#juju-dev 2012-06-04
<davecheney> fwereade: aloha!
<fwereade> davecheney, heyhey
<fwereade> davecheney, how's it going?
<davecheney> good
<davecheney> spent the weekend looking at ARM assembly
<davecheney> so a change is as good as a holiday
<fwereade> davecheney, excellent :)
<TheMue> Morning.
<fwereade> heya TheMue
<fwereade> TheMue, ping
<TheMue> fwereade: pong
<fwereade> TheMue, figuring out what I should work on; wondering how the relation stuff is going
<TheMue> fwereade: One moment please, have to change rooms.
<fwereade> TheMue, np
<TheMue> fwereade: So, back again. My wife has been in the home office before. ;)
<TheMue> fwereade: Currently the State.AddRelation() and State.RemoveRelation() are in review and short before LGTM.
<TheMue> fwereade: But Roger and Gustavo discussed and want to keep open, that a service might have relations of the same role mltiple times in future.
<TheMue> fwereade: So I'll now change the topology for it and afterwards the AddRelation() and RemoveRelation() to use this new model.
<fwereade> TheMue, ok, cool, so not *imminent* -- cheers :)
<TheMue> fwereade: Next step will then be to enhance AddRelation() by a better verification of the endpoints. Today it has the same checks as the Py code, but it's not enough.
<TheMue> fwereade: What is not so imminent?
<fwereade> TheMue, sorry, got diverted; I mean that a usable state.Relation type is still some way off
<TheMue> fwereade: There's enough to do, yes.
<fwereade> TheMue, cool; I should probably do something that doesn't depend on it then :)
<TheMue> fwereade: Or participate. ;)
<fwereade> TheMue, if you think the work can be usefully divided that would be cool -- what's the plan?
<TheMue> fwereade: That's the problem, there is no real plan. I'm just doing the functionality top-down. I've started with the relation manager, those methods move into our new State.
<TheMue> fwereade: Next step would be relation type by type.
<TheMue> fwereade: I've seen that the service relation has a lot of functionality.
<fwereade> TheMue, I could happily try to do some rudimentary types
<fwereade> TheMue, see how far I can take them
<fwereade> TheMue, what's the latest stable-ish unmerged relations branch? if you think some will go in soon, it seems sensible to branch from there
<TheMue> fwereade: The latest is go-state-remove-relation. But it has a line of prerequisites.
<fwereade> TheMue, and is that likely to be merged by the time I have a new type to propose on top of it? and not to change too much between now and then?
<TheMue> fwereade: So maybe you should wait until the all are in the trunk.
<TheMue> fwereade: I hope to keep the API stable, as well as the types so far.
<TheMue> fwereade: Right now only the topology below is changed.
<fwereade> TheMue, ok; in that case I can probably just go from trunk, right?
<TheMue> fwereade: In trunk there's the current topology, but not yet the relation types as well as Add and Remove.
<fwereade> TheMue, I'm just re-checking your current proposals to get myself more up to speed
<TheMue> fwereade: Yes, take a look. I think both will go in today or tomorrow.
<TheMue> fwereade: Just wait for the LGTM. The Addâ¦ had a pre-LGTM by Gustavo.
<fwereade> TheMue, hmm, so I should maybe branch from add-relation and add functionality to ServiceRelation?
<TheMue> fwereade: Sounds like a plan. ;) In Removeâ¦ it only has one method more to return the according Relation.
<fwereade> TheMue, cool, I'll ping you if I have any trouble
<TheMue> fwereade: Cheers.
<TheMue> fwereade: You'll see that those zkXyz are now topoXyz, as we talked about in Oakland.
<fwereade> TheMue, sweet
<fwereade> TheMue, just mailed juju-dev; would apreciate your thoughts
<TheMue> fwereade: Will take a look.
<TheMue> fwereade: To summarize: Interface changes (with checks) are ok, but no subordinate changes. Am I right?
<fwereade> TheMue, that's what I *think* we should do
<fwereade> TheMue, you tell me whether that's right :)
<TheMue> fwereade: Hehe, then you have to help me with subordinates before. Where do we find it in our model?
<fwereade> TheMue, we don't really yet
<fwereade> TheMue, I think that I need to incorporate them before we can move much further ahead on the relations
<TheMue> fwereade: Could you please tell me more about subordinates?
<fwereade> TheMue, I'm just changing Service.AddUnit to take (container *Unit) and seeing where I go from there
<fwereade> TheMue, ok, I'll try
<fwereade> TheMue, charms can be subordinate
<fwereade> TheMue, if they are, their characteristics change
<fwereade> TheMue, when deployed, no units are created
<TheMue> fwereade: Do you have an example?
<fwereade> TheMue, when added to a relation, units of the subordinate are deployed inside the containers of every unit of the principal service
<fwereade> TheMue, rsyslog would make a good subordinate charm
<fwereade> TheMue, there are a number of other consequences which I only really discovered today
<fwereade> TheMue, and am not quite sure I have a handle on
<fwereade> TheMue, so stop me if anything sounds like crack
<fwereade> TheMue, one of the issues is that the "connections" are different to in a normal relation
<fwereade> TheMue, in a normal relation between P and Q, every unit of P can see every unit of Q, and react to settings changes; and vice-versa
<TheMue> fwereade: It's ok so far. I only try to understand it right. Typically I've used to work a lot with my colleagues around a whiteboard. I need to visualize it.
<fwereade> TheMue, in a subordinate relation, each unit of P sees only the unit of Q that it shares a container with
<fwereade> TheMue, and settings changes are processed entirely within, and do not leak beyond, that container
<fwereade> TheMue, (everything is still mediated by ZK, obviously; but a settings change for one unit of P will not e noted by a unit of Q outside its container
<fwereade> TheMue, so... when storing unit relations, we have different ZK paths to the settings etc
<TheMue> fwereade: IC
<fwereade> TheMue, we currently handle the /relations/rel-id case, which is what we want globally; but we also need /relations/rel-id/unit-id (I think)
 * fwereade goes to check
<fwereade> TheMue, yeah; and the unit-id I referred to above is specifically the unit id of the principal (ie the one that "owns" the container
<fwereade> )
<TheMue> fwereade: In Py there's code handling /relations/rid/cid/role
<TheMue> fwereade: Where cid is the container id
<fwereade> TheMue, and also /relations/rid/role, right?
<TheMue> fwereade: For global relations, yes.
<fwereade> TheMue, AIUI the structure of the nodes underneath that should be similar
<fwereade> TheMue, does that seem right to you?
<fwereade> TheMue, look for _get_scope_path and how it's used
<TheMue> fwereade: Thought it is, but not yet covered the container scoped relations.
<fwereade> TheMue, I'll proceed as though it is; can't be sure of every use of it yet but we'll see how it goes
<fwereade> TheMue, the topology also changes to accommodate knowledge of what container a given unit is actually deployed
<fwereade> TheMue, in
<fwereade> TheMue, anyway, I'm just making those changes to begin with
<TheMue> fwereade: So this is also a topology change to the current Py version?
<fwereade> TheMue, no, it's part of the topology already in py
<TheMue> fwereade: OK, then I got you wrong, sorry.
<TheMue> Lunchtime
<Aram> heyhey.
<TheMue> Aram: Hiya.
<hazmat> g'morning
<fwereade> Aram, hazmat: heyhey
<fwereade> heya niemeyer
<niemeyer> Yo
<niemeyer> !
<fwereade> niemeyer, TheMue, Aram: conundrum follows
<fwereade> in the topology, in python, we store Machine and Container for each unit
<fwereade> only one of those can ever be valid
<fwereade> I just got an urge to store Location, which is just a unit key or a machine key (and we can tell which is which, right?)
<fwereade> this feels kinda evil, but so does allowing the content to be insane by having both or neither set
<fwereade> niemeyer, TheMue, Aram: opinions? ^^
<fwereade> niemeyer, TheMue, Aram: I'm leaning towards storing them under separate keys and just being careful about what we write, but...
<niemeyer> fwereade: Having a single key feels bad indeed
<fwereade> niemeyer, TheMue, Aram: hmm, *at the moment* only one can ever be correct
<fwereade> niemeyer, TheMue, Aram: forget I said anything :)
<niemeyer> fwereade: They're different fields meaning different things.. having to guess what the field means based on the value isn't great
<fwereade> niemeyer, indeed
<fwereade> niemeyer, we could still have garbage set in that one field anyway
<TheMue> fwereade: Forgotten. ;)
<niemeyer> fwereade: can you please expand? Maybe I misunderstand
<fwereade> niemeyer, I'm agreeing -- the extra guard against insanity of the icky magic field doesn't eliminate the possibility of insanity anyway, so so even that is not an argument for the icky magic field over two separate fields
 * fwereade should go and borrow a cuddly toy from laura to ask these questions to before polluting irc :)
<niemeyer> fwereade: Hehe. Jamu used to call that "talking to the bear" :-)
<fwereade> niemeyer, yeah, it's a noble technique, we had a cat called mutex at resolver who sometimes served that purpose
<TheMue> fwereade: Hehe, or take your pet to talk to (if you got one). Cats are good listeners.
<fwereade> TheMue, just a cuddly cat actually
<fwereade> TheMue, I'm not sure a real cat would have the patience anyway ;)
<TheMue> fwereade: I'm sure -- it doesn't. *lol*
<niemeyer> fwereade: That's a great name for a cat :)
<fwereade> niemeyer, isn't it :)
<TheMue> niemeyer: One question about the topo service inversion. Shall I open the combination of roles even if e not yet use them or shall I keep the current validation for provider/requirer or peer?
<niemeyer> TheMue: The validation is fine
<niemeyer> TheMue: I mean, it will have to be adapted, of course
<niemeyer> TheMue: Given the map change
<TheMue> niemeyer: Sure, but I'll keep it and later we can extend it. OK.
<niemeyer> TheMue: But we don't have to pretend it's general right now.. we just don't want to shoot ourselves in the foot
<TheMue> niemeyer: Could hurt, indeed. ;)
<TheMue> niemeyer: So, change is done. Influence on State.AddRelation() is small, only three lines.
<fwereade> niemeyer, TheMue: https://codereview.appspot.com/6268050 has subordinate units
<niemeyer> fwereade: Woohay
<niemeyer> TheMue: Woohay too :)
<TheMue> :D
<fwereade> niemeyer, er, there'll be a small diff upcoming, missed a signature change in the rest of the project
<fwereade> niemeyer, should be a trivial additional file changed
<fwereade> niemeyer, ie shouldn't block sane reviewing if you're of a mind :)
<niemeyer> fwereade: Not yet.. still trying to catch up on post-sprint stuff
<niemeyer> fwereade: But I'll get there
<fwereade> niemeyer, no worries :)
<fwereade> niemeyer, btw, tyvm for reviewing what you did during the sprint, hope it didn't take too much time away
<niemeyer> fwereade: No problem. It didn't take much time away because I've used the evenings for the largest chunks
<fwereade> niemeyer, jolly good, thanks :)
<fwereade> niemeyer, TheMue, Aram: oh, blast: I've put myself in a situation in which I need two prereqs for the next branch
<fwereade> niemeyer, TheMue, Aram: did we ever come up with a nice way of dealing with this?
<niemeyer> fwereade: Is any of them small enough that I can review quickly?
<niemeyer> fwereade: There's no way to handle that automatically with current infra
<niemeyer> fwereade: Launchpad doesn't support it
<fwereade> niemeyer, one of them is TheMue's add-relation branch
<niemeyer> fwereade: Hmm
<fwereade> niemeyer, which IIRC has an LGTM from you already actually
<niemeyer> fwereade: I thought that was in already
<TheMue> fwereade: There's only a final lgtm missing.
<niemeyer> Link?
<TheMue> niemeyer: I only proposed it for the final one after the last open points.
<TheMue> niemeyer: https://codereview.appspot.com/6223055/
<TheMue> niemeyer, fwereade: I'm ready for submitting it.
<fwereade> TheMue, if it has an LGTM you should be good to go unless you disagreed with any of the points in the LGTM
<fwereade> TheMue, I have been known to be too hesitant on this front :)
 * TheMue likes clean LGTMs. ;)
 * fwereade lives in hope of them, and is most joyful when they come to pass
<niemeyer> TheMue: LGTM
<TheMue> niemeyer: Ah, thx, here it comes.
<fwereade> niemeyer, thanks, sorry context switch :(
<niemeyer> fwereade point applies in this case, FWIW.. your branch reflected the agreement of the last review
<niemeyer> fwereade: No problem, happy to prevent blockage
<TheMue> niemeyer: Updated the latest propose (topology). Now the new trunk containing State.AddRelation() is merged, the changes adopted and all lamps are green.
<niemeyer> TheMue: Beautiful, thank you
<niemeyer> !
<TheMue> niemeyer: np :D
<niemeyer> mramm: ping
<mramm> pong
<mramm> neimeyer: pong
<niemeyer> mramm: Heya
<niemeyer> mramm: Are you subscribed to juju-dev?
<niemeyer> davecheney: ping
<davecheney> niemeyer: ack
<davecheney> hang on
<davecheney> let me change terminals
<davecheney> x has shat itself
<davecheney> niemeyer: back
<niemeyer> davecheney: Heya
<niemeyer> davecheney: Morning
<davecheney> yes! finally it wasn't raining, so it was my first chance to go for a ride before work for weeks!
<davecheney> niemeyer: how was your trip ?
<niemeyer> davecheney: Oh, congrats! :)
<niemeyer> davecheney: It was great indeed.. very happy that I managed to meet Aram and rogpeppe in addition to the unrelated sprint
<davecheney> niemeyer: didn't you meet roger at UDS ?
<davecheney> :P
<davecheney> btw, is Aram the Aram taht wrote doozer ?
<niemeyer> davecheney: Yeah, but it was great even then.. we managed to cover some interesting ground there
<niemeyer> davecheney: He's the Aram that patched doozer for persistence, yeah
<niemeyer> ;)
<davecheney> niemeyer: i had a great day with rog when we went to the computer history museum
<niemeyer> davecheney: By the way, I've pushed a gozk branch that does the semantic change I mentioned
<davecheney> 5 hours on SF public transport left ample time for getting to know each other
<davecheney> niemeyer: sweet, i'll go get -u now
<niemeyer> davecheney: I didn't submit yet, but if you LGTM it I'll push it on
<davecheney> niemeyer: does the notification go to gophers ?
<davecheney> i haven't seen an email
<niemeyer> davecheney: I'd hope so
<davecheney> i guess not
<davecheney> let me double check
<davecheney> https://codereview.appspot.com/6292044
<davecheney> but no email, i just foudn that from your reply to my handover note
<niemeyer> Hmm
<niemeyer> Weird
<davecheney> am i a member of golang-gophers? or just the juju project ?
<niemeyer> davecheney: I'm somewhat concerned with the lack of tests in the provisioner brancher.. there are tons of edge cases there that are being carefully considered but untested
<niemeyer> davecheney: I'm sure you've tested by hand or at least put good thought on these edge cases
<davecheney> niemeyer: yes, i wonder how I can decompose the functions to test them
<niemeyer> davecheney: Have you used TDD before?
<davecheney> nope
<niemeyer> davecheney: Ok.. there are a couple of good books that might be useful.. I like the one from Kent Beck, but just last week I've received good feedback on Growing Object-Oriented Software, Guided by Tests by Steve Freeman
<niemeyer> davecheney: I suspect Kent Beck might be more to-the-point in terms of TDD specifically, though
<niemeyer> davecheney: Regarding decomposing the functions, that may not be necessary
<niemeyer> davecheney: I'm more concerned about testing the real edge cases than the *implementation*
<niemeyer> davecheney: A simple example: if a misconfigured environment somehow arrives, the new configuration is ignored, but a follow up configuration actually takes place
<niemeyer> davecheney: We can integrate this branch before that, though.. I'm happy to have those increments nicely done on top of that, given that what we have in place was well debated over already
<davecheney> niemeyer: cool
<niemeyer> davecheney: Right now I'm just doing some minor commenting on trivial stuff like log messages and whatnot
<davecheney> niemeyer: ... wondering how to observe a misconfigured environment being rejected
<niemeyer> davecheney: What we care about is not that it is rejected, but that the provisioner itself stays up
<niemeyer> davecheney: and that a follow up correct configuration is loaded
<davecheney> niemeyer: gotcha
<davecheney> understood
<davecheney> making a list now
<davecheney> * provisioning does not occur with broken env
<davecheney> * provisioning does occur when env is valid, then broken
<davecheney> * provisioning does occur when env is valid, then broken, then valid
<davecheney> niemeyer: are you comfortable with me integrating those into the woodenman branch ?
<niemeyer> davecheney: woodenman? Do we already have another branch up? Sorry, I'm still catching up
<davecheney> yeah, strawman is the basic provisioning agent that just obseverves environment changes
<davecheney> woodenman is the follow on that actually pokes the Environ when it needs to add machines, etc
<niemeyer> davecheney: I'd prefer to fix these outstanding issues before we increase the functionality
<niemeyer> davecheney: We have quite a few pending things there
<davecheney> sure
<niemeyer> davecheney: Tests, state issues within the loop (maybe?), state reconnection
<davecheney> ok
<niemeyer> davecheney: If we add more logic, we'll be increasing the debt
<niemeyer> davecheney: It's not clear to me we need IsValid anymore, btw, assuming the gozk branch
<niemeyer> davecheney: We'll have to think about it..
<niemeyer> davecheney: Well.. and maybe a test
<davecheney> niemeyer: no, as long as the watchers close, then I can return to the Run function, tear everything down, and start again
<davecheney> niemeyer: re: gozk, I guess that looks good
<davecheney> i don't think i can comment authoratively
<niemeyer> davecheney: Actually, we may still need a note from the state
<davecheney> niemeyer: that is what I thought of from IsValid()
<niemeyer> davecheney: Yeah, may be good to be explicit
<davecheney> so the innerLoop becomes for state.IsValid() { .... }
<niemeyer> davecheney: Yeah, or IsConnected
<davecheney> niemeyer: yup, i'm not fussed about the name
<niemeyer> davecheney: I know, just brainstorming as we go
<davecheney> I had the idea that that would return false after some event that caused the state to be 'broken'
<niemeyer> davecheney: We can watch the session channel for that
<niemeyer> davecheney: Should be an easy walk
<niemeyer> davecheney: Anything returning event.Ok() != true can set the flag to red
<niemeyer> davecheney: Actually, now that I think of it.. I don't think the complexity of machinesChanges and environChanges makes sense anymore, with those decisions
<niemeyer> davecheney: Just imagine in which circumstances these channels will break
<davecheney> niemeyer: you mean making them recreate themselves
<niemeyer> davecheney: Oh, duh.. please ignore me
<niemeyer> davecheney: No, wait.. I'm confused
<niemeyer> davecheney: These channels are internally reestablished..
<niemeyer> davecheney: Yeah, the initial idea seems to proceed
<niemeyer> davecheney: We only need to recreate on !ok, right?
<davecheney> niemeyer: they are only recreated if, a, they were nil to begin with (just started up) or b, their Stop() method was called, in response to them being closed
<niemeyer> davecheney: Yeah, in that second scenario, it doesn't really make sense to bring them back
<niemeyer> davecheney: If they're dead, the state itself must be dead too
<davecheney> niemeyer: with your change, that is now true
<niemeyer> davecheney: That said, it's a trivial simplification we can do in a second moment.. not hurting reliability, so let's move on
<davecheney> they anre't recreated util you go through the loop again, so if the loop has a condition that is false if the state is closed
<davecheney> they wont' be recreated
<niemeyer> davecheney: It was always true..
<niemeyer> davecheney: We've changed !ok to happen more frequently, but it was always the case that with !ok the state is dead
<davecheney> niemeyer: absolutely, just those channels never closed
<niemeyer> davecheney: Exactly, and stop is only called when they closed
<davecheney> so, we can use !ok as a proxy for state is closed
<davecheney> ok, that makes things simpler
<niemeyer> davecheney: Yeah, I think so too
<davecheney> if a watcher closes, it's underlying state is closed, so collapse back to the top, cleaning up as we go, then start again
<niemeyer> davecheney: Right
<niemeyer> davecheney: I think we pretty much never have to retry in place
<niemeyer> davecheney: These scenarios are edge cases anyway, and we need to handle harsher scenarios too
<niemeyer> davecheney: So we can safely handle hiccups in conservative ways
<davecheney> niemeyer: i also wanted to ask you about retring
<davecheney> retrying
<niemeyer> davecheney: and proceed as if they were major breakups
<davecheney> roger suggested there was some retry logic in the ec2 code, but unless he's talking about goamz, I couldn't find it
<niemeyer> davecheney: Hopefully we'll also be better off when the major break does happen
<niemeyer> davecheney: No, there is indeed, inside environs/ec2
<davecheney> grep -i -r retry didn't help me, i'll try again
<niemeyer> davecheney: grep for shortAttempt
<davecheney> ok
<niemeyer> davecheney: Should give you the seed
<davecheney> i want to reuse that for the loop at the top of the PA, so we dont' jump straight back in
<niemeyer> davecheney: Nice; feel free to refactor it off
<davecheney> niemeyer: ok
<davecheney> niemeyer: https://launchpad.net/~gophers << i'm still pending approval
<davecheney> :(
<niemeyer> davecheney: Hah, sorry
<niemeyer> davecheney: That explains it
 * davecheney feels slighted
<niemeyer> davecheney: Ok, review delivered
<davecheney> thank you very much
<davecheney> who is going to look at IsConnected ?
<niemeyer> davecheney: I think the branch can go in with those changes, but I'm fine as well if you decide to refactor it further with some of these ideas we just discussed
<davecheney> niemeyer: I would liek to land it today, if there are no objections, then follow up with the other things we have discussed
<niemeyer> davecheney: Feel free to do it if you feel like it, or we can ask TheMue to have a look at it
<davecheney> niemeyer: re IsConneced, i'll take a look today and hand over to TheMue at 4pm if I get stuck
<niemeyer> davecheney: Okay, so let's move ahead.. fixing the comments/messages/etc, so we can get it in with a state we're both happy with
<davecheney> ok
<niemeyer> davecheney: Okay.. the idea is really simple: we have a loop with event, ok <- session in open.go already.. we just have to add a check to see if event.Ok() for each event received, and set a protected variable in case it's false
<niemeyer> davecheney: You can find some Stop/Start tests within zk_test.go that may be useful
<niemeyer> davecheney: Look for Reconnect in the test name
<niemeyer> / .. but we take all session events that occur before a session is established
<niemeyer> as errors.
<davecheney> niemeyer: in state/open.go ?
<niemeyer> davecheney: Hmm
<niemeyer> davecheney: The comment seems clear.. all session event on non-session watches are fatal
<davecheney> i guess i don't understand what a zk session is
<niemeyer> davecheney: THere are two kinds of watches: a requested watch (e.g. GetW), and the session watch that is obtained via Dial
<niemeyer> davecheney: Both of these receive events
<davecheney> so a session watch observes everything, a getw is scoped to a path ?
<niemeyer> davecheney: The session watch receive *only* connectivity state changes
<davecheney> ahh, of course, events about this session, this connection
<niemeyer> davecheney: The session watch receives both these and the actual requested watch
<niemeyer> davecheney: Yeah.. I call them session events because they actually have a code of EVENT_SESSION
<niemeyer> davecheney: It is confusing indeed, sorry
<niemeyer> davecheney: Is there a way I can improve that comment in a more understandable way?
<davecheney> niemeyer: no, it makes sense now.
<davecheney> so if you're a watcher, and you receive an EVENT_SESSION, it's a signal that you should close
<davecheney> your comment is correct
<davecheney> afk 15 mins
<niemeyer> davecheney:
<niemeyer>                 // All session events on non-session watches will be delivered
<niemeyer>                 // and cause the watch to be closed early. We purposefully do
<niemeyer>                 // that to enforce a simpler model that takes hiccups as
<niemeyer>                 // important events that cause code to reestablish the state
<niemeyer>                 // from a pristine and well known good start.
<niemeyer>                 if event.State == STATE_CONNECTED {
<niemeyer>                         // That means the watch was established while we were still
<niemeyer>                         // connecting to zk, but we're somewhat strict about only
<niemeyer>                         // dealing with watches when in a well known good state.
<niemeyer>                         event.State = STATE_CONNECTING
<niemeyer>                 }
<niemeyer> davecheney: Hope that's more understandable
<davecheney> +1e9
<davecheney> niemeyer: my only change would be
<davecheney>  // important events which cause code to reestablish the state
#juju-dev 2012-06-05
<davecheney> niemeyer: thanks for fixing gozk
<davecheney> niemeyer: 2012/06/05 11:37:12 JUJU provisioning: environWatcher reported error on Stop: watcher: critical session event: ZooKeeper connecting
<davecheney> now the PA can sense zookeeper going away
<davecheney> niemeyer: i'm liking the way this looks now
<davecheney> if we get a !ok watching for changes, we just return, and I let defer's take care of cleaning up
<davecheney> it's much cleaner, and -2 functions
<davecheney> what a crappy day
<davecheney> spent 1/2 the day thinking their was a bug in the watchers
<davecheney> then finaly realised, at 5pm, that there are actually connections in place when you are running tests
<TheMue> davecheney: Hiya. Hehe, I know that feeling.
<TheMue> davecheney: But at least you can end the day with the knowledge that you've found it.
<davecheney> and a rush to get my change propsed by 6pm :)
<davecheney> TheMue: i was going mad
<davecheney> i chased it all the way down to gozk and back again
<davecheney> and I was just writing up the message asking for help and explaining that if I actually killed the zookeeper server
<davecheney> then everything would work fine, when I relalised that clearly I was closing the wrong connection
<Aram> hello.
<TheMue> Aram: Hi.
<Aram> hey there
<niemeyer> Goooood morning jujuers
<niemeyer> What a beautiful morning it is here indeed
<Aram> morning.
<mramm> good morning.
<Aram> rains here
<mramm> here too
<niemeyer> Aram: Worry not, I'll make sure to send some of the fantastic weather in your direction
<Aram> niemeyer: do you see the transit of Venus in your area?
 * Aram hopes it's not cloudy tomorrow
<niemeyer> Aram: Unfortunately not :-(
<TheMue> niemeyer: Morning. Here it is grey, thakfully dry, and both daughters are ill. *sigh*
<niemeyer> TheMue: Oh :(
<niemeyer> TheMue: What's up with them? Cold?
<TheMue> niemeyer: No, the younger one has typical female probs (she always has a hard first day), the older one has cut herself yesterday.
<niemeyer> Ouch!
<TheMue> niemeyer: Indeed, stayed until 1am in the clinic and hope to get her back today or tomorrow.
<niemeyer> TheMue: Wow.. so it was serious indeed.. is she ok now?
<TheMue> niemeyer: Yes, but the cut has to be stitched.
<niemeyer> TheMue: Ok, but that's generally something fast to do
<TheMue> niemeyer: Yes, now they only control that it doesn't infect. And her blood pressure went down.
<niemeyer> TheMue: I see
<TheMue> niemeyer: That's why we now wait for a call from the hospital to fetch her.
<TheMue> niemeyer: By the way, could you take a look at  bug 1007373? I added a comment what we already check and what to add when adding a relation. Maybe you see more.
<niemeyer> TheMue: Will look
<niemeyer> TheMue: It's not clear to me how your comment relates to the description of that bug
<niemeyer> TheMue: Can you please respond (here or there) in terms of the specific issue described?
<niemeyer> TheMue: THe issue is very specific.. a "Tests so far" list does not make it clear to me
<TheMue> niemeyer: OK, try to make it more clear.
<niemeyer> TheMue: Talking to me is fine as well :)
<TheMue> niemeyer: The implemented tests so far are the same as in Py. That are all those points I listed below.
<TheMue> niemeyer: Those tests we not yet have - in both versions - are the open tests.
<TheMue> niemeyer: Mostly if identifiers are not empty and role and scope contain valid values.
<niemeyer> TheMue: Can you please read the description of that bug again?
<niemeyer> TheMue: And explain how you feel about *that specific issue*
<niemeyer> TheMue: Your telling me about tests you have or not.. I raised a specific problem that I'd like to understand
<TheMue> niemeyer: As far as I understood the issue there are open validations regarding the endpoints in State.AddRelation().
<niemeyer> TheMue: Not just that.. it's also about the side effects of validating later rather than sooner
<niemeyer> It'd be bad to be creating state in ZooKeeper just to later tell
<niemeyer> something
<niemeyer> > trivial to the user that we could have verified upfront. We shouldn't be
<niemeyer> > duplicating this logic, though.
<niemeyer> TheMue: I haven't heard anything about what you think of that yet..
<TheMue> niemeyer: I'm just checking my current version, most is now tested upront any writing to ZK.
<niemeyer> TheMue: Sorry, but I still can't follow.. can you please be more specific?  Something like "No, that can't happen because if the endpoint is invalid foo bar will check and prevent the node from being created."
<niemeyer> TheMue: You may well be right, but I can't tell yet..
<TheMue> niemeyer: That's exactly what I'm implementing now and what the next proposal will show to you.
<niemeyer> TheMue: Heh
<TheMue> niemeyer: I only wanted to know if you see any specific tests I forgot, so I wrote them down.
<TheMue> niemeyer: The way I test it you will see then.
<niemeyer> TheMue: That bug is about one specific issue.. that comment makes no sense in that context
<niemeyer> TheMue: We already have a function to validate endpoints..
<niemeyer> TheMue: That bug is about the fact they are not being validated before the endpoint is being acted upon, which causes state in zookeeper to be created.
<TheMue> niemeyer: Then maybe I've got a problem understand your issue and would like you to rephrase it for me.
<niemeyer> TheMue: We need to do the validation upfront rather than doing random writing and then testing that the parameter is invalid.
<niemeyer> TheMue: That's all
<TheMue> niemeyer: Not everything is tested yet, and when having two endpoints the first one could be writte while the second later breaks the iteration over the endpoints because the second one is invalid.
<TheMue> niemeyer: Yes, that upfront testing is added in the new branch.
<niemeyer> TheMue: That's what this bug is about
<niemeyer> TheMue: If you're fixing that, it's great
<TheMue> niemeyer: OK, then I only found more to validate.
<niemeyer> TheMue: Yep, that's even better, thank you
<TheMue> niemeyer: The rest is clear.
<niemeyer> TheMue: Awesome, cheers!
<TheMue> niemeyer: cheers
<TheMue> niemeyer: Then I only have to add some negative tests to verify it and it comes in.
 * TheMue just got the news to fetch Janina in 2h from the hospital. *yay*
<mramm> TheMue: Yay indeed
<mramm> TheMue: Must have been scary
<niemeyer> TheMue: Phew!
<TheMue> mramm: Yes, it has been. At least the first shock. After we've seen her it got better. But it's still different from a children's disease.
 * TheMue thinks back of what we already had with our girls.
<niemeyer> Woohay.. first fire of the year in the fireplace
<Beret> cold and Brazil just doesn't compute for me for some reason
<Beret> perhaps because I've never been there
<Beret> or perhaps just because the heat is talked about far more than the cold
<niemeyer> fwereade: ping
<niemeyer> Beret: yeah, it's a common illusion to think that the whole country looks like Rio de Janeiro :)
<Beret> Fernando shows me pictures of beautiful beaches and Andreas talks of the heat, and there's my impression
<niemeyer> Beret: Andreas isn't a good parameter.. he probably has his own snow generation engine at home
<Beret> hah
<andrewsmedina> Beret: I live in Rio de Janeiro
<andrewsmedina> :)
<niemeyer> andrewsmedina: Just today my wife was saying she'd like to visit Rio..
<niemeyer> davecheney: Heya,
<niemeyer> davecheney: Good morning
<niemeyer> Just in time for me to step out :)
<davecheney> niemeyer: morning
<niemeyer> May be back later for more reviewing..
<davecheney> no worries
<davecheney> thanks for your work overnight
<davecheney> especially that schema one
<davecheney> i'll abandon that branch
<davecheney> niemeyer: one final thing, with the machines.String(), i agree just printing it's itoa id is right, but it will be a larger change to the tests. I'll resubmit the branch in a bit
<niemeyer> davecheney: Sounds good
<niemeyer> Back in a few hours for some reviewing
<davecheney> kk
#juju-dev 2012-06-06
 * niemeyer pops up
<niemeyer> Okay.. time for some sleep
<niemeyer> Night all
<Aram> morning.
<davecheney> howdy doodie
<davecheney> putting my head in the lions mouth, http://codereview.appspot.com/6297048/
<fwereade> mornings
<fwereade> TheMue, ping
<TheMue> fwereade: pong
<fwereade> TheMue, just thought I'd update you quickly re service/unit relations
<fwereade> TheMue, I have something that I think looks good
<TheMue> fwereade: I'm listening.
<fwereade> TheMue, but it's going to be a fair number of branches, which will roughly go:
<fwereade> add state.watcher.VersionWatcher
<fwereade> add state.RelatedUnitWatcher, which watches both the presence node and the settings node of a given unit relation
<fwereade> add state.RelatedUnitsWatcher, which watches a set of unit relations and maintains a RelatedUnitWatcher for each
<fwereade> add state.UnitRelation, the tricky bit of which is WatchRelated (which depends on all the foregoing)
<fwereade> add state.ServiceRelation
<fwereade> phew
<TheMue> fwereade: Hehe, good list.
<TheMue> fwereade: ServiceRelation is already started.
<fwereade> TheMue, there's a struct without methods, right?
<TheMue> fwereade: UnitRelation is on the list.
<fwereade> TheMue, I've basically written all that code
<fwereade> TheMue, needs far better tests, and to be broken into that nice clean structure
<fwereade> TheMue, but I wanted to check we weren't duplicating too much
<TheMue> fwereade: Currently it has the first few accessor methods, but more are planned (matching the Py impl.).
<fwereade> TheMue, ah yeah, that's right, sorry
<fwereade> TheMue, that was what I was starting from
<TheMue> fwereade: No problem, it's not yet started.
<fwereade> TheMue, cool; just wanted to reassure you that it was coming, and check you didn't need it imminently
<TheMue> fwereade: I had two smaller branches I'm working at before going on.
<TheMue> fwereade: So it fits.
<fwereade> TheMue, cool
<TheMue> fwereade: One orthogonal issue is the harmonization of error handling in the state code.
<fwereade> TheMue, I'm +1 on the general concept, but not really sure what we're planning
<fwereade> TheMue, the thing is, I do often appreciate the details of the naked errors that bubble up, so I kinda favour a nested error type... would something like that work for you?
<TheMue> fwereade: Right now too many low level errors pop up to the user, like errors containing internal keys or ZK messages.
<fwereade> TheMue, although I'd be willing to lose that so long as we log the errors whenever we transform them
<fwereade> TheMue, actually, I think I like that idea
<niemeyer> Good morning folks
<fwereade> niemeyer, heyhey
<TheMue> fwereade: In private I'm using specific error types like e.g. ServiceNotFoundError. As a struct it can contain further infos as fields, like keys. And Error() returns the message. It's a bit more effort, but testing and analyzing is simpler this way. Also later I18N.
<fwereade> TheMue, SGTM
<TheMue> niemeyer: Moin.
<niemeyer> fwereade: https://codereview.appspot.com/6248076 is reviewed
<fwereade> niemeyer, what's the problem with TestAddRelation?
<niemeyer> fwereade: Hmm
<niemeyer> fwereade: Sorry, nevermind..
<niemeyer> fwereade: I got tricked by codereview showing me as being added in a revision you pushed
<fwereade> niemeyer, I can agree that state_test.go is getting a little unwieldy though :)
<niemeyer> fwereade: If I compare against the base it goes away
<niemeyer> fwereade: It is indeed
<niemeyer> 	 29 For cs:~user/series/mysql
<niemeyer>  30   cs:~user/mysql
<niemeyer> fwereade: Shouldn't this say "precise" rather than "series"?
<fwereade> niemeyer, oh hell
<fwereade> sorry
<niemeyer> fwereade: np
<fwereade> niemeyer, I'll propose a trivial in a mo
<niemeyer> fwereade: Cool.. feel free to propose+submit directly as a trivial
<fwereade> niemeyer, ok, will do
<fwereade> cheers
<niemeyer> fwereade: Thanks
<niemeyer> Aram: ping
<Aram> pong niemeyer
<Aram> what's up.
<hazmat> looks like the ec2 archives are working again
<niemeyer> Aram: Heya
<Aram> hey
<niemeyer> Aram: That's what I was about to ask too :-)
<niemeyer> hazmat: Sweet
<niemeyer> Aram: Haven't heard much after the sprint.. how're things moving there?
<niemeyer> TheMue: You've got a review on https://codereview.appspot.com/6268049/
<TheMue> niemeyer: Just seen the notification, thanks.
<Aram> niemeyer: working on mgokeeper right now, haven't really worked *at* it until today, I played with mongo in all kinds of scenarios to understand it better.
<niemeyer> Aram: Sweet
<niemeyer> Aram: As usual, it works best if changes are pushed in small solid increments, rather than as a big chunk
<niemeyer> chunks
<niemeyer> Aram: No change is too small.. even one-liners are fine if they look sensible
<Aram> absolutely, I don't intend to come with just one big commit :).
<niemeyer> Aram: I know.. no body intends, but it often happens nevertheless :-)
<niemeyer> Aram: So it's good to keep it in mind so that breakpoints are more easily detected
<Aram> yes.
<niemeyer> Aram: I promise we'll actually integrate them, rather than being trashed as might have been seen in some other project ;-D
<Aram> heh, indeed.
<TheMue> Lunchtime.
<Aram> enjoy
<niemeyer> Truesome
<niemeyer> TheMue: https://codereview.appspot.com/6250076/ is dirty again
<niemeyer> OMG.. is it lunch time yet... hmmm.. nope, 3h to go.  The body gets pretty confused when I wake up so early.
<niemeyer> TheMue: You've just reproposed a branch that has already been submitted: https://codereview.appspot.com/6120045/
<niemeyer> TheMue: Probably because of the vague branch name ("go")
<hazmat> niemeyer, i always  listen to the body in those situations
<TheMue> niemeyer: Yes, just wondered about the message in the notification too. Strange,
<niemeyer> hazmat: Hehe, wise :)
<TheMue> niemeyer: Check again, should have been the cleaning of the RemoveRelation() branch.
<niemeyer> TheMue: You've reproposed the same branch.. nothing really surprising I guess
<niemeyer> TheMue: Nothing up for review
<TheMue> niemeyer: No, please ignore it, has been my fault.
<TheMue> niemeyer: The right one will come in a few moments.
<niemeyer> TheMue: Sounds good, thanks
<TheMue> niemeyer: Ah, looks cleaner now.
<niemeyer> TheMue: Super, thanks.. I'll just finish the subordinate units branch from fwereade and then will go back to yours
<TheMue> niemeyer: Thx,
<niemeyer> fwereade: https://codereview.appspot.com/6268050/ looks great.. just sent a suggestion for the API
<niemeyer> fwereade: Please let me know how you feel abou tit
<niemeyer> about it
<fwereade> niemeyer, thanks, will do
<niemeyer> fwereade: Cheers
<niemeyer> Also, if someone is in the mood for a quick review: https://codereview.appspot.com/6295048/
<niemeyer> TheMue: Reviewed
<TheMue> niemeyer: Great, thx.
<niemeyer> TheMue: np
<TheMue> niemeyer: I'm just writing down some thoughts about error creation and handling.
<niemeyer> TheMue: Awesome, looking forward to seeing it
<TheMue> niemeyer: Will send it to our list for discussion.
<niemeyer> TheMue: Brilliant
<niemeyer> TheMue, fwereade: We're *almost* zeroing out on branches up for review.. this is probably a close to a good time to do the switch over to the juju-core project in Launchpad
<TheMue> niemeyer: Sounds good
<fwereade> niemeyer, there's never going to be a good time; so yeah, let's do it now ;p
<niemeyer> Cool, I'll do the whole dance.. there's still time to merge these last few branches, though
<niemeyer> After you're done with that, we can settle on a no-more-merges approach
<TheMue> Hooray, got my list of branches with prerequisites in. The current open and the new started are directly based on trunk. Phew.
<niemeyer> TheMue: Awesome.. unfortunately there will be some extra pain just now, as you'll have to rebase the branch on the new tree
<niemeyer> TheMue: But that shouldn't be too much touble hopefully
<niemeyer> trouble
<TheMue> niemeyer: Think so too.
<niemeyer> TheMue: Ok, so all of your impending branches are in?
<TheMue> niemeyer: There's one proposal open, the new branch is in a very early state and not yet proposed.
<niemeyer> TheMue: Okay, can you please get the open proposal in then
<niemeyer> TheMue: So I can do the switch in a less painful way for you
<fwereade> niemeyer, one thing about the formatting I just thought of
<fwereade> niemeyer, the overridden format_smart methods don't produce yaml
<niemeyer> fwereade: Ditto for you.. there's at least one almost-ready-to-merge branch.. if you can get that in now, I'll do the switch over shortly afterwards
<niemeyer> fwereade: What does that mean in practice?
<TheMue> niemeyer: Directly submit it?
<niemeyer> TheMue: How do you mean?
<niemeyer> TheMue: What's the branch?
<fwereade> niemeyer, they print individual list elements directly
<TheMue> niemeyer: It's the changed service/relation mapping in topology.
<niemeyer> fwereade: Well, that sounds like an issue.. it should be yaml as well
<fwereade> niemeyer, this is not a problem if we're not saying that "smart formatting implies yaml"
<fwereade> niemeyer, indeed so :)
<niemeyer> TheMue: That's reviewed already
<niemeyer> fwereade: There's no such thing as "smart formatting" as far the API towards the user goes
<niemeyer> fwereade: That smart formatting is a pretty dumb idea, if you pardon the pun
<TheMue> niemeyer: You had two remarks, not yet a LGTM. But both a simple.
<fwereade> niemeyer, there's "--format=smart", it just happens to be a default
<TheMue> s/a/are/
<fwereade> niemeyer, yeah, agreed
<niemeyer> TheMue: Okay, so the comment I was making is for us to focus on getting these branches in
<niemeyer> TheMue: If you address the points, I'll provide a timely review, and you can submit
<TheMue> niemeyer: OK, come in in a few moments.
<niemeyer> fwereade: We want yaml for the output.. if it prints anything else such as a Python object or whatever, it is a bug
<niemeyer> fwereade: smart was an internal implementation that leaked, in a horrible way unfortunately
<fwereade> niemeyer, heh, such is life... but that then means that fixing it will break more than we anticipated... ie relation-ids and relation-list :/
<niemeyer> fwereade: Hmm.. how do you mean?
<fwereade> niemeyer, they're the ones that output strainght \n-separated lists atm
<niemeyer> fwereade: Oh, I think those are fine
<fwereade> niemeyer, ok, but they're not outputting yaml; if this is an inconsistency we can live with them I'm ok with that
<niemeyer> fwereade: Their values can't really contain anything but a list of identifiers separated by newlines
<niemeyer> fwereade: Which is a handy API for shell scripts to deal with
<fwereade> niemeyer, agreed, but not providing a yaml option across the board seems a little weak
<niemeyer> fwereade: Or do I miss some more complex case that we have?
<fwereade> niemeyer, we could just add an explicit --yaml option, which should be trivial
<niemeyer> fwereade: We can, but I suggest not worrying about this for now
<niemeyer> fwereade: The shell use case is so strong that it justifies the inconsistency, and doing s.split("\n") is trivial on any language
<fwereade> niemeyer, I think that the complete implementation is "def format_json(self, result, stream):\n    print >>stream, yaml.safe_dump(result)"
<TheMue> niemeyer: It's in for review, and sorry, has been the validation improvement.
<fwereade> niemeyer, but anyway: I like the AddUnitSubordinateTo(principal) idea, and I'm working on that now
<niemeyer> fwereade: Okay, I'm happy with adding that if you'd prefer to have it now.. --output yaml?
<niemeyer> fwereade: Is that the syntax we have everywhere else?
<fwereade> niemeyer, "--format yaml"
<niemeyer> fwereade: Ah, cool.. I'm happy with whatever we're using elsewhere
<niemeyer> andrewsmedina: Is Francisco Souza with you guys as well?
<andrewsmedina> niemeyer: yes
<niemeyer> andrewsmedina: Cool.. just found an unanswered mail from him on the list re. CentOS and was wondering about it
<niemeyer> Will provide some feedback
<andrewsmedina> niemeyer: he is my co-worker and my friend
<andrewsmedina> niemeyer: we ported juju to work on centos
<niemeyer> andrewsmedina: This is awesome
<andrewsmedina> niemeyer: this are working fine now
<niemeyer> andrewsmedina: Really? Ho ho.. this is cool
<andrewsmedina> niemeyer: we need thinkin how to provide to juju this compatibility without use a fork
<andrewsmedina> niemeyer: if juju support a custom cloud init
<niemeyer> andrewsmedina: Sounds good.. I'll provide some feedback on the list and we can move from there
<niemeyer> fsouza: Timely ;)
<fsouza> niemeyer: andrewsmedina summoned me
<niemeyer> fsouza, andrewsmedina: Mail sent
<fsouza> niemeyer: great, I hope to give it back to you soon :)
<fwereade> niemeyer, https://codereview.appspot.com/6268050/
<niemeyer> fwereade: Looking
<fwereade> niemeyer, cheers
<niemeyer> fwereade: Done
<niemeyer> fwereade: I think you got the last bit of the comment reversed
<niemeyer> fwereade: LGTM with that addressed
<niemeyer> fwereade: That bit, specifically: "Adding a non-subordinate unit on a subordinate service or the other way around should fail in a nice way, though. People may do that with the UI (add-unit foo)."
<fwereade> niemeyer, surely it's impossible to add a principal unit as a subordinate via the UI?
<niemeyer> fwereade: juju add-unit subordinate-service
<fwereade> niemeyer, add-unit will call AddUnit, surely?
<fwereade> niemeyer, AddUnitSubordinateTo should only be ever called by us as a side-effect of addition of some other unit
<fwereade> niemeyer, and we should know what we're doing from the POV of program logic alone
<niemeyer> fwereade: You're right, but the case that is panicking is simply the counterpart of that
<niemeyer> fwereade: Which can plausibly happen as a valid user interaction of some kind
<fwereade> niemeyer, there's no panic in AddUnit
<niemeyer> fwereade: The case that is panicking in AddUnitSubordinateTo is the counterpart of a case that errors in AddUnit
<niemeyer> 	 109         if ch.Meta().Subordinate {
<niemeyer>  110                 return nil, fmt.Errorf("cannot directly add units to subordinate
<niemeyer>       service %q", s.name)
<niemeyer>  111         }
<niemeyer> 	 122         if !ch.Meta().Subordinate {
<niemeyer>  123                 panic("cannot make a principal unit subordinate to another unit"
<niemeyer>      )
<niemeyer> This is a sensible panic:
<niemeyer>  125         if principal == nil {
<niemeyer>  126                 panic("a subordinate unit must be added to a principal unit")
<niemeyer>  127         }
<niemeyer> Everything else there should be an error
<fwereade> niemeyer, I still don't see how I can panic unless the state is already screwed up badly enough that we started a subordinate relation between the wrong kinds of services
<niemeyer> fwereade: A panic is a serious logical mistake, or system problem
<niemeyer> fwereade: Having two valid units at hand, and doing an operation that is not supported for them, is an error, not a reason to blow the process up
<fwereade> niemeyer, fair enough... I still consider the above to represent a serious logical error that can only result from other code acting flat-out wrong; but I can see there's a distinction there
<niemeyer> fwereade: My proof that this isn't the case is that you correctly spotted that panicking on AddUnit would be bad
<fwereade> niemeyer, ha, yeah
<fwereade> niemeyer, a thought; is there any point panicing explicitly in the first place? the next operation is to call principal.IsPrincipal() which will panic on nil anyway
<niemeyer> fwereade: I'd be fine with dropping the explicit check
<jimbaker> fwereade, thanks for the review, some great refactoring comments!
<fwereade> jimbaker, cheers, hope I haven't missed some subtlety somewhere ;)
<jimbaker> fwereade, the only subtlety, if we will call that, is just the preservation of the json serialization. i'm just trying to preserve any existing warts, since that's the point of adding format: 2
<jimbaker> also i do agree with format_yaml, it's too bad white space (including \n) in yaml just simply is a white space separated string and doesn't imply a list
<fwereade> jimbaker, I'm pretty sure that yaml can serialize anything json can; so if we're using it internally, why not go with the more expressive format that is often needed anyway, and dispense with all the extra logic in the protocol?
<fwereade> niemeyer, that's in now
<niemeyer> fwereade: Sweet, thanks
<niemeyer> fwereade: Would you mind to have a quick look at https://codereview.appspot.com/6295048/?
<niemeyer> fwereade: Should be a trivial
<fwereade> niemeyer, on it
<fwereade> niemeyer, LGTM
<niemeyer> fwereade: Cheers!
<niemeyer> Oh, TheMue had already reviewed it too
<niemeyer> Cool
<jimbaker> fwereade, yaml can serialize anything json can do. but again there are subtle changes that the json path introduces that using charms can observe and therefore depend on. in particular, the u"foobar" stuff in output
<fwereade> jimbaker, where does it hit output? all the communications stuff in protocol.py is 100% internal, isn't it?
<jimbaker> fwereade, sure, but it's in the communication that the conversion occurs
<fwereade> jimbaker, that is, everything we serialize to return is immediately deserialized atthe other end before being converted to the final format
<jimbaker> fwereade, so info is being added that is unexpected, and then leaks out
<niemeyer> TheMue: LGTM on https://codereview.appspot.com/6268049/
<niemeyer> Will do the switch over once that's in, after lunch
<TheMue> niemeyer: Great, thanks
<jimbaker> fwereade, previously i didn't maintain two separate code paths, and it was 90% (or whatever pareto limit) consistent with the old format
<fwereade> jimbaker, what is the problem with converting object=>yaml=>object=>json, compared to object=>json=>object=>json?
<niemeyer> Lunch!
<fwereade> jimbaker, (ok, the whole idea is silly in the first place, but...)
<jimbaker> fwereade, it's the fundamental reason why this work was done in the first place
<fwereade> jimbaker, ah, hold on, I see now
<jimbaker> so you do relation-get - with format v1 and you get {u"public-address": u"ec2-1-2-3-4.compute-1.amazonaws.com"}
<jimbaker> fwereade, now it's quite possible to avoid this particular case, there's nothing wrong with unicode per se. but then other things start changing
<jimbaker> so keeping the two code paths ensures we get old buggy behavior, as was agreed upon
<jimbaker> and it's the *same* buggy behavior, and the buggy behavior even now has tests :)
<fwereade> jimbaker, yep, consider those objections withdrawn; I had thought we'd duplicate the bugs, but we don't
<fwereade> jimbaker, thanks for bearing with me :)
<jimbaker> fwereade, no worries, and thanks for the good tips again
<fwereade> jimbaker, a pleasure :)
<Aram> niemeyer: am I correct in assuming that in mgo the only way to search for things matching a prefix is with a bson.RegEx?
<fwereade> hazmat, btw, did you have an opinion on https://code.launchpad.net/~julian-edwards/juju/maas-provider-non-mandatory-port/+merge/107577 ? it's been hanging a while
<fwereade> hazmat, it still LGTM
<niemeyer> Aram: Yeah, but when searching for simple prefixes (/^foo/) MongoDB optimizes that away and uses indexes etc
<Aram> yes, yes, I know.
<niemeyer> Aram: Cool :)
<hazmat> fwereade, yeah.. its on my list for today to merge
<niemeyer> Wood is running out.. I'll run out to buy some and be back soon to get trunk transitioned
<niemeyer> Woohay.. wood is burning, stocks replenished
<fwereade> niemeyer, how's the transition going?
<niemeyer> fwereade: Heya
<niemeyer> fwereade: Translating the paths now
<niemeyer> fwereade: Why, do you have something impending?
<niemeyer> Our tests are getting pretty slow..
<niemeyer> The environment ones, mainly
<fwereade> niemeyer, not quite, but I have a longish pipeline that's pretty well mapped out in my mind; but then, hmm: tomorrow is a public holiday
<niemeyer> fwereade: Oh, really? It's a public holiday here as well
<niemeyer> fwereade: Either way, the transition is done
<niemeyer> fwereade: lp.net/juju-core/juju is live
<niemeyer> Killing the old trunk right now
<fwereade> niemeyer, and I don't-entirely-depend-on-it-but-boy-howdy-it's-convenient rog's unit node branch
<fwereade> niemeyer, that doesn't need to come in for a while
<niemeyer> fwereade: I'll review it today still
<niemeyer> fwereade: I didn't bother because rog wasn't around
<fwereade> niemeyer, but I should be able to dash off an easy branch or two without it all the same
<fwereade> niemeyer, yeah, quite right
<niemeyer> I'm actually pondering if I should take this rare chance to actually stop for a couple of days
<niemeyer> (being a national holiday tomorrow and all)
<fwereade> niemeyer, sounds sensible :)
<mramm> niemeyer: rest up now while you still can
<mramm> and don't burn out!
<fwereade> tue/thu national holidays should generally be taken advantage of :)
<niemeyer> fwereade:+1 :-)
<niemeyer> mramm: Yeah, two sprints in a row, and another conf in a couple of weeks.. fun
<mramm> niemeyer: you have been traveling a lot
<niemeyer> mramm: Hey, but I need the miles.. ;-)
<mramm> haha
<niemeyer> It's actually a bit sad.. my wife and I have diverging ideas of what holidays should look like.. she wants to go out on some kind of trip to relax, and I want to stay at home and *not travel* for a change
<mramm> niemeyer: yea, I know how you feel
<mramm> vacation from traveling is nice
<niemeyer> It's been fun, though, honestly.. the amount of progress we've been doing on a daily basis feels so great.
<niemeyer> Aaaand.. I've committed the tombstone on the previous branch.
<niemeyer> We're officially moved!
<niemeyer> A long life to juju-core
 * niemeyer mails people
<niemeyer> fwereade: Forgot to update .lbox.. in case you've branched, please repull in a moment
<niemeyer> Woohoo!
<niemeyer> All Go branches were dealt with, and the move is now done
<mramm> niemeyer: Awesome!
<niemeyer> https://code.launchpad.net/juju/+activereviews
<niemeyer> No Go branches there
<niemeyer> https://code.launchpad.net/juju-core/+activereviews
<niemeyer> Nothing there either!
#juju-dev 2012-06-07
<wrtp> fwereade: mornin'
<TheMue> Good morning, Go-ers.
<wrtp> TheMue: hey!
<wrtp> TheMue: i seem to have problems branching the new juju-core project... is there anything i should know?
<TheMue> wrtp: Ah, here he is. HAd a nice time with the Queen?
<wrtp> TheMue: she was lovely thanks. she sends fond regards.
<TheMue> wrtp: Oh, thank you. Send her my best wishes when you're next time in for a cup of tea. *lol*
<wrtp> % bzr branch lp:juju-core/trunk juju-core
<wrtp> bzr: ERROR: Permission denied: "Cannot create 'trunk'. Only Bazaar branches are allowed."
<wrtp> wtf?
<TheMue> wrtp: Did not tried it yet, just started and updated the VM.
<TheMue> wrtp: Will try now and see what I get.
<wrtp> TheMue: cool, thanks
<TheMue> wrtp: Simple solution, just remove /trunk.
<wrtp> TheMue: remove /trunk from where?
<TheMue> wrtp: It's bzr branch lp:juju-core <<your dir name>>
<wrtp> TheMue: ah. that's odd, because every other project uses /trunk...
<TheMue> wrtp: ;)
<TheMue> wrtp: Even with bzr? Trunk as a dir I know from svn. But e.g. not in Mercurial based projects.
<wrtp> TheMue: yeah. for example goamz and gozk
<wrtp> TheMue: ha ha. it doesn't work with cobzr!
<TheMue> wrtp: I'm using plain bzr and switch/rename directories
<wrtp> TheMue: yeah. tbh, i think it's perhaps an oversight that niemeyer's not using lp:juju-core/trunk
<wrtp> TheMue: hmm, he *is* using /trunk: https://code.launchpad.net/~gophers/juju-core/trunk
<wrtp> TheMue: hmm, this worked ok: bzr  branch https://code.launchpad.net/~gophers/juju-core/trunk juju-core
<wrtp> TheMue: which is all i needed
<TheMue> wrtp: I think the command above leads to the same result.
<TheMue> wrtp: See "Get this branch:" on the web site.
<wrtp> TheMue: evidently it doesn't, because bzr branch lp:juju-core/trunk failed...
<TheMue> wrtp: There's written "bzr branch lp:juju-core"
<TheMue> wrtp: Nothing about a "/trunk".
<wrtp> TheMue: yeah, but that *should* be the same as bzr branch lp:juju-core/trunk
<wrtp> TheMue: which is what i've been using for all the other projects.
<wrtp> TheMue: (including juju itself)
<wrtp> (the python version)
<TheMue> wrtp: Maybe, here I'm not deep enough in the bazaar conventions.
<wrtp> TheMue: neither me :-)
<TheMue> wrtp: I even don't see a reason for appending trunk.
<wrtp> TheMue: it's just what gustavo told me to do ages ago...
<TheMue> wrtp: What would it be good for?
<wrtp> TheMue: all launchpad branches are of the form lp:project/branch
<TheMue> wrtp: OK, so the URI maybe may contain a branch but when no branch is defined the trunk is taken.
<wrtp> TheMue: yup
<TheMue> Oh, family forces me to a second breakfast on the veranda. Brb.
<wrtp> TheMue: i feel your pain :-)
<TheMue> wrtp: The pain starts when returning back to the computer.
<Aram> morning.
<TheMue> Hi Aram
<hazmat> wrtp, bzr branch lp:juju-core
<wrtp> hazmat: why doesn't bzr branch lp:juju-core/trunk work like it does for other projects?
<wrtp> hazmat: seems a bit odd
<wrtp> Aram: hiya
<hazmat> wrtp, because there is no 'trunk' series defined
<hazmat> wrtp, it looks like gustavo named it 'juju'
<hazmat> which is odd
<hazmat> so instead of 'trunk' you have bzr branch lp:juju-core/juju
<wrtp> hazmat: i agree that's odd. it's also odd, given that,  that this link works: https://code.launchpad.net/~gophers/juju-core/trunk
<hazmat> wrtp, right.. the actual branch is trunk (off the gophers group).. but the lp alias derives off  the series
<hazmat> changing the name of the series would fix that
<hazmat> at https://launchpad.net/juju-core/juju
<hazmat> i'll wait to niemeyer is around though
<wrtp> hazmat: yeah, he usually has a good reason for this kind of thing
<wrtp> hazmat: 'juju-core juju series [...] The "trunk" series represents [...] '
<hazmat> wrtp, i rather doubt there's a reason to it
<hazmat> well perhaps to hold the series as separate project containers
<hazmat> wrtp, the reason niemeyer rejected the proposals is so people could create new ones against the juju-core project
<wrtp> hazmat: i have no problem with that
<hazmat> https://www.windowsazure.com/en-us/manage/linux/
<hazmat> wrtp, just noticed you had put an existing back into 'wip'
<wrtp> hazmat: oh, that was unintentional - i tried to do lbox propose -for lp:juju-core, but it found the old proposal
<hazmat> wrtp, no worries.. try the bzr merge --remember lp:juju-core thing first
<wrtp> hazmat: the WIP was so that i didn't make an actual proposal, but had the side effect, i see, of unrejecting it
<wrtp> hazmat: i'm branching from trunk then merging back into that, then proposing that
<hazmat> wrtp, lbox inspects the bzr info command to determine the submit/push branch
<wrtp> hazmat: which i hope should work
<wrtp> hazmat: new proposal https://codereview.appspot.com/6300060 (which is quite a nice number)
<TheMue> wrtp: First impression is good, but it's a deeper change and I've got to see where other parts of the still to port state code has to adopt those changes.
<wrtp> TheMue: submitted, i'm afraid. consensus has it that there shouldn't be a particular problem with it...
<TheMue> wrtp: Thought it's a proposal.
<wrtp> TheMue: it was proposed a week ago, and LGTM'd by gustavo
<wrtp> TheMue: see https://codereview.appspot.com/6247066/
<TheMue> wrtp: OK, only wondered because you wrote "new proposal".
<wrtp> TheMue: it's only new because it had to be created anew for juju-core
<TheMue> wrtp: Ah, understand.
<Aram> today is a national holiday here, I think I'll be around mostly, though.
<wrtp> Aram: seems fairly quiet today...
<TheMue> In Germany some of the federal states have public holiday too, we don't. So they drive into our towns to go shopping. ;)
<Aram> yeah, when it's a public holiday it's terrible, everything is closed.
<Aram> we have one single generic shop opened in a 2M people city.
<Aram> and even that shop is not non stop.
<Aram> back in Romania I had 3 non-stop shops on my street, heh.
<Aram> btw, did you guys see the transit of venus yesterday?
<TheMue> Aram: No, too early, will look at it next time. *lol*
<Aram> lol.
<wrtp> hazmat: does remove-unit ever terminate a machine?
<hazmat> wrtp, no
<hazmat> wrtp, the machine is still allocated, and can be used for new units
<hazmat> wrtp, its a big bogus atm though
<hazmat> because we don't isolate the units from the root fs
<wrtp> hazmat: i was thinking about dave cheney's problem with the provisioning agent.
<hazmat> so they'll have pre-existing state on them
<wrtp> hazmat: i see
<hazmat> terminate-machine will kill an unused machine
 * hazmat checks for dave's email
<wrtp> hazmat: i'm wondering if it might be best if terminate-machine merely marks the machine as "terminated" and only the provisioning agent actually removes it from the state, once the machine has been really terminated.
<wrtp> hazmat: then there's no need to be able to list all machines in the environment, i *think*.
<hazmat> wrtp, it removes the machine state
<wrtp> hazmat: currently, yeah.
<hazmat> wrtp, the provisioning agent marks machine its creates in some fashion
<hazmat> so it can identify them
<hazmat> in ec2 it uses security groups
<wrtp> hazmat: that's true.
<hazmat> and then it deltas between state and provider instances
<hazmat> and removes those with the mark but no state
<wrtp> hazmat: i was wondering if it's actually necessary to be able to do that
 * hazmat doesn't understand the question
<hazmat> wrtp, as opposed to synchronous rpc to the provisioning agent?
<hazmat> you can also mark the state as deleted and have the provisioning agent subscribe not just to children but  also to contents..
<wrtp> hazmat: i don't think the provisioning agent subscribes to children
<hazmat> the firewall component of the provisioning agent has some nested watches for exposed port management
<wrtp> hazmat: i think it subscribes to topology
<hazmat> wrtp, oh.. yeah.. right
<hazmat> same principle though, you could mark the deletion in the topology
<hazmat> wrtp, what's your alternative that your thinking about?
<wrtp> hazmat: yeah, that's what i'm suggesting
<hazmat> wrtp, yes you should go down that road
<hazmat> wrtp, see the stop protocol proposal.. as a background
<wrtp> hazmat: rather than simply deleting the machine from the topology and letting the provisioning agent do the delta.
<wrtp> hazmat: link?
<hazmat> http://bazaar.launchpad.net/~hazmat/juju/unit-stop/view/head:/source/drafts/stopping-units.rst
<hazmat> wrtp, it would be nice if go juju had some proposals...
<hazmat> ie. document architecture choices
<hazmat> this is one thing we were really good at during the beginning of juju dev
<wrtp> hazmat: i guess atm our proposals are all "do what py juju does" :-)
<hazmat> that we fell off at, it would be good to resurrect
<hazmat> wrtp, except when its not
<wrtp> hazmat: true.
<hazmat> pretty much everything different is undocumented
<hazmat> the zk interaction etc.
<wrtp> hazmat: that's true. i'd definitely like to see more docs on that. although in the code would be fine for me too.
<hazmat> wrtp, in the code sort of defeats the purpose of a proposal's intent, unless its extractable prose
<hazmat> api docs are not really architecture documentation
<wrtp> hazmat: isn't a "proposal" about something that might be, rather than something that is?
<wrtp> hazmat: BTW i like the intent of the stop proposal, but i'm of two minds.
<hazmat> wrtp, its prose documentation about the impl
<hazmat> and architecture
<hazmat> we did typically do it in advance, but the value is the historical as well
<wrtp> hazmat: as an operator i'd like to be able to do terminate-machine and have it just stop, not potentially hang waiting for buggy stop hooks to complete.
<hazmat> because niemeyer wanted to agree on architecture before impl for most bits as a coordination/go-slow point
<hazmat> wrtp, that's covered in the proposal
<hazmat> wrtp, the supervision tree is still in effect
<hazmat> it just gives a chance/notification and wait period for the buggy stop hook
<hazmat> as opposed to ruthless termination which precludes any form of child cleanup/activity
<wrtp> oh yeah, i see
 * wrtp hates timeouts in general though :-)
<wrtp> although i know they're unavoidable in general
<hazmat> wrtp, your alternative suggestion is eagerly awaited ;-)
<hazmat> coffee break, bbiam
<wrtp> hazmat: just kill the machine. any system should be able to cope with that as a matter of course anyway...
<hazmat> wrtp, right.. that's what we do now
<hazmat> parent's kill children
<wrtp> hazmat: what's the parent of a machine node?
<hazmat> wrtp, the provisioning agent
<hazmat> the environment from a state perspective
<hazmat> but the point is that's problematic for a coordination system to not coordinate :-)
<hazmat> for example the  switch to contained unit states in the service is something that should get documented as part of the zk tree layout docs we have.
<wrtp> hazmat: agreed.
<hazmat> i'd suggest bringing in juju/docs into the goport tree or into juju-core as a separate series
<hazmat> and updating it there
<wrtp> hazmat: good idea.
<wrtp> hazmat: i think that and the (as yet unused?) presence node stuff are the only major deviations so far
<wrtp> hazmat: but TheMue might know of more
<hazmat> wrtp, well pinger nodes for ephemeral nodes as well
<hazmat> wrtp, and the zk session expiration tomb behavior
<wrtp> hazmat: that's what i meant by "presence node stuff"
<wrtp> hazmat: is that a change visible in zk?
<wrtp> (the session expiration behaviour, that is)
<hazmat> wrtp, its an architecture detail thats worth documenting
<hazmat> considering it was given as one of the main reasons to port to go..
<hazmat> that might be nice
<hazmat> we have lots of non zk state things documented
<TheMue> wrtp: Seen my nick. Where do I know more?
<hazmat> TheMue, pinger presence nodes
<hazmat> as something to be documented
<wrtp> TheMue: deviations of go juju state implementation from py juju state implementation
<wrtp> hazmat: some of this stuff could definitely do with some more documentation, but i feel that the session expiration behaviour, in particular, is an implementation detail that would read well in the code, but isn't inevitable from the architecture of the system.
<TheMue> hazmat, wrtp: Just took a look for orientation. I've never touched them, sorry.
<wrtp> hazmat: although some internal implementation overview docs might be good too, i guess.
<wrtp> hazmat: most of the stuff in docs/source/internals seems to be about higher-level and zk stuff, rather than how stuff is actually done in the python itself
<wrtp> hazmat: is there a place that i can see all the proposals like your stop proposal? (just the mailing list archive?)
<hazmat> wrtp, agreed and the session expiration handling is also a higher level detail
<wrtp> hazmat: i guess so
<hazmat> wrtp, their published as branches under code.launchpad.net/juju
<wrtp> hazmat: hmm, useful :-)
<hazmat> their isn't a good index of just the branches
<hazmat> that are docs
<hazmat> i've also got rest-api, security, environment-settings proposals extant there
<wrtp> hazmat: might be useful if the branches were named xxx-proposal
<hazmat> true
 * Aram is using Go linear time regular expressions to validate PCRE regular expressions so that I can be sure they are also linear time.
<Aram> this seems... weird.
<hazmat> the nodejs package manager is really quite nice
<Aram> I heard the same about it as well.
<Aram> IMO node.js is a terrible way of doing things, but I actually like the guys because they are bold enough to try something different.
<Aram> people are afraid to try new things these days.
<TheMue> Aram: Not only these days. Leaving well known paths hurts many people.
<Aram> yeah, I was using "these days" in a very loose sense, more likely "since the dawn of humanity" :).
<TheMue> Aram: OK, this timespan matches.
<wrtp> Aram: doesn't it depend very much on the input text?
<hazmat> Aram, agreed, client/server language unification is nice, but via callback hell.. ick..
<hazmat> Aram, their tooling and underlying event reactor usage (libev) is quite nice though
<Aram> wrtp: yes, depends on the input text, that's what I am using for the first Go regexp, to validate it before sending it to PCRE>
<TheMue> Anyone interested in a small proposal: https://codereview.appspot.com/6305067 ?
<Aram> s/for/with/
<wrtp> TheMue: LGTM
<wrtp> TheMue: although i don't seem to be getting proposal emails any more.
<TheMue> wrtp: Thx
<wrtp> TheMue: i saw my own reply, but not your original proposal
<TheMue> wrtp: Maybe due to the URI I used.
<wrtp> TheMue: which URI was that?
<TheMue> wrtp: First bzr push --remember lp:~themue/juju-core/go-state-relation-endpoint-verification and then lbox propose -cr -for lp:juju-core
<TheMue> wrtp: Both times I used juju-core.
<wrtp> TheMue: it should've worked, i think
<TheMue> wrtp: I would expect it too.
<TheMue> wrtp: Your proposal using juju popped up instead.
<Aram> "launchpad.net/juju-core/juju/state"
<Aram> what an ugly import
<Aram> can we do better?
<Aram> my bzr foo is lacking, why bzr branch lp:juju-core puts everything in juju-core as oposed to juju-core/juju as should be for the import paths to work?
 * Aram is confused.
<TheMue> Aram: There's a mail by Gustavo on juju-dev about the naming.
<Aram> I've seen it. still confused about bzr branch dehavior though
<Aram> behavior
<TheMue> Aram: I'm coming more from svn and hg, so I'm sometime confused too.
<Aram> likewise.
#juju-dev 2012-06-08
<wrtp> davecheney: mornin' boss
<davecheney> wrtp: woop woop
<wrtp> davecheney: how's tricks?
<davecheney> wrtp: so, i added the a feature to environ.Instances, if you pass nil, you get whatever it knows about at that point
<davecheney> you may find it upsetting
<wrtp> davecheney: i think it's probably the best way forward at this point
<wrtp> davecheney: the other alternative was to add a bool "all" argument to request all instances in addition to the ones requested.
<davecheney> http://paste.ubuntu.com/1029872/
<wrtp> s/the other/another/
<davecheney> the PA calls environ.Instance(nil) in a loop anyway
<davecheney> so I think we can cope with it eventually discovering all machines
<wrtp> davecheney: seems reasonable. what a pity ec2 makes everything so darn hard.
<davecheney> wrtp: is has to be, it's cloud scale
<davecheney> wrtp: http://paste.ubuntu.com/1029874/
<davecheney> I think it's fine to call this in a loop
<davecheney> environs.Instance(nil) that is
<wrtp> davecheney: i suppose so, but the eventual consistency stuff seems a little unnecessary.
<davecheney> wrtp: it does sound a little overengineered
<davecheney> it not like the console is fast or anything because of it
<wrtp> :-)
<wrtp> davecheney: i thought the provisioning agent only needed to get the instance list on startup
<wrtp> davecheney: at least, that's how i understood your message on the mailing list
<TheMue> Morning.
<davecheney> wrtp: well, the pythonic version did it periodically
<wrtp> TheMue: hiya
<davecheney> TheMue: hello!
<davecheney> wrtp: also, findUnknownInstances() can probably cache it's results after the first run
<wrtp> davecheney: i suppose that's useful if there might be several provisioning agents running concurrently.
<davecheney> it's more to fix the problem of
<davecheney> add machine, stop PA, remove machine, start PA == instance is lost and not shutdown
<wrtp> davecheney: yeah, but that could happen in a 2nd PA instance, i think, so the "unknown instance" thing could happen for the first PA instance even when it wasn't shut down.
<davecheney> wrtp: i'm aware of, but not putting any effort towards concurrent PAs at this point
<wrtp> davecheney: ok.
<davecheney> TheMue_: internet troubles ?
<TheMue> davecheney: Mobile access. ;)
<davecheney> TheMue: i'll speak slowly then
<davecheney> TheMue: mobile data speeds in australia are pitiful
<TheMue> davecheney: Here it's mostly ok.
<davecheney> bugger, after implementing code that shuts down unkonwn instances, the other side of the unit tests have broken
<davecheney> FAIL: provisioning_test.go:245: ProvisioningSuite.TestProvisioningDoesNotProvisionTheSameMachineAfterRestart
<davecheney> Error: provisioner started an instance
<davecheney> TheMue: wrtp have you ever had visilibity issues between different zookeeper connections
<wrtp> davecheney: no
<davecheney> ie, you write somethig with connection A, but it isn't visible to connection B ?
<wrtp> davecheney: using the same server?
<davecheney> yup, localhost
<wrtp> davecheney: nope. i suspect a bug in your code :-)
<davecheney> yeah, good call
<fwereade> davecheney, wrtp: I have suspicions that two connections to the same local server *can* have rather different views of what time it is
<wrtp> fwereade: interesting. more than just transient phase difference you mean?
<davecheney> fwereade: the problem I am seeing is, in a test, I start an instance, then write the provider id back to the state with machine.SetInstanceId
<fwereade> davecheney, wrtp: they'll see the same history, sure, but I don't think any guarantees are made about distinct connections being synchronized in any way
<davecheney> then I close that state connection, open another one, but the Id is not there
<wrtp> fwereade: aren't they looking at the same underlying state?
<davecheney> http://codereview.appspot.com/6307049/diff/2001/cmd/jujud/provisioning_test.go
<davecheney> ^ line 245 etc
<fwereade> wrtp, I'm afraid I don't know exactly what form that underlying state takes; but given the terms in which the guarantees are couched I wouldn't find it surprising
<wrtp> davecheney: my first inclination would be to put a debugging print inside zk to print out what attributes are being set
<davecheney> yeah, i'll keep digging
<wrtp> fwereade: i think i would, but then again i haven't delved too deeply into zk storage internals.
<davecheney> this is almost certainly my fault
<wrtp> fwereade: when you've a moment, a chat about upgrade would be good
<fwereade> wrtp, specifically:
<fwereade> Timeliness
<fwereade>     The clients view of the system is guaranteed to be up-to-date within a certain time bound. (On the order of tens of seconds.) Either system changes will be seen by a client within this bound, or the client will detect a service outage.
<wrtp> fwereade: surely that applies only when you've got multiple servers?
<davecheney> wrtp: i'd think so
<davecheney> crap
<davecheney> that is weird
<fwereade> Sometimes developers mistakenly assume one other guarantee that ZooKeeper does not in fact make. This is:
<fwereade> Simultaneously Conistent Cross-Client Views
<fwereade> davecheney, wrtp: but I *hadn't* previously noticed the sync() call they mention for getting round this
<fwereade> davecheney, wrtp: all above from Sometimes developers mistakenly assume one other guarantee that ZooKeeper does not in fact make. This is:
<fwereade> Simultaneously Conistent Cross-Client Views
<fwereade> er, http://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#ch_zkGuarantees
<wrtp> fwereade: "ZooKeeper by itself doesn't guarantee that changes occur synchronously *across all servers*"
<davecheney> which is fair
<wrtp> fwereade: i still think it's talking about cross-server consistency, not single-server consistency
<wrtp> fwereade: i may of course be wrong :-)
<fwereade> wrtp, Single System Image
<fwereade>     A client will see the same view of the service regardless of the server that it connects to.
<wrtp> fwereade: i'm not sure i see the relevance of that
<wrtp> fwereade: we've only got one server.
<fwereade> wrtp, for that to hold, a client can't just be getting the latest state from whatever server it happens to connect to... can it?
<fwereade> wrtp, I agree that I haven't seen anything specifically stating that there are no additional guarantees with single servers
<wrtp> fwereade: i don't see why not. isn't that the whole point, in fact?
<fwereade> wrtp, I think the whole point is that everyone sees the same history but not necessarily at the same time
<wrtp> fwereade: i guess i'm just having trouble seeing how this would happen in the single-server case. a write would have to put the data into a slow pipeline, to be written eventually, but the write consistency guarantees mean that a write must always see the latest view, i think, so on a single server i can't see that happening.
<wrtp> davecheney: you could try putting a sleep after the zk write and see if that made a difference.
<davecheney> fwereade: sounds like 'everyone will see the same events, in the same order, eventually'
<davecheney> uh oh, australia is lagging out
<davecheney> wrtp: nope didn't help, _but_ if I did NOT call st.Close() on the first connection, before opening the second everything worked
<fwereade> davecheney, I don't even think it guarantees that everyone will see the same events... just a consistent series of state snapshots, if you like
<wrtp> davecheney: interesting. maybe the close is losing events or something.
<davecheney> 'everyone will see most of what happened, probably in order
<davecheney> wrtp: indeed, I wonder if we need an explicit flush
<wrtp> davecheney: the close is happening in the same thread, right?
<fwereade> davecheney, I think it does guarantee "in order", if somewhat more weakly than I would prefer
<davecheney> wrtp: yes, i kill the tomb, which closes in a defer
<wrtp> davecheney: you might want to try reproducing the behaviour in a less convoluted setting.
<davecheney> wrtp: yes
<davecheney> wrtp: i'm certainly seeing some scarey things hapen around close
<davecheney> will do a simple test case
<wrtp> fwereade: ping
<fwereade> wrtp, pong
<wrtp> fwereade: upgrades...
<fwereade> wrtp, ah yes
<wrtp> fwereade: this was my first thought: http://paste.ubuntu.com/1030040/
<wrtp> fwereade: but then i realised that "exit and let upstart start new version" is asking for trouble
<fwereade> wrtp, really? handing responsibility over to upstart seems pretty sane to me
<fwereade> wrtp, what are you worried about?
<wrtp> fwereade: i'm worried about what happens if someone uploads a dodgy set of tools. suddently everything will break.
<wrtp> suddenly
<fwereade> wrtp, hm
<wrtp> fwereade: i have a solution
<wrtp> fwereade: which you might or might not like
 * fwereade listens
<wrtp> fwereade: it would be nice if a program could upgrade itself, but it can't do that and still have upstart handle the case where it crashes.
<wrtp> fwereade: so the idea is to add another "upgrader" command
<wrtp> fwereade: which handles the rendezvous between the old and the new versions
<fwereade> wrtp, that sounds fine until there's a problem with the upgrader :p
<fwereade> wrtp, actually, I think I see another problem
<wrtp> fwereade: the idea is that the upgrader is simple enough that it never needs upgrading itself.
<fwereade> wrtp, sometimes the args we use to start the agents will change
<wrtp> fwereade: that's fine. i've catered for that.
<fwereade> wrtp, so we can't necessarily just reuse the upstart script
<fwereade> ah
<fwereade> cool :)
<fwereade> wrtp, I'm reasonably happy with the upgrader idea
<wrtp> fwereade: the idea is that when you upgrade, you actually run both programs together for a while.
<wrtp> fwereade: the new program connects to the state, does some checks and then says "ok, i'm ready to start"
<fwereade> wrtp, my only objection is that "the upgrader is simple enough that it never needs upgrading itself" feels a touch optimistic to me
<fwereade> wrtp, I don't really like running 2 at a time
<wrtp> fwereade: then the old program shuts down and the upgrader tells the new program to go ahead
<wrtp> fwereade: the nice thing about running them both at the same time is that you get zero down time
<fwereade> wrtp, yeah, it's just my gut reaction
<fwereade> wrtp, I will probably come around to it
<wrtp> fwereade: i've built the upgrader part already (although i've still got some compiler errors)
<wrtp> fwereade: i think it's possible to do some exhaustive verification of it to check that it's correct
<wrtp> fwereade: it's 280 lines of code
<wrtp> fwereade: which is bigger than i'd hoped, but still pretty small
<wrtp> fwereade: at the moment, the assumption is that it talks to the commands that it runs via stdin and stdout. that could change.
<fwereade> wrtp, cool
<fwereade> wrtp, sorry, I have to go help cath with the car for a bit :/
<wrtp> fwereade: np
<fwereade> wrtp, idea sounds broadly sane but I'd like to see an implementation ;)
<wrtp> fwereade: later this morning, i hope :-)
<fwereade> wrtp, sweet
<fwereade_> TheMue, is it sane to have settings on a service relation?
<TheMue> fwereade_: Today we have on a relation. But I don't know yet how they are used.
<fwereade_> TheMue, it seems to me that *unit* relation settings are sane (and are set by units in their hooks) and *service* settings are sane (set by the user via command line)
<fwereade_> TheMue, but that we don't currently have a use case for *relation* or *service relation* settings
<TheMue> fwereade_: Did you check Py state where those settings are changed?
<TheMue> fwereade_: Not that we only create that node and nothing else.
<fwereade_> TheMue, what I'm dancing towards is a suggestion that we drop all creation of role/settings nodes in state.AddRelation, because I think they're meaningless until we add units
<TheMue> fwereade_: Sounds reasonable.
<fwereade_> TheMue, sweet, I think it makes things simpler
<TheMue> fwereade_: Yes. Not much, but every bit counts.
<Aram> morning.
<wrtp> Aram: hiya
<TheMue> Aram: Moin.
<fwereade_> Aram, heyhey
<wrtp> fwereade_: here's the WIP. i've compiled it only, no tests yet. https://codereview.appspot.com/6307061
<fwereade_> wrtp, cheers
<fwereade_> TheMue, are there any existing tests for the scope stuff in AddRelation?
<TheMue> fwereade_: Only the validation of the value, it's in my last proposals.
<TheMue> fwereade_: What exactly do you want to see tested?
<fwereade> TheMue, sorry, lost history; did you respond?
<TheMue> fwereade: Hehe, modern technology.
<TheMue> fwereade: I said that my latest proposal does a value validation, but nothing else.
<TheMue> fwereade: And I asked what do you want to see tested.
<TheMue> fwereade: We've not yet reached the parts that add container scoped paths in ZK.
<fwereade> TheMue, what is meant to happen when the scope doesn't match
<fwereade> TheMue, that's edging its way towards a proposal on my end
<TheMue> fwereade: One moment.
<TheMue> fwereade: https://bugs.launchpad.net/juju-core/+bug/1007373 See latest question, it is still open.
<TheMue> fwereade: Today, in Py as in Go, only one endpoint has to be container, that's enough.
<fwereade> TheMue, I think that's correct; but we need to verify that we store ScopeContainer on each
<fwereade> TheMue, that's what you do, but it's not tested AFAICS
<TheMue> fwereade: You're right.
<fwereade> TheMue, I was also wondering why we return the ServiceRelations from AddRelation; seems to me it would make more sense to return just the Relation, and give that a Services method
<TheMue> fwereade: Did you checked the callers of AddRelation()
<fwereade> TheMue, there aren't any
<TheMue> fwereade: In Py?
<TheMue> fwereade: Interesting.
<fwereade> TheMue, but if there were, they could just as easily extract the service relations from the service
<fwereade> TheMue, sorry, not in py
<TheMue> fwereade: The you maybe see the reason.
<TheMue> fwereade: But a relation has not enough information for all services, it would have to dynamically retrieve them again. So maybe it's better to only return the service relations, they have a Relation() method.
<fwereade> TheMue, sorry, I don;t see the reason
<fwereade> TheMue, the only client discards the result
<TheMue> fwereade: Hmm, then the old reason for it may be gone.
<fwereade> TheMue, and honestly I don't see what good the Relation type actually does (maybe status?)
<fwereade> TheMue, given a service, I will definitely want to know what relations it's in
<fwereade> TheMue, but for that I want ServiceRelations, I think
<fwereade> TheMue, hmm, actually, even for status a bare Relation type is pretty useless I think
<fwereade> TheMue, I think it should actually be AddRelation(...) error
<TheMue> fwereade: It's used in RemoveRelation() ;)
<fwereade> TheMue, why don't we just pass the same endpoints, calculated in the same way?
<fwereade> morning mramm
<TheMue> fwereade: We can do, no problem.
<TheMue> mramm: Moin.
<fwereade> TheMue, I don't *think* there are any other use cases... but I may be missing something
<TheMue> fwereade: But that change would affect all callers too.
<fwereade> TheMue, how many?
<fwereade> TheMue, AFAICS the only use of get_relation_state is in remove-relation, which then uses it to... remove the relation
<TheMue> fwereade: I don't know. I only want to take care that a redesign of todays API instead of pure porting doesn't cost too much time (up to 12.10).
<TheMue> fwereade: And you know, our last API change is now gone almost completely back into the former solution.
<fwereade> TheMue, but a bit of checking of the existing API, and seeing that parts of it are 100% useless, saves us an awful lot of implementation time
<fwereade> TheMue, which one?
<TheMue> fwereade: I'm not safe enough in todays Py code and the history which lead to it. So if you see opportinities in those changes please talk to Gustavo.
<TheMue> fwereade: The topology related relation stuff we talked about at UDS.
<fwereade> TheMue, this one:      Services  map[string]*topoRelationService
<fwereade>  ?
<TheMue> fwereade: That's one of the parts, and some implementation details. The last proposals almost looked like the first ones.
<fwereade> TheMue, well, I guess this is the same area... but you are aware that we're burning time implementing putative multi-endpoint relation support that isn't on any roadmap I know of?
<fwereade> TheMue, based on that I can deal with the data format change
<TheMue> fwereade: Yep
<fwereade> TheMue, ok, I just don't understand why it's worth spending time on this
<TheMue> fwereade: By the way, did you read my error handling mail?
<fwereade> TheMue, I thought you seemed to make a solid case but I haven't got much more of a response
<fwereade> TheMue, anyway, sorry about that whole review cycle
<fwereade> TheMue, I have no idea on what basis niemeyer makes that sort of API decision though
<fwereade> TheMue, the AddRelation(args...) business is a known ugliness in python which I'm sure niemeyer complained about himself at one stage
<TheMue> fwereade: That's no problem. It's only that I'm maybe not the best discussion partner for some ideas due to my lack of the project history and many design motivations.
<fwereade> TheMue, yeah, understood -- but I think it's worthwhile talking to you about these things anyway
<TheMue> fwereade: That's true.
 * fwereade tries to figure out what he's actually doing
<fwereade> TheMue, do we have Service.Relations() yet?
<fwereade> TheMue, wait, sorry, ignore me
<TheMue> fwereade: No
<TheMue> fwereade: I'm currently taking a deeper look into our code base regarding error handling.
<fwereade> TheMue, ok, cool
<TheMue> fwereade: There are many places where errors just passed up. Not only ZK, topology or os.
<fwereade> TheMue, I think the thing I *need* now is Service.Relations; I'll get to work on that
<fwereade> TheMue, I think we should also fix the interface to AddRelation and RemoveRelation
<TheMue> fwereade: IMHO we should seperate between errors containing possible additional informations (as own types) and the visualization of this error as message, log entry, something else on the UI level (commands, logs, web).
<fwereade> TheMue, would you have any objection to a CL that combined the two? they're somewhat intertwined
<TheMue> fwereade: Combining Add and Remove?
<fwereade> TheMue, making Add return error, and Remove take endpoints... like add
<TheMue> fwereade: So instead of a relation to remove I would have to get the endpoints, then (internally) search a matching relation and then delete it? Where do you get the EPs from? And did you take a look into todays RemoveRelation code?
<fwereade> TheMue, how do you get a relation instance to remove now?
<fwereade> TheMue, the answer is "you can't"
<TheMue> fwereade: Are there no callers of remove_relation_state in Py today?
<fwereade> TheMue, the only called uses a relation it got from get_relation_state, which takes endpoints and returns a relation
<fwereade> TheMue, that place is the only client of get_relation_state
<fwereade> TheMue, the actual operation is "given these endpoints, which came from the command line, remove the corresponding relation"
<fwereade> TheMue, just like Addrelation but in reverse
<TheMue> fwereade: OK, I see. So it sounds reasonable.
<fwereade> TheMue, cool, thanks
<Aram> I wonder why Gustavo made mgo take bson.{M,D} arguments instead of using reflection to determine the type of arguments, it's not like you buy any type safety since you can put whatever crap you want inside the bson objects.
<Aram> err = k.nodes.Update(bson.M{"_id": parent}, bson.M{"$push": bson.M{"children": path}})
<Aram> there's a lot of bson.M noise.
<Aram> if you use bson.D it's even worse.
<Aram> surely we can do better.
<wrtp> Aram: what would you want that update line to look like?
<Aram> thinking.
<hazmat> jimbaker, the branch is looking much nicer
<wrtp> fwereade: did you have a look at the upgrade interface in https://codereview.appspot.com/6307061/diff/1/upgrade/upgrade.go ? does it look kinda reasonable? all agents would call Start, and then Upgrade when they need to upgrade.
<fwereade> wrtp, sorry: yeah, it looks sensible, but for some reason I'm a bit suspicious of stdin/stdout
<fwereade> wrtp, trying to figure out whether I'm just being irrational
<wrtp> fwereade: yeah, i wondered about that.
<wrtp> fwereade: it would be perfectly possible to use e.g. unix sockets in the same framework
<fwereade> wrtp, indeed, but then I'm not reaally sure what that buys us
<wrtp> fwereade: but in the end, stdin/stdout seem to give us exactly what we want, i think.
<fwereade> wrtp, indeed :)
<wrtp> fwereade: i don't think we're gonna use it for anything else
<fwereade> wrtp, consider my reaction to be "tentative approval" then :)
<wrtp> fwereade: and i've added some safeguards in there in case anyone prints random stuff
<wrtp> fwereade: thanks
<wrtp> fwereade: i think the upgrader implementation is really quite neat, BTW. i'm happy to have written it, even if it goes in the bin on monday.
<fwereade> wrtp, haha, yeah, it's nice :)
<wrtp> fwereade: in particular the way that cmd.run recursively calls itself in a new goroutine and talks to the new version of itself via a channel.
<wrtp> fwereade: i started doing it without goroutines and realised it was surprisingly hard to do nicely.
<fwereade> wrtp, ++goroutines :)
<wrtp> fwereade: +1
<fwereade> TheMue, if you have a moment I would appreciate a look over https://codereview.appspot.com/6303060 which we discussed earlier
<TheMue> *click*
<wrtp> hmm, linux really isn't good at coping with runaway processes. whatever happened to time sharing?
<TheMue> fwereade: It will conflict a bit with my latest validation, but so far LGTM.
<TheMue> fwereade: I'm interested what niemeyer will say.
<Aram> I've written the longest commit message of my life
<Aram> http://paste.ubuntu.com/1030536/
<Aram> TheMue: fwereade wrtp: I don't understand something about bzr, say I did these 5 commits in my branch. I want to propose a merge. After the merge is done, will my 5 changes appear in the trunk history or just the merge proposal?
<wrtp> Aram: just the merge proposal
<Aram> hmm.
<Aram> that's unfortunate.
<wrtp> Aram: if you want that commit to appear, you should have it as part of the merge proposal description
<wrtp> Aram: that's what we'll read when looking over the CL anyway
<wrtp> Aram: so you might as well put it there.
<fwereade> TheMue, cool
<Aram> thanks. I know this is how contribuiting to the Go project also works, but I find it counter intuitive in a distributed versioning system world. it feels more like cvs to me.
<wrtp> Aram: seems reasonable to me. the commits aren't lost, they're just one level deeper.
<Aram> oh? You can still access them?
<wrtp> Aram: yeah
<Aram> how?
<wrtp> Aram: bzr log -n 0
<wrtp> Aram: or -n 2 to show just two levels
<Aram> interesting.
<Aram> ugh, lbox uses exp/html
<Aram> I guess I should install the ppa?
<wrtp> Aram: yeah, i just install the ppa
<Aram> wrtp:
<Aram> 2012/06/08 17:50:06 Authenticating in Launchpad...
<Aram> Go to your browser now and authorize access to Launchpad.
<Aram>  
<Aram> how do I do that?
<wrtp> Aram: i guess your browser should have come up with a new window, assuming you're using a standard web browser. i doubt it'll work with lynx :-)
<Aram> ah, damn.
<Aram> this is a headless box
<Aram> hmm...
<hazmat> Aram, you can do it on a box with a head, and transfer the auth token
<hazmat> Aram, actually you don't even need to do that
<hazmat> Aram, you can go to lp, and authorize the app separately
<hazmat> and copy the token
<hazmat> maybe.. ;-0
<hazmat> hmm.. no that doesn't quite work
<hazmat> i was looking at https://launchpad.net/~<your_id>/+oauth-tokens
<hazmat> you can't actualy retrieve the token there though, just manage the apps
<Aram> yes, no option to add
 * Aram is installing a browser in his headless box
<hazmat> hmm.. it might work with lynx
<Aram> great, chrome doesn't like remote X.
<wrtp> lol
<Aram> I have the URL now though
<Aram> and I can do it, I think
<Aram> I think I finally did it.
<Aram> wrtp: fwereade: TheMue: care to review a small branch? (my first, heh...): https://codereview.appspot.com/6298062/
<fwereade> Aram, gladly :)
<wrtp> Aram: will do
<Aram> thanks.
<TheMue> Aram: Today only time for a quick look, have to leave now. Will look again tomorrow. So far ok, only seperating comments like // ------ are uncommon in our project.
<Aram> I know, I want to delete those, niemeyer wrote them though
<Aram> heh
<TheMue> Aram: Even if I like them because I'm adding optical seperators in my private code too. ;)
<Aram> there's a lot of dead code.
<TheMue> So, I'm off, birthday party of sister-in-law.
<Aram> enjoy
<TheMue> Have a nice weekend.
<Aram> likewise
<TheMue> Aram: Thx.
<wrtp> Aram: i'd delete 'em. gustavo's deleted them in another project recently AFAIR.
<wrtp> Aram: but maybe in another branch...
<Aram> yes.
<wrtp> fwereade: any chance you could have a glance at this before i post it to the list? http://paste.ubuntu.com/1030658/
<fwereade> wrtp, reading
<fwereade> wrtp, sorry, another problem has just crystallized in my brain
<wrtp> fwereade: go on
<fwereade> wrtp, how will this handle zk data format changes? when we need to stop everything, fix zk, and restart everything?
<fwereade> wrtp, not that the original proposal was explicit about that either...
<fwereade> wrtp, and not that I can see anything that will make it *harder* than the original
<wrtp> fwereade: i did have a plan for that... let me think
<fwereade> wrtp, but that does all look good otherwise
<wrtp> fwereade: here's a possibility - if the version is tagged as "pending", then all agents download the new version, stop what they're doing, then wait for the "pending" tag to be removed. *then* they go through the normal upgrade procedure.
<wrtp> fwereade: it's possible we might want to alternate between two server ports, so that new-major-version agents can connect to the new version of the db while others are still hanging on to the old one.
<fwereade> wrtp, first bit sounds sane, bit nervous about the second part
<fwereade> wrtp, I don;t think we ever want 2 things connected to the same db that aren't in agreement about what data formats are in play ;)
<wrtp> fwereade: me neither. hence before we bring up the new db, we make sure that everything in the system is halted, waiting for the pending tag to be removed.
<fwereade> wrtp, if we can halt everything we can also tell them to drop their connections, surely?
<wrtp> fwereade: yes, but we can't then tell them when to reconnect, right?
<fwereade> wrtp, the laws of logic be a harsh mistress
<wrtp> fwereade: i'm not sure i can see a way of reliably upgrading the db without using two db instances.
<wrtp> fwereade: and also we'd like to be able to switch the underlying db technology too.
<fwereade> wrtp, hmm, yeah, makes sense
<wrtp> Aram: there's a convention of using "pathPkg" as an alternative name when importing "path".
<wrtp> fwereade: last thing i saw from you was "[17:55:53] <fwereade> wrtp, the laws of logic be a harsh mistress"
<wrtp> fwereade: in case i missed anything
<fwereade> wrtp, you missed a "yeah, makes sense"
<wrtp> fwereade: cool, thanks
<wrtp> right, i'm off for the weekend. see y'all monday.
<wrtp> fwereade: have a good weekend. enjoy the lovely sun which we haint got.
<fwereade> wrtp, cheers, enjoy :)
<Aram> wrtp: ok, thanks, will change
<hazmat> wrtp, you stop the activity, wait till everyone is stopped, upgrade, then reboot the agents
<hazmat> wrtp, have a good weekend
<hazmat> er. upgrade == upgrade, run migrations
#juju-dev 2012-06-09
<Aram> moin.
#juju-dev 2012-06-10
<twobottux> aujuju: why is it that 'Ubuntu Server. A full deployment will require a minimum of 10 servers.'? <http://askubuntu.com/questions/149012/why-is-it-that-ubuntu-server-a-full-deployment-will-require-a-minimum-of-10-se>
#juju-dev 2013-06-03
<jam> wallyworld: I ended up getting back at a reasonable time if you still want to 1:1
<wallyworld> ok, i'll fire up mumble
<jam> wallyworld: we can hangout instead
<jam> https://plus.google.com/hangouts/_/30081079f3ae053b9ac5eba3f343941780b57763
<jam> wallyworld: ^ I know how much you just adore mumble
<dimitern> wallyworld: client-side api stays as in state, as agreed on the last call - we cannot change that now
<dimitern> wallyworld: it'll require rewritting everything
<dimitern> wallyworld: but it's a transitional state towards having bulk support on the server
<dimitern> wallyworld: and eventually the agents can benefit from that (one at a time) when refactored
<wallyworld> dimitern: we need to discuss. i'm on a call right now. perhaps after standup. the way it is now is fundamentally broken because it is 1. stateful, 2. not connectionless, 3. not service oriented
<dimitern> wallyworld: none of these 3 things can change in a single CL
<dimitern> wallyworld: as i said, changing the client-side api needs to happen once the state api itself is changed accordingly
<dimitern> wallyworld: what's important is to have the api server-side right first, which is what the cl does
<dimitern> wallyworld: but by all means, another discussion is in order with me, you, fwereade and rogpeppe
<TheMue> morning
<fwereade> morning gents, I think my irc client may have swallowed it last time I tried to greet you all
<dimitern> fwereade, TheMue: morning
<TheMue> fwereade, dimitern: heya
<fwereade> dimitern, TheMue, heyhey
<fwereade> dimitern, wallyworld, we should all talk sometime soon
<dimitern> fwereade: yep, sgtm
<dimitern> fwereade: i have 1-1 with jam at 11, maybe after that, before the standup?
<fwereade> dimitern, sure, if that works for wallyworld -- or if right now wors for you both that might be even better
<dimitern> wallyworld: you're good for a call now?
<dimitern> fwereade: what was you wanted to talk about the api btw?
<fwereade> dimitern, I was just a bit confused about the authorization testing for unit access given a machine connection
<fwereade> dimitern, it seemed like an unrelated machine was happily getting units it shouldn't really have access to?
<dimitern> fwereade: oh?
<fwereade> dimitern, and if that's just because it's a straight port of existing functionality that's ok but I wanted to check what the story was there
<dimitern> fwereade: only the environ manager machine and the machine agent for a specific machine has access to most methods on machines
<dimitern> fwereade: re units, the auth is based on 3 things: entity itself (u.Tag() == authUser.Tag()); or its deployer (principal unit or machine)
<fwereade> dimitern, https://codereview.appspot.com/9797046/patch/21001/22008 seems to log in as a machine but to get units not assigned to that machine
<fwereade> dimitern, possibly I just misread something?
<dimitern> fwereade: can you point me to a specific test?
<fwereade> dimitern, TestUnitsGet
<dimitern> fwereade: so the suite is written so that the authuser is machine 0
<dimitern> fwereade: which (i know this must be wicked, but) is both JobHostUnits and JobManageEnviron
<fwereade> dimitern, s.machine, err = s.State.AddMachine("series", state.JobHostUnits) ?
<dimitern> fwereade: and in general reading operations are less restrictive than write
<fwereade> dimitern, is there something else set up in JujuConnSuite?
<dimitern> fwereade: ah, that's the first CL - yes in there it's just JobHostUnits
<fwereade> dimitern, ah, ok, I'm still readin that one I'm afraid
<dimitern> fwereade: Units.Get does not impose any additional authorization, beyond the requireAgent stuff which happens at srvRoot.Units()
<fwereade> dimitern, anyway that's a side issue really -- the really important thing is to nail down agreement with wallyworld, I think
<dimitern> fwereade: agreed
<fwereade> dimitern, hmm, I think that's probably wrong -- I'd prefer to be as restrictive as possible by default, really
<fwereade> dimitern, nobody gets any information we can't be sure they need ;p
<dimitern> fwereade: well, you cannot really do anything even once you get a unit - any other call which is mutating *has* auth
<dimitern> fwereade: but ok, I can change that to be more restrictive - like the check in Units.SetPassword perhaps?
<dimitern> fwereade: if !u.root.authOwner(unit) && !u.root.authDeployer(unit)
<fwereade> dimitern, yeah, that sgtm, I think... I'm just wondering whether it's ok to tell unauthorized clients the difference between a missing entity and one they're just not allowed access to
<fwereade> dimitern, it might not matter, but there's a definite parallel with common practice logging into, say, a website
<dimitern> fwereade: as it is right now, they'll get not found before they get unauthorized
<dimitern> fwereade: cf. tests with foo/42
<fwereade> dimitern, you don't say "wrong password" and thereby leak the existence of an account with a given email address
<fwereade> dimitern, you just say "email/password didn't match" (and thereby, ofc, frustrate users who can't remember which email account they signed up with ;p)
<dimitern> fwereade: i see your point, but what's the action? :)
<dimitern> fwereade: should we postpone CodeNotFound until we checked the user is actually authorized?
<fwereade> dimitern, report not-found for anything you try to access but aren't allowed to
<fwereade> dimitern, the trouble is I'm not certain that this is a good idea
<fwereade> dimitern, but nor am I certain it isn't
<dimitern> fwereade: so why is CodeUnauthorized there for at all?
<fwereade> dimitern, there's definitely a use case for it -- trying to modify an entity in an unacceptable way
<dimitern> fwereade: remember, if you got as far as to call that method you're at least an authorized agent
<fwereade> dimitern, eg the deploying machine should be able to get some access to a unit it's responsible for, but it shouldn't be able to set it to Dying
<dimitern> fwereade: i think i might be a good idea not to lie about things like that and make debugging much harder
<fwereade> dimitern, yeah, there's always a tension between security and transparency ;)
<jam> fwereade: I don't think we've actually had a discussion of the privacy model for Juju.
<jam> I'd like to at least bring up the 'helpful hints when you make a mistake' vs the 'dont leak secrets'
<fwereade> jam, quite so
<jam> fwereade: I think it would be reasonable to code Juju more along the lines of "if you have at least authorized with an account, we will give you more helpful errors". Though I guess
<jam> we run some untrusted code
<jam> in the Charm hooks, right?
<jam> so maybe we should be more defensive.
<jam> I'm not entirely sure what would leak if you knew that there was a specific service running.
<jam> fwereade: and then there is multitenancy
<fwereade> jam, remember that this is the agent context, though -- I would prefer that we be defensive in that context to guard against agent compromise (both by malicious actors, and by our own mistakes)
<jam> fwereade: well given the existing model of everyone-has-root-everywhere, it is good to get away from them. Finding a decent happy point is good.
<fwereade> jam, yeah, but not-being-sure-what-would-leak is kinda my problem -- being as restrictive as possible, at least for the private API, seems like a sensible default position
<jam> fwereade: also, what about readonly mode?
<jam> is that just again for the public apis?
<fwereade> jam, readonly mode is a Client thing, not an agent thing, I think
<fwereade> jam, I don't think there's a use case for an agent that can connect but not actually fulfil its duties
<fwereade> jam, and I don't *think* there's much confusion about an agent's duties
<jam> fwereade: at that point Client looks like a really wide API, rather than "some client-level requests about Units".
<jam> Which I think is why there is occasional confusion.
<jam> I think I understand that each Process that talks to the API should generally be talking to one of the top level endpoints?
<fwereade> jam, hmm -- a user will be talking to Client
<fwereade> jam, at the moment an agent will potentially want to talk to several of the root objects
<fwereade> jam, but not to client
<jam> fwereade: yep, that is 1 process (juju FOO) talking to 1 top level API (Client)
<jam> But Machine agents might want to talk to Unit?
<fwereade> jam, quite so
<fwereade> jam, I would be happiest if they were to be *very* heavily restricted
<fwereade> jam, but the set of relevant information, and allowable operations, is a bit messy
<jam> fwereade: so is it possible to say that Machine can only talk about Units that are on that machine? Is that reasonably structured internally that it can be queried easily?
<jam> The only specific logic I've seen so far is "Are you an Agent"
<fwereade> jam, yeah, should be
<jam> vs Are you the Agent for Machine X when requesting actions on X
<fwereade> jam, yeah, "are you an agent" is IMO of limited value
<fwereade> jam, "are you agent X" is a more suitable model in general, I think
<fwereade> jam, I'm not 100% sure, but I think that in the vast majority of cases it is simple to tell what entity has responsibility for what operations on what entities
<dimitern> jam: not quite right
<dimitern> jam: there are both "are you an agent" (requireAgent) when connecting
<dimitern> jam: and "are you the machine agent for machine X" authOwner(machine)
<fwereade> dimitern, jam: it crosses my mind that it *might* be good to partition the APIs by client-task
<fwereade> dimitern, jam: this would perhaps satisfy wallyworld's concerns
<dimitern> fwereade: by client-task?
<fwereade> dimitern, the provisioner service, for example, provides all functionality required by the provisioner task
<fwereade> dimitern, downside is that there will be some overlap of responsibility
<dimitern> fwereade: hmm.. that's not a bad idea
<dimitern> fwereade: but are we deciding to break state api compatibility completely then?
<fwereade> dimitern, ie a Machiner is allowed to set its machine to Dead; and so is a Provisioner
<fwereade> dimitern, yeah, it might be hard to do without having some impact on the client code, that is also a valid concern
<dimitern> fwereade: *some* impact? :)
<fwereade> dimitern, ;)
<fwereade> dimitern, I don't like to overstate things, it makes people jumpy ;p
<dimitern> fwereade: :) yeah, I know what you mean
<jam> dimitern: If the actual API call takes the same args and has the same net effect, then the client impact is just renaming what method you are calling. Which is an effect, but it isn't massive.
<dimitern> fwereade: then maybe we should decide should we throw away what's already there in the workers and rewrite them one by one to use the api, while keeping the others using state
<dimitern> fwereade: if we do that, everything goes for the client-side agents api
<fwereade> dimitern, that's what we're doing already, really, it's just we've been trying to minimize the difference between the two versions
<fwereade> dimitern, and we still should -- but we shouldn't take the original form to be sacrosanct
<fwereade> dimitern, the difference between having a global api.Machine and a provisioner.Machine, for example, is actually quite minimal
<fwereade> dimitern, (vs machiner.Machine)
<dimitern> fwereade: true
<dimitern> fwereade: i'm beginning to like it more and more
<fwereade> dimitern, and it has benefits in that we make it much harder to screw up auth on methods that are relevant for two very different clients in very different contexts
<dimitern> fwereade: there'll be some duplication (machiner.Machine and provisioner.Machine will basically be the same, but this can be fixed by helpers)
<fwereade> dimitern, however, I'm not necessarily sure that task-graining is quite right -- job-graining *might* be better
<fwereade> dimitern, thoughts?
<dimitern> fwereade: it actually simplifies the authorization a lot - we can do it on the top-level - "are you a provisioner?" ok - get the api.Provisioner end point, and no more checks inside the calls themselves
<fwereade> dimitern, yeah
<fwereade> dimitern, I'm considering the distinction between the Provisioning endpoint and the EnvironManagement endpoint
<dimitern> fwereade: i can see more benefits as well - it isolates us nicely from the exact state api
<fwereade> dimitern, yeah
<dimitern> fwereade: so job vs. agent?
<fwereade> dimitern, yeah: I can't tell which is right though :)
<dimitern> fwereade: the jobs (tasks/workers) are the ones actually doing stuff, while the agent has little to do with state anyway
<fwereade> dimitern, a job is a group of tasks/workers
<dimitern> fwereade: (it shouldn't deal with state at all I think, or if it should, it can use the worker api instead perhaps?)
<dimitern> fwereade: yeah, correct
<fwereade> dimitern, I'm concerned right now with whether job granularity is better, or task granularity
<dimitern> fwereade: if we choose tasks, authz gets easier (at end point only)
<fwereade> dimitern, maybe auth should be grouped by job, but actual service endpoints should be split by task?
<dimitern> fwereade: if we choose jobs, there has to be more authz inside methods as well
<fwereade> dimitern, it does act to fossilize the jobs
<fwereade> dimitern, but maybe that's a worthwhile proce, because the original idea of making them flexible will be kinda hard to carry off regardless
<fwereade> dimitern, and maybe it's not even any worse than it will be *whatever* we do
<fwereade> dimitern, this plausibly lets us split the whole damn api up into sensible packages though
<fwereade> dimitern, and that idea makes me very happy
<fwereade> dimitern, state is a monster
<fwereade> dimitern, and it'll be hard to ever unmonster that, because it's responsible for cross-entity consistency, sanity, etc
<dimitern> fwereade: bbiab, sorry - 1-1
<fwereade> dimitern, no worries, ttyl
<fwereade> dimitern, (fwiw, I think the job/task granularity question can be decided a little way down the line: pick one by gut, keep an eye on it, fix it before we release if it seems to be crazy in short-term hindsight)
<fwereade> dimitern, (by-job is probably easier, fwiw)
<fwereade> dimitern, (no response necessary while you're busy ;p)
<dimitern> fwereade_: back\
<dimitern> fwereade_: fwiw, task-based interface seems better to me
<TheMue> dimitern: rogers new task runner isn't yet merged, isn#t it?
<dimitern> TheMue: it is, but it's not used yet
<TheMue> dimitern: hmm, seems i'm looking at the wrong places :)
<dimitern> TheMue: worker/runner.go
<TheMue> dimitern: thx
<fwereade_> dimitern, let's go with per-task then
<fwereade_> dimitern, definitely makes it easier to shift tasks between jobs, which I expect we'llneed at some point
<dimitern> fwereade_: cool, I can throw together a proposal of the interface
<fwereade_> dimitern, great
<dimitern> fwereade_: should I start with the provisioner?
<fwereade_> dimitern, best not, frank is up to the elbows in that at the moment
<fwereade_> dimitern, let's go with machiner to start with
<dimitern> fwereade_: ok
<fwereade_> dimitern, which I *think* is an absolutely minimal set of data and methods
<TheMue> fwereade_, dimitern: +1
<fwereade_> dimitern, no need to include pinger
<TheMue> ;)
<dimitern> fwereade_: sure
<fwereade_> dimitern, I'm not sure how we address wallyworld's concerns re statefulness and connectionlessness
<fwereade_> dimitern, the latter is a necessity given routing issues, and the former is a consequence thereof AFAICT
<fwereade_> dimitern, I agree that we're more stateful than we need or want to be long-term, but I don't think it should be addressed today
<fwereade_> dimitern, does that roughly align with your understanding?
<dimitern> fwereade_: exactly
<fwereade_> dimitern, cool
<fwereade_> TheMue, btw, I noted a comment of yours in the loggo CL: "TODO for whom?"
<fwereade_> TheMue, AIUI the badging by name is not "who should do this", it's "who should I talk to about this when I hit it"
<fwereade_> TheMue, if person X is refactoring package Y and sees TODO(Z), and doesn't immediately understand all relevant context, X should talk to Z and take the resulting discussion into account when changing Y
<fwereade_> s/refactoring/working in/
<TheMue> fwereade_: ok
<TheMue> fwereade_: i only know lonely todos as those which never will be done
<fwereade_> TheMue, so I'm not convinced that TODOs *always* require a badge -- it's just a pointer that is expected to persist better than `bzr annotate`
<fwereade_> TheMue, I hadn't seen a badged TODO as a commitment to actually DO :)
<TheMue> fwereade_: it's just short than // HASTOBEDONEONCEWHENANYBODYHASTIME ;)
<fwereade_> TheMue, yeah :)
<TheMue> fwereade_: I take them typically as hints for me that I have to do further work here in a second or third step
<fwereade_> TheMue, sure, it's *great* when someone adds a badged TODO and fixes it 3 days later
<TheMue> fwereade_: *rofl*
<fwereade_> TheMue, but I still feel the badge serves the purpose I suggested: to warn others to chat to X before diving in (and thereby discover, hey, it's being done :))
<TheMue> fwereade_: btw, just a question. it seems to me my "anybody" has been wrong and "somebody" would be the correct term. am i right?
<fwereade_> TheMue, either reads ok in that context, I think, but maybe "somebody" wins on a technicality
<TheMue> fwereade_: fine, thanks, still learning
<dimitern> fwereade_: there it is -> https://docs.google.com/a/canonical.com/document/d/1Yd2Nil43AemnBq8qv003OkWLptiPzuhWGESBcwlI7Nc/edit?usp=sharing
<dimitern> wallyworld, jam: ^^
<fwereade_> dimitern, cheers
<fwereade_> dimitern, hmm, all of those things STM to be better expressed client-side as methods on a *Machine, as before
<dimitern> fwereade_: i was trying to avoid this
<dimitern> fwereade_: and make wallyworld happy as a side-effect :)
<fwereade_> dimitern, yeah, we should talk to wallyworld about what his objections to the agreed (I thought) client-side API are
<fwereade_> dimitern, the client  side should be fixed if it's stupid, sure
<fwereade_> dimitern, but we don't have to write the client-side code as a perfect clone of the API we use to implement it
<dimitern> fwereade_: not sure i understand the last statement..
<fwereade_> dimitern, Machine.SetStatus(foo, bar) can call MachinerService.SetStatus([]MachineStatus{{id, foo, bar}}), and then we get both a helpful long-term API, and a helpful form of code on the client side
<dimitern> fwereade_: there's not MachinerService - at server-side we still use Machines.SetStatus, etc.
 * TheMue is at lunch, bbiab
<dimitern> fwereade_: the Machiner facade is client-side only
<dimitern> fwereade_: at least that's what I understood
<fwereade_> dimitern, hmm, it seemed to me that we'd get the best of it from doing it server-side -- doing it client-side kinda misses the SOA point I think?
<fwereade_> dimitern, that's where we need the auth etc after all
<dimitern> fwereade_: hmm.. good point
<fwereade_> dimitern, MachinerService (or whatever) thus maps approximately to the *State we pass into workers today
<dimitern> fwereade_: so then the client-side API is just a 1-1 mapping to the server-side
<dimitern> fwereade_: how about bulk operations then?
<fwereade_> dimitern, well, the client-side API can look much as it currently does, but expressed in task-specific terms over the wire
<fwereade_> dimitern, I *think* we still want bulk ops, on the basis that they're forward-sensible even if we never use them, and boxing/unboxing is little hassle
<dimitern> fwereade_: that's exactly what wallyworld was agains aiui
<fwereade_> dimitern, he was against bulk ops?
<fwereade_> dimitern, ah sorry, against keeping the existing client-side API?
<dimitern> fwereade_: no, against client-side objects even when they are proxying to server-side services
<dimitern> fwereade_: yeah
<dimitern> fwereade_: how about a hybrid approach?
<fwereade_> dimitern, changing that is IMO busywork with no benefit, and it obscures the actual logic of the machiner
<fwereade_> dimitern, a machiner's about messing around with a single machine
<dimitern> fwereade_: still have Machines at server-side, with bulk ops, as now, but in addition have the Machiner facade as well
<dimitern> fwereade_: which will use the machines service but for a single machine
<fwereade_> dimitern, IMO the per-task facades make sense server-side, and let us drop the Machines, Units, etc service-alikes he';s not keen on
<fwereade_> dimitern, the issue with "Machines" is that it's a weird division and not very servicey
<dimitern> fwereade_: not really - the "service" deals with machines
<dimitern> fwereade_: that's like openstack api for servers
<fwereade_> dimitern, yeah, but lots of things deal with machines -- splitting by entity kind seems weaker than splitting by groups of business-logic ops that fit well together
<dimitern> fwereade_: ok, if we do that, then i cannot see where the bulk ops come into play
<dimitern> fwereade_: if the api is task-oriented (which is still SOA imho), then we don't really need bulk ops at all
<dimitern> fwereade_: we'll need them only for client-facing apis like for the gui/cli
<fwereade_> dimitern, consider the provisioner -- where long-term we'll want Lifes, InstanceIds, etc
<fwereade_> dimitern, the machiner doesn't need them now and perhaps never will
<fwereade_> dimitern, but habitually expressing things as bulk ops means we never need to make that decision, and hence never run the risk of getting it wrong
<dimitern> fwereade_: fair enough, then we'll introduce bulk ops on methods of the task facades, as needed
<fwereade_> dimitern, I agre that in the narrow machiner context we probably never will
<dimitern> fwereade_: but not *everywhere by default*
<fwereade_> dimitern, I'm advocating everywhere by default, because the cost of doing so is low and it inculcates good habits
<dimitern> fwereade_: the cost is not low at all in the case of the machiner
<fwereade_> dimitern, expand please
<dimitern> fwereade_: if the facade operates on a single machine, why do we need to pretend is works on multiple machines?
<fwereade_> dimitern, because, say, getting machine life *is* an op we will need in bulk
<dimitern> fwereade_: on the other hand, for the provisioner - most ops will be bulk, but still they're on the facade as methods
<fwereade_> dimitern, the machiner only wants one at a time, but the provisioners will want N
<dimitern> fwereade_: if we're talking about eventually changing the underlying state API to default to bulk ops, i agree
<fwereade_> dimitern, we can then make the auth the responsibility for the facades, and implement the machine-life-getting separately, as called by both
<dimitern> fwereade_: but for the case of the machiner single ops suffice
<dimitern> fwereade_: so you're talking about a hybrid approach after all - having both Machines (w/ bulk ops by default), and facades, using Machines as well - in case of machiner single ops, with the provisioner - bulk ops?
<fwereade_> dimitern, no... Machines should never be exposed
<dimitern> fwereade_: i'm not saying it should be
<fwereade_> dimitern, but an apiserver-internal Machines might actually fit very nicely
<dimitern> fwereade_: it can be an internal "service"
<dimitern> fwereade_: exactly
<fwereade_> dimitern, and having the various services expressed with a consistent vocabulary is IMO an end in itself
<dimitern> fwereade_: ok, it gets clearer now
<dimitern> fwereade_: apiserver.machines, units, etc. are there
<dimitern> fwereade_: and the exposed facades apiserver.Machiner, etc. use them internally
<dimitern> fwereade_: this simplifies the srvmachines implementation, because it doesn't have to care about authz at all
<dimitern> fwereade_: it'll be done on the facades only
<dimitern> fwereade_: does it make sense?
<fwereade_> dimitern, yes, with the caveat that the thing I want to prescribe is that the data on the wire be expressed as (1) bulk ops on (2) business-logic-level services; the internal services are a possible implementation detail that seems ATM like a good idea, but which doesn't lock us into anything going forward
<fwereade_> dimitern, yeah
<fwereade_> dimitern, the rason I'm quibbling about the internal machines is that we'll actually want that stuff to migrate into the *state* package long-term, I think
<dimitern> fwereade_: see, now I'm not following you
<fwereade_> dimitern, so we can do bulk *db* ops to get, say, life for 10k machines in one go
<dimitern> fwereade_: bulk ops *on-the-wire* applies only to the exposed facades' methods
<fwereade_> dimitern, without having to suck up 10k lists of assigned units as well
<fwereade_> dimitern, yeah, that is my primary focus
<dimitern> fwereade_: ok, then we're on the same page
<fwereade_> dimitern, but it supports, in the long term, smarter and more efficient usage of the underlying db
<fwereade_> dimitern, and so while a Machines service is probably a good idea today, it's not a prerequisite for what I'm after
<dimitern> fwereade_: agreed - once we're free to develop the state api w/o having to worry about breaking the agents, we can improve the actual calls to be bulk
<fwereade_> dimitern, and as soon as we've isolated state from everything except the API server, we start to be able to slice and dice the interface of the *state* package such that things like the Machines service become redundant
<fwereade_> dimitern, exactly
<dimitern> fwereade_: i'll change the proposal to reflect what we just discussed then
<fwereade_> dimitern, <3
<fwereade_> dimitern, is your standup about to start?
<dimitern> fwereade_: in 30m
<fwereade_> dimitern, ok, cheers -- if I'm afk when wallyworld's ready for a chat, text me please, I'd like to make sure we're all on the same page asap rather than waste your time again :)
<dimitern> fwereade_: sure, np :)
<wallyworld> fwereade_: dimitern: hi, i just got back from soccer, did you guys want a chat?
<dimitern> wallyworld: now or after standup? jam said he won't be around until standup time
<wallyworld> now is fine
<dimitern> fwereade_: you as well can chat now?
<dimitern> wallyworld: just a sec, texting fwereade_
<fwereade_> wallyworld, dimitern, sgtm
<dimitern> fwereade_, wallyworld: ok https://plus.google.com/hangouts/_/7964c6d4ac01386655b7ddf69dbc9cd43365a91a?authuser=0&hl=en
<danilos> jam, dimitern, hi, are we having a stand-up?
<jam> yep!
<fwereade_> wallyworld, look at it like this: by delivering an API, we create real value right now wrt things the business needs *soon*; by refactoring all the client code on top of that, we just increase risk. Maintaining the domain-object style of the pre-existing client code is cheap and easy, and does *not* make it any harder to fix later; but trying to fix the programming model right now across the board *does* impact our ability to deliver real value
 * fwereade_ lunch
<dimitern> fwereade_: ping me when you're back please
<fwereade_> dimitern, ping
<dimitern> fwereade_: i'll start a hangout
<dimitern> fwereade_, wallyworld, jam: https://plus.google.com/hangouts/_/609bce415113fdec03001e8ade27f75c41945d7e?authuser=0&hl=en
<fwereade_> someone else please review https://codereview.appspot.com/9641044/ for frankban
<fwereade_> ttfn guys, I'll be around again later tonight
<FunnyLookinHat> Anyone here know why I'd be getting a 411 Length Required error when trying to bootstrap?  Is the client request class not sending proper headers for some reason?
<FunnyLookinHat> I can't determine if the headers are being attached or not because I haven't found a way to dump the requests being sent :-(
<arosales> ran into this juju core bug this weekend
<arosales> https://bugs.launchpad.net/juju-core/+bug/1187062
<arosales> is anyone else running into this on HP's cloud?
<fwereade_> arosales, `juju image-metadata -i 81078` will generate a pair of files somewhere inside $JUJU_HOME that you need to upload to your public-bucket
<fwereade_> arosales, I thought wallyworld sent a mail about it, but I could be wrong
<arosales> fwereade_, to juju list or juju-dev
<fwereade_> arosales, I can't seem to find it so that may just have been a fever dream
<arosales> I very well could have missed it too
 * arosales running behind on list email
<fwereade_> arosales, I'm pretty sure that if you run that command, though, it tells you what to do next
<fwereade_> arosales, sorry drive-by, I'm not really at work any more
<arosales> fwereade_, do I get 'juju image-metadata" with the ppa juju-core?
<arosales> or do I need to do that from source?
<fwereade_> arosales, hmm, I will be upset if we released simplestreams-only without including the workaround
<fwereade_> arosales, run that command and rage at me if it doesn't help you :)
<arosales> fwereade_, will do
<arosales> I had to remove juju core
<arosales> and had to fall back to juju for my demo this weekend
<fwereade_> arosales, dammit, sorry about that :(
<fwereade_> arosales, I'm often around at antisocial hours but was ill last w/e
<fwereade_> arosales, was that the problem?
<arosales> fwereade_, no worries
<arosales> had a demo at texas linux fest I had to fall back to .7 for
<jcastro> arosales: did the demo go well overall?
<arosales> jcastro, ya .7 went well
<arosales> had good interest and questions.
<arosales> fwereade_, do you know what the valid inputs are for series?
<arosales> http://pastebin.ubuntu.com/5730353/ ?
<fwereade_> arosales, you don't want the "="s in there
<fwereade_> arosales, `juju image-metadata -s precise ...`
<wallyworld> fwereade_: arosales: just saw the backscroll - i did send emails and it's in the release notes how to use image-metadata to generate local simplestreams data
<fwereade_> wallyworld, yeah, I *thought* you had, I must just be stupider than I look
<wallyworld> i tested with canonistack
<wallyworld> you do not look *that* stupid :-)
<arosales> still getting http://pastebin.ubuntu.com/5730651/
<arosales> fwereade_, thanks or the response btw, I know it is late for you.
<fwereade_> arosales, no worries :)
<wallyworld> arosales: did you copy those files to your public bucket?
<fwereade_> arosales, I don't think you want any of those "="s at all, fwiw
<arosales> wallyworld, no  I didn't.
<arosales> I suspect I should find the release notes and follow the instructions . . .
<wallyworld> arosales: it's in the instructions - perhaps i need to make it clearer
<arosales> fwereade_, noted on the "="
<wallyworld> it's also on the command line after you run the command
<wallyworld> arosales: i haven't come across a meeting invite - you said you scheduled one for tomorrow? i only just woke up so may be not have found it in my inbox
<thumper> morning
<thumper> hi wallyworld
<thumper> wallyworld: early for you
<wallyworld> thumper: hi, how's the dog
<thumper> bouncy
<wallyworld> yeah
<wallyworld> i bet
<wallyworld> the girls like him?
<thumper> oh yes
<thumper> and her
<thumper> pepper is a girl
<thumper> I think I'll call her pepper pots
<wallyworld> ah, i forgot
<wallyworld> dogs are great. we love ours
<thumper> she has settled down now
<wallyworld> yeah, takes a day or two
<thumper> I've moved into the lounge so I can close it off and watch her
<thumper> well everyone else has left
<thumper> kids at school
<thumper> and rachel at work
<wallyworld> i gotta go do school drop off now. back after breakfast etc
<arosales> wallyworld, ah I forgot to add you to the invite
<arosales> wallyworld, sorry
<arosales> which makes the scheduling tuff
<arosales> wallyworld, fwereade_ regarding simple streams. As this is a fundamental change to the way users expect Juju to work do you plan on making this backwards compatible or optional?
<arosales> ie the use of simple streams?
<fwereade_> arosales, simplestreams is intended to be the One True Way going forward; but we hope that for most public clouds we wil be publishing everything needed via cloud-images.ubuntu.com
<arosales> wallyworld, perhaps I can follow up with you and dave later in the evening regarding simple streams
<arosales> fwereade_, gotcha.  Now we just need to be sure we are publishing everything to cloud-image.ubuntu.com
<fwereade_> arosales, yeah; there has an issue with us publishing unsigned simplestreams data to the HP bucket, misrepresenting the images as being Official Ubuntu Images -- I think there's a thread you're in somewhere..?
 * fwereade_ looks at his mail again
<arosales> ya, I think I saw ben howard raise that concern
<fwereade_> arosales, I've just accepted the invite for tomorrow, thanks
<arosales> fwereade_, thanks. I"ll follow up with wallyworld and dave later in the US day to get their input/feedback too.
<wallyworld> arosales: sorry, was doing school drop off
<arosales> wallyworld, no worries
 * thumper takes the dog for a walk...
<wallyworld> arosales: ping me today if you have any immediate questions, or otherwise we can ensure everyone is on the same page tomorrow at the meeting
<arosales> wallyworld, will do.  I am seeing a few interesting stack traces in go-juju
<arosales> but perhaps that is just my environment.
<arosales> if it is repeatable I will open up a bug for triage
<wallyworld> yeah, we would love to get all these things fixed
<wallyworld> we're not seeing them as devs but end users have all sorts of different set ups
<wallyworld> and we know that if there are issues (with env or whatever), we need to, in places, improve the error reporting etc
<wallyworld> thumper: back from the walk?
<thumper> wallyworld: yeah, have been for ages
<thumper> the dog doesn't take long walks
<thumper> yet
<thumper> and it is cold outside
<thumper> I had to coax her out
<wallyworld> i bet
<wallyworld> quick chat?
<thumper> sure
<wallyworld> https://plus.google.com/hangouts/_/d3f48db1cccf0d24b0573a02f3a46f709af109a6
<mramm> thumper: you around?
<thumper> mramm: yep, chatting with wallyworld now
<mramm> thumper: cool!
<mramm> thumper: got time to chat for a few in like 15 min?
<thumper> mramm: wallyworld said to talk to you first
<mramm> hahaha
<mramm> ok
<thumper> he said it like "mramm is in a worse timezone and needs sleep, so talk to him first"
<mramm> hahaha
<mramm> I guess I was at a meeting with him 12 hours ago
<thumper> mramm: are you starting a hangout/
<thumper> ?
<mramm> yep
<mramm> one sec
<thumper> wallyworld: I'm off with mramm now
<wallyworld> ok
<wallyworld> same url as before
#juju-dev 2013-06-04
<davecheney> https://bugs.launchpad.net/bugs/1187062
<_mup_> Bug #1187062: 1.11.0-1~1240~quantal1 cannot find Precise Image on HP Cloud <juju-core:New> <https://launchpad.net/bugs/1187062>
<davecheney> does anyone have the creds to run sync tools ?
<wallyworld> davecheney: it's not sync-tools - there's no quantal image metadata uploaded to hp cloud yet, just precise
<wallyworld> i can add a quantal image entry to the metadata
<wallyworld> what image id do you want?
<davecheney> wallyworld: thanks for replying
<davecheney> i'm wondering if I should tell antonio not to use quantal
<davecheney> we're only pushing the LTS
<wallyworld> that's what i thought when i uploaded the metadata - that people would just be using precise
<davecheney> i think that is a reasonable response
<davecheney> and is consistent with the company message
<wallyworld> i thought that the charms would only be quaranteed to work with the LTS
<davecheney> also, there are almost 0 charms for quantal
<wallyworld> yeah
<wallyworld> so i didn;t want to provide a footgun
<davecheney> wallyworld: the charms also have a series, and there aren't really any for quantal, almost none for raring and ziltch for suacy
<davecheney> lol @ footgun
<wallyworld> i know charms have a series - just assumed there were only a number for precise and not many for anything else like you say
<wallyworld> foorgun amuses me too
<wallyworld> footgun even
<wallyworld> for demos, like i think he wants this for, precise is the way to go
<davecheney> wallyworld: thanks, i'll write some stuff to antonio
<wallyworld> i had a chat this morning - there was also an issue not reading the release notes i *think*
<wallyworld> davecheney: there's a meeting tomorrow where all this will be cleared up fwiw, so don't feel like you have to write too much today
<davecheney> ok
<davecheney> yeah, simple streams
<thumper> hi jam, you around yet?
<thumper> davecheney: I have a question for you if you are around
<thumper> davecheney: wondering about lockless reading of an int
<thumper> davecheney: when it is possible that another goroutine may be writing to it
<thumper> any ideas on guarantees?
<thumper> davecheney: nm, I'll use a defined size and sync.atomic
<jam> thumper: hey, what's up ?
<thumper> jam: hi, I've just tweaked your branch a little
<thumper> and about to merge it
<jam> thumper: for loggo?
<thumper> yeah
<thumper> jam: I changed the globalMinLevel to be a uint32 explicitly
<thumper> (and changed the type from int to uint32 for Level)
<thumper> so I could use sync.atomic for the reads and writes
<thumper> otherwise we'd be reading an int without a lock
<thumper> potential problem
<thumper> so using sync.atomic for both reads and writes fixes this
<jam> thumper: because it is a word size, you will either get the existing or previous value (I'm pretty sure), which you are racing on when the int gets set with when you are logging
<jam> which is already a race.
<jam> well 'race'
<thumper> while the writes were all within the mutex, the reads weren't
<jam> thumper: sure, but it doesn't seem like an int that you have to be strict about
<thumper> while most likely, it isn't guaranteed (for the reads)
<jam> because if you got the wrong value, you could have also gotten the 'wrong' value because the async call to change it was 1ms later.
<thumper> yeah, but I'm mildly pedantic about shit like that
<jam> thumper: for word-size, you won't ever read part of the value.
<jam> thumper: for the lowest part of logging, it would be nice to avoid mutexes.
<jam> or sync level calls.
<jam> did you check it with the benchmarks?
<thumper> I seem to recall an exact example where it would
<jam> thumper: for multi-word stuff, you can get craziness.
<thumper> let me poke it with and without the sync calls
<jam> for example 'interface' is 2 words
<jam> (a pointer to type, and a pointer to value, which you can abuse with GOMAXPROCS>1)
<jam> You may still want uint32 instead of 'int' for that same reason.
<jam> thumper: anyway, I have the feeling the specific overhead doesn't matter much, but it is a multi-cpu sychronization point to log something that you are then throwing away (sometimes)
<jam> thumper: btw, you don't have a good way to change the log level of writers. To change the default logging you have to RemoveWriter() RegisterWriter(), because ReplaceDefaultWriter doesn't let you change the level.
<thumper> jam: ah, good point...
<thumper> I suppose we should add that at some stage
<jam> thumper: I was a little surprised that Logger objects track their log level, but Writer objects have it tracked in a separate location.
<thumper> seems to be about 5% slower on the fastest case to use sync
<jam> thumper: tbh the fastest case probably doesn't trigger all that often, since getEffectiveLogLevel is probably going to be the primary DEBUG/TRACE filter?
<thumper> aye
<jam> thumper: the goal is to just make it cheap enough that you don't have to think about perf when adding logging, because it will be cheaply filtered out.
<thumper> so I guess the question is with a read of uint32, will we always get a "whole" value?
<jam> thumper: note, layering the calls has a measurable performance impact, though in the "NS" range.
<thumper> what do you mean layering?
<jam> thumper: Debug calling Log
<jam> it is also measurable if you pass an extra parameter.
<jam> but again, 13ns per call vs 19ns per call.
<jam> sort of range.
<thumper> that is because I special case the one param
<thumper> to avoid Sprintf
<jam> thumper: no, actually, it doesn't get to Sprintf
<jam> Sprintf costs 600ns
<jam> or so
<jam> this is just parameter passing
<thumper> oh, so jsut the param?
<thumper> weird
<jam> if you change the fastest benchmark test
<jam> thumper: yeah, but again, nanoseconds
<thumper> and you really shouldn't be logging 1000s of things per second
<thumper> hopefully
<jam> thumper: well, millions of things, really
<thumper> although I did log a metric fuck-ton of stuff in unity for testing cleanup
<jam> I could see TRACE getting really verbose.3
<thumper> jam: exactly, which is why we want better module separation
<jam> but if you are actually logging stuff, the Sprintfs start to add up.
<thumper> so you can set trace on say, the provisioner
<thumper> but nothing else
<thumper> yeah, I can imagine that it does add up, but the cost/benefit of the logging is worth it I think
<jam> thumper: so on my machine, if you look at BenchmarkLoggingDiskWriterNoMessagesLogLevel
<jam> with s.logger.Debug(msg, i)
<jam> it is 25.5ns/op
<jam> if I change it to
<jam> s.logger.Debug(msg)
<jam> it is 13.8ns/op
<thumper> heh
<jam> if I change it to s.logger.Log(trace.DEBUG, msg)
<jam> It is 9.2ns/op
<jam> so, interesting that all that is measurable
<jam> I imagine the overhead of varargs is because it has to allocate the slice and the backing array
<jam> but you *are* talking 15ns absolute time.
<jam> so if you call this 1000s of times per second, it has a net overhead of 15 microseconds/second (15ppm)
<jam> I don't think that will explicitly ever show up in a pprof :)
<thumper> :)
<jam> compared to TestWriters which is 1574ns
<jam> so once you are actually formatting and writing stuff to a string
<thumper> so, reading around the atomicity of 32 bit reads and writes...
<jam> you are ~100x slower.
<thumper> seems fine on intel, and amd64
<thumper> but not sure about arm
<jam> thumper: I *believe* that all platforms give you an atomic-word-aligned load because often pointers are used to implement atomic operations.
<thumper> I'll take that 10ns hit for a nicer method signature
<thumper> hmm...
<jam> thumper: http://stackoverflow.com/questions/9399026/arm-is-writing-reading-from-int-atomic
<thumper> jam: so a question is "does go word align integers?"
<thumper> I think the answer should be "I damn well hope so"
<jam> thumper: I don't know of anything that *doesn't* align objects unless you do crazy shit
<thumper> jam: like std::vector<bool>?
<jam> like use an integer pointer into a byte array to manually extract stuff (which I've done, but I've also gotten BUS errors on Mac doing it :)
<jam> thumper: does the stdlib treat it as wide pointers in memory?
<thumper> I don't recall
<jam> there is a fair amount of C code (especially string searching) that knows that on certain platforms
<jam> it is safe to do unaligned loads
<thumper> the general answer was don't do it
<jam> (intel amd is fine, PPC is not)
<thumper> ew
<jam> I think bzrlib itself has some, but also with platform checks.
<thumper> ok, so back to a global uint32
<thumper> should be word aligned?
<thumper> how would we know?
<thumper> apart from taking the address of it
<jam> if you want to *know* then you have to take the address and compare mod 4, but there is reflect.Type.Align as well.
<thumper> yeah, you know what?
<thumper> I'm just going to use sync.atomic
<thumper> it isn't enough of a difference to care
<jam> thumper: did you check what the over head is? If it is 10ns I agree, if it is 100 or 1000 then I might quibble
<thumper> jam: it is about 4ns
<thumper> on my machine
<davecheney> thumper: back now (lunch)
<davecheney> looks like you already got an answer
<thumper> davecheney: that's ok, me and jam have just been talking about atomic stuff
<davecheney> maybe you've already covered it
<davecheney> but there is a difference between an atomic write, and a write that is safely published
<thumper> what do you mean by safely published?
<davecheney> visible to another thread
<thumper> I don't care about delays
<thumper> just valid reads and writes
<davecheney> you'll probaby be ok, but that is playing fast and loose with the memory model
<davecheney> ie, the delay could be infinite
 * thumper pulls a face
<thumper> so how does one safely publish something?
<davecheney> sync/atomic or use a lock
<davecheney> ie, you need a memory barrier
<davecheney> or send it through a channel
<thumper> ah, yes, I decided to use sync/atomic
<davecheney> all good then
<thumper> ok, cool
 * thumper considers something else
<thumper> so, theoretically I could have one go routine set a logging level on a logger, and not have it visible to another go routine?
<thumper> if not protected by locks?
<davecheney> correct
<thumper> or atomic reads/writes
<thumper> hmm...
<davecheney> yes
<thumper> poos
<thumper> davecheney: so, I have "type Level uint32"
<thumper> but I can't have a Level variable and use atomic.LoadUint32
<thumper> because it complains about the casts
<thumper> so perhaps just better not to have the Level type?
<thumper> although I kinda like the String method on it
<thumper> or just have the Level at the public interface
<thumper> and use uint32 internally?
<davecheney> thumper: where is the code again
<thumper> launchpad.net/loggo
 * davecheney looks
<davecheney> two secs
<thumper> particularly considering the func (logger Logger) SetLogLevel method
 * davecheney twiddles fingers
<thumper> davecheney: also, FYI, I'm giving a talk on Go in about an hour
<davecheney> at the NZ meetup
<davecheney> sweet
<davecheney> >
<davecheney> ?
<davecheney> thumper: let me check one thing
<davecheney> i think when you use atomic.SetUint32, you also need to read using atomic.ReadUint32
<davecheney> let me check in the channel
<thumper> davecheney: yes, was also using both
<thumper> read and write for sync
<davecheney> ok, cool
<thumper> atomic.StoreUint32 and LoadUint32
<thumper> however
<davecheney> indeed
<davecheney> logically that would appear to be the correct usage
<thumper> currently for the logging levels, this isn't done
<thumper> just assigning to a Level variable
<thumper> which is a uint32
<thumper> but the typing is just getting in the way of using the atomic functions to store and load
<thumper> which is frustrating
<davecheney> thumper: got the line(s)
<thumper> davecheney: I'm currently poking jam's branch
<thumper> which isn't currently pushed
<davecheney> ewww
<thumper> ewww what?
<davecheney> just paste the few lines into paste.ubuntu.com
<thumper> return level >= Level(atomic.LoadUint32(&globalMinLevel))
<thumper> sorry, someone at the door
<thumper> that is what I have by saving globalMinLevel as a uint32
<thumper> but I can't seem to have globalMinLevel as a Level, and still use atomic reads and writes
<thumper> davecheney: http://play.golang.org/p/pB65DtZrXr so this is what I want to do but can't
<davecheney> thumper: http://play.golang.org/p/0_ei1fpH9q
<thumper> davecheney: gah... ok, ta
<thumper> I'm pleased it is possible
<thumper> I was getting very frustrated
<davecheney> you need the parens to disambiguate the * deref
<davecheney> please hold, arguing about StoreUint32 in #go-nuts
<thumper> davecheney: as in arguing the need for it?
<davecheney> not really
<davecheney> more wandering a blind confusion at this point
<thumper> interesting
<thumper> changing the access to the module.level to use atomics made 0.1ns of difference
<thumper> rather than the 4ns of difference changing the globalMinLevel for the writer
<davecheney> thumper: i think i missed that part of the timeing discussion
 * davecheney looks for globalMinLevel
<davecheney> thumper: honestly, if you're in the ns territory, it doesn't matter
<davecheney> even in the us territory
<thumper> davecheney: it is in the branch jam proposed
<thumper> not in trunk
<davecheney> can't see the code
<davecheney> but there are probably a few things going oin there
<davecheney> 1. this is uncontented
<davecheney> no other CPU is stealing the cache line from the active one
<davecheney> so you're benchmarking the round trip to your L1/L2 cache
<davecheney> which is fair
<davecheney> most synchronisation is uncontended
<davecheney> i'm assuming because globalMinLevel has the word global in it
<davecheney> you're walking the logger tree to the root
<davecheney> correct ?
<thumper> I think so, but I'm off now
<thumper> perhaps we could chat tomorrow?
<davecheney> who's used debug-hooks recently
<davecheney> ?
<fwereade_> davecheney, I haven't but I might be able to talk about it?
<davecheney> fwereade_: so I have a version of debug-hooks that replicates the ssh into tmux behavior
<davecheney> (stole the script straight out of py juju)
<davecheney> but it doesn't really look right
<davecheney> ie, shouldn't HOOK_CONTEXT be exported, blah blah ?
<fwereade_> davecheney, yeah, I think so, HOOK_CONTEXT_ID is needed to make the hook tools run
<davecheney> and probably putting them in the path
<davecheney> but I think pyjuju cheats on that bit
<fwereade_> davecheney, yeah -- it ought to be as close a copy of the normal hook environment as possible
<davecheney> fwereade_: did such a thing contextually exist in pyjuju ?
<fwereade_> davecheney, there must have been *something* that communicated back with the main agent process... let me see if I can find it
<davecheney> fwereade_: i'm looking at control/debug_hooks.py
<fwereade_> davecheney, you've seen the other half of it in hooks/executor.py?
<davecheney> fwereade_: nope
<davecheney> there was a concerning something.setdebug(true)
<davecheney> that i chose to ignore
<davecheney> fwereade_: even if there was a signal to the unit agent
<davecheney> i cant find any hook(sic) in the command that does anything but wait til it reached that point
<davecheney> then ssh in
<fwereade_> davecheney, I'm afraid I don't recall the experience of actually using debug-hooks
<davecheney> fwereade_: i saw marco use it once
<davecheney> https://codereview.appspot.com/9996043
<davecheney> it works, but doesn't really do much more than juju ssh atm
<fwereade_> davecheney, (but yeah, I think you can forget the set debug log bit for now -- it'd be worth syncing up with thumper on what he has planned in that direction though)
<davecheney> fwereade_: i'm trying to figure out if my faxilily is poor, or if the debug hooks world is more complicated in our juju
<fwereade_> davecheney, there's definitely some crazy magic that needs to happen at the other end, so that we do actually somehow inject the appropriate hook context while a hook is being run
 * davecheney scrobbles in the code
<davecheney> lucky(~/devel/juju/juju/hooks) % ack tmux
<davecheney> executor.py
<davecheney> 34:# The beauty below is a workaround for a bug in tmux (1.5 in Oneiric) or
<davecheney> 37:tmux new-session -d -s $JUJU_UNIT_NAME 2>&1 | cat > /dev/null || true
<davecheney> 38:tmux new-window -t $JUJU_UNIT_NAME -n {hook_name} "$JUJU_DEBUG/hook.sh"
<davecheney> oh ffs
<davecheney> that is what set debug does
<fwereade_> davecheney, ah, got you, I misunderstood what you said before
<fwereade_> davecheney, you will indeed need to set some state flag so the unit can know to do that
<davecheney> fwereade_: ok, i'll talk to le thump tomorrow
<fwereade_> davecheney, although... we will also need to make sure we unset it, however the connection ends, surely
 * davecheney noted that python didn't appear to unset whatever it sets
<fwereade_> davecheney, oh, ffs, it uses a ZK ephemeral node that goes away when the cli client goes away
 * fwereade_ sighs deeply
<fwereade_> davecheney, ...we could maybe use a presence node with a session guid? cleaning up feels likely to be racy, but perhaps it's actually equivalent
 * davecheney throws up his hands
<fwereade_> davecheney, ZK ephemerals don't go away immediately either
<davecheney> this is over my head
<davecheney> so much for debug-hooks being 'simple' to implement
<fwereade_> davecheney, yeah, I didn't have any recollection of that being the case myself
<davecheney> there was a suggestion of such on the ML
<fwereade_> davecheney, roughly speaking, the presence stuff should enable the same stuff that ZK ephemerals do, so once you're operating at a certain level of rarefied abstraction I can imagine how it could seem simple :/
<davecheney> sure, it's just programming, right ?
<fwereade_> davecheney, it's just 1s and 0s, type fastr
<fwereade_> davecheney, and, hmm, we can't get the ownership guarantees with a pinger that we could with a ZK ephemeral node
 * davecheney goes to find whisky
 * fwereade_ looks at the clock, and regretfully does not join davecheney
<davecheney> fwereade_: it's always 5pm somewhere in the world
<fwereade_> davecheney, ok, we *could* fake up everything we need, I think -- I just need to get up properly before I can discuss this sanely
<fwereade_> davecheney, I'll be back in a few minutes; will you be free for a quick hangout then? I think I can give you a rough sketch of what we need and you can decide whether it's doable without bloodshed
<davecheney> fwereade_: i don't think it is worth it
<davecheney> this is over my head
<davecheney> i'll throw this card back into the pool and leave this command for someone else to use
<davecheney> this is a much bigger job than I thought
<davecheney> and this work is not scheduled for a good reason
<dimitern> morning
<fwereade_> dimitern, heyhey
<fwereade_> TheMue, heyhey
<rogpeppe> mornin' all!
<dimitern> fwereade_: hey
<dimitern> rogpeppe: morning
<TheMue> fwereade_, dimitern: heya
<rogpeppe> dimitern, TheMue: yo!
<dimitern> fwereade_: about to propose the machiner facade stuff
<TheMue> rogpeppe: oh, yes, welcome back too ;)
<fwereade_> rogpeppe, heyhey
<fwereade_> dimitern, cool
<rogpeppe> fwereade_: hiya!
<dimitern> rogpeppe: you might be surprised by my next CL :)
<rogpeppe> dimitern: how did you get on with that set of branches?
<dimitern> rogpeppe: didn't manage due to too much other stuff going on
<rogpeppe> dimitern: i haven't seen any emails go by, but that's i think because i never see any CLs i'm not directly involved in
<rogpeppe> dimitern: what's the "machiner facade" then?
<dimitern> rogpeppe: I did 3 separate refactoring proposals last week, all of them changing the API in a different way
<dimitern> rogpeppe: finally we agreed on how to move forward
<dimitern> rogpeppe: so basically we decided to get rid of srvMachine, srvUnit, api.Machine, api.Unit and related stuff
<dimitern> rogpeppe: and instead have srvMachiner, which implements a subset of the API only used by the machiner
<rogpeppe> dimitern: okay... what's the plan then?
<rogpeppe> dimitern: so you're putting the machiner in the API?
<dimitern> rogpeppe: and have lightweight "MachinerMachine" objects proxying calls through the facade
<rogpeppe> dimitern: i'm not sure i understand what you mean by that
<dimitern> rogpeppe: yeah, as an attempt to be have SOA oriented approach, rather than replicate the state api directly
 * rogpeppe doesn't really know what "service-oriented architecture" actually means
<dimitern> rogpeppe: you'll see shortly - the code will explain it better
<rogpeppe> dimitern: could you point me towards a place where you were discussing this, so i can get some background?
<dimitern> rogpeppe: take a look at the juju-dev ML messages and this document: https://docs.google.com/a/canonical.com/document/d/1Yd2Nil43AemnBq8qv003OkWLptiPzuhWGESBcwlI7Nc/edit#heading=h.fdyojyfogyn1
<dimitern> rogpeppe: it's not up-to-date now, but the proposal implements what was agreed upon
<rogpeppe> dimitern: so, by "bulk operation" you mean that you have a set of objects and you perform the same operation on all of them at once? kinda like vector math but for objects?
<TheMue> rogpeppe: it's not in the typical sense of soa. but the style is service-oriented. instead m := getMachine(4711); m.DoThis(...); it's a machineService.DoThis(4711, ...).
<dimitern> rogpeppe: yes
<fwereade_> rogpeppe, yeah, the consensus is essentially that domain objects always suck
<fwereade_> rogpeppe, client-side, we'll still be faking them up, so the refectoring doesn't kill us
<dimitern> rogpeppe: instead of having Machine.SetStatus() -> Machiner(args []{Id, status, info}) -> []errorresults
<rogpeppe> fwereade_: i guess i don't quite see the issue.
<dimitern> rogpeppe: s/Machiner/Machiner.SetStatus/
<rogpeppe> fwereade_: how do domain objects always suck?
<fwereade_> rogpeppe, well, the API mimics state at the moment, and thereby preserves all the mistakes we made with state -- but is much harder to change
<fwereade_> rogpeppe, consider the provisioner
<fwereade_> rogpeppe, grabbing N machine objects individually is plainly insane
<rogpeppe> fwereade_: it calls AllMachines, no?
<fwereade_> rogpeppe, yes, and AllMachines is total unjustifiable crack
<rogpeppe> fwereade_: interesting p.o.v.
<rogpeppe> fwereade_: why so?
<fwereade_> rogpeppe, it's a load of unnecessary data that'll go stale, and which doesn't allow for useful bulk ops -- if we wanted to do the provisioner right, we'd be asking for 1000 life statuses, then getting the instance ids of the alive 800 of those, then getting whatever the next subset of information is
<fwereade_> rogpeppe, talking about individual machines renders this approach unworkable to the point you don't even consider it
<rogpeppe> fwereade_: we can get bulk ops easily by issuing multiple concurrent requests - but do you consider that an unworkable approach?
<fwereade_> rogpeppe, yes, it's insane
<rogpeppe> fwereade_: that's a strong statement :-)
<fwereade_> rogpeppe, makes it impossible to leverage the db's support for bulk ops
<rogpeppe> fwereade_: ah, that's a good point.
<fwereade_> rogpeppe, I understand *why* it happened like this -- we swapped backends without considering what the state package should look like in the new context
<rogpeppe> fwereade_: and mongodb has such support? (other than bulk query ops)?
<fwereade_> rogpeppe, bulk query ops are exactly what we need in general
<fwereade_> rogpeppe, and in the cases where we need bulk change ops (currently few, I suspect will become more and more important as the project matures)
<rogpeppe> fwereade_: i thought query ops were very fast, and not likely to be a bottleneck for us
<fwereade_> rogpeppe, we still want to be able to express them in a compact way even if we end up needing to write intent to a queue and handle the txn changes in smaller batches
<rogpeppe> fwereade_: is there a risk that we're making things harder for ourselves by making a significant architecture shift here that might actually be premature optimisation?
<fwereade_> rogpeppe, I accept responsibility for that risk
<rogpeppe> fwereade_: have you made some measurements to support these decisions?
<fwereade_> rogpeppe, no, because we can't scale far enough to tell in the first place
<dimitern> there it is -> https://codereview.appspot.com/9896046/
 * dimitern ducks for cover with a steak :)
<fwereade_> rogpeppe, please read through the lists for better context than IRC
<rogpeppe> fwereade_: which lists?
<fwereade_> rogpeppe, juju-dev
<fwereade_> rogpeppe, on a related note, btw, what is the deal with Pinger?
<rogpeppe> fwereade_: i haven't seen anything on juju-dev since dimiter's response to my handoff email
<fwereade_> rogpeppe, have you switched to gmail yet?
<rogpeppe> fwereade_: ahhh
<rogpeppe> fwereade_: that will be the issue
<rogpeppe> fwereade_: i use gmail anyway - i need to point it to my canonical gmail i guess
<rogpeppe> fwereade_: it's a good thing actually that i didn't see any of this while i was away on holiday :-)
<fwereade_> rogpeppe, ha, yeah
<TheMue> rogpeppe: ;)
<rogpeppe> fwereade_: what about Pinger?
<fwereade_> rogpeppe, I was very clear that I did not want pinger in the API
<rogpeppe> fwereade_: ah, what would you like to call it?
<fwereade_> rogpeppe, and that I wanted it out of machiner so we could move forward without it complicating the issue, and give ourselves some breathing room to make a final decision without immediate pressure
<fwereade_> rogpeppe, I don't think it's justifiable at all in the API, myself, as you know
<rogpeppe> fwereade_: ah, so we don't want the agents telling the state they're alive at all?
<fwereade_> rogpeppe, we certainly do not want the machiner pretending it's the machine agent
<rogpeppe> fwereade_: what *is* the machine agent?
<fwereade_> rogpeppe, the bit that runs the various workers
<fwereade_> rogpeppe, not the workers themselves
<rogpeppe> fwereade_: presumably *some* worker has got to do it, no?
<fwereade_> rogpeppe, I'm not sure why that would be the case at all
<fwereade_> rogpeppe, why not keep it purely server-side?
<rogpeppe> fwereade_: how does the server side know a client is around?
<rogpeppe> s/client/agent/
<fwereade_> rogpeppe, I *hope* it can tell, otherwise it'll be running a lot of watchers for a client that's disconnected
<fwereade_> rogpeppe, anyway, the reason I wanted it out of the API was so that we could have this discussion independently of the critical path
<rogpeppe> fwereade_: so you're suggesting that the API see a connection from a given agent and run a pinger on its behalf as a result of that connection being made?
<fwereade_> rogpeppe, that is indeed a possibility, as we discussed in detail before I went away on holiday
<rogpeppe> fwereade_: i remember a few discussions in that area. i hadn't realised you didn't want the pinger in the API at all though, sorry.
<rogpeppe> fwereade_: istr the suggestion to put it in the Agent.Entity call, which didn't work out
<fwereade_> rogpeppe, I wanted it out of machiner so that we could discuss the answer to this question separately, without blocking the machiner work
<rogpeppe> fwereade_: ok, so let's just remove it then
<dimitern> rogpeppe, fwereade_: https://codereview.appspot.com/9896046/ ?
<fwereade_> dimitern, I'm reading it right now
<dimitern> (it's big, but there are mostly removals)
<fwereade_> dimitern, reviewed with a few thoughts
<dimitern> fwereade_: cheers
<fwereade_> dimitern, some of them are a bit vague
<fwereade_> dimitern, Machines in particular is maybe really just Exists()?
<fwereade_> dimitern, but I'm ambivalent about the Refresh()/Life() thing in particular
<dimitern> fwereade_: i think it's a good idea for Refresh calling Life and caching
<fwereade_> dimitern, so long as we implement the API sanely, I think it's sensible to keep the *Machine interface as close as possible
<fwereade_> dimitern, cool
<fwereade_> dimitern, so long as we all know that's now how it "should" be long-term
<dimitern> fwereade_: yeah
<dimitern> fwereade_: not sure I get the point of converting NotFound into Dead?
<dimitern> fwereade_: that surely the same as state works
<fwereade_> dimitern, yeah, but all it means is that all the client code has to go around specifically handling NotFound and handling it as Dead ;p
<fwereade_> dimitern, we kinda BDUFed the lifecycle stuff and that's one of the ickiness points
<fwereade_> dimitern, there may or may not in general be a distinction between Dead and NotFound, and it's situational
<fwereade_> dimitern, I suspect that if we're asking for Life explicitly it should probably be reported as Dead -- although this then makes auth errors interesting
 * fwereade_ grumbles at the world
<dimitern> fwereade_: well, how about a true "not found" case?
<dimitern> fwereade_: i mean you asked for a machine that was never there
<fwereade_> dimitern, ok, lets start with Machines() -- what's the use case there on the server side? to get a domain object, we need to discover its Life, but that's the only call we need -- we already know the ID
<fwereade_> dimitern, not-auth is actually a reasonable response to a Life query I guess
<fwereade_> dimitern, (in general, anyway...) but from the POV of the machiner, I think that reporting notfound as Dead is actually quite useful... am I making sense here, or is this all too dependent on my internal context?
<fwereade_> dimitern, it's like calling EnsureDead on a machine that doesn't exist -- it succeeds
<dimitern> fwereade_: do you mean that only for life?
<dimitern> fwereade_: i.e. Machines will still return not found as usual
<fwereade_> dimitern, I'm trying to figure out what the use case is for Machines()
<dimitern> fwereade_: what? no - ensuredead fails on a non-existing machine through the API
<dimitern> fwereade_: because we need to get the machine first
<dimitern> fwereade_: but i guess in state it operates on cached life and might succeed
<fwereade_> dimitern, that's an interesting behaviour change
<fwereade_> dimitern, g+ a mo?
<dimitern> fwereade_: ok, just a sec
<dimitern> fwereade_: https://plus.google.com/hangouts/_/971de9659aecd256626b1d52513288375093bf72?authuser=0&hl=en
<dimitern> rogpeppe: can you take a look as well please? https://codereview.appspot.com/9896046/
<rogpeppe> dimitern: i'm currently going through all the backlog
<dimitern> rogpeppe: ok
<dimitern> fwereade_: if i remove the authEnvironManager from the allowed perm checks
<dimitern> fwereade_: then it kinda defeats the point of having bulk operations - any method on the machiner will only ever succeed for the machine this machiner is responsible for
<dimitern> fwereade_: do you know what I mean?
<dimitern> fwereade_: anyway, it's updated now https://codereview.appspot.com/9896046/
<wallyworld> TheMue: hey frank, thanks for the review. as written, i think the provisioner tweak is safe to land now so i'd like to do that and then your changes can come along later?
<jam> fwereade_: so, we currently have a test that is broken with latest mgo. And it will prevent us from using the juju-core go-bot. Care to give direction about how to fix it?
<jam> TestOpenDoesNotDelayOnHandShakeFailure
<jam> it was written by Dave, because he implemented the logic to have juju-core delay if it gets a connection failure, but *not* delay if it gets a TLS handshake failure.
<jam> The test now fails because mgo unconditionally delays 500ms on *any* failure.
<TheMue> wallyworld: it's ok for me. the information ContainerType() is returning matches exactly my needs.
<wallyworld> great :-)
<TheMue> wallyworld: just code a little outline where it better can be seen how it will be used.
<TheMue> wallyworld: eh, not "just code", but "i'm just coding" ;)
<wallyworld> TheMue: ok. so are you wanting to land your work before mine? wouldn't mine need to land before yours?
<wallyworld> so you can use the new ContainerType() method?
<fwereade_> dimitern, the remaining point of bulk operations is in habit and consistency; and in that we *can't* really predict what a machiner will ultimately be responsible for, and we get some future-proofing by allowing ourselves to express multiple ops if we ever need them
<fwereade_> dimitern, I do agree that the machiner is not in itself a compelling use case for bulk ops
<dimitern> fwereade_: sure, np
<wallyworld> fwereade_: i think you'll be happy with this now, hopefully https://codereview.appspot.com/9820043/
<dimitern> fwereade_: wanna take a look now?
<fwereade_> dimitern, wallyworld, thanks both, I'll take a look
<wallyworld> thanks
<fwereade_> jam, I would provisionally be ok dropping that test and that behaviour -- it seems like it's been taken out of our hands with the mgo change
<jam> fwereade_: well *a* fix is to get something upstreamed
<jam> but an *easier* fix is to just drop it :)
<fwereade_> jam, but it would be good to have a word with davecheney for a bit more context
<jam> fwereade_: AIUI the initial issue was that mgo always retried without any delay, which punished things a bit
<jam> so we put in a delay, and now so has mgo directly.
<fwereade_> wallyworld, what were your thoughts re unit-dirty vs container-dirty?
<fwereade_> jam, in that case I would be happy dropping it, assuming davecheney's approval
<fwereade_> just in case i missed something
<TheMue> wallyworld: yours imho can already land, yes
<wallyworld> fwereade_: hmmm. i must confess i forgot that if it was raised as a question. sorry. the flag as implemented is a unit-dirty flag i guess. but that's all we need now i think
<wallyworld> TheMue: thanks. i promise to fix that method comment if you +1 it :-)
<fwereade_> wallyworld, fair enough -- I think there will be some interestingness in future though -- possibly I raised it in a different CL
<TheMue> wallyworld: great
<wallyworld> fwereade_: the idea now is that a unit can be deployed if unit-clean=true, since it a machine has containers, it doesn;t really matter for that case
<wallyworld> or so i understand
<fwereade_> wallyworld, I'm just a bit antsy about what we really want to express -- "clean" and "unused" both feel like rational and distinct requests
<fwereade_> wallyworld, I'm willing to call this progress, though -- nothing's using "unused" at the moment, right?
<fwereade_> wallyworld, I *will* be nervous about reactivating unused while it's not able to take constraints into account, though
<wallyworld> fwereade_: for now, afaiui, we only need care about clean rather than unused  (as far as containerisation goes)
<wallyworld> fwereade_: i think the current AssignUnused realy should be AssignClean
<wallyworld> since that's the semantic we are really aiming for right now
<wallyworld> afaiui
<fwereade_> wallyworld, I'd be +1 on a rename there, but you don't need it in this CL
<danilos> mumble trouble :/
<mgz> you're hoping a lot ::)
<wallyworld> fwereade_: yes, i agree with the rename. as you say, that's part of the evolution of this work and not for this mp
<wallyworld> that also matches my thinking of the issue
<jam> danilos: we've put you in another room for now, hopefully you can get it working again.
<danilos> jam: I am trying
<danilos> jam: works fine then kicks me out 5 seconds later :/
<jam> you could try restarting completely...
<fwereade_> wallyworld, you have an LGTM, sorry I let that one linger
<fwereade_> dimitern, I'm on yours now
<wallyworld> np, thanks
<dimitern> fwereade_: cheers!
<danilos> jam: I can, if you mean rebooting? (I've killed mumble and tried again, fwiw)
<jam> that is what I meant, though we could also switch to a hangout
<jam> wallyworld, danilos, mgz: https://plus.google.com/hangouts/_/8868e66b07fa02bdc903be4601200d470dae9ee3
<jam> dimitern: ^^
<dimitern> fwereade_: blast, I realized I have to add MachinerMachine.EnsureDead() and also add client-side tests for the machiner
<fwereade_> dimitern, you *could* very happily just strip out the client-side code from this CL, and repropose those with tests in a new one
<dimitern> fwereade_: good idea, will do
<dimitern> fwereade_: so the pipeline will be: 2) client-side + tests, 3) split suites, 4) split (apiserver|api)/machiner into a separate subpackage, 5) implement Machiner.Watch,
<dimitern> fwereade_: sounds good?
<fwereade_> dimitern, SGTM
<fwereade_> dimitern, and LGTM
<dimitern> fwereade_: tyvm
<fwereade_> dimitern, (although lt me know your thoughts on the comments, and if you decide they're candidates for this CL so much the better)
<fwereade_> dimitern, I'm very happy with splitting Authorizer out
<dimitern> TheMue: since you're the OCR today, can you have a look as well? https://codereview.appspot.com/9896046/ (disregard the state/api/machiner.go stuff - will split it in a follow-up)
<TheMue> dimitern: yep
<dimitern> fwereade_: i'm looking at your review and will ask if something is unclear
<danilos> TheMue, hi, I wonder if you can take a look at https://codereview.appspot.com/9876043/?
<TheMue> danilos: *click*
<danilos> TheMue, thanks :)
<TheMue> danilos: Done.
<danilos> TheMue, thanks
<fwereade_> wallyworld, ping
<wallyworld> hi
<fwereade_> wallyworld, free for a chat about containers?
<wallyworld> sure
<fwereade_> wallyworld, I'll start one
<mgz> can I sit in?
<wallyworld> oh, alright
<fwereade_> mg, sure, I'll invite, just a sec
<dimitern> fwereade_, TheMue: next one - https://codereview.appspot.com/9686047/
<TheMue> dimitern: *click*
<fwereade_> wallyworld, mgz, ffs, sorry
<mgz> no problem
<fwereade_> waiting for plus.google.com...
 * fwereade_ sighs
<fwereade_> wallyworld, mgz, it really doesn't want to talk to me today :/
<wallyworld> fwereade_: try mumble?
<fwereade_> wallyworld, I really ought to have set that up at some point since I started here, shouldn't I :/
<wallyworld> lol
<fwereade_> I'll bounce my router on general principles, bbiab
<wallyworld> kk
<dimitern> TheMue: thanks
<frankban> rogpeppe, dimitern,anyone else: I need another review for https://codereview.appspot.com/9641044/ . could you please take a look?
<TheMue> dimitern: yw
<dimitern> frankban: looking
<frankban> dimitern: thanks
<dimitern> fwereade_, TheMue: and another small one: https://codereview.appspot.com/10003044/
<fwereade__> TheMue, mramm, kanban?
<dimitern> frankban: LGTM
<frankban> dimitern: great, thank you. The test with a float is already present as part of the schema tests.
<dimitern> frankban: wasn't sure, but ok
<TheMue> sh..., sorry, just a technician arrived *grmpf*
<fwereade__> grar
<fwereade__> google dislikes me today
<fwereade__> and it really is just google
<dimitern> fwereade__: :/
<fwereade__> launchpad is positively sprightly by comparison
 * fwereade__ sighs deeply
<fwereade__> dimitern, rogpeppe1, mramm, I think I'm going to give up on this and go reboot ALL THE THINGS
<dimitern> TheMue: https://codereview.appspot.com/10003044/
<dimitern> also second reviewer needed on https://codereview.appspot.com/9686047/
<TheMue> dimitern: *click*
<dimitern> TheMue: cheers
<dimitern> TheMue: tyvm
<dimitern> need to relax a bit, i'm off for now; might be back later
 * rogpeppe2 is done for the day. g'night all.
<thumper> fwereade_: around?
 * thumper sighs
<thumper> can anyone else confirm a build failure with trunk?
 * thumper wonders where our tarmac committer is
<thumper> grr
<thumper> dimitern_: you broke trunk, naughty naughty
<thumper> r1247
<thumper> jujud tests fail to build
<wallyworld> thumper: tarmac is almost ready - was waiting on a failing mongo test to be fixed
<thumper> wallyworld: hi there
<wallyworld> but that test will be deleted
<wallyworld> hi
<wallyworld> how's the dog?
<thumper> wallyworld: how am I supposed to know that? trunk build fails, this is bad
<thumper> dog is fine, was sleeping
<wallyworld> thumper: no, separate issue
<thumper> now is staring at me
<wallyworld> test failure (resulting from upstream mongo changes) was preventing tarmac bot bring deployed
<wallyworld> and don't you love how we just pull upstream from tip so we are not isolated from breaking changes
<thumper> yeah, it's awesome
<wallyworld> ah, who needs dependency management
<thumper> didn't kapil have a dep management thing to add in?
<thumper> hazmat: where is that?
<wallyworld> yeah, but now someone else has proposed yet another Go solution
<thumper> wallyworld: do you know what needs to be fixed in the failing test?
<thumper> wallyworld: I'm not sure what the test is trying to test
<wallyworld> thumper: gopm or something. but given how the last person who proposed something was shot down in flames, i'm not optimistic
<thumper> who proposed a go thing?
<wallyworld> thumper: no idea. i just looked at irc and saw your comments. i know nothing of the build failure yet
<wallyworld> the failing test is being deleted
<thumper> you said someone proposed a go solution...
<thumper> i was asking about that
<wallyworld> yeah, i was told, let me try and find something
<wallyworld> thumper: https://groups.google.com/forum/?fromgroups#!topic/golang-nuts/k8pmk8FQC8w
<wallyworld> https://github.com/GPMGo/gopm-api/
<wallyworld> so just a proposal really it seems at this stage
<wallyworld> thumper: if you add your +1 to this i can land it https://codereview.appspot.com/9820043/
<fwereade_> thumper, hey dude
<fwereade_> thumper, sorry I haven't been around much at sociable hours the last few days
<thumper> wallyworld: sorry, doggy break needed
<thumper> fwereade_: hey
<thumper> wallyworld: we should land anything new until trunk is fixed
<wallyworld> thumper: sure, i just want to get it unblocked, not going to land immediately
<thumper> fwereade_: so trunk is broken due to r1247
<thumper> fwereade_: just wondering what to do with the now failing test
<thumper> wallyworld: ack, I'll look shortly
<wallyworld> no hurry
 * thumper sighs
<thumper> it isn't obvious how to fix this test
 * thumper comments out the whole test
<thumper> wallyworld: can I get a +1 trivial on this? Rietveld: https://codereview.appspot.com/10022043
<thumper> I could just merge it in, but someone else agreeing helps
 * wallyworld looks
<wallyworld> thumper: done. i had a quick look too but it wasn't immediately obvious what the replacement api to call was
<thumper> yeah
<wallyworld> we could have found it i guess, but other things to do
<fwereade__> thumper, sorry, just getting up to date
<fwereade__> thumper, oh, hell, what's broken?
<thumper> fwereade__: just a test, see review just above
<fwereade__> thumper, ah, ok, the issue is that API no longer has a .Machine()?
<fwereade__> thumper, I'd just drop that bit
<fwereade__> thumper, being able to log in should be evidence enough that something's serving the API
<thumper> fwereade__: well, actually submitted already
<thumper> :)
<fwereade__> thumper, ok, just mail dimitern_ with some light bitching about running *all* the tests then ;p
 * thumper nods
<thumper> fwereade__: do you have time for a quick catch up?
<thumper> like a hangout?
<fwereade__> thumper, sure, woudl you start one please? with you in 2
<thumper> fwereade__: ok
<hazmat> thumper, its kinda of lame.. ie works for ci use case only.  its at lp:goreq
<thumper> fwereade__: https://plus.google.com/hangouts/_/69cccc01076c5b15bb3afbf54ba00501977e7b80?hl=en
<hazmat> there's better vcs management in go juju's deployer impl lp:juju-deployer/darwin
<hazmat> definitely a few go build tools popping up
<thumper> not surprising
<thumper> the problem hits everyone
<hazmat> besides the ones on the golang list.. there's also mozilla's heka-build tool which  supports compile time plugins.
<hazmat> as well as the frozen/repeatable vcs version sets
<hazmat> the later of which is all goreq does, update a tree/gopath to a known set of versions
<fwereade__> mramm, ping
<thumper> mramm: are you alive?
<mramm> thumper: fwereade__: yep I'm here
<mramm> and alive
<thumper> mramm: can you join us in a hangout?
<mramm> sure
<thumper> mramm:  https://plus.google.com/hangouts/_/69cccc01076c5b15bb3afbf54ba00501977e7b80?hl=en
<thumper> wallyworld: got a few minutes?
<wallyworld> thumper: on a csll
<wallyworld> call
<thumper> wallyworld: ack
<thumper> wallyworld: I'm going to go to the gym in about 20 minutes, so perhaps we'll chat when I'm back?
<wallyworld> sure
#juju-dev 2013-06-05
<thumper> wallyworld: ping
<wallyworld> pong
<thumper> wallyworld: hangout?
<wallyworld> sure
<wallyworld> you have one handy?
<thumper> nope,
<thumper> can make one
<wallyworld> ok, your turn
<thumper> https://plus.google.com/hangouts/_/697901d4a23e194d51d2d6a8e6e35c8f7cfe0102?hl=en
<wallyworld> davecheney: go question for you if you are about
<wallyworld> nevermind, figured it out
<davecheney> wallyworld: back
<wallyworld> hi, i figured it out, thanks
 * thumper afk to make dinner, may pop back later to catch fwereade__
<dimitern_> morning
<dimitern_> shit.. my bad, sorry about trunk
<dimitern_> will fix the test and propose a patch
<fwereade__> thumper-afk, hey, just woke up, not starting work quite yet, really dropped by to say hi to TheMue
<fwereade__> TheMue, ping
<TheMue> fwereade__: pong
<fwereade__> TheMue, heyhey
<TheMue> fwereade__: heya
<dimitern_> https://codereview.appspot.com/9686047/
<dimitern_> this includes a fix for the broken test as well
<dimitern_> fwereade__: ^^
<dimitern_> wallyworld: ping
<dimitern_> rogpeppe2: need second LGTM on this, if you can have a look? https://codereview.appspot.com/10003044/
<rogpeppe2> dimitern_: looking
<rogpeppe2> dimitern_: reviewed
<dimitern_> rogpeppe2: cheers
<dimitern_> bump https://codereview.appspot.com/9686047/
<dimitern_> fwereade__: ping
<fwereade__> dimitern_, I will be on reviews as soon as possible
<dimitern_> fwereade__: cheers; sorry, just checking, it seems pretty quiet today
<dimitern_> rogpeppe2: the idea behind embedding baseSuite is to be able to reuse openAs mainly
<rogpeppe2> dimitern_: errorSuite doesn't use a state connection
<dimitern_> rogpeppe2: ah, you mean only there
<rogpeppe2> dimitern_: yea
<rogpeppe2> h
<dimitern_> rogpeppe2: ok
<dimitern_> rogpeppe2, fwereade__: another CL https://codereview.appspot.com/10044043
<dimitern_> fwereade__, rogpeppe2: and another (the corresponding work on client-side) https://codereview.appspot.com/10026044
<dimitern_> fwereade__, rogpeppe2: oops, forgot to change MachinerMachine, will repropose
<fwereade__> dimitern_, ha, I already LGTMed, didn't spot that
<fwereade__> dimitern_, thanks
<rogpeppe2> dimitern_: BTW, about the machiner needing only the life status, won't it need to know the provisioning nonce too?
<dimitern_> fwereade__: I would've missed if not for your comment on the earlier CL
<dimitern_> rogpeppe2: well, not the machiner per se, the machine agent needs that though
<rogpeppe2> dimitern_: same difference from the API perspective
<dimitern_> rogpeppe2: I was thinking of having another facade for the MA
<dimitern_> fwereade__: thoughts? ^^
<rogpeppe2> dimitern_: what's the difference between the MA and the machine agent from the API perspective?
<rogpeppe2> s/machine agent/machiner/
<rogpeppe2> dimitern_: i think that having a separate facade for each entity that want to talk to the API is verging on insanity, but i realise i'm probably the only one that thinks so
<rogpeppe2> s/want/wants/
<fwereade__> rogpeppe2, do you know off the top of your head where agents download cs: charms from? env storage, or the charm store?
<rogpeppe2> fwereade__: env storage, i'm pretty sure
<rogpeppe2> fwereade__: but i may be wrong
<dimitern_> rogpeppe2: well, aiui the facades give us extra security by not exposing methods that the entities that use them don't need
<fwereade__> rogpeppe2, hmm, how do we do a charm store Get in the api server?
<rogpeppe2> fwereade__: we've got to
<rogpeppe2> fwereade__: we've got to fetch the charm in order to push it into env storage
<fwereade__> rogpeppe2, that shouldn't have a JUJU_HOME, though
<fwereade__> rogpeppe2, ah! we set a global variable in the charm package
<fwereade__> rogpeppe2, hmm
<fwereade__> rogpeppe2, that was the missing piece, thanks
<rogpeppe2> fwereade__: what's the JUJU_HOME connection
<rogpeppe2> dimitern_: i'm not convinced that extra security is worth 100s or 1000s of lines of extra code
<fwereade__> rogpeppe2, I spotted the panic-if-no-charm.CacheDir, and the setting of it in Conn
<fwereade__> rogpeppe2, but missed the setting of it in jujud
<rogpeppe2> dimitern_: i'm not even sure it really buys us that much security tbh
<rogpeppe2> fwereade__: ah
<dimitern_> rogpeppe2: it's no way near 100s or 1000s of lines, maybe a few extra 100s, which is not so bad for the added benefit imho, and it's better design in general i think
<fwereade__> rogpeppe2, dimitern_, if we're talking about dropping the pointless extra API call on Machiner construction just so we fail unauthed a little earlier I'm keen to kill it
<dimitern_> rogpeppe2: well, if there's no method you can call you cannot screw things up, no?
<fwereade__> rogpeppe2, dimitern_, the Authorizer will handle that anyway, right?
<dimitern_> fwereade__: no
<fwereade__> rogpeppe2, dimitern_, if not, ignore me
<rogpeppe2> dimitern_: that is true, but vast, sprawling APIs have their own cost
<rogpeppe2> dimitern_: and i fear that is where we are heading
<dimitern_> fwereade__: we're talking whether it's worth having a separate facade for each worker/agent
<fwereade__> dimitern_, rogpeppe2: different tasks need different information, and have different auth concerns
<dimitern_> fwereade__: for the sake of better separation of who needs what and security
<fwereade__> dimitern_, basically yeah
<dimitern_> fwereade__: so having Machiner *and* MachineAgent facades
<rogpeppe2> fwereade__: that can be dealt with inside the various methods, no?
<fwereade__> dimitern_, although "better security" is here in my mind more in the sense of "comprehensible auth code" rather than actually "different macroscopic behaviour"
<rogpeppe2> dimitern_: there is really no point in having more than one facade for a given authenticating entity
<fwereade__> dimitern_, rogpeppe2, firm +1 on separating Machiner from MachineAgent
<dimitern_> fwereade__: on the api level?
<dimitern_> fwereade__: i.e. Machiner and MachineAgent as separate facades
<fwereade__> dimitern_, rogpeppe2: the machiner has a very tightly proscribed set of responsibilities, no sense exposing stuff not needed by it
<dimitern_> fwereade__: agreed
<fwereade__> dimitern_, rogpeppe2: there's enough casual conflation of the two already, I would prefer not to extend that
<rogpeppe2> fwereade__: if we're talking about security, what is the point of duplicating types for two clients that are sharing the same session?
<dimitern_> rogpeppe2: they *are* different clients
<fwereade__> rogpeppe2, with different responsibilities and usage patterns
<rogpeppe2> fwereade__: not really. certainly not from a security perspective.
<dimitern_> rogpeppe2: and separate connections
<fwereade__> dimitern_, not separate connections, surely?
<dimitern_> rogpeppe2: although a bit "virtual"
<rogpeppe2> dimitern_: they share the exact same websocket connection
<dimitern_> fwereade__: well the connection is one, but the way they access the api is through different facade factory methods
<dimitern_> hence the "virtual" part
<rogpeppe2> fwereade__, dimitern_: it seems to me that this approach amounts to duplicating a subset of the state package for every possible client.
<dimitern_> so Machiner uses the machiner facade the same way as before it would use api.State
<fwereade__> rogpeppe2, agreed -- the point is that we separate the context of the call from the call itself, and can do the auth work specific to the context; underneath, behind the API, we can absolutely share implementations for the things that make sense to do so
<rogpeppe2> it seems like utter crack to me i'm afraid
<dimitern_> and the MA similarily
<fwereade__> rogpeppe2, exactly so
<fwereade__> rogpeppe2, your problem with this is?
<rogpeppe2> fwereade__: huge amounts of wasted code.
<rogpeppe2> fwereade__: poor maintainability
<fwereade__> rogpeppe2, so a change to the provisioner's usage of a machine should always affect how the machiner sees the world? interesting perspective
<rogpeppe2> fwereade__: i think it shouldn't be too hard to partition the API so that it can be appropriate for all its clients.
<fwereade__> rogpeppe2, we already failed there -- the stuff we send down with the AllWatcher doesn't match the stuff we were sending to the agents
<rogpeppe2> fwereade__: the client interface *is* different. but i don't think we need to do the same thing for every agent.
<fwereade__> rogpeppe2, I think it's suspect to assume that every client needs the same view of the underlying entities they manipulate
<fwereade__> rogpeppe2, that smacks of special pleading
<fwereade__> rogpeppe2, most workers use very different subsets of the same state objects today
<fwereade__> rogpeppe2, the overlap STM to be pretty minimal
<rogpeppe2> fwereade__: they all deal with the same set of basic types - Machine, Unit, etc
<fwereade__> rogpeppe2, well... some of them use overlapping subsets of those basic types, but in very different ways
<rogpeppe2> fwereade__: i think i'm missing something here: where do you see a potential security problem with a shared agent state API?
<rogpeppe2> fwereade__: AFAICS the main issue we're talking about here is visibility of some fields
<rogpeppe2> fwereade__: and i can't currently think of fields whose exposure would be a problem. i'll have another look though.
<fwereade__> rogpeppe2, two sides to it -- one is "don't expose stuff there's not a known need for, security holes are subtle and quick to anger"
<fwereade__> rogpeppe2, but the more important side is
<fwereade__> rogpeppe2, that smooshing all the various security contexts into a single method
<fwereade__> rogpeppe2, is hard to understand and hard to maintain
<rogpeppe2> fwereade__: i think that maintaining n different versions of the same API is also hard to maintain (and to understand and use too)
<fwereade__> rogpeppe2, and that putting the auth for a specific context in a context-specific method is noticeably more clear and maintainable, which *is* a second-order security consideration
<fwereade__> rogpeppe2, how is it hard to maintain when we can share implementations of the bits that really are the same, and keep auth separate from that?
<fwereade__> rogpeppe2, STM like a maintainability win
<rogpeppe2> fwereade__: i'm not sure how much we'll really be able to share
<fwereade__> rogpeppe2, then your argument against maintainability evaporates, no?
<rogpeppe2> fwereade__: eh?
<fwereade__> rogpeppe2, if they're different methods, np, no burden
<fwereade__> rogpeppe, if they're not different at heart, we can share an implementation
<rogpeppe> fwereade__: not necessarily.
<fwereade__> rogpeppe, either way, putting the auth for a specific context in that context, rather than having one place responsible for all the auth logic, is a clear win
<fwereade__> rogpeppe, expand on "not necessarily" please?
<rogpeppe> fwereade__: it depends on the types of the methods
<rogpeppe> fwereade__: and whether they contain references to other types that are implemented by the facades
<jam> mgz: poke
<fwereade__> rogpeppe, are you talking about the client-side implementation burden?
<fwereade__> rogpeppe, rather than the server?
<rogpeppe> fwereade__: both actually
<fwereade__> rogpeppe, the types are all in params...
<mgz> hey jam
<rogpeppe> fwereade__: the plan was to remove params eventually
<rogpeppe> fwereade__: along with statecmd
<rogpeppe> fwereade__: params was a temporary hack to get around a temporarily inevitable cycle
<fwereade__> rogpeppe, having a common vocabulary with which to talk to the various facades seems not unreasonable to me, especially since it already exists and will ease the burden of implementing common functionality in a common way
<fwereade__> rogpeppe, is there a deeper philosophical concern here?
<rogpeppe> fwereade__: really, i just feel we're multiplying entities many degrees beyond necessity.
<rogpeppe> fwereade__: and it tickles my spider sense badly
<jam> mgz: care to do a teddy bear chat about tarmac + requirements, etc?
<mgz> jam: lets do that. mumble?
<jam> mgz: I'm there already :)
<rogpeppe> fwereade__: i feel like we had a nice clear plan for moving forward with this and ending up with a nice clean, coherent and maintainable API, and that suddenly it's all gone to pot and i'm pretty unhappy about it
 * TheMue is afk, lunchtime
<rogpeppe> fwereade__: it may be just post-holiday blues though
<fwereade__> rogpeppe, I felt like we had a short-sighted "let's copy the state api" plan that we only went forward with for want of better ideas
<rogpeppe> fwereade__: that seemed like a really good, low-risk plan to me
<fwereade__> rogpeppe, the state api is, let's not forget, a port of a port of an api designed for a different language against a different backend
<fwereade__> rogpeppe, the core concepts of juju can still be dimly seen through the mist
<rogpeppe> fwereade__: the one thing i really did *not* want to do was to redesign the whole thing as we moved to using a network API
<rogpeppe> fwereade__: i feel we're becoming more and more vulnerable to second-system effect in this area
<fwereade__> rogpeppe, we're designing the *network API itself* -- it is definitely not the same thing as we had before, and I am really glad that we've actually done some design work that takes the different context into consideration
<fwereade__> rogpeppe, rather than implementing a port of a port of a port
<fwereade__> rogpeppe, I think it's a huge win that once we've done the machiner we will have *done the machiner*, rather than half-implemented a type which'll be needed in different contexts in the future, all of which can/will evolve in subtly different directions
<fwereade__> rogpeppe, and which exposes all capabilities to all things, and depends on authorizing all operations for all possible contexts in the same place
<rogpeppe> fwereade__: but then you've got to do the same thing for the provisioner, firewaller, uniter and every other thing we might want to talk to the API. we're not designing "an API" - we're designing many, and the extra burden will continue.
<rogpeppe> fwereade__: it feels like heinous duplication of effort to me
<fwereade__> rogpeppe, better to have many small APIs with clearly defined and largely non-overlapping responsibilities than to propagate the monolithic state package's design to a context that doesn't suffer the same constraints
<jam> davecheney: hey, can you look at: https://code.launchpad.net/~jameinel/juju-core/timeout-bug-1183320/+merge/167264  (or https://codereview.appspot.com/9984046/ if you prefer)
<jam> you wrote the original logic and test, but mgo just changed its logic behind our back
<rogpeppe> fwereade__: i might be happier if the API was not segmented by agent name
<fwereade__> rogpeppe, *task* name not agent name
<rogpeppe> fwereade__: even worse - there's a potentially unbounded set of tasks
<fwereade__> rogpeppe, as opposed to segmentation by domain object type?
<dimitern_> fwereade__: so we're not doing MachineAgent facade then after all?
<rogpeppe> fwereade__: yes
<fwereade__> dimitern_, MachineAgent, I assumed, was for that subset of behaviour specific to the *agent* as opposed to the workers it runs -- did you have a different concept in mind?
<fwereade__> rogpeppe, so your contention is that the monolithic nature of state is a Good Thing, rather than a sad necessity dictated by the constraints of the backend?
<davecheney> jam: yeah i was looking at that earlier today
<davecheney> i didn' read the whole thing
<davecheney> but my general approach is
<rogpeppe> fwereade__: yes. in particular for clients, i think i should import "launchpad.net/juju-core/state/api" rather than "launchpad.net/juju-core/state/api/mytask"
<davecheney> the retry delay cause so many headaches
<davecheney> lets just rip it out ave live with it
<davecheney> this *IS* a problem that must be solved in mgo
<dimitern_> fwereade__: my idea was to have state/apiserver/machineagent/ facade
<davecheney> we've tried as much as we can to avoid asking for a properly configurabl retry delay
<dimitern_> fwereade__: and the corresponding state/api/machineagent
<rogpeppe> fwereade__: and in fact is there a good reason we can't have that, even if we use different types for each agent on the server side?
<fwereade__> dimitern_, that sounds fine -- that's where stuff like CheckProvisioned and SetPassword should live, right?
<jam> davecheney: k. I think I mostly ripped it out, with the caveat that we still have a custom dial because we need to setup the TLS stuff.
<dimitern_> fwereade__: exactly
<fwereade__> rogpeppe, keeping packages small and task-focused isn't a good enough reason?
<fwereade__> dimitern_, I think that sounds good to me
<dimitern_> fwereade__: cool
<rogpeppe> fwereade__: state is far too big, i agree. but splitting it into n slightly overlapping subsets of it is not going to help matters IMO
<fwereade__> dimitern_, (*consider* the overlap between unit and machine, though, it *may* be that we can clearly define an *agent*-specific api that works for units and machines... just something to bear in mind when we implement the second one)
<dimitern_> fwereade__: i have a list of all state calls by each agent/worker
<rogpeppe> fwereade__: at the very least, i think we should organise the API according to authenticating entity rather than according to the task at the other end.
<fwereade__> rogpeppe, splitting it into N non-overlapping subsets that are partitioned by underlying data model, rather than business logic, seems like a pretty serious encapsulation violation
<dimitern_> fwereade__: the MA only uses machines
<davecheney> jam: consider that LGTM from me
<jam> davecheney: thanks
<dimitern_> fwereade__: the only overlapping is in the agent.go where we need Life, SetMongoPassword for both units/machines
<rogpeppe> fwereade__: it is a trade off, of course.
<davecheney> we should be asking Gustavo to add all these features for us
<davecheney> we've tried to do it ourselves, and it blew
<rogpeppe> davecheney: could you raise an issue against mgo?
<rogpeppe> davecheney: hiya, BTW
<davecheney> rogpeppe: sure, i'll set myself on fire for an encore
<fwereade__> dimitern_, yeah, that was just a thought, leave it in the back of your mind for now :)
<jam> rogpeppe: well, he added delay in mgo, which we had done in juju-core, which then broke the test that asserted it delayed the way we were doing it. :)
 * davecheney waves at rogpeppe 
<davecheney> had a good break ?
<dimitern_> fwereade__: sure
<jam> though now it isn't configurable, and delays on all error types, etc.
<rogpeppe> davecheney: i'm not sure you'll beat me to it
<rogpeppe> davecheney: yeah, it was lovely
<davecheney> rogpeppe: the immolation or the vacation ?
<rogpeppe> davecheney: the latter. we will see about the former.
<davecheney> rogpeppe: i'm on my third pint
<davecheney> so i may leave it til tomorrow morning
<davecheney> i think i will be able to make a lucid argument then
<fwereade__> rogpeppe, I'm afraid I need to switch subject now, but you might want to keep an eye on the upcoming conversation
<fwereade__> Makyo, ping
<mgz> davecheney: when you have a moment (maybe my evening, when you're up tomorrow?) it would be good to find out what you need from me over release things
<dimitern_> need second review on these two related CLs: https://codereview.appspot.com/10044043/ https://codereview.appspot.com/10026044/
<davecheney> mgz: i can't G+ atm
<davecheney> but is now a good time ?
<mgz> davecheney: are you setup with the canonical mumble server? otherwise irc is fine
<mgz> now is fine by me
<davecheney> mgz: I'm at the pub
<davecheney> so it's irc or bust
<davecheney> here is how I do a 'release'
<davecheney> 1. push the release now button on the recipe https://code.launchpad.net/~dave-cheney/+recipe/juju-core
<davecheney> 2. wait for that to run
<davecheney> 3. when it's done, for each deb that it produces
<davecheney> run scripts/release-public-tools/release-public-tools.sh $DEB_URL
<davecheney> so that is currently 6 debs
<davecheney> which downloads the deb, extracts the tools,and pushes them back up to s3
<davecheney> mgz: do you still have the magic credentials ?
<davecheney> then I apt-get update/apt-get upgrade to install that versoin of the client
<davecheney> then I do a bootstrap + wordpress + mysql + releations + expose test on s3
<davecheney> sorry ,ec2
<davecheney> if that all works the then release is 'good'
<mgz> ace, wasn't sure what smoketest bits you did
<davecheney> and I submit a branch to bunmp the version
<davecheney> mgz: i just do the wordpress+mysql test
<mgz> so, there are a few other things, I just want to document this and get some feedback from you+others, but I'm stil not certain what we're targetting for releases at the moment... just ppa? or in saucy/raring backports too
<davecheney> put it in the juju/devel ppa
<davecheney> jamespage and davey want a release at some point
<davecheney> but we need to give them the full monty tarball
<davecheney> what mramm wants is jst to keep us in the habbit of making a release every week
<mgz> that's easy enough to do, with a few tweaks to the process
<mgz> the other thing is propogating the release to places other than ec2
<davecheney> mgz: that is something that I dropped the ball on for 1.11.0
<mgz> we now have semi-official buckets for hp/canonistack, which I don't have credentials on
<davecheney> i think i might have said 'hey, can someone, um you konw, run sync-tools, or something'
<davecheney> mgz: well, i know this is a contentios point
<davecheney> but wallyworld and I are going to raise bundling hte tools inside the juju client deb again
<davecheney> so that will be lots of run
<davecheney> fun
<mgz> :)
<fwereade__> Makyo, for when you're around: I've reviewed https://codereview.appspot.com/9975045/ -- give it a quick look, then we should talk and figure out the best way forward. rogpeppe, your input would also be appreciated here.
<davecheney> mgz: the final part of the release is writing the release notes
<davecheney> which i copy pasta from the week before, change the verstion
<davecheney> then go through the cards in leankit
<mgz> yeah, tis the it I like least :)
<dimitern_> wallyworld: ping
<wallyworld> hi
<dimitern_> wallyworld: you're officially the OCR today :) have a look at these two please:  https://codereview.appspot.com/10044043/ https://codereview.appspot.com/10026044/
<wallyworld> but it's waaaay past my EOD :-(
<wallyworld> i'll look
<wallyworld> also trying to watch the Australian version of the SUperbowl
<dimitern_> wallyworld: thanks! I know, but I couldn't find anybody to review them
<davecheney> wallyworld: have there been any punch ons yet ?
<davecheney> mgz: https://code.launchpad.net/~dave-cheney/+recipe/juju-core-daily
<wallyworld> yes! just before half time. Gallen is a dirty dog
<davecheney> wallyworld: good to see they are keeping with tradtion then
<wallyworld> yeah, except for the Qld losing bit :-(
<davecheney> mgz: thumpers' new log package might need to be added as a nested repo to the build
<davecheney> actually, i'll just do that now
<davecheney> it can't hurt
<davecheney> wallyworld: i'm not seeing why that is a bad thing for me
<wallyworld> davecheney: i can't answer that in a polite way
<davecheney> i don't even follow the rugby and i'm beaming from ear to ear
<wallyworld> your mean rugby league? rugby is diferent
<davecheney> pfft, you all use the same size ball
<davecheney> what is the difference ?
<davecheney> it's just a mob of blokes with tape around their heads to stop their ears falling off
<wallyworld> one is played with 15 men (rugby union, also called rugby), the other (rugby league) with 13
<davecheney> wallyworld: a significant and crucial difference
<wallyworld> similar concepts but forwards are totally different in their style and attributes
<wallyworld> totally different games really
<davecheney> clearly
<wallyworld> you being sarcastic? they are a lot different
<wallyworld> really!
<davecheney> i don't know how I could have made such a foolish mistake
<dimitern_> :)
<jam>  wallyworld, davecheney: I know dave is 3-pints in, I wouldn't be surprised if wallyworld is 2+, so realize IRC humor might get a bit lost :)
<wallyworld> i haven't had a drink yet - too busy watching the football
<jam> wallyworld: I would imagine rugby union is closer to rugby league than say American Football
<wallyworld> oh yes
<wallyworld> far less stoppages for a start
<wallyworld> and only one team takes the field
<wallyworld> not a different team for attack, defence etc
<jam> wallyworld: so one team goes on the field, runs it up and scores
<jam> then the other team goes on
<jam> and does the same?
<jam> :)
<wallyworld> ha ha
<jam> seems a bit unsporting :)
<dimitern_> jam: can you send me the g+ standup link so I can add it to my calendar please?
<mgz> it's in the calendar already,
<mgz> under the blue standup event
<dimitern_> mgz: I'm not using the team calender due to TZ issues
<jam> dimitern_: https://plus.google.com/hangouts/_/calendar/am9obi5tZWluZWxAY2Fub25pY2FsLmNvbQ.162l082d3h3c5uidfclmtcds1g
<dimitern_> jam: thanks
<jam> wallyworld: did g+ crash?
<wallyworld> not for me
<jam> anyone?
<jam> wallyworld: I can't hear you if you are speaking
<jam> but I don't hear anyone
<mgz> jam: you dropped
<wallyworld> we are all fine
<rogpeppe> fwereade__: i'm looking at https://codereview.appspot.com/9975045. are you saying that we need to implement "local" charms before that branch can be landed?
<rogpeppe> fwereade__: FWIW i agree we need to do it - i'm just not entirely sure of the timeline
<fwereade__> rogpeppe, not necessarily -- I'm saying that including local-repo-specific stuff on the server side is definitely wrong
<rogpeppe> fwereade__: i'm not sure what stuff you're referring to there
<fwereade__> rogpeppe, I hope to have a quick chat with you and Makyo when he's available so we can figure out the least amount of work that makes progress for the gui and isn't actively wrong
<fwereade__> rogpeppe, bumpRevision
<fwereade__> rogpeppe, that has no place in the API, or in code used internally by the API
<rogpeppe> fwereade__: bumpRevision is false on the server side.
<rogpeppe> fwereade__: and isn't in the API, is it?
<rogpeppe> fwereade__: ah, "code used internally by the API"
<rogpeppe> fwereade__: does that really matter?
<fwereade__> rogpeppe, I think it does, yes... statecmd will be going behind the API, right?
<fwereade__> rogpeppe1, ^^
<fwereade__> rogpeppe1, and it's implemented in terms that imply and use functionality that does not, and cannot, exist on the API server
<rogpeppe1> fwereade__: yeah. this particular statecmd function is so simple i don't really see the point of having it
<rogpeppe1> fwereade__: but the API server still needs to use conn.PutCharm currently
<rogpeppe1> fwereade__: which has a bumpRevision flag for the time being
<fwereade__> rogpeppe1, I must have miscommunicated, tbh, I had thought I'd explained that the necessary API was exactly and precisely a forward to state.Service.SetCharm
<rogpeppe1> fwereade__: how does the new charm get into the state?
<fwereade__> rogpeppe1, IMO we need a distinct API call for that
<rogpeppe1> fwereade__: i'm not convinced we do actually
<rogpeppe1> fwereade__: i think we want a PutCharm entry point that actually uploads bytes of a local charm
<rogpeppe1> fwereade__: and gives back a URL
<fwereade__> rogpeppe1, ok, we *will* need one to support local charms, but we can short-circuit it for cs: charms
<rogpeppe1> fwereade__: that's my thought
<fwereade__> rogpeppe1, would you agree that there's definitely no excuse for sending a deploy request, or a setcharm request, with an unrevisioned charm URL?
<rogpeppe1> fwereade__: i think a charm URL works ok as a charm handle
<rogpeppe1> fwereade__: i'm not sure i would
<fwereade__> rogpeppe1, expand please?
<fwereade__> rogpeppe1, isn't fguring out the charm to use a client-side responsibility?
<rogpeppe1> fwereade__: i think it's perhaps reasonable for an API client to call Deploy("cs:wordpress")
<rogpeppe1> fwereade__: without actually looking in the charm store itself and working out the desired revision
<fwereade__> rogpeppe1, isn't the GUI user presenting charms to usefrom the charm store anyway?
<fwereade__> rogpeppe1, what's the benefit of deploying a charm that doesn't match the one the user asked for?
<rogpeppe1> fwereade__: the GUI will not be the only API client
<fwereade__> rogpeppe1, the user's always deploying a *specific* charm from the GUI, right?
<rogpeppe1> fwereade__: perhaps. but a command-line user isn't.
<rogpeppe1> fwereade__: we're essentially making it *necessary* for an API client to talk to the charm store as well as the API.
<rogpeppe1> fwereade__: i don't really see why we need to do that
<fwereade__> rogpeppe1, that sounds like a really good thing to me
<fwereade__> rogpeppe1, know what you're asking for
<rogpeppe1> fwereade__: i'm not sure i see why you have a problem with allowing a short-form charm URL
<fwereade__> rogpeppe1, ISTM that smearing the responsibility for figuring out what charm to deploy across layers is self-evidently bad
<fwereade__> rogpeppe1, it's like all the other crazy magic that got put in the deploy API
<fwereade__> rogpeppe1, lol here's a different number of units to what you asked for
<fwereade__> rogpeppe1, service name? meh we'll infer one for you
<fwereade__> rogpeppe1, I want an API that does what you tell it, not one that makes guesses based on inferred context
<fwereade__> rogpeppe1, that's a job for the outermost layer
<fwereade__> whoops, lunch
<fwereade__> bbiab
<rogpeppe1> fwereade__: i agree that the heuristics in the Deploy call are misplaced.
<rogpeppe1> fwereade__: but that's mainly because all that information is already trivially available to the client
<rogpeppe1> fwereade__: that is not the case for the revision number.
<TheMue> hmm, just missed fwereade__
<TheMue> ok, then later
 * rogpeppe1 goes for lunch too
 * dimitern_ lunch
<fwereade__> TheMue, heyhey
<TheMue> fwereade__: ah, there he is. but i already found the answer to my question (has been about a statement in state/machine.go.
<fwereade__> TheMue, cool
<fwereade__> TheMue, which did you pick up?>
<TheMue> fwereade__: i've started with the state.Cleanup() and the same maintainer will get the mgo/txn in a second cl
<fwereade__> TheMue, two workers, please
<fwereade__> TheMue, small components, single responsibilities
<fwereade__> TheMue, and check with roger that we're calling it JobManageState -- I think we agreed that yesterday, but it's worth a check
<fwereade__> TheMue, (fwiw I would be fine with you implementing each worker in a separate CL, and then integrating them both in a single followup)
<TheMue> fwereade__: the maintainer worker is small and the cleanup method call in there is very small. the txn one would be an extra method
<fwereade__> TheMue, they're using rather different models, aren't they?
<TheMue> fwereade__: JobManageState? not JobMaintainState anymore?
<fwereade__> TheMue, there's currently no such thing as JobMaintainState
<TheMue> fwereade__: please define models in this context?
<fwereade__> TheMue, one needs to respond to a watcher, one needs to run periodically
<TheMue> fwereade__: no, thy are new, take a look in the issues you wrote. you called them JobMaintainState
<fwereade__> TheMue, I know I did, that's because we didn't know what it'd be called yet
<TheMue> fwereade__: the current approach is to have a maintainer worker running jobs periodically. right now only the cleanup.
<TheMue> fwereade__: manager imho is the wrong name for maintenance.
<fwereade__> TheMue, the idea is just to groups them under the rough umbrella of "things that need a state connection to function"
<fwereade__> TheMue, but you do understand that jobs are names for *groups* of tasks, right?
<fwereade__> TheMue, a job with two responsbilties runs two tasks
<TheMue> fwereade__: so which task do you want to be done in a, yes, how should it be called, statemanager?
<fwereade__> TheMue, I don't think there's any call for a statemanager type, is there?
<TheMue> fwereade__: my idea of a maintainer is to periodically run different tasks, like the state cleanup
<fwereade__> TheMue, we need a worker that watches for cleanup docs, and runs them
<fwereade__> TheMue, as stated in the bug
<TheMue> fwereade__: and how would you name that worker?
<fwereade__> TheMue, Cleaner?
<fwereade__> TheMue, Janitor if you're feeling cute ;p
<TheMue> fwereade__: ;)
<TheMue> fwereade__: but here q quote why i thought the watcher will be interesting later "probably we should have a CleanupWatcher as well, changes on which cause Cleanups to run"
<TheMue> fwereade__: that's why i thought we need a different approach
<TheMue> fwereade__: but ok, will change it
<fwereade__> TheMue, I'm a bit confused there
<fwereade__> TheMue, it's a task that runs code n response to state changes
<fwereade__> TheMue, we need a good reason *not* to implement it in trms of a watcher
<TheMue> fwereade__: what would be the problem doing it periodically?
<TheMue> fwereade__: which btw doesn't mean that i will not change it
<TheMue> fwereade__: i'm only interested
<TheMue> fwereade__: because multiple approaches are possible
<fwereade__> TheMue, the problem is just that it's picking a different strategy from the conventional one
<fwereade__> TheMue, there may be a reason to do so, in which case, great
<fwereade__> TheMue, but in general we should be using the backend's capabilities to run code when we need to rather than just guessing, I think
<TheMue> fwereade__: it only remembered me of typical database maintenance jobs
<TheMue> fwereade__: that's why i went that way
<fwereade__> TheMue, this argument does not apply to the txn resumer, because we can't simply watch that
<fwereade__> TheMue, cleanup jobs are infrequent -- once per relation removal, basically
<fwereade__> TheMue, but we have the mechanism in place that stores them watchably
<TheMue> fwereade__: how often do you think the watcher would raise an event?
<fwereade__> TheMue, very infrequently
<TheMue> fwereade__: a number, maybe as ratio to units?
<fwereade__> TheMue, totally depends on usage patterns
<fwereade__> TheMue, in most deployments, 0% of the time
<TheMue> fwereade__: that's indeed very low :D
<fwereade__> TheMue, but with a watcher-based approach, it'll be run when it needs to be, even if that usage pattern changes
<fwereade__> TheMue, it's very possible we'll end up with more cleanup types
<TheMue> fwereade__: the watcher approach may have the problem later of having only one responsible machine (and so a spof). periodical cleanups could run on multiple machines, needing only a good idea of distribute the runs over the day
<fwereade__> TheMue, why would we run it on only one machine?
<TheMue> fwereade__: in case of of multiple machines watching for the cleanup demand, how would those machines react? do all start the cleanup in parallel?
<TheMue> fwereade__: because it's a global operation.
<fwereade__> TheMue, AFAICT State.Cleanup will work fine when called concurrently
<TheMue> fwereade__: ok, due to transactional behavior, but imho still no nice coordination
<TheMue> fwereade__: but we'll start with the machine 0, don't we?
<fwereade__> TheMue, right now, that is the only machine that will be running this job, yes
<TheMue> fwereade__: then i'll change it into a cleaner :)
<TheMue> fwereade__: heh, or duster? :D
<fwereade__> TheMue, remember we'll need a watcher for those docs too
<fwereade__> TheMue, might be best to implement that first
<fwereade__> TheMue, and I don't demand great sophistication here
<TheMue> fwereade__: sure, otherwise the worker will wait for what? hmm, some nice day in the future.
<fwereade__> TheMue, something that just sends an event whenever the set of cleanup docs changes should suffice
<TheMue> fwereade__: will do it as first small cl
<fwereade__> TheMue, great, thanks
<TheMue> fwereade__: can start with the cleaner itself then in parallel as followup cl
<fwereade__> TheMue, I couldn't parse that
<fwereade__> TheMue, cleaner worker depends on cleanup watcher
<fwereade__> TheMue, resumer worker is independent and can be done in parallel
<fwereade__> TheMue, hmm
<fwereade__> TheMue, yeah, make it a worker for consistency's sake, it'll just periodically pass through a call to st.runner.ResumeAll()
<fwereade__> mmm, cath has made pud, I'll go take the second half of the lunch break ;p
<fwereade__> see you all in kanban
<TheMue> fwereade__: yes, we match. what i wanted to say is that the watcher will be the first cl, the cleaner worker then follows (i only start it while the first cl is in review). the resumer will then another one
<mattyw> is there someone here I can talk to about plugins in juju-core?
<mattyw> specifically it looks like any flag I have for my plugin which has - in it causes the arg list to get trimmed
<mattyw> http://pastebin.ubuntu.com/5735795/
<fwereade__> mattyw, looks like a bug to me, would you let thumper know about it please?
<mattyw> fwereade__, will do thanks: here's a much better paste http://pastebin.ubuntu.com/5735830/
<fwereade__> mattyw, it's always a pleasure to see a clear fail case :)
<mattyw> fwereade__, it's also comforting to know I'm not trying to do something insane :)
<dimitern> rogpeppe1: ping
<rogpeppe1> dimitern: pong
<dimitern> rogpeppe1: can I ask you to have a look at these? https://codereview.appspot.com/10044043/ https://codereview.appspot.com/10026044/
<rogpeppe1> dimitern: will do. i've got another meeting in 8 minutes though, so probably won't get through them before then.
<dimitern> rogpeppe1: sure, no rush
<dimitern> mgz: ping
<mgz> hey dimitern
<dimitern> mgz: hey, did you manage to look at the API stuff?
<mgz> haave looked through the two cls you have up currently
<dimitern> mgz: they are the last in line though
<mgz> yeah, not super relevent
<dimitern> mgz: the actual facade has landed already and the client-side stuff as well
<mgz> I did see I think the first before week away too
<dimitern> mgz: no, that was rejected
<mgz> ..confusing :)
<dimitern> mgz: if you go back in the pipeline of prereqs  you'll find the first one and follow through
<dimitern> mgz: https://codereview.appspot.com/9896046/ that's the first
<mgz> okay
<mgz> thanks
<dimitern> mgz: np, ping me if you need something clarified as you go
<dimitern> mgz: these are the next ones in line (in order): https://codereview.appspot.com/9686047/ (client-side machiner), https://codereview.appspot.com/10003044/ (splitting test suites server-side), then the ones still up for review
 * TheMue reboots, somehow my box lags and i don't know why
<hazmat> gary_poster, i think metrics or stats.yaml.. vitals is a term for those in the know..
<gary_poster> hazmat, I'll note., thanks
<Makyo> fwereade__, rogpeppe1 I'm available for the rest of the day, minus a 15min call in ~50mins.
<fwereade__> Makyo, heyhey
<fwereade__> Makyo, would you start a hangout please? I just need a couple of mins
<fwereade__> rogpeppe1, join us if you can but don't feel obliged
<rogpeppe1> fwereade__: will do
<Makyo> fwereade__, alright, once I figure out the new hangout interface.  Bleh.
<rogpeppe1> Makyo: go to the top of the page, and share a hangout
<fwereade__> Makyo, back
<rogpeppe1> Makyo: it's unintuitive, i know
<fwereade__> Makyo, type to search, tick name to add, press the silly video-camera-like icon that appeared magically below the name list, IIRC
<Makyo> Ah, was making it too complicated.  https://plus.google.com/hangouts/_/04c2a36e9afa1c5ace4880ec2c4681cc01fb9f79?authuser=1&hl=en
<Makyo> https://codereview.appspot.com/9975045/
<Makyo> fwereade__, rogpeppe1 ^^^
<dimitern> last one for today: https://codereview.appspot.com/9937045/
<dimitern> rogpeppe1, fwereade__ ^^ when you can
<dimitern> rogpeppe1: last nag for today for these two: https://codereview.appspot.com/10044043/ https://codereview.appspot.com/10026044/
<rogpeppe1> dimitern: sorry, looking
<dimitern> rogpeppe1: cheers
<dimitern> freenode's being ddosed wow :) interesting
<dimitern> i'm off, have a good evening all
<rogpeppe1> dimitern: you've got a review
<rogpeppe1> dimitern: g'night
<dimitern> rogpeppe1: tyvm
 * rogpeppe1 is done for the day. g'night all.
<hazmat> rogpeppe1, cheers
<hazmat> anyone run juju-core against maas?
<andreas__> hi guys, did juju bootstrap lose the -v (verbose) option?>
<thumper> fwereade__: ping?
<wallyworld__> thumper: g'day
<thumper> wallyworld__: hey
<wallyworld__> quick chat?
<thumper> wallyworld__: I have a call with mramm in 4 min
<wallyworld__> ok, after that, ping me. i'll go have breakfast
<thumper> kk
<wallyworld__> i was hoping to talk to fwereade__ too
<fwereade__> thumper, wallyworld__, heyhey
<thumper> fwereade__: just chatting with mramm
<fwereade__> wallyworld__, guess I'm here for you then :)
<fwereade__> wallyworld__, actually, I have to make laura's lunch for tomorrow
<fwereade__> wallyworld__, but there is an 80% chance I'll be back in 10 mins
<wallyworld__> fwereade__: ok, ping if you are free after your chore
<fwereade__> wallyworld__, ping
<wallyworld__> hey
<wallyworld__> quick hangout?
<fwereade__> wallyworld__, sgtm
<fwereade__> wallyworld__, start it?
<wallyworld__> https://plus.google.com/hangouts/_/d3f48db1cccf0d24b0573a02f3a46f709af109a6
<thumper> wallyworld__: want me to join in?
<wallyworld__> yes
#juju-dev 2013-06-06
<thumper> wallyworld__: taking the dog for a walk in the hope she'll go outside...
<wallyworld__> ok
<thumper> geez luise
<thumper> she has great bladder control holding it for inside
<thumper> although this time, she did go outside
<thumper> thankfully
<thumper> only after we had got back, and I refused to take her inside
<thumper> wallyworld__: wow some of our code is wonderfully inefficient
<wallyworld__> thumper: which bit?
<thumper> wallyworld__: the provisioner, that I'm currently refactoring
<wallyworld__> ah. good thing you're onto it then
<thumper> I'm busy teasing apart dependencies
<wallyworld__> i find not having generics leads to a *lot* of code duplication
<thumper> I really wish our state methods would return interfaces rather than real structs
<thumper> makes mocking and testing a PITA
<wallyworld__> oh yes +100 to that one
<thumper> wallyworld__: I've made interfaces for the provisioner on all things that I can
<thumper> so the new bits take interfaces not structs
<wallyworld__> \o/
<thumper> wallyworld__: OMFG
<thumper> bigjools: ok, can have that talk now
<wallyworld__> thumper: what?
<thumper> wallyworld__: my day of refactoring passed the tests first time (once I fixed all the compile errors)  https://codereview.appspot.com/9937046
<wallyworld__> cool, i'll look
<bigjools> wallyworld__: need to talk to you, and it's about work.... <shock>
<wallyworld__> shit eh
<bigjools> it could be
<bigjools> I could just call you but reception is shit so get on a hangout will ya
<wallyworld__> i thought since you asked you'd create one, but i will
<wallyworld__> https://plus.google.com/hangouts/_/d3f48db1cccf0d24b0573a02f3a46f709af109a6
<rogpeppe1> mornin' all
<TheMue> rogpeppe1: heya
<rogpeppe1> TheMue: hiya
<fwereade__> TheMue, ping
<TheMue> fwereade__: pong
<TheMue> fwereade__: just wanted to ping roger ;) but will do it afterwards
<TheMue> fwereade__: ah, it seems i found what i wanted to ask
<fwereade__> TheMue, would you please send thumper a link to your lxc work so he can maybe look through it tonight?
<TheMue> fwereade__: sure, it's a very interesting topic
<TheMue> fwereade__: will do immediately
<TheMue> fwereade__: btw, watcher with test is working
<dimitern> mgz: hey
<mgz> dimitern: hey
<dimitern> mgz: how can we go about pairing on implementing the deployer stuff?
<mgz> I'm open to suggestions
<mgz> one option is using my vm with go stuff on and screen or something, plus voice
<dimitern> mgz: not sure i see the process will go - can you expand a bit?
<dimitern> rogpeppe1: responed to https://codereview.appspot.com/10044043/
<mgz> trying to remember what jam and I did exactly, basically you just ssh into my canonistack box then run screen with a couple of flags, then we try to not type at the same time :)
<dimitern> mgz: I see, well we can try it out
<dimitern> just as soon as I finish responding to my reviews
<mgz> sure
<dimitern> rogpeppe1: responded to this as well https://codereview.appspot.com/10026044/
<dimitern> looking for reviews on this: https://codereview.appspot.com/9937045/
<dimitern> fwereade__: ^^
<dimitern> mgz: i'm afraid we have to base the deployer stuff on one of my CLs, so until it lands maybe we can discuss it first
<mgz> that sounds good
<dimitern> but hopefully it won't change much, we can just use it as a prereq before it lands actually
<mgz> what's the branch?
<dimitern> lp:~dimitern/juju-core/055-api-machiner-subpackages
<dimitern> assuming we take the same path and develop it first server-side as a subpackage
<dimitern> that branch introduces some interfaces and decouples the root api object from the machiner facade
<dimitern> fwereade__: ping
<dimitern> mgz: g+?
<fwereade__> dimitern, pong
<dimitern> fwereade__: do you have some time to g+ about the general direction of the deployer facade?
<fwereade__> dimitern, give me 5 mins
<dimitern> fwereade__: sure
<mgz> dimitern: can do, but mumble would work as well for this
<dimitern> mgz: sgtm, if fwereade__ won't mind
<mgz> ah, g+ then if we want him too
<dimitern> mgz: in the mean time have a look at worker/deployer/deployer.go
<dimitern> mgz: i have extracted a list of state calls used by it: http://paste.ubuntu.com/5738348/
<mgz> ah, crap, but I need to reboot into chromeos for that...
<dimitern> mgz: oh, sorry about that
<mgz> fwereade__: can you install mumble quickly?
<fwereade__> mgz, dimitern, ok, I am here, I need the bathroom, I will try to get mumble up in a sec
<dimitern> mgz, fwereade__: we can meet in the Blue room on mumble (Root -> Cloud Engineering -> Blue)
<fwereade__> cheers
<dimitern> fwereade__: if you go to settings in mumble and enable push to talk + set the shortcut to something (i use numlock) and hold it while speaking I found it works best
<dimitern> fwereade__: because the noise/silence detection sucks in mumble
<jamespage> mgz, bug 1188126
<jamespage> https://bugs.launchpad.net/juju-core/+bug/1188126
<TheMue> *: any known problems with the .../worker tests? they timed out here when doing test ./...
<mgz> jamespage: thanks!
<TheMue> rogpeppe1: ping
<rogpeppe1> TheMue: pong
<TheMue> rogpeppe1: just performing tests including the worker package. had a timeout, tests ran too long.
<rogpeppe1> TheMue: can you reproduce the problem?
<TheMue> rogpeppe1: now I took TestOneWorkerStart as a first one to see where it happens
<TheMue> rogpeppe1: and already this hangs
<TheMue> rogpeppe1: yes, reproducable
<rogpeppe1> TheMue: this is with trunk?
<TheMue> rogpeppe1: had done a pull before and run 1253 (plus my change which is something added, but not used so far)
<TheMue> rogpeppe1: will go into trunk and try there too, to get sure
<rogpeppe1> TheMue: if you get a stack trace from the hanging test (with ctrl-\), what does it print?
<TheMue> rogpeppe1: one moment
<TheMue> rogpeppe1: strange
<TheMue> rogpeppe1: the .\... timed out on the shell and from inside the editor
<TheMue> rogpeppe1: now i'm only in that package and is passes
 * TheMue shakes his head
<rogpeppe1> TheMue: how do you know that the worker package is the one that's failing?
<TheMue> rogpeppe1: *** Test killed: ran too long.
<TheMue> FAIL	launchpad.net/juju-core/worker	600.005s
<TheMue> rogpeppe1: and then i started the worker tests from inside the editor with the same result (the output above has been on the shell)
<rogpeppe1> TheMue: can you kill -QUIT the worker tests that are hanging within the editor, please.
<rogpeppe1> TheMue: so we can find out what they're hanging
<rogpeppe1> s/what/where/
<TheMue> rogpeppe1: btw, just started it in the shell again and it hangs
<TheMue> rogpeppe1: will paste you the output
<rogpeppe1> TheMue: thanks
<TheMue> rogpeppe1: http://paste.ubuntu.com/5738426/
<TheMue> rogpeppe1: carmen just called me for lunch, biab
<rogpeppe1> TheMue: oh dammit
<dimitern> rogpeppe1: https://codereview.appspot.com/10044043/ - need some help on this please
<rogpeppe1> TheMue: when you get the chance, please run go test -c; then run the resulting binary and kill that
<rogpeppe1> TheMue: then we won't be seeing two stack traces interleaved
<rogpeppe1> dimitern: looking
<rogpeppe1> dimitern: sorry for not responding earlier - i was in a call
<dimitern> rogpeppe1: np
<dimitern> rogpeppe1: i'm really bad at phrasing doc comments and will appreciate suggestions like for the Tagger
<danilos_> dimitern, wallyworld: hi, hangout?
<wallyworld> ok
<rogpeppe1> dimitern: responded
<dimitern> danilos_: uh, yes, forgot we're doing g+ now
<dimitern> rogpeppe1: cheers
<danilos> dimitern, https://codereview.appspot.com/9876043/
<dimitern> rogpeppe1: can we agree to disagree on the apiserver machiner_test? :)
<dimitern> rogpeppe1: and it's value
<dimitern> (that it's valueable)
<rogpeppe1> dimitern: is there anything there that can't be tested just as easily through the client API interface?
<rogpeppe1> dimitern: i'm not saying the tests themselves aren't valuable
<dimitern> rogpeppe1: the client-side test is a different one
<dimitern> rogpeppe1: this is a unit test
<dimitern> rogpeppe1: the client one is an integration test
<dimitern> rogpeppe1: and they tests different things
<rogpeppe1> dimitern: i see the tests in apiserver/machiner and i think that i would like to verify that all of that stuff works from the client all the way to the server
<dimitern> rogpeppe1: not quite
<rogpeppe1> dimitern: i think by testing everything through the client, we get the best of both worlds
<dimitern> rogpeppe1: the client-side doesn't need to care about the correctness of bulk operations: handling parameters and results, as well as permission checks
<rogpeppe1> dimitern: surely it does
<dimitern> rogpeppe1: no it doesn't, because the interface as exposed is not about that
<dimitern> rogpeppe1: permissions yes, but not bulk ops
<rogpeppe1> dimitern: you mean because we don't expose the bulk operations currently as part of the machiner client API?
<dimitern> rogpeppe1: and the permissions are already tested at server side
<dimitern> rogpeppe1: yeah
<rogpeppe1> dimitern: this feels wrong to me
<rogpeppe1> dimitern: all the way along we have implemented client functionality that gives us access to the operations supported by the server
<dimitern> rogpeppe1: client-side tests are about how you use the interface as a client, not so much how the transport layer works
<rogpeppe1> dimitern: and this is breaking that correspondence
<dimitern> rogpeppe1: don't you agree unit tests are a good thing?
<rogpeppe1> dimitern: i think more test coverage is a good thing
<rogpeppe1> dimitern: and AFAICS by putting the tests on the server side only, we get less test coverage
<rogpeppe1> dimitern: whereas if we do it at the client side, we get to test the full path almost for free
<dimitern> rogpeppe1: exactly
<dimitern> rogpeppe1: and also testing at the right place
<dimitern> rogpeppe1: no, they are client-side tests in the follow-up
<dimitern> rogpeppe1: testing end-to-end only is not sufficient imho
<rogpeppe1> dimitern: really? why not?
<rogpeppe1> dimitern: what can you test on the server side that you cannot test end to end?
<dimitern> rogpeppe1: and incurrs the overhead of starting a server when you need only to test the server-part in isolation
<rogpeppe1> dimitern: starting a server takes microseconds
<dimitern> rogpeppe1: as i already mentioned, bulk operations, parameters handling, permissions, results
<rogpeppe1> dimitern: we should have client entry points for the bulk operations. i'm not sure how the rest aren't (or can't) be tested with end-to-end tests
<rogpeppe1> dimitern: that's how it's all going to be used, after all
<dimitern> rogpeppe1: we don't need client-side bulk entry points for the machiner yet
<dimitern> rogpeppe1: but we need to support them at transport level, as agreed
<rogpeppe1> dimitern: i think we should have them anyway, so we can test them end to end
<rogpeppe1> dimitern: it's not much code, as fwereade__ keeps on reassuring me
<dimitern> rogpeppe1: ok, i think we're in a vicious circle here
<dimitern> rogpeppe1: we need to test both sides
<fwereade__> rogpeppe1, dimitern, damn, I meant to address this before
<fwereade__> rogpeppe1, dimitern, shall we have a quick g+?
<rogpeppe1> fwereade__: sure
<dimitern> fwereade__: what?
<dimitern> fwereade__: ok
<TheMue> rogpeppe1: http://paste.ubuntu.com/5738483/ <= this one is by calling the test binary
<dimitern> fwereade__: shall I start one or you will?
<rogpeppe1> TheMue: thanks. that's more useful.
<fwereade__> dimitern, would you, just finishing an email
<dimitern> fwereade__: sure
<TheMue> rogpeppe1: great
<dimitern> rogpeppe1, fwereade__: https://plus.google.com/hangouts/_/1842432ac5d47d4b006b6679b49cb3d836c8163f?authuser=0&hl=en
<AeroNotix> When using the goose Nova binding, is the client expected to provide an imageRef themselves?
<AeroNotix> I have a similar set of bindings and I'm at a loss at how to combat the ever-changing list of images.
<TheMue> *: one CL for review: https://codereview.appspot.com/10078043
<dimitern> rogpeppe3: can you sketch out an idea of how requireMachiner should look like?
<dimitern> rogpeppe3: not sure about the check "is machine"
<rogpeppe3> dimitern: http://paste.ubuntu.com/5738805/
<rogpeppe3> oops
<rogpeppe3> dimitern: s/requireAgent/requireMachine/
<dimitern> rogpeppe3: ah, ok, so it's simpler than i thought
<rogpeppe3> dimitern: not too bad, eh?
<dimitern> rogpeppe3: will add it in machiner.new
<rogpeppe3> dimitern: cool
<dimitern> rogpeppe3: indeed
<dimitern> rogpeppe3: and then there's this: https://codereview.appspot.com/10026044/
<rogpeppe3> dimitern: FWIW i think we could just have api.Machine
<rogpeppe3> dimitern: and have it talk to whatever facade is appropriate
<dimitern> rogpeppe3: not really, because machiner.Machine and provisioner.Machine, etc. are different
<rogpeppe3> dimitern: we could quite easily keep them compatible
<dimitern> rogpeppe3: i think the point of the separation is not making them compatible
<rogpeppe3> dimitern: ah. i thought the reason was security.
<dimitern> rogpeppe3: that's the main part of it
<dimitern> rogpeppe3: not having a method you can call is the best security, isn't it?
<rogpeppe3> dimitern: on the server side, yes.
<dimitern> rogpeppe3: and will catch issues like this at compile time
<dimitern> oh well.. kanban time
<rogpeppe> dimitern: agreed. it depends how much code we're willing to write for that.
<rogpeppe> time to reboot. this machine is becoming unusable.
<dimitern> rogpeppe: so wrt unifying ServerError and ServerErrorToParams
<dimitern> rogpeppe: the only thing that needs wrapping is the param to conn.Serve in apiserver.go
<rogpeppe> dimitern: i'm not sure i understand
<dimitern> rogpeppe: i'd still rather have separate helpers, but if you insist, I'll create a serverError there locally which will call common.ServerError and converting the result to error
<rogpeppe> dimitern: why can't serverError serve both purposes?
<rogpeppe> dimitern: you'll need to type-convert the result, but i don't think that's a big issue
<dimitern> rogpeppe: because conn.Server needs func (error) error
<dimitern> rogpeppe: and in all other places I need *params.Error, not error
<rogpeppe> dimitern: serverError(err).(*params.Error) ?
<dimitern> rogpeppe: it's not a method call, it's a callback
<dimitern> rogpeppe: so I need to pass something that takes error, calls common.ServerError internally and returns error
<rogpeppe> dimitern: i mean: keep serverError as it is (but make it always return *params.Error for non-nil errors). then do the above if you want a params.Error
<rogpeppe> dimitern: then there's no need for two separate error types (api.Error and params.Error)
<rogpeppe> dimitern: or for the somewhat awkwardly named ServerErrorToParams
<dimitern> rogpeppe: it has to be something like: err1 := common.ServerError(err); if err1 != nil { return err1.(error) }
<rogpeppe> dimitern: in all the cases you're currently using ServerErrorToParams, you've got that err != nil check anyway
<dimitern> rogpeppe: just a sec, let me paste it
<jtv> Hi folks â quick question if you don't mind.  Is this test failure normal?  http://paste.ubuntu.com/5738993/
<jtv> It looks a bit like the machine's speed just happens to be barely out of the test's tolerance range or something.
<rogpeppe> dimitern: alternatively do as you suggested above: make a local closure in side serveConn
<rogpeppe> jtv: i haven't seen that failure before, but that test is quite new
 * rogpeppe hates timing-sensitive tests
<jtv> rogpeppe hates with reason.
<rogpeppe> jtv: i think the test just needs more wiggle room
<jtv> Heh.
<rogpeppe> jtv: feel free to propose a change that ups it to 200ms or so
<dimitern> rogpeppe: http://paste.ubuntu.com/5739005/ something like this
<jtv> rogpeppe: I'll try to get other tests running first though... not feeling very lucky today!
<rogpeppe> dimitern: that looks ok. i'm not sure i'd bother with the separate func for serverError though - just inline it in the call to conn.Serve
<dimitern> rogpeppe: that's nasty :) but might do
<rogpeppe> dimitern: oh, one other thing:
<dimitern> rogpeppe: or just make it serverError := func(error) error { ... }
<rogpeppe> dimitern: that's fine too
<rogpeppe> dimitern: you'll need to check if the error returns an error code
<dimitern> rogpeppe: why?
<rogpeppe> dimitern: and use that error code if o
<rogpeppe> so
<dimitern> rogpeppe: not sure i get you
<dimitern> rogpeppe: you mean in common.ServerError at the end?
<rogpeppe> dimitern: yeah
<rogpeppe> dimitern: it means that API server code can return coded errors if they want
<dimitern> rogpeppe: I'll look into it (hopefully some tests will fail if I mess it up :)
<rogpeppe> dimitern: good point. i'm not sure that functionality *is* specifically tested. you'll probably want to add a test.
<rogpeppe> dimitern: ah, actually that functionality is tested
<dimitern> rogpeppe: so then it becomes: http://paste.ubuntu.com/5739032/
<rogpeppe> dimitern: and the test will indeed fail
<rogpeppe> dimitern: not quite
<dimitern> rogpeppe: that's a direct translation of what happens when servererrortoparams is removed
<dimitern> rogpeppe: and integrated into servererror
<rogpeppe> dimitern: code will never be non-empty on line 17
<dimitern> rogpeppe: how did it work before?
<rogpeppe> dimitern: it returned the error unwrapped
<rogpeppe> dimitern: i think you want something more like this: http://paste.ubuntu.com/5739044/
<rogpeppe> dimitern: oops
<rogpeppe> dimitern: this: http://paste.ubuntu.com/5739046/
<rogpeppe> dimitern: oops, wrong still
<rogpeppe> dimitern: better i think: http://paste.ubuntu.com/5739049/
<dimitern> rogpeppe: ok, sgtm
<jtv> Build failure in trunk as well?  ../../../go/src/launchpad.net/juju-core/environs/openstack/provider.go:592: undefined: identity.AuthKeyPair
<jtv> I wonder why that didn't affect my testing.
<mgz> jtv: just the normal "you need to pull lp:goose" thing? trunk builds for me
<jtv> mgz: I haven't been on juju for a while... what is the "normal you-need-to-pull-lp:goose" thing?  Mind you this is a completely fresh "go get launchpad.net/juju-core/..." from scratch.
<mgz> completely fresh? I'd double check your latest revs in launchpad.net/goose
<mgz> because your error strongly suggests you have an older goose
<mgz> which won't cook as well
<jtv> mgz: yes, completely fresh â cloud instance
<mgz> I'm on r95 "Add juju-tools keystone endpoint and fix tests"
<jtv> Oh, of course I just discarded my instance.  The compile error is on a different system.  Sigh.
<jtv> mgz: ah... I'm on a much older revision â "go get launchpad.net/goose" didn't update it apparently.
<mgz> indeed not
<jtv> mgz: apparently the branch URL has changed... how do I fix that?
<rogpeppe> hmm, my 2-factor auth is no longer working. any suggestions anyone?
<mgz> `bzr pull --remember lp:goose`
<mgz> rogpeppe: use your carefully prepared backup
<rogpeppe> mgz: the keyfob itself is working fine
<rogpeppe> mgz: but i see "The password is invalid" when I try to log in with it
<rogpeppe> mgz: hmm, maybe i messed up "Do not activate your authentication device when not needed. Extra activations will get your device out of sync with the server and lock you out of your account."
<jtv> mgz: that updated it, thanks!  I thought I'd run a "go get -u launchpad.net/juju-core/..." which I assumed would update my dependencies.
<mgz> if you only have one device regiestered, and it's desynced, you need to ping webops in #webops
<mgz> rogpeppe: ^
<mgz> jtv: go get is not very smart, and bzr recording resolved locations messes it up
<rogpeppe> mgz: thanks, i'll try that. hopefully that's the problem.
<rogpeppe> mgz: i didn't know the devices were synced. that seems highly fragile to me.
<mgz> timebased auth is much nicer
<mgz> but this way is simple, and easily fixable if you have a backup (generally just remove and readd the device)
<rogpeppe> mgz: yeah. or challenge response can work ok too.
<rogpeppe> mgz: readd ?
<mgz> that's not really yubikey accessible though
<mgz> ^re-add
<mgz> generate a new seed, register device with it, start on fresh sequence
<rogpeppe> mgz: ah i see
<rogpeppe> mgz: if i'd had any idea that pressing the button one two many times would be a problem, i would have been a bit more careful
<rogpeppe> s/two/too/
<rogpeppe> mgz: are you sure #webops is the place? perhaps #admin or something might be better?
<gnuoy> rogpeppe, web0ps do 2fa resets, and they hangout in #web0ps
<dimitern> gnuoy: tis with "0" (zero) or upper-case "O"?
<gnuoy> dimitern, sorry I am in the habit of not typing the word as it causes them (and I am one) to be pinged
<dimitern> gnuoy: so web0ps is the magic word, good to know :)
<gnuoy> dimitern, sorry, no. It is webops and #webops
<dimitern> gnuoy: ah, ok
<mgz> (on the internal server, not freenode, too)
<dimitern> gnuoy: now i got you :) sorry
<gnuoy> dimitern, np, sorry for causing the confusion :)
<dimitern> jtv: that's a very good skeleton you have there
<fwereade__> a second review of https://codereview.appspot.com/9698047/ would be good, it's almost trivial
<dimitern> jtv: will you develop most of it like the maas provider?
<dimitern> fwereade__: i'm on it
<jtv> dimitern: so far it's been _exactly_ like the maas provider.  :)
<fwereade__> TheMue, ping
<dimitern> jtv: :) i meant in a separate branch
<jtv> dimitern: ah, no, we won't be doing that this time.  Too much integration pain.
<jtv> That's why I put this up for review now.
 * jtv â eod
<dimitern> jtv: ah ok
<dimitern> rogpeppe: https://codereview.appspot.com/10044043/ all done I think
<dimitern> rogpeppe: if you think it's worth an lgtm now i'd like to land it
<rogpeppe> dimitern: i'm looking now
<rogpeppe> dimitern: reviewed
<fwereade__> danilos, ping
<dimitern> rogpeppe: cheers
<dimitern> rogpeppe: are you sure you want to leak info by reporting CodeNotFound when id != "" and we haven't checked the user is authorized?
<rogpeppe> dimitern: i don't think i'm suggesting that. i'm suggesting returning errBadId in that case
<dimitern> rogpeppe: istm it good to always first check for authorization
<rogpeppe> dimitern: which doesn't leak any info at all
<dimitern> rogpeppe: hmm..
<dimitern> rogpeppe: ok, i'll go with this, but seems fishy
<rogpeppe> dimitern: when/if we start using the ids in this case, we will check first
<rogpeppe> dimitern: in that case, we'll probably pass the id into machiner.New
<dimitern> rogpeppe: yeah, fair enough
<dimitern> rogpeppe: if we lose the ErrNotLoggedIn returned from RequireMachiner, we'll need to have IsLoggedIn() bool in the Authorizer and check that separately
<rogpeppe> dimitern: hmm, i wonder if a simple ErrPerm would be good enough in that case
<dimitern> rogpeppe: that'll change the login_test
<dimitern> rogpeppe: and complicate things
<dimitern> rogpeppe: so I prefer to have a separate check in root.Machiner() to return ErrNotLoggedIn as well as now
<rogpeppe> dimitern: that seems reasonable actually
<dimitern> rogpeppe: ok
<rogpeppe> dimitern: then it's easy to check that all the root methods check for logged-in-edness
<rogpeppe> dimitern: and it's up to the individual facades to decide on further checks
<dimitern> rogpeppe: exactly
<dimitern> rogpeppe: last time for today, i promise :) https://codereview.appspot.com/10026044/ - you already reviewed it once, I fixed it as suggested, and the version arg is gone now
<rogpeppe> dimitern: looking
<rogpeppe> dimitern: oh darn, you've merged trunk
<dimitern> rogpeppe: what?
<rogpeppe> dimitern: i'm seeing lots more diffs than i should
<rogpeppe> dimitern: between patchset 4 and 5
<dimitern> rogpeppe: when i proposed it the diff was screwed, so i repropsed
<dimitern> rogpeppe: i haven't merged trunk
<rogpeppe> dimitern: ah, sorry, i thought it was the other CL again!
<dimitern> rogpeppe: np
<rogpeppe> dimitern: can't we just delete the id argument to api.State.Machiner, as we discussed?
<rogpeppe> dimitern: reviewed
<dimitern> rogpeppe: not unless we need to test it
<dimitern> rogpeppe: and i'd like to have a test for id != ""
<rogpeppe> dimitern: you could use Call directly for that
<rogpeppe> dimitern: that's the kind of thing it's there for
<dimitern> rogpeppe: that's seems wrong
<rogpeppe> dimitern: it seems wrong to me to clutter the user-facing client API with meaningless parameters
<rogpeppe> dimitern: you could put another method in export_test.go if you don't want to use Call
<dimitern> rogpeppe: well, ok, it's a compromise I can live with for now :)
<rogpeppe> dimitern: in fact there are quite a few methods we should probably test with non-nil ids (the Client entry points, for example)
<dimitern> rogpeppe: agreed
<rogpeppe> dimitern: so a table-driven test using Call could work ok there
<dimitern> rogpeppe: in the apiserver you mean?
<rogpeppe> dimitern: i was thinking in the api client, but you could do server tests instead, yes
<dimitern> rogpeppe: that'll be better I think
<dimitern> rogpeppe: so that we don't need to expose them all the way to the client
<rogpeppe> dimitern: if you use Call, you wouldn't have to
<rogpeppe> dimitern: but i'm ok either way. this is trivial functionality and not actually important in any way that i can really think of.
<dimitern> rogpeppe: I can add a state_test.go in the api client and test with call
<dimitern> rogpeppe: then just drop it and add a todo instead?
<dimitern> rogpeppe: to do it in a follow-up
<rogpeppe> dimitern: seems ok to me
<dimitern> rogpeppe: ok then
<dimitern> I'm off to pack
<dimitern> good night all!
<andreas__> why do I need to add the "local:" prefix when deploying a charm from my disk? Isn't --repository /foo/bar/ specific enough?
<fwereade__> andreas__, --repository just overrides the value of $JUJU_REPOSITORY, which might always be set, but doesn't always imply "look in a local repo" unless specified via the charm schema
<fwereade__> andreas__, I see your point, but I think it's a bit cleaner to have --repository and JUJU_REPOSITORY match in usage as well as meaning, rather than layer on extra cleverness in --repository that doesn't really fir with the env var
<ahasenack> fwereade__: --repository /foo/bar charm-name has surprising results
<rogpeppe> g'night all
<thumper> morning
<hatch> morning
<hatch>  /afternoon
<hatch> :)
 * thumper is now heads up after email reading
<thumper> hmm... coffee
<fwereade__> thumper, want to chat about the provisioner briefly?
<fwereade__> thumper, short version: yes, the machines watcher can be expected to deliver an initial event that represents the complete set of known machines
<fwereade__> thumper, subsequent events delivered are relative to that base "change"
<fwereade__> thumper, if any machine enters the set, you'll be informed
<fwereade__> thumper, if any machine in the set becomes dying, you'll be informed
<fwereade__> thumper, if any machine in the set is removed *or becomes Dead*, you'll be informed, and will receive no further notifications for that id
<fwereade__> thumper, calling AllMachines just raises questions of raciness that are *probably* ok but not trivial to analyse
<fwereade__> thumper, and it seems unnecessary since we have this type sitting there that's designed to send precisely the information necessary to maintain a consistent view of the set of machines in play over the lifetime of the watcher
<thumper> fwereade__: oh hai
<thumper> fwereade__: was taking the dog for a walk
<thumper> fwereade__: given this knowledge then, I'll refactor a little...
<fwereade__> thumper, nice weather for it?
<thumper> fwereade__: is actually, sunny and cool (well cold)
<thumper> but nice
<thumper> given what you have said about the watcher
<thumper> I think we can get rid of the "all machines" bit
<fwereade__> thumper, I'm starting to think wistfully of that sort of wather
<fwereade__> thumper, cool, I'm glad that makes sense
<fwereade__> thumper, I think it's a strict simplification
<thumper> yeah
<thumper> shouldn't be too hard
<thumper> rogpeppe had a lot of good changes for loggo too
<thumper> but includes breaking API
<fwereade__> thumper, awesome
<fwereade__> thumper, bah
<thumper> so would need to land branches in parallel
<thumper> because we don't have library dependencies :)
 * fwereade__ sighs wistfully
<thumper> perhaps soon
<thumper> maybe
<wallyworld> fwereade__: i was going top ask you about the machineContainer stuff but it's very late for you
<fwereade__> wallyworld, I responded to your CL I think, saying essentially machineContainer is good but quibble quibble quibble bikeshed bikeshed
<thumper> :)
<wallyworld> fwereade__: yeah, saw that thanks. question though - if we have fast regex queries, then there's really no need for the children slice
<fwereade__> wallyworld, I've realised there is, I think
<wallyworld> ok. i notice you also nixed by surrogate key :-)
<fwereade__> wallyworld, when we're able to watch an index of documents we care about, in a single document, we get to avoid watching a whole collection
<fwereade__> wallyworld, grep for globalKey in state
<wallyworld> ok
<fwereade__> wallyworld, no objection to the concept, but would prefer to maintain convention when context does not specifically indicate otherwise
<arosales> davechaney around?
<wallyworld> thanks for the input. will push up something today
<wallyworld> fwereade__: also, i'll think about the txrevno - i do think it is required since the documwent has to be loaded, modified, and saved again
<wallyworld> and that is not atomic
<wallyworld> i think settings uses it for the same reason
<wallyworld> thumper: you ok to +1 that clean flag branch?
<fwereade__> wallyworld, the model is more for a client to make exactly the document change it needs, asserting only the minimal conditions that must pertain to maintain consistency in state
<fwereade__> wallyworld, settings is crazy
<thumper> wallyworld: I can take a look
<wallyworld> fwereade__:  ok. i'll have to think about it some more. i can't tight now see how we can do it without the possibility of two threads adding a container and stomping on each other
<wallyworld> thumper: i didn't change anything, just answered your question
<thumper> wallyworld: I'd have to page in the information, so can't +1 just yet
<wallyworld> sure
<thumper> hence needing to look at it again :)
<thumper> my brain holds only so much in active state :)
<fwereade__> wallyworld, so, briefly, there are mongo ops like $addToSet that we can use in transactions
<wallyworld> ok, will look into that
<wallyworld> thanks
<fwereade__> wallyworld, yeah, $addToSet and $pull are used on the Principals and Subordinates fields that this is exactly analogous to
<wallyworld> fwereade__: i also have the issue that the machineContainers doc might not exist yet
<wallyworld> so i need to create sometimes and update via $addToSet other times
<wallyworld> so there's still a potential race condition
<wallyworld> with Principals, the doc always exists
<thumper> wallyworld: I'm out for a bit, back hopefully in 30-40 minutes
<fwereade__> wallyworld, for simplicity you could just create it alongside the machine doc, earlier in the ops list, and remove it later in its removal ops, such that the existence of a machine document always implies the existence of the containers doc?
<wallyworld> fwereade__: thought about that but seems very wasteful since the machine may not even have a container added and we may hav lots of machines?
<fwereade__> wallyworld, if you ever get a mongo NotFound from such an ancillary document, you can infallibly infer that the parent document was removed
<fwereade__> wallyworld, they're tiny documents
<wallyworld> ok. i'll create one regardless when a machine is created. it feels very dirty to me to have to do that though
<fwereade__> wallyworld, I infer that given http://docs.mongodb.org/manual/faq/developers/#how-do-i-optimize-storage-use-for-small-documents -- in which it recommends the possibility of noticeably reducing overhead by using shorter field names -- that we're not going to be paying a significantly greater storage cost by moving som of those field names to a distinct document with a clear purpose
<wallyworld> ok. thanks. i'll read that faq also
<fwereade__> wallyworld, I would very much prefer not to go down the route of mangling our field names for the db's convenience until I get evidence that it's a source of trouble for us
<fwereade__> ;p
<wallyworld> i never suggested we mangle field names i don't think
<fwereade__> wallyworld, no, I wasn't trying to imply you were
<wallyworld> ah ok :-)
<fwereade__> wallyworld, I'm saying that if the few bytes per document to store the string keys, which will indeed be somewhat repetitive, are a significant enough factor to recommend tweaking, the additional db-imposed overhead is going to be of a similar order of magnitude for that sort of change to be effective
#juju-dev 2013-06-07
<fwereade__> wallyworld, I may be blithering
<wallyworld> we can always optimise if the approach we decide on becomes an issue
<fwereade__> exactly
<wallyworld> i guess i was taking the approach of don't create something until needed
<wallyworld> since for many/most cases it won't be needed
<wallyworld> so we are loading up the db since we don't want to deal with potentialy dirty docs
<fwereade__> yeah, but then -- the different field sets of the varios documents actually have very distinct usage and observation patterns that lend themselves well to implementation as distinct types
<wallyworld> oh, i want to use a distinct type - machineContainer
<fwereade__> wallyworld, absolutely
<wallyworld> but i was willing to pay the cost of dealing with potential stateness issues if it meant not having to create unnecessary ones
<fwereade__> wallyworld, it is perfectly possible to create it lazily, but it's more hassle
<fwereade__> wallyworld, you just need to do an existence query, and write a transaction based on that result, with an assertion that the condition still holds
<fwereade__> wallyworld, if it fails, try again, at some point barf with ErrExcessiveContention
<fwereade__> wallyworld, kinda tedious
<wallyworld> yeah. would be nice to bake that logic into the framework
<wallyworld> so optimistic locking becomes easy to use
<fwereade__> wallyworld, (you also need to assert that the machine's in a state that allows for the desired operation -- eg, you can't add containers when it's not Alive)
<fwereade__> wallyworld, yeah, I have some ideas floating around in that direction
<wallyworld> i'm finding in general we are doing a lot of (repeated) boiler plate in juju :-(
<fwereade__> wallyworld, I think there's fertile ground for consolidation of some of that
<fwereade__> wallyworld, in a number of places in state
<wallyworld> yes indeed
<fwereade__> wallyworld, fwiw the reason machine ids ended up as strings was, er,  at least partially informed by my unreasoning rage at have to copy and paste the EntityWatcher code to handle ints instead of strings
<wallyworld> lol
<wallyworld> if only go had generics
<fwereade__> wallyworld, I came across the two things, they did different stuff, I spent a couple of hours analyzing their behaviour in enough detail to determine that the macroscopic effect of the two appraches was in fact equivalent
<fwereade__> wallyworld, if I get a moment, SettingsWatcher and EnvironConfigWatcher would be candidates for replacement with EntityWatcher as well
<wallyworld> sounds good
<fwereade__> wallyworld, but I should sleep
<wallyworld> yes you should. thanks for the input. appreciated
<fwereade__> wallyworld, holiday tomorrow for me, enjoy your weekend :)
<wallyworld> you too. i have a holiday on monday :-)
<fwereade__> cool, enjoy
<fwereade__> gn
<wallyworld> hight
<wallyworld> night
<thumper> grrrr
<thumper> ffs
 * thumper sighs
<thumper> working out how to enfixorate this test
<thumper> ha
<thumper> cracked it
<thumper> wallyworld: ping
<wallyworld> hi
<thumper> wallyworld: got time for EOW debrief?
<wallyworld> sure
<thumper> https://plus.google.com/hangouts/_/d0771f5fc18ee2fb1740aefa78b11db732ce9419?hl=en
<rogpeppe> mornin' all
<fwereade__> TheMue, I'm on holiday, but... why are you adding an explicit TxnRevno field that you're not using?
<dimitern> TheMue: reviewed with some comments
<TheMue> fwereade__: Otherwise the watcher package doesn't work, it relies on it.
<fwereade__> dimitern, fwiw the convention is that watchers always send an initial event representing "initial" state
<fwereade__> dimitern, this applies to the single-entity watchers as much as the ones that send actual state
<dimitern> fwereade__: ah, ok, wasn't sure this is always the case
<fwereade__> TheMue, oh really?
<TheMue> fwereade__: yes, that's why I had chosen a different approach first
<dimitern> TheMue: TxnRevno is always available on any document in a mgo collection, implicitly
<dimitern> TheMue: the same as the "_id" field
<fwereade__> TheMue, dimitern: well, to be explicit, on any doc used by mgo/txn
<fwereade__> TheMue, dimitern: it is *written* by that package
<dimitern> yep
<TheMue> fwereade__: but found it in machine, service, unit
<dimitern> TheMue: you only need to define it on the document when you need to read it for some reason - like the presence pingers do
<fwereade__> TheMue, dimitern: and is exported in *some* of our entity documents, for use in single-entity watches
<fwereade__> TheMue, dimitern: it might also be exported in certain cases to do coarse-grained optimistic locking, in which you assret the precise state of a full document against which you plan to make changes
<TheMue> fwereade__, dimitern: ah, good to know, will remove it. that led me into the wrong direction *hmpf*
<fwereade__> TheMue, dimitern: but we have 0 cases currently in play where that is necessary
<TheMue> fwereade__: because so the WatchCollection() is indeed a great helper.
<TheMue> dimitern: btw, i'm almost hoarse (it's getting better), we yesterday won and are now in the finals
<dimitern> fwereade__: do I need to update the mongo package used by tests, because now in trunk I get this error: http://paste.ubuntu.com/5741365/ (i know that test jam and davecheney fixed relied on some retry/delay behavior that changed in mongo or maybe mgo?)
<dimitern> s/fixed/fixed a test that/
<TheMue> dimitern: thx for your review. i forgot to destroy varnish too, to test the event
<dimitern> TheMue: np
<fwereade__> TheMue, would be good to test coalescing
<fwereade__> TheMue, destroy varnish and call Cleanup, and check that only one new change is sent after that
<fwereade__> TheMue, for example
<fwereade__> dimitern, hmm, yes, I think just update mgo
<dimitern> fwereade__: ah, cool
<dimitern> fwereade__: I was just looking to file a bug :)
<TheMue> fwereade__: yes, will do, thx for the review
<fwereade__> TheMue, also maybe good to check the Stop behaviour, to observe the channel close, for completeness' sake
<TheMue> fwereade__: yup, will add it too.
<dimitern> another trunk test failure after updating mgo: http://paste.ubuntu.com/5741417/
<dimitern> i think somebody complained about it on the list
<dimitern> I filed a bug 1188549 for it and sent mail to the list
<dimitern> _mup_: wtf?
<dimitern> https://bugs.launchpad.net/juju-core/+bug/1188549
<dimitern> fwereade__: i think we should revert 1257
<dimitern> fwereade__: due to the test failure above, unless i'm it's just me who's seeing it
<fwereade__> dimitern, +1
<dimitern> TheMue: can you pull trunk tip and run all tests on it please?
<dimitern> TheMue: to see if you can reproduce the bug above?\
<dimitern> mgz: hey
<fwereade__> dimitern, hmm, passes for me
<dimitern> fwereade__: ha.. so it might be only on my machine somehow
<dimitern> TheMue: please confirm ^^
<fwereade__> dimitern, oh wait sorry
<fwereade__> dimitern, nope, I agree, it's broken
<fwereade__> dimitern, +1 on reverting it, give me 1 min to peer quickly at the code
<dimitern> fwereade__: sure
<TheMue> dimitern: oh, sorry, had been on a different screen. one moment, will test
<mgz> dimitern: hey!
<dimitern> mgz: so it turned out machine.WatchUnits is already implemented, which simplifies the deployer refactoring task
<mgz> excellent
<dimitern> mgz: just compiling a short plan of steps what to do about it
<fwereade__> dimitern, hey, looks kinda like crack to me
<fwereade__> dimitern, revert away
<dimitern> fwereade__: ok, will do
<dimitern> fwereade__: one question about the deployer - you meant to remove it from the unit agent and leave it only in the machine agent, right? becasue i vaguely recall mentioning removing it from the uniter instead
<fwereade__> dimitern, I meant agnt
<dimitern> fwereade__: yeah, ok
<fwereade__> dimitern, but there's some special-case shutdown code in the uniter that we can also drop
<dimitern> fwereade__: i'll look into it
<fwereade__> dimitern, because exiting the process will no longer take down the relevant Deployer
<dimitern> fwereade__: ah, sure we can't miss it then, once it's removed from UA
<fwereade__> dimitern, and I *think* that the changes all go well together: switch the deployer type on machine agent, and delete a bunch of unit agent and uniter code at the same time
<dimitern> fwereade__: and in the first step, instead of passing a UnitsWatcher, we can just pass a machine id and call st.Machine(id).WatchUnits inside the loop, right?
<fwereade__> dimitern, that sgtm I think
<dimitern> fwereade__: cool
<TheMue> dimitern: some problem here
<TheMue> dimitern: ouch, not some, same
<TheMue> dimitern: typo
<dimitern> TheMue: yeah, will revert it shortly
<dimitern> TheMue: thanks for checking
<TheMue> dimitern: yw
<dimitern> mgz, fwereade__: http://paste.ubuntu.com/5741545/ - some short stepwise plan for the deployer (each step is roughly a CL)
<fwereade__> dimitern, (1) and (2) must land in the same CL
<fwereade__> dimitern, otherwise you'll end up with double-deployed subordinate unit agents
<fwereade__> dimitern, but the uniter shutdown code cleanup can come after, to keep things small and stepwise
<fwereade__> dimitern, you'll need WatchUnits over that facade as well
<mgz> hm, is the state between 2 and 3 working?
<fwereade__> mgz, yes, because 1/2 are changes to state, and to the existing code, without hitting the API at all
<mgz> ...that's not really the right question
<mgz> why is it safe to remove the deployer task from the unit agent, what is happening instead?
<fwereade__> dimitern, something like `WatchMachineUnits([]MachineId) ([]struct{watcherid, error}, error)`
<fwereade__> mgz, Machine.WatchUnits reports changes to all a machine's units: both its principals and their subordinates
<dimitern> fwereade__: oh, sure
<mgz> ah, I see
<fwereade__> mgz, so the necessary synchronization is that we must remove the unit agent deployer at the same time we change deployer to always watch a machine's complete complement of units
<mgz> so the previous change makes the machine agent responsible for both, which is why 1+2 need to be done together
<fwereade__> mgz, hence 1/2 having to go together
<fwereade__> mgz, exactly
<dimitern> mgz: dump question about bzr revers again - do I do bzr pull on trunk, then create a branch as usual, and do "bzr merge . -r1256..1257" if I need to revert only 1257?
<mgz> -r1257..1256
<fwereade__> mgz, there's a bit of side work, in uniter/ModeShutdown() (or something like that) that causes the uniter to delay its own death until its subordinates are removed
<dimitern> mgz: oh, I keep forgetting the order and always seems weird
<mgz> merging the *reverse* of the change, hence the bigger rev first
<fwereade__> mgz, that can be removed safely once the other is done, but won't hurt anything if it persists after the switch
<fwereade__> mgz, the effect of removing it will be that the principal unit agent dies a bit earlier, once it has no further duties to perform
<fwereade__> mgz, and the uniter becomes a smidgen simpler
<danilos_> whoops, forgot the topic
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: danilos | Bugs: 8 Critical, 71 High - https://bugs.launchpad.net/juju-core/
<dimitern> fwereade__, mgz: updated the plan http://paste.ubuntu.com/5741571/
<fwereade__> dimitern, uniter shutdown code can and should be a separate step
<fwereade__> dimitern, and 3 can be expanded a little into two
<fwereade__> dimitern, writing a clone of the deployer and unit testing it against api rather than state is a perfectly good step
<fwereade__> 3
<dimitern> fwereade__: but if that shutdown code relies on having a deployer and it's gone, won't that be a problem?
<fwereade__> dimitern, step 4 can be dropping the state version and integrating the api version
<fwereade__> dimitern, its duties are taken over by the deployer in the machine agent
<fwereade__> dimitern, so a relevant deployer is always available
<fwereade__> dimitern, it's just running in a different process now
<dimitern> fwereade__: hmm, ok - i haven't looked at the code, sorry
<fwereade__> dimitern, it's ModeTerminating
<fwereade__> dimitern, the whole loop dickery can be straight-up dropped, and we can return ErrterminateAgent directly
<fwereade__> dimitern, (once we've set StatusStopped)
<mgz> I'd like to tackle the testing deployer clone with some pointers
<fwereade__> mgz, really sorry, I have to go out to lunch now *but* if you ask questions now I might be able to drive-by them
<fwereade__> mgz, will try to do a quick pass on anything you manage to type in the next 10 mins ;p
<fwereade__> bbs
<mgz> no problem, I'm sure dimitern will happily boss me around
<fwereade__> (after that I'll actuallybe at lunch)
<dimitern> mgz: am i that bad? :)
<mgz> you're that good!
<dimitern> :)
<dimitern> fwereade__, mgz: so this is the (somewhat) final plan http://paste.ubuntu.com/5741603/
<dimitern> mgz: I'll appreciate an LGTM on this https://codereview.appspot.com/10046047/
<mgz> looking at both
<dimitern> mgz: step 3 might need to be split in 2 parts still - apiserver implementation and client api implementation
<fwereade__> dimitern, mgz, +1 to that
<fwereade__> dimitern, mgz: and I'm comfortable with following a general approach of unit tests for the server, integration tests for the client
<dimitern> fwereade__: +100 on that
<fwereade__> dimitern, long-term we can get smarter -- more cheap client unit tests, fewer heavyweight integration tests -- but for now it's a sane convention
<mgz> sure
 * fwereade__ really lunching now
<dimitern> danilos, mgz: are we g+-ing today or mumble-ing for standup?
<danilos> dimitern, hanging out I suppose :)
<dimitern> danilos: can you send me a link to your CL again please?
<danilos> dimitern, sure, https://codereview.appspot.com/9876043/
<dimitern> danilos: thanks
<danilos> dimitern, I suspect yours is https://codereview.appspot.com/9937045/?
<dimitern> fwereade__: in case you're still around at some point, take a look at the https://codereview.appspot.com/9937045/ watchers stuff
<dimitern> danilos: yep
<dimitern> TheMue: you have a second review
<dimitern> danilos: reviewed as well
<dimitern> mgz: so how do you feel about the api stuff lately? does it make sense, and does it help to understand what'll be needed for the deployer?
<danilos> dimitern, thanks
<danilos> dimitern, you've got a review as well, so perfect timing :)
<mgz> yeah, is much clearer to me now
<dimitern> danilos: thanks!
<dimitern> mgz: only the watcher stuff is a bit muddy, that why I really wanted to finish it of at both server and client side for the machiner, so it can be used as a model, but not sure if I'd manage
<ahasenack> hi, do you guys have any tricks to the problem of having several versions of a local charm in the same directory? Like
<ahasenack> ~/charms/precise/{postgresql,postgresql-fix-1234}
<dimitern> danilos: blast! that's not right at all, I should've looked
<ahasenack> juju won't know which one to deploy, as both are called "postgresql", and the directory name is irrelevant
<danilos> dimitern, the logging changes?
<dimitern> danilos: something went wrong with my long pipeline and trunk merges
<dimitern> danilos: yep
<dimitern> danilos: i'll fix it an repropose, sorry - the bits that were weird should be there :)
<danilos> dimitern, right, I assumed as much, but wanted to put it out there just in case
<danilos> dimitern, heh, cool
<fwereade__> ahasenack, if they have different revisions, you can use those
<fwereade__> ahasenack, otherwise, you could change the charm name in the metadata to distinguish them more clearly
<fwereade__> ahasenack, otherwise you'll get one of the ones in the local repo with the highest revision
<ahasenack> fwereade__: if I have 10k directories under ~/charms/precise, juju will look into */metadata.yaml, right?
<fwereade__> ahasenack, yeah
<ahasenack> fwereade__: ok
<fwereade__> ahasenack, the local repo has always been a bit questionable, it's hard to fit it into a model that matches the charm store and we didn't really do it very nicely
<ahasenack> the thing with changing the charm name is dangerous to forget, you have to eventually revert that before proposing
<ahasenack> and revert revert when fixing up a review comment, etc
<ahasenack> ok
<fwereade__> ahasenack, yeah, understood -- I'm very open to suggestions that help us improve the experience
<ahasenack> use the directory name would be my first suggestion, there can only be one charm inside it, use whatever you find in *that* metadata.yaml
<ahasenack> but sure, there are ramifications I don't see
<ahasenack> it's how I tried to use it the first time, and fell on my face :)
<rogpeppe> lunch
<fwereade__> ahasenack, that does sound like a more naturally comprehensible model, indeed, but the main problem is that the charm metadata name is expected to match the charm url name, and charms are requested from the repo by charm url
<ahasenack> that is the root of the problem
<fwereade__> ahasenack, it may be the case that we could transparently fake something up, but I feel the bump-revision business is already somewhat hokey, and adding further layers of hokiness feels potentially, er, suboptimal
<ahasenack> fwereade__: on that topic, what do you think about making this error: "juju deploy --repository /foo/bar mycharm"
<ahasenack> i.e., without the local: prefix before mycharm
<ahasenack> I don't know what it will deploy, honestly. I think it would deploy mycharm from the store
<fwereade__> ahasenack, yeah, it would, and I see how it's very reasonable to infer that of course a local repository is intended in that case
<fwereade__> ahasenack, what's your opinion on `juju deploy --repository /foo/bar cs:somecharm`? if we make the `local:` inference in the case you describe, this one feels like it becomes ill-formed, because a local repo cannot contain cs: charms
<ahasenack> fwereade__: that command should fail
<ahasenack> fwereade__: I think --repository is all that is needed to infer where the charm is
<ahasenack> fwereade__: what are the possible values for --repository?
<fwereade__> ahasenack, always interpreted as a path at the moment
<fwereade__> ahasenack, probably never used otherwise
<ahasenack> fwereade__: are "juju deploy mycharm" and "juju deploy cs:mycharm" the same?
<fwereade__> ahasenack, yeah
<ahasenack> fwereade__: what about "juju deploy local:mycharm", what does it do?
<fwereade__> ahasenack, the inference process is as follows:
<fwereade__> ahasenack, infer a url from the supplied string, using schema cs: (if not specified) and series default-series (if not specified)
<fwereade__> ahasenack, infer a repo from the schema and the repo path, take from $JUJU_REPOSITORY if --repository is not specified
<fwereade__> ahasenack, ask that repo about the inferred url; if the url was unrevisioned, pick the latest revision reported by the repo and add it to the url
<fwereade__> ahasenack, use the charm thus uniquely identified by that url
<fwereade__> ahasenack, (uniquely... yes for the charm store, not so much for local repos)
<ahasenack> fwereade__: sounds complicated
<ahasenack> fwereade__: I'm confused why we have two possible "sources" in the same command line
<ahasenack> fwereade__: --repository, and the prefix for the charm name
<ahasenack> fwereade__: bzr doesn't have any of that, for example, and I think we were at some point following the bzr url conventions
<ahasenack> it's bzr branch /foo /bar
<ahasenack> bzr branch lp:foo
<ahasenack> bzr branch lp:~oioi/foo/bar
<ahasenack> and so on
<fwereade__> ahasenack, yeah, it's a lot of effort just to make `juju deploy wordpress` work
<ahasenack> works with both local and remote
<ahasenack> but, now we have to worry about backwards compatibility
<ahasenack> so that's why I suggested to make the invalid combinations an error
<fwereade__> ahasenack, yeah, I am cautiously with you on this
<fwereade__> ahasenack, can I ask you to mail the juju list about it, just in case anyone's using it in some important way, before we agree it's definitely a bug and can be changed as you suggest?
<ahasenack> sure
<fwereade__> ahasenack, (fwiw I presume you don't ever actually put bundled charms in your local repos?)
<fwereade__> ahasenack, hell even if you do we can make it sane that way
<ahasenack> fwereade__: what is a bundled charm?
<fwereade__> ahasenack, a zip file basically -- just a serialized charm dir
<ahasenack> never heard of it, so no :)
<fwereade__> ahasenack, the format you download charms from the store in vs the format you edit and develop them in
<ahasenack> fwereade__: didn't even know there was a different way to grab charms other than bzr branch
<fwereade__> ahasenack, there's not much point to it AFAICT but it is a public format
<fwereade__> ahasenack, I think it's important to maintain the matching-names guarantee, though
<fwereade__> ahasenack, so making a local repos that behaves no less insanely will be a little bit fiddly
<fwereade__> ahasenack, almost certainly worth it though
<ahasenack> fwereade__: I don't know, my personal feeling is that it's very restrictive and surprising
<ahasenack> but any change here could have far reaching consequences
<ahasenack> so I would just start with making the obvious incorrect combinations error out
<ahasenack> maybe someone will jump in and explain what was the reasoning behind --repository and local:charm
<dimitern> fwereade__: can you have another look at https://codereview.appspot.com/9937045/ please?
<dimitern> niemeyer: hey
<dimitern> niemeyer: i think _mup_ has been misbehaving again and not responding to bug #
<dimitern> rogpeppe: I don't want to get into another 3h argument about the watchers, but I think the idea about how watchers have to be implemented the same way in the server-side API is wrong
<rogpeppe> dimitern: how do you mean?
<dimitern> rogpeppe: we do need to do coalescing at client side, so that we don't drop events
<dimitern> rogpeppe: and as well as we do that, we can handle the initial events there
<rogpeppe> dimitern: have you got a link to the kanban hangout please? my browser is refusing to load google calendar
<dimitern> rogpeppe: https://plus.google.com/hangouts/_/539f4239bf2fd8f454b789d64cd7307166bc9083
<rogpeppe> dimitern: i think that initial events and client-side coalescing are two independent issues
<rogpeppe> dimitern: thanks
<niemeyer> dimitern: It was down
<fwereade__> rogpeppe, dimitern: fwiw I would probably accept rog's flip-flop proposal for the client-side watcher wrappers
<fwereade__> rogpeppe, dimitern: we can build in coalescence later if the slightly-surprising event delivery burstiness is a problem for client code
<rogpeppe> fwereade__: +1
<rogpeppe> fwereade__: that would make client-side watchers *much* simpler too
<fwereade__> rogpeppe, dimitern: if we've done them all right it won;t be
<dimitern> fwereade__: what's the flip-flop stuff?
<fwereade__> rogpeppe, dimitern: I'm just a little twitchy about it because, well, there's bound to be somewhere where someone has made unwarranted assumptions about watcher behaviour
<fwereade__> dimitern, always have one of in and out be nil
<fwereade__> dimitern, so always deliver events with exactly the grouping they're initially received with
<fwereade__> dimitern, this should not matter
<fwereade__> dimitern, and probably doesn't
<dimitern> fwereade__: but how about the initial Changes call at server-side
<fwereade__> dimitern, I'm comfortable with whatever convention we pick
<fwereade__> dimitern, if the convention is that every server-side Watch method pulls the first event, and returns anything interesting in its results, and that every client-side watcher always deliver those results in the first event on its Changes channel, then I'm happy
<fwereade__> dimitern, rogpeppe: since we've got the current Next behaviour we may as wll try to make use of it -- some use cases, like uniter.Filter, still abuse it, but we can worry about that when it becomes a problem
<dimitern> rogpeppe, fwereade__how about not calling changes at server-side, because we cannot accomodate all watchers (i.e. lifecycle one has non-empty changes)
<rogpeppe> dimitern: i'm not sure what you mean there
<fwereade__> dimitern, the first event on a watcher is always delivered as soon as possible, and represents the base state relative to which change notifications will be delivered
<rogpeppe> fwereade__: i still think that sending the initial event back with the Watch request is a reasonable thing to do
<fwereade__> rogpeppe, me too
<rogpeppe> s/the Watch request/the response to the Watch request/
<rogpeppe> fwereade__: oh, good
<rogpeppe> fwereade__: i thought that was considered a problem
<fwereade__> rogpeppe, I just hadn't thought of it
<dimitern> rogpeppe: we discussed client-side watchers to return struct{}{} instead of changes, and then after you're notified, you can get the actual changes from the api separately
<fwereade__> rogpeppe, but it makes perfect sense, especially when we've got flip-flop client-side watchers
<rogpeppe> dimitern: let's not do that
<fwereade__> rogpeppe, we don't even need to call Next until the initial result has actually been handed off to the watcher's client
<rogpeppe> dimitern: let's just do for {data := Next(); c <- data}
<rogpeppe> fwereade__: exactly
<dimitern> rogpeppe: so then instead, we do proper coalescing on client-side
<fwereade__> dimitern, depends on the watcher
<rogpeppe> dimitern: no need
<fwereade__> dimitern, lists of documents changing state need to be lists of ids
<dimitern> rogpeppe: you're arguing events won't be dropped?
<rogpeppe> dimitern: yes
<rogpeppe> dimitern: not with the above for loop
<dimitern> rogpeppe: i don't see how you can guarantee that
<rogpeppe> dimitern: how can an event be dropped?
<dimitern> rogpeppe: with the current client-side watchers you do
<fwereade__> dimitern, we never read a new event to drop from the common watcher until we've delivered the previous event to the client
<rogpeppe> dimitern: yes, but they're being too clever
<rogpeppe> dimitern: we're gonna axe all that code
<rogpeppe> dimitern: and go *much* simpler
<dimitern> rogpeppe: ok then
<rogpeppe> dimitern: the server side does coalescing already.
<dimitern> rogpeppe: as long as we keep to the standards we agreed (please) about how server-side should have bulk ops and client-side only if needed
<rogpeppe> dimitern: i have a possible plan to add universal bulk ops to the rpc package
<fwereade__> rogpeppe, that sounds pretty frickin' sweet to me
<rogpeppe> dimitern: that would mean that a) you'd only need to implement bulk ops when you could do so meaningfully and b) you'd get benefit from using bulk ops even if the server methods were implemented as singletons
<dimitern> rogpeppe: well, as long as it provides the same protocol with less code and better readibility, i'm +1
<rogpeppe> fwereade__: pretty much
<rogpeppe> fwereade__: the idea is you'd send an rpc request with Ids: ["id1", "id2"] rather than Id: "id1"
<dimitern> rogpeppe: i don't get 1) and 2)
<dimitern> rogpeppe: seems too smart
<rogpeppe> fwereade__: then the rpc server would try to find a specific bulk request handler for the type, and fall back to single methods (concurrently) if there's none
<dimitern> rogpeppe: i have to see it in other words :)
<rogpeppe> dimitern: i think it might work ok
<dimitern> rogpeppe: where's the concurrency happening? server-side only?
<fwereade__> rogpeppe, hmm, honestly, I think it's too speculative
<rogpeppe> dimitern: yes
<rogpeppe> fwereade__: it means quite a bit less work implementing server methods
<rogpeppe> fwereade__: for a relatively small piece of work in the rpc package
<rogpeppe> fwereade__: i *think* the trade-off is worth it
<dimitern> rogpeppe: will it also involve splitting the rpc package in more manageable chunks, which could be unit-tested separately?
<rogpeppe> dimitern: maybe.
<rogpeppe> dimitern: the rpc package isn't exactly large though
<dimitern> rogpeppe: I read that as "no" :)
<dimitern> rogpeppe: but anyway, we can do that later, no worries
<rogpeppe> dimitern: what do you think is unmanageable about the rpc package?
<rogpeppe> dimitern: i think it's tested reasonably well, no?
<dimitern> rogpeppe: i'm not getting back into the argument big vs. small packages
<dimitern> :)
<rogpeppe> dimitern: it's only 800 lines of code
<fwereade__> rogpeppe, I would love to see the authentication behaviour split off
<rogpeppe> fwereade__: that's not part of the rpc package
<fwereade__> rogpeppe, but I don't wish to force the issue now
<dimitern> rogpeppe: well, once we implement the separate login step in the authentication, which is not part of the rpc interface, we'll need to split parts of it related to marshalling to reuse them
<fwereade__> rogpeppe, ofc, sorry, I don't have a quarrel with the rpc package itself
<rogpeppe> dimitern: i do actually have a CL somewhere that splits out the method introspection code so it can be used independently.
<dimitern> rogpeppe: great to hear
<rogpeppe> dimitern: those bits are already in a separate package
<dimitern> sorry bbiab
<rogpeppe> dimitern: the rpc package doesn't do any marshalling or unmarshalling
<rogpeppe> dimitern: rpc/jsoncodec does that
<fwereade__> rogpeppe, anyway, please tell me a smidgen more: how are you planning to register the handlers?
<rogpeppe> fwereade__: which handlers?
<fwereade__> rogpeppe,  for bulk vs single operations
<rogpeppe> fwereade__: my plan was to treat some names specially - for instance types with a "Vector" suffix.
<fwereade__> rogpeppe, you are talking about some sort of magic which I am prepared to consider, possibly, to be worth the effort, but I'd like to understand exactly what's involved :)
<rogpeppe> fwereade__: then methods on those would have to conform to the signature foo(args someStruct) (ret []resultStruct, errs []error)
<rogpeppe> fwereade__: where the params and result struct would need to match the non-bulk form of the operation if that's provided
<fwereade__> rogpeppe, ok, so, that's how we implement a bulk API method in terms of single-entity methods
<fwereade__> rogpeppe, where do we get the implementation for the bulk form? what happens if both single and bulk exist?
<rogpeppe> fwereade__: the bulk API method doesn't need to call the single-entity methods
<rogpeppe> fwereade__: for operations on a single id, the single form is chosen by preference, falling back to bulk form. for operations on many ids, the bulk form is chosen by preference, falling back to the single form called concurrently on all the ids.
<rogpeppe> fwereade__: we might want to call the bulk form preferentially in all cases though
<fwereade__> rogpeppe, ok, that sounds nice tbh, I can imagine optimized single-use variants being useful
<rogpeppe> fwereade__: i'm not sure. the logic in the rpc packageis not hard though (a couple of map lookups)
<rogpeppe> fwereade__: cool
<fwereade__> rogpeppe, but it lives or dies in my mind by clarity and comprehensibility of exactly what API methods will be provided by a given facade type
<rogpeppe> fwereade__: one problem i have with the current facade stuff is that it's only one type
<rogpeppe> fwereade__: but resources (watchers etc) don't fit within that.
<fwereade__> rogpeppe, that one type is equivalent to State -- it's just a cut-down variant tuned to the client's needs
<rogpeppe> fwereade__: that one type doesn't include the watcher Next methods, for example
<rogpeppe> fwereade__: so it doesn't actually include the entire API surface area for the agent
<rogpeppe> fwereade__: one possibility i've been toying with is the idea of allowing an rpc server to dynamically change the current object being served.
<fwereade__> rogpeppe, suggestion, half baked: Watcher facade, with Next(watcherId) and NextIds(watcherId) methods?
<fwereade__> rogpeppe, such that everything authenticated can access that facade
<rogpeppe> fwereade__: not sure. i'd prefer to go more the direction of allowing a given agent to access a set of types rather than just one type
<fwereade__> rogpeppe, the RegisterResource stuff on the server STM to be hinting in that direction -- there are common patterns that apply across all kinds of watcher
<rogpeppe> fwereade__: so the API served for the machine agent, say, might have just Machine and EntityWatcher
<rogpeppe> fwereade__: there's definitely something a little nicer lurking somewhere here
<fwereade__> rogpeppe, yeah, something like that -- I'd been thinking of Watcher as a chunk of common business-level functionality
<rogpeppe> fwereade__: if you can dynamically change the object being served, then the first facade presented could have *only* the Admin type (with its Login method); when successfully logged in, we could choose a facade based on who's logged in and change to serving that
<rogpeppe> fwereade__: i can't immediately see any particular difficulty in doing that actually.
<fwereade__> rogpeppe, ok, the *concept* has my cautious approval
<fwereade__> rogpeppe, but my absolute current priority is to get us running non-manager machine agents with API connections only, and no access to state
<rogpeppe> fwereade__: agreed
<fwereade__> rogpeppe, and I don't think that plan is a major win wrt that specific goal, even if it is in other areas
<rogpeppe> fwereade__: correct
<fwereade__> rogpeppe, and don't take that Watcher facade idea as gospel, I think it may cause as many problems as it solves
<rogpeppe> fwereade__: do i take it then that you're ok with having a facade per authenticated entity rather than a facade per worker?
<rogpeppe> fwereade__: because that's what the above plan would need
<fwereade__> rogpeppe, I don't think I said that
<rogpeppe> fwereade__: and i think it makes sense from a security point of view
<rogpeppe> fwereade__: if you were changing the facade based on login, why have a different facade for each worker?
<fwereade__> rogpeppe, because I think it's a worthy goal to namespace the methods we expose according to the patterns in which we expect them to be used
<rogpeppe> fwereade__: even if it's loads of code duplication?
<fwereade__> rogpeppe, this does not mean that there will never be facades that are so useful that practically everyone uses them
<rogpeppe> fwereade__: it just seems a weird way of making an API to me.
<rogpeppe> fwereade__: is there precedent for this kind of thing?
<rogpeppe> fwereade__: usually my goal with APIs is to make something as small and neat as possible so there's nice easily analysable surface area
<fwereade__> rogpeppe, um, grouping code according to common purpose is not without precedent
<rogpeppe> fwereade__: i was talking about actual APIs, not just grouping code
<fwereade__> rogpeppe, yes: by exposing the minimum information necessary to accomplish business-level tasks, you can analyse the impact of given chunks of the API in isolation
<rogpeppe> fwereade__: but from a security point of view, you're looking at the union of all the available facades, no?
<fwereade__> rogpeppe, but I don't want Machines as a global concept, because different chunks of juju are interested in different, and not-always-overlapping, subsets of Machine functionality
<rogpeppe> fwereade__: i'm not suggesting Machines as a global concept here
<rogpeppe> fwereade__: i'm suggesting one Machine for the machine agent, and one for the unit agent (if it needs to look at a machine of course)
<rogpeppe> fwereade__: rather than one set of things for each worker
<rogpeppe> fwereade__: and one for the agent itself
<fwereade__> rogpeppe, ok, but different tasks within the machine agent depend on different subsets of the functionality and do them for different reasons
<fwereade__> rogpeppe, machiner has no use for instance id, for example
<rogpeppe> fwereade__: i agree. but i think unifying them into one thing at the server side will a) be much less code b) not too hard and c) make things easier to analyse from a security p.o.v.
<fwereade__> rogpeppe, and if we do that it becomes a real hassle to migrate a task from one agent kind to another
<fwereade__> rogpeppe, one conceptual entity that has facades of nicely-grouped methods hanging off it, fine
<fwereade__> rogpeppe, are we in violent agreement here?
<rogpeppe> fwereade__: i'm not sure. i don't *think* so, though i would *really* like to be agreeing with someone for once. i feel like i've been disagreeing with everything recently, and i don't like that.
<rogpeppe> fwereade__: i take your point about it being a hassle to migrate a task from one agent to another
<fwereade__> rogpeppe, yeah, I think it's the violent collision of two very different sets of experience
<rogpeppe> fwereade__: but i'm wondering how important is it that the security is so perfectly fine grained
<dimitern> rogpeppe: the bulk ops signature has to be method(args someStructWithIds) (resultsAsArrayOfStructsWithErrorAndResultEach, error)
<dimitern> rogpeppe: not "<rogpeppe> fwereade__: then methods on those would have to conform to the signature foo(args someStruct) (ret []resultStruct, errs []error)"
<rogpeppe> fwereade__: i'm thinking particular of information propagation here. what possible harm could happen if the machiner knows its instance id, for example?
<rogpeppe> dimitern: why so?
<dimitern> rogpeppe: we need errors and results for single ops in an array
<rogpeppe> dimitern: why so?
<dimitern> rogpeppe: because it helps debugging and matching error to result
<rogpeppe> dimitern: if both slices have the same number of elements, i'm not sure i see the difficulty
<dimitern> rogpeppe: no, i tried that initially, wasn't accepted
<dimitern> rogpeppe: and i see why now
<fwereade__> rogpeppe, (1) namespacing is useful for people as well and (2) I would prefer to adopt a general policy of handing out information for which there is no demonstrated need
<dimitern> rogpeppe: the debugging argument is a big one
<rogpeppe> dimitern: how is it easier to debug?
<dimitern> rogpeppe: you have result+error for each op
<fwereade__> dimitern, he's talking about some magic dispatch to automatically execute bulk requests as a bunch of single ops, which single ops would have that signature
<dimitern> rogpeppe: that keeps related things together, rather than having them spread out
<rogpeppe> fwereade__: i'm concerned that at every step we're writing Big Software where we don't actually need to. we don't have to write all this code.
<fwereade__> dimitern, it might be nice but I don't think it's justifiable in light of the current goals
<dimitern> fwereade__: no, aiui it's about having separate errors and results
<fwereade__> dimitern, ah, hmm, yes, I misread that line
<dimitern> rogpeppe: writing gode is not a bad thing
<dimitern> code
<dimitern> :)
<rogpeppe> dimitern: i can see why the current scheme was chosen, but there are good reasons to choose separate slices if we're automating the bulk stuff
<rogpeppe> dimitern: yes it is :-)
<rogpeppe> dimitern: the more code the worse
<dimitern> rogpeppe: writing good code is better than writing small amount of too-smart-to-reason-with-code
<rogpeppe> dimitern: i'd like to write a small amount of easy-to-reason-about code
<dimitern> rogpeppe: but you're rejecting good points as irrelevent
<dimitern> rogpeppe: for the sake of code simplification
<rogpeppe> dimitern: it's always a trade-off
<dimitern> rogpeppe: some trade-offs i'm not willing to take
<dimitern> :)
<rogpeppe> dimitern: perfection is the enemy of the good
<fwereade__> rogpeppe, so, the magic technique for doing so, that many of us have found helpful in our careers to date, is to resist the temptation to lump unrelated concerns together
<fwereade__> rogpeppe, frequently this results in more lines of code
<rogpeppe> fwereade__: it all depends on your ontology of concerns :-)
<dimitern> rogpeppe: aiming for perfection and missing the bigger picture is a bad thing
<fwereade__> rogpeppe, but a lower cognitive load when trying to reason about the system, because the properties of indivdual components can be isolated and reasoned about in isolation
<TheMue> fwereade__: +1, imho readability and so also maintainability is the more important factor than size (keeping in mind that this doesn't mean huge is better maintainable)
<fwereade__> rogpeppe, I'd rather have, say, two 60-line types with one clear point of contact between them than one 100-line type
<dimitern> rogpeppe: we don't just change the interface without need for the sake of simpler code, esp. wrt security related stuff
<rogpeppe> fwereade__: i would really like one API that asymptotes and rarely needs adjustment. what we're reaching towards is an API that is ever-growing, one piece for every new kind of client.
<fwereade__> rogpeppe, one piece for every distinct function of a juju system, surely?
<dimitern> rogpeppe: we're aiming towards specialized api that works for our needs, and grows with them, not something universally simple and practically complicated to use because of its simplicity
<fwereade__> rogpeppe, our understanding of the best grouping will continue to evolve
<fwereade__> rogpeppe, for now, I want to expose the absolute minimum necessary to perform a single task
<dimitern> rogpeppe: ..and to secure, understand, take as a whole..
<dpb1`> Hi all -- anyway to get more data from this bootstrap failure: http://paste.ubuntu.com/5742182/?  What url it's trying?  what data it sees?
<fwereade__> dpb1`, there is a mail describing it on the lists, from wallyworld
<dpb1`> thx, I'll check
<fwereade__> dpb1`, you need to generate a simplestreams representation of your custom image, with `juju image-metadata`
<fwereade__> dpb1`, and upload it to your bucket
<fwereade__> dpb1`, the output from image-metadata tells you what to do in more detail
<rogpeppe> sorry, was afk
<rogpeppe> car insurance hassles
<dpb1`> very interesting.  thanks fwereade__ I'll poke around with it, that gives me something to go on, thanks
<fwereade__> rogpeppe, the goal is that we get what we need for machiner and, bam, it's done
<rogpeppe> fwereade__: i *thought* it was one piece for "the state", with appropriate guards to make sure that agents can't egregiously escape their bounds of responsibility
<fwereade__> rogpeppe, then the other tasks, one afterthe other, and switching them over to use API connections as soon as that precise set of functionality has landed
<rogpeppe> fwereade__: my goal was we get "the state" and bam, it's done
<rogpeppe> fwereade__: currently, what facades are we planning?
<rogpeppe> fwereade__: perhaps we should actually design (in advance) what each facade should contain
<rogpeppe> fwereade__: machiner, uniter, upgrader, machine agent, unit agent, provisioner, firewaller.
<rogpeppe> fwereade__: resumer?
<rogpeppe> fwereade__: cleanuper?
<dimitern> rogpeppe: I have a list of all state calls that are needed
<rogpeppe> dimitern: that's no longer sufficient
<TheMue> rogpeppe: s/cleanuper/cleaner/
<dimitern> rogpeppe: that's exactly what we need because the are partitioned in who uses them
<dimitern> rogpeppe: in addition to all these you listed we need two more: bootstrap stuff and agent base
<rogpeppe> dimitern: bootstrap stuff doesn't count - there's no API at that point
<rogpeppe> dimitern: agent base?
<dimitern> rogpeppe: stuff in cmd/jujud/agent.go
<rogpeppe> dimitern: i don't think that can have its own facade
<dimitern> rogpeppe: essentially: entity.Tag(), entity.Set(Mongo)Password
<dimitern> rogpeppe: probably not
<dimitern> rogpeppe: but it needs to access these
<rogpeppe> dimitern: i think that has to count as part of the machine-agent and unit-agent facades
<dimitern> rogpeppe: if we can move the code in openState that needs to set the passwords in both agents, yes
<rogpeppe> dimitern: i don't understand that
<dimitern> rogpeppe: well, if in openState in agent.go we create either the MA or UA facade, then the code that sets the password can be generic (with interfaces) and still be in one place
<rogpeppe> dimitern: that's what i thought. no need to move the code then, no?
<dimitern> rogpeppe: I think now it's like that, except for getting the specific facade instace for the agent
<dimitern> rogpeppe: yeah, sorry, just thinking aloud
<rogpeppe> dimitern: the facade instance is got before calling openState
<dimitern> rogpeppe: perfect
<dimitern> rogpeppe: we just have to make sure both UA and MA facades provide these methods that openState needs
<rogpeppe> dimitern: yes
<dimitern> i'm sorry it seems i won't be able to finish the machiner.Watch CL today
<dimitern> maybe someone can take if over?
<dimitern> i need some rest - i slept like 3h since malta
<dimitern> happy weekend everyone and see you on the 26th
<fwereade__> dimitern, can you plausibly hand over to mgz?
<fwereade__> dimitern, happy weekend
<dimitern> fwereade__: i'll write him a mail
<fwereade__> dimitern, cheers
<fwereade__> rogpeppe, dimitern: just bear in mind the possibility that there may be a completely common "agent" facade trying to get out
<fwereade__> anyway I am not really working today, I tell myself
 * fwereade__ bows out for a bit
<TheMue> Carmen calls me for BBQ, have a nice weekend.
<rogpeppe> right, eod reached. have a fine weekend everyone!
<dpb1> Hi all -- http://paste.ubuntu.com/5742835/ <-- another openstack json unmarshalling problem.  Something easy to fix?  Where should I start looking?
<fwereade__> dpb1, I would search for "security_group" in launchpad.net/goose
<fwereade__> dpb1, I would guess that it's a version of openstack that returns a different id type
<dpb1> fwereade__: Hi, I have contacted mgz about it already.  I have a bug and proposed patch that gets it working.  It's a problem of grizzly with quantum.
<fwereade__> dpb1, cool
<dpb1> https://bugs.launchpad.net/goose/+bug/1188815
<_mup_> Bug #1188815: security group on quantum/grizzly is uuid, goose chokes on it <Go OpenStack Exchange:Confirmed> <juju-core:Confirmed> <https://launchpad.net/bugs/1188815>
<fwereade__> danilos, ping
#juju-dev 2014-06-02
<bodie_> if anyone can give me some insight here, it would be really helpful
<bodie_> http://paste.ubuntu.com/7569460/
<bodie_> the value is being retrieved from the db, after being serialized into it from a nested map[interface{}]interface{}
<bodie_> the bit responsible for writing it to the db is here: http://bazaar.launchpad.net/~binary132/juju-core/charm-interface-actions/view/head:/state/state.go#L511
<bodie_> this bit in particular threw me for a loop:
<bodie_> map[string]interface {}{"<interface {} Value>":"foo.bz2"}
<bodie_> it seems like perhaps it needs to be coerced to a map[interface{}]interface{} when being serialized?
<jam> morning all
<wallyworld> hi
<wallyworld> quiet today with nz folks missing and andrew
<jam> wallyworld: I thought you were sick
<jam> I mean, I know you're always *sick*, but I thought you were sick-sick today :)
<wallyworld> jam: i am. so i'm a little slower
<jam> I just didn't expect you to be hanging out in IRC during your recovery
<wallyworld> well, there's lots to get done
<wallyworld> the github cutover has been on my mind  - i need to catch up with martin a bit later
<jam> wallyworld: did I miss the doc which describes how we're actually going to land things?
<wallyworld> jam: no, still wip, will be out today
<vladk> jam: morning
<jam> morning vladk, just switching to the hangout now
<dimitern> morning all
<wallyworld> morning
<jam> wallyworld: just thinking about it, is the intent to get 1.18 into github as well?
<jam> or are we just doing trunk and future releases?
<jam> dimitern: morning. I'm around if you're up for starting our 1:1 a bit early
<wallyworld> jam: sec, otp will talk soon
<jam> wallyworld: also, just to queue up things, we are failing to just bump the version to 1.19.4, https://code.launchpad.net/~sinzui/juju-core/inc-1.19.4/+merge/221570
<jam> any ideas there?
<dimitern> jam, i'm here, we can finish earlier as well
<wallyworld> jam: sorry, was dealing with a crisis at home. there were no plans o do 1.18 as well, just trunk onwards
<wallyworld> jam: saw the 1.19.4 bug. there were 2 raised, one about tools on utopic and failing tests. i fixed that one but was delaying the 1.19.4 one till i talked to tim
<wallyworld> jam: with the ec2 root disk fix, i was hoping that since 1.20 was out soon, we could not worry about 1.18. but can do 1.18 if needed
<jam> wallyworld: I believe 1.18 is going to be the version we can get into trusty for quite some time
<jam> so it is stays quite relevant
<wallyworld> jam: ok, i'll bACKPORT
<wallyworld> bloody capslock
<voidspace> morning all
<TheMue> voidspace: morning
<jam> morning voidspace
<mgz> wallyworld: are you feeling up to standup?
<wallyworld> sure
<wallyworld> sorry, was just finishing dinner :-)
<jam> mgz, wallyworld: as long as he can do it laying flat on his back
<mgz> was worried you might be ill in bed till I saw you chatting with jam earlier :)
<dimitern> WTF is github.com/binary132/gojsonschema ??
<dimitern> are we adding random github dependencies now just like that?
<dimitern> I though we're supposed to bring/fork third parties into github.com/juju/ when using them?
<mgz> dimitern: we can move it into juju
<dimitern> mgz, there should be some agreement and a policy how to deal with these
<jam> dimitern: mgz: fwiw, I'd rather link upstream as much as we can and use dependencies.tsv to ensure their correctness, rather than fork things and not give "credit" to upstream authors.
<dimitern> jam, and fork if the upstream breaks our stuff?
<jam> dimitern: well again, we're pinned to an exact rev
<jam> with deps
<jam> but yes, if they are egregious, then we fork
<dimitern> ah, right
<jam> dimitern: we all have actual copies of the history because of how git works, so if they ever just get deleted, we can always recreate them
<dimitern> jam, modern technology hey :)
<mgz> so, this is already an our-fork of the project as bodie needed to fix stuff, but I'm also happy with it being in his namespace due to the wonders of dvcs
<dimitern> fwereade, jam, vladk, networks= constraint only https://codereview.appspot.com/93670046 as requested (cmd/juju changes in a follow-up)
<dimitern> please take a look
<dimitern> jam, mgz, can you rename a branch and repropose it as new?
<mgz> dimitern: you caaan...
<mgz> easiest just to push from old location to new
<dimitern> mgz, but how about the working dir? what bzr commands to use?
<mgz> dimitern: which of the methods are you using?
<mgz> probably just when on that branch, `bzr switch -b new_name; bzr rmbranch old_name`
<dimitern> mgz, hmm.. will it keep my pipeline intact?
<mgz> dimitern: that is a good question I am not willing to answer :)
<mgz> find aaron :)
<dimitern> mgz, ok, i'll just stick with the obsolete name then :) too much trouble
<voidspace> I'd love a go debugger right now
<dimitern> voidspace, have you checked this http://golang.org/doc/gdb ?
<voidspace> dimitern: I haven't
<voidspace> dimitern: hmmm
<jam> vladk|offline: standup?
<jam> vladk: welcome back. Standup?
<voidspace> ah, that foxed me - replicaset Initiate logging is going to cloud-init.log not all-machines.log
<jam> voidspace: ah, because it is driven by "jujud bootstrap-state" ?
<voidspace> jam: something like that (I wasn't aware of bootstrap-state specifically)
<voidspace> jam: it's setting up the mongo server, so I guess it's needed before jujud can do *anything useful*
<jam> yeah, though I wouldn't be terribly miffed if it also logged to machine-0.log which would get it into all-machines.log
<voidspace> right
<voidspace> it would have saved me twenty minutes of working out why my logging didn't appear to do anything :-)
<perrito666> good morning everyone
<voidspace> perrito666: morning
<jam> morning perrito666
<voidspace> so, on my local machine it is 9 seconds after Initiate that replSetStatus is able to return a successful status
<perrito666> jam: so you made my sunday interesting with your email :p
<jam> might as well keep every day exciting, why should weekends be different
<jam> perrito666: ^^ :)
<voidspace> whereas asking for the config returns immediately even if initiate is still in process - I wonder if that can still be slower on some machines
<perrito666> jam: I was reasonably far from a computer irc-able :p I had my head on that the rest of the day :p
<dimitern> fwereade, when you have 15m today me and vladk (possibly jam as well) want to have a quick g+ with you re what networking-related stuff to add to hardwareCharateristics (if any)
<fwereade> dimitern, I think I can do that now if you're free
<fwereade> dimitern, I'm not sure if it's going to be the hc *document* itself -- mainly because I worry that network capabilities are potentially mutable, or at least more so than "normal" hc, although what happens when someone hot-swaps hardware?
<jam> https://plus.google.com/hangouts/_/canonical.com/juju-sapphire ?
<fwereade> dimitern, nonetheless I think what we need to store is "all *accessible* networks"
<jam> dimitern: vladk|offline: care to join us?
<perrito666> jam: lemme know when you are free for a moment please, so we can have a chat re the issue of localhost bein priorized
<rogpeppe> anyone care to do a fairly trivial review, factoring some logsuite functionality out of juju-core? https://github.com/juju/testing/pull/8/files
<jam> perrito666: I seem to have abit of time while this meeting comes together
<jam> what's up?
<perrito666> well, I read all of your comments, your main concern seems to be that I might (and most likely will) break the order of the addresses which are not localhost, right?Ã§
<jam> right
<natefinch> seems like the answer is not to sort, just extract localhost from the slice if it exists and connect there first
<jam> since sort.sort is not stable
<jam> yeah
<perrito666> the fun thing here is, I had done just that to begin, but it looked uglier than sorting .. :)
<jam> I feel a little bit that if localhost is in the list, you shouldn't actually try to connect to anythingelse.
<jam> is there a reason we allow fallbacks ?
 * perrito666 thinks
<perrito666> It made sense to me because I am the one adding localhost there in the first place (actually I might) but I see no reason why localhost should not work in that case, since I am using the set up port
<natefinch> it's a good point... we should probably only connect to localhost if it exists.
<jam> natefinch: it seems like just a good way to accidentally connect to something not localhost and then think that you're also a master, etc.
<natefinch> that's a very good point
<perrito666> I wonder, is it possible for that list to contain more than one localhost?
<perrito666> localhost address that is
<natefinch> with different ports?   Shouldn't be
<perrito666> ok then, refactor it is... how do I handle re-proposing someting?
<jam> perrito666: don't, start a new branch
<perrito666> jam: ok
<natefinch> have you merged it already?
<perrito666> natefinch: yup, I fixed your and waynes suggestions and that gave me a couple of lgtms
<voidspace> natefinch: morning
<voidspace> natefinch: care to take a look at this
<voidspace> natefinch: https://codereview.appspot.com/104800043
<voidspace> this waits for *up to* five seconds for a successful replicaset config - even if the Initiate call errors out
<natefinch> perrito666: that's totally fine... just was going to say that if you hadn't merged it, you could edit on the branch, but since you have, just make a new branch with the fix
<voidspace> natefinch: on my machine it can take 9 seconds before we get a successful *status* after calling Initiate
<voidspace> which is why I went with config - because waiting for status would slow down cloud-init on all machines
<voidspace> (state servers)
<voidspace> on the other hand they can't do anything useful until the replica set is fully initiatlized, so maybe it doesn't matter
<voidspace> *wouldn't matter
<natefinch> voidspace: won't this only wait a max of 5 seconds, though?
<voidspace> natefinch: I'm waiting for config not for status
<voidspace> natefinch: I didn't wait for status because that would take longer
<natefinch> oh right, ok
<voidspace> if this doesn't work then maybe we need to wait for status
<voidspace> and if that doesn't work I suggest giving up for now...
<voidspace> natefinch: all tests pass with this change
<perrito666> voidspace: didnt all test pass without it too? (in machines other than CI)
<voidspace> perrito666: yes... what I mean is that I have run all the tests to check I didn't break anything...
<voidspace> perrito666: well, didn't break anything covered by tests anyway
 * voidspace lunch
<jam> natefinch: did you want to do a catchup 1:1 now?
<dimitern> fwereade, cheers, thanks
<natefinch> jam: I'm probably going to get interrupted any minute, so probably not. We should probably reschedule this in general to a couple hours earlier when I'm more likely to be baby free
<rogpeppe> dimitern, voidspace, natefinch, jam, fwereade: any chance of a review of this, please? i've got a juju-core branch waiting on it. https://github.com/juju/testing/pull/8/files
<jam> well, I had 1hr ago free, but you weren't online yet, (after our standup, so not guaranteed), but I also have a gap 2 hours earlier if you can make that time
<natefinch> 2 hours earlier is fine, I'm usually online then, just sometimes hiding ;)
<jam> rogpeppe: I thought splitting something out into juju/$PACKAGE meant that it should actually be independent of juju core stuff
<jam> it feels odd to have JUJU_LOGGING_CONFIG in there
<rogpeppe> jam: i wondered about that
<rogpeppe> jam: i figured JUJU doesn't necessarily imply juju-core
<rogpeppe> jam: and it is in juju/testing
<rogpeppe>  jam: but i'm happy to rename it anything you choose :-)
<jam> So at least, my understanding was that stuff under "github.com/juju/*" should hopefully be applicable to 3rd parties that want to use them, otherwise why split it out
<jam> though I realize the new store changes will be something that is still very Juju related that wants a separate copy
<jam> TEST_LOGGING_CONFIG seems a good fit, though.
<rogpeppe> jam: ok, TEST_LOGGING_CONFIG it is
<jam> dimitern: have you had any chance to think about ipv6 stuff ?
<jam> that sounds like work that we need to be scoping early this week.
<fwereade> jam, dimitern: +1 to that
<dimitern> jam, i'm reading the wiki today as i code
<dimitern> jam, sgtm, i'll have more insight tomorrow i guess
<rogpeppe> another fairly trivial review, please; update golxc to avoid using the (now deleted) testing/logging package: https://codereview.appspot.com/103780043
<rogpeppe> dimitern, jam, voidspace, fwereade: ^
<jam> I thought golxc already had thta
<jam> rogpeppe: rev 9 of golxc has "Replace LoggingSuite with IsolationSuite"
<rogpeppe> jam: oh
<rogpeppe> jam: i guess i didn't try bzr pull
<rogpeppe> jam: thanks
<voidspace> fwereade: so mocking CurrentConfig directly isn't possible
<voidspace> fwereade: I assume you want me to add an extra level of indirection to make it testable
<voidspace> further complicated by the fact that *all* replicaset tests Initiate already
<voidspace> but I can workaround that
<voidspace> fwereade: replicaset_test.go is already a whitebox test (in the replicaset package)
<voidspace> fwereade: so ok to just use: var getCurrentConfig = CurrentConfig
<voidspace> fwereade: to permit mocking?
<fwereade> voidspace, works for me
<voidspace> fwereade: cool, reproprosing shortly
<natefinch> voidspace: yeah  (you'll find most of my tests are in the same package as the stuff they're testing)
<natefinch> voidspace: standup
<voidspace> voidspace: omw
<dimitern> vladk, reviewed
<dimitern> fwereade, a quick look at https://codereview.appspot.com/93670046/ - network constraints?
<fwereade> dimitern, I'm in meeting mode for the rest of the pm, make sure perrito666 takes a look, I will try to swing by and add my opinions in the evening
<dimitern> fwereade, ok, sure
<dimitern> perrito666, PTAL then ^^ :)
<perrito666> dimitern: looking
<dimitern> perrito666, thanks
<perrito666> dimitern: I am sure you have a very good reason for it, but why ^? :)
<dimitern> perrito666, ;)
<dimitern> perrito666, for the CL or for the "thanks" ?
<perrito666> dimitern: for the use of "^"
<dimitern> perrito666, ah, sorry - I meant "look at the previous lines"
<perrito666> dimitern: I meant for the ^ as the "exclude" prefix
<dimitern> perrito666, or you mean in constraints? someone suggested to use it instead of "!" to play nice with bash
<perrito666> dimitern: I think the ideal would be "-" but that would break anything most likely
<dimitern> perrito666, yes, we use "-" in tags
<bodie_> anyone have some input on how to properly serialize a nested map[interface{}]interface{}?
<bodie_> I think it's serializing as {"<interface {} Value>":"foo.bz2"} instead of the full map, though I could be retrieving it wrong
<bodie_> er, for use with mongo
<bodie_> if not, I'll just head to the documentation, but if there's a quick answer to short-circuit that, it would be much appreciated
<natefinch> bodie_: not sure why it's not serializing correctly. However, I'm wondering why you have a map with interface{} as keys?
<bodie_> well, the top level is a map[string]interface{} -- however, the depth of that map is unknown until runtime, at which point it's deserialized from YAML
<bodie_> therefore, since the values (of type interface{}) can themselves be keys of a deeper map...
<bodie_> you end up with a map[i{}]i{}
<natefinch> I sort of understand....sort of
<bodie_> natefinch, the actionspecs key here has a list of values, which are themselves keys
<bodie_> https://codereview.appspot.com/94540044
<bodie_> I mean uhh... maps
<bodie_> the values are maps
<bodie_> which contain keys with maps as values
<bodie_> so it's not as complicated looking as it sounds
<bodie_> but, I'm not seeing how to coerce it into a usable format for bson serialization
<perrito666> dimitern: I reviewed your patch, sorry if the commends sound harsh, I lack a bit of polite English, read them as if I was saying them with a smile
<perrito666> bbiab
<natefinch> bodie_: I'm just worried that if you make map[interface{}] then people can put stuff in the value that isn't a valid map key.... like another map
<natefinch> which will only blow up at runtime
<bodie_> yeah, that's probably why it doesn't work, heh
<natefinch> http://play.golang.org/p/mk3tZ4NCUF
<bodie_> I've been pushing for us to make it a map[string]JsonSchemaDoc or simply map[string]string and serialize to JSON-Schema at runtime
<bodie_> because, we DO know that the values are serializable to J-S-docs, we check that at time of deserialization
<jcw4> bodie_, natefinch even without that, I still don't see any place where the key has to be an interface{}
<bodie_> (so that probably also answers your question -- we do a validity check immediately after deserializing from YAML)
<jcw4> the keys, AFAICT, are always string
<bodie_> the keys are, yes, but the values themselves can be maps
<bodie_> with string keys
<natefinch> right, but that means you should have map[string]interface{}
<jcw4> right, so where does map[interface{}]interface{} come from?
<bodie_> the top level is map[string]interface{} because it could have values which are themselves maps
<jcw4> right,
<bodie_> so, those are of type interface{} as far as we know
<jcw4> any value with a map type would still be map[string]interace{}
<bodie_> but they could be the keys of maps
<bodie_> okay
<natefinch> but if they're a key, they're a string, so you can convert them to a string first
<jcw4> no, the value would be a map[string]interface{}
<jcw4> not a list of keys
<bodie_> right
<jcw4> bodie_: there's never a time that there will be a value of type map[interface{}]interface{}
<jcw4> bodie_: it will be an interface{} until you figure out if its a map, in which case you'd cast it to a map[string]interface{}
<bodie_> okay, which means that when I serialize it, I need to step through the top-level map recursively re-casting its children into map[string]interface{}'s if they are such?
<bodie_> and then iterating into them
<jcw4> bodie_: hmm, actually I think when you serialize them it takes care of that...
<bodie_> that's what I thought, too.... :)
<natefinch> most of this stuff should be taken care of for you
<bodie_> right, I'm sure I'm just missing some detail
<dimitern> perrito666, thanks
<jcw4> bodie_: looking at that MR I would first change all your map[interface{}]interface{} lines to map[string]interface{}
<rogpeppe> can anyone remember the usual dance for updating a dependency in the 'bot?
<jcw4> bodie_: then starting from there we can figure out other serialization issues
<bodie_> looks like this technique (map[interface{}]interface{}) is already being used in several other place
<bodie_> s*
<bodie_> charm/meta
<natefinch> if other people jumped off a bridge, would you? ;)
<bodie_> hehe
<bodie_> here we are
<bodie_> http://paste.ubuntu.com/7573584/
<bodie_> so, "outfile" is a map[string]interface{}
<bodie_> I conformed my test to the obtained result, but maybe that's where I need to look closer
<bodie_> http://paste.ubuntu.com/7573598/
<bodie_> so, this is being deserialized by goyaml in this function: http://bazaar.launchpad.net/~binary132/juju-core/charm-interface-actions/view/head:/charm/actions.go#L37
<bodie_> "ReadActionsYaml"
<sinzui> natefinch, jam, fwereade : CI is blocked by https://bugs.launchpad.net/juju-core/+bug/1325074
<_mup_> Bug #1325074: Juju version cannot be set to 1.19.4 <packaging> <juju-core:Triaged> <https://launchpad.net/bugs/1325074>
<sinzui> ^ the test suite knows we cannot replace a published version of juju. So when devel gets updates, 1.19.3 is built, it is rejected because 1.19.3 really exists.
<sinzui> The unit tests are not affected by this, just the substrate and function tests
<jcw4> bodie_: is this code all pushed up to your branch?
<bodie_> yeah
<bodie_> lp:binary132/juju-core/charm-interface-actions
<jcw4> bodie_: so, you're seeing that the goyaml deserializer is returning a type map[interface{}]interface{} ?
<bodie_> that appears to be what's happening, yeah
<bodie_> I think I should have caught this before it hit the DB -- I think adding an extra check to charm_test might help, here
<jcw4> bodie_: before it hit the DB?
<bodie_> right, my issue was that I was retrieving an unexpected <Value: ...> from the database
<bodie_> instead of a map
<jcw4> bodie_: hmm, I didn't see any mongo access in that MR, do you mean Charm YAML?
<bodie_> I added some stuff to state/State.AddCharm
<bodie_> just a minor bit for including ch.Actions() in the cdoc
<jcw4> bodie_: oh, in a previous MR?
<bodie_> no, in charm-interface-actions
<bodie_> http://bazaar.launchpad.net/~binary132/juju-core/charm-interface-actions/view/head:/state/state.go#L511
<jcw4> bodie_: hmm, not in the MR then okay..  I was looking there while I was waiting for the branch
<bodie_> https://code.launchpad.net/~binary132/juju-core/charm-interface-actions/+merge/221595 line 351?
<bodie_> or uh, right, the MR
<bodie_> https://codereview.appspot.com/99640044/patch/20001/30015
<bodie_> no biggie
<bodie_> anyway, I think I can catch it before that in charm_test
<bodie_> pushing up a change
<jcw4> bodie_: yeah.  Which test is failing in your branch though... I'm not seeing the error you pasted
<jcw4> bodie_: (doh) forgot to switch branches
<bodie_> that revision probably has its own problems, I just pushed the test breakage back into the charm package
<bodie_> https://codereview.appspot.com/99640044
<jcw4> bodie_: so when you use map[interface{}]interface{} in the test it passes...
<bodie_> right
<jcw4> bodie_: where is the issue outside the test when you use that?
<bodie_> state/charm_test/TestCharm line 53
<bodie_> 54*
<bodie_> the error is now charm/charm_test.go:77: invalid operation: f.Actions().ActionSpecs["snapshot"].Params["outfile"]["description"] (index of type interface {})
<bodie_> Params is a map[string]interface{}
<jcw4> bodie_: i see
<bodie_> so, I need to cast it before checking, I suppose
<bodie_> Panic: interface conversion: interface is map[interface {}]interface {}, not map[string]interface {}
<bodie_> I'm still of the opinion that storing the JsonSchemaDocs as raw strings would be much simpler
<jcw4> bodie_: yeah, that's an mgz and fwereade discussion.
<bodie_> yep
<voidspace> fwereade: natefinch: mp updated
<voidspace> https://code.launchpad.net/~mfoord/juju-core/slow-replset/+merge/221709
<voidspace> fwereade: natefinch: still waiting for lbox to do it's thing for the CL to update
<voidspace> fwereade: note that after discussion with natefinch we switched to waiting for a successful replicaset status check rather than a config check
<voidspace> this is slightly slower, but safer and more likely to actually fix the CI problem
<voidspace> and CL updated too
<voidspace> https://codereview.appspot.com/104800043
<natefinch> voidspace: LGTM
<voidspace> natefinch: cool, let's land it and see if it fixes the problem
<voidspace> natefinch: if not I suggest we back it out because of the extra slowdown it causes
<perrito666> back
<voidspace> natefinch: I'm hoping this change makes replicasets more reliable for tests too
<voidspace> natefinch: trying it now
<voidspace> natefinch: hmmm... well, a bit more reliable
<voidspace> natefinch: still some panics
<mgz> fwereade: do you have github permissions to creat juju/core?
<mgz> I seem not to.
<fwereade> huh, I think you should
<fwereade> mgz, what description would you like? :)
<mgz> "Preliminary version, do not use yet" (I assume we can change that tomorrow :)
<fwereade> mgz, created :)
<mgz> thanks!
 * fwereade has to go to the shops, bbiab, maybe
<voidspace> sinzui: can you kick of a precise-amd64 bootstrap test
<voidspace> sinzui: the last run was revision 2814
<voidspace> http://juju-ci.vapour.ws:8080/job/local-deploy-precise-amd64/
<sinzui> voidspace, no
<voidspace> sinzui: is this the build problem you emailed about?
<sinzui> voidspace, since the juju version was not updated to 1.19.4, it is not possible to test new revisions
<voidspace> sinzui: right :-/
<sinzui> wohttps://bugs.launchpad.net/juju-core/+bug/1325074
<_mup_> Bug #1325074: Juju version cannot be set to 1.19.4 <packaging> <juju-core:Triaged> <https://launchpad.net/bugs/1325074>
<voidspace> sinzui: we think we might have fixed the precise problem, we'll have to wait to see I guess
<sinzui> voidspace, I honstley have been trying. I in in a utopic instance right now forcing the build steps to give me something to test
<voidspace> sinzui: :-(
<voidspace> is anyone looking at this?
<voidspace> I'm ten minutes before EOD I'm afraid
<voidspace> and it's krav maga tonight
<sinzui> voidspace, I don't think anyone is, sorry
<sinzui> voidspace, I have talked to a few devs, but no one has a clue
 * sinzui ponders the consequences of hacking the juju version just before the tarball is built
 * sinzui think unit tests fail (good), but substrate tests will run
<natefinch> mgz: can we just call it "juju"?
<voidspace> right, off to krav maga
<voidspace> probably back on later though
<voidspace> see  you all
<voidspace> EOD
<mgz> natefinch: wha?
<mgz> I was told core. There was a thread a while back and I thought that was the outcome of it.
<natefinch> mgz: hmm, I must have missed that thread
<cmars> hazmat, jam I'm going to tackle LP: #1183309, per https://bugs.launchpad.net/juju-core/+bug/1183309/comments/7. sound good?
<_mup_> Bug #1183309: destroy-service should have a force option <destroy-service> <ui> <usability> <workflow> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1183309>
<_mup_> Bug #1183309: destroy-service should have a force option <destroy-service> <ui> <usability> <workflow> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1183309>
<jam> cmars: make sure you sync with fwereade because he felt destroy-service âforce was a Won't Fix
<cmars> jam, will do, thanks
<jcw4> bodie_:  so bson (mgo\bson) doesn't know how to serialize map[interface{}]interface{}
<bodie_> I figured that was it
<bodie_> so I guess we need to look at how gojsonschema is representing it in memory
<bodie_> I think it does a recursion
<bodie_> in fact I'm certain it does
<jcw4> well our tests indicated it was map[interface{}]interface{}... we know that Params will be a map[string](map[string]string)
<bodie_> that's not necessarily true
<bodie_> first of all, Params can definitely have simple string values
<bodie_> http://paste.ubuntu.com/7574011/
<mgz> jcw4: it seems like you should be able to write a pretty simple function that takes your map[string]interface{} and gives back usable bson
<bodie_> secondly, it could have more deeply nested maps
<jcw4> bodie_: so Params is 100% free form, constrained only by a jsonschema?
<jcw4> mgz: map[string]interface{} is okay with bson
<jcw4> it's map[interface{}]interface{} that bson chokes on
<mgz> well, taking that then
<bodie_> mgz, do you know offhand whether encoding/json can?
<bodie_> cause if we can encode it as a JSON literal, it should be simple to transform that into a bson literal suitable for direct storage
<bodie_> without having to store the raw string
<bodie_> but, I guess jcw4 is on the money with the assessment that the interface{} keys are the issue
<jcw4> play.golang.org can't do imports...
<bodie_> william mentioned we could use juju/schema to enforce String or MapString types
<bodie_> s/MapString/stringMap
<jcw4> http://paste.ubunut.com/7574595
<jcw4> http://paste.ubuntu.com/7574595   damnig
<wwitzel3> perrito666: thanks for the review
<natefinch> mgz: I looked back at the thread where we talked about the name, I think we were hoping for canonical/juju, but couldn't get canonical.   I think otherwise, we didn't really come to a consensus.  There is definitely a lot of previous art for using github.com/team/project   .... when looking at our list of repos, you'll see "testing" "errgo" "loggo" ... and "core".  Except core is not a thing (arguably testing is a bad name
<natefinch>  too).
<perrito666> wwitzel3: my pleasure
<bodie_> ha, that's so annoying that someone has already taken "canonical"
<perrito666> (also my duty today :p)
<bodie_> his account has also been inactive for at least a year or so
<bodie_> make that 4 years
<perrito666> bodie_: freenode?
<bodie_> oh, my bad, I was thinking of Github.  I'll shut up now ^_^
<bodie_> oh, since Nate had said github.com/team/project
<bodie_> someone is parked on github.com/canonical, but his account has been inactive for 4 years
<jcw4> bodie_: mgz, fwereade ... we know the params *passed in* when invoking an action can be arbitrary,  constrained by a jsonschema; but *declaring* the params in the actions.yaml seems like it will always be of a fixed structure?
<bodie_> JSON-Schema is quite flexible
<jcw4> bodie_: no disagreement there
<bodie_> http://json-schema.org/example2.html
<jcw4> bodie_: but actions.yaml is merely declaring the possible params, and providing a schema for what the value of those params could be
<jcw4> but the name, description, and even default value isn't defined by that schema
<jcw4> the schema we're referencing validates the value of the named parameter
<bodie_> actions.yaml is being deserialized into a document which must be JSON-schema
<bodie_> that's the only constraint as far as I'm aware
<jcw4> bodie_: the point of json schema is to validate input to the api
<jcw4> which 99% of the time is not from the actions.yaml file
<jcw4> we're confusing declaring the action with invoking the action
<bodie_> I'm not following your argument, actions.yaml does nothing but define the JSON-Schema which the Action arguments must be validated against
<bodie_> Actions will therefore accept any JSON which conforms to the JSON-Schema defined in actions.yaml
<jcw4> bodie_: you're saying that actions.yaml itself is the json schema?
<bodie_> precisely
<bodie_> of course, it doesn't have to itself be JSON, though it could since YAML can be constrained to conform to JSON
<bodie_> it is simply deserialized into a JSON document which must conform to JSON-Schema v4
<bodie_> or uh
<jcw4> bodie_: so the "name", "description", and "default" keys are json schema keys?
<bodie_> right, everything under params is JSON-schema
<bodie_> so Actions could actually also have other Actions-related metadata
<bodie_> whereas Actions.ActionSpecs is technically a map[string]JsonSchemaDoc
<bodie_> or uh
<bodie_> map[string]ActionSpec
<bodie_> where ActionSpec contains potential metadata, AND a JSON-Schema document
<bodie_> sorry, I'm sure that's very muddled in translation to english
<perrito666> ok, my brain just segfaulted
<perrito666> I just bzr git pushed
<jcw4> bodie_: #jujuskunkworks?
<bodie_> yeah
<natefinch> perrito666: haha
<natefinch> perrito666: sometimes I juju push... which works about as well
<perrito666> I am so tempted to create plugin for git and bzr that are forgiving on my brain short circuits
<perrito666> sweet, natefinch did you see https://codereview.appspot.com/99670044/ ?
<natefinch> perrito666: yeah, I was trying to talk to fwereade about it.  I was wondering if we're really interested in writing bulk api calls for calls that can't be bulk right now.  I don't know what a bulk call for ensure availability would look like.  We don't have multiple environments on a state server currently.
<bodie_> does anyone know offhand where an example of juju-schema usage would be found?  (I'll be doing a little looking myself starting now, but any pointers are appreciated :) )
<perrito666> natefinch: when is it fall over there?
 * perrito666 gets promissed a feature for his phone that he has been waiting for years
<natefinch> perrito666: heh.... it's fall in like october
<perrito666> bu
<natefinch> perrito666: I take it you're watching the Apple stuff today?  Is it widgets or custom keyboard or something else?  those were the two that struck me as big
<natefinch> (I'm just looking at a summary of a live blog, not actually paying that much attention)
<perrito666> natefinch: I am listening to it, I had to get the mac with safari to get the vid to run so I just let it run on bg while I code
<natefinch> haha
<perrito666> natefinch: but no, the ability to answer phone calls and use sms on the computer
<natefinch> oh, neat
<perrito666> natefinch: since the first time I saw an iphone I wondered what was the goal of paying a fortune for a phone+computer that are made by the same manufacturer and cannot do anything more complex than copy music from one to the other
<natefinch> haha
<perrito666> mm /testing/base.go:26: undefined: "github.com/juju/testing".LoggingSuite
<perrito666> what makes the dependencies go unclean?
<natefinch> perrito666: generally it's that juju-core updates and the external repo doesn't, or vice versa
<perrito666> natefinch: mm, apparently I have conflicts :|
<perrito666> strange
<perrito666> godeps ux is.... nil
<natefinch> yeah
<natefinch> I wish we'd drop it in favor of versioned branches
<perrito666> natefinch: I would be fine if it just did git/bzr pull by itself
<natefinch> perrito666: yeah, the problem is that in order to be fully functional, it would have to basically replicate the whole of go get
<perrito666> natefinch: well, I could do another go lazytool in bash :p but you condemn bash tools for go
<perrito666> :
<perrito666> :p
<natefinch> haha
<natefinch> perrito666: you should send a patch in for godeps
<thumper> fwereade: morning
<thumper> fwereade: around?
<thumper> oh for the love of all things...
<thumper> why do people feel the need to write code where they have to repeat themselves
<thumper> a lot...
<perrito666> hey, hey would someone ptal https://codereview.appspot.com/103820044
<perrito666> thumper: ?
<thumper> hi perrito666
<perrito666> thumper: good mornight
<thumper> ah crap...
<thumper> just remembered that I have to call the school to tell them daughter is at home...
<perrito666> thumper: wont they notice? :p
<thumper> perrito666: sure... but this is to say that I know ...
<perrito666> heh, here they check every day and at the end of each month ask you if you where aware the days they where not there :p less than ideal
<thumper> just had a mind blank on my own home phone number
<perrito666> home phone, are you from the past?
 * perrito666 misses having a home phone, but is not willing to wait 3-6 months for it
<thumper> perrito666: will localhost ever be in the list more than once?
<waigani>  thunderbird just crashed with the amount of emails I have to catch up on
<waigani> morning menn0, thumper :)
<thumper> waigani: morning
<menn0> morning waigani
<thumper> waigani, menn0: fyi, I'm working on adding display name hand doing more hacking elsewhere to make that happen than I was entirely comfortable with
<thumper> I want you two to review it later
<menn0> thumper: np
<waigani> thumper: display name hand? what is that?
<thumper> s/hand/and/
<thumper> typo
<thumper> menn0: did you have a branch changing --password on user add?
<menn0> thumper: yes. that was merged some time ago
<thumper> menn0: ta
<sinzui> thumper, I fixed the bug
<thumper> sinzui: yeah, I saw after I sent the email
<sinzui> thumper, I found your constant in versions.go, I decide we can have upto 1.19.9 before we need to hack again
 * thumper nods
<voidspace> so my fix for precise-amd64 failed, but the error message is different
<sinzui> thumper, It was obvious to fix when I had a break from the code. Friday was too chaotic for me
<sinzui> voidspace, I think it has changed, yes
<voidspace> sinzui: trying on my precise box now on the off chance I can reproduce, but I'll dig into it anyway
<voidspace> sinzui: glad you managed to get CI working again, thanks
<sinzui> voidspace, is this a timeout issue? Can I change a config to extend the timeout to 10 minutes?
<voidspace> sinzui: I don't think so - the session is *immediately* saying it has already been closed
<sinzui> wallyworld, CI has blessed 1.18.4 trunk. I will release it tomorrow unless you want me to delay. I see one bug in progress.
<alexisb> alright folks, juju is last, if you have a twitter account and a few minutes vote! : http://ibmappthrowdown.tumblr.com/
<alexisb> sinzui, fyi wallyworld is suppose to be out sick today
<jcw4> alexisb: how many votes do you get... I already voted once
<alexisb> 10 I think
<jcw4> woo hoo
<alexisb> you just have to change the first part of the message
<sinzui> wallyworld, since 1.18.2 doesn't support utopic, I cloudn't deploy or provision a utopic slave to test. I had to manually install all the test resources to verify that utopic can be a deploy target. Oh the irony.
<voidspace> I've voted from my twitter accounts
<sinzui> alexisb, damn, I will find axwalk later and ask about his plans
<jcw4> alexisb: well, six is about all I can comfortably inflict on my twitter friends
<voidspace> naturally bootstrap succeeds on my precise machine
<alexisb> :)
<alexisb> far enough jcw4
<alexisb> fair
<jcw4> :)
<sinzui> voidspace, is your machine lxc or vm?
<voidspace> sinzui: it's a VM
<thumper> mwhudson: I think you have spoiled me
<thumper> mwhudson: now I get angry looking at terrible tests
<mwhudson> thumper: happy to help!
<mwhudson> also i mostly got that way from talking to jml, so blame him!
<voidspace> anyone know what sort of time axw is likely to be around?
<mwhudson> voidspace: he's perth, isn't he?
<voidspace> mwhudson: I believe so
<mwhudson> so utc+7 so it's only like 5 am for him now
<mwhudson> so probably another 3-4 hours?
<mwhudson> oh no, utc+8
<voidspace> heh, yeah - 6:21am
<mwhudson> oh heh, WA stopped doing dst in 2009
<mgz> anyone know who's admin on the juju team on github? william was earlier, but I need someone to grant me rights so I can push a branch
<mgz> alexisb: ^halp
<wallyworld> sinzui: i am fixing backporting the fix fromtrunk now
<wallyworld> should be done real soon now
<mgz> wallyworld: do you have admin on github.com/juju ?
<wallyworld> mgz: let me check
<mgz> I need to push my import up there to switch the bot over
<wallyworld> mgz: i added you as an owner of juju
<wallyworld> mgz: what do you think about it being juju/juju ?
<mgz> wallyworld: thanks, pushing
<wallyworld> i can see the point
<mgz> wallyworld: I don't really have an opinon, which is why I just did what was suggested
<mgz> I'm fine with whichever name people want
<wallyworld> thumper: do you have a strong opinion? github.com/juju/juju-core perhaps?
<thumper> wallyworld: not especially
<thumper> menn0, waigani: with  you in a minute
<mgz> wallyworld, okay, repo up and job swaitched
<menn0> thumper: np
<mgz> emailing instructions
<wallyworld> mgz: i can see that core may be suboptimal if people fork. can we make it juju-core?
<wallyworld> mgz: awesome
<wallyworld> did you get the tarmac integration sorted?
<mgz> discussed using other non-ec2 clouds with sinzui, seems hp would be the best switch
<mgz> but want to try a couple more things on ec2
<sebas5384> cmars: ping
<cmars> hi sebas5384
<cmars> what's up?
<sebas5384> hey!
<sebas5384> i did the -i parameter to specify the interface (at least i think i did hehe)
<sebas5384> but i don't know how to test
<cmars> oh cool
<sebas5384> i run the the make restore install
<sebas5384> but it always get stuck in some build
<sebas5384> if you can guide me to test it cmars :)
<wallyworld> mgz: ok, so long as we get to the point where we migrate and the landing is reliable :-)
<cmars> sebas5384, can you push your changes up to your branch, so i can try it?
<sebas5384> cmars: yes!
<sebas5384> cmars: when its there i notice you
<sebas5384> cmars: https://github.com/sebas5384/juju-nat/tree/local-provider-support
 * cmars takes a look
<sinzui> The windows builder and test has gone tits up
<sinzui> I need to build a new one before to retest 1.18.4 for release tonight
<perrito666> thumper: ping me when you are back plz
<thumper> perrito666: back and off calls now
<perrito666> I saw your comment, how do you suggest that port to be set?
<thumper> there are other examples in the code
 * thumper looks for one
<thumper> perrito666: launchpad.net/juju-core/charm/testing/mockstore.go:
<thumper> perrito666: line 59
<thumper> perrito666: effectively ask for port 0, and then look at what it gave you
<perrito666> I see, thank you
<cmars> sebas5384, commented
<sebas5384> cmars: yeah! i sow thanks
<sebas5384> cmars: but can you help me on how to test it?
<wallyworld> sinzui: that 1.18.4 bug is now fix committed
<sebas5384> i will make the changes cmars
<cmars> sebas5384, sadly, i do not have unit tests. we're in the awkward position of manually testing the thing, unf
<sinzui> wallyworld, fab, now I need to just build a replacement windows instance to test it
<wallyworld> joy
<cmars> sebas5384, shame on me... i'll add an issue to make some proper unit tests. juju-core has some nice testing infrastructure, which will help
<cmars> in the meantime, manual testing will have to suffice
<sebas5384> nice!
<sebas5384> yeah!
<sebas5384> but how?
<sebas5384> hehe i don't know how to build it and use it
<cmars> sebas5384, i'll note the pull req
<cmars> or the branch, rathetr
<sebas5384> thanks cmars o/
<sinzui> wallyworld, I think 1.18.4 will have a good ride through CI. My new win machine works. It was the machine that failed to start and caused the curse of the last revision
 * wallyworld crosses fingers
<sinzui> wallyworld, CI won't get to the stable revision for about 90 minutes though
<wallyworld> o
<wallyworld> k
 * thumper goes to hit some things...
<sinzui> hmm. CI now take 2h10m to test everything
 * sinzui ponders parallel builds in ci3.5
#juju-dev 2014-06-03
<axw> morning all
<waigani> morning axw :)
<perrito666> well github is rather verbose
<waigani> how's bug land?
<axw> not bad, since I haven't been looking at the horrible mongo-related bugs lately
<perrito666> ok guys, EOD
<axw> wallyworld: got time to chat?
<wallyworld> sure
<axw> wallyworld: https://plus.google.com/hangouts/_/gw7us4q4lhcyws5ksu7ipthn6ia?hl=en
<axw> wallyworld: there is already a canonistack slave
<wallyworld> oh
<axw> but... running tests on there may interfere with CI
<wallyworld> yeah
<axw> I'm going to try adding another one in that we can use
<wallyworld> axw: the current script used by jenkins for the github landing fails with an unexpected EOF. i can't spot the issue, here's a pastebin - it's the last bit that's bad. can you see the error? http://pastebin.ubuntu.com/7577197/
<axw> wallyworld: nothing jumps out
<wallyworld> ok, ta
<axw> wallyworld: the "}" needs to be preceded by a newline or semicolon
<wallyworld> ok, will try that
<wallyworld> \o/ thanks, that was it
<axw> nps
<sinzui> wallyworld, mgz, canonistack is under resources, our aws account is reaching saturation. joyent and hp cloud have lots of cpus/instances for us to test/build with
 * sinzui will look to move some function tests and series tests to hp this week
<wallyworld> sinzui: ok, we may need to try hp cloud. i'm testing now with a nailed up ec2 instance i was using for mongo testing in order to save time
<sinzui> wallyworld, mgz, will the lander be running unit tests for more than 1 series
<sinzui> ?
<wallyworld> sinzui: not initially since we are just going trunk at first, but it will
<sinzui> wallyworld, I need to teach the default run-unit-tests about nova. At this time we can run ec2 or any existing host. Teaching the setup about joyen would be goos too
<sinzui> oh, and azure was saturated last December, not that you wanted to wait 25 minutes to bring up a instance
<wallyworld> sinzui: at the moment, the unit tests the lander runs fail every time using an ephemeral ec2 instance
<wallyworld> using a nailed up instance seems better
<sinzui> wallyworld, bad script, unbound variable
<wallyworld> i have a couple of tests fail due to build errors but everything else passes
<sinzui> thats the problem with cargo culting a script, you loose the context of what the required setup is
<sinzui> wallyworld, I am adding set -eux to make it easier to debug
<wallyworld> ok
<sinzui> wallyworld, I think you are going to get a pass this round
<wallyworld> i hope so
<wallyworld> well, except for the build error
<wallyworld> sinzui: did you change the test-merge-git script?
<sinzui> wallyworld, no, I think someone fixed the unbound var from run #2
<sinzui> wallyworld, sorry, a failure
<sinzui> oh, is this trusty?
<wallyworld> yeah but at least it was due to a build failure
<wallyworld> i think so
<sinzui> this is one of the common intermittent bugs.
<wallyworld> oh, i just saw the replicaset failure also
<sinzui> maybe we want to call make check a few times to replay intermittent faules
<sinzui> make check || make check || make check
<sinzui> ^ three tries
<wallyworld> on canonistack with tarmac, we seem to have eliminated most of the failures
<wallyworld> seems more flakey on ec2
<wallyworld> but yes, i guess we can do 3 retries
<sinzui> wallyworld, This test always gets current packages before starting
<sinzui> and I think you were using precise which had fewer changes
<sinzui> wallyworld, may, I make some quick changes to remove the unsed parts of the script and make it retry?
<wallyworld> yes please
<wallyworld> we only have a small window left to get this working
<wallyworld> it will still fail due to the build errors though
<wallyworld> looks like the import pulled in a deleted package
<wallyworld> cmd/jujuc
<wallyworld> sinzui: those obsolete files which cause the compile errors are now gone.
<sinzui> okay
<wallyworld> are you finished with the script?
<wallyworld> can we start another run?
<waigani> thumper: ping
<thumper> waigani: in the middle of something right now
<waigani> thumper: okay, np
<sinzui> wallyworld, I see this script was restarted. I just save my changes. I removed the non-ephemeral conditions because this use is always ephemeral. I removed the lines that built tools. I need to do the same to the real scripts because we now do a proper isolated build for each test deb and tool
<wallyworld> sinzui: i restarted just to get a ahead start on whether my build fixes were ok, i'll re-run wth your changes also
<sinzui> okay
<wallyworld> sinzui: my main concern is that now the landing process takes a fair bit longer
<sinzui> I just discovered that I cannot build juju packages and run unit tests on the same machine
<sinzui> wallyworld, yep
<wallyworld> so i'm thinking we do want to move to a nailed up instance at some point post cutover
<sinzui> you can make it go much faster with provisioned instance or one that you start and stop. The deps phases of the script are essentially skipped
<wallyworld> yep
<wallyworld> i think having one that we start and stop will be best
<sinzui> I am pondering  having a jenkins slave for each crucial series+arch, and using the instance as a dedicated builder and running of local-provider tests
<wallyworld> sounds reasonable
<wallyworld> sinzui: yay, that run worked
<wallyworld> i'll try another with your changed script active
<sinzui> excellent, now we get to see if I can hack a bash script without syntax highlighting
<axw> wallyworld: it says #18 passed, but in the logs the replicaset package's tests failed
<axw> oops
<axw> wrong tab :)
<wallyworld> phew
<axw> wallyworld: why did the PR not get updated?
<wallyworld> it should have i think, not sure
 * thumper grumbles
 * thumper makes unpleasent noises
<thumper> I've worked out why this test is failing
<thumper> it is because JujuConnSuite bootstraps the environment for you
<thumper> and I've changed the base test suite for this test
<thumper> to something much simpler,
<thumper> hence, broken test
<thumper> now how to just write out some jenv stuff...
<axw> wallyworld: I was waiting for a LGTM, then I would do $$merge$$ - that's still the protocol right?
<wallyworld> axw: yeah, i'm experimenting trying to get the bot to pick it up
<wallyworld> the current core will be blown away and replaced with the trunk snapshot
<axw> ok
<waigani> axw, wallyworld: will we use codereview at all anymore?
<axw> waigani: no
<wallyworld> nope :-D
<wallyworld> and good riddence
<waigani> big changes, okay
<axw> wallyworld: the lander didn't pick up the $$merge$$ on your other PR?
<wallyworld> axw: nope :-(
<wallyworld> there's something wrong clearly
<waigani> man git is spamming my inbox!
<wallyworld> there's a cron job, but i also ran it by hand
<axw> wallyworld: are you running it on your machine now?
<axw> mk
<wallyworld> axw: nope, i've sshed into jenkins
<axw> wallyworld: your membership in the juju org needs to be made public
<wallyworld> ok, i didn't realise it wasn't
<wallyworld> ah, that has fixed it
<axw> wallyworld: I'll reply to the list about that
<wallyworld> axw: that's a good pickup because a bunch of others are also private
<axw> yup
<wallyworld> axw: so now with the ephemeral instances we waste about 6-7m getting everything set up to run the tests. not great, but not a show stopper for cut over
<axw> wallyworld: yeah doesn't seem too terrible for a first cut
<wallyworld> axw: the thing now is the extra occurrences of these fraking timeout errors etc which break the tests
<axw> yep... :/
<wallyworld> just got one now in the current run
<wallyworld> i could have sworn that it's worse on ec2
<axw> wallyworld: I wonder if pre-warming the disk would be of benefit
 * axw looks into what we can do there
<wallyworld> axw: yeah, i was wondering that also
<thumper> nailed it!
<thumper> coffee time
<thumper> WTF...
<thumper> lbox still trying to work out the diff
 * thumper goes to make the coffee
<axw> wallyworld: one option is to try using the ephemeral storage, in /mnt
<axw> wallyworld: I think that should be considerably faster
<axw> and we don't need the persistence...
<thumper> omg, that grew a bit... https://code.launchpad.net/~thumper/juju-core/user-display-name/+merge/221823
<wallyworld> axw: can't hurt, looks like the current job will pass second time through. can you tweak the script to use/mnt?
<axw> wallyworld: even just putting TMPDIR in there
<axw> wallyworld: I'll give it a shot
<wallyworld> ok, ta
<wallyworld> TMPDIR sounds sensible also
<thumper> menn0:  https://codereview.appspot.com/102970043 - it grew more than expected
 * thumper -> coffee
<menn0> thumper: looking
<thumper> menn0: I suggest that you, me and waigani chat tomorrow about it
<thumper> as there are a number of things I want to talk through
 * thumper heads out to local code craft meeting
<menn0> thumper: ok. I'm almost done though and I think I understand most of the things you're going to talk about
<thumper> :)
<menn0> thumper: thanks for cleaning up some of my --password changes. Much better.
<wallyworld> axw: that last one took all 3 attempts to merge. different error each time. the watcher / session closed issue seems to be more prevalent of late
<axw> wallyworld: I put in the $TMPDIR change, hopefully that makes a difference
<wallyworld> axw: i'll trigger another merge and we'll see how it goes
<axw> ok
<wallyworld> axw: is it work setting up a dir in /mnt and setting GOPATH to that?
<wallyworld> worth
<wallyworld> before i do another run
<axw> wallyworld: I don't think so, that's just for building the code
<axw> it might make building the code a bit quicker, I guess, but I don't think it's a bottleneck by a long shot
<wallyworld> ok
<axw> wallyworld: https://github.com/juju/core/pull/4
<wallyworld> axw: looks like it's off and running
<axw> goodo
<axw> wallyworld: we should probably run that script before tests too
<wallyworld> yeah
<wallyworld> i think i mentioned that to martin but we never wrote it down
<axw> I'll add it in the job now
<wallyworld> ok, ta
<wallyworld> axw: a random thought - would running the tests with say "-p 4" help, since that may cut down on the thrashing of resources etc
<axw> wallyworld: umm, I think it's going to be 2 as it is
<axw> it's running on a dual core isn't it?
<axw> -p defaults to #CPU
<wallyworld> not sure, i had thought it was higher
<wallyworld> you may be right
<axw> wallyworld: I'll check, but I don't think it's higher than 4
<wallyworld> if it is 4, then may -p 2
<axw> wallyworld: it has 4
<wallyworld> yeah, m1.xlarge
<axw> wallyworld: doesn't hurt to try
<wallyworld> the latest run failed twice
<axw> yeah :/
<wallyworld> axw: sigh. latest builds not happy
<wallyworld> go tool: no such tool "vet"; to install:
<wallyworld> 	go get code.google.com/p/go.tools/cmd/vet
<wallyworld> i have to head to soccer, are you able to have a look and i'll check later?
<axw> wallyworld: sorry, will fix
<rogpeppe> so... one hour to github D Day... how's it looking?
<voidspace> morning all
<TheMue> voidspace: morning
<voidspace> TheMue: o/
<jam> wow, it took me 3 hours to actually get a bootable USB stick that worksâ¦. so hard, why is it so hard...
<voidspace> jam: I've always found it difficult too
<voidspace> jam: and I think in the end I discovered I could only make it work with *one* of my USB sticks
<jam> voidspace: yeah, there are 4 different master partition table types, and various flags that have to be set, and ....
<jam> and tools that you try to use that just blow up unless everything is already set up correctly
<jam> voidspace: so I formatted with MBR form on my Mac, took it over to Ubuntu and used fdisk to create 1 partition dos compatible, set it active and bootable, set the type to 0x0C, mkfs.fat -F 32, and then use Startup Disk Creator to copy the ISO onto the stick.
<jam> now *maybe* msdos formatting from gparted would have also worked (maybe)
<voidspace> heh, ouch
<jam> I tried a lot of other permutations in there
<jam> but it seems like a streamlined "here is an ISO, and do what you want to this stick to make it boot"
<jam> would be such a better experience
<lifeless> jam: dd usually works, no ?
<voidspace> I always used to use unetbootin on the Mac - it didn't work for me on Ubuntu last time I tried it
<voidspace> lifeless: morning
<jam> lifeless: so the Mac instructions have you convert the ISO into a DMG and then dd it, which worked once
<jam> but I accidentally used the 64 bit version for an old machine
<jam> so I had to fix it for 32 bit
<jam> and â¦ yeah, a lot of stuff didn't work
<jam> I think the key was finding a way to get MSDos compatibility enough
<jam> lifeless: dd doesn't set the bootable or partition table, does it?
<lifeless> jam: it includes a partition table
<lifeless> jam: dd of=/deb/sdb
<jam> lifeless: the biggest problem I had was having the old machine find the USB device as bootable, I'm not sure if I tried just dd from the .iso
<voidspace> axw: ping
<jam> it wasn't in any of the recommended guides
<jam> all that just to run shred sanely on all my old hard drives before getting rid of old equipment
<axw> voidspace: pong
<jam> I'll try that once the shredding is done :)
<jam> lifeless: good to see you around, btw, haven't said hi in a while
<lifeless> jam: hi :)
<lifeless> jam: did you consider netbooting ?
<lifeless> jam: e.g. with maas and run shred from there?
<lifeless> or even perhaps maas should have a decommission-hardware facility, to make removing old content easy
<jam> lifeless: interesting thought, I had not considered it. The main idea was to not have / on the physical disks so that I could nuke everything without worrying about mounts. Because I *could* just boot the thing directly, though it was running Hardy
<jam> and I think the other machine is/was running Fedora Core *3*
<voidspace> axw: so we made a change to Initiate yesterday on the suspicion that the CI machine was just slow
<voidspace> axw: this is the bootstrap problem on precise amd64
<axw> voidspace: I saw
<voidspace> axw: it did not fix the issue - but it made the problem a bit clearer
<axw> oh?
<jam> lifeless: one funny thing working here, is that I'm now the most senior person in all of Juju. Tim is here, but I predate him by a couple of months (since he actually originally applied for the Bazaar dev role)
<voidspace> axw: immediately after calling replSetInitiate we poll until CurrentStatus returns successfully
<voidspace> axw: on the failing machine that fails immediately (and persistently) with a "Closed explicitly" error
<voidspace> axw: that specific error seems to come from the mgo library when socket.Close() is called
<axw> yep
<voidspace> axw: I've read through the code from MaybeInitiate  (I probably need to go further back in the code but we're running out of time) to see if I can find a race condition
<axw> on the other machines it comes back saying "it'll be ready soon" or something?
<voidspace> axw: yep
<voidspace> I thought that we just weren't waiting long enough before giving up on this machine
<voidspace> but that appears not to be the case
<voidspace> axw: I do have one question
<voidspace> inside replicaset/replicaset.Initiate
<lifeless> jam: wow, sounds like there was quite some turnover?
<voidspace> we call monotonicSession.SetMode(mgo.Monotonic, true)
<jam> lifeless: well, this is just Juju, there are still people at Canonical that predate me.
<jam> plus, I'm like #70 or so
<jam> so not a huge gap in front, and 600 people now at Canonical
<jam> so, I was in top 10% to start
<voidspace> axw: session.SetMode - the second parameter is for "refresh", which if true calls session.unsetSocket()
<voidspace> axw: do you know why it does this?
<axw> voidspace: I think the refresh is to ensure restarting of the consistency guarantee
<axw> I don't know any more than that
<axw> voidspace: rogpeppe or natefinch may know more about it
<lifeless> jam: sure, I was meaning in the juju context
<voidspace> axw: as far as I could tell this *doesn't* close the socket - although it does set a couple of sockets to nil
<voidspace> axw: but there's no finalizer that I could find
<lifeless> jam: like, I'd have thougt kapil or mramm joined juju before you
 * axw looks
<jam> lifeless: I mean joined Canonical, where I predate mramm, but maybe not kapil
<voidspace> axw: and it calls socket.Unlock() - which I think comes from the fact that mgo.socket embeds sync.Mutex
<jam> he doesn't directly report on the same chain
<voidspace> axw: which also wouldn't call Close
<jam> for Juju proper, lots of people still working on the project longer than me
<jam> fwereade, mramm, rogpeppe, etc.
<axw> voidspace: unsetSocket does a socket.Release
<voidspace> lifeless: jam: mramm is relatively new - I predate him
<jam> voidspace: yeah, he's only about 2 years or so
<jam> but he's been working as part of Juju longer than me
<voidspace> axw: right, which does an Unlock() followed by a LogoutAll()
<voidspace> axw: none of which *seems* to call Close()
<voidspace> I guess that isn't it - it was just suspicious
<voidspace> axw:  and I wondered if we *needed* that refresh
<axw> voidspace: hmm yeah, I think it just puts it back into a pool to be reused.
<axw> voidspace: that I do not know
<voidspace> axw: something is closing the socket the session is using - causing all further operations to fail
<voidspace> axw: but only on the CI build machine...
<voidspace> axw: I think this morning is the deadline though, we can't leave trunk broken any longer
<voidspace> even if it's just on one machine...
<voidspace> so we'll have to backout my changes and switch to a different strategy for setting the Mongo WMode (which was the purpose of enabling replica sets for local provider)
<axw> voidspace: bugger :(
<voidspace> not having direct access to the machine to try things on doesn't help
<voidspace> coffee
<axw> voidspace: it's the closing of the "Server" object that closes all the sockets
<jam> wow, driving it at 73MB/s sustained will still take 4 hours to do 1 pass over 1TB of data, and I have it set to do 3, guess I just come back tomorrow :)
<lifeless> jam: doing some map-reduce ?
<lifeless> jam: oh shred ;)
<jam> lifeless: shredding old disks before giving them away
<lifeless> well, gnight :)
<jam> lifeless: gnight
<stub> github.com/ju/ju/core
<voidspace> stub: hah, nice
<voidspace> we should do that
<voidspace> natefinch: morning
<natefinch> voidspace:  morning
<jam> fwereade: quick poke about CodeNotFound vs CodeUnauthorized here. If you pass an environ we don't recognize, what code should we get?
<jam> We are still getting the error during Login, because of how round trips work, so Unauthed is somewhat reasonable
<perrito666> good morning
<wallyworld> rogpeppe: jam: natefinch: you guys ok with githib.com/juju/juju-core ? seems to be the least hated option
<jam> wallyworld: omg, wtf,,, how could you say such a thing
<jam> wallyworld: I'm pretty easy going here.
<wallyworld> me too
<wallyworld> just trying to get consensus, whatever that means
<jam> I'd actually like to get rid of the "-core" part, because it feels like it is just baggage because there was a pyjuju that took 'juju' over for too long
<wallyworld> yeah
<natefinch> that was my thought of why it should just be juju
<jam> if you "apt-get install juju" you get juju-core
<wallyworld> so github.com/juju/juju then?
<jam> I agree that github.com/juju/juju/juju and github.com/juju/juju/cmd/juju.Juju are a bit silly, but the -core doesn't actually add any information.
<natefinch> yeah, it's the same stuttering, just with some garbage in the middle
<wallyworld> i thought core because it distinguishes from logging testting etc
<jam> wallyworld: github.com/juju/juju is my #1, with github.com/juju/juju-core being #2, I guess
<jam> fair point, though I'd rather prefer we had libjuju and juju the cmdline tool :)
<natefinch> for the record, I still don't know what the juju-core/juju package is for, maybe we just need to rename that
<wallyworld> or rename juju the team to juju-team :-)
<wallyworld> github/juju-team/juju
<wallyworld> +1 to renaming juju-core/juju
<natefinch> wallyworld: I actually don't like the testing package name, it's not descriptive on its own, and it should be, if it's in its own repo
<natefinch> s/package/repo/
<wallyworld> yeah, i haven't thought about that one, since i didn't set it up
<jam> natefinch: it is the entry point, AIUI, so juju/api.go etc. "juju.NewAPIClientFromName()"
<jam> natefinch: arguably we could have that at the top level
<jam> github.com/juju/juju.NewAPIClient()
<natefinch> jam: having something at the top level would be nice.  I'd love to have cmd/juju at the top level, since that's what you'd normally "go get"  even though we sadly don't support that
<jam> natefinch: IMO that is a clear use case for having 2 separate things, a 'libjuju' and a 'juju command line client"
<wallyworld> that makes sense
<voidspace> perrito666: morning
<wallyworld> what do other github team based projects do?
<wallyworld> is team name normally = product name?
<natefinch> often times, yes
<wallyworld> so i guess we should go with github/juju/juju
<wallyworld> can we all live with that?
<natefinch> the only other person I heard respond was axw, who said he moderately preferred /juju/juju, I think
<voidspace> I prefer juju/juju
<voidspace> fwiw :-)
<wallyworld> ok, the people have spoken :-)
<natefinch> Actually, rogpeppe responded, but sort of vaguely
<rogpeppe> i'm not that keen on juju/juju actually
<rogpeppe> because then we have a package github.com/juju/juju/juju
<rogpeppe> which seems like overkill
<rogpeppe> i actually think that core isn't too bad
<rogpeppe> but i'd be happy with juju/juju-core too
<natefinch> rogpeppe: we can just rename that package
<rogpeppe> as it distinguishes the genuinely juju-specific parts of github.com/juju
<rogpeppe> because actually most packages under github.com/juju are *not* juju specific
<rogpeppe> so having a juju- prefix to the name makes sense
<natefinch> that's why I like calling this repo juju :)
<rogpeppe> natefinch: yeah, but see above :-)
<rogpeppe> jujujujujuju
<voidspace> natefinch: my absolutely last ditch attempt to fix locl-deploy-precise-amd64
<voidspace> natefinch: https://code.launchpad.net/~mfoord/juju-core/refresh-session/+merge/221857
<natefinch> see above.... we should rename that package anyway
<voidspace> natefinch: just running all tests here to check it doesn't break anything
<voidspace> natefinch: should I propose it/
<voidspace> ?
<rogpeppe> natefinch: well, the original intent of that package was that we'd have a "juju" package which would be the "face" of juju to external Go code
<rogpeppe> natefinch: i still think that's a reasonable intent
<voidspace> yay for bike-sheds
<natefinch> rogpeppe: rename it to libjuju, since juju is the command line client
<rogpeppe> natefinch: and if we were to export versioned APIs, maybe it might make sense to have a separate (versioned) repo named "juju" which would import juju-core
<rogpeppe> natefinch: libjuju?
<dimitern> vladk|offline, standup?
<natefinch> rogpeppe: maybe github/juju-team is better?
<rogpeppe> natefinch: probably
<rogpeppe> natefinch: but github/juju is fine too
<rogpeppe> natefinch: and github/juju/juju-core works pretty well, i think.
<rogpeppe> natefinch: and leaves the path open for other juju- packages
<natefinch> rogpeppe: juju-core was my second choice, I just don't think core adds much... and using /juju doesn't stop someone else from using juju-foo
 * natefinch wants to make a juju-fu package now
<rogpeppe> natefinch: if the top level of the juju-core repo was a usable package, i'd agree that juju was a better name (it makes for a more guessable identifier) but as it is, i think i'm in favour of juju-core, or just core and people can rename it. tbh i don't care too much.
<natefinch> well, I'd love to put cmd/juju in the root so that the CLI client is the top level "thing that gets built"... but that's probably asking too much ;)
<wallyworld> so do we want juju-core then?
<natefinch> It sounds like we have 4 votes for /juju and one for juju-core ....  so I don't know where that puts us
<wallyworld> i originally was +1 for juju-core if core were not suitabke
<voidspace> I'd have thought it was fairly clear where it puts us...
<natefinch> wallyworld: so 2 for juju-core.
<natefinch> voidspace: I'm actually not trying to make it a popularity contest
<voidspace> heh
<mgz> I have decided we're going to use github.com/juju/justsomedamncode
<natefinch> lol
<wallyworld> i just wish our team was called juju-team
<voidspace> it should be justsomedamnedcode
<wallyworld> that would remove a lot of the stuttering
<natefinch> https://github.com/rails/rails
<voidspace> team and project name the same is a pretty common convention I would expect
<natefinch> for things that are actually worked on by outsiders, yes... there's github.com/dotcloud/docker  for example
<natefinch> (for things that are more primarily worked on by the company)
<voidspace> natefinch: ping
<voidspace> natefinch: I'm going on lunch
<voidspace> natefinch: can you take a look at https://codereview.appspot.com/102980043/
<voidspace> natefinch: and see if you think it's worth one last try, or if you have a better idea / modifications
<voidspace> natefinch: or just think it's a bad idea...
<natefinch> voidspace: I looked, it's worth a try.  We should really get on the CI machine... sinzui tells me we can do that now.  I forget exactly what he said but I can find it on the traceback
<perrito666> oh man, I missed the bikeshed
<perrito666> say guys, I have something to submit, but I am not sure were we stand since today is go to github day
<jam> perrito666: well they haven't said "we are on github now" so I think we're still on launchpad
<perrito666> jam: as far as we are not in between
<jam> perrito666: until they say stop, we might as well land where we land
<jam> well, until they say go, I guess
<perrito666> jam: ack
<perrito666> jam: btw https://codereview.appspot.com/103820044/ if you want to take a look
<jam> perrito666: do we need to concern ourselves about getting 127.0.0.1 or ::1 instead of "localhost" ?
<perrito666> for the moment I am getting spammed by gh I need to create a few new filters it seems
<jam> perrito666: fortunately, they are "lists" so gmail filters them ok, but yes, every inline comment = new email
<wallyworld> jam: perrito666: there will be 30 minute's notice, so land away till then
<wallyworld> a few prereq things are being finalised
<perrito666> jam: you mean why I actually look for localhost in the test?
<jam> perrito666: 		if strings.HasPrefix(addr, "localhost:") {
<jam> 127.0.0.1 is *also* localhost
<jam> as is ::1
<jam> as is the actual public IP of the local machine
<mgz> pre-forward notice, bzr landings will be stopping today
<mgz> be aware if you have a branch currently up for review
<perrito666> jam: well we are trying to force it to go trough localhost, if the list already has 123.0.0.1 we should do no harm
<jam> perrito666: well, 127 is pretty specific (vs 123 :)
<perrito666> jam: oh
<jam> mgz: you could do something evil and commit a change to 'lp:juju-core' that just always breaks the test suite :)
 * perrito666 moves the irc to the screen with a reazonable resolution
<jam> well, actually, just change go-bot to run "&& false"
<jam> mgz: then the MP's get rejected
<jam> echo "no longer merging on launchpad
<mgz> jam, I think I just take the lock when ready
<jam> " && falsee
<jam> mgz: well, I mean someone submits, have them get the "not here" feedback
<jam> so leave tarmac running, just have it running and rejecting all requests.
<voidspace> natefinch: ok, if you can get those instructions I'll try it out
<natefinch> looking
<bodie_> morning all
<perrito666> bodie_: hi
<voidspace> natefinch: any luck?
<voidspace> natefinch: we could land it - it if it doesn't fix the problem we'll be reverting the lot anyway
<alexisb> natefinch, ping
<wwitzel3> fwereade: https://codereview.appspot.com/103770044/
<frankban> mgz: is it ok to start working on the git project, or do we need to wait until the migration is done?
<alexisb> natefinch, cloudbase call
<mgz> frankban: you need to wait
<mgz> but please feel free to try out the test repo
<rogpeppe1> mgz: are changes to launchpad.net/juju-core frozen?
<mgz> you'll just need to patch any changes you want to keep onto the new one when it's out
<wwitzel3> mgz: did you pull the trigger on disabling lp?
<mgz> just about to
<perrito666> mgz: good luck
<mgz> test git landing disabled
<fwereade> wwitzel3, might not get to that for a bit today
<sinzui> jam: the job doesn't set GOROOT. The test is very bare in fact. It sets GOPATH just before it calls make.  I add GOROOT. Or maybe this bytes issue with trusty packaging problem
<wwitzel3> fwereade: sure, np :)
<fwereade> wwitzel3, encourage cmars to take a look now; wallyworld or I will surely take a look tonight/tomorrow though
 * cmars takes a look
 * wallyworld will look tomorrow
<natefinch> alexisb: crap, sorry, I thought that was Wednesday... got pulled away from my desk for a while
<wwitzel3> fwereade: I am moving it to GH anyway, so will ping people with the new review, thanks
<alexisb> natefinch, no worries
<alexisb> we are still on
<natefinch> alexisb: jumping on
<alexisb> hi  gsamfira and alexpilotti
<alexisb> kadams54, hi
<bodie_> looking for some guidance serializing this map[interface{}]interface{} -- I know that the values are either lists, maps, or other things that are OK.  if the values are maps, I know that the keys will be strings
<bodie_> however, bson is grumpy because it doesn't know that the keys will be strings, it sees interface{}s
<kadams54> alexisb: hi!
<bodie_> o/
<bodie_> I'm thinking the right answer is to recurse on a type switch and then serialize leaves to bson
<bodie_> but not totally sure and want to get some confirmation
<bodie_> also, there could potentially be maps or lists as values inside lists, I suppose, I mean it's just JSON so I don't know why it doesn't want to play nice
<bodie_> I deserialized into the datastructure using goyaml and I think that might be the issue
<voidspace> natefinch: perrito666: wwitzel3: I assume we're delaying standup until natefinch is free
<perrito666> voidspace: ok by me
<alexisb> hi trobert2
<natefinch> voidspace: free now
<voidspace> cool
<alexisb> trobert2, if you have specific questions regarding unit tests this would be a good place to ask them
<alexisb> I am sure that natefinch and team would be willing get you pointed in the right direction
<wwitzel3> how do I get bzr to produce a patch with bzr diff so I can apply it to my gh branch? .. bzr diff --old co:trunk isn't doing anything useful
<voidspace> wwitzel3: create a merge proposal on launchpad and download the diff :-)
<wwitzel3> voidspace: hah, good point
<natefinch> ug, stupid hangouts crashed, brb
<jam1> voidspace: wwitzel3: bzr diff -r ancestor:trunk
<jam1> or bzr diff -r submit: if you have that configured
<mgz> starting the bzr import
<jam1> cd ../trunk
<jam1> bzr merge âpreview $MYBRANCH; is the contents of the launchpad MP
<mgz> wwitzel3: my email should have included a working command
<mgz> ..at least for the pople who use my way, I forgot to mention that cobzr people need to work out how to address trunk themselves
<trobert2> ok. thank you very much guys :)
<wwitzel3> mgz: yeah, downloading the diff from lp worked out well enough :)
<wwitzel3> mgz: also, I hadn't seen your email yet. It would of also solved it for me.
<mgz> jenkins job cloned and updated
<alexisb> mgz, I will be a few minutes late to our 1x1, will ping you when I am back at the keyboard
<mgz> I'm so tempted to make the github description `juju is`
<mgz> alexisb: no problem
<mgz> first revs pushed, everyone please still wait before poking
<mgz> I'll send an all clear when I'm done
<wwitzel3> so I should stop my automated poking script?
 * wwitzel3 kicks the dirt
<mgz> :D
 * bodie_ pokes wwitzel3 
<perrito666> wwitzel3: if you need a hand with what you are doing poke me, I might be of help there
<mgz> import changes thus far, a little oddness, some files have been resurrected, including some testing charm bits: http://paste.ubuntu.com/7580793
<natefinch> mgz: nice work
<natefinch> mgz: odd that some things are getting resurrected.... do you think it's just a buggy import?  (I presume you're using some bzr to git importer)
<mgz> yeah, just some import oddness I presume
<mgz> seems limited from eyeballing that
<natefinch> we're gonna have to update the readme
<mgz> yeah, I have various modifications to land befre this is live
<natefinch> cool
 * natefinch will stop bugging the person actually getting the work done.
<trobert2> Hey guys, I have a question: anyone know how to expand a short path? ex: "C:\\Users\\ADMINI~1" to "C:\\Users\\Administrator\\"
<natefinch> trobert2: http://msdn.microsoft.com/en-us/library/windows/desktop/aa364980(v=vs.85).aspx
<natefinch> trobert2: which it looks like is in windows' syscall library
<natefinch> trobert2: do you know about running godoc locally so you can see the windows version of the syscall package?
<trobert2> no, I do not
<trobert2> I`ll give syscall.toLong a try. Thanks for the response
<natefinch> trobert2: oh, wow, glad I thought to ask.  depending on how you installed Go, you should have an executable called godoc, which you can run to get a copy of the docs that exist at golang.org.  run godoc -http=:8080  and then you can open a browser to localhost:8080/pkg to see the docs for all packages on your system, including the standard library... which is the only way to see the windows version of the syscall package (
<natefinch> since it is OS specific)
<natefinch> trobert2: it should be syscall.GetLongPathName ... toLong is a private function that you won't be able to reference (note the lowercase first letter)
<natefinch> trobert2: but you can use toLong as a guide for how to call GetLongPathName
<natefinch> trobert2: it's probably good to just copy toLong entirely to your own package
<trobert2> natefinch: thank you. Having the docs locally will be quite helpful
<natefinch> trobert2: welcome.  Hugely important when doing windows work, since the windows syscall stuff isn't on golang.org  (I really wish they'd put the non-linux syscall docs online... I realize why they're not on the default doc site, because Google isn't running Windows or OSX, but still....)
<mgz> seriously?
<mgz> I appear to have screwd the repo already somehow
<natefinch> lol
<natefinch> just rebase
<natefinch> ;)
<perrito666> mgz: you are a fast kid
<natefinch> that's the git way, right?
<mgz> as in, I have a corrupt pack
<mgz> github trunk should be ready for forking
<mgz> doing test merge
<mgz> I'll send a note to the list when things are actually confirmed usable, but the adventurous can start experimenting now
<mgz> ...I've confused github about who's the bot and who is me
<sinzui> mgz :)
<sinzui> mgz, do you bleed?
<mgz> maybe I should have used a mailinator account
<sinzui> oh, hay, I don't think I closed the bugs, or...I forgot to be the juju-qa-bot
<mgz> everyone: can I have a review? github.com/juju/juju/pull/1
<mgz> sinzui: have you got an alt caonical email address I can use for the bot?
<mgz> I want to reclaim mine for my github account
<sinzui> hmm
<mgz> something like jujubot@ubuntu.com would be grand, but that means talking to IS...
<mgz> sod it
<sinzui> mgz, I don't, but the bots abentley created alway have + in the user
<mgz> cunning
<sinzui> mgz https://launchpad.net/~juju-qa-bot
<sinzui> bugger, I did forget to be the bot. sorry jam
 * sinzui updated the release log to ensure the bot credentials are passed
<mgz> perrito666: can you review my branch plz?
<sinzui> mgz, will you be adding a 1.18 branch? I tagged the Lp branch a few hours ago, but postponed the 1.18.5 increment until github was ready
<mgz> nope
<mgz> I wasn't planning on it
<mgz> I left the bzr bits up for 1.18 for now
<sinzui> mgz, oh, that is right, 1.18 doesn't to to github. 1.20 will be the first stable in github
<mgz> keeping it like that would make backports more annoying, but importing multiple branches with fastimport is kinda borked
<sinzui> mgz, fast export to fast import? Does that preserve tags?
<perrito666> mgz: diff says much more than your merge proposal comment :p
<sinzui> I guess not
<perrito666> mgz: reviewed
<jam1> mgz: sinzui: worse case it generates a mapping and we should be able to find the tags, but I think it does at least try to tag revs
<jam1> mgz: if we don't have the release tags, can you make sure to get them put in manually?
<jam1> I feel it is important to keep them
<sinzui> +1. I was going to add them once mgz frees his identity from the bot
<mgz> jam, hm, urk, the instructions I was following stripped varioius things
<mgz> but retagging should be fine, right?
<jam1> mgz: stripping revs makes me pretty sad
<mgz> the revs should all be there, plus or minus weirdness, of which there was some
<jam1> mgz: as long as you can match the contents, we can do new tags
<mgz> but I didn't generate a marks file or anything
<jam1> I don't really want to have "the mainline commit that sort of looks something like 1.18.1"
<jam1> I really want us to have the "tree that was released as 1.18.1"
<mgz> jam1: I don't see a way around it for old minor versions,
<jam1> mgz: create them
<sinzui> perrito666, do you have a minute to review https://codereview.appspot.com/93710043
<jam1> mgz: these are actually precious artifacts, we *can* go back to the bzr codebase, but that gets pretty clumsy
<jam1> mgz: but really, why would we chose a lossy conversion?
<sinzui> jam1, mgz, we don't have stable tags in th lp's trunk aka git's master. If we want the stable tags, we need to recreate the stable branches
 * jam1 goes off for a while
 * perrito666 isnot sure if lgtm sinzui 's version number changes after seeing how they break builds :p
<perrito666> sinzui: done
<sinzui> perrito666, good point. 1.18 doesn't have a "var switchOverVersion = MustParse("1.19.9")"
<sinzui> perrito666, ^ that number was set to 1.19.3 in devel, telling juju to reject 1.19.4 as a devel lrease
<perrito666> yeah, I thought that as long as we dont get near 1.19 its all fine :)
<perrito666> mgz: jujubot has your face as avatar that is mildly disturbing
<mgz> perrito666: it is.
<mgz> I am trying to fix that by getting a new account from IS
<perrito666> there is certain dr who sense in the fact that our bot is a british man with a canonical logo for a face
<natefinch> haha
<sinzui> yeah, the flat colour looks like the logo from the 4th Doctor era
<natefinch> github.com/juju/juju builds for me after running godeps -u dependencies.tsv
<natefinch> nice work everyone
<mgz> evil time coming
<mgz> ...actually, I won't be evil.
<natefinch> anyone opposed to making the readme markdown?
<perrito666> natefinch: not at all
<voidspace> wow, 20:54 minutes to scp a juju tarball to the CI machine
<mgz> natefinch: see the pending merge proposal
<natefinch> mgz: you mean the pull request? ;)
<mgz> watevar
<natefinch> :D
<perrito666> voidspace: welcome to southamerica (?)
<voidspace> perrito666: hehe
<voidspace> perrito666: or rural England
<voidspace> natefinch: sinzui: so logged into CI machine as jenkins and running the juju that CI did, a local bootstrap fails
<voidspace> natefinch: sinzui: a local build that I scp'd up (as ubuntu user) succeeds
<sinzui> hmm
<voidspace> natefinch: sinzui: that one refreshes the session before asking for status
<voidspace> sinzui: that's with changes
<natefinch> oooh
<sinzui> voidspace, very intersting
<voidspace> sinzui: natefinch: that Refresh is to avoid the "Closed explicitly" problem
<natefinch> I hope that means it's fixed, not just that where it's built matters
<voidspace> natefinch: well, yes
<voidspace> sinzui: natefinch: so current hypothesis is that this "fix" is worth landing and trying
<sinzui> voidspace, We know that jenkins strips the env of personal var that we often assume to be there
<sinzui> voidspace, oh, I misunderstood, please land.
<voidspace> sinzui: natefinch: if it doesn't *actually* fix the problem then we will revert the changes that caused this and workaround it some other way to achieve what we want
<voidspace> so one way or another it will be the end of it
<voidspace> (well, of this particular problem)
<sinzui> voidspace, We explicitly define USER=jenkins in the local tests because that var was stripped by jenkins. if local tests, juju, mongo need a missing var, I need to find it and put it back
<voidspace> right
<voidspace> ok, so now all I need to do is work out how to land a branch in the new wild and wonderful world of git
<natefinch> haha
<voidspace> which can mean only one thing
<voidspace> it's time for more coffee
<natefinch> voidspace: http://blog.natefinch.com/2014/03/go-and-github.html
<natefinch> (and anyone else)
<perrito666> natefinch: thank you
<jcw4> natefinch: great tip, thanks!
<perrito666> although we should all use the doc on the repo, so we have better chances of finding errors
<sinzui> Anyone have an idea why the ppc64el (ggcgo) doesn't have anything like "go/src/pkg" installed. It may have been, but was removed? The unittests started failing yesterday because the  bytes package doesn't exist
<natefinch> sinzui: no idea.... obviously, it should be there
<mgz> gah, I'm an idiot, I should have been evil
<natefinch> hahaha
<natefinch> is it too late?
<mgz> well, I can half-way it
<sinzui> rick_h_, jcsackett I am still getting jcsackett's vacation requests
<jcsackett> sinzui: ack, thanks. rick_h_ is on vacation, i'll see about sorting that. perhaps an RT to the directory.
<jcsackett> it lists me as reporting to you still as well.
<sinzui> stupid monkeys
<mgz> sinzui: lp:~gz/juju-release-tools/fix_juju_github_path
 * sinzui pulls
<perrito666> mm, bzr diff between two branches using --old and --new yields... a lot
<sinzui> mgz, merged
<mgz> pulled on the bot, thanks
<sinzui> mgz, can I install it on the master and slave now. I cannot think os a reason why not
<mgz> hopefully that's the last of the fallout from the name change
<mgz> sinzui: please do, I did master as I need that for a clean run
<sinzui> We still need to tell CI to poll the git branch for changes
<mgz> I do that
<mgz> it's in the ubuntu user's crontab, which is I'm not sure how you want it
<mgz> I also need to write all the pokings I did on the jenkins machine in your doc
<mgz> or you mean for the other jobs?
<sinzui> mgz, jenkins has a crontab that lists the branches to check. abentley or I will change it shortly. I am kicking the stable 1.18.5 test run to complete
<abentley> sinzui: I was off IRC for a bit.  Missing context.
<sinzui> mgz, The publish step is too brittle now that it it builds a lot and publishes even more. Lots of network failure opportunities
<alexisb> jam, fwereade: either of you around?
<sinzui> abentley, juju core is on git hub. We want to replace lp:juju-core with "gitbranch:master:github.com/juju/juju" I think
 * sinzui typed that from memory
<abentley> sinzui: Yes, I agree.
<abentley> sinzui: I want to run it by hand first, but we won't see the full effect until these tests are done: publish-revision, run-unit-tests-precise-amd64, run-unit-tests-trusty-amd64, walk-unit-tests-amd64-trusty
<mgz> okay, all done but the waiting
<mgz> actually, checklist
<sinzui> abentley, +1
<sinzui> abentley, mgz, it would be nice to pick a known good revision, but trunk has been broken for 9 days
<natefinch> switching version control providers while the tests were passing would have been too boring
<sinzui> :)
<sinzui> natefinch, the brittle publish step always adds a risk to every test
<natefinch> that just makes it more exciting
<sinzui> abentley, do you have a few minutes to discuss a revised process to run unit tests, build, and publish
<abentley> sinzui: sure thing.
<sinzui> abentley, https://plus.google.com/hangouts/_/calendar/Y3VydGlzQGNhbm9uaWNhbC5jb20.lmdv7d975raqphi9pf38tm7juo?authuser=1
<abentley> sinzui: I am in the hangout, but I don't see you.
<sinzui> oh so am i
<perrito666> mgz: natefinch do any of you know where "check" is?
<rogpeppe> perrito666: "check" ?
<perrito666> rogpeppe: in the goint go git docs there is a reference to
<perrito666> ln -s ../../check .git/hooks/pre-push
<mgz> perrito666: it got moved, the new contributing has the updated location
<perrito666> mgz: did that got merged?
<rogpeppe> perrito666: it's probably renamed from lbox.check
<mgz> perrito666: it's being chewed on
 * perrito666 does coffee to ease the wait
<mgz> we have issues with tests witch -p > 1
<voidspace> mgz: I have a branch with a 1 line change in it
<voidspace> mgz: bzr diff -rancestor:co:trunk > ~/initiate-refresh.patch
<voidspace> mgz: produces a 27k patch file
<voidspace> mgz: my bazaar branch has trunk merged into it, so it's up to date
<mgz> odd
<perrito666> voidspace: bzr diff between your commit and the previous one
<voidspace> that's what I thought :-)
<perrito666> I had to do that
<voidspace> perrito666: ok
<perrito666> bzr diff between my branch and trunk produced a very verbose thing
<mgz> I might have the spelling wrong, what did jam say earlier...
<mgz> (worked for the branches I tested though...)
<mgz> hm, jam said the same
<perrito666> mgz: the same?
<mgz> merge -r ancestor:trunk
<voidspace> picking the specific revision worked fine
<voidspace> sooo, I created a branch added-committed-pushed (origin master)
<voidspace> I can't see my branch on github though
<voidspace> https://github.com/voidspace/juju/branches
<voidspace> hmmm... although: https://github.com/voidspace/juju/tree/initiate-refresh
<voidspace> but "This branch is 0 commits ahead and 0 commits behind master"
<Egoist> Hello
<Egoist> I have to instances and want to connect them. The problem is that first instance want to connect with second but second has not finished configuration so refsed connection, and by that i have Error in *-relation-changed hook
<mgz> sinzui: can you keep an eye on github-merge-juju? we're still not having miuch joy with these tests
<Egoist> someone know how to handle with this?
<natefinch> marcoceppi: ^^
<sinzui> mgz I wish
<sinzui> mhgz, I will
<voidspace> ah, I pushed to the wrong branch it seems
<marcoceppi> Egoist: is it configuring in the background?
<marcoceppi> Egoist: how is it not finished configuring?
<voidspace> natefinch: https://github.com/juju/juju/pull/3
<natefinch> voidspace: looking
<natefinch> haha contributing still mentions cobzr
<voidspace> yeah, I just saw that
<perrito666> natefinch: https://github.com/bz2/core/blob/contributing_updates/CONTRIBUTING
<perrito666> slightly newer
<voidspace> the repo names have been updated though
<perrito666> so, so, juju/juju or juju/core, I am not sure what did we settle with
<natefinch> juju/juju is where stuff it
<marcoceppi> \o.
<natefinch> s/is/it/
<marcoceppi> \o/*
<marcoceppi> juju/juju makes much more sense, think of those who clone something, suddenly what is marcoceppi/core ?
<natefinch> exactly what I said
<natefinch> the other option was juju/juju-core .... but the argument there is that we only called it juju-core before because on launchpad, juju was taken by pyjuju
<marcoceppi> natefinch: we're going to be moving juju/docs to juju/juju-docs soon as well
<natefinch> marcoceppi: that's good.  It's good to have the repo names able to stand on their own
<bodie_> YES!  got my funky nested map[interface{}]'s coerced properly
<bodie_> if this doesn't fix the bson marshaling issue around that, i'll be a monkey's uncle
<natefinch> mgz, sinzui: where's the new contributing doc for how we work this thing?  Is there a formal "Approve this PR"? Or is it more informal?
<natefinch> bodie_: awesome
<natefinch> oh, I missed perrito666
<natefinch> 's link
<sinzui> natefinch, sorry. I don't know that yet
<natefinch> Sounds like informal, which is fine.  It was effectively the same thing before.
<perrito666> natefinch: that has been approved for landing
<voidspace> natefinch:
<voidspace> After a proposal has recieved an LGTM, the landing must be notified to test and merge
<voidspace> the code into master. This is done by a member of the juju project adding the magic
<voidspace> string $$merge$$ in a comment.
<perrito666> but mgz-faced bot is not being able to merge it
<voidspace> natefinch: cool, I have to EOD
<voidspace> natefinch: hopefully this lands and hopefully it fixes the build :-)
<natefinch> voidspace: np.  Good work on that.
<voidspace> if not, tomorrow is the big revert day :-(
<voidspace> heh
<voidspace> we'll see
<voidspace> g'night
<natefinch> I'm optimistic
<natefinch> see ya
<perrito666> mm, reverting something merged with bzr in git, that should be somehow fun
<natefinch> I think the commits are all there, so it's just a matter of pretending they were always in git.
<perrito666> natefinch: nice
<rogpeppe> g'night all
<perrito666> rogpeppe: o/
<bodie_> I'm getting thrown off by state/conn_test.go line 74 AddTestingCharm
<bodie_> it's in the state package, yet it calls state.AddTestingCharm with additional arguments
<bodie_> the only other AddTestingCharm definition I see is in juju/testing/conn.go
<bodie_> and its arity doesn't match, either
<bodie_> does anyone know where to track that down?  I must not be grepping just right
<bodie_> er, it's in the state_test package
<natefinch> that's the thing
<natefinch> state_test is a different package
<natefinch> so in order to call AddTestingCharm, it has to qualify it by the package name
<bodie_> here we go
<bodie_> found it
<bodie_> yeah, I think I got thrown off since I somehow forgot I was in the testing package :)
<bodie_> s/testing/state_test
<natefinch> yeah, it can be a little confusing, especially since that's the only time you can be in the same directory, but a different package
<natefinch> man, I wish env would sort alphabetically
<natefinch> mgz: why do we tell people to do go get -v?
<perrito666> ouch, I forgot a dentist appointment
<natefinch> oops
 * perrito666 runs
<perrito666> Ill ba back later most likely
<perrito666> I also got my git juju working, altough I am not sure what is the new name for check
<natefinch> perrito666: I'll see if I can figure it out
<perrito666> well, contributing/readme patch just landed
<perrito666> so perhaps docs are fixed
<natefinch> they look much improved
 * perrito666 maps the dentist office and notices it is perhaps the hardest place to get in driving in the universe...
<natefinch> doh
<perrito666> well I better run then, let me know if you find out what the name of check is :)
<perrito666> cheers guys
<natefinch> such a beautiful URL  https://canonicalhr--fhcm2.eu2.visual.force.com/apex/fHCM2__FairsailProfile
<mgz> woho! landed.
<mgz> I'll send email a bit later, but consider the git repo live
<natefinch> yay!
<sinzui> mgz, Did you add a rule to try -p 1 when the first effort fails?
<mgz> I made it just use -p1 twice in the end, it's the only one that seems to reliably work
<sinzui> and you select xlarge too
<sinzui> this is very worying
<mgz> I think we just use a smaller machine for now as well, it's not like we're getting benefit from extra cores...
<sinzui> mgz, I am building slaves next week. think about the resources you need. I may be able to provide a dedicated slave for gobot
<sinzui> mgz, those tests do make a difference to the test suite
<sinzui> the cores are used
<mgz> sinzui: ace, thanks
<mgz> sinzui: they would be for -p > 1 much better...
<abentley> sinzui: I have tried building the git branch, but so far, breakage: http://juju-ci.vapour.ws:8080/job/build-revision/1436/console
<natefinch> what's wrong with git clone https://github.com/juju/juju  ?
<natefinch> abentley, sinzui: ^^
<abentley> natefinch: That is too vague.
<jcw4> natefinch: works for me?
<natefinch> abentley: eh?
<abentley> natefinch: We want a specific tree, not whatever happens to be the tip revision.
<natefinch> abentley: that'll get the master branch
<natefinch> or rather, whatever is defined as the default branch, which is currently master
<abentley> natefinch: Right.
<jcw4> natefinch: n/m misunderstood your comment :-)
<natefinch> which is what gitbranch:master seems to indicate you want?
<abentley> natefinch: We don't want something that always grabs master.  We want something that grabs what we tell it to grab.
<natefinch> abentley: ok, sorry, I was just looking at what that code seemd to be doing.  Stack Overflow says to do this:
<natefinch> git init
<natefinch> git remote add -t $BRANCH -f origin https://github.com/juju/juju
<natefinch> git checkout $BRANCH
<natefinch> the wacky stuff in the middle is to prevent git from getting the whole repo.... which I presume is what we want. Otherwise you can just do
<natefinch> git clone https://github.com/juju/juju && git checkout $BRANCH
<natefinch> (clone gets all branches in the repo)
<natefinch> this also seems to work:   git clone -b $branch https://github.com/juju/juju
<abentley> natefinch: Good to know how to avoid retrieving other junk.  But we're fine with doing a clone for now.
<bodie_> I'm not totally following this bit -- $ ln -s ../../scripts/pre-push.bash .git/hooks/pre-push
<bodie_> where's that assuming your working directory is?
<natefinch> relative directories are never a good idea in instructions
<natefinch> bodie_: I have no idea where it expects you'll be when you run that command.
<bodie_> also, I'm assuming none of our bzr branches and such have been preserved, so I need to push --force to a forked branch in my user account, right?
<bodie_> if an email went out with all this info, I don't think I saw it -- did I miss that?
<natefinch> bodie_: the email is still in the works
<natefinch> bodie_: yes, none of the bzr branches will exist. You'll need to move your code over... I'm not sure exactly how to do that, honestly.
<bodie_> I think forking it and pushing to our personal forks will suffice for now, but merging might get finicky, unless I'm not thinking through this clearly
<bodie_> the way we worked at DigitalOcean was to push to user branches in the core repo, then open PR's against the branches rather than against user repos
<abentley> sinzui: looks like the packaging is hardcoded for launchpad.net: http://juju-ci.vapour.ws:8080/job/publish-revision/454/console
<sinzui> abentley, natefinch : the make-release-tarball script does something similar
<abentley> sinzui: Yes, that's what I think we were discussing.
<sinzui> abentley, natefinch : git clone $JUJU_CORE_REPO $WORK/src/$PACKAGE
<sinzui>     if git ls-remote ./  | grep origin/$REVISION; then
<sinzui>         git checkout origin/$REVISION
<sinzui>     else
<sinzui>         git checkout $REVISION
<sinzui>     fi
<natefinch> bodie_: any experience you can share with git/github would be much appreciated.  We're all pretty new to it (I have used github for personal projects quite a bit, but not for anything big)
<bodie_> I'll see what I can dig out of my head -- personal forking and PR's are probably not a bad way to do it really
<bodie_> there are a lot of ways to do it
<bodie_> since we were on Github Enterprise w/ a private repo I think it was different since all users were easily able to have push access (although we didn't have push to master)
<sinzui> abentley, I think a step got skipped, The script might be getting the wrong args
<sinzui> abentley, https://code.launchpad.net/~sinzui/juju-release-tools/make-from-git/+merge/221307
<sinzui> make-release-tarball.bash 0ededa1f3e7cd8a50f1c94f6abbb3355735069a6 https://github.com/juju/juju.git would be the expected args
<sinzui> abentley, I will look into this when I get back from school
<abentley> sinzui: I've fixed that issue already.  See http://juju-ci.vapour.ws:8080/job/build-revision/1440/console
<bodie_> natefinch, if there's anything specific in question or that I can help with, feel free to grab my attention any time -- I'm probably not qualified to make positive claims about best practices though :)
<natefinch> bodie_: haha, no problem.  I'm sure there will be things you know that we don't, and things none of us know. It'll be fun.
<bodie_> I'm definitely a huge fan of it.  there are a couple of fundamental schools of thought.  I lean towards the "branch thoughtfully and prodigiously" school
<bodie_> I like to have a master, a beta, a testing, and a dev branch, then user branches, and then grant push access on user branches; merge finished features to dev; merge finished dev sprints to testing; merge greenlit code to beta; merge major completed features to master
<bodie_> then you don't have to worry about what's going into master
<bodie_> but, most people just like to PR against master, I'm just kinda more process oriented
<bodie_> if you're not comfortable with git the branching, rebasing, merging stuff can be kinda high drag / breakage prone
<bodie_> so, I think most people don't like the branch heavy approach
<natefinch> bodie_: the nice thing about forks is that you don't have to grant anyone access, they can just do it.  But yeah, might be the difference between enterprise and github proper
<bodie_> right, but the question is kind of whether you want people opening PRs against master or, what branch
<natefinch> bodie_: yeah.... my feeling is that eventually the code makes it into master anyway, and as long as tests and CI pass, go for it
<menn0> waigani: ~\0/~
<natefinch> I'm trying to decide if that's a guy shrugging, or a guy with really big shoulder pads.... probably neither
<menn0> it's not great is it?
<menn0> kinda looks like superman flying, head on
<bodie_> natefinch, lol
<menn0> officially it's: waving hello/goodbye
 * natefinch is terrible at these things
<bodie_> it's doing the sine dance
<bodie_> :P
<menn0> nice ;-]
<mgz> I have made jumbalia and am watching irc again if people have migration issues
<menn0> mgz: so far so good for me
<menn0> is anyone having trouble with their canonical gmail? I've just been logged out and it's asking me for a password instead of 2FA...
<menn0> nevermind
<mgz> I always get asked, just leave blank and it forwards
<alexisb> mgz, dude! what time is it for you?
<mgz> 21:38
<alexisb> mgz is getting sleepy ... ;)
<menn0> mgz: I tried that first but I got "wrong username/password"
<natefinch> omg, I want jambalaya
<menn0> mgz: entering the token from my phone worked (even though it was asking for a password not a token, weird)
<alexisb> alrighty all, I am off to town for some outreach work and won't be online again until tomorrow morning, if you need me urgently feel free to call my cell
<natefinch> alexisb: have fun!
<thumper> waigani, menn0: morning, care for a hangout to talk through the user display name work?
<menn0> thumper: sure, give me 1 min
<thumper> menn0, waigani: https://plus.google.com/hangouts/_/g5vgmsdpkfpsgdxsxsw3dbvfpqa?hl=en
<sinzui> abentley, wallyworld, CI is broken because the source package branch (also under test) looks for launchpad.net/juju-core. the winbuildtest.py script does something similar
<sinzui> I will fix them though I am a little tired. I wont rush
<abentley> sinzui: Yes, I mentioned this two hours ago: " sinzui: looks like the packaging is hardcoded for launchpad.net: http://juju-ci.vapour.ws:8080/job/publish-revision/454/console"
<sinzui> abentley, sorry I have been in meetings. I know how to do the fix.
<abentley> sinzui: On a happier note, I have preliminary numbers for published revisions: https://chinstrap.canonical.com/~abentley/published-revisions.png
<sinzui> lovely.
<sinzui> thank you abentley
<abentley> sinzui: win-client-build-installer also seems to depend on launchpad.net in the path.
<sinzui> I am fixing that right now
<bodie_> hmmmm.... I just noticed that I'm having dependency issues because all my deps are from juju/juju, but I've changed a bunch of files in my personal repo.
<bodie_> so, I could go into my code and edit it to use my own deps
<bodie_> but then if I merge that, Juju would be using my code as deps
<bodie_> is there an established way to deal with this?
<bodie_> it seems like this is a good argument against using user forks, but there is probably a way
<bodie_> maybe we could have a script that symlinks github/juju/juju to the user's workspace
<jcw4> <natefinch> voidspace: http://blog.natefinch.com/2014/03/go-and-github.html
<jcw4> <natefinch> (and anyone else)
<jcw4> bodie_: ^^^
<bodie_> s/Juju would be using my code/Juju would be using my repositories/
<bodie_> I don't think that answers the question, but I'll look again
<bodie_> okay, so step 4 overwrites the local copy of the repository in-place
<bodie_> that makes sense :)
<jcw4> right, so you're using your repo, but at the gopath location for juju/juju
<bodie_> I don't like that but it works
<jcw4> there's a couple variations in the comments, but nate's original suggestion makes the most sense to me
<bodie_> I guess the only alternative would be to symlink the repo to the user's repo
<bodie_> or, the only one that comes to mind for m
<bodie_> e
<jcw4> A close contender for me was the suggestion in the comments to clone gh:juju/juju in the expected place and then add your personal repo as an upstream repo and then branch from there
<jcw4> but I liked the simpler approach of just cloning your personal repo in the gh:juju/juju gopath location
<jcw4> (and adding gh:juju/juju as an upstream repo)
<perrito666> hello again everyone
<bodie_> if anyone gets a chance (jcw4?) I could use a glance at https://github.com/binary132/juju branch charm-interface-actions
<bodie_> not quite sure what's going on with my tests
<bodie_> http://paste.ubuntu.com/7583269/
<jcw4> bodie_: looks like you're getting a nil ActionSpec instead of an empty one...
<bodie_> hm
<jcw4> (technically its the map[string]charm.ActionSpec that's nil)
<perrito666> bodie_: where exactly is that test?
<jcw4> perrito666: state/state_test.go
<jcw4> perrito666: in bodie_ 's branch
<bodie_> https://github.com/binary132/juju/blob/charm-interface-actions/state/state_test.go#L2032
<thumper> cmars: sorry for being late, can I put you off a day?
<jcw4> bodie_: I think just checking for nil unmarshaledActions.ActionSpecs and assigning an empty map if its nil will get you past that error...
<jcw4> bodie_: I assume we want to always return an empty ActionSpecs if the Charm doesn't declare one?
<jcw4> bodie_: hmm, no - even with that check it's still getting the error.  Maybe the map isn't getting serialized to mongo, and then when it gets read back out it's nil?
<bodie_> that's possible, yes
<jcw4> bodie_: my hypothesis is that AddTestingCharm returns a Charm with an empty Actions, but when that same charm is read from mongo the Actions is nil
<jcw4> bodie_: whether it's the Actions() or the Actions().ActionSpecs that is nil, I'm not sure... I think the latter actually
<bodie_> hmm
<jcw4> bodie_: no, AddTestingCharm returns the Charm *from* mongo after adding it
<jcw4> bodie_: so you're getting a Charm with an empty charm.Actions().ActionSpecs when you initially add it, but then when you retrieve it the second time the charm2.Actions().ActionSpecs is nil
<perrito666> aghh, my juju computer lacks all my github config
<jcw4> bodie_: ahhh... no.  In your Assert on line 2032, you're putting the expected charm in the obtained spot
<bodie_> am i?  *looks*
<jcw4> bodie_: so the error message is backwards...
<jcw4> bodie_: the Assert call expects the obtained result first, and the expected result later in the params list
<jcw4> bodie_: so.. AddTestingCharm is returning a charm where Actions().ActionSpecs is nil
<bodie_> I'm not certain about that, based on the previous check in that file
<jcw4> bodie_: however the subsequent call to state.Charm() is returning a charm where Actions().ActionSpecs is empty
<jcw4> bodie_: that's what the source for Assert says... gocheck/helpers.go:165
<bodie_> okay, well that must be where we're seeing the problem
<bodie_> wordpress is *supposed* to have an empty Actions
<bodie_> I was thinking we should return Actions{} rather than Actions{ActionSpecs: Charm.ActionSpec{}}
<bodie_> okay, I think I understand, thank you
<bodie_> they're both using newCharm(st, cdoc)
<bodie_> I suspect that since the Actions struct is empty it's simply omitting it from the document entirely
<bodie_> whereas after the Insert, it's still just an empty Actions{}
<jcw4> bodie_: I bet if we add a `yaml:",omitempty` to ActionSpecs it'll work as expected
<jcw4> bodie_: testing....
<bodie_> or maybe a bson
<jcw4> bodie_: yeah, trying bson but that's just guessing on my part now
<bodie_> I think perhaps omitempty is the opposite of what we want?
<bodie_> or...
<jcw4> bodie_: do we want nil or empty?
<bodie_> well if we use omitempty I think it will omit the field from the doc
<bodie_> therefore, when deserialized, will come back nil
<bodie_> right?
<jcw4> bodie_: hmm bson:",omitempty" made the test pass...
<jcw4> bodie_: right
<bodie_> okay, cool
<bodie_> I'm still not fully getting the annotations
<bodie_> what did you do with that?
<bodie_> e.g. ActionSpecs map[string]ActionSpec `yaml:"actions"bson:",omitempty"
<jcw4> bodie_: http://paste.ubuntu.com/7583486/
<jcw4> bodie_: yeah, pretty close
<bodie_> ah, I see
<waigani> menn0: I think you just merged that branch by saying "$$merge$$"
<waigani> menn0: maybe you just need to retry a few times
<bodie_> now I'm getting this: http://paste.ubuntu.com/7583514/
<thumper> wallyworld: ping?
<jcw4> bodie_: hmm, do you have any orphaned mongo processes or some other environmental issues?
<wallyworld> thumper: hiya
<bodie_> I do have my own mongod running in the background -- I checked for that though.  maybe I'll have a sweep through /tmp
<wallyworld> coming
<bodie_> welp, I'm gonna have to set this down for tonight, hopefully a reboot will clear away the stormclouds ;)
<jcw4> bodie_: o/
<waigani> menn0: ping?
<menn0> waigani: pong
<waigani> menn0: ah, never mind. Dave turned up for standup.
 * perrito666 makes a pr that is 10 times longer in description that the actual patch
#juju-dev 2014-06-04
<davechen1y> poop, http://juju-ci.vapour.ws:8080/job/run-unit-tests-trusty-ppc64el/439/console
<davechen1y> 504
<davechen1y> no
<davechen1y> 404
<davechen1y> wtf
<abentley> davechen1y: Not sure.  But I think our last build for that is 458.
<davechen1y> i know what the issue is anyway
<davechen1y> will fix
<waigani> wallyworld: so who should comment $$merge$$ ?
<abentley> davechen1y: So your wtf is why don't we retain all the builds?  Disk space.
<wallyworld> waigani: the author of the pull request after getting a lgtm, just like in launchpas where the author of the mp sets approved
<davechen1y> abentley: ok, fair enough
<davechen1y> welcome to the cloud
<waigani> wallyworld: thought so, just checking. I saw an LGTM and a $$merge$$ by the reviewer
<wallyworld> waigani: np, sometimes the reviewer scan do that if the author had gone to bed or whatever and they want to get it landed
<waigani> sure
<wallyworld> i secretly wish we were stil on lp :-(
<davechen1y> wallyworld: i don't think it is a secret
<wallyworld> lol
<waigani> hehe
<wallyworld> you got me there
<abentley> Certainly not now : -)
<wallyworld> i just detest git so much compared to bzr
<waigani> wallyworld: do we go here to see everything that needs to be reviewed: https://github.com/juju/juju/pulls
<wallyworld> i think so yeah
<waigani> just asking all the really obvious questions to get them out of the way :)
<wallyworld> i'm still ramping up on github myself
<davecheney> % echo $PWD
<davecheney> /home/dfc/src/github.com/juju/juju/juju/osenv
<davecheney> i'd like to present this for the justification _not_ to call the main repo 'juju'
<perrito666> wallyworld: I am surprised that you did not curse even once in the mail
<davecheney> but it's too late for that now
<perrito666> davecheney: there where very good arguments agains
<wallyworld> perrito666: i can curse here on irc :-) had to be polite in the email :-)
<perrito666> such as having clones like github.com/perrito666/core
<davecheney> perrito666: meh, what's done is done
<perrito666> who would want to clone my core? :p
<davecheney> perrito666: github.com/davecheney/testing
<davecheney> not that descriptive either
<wallyworld> davecheney: that issue was mentioned and people did suggest juju-core but the majority wanted juju/juju
<perrito666> davecheney: do you have a production davecheney too? :p
<wallyworld> fwiw i wanted juju-core
<davecheney> func (*importSuite) TestTemporaryDependencies(c *gc.C) {
<davecheney>         c.Assert(coretesting.FindJujuCoreImports(c, "github.com/juju/juju/juju/osenv"),
<davecheney>                 gc.DeepEquals, []string{"utils"})
<perrito666> we should have discussed this in person, over beer
<davecheney> }
<davecheney> what the shit is this supposed to be testing ?!?
<wallyworld> that ony utils package is imported
<davecheney> why ?
<wallyworld> thumper wrote that stuff to ensure we were only importing sensible dependencies
<wallyworld> to pick up layering violations
<wallyworld> because we had several
<waigani> why not change the team name?
<davecheney> wallyworld: ok, this will need some fixing for gccgo
<davecheney> you'll only hit that bug when /usr/bin/go is compiled by gccgo
<wallyworld> waigani: i also asked that question
<wallyworld> davecheney: why will that happen?
<davecheney> wallyworld: under gccgo, the way we package it the path from the compiled depdency to it's source is not ../src
<davecheney> as sinzui discovered
<wallyworld> ah ok
<wallyworld> thumper will have to fix it then :-)
 * davecheney is having a look
<davecheney> thumper: is busy growing his flock
<wallyworld> what does "growing his flock" mean?
<lifeless> do you really want to know?
<davecheney> crap, this test is going to be a bit tricky to fix
 * davecheney goes to ponder it
<sinzui> wallyworld, I declare CI to be fit for juju testing from git. Two tests failed. voidspace thinks he has a fix for the local-deploy-precise-amd64 failure
<wallyworld> great :-)
<sinzui> maybe davecheney  can explain the broken ppc64el unit tests
<wallyworld> sinzui: do you have a pointer to failed tests saving me from looking?
<sinzui> wallyworld, https://bugs.launchpad.net/juju-core/+bug/1325707
<wallyworld> oh goody, a bug
<sinzui>  I looked at the packages wondering if one got removed that provided src/pkg/bytes and all the other missing packages
<wallyworld> looks like a relatively simple fix hopefully
<sinzui> a simple fix for me is install a missing package
<wallyworld> lifeless: well, i was scared about asking, for sure :-)
<davecheney> sinzui: i know the problem
<davecheney> it's a bug with the test helper
<davecheney> the short, and not competely accurate explanation is, it's gc only
<sinzui> excellent, gccgo manages its deps differently
<davecheney> sinzui: no
<davecheney> sinzui: in short, i'm fixing it
<davecheney> i'm not going to explain it twice
<davecheney> the commit message will be large enough
<perrito666> ok fine people, EOD, good night everyone
<davecheney> https://github.com/juju/testing/pull/10
<davecheney> ^ part 1/2 of the testing failure
<axw> wallyworld: hey. any githubby/landery things you'd like me to look at today?
<wallyworld> axw: hi there, quick hangout?
<axw> sure
<wallyworld> i'll set one up
<wallyworld> https://plus.google.com/hangouts/_/g3h6myvrnbm3pwvzj3dibguaaqa?hl=en
<axw> wallyworld: "the party is over"
<wallyworld> sigh
<axw> :~(
<wallyworld> i sent an invite also
<axw> wallyworld davecheney: could you please take a look at https://codereview.appspot.com/101980044/ when you have some time
<wallyworld> sure
<davecheney> axw: sure
 * thumper isn't growing his flock
 * thumper is getting punched in the head
<waigani> mgz: bzr diff -rancestor:co:trunk. I get bzr: ERROR: Not a branch
<thumper> waigani: you arn't using co-located branches
<thumper> waigani: I'll work it out and let you know if wallyworld doesn't first
<waigani> thumper: sounds good to me :)
<wallyworld> thumper: if you could do it that would be great
<wallyworld> as i have co located branches
<thumper> waigani: I need to do it for mine anyway
<thumper> lifeless: o/
<waigani> cool
<lifeless> thumper: o/
<thumper> wallyworld: are there instructions anywhere on how people should set up the git remotes?
<thumper> wallyworld: personally I've added 'me' as a remote
<thumper> and go 'git push me branch-name'
<thumper> as origin is the juju team branch
<thumper> wallyworld: I was talking with fwereade earlier today, and I think I have a way for us to have a test object factory like the old launchpad one...
<thumper> wallyworld: at least one that works well enough
<wallyworld> thumper: i did git remote add upstream https://github.com/juju/juju.git
<wallyworld> for upstream
<wallyworld> but i guess i should add 'me' too
<thumper> wallyworld: and changed origin to be you?
<wallyworld> not yet but i should
<thumper> wallyworld: so what do you get if you go 'git remote -v'?
<thumper> to me 'origin' is 'upstream'
<thumper> but I guess each person has their mental model
<wallyworld> [master]ian@wallyworld:~/juju/go/src/github.com/juju/juju$ git remote -v
<wallyworld> origin  https://github.com/wallyworld/juju (fetch)
<wallyworld> origin  https://github.com/wallyworld/juju (push)
<wallyworld> upstream        https://github.com/juju/juju.git (fetch)
<wallyworld> upstream        https://github.com/juju/juju.git (push)
<wallyworld> thumper: +1 for better test infrastructure :-)
<axw> wallyworld: can I get you to approve that goamz MP please? I am not a maintainer
<wallyworld> sure
<axw> davecheney: "git rebase -i master". I *think* that since it's been published, it becomes problematic (you'd have to force push and then lose GitHub comments)
<thumper> wallyworld: here is a useful trick: git remote set-url --push origin no-pushing
<thumper> wallyworld: or for you 'upstream'
<wallyworld> what does that do?
<thumper> wallyworld: so you don't accidentally push to the main juju repository
<wallyworld> ah cool
<wallyworld> we'll add these to the doc
<thumper> $ git push
<thumper> fatal: 'no-pushing' does not appear to be a git repository
<thumper> fatal: The remote end hung up unexpectedly
<axw> I think we should remove most people's commit access anyway
<thumper> you can't remove the push url, but you can make it not work
<wallyworld> we will lock it down also
<thumper> I just don't trust myself not to make mistakes
<wallyworld> yep. locking it down is next on martin's todo list
<menn0> thumper: isn't origin supposed to be your personal fork and upstream the central shared repo?
<thumper> menn0: I don't know, is it?
<thumper> I'm happy to move to a more accepted terminology
<axw> thumper: yes it is
<thumper> but in the absence of any information, I made it up :)
<thumper> ok, can someone mention that too in the emails?
<menn0> if you fork on GH and clone that onto your machine, then origin will automagically be your personal repo
<menn0> here's what I've got:
<menn0> origin	git@github.com:mjs/juju.git (fetch)
<menn0> origin	git@github.com:mjs/juju.git (push)
<menn0> upstream	https://github.com/juju/juju.git (fetch)
<menn0> upstream	no-pushing (push)
<axw> http://stackoverflow.com/questions/9257533/what-is-the-difference-between-origin-and-upstream-in-github
 * thumper updates 
<menn0> and if you do this in your feature branches: git branch --set-upstream-to remotes/upstream/master
<menn0> then stuff like "git log @{u}.." and "git rebase -i @{u}" will work with your changes beyond upstream
<menn0> and this: git config branch.autosetupmerge always
<menn0> means you don't need to do it for every feature branch
<menn0> this all requires git 1.7.something
<menn0> ish
<menn0> but we're probably all running that
<thumper> I don't get the @{u} thing
<menn0> it's short for @{upstream} and means "the rev that upstream is on"
<thumper> what is the autosetupmerge do?
<menn0> avoids the need to do "git branch --set-upstream-to remotes/upstream/master" for each new branch you create
<menn0> in recent versions of git "upstream" means something special and there's support through various commands for it
<menn0> actually, I believe for rebase you can just do "git rebase -i" and it will use "upstream" if defined for the branch
<thumper> ah... nice
<thumper> hmm
<thumper> I tried to do the git branch line
<thumper> and it errored
<thumper> error: the requested upstream branch 'remotes/upstream/master' does not exist
<thumper> I started with "go get github.com/juju/juju"
<thumper> then messed with the remotes
<axw> wallyworld: is there a bot for goamz?
<thumper> how do I recheckout my master branch as upstream/master ?
<wallyworld> axw: i thought so
<wallyworld> i'll check
<axw> thumper: I think you want to do "git branch --set-upstream-to upstream/master master"
<axw> assuming you haven't changed it
<thumper> I think I need to fetch heads
<axw> right, do a "git fetch upstream/master" first
<thumper> got it now
<thumper> axw: so if you don't specify remotes/ it assumes it?
<axw> I don't know what the difference is :)
<wallyworld> axw: bot doesn't do tarmac :-( if the tests pass for you, i guess we just push manually
<axw> passes for me
<axw> can you please submit?
<wallyworld> sure
<wallyworld> axw: done
<axw> wallyworld: thanks :)
<wallyworld> i forgot to update the commit message, oh well
<thumper> heh
<thumper> one of my imports has too many 'juju's
<wallyworld> lol
<thumper> waigani: :~/go/src/github.com/juju/juju$ (cd ~/go/src/launchpad.net/juju-core/ && bzr diff -r ancestor::parent) | patch  -p0 --merge
<thumper> waigani: so I did this from the juju branch
<thumper> it assumes that the branc in juju-core does indeed have the parent branch set to trunk
<thumper> I had three merge conflicts
<thumper> all imports
<thumper> hmm...
<thumper> ick
<thumper> minimumunits_test.go:21:
<thumper>     s.ConnSuite.SetUpTest(c)
<thumper> export_test.go:115:
<thumper>     c.Assert(err, gc.IsNil)
<thumper> ... value *errors.errorString = &errors.errorString{s:"cannot create log collection: local error: bad record MAC"} ("cannot create log collection: local error: bad record MAC")
<thumper> intermittent failure
<thumper> anyone know a way to do the equivalent of 'bzr clean-tree' ?
<thumper> I have some untracked files created due to merge conflicts (I think)
<waigani> thumper: sorry just saw your message
<waigani> thumper: I get an error
<waigani> patch unexpectedly ends in middle of line
<waigani> patch: **** Only garbage was found in the patch input.
<waigani> here's my diff: http://paste.ubuntu.com/7584825/
<thumper> waigani: look at the output of your diff by piping to less rather than patch
<thumper> um...
<thumper> yeah
<thumper> you have colour there
<waigani> thumper: I just piped my diff to pastebin
<thumper> right
<thumper> you have made diff give colour
<thumper> patch can't handle that
<waigani> I have? hmm don't remember doing that
<jimmiebtlr__> Believe there is a clean subcommand in git
<thumper> jimmiebtlr__: ok, so I didn't need this? git status -s | grep "^??" | cut -d " " -f 2 | xargs rm
<jimmiebtlr__> no
<thumper> yep, looks like git clean would do what I want :-)
<thumper> waigani, menn0: https://github.com/juju/juju/pull/6
<menn0> thumper: do you want us to redo our review comments or just LGTM it?
<thumper> menn0: I addressed the review comments in the old review thingy
<thumper> menn0: should be OK to lgtm it
<menn0> thumper: ok. I'll quickly check the old review and approve
<thumper> sure
<menn0> thumper: done
<menn0> waigani: have you got a bzr alias for diff to cdiff ?
<waigani> menn0: not that I know off
<thumper> waigani: bzr alias
<waigani> I just added a comment on thumper's branch. Do I want to "add a line note" or not?
<thumper> waigani, wallyworld: now I have a question...
<wallyworld> yes?
<thumper> it seems to me that we want relatively clean commit logs
<thumper> now I have one big commit
<waigani> hey I do, menn0 yes you're right
<thumper> and one review comment about changing one line
<thumper> ideally I'd like to clean up the one line
<thumper> bit it will then be in the commit history
<thumper> and given how git handles logs differently
<thumper> do we care?
<thumper> or more importantly:
<thumper> what is our process around this
<thumper> because it will happen a lot
<wallyworld> i think if the change you make relates to a review comment, perhaps it should stay in the logs?
<menn0> waigani: yeah, cdiff doesn't check to see if stdout is a terminal or a pipe and always emits color escapes regardless
<thumper> wallyworld: more a "tidy up this line" nothing really log worthy
<wallyworld> that was the original thinking anyway
<wallyworld> i think perhaps it's a case of "do the right thing"
<wallyworld> if just cleanup, then squash
<thumper> wallyworld: how?
<menn0> even easier: amend
<wallyworld> rebase -i ?
<wallyworld> not sure
<menn0> fix the line
<menn0> then: git commit --amend
<thumper> menn0: ta
 * wallyworld makes a note to add that to doc
<menn0> sorry, left out the "git add"
 * wallyworld misses bzr already :-(
<menn0> --amend just allows you to make changes to the last commit and/or edit the commit message
<menn0> useful when you forgot to include a file or realised you mistyped the commit message just as you hit enter
<thumper> menn0: git commit -a --amend?
<menn0> that works too :)
<thumper> menn0: however this branch is being reviewed...
<thumper> menn0: so how does that work?
<thumper> given that we are changing public history
<menn0> that branch only exists in your personal repo
<menn0> so I don't think it matters that much
<menn0> but i'm not sure what exactly will happen if you push again
 * menn0 thinks thumper makes an excellent lab rat
<thumper> heh
<menn0> I think the "right thing" will probably happen but let's find out for sure
<thumper> menn0: ok git master: how do I get a local diff vs upstream/master?
<thumper> wallyworld: add that to the docs too
<thumper> wallyworld: when he says it
 * thumper waits
 * thumper waits impatiently
<menn0> git fetch upstream
<menn0> then
 * wallyworld taps fingers
<thumper> menn0: git status tells me I am up to date with upstream/master already
<thumper> menn0: what is the difference between fetch and pull?
<menn0> git diff @{u}..
<menn0> I think
<thumper> and the two dots?
 * menn0 is not a git master
<thumper> menn0: well that did something
<menn0> shorthand for a range
<axw> thumper: "In its default mode, git pull is shorthand for git fetch followed by
<axw>        git merge FETCH_HEAD."
<thumper> and it kinda looks like a diff
<thumper> axw: ah... good to know
<menn0> <something>.. is the same as <something>..HEAD
<jimmiebtlr__> there is also git diff
<axw> fetch brings down the commits, pull does that and merges into your working tree
 * thumper likes having git capable folks to ask
 * menn0 thinks thumper didn't read what I wrote
<menn0> git fetch
<menn0> then git diff
<thumper> menn0: capable != master
<thumper> menn0: but you are more capable than me
<thumper> and master status is in the eye of the beholder :)
<thumper> says the git pleb
<menn0> much of the stuff I've written about git today, I learned today!
<menn0> all the "upstream" stuff anyway
<thumper> menn0: needed to --force the ammended commit
<thumper> :)
<menn0> thumper: ok. I guess that's kinda expected. does the branch look right on GH?
<thumper> the comments are handy too
<thumper> the comments say "waigani commented on an outdated diff 12 minutes ago" with a "show outdated diff" link
 * thumper nods
<thumper> looks ok
 * thumper does the $$merge$$ thing
<thumper> I wonder if this'll work
<waigani> man I'm so outdated ...
<menn0> thumper: the way GH handles updated diffs is nice, keeping track of previous versions and all that
<thumper> yay, jujubot grabbed it
<thumper> although I want a better picture than mgz
<thumper> we should have a robot pic
<menn0> I was thinking the same
<menn0> although mgz is kinda robotic...
<menn0> in a coding machine kind of way
<waigani> thumper: you might like: git config merge.conflictstyle diff3
<waigani> thumper: it'll give you the <<<<<<< ======= inline markers for conflicts
<waigani> menn0: what is your preferred way of handling conflicts?
<menn0> I just use the default but diff3 is sometimes nice too
<menn0> some people like to get git to invoke an external merge tool like kdiff3 or meld but I usually prefer to handle it in my editor unless it's really hair
<menn0> hairy even
<menn0> the merge.tool config option lets you set one of the many supported tools to invoke in the event of merge conflicts
<jcw4> I just issued a pull request for a very small change in the README.md...
<jcw4> In it I replaced a recommendation to do bzr pull with a recommendation to do 'git pull --rebase' to pull in upstream updates
<jcw4> I haven't heard any discussion about git pull --rebase when pulling in from master but I don't think that will be controversial
<jcw4> (presumably no-one will be making commits in their local master branch, so it should be moot)
<jam> sinzui: given that the conversion to git was lossy (something all non-mainline revs were stripped), we want to not just override our existing lp:juju-core branch with a git import. I suppose we could just move it off to the side?
<jcw4> fwereade: I guess you're the pool 1 reviewer :-D  -  https://github.com/juju/juju/pull/7
<jcw4> axw: thanks
<axw> no worries
<axw> wallyworld: another one sorry, https://code.launchpad.net/~axwalk/goamz/ec2test-runinstances-availzone/+merge/221982
<wallyworld> sure
<wallyworld> axw: so it's only the one file changed? no tests?
<axw> wallyworld: right. there were no tests. I could add one, but it seemed a bit against the theme of the existing tests. I'm adding one in juju (that's how I found it)
<wallyworld> oh ok, ta
<wallyworld> axw: done
<axw> wallyworld: thanks
<axw> simple review anyone? https://github.com/juju/juju/pull/5
<jam> axw: do you know what the "Comment": "null-185" stuff is ?
<jam> axw: also, there appears to be a bunch of stuff about Godeps/_workspace that we probably don't actually want
<jam> I don't *think* we want to end up with double copies of all dependencies, do we?
<jam> axw: at least it looks like godep wants to take over your GOPATH and auto-insert Godeps/_workspace as part of GOPATH, instead of working "as normal"
<jam> so I *think* we actually need more discussion about how Godep is actually going to work with our system, since it isn't just a drop-in replacement
<axw> jam: sorry walked away for a bit. where's the Comment?
<jam> axw: https://github.com/juju/juju/pull/5/files#diff-681659ab9abb4b4883e78e8aaa980dbaR10
<axw> jam: ah, no idea. also, I misunderstood the workspace. I think you're right, let's hold off for now
<jam> axw: what is the equivalent of marking an MP as WiP ?
<axw> jam: there are no state for PRs. you could add a label, or just a comment
<jam> axw: is there a "rejected" or "closed" so that other people don't try to review it?
<axw> nope
<jam> axw: yep, you can "Close" link
<jam> when commenting
<axw> sorry
<jam> and it goes to Closed without merging.
<jam> So you don't have quite the "please resubmit this" niceness, but you can put that in the comment.
<axw> that seems like a reasonable equivalent
<axw> it'll disappear off the active PRs until that person reopens it
<jam> yeah
<dimitern> morning all
<jam> morning dimitern
<voidspace> dammit, so my change didn't fix trunk - but it looks like that's only because it isn't waiting long enough for status to return successfully
<voidspace> morning all
<axw> morning
<axw> sinzui: any particular reason why CI is using m1.xlarge, and not m3.xlarge? m3 is faster and cheaper. AFAICT, no downside
<axw> mgz: ^^ is it worthwhile me changing in the build script, or do you think you'll have the HP instance plugged in soon?
<axw> (changing in the build script for the lander, only)
<dimitern> jam, perrito666, others? I migrated my networks constraint branch to github, already approved on rietveld - just a final look?
<dimitern> https://github.com/juju/juju/pull/9
<jam> looking
<jam> dimitern: lgtm
<dimitern> jam, thanks!
<davecheney> https://github.com/juju/juju/pull/10
<davecheney> i dunno if we're using godeps anymore, but this fixes 1325707
<davecheney> #1325707
<davecheney> no _mup_ ?
<dimitern> bug 1325707
<dimitern> yep it's gone
<dimitern> davecheney, LGTM
<dimitern> davecheney, yes, we're still using godeps
 * dimitern really misses bzr pipelines in git 
<davecheney> dimitern: ta
<davecheney> so $$merge$$ will submit this
<natefinch> morning all
<davecheney> will it also run a jenkins build that I can see if 1325707 is fixed ?
<natefinch> davecheney: you need to go to https://github.com/orgs/juju/members and set yourself public
<davecheney> natefinch: ta
<davecheney> done
<davecheney> do I need to recomment ?
<natefinch> as does dimitern and fwereade and vladk and rogpeppe and sinzui and voidspace and some people not online
<dimitern> natefinch, cheers, just did
<natefinch> :)
<natefinch> davecheney: no idea
<natefinch> davecheney: probably can't hurt
<dimitern> natefinch, no need - "Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju"
<dimitern> and i haven't recommented
<natefinch> cool
<davecheney> whee!
<dimitern> it's sooo much nicer to see the progress of the merge
<natefinch> yep
<davecheney> poop, timed out
<jam> dimitern: it is, though having it take 4x longer is a bit of a bad tradeoff
<davecheney> does it really have to download every single dependency every time
<davecheney> that is going to make the build super fragile
<davecheney> espeically when there are redirectors involed
<jam> davecheney: it bootstraps a brand new image to run the tests on everytime
<jam> they could potentially seed it
<dimitern> jam, yeah, I think some initial steps can be optimized
<natefinch> so, why does the new CI system run on different computers than the old CI system?
<natefinch> why can't we just use the same ones, if they worked better?
<jam> natefinch: a different group of people set it up, and it uses different infrastructure (run by Jenkins, because Tarmac can't run against Github)
<natefinch> jam: oh, right, this is the landing bot not CI per se... right
<jam> natefinch: I *think* the idea was to enable us to get to a point where we could actually run several test suites in parallel against multiple new machines
<jam> natefinch: well, the landing bot is now running like CI
<natefinch> jam: right
<jam> rather than running as a dedicated landing machine.
<jam> natefinch: anyway, I think they might have been trying to make it so that gccgo and golang go would run in parallel, and maybe run multiple arch's
<jam> natefinch: personally, the decrease in cycle time (how quickly can we land code) is a pretty major thing, which I've been bringing up as they wanted to increase what we do pre-commit. But others may not see the same.
<jam> 15min was a bit long already
<davecheney> go build github.com/juju/juju/...
<davecheney> + export JUJU_NOTEST_MONGOJS=1
<davecheney> + JUJU_NOTEST_MONGOJS=1
<davecheney> + export GOMAXPROCS=1
<davecheney> + GOMAXPROCS=1
<davecheney> ^ who did this
<davecheney> it's not wrong
<davecheney> but not right either
<jam> I *think* wallyworld et al are trying to get the test suite to be stable enough to run in parallel. While we can't run any given package test in parallel (because lots of tests mutate global state and then set it back again)
<davecheney> the correct line occurs just after it
<davecheney> go test -p 1 ./...
<jam> I'm really surprised we're running the test suite with JUJU_NOTEST_MONGOJS=1
<jam> did we lose running on Precise, too ?
<davecheney> that is needed when using juju mongodb package
<jam> davecheney: yes, but we should be using mongodb-server on Precise, and we should be running the test suite on precise because if it is going to break *somewhere* it would be there.
<jam> that, and it isn't actually needed anymore as if the test suite notices it is using juju-mongodb it sets it automatically
<voidspace> yay, after updating filters for github I'm down to 34 emails in my inbox instead of 151
<axw> davecheney: indeed, GOMAXPROCS shouldn't be there. I'll take it out
<axw> jam: I think it's trusty atm. mgz will be looking at changing us from an ephemeral to a nailed-up instance (maybe with start/stop by jenkins? not sure). should be simple to go back to precise then
<axw> as for JUJU_NOTEST_MONGOJS ... not sure why that's in there
<axw> possibly cargo culted
<voidspace> natefinch: hello nate, you're on early today
<natefinch> voidspace: I'm usually up by now, just often hiding :)
<voidspace> natefinch: heh, sensible
<voidspace> I've just "gone public"
<voidspace> at least as far as the juju-team goes
<voidspace> natefinch: soooo... the commit yesterday *didn't* fix the build
<voidspace> natefinch: *but*
<voidspace> natefinch: I think that's mostly because of a mistake on my part
<natefinch> ok
 * natefinch refrains from smashing his forehead on his keyboard for now
<voidspace> natefinch: when I switched from waiting for CurrentStatus instead of CurrentConfig I left it at only waiting 5 seconds
<voidspace> but we know that status can take longer than that, I meant to increase the time
<natefinch> right, which we know might not be long enough
<voidspace> the *good* news is that the error changed
<voidspace> instead of "Closed explicitly" it looks like the Refresh worked and we now get a sensible error message
<voidspace> I'll go copy & paste in a second, but now it basically says "initiating, be ready shortly"
<voidspace> so really it's good news
<voidspace> wrapped up as bad news
<voidspace> 2014-06-04 09:23:45 WARNING juju.replicaset replicaset.go:77 Initiate: fetching replicaset status failed: cannot get replica set status: Received replSetInitiate - should come online shortly.
<natefinch> that does sound good
<voidspace> natefinch: so my bad, but I'm hopeful that *one last attempt* really should solve the problem :-)
<natefinch> That also sounds like an error we should handle explicitly in that code
<voidspace> natefinch: instead of warning?
<voidspace> we have to do a string comparison unfortunately
<natefinch> I know that blows, but if there's an error that we actually understand and can handle programmatically... we should do so
<voidspace> ok
<voidspace> we do that for unreachable members in the same block of code, so it's straightforward enough
<voidspace> new mp coming in shortly when I straighten out my git workflow
<voidspace> heh, pr instead of mp I guess
<voidspace> need to straighten my terminology too
<voidspace> natefinch: on another topic, did you get an email from me this morning?
<natefinch> voidspace: oooh, keynote, huh?  awesome! :)
<voidspace> natefinch: yeah, should be good
<voidspace> natefinch: I'm going to work with bloodearnest on a good demo and do a practise run at PyCon UK
<jam> mgz: wallyworld: another question on the bot. I believe that I see it does proper "always generate a merge even when you could Fast Forward", I just wanted to check if that was really tue.
<jam> true
<natefinch> voidspace: cool cool
<jam> voidspace: well, CL from Reitveld, even
<voidspace> jam: has lbox been updated for git?
 * voidspace needs to read the updated CONTRIBUTING doc
<jam> voidspace: we don't use lbox anymore, AFAICT
<voidspace> ok
<jam> voidspace: you push up your branch, and do a PR against master
<jam> no more reitveld
<voidspace> I won't miss it terribly
<voidspace> jam: right, I've already done the PR dance once
<voidspace> for a one line change...
<voidspace> but I think I managed to commit to the wrong branch in my github fork - so I need to delete and start again (well I probably don't *need* to, but it's the path of least resistance at the moment)
<natefinch> mgz: do we have something running the go fmt precheck at least?
<axw> natefinch: yes, we do
<natefinch> axw: ok, good.  all is right with the world.
<axw> at least we did before yesterday... I should check it got put back in
<axw> natefinch: yep. the pre-push script in juju/scripts gets run by the lander
 * axw is sad that he can't link branches and bugs anymore
<axw> now I have to remember things
<natefinch> you can post a comment with a link to the bug, right?
<jam> natefinch: you can set up a pre-push hook to the check script, but it means everytime you want to push out your local changes it will block for 30s
<axw> yes, I am doing that. I'll need to also link from the bug back to the PR/commit tho
<axw> kind've a drag
<jam> I would hope they followed tarmac bot's goal of "just run go fmt ./..., because if we can automatically fix it, no reason to block on it"
<jam> propose time != push time is definitely != commit time.
<natefinch> jam: that's a good point... as long as go fmt doesn't error out, just run it
<jam> natefinch: yeah, and if it *does* then you have bad code that needs to abort anyway
<axw> sounds like a good thing to do. I just stuck the script in there so we had *something*.
<jam> axw: I like that the check script does stuff like go vet, etc.
<jam> But I'd rather do the "takes a while" checks only when I actually am ready to publish
<jam> because I like to commit a lot, push a couple of times a day, and propose when its ready
<axw> jam: it is possible to not run the hooks
 * axw rummages around for the command
<axw> --no-verify
<jam> axw: well, having to type "git push --no-verify" every time is a bit wonky
<jam> I could invert the default and then have to run "git push --verify"
<axw> you could have an environment variable or something, and an alias that sets that to RUN_SLOW_THINGS=1 before "git push"
<dimitern> you can do git config --global alias.push = push --no-verify
<axw> ah
<dimitern> that was one of the very first thing i did - migrating some of my bzr aliases into ~/.gitconfig
<dimitern> the other thing i'm trying now is Stacked Git, to see how well it works as a bzr pipelines replacement
<jam> vladk: dimitern: standup ?
<dimitern> jam, brt
<jam> dimitern: I think it is primarily about assembling patches, which doesn't actually assemble branches
<axw> jam: just had another look at godep, and "godep restore" does update the branches in their usual $GOPATH location
<axw> jam: not really sure what the workspace is for...
<jam> axw: I'm pretty sure that "godep go test" sets GOPATH to Godep/_workspace
<axw> yes it does
<axw> so we'd probably not want to do that at all, but rather just "godep restore" && "go test"
<natefinch> yeah, I think you can run it exactly like godeps
<natefinch> godep restore is the same as our godeps -u depedencies.tsv
<natefinch> I believe... I haven't tried it out yet
<davecheney> natefinch: i believe that is true
<jam> natefinch: well it fundamentally wants to copy everything into Godeps/_workspace
<jam> which means while it might approximate it, it still does its own stuff in a different way
<davecheney> jam: yes, however it will collaps imports when it finds one of those deps also uses godeps
<davecheney> this will mean that CI will have less to do to construct an environment to test
<davecheney> and will not depend on external sites
<davecheney> this will also make life easier for sinzui as the current build-release-tar.bash script can be simplified/removed
<jam> davecheney: except we aren't versioning those deps
<jam> so all the things you're saying still aren't true
<jam> anyway, maybe we do want to version them, but if that is true, then clearly we actually need to discuss how we need to change to use it.
<voidspace> natefinch: https://github.com/juju/juju/pull/12
<davecheney> jam: i don't know about versioning
<davecheney> that sounds like releases and build numbers and stuff
<davecheney> which is orthogonal to the problem I am intersted in, reliable, preproducable builds
<natefinch> he means vendor
<natefinch> I think
<davecheney> natefinch: yes, reproducable builds via vendoring
<jam> natefinch: I mean "commit it to our tree" which yeah, vendoring is sort of it, though even vendoring could be "github.com/juju/OTHERPACKAGE", which is still different from having it inside your tree committed with the other files.
<jam> If we want to vendor, I'm not strictly against it, but it should be decided "we want to vendor" not wedged in because we thought we should use a different godep tool.
<natefinch> absolutely
<natefinch> I'm not sure I want to vendor btw
<natefinch> actually pretty sure I don't
<tasdomas> hi
<tasdomas> when running `go test -gocheck.vv -gocheck.f=LifeSuite ./...` in /state, I see these log entries:
<tasdomas> [LOG] 36.85478 DEBUG juju.testing tls.Dial(127.0.0.1:52204) failed with dial tcp 127.0.0.1:52204: connection refused
<tasdomas> [LOG] 36.85478 DEBUG juju.testing tls.Dial(127.0.0.1:52204) failed with dial tcp 127.0.0.1:52204: connection refused
<jam> natefinch: well we've talked about it, and we don't do it *yet*, so we've generally decided not to so far
<tasdomas> the tests pass, but are these log entries safe to ignore?
<davecheney> tasdomas: does the warning fire when you don't use -gocheck.f ?
<tasdomas> davecheney, yes, for every test suite that's being set up
<dimitern> jam, fwereade, natefinch, finalizing networks constraints - https://github.com/juju/juju/pull/13 PTAL
<natefinch> there's a lot of crap in the logs even when things work correctly, unfortunately.  these are debug messages, so probably safe tp ignore
<natefinch> tasdomas: ^^
<natefinch> voidspace: see my comment on your PR
<voidspace> natefinch: hah, you spotted my deliberate error...
<voidspace> natefinch: fixing
<natefinch> heh
<natefinch> just making sure I'm paying attention? :)
<voidspace> natefinch: fixed
<jam> natefinch: tasdomas: I believe it is reasonably expected that while starting up Mongo it takes a try or two to get connected. without ".vv" you shouldn't see them. And hopefully we have a "connected to" if we are giving the "connection refused" messages.
<voidspace> natefinch: I'm assuming 15 seconds will be enough... (it really should be)
<voidspace> we can always bump up maxInitiateStatusAttempts if we need to
<natefinch> yep
<voidspace> 15 seconds to fail is already a bit of a bump though
<voidspace> 15 *extra* seconds to fail
<TheMue> *: Have to leave home for about 1h, see you all later
<natefinch> voidspace: but failure should be rare
<voidspace> natefinch: yep, we didn't see this failure before - so it should be very rare
 * perrito666 is informed by github that he has been subscribed to 14 repositories... let the mailing avalanche begin
<natefinch> haha
<mgz> jam: er, I'm an idiot, can you add me back to the owners group on github?
<natefinch> hahaha
<mgz> I need to remove myself last...
<mgz> perrito666: sorry, spam from me
<dimitern> sorry to be a pest, but a gentle reminder ... jam, fwereade, natefinch, i'd appreciate a review on the following (finalizing networks constraints) - https://github.com/juju/juju/pull/13
<natefinch> should we turn off issues if we want bugs in launchpad?
<mgz> trying to make it so we can't land without the bot actually testing the code
<mgz> natefinch: probably
<mgz> natefinch: can you re-owner me?
<jam> mgz: no
<mgz> ;_;
<jam> mgz: I find it funny that your real account (bz2) doesn't have your picture, but your fake one (jujubot) does
<mgz> yeah, I shall be fixing that
<mgz> I confused github with an email account dance
<jam> mgz: you are now an owner again
<natefinch> where's the ui for that jam?
<natefinch> I couldn't find it
<jam> mgz: I take it the idea is to filter down the list of owners, and then just have the bot controlling the official reop ?
<jam> natefinch: github.com/juju, click on Members, click on Teams, click on Owners, "add users to team" edit box
<mgz> jam: yup
<jam> looks like you can skip the "click on Members" and go straight to teams
<mgz> I'm going to leave team leads as owners and see how that works out
<natefinch> ug, ok, I was looking in the list of members not teams
<jam> natefinch: so was I
<mgz> it's a bit more annoying for everyone but should cut out some accidents
<jam> mgz: given I set up lp:juju-core that way, I think you can feel where I fall in that :)
<jam> the only question is whether it would block us commenting on PRs
<natefinch> jam I mean, I didn't know owners was a team and not a role or something
<jam> natefinch: sure
<mgz> okay, *now* I remove myself :)
<jam> natefinch: I certainly thought it was just a role that I could change
<jam> given that the members list shows "members" and "owners" directly
<jam> but I don't think it would show "hackers"
<natefinch> exactly
<jam> so it *is* just a team, but it is a github special team
<natefinch> yeah
<mgz> yeah, and I can't remove it's access to push a branch, so we all need de-ownering
<mgz> okay, done
<jam> mgz: so is there any chance to do the same "sudo" trick that we had on LP?
<mgz> teamlead plz do this for me
<jam> I know tim and myself would like to not fuck stuff up by accident, only when we really need to
<mgz> it's not nearly as good
<jam> mgz: I'm more concerned about one of the teamleads accidentally leaving github.com/juju/juju as origin and doing "git push"
<mgz> well, I could leave the bot as the only owner, but actually I don't like that as much
<jam> mgz: well we need owners to do membership status, right ?
<mgz> jam: you'll have to be better than the rest of us and not make mistakes :)
<mgz> right, you have to be owners really
<jam> mgz: and you can't restrict your own rights... :(
<natefinch> responsibility?  I didn't sign up for this
<mgz> or it's painful to add people/change settings/create repos
<jam> natefinch: umm... I think you did :)
<natefinch> oh, right :)
<jam> natefinch: but there is a difference between helping you enforce your own responsibility by intentionally creating boundaries
<mgz> hm, I still have a big buttom that says merge pull request
<jam> I'm a big believer in "make it hard to do it wrong"
<natefinch> jam: me too
<jam> mgz: "hackers have write access"
<jam> should I be removing that /
<jam> ?
<jam> This team will be able to read its repositories, as well as push to them.
<mgz> nope
<jam> vs Admin Access: This team will be able to push/pull to its repositories, as well as add other collaborators to them.
<mgz> we don't make hackers a team on juju/juju
<jam> vs This team will be able to view and clone its repositories.
<mgz> we need it still for juju/testing say
<mgz> till we move that to the bot, then we can remove the team from that branch
<jam> mgz: https://github.com/orgs/juju/teams/hackers says "juju/juju"
<jam> should I be removing *that* ?
<mgz> can you doulbe check, but github.com/juju/juju/settings/collaboration should be owners only
<mgz> I may have been a bit to automatic with the adding
<jam> mgz: hackers is listed as a team
<mgz> okay, revoke there
<jam> mgz: and it doesn't let me set perms on team there
<mgz> just revoke
<jam> mgz: it seems there isn't an many-to-many relationship that we want
<jam> "hackers" should have X access on Y project
<mgz> I may need a bots team, which will be on those branches
<jam> roveked
<jam> revoked
<mgz> if fact, I think I'll do that just in case
<mgz> oh, poo
<mgz> jam: can you reowner me for now... I'll revoke myself later
<jam> mgz: I keep telling you, that's not gonna happen
<mgz> otherwise I'll keep having to bug you when I screw up >_<
<jam> mgz: ownered
<mgz> ta
<jam> mgz: did you see your "merge this" button go away, at least?
<mgz> ah, let me check via another cunning method
<jam> a second account?
<mgz> yes, it does
<mgz> the bot is not yet granted on most branches
<jam> mgz: I don't actually see the "jujubot" account
<jam> mgz: Is it a full fledged github account, or is it a 'something that has my credentials' account?
<mgz> github.com/jujubot
<jam> ah, there it showed up
<voidspace> natefinch: should I add the $$merge$$ magic comment?
<mgz> okay, this is looking good now
<perrito666> mm, there seems to be no doc for HA
<natefinch> voidspace: yep, after the LGTM, you can do the $$merge$$ ... basically the same as before where you'd set the commit message and mark it as approved
<voidspace> natefinch: yeah, done - thanks
<voidspace> perrito666: I believe we have a card for that
<perrito666> voidspace: I am working on it, but as the card said "update HA docs" I assumed there was something to update :p
<voidspace> hah
<voidspace> why would you assume that? :-D
<perrito666> I am a hopelessly naive guy
<natefinch> perrito666: I'll change it to upsert ;)
<voidspace> Looks like that branch is building. Should know whether or not the build is fixed in about an hour or so.
<voidspace> In the meantime
 * voidspace lunches
<jam> dimitern: I do have it open for review
<dimitern> jam, thanks
<jam> lifeless: so I tried just doing "dd" of the ubuntu.iso image. and it comes out with a filesystem that is: /dev/sdg1   *          64     1986559      993248   17  Hidden HPFS/NTFS
<jam> which does have the "boot" flag set
<jam> but my old machine doesn't recognize it as actually being bootable.
<jam> which is why http://www.ubuntu.com/download/desktop/create-a-usb-stick-on-ubuntu has you use startup disk creator
<natefinch> startup disk creator used to 100% reliably fail last time I tried it (about 6 months ago)
<natefinch> which is why I always used dd
<natefinch> which 100% reliably worked
<jam> natefinch: interesting. I think it works because of a newer Bios than my circa 2003 machine
<jam> natefinch: so when I first tried, it kept failing, but that was because you have to create the partition and boot flag, etc, manually, and it just copies what you need onto the disk
<jam> well, usb stick
<jam> it doesn't seem capable of handling no media/bad filesystems
<natefinch> startup disk creator may make a better image... but I couldn't ever get it to actually finish creating anything, so never got to try what it output
<perrito666> jam: I agree, dd should work just fine
<natefinch> jam: for me, dd of the image onto the usb stick worked fine as-is
<jam> perrito666: I can absolutely confirm it does not :)
<jam> as in, I just tried it
<jam> and USB doesn't show up as bootable
<natefinch> jam: but I was putting it into a new computer, so maybe your old one needs something more specifically set up
<perrito666> jam: the machine I am chatting with now was installed by using a dd usb
<perrito666> try unetbootin
<natefinch> perrito666: his machine is oooooold.  So, who knows what people were doing with USB bootable disks in 2003
<jam> perrito666: sure, I bet it does work (sometimes), I can just confirm that it doesn't work here
<perrito666> try a cd :p
<natefinch> I'm honestly surprised a 2003 machine even has that capability
<perrito666> natefinch: that capability is quite old, but before was not mentioned as "boot from usb" but as "hey this seems to be a drive, boot from it?"
<jam> perrito666: well I could, but I disconnected all the CDs in this thing. I could get there eventually, but fdisk a fat32 partition seems to be workable
<jam> note there's stuff like "what is the byte offset of the first partition"
<wwitzel3> 2003 .. probably has multiple LPT ports
<jam> and what is the master partition table
<perrito666> :|
<jam> (GPT doesn't work here)
<jam> wwitzel3: I only have 1 LPT, but I have 2 COM ports
<perrito666> jam: unebootin wont create something bootable?
<jam> I haven't tried unetbootin
<jam> I'm just using upstream ISO/Startup disk creator.
<perrito666> jam: do you have network and floppy? if so you can create a debian net install disk, install, change the repos and upgrade, I think that ... lets say works
<jam> perrito666: so fdisk in "dos compatible mode"
<jam> creating a primary partition at byte 63
<jam> set it bootable
<jam> set it as a Fat32, seems to create something that will actually boot
<jam> but I get to "Operating System Not Found" which doesn't seem helpful :)
<jam> it worked two days ago
<jam> I tried "dd" to see if I was just going the long way round
<jam> weird, QEMU was happy ,but now my system is not
<perrito666> jam: maybe this is a stupid question
<perrito666> what size is your pendrive?
<jam> perrito666: 16GB
<perrito666> jam: do you have something smaller to try?
<jam> not on hand, and it did work a couple days ago
<perrito666> meh
<jam> perrito666: note, dd did seem to work on the 64 bit ISO, but it booted and said "this is not a 64-bit system, download the 32bit instead" :)
<perrito666> well, is it?
<perrito666> btw, is a machine that old usable with a newer ubuntuÂ¿?
<jam> perrito666: again, I was using it for a while, I just wiped it to see if I could dd it
<jam> so yeah, it is a 32 bit machine
<jam> interesting, byte 63 really did matter
<jam> I tried again with the default offset (byte 32) and it failed to detect it as bootable
<perrito666> jam: that is sector iirc
 * TheMue is back after helping his wife buying more and larger plants for our garden
<perrito666> TheMue: sweet, that means better air to breathe
<jam> perrito666: sure, I just mean you have to know the magic 63 or it doesn't work, and it doesn't default to the right magic value
<perrito666> jam: fdisk defaulted to that magic value when your machine was new :p
<jam> perrito666: yeah, and now it is actually a bad one because of sector alignment, etc.
<TheMue> perrito666: exactly, it fine recreation in the garden
<TheMue> s/it/and/
<jam> why can I not reproduce what worked 2 days ago...
<perrito666> jam: welcome to sinzui's world
<TheMue> *lol*
<perrito666> even with the properly partitioned stick you cannot boot?
<sinzui> perrito666, jam , speaking of 32bit. If I installed i386 go libs and tools on an amd64 instance, would this machine be valid to run unittests and build i386 juju?
<jam> sinzui: davecheney is the one to ask that: http://dave.cheney.net/2012/09/08/an-introduction-to-cross-compilation-with-go
<sinzui> We are moving away from cross-compiling
<sinzui> I make win32 juju by installing 32bit golang on 64bit server.
<sinzui> the real issue is I cannot get an i386 machine that is powerful enough to run unittests and deliver a binary in 30 minutes
<jam> sinzui: interesting, I'm ~ ok with using an amd64 in 32-bit mode and going with that
<jam> I have a small concern that it won't notice if we accidentalyl give it a 64-bit binary and fail to test 32-bit at all
<jam> could we run some level of smoke test on a real 32 bit ?
<sinzui> jam, I was thinking of a jenkins slave to act as an i386. After I apt-add-architecture, I change the tests setups to honour it. I would then use the i386 slave to run unit tests, build the binary packages, install, then run lxc tests.
<sinzui> I think I need i386 juju-mongodb too
<natefinch> sinzui: 32 bit go on a 64 bit windows should run almost identically to 32 on 32
<sinzui> almost? don't say that that way
<sinzui> natefinch, maybe I don't care. Win users do report bugs, and they are not reporting bugs about the client.
<natefinch> sinzui: it's only almost in that there are file system and registry differences on 64 bit machines, but those are likely things we'll never run into unless we explicitly do some wacky stuff
<jam> natefinch: sinzui: so I think the test suite on 32bit is fine, it is just that we could screw it up and not realize we were missing test coverage.
<sinzui> okay, thank you natefinch. I will try to remember that
<jam> "test suite in 32-bits on a 64-bit machine"
<natefinch> sinzui: basically, the registry on a 64 bit windows also has a segregated 32 bit section... the two normally never realize they're segregated unless you specifically write code to go looking for the other section
<voidspace> still building
<natefinch> voidspace: since an hour and a half ago?
<voidspace> natefinch: I think one was already in the queue
<voidspace> natefinch: so yes, not merged yet as far as I can tell
<natefinch> ahh yeah, single threaded, right.
<voidspace> natefinch: yeah, build 10 has been going for 50 minutes
<natefinch> Seems this is something we're going to have to address sooner rather than later
<sinzui> jam: natefinch voidspace : I am setting up dedicated slaves to provide better testing and build resources next week. I will try to provide one for the lander bot to test merges quickly
<voidspace> sinzui: great
<jam> sinzui: given the Canonistack instance was only dual-CPU and 4GB data, why is the ec2 instance too slow for us?
<natefinch> jam, sinzui: isn't it because we're bringing up the machine from scratch for every test?  Or am I misunderstanding the infrastructure?
<mgz> jam: it's a minor mystery
<natefinch> (for every merge perhaps?_
<jam> natefinch: I would hope it doesn't take 35 minutes to bring up a new machine
<mgz> jam: a good protion is that to get the tests to pass we *have* to run with -p 1
<jam> given our test suite was running in ~15-20 min
<mgz> which doubles the time taken
<natefinch> jam: also, EC2 is *notorious* for being slower than the specs people expect from it.
<sinzui> yep
<jam> natefinch: bad neighbor problem ?
<natefinch> jam: yeah, basically
<mgz> my plan of attack starts with moving the landing test running to another cloud
<wwitzel3> jam: so the WaitAgentPresence call isn't about actually waiting for anything. In fact, calling it with 1 ns results in the agent status being properly reported, even when doing ensure-availability && ensure-availability
<wwitzel3> jam: I just used it there because a) the Presence pinger API was already exposed there and b) I couldn't determine a better place to call it.
<jam> wwitzel3: I don't see how it affects "ensure-availability && ensure-availability" given the agent hasn't even started yet, but I could see "bootstrap && ensure-availability" is that what you meant?
<jam> I think the issue is that it forces an early watch of the agent status changing
<jam> instead of waiting for it to background poll for it.
<wwitzel3> jam: sorry, yes bootstrap && ensure-availability
<jam> wwitzel3: so I think what I'd like to see is something called "RefreshAgentState" that doesn't take a timeout and doesn't have to Watch anything, since it just grabs the current value
<wwitzel3> jam: I see, ok, and the just refactor WatchAgent to use that internally as well
<natefinch> jam: yeah, we're not worrying about ensure-availaiblity getting called twice right now, since that's less of a valid thing to do than bootstrap and ensure-availability
<jam> natefinch: well, given that we said you could run it in cron, it isn't *that* unexpected. however,sure, the patch wasn't about that, I was just making sure I wasn't missing anything.
<jam> I think the key bit from wwitzel3's patch is that we currently have a mechanism for tracking presence, but it takes a long time for it to actually notice that the machine agent is running, when we want it to trigger as soon as the agent logs in.
<natefinch> yep
<jam> Im slightyl concerned about bouncing agents looking up too much
<dimitern> jam, poke re that PR
<jam> dimitern: I posted on it, IIRC
<dimitern> jam, ah, I can see it now, thanks
<sinzui> natefinch, jam, mgz, We might need more powerful instances, but giving the git-tester its own queue or resources will prevent it from queuing for 20 minutes while CI and health checks are running
<voidspace> merged!
<natefinch> wwitzel3, perrito666, voidspace I have the TOSCA meeting right now, can we postpone the standup until it's done?  Probably an hour
<voidspace> natefinch: fine with me
<perrito666> natefinch: sure
<natefinch> I need to reschedule our standup on wednesdays, but gmail makes it tricky
<sinzui> voidspace, can I take down the juju env/mongod that is running under ubuntu user on juju-ci?
<voidspace> sinzui: oh crap
<voidspace> sinzui: yes - sorry
<sinzui> no
<voidspace> sinzui: I thought I destroyed that environment, my apologies
<sinzui> problem voidspace
<sinzui> voidspace, you may have. I just see the mongod running
<voidspace> I hope that is "no problem" and not "no. Problem!"
<voidspace> ok
<bodie_> morning all
<perrito666> hi bodie_
<jcw4> o/
<rogpeppe> mgz: any chance i could get permissions to create repos in github.com/juju ?
<mgz> rogpeppe: ah, I haven't sent the email yet
<rogpeppe> mgz: which email?
<mgz> rogpeppe: maybe, the issue is we can't seperate the right to create a new repo in juju from the right to screw up any branch under juju
<mgz> which is some what accident-y
<mgz> I'll re-owner you for now, but please comment on the list when I send this notice :)
<rogpeppe> mgz: what comment would you like me to make?
<mgz> rogpeppe: done.
<rogpeppe> mgz: ta
<mgz> that you need to be able to create repos... we can find another way on that, perhaps a role-account all or some of us have access to
<dimitern> jam, if you're still here and can have a look, I've updated https://github.com/juju/juju/pull/13 as you suggested
<rogpeppe> mgz: that's not a bad idea
<bodie_> am I the only one getting this problem in state_test?  http://paste.ubuntu.com/7587871/
<bodie_> It has nothing to do with anything I've touched as far as I can tell
<bodie_> so I guess that's a yes haha
<voidspace> local-deploy-precise-amd64 has got as far as bootstrapping the juju machine agent
<voidspace> dammit, failed
<voidspace> and for a different reason
<voidspace> natefinch: sinzui: latest build failed
<voidspace> Initiate: fetching replicaset status failed: cannot get replica set status: can't get local.system.replset config from self or any seed (EMPTYCONFIG)
<voidspace> natefinch: sinzui: I'm going to prepare a branch backing out the changes that caused this
<voidspace> I don't think we should follow this rabbit hole any further
<natefinch> voidspace: ok.  Damn.
<sinzui> :/
<sinzui> thank you voidspace
<lazyPower> Greetings core team, have any of you seen an LXC failure message akin to the following: http://paste.ubuntu.com/7587937/ - something about LXC not understanding the argument 'trusty' ?
<bodie_> odd
<bodie_> lazyPower, are you on Ubuntu?  I had a hell of a time getting everything working properly until I switched my workstation over to 14.04
<lazyPower> bodie_: this is our juju flavored vagrant image
<lazyPower> so its precise atm
<bodie_> ah
<bodie_> well, that's all I've got, hehe
<lazyPower> our trusty image seems to be suffering from similiar issues but I don't have enough empiracle evidence to present as to the root cause of the issues yet.
<adeuring1> lazyPower: well, I see this error for the Juju vagrant image for precise, for example this one: http://cloud-images.ubuntu.com/vagrant/precise/20140602/precise-server-cloudimg-amd64-juju-vagrant-disk1.box
<adeuring1> The trusty images work fine.
<natefinch> ahhh stupid chrome, man.  Had to switch to firefox to get hangouts to be stable.  That's pretty sad, Google.
<wwitzel3> ahh, was wondering what you did to fix it
<lazyPower> adeuring1: admittedly its been about 2 weeks since i've looked at the trusty box, if the lxc container issues have been resolved - bueno!
<bodie_> natefinch, lol
<wwitzel3> natefinch: should we do standup now?
<bodie_> hrmn
<bodie_> http://paste.ubuntu.com/7588034/
<jcw4> bodie_: go get -u github.com/juju/testing/...
<bodie_> ah... derp
<bodie_> that's weird though since I did a go get github.com/juju/juju  -- ah, maybe I left off the /...
<bodie_> yep, looks like that's doing it
<jcw4> bodie_: I avoid go get -u github.com/juju/juju because I don't want go get to overwrite my personal repo at that location.
<jcw4> bodie_: I don't think it will with git, but still...
<bodie_> I had the same concern but I was able to immediately re-checkout my own branch, so I don't think there was an issue
<bodie_> hopefully this will do the trick
<natefinch> wwitzel3: yep
<natefinch> voidspace, perrito666: standup
<perrito666> natefinch: going
<bodie_> and all tests are green!  BOOYA!
<natefinch> nice
<voidspace> natefinch: https://github.com/juju/juju/pull/14
<sinzui> I know know what the powerpc debs are. They are the 32bit debs for ppc64el vms. We don't make juju tools for the arch, we could if the ubuntu in ubuntu on ppc64el envisions running juju inside it
<bodie_> polishing my nonexistent git rebase skillz: does anyone know if it's possible to use multiple commands per commit?
<bodie_> e.g. squash AND edit
<bodie_> I figured it would just be something like s,e
<bodie_> also, this might be worth looking at (natefinch?) http://paul.stadig.name/2010/12/thou-shalt-not-lie-git-rebase-ammend.html
<bodie_> personally I've always used the branch-and-merge-prolifically approach, which might have a more "honest" but noisier / less human readable history
<mgz> sinzui: what's the oldest version of go the jenkins infrastructure compiles juju with?
<sinzui> mgz, 1.1.2 used by precise and saucy
<sinzui> mgz, oh and the win client
<mgz> sinzui: are we planning to change this for the current dev release?
<mgz> you had a newer version compile yourself, right?
<sinzui> mgz, the juju-packaging/devel ppa is building only with 1.2.
<mgz> sinzui: I guess I'd better take it to the list, but would like to make 1.2 the minimum for trunk
<bodie_> https://github.com/juju/juju/pull/15
<bodie_> ^_^
<sinzui> mgz, well I think that is doable since I backported 1.2 to precise. None of the 1.19.3 bugs imply a compiler issue. I can promote my package to the stable archive for 1.20. *BUT* We need foundations to accept golang 1.2 into ctools to really make the transition
<perrito666> anyone remembers what was the outcome of the discussion about writing the docs as markdown (or anything other than plain text)?
<voidspace> perrito666: I *thought* we all agreed on markdown
<voidspace> perrito666: it's pretty plain-text-ish anyway
<perrito666> I thougt so too
<mgz> sinzui: bash help, github-merge-juju/12 failed at the build step but the way I'm calling into the build script didn't get me an exit code to report the failure, it just aborts the job
 * sinzui looks
<sinzui> mgz, I think you are saying that the script should get to "echo "Build failure, reporting on proposal""
<mgz> right, but source is wrong
<mgz> it goes and executes as top level, right?
<mgz> that whole block is kinda horrid
<sinzui> mgz make-release-tarball.bash is setting -e and the script is sourcing
 * mgz <3 sh
<sinzui> I think we ca avoid the use of source
<mgz> okay, yeah, that'd do it
<mgz> I tried to do it with a || { ..block.. } rather than fiddling with -e but no joy on the syntax
<sinzui> I try to avoid sourcing because of that. I see the job knows how to find the tarball
<sinzui> mgz, may I rebuild to get the proper error?
<mgz> yeah, just add $$merge$$ on the mp
<mgz> dave won't mind the spam :)
<sinzui> mgz watching http://juju-ci.vapour.ws:8080/job/github-merge-juju/13/console
<mgz> ta
<mgz> that worked. thanks sinzui!
<sinzui> :)
<voidspace> natefinch: when I merged your write-majority code I created an mp that I forgot to mark as WIP
<voidspace> natefinch: so there was some interesting discussion on it :-)
<voidspace> natefinch: https://code.launchpad.net/~mfoord/juju-core/write-majority/+merge/220823
<voidspace> natefinch: part of the problem, which I forgot to mention during standup, is that setting the write mode doesn't return an error
<voidspace> natefinch: so you can't use *that* to detect whether there's a replica set or not
<voidspace> natefinch: I'll have to find some analagous operation that does nothing (e.g. replicaset.CurrentConfig() ) but returns an error when there's no replica set
<mattyw> Is someone able to spare a few moment to talk about how we test long running commands in core?
<natefinch> voidspace: ahh, dang.  So you can set write majority and it won't error out... what happens?
<voidspace> natefinch: I think everything after that errors :-)
<voidspace> that's what I was seeing in tests
<natefinch> haha
<voidspace> with the noreplset error we saw
<voidspace> or whatever it was
<voidspace> no, I think session.SetSafe panics with that error
<voidspace> something like that
<voidspace> hooray for no exceptions!
<voidspace> natefinch: I'll sort it out, not a problem
<voidspace>  natefinch I got an LGTM from jam on my revert PR, so I've gone the magic merge message
<natefinch> voidspace: cool
<natefinch> (sorta)
<voidspace> heh
<voidspace> jujubot is quick
<mgz> THANK YOU
<mgz> beep beep boop
<voidspace> mgz: heh :-)
<voidspace> no, thank you (I assume)
<voidspace> it noticed my magic-merge-message within a few seconds
<jcw4> mgz: they said you were a machine... didn't realize it was literal
<mgz> :)
<bodie_> that sounds like a reference to a bad B sci fi flick
<bodie_> "My masters built me to be the perfect merge resolver.  Now I must kill all humans."
<jcw4> quick... someone feed the merge bot a bad whitespace merge... that oughta save mankind
<alexisb> jam, what networking spec are you and dimiter currently using?
<alexisb> I should say working off of
<voidspace> natefinch: specifically: ... Panic: cannot create database index: norepl (PC=0x414676)
<jam> alexisb: so *right* now Dimiter is working on fixing up what was discussed with Mark S, about not having "juju deploy --exclude-networks" but instead making it a constraint (juju deploy --constraints=network=^foo"
<jam> I'm trying to find the concrete doc on that
<natefinch> voidspace: man, not sure who's panicking there, but that's terrible
<jam> the other actions being worked on are toward: https://docs.google.com/a/canonical.com/document/d/1XZN2Wnqlag9je73mGqk-Qs9tx1gvOH0-wxMwrlDrQU4/edit#heading=h.h1grzzgqa6st
<voidspace> natefinch: yeah, it sucks :-)
<voidspace> natefinch: inside provider/dummy/environs.go I believe
<natefinch> oh, the dummy provider.... that's our own stupid fault.... quite possibly specifically my stupid fault
<alexisb> jam, ack, when we get to the point of network modeling work we will need a spec that we can point to that is current and explains the work being done
<jam> alexisb: so I know of https://docs.google.com/a/canonical.com/document/d/1bHI5ZXbbnGk3xict7d_39ipOcUI2urFVNMKQv1BFlSk/edit and https://docs.google.com/a/canonical.com/document/d/1UzJosV7M3hjRaro3ot7iPXFF9jGe2Rym4lJkeO90-Uo/edit#heading=h.a92u8jdqcrto though I think the actual concrete plan needs to be properly written up.
<voidspace> natefinch: it's state.Initialize that returns that error and the dummy provider panics
<voidspace> natefinch: but this is long *after* we set the write mode
<jam> Which I pointed TheMue towards, but I don't think we've actually started the process of generating the user docs for how we are going to make networking look.
<alexisb> jam ack
<alexisb> we will wnat to have a target date for completion of plan write-up that we can communicate to the interlock group + mark s
<voidspace> natefinch: actually, not *long after*, it's probably immediately after
<voidspace> natefinch: write majority is set in state.Open, which is called from state.Initialize which is called from the dummy provider
<voidspace> natefinch: so it's opening the state (newState) after setting write majority that fails
<voidspace> I'll try guarding it with a replicaset.CurrentConfig call first
<voidspace> if that errors we won't set write-majority
<natefinch> voidspace: ok.... sorry, I had hoped it would be more straightforward, but it seems that's rarely the case when interacting with mongo
<voidspace> natefinch: hah, it's interesting - not a big problem
<voidspace> at least this one seems solvable
<natefinch> voidspace: yep
<voidspace> natefinch: guarding setting the WMode with checking the replicaset config seems to work
<natefinch> voidspace: ship it!
<voidspace> natefinch: well, you say that...
<voidspace> but yes...
<voidspace> :-D
<voidspace> is there any way to create a "work in progress" pull request on github?
<voidspace> so I can view the diff
<perrito666> voidspace: sor of
<voidspace> perrito666: explain :-)
<perrito666> voidspace: you can click on pull request button
<voidspace> perrito666: right
<perrito666> and beforre hitting commit
<voidspace> perrito666: I'd like one I can share
<perrito666> you have the diff tab available
<voidspace> perrito666: clicking "compare & pull request" shows the diff immediately
<voidspace> perrito666: but there's no "diff tab" anywhere I can see ?
<perrito666> mmm hold I think I remember such feature
<voidspace> another chrome extension perhaps
<voidspace> perrito666: once you've created the pull request there's a "files changed" tab that shows you the diff
<voidspace> natefinch: so the actual change is quite simple
<voidspace> https://github.com/juju/juju/pull/17/files
<voidspace> natefinch: testing it on the other hand...
<voidspace> a problem for tomorrow morning I think
<voidspace> EOD
<perrito666> voidspace:
<natefinch> perrito666: the files changed tab has the diffs
<perrito666> ok, this will sound stupid but I know only how to do this via url
<perrito666> Idont know how to get there via the gui
<perrito666> :p
<natefinch> perrito666: https://github.com/juju/juju/pull/17/files
<natefinch> their "tabs" are pretty subtle
<voidspace> yeah, but you only get that by creating a pull request
<voidspace> there's no way to specify that it's still a work in progress except by comment (which is what I've done)
<voidspace> (well - description)
<perrito666> I found it
<perrito666> voidspace: on your repo
 * natefinch should learn to read the history before jumping into a conversation
<perrito666> above the list of files
<perrito666> there is a greyed link, that says compare
<voidspace> ah yes!
<voidspace> perrito666: thank you
<gQuigs> I'm just curious and couldn't find backstory on the decision to move juju-core from lp to github, anyone know where it took place? (I already tried looking through the ml)
<voidspace> https://github.com/voidspace/juju/compare/write-majority
<natefinch> gQuigs: basically an order from On Highâ¢  ... the thought is to make the project more visible, and lower the barrier of entry for external contributors, since most people are more familiar with git and github than bzr and launchpad
<perrito666> https://github.com/voidspace/juju/compare/juju:master...voidspace:backout-replicaset-changes
<voidspace> perrito666: yep, thanks - that's helpful
<perrito666> we might want to ship that little piece of advice in the code
<perrito666> s/code/doc
<gQuigs> natefinch: oh.. understood..  was adding git support to lp looked at? - or was the complete package important?
<voidspace> right, really EOD
<voidspace> g'night all
<natefinch> gQuigs: it was more the whole package and the fact that more eyes are on github, and we wanted more eyes on juju development
<gQuigs> natefinch: got it, thanks for explaining
<jam> alexisb: good job keeping it right at 30min
<alexisb> there are some things that people need to go off and investigate
<alexisb> so it made since to keep is to the time
<perrito666> alexisb: what?
<TheMue> jam: just seen you mentioned me. is network documentation higher priorized than API?
<sinzui> gQuigs, We were hearing that Juju wasn't a public project. Of course it is, but people didn't see it on github, they couldn't fork it nor could they make a pull request. github ~= "open source and looking for new contributors"
<alexisb> perrito666, it made sense to me, can't you just read what I was thinking and not what I actually typed ;)
<lifeless> jam: heh interesting :(.
<sinzui> jam, lifeless All the senior people are ex bzr and Lp. I think jcastro is the only exception being 6 years, but from ubuntu
<jcastro> jam is older, we established that at the sprint
<jcastro> sorry, I mean "more experienced"
<perrito666> its pretty clear that jam looks younger than all of us, so hopefully he is
<perrito666> bbl
<bodie_> in order to test an unexported method, I should just use the same package name as the tested file, right?
<bodie_> (is there a better word than file to use in this context?)
<natefinch> bodie_: that's correct
<jcw4> bodie_: your test package has to have the same package name as the method
<jcw4> s/method/method package/
<natefinch> bodie_: there's only two places you can put tests, either the same package as the rest of the directory, or package_test
<natefinch> bodie_: we usually call them internal and external tests or whitebox and blackbox
<bodie_> I see
<bodie_> and there's no special name for a file besides "file" since multiple types and so forth could be in it
<natefinch> bodie_: personally, I just put all my tests in the same package a the rest of the directory. there seems to be no actual reason to use package_test unless you want to have example code that looks like what external packages would use (in other words, functions and types namespaced by the package name)
<bodie_> I see
<natefinch> bodie_: other people like to use blackbox tests as much as possible to ensure they're testing the API rather than the implementation, which I kind of agree with... except that you can still do that in internal tests, except you have a lot more flexibility to mock stuff out and do more focused unit tests
<hazmat> hey is there a document that maps gemstones to humans ?
<natefinch> lol
<natefinch> hazmat: https://directory.canonical.com/list/team/
<natefinch> hazmat: they're actual teams :)
<jcw4> natefinch: 403 :(
<jcw4> :)
<natefinch> jcw4: haha
<hazmat> natefinch, yeah.. see how most of those are instantly recognizable ...
<natefinch> hazmat: maybe it be better if we prefixed them with "Juju Core - "
<natefinch> hazmat: we just didn't want "Team 1" "Team 2"
<hazmat> natefinch, yeah.. that would help. at least they'd group together..
<natefinch> alexisb: ^^
<natefinch> not sure what the process is to get them changed.  I agree, it would be nice if they were grouped together, and more descriptive
<natefinch> I actually hadn't expected them to be made into official HR teams
<natefinch> but I guess since HR needs to know who's managing who
<alexisb> natefinch, I can take care of that
<waigani> morning all
<menn0> moring waigani
<waigani> hey menn0 :)
 * menn0 sighs
<menn0> morning even
 * menn0 has had a bad night's sleep (kids) but AC/DC is helping
<wallyworld__> fwereade: sorry, network issues, will reboot and be there soon
<wallyworld__> fwereade: thumper sorry, g+ has crapped itself
<wallyworld__> trying to get back in
<wallyworld_> alexisb: be there in a sec, just finishing another meeting
<alexisb> me too
<alexisb> wallyworld_, ^^
<wallyworld_> alexisb: sorry, here now
<alexisb> wallyworld_, coming
<alexisb> wallyworld_, sorry
<alexisb> still on another call
 * thumper goes to walk the dog and think prior to standup
<alexisb> wallyworld_, I see you typing
<alexisb> but you cant hear me
<alexisb> wallyworld_, I lost you
<waigani> so already I've: tried to push to upstream and been working in the launchpad repo when I thought I was in github
<waigani> sigh...
 * thumper heads out for lunch
<menn0> waigani: I moved my old bzr repo well out of the way to avoid exactly that
<menn0> waigani: and Tim's tip of setting the push url for upstream to something that doesn't exist is a good idea
<waigani> menn0: yep, luckily I had done that already, so no harm done
<menn0> ok sweet
<waigani> menn0: I'm now thinking about how to propose three branches, each one relying on the one before it
<menn0> I would do them one at a time
<menn0> landing one before proposing the next
<waigani> oh right, yeah
<menn0> that way if there's post-review changes for an earlier branch you can adjust the following branch to suit
<waigani> that will have to do for now
<menn0> otherwise it just gets confusing
<waigani> bzr was a little smarter, but this way is simple and clean and will do for now
#juju-dev 2014-06-05
<menn0> waigani: I've never used it but Stacked Git would probably help with this kinda of thing if it becomes a common thing you want to do. http://www.procode.org/stgit/
<waigani> menn0: ah, thanks for the tip
<waigani> what is the difference between state/api/params/internal.go and params.go?
<wallyworld__> the former is for when code running within the state server makes an api call, vs clients
<waigani> wallyworld__: thanks :)
<wallyworld__> np
<axw> wallyworld__: where should I put backlog cards on Leankit now?
<wallyworld__> axw: let me check
<axw> onyx has a section
<thumper> my recommendation is "in the backlog"
<axw> thumper: there is no one backlog. there's "onyx", "features", "core 1", "defered", and "team nz/au"
<axw> and now tanzanite :)
<thumper> right
<thumper> we can kill "team nz/au"
<thumper> move all them to "defered"
<thumper> I guess
<wallyworld__> axw: in the new tanzanite backlog lane i just created
<axw> wallyworld__: thanks
<wallyworld__> axw: i hadn't had a chance to reply yet - i wanted to shutdown the github issue tracking as per the original plan. i don't think having 2 issue trackers is sensible, especially given we are using lp for scoping work for milestones etc
<thumper> menn0, waigani: comments left on respective pull requests
<waigani> thumper: thanks. I'm just reviewing my user info branch before pushing.
<axw> wallyworld__: I don't have a strong opinion. You can do milestones in GitHub too though.
<wallyworld__> sure, but milestone in lp then relate to ppas etc and all that other good stuff
<wallyworld__> and our release processes are built around that
<axw> yeah, it would be a shame to lose the integration bits
<wallyworld__> i just wanted to check in with you before replying to the thread
<axw> it doesn't really bother me. there are pros and cons to both options. I'd kinda like to wait and see, but it may just be a hassle to maintain both
<axw> if closed, we need a prominent message somewhere pointing people to launchpad
<menn0> thumper: thanks for the feedback. I've responded and will action the points you've raised. I'll probably wait until fwereade has seen these changes before landing though.
<thumper> ok
<axw> wallyworld__: FYI, I'm going to change the merge job to use the m3.xlarge instance type
<wallyworld__> axw: hol off, ec
<wallyworld__> sec
<axw> ok
<axw> wallyworld__: rebasing and force-pushing does *not* lose comments on GitHub. I'm sure it used to, but maybe I was just on crack
<menn0> thumper: let me know when/if you have a moment to talk about schema migrations
<wallyworld__> axw: https://plus.google.com/hangouts/_/grsvduf57lzffsqepsrfr6mdsqa
<thumper> menn0: sure, just trying to submit a few expenses
<menn0> thumper: no rush
<axw> wallyworld__: just a sec
<wallyworld__> axw: oh balls, just failed :-(
<axw> bugger :/
<wallyworld__> still, quick enough that we try with parallelisation first
<thumper> menn0: how about now?
<axw> wallyworld__: so it adds ~15 mins at worst?
<menn0> thumper: now is good
<wallyworld__> yeah or even a a bit less
<menn0> thumper: https://plus.google.com/hangouts/_/gujkllcsyr6ughkd5siy5jkdeya?authuser=1&hl=en-GB
<axw> wallyworld__: seems fine. I was going to look into creating a fs lock for mongo tests. do you think that'd be worthwhile?
<axw> wallyworld__: then we can run a bunch in parallel, but only those mgo ones will serialise
<wallyworld__> can't hurt, worth a spike on
<wallyworld__> this was the failure btw
<axw> will see if I can whip something up
<wallyworld__> FAIL: container_initialisation_test.go:1: ProvisionerSuite.TearDownTest
<wallyworld__> /home/ubuntu/juju-core_1.19.4/src/github.com/juju/juju/juju/testing/conn.go:135:
<wallyworld__>     c.Check(err, gc.IsNil)
<wallyworld__> ... value *errors.errorString = &errors.errorString{s:"error receiving message: write tcp 127.0.0.1:45210: broken pipe"} ("error receiving message: write tcp 127.0.0.1:45210: broken pipe")
<wallyworld__> ----------------------------------------------------------------------
<wallyworld__> PANIC: provisioner_test.go:407: ProvisionerSuite.TestProvisionerSetsErrorStatusWhenNoToolsAreAvailable
<axw> mmk
<wallyworld__> looks like a race tearing down a test
<wallyworld__> anyways, i'll change the job
<axw> hmm, that's an error closing the API server
<axw> API client event
<wallyworld__> ok, i didn't look past the error just yet
<wallyworld__> axw: my code is out of date, so line number doesn't match,but looks like we are in a cleanup and can't close api server. i wonder if it had already been shutdown
<axw> yes that is the line
<axw> not sure...
<wallyworld__> sure seems like a potential race
<axw> wallyworld__: it's the client that can't be closed. apparently websockets require you to send data to close the connection. I guess the server has already gone away
<wallyworld__> yeah
<axw> wallyworld__: so... I think the issue is that the client is being closed in an AddCleanup, whereas the server is closed by TearDownTest (prior to running other cleanups)
<axw> I can take a look at fixing if you like
<wallyworld__> ok, that woud be great
<wallyworld__> axw: maybe also have a quick check of the code to see if there's other places where something similar is down, so we try and attack them all
<axw> wallyworld__: is there a bug for this failure?
<wallyworld__> not yet that i know of
<axw> k
<wallyworld__> axw: +1, now let's see how the new landing job goes
<axw> thanks
<axw> I really like that that we can see the bot's progress now
<wallyworld__> axw: \o/
<wallyworld__> worked
<wallyworld__> +1 to seeing progress
<axw> winning
<axw> nice and quick
<sebas5384> there's a way to upgrade a bundle?
<rick_h_> sebas5384: no, not currently.
<axw> wallyworld__: maybe we should stick with ephemeral instances then?
<sebas5384> oh rick_h_ thanks, but there's plans for it?
<wallyworld__> well, testing is down to 11 mins. building the instace takes about 5
<wallyworld__> so 16mins vs 11
<wallyworld__> worth a little more effort i think
<rick_h_> sebas5384: it's been talked about but not currently on the list of stuff being worked on.
<axw> s'pose so.
<wallyworld__> but not much more
<sebas5384> rick_h_: ahh ok then, i imagine it isn't nothing simple
<sebas5384> there are so many complicated situations
<sebas5384> like adding a service, and then removing it, and thats the same for relations, etc...
<thumper> davecheney: seems like your deps file is missing a tab juju-ci.vapour.ws:8080/job/github-merge-juju/17/?
<waigani> thumper: ping
<thumper> hai
<waigani> thumper: I don't understand your review comment "Both user and stateServers for the environment need to read the jenv file. Perhaps consider doing that before?"
<thumper> waigani: well, to get the user and the api end points, you need to read the jenv file
<thumper> you are reading the jenv file twice
<thumper> once in each method
<waigani> oooh
<waigani> got ya
<thumper> (behind the scenes)
<waigani> I'll reuse it
 * thumper nods
<thumper> that's what I was meaning
<thumper> waigani: what happens if there is no jenv file?
<thumper> because that environment isn't bootstrapped?
<thumper> tests for that>?
<waigani> thumper: yep
<thumper> cool
<waigani> thumper: so i discussed this with menn0, and we wanted to support the scenario of switching to an bootstrapped environ - with no jenv file
<thumper> waigani: you can't
<waigani> thumper: in that case me code is wrong :(
<thumper> waigani: if the environment isn't bootstrapped, there is no jenv file
<waigani> as it currently suppresses file not found error
<thumper> if there is a jenv file - it should be bootstrapped
<menn0> I don't recall talking about that :-/
<thumper> it is possible that you have a jenv file that points to an environment that someone else has destroyed
<menn0> actually now I do
<waigani> so I can remove the error suppression and fail when trying to switch to an env with no jenv?
<menn0> it was so you could do "juju switch foo" and then "juju bootstrap"
<waigani> menn0: yep, that's right
<menn0> seems like a reasonable thing to want to do, although not super critical
<waigani> so the current code allows that
<waigani> i mean my branch
<waigani> thumper: do we want to support this usecase: "juju switch foo" and then "juju bootstrap"
<thumper> yes
<thumper> most definitely
<thumper> I do it all the time
<waigani> in which case we have to switch to an env without a jenv
<thumper> yes, but that isn't what you said
<waigani> what did I say?
<thumper> you said "we wanted to support the scenario of switching to an bootstrapped environ - with no jenv file"
<waigani> oh shit
<waigani> UNbootstrapped
<thumper> right
<waigani> typo, sorry
<thumper> that makes more sense
<waigani> right, all on the same page. I've covered it in the code and have tests.
<jcw4> are we using axw's gocov tool on juju regularly, or some other coverage tool?
<davecheney> jcw4: go tool cover
<axw> gocov is mostly redundant now
<jcw4> davecheney, axw I see... does go tool cover have a report function too?  I need to go investigate
<jcw4> (I like gocov test | gocov report)
<axw> jcw4: only as HTML, AFAIK
<jcw4> axw: ah.  I guess lynx or something would be do-able on the command line
<axw> jcw4: gocov just uses the builtin coverage analysis now anyway, so if you prefer that then there's no problem using it
<davecheney> hold up
<davecheney> http://dave.cheney.net/2013/11/14/more-simple-test-coverage-in-go-1-2
<axw> davecheney: yeah, it's not quite the same. "gocov report" gives you a breakdown by function
<jcw4> davecheney: thanks!  I think that will be useful too.  I'll probably end up using both
<axw> it would be nice to have regular coverage reports
<jcw4> it's not in the juju ci pipeline?
<axw> nope
<jcw4> hmm
<jcw4> I could even see it being tuned and added as a pre-push hook...
<jcw4> if you don't mind waiting forever
<jcw4> (metaphorically speaking)
<axw> could do, I think it would be fine in the CI tho
<axw> I prefer to use it just as a tool to guide testing, not to gate changes
<jcw4> axw:  yeah, that makes sense.
<jcw4> otoh, 0% coverage is a pretty big red flag...
<davecheney> jcw4: go test -cover github.com/juju/juju/...
<davecheney> will give you a first aproximation
<davecheney> make sure you have the latest cover tool
<jcw4> k
<davecheney> there was a bug with external tests that showed 0 when there were no internal tests
<jcw4> it would be interesting to develop a hook that could run coverage tests only on modified files
<jcw4> and fail if less than some threshold, say 50%
<davecheney> jcw4: sounds like a nice pipe dream
<jcw4> :-D
<jcw4> yep
<davecheney> there are more serious issues atm I feel
<davecheney> lack of mocks for the state server is the big one for me
<davecheney> the fact the tests take 30 minutes to run is fucked
<jcw4> and that is mostly because of mongo dependencies?
<davecheney> and they do this because we start one mongo per test
<jcw4> yeah
<davecheney> 90% of the current test failures are when mongo just fails to start
<jcw4> what's interesting to me is that the mongo api seems so easy to mock...
<jcw4> are there subtleties I'm missing?
<davecheney> nope
<davecheney> an in memory mongo is a map and a lock
<jcw4> hmm.
<jam1> morning all, I had forgotten I had closed my IRC window. I hope everyone is doing well.
<vladk> jam1: morning
<lifeless> jam1: you're still in the middle east I guess?
<jam1> lifeless: yeah, in Dubai, UTC +4
<TheMue> morning
<jam1> TheMue: morning
<jam1> I think I have the outline for API versioning up for you
<jam1> fwereade: it would be good if you can give it a look over as well, to make sure my plotted trajectory is sane.
<fwereade> jam1, great, tyvm
<TheMue> jam1: great, finished the planning doc yesterday evening and now waiting for Nick to get it online
<jam1> TheMue: fwiw we'll probably want the API versioning document to become something in trunk developer docs as an .md file, I suppose I need to start reading up on markdown syntax
<TheMue> jam1: I would like all developer docs, opposite to planning docs, in the standard repo too, yes
<TheMue> jam1: makes more sense to keep them near and also always matching the releases
<TheMue> jam1: thankfully md is pretty simply, and github renders it directly when somebody browses our repo
<jam1> api versioning is potentially a dual doc, since for people that want to consume Juju via the API we do want them to know how we plan to interact with them.
<dimitern> fwereade, jam1, hey, I still need an approval on https://github.com/juju/juju/pull/13 please!
<jam1> dimitern: I'm surprised to see the Disabled stuff come in after the review, I don't think that was part of the discussion so far, did I just miss that part?
<TheMue> jam1: will talk to Nick about a process to also publish released developer docs out of our repo
<dimitern> jam1, it was part of the discussions past :)
<dimitern> jam1, i mean, we talked about configuring only the nics the user wanted, not all possible ones
<dimitern> jam1, i did implement "find all nics/nets and configure them", but now as we're moving to a more dynamic model the networker can do that later eventually
<voidspace> morning all
<fwereade> dimitern, reviewed
<fwereade> dimitern, although, thinking about the disabled stuff
<fwereade> dimitern, I'm not sure it's the right time for that -- we should be able to guarantee that the ^foo networks won't be possible to configure already, so that bit's handled
<fwereade> dimitern, but without the networker worker, to handle things the deployment of units that use a constrainted-but-disabled network
<fwereade> dimitern, sane? or do we expect a functional networker so soon that it makes no odds? I'm a bit sceptical there, 1.20 is coming upon us apace
<dimitern> fwereade, sorry, i don't quite follow - constrainted-but-disabled network ?
<fwereade> dimitern, mentioned in constraints, so it'll be noted as an available network, but not specified with --networks and therefore not enabled
<fwereade> dimitern, without a networker to enable it, we're better off leaving them all configured until we have the machinery in place to handle switching them on and off as required
<fwereade> dimitern, otherwise we'll end up deploying units to machines that can't actually serve them
<rogpeppe> hmm, it looks as if the juju commit history was not filtered to remove the non-trunk commits. is that right?
<rogpeppe> mgz: ^
<rogpeppe> the unfortunate implication of that seems to be that when using git filter-branch to factor out juju subdirs into their own repos while preserving history, none of the trunk commits (the ones with the actual useful messages) get preserved.
<dimitern> fwereade, until we have the networker, we'll still use cloudinit for the setup
<dimitern> fwereade, and in my PR i changed the logic to only bring up what the user wanted, leaving the rest disabled
<fwereade> dimitern, yeah, and if cloudinit leaves some networks disabled then the model will happily assign units that need disabled networks to that machine, and they won't work
<dimitern> fwereade, that's what we agreed upon in austin, isn't it?
<dimitern> fwereade, how can that happen?
<fwereade> dimitern, yeah -- but until we have the networker to switch on those networks when we need them, having them disabled is more broken not less broken
<fwereade> dimitern, this machine records that it has access to foo, bar, baz
<fwereade> dimitern, and it's started with --networks=foo
<dimitern> fwereade, yes, this means we're enable only foo
<fwereade> dimitern, the unit assignment logic should consider it a viable place to deploy a service that uses bar
<rogpeppe> fwereade: how much do we care about commit history in git?
<dimitern> fwereade, no yet anyway
<fwereade> dimitern, expecting that the networker will set up the appropriate interface
<dimitern> fwereade, the unit assignment logic + networks currently considers only empty machines
<dimitern> fwereade, there's no way to break things automatically ;)
<fwereade> dimitern, right -- you can start a machine with --networks, and constraints, and the unit assignment logic should consider that clean machine a viable place to put a unit
<fwereade> dimitern, and it should not be worrying about the machine's --networks, or its constraints, only on the networks we record as potentially accessible
<dimitern> fwereade, a new machine, not just a clean one
<fwereade> dimitern, yes: it's new, so it's clean and empty, so it's in the running
<voidspace> PR for someone: https://github.com/juju/juju/pull/24
<dimitern> fwereade, what you're describing involves doing add-machine --networks ... --constraints networks=...
<fwereade> dimitern, and if it has recorded that net-bar *could* be set up, the unit assignment should consider it a viable location for a unit of a service that requires net-bar
<fwereade> dimitern, no, it doesn't require any more than add-machine --networks
<dimitern> fwereade, what networks *can* be configured on a machine comes from the interfaces discovered at provisioning time
<fwereade> dimitern, right
<fwereade> dimitern, create a machine with --networks
<dimitern> fwereade, the disabled flag just signals not to enable them, but we still have the info in state, as unit assignment logic is concerned
<fwereade> dimitern, that machine gets provisioned
<fwereade> dimitern, it's recorded as capable of starting a big list interfaces, but it only actually starts one of them
<fwereade> dimitern, now, much later, we create a unit
<fwereade> dimitern, that unit's service specified --networks=bar
<fwereade> dimitern, correct unit assignment logic says "that machine *could* configure net-bar, so we'll use it"
<fwereade> dimitern, unit goes on machine, networker hasn't landed yet, unit can't use network
<dimitern> fwereade, you're correct about that, but it's not implemented like that yet
<dimitern> fwereade, the networker is coming soon anyway
<dimitern> fwereade, unit assignment logic does not consider what networks are available on the machine yet
<fwereade> dimitern, is there any doubt as to how it should, or will, be implemented? the point is to build a component that continues to work nicely even given the annoying way different streams of work progress at different times and in different ways
<fwereade> dimitern, it should consider it, and it should happen soon, it's all part of the model
<dimitern> fwereade, to do that, we need to take into account what network interfaces are there, constraints are not important there, just requested networks (but they will be configured anyway at provisioning time)
<dimitern> fwereade, yes, it's part of the model, but we haven't got that far
<jam1> dimitern: fwereade: so my personal take on itâ¦ we shouldn't touch cloud-init, it should ignore anything about disabled, instead we should only have the networker set things up and have it pay attention to a new flag.
<jam1> then, while we are still on just cloud-init without a good worker, it still works
<jam1> and when we are using the worker instead, we get the better behavior
<fwereade> jam1, yes, exactly
<jam1> I think cloud-init only starting some networks is going to just be a point of temporary brokeness
<jam1> brokenness ?
<dimitern> fwereade, ok, so we're temporarily departing from what was important in austin - i.e. do what the user says about networks and no more, until we have the networker in place
<dimitern> fwereade, fair enough, but I still want to keep the Disabled flag
<jam1> I ain't messing wit no broke broke, ya na wha I mean
<jam1> dimitern: so for me Disabled feels a bit weird, rather than having it just not be in Enabled, do we know all the potential networks that we aren't a part of yet
<dimitern> fwereade, but if it wasn't clear, not disabling networks in cloudinit won't make the unit assignment logic "just work" :) until we change it to consider NICs
<dimitern> jam1, maas knows all networks a machine can be on, and we expect the provider to tell us that
<bigjools> potentially knows :)
<dimitern> jam1, so out of these, the user picks what to enable, and we ca decide what can we deploy there
<dimitern> hey bigjools  ;)
<bigjools> ahoy dimitern
<fwereade> dimitern, the unit assignment logic *is* a pretty critical part of the model -- we have to apply that even for manual placement attempts
<dimitern> ofc, maas knows if you told it
<dimitern> fwereade, i agree, that's why i'm thinking of working on that as soon as we have the networker running as a MVP
<fwereade> dimitern, gating that work on the networker is not optimal imo
<dimitern> fwereade, until then, nothing will break that used to work before, but networking stuff might
<dimitern> fwereade, well, until we have a way to configure additional nics on a running machine
<dimitern> fwereade, we can only assume the networks will work or not
<dimitern> fwereade, you're saying let's make them work by default, so we can wire the assignment logic?
<fwereade> dimitern, yes
<dimitern> fwereade, got you than :) that's possible before the networker then, sorry
<fwereade> dimitern, cool, sorry I didn't explain clearly
<dimitern> fwereade, you and me both i guess :)
<dimitern> fwereade, re network names validation as part of constraints validation - i'm aware of the prechecker and similar things that are supposed to be useful for constraints validation, but wasn't obvious how to make it work
<dimitern> fwereade, need to look some more into it, or possibly do a follow-up on that, as i suspect will blow up the PR even more
<jam1> dimitern: anyway, from a meta-perspective, adding this to the code during the middle of having it reviewed for other stuff makes it hard to see if the original questions were addressed
<dimitern> jam1, hence my request to do it in a follow-up :)
<jam1> dimitern: I meant adding the Disabled stuff, but yes, the other would be true as well.
<dimitern> fwereade, just to check something with you re constraints propagation
<fwereade> dimitern, sure
 * axw wishes you could batch comments on GitHub
<axw> so noisy
<dimitern> fwereade, so if i have "networks=foo,^bar" at environ level, all services should inherit them (unless overridden at service level), and all units and machines should also inherit them (unless overridden)
<dimitern> axw, oh me too!
<voidspace> axw: thanks
<axw> voidspace: no worries
<axw> voidspace: I struggled to find an appropriate emoji to convey my feelings about this issue ;)
<voidspace> axw: I thought you did very well...
<dimitern> so :shit: it is :)
<dimitern> fwereade, the same should be for service-level network constraints going down to units and machines
<dimitern> fwereade, and i'm talking about new entities for each kind, not existing ones which might already have constraints
<axw> voidspace: does that job (local-deploy-precise-amd64) just run directly on the jenkins machine?
<axw> looks like it from the jenkins output
<voidspace> axw: yes, it runs on the master
<fwereade> dimitern, sort of -- a new machine's constraints should be immediately combined with environment constraints, and the result used to provision it
<axw> mk
<voidspace> axw: I'm not sure what specifically triggers a new job - it sometimes seems to take a couple of hours
<voidspace> axw: but I'm not aware of it requiring manual intervention
<voidspace> axw: I don't [believe I] have a login on that jenkins, so I can't manually start on
<voidspace> if one hasn't run soon-ish I'll ping sinzui or abently
<fwereade> dimitern, a new unit's constraints should be taken by snapshotting the combination of the current service and environment constraints
<dimitern> fwereade, ok that's where i don't get it
<fwereade> dimitern, when creating a new machine for that new unit, it should take exactly the constraints just calculated for its unit
<dimitern> fwereade, " a new machine's constraints should be immediately combined with environment constraints" - why not the service as well?
<dimitern> fwereade, i.e. when deploying the first unit of a service + netCons in an env + netCons, the new machine's cons = envCons+svcCons, no?
<fwereade> dimitern, because when I say a new machine I mean the result of add-machine, not of deploy
<axw> voidspace: I *do*, but only because I've been doing github things. I think it's just on a timer
<dimitern> fwereade, or, alternatively machineCons = svcCons (assuming svcCons += envCons)
<axw> I'm wary of kicking one off in case it buggers up the scheduling
<fwereade> dimitern, when creating a machine for a specific unit, there aren't any other machine constraints -- we just take them from the unit
<dimitern> fwereade, right, so if provisioning a machine just like that, not for deploying a unit
<dimitern> fwereade, use the envCons, if a deployment is involved, use svcCons
<fwereade> dimitern, yeah -- when provisioning a machine *not* for a unit, there's no service in the picture, just modify envCons with whatever was specified for that specific machine
<dimitern> fwereade, ok, i think it's clear enough for me now, i can follow what's happening in that test
<fwereade> dimitern, cool
<voidspace> axw: ok
<jam1> fwereade: dimitern: I saw that you at least opened up the doc I sent about API versioning, any comments, or just haven't gotten to reading it yet
<fwereade> jam1, sorry, it's in my tab queue
<jam1> np
<jam1> I fully understand
<jam1> fwereade: since you're around, shall we try to decide what actual URLs we want to connect to?
<jam1> https://host:port/ENVUUID/api
<jam1> https://host:port/environment-ENVUUID/api
<jam1> https://host:port/environment/ENVUUID/api
<jam1> concretely: "https://host:port/72e9779c-0c82-4d52-83f9-50bb594941e5/api"
<jam1> vs
<jam1> https://localhost:17070/environment-72e9779c-0c82-4d52-83f9-50bb594941e5/api
<jam1> https://localhost:17070/environment/72e9779c-0c82-4d52-83f9-50bb594941e5/api
 * c7z votes first or third
 * TheMue too
<TheMue> jam1: to be more special, more for the third than the first
<jam1> its a SMOP, but we should decide before it lands
<jam1> TheMue: I think you mean "more specific", but sure
<jam1> to go with
<TheMue> iiirks, eh, yes
<jam1> https://localhost:17070/environment/72e9779c-0c82-4d52-83f9-50bb594941e5/apilog
<jam1> https://localhost:17070/environment/72e9779c-0c82-4d52-83f9-50bb594941e5/api/log
<jam1> ugh
<jam1> stupid thing refuses to let me edit in place
<jam1> https://localhost:17070/environment/72e9779c-0c82-4d52-83f9-50bb594941e5/log
<jam1> https://localhost:17070/environment/72e9779c-0c82-4d52-83f9-50bb594941e5/charms
<jam1> and presumably what URL would we connect to when we are dealing with creating a new environment, /? /environment/ /api?
<c7z> if we'll never need to serve other endpoints, first is better. third gives room for doing non-env specific things from the same server
<axw> also gives us a nice URL for listing environment UUIDs
<axw> https://localhost:17070/environment/
<axw> that is
<dimitern> jam1, haven't have time to read it, will do
<frankban> mgz: morning, does the CI lander handle changes to dependencies.tsv in the new git world? Or should we keep asking you to update/install packages?
<voidspace> alexisb: ping
<natefinch> voidspace: 3:30am her time right now, probably not awake :)
<voidspace> natefinch: hah, thanks
<voidspace> natefinch: do you know if we still have open headcount?
<voidspace> natefinch: this position is still being advertised:
<voidspace> https://ch.tbe.taleo.net/CH03/ats/careers/requisition.jsp?org=CANONICAL&cws=1&rid=830
<natefinch> voidspace: we have one open spot with at least one person who we are working on given an offer.. however, there's a couple other related openings in related teams
<voidspace> natefinch: cool
<voidspace> thanks
<axw> mgz: can you please fix https://github.com/juju/juju/pull/8 (or tell me how to)
<axw> mgz: never mind, figured it out. just had to add "Build failed: "
<perrito666> ok bbiab
<voidspace> how do I get a test suite with a replica set?
<voidspace> I think we have some tests that do that
<dimitern> natefinch, voidspace, isn't "juju engineering" the solutions / ci / ecosystem team?
<voidspace> dimitern: heh, I don't know
<dimitern> natefinch, voidspace, (just looking at that open position)
<dimitern> ah, ok
<jam1> dimitern: I think you are thinking "Juju Solutions" ?
<dimitern> jam1, i'm confused with so many vaguely similarly sounding teams :) - the taleo offer there is for juju eng.
<jam1> I thought there was a Juju Engineering team, but I don't see it in directory
<jam1> yeah
<c7z> axw: yeah, that's it. you can also hit rebuild on the job in jenkins
<natefinch> It's pretty bad that people on the Juju team don't understand the job description on our jobs page :/
<natefinch> I'm pretty sure that job that voidspace linked to is for juju core.
<c7z> axw: (sorry, missed ping, I blame cow-orking)
<natefinch> but it goes to alexis, and she's driving a couple other positions as well
<jam1> natefinch: voidspace: the original link https://ch.tbe.taleo.net/CH03/ats/careers/requisition.jsp?org=CANONICAL&cws=1&rid=830 is, indeed, for juju-core, we currently have 2 more slots though I think we're already trying to give a couple people offers
<dimitern> c7z, you're taking pictures of cows? :)
<dimitern> TIL: http://www.urbandictionary.com/define.php?term=orking
<voidspace> jam1: cool. I've had a random-but-seemingly-good person email me. I've sent them that link and told them to apply anyway.
<voidspace> jam1: we have one very good new person starting on Monday.
<jam1> voidspace: IIRC we have a couple more openings in related projects like JaaS, etc. so we still have a few
<voidspace> jam1: do you know which team he will be joining?
<jam1> voidspace: I think it is still being sorted out
<jam1> you mean Eric, right?
<voidspace> yes
<voidspace> he should join natefinch's team...
<natefinch> voidspace: I think he is
<voidspace> awesome :-)
<alexisb> voidspace, pong
<voidspace> alexisb: no problem - all sorted
<voidspace> alexisb: I had someone contact me asking about open positions
<alexisb> voidspace, ack
<voidspace> alexisb: I've told them to just apply...
<jam1> dimitern: since your on-call-reviewer today anyway https://github.com/juju/juju/pull/26
<dimitern> jam1, sure, looking
<jam> just realized, github doesn't do anything with the concept of "prerequisites" does it?
<jam> to break up a bunch of changes into reviewable chunks.
<jam> Is the only chance to rebase and break it into just tiny patches?
<jam> c7z: ^^
<c7z> yeah, I didn't find anything as an analogue
<jam> you could create PRs chaining one into the other one, but it wouldn't show up on the main queue, I think
<c7z> indeed
<dimitern> jam, reviewed
<jam> thx
<wallyworld_> mgz: still in other meeting, will ping soon
<perrito666> rogpeppe: ping_
<rogpeppe> perrito666: pong
<vladk> dimitern: please, take a look https://github.com/juju/juju/pull/16
<dimitern> vladk, looking
<dimitern> vladk, reviewed
<vladk> dimitern: thanks
<vladk> dimitern: how to understand 'd' in the last comment?
<dimitern> vladk, d="delete this line" :)
<dimitern> it's a common shortcut
<vladk> dimitern: when I fix all, how to kick the bot?
<jam> vladk: you should push and then comment $$merge$$
<vladk> jam: thx, I wasn't sure I have such right
<dimitern> vladk, but before that, please rebase most of these changes to squash them into one
<dimitern> vladk, or a few
<sinzui> voidspace, axw. https://docs.google.com/a/canonical.com/document/d/1ZQIJL2YNAYpDHDO4g3kwcq8tHTR8ax6L15xhAXELngc/edit#heading=h.f3goroun9jd1
<sinzui> ^ ci-director looks for new revs, triggers a build of the tarball, which causes a cascade of other bugs to run as resources come available
<sinzui> git-merge is similar. (which is not ideal because branch merging should not be in resource contention with CI)
<dimitern> fwereade, is there a particular reason state.Unit.constraints to be unexported?
<voidspace> sinzui: thanks
<voidspace> sinzui: I will read and digest
<voidspace> sinzui: so that build is currently "waiting on resources"
<voidspace> given that there are new revisions
<sinzui> oh, CI is idle, I think almost 90% of resources are available
 * sinzui investigates
<voidspace> that's a really useful document though, thanks
 * TheMue is AFK for a moment
<voidspace> it will be useful to actually understand our CI infrastructure
<voidspace> but for no
<voidspace> *now
 * voidspace lunches
 * sinzui fixes script blocking CI
<bodie_> morning all!
<fwereade> dimitern, yes, because they're nobody's business but state's -- it just uses them to put the unit on a suitable machine
<fwereade> dimitern, be it one whose HCs match, or a new one which gets the constraints copied straight over from the unit
<fwereade> bodie_, morning :)
<jam> fwereade: overloading methods that don't match the exposure rules falls pretty much into "spooky action at a distance"/side-effect programming.
<fwereade> jam, yeah, I know, it *is* horrible, it's just that the other one is uglier to look at
<jam> fwereade: yeah
<jam> fwereade: hence why it wasn't a clear "do it this way"
<jam> fwereade: we *can* do the nested embedding until the embedding is too deep and we just copy the whole thing
<fwereade> jam, now I think of it, I realise that when I say "I'm not sure this is horrible or brilliant" I'm actually saying "this is horrible, independent of any brilliance it may also exhibit" ;p
<dimitern> fwereade, yeah, but in this case I need them
<dimitern> fwereade, do you mind exporting it?
<jam> fwereade: :)
<fwereade> dimitern, remind me exactly what for?
<fwereade> dimitern, I might mind a bit
<dimitern> fwereade, for this special case deploy --to lxc:<mid> --constraints networks=good,^bad
<bodie_> fwereade, mgz thought I should ping you about whether we want to have the Actions schema be a member of the Charm documents in State, or have its own collection, e.g. CharmActionSchemas
<bodie_> since altering the document could impact juju upgrades
<fwereade> bodie_, I don't think action schemas vary independently of charms so I'm inclined to keep it in there
<dimitern> fwereade, we create a dirty container to deploy the unit on, but since it's dirty the unit assignment logic bypassed
<fwereade> bodie_, old charms will just deserialise it out to a nil actions, right?
<bodie_> hmmmm
<fwereade> dimitern, is there any particular reason we can't just hook into the existing logic to create that container pre-loaded with the unit, just as we do when creating a new environ machine?
<bodie_> I think it'll come back empty since we're using omitempty -- we were having an issue with testing where a nonexistent actions.yaml was coming back with an empty Actions (a la Config) but it was being deserialized into a nil value
<fwereade> bodie_, and that's just on the doc, so any Actions() method can just return an empty one
<bodie_> I have to admit I'm not totally clear on how the bson tags work
<bodie_> yeah
<fwereade> dimitern, it feels like it's an oversight in the assignment logic in state, which is in danger of slipping out over the boundaries of that package
<bodie_> if anyone has a minute, I could use a sanity check here: https://github.com/binary132/juju/blob/charm-interface-actions-fix-interface-stripper/charm/actions.go#L79
<bodie_> this function LOOKS like it should work correctly, but my tests think otherwise
<axw> sinzui: would it be okay for me to get on the Jenkins machine and manually test the local provider at some stage? I'd like to see if we can get to the bottom of the HA thing, because it's going to be a PITA if we have to carry this code around forever
<bodie_> specifically, I'm getting an inner map[interface{}]interface{}
<sinzui> axw you sure can. You will need to be mindful of http://juju-ci.vapour.ws:8080/job/local-deploy-precise-amd64/
<sinzui> axw, you can disable it if you need
<dimitern> fwereade, well, there are a couple of state.Assign*() methods that can be used there instead, but not quite the same
<axw> sinzui: thanks, I will probably disable it while I'm testing
<dimitern> fwereade, first, in DeployService some trickery is done to "Create the new machine marked as dirty so that  nothing else will grab it before we assign the unit to it"
<axw> (not tonight tho, maybe tomorrow)
<fwereade> dimitern, I guess what I'm asking is "why is there no AssignToNewContainer?"
<dimitern> fwereade, there is AssignToNewMachineOrContainer
<fwereade> dimitern, that sounds like a horrible hack plastered on at a different level to all the rest of the assignment stuff
<dimitern> fwereade, which decides to use a container only if it's specified in the constraints
<fwereade> dimitern, sorry not ATNMOC itself
<fwereade> dimitern, how hard would it be to put that functionality into a new state method?
<fwereade> dimitern, I'm really not keen on exposing internals to do calculations that should be done internally but which we just happen not to
<dimitern> fwereade, I guess instead of that hack we can use AssignToNewMachine there, assuming the dirty flag not being set is ok and also we change ATNM to take containerType and use it if not empty, rather than relying on checking constraints
<fwereade> dimitern, I suspect it needs a bit more analysis -- I suspect that breaking down the methods we expose by the use cases we have at that layer may be moreprofitable than overloading existing methods
<fwereade> dimitern, wait, deploy --to with --constraints as well?
<fwereade> dimitern, ah deploy --to directive not machine-id
<dimitern> fwereade, deploy --to lxc:XXX and --constraints networks=... specifically
<bodie_> this is weird
<bodie_> it's WORKING here
<bodie_> http://play.golang.org/p/MZ-3jwPZra
<bodie_> so I have to assume there's something wrong either with my test case, or with how I'm handling the values
 * perrito666 sees a thread arrive to a length that makes it easier to actually do a pdf and read it in a kindle
<bodie_> oh, cmon... the problem was a minor logical bug in my main loop
<bodie_> sigh
<bodie_> I have yet to learn the universal truth that IT'S ALWAYS SOMETHING DUMB!
<perrito666> sweet, my git vim plugin is useful again
<perrito666> natefinch: voidspace wwitzel3 at this point I am no longer sure if we stand up or not :p
<natefinch> perrito666:let's stand up
<natefinch> ^^ wwitzel3 voidspace
<voidspace> ok
<jam> dimitern: rogpeppe: I updated the proposal, sorry I didn't "lbox propose " the previous one where I had responded to your review questions, it should also be updated now if you want to read direct responses to your original requests
<rogpeppe> jam: thanks
<rogpeppe> jam: i've just responded to one of the responses
<bodie_> okay, so let's say I want to rebase -i my noisy commits into something useful, but not overwrite the history of my existing branch
<jam> rogpeppe: "/" is explicitly special
<bodie_> would I want to checkout -b a new branch, rebase in there, and then cherry-pick the combined commit into the feature branch, for example?
<vladk> jam. fwereade, dimitern: I've got an error in the end of bot log in lander-merge-result. What to do?
<vladk> http://juju-ci.vapour.ws:8080/job/github-merge-juju/24/consoleFull
<rogpeppe> jam: really? i couldn't divine that from the docs
<rogpeppe> jam: but if experiment bears you out, that should be ok (apart from the fact that you won't get a JSON error)
<jcw4> bodie_: what is the benefit of keeping your local branch history?
<jam> rogpeppe: https://github.com/juju/juju/pull/26/files#diff-37b85f678233dbf691a5813acd25d6d9R217
<bodie_> https://github.com/EugeneKay/git-jokes/blob/lulz/Jokes.txt
<jam> rogpeppe: I did add an explicit test for connecting to random paths and have it fail
<bodie_> jcw4 -- I have it on my github repo as well
<jam> vladk: I haven't seen that error before. It looks like a bug in the landing script
<dimitern> mgz, c7z, it looks like some script on jujubot fails for vladk http://juju-ci.vapour.ws:8080/job/github-merge-juju/24/consoleFull
<jam> so you could try voting "$$MERGE$$" again
<jcw4> bodie_: is there a specific line that I should read?
<jam> or you could poke c7z and/or wallyworld_ etc
<bodie_> heh, no, just found some funnies at #git
<c7z> dimitern: looking
<jcw4> bodie_: oh, it wasn't an answer to my question... sorry :)
<c7z> vladk: yeah, awx hit that as well, but I'm not sure if he worked out *why* it happened
<c7z> I'll add some debugging and requeue
<bodie_> jcw4, the answer was that my remote has the history, so I don't like overwriting existing server-side truth
<voidspace> so, charm hooks can be "any executable"
<voidspace> so long as those executables run on every platform juju supports, right?
<jcw4> bodie_: I'll have to explore a bit, but I'm not really considering my personal repo on github to be server-side truth at this point...  If no-one is cloning my personal repo it feels more like a sandbox to me.
<bodie_> jcw4, in case mgz doesn't see #jujuskunkworks, I have to get moving in 30-60 minutes here to spend the day with family since they're leaving saturday -- I'll be working sat
<c7z> bodie_: thanks (I'm mgz for today)
<rogpeppe> jam: thanks - it looks like the docs should be a little better there
<jcw4> bodie_: yep, I saw that... hi mgz/c7z
<jcw4> c7z: upgrade from gzip to 7zip?
<c7z> sidegrade :)
<jcw4> hehe
<mattyw> natefinch, rogpeppe <- I'm picking you two at random - congratulations! Can you point me in the direction of someone who can answer some questions about the unit tests in core? Specifically I'd like to know how you use testing.RunCommand to test long-running services (if you do)
<natefinch> mattyw: define long-running?
 * rogpeppe goes to look at testing.RunCommand
 * natefinch does too
<voidspace> natefinch: I can get the session back out of the state with state.MongoSession()
<natefinch> voidspace: that sounds good
<rogpeppe> mattyw: RunCommand isn't for that
<voidspace> natefinch: so I have tests that I think are now testing the right thing
<voidspace> however they panic - but they don't fail!
<rogpeppe> mattyw: it's really for testing short running Commands
<mattyw> rogpeppe, what about things like the machine and unit agents?
<rogpeppe> mattyw: for testing a long running command, i'd usually test it directly (testing.RunCommand really doesn't add much tbh)
<voidspace> natefinch: and now they pass...
<mattyw> rogpeppe, just compile and run the binary via cmd.exec?
<voidspace> natefinch: WMode not set without replSet, it is set with replSet
<rogpeppe> mattyw: if you look at how they're done, they instantiate the Command directly and there's a tomb in there that's used to tear down the command
<natefinch> voidspace: awesome
<perrito666> hay guys, any of you is working ona virtualized ubuntu on osx?
<mattyw> rogpeppe, I was looking for that kind of thing yesterday but couldn't see where that was going on -  I'll have anther look
<stokachu> sinzui, hi could you take another look at 1307434
<rogpeppe> mattyw: see MachineSuite.TestRunStop for a simple example
<jcw4> perrito666: sometimes... what are you seeing?
<stokachu> sinzui, was hoping to get that included in the next-stable
 * sinzui looks
<perrito666> jcw4: well my current work computer does not support ram upgrading and its falling a bit short, so I noticed my mac has enough ram for both oses and Was wondering if dual booting or virtual in terms of having enough ram to compile
<jcw4> perrito666: I found that on my macbookpro with 14.04 in a virtualbox vm with 2 cores assigned and 4GB of ram that it worked very well
<mattyw> rogpeppe, perfect, thanks for the pointers
<mattyw> rogpeppe, you're a top bloke and no mistake
<rogpeppe> mattyw: bet you say that to all the boys
<jcw4> perrito666: my bottleneck was disk I/O, but even that wasn't too bad
<sinzui> stokachu, it is to be fixed in the next release (one-two weeks), where as next-stable (1.20) is more than one release away
<jcw4> perrito666: if you have an SSD it might not even be a problem
<mattyw> rogpeppe, however I answer that it's going to be awkward
<rogpeppe> mattyw: orly? :-)
<stokachu> sinzui, ah ok
<perrito666> jcw4: tx, Ill take a look
<stokachu> sinzui,  is there a name being used for next point release?
<voidspace> fwereade: testing the setting of write-majority turned out to be very easy
<sinzui> stokachu, regardless of those two releases, if the fix does not require lots of new code, we can backport to 1.18.5  (because we don't think 1.120 will arrive soone enough)
<stokachu> sinzui, ok sounds good
<sinzui> stokachu, next-stable will be named 1.20.0 when the day comes
<bodie_> c7z, so am I making a new PR using my feature branch?  I assume the other PR has now been closed
<dimitern> jam, fwereade, can I get an LGTM on https://github.com/juju/juju/pull/13 ?
<bodie_> but, I got an email saying the bot had merged it
<bodie_> which confused the heck out of me...
<c7z> O_O
<c7z> bodie_: looking at the pr, and the revision linked, <https://github.com/juju/juju/pull/15>
<c7z> looks like github just got confused when you (correctly) reverted your master branch back to be a mirror of upstream
<c7z> bodie_: so, no harm done, carry on
<c7z> and you can go ahead and repropose
<bodie_> OK, cool
<bodie_> I don't have a lot added, but I did ping fwereade about the question of charms in state
<c7z> ace
<bodie_> or, uh, Actions docs in state.Charm
<bodie_> my understanding is that it won't be a breaking change to add Actions to Charm docs
<bodie_> since Actions will only change in step with changes to the Charm
<fwereade> voidspace, if it's truly horrible I can accept a non-unit-tested writeconcern
<fwereade> voidspace, but please record a bug as well
<voidspace> fwereade: I said it *is* easy
<voidspace> fwereade: have a look https://github.com/juju/juju/pull/17
<fwereade> voidspace, ah! cool
<fwereade> voidspace, I have clearly lost the ability to read :/
<voidspace> heh
<voidspace> you don't need it much
<rogpeppe> frankban, dimitern, anyone else: huge but trivial PR - https://github.com/juju/juju/pull/28/files
<rogpeppe> it factors utils out of juju core
<dimitern> rogpeppe, is it just a mechanical change?
<rogpeppe> dimitern: yup
<dimitern> rogpeppe, right, LGTM then
<rogpeppe> dimitern: thanks
<voidspace> fwereade: I understand what you mean about MongoSession
<voidspace> fwereade: it happens to be really useful for this test... (And of course was pre-existing this mp.)
<voidspace> fwereade: it's only used in a few places, shouldn't be hard to resolve those specific use-cases one by one
<fwereade> voidspace, if you happen to notice opportunities to drive-by them, I heartily endorse any and all such efforts
<voidspace> fwereade: cool
<hackedbellini> hi guys,
<hackedbellini> I'm having some problems on my juju installation using lxc. hazmat told me that you can probably help me here
<hackedbellini> The problem started when I did a "juju upgrade-juju" and it went from 1.18.x to 1.19.2 (it shouldn't since 1.19.x is not a stable version, but this is another issue)
<hackedbellini> Now all my machines/units are spamming this on the log: http://pastebin.ubuntu.com/7573901/ (their agent-state is "down")
<hazmat> basically his agents don't know where the api servers are
<hazmat> this was post an accidental upgrade from 1.18 to 1.19 (unclear how that happened, no flags, pkg version, bug filed)
<hazmat> when it was trying to go from 1.18.1 to 1.18.3
<hackedbellini> also, since I'm already on a development version, I wish to upgrade it (since we are having a problem on 1.19.2 already fixed on 1.19.3), but when I try to "juju upgrade-juju" it says: ERROR no more recent supported versions available
<hackedbellini> I even tried with juju upgrade-juju --version 1.19.3, but this was the result: http://pastebin.ubuntu.com/7595390/
<hazmat> natefinch, ^ any ideas on how stateServers api addresses could go to empty?
<rogpeppe> how do we track the progress of a commit that's being merged?
<hazmat> hackedbellini, can you pastebin one of the agent confs on one of the agents thats not connecting/spinning
<voidspace> the good news is that CI is running again
<voidspace> the bad news is that the local-deploy-precise-amd64 job is still failing :-/
<hazmat> hackedbellini, curious to see what it has recorded for its state server addresses
<voidspace> Exception: Timed out waiting for juju status to succeed: Command '('juju', '--show-log', 'status', '-e', 'local-deploy-precise-amd64')' returned non-zero exit status 1
<voidspace> Ah no, the real error is still:
<voidspace> 2014-06-05 14:23:13 ERROR juju.cmd supercommand.go:308 cannot initiate replica set: Closed explicitly
<voidspace> let's see what revision it is running, it shouldn't be trying to create a replica set if it's using master HEAD
<hackedbellini> hazmat: yes I can. Just give me a minute, I need to ask someone with sudo powers to read it for me
<natefinch> hazmat: I wonder if that's a problem with the HA upgrades that we did.... obviously upgrading from 18 to 19 is a huge bug, but it should still *work*
<natefinch> hazmat: s/HA upgrades/HA modifications/
<voidspace> ah, the last revision it ran was one from 21 hours ago
<voidspace> in fact, it has run that revision 5 times
<voidspace> sinzui: local-deploy-precise-amd64 ran a bunch of times with some old revisions
<sinzui> yeah, it was trying to finish the stalled revision under test
<bodie_> do I need to rebase my feature branch onto master before proposing?
<rogpeppe> natefinch, sinzui: is it possible track the progress of a commit that's being merged?
<hackedbellini> hazmat: oh, kiki took the conf from it's own .juju instead of the juju user. He went feed the dog and he will be back soon. Then I can show the conf
<hackedbellini> in the mean time, he asked me to ask you if this is right: http://pastebin.ubuntu.com/7595448/
<hackedbellini> this is the contents of .juju/local
<hackedbellini> is it right for all those directories be property of root?
<rogpeppe> natefinch: ISTR seeing someone say it was, but i can't seem to find that info
<sinzui> rogpeppe, some, but not all tests report the branch and revision under test
<rogpeppe> sinzui: where can i see that?
<rogpeppe> sinzui: i'm kind of expecting some comment on the PR eventually
<sinzui> rogpeppe, the test that voidspace is looking at has http://juju-ci.vapour.ws:8080/job/local-deploy-precise-amd64/
<sinzui> the info in the build history
<perrito666> ha guys, could you take a look at this? the peergrouper section is to be credited to rogpeppe https://github.com/perrito666/juju/blob/update_HA_doc/doc/high_availability.md
<voidspace> rogpeppe: once the pr is merged you get a comment from jujubot
<rogpeppe> sinzui: so is it possible to find out if the 'bot has actually seen my commit?
<sinzui> rogpeppe, we don't have a way to saw how many revisions trunk is ahead of CI
<sinzui> yes...
 * sinzui thinks
<rogpeppe> sinzui: i'm not sure i've got the format of my commit message right
<voidspace> rogpeppe: $$merge$$
<voidspace> rogpeppe: as a comment on the pr, not the commit message
<sinzui> rogpeppe, http://juju-ci.vapour.ws:8080/job/github-merge-juju/
<rogpeppe> voidspace: i *think* i used that
<voidspace> heh
<sinzui> ^ I think the branch name is listed as of today
<voidspace> rogpeppe: you should get a comment from the jujubot within 30 seconds
<voidspace> rogpeppe: so if there's no comment, it hasn't been seen
<rogpeppe> sinzui: what should i look at in that page?
<rogpeppe> sinzui: (and is that http address stable?)
<sinzui> rogpeppe, build history
<sinzui> rogpeppe, yes, the address is stable
<voidspace> rogpeppe: see the comments here for example https://github.com/juju/juju/pull/17
<rogpeppe> hmm, still no comment
<rogpeppe> i'm looking at https://github.com/juju/juju/pull/28
<bodie_> https://github.com/juju/juju/pull/29
<bodie_> woop woop!
<rogpeppe> sinzui: so in http://juju-ci.vapour.ws:8080/job/github-merge-juju/, i see build #25 flashing, but nothing to indicate what it might be working on
<bodie_> c7z/mgz, let me know how that looks to you
<voidspace> rogpeppe: comment looks good to me, but the juju bot hasn't noticed it
<rogpeppe> voidspace: ok
<voidspace> mgz: ^
<voidspace> mgz: https://github.com/juju/juju/pull/17
<voidspace> mgz: jujubot doesn't seem to have noticed the magic-merge-message
<voidspace> mgz: gah, wrong url
<voidspace> mgz: https://github.com/juju/juju/pull/28
<bodie_> fwereade, PR opened for charm interface actions -- https://github.com/juju/juju/pull/29 -- feedback appreciated!
<c7z> rogpeppe: not sure why the bot hates you
<c7z> I'll find out
<c7z> bodie_: looks good, thanks!
<rogpeppe> sinzui: it seems wrong that the 'bot tries the tests several times on failure
<c7z> rogpeppe: write better tests then :P
<rogpeppe> sinzui: that means that if you get something trivial wrong, it delays everything a lot
<c7z> (we still have issues with tests that aren't reliable when run conccurently)
<rogpeppe> c7z: i'm looking at http://juju-ci.vapour.ws:8080/job/github-merge-juju/25/consoleFull
<c7z> it will exit out earlier when it's a build or check-catchable issue
<sinzui> rogpeppe, It does because ec2 can fail. We see several failures each week because the mirrors are stale
<natefinch> rogpeppe: is your membership public in juju?
<bodie_> c7z, woot
<rogpeppe> natefinch: i've no idea
<rogpeppe> natefinch: what does that mean?
<natefinch> https://github.com/orgs/juju/members?page=2
<natefinch> set yourself to public
<c7z> rogpeppe: it's not, that's it
<sinzui> rogpeppe, but we all agree tests and test environment should be 100% repliabe
<natefinch> I don't know what it means
<c7z> good call nate
<natefinch> but you need to be publicx
<sinzui> mgz, http://juju-ci.vapour.ws:8080/job/github-merge-juju/24/console implies the tests pass, the the handoff of the result failed
<rogpeppe> natefinch: ah, that's probably the reason
<voidspace> natefinch: I need to pick something new
<c7z> sinzui: yah, I'm debugging that
<voidspace> natefinch: our lane is sparse
<c7z> I will also in the script merge make sure that kind of thing comes up red
<natefinch> voidspace: getting you something...
<jam> voidspace: given our history with --relpSet and Precise, is your new test going to fail running the test suite on Precise ?
<voidspace> jam: hah!
<voidspace> jam: I *really* hope not
<voidspace> jam: however, the test that was failing was the deploy test - not the test suite
<voidspace> jam: and we have other tests that use replSet too
<hackedbellini> hazmat: here is the conf: http://pastebin.ubuntu.com/7595574/ (I removed some possible private information and marked them as <removed>
<jcw4> c7z: to write private tests, I need to be in the same package right?
<jcw4> c7z: so can I access the state_test package somehow from state package?
<jcw4> c7z: normal imports don't seem to work very well
<c7z> sorry, that sounds backwards
<jcw4> :)
<jam1> jcw4: non test code should never import test code
<jcw4> c7z: I want to test private functions in state package
<jcw4> jam1: agreed
<c7z> if you have state_test, you need to import state, and any objects you need in testing must either be Public, or aliased public in an export_test.go file
<jam1> jcw4: i missed context, is there something you need help sorting out?
<jcw4> jam1: want to test private functions in state package
<jcw4> to date, my tests for state have been in state_test
<jam1> jcw4: either create a state_internal_test.go with package "state" at the top, or use export_test.go to expose the private things only in the test suite.
<jam1> we've generally gone with export_test route
<jam1> you should have plenty of examples
<jcw4> jam1: I think export_test is what I'm looking for... thanks!
<jam1> var MadePublic = madePublic
<jam1> is a pretty common pattern
<jam1> then it is available as "state.MadePublic" only in the test suite
<jcw4> jam1: makes a lot of sense! cool.
 * natefinch twitches a little at export_test
<jcw4> natefinch: why you no like?
<voidspace> natefinch: shall I pick up "add ability to set state server's state to down for maintenance"?
<natefinch> voidspace: sounds good
<voidspace> natefinch: all the others are backup related and need orchestrating - probably sequentially
<voidspace> natefinch: what do we want it to do?
<natefinch> jcw4: for one, it's unnecessarily complex when you can just put the test in the same package and test the function directly
<voidspace> natefinch: by "turn off the api" do we mean that all subsequent calls will error with a useful error code/message?
<voidspace> natefinch: and if the api is off, how can it be turned on again
<natefinch> voidspace: I'm making this stuff up as I go along :)
<voidspace> natefinch: should we ensure it is safe to stop mongo once the status has been set
<voidspace> natefinch: hehe, great
<voidspace> natefinch: is there anywhere else the status can/should be visible *other* than via the api?
<natefinch> voidspace: https://docs.google.com/a/canonical.com/document/d/1XZN2Wnqlag9je73mGqk-Qs9tx1gvOH0-wxMwrlDrQU4/edit#heading=h.wjndmqkw7j22
<voidspace> natefinch: ok, reading - thanks
<natefinch> voidspace: there's not a ton of detail, maybe fwereade has ideas
<natefinch> (re backup and restore)
<natefinch> jcw4: I also don't like that now your tests are referencing a function that looks like it is exported from the package, but is not, outside of test time
<perrito666> ptal https://github.com/juju/juju/pull/30
<natefinch> markdown is so nice
<perrito666> natefinch: not really, but the rendered result is
<perrito666> :p
<natefinch> perrito666: better than html... or most alternatives
<perrito666> natefinch: true
<alexisb> mgz, fwereade: no one was on the actions call
<alexisb> was it canceled for today?
<alexisb> bodie_, jcw4  ^^?
<jcw4> alexisb: we just got off
<alexisb> heh
<alexisb> I missed you
<alexisb> was excited I was finally going to be able to join
<jcw4> alexisb: We can hop back on? :)
<jcw4> alexisb: but bodie_ won't be able to make it
<alexisb> no, I have nothing of value to add
<alexisb> I was just going to stalk
<jcw4> alexisb: on the other hand it might be worth it... it was a rare video sighting of mgz /c7z
<jcw4> alexisb: I should've snapped a pic
<alexisb> I know! I saw him this morning
<alexisb> we had a 1x1 earlier
<jcw4> alexisb: I think he did it just to counter the rumours that he's a bot
<alexisb> :)
<jcw4> natefinch: how do you access state_test package stuff from your private state tests that are in the state package?
<natefinch> jcw4: can't... but the reverse works... put the test code in state package and state_test can access it as if it were always in state.  that's what export_test does
<perrito666> rogpeppe: just let me know when you are finished reviewing :) so I read the whole mail thread and correct it
<rogpeppe> perrito666: yeah, sorry - i'd much prefer to go through making comments and send them all at once, but github doesn't like that.
<jcw4> natefinch: I see, but if I want to test state private code using established state_test stuff (ConnSuite) I just have to export the state private code
<perrito666> rogpeppe: not a problem, but it is easier for me to wait until the dust from mail settles :p
<rogpeppe> perrito666: definitely
<hazmat> hackedbellini, i meant for not machine-0
<hazmat> hackedbellini, one of the agents that can't connect
<hackedbellini> hazmat: ahh, ok. This was the only one I found... but probably they are inside the lxc machines, right?
<hackedbellini> I'm checking it right now
<perrito666> rogpeppe: done?
<rogpeppe> perrito666: just writing a final comment
<perrito666> heh ok, just asked because my phone stopped going crazy
<hackedbellini> hazmat: http://pastebin.ubuntu.com/7595873/ http://pastebin.ubuntu.com/7595869/ http://pastebin.ubuntu.com/7595875/
<hackedbellini> the first is the machine agent.conf, the second is unit-jenkins and the last one is unit-landscape
<c7z> rogpeppe: 002-move-utils seems to have failed at build
<rogpeppe> c7z: darn
<c7z> not sure if it's you or me yet :)
<hackedbellini> really strange that those confs, although on the same machine, have different "upgradedToVersion"
<rogpeppe> c7z: hmm, this looks like the culprit:
<rogpeppe> 	imports github.com/juju/juju/names: cannot find package "github.com/juju/juju/names" in any of:
<rogpeppe> ahh
<rogpeppe> no juju/juju/names has gone away
<hazmat> hackedbellini, got time for a g+?
<rogpeppe> ah, it's in goose
<rogpeppe> i'm surprised the tests passed when the names package was moved
<perrito666> rogpeppe: thanks a lot for your corrections :D
<rogpeppe> perrito666: np :-)
<jcw4> rogpeppe: c7z , could it be a merge issue?
<rogpeppe> c7z: hmm, weird, i don't see any references to juju/juju/names in my entire source tree
<rogpeppe> c7z: so i have to suspect some kind of build problem
<hackedbellini> hazmat: sure. Lets go now?
<redir> rogpeppe: I am seeing that to after a pull from upstream master
<rogpeppe> redir: you're seeing the same error?
<hackedbellini> hazmat: I'm online there. Just call me, or pm me the hangout link
<rogpeppe> redir: do you see the same problem if you remove $GOPATH/pkg and reinstall everything?
<redir> rogpeppe: it is in state/apiserver/networker/networker.go
<redir> rogpeppe: it is imported in that file juju/juju/names
<rogpeppe> redir: ah, i don't have that file in my branch
<rogpeppe> redir: it must have been pushed recently
<redir> rogpeppe: yeah I just did a pull from upstream
<voidspace> natefinch: I can't find it in that doc at all!
<c7z> ha, so it is a merge issue then
<redir> rogpeppe: https://github.com/juju/juju/commit/c5a50a458a12da8e61c0f83704e53e8f3fa0f891
<c7z> and this is why we do the testing on the post-merge build
<c7z> shame the error message is not that helpful
<c7z> rogpeppe: so, merging/rebasing on upstream should work
<rogpeppe> c7z: ah yes, i just merged but i should have rebased
<rogpeppe> c7z: except that i've published the branch
<rogpeppe> c7z: so i guess i'll just push the merge
<c7z> rogpeppe: try what ever, we'll see what happens!
<voidspace> natefinch: I assume that by "set status to down for maintenance" that is what we want "juju status" to report
<voidspace> natefinch: so all api calls except that one should error
<voidspace> fwereade: ping, you still around?
<voidspace> hmmm... my merge failed
<voidspace> and it was one of the new tests that failed
<voidspace> hmmm
<rogpeppe> it's great to see how much faster linking is with go 1.3
<voidspace> hah, it passed on my branch - when I merge head it fails
<voidspace> dammit
<c7z> voidspace: ha! the same thing
<voidspace> who killed my test!!
<voidspace> I am seeing this: [LOG] 0:00.914 DEBUG juju.state connection failed, will retry: dial tcp 0.1.2.3:1234: invalid argument
<rogpeppe> hmm, what does "Waiting for next available executor" mean
<rogpeppe> it seems to be waiting for quite a long time
<rogpeppe> oh here it goes
<c7z> voidspace: it means you're in the queue
<voidspace> c7z: I think you meant rogpeppe ...
<c7z> meh, whichever :P
<voidspace> I'm flattered by the thought that we're interchangable, but I assure you it's not true
<rogpeppe> voidspace: we could try it :-)
<voidspace> hehe
<voidspace> I think our respective wives might object for starters...
<voidspace> heh, the error is "no reachable servers"
<voidspace> adding a time.Sleep(time.Second * 20) fixes it
<voidspace> so how should I *really* fix it
<c7z> :D
<voidspace> find something I can usefully poll I guess
<voidspace> this is the issue I fixed (and reverted) as part of trying to get replicasets to work for local provider I think
<voidspace> we start replica set initiation, but don't wait until it is completed
<rogpeppe> c7z: hmm: sudo: go: command not found
<rogpeppe> c7z: from http://juju-ci.vapour.ws:8080/job/github-merge-juju/28/consoleFull
<rogpeppe> c7z: that doesn't look as if it's anything to do with my branch
<rogpeppe> hmm, how *did* that networker change go in after names was deleted
<rogpeppe> ?
<c7z> that does seem like a race, lets see
<natefinch> voidspace: sorry, had to step out
<c7z> rogpeppe: okay, same thing again...
<voidspace> natefinch: I've agreed with will to talk to him tomorrow
<voidspace> natefinch: meanwhile I'm trying to fix my write-majority test that fails now
<voidspace> natefinch: it passed reliably until I merged in head :-)
<natefinch> voidspace: ug, ok
<voidspace> natefinch: I thinjk this is the issue I fixed (and reverted) as part of trying to get replicasets to work for local provider I thin
<voidspace> natefinch: we start replica set initiation, but don't wait until it is completed
<voidspace> natefinch: so the server is unreachable until it is initiated
<natefinch> brb sorry
<c7z> sinzui, rogpeppe: W: Failed to fetch bzip2:/var/lib/apt/lists/partial/us-east-1.ec2.archive.ubuntu.com_ubuntu_dists_trusty-updates_main_source_Sources  Hash Sum mismatch
<rogpeppe> c7z: awesome
<c7z> not sure what I can do about fixing that
<rogpeppe> c7z: it's that pesky NSA again
<voidspace> natefinch: and I can't call CurrentStatus to poll (in this test) because I don't have a session
<natefinch> voidspace: hmm
<redir> rogpeppe: looks like networker was a newly added file, with the import in it.
<rogpeppe> redir: yeah, but how did it get through?
<rogpeppe> redir: because presumably when it was merged, everything had been changed to use the new names repo and the old one was removed
<rogpeppe> redir: .... unless
<rogpeppe> redir: i bet the 'bot doesn't clean the pkg directory
<redir> rogpeppe: seems reasonable guess
<sinzui> c7z: yep happens all the time, the test should try 3 times because of that
<c7z> sinzui: the issue is this is pre-test
<c7z> are other clouds less pants about corrupting our archive mirrors?
<sinzui> c7z: yes, the test needs to try again
<rogpeppe> here's another trivial pull request, using the new juju/filetesting and removing the old one: https://github.com/juju/juju/pull/31
<rogpeppe> review, anyone?
<sinzui> c7z, rogpeppe every cloud has this problem. aws and azure are the worst, though aws might *appear* to be worse because 50% of testing happen there
<natefinch> voidspace: not sure what to tell you, it sounds like we need a session to be able to poll so we can know when things should succeed
<rogpeppe> sinzui: which problem? randomly corrupting network data?
<voidspace> natefinch: I can get another session to poll with
<voidspace> natefinch: although dialling at all is proving problematic
<sinzui> rogpeppe, probably the cache.
<rogpeppe> sinzui: ha. randomly corrupting on-disk data. that's worse.
<sinzui> rogpeppe, c7z : when http://juju-ci.vapour.ws:8080/view/Cloud%20Health/ this page lists a cloud repeated;y failing becaus of mirrors,  we raise the issue with IS
<voidspace> natefinch: current theory is I need to set DialDirect
<rogpeppe> natefinch: what's the problem?
<rogpeppe> oops
<voidspace> natefinch: nope
<rogpeppe> voidspace: what's the problem?
<voidspace> natefinch: I *really* have to EOD
 * rogpeppe too
<natefinch> voidspace: go, it's ok, we'll figure it out
<voidspace> rogpeppe: if you're around in the morning I'll ask you
<voidspace> natefinch: I can sort it tomorrow - it's my PR that's failing to merge so it's no-one else's problem
<rogpeppe> i'd very much like a rubber stamp on https://github.com/juju/juju/pull/31 before i do EOD, if poss.
<rogpeppe> please
<voidspace> g'night all
<sinzui> c7z, I will setup a dedicated slave for the git-merge-juju job. This guarantees fast availability and mostly working mirrors, but then the test needs to cleanup
<c7z> sinzui: my work list include unifying that script with the existing job so we can just pass in the params for an existing slave
<sinzui> c7z, I am doing something similar for all key series-archs so that I have a reliable way to test tarballs, build packages, and test local-provider
<redir> rogpeppe: LGTM
<rogpeppe> redir: ta!
<c7z> rogpeppe: have people said juju/jujuutils and such to you yet?
<rogpeppe> c7z: no
<redir> rogpeppe: who would ensure the bot does make clean or whatevs?
<redir> and rogpeppe good evening...
<rogpeppe> c7z: not quite sure what you mean there
<c7z> utils is a bad package name, and I'm not sure just being under the juju organisation is great namespacing
<rogpeppe> c7z: ah yes.
<c7z> redir: make clean is not the issue, it's all in a tmp GOPATH anyway
<rogpeppe> c7z: i dunno though. they're not very juju-specific
<c7z> it's more the stuff like lingering mongo processes and other more subtle cruft/leakage
<c7z> rogpeppe: that's a fair point for utils, various unix/ubuntu its mostly
<c7z> *bits
<redir> c7z: I see
<c7z> sinzui: ec2 seems pretty broken atm
<rogpeppe> g'night all
<c7z> night rogpeppe, I may try and land your bits later, otherwise requeue tomorrow
<jcw4> rogpeppe: o/
<rogpeppe> c7z: thanks that would be nice
<rogpeppe> o/
<sinzui> c7z, The errors last about 30 minutes. The run unit tests jobs are doing okay. The test just needs to keep trying when It believes the cloud is at fault
<sinzui> c7z, we have seen many AMIs *spontaneous combust* We had to select new AMIs to get tests to work
<c7z> sinzui: should it poll on the instance when the apt call fails? spinning up new ones seems bad...
<c7z> but lolo, exploding amis would be a problem
<sinzui> c7z, since git-merge-juju cargo culted the run-unit-tests, it lots the power to switch from AMI to an existing host
<c7z> sinzui: right, I'll get that bit back
<sinzui> c7z I think we just need that script to accept an optional path to a tarball.
<c7z> but we don't have an existing way to be robust agains mirror errors for the machine-per-run case as far as I see
<c7z> sinzui: yeah, that's the plan
<c7z> plus unifying the other changes
<perrito666> mm, I can mention people on the commits, rogpeppe did you get any notification about that?
<c7z> perrito666: depends on his notification settings
<marcoceppi> I really want to use actions, any idea what release that will be available in? (Will we see it in a 1.19.x?)
<c7z> jcw4: ^
<jcw4> marcoceppi: c7z ; we're making progress... I'm not sure what the 1.19.x timeline looks like
<jcw4> marcoceppi: c7z I think so
<c7z> marcoceppi: code is landing on trunk now
<marcoceppi> c7z: yeah, I saw the GH notifications
<marcoceppi> c7z: jcw4: is there any prelim documentation so I can start including actions in charms I'm writing now?
<marcoceppi> has the format been outlined yet?
<jcw4> marcoceppi: good question.. fwereade has been after us to produce those
<alexisb> jcw4, there will be a 1.19.x release next week
<alexisb> then the following week we plan to release 1.20
 * marcoceppi is not after you guys
<marcoceppi> /not/now/
<jcw4> alexisb, marcoceppi in that case I don't expect Actions in 1.19.x
<jcw4> 1.20 is hopeful
<marcoceppi> you mean 1.21?
<alexisb> yes he means 1.21 :)
<c7z> I thought we were dropping odd/even
 * marcoceppi was concerned the release pattern had changed *whew*
<alexisb> jcw4, 1.20 will be the stable release
<alexisb> c7z, yes
 * marcoceppi is concerned again
<alexisb> so 1.21beta1 will be the first development with the tag
<c7z> this conversation is hillarious
<jcw4> marcoceppi: what can we do to help
<jcw4> c7z: I take umbrage at that... :)
<c7z> I love working with you guys
<marcoceppi> Oh I really don't care about the release things, just actions will literally rock my world
<alexisb> marcoceppi, why is that
 * jcw4 is a little nervous about literally rocking the world
<alexisb> jcw4, I think that just means a big party and marcoceppi is bringing the refreshments
<jcw4> alexisb: whew
<marcoceppi> alexisb: A lot of my charms use the notion of an external repository for user maintained assets (like the wordpress charm, and now the nginx-site charm) so having an action to sync this content that a user can initaite agains the charm is funderful
<jcw4> marcoceppi: what kind of docs would be most helpful... API docs for defining actions in your charm?
<marcoceppi> jcw4: yes, how do I add actions to my charm would be the best document
<marcoceppi> I can also help get that doc in to the juju docs for when actions land
<alexisb> jcw4, mgz, fwereade: it may be good to invite marcoceppi to one of your interlocks so you can hammer out details on documentation for charmers
<jcw4> marcoceppi: we've (I've) been mostly focused on plumbing and innards... bodie_ has been working on the Actions() method on Charm... but I don't think we have a fully fleshed out API yet.
<jcw4> alexisb: good idea
<jcw4> marcoceppi: what timezone are you in?
<marcoceppi> jcw4: I've trancended timezones, but I'm mostly awake during EDT
<jcw4> we meet daily at 1600 UTC
<alexisb> heh
<marcoceppi> jcw4: cool, feel free to ping me when you do, I'm not afraid to run trunk to hammer out features landing
<marcoceppi> hammer on, even
<jcw4> and sometimes we chat in #jujuskunkworks
<jcw4> although we try to keep discussion in here until it gets too noisy
<marcoceppi> cool
<jcw4> marcoceppi: the hangout link is in the topic for that channel too
 * marcoceppi lurks in there
<marcoceppi> Cool, thanks for the info jcw4 alexisb c7z et al
<jcw4> marcoceppi: thanks for the added impetus
<perrito666> anyone here is in charge-ish of https://juju.ubuntu.com/docs/contributing.html ? I have docs I want to add but I am not really sure about the criteria for the naming
<jcw4> perrito666: jcastro maybe?
<perrito666> jcastro: ?
<jcastro> hi!
<jcastro> naming which part?
<perrito666> jcastro: the source for the docs has the doc names preceded by things like authors- charms- config- howto-
<perrito666> so
<perrito666> I want to add docs on HA, what is the prefix for "feature documentation"
<jcastro> howto-ha
<jcastro> though, howto-highavailability probably makes more sense
<perrito666> yes
<perrito666> just wanted to be sure, I gues that i used to present the data so I didnt want my doc to end in the wrong section
<perrito666> tal
<jcw4> did the networker.go merge issue get fixed and merged in yet?
<jcw4> i.e. are we just waiting for CI to finish?
<jcw4> rogpeppe, mgz ^^
<jcw4> I don't see a PR to fix it
<jcw4> s/PR/MR/
<jcw4> crickets...
<jcw4> https://github.com/juju/juju/pull/32
<natefinch> sorry, no idea what you're talking about
<jcw4> natefinch: tx
<jcw4> natefinch: for the LGTM that is
<natefinch> :)
<natefinch> I knew what you meant
<jcw4> :)
<jcw4> fwereade, mgz any chance of a $$merge$$ ^^
<natefinch> jcw4: for the thing I LGTM'd?
<natefinch> I thought you could do that.  I can do that if you want
<sinzui> *** I am forcing a rebuilt of the git-merge-juju job to see if the mirror issue is resolved***
<sinzui> *** I will build a dedicated slave if this fails ***
<perrito666> omg, that thread will never die
<natefinch> perrito666: which one? :)
<perrito666> natefinch: hosting projects...
<natefinch> yep.. well, it hits close to home for a lot of people.    Both tools are so complex, it's hard to know both of them really well, so there's a lot of misunderstanding on both sides, I'd imagine.  Luckily, I don't know either of them very well, so it's all the same for me :)
<natefinch> (bzr vs git that is.... LP vs github is a little easier to reason about, I think)
<perrito666> its vi vs emacs
<natefinch> yep
<perrito666> sadly its a thread unlikely to be ended by godwin lay
<natefinch> I've also heard that when you argue about things that people use to define themselves (I'm a python programmer, I'm a Christian, I'm a vi user)... that you basically can never win.  That attribute is so closely tied to how they identify themselves, that they can't and won't listen to reason about why it might not be good.
<perrito666> law
<perrito666> I will say only this, git branch is oh so much faster than bzr branches
<natefinch> yeah, that's cause git branch does almost no work
<natefinch> it is pretty cool
<perrito666> also since my .vimrc was heavily tuned for working with git now I have a changed lines indicator which makes my life easier
<natefinch> that's cool
<natefinch> Honestly, I'm happy to get the chance to work more in git in my daily work, since that's what most people use these days, and I'd like to be more proficient with it.   My last company used SVN, and I only used git in minor side projects, so not a lot of merging, rebasing, squashing etc.
<jcw4> thanks natefinch ; I'm not a member of the juju team so I can't do the $$merge$$...
<jcw4> what is the apt mirror corrupted issue on CI?
<jcw4> mgz: ^^
<sinzui> jcw4, I am working on it
<jcw4> sinzui: thanks!
<sinzui> I was going to build an instance, but abentley  has a hack to change the mirrors that I should try first
<sinzui> jcw4, I moved a rule from run-unit-tests into the git-merge-juju job. When we unify the job with run-unit-tests, the job will get all the goodness
<sinzui> tests are running again
<jcw4> sinzui: thanks!
<natefinch> sinzui: I see a test failed here for a differnt pull request? https://github.com/juju/juju/pull/32
<natefinch> 32 and 33 at the bottom there
<natefinch> I guess that's build 33 not PR 33
<sinzui> natefinch, it is spurious
<natefinch> sinzui: figured, just making sure
<sinzui> sorry. I moved some love from run-unit-tests into the job to get past the mirror issue. CI is using newer solutions to common problems
 * sinzui adds local tarball support to run-unit-tests
 * thumper goes to walk the dog
<menn0> morning all
<jcw4> menn0: morning :)
<thumper> o/
<thumper> menn0: so, I told people last night that you would be taking the db-schema upgrade process to the list :)
<thumper> hmm...
<thumper> the volume of email I now get is much bigger
<thumper> github has very chatty reviews
<jcw4> dimitern and davecheney are reviewers today, but not online right now...
<jcw4> mgz, fwereade https://github.com/juju/juju/pull/33
<jcw4> thumper: lots more chatting coming from that PR ^^
<thumper> um...
 * jcw4 looks around nervously...
<jcw4> yes...
 * thumper looks at menn0
<thumper> 'tis friday
<jcw4> urg... gotta get my timzone math right
<jcw4> wait... it's not friday for natefinch...
 * thumper replies to waigani's email
<menn0> HFS, go
<menn0> ignore that :)
<menn0> thumper: sorry I didn't see all this. I had Google Docs full screen while working on a schema migration doc
<menn0> thumper: I will be taking that to the list
<menn0> thumper: using Google Docs because all the details are getting lost in the email thread
 * thumper nods
<thumper> ack
<menn0> thumper: I'm pretty happy with how the design is working out. Your idea cracked it.
<thumper> cool, happy to help
<thumper> waigani: around?
<waigani> thumper: yep
<thumper> waigani: did you want to come around today?
<waigani> oh right
<thumper> waigani: I think we could nail the switch thing
<waigani> ah, I don't have a car
<waigani> I've addressed all your comments
<waigani> just adding formatting to list now
<waigani> I'm down on castle st
<waigani> hmmm ... i *might* be able to steal molly's car
<waigani> hang on
<waigani> thumper: I've got wheels! Shall I head your way?
<thumper> yeah
<waigani> thumper: I just pushed my branch - addressed everything but formatted output to list
<thumper> waigani: let's go over it here
<waigani> okay, see you soon
 * thumper turns on the coffee machine
<sinzui> wallyworld_, mgz: FYI https://code.launchpad.net/~sinzui/juju-ci-tools/local-tarball/+merge/222266
<thumper> davecheney: o/
<bigjools> wallyworld_: do I need to come round and put a hammer through your router?
#juju-dev 2014-06-06
<davecheney> http://juju-ci.vapour.ws:8080/job/github-merge-juju/35/?
<davecheney> what the heck happened ?
<perrito666> davecheney: well first of all, it would seem that the bot is not properly merging stuff since at some point i in detached mode
<davecheney> can anyone just hulk smash this
<davecheney> this is blocking a fix
<davecheney> i've been trying to merge this SOB for 3 days
<perrito666> davecheney: wallyworld_ is the man you are looking for
<perrito666> well perhaps its mgz but I hink at thi time you can only get wallyworld_
 * wallyworld_ is having network issues :-(
<wallyworld_> so i missed the question
<davecheney> wallyworld_: can you land https://github.com/juju/juju/pull/10 manually
<davecheney> i've been stuck on this for days
<wallyworld_> sure, why?
<davecheney> http://juju-ci.vapour.ws:8080/job/github-merge-juju/35
<wallyworld_> so don't you need to fix the merge conflict first?
<davecheney> thanks
<davecheney> i'll try again
<davecheney> that probably means someone already bumped the versino of juju testing
<davecheney> so I probablky don't need to do this at all
<wallyworld_> i am happy to help if needed
<davecheney> wallyworld_: thanks for your help, I think I have it now
<wallyworld_> ok, i didn't do anything though :-)
<perrito666> wallyworld_: your sole presence is enough
<wallyworld_> debatable :-)
<davecheney> hey, what's happening with juju/names ?
<davecheney> looks like the job is 1/2 done
<jcw4> davecheney: how so?
<davecheney> is there a job to nuke the copy in juju/juju ?
<davecheney> + ./bin/lander-merge-result --ini development.ini --pr=10 --job-name=github-merge-juju --build-number=36
<davecheney> Failed to add comment: Failed to merge: {u'sha': u'6bbbdb86e38120549487326600e16bc1a340d59d', u'message': u'Pull Request successfully merged', u'merged': True}
<davecheney> ++ date
<davecheney> + echo Finished: Fri Jun 6 01:07:08 UTC 2014
<davecheney> Finished: Fri Jun 6 01:07:08 UTC 2014
<davecheney> + exit 0
<davecheney> Description set: davecheney 101-update-juju-testing-dependency
<davecheney> Finished: SUCCESS
<davecheney> wallyworld_: failed, but success ?
<wallyworld_> otp, sec
<jcw4> davecheney: sorry, I thought you were talking about in your own repo/branch
<jcw4> :)
<wallyworld_> davecheney: yeah, dumb message, github seems to think it's merged from what i can see. i'll get martin to look into it as he's tidying up all that stuff at the moment
<davecheney> wallyworld_: i'm moving on to finishing the juju/names package
<davecheney> that is blocking me today
<wallyworld_> davecheney: ok, but at least your dep change got merged even if the console message sucked
<davecheney> \o/
<davecheney> i'll go close the bug on LP then
<wallyworld_> axw: so i have a large set of changes i want to split up into 2 branches (like a bzr pipeline). i want to add a pre-req branch to my current one. the work is currently not committed. do you know how to do that efficiently with git? do a need that stacked git addon?
<axw> wallyworld_: why do you want to do it in two branches - just to split up the review?
<wallyworld_> yeah
<wallyworld_> like we do in launchpad and bzr
<wallyworld_> and i want to switch between the 2
<wallyworld_> make changes in the upstream one and push those to the other
<axw> so there's no prereq sort of thing in GH. if they're small enough, you could just put them in one review as separate commits. you can then see commit-specific changes in the review
<axw> I don't know much about stacked git, just that it exists
<wallyworld_> ok, ta
<wallyworld_> i'm really missing bzr and launchpad already
<davecheney> https://github.com/juju/juju/pull/34
<davecheney> ^ finishes the move to juju/names
<thumper> davecheney: https://github.com/juju/juju/pull/27
<davecheney> thumper: I don't get it
<davecheney> my working copy doesn't have this
<davecheney> oh fuck
<thumper> git fetch upstream?
<davecheney> how do you set that up ?
<thumper> davecheney: what do you have now?
<davecheney> just editied .git/config to set original to my fork
<davecheney> thats it
<thumper> git remote -v
<davecheney> lucky(~/src/github.com/juju/juju) % git remote -v
<davecheney> origin	https://github.com/davecheney/juju (fetch)
<davecheney> origin	https://github.com/davecheney/juju (push)
<thumper> ok...
<axw> git remote add upstream https://github.com/juju/juju.git
<wallyworld_> davecheney: were there supposed to be so many changes to dependencies.tsv in you mp?
<davecheney> wallyworld_: supposed to only be one change
<thumper> davecheney: git remote set-url origin git@github.com:davecheney/juju.git
<wallyworld_> the diff has a few more than that
<davecheney> i do godeps -t $SOMEPKG >> dependencies.tsv
<thumper> davecheney: and what axw said
<thumper> then do:
<davecheney> then hand edit dependencies.tsv afterwards
<davecheney> because godeps is so pickup about the file format
<wallyworld_> i just always hand edit it
<thumper> git branch --set-upstream-to remotes/upstream/master
<davecheney> % git branch --set-upstream-to remotes/upstream/master
<davecheney> error: the requested upstream branch 'remotes/upstream/master' does not exist
<thumper> but  do "git fetch upstream " first
<thumper> sorry
<thumper> ordering
<davecheney> ok
<thumper> davecheney: also "git remote set-url --push upstream no-pushing" will mean you can't accidentally push upstream
<wallyworld_> davecheney: so i think you need to fix the tsv file so the diff only shows one change
<davecheney> there is no way i'll remember this
<axw> if anyone cares, I have a local branch called "master" that tracks upstream/master. I pull into this and branch off there for my feature branches
<davecheney> is this writtne down anywhere
<thumper> davecheney: you only have to do it once
<thumper> and I think it is written down
<thumper> if not, it should be
<wallyworld_> wtf is git so obtuse and difficult?
<davecheney> thumper: do I need to do the updates upstream thing for every juju repo ?
<thumper> yes
<davecheney> ok
<jcw4> menn0: great feedback so far... let me know when you're done and I'll respond to them all
<menn0> jcw4: will do
<sinzui> CI Loves trunk after 9 days. Blessed: gitbranch:master:github.com/juju/juju 6bbbdb86 (Build #1451)
<jcw4> sinzui: yay
<waigani> thumper: https://github.com/juju/juju/pull/35
<waigani> axw: I have the same setup
<waigani> thumper: you about?
<menn0> jcw4: all done
<jcw4> menn0: sweet, all good comments thanks
<menn0> jcw4: sorry that I don't feel like I can LGTM it (due to my lack of background, not a problem with the code)
<jcw4> I'll address the stuff you brought up and then ask fwereade for review too..
<menn0> cool
<jcw4> menn0: no worries, I appreciated all your feedback
<menn0> hopefully I've caught a few things fwereade would have otherwise
<menn0> davecheney: regarding  https://github.com/juju/juju/pull/34
<menn0> davecheney: I've seen your comment. Is the diff shown no longer accurate?
<davecheney> menn0: it is accurate
<menn0> davecheney: ok, the comment confused me a little. I will review!
<axw> if anyone's able to review goose changes, I need one please: https://codereview.appspot.com/103900045
<menn0> davecheney: why the changes of errors.New to fmt.Errorf?
<davecheney> menn0: why have two differnt ways of making an error ?
<menn0> davecheney: ok. so because fmt.Errorf is more flexible you're standardising to that?
<davecheney> just reducing the number of imports
<davecheney> why import errors to make one call to it
<davecheney> when fmt is already imported and does the smae job
<menn0> why not use the new juju/errors package which gives tracing?
<menn0> davecheney: ^^
<davecheney> menn0: it was just a cleanup
<davecheney> i'm not trying to boil the ocean
<menn0> davecheney: fair enough. but in general we should be using juju/errors right? esp for new stuff?
<davecheney> menn0: nfi
<menn0> davecheney: I think we are but that doesn't matter for this change ;-)
<davecheney> afaik the errors package is for wrapping other errors
<menn0> davecheney: it also generates new errors with the source information recorded
<menn0> there's New and Errorf which are compatible with the standard library functions
<menn0> davecheney: anyway, review done.
<davecheney> ta
<thumper> waigani: here now
<thumper> and missed the ping from before
<thumper> was collecting daughter
<waigani> davecheney: creating a test for the error, I can't see how to do this. d.dir is not exported, so I can't override that ... ?
<waigani> menn0: thumper: github is erasing my backslashes. The regex in the comment *should* read: ^.+\.jenv$
<menn0> waigani: regardless I still think you should consider using Glob instead of a regex :)
<waigani> menn0, thumper: what are we doing for review fixes? amending? rebasing? or just pushing?
<menn0> waigani: I've generally been editing history (either git commit --amend or some kind of rebase) and pushing with --force. Github handles that pretty well and it's not like anyone is going to be checking out your PR on their machine.
<waigani> menn0: Cool I'll amend and push -f
<menn0> waigani: if there was a large change in response to a review comment then I would probably commit it was a separate rev
<waigani> yeah, fair enough, makes sense
<jcw4> menn0: I'm going to run some smoke tests on my changes and push an update to PR https://github.com/juju/juju/pull/33
<jcw4> menn0: I did make one comment and asked one question in there though
<menn0> jcw4: cool. I'm about to finish up but can hang around for a little longer
<jcw4> no worries menn0 I'm eod'ing right after I push...
<menn0> jcw4: I've responded to your comments. I thought of more potential issues with the ActionResults() lookup unfortunately.
<jcw4> heh... thanks menn0
<jcw4> menn0: how about ActionResultsForUnit(globalKey), and ActionResultsForAction(actionId)?
<menn0> jcw4: yes, that sounds good
<jcw4> cool, tx
<menn0> jcw4: another thought: a comment describing the structure the of the action ids might be useful
<jcw4> menn0: +1
<menn0> right I'm done
<jcw4> ta
 * menn0 thinks review duty is hard work (but rewarding)
<menn0> have a good weekend all
<jcw4> Okay I pushed latest changes for PR https://github.com/juju/juju/pull/33
<jcw4> fwereade: if you get a chance today your input would be appreciated
<wallyworld_> axw: i'm getting errors trying to push a feature branch (missing packages). so i changed to master, did a git pull upstream master. i want to push that to my fork on gh using git push origin but still it says  packages are missing. do you know what i need to do?
<axw> wallyworld_: sounds like the pre-push hook is failing
<axw> wallyworld_: some packages have been moved around (out of juju/juju)
<axw> you probably just need to go get them
<wallyworld_> ok, will try. git is so easy to use
<axw> wallyworld_: sorted?
<wallyworld_> no :-( sort of. i was working on branch A. I committed some files. I stahsed the rest. I checkout -b branch B. I pop the stash. I commit and push. Now branch B in the pull request has all the stuff from branch A as well
<wallyworld_> and I can't do anything with branch B because it depends on changes in branch A
<wallyworld_> this sucks
<wallyworld_> so ideally i want to delete stuff from branch B (all of the branch A work)
<wallyworld_> but nothing in the log seems to indicate where the branch A stuff came from to get into branch B
<dimitern> fwereade, jam, please take a last look at https://github.com/juju/juju/pull/13 i really need an approval to land it already
<axw> wallyworld_: if you "checkout -b", that's coming off the branch you were previously on
<axw> wallyworld_: if you didn't want the stuff from A, branch off master
<axw> wallyworld_: now that you've done that, you can just "git rebase -i master" and delete the commits you don't want in that branch
<wallyworld_> axw: thank, will do that after soccer
<rogpeppe1> does anyone know if the issues with the jenkins bot are fixed yet?
<rogpeppe1> axw: ^
<axw> rogpeppe1: issues?
<rogpeppe1> axw: there was a checksum issue which was causing all builds to fail like this http://juju-ci.vapour.ws:8080/job/github-merge-juju/30/consoleText
<rogpeppe1> axw: because the go tool wasn't available
<rogpeppe1> axw: if you've succeeded in landing stuff, i guess it's fixed
<axw> rogpeppe1: I'm pretty sure I've seen things landed today
<rogpeppe1> axw: ok, i'll try again
<BjornT> could someone please take a quick look at this bug and say what more information i can add to it, before i shutdown the failed instance?  https://bugs.launchpad.net/juju-core/+bug/1327078
<_mup_> Bug #1327078: Failed to bring up an LXC with 1.19.3 <landscape> <juju-core:New> <https://launchpad.net/bugs/1327078>
<rogpeppe1> anyone know what happened to lbox.check? i'd like to run the checks locally that the bot runs remotely, so i don't do silly things like forget to gofmt or govet.
<rogpeppe1> (having just realised i failed to gofmt my last merge req)
<axw> BjornT: if there's no /var/lib/juju, then I think cloud-init-output is all we can work with
<axw> rogpeppe1: symlink scripts/pre-push.bash as .git/hooks/pre-push
<BjornT> axw: ok, thanks
<rogpeppe1> axw: ah, excellent, thanks - i'd have taken ages to find that
<axw> np
<rogpeppe1> axw: is the pre-push script guaranteed to run in the root dir of the repo>
<rogpeppe1> ?
<rogpeppe1> axw: (when doing git push)
<axw> sinzui: interesting thing I found - the HA-failing-on-precise thing only occurs on your machine when the mongo db is on the ephemeral disk
<axw> rogpeppe1: yes, I confirmed that
<axw> pushed from some other subdir, it always was run in the root
<rogpeppe1> hmm, pity it takes 30s to run
<rogpeppe1> cp /bin/true $(go tool -n vet)
<voidspace> morning all
<TheMue> morning
<dimitern> TheMue, hey
<dimitern> TheMue, I need an LGTM on this if you can take a look - I fixed all that was suggested earlier https://github.com/juju/juju/pull/13
<TheMue> dimitern: yep, will take a look
<axw> voidspace: I've had a breakthrough on the local-provider/replicaset issue
<dimitern> TheMue, thanks
<axw> voidspace: I can reproduce it on my own m2.xlarge instance
<axw> voidspace: the trick is to use the ephemeral disk as the root-dir
<axw> if you use the EBS disk it works fine
<rogpeppe1> is there any way of telling jenkins to stop retrying tests because i've pushed a new version of the branch?
<voidspace> axw: ah right
<voidspace> axw: interesting
<TheMue> dimitern: a bit hard to follow now on GH. the comment by william yesterday at 10:16 (at the bottom), is it covered?
<dimitern> TheMue, about constraints?
<dimitern> TheMue, yes
<dimitern> TheMue, there are a few things i'll do in follow-ups as agreed as well
<TheMue> dimitern: fine
<TheMue> dimitern: done
<dimitern> TheMue, cheers
<TheMue> dimitern: does GH provide a side-by-side diff too?
<dimitern> TheMue, no, but menno found some chrome extension that gives you that - check the mailing list
<TheMue> dimitern: chrome is a no-go for me, itâs a resource hog. since changing to ff everything runs better, even the google apps like mail or hangout
<voidspace> axw: any idea *why* that should make a difference - IO performance?
<axw> voidspace: not really. I wouldn't think it'dbe iops, because the ephemeral disk should be faster
 * axw checks hdparm
<rogpeppe1> ffs, it's *still* retrying the tests
<voidspace> axw: can you reproduce it with trusty?
<voidspace> axw: or is it just precise
<axw> voidspace: I did, same issue. I sent instructions to juju-dev on how to repro
<voidspace> axw: ah yes, I see
<axw> voidspace: EBS on that machine gets 80.33MB/s, ephemeral gets 552.59MB/s
<axw> voidspace: I tried it on m3.xlarge and the ephemral disk was fine. on there it's an SSD though
<rogpeppe1> it seems like a failed test delays the merge queue by at least 30 minutes. that's really not great.
<axw> rogpeppe1: we are working on improving it, there are still some intermittent failures
<rogpeppe1> axw: it would be great if it was possible to tell jenkins "please kill this job now - i know it's going to fail"
<axw> hmm yes, that would be nice...
<rogpeppe1> axw: even with intermittent failures, it might still be better if it didn't retry automatically
<rogpeppe1> axw: then at least the person responsible can look at the failure and see if it looks like it's correctly intermittent, or if it's just a bad test
<axw> initially it was so bad that we had to, but now, yeah, I think you may be right
<rogpeppe1> right, hoping for 5th time lucky
<rogpeppe1> dimitern: your branch failed to merge: tmp.ZRpISWCH5X/RELEASE/src/github.com/juju/juju/juju/testing/instance.go:119: undefined: includeNetworks
<rogpeppe1> dimitern: (i can't say i'm unhappy because if yours had succeeded, mine would probably have failed. again)
<dimitern> rogpeppe1, yep, some merge didn't go well i'm fixing it
<voidspace> axw: hmmm... maybe my Friday project can be to diagnose this
<axw> heh :)
<axw> well it sure is mystefying, so it's gotta be educational if you can figure it out
<voidspace> right
<voidspace> not useful education, but education none-the-less ;-)
<rogpeppe1> i love my ISP
<rogpeppe1> ssh: Could not resolve hostname github.com: Name or service not known
<voidspace> rogpeppe1: stop using their dns?
<rogpeppe1> voidspace: i think it's just the whole thing's flaky. i get random delays, even when i'm not using their dns
<rogpeppe1> dimitern: provider/ec2/local_test.go:359: too many arguments in call to "github.com/juju/juju/juju/testing".StartInstanceWithParams
<dimitern> rogpeppe1, just fixed that, but I can't stop the ci job
<rogpeppe1> mgz: would it be possible to turn off the automatic test retry on the bot?
<rogpeppe1> mgz: it's taking over an hour to process a commit with a test failure
<mgz> rogpeppe1: maybe
<mgz> I ill at least cut down to two I think
<mgz> we have still got bogus failures
<rogpeppe1> mgz: i think one would be better
<rogpeppe1> mgz: we've got to look at the failure status anyway
<rogpeppe1> mgz: and it takes 18 minutes to process a job if all goes well
<rogpeppe1> mgz: but doubling that when a test fails seems wrong
<mgz> I prefer to inconvience on failing test run, rather than on good branhc that hits a random issue
<rogpeppe1> mgz: it inconveniences everyone though
<rogpeppe1> mgz: i've spent the entire morning waiting for the bot
<rogpeppe1> mgz: can we *try* not retrying
<rogpeppe1> ?
<rogpeppe1> mgz: and if the sporadic failures turn out to be a real problem, going back to retrying
<rogpeppe1> mgz: alternatively, try to diagnose the sporadic failures and only retry if it looks like one of those
<dimitern> and I can't merge my PR for 2 days because everything keeps changing across the codebase - imports mostly
<rogpeppe1> dimitern: yeah, sorry about that
<rogpeppe1> dimitern: nothing i can do to help about that, but it does mean that we *can't* submit PRs that will definitely work
<rogpeppe1> dimitern: because we have fundamentally clashing branches
<mgz> rogpeppe1: we've already had runs that have only landed successfully because of the retry
<mgz> see build 34 for instance
<rogpeppe1> mgz: any other examples?
<mgz> I'mm trying to find the graph thingy
<dimitern> mgz, we need accounts on the jenkins ci instance for cases like this - stopping a build that'll never succeed
<mgz> dimitern: we don't really
<mgz> we just need our tests to not be so flakey
<mgz> then 10-15mins for a run is fine
<dimitern> and the retry logic can detect build failures and do not attempt to retry
<mgz> juju-ci.vapour.ws:8080/job/github-merge-juju/buildTimeTrend
<mgz> I've changed the job to do a single retry, it should fail in around 30-35 mins now
<dimitern> haha :) beat you to it rogpeppe1  :)
<dimitern> sorry about that
<rogpeppe1> dimitern: frick
<mgz> #36 #23 #21 all required the rerun
<rogpeppe1> mgz: could we please change it to no retries at least for today
<rogpeppe1> ?
<rogpeppe1> mgz: because dimitern and i are conflicting and we're constantly watching the jobs and a failing job can double the time we're going to need for all this overhead
<mgz> rogpeppe1: I feel you two could work out a better solution between yourselves..
<mgz> but okay, I'll put it back in after y'all have sorted out your mess
<perrito666> good morning everyone
<rogpeppe1> mgz: so, all those intermittent failures failed with "panic: Session already closed"
<dimitern> rogpeppe1, I think changes like this (extracting a dependency and changing all imports) is very frustrating to deal with, especially when they are several like that in a row
<rogpeppe1> mgz: wouldn't it be possible to see that and retry only in that case?
<rogpeppe1> dimitern: yeah, particularly when the dependency is widely used
<rogpeppe1> perrito666: hiya
<voidspace> crazy internet today :-/
<mgz> yeay rural england?
<voidspace> mgz: yeah, and yay talktalk
<dimitern> rogpeppe1, we should've handled that better - announce it on the ML, perhaps pick a day to do a few of these in sequence, so it can be expected and we can fix our PRs after that
<voidspace> rogpeppe1: got a minute to help me diagnose a problem?
<rogpeppe1> voidspace: sure
<rogpeppe1> voidspace: yay talktalk :-\
<voidspace> yeah
<TheMue> dimitern: missed our hangout, but now nobady is there. has it already been?
<rogpeppe1> voidspace: my talktalk connection is really crappy today too
<rogpeppe1> voidspace: do you have slow/crappy DNS issues too?
<voidspace> rogpeppe1: heh, maybe coincidence maybe not
<voidspace> rogpeppe1: I don't use their DNS
<rogpeppe1> voidspace: 8.8.8.8 ?
<dimitern> TheMue, i haven't bothered today :) my bad - we can still do it, though if we need
<voidspace> rogpeppe1: this test passes
<voidspace> http://pastebin.ubuntu.com/7601025/
<voidspace> rogpeppe1: but note the 20 second sleep
 * dimitern is not ipv6 capable :)
<voidspace> without that it fails
<TheMue> dimitern: Iâve got nothing special, so itâs ok for me to not do it
<TheMue> ;)
<dimitern> my shiny new hurricane electric tunnel just works
<dimitern> (after a few issues resolved)
<voidspace> rogpeppe1: without it state.Open fails with
<voidspace> ... value *errors.errorString = &errors.errorString{s:"no reachable servers"} ("no reachable servers")
<voidspace> rogpeppe1: so I need some way of polling for when mongo is ready
<voidspace> I suspect it is waiting for replica set initiation
<dimitern> who's most git-savvy around?
<rogpeppe1> voidspace: does it work if you increate the DialOpts timeout to 30s from 10?
<rogpeppe1> dimitern: me! *chortle*
<voidspace> rogpeppe1: ah...
<dimitern> i have this issue almost every time i need to push changes from a PR branch after fixing stuff/review comments
<dimitern> it tells me Updates were rejected because the tip of your current branch is behind
<voidspace> rogpeppe1: it's set to testing.LongWait
<dimitern> how can my local branch be behind my origin/samebranch ?
<voidspace> rogpeppe1: which is ten seconds
<rogpeppe1> voidspace: exactly
<voidspace> rogpeppe1: should I create a VeryLongWait ?
<voidspace> or hard code 30 seconds?
<rogpeppe1> voidspace: just hardcode it
<voidspace> ok
<dimitern> what do you guys use as a workflow? we need to document it somewhere more extensively
<rogpeppe1> voidspace: is that git complaining?
<rogpeppe1> voidspace: when you try to do the push?
<rogpeppe1> oops
<rogpeppe1> dimitern: ^
<dimitern> (i.e. more than what you do initially, but also how you do specific steps)
<dimitern> rogpeppe1, yep
<rogpeppe1> dimitern: are you rebasing before pushing?
<dimitern> rogpeppe1, the only way i found around it is push -f
<dimitern> rogpeppe1, sometimes, but not always
<rogpeppe1> dimitern: if you rebase, you'll have that problem
<voidspace> rogpeppe1: that passes
<voidspace> ship it!
<voidspace> rogpeppe1: thanks
<rogpeppe1> dimitern: if you don't, you should not
<rogpeppe1> dimitern: i've stopped doing any rebasing
<rogpeppe1> dimitern: after the discussion on juju-dev
<dimitern> rogpeppe1, without rebasing you get a lot of pollution in the commit log
<rogpeppe1> dimitern: if you're going to rebase, then do it a) before starting the review and/or b) just before sending the $$merge$$ message
<dimitern> rogpeppe1, like your PR might land from a branch which last commit was "Fixed merge conflicts", which you'll see in the upstream master commit log, rather than your nicely formatted commit message
<rogpeppe1> dimitern: but i think i'd go with just the former
<rogpeppe1> dimitern: yup
<rogpeppe1> dimitern: this is the world we've moved into
<dimitern> rogpeppe1, i really don't like that
<rogpeppe1> dimitern: if you rebase you'll have to push -f
<rogpeppe1> dimitern: because push will only step forward in time
<voidspace> rogpeppe1: I already had an LGTM, the only change I've made since is increased that timeout
<dimitern> rogpeppe1, there has to be a better way to have both sane commit logs eventually and the ability to go over a reviewed branch fixing stuff
<voidspace> rogpeppe1: so I'm going to merge, unless you want to take a look first...
<rogpeppe1> dimitern: rebase just before sending the $$merge$$ message
<rogpeppe1> voidspace: sgtm
<voidspace> rogpeppe1: thanks
<rogpeppe1> dimitern: but as discussed on juju-dev, you can still see sane commit logs even when all commits go in
<rogpeppe1> dimitern: and i'd prefer not to all the extra overhead of rebasing
<rogpeppe1> s/not to/not to add/
<natefinch> I think it's worth not rebasing if we still have a way to see sane logs of all the merges going in
<rogpeppe1> natefinch: +1
<dimitern> rogpeppe1, how's that going to work?
<voidspace> natefinch: morning
<voidspace> fwereade: ping
<natefinch> voidspace: morning
<dimitern> rogpeppe1, examples?
<fwereade> voidspace, pong
<rogpeppe1> dimitern: see the juju-dev thread
<natefinch> dimitern: called not rebasing after PR
<dimitern> rogpeppe1, natefinch, will do, thanks
<rogpeppe1> can anyone thing of a particular reason why juju/testing/charm.go should not live inside charm/testing ?
<rogpeppe1> s/thing/think/
<rogpeppe1> it doesn't seem like it should live in github.com/juju/testing, to my mind
<rogpeppe1> although... i suppose we could move charm/testing into github.com/juju/testing/charmtesting
<rogpeppe1> jam, jam1: ^
<bodie_> morning all
<mgz> morning bodie_
<natefinch> rogpeppe1: seems like charm.go should be in charm/testing   seems like a good idea to keep charm stuff away from the rest of core
<rogpeppe1> natefinch: that's my thought too
<perrito666> voidspace: did you see axw mail about the replicaset enabled for local issue, apparently he found out what it was, or at least very close
<voidspace> perrito666: I did see
<voidspace> perrito666: he found a way to reproduce it
<voidspace> perrito666: which is not quite the same as finding out what it is :-)
<voidspace> being able to reproduce it is useful
<voidspace> I might do some experimentation
<perrito666> well he mentioned having an idea of what it is :) he just did not tell us
<mgz> he knows how to reproduce it, not fix it. he wasn't leaving it as an educational puzzle for you guys :)
<perrito666> mgz: oh, sad, I love puzzles
<perrito666> although, technically if we don't use ephemeral storage, it would be "fixed" :p or at least workarounded
<perrito666> anyone else with good grammar and redaction skills wants to take a shot at https://github.com/juju/juju/pull/30 ?
<perrito666> rogpeppe1 and menn0 have both added very valuable comments which where addressed but being docs, the more the merrier
 * mgz redacts perrito666
 * rogpeppe1 wonders what is different in the ephemeral fs semantics
 * perrito666 wonders if redacting means the same in English than in Spanish
<mgz> perrito666: I suspect not, but *common* usage in english is now as in "draw black lines over sensitive parts" rather than editing in general
<mgz> so, it's generally understood as closer to "censor" than "edit"
<perrito666> mgz: ahh, I see, in Spanish is used for "create a text" or "properly format a text"
<mgz> yeah, that's pretty much the orignal english meaning
 * perrito666 has an outdated dictionary
<natefinch> there's the definition, and then what people *mean* - often different
<perrito666> meh, users :p
<voidspace> the build for my branch passed: http://juju-ci.vapour.ws:8080/job/github-merge-juju/?
<voidspace> (build 47)
<voidspace> but it still hasn't merged yet
<voidspace> this was a few hours ago
<mgz> voidspace: looking
<mgz> voidspace: see the end of the log
<mgz> I'll run the command manually see if it now works
<mgz> nope...
<mgz> I wonder if our bots merging can pass when github itself will fail, I'm not sure what strategy they use exactly
<voidspace> mgz: ah, failed to merge
<mgz> voidspace: can you see if you conflict with trunk at all?
<voidspace> sure
<mgz> I wonder if the preceeding pull conflicts, but when the bot pulled trunk for the next run it didn't get that yet
<mgz> or something
<natefinch> voidspace, perrito666, wwitzel3: standup
<voidspace> mgz: merge seemed to work fine, no conflict
<voidspace> pushing
<voidspace> natefinch: might just be you and me
<mgz> voidspace: also wtf is up with python and getattr({u"a":1}, "a") vs {u"a":1}["a"]
<mgz> last successful job also failed my branch to catch this, even though I *did* change it to use the unicode literal
<voidspace> mgz: getattr does an attribute lookup
<voidspace> mgz: that's an *item lookup*
<voidspace> mgz: not an attribute
<mgz> doh!
<voidspace> mgz: so uhm, they're different
<mgz> too much go
<voidspace> mgz: itemgetter?
<sinzui> mgz, My branch is merged https://code.launchpad.net/~sinzui/juju-ci-tools/local-tarball/+merge/222266
<voidspace> mgz: specifically operator.itemgetter
<mgz> I just want `"a" in {..}`
<abentley> voidspace: How about just {u"a":1}.get("a") ?
<mgz> or that
<voidspace> heh, or that
<voidspace> unless you *want* the KeyError
<sinzui> mgz, We can try to simplify git-merge-juju when you are ready
<mgz> nah, the else raises anyway
<mgz> sinzui: thanks!
<mgz> voidspace: I have requueueeeueued
<voidspace> mgz: I saw, thanks
<mgz> sinzui: in case you missed it in the log earlier, I took out the test retry for now as it was upsetting rog and dimiter's merge conflict landing fun
<sinzui> mgz, understood. I had missed that
<sinzui> mgz, did you see I had a pastebined link to how I *thought* git-merge-juju would work? I didn't know about the retry change, but run-unit-tests doesn't retry
<fwereade> sorry, didn't realise irc was wedged
<mgz> no, I missed that, will find
<fwereade> voidspace, are you around?
<fwereade> wwitzel3, also, please ping me if you have a moment
<voidspace> fwereade: I am, but in standup
<sinzui> mgz, http://pastebin.ubuntu.com/7597888/
<rogpeppe> mgz: ta!
<jcw4> natefinch: menn0 reviewed this yesterday, but wanted other eyes on it too, I've addressed all his issues but the one about the errors package, pending discussion: https://github.com/juju/juju/pull/33
<jcw4> fwereade: I'd appreciate a quick sanity check review on ^^ too?
<natefinch> forgot you can do markdown in comments, that's awesome
<natefinch> jcw4: not a huge fan of encoding one id in another one.  Why do we need to do that?  why not just have a separate field for ActionId?
<jcw4> natefinch: that was an early suggestion by fwereade
<jcw4> natefinch: not that it was intended to be in stone... just for now
<jcw4> natefinch: the benefit is facilitating watchers to filter on specific actions based on key
<voidspace> natefinch: just grabbing a drink - back shortly
<mgz> voidspace: erk, you hit the random failure the retry was in to get around
<natefinch> mgz: doh
<natefinch> jcw4: not sure I understand how encoding two keys together makes it easier to watch one of those keys :/
<jcw4> natefinch: a unit would have a watcher on the actions collection and filter out only actions that have the unit prefix
<natefinch> jcw4: .... or filter on the unitId field if one existed instead?
<jcw4> natefinch: that was my original design
 * natefinch squints at fwereade 
<jcw4> natefinch: fwereade wasn't issuing an edict just suggesting we try the simpler approach until clarity emerged
<jcw4> :)
<jcw4> natefinch: but you're looking at ActionResults, not Actions... I propogated the pattern because it seemed to make sense to keep it consistent
<rogpeppe> anyone know of a (nicely implemented) package that implements recursive directory copying?
<fwereade> natefinch, there's a quick select-on-initial-simple-regexp which we can use to build the initial watcher cleanly, using just ids without further db hits
<jcw4> natefinch: ActionResults don't need a watcher, so a regular key would probably be fine with an ActionId and a Unit Name in it
<rogpeppe> i want to change charm/testing so it doesn't use cp -r
<fwereade> natefinch, it's how relation scope watching works and it seems like a tolerable approach to me
<voidspace> mgz: heh
<voidspace> mgz: github hates me
<fwereade> jcw4, I'm not sure you're right that the results collection doesn't need a watcher; it *probably* doesn't need a watcher that'd benefit from that id style
<jcw4> natefinch: I've also already run into a couple spots where filtering on key seems to be easier than loading the whole doc
<voidspace> mgz: in fact git hates me, which is why I never got on with it in my previous attempts
<jcw4> fwereade: I see... I haven't been able to think of a watcher for results, but I may not be trying hard enough.
<jcw4> :)
<fwereade> jcw4, but I'm not sure I'd want to rule it out -- at the very least you'll want to watch for completion of a particular action, and I can imagine that as a gui user you'd probably want to get completed action notifications streamed in along with everything else
<jcw4> fwereade: +1
<natefinch> fwereade: it just requires that we have intimate knowledge of how the id is constructed.... I've been burned before by code merging two strings and then encoding the knowledge of what the two strings are supposed to be in various places in the code (or even in a single place).
<fwereade> jcw4, but given the 1:1 correspondence of actions to results I'm pretty comfortable keeping the keys the same
<jcw4> natefinch: agreed, but I've tried to abstract the details away behind funcs as much as possible
<voidspace> natefinch: I'd really like to take a break
<jcw4> fwereade: could you imagine multiple results for one action?
<natefinch> voidspace: what do you need?
<fwereade> jcw4, I don't... *think* so
<voidspace> natefinch: well, we need to decide what we're doing next (collectively) and which bit I should take on
<voidspace> natefinch: I think we've decided we're *not* going to need to stop mongo, so we're not going to need to take down a state server for maintenance
<natefinch> fwereade: do you know why we were stopping mongo to run backup before?  Was it just that we didn't know there was a better way?
<fwereade> natefinch, I'd suggest most of the problem lies in allowing the knowledge of that string's structure to spread through the codebase instead of properly abstracting it
<voidspace> natefinch: we wanted to run by fwereade the idea that instead of a url we have a backup "number" and a "juju fetch-backup n" command
<fwereade> natefinch, I forget why we didn't keep mongo going, I'm afraid
<natefinch> fwereade: I guess most of my problem with the id thing is that I don't understand why we have to do it.  Why can't you just filter by another field?  You said it causes another db hit, but I don't understand why that is.
<fwereade> natefinch, I know it was on the table, but I can't recall if there were complications, or whether the one that got implemented just happened to stop-start and that was acceptable for the use case so we didn't quibble
<fwereade> natefinch, primarily because id and txn-revno are the only pieces of information we get out of the watcher package
<fwereade> natefinch, we can make useful inferences about watcher events if we encode some meaning in the _id
<natefinch> fwereade: I see.
<fwereade> natefinch, checking the prefix is what we do when we're filtering the firehose of settings-collection changes to determine which ones apply to a particular watcher
<voidspace> natefinch: we shouldn't store any information about available backups in mongo
<voidspace> natefinch: we should use the filesystem
<fwereade> natefinch, and it seems smart to use the same method of identification where we can, lest subtle semantic drift in other fields catch us unawares
<voidspace> natefinch: otherwise restore will think there are backups that don't necessarily exist
<fwereade> natefinch, so we also construct initial state with a query for docs with a particular _id prefix
<fwereade> natefinch, and it's nice that mongo has a way of doing that query fast
<voidspace> natefinch: and restore is interesting - should we allow restoring from a backup already on the state server or should you always have to upload
<voidspace> natefinch: if we allow both we need two variants of the restore command
<fwereade> natefinch, *also* it's genuinely important that we minimise db hits in our watchers, because any one of them can block the whole watcher infrastructure if it doesn't pick up the events on the channel it registered
<fwereade> natefinch, in essence every watcher is implicitly expected to consume every watcher notification fast enough that it's ready and waiting before the next notification comes in; if that's broken, *every* other watcher has to wait for the slow one to catch up
<natefinch> fwereade: yeah, I forgot you just watch a collection, and can't pre-filter your watch (like, watch this collection and only show me things that match this query)... and then I also didn't realize all we got out of the notification was the id, not the full document
<natefinch> voidspace: that's a good point about restoring from an existing backup on the server already
<natefinch> voidspace: yes, we need to keep the backups on disk, not in mongo, so that if everything else on the machine dies, but the disk still works, you can still get your backups.
<perrito666> hey, back
<perrito666> I notice you are talking about backups
<perrito666> my favorite subjects
<natefinch> perrito666: are you aware of why we were calling mongodump with --dbpath rather than --oplog?  sounds like --dbpath requires that you stop the db, where --oplog does not
<perrito666> natefinch: I became aware last night when I got a mail from fwereade about it
<bodie_> https://github.com/juju/juju/pull/42 (mgz)
<bodie_> I'll really be happy to get permanently out of charm/actions.go, heh
<bodie_> (jcw4, have a look if you like -- this is the simple version of the function to conform keys to strings)
<jcw4> bodie_: yeah I'm looking through the diff...
<bodie_> mgz, fwereade I wasn't sure if that's too many commits for a single PR, but a few of them were responses to discrete concerns raised over the initial PR, so I figured they might be good to have individually (?)
<mgz> bodie, looks much nicer
<natefinch> voidspace: ha, Wayne had requested yesterday and today off.  It's in the HR site, just not on the calendar
<natefinch> I'd forgotten
<perrito666> natefinch: ah, true, he was traveling right?
<voidspace> natefinch: ah, cool
<voidspace> so he's probably alright then
<natefinch> perrito666: yeah, he was working from some other state on Wednesday
<natefinch> mgz, fwereade, rogpeppe: anyone want to give me a crash course in writing an API facade?  I can deduce a bunch of the proper structure from the code, but getting some background would help I think
<fwereade> natefinch, I might be able to a bit later today, how long are you around for?
<natefinch> fwereade: probably later than you, I would think ;)
<fwereade> natefinch, yeah, but cath's going out soon -- laura will likely fall asleep before she gets back but the precise time that happens can vary
<natefinch> fwereade: I'll stay available
<natefinch> brb, grabbing lunch real quick
<natefinch> voidspace, perrito666:  so, I think we can split up the work so one person works on writing backup, one person works on writing restore, one person works on writing the code to  list & download the backups
<natefinch> and wayne gets whatever crap jobs are left over because he's out today
<voidspace> natefinch: sounds good
<voidspace> natefinch: I'm about EOD though
<voidspace> natefinch: I'm happy for you to assing me stuff to start working on though
<natefinch> voidspace: yep, no problem.  I'll get the facade written today so there's a place to put the code monday morning
<voidspace> awesome
<mgz> rogpeppe: your charm/testing branch may now conflict with trunk, watch out for that when landing
<mgz> voidspace: I'd not be happy if he was assing me stuff...
<natefinch> mgz: you get used to it after a while
<voidspace> hah
<voidspace> assign
<voidspace> no assing please
<natefinch> dang
<mgz> ehehe
<voidspace> g'night all
<jcw4> natefinch: Are you comfortable LGTM'ing this or should I get fwereade to finalize? https://github.com/juju/juju/pull/33
<natefinch> jcw4: not sure I understand enough of the background to LGTM it.  You'd just have me asking a bunch of stupid questions, like why we're munging two Ids together ;)
<jcw4> natefinch: lol... that was good (and the discussion should be captured in documentation somewhere)
<jcw4> natefinch: I'll bug fwereade
 * perrito666 hacks juju-backup a lot
<natefinch> perrito666: go go go! :)
<perrito666> no, actually bash :p
<perrito666> well mongodump "works" now let me try to make the same with mongoexport
<perrito666> and then lets see if all of that restores
<natefinch> perrito666: this says that you restore with mongorestore --oplogReplay: http://docs.mongodb.org/manual/reference/program/mongodump/#cmdoption--oplog
<perrito666> natefinch: yup, I am trying to make export work right now
 * perrito666 was an hour trying to make mongoexport work only to realize its output is not used to restore
<natefinch> heh oops
<natefinch> ug... talking to insurance about what is covered and what's not covered.... my favorite thing ever
<jcw4> :(
<jcw4> hopefully not because of a loss that actually occurred?
<perrito666> natefinch: usually whatever happened to you is not covered
<natefinch> perrito666: yep
<perrito666> did you have an accident?
<natefinch> perrito666: nah, just some tests we'd like to do... you know, to make sure things are ok, and if they're not, do something about it before it gets to be a big expensive problem.
 * perrito666 thinks ensurance is something different in the US
<jcw4> perrito666: I think natefinch is talking about medical insurance
<jcw4> which, yes, is something different all right in the US
<natefinch> heh yeah
<perrito666> ah yes, we have public healthcare or private healthcare which are pretty much the same excepting a few differences on the prettiness of the hospitals
<natefinch> it's funny how insurance companies will refuse to pay for the $300 test, but then have to pay for the $10,000 medical procedure the test would have prevented
<perrito666> natefinch: wow that is some interesting sum of money, there is no procedure in this country that costs that much
<jcw4> perrito666: our insurance system drives the prices way up for many things
<natefinch> perrito666: the US is the king of expensive procedures. There's few procedures that are *less* than that, if they involve being in a hospital (versus a doctor's office).
<natefinch> yeah, the problem is that hospitals and doctor's practices are generally for-profit corporations, so their motivation is to make a lot of money, which means charging as much as they can get away with, and *doing* as much as they can get away with.  And since a lot of people get insurance through their work, where it's often partially paid for by work, and taken out of their paycheck before they see it.... the average consu
<natefinch> mer doesn't really see the cost of the procedures directly... unless you have no insurance, in which case you just go bankrupt any time anything bad happens.
<perrito666> wow, but just for you to use as a comparator 10k USD is around 12 times the avg salary in this country so if procedures where that expensive everyone would be dead
<jcw4> perrito666: which country are you in?
<perrito666> Argentina
<natefinch> perrito666: it's 1/5th the median salary here... so, still sort of amazingly expensive.
<jcw4> I see, and you mean avg salary there is USD 1,000 per year or per month?
<perrito666> jcw4: month
<perrito666> altough real avg must be around 600
<natefinch> oh ok, you were doing per month
<jcw4> still...
<perrito666> natefinch: in a country with more than 25% inflation you dont do yearly salary calculations
<natefinch> so, 10k is like 2.5x the monthly salary for a median household in the US
<natefinch> perrito666: wow, I didn't know it was that bad
<jcw4> perrito666: that's nothing, I grew up in Zimbabwe...
 * perrito666 running tests for restore
<jcw4> :)
<natefinch> jcw4: really?
<jcw4> Although I wasn't there during the hyperinflation days
<jcw4> natefinch: yeah
<natefinch> jcw4: that is really interesting. What were your parents doing there?
<jcw4> My dad is South African - a preacher (not missionary :) )
<perrito666> at some point in the 90s we had a currency worth as much as the dolar which was fake, at some point that was dropped and since then our currency is loosing weight which drives the price up
<jcw4> my mom is American
<jcw4> perrito666: how do people handle inflation?
<jcw4> buy durable goods and sell them later?
<perrito666> jcw4: nope, they buy dollars driving inflation even higher
<perrito666> and land or appartments driving the prices of real state also up
<perrito666> :p
<jcw4> :( Zimbabwe tried to make that illegal
<perrito666> so did we
<jcw4> but they gave up eventually
<perrito666> there is a large market for parallel dollar
<perrito666> which is 35% higher than official
<perrito666> and there is fiscal pressure
<jcw4> yeah... danger pay!
<jcw4> fiscal pressure?
<mgz> what's a parallel dollar?
<perrito666> if you buy dollars in an official seller you will be investigated by the irs
<jcw4> perrito666: gotcha
<perrito666> mgz: in countries where there is fiscal control over who aquires dollars there is a black market for currency
<jcw4> a parallel black market
<natefinch> perrito666: is there interest in bitcoin there?  Not sure if there's enough internet infrastructure / mobile infrastructure to make it feasible to normal folks
<natefinch> it's like the ultimate parallel dollar..... though not very stable either, I suppose
<perrito666> natefinch: regular people invest more in traditional stuff
<perrito666> for instance its customary to buy real state in cash (declaring 1/3 of the price to irs)
<perrito666> no one takes credit or morgages
<natefinch> heh
<natefinch> yeah
<jcw4> perrito666: how could you take long term credit I suppose
<natefinch> I bet credit is impossible to get for almost anything
<perrito666> ironically, people buy cars in 52 or even 60 mensualities
<jcw4> mensualities?
<jcw4> payments?
<natefinch> I guess if the payments are locked to inflation
<perrito666> jcw4: yes, thank you
<natefinch> jcw4: monthly payments
<perrito666> natefinch: well my car costs 180U$D per month including insurance
<jcw4> from what I've read inflation always favours the debtor
<natefinch> jcw4: same word base as menstruation :)
<perrito666> lol
<jcw4> :)
<jcw4> although the creditor could make terms that removed that favor
<ahasenack> hi, I'm trying to debug a relation-set failure, it fails with "argument list is too long". I want to see that argument list, but juju-log fails with the same error of course :) Before I start opening files, didn't there use to be a debug setting for charms? So it would log everything?
<natefinch> ahh work?  boo.
<jcw4> ahasenack: sorry I'm just here for the economics chatter
<jcw4> ;)
<natefinch> marcoceppi_: ^^ debug setting for charms?   Sorry, I don't know, ahasenack.
<ahasenack> some time ago the log used to be like "going to call this"
<ahasenack> and then it would do it
<ahasenack> and fail, in my case
<ahasenack> but the intent was logged and I was able to see the details there
<natefinch> ahasenack: google says that can happen if you use *.foo, because bash will expand *.foo to actually be all the different files, and if there's too many, boom
<ahasenack> natefinch: well, boom can also happen with silly things like https://bugs.launchpad.net/juju-core/+bug/1274460/comments/1
<_mup_> Bug #1274460: juju-log vs. command line length limits <juju-log> <juju-core:Triaged> <https://launchpad.net/bugs/1274460>
<ahasenack> juju-log "$output" where $output contains, say, -e stuff
<ahasenack> but ok, I'll add something to write this stuff to a file and inspect the file
<marcoceppi_> ahasenack: run juju debug-hooks
<natefinch> ahh right, duh
 * natefinch likes the way you can summon people on irc to swoop in and save the day.
 * natefinch gives marcoceppi_ a cape.
<ahasenack> it's relation_set() from charm helpers that is crashing
<ahasenack> but I'm debugging it now
<ahasenack> I think at some point in time a default changed in juju and DEBUG logging was turned off, that's what I wanted to re-enabled
<ahasenack> re-enable
<perrito666> natefinch: backup restore using oplog workls
<perrito666> works even
<perrito666> at least the restored machine is functional
<natefinch> ahasenack:  juju set-environment logging-config "unit=DEBUG"
<ahasenack> natefinch: ah, thanks!
<natefinch> ahasenack: (figured it out from juju help logging)
<ahasenack> good to know
<natefinch> for me too :)
<perrito666> I need to leave for a moment now bbl
<natefinch> perrito666: thanks for figuring that out
<perrito666> the only difference is that we will need the password for admin to to the backup, but that is something easy if we are part of api
<natefinch> perrito666: yep
<jcw4> fwereade: if you happen to get back on again tonight I'd like your approval to merge https://github.com/juju/juju/pull/33
<perrito666> what is the preferred way to call external binaries?
<jcw4> menn0: weird your comments showed up in my email but not in github
<jcw4> I'm going to add that errors stuff in and then hope for a merge :)  I can do it myself now, but I'd like fwereade to at least nod :)
<menn0> jcw4: for one comment today I added it then figured out it was dumb so I deleted it. That could by why.
<jcw4> menn0: lol
<jcw4> menn0: it was a good comment
<jcw4> menn0: I should really add comments there to clarify
<menn0> jcw4: the one about actionMarker vs actionResultMarker? yeah a comment could help
<jcw4> menn0: yep
<menn0> jcw4: also, with the comments describing the structures of the action and actionresult ids it might be helpful to show an example of each
<menn0> that would make it super clear
<jcw4> menn0: +1
<jcw4> menn0: godoc doesn't support any markup right?
<menn0> you're asking the wrong person. I'm fairly new to Go.
<menn0> brb
<jcw4> :)
<menn0> jcw4: back again. had to deal with the kids for a bit
<jcw4> those kids... :)
<jcw4> mine will be home from a day at the amusement park soon
#juju-dev 2014-06-07
<jcw4> menn0: updated PR https://github.com/juju/juju/pull/33
<jcw4> menn0: but I forgot to enhance the actionId docs... urg
 * jcw4 is EOD
<menn0> jcw4: just had a quick look. the last commit looks fine.
<jcw4> thanks menn0
#juju-dev 2014-06-08
<jcw4> mgz, fwereade (bodie_) some initial actions documentation https://github.com/juju/docs/pull/117
<jcw4> Just a first cut, driven largely by the Actions Draft Spec, but a starting point.
<bodie_> jcw4, good start, maybe I'll PR against your user repo to build on what you've got there --
<bodie_> I added some thoughts here and would appreciate input from rick_h_ and any other frontend hackers
<bodie_> https://github.com/johnweldon/juju-docs/commit/cb7a4709e9a5fed5c885cd3e5e2ecc1cd34da181#commitcomment-6594689
<bodie_> on second thought those comments might be misplaced -- I think some of that stuff should probably instead be in juju/doc
<bodie_> i.e., github doc
<bodie_> nvm, I'm addressing my concerns in a juju/juju PR which will need comments :)
<bodie_> phew
<bodie_> https://github.com/binary132/juju/blob/actions-doc/doc/actions.md
<bodie_> preliminary PR just to get some conversation flowing -- https://github.com/juju/juju/pull/46
 * thumper starts the trawl through the inbox
<waigani> morning all
 * thumper is reading the juju info email thread carefully
<thumper> I found a horrible bug with the local provider yesterday
<thumper> it was already reported
<thumper> and entirely my fault :)
<menn0> thumper: what was it?
<thumper> the template that is created for clone has a user specified log mount in it
<thumper> which will be wrong...
<thumper> for any environments other than the first
<thumper> and breaks the local provider for other users on the same machine
<thumper> waigani: I was thinking this morning... and I think I told you wrong in the review
<waigani> thumper: oh?
<waigani> thumper: I'm still getting through emails
<thumper> waigani: I'm trying to find the pull request
<waigani> is this the List branch?
<thumper> yeah
<thumper> omg, how do we find the pending reviews?
<waigani> thumper: https://github.com/juju/juju/pull/35
<waigani> thumper: pending reviews: https://github.com/juju/juju/pulls
<thumper> got it...
<waigani> man, the email list is information overload
<thumper> waigani: it can be
<waigani> thumper: do you know if you can search open files in sublime?
<thumper> waigani: ah... yeah
 * thumper reboots yet again
<thumper> oh ffs, it is still there
<thumper> "sudo apt-get remove ubuntuone-client" and now I wait...
<thumper> to see if I still get it
<thumper> nope...
 * thumper tries rebooting
<thumper> ok... so far, so good
<waigani> thumper: do I need to test dir version erroring?
<waigani> if so, I'm not sure how to
<thumper> wat?
 * thumper is about to head to the gym
<thumper> I'm not sure what you mean
<thumper> chat when back
<waigani> ah your review comment
<waigani> response to dave
<waigani> thumper: In environs/configstore/interface_test.go:
<waigani> thumper: @davecheney the dir version could error (in theory)
#juju-dev 2015-06-01
<wallyworld> anastasiamac: can i please get a review on this which is a fix for one of the issues discovered last week during the outage analysis http://reviews.vapour.ws/r/1824/
<anastasiamac> wallyworld: looking :D
<anastasiamac> fun \o/
<davecheney> ls
<wallyworld> jam: hey, you around?
<jam> hiya wallyworld
<wallyworld> quiet today with everyone away
<wallyworld> anyways
<wallyworld> i have a MP for python-jujuclient which retries send requests if juju says it is upgrading
<wallyworld> it should address some of the core issues deployer is having
<wallyworld> could you take a look?
<wallyworld> https://code.launchpad.net/~wallyworld/python-jujuclient/retry-on-upgrade/+merge/260658
<jam> wallyworld: I do wish it was trivial to backoff retries
<wallyworld> yeah
<wallyworld> it *could* be implemented, but this i think is an ok first step
<wallyworld> it covers the small window where juju machine agent needs to first check if upgrades are needed
<wallyworld> during which time the api is limited and so the "upgrade error" is reported
<wallyworld> which would be < 1 second normally
<wallyworld> or thereabouts
<wallyworld> 99% of the time (or pick your own stat), the api goes from limited -> open because no upgrade is required
<wallyworld> this stops the case where people juju bootstrap && juju-deploy via a script
<wallyworld> from going wrong
<jam> wallyworld: if its upgrading can't we also get disconnected completely during this time?
<wallyworld> so, yes but what a recent juju change did was keep the api in limited mode until the check to see if an upgrade is required. if an upgrade is required, then it does that without giving the deployer a chance to connect simply to be disconnected
<wallyworld> but that changed opened a small window where the deployer trying to connect initially got the "upgrading error"
<wallyworld> because the upgrade worker needed to start
<jam> wallyworld: so I see that you're retrying "upgrade in progress" which is fine, my concern is are we also retrying "I got disconnected completely". IIRC the later is what broke OIL, etc.
<wallyworld> i didn't intend to retry anything other than "we are upgrading"
<wallyworld> for this change
<wallyworld> the "we got disconnected" case is a bit separate
<jam> wallyworld: Isn't the original bug about getting disconnected vs upgrading?
<jam> (the problem with upgrading is that deployer got disconnected and then just died)
<wallyworld> jam: hangout? a bit easier to explain
<wallyworld> https://plus.google.com/hangouts/_/canonical.com/tanzanite-stand
<jam> sec, need to grab headphones
<wallyworld> ok
<dimitern> dooferlad, standup?
<voidspace> dimitern: thanks to dooferlad it now works!
<dimitern> voidspace, sweet!
<dimitern> voidspace, omw to our 1:1
<voidspace> dimitern: grabbing coffee first
<dimitern> voidspace, sure
<voidspace> dimitern: omw
<perrito666> morning
<dimitern> perrito666, o/
<wallyworld> sinzui: hey, i did a python-jujuclient change to fix the issue of deployer complaining that juju is upgrading. but i need to talk to a maintaner to get that merged (it's approved), and then we need to figure out how to unblock landings
<wallyworld> sinzui: ^^^^ - i can get the fix landed but am unsure what to do next to unblock things. can we ass a 1 sec delay to the CI test until pythin-jujuclient gets rolled out
<sinzui> wallyworld: We automatically test tht package.
<sinzui> wallyworld: all slaves use the juju ppa to get its packages. That is how we cause the quickstart regression last week
<wallyworld> sinzui: so as soon as a phtyon-jujuclient fix lands in source, CI will grab that copy?
<sinzui> wallyworld: no, CI gets the built packages
<wallyworld> how long from branch getting merged till CI using the changes?
<sinzui> wallyworld: about 1 hour after the package is built by Lp
<wallyworld> ok, great, i'm just trying to see how long we may still be blocke for
<katco> ericsnow: standup
<sinzui> wallyworld: katco: Do either of you have a minute to review http://reviews.vapour.ws/r/1829/
<katco> sinzui: in a meeting, sec
<wallyworld> sinzui: +1
<sinzui> thank you wallyworld
<dimitern> wallyworld, hey there
<dimitern> wallyworld, if you can find some time, please have a look at http://reviews.vapour.ws/r/1830/ - instancepoller using the api
<ericsnow> wallyworld: any help I can give on #1460171?
<mup> Bug #1460171: Deployer fails because juju thinks it is upgrading <blocker> <ci> <deployer> <regression> <upgrade-juju> <juju-core:In Progress by wallyworld> <python-jujuclient:In Progress by wallyworld> <https://launchpad.net/bugs/1460171>
<dimitern> dooferlad, voidspace, ^^
<voidspace> dimitern: looking
<dimitern> voidspace, thanks!
<wallyworld> ericsnow: waiting for patch to land in python-jujuclient - no core changes
<wallyworld> should be soon i hope
<ericsnow> wallyworld: cool
<wallyworld> thanks for asking
<ericsnow> wallyworld: :)
<wallyworld> dimitern: sorry, was talking to someone else
<dimitern> wallyworld, no worries
<voidspace> dimitern: why does facade version start at 1 whilst others start at 0
<dimitern> voidspace, new facades should start at 1
<dimitern> (there was some decision about this some time ago)
<perrito666> voidspace: 0 is for facades previous to versioning iirc
<voidspace> cool, thanks
<dimitern> wallyworld, I'd appreciate if you can confirm the instancepoller should start once per apiserver (rather than per environment)
<dimitern> fwereade, ^^\
<wallyworld> dimitern: so long as it knows how to deal with mult envs
<wallyworld> machines are per env after all
<fwereade> dimitern, wallyworld: yeah, it sounds like a per-env thing to me
<wallyworld> +1
<wallyworld> we could have just the one, but polling intervals get tricky
<fwereade> dimitern, wallyworld: and including multi-env logic in the instancepoller, rather than just running N of them, would seem suboptimal
<wallyworld> yeah
<dimitern> fwereade, wallyworld, but each running instance should only work for a given env?
 * dimitern wonders if requiring JobManagerEnviron will make this "just work", like for other "singleton" workers
<wallyworld> dimitern: almost 1am here, my brain is dead sorry, i need sleep
<dimitern> wallyworld, get some sleep then! :)
<wallyworld> can talk more tomorrow unless fwereade soets it out
<dimitern> sure, no problem
<wallyworld> see ya later
<fwereade> dimitern, yes, each instance is part of one and only one env
<dimitern> fwereade, so I guess starting one per env should work, as login will take care of which envs to use and subsequently what will the watchers report
<fwereade> dimitern, I think you should just be starting the instancepoller alongside the firewaller and provisioner for each environment
<dimitern> fwereade, right
<dimitern> fwereade, so I'll change that, but the rest should be fine
<dimitern> fwereade, thanks!
<fwereade> dimitern, hey, has instancepoller just always been running non-singular?
<fwereade> dimitern, I'm pretty sure we don't want one per state server per env
<fwereade> dimitern, ...in fact
<fwereade> dimitern, instance address-setting txns have been among the ones we've seen clogging up stuck environments, right?
<dimitern>  fwereade so far it was started in the StateWorker() method of the MA
<fwereade> dimitern, and the problems with mgo/txn absolutely centre around separate flushers racing to write the same doc
<dimitern> fwereade, which means once per state server
<fwereade> dimitern, it's also in startEnvWorkers
<fwereade> dimitern, ...or only there
<dimitern> fwereade, now it's only in startEnvWorkers (running tests still)
<fwereade> ah ok
<fwereade> dimitern, but I *do* see it non-singular in startEnvWorkers
<dimitern> fwereade, where?
<fwereade> dimitern, and as a worker that's yammering at the provider api we definitely want it to be singular, I think, not to menntion my FUD about it causing the sort of workload that stresses mgo/txn
<fwereade> dimitern, :1116 in master
<fwereade> 	runner.StartWorker("instancepoller", func() (worker.Worker, error) {
<fwereade> 		return instancepoller.NewWorker(st), nil
<fwereade> 	})
<dimitern> fwereade, right!
<fwereade> dimitern, so s/runner/singularRunner/ and we get a little bit better in a couple of good ways too
<fwereade> dimitern, (on top of passing in the api instead of teh state :))
<dimitern> fwereade, in a call, will get back to you
<cherylj> sinzui: Should I backport bug 1442308 to 1.23?
<mup> Bug #1442308: Juju cannot create vivid containers <ci> <cloud-installer> <local-provider> <lxc> <ubuntu-engineering> <vivid> <cloud-installer:Confirmed> <juju-core:In Progress by cherylj> <juju-core 1.24:Fix Committed by cherylj> <https://launchpad.net/bugs/1442308>
<sinzui> cherylj: no, I donât think we will make a 1.23.4 release since we will propse 1.24.0 on Thursday
<cherylj> ok, thanks!
<sinzui> cherylj: I will add a task to the bug as WONT FIX to be clear that we choose not to
<cherylj> sinzui: awesome, thank you
<voidspace> rebooting *sigh*
<natefinch> abentley: you around?
<abentley> natefinch: Yes, but I have standup now.  I'll ping you when done.
<natefinch> abentley: thx
<voidspace> dimitern: ping
<voidspace> dimitern: if you're still around
<voidspace> dimitern: I'm still doing your review by the way...
<voidspace> it's big
<voidspace> (the patch I mean)
<voidspace> but also trying to bootstrap juju with MAAS
<voidspace> and failing - hard to tell if current failure is a MAAS problem or a juju problem, or something else
<voidspace> last problem was HP propietary drivers calling deploy to fail
<voidspace> current problem is this:
<dimitern> voidspace, yeah, I'm here
<dimitern> voidspace, sorry about the side - it's mostly tests though :)
<voidspace> dimitern: http://pastebin.ubuntu.com/11499441/
<voidspace> dimitern: heh, indeed
<dimitern> voidspace, looking
<voidspace> dimitern: so juju fails to contact MAAS (connection refused)
<voidspace> a
<voidspace> fetching that URL in the browser works
<voidspace> and there's nothing useful in the MAAS logs
<voidspace> the MAAS node is deployed
<voidspace> dimitern: I updated MAAS version and am running juju latest master
<dimitern> voidspace, why localhost?
<voidspace> dimitern: because MAAS is running locally
<dimitern> voidspace, on port 80?
<voidspace> hmmm... apparently
<voidspace> yes
<voidspace> that's working fine
<dimitern> voidspace, try bootstrapping with --debug to get more context
<voidspace> dimitern: ok, will do
<voidspace> dimitern: it takes about ten minutes or so because these proliants are *slow* to boot
<voidspace> dimitern: the intelligent bios thing takes several minutes to do its thing
<voidspace> I might try and disable it
<voidspace> but it can run in the background whilst I continue the review
<dimitern> voidspace, is MAAS itself configured with http://localhost/MAAS/ ?
<dimitern> voidspace, dpkg-reconfigure maas (IIRC)
<voidspace> dimitern: I'll check
<voidspace> when I went to 127.0.0.1/MAAS instead of localhost I had to login again
<voidspace> so there maybe a difference
<voidspace> I'll wait until this bootstrap completes
<dimitern> voidspace, ok
<dimitern> voidspace, I'm pretty sure the MAAS URL has to match exactly - both in maas config and in juju's
<voidspace> dimitern: yep, good call
<abentley> natefinch: I'm free now.
<voidspace> dimitern: I think it needs a visible url and not a local url
<voidspace> dimitern: trying with the machine IP address
<dimitern> voidspace, that sounds good
<voidspace> dimitern: i.e. a node can't use 127.0.0.1 to reach the MAAS API
<voidspace> taking a break
<dimitern> voidspace, I have a similar setup locally, but I use a 192.168.50.X - .2 for maas, the rest for the nodes
<dimitern> voidspace, ok, I'll need to go, but might be back later
<voidspace> dimitern: thanks, see you later
<natefinch> abentley: I was going to do something like tghis to add the actions feature flag to the CI tests... is this acceptable? http://pastebin.ubuntu.com/11499809/
<abentley> natefinch: That won't work because EnvJujuClient24 is only used for juju 1.24.  I meant that you should add an EnvJujuClient22 that was used for juju 1.22, that supplied the 'actions' feature flag.
<abentley> natefinch: A heads-up: jog is landing support for -e with "action do" and "action fetch" today.
<abentley> natefinch: In this branch: https://code.launchpad.net/~jog/juju-ci-tools/start_chaos
<voidspace> dooferlad: hah, and four days later I have a working juju bootstrapped to MAAS on an HP proliant
<voidspace> dooferlad: the PDU seems to be working fine now too, both for switching machines on and off
<voidspace> dooferlad: http://pastebin.ubuntu.com/11500002/
<natefinch> abentley: I'
<natefinch> abentley: I'm not really prepared to spend very much more time on this CI test.  It's already taken 3-4 times as long as I had anticipated & scheduled
<natefinch> cc katco ^^
<natefinch> abentley: but if I can just remove my action code and merge with what jog lands, that's fine with me, though it would make for a lot of wasted work on my part.  It's unfortunate both of us were working on the same functionality.
<natefinch> abentley: or maybe I misunderstood what you were talking about.. do you mean he was landing code in the tests or juju-core
<jog> natefinch, sorry I was working on another project and just discovered our juju-ci-tools lib needed to handle actions differently on Friday.
<abentley> natefinch: He's just done an alternative implementation of the _full_args change, none of the rest.
<natefinch> abentley: oh ok, that's good.  I'm glad we didn't overlap much
<natefinch> abentley: do I have to do more in the EnvJujuClient22 than implement the _shell_environ, and add a new elif in EnvJujuClient.by_version?  Something like this? http://pastebin.ubuntu.com/11500166/
<abentley> natefinch: That's all you need to do for that.
<natefinch> abentley: thanks
<katco> natefinch: abentley: hey... so these CI tests are being wrapped up then?
<natefinch> katco: yeah
<katco> yay :D
<natefinch> why the heck do I have to log into ubuntu to "download as text" from pastebin.ubuntu.com?
<perrito666> lol
<perrito666> you can always report it as a bug
<katco> wwitzel3: ping
<wwitzel3> katco: pong
<katco> wwitzel3: hey on the rich status spec? who do you think from ecosystems/accounting would be good to ping?
<katco> wwitzel3: it has to do with charm metadata, so charmers for sure. and i would think someone from accounts would want to give input on what information they'd like when doing installations
<wwitzel3> katco: not 100% sure, so I'd ping arosales and ask him for some candidates that might have a strong interest/opinion
<katco> wwitzel3: ty. arosales, any volunteers? https://docs.google.com/document/d/1JcWkE4SNxXuFClZGBcwnU3w13IpRU1yxMhddQG6mKyE/edit#
<arosales> katco, /me looking . .
<katco> arosales: ty sir
<arosales> katco, I'll bring it up on our daily and send a mail out on it too
<katco> arosales: ty... please let me know who you'd like to delegate so i can add them to the reviewers list
<arosales> katco, will do
<arosales> katco, thanks thanks for looking for the feedback
<katco> arosales: ty again!
<arosales> katco, np. I'll should have some more information this afternoon.
<katco> arosales: i'm also pulling marcoceppi into https://docs.google.com/document/d/1LORhaYvk_A8yMHkAb9FR_cN9V0S55zEx-T6QXdmr3fU/edit#
<katco> arosales: he expressed interest in nuremberg
<arosales> katco, ah yes is a good one for min version
<katco> arosales: juju min. version is the one we'll be focusing on next
<abentley> natefinch_afk: jog's stuff has landed now.
<natefinch_afk> abentley: thanks
<natefinch_afk> abentley, sinzui:  I get this error on several of the tests, despite having run make install-deps
<natefinch_afk> OSError: /usr/lib/python2.7/dist-packages/lookup3.so: cannot open shared object file: No such file or directory
<sinzui> I wonder what that is
<sinzui> natefinch_afk: It appears to relate to jenkins and I I see several reports of it failing
<natefinch> sinzui: yeah, just found some interesting things... I found it in /usr/local/lib/python2.7/dist-packages/
<sinzui> natefinch_afk: my apt-cache policy python-jenkins says I have 0.2.1-0ubuntu1
<natefinch> Installed: 0.2.1-0.1
<sinzui> natefinch: how did you get that version? pip? easy_install?
 * sinzui thinks we need the ubuntu version
<natefinch> sinzui: quite possibly
<natefinch> sinzui: I didn't know about make install-deps when I started, so I was just installing stuff however I could find it
<sinzui> natefinch: understood. I have to do the same on the win and OS X machines. The issue I am reading implies the jenkins lib does work on OS X, but it is working wel enough for our tests
<natefinch> I'm on ubuntu... just ran pip install (I think?)  because I didn't know how else to ge tit
<natefinch> and..... now pip is dumping a giant stack trace when I do pip uninstall jenkins.  Nice.
<abentley> natefinch: If you ran make install-deps, you should have python-jenkins installed via apt.
<sinzui> natefinch: you can run pip unistall jenkins?
 * sinzui isnât sure of the pip package name
<natefinch> sinzui: I can try and have it fail
<natefinch> sinzui: it seemed to recognize the name
<natefinch> abentley: yeah, apt seemed to think I had it installed via apt
<sinzui> abentley: surely pip is installing in a path that takes precedence.
<natefinch> I removed and reinstalled the apt version, it still gives me  0.2.1-0.1
<abentley> I do not have lookup3 installed, and I don't seem to need it.
<abentley> I have python-jenkins 0.2.1-0.1 installed.
<natefinch> full stack trace from running tests (there are a handful of these): http://pastebin.ubuntu.com/11502857/
<abentley> natefinch: Can you delete /usr/local/lib/python2.7/dist-packages/jenkins.py or at least move it aside so that the correct jenkins lib gets loaded?
<natefinch> abentley: sure
<natefinch> FYI, I don't have  /usr/lib/python2.7/dist-packages/jenkins.py
<natefinch> (if I'm supposed to)
<natefinch> It looks like all my jenkins stuff got installed to /usr/local/lib/python2.7/dist-packages/  instead of /usr/lib/python2.7/dist-packages/
<natefinch> that sounds like "you installed something with or without sudo when you should have done it the other way"   but I have no idea what, being both a linux and python n00b
<abentley> natefinch: No, you shouldn't have that, you should have /usr/lib/python2.7/dist-packages/jenkins/__init__.py
<natefinch> abentley: ahh, ok, yes, I have that
<natefinch> I guess get_python_lib()  must be returning the wrong thing
<abentley> natefinch: There are at least two incompatible packages providing 'jenkins': https://pypi.python.org/pypi/jenkins https://pypi.python.org/pypi/python-jenkins and the one installed in /usr/local/lib is the wrong one.
<natefinch> abentley: how am I supposed to install it?
<abentley> natefinch: The right one is already installed.  You just have to get rid of the wrong one.
<natefinch> abentley: ahh, ok, I figured it out. pip uninstall, instead of saying "Hey, this needs to be run with sudo" instead dumped a giant ugly stack trace.
<natefinch> which I incorrectly interpreted as "jenkins wasn't installed with pip"
<natefinch> that fixed it
<rogpeppe> thumper: hiya
<natefinch> is there a bzr plugin that'll let me run an external merge tool to fix conflicts?  I found bzr-extmerge, but it appears to be ancient (tries to run with python 2.4)
<natefinch> thumper, sinzui, abentley: ^^
<abentley> natefinch: No, extmerge is the only one I'm aware of.  But bzr dumps THIS, BASE and OTHER files that you can use an arbitrary tool with.
 * natefinch closes his eyes and runs sudo python ./setup.py
<natefinch> er setup.py install
<sinzui> wallyworld: do you think the maas 1,7 test would pass if we added a 30s delay between bootstrap and deployer?
<wallyworld> sinzui: yes
<wallyworld> sinzui: not even 30s, more like 1 second
<wallyworld> or 2
<sinzui> let me try to solve the issue.
<sinzui> wallyworld: I will start with 5 seconds
<wallyworld> ok :-)
<marcoceppi> katco: you still around?
<sinzui> wallyworld: I am adding a call to status between bootstrap and deployer. Do you think that is enough time? Do you have a branch ready to merge to test my change. I donât want to start a test of an old revision if you have work queued?
<wallyworld> sinzui: everything you need to test should be in tip of 1.24
<wallyworld> sinzui: the python-jujuclient work simply retries during the second or so you will be deplaying
<wallyworld> delaying
<wallyworld> which would make the delay unnecessary
<sinzui> wallyworld: I am pushing a change to all the slaves. I will retest 1.24 tip when I see the changes areive
<sinzui> arrive
<wallyworld> sinzui: tyvm, i will wait with baited breath
<mup> Bug #1460184 changed: Bootstrapping fails with Maas on Ubuntu Vivid <maas-provider> <vivid> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1460184>
<wallyworld> ericsnow: any chance of a trivial time display fix review? the code change is one line, the test changes are a search and replace http://reviews.vapour.ws/r/1823/
<ericsnow> wallyworld: sure
<wallyworld> ty
<ericsnow> nice: "You Require More Vespene Gas" (in a test)
<ericsnow> wallyworld: ship-it!
<wallyworld> ericsnow: ty
<ericsnow> wallyworld: any time
<katco> marcoceppi: am now, what's up?
<wallyworld> waigani_: heya, you working on bug 1376246 ?
<mup> Bug #1376246: MAAS provider doesn't know about "Failed deployment" instance status <landscape> <maas-provider> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1376246>
<davechen1y> great, build is blocked, again
<waigani_> wallyworld: no, I should be able to start on that today though.
<wallyworld> waigani_: great, becasue we want 1.24 work done so we can look to do a release overnight
<waigani_> wallyworld: okay, let me get a bite to eat and I'll get into it
<wallyworld> ty
<axw> wallyworld: sorry I missed standup, been on the phone with iinet for 40 minutes trying to get my account unlocked :/
<wallyworld> axw: gawd, i hate isps. all fixed?
<axw> wallyworld: yeah, silly error while setting up my new modem. OTOH, seems I got swapped to the new port and now I'm syncing at 16Mb as opposed to 4Mb I was getting for the last few months
<wallyworld> oh good :-)
<wallyworld> axw: you free now for a chat?
<axw> sure, just a quick one tho
<axw> see you in standup
#juju-dev 2015-06-02
<waigani_> wallyworld: lp:gomaasapi/enum.go only has 7 node status consts, where lp:maas/maasserver/enum.py has 15. Do you know if there is a reason for that, or is gomaasapi out of date? If so, I'll add the missing 8 statuses, along with the deployed failed status needed to fix this bug.
<axw> wallyworld anastasiamac: 1.24 release notes are here: https://docs.google.com/document/d/1qKWvSZ06Vx3ZI2RxYg6P7sWdIvOD5QXPfvpUNEtImMA/edit
<cherylj> thumper: ping?
<thumper> hey
<cherylj> should DestroyEnvironment be moved from apiserver/client to apiserver/environmentmanager so it could be called when you're just logged into a system?
<thumper> ah...
<thumper> no
<thumper> don't think so because it actually operates on an environment...
<thumper> which makes me rethink slightly
<thumper> bugger
<cherylj> heh
<thumper> ah FFS
 * thumper thinks
<thumper> shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit shit vv
<cherylj> :(
 * thumper rethinks the option of splitting the command up
<thumper> damn it
<cherylj> thumper: need to chat about it?
<thumper> yeah
<cherylj> okay, one sec.
<thumper> gimmie a few minutes to finish munching
<cherylj> sure
<thumper> cherylj: https://plus.google.com/hangouts/_/canonical.com/destroy-all-the-environments
<wallyworld> waigani_: yes, gomaasapi out of date, which is part of the reason for the bug
<waigani_> wallyworld: I've fixed the bug, now I just have to refresh my memory on bzr to propose to gomaasapi...
<wallyworld> waigani_: does the fix include updating pending machine status to error as per the bug report?
<wallyworld> bzr push lp:~waigani/gomaasapi/your-branch
<waigani_> wallyworld: it adds a TestBootstrapNodeFailedDeploy to the maas provider - that is bootstrap returns an error
<wallyworld> waigani_: i see, and so that propagates through to make start instance error
<wallyworld> it works for any instance starting up, not just bootstrap?
<waigani_> wallyworld: yes, the fix is in waitForNodeDeployment - but l'll follow up with more testing to make sure
<wallyworld> waigani_: thanks, i'm just being cautious
<waigani_> wallyworld: yep, of course
<axw> anastasiamac: storage-add is all done on 1.24 right?
<axw> anastasiamac: just updating the docs
<wallyworld> axw: anastasiamac had to go out to the school for a bit; add is done except for the 2 remaining PR's for the hool tool fixes
<axw> wallyworld: thanks
<axw> wallyworld: is it MAAS 1.8 that's required for disk constraints?
<wallyworld> axw: yep
<axw> wallyworld anastasiamac: updated docs, would appreciate your eyes over it when you have time: https://github.com/juju/docs/pull/443
<wallyworld> axw: ty, am finishing a branch will look soon
<anastasiamac> axw: tyvm!
<anastasiamac> axw: will look soon :D
<axw> thanks
<anastasiamac> axw: tyvm for comments on PR as well :D
<anastasiamac> axw: if u have time to cast ur eyes over this one would gr8!!
<anastasiamac> axw: http://reviews.vapour.ws/r/1828/
<davechen1y> thumper: http://paste.ubuntu.com/11510221/
<davechen1y> latest state of play
<davechen1y> still very messy
<mup> Bug #1460882 was opened: provider/joyent: multiple data races <juju-core:New for dave-cheney> <https://launchpad.net/bugs/1460882>
<thumper> davechen1y: keep at it, you're doing a great job!
<waigani_> wallyworld: https://code.launchpad.net/~waigani/gomaasapi/faildeploy/+merge/260786
<wallyworld> waigani_: thanks will look soon
<waigani_> wallyworld: once that lands I'll update dependencies.tsv and propose the juju-core pr
<davechen1y> thumper: /me salutes
<axw> anastasiamac: :(  it occurred to me that storage-add is a bit too permissive with the constraints. unit should not be able to specify which pool to add storage from
<axw> anastasiamac: that's an operator concern. units should only be able to request more of storage that has already been assigned
<axw> anastasiamac: i.e. they should be able to specify count -- not even sure about size
<axw> I think disallowing pool would be sufficient for now
<axw> anastasiamac: added a comment to review - when doing dependent branches, please use rbt to set the parent branch
<menn0> thumper: here's the PR that adds the logsink log file
<menn0> thumper: http://reviews.vapour.ws/r/1835/
<thumper> k
<menn0> thumper: my only concern with it is that in a large env there's potentially 1000's of request handlers using the same lumberjack.Logger
<menn0> thumper: it's goroutine safe (using a mutex internally) so that's not a problem
<thumper> menn0: are we writing one file per stateserver or one file per environment?
<menn0> thumper: but I do wonder about the perform impact for the logsink API
<menn0> thumper: one per stateserver
<thumper> hmm.....
<menn0> thumper: the env UUID is included in each log line
<thumper> I suppose it is only for post-mortems
<menn0> thumper: separate log files would be tricker to manage when envs are destroyed
 * thumper nods
<menn0> thumper: re the performance aspect I was thinking there could be a single goroutine which writes to the file, using a buffered channel to help cope with bursts
<menn0> thumper: not sure if it's worth the complexity (and it might not really help that much)
<thumper> hmm...
<thumper> yeah, probably not worth it just yet
<menn0> thumper: the way things are now if the server end starts slowly down it should be fine because there's tons of buffering on the client side
<thumper> I guess we'll see how it performs :)
<menn0> thumper: i could run the performance test again
<thumper> up to you...
<thumper> I'm not too worried at this stage
<menn0> ok
<menn0> i'll make a note to run the test again - maybe when i'm writing specs later this week
<natefinch> menn0, thumper: you can easily toss a bufio.Writer around lumberjack and get buffered writing
<natefinch> basically zero complexity.  I know another application using lumberjack did the same thing, because they were doing a ton of small writes.
<wallyworld> axw: could you look at this for me? discussing with curtis we have been wanting this for a little while - bootstrap command waits till api server is ready before exiting http://reviews.vapour.ws/r/1836/
<menn0> natefinch: not a bad idea ...
<wallyworld> waigani_: gomaasapi change looks good
<axw> wallyworld: looking
<waigani_> waigani_: cool - that will allow the maas provider to report the right error
<menn0> natefinch: the problem is making sure things make it to disk when the agent dies
<natefinch> menn0: yep, that's a problem
<menn0> natefinch: looking at the Writer code it looks like unless you call Flush or Close before things finish up the contents of the buffer at the end won't make it to disk
<menn0> natefinch: so i'd still need something watching the api server's tomb which calls flush as it's dying
<menn0> natefinch: anyway i'll leave things as they are for now. the OS's own caching may be just fine.
<menn0> thumper: do you have time to look at that PR?
<thumper> menn0: yeah
<thumper> in a few minutes
<menn0> thumper: thamks
<axw> wallyworld: reviewed, probably good to ship but I have a few questions first
<wallyworld> sure, ty
<wallyworld> axw: answers published
<davechen1y> thumper: here is a nasty one https://bugs.launchpad.net/juju-core/+bug/1460893
<mup> Bug #1460893: many unhandled assigned values <juju-core:New> <https://launchpad.net/bugs/1460893>
<axw> wallyworld: I'll shipit, but what I had expected was that when you bootstrap, Juju wouldn't even export a limited API until bootstrap+possible upgrade had finished
<wallyworld> axw: but then that would not allow status to run
<axw> wallyworld: why does that matter? if we're blocking until the API is available?
<wallyworld> axw: and it's even harder because the upgrade check needs to api
<wallyworld> so we need to start the api for the upgrade workr
<wallyworld> hence everything can see it also
<wallyworld> and so log in
<wallyworld> is that making sense
<axw> wallyworld: it's not unsolvable, but I understand that that makes it more difficult
<wallyworld> it would need changes down at the rpc leve;
<axw> yep
<wallyworld> so messy this late in 1.24
<wallyworld> and we've wanted bootstrap to behave like this anyway
<axw> it could use loopback-only during bootstrap-upgrade, but that probably wouldn't help for local provider
<wallyworld> likely, yes
<wallyworld> we can take another stab in 1.25 or something
<axw> wallyworld: it's got a rubber stamp now
<wallyworld> ty
<wallyworld> tested on amazon, just testing on local to be sure
<menn0> natefinch: how do the log files managed via lumberjack end up with the "correct" ownership and perms?
<menn0> natefinch: i'm not seeing where that happens
<natefinch> menn0: https://github.com/natefinch/lumberjack/blob/v2.0/lumberjack.go#L200
<menn0> natefinch: ok right. so b/c the shell script used with upstart/systemd sets the initial mode and owner lumberjack perpetuates it
<natefinch> menn0: yep
 * menn0 needs to do something similar for this other log file
<natefinch> menn0: that's like 2/3rds of the reason I left the upstart script alone, was because I didn't want to have to recreate that logic somewhere else.
<mup> Bug #1460893 was opened: many unhandled assigned values <juju-core:New> <https://launchpad.net/bugs/1460893>
<thumper> time to head off and give a juju talk
<wallyworld> axw: if you have time, i'd like another set of eyes on a problem
<axw> wallyworld: yup
<wallyworld> so in that recent PR with bootstrap - it fails on local provider because the isAgentUpgrading() never returns false
<wallyworld> so bootstrap polls and gives up
<wallyworld> and yet the bootstrap is ok because deploy works
<wallyworld> so i can't quite see right now why the channel to signal agent upgrade checks are finished isn't closed
<axw> wallyworld: I don't see any isAgentUpgrading(), did you change something and not upload?
<wallyworld> no, don't think so. that's in jujud/agent/machine.go
<axw> isAgentUpgradePending?
<axw> ok
<wallyworld> seems to all work fine on aws
<wallyworld> if i hard code isAgentUpgradePending() to return false it works
<wallyworld> so for some reason the agent upgrade worker is not closing the channel on local provider
<wallyworld> that's where i've got to so far
<axw> wallyworld: I'm guessing it's something to do with local provider auto-bumping the version
<wallyworld> ah, yeah maybe, that version is stored as agent-version in state i guess
<axw> yes
<wallyworld> i'm sure i tried this, let me try closing that channel at the start of the upgrade worker
<menn0> wallyworld: could you have a quick look at this? http://reviews.vapour.ws/r/1838/
<wallyworld> ok
<menn0> wallyworld: thanks. it's an easy one.
<davechen1y> menn0: sorry, -1 on that PR
<davechen1y> menn0: delete the tests and i'll LGTM it
<menn0> davechen1y: fine by me I guess
<menn0> davechen1y: the tests are kind pointless but i was expecting a reviewer to require them
<menn0> :)
<menn0> kinda
<menn0> davechen1y: the original func with the juju codebase wasn't tested either
<wallyworld> axw: if i hard code the closing of the agentUpgradeComplete channel at the very start of the agent upgrade worker loop, it still is unhappy, so the version bump may be a red herring
<axw> wallyworld: I'll pull your branch and see if I can spot anything
<wallyworld> axw: thanks, i'm sure it's something ovious
<menn0> wallyworld: you ok with me deleting the test as per davechen1y (above)
<menn0> wallyworld: ?
<wallyworld> menn0: was looking at test
<wallyworld> i guess so
<davechen1y> menn0: os.Chown has tests
<davechen1y> the gynmastics to mock out the call just to prove that we can call that call
<davechen1y> seem unnecessary
<menn0> davechen1y: I agree it's a little contrived
<menn0> davechen1y: i could go either way on this one
<menn0> davechen1y: the test at least shows that the right things are called
<davechen1y> if it didn't work, there are other tests that would break in other packages
<davechen1y> if you want to ahve tests
<davechen1y> test that calling the os.Chown wrapper changes permissions on disk
<davechen1y> otherwise all you're testing is function dispatch works
<menn0> davechen1y: yeah but you can't do that meaningfully if the tests aren't running as root
<davechen1y> have two tests
<davechen1y> if os.GetUid() != = {
<davechen1y>   t.Skipf("test skipped, run as roo")
<davechen1y>   t.Skipf("test skipped, run as root")
<davechen1y> or seomthing
<menn0> davechen1y: yeah but that will never happen
<menn0> and you shouldn't have to run tests as root
<davechen1y> again, it comes down to what is the test testing ?
<davechen1y> if the test _requires_ that root can os.Chown, then we have to run as root
<davechen1y> otherwise, mocking the funciotn doesn't test anyhting
<davechen1y> (apart from function dispatch)
<menn0> davechen1y: sure but it's checking that the correct funcs are called with the correct args which is something
<menn0> davechen1y: anyway, i don't actually mind much
<menn0> davechen1y: i'll delete the test
<axw> wallyworld: you have to get a whole new client
<axw> wallyworld: also, you should close the client in that method
<davechen1y> menn0: sgtm
<davechen1y> not worth the argument
<wallyworld> axw: let me check what i did
<axw> wallyworld: would be nice if the "Waiting for API..." message came before "Bootstrap complete" too
<wallyworld> axw: yeah, i thought of that, the bootstrap complete is down in environs - i could move it from there
<wallyworld> so what do you mean by a new client? is NewAPIRoot(0 not enough?
<axw> wallyworld: or change environs/bootstrap.Bootstrap to do the waiting - I think that would be preferable, rather than adding more logic to the command code
<axw> wallyworld: I mean, when you retry, you need to get a new client
<axw> wallyworld: otherwise it uses the same apiserver root
<wallyworld> why a new client when retrying?
<wallyworld> oh right, yes
<axw> wallyworld: ^^
<wallyworld> axw: i thought about putting the check in environs, but that would have meant putting client code down there
<wallyworld> and i didn't think that to be appropriate
<axw> wallyworld: well to be truthful the bootstrap *is* complete, so I guess it can stay as it is
<wallyworld> yeah, i sorta came to the same conclusion
<wallyworld> so let me retry getting another client each retry
<wallyworld> axw: yeah, that was it, thank you. i think i was hoping / expecting that a server side change would be noticed by existing api server roots
<axw> wallyworld: the current impl doesn't strike me as ideal
<wallyworld> me either
<wallyworld> i keep thinking it works differently
<wallyworld> axw: just read the release notes - the link in the note to the wip doc - the doc is out of date eg says placement not supported
<wallyworld> not sure if we should just cut and paste the wip doc into the release notes as doc
<axw> wallyworld: where? I updated that
<axw> wallyworld: sorry, I updated the doc: https://github.com/juju/docs/pull/443
<wallyworld> in Unimplemented Cavearts
<wallyworld> oh, ok
<wallyworld> so just needs to be acted on
<wallyworld> axw: makes more sense now. i think maybe would also be worthwhile to mentin the storage hook tools in the google doc so that at least the doc has mentions of the key features?
<axw> wallyworld: seeing as it's the first release, I was trying to avoid having all the same information in two places. I can if you really want.
<wallyworld> axw: the issue is that the release notes are really quite comprehensive, so give the impression they cover "all the things"and so a reader might see them and not realise hook tools are there. i agree agbout 2 places. so perhaps the release notes need to be wound back?
<axw> wallyworld: originally it was just "we've done storage, here's a link to the docs" - yes, I think we should go back to that
<wallyworld> ok, sounds good. i have to head to soccer, will be back later. can you liaise with anastasia if necessary to get her stuff landed?
<wallyworld> and get her to re-read the release notes and wip doc?
<axw> wallyworld: no worries
<wallyworld> tyvm
 * anastasiamac cleans storage-add stuff
<anastasiamac> ericsnow: ping :D
<fwereade> well, that was interesting
 * fwereade just found a small lizard hiding in his dressing gown
<voidspace> fwereade: I'm sure it's not that small
<voidspace> don't be so hard on yourself
<fwereade> haha
<rogpeppe> fwereade, dimitern, voidspace: fancy a little review? (a couple of bug fixes to the juju/schema package) https://github.com/juju/schema/pull/6
<dimitern> rogpeppe, LGTM
<rogpeppe> dimitern: thanks!
<anastasiamac> axw: in case u miss my pmsg - doc is LGTM :D loving your writting style!!
<perrito666> morning
<rogpeppe> anyone know if there's any documentation for the possible environment config attributes anywhere?
<anastasiamac> perrito666: morning!
<rogpeppe> dimitern, jam, fwereade, perrito666, mgz_, evilnickveitch: ^
<rogpeppe> anastasiamac, perrito666: hiya
<anastasiamac> rogpeppe: o/
<perrito666> rogpeppe: sorry dont know
<evilnickveitch> rogpeppe, maybe...
<evilnickveitch> do you mean this:
<evilnickveitch> https://jujucharms.com/docs/stable/config-general
<rogpeppe> evilnickveitch: that looks good, thanks. except... a some of those values don't seem to have any descriptions.
<rogpeppe> evilnickveitch: it's a good starting point though, thanks
<evilnickveitch> rogpeppe, yes indeed, if you could fill them in as you go along, that would be a great help :)
<rogpeppe> evilnickveitch: i'm just about to make a big table inside environs/config that describes them all :)
<evilnickveitch> hurrah!
<Syed_A> Hello Folks, How can i force remove a service stuck with a hook failed.
<Syed_A> nevermind
<dooferlad> voidspace, dimitern: https://plus.google.com/hangouts/_/canonical.com/maas-juju-net
<dimitern> dooferlad, voidspace, it's over isn't it?
<dooferlad> dimitern: yea
<dooferlad> dimitern: I was the only Juju guy who turned up and didn't have much to add.
<dimitern> dooferlad, too bad :)
<dimitern> dooferlad, yeah, np
<natefinch> perrito666: it ended up being faster to just write a dumb program than try to figure out how to install that bzr plugin: https://github.com/natefinch/bsmrt
<perrito666> lol
<katco> natefinch: as ocr, can you give the reviews for this bug precedence? https://bugs.launchpad.net/juju-core/+bug/1451626
<mup> Bug #1451626: Erroneous Juju user data on Windows for Juju version 1.23 <1.23> <blocker> <juju> <oil> <regression> <windows> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1451626>
<katco> natefinch: it's been sitting for a bit, and we would like to land it ASAP for 1.24
<natefinch> katco: yep.
<katco> natefinch: ty sir
<perrito666> man being able to hangout and build juju is so cool
<katco> :p
<natefinch> perrito666: what, did you upgrade to a supercomputer?
<wwitzel3> lol
<perrito666> I have a backpack cluster to run hangout
<perrito666> the level of annoyance I have to go to get local provider working despite not having juju local installed is.. well annoying :p
<natefinch> katco: lol, Gabriel has gsamfira has been trying to get me to review that stuff for ages... he'll be glad it's officially my duty to do so now.
<katco> natefinch: haha
<katco> natefinch: sorry... you just happened to be ocr ;p
<natefinch> katco: I've wanted to review it, but we've just been so tight on deadlines, I haven't felt like I could spare the time, so I'm glad to be able to now.
<rogpeppe> evilnickveitch: FWIW, all these config attrs are available but don't seem to be documented: block-remove-object provisioner-safe-mode rsyslog-ca-key block-destroy-environment tools-metadata-url storage-default-block-source block-all-changes lxc-use-clone tools-stream allow-lxc-loop-mounts lxc-default-mtu
<rogpeppe> ev: ah, tools-metadata-url is actually documented as deprecated
<evilnickveitch> rogpeppe, yup. thanks for the others though.
<evilnickveitch> rogpeppe, what is the difference between lxc-clone and lxc-use-clone?
<evilnickveitch> or are they the same?
<rogpeppe> evilnickveitch: i've no idea :)
<evilnickveitch> hehehe
<rogpeppe> evilnickveitch: i need to find out though
<mup> Bug #1461111 was opened: Allow status-set/get to a service by its leader unit <juju-core:Triaged by hduran-8> <juju-core 1.24:In Progress by hduran-8> <https://launchpad.net/bugs/1461111>
<wallyworld> evilnickveitch: the same, lxc-use-clone is deprecated
<wallyworld> rogpeppe: ^^^
<rogpeppe> wallyworld: is that documented anywhere?
<evilnickveitch> wallyworld, thanks!
<wallyworld> rogpeppe: not sure tbh
<rogpeppe> wallyworld: is lxc-clone itself documented, in fact?
<wallyworld> rogpeppe: evilnickveitch: also, anastasiamac sent out notes on the block config
<rogpeppe> wallyworld: i'm trying to gather info on all the config attributes to put into a table inside environs/config
<wallyworld> rogpeppe: it would have been in release notes but i sadly suspect we didn;t do more than that
<evilnickveitch> wallyworld, cool, I will sync with her on updating the table
<wallyworld> ty
<evilnickveitch> rogpeppe, lxc-clone is documented, but as it is provider specific, it is on the lxc page
<evilnickveitch> https://jujucharms.com/docs/stable/config-LXC
<rogpeppe> evilnickveitch: it's not really provider-specific, is it? all environments can have lxc containers
<wallyworld> evilnickveitch: rogpeppe: it used to be but not anymore
<wallyworld> it was changed in 1.20
<rogpeppe> wallyworld: ok
<wallyworld> i think that's when use-clone was deprecated also
 * wallyworld is really going away now to sleep
<evilnickveitch> heheheh
<evilnickveitch> night night wallyworld
<wallyworld> ttyl
<rogpeppe> wallyworld: thanks for the pointer
<evilnickveitch> rogpeppe, provisioner-safe-mode is deprecated too, as of 1.21.1
<rogpeppe> evilnickveitch: ok, thanks
<rogpeppe> evilnickveitch: presumably superceded by harvest-mode
<evilnickveitch> yes
<evilnickveitch> rogpeppe, tools-stream was replaced with agent-stream
<perrito666> is anyone willing to run a test in 1.24 to confirm something?
<mup> Bug #1461150 was opened: lxc provisioning fails on joyent <blocker> <ci> <joyent-provider> <lxc> <regression> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461150>
<perrito666> I need to leave for a moment can anyone check if, while in debug hooks,  you can call any helper at all?
<perrito666> for 1.24 that is
<natefinch> I'm leaving for a while too, or I would
<perrito666> I think something is seriously broken there
<rogpeppe> mgz_: ha, looks like 445a79b25d7d7a95127ec36a1f4c41674718a98f changed a little more than you hoped it would
<rogpeppe> mgz_: for instance, i just found a bug link to https://bugs.github.com/juju/juju/+bug/1224492
<mup> Bug #1224492: environs/config: zero-valued port settings are allowed but ignored <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1224492>
<rogpeppe> mgz_: which should be the link that mup pointed to...
<fwereade> DAMMIT: ... Error: Account should have balance of 25000, got 24999
<fwereade> I really thought I might have had it that time
<wwitzel3> fwereade: trying to pull a Superman III?
<Syed_A> I am trying to reduce the memory of containers by setting up lxc.cgroup.memory.limit_in_bytes in /var/lib/lxc/conatiner/config, but it is throwing errors.
<Syed_A> How can i change the default memory for containers in juju ?
<Syed_A> lxc_cgmanager - cgmanager.c:cgm_setup_limits:1250 - Error setting cgroup memory:lxc/juju-machine-6-lxc-3 limit type memory.memsw.limit_in_bytes
<lazyPower> question: I'm running 1.24-beta5.1 and i've found some weird behavior today. I don't have unit output as i tore the env down and stood it back up wiping the logs in teh process - but i've run into edge cases where it appears that juju did not upload is-leader as part of the toolchain
<lazyPower> is this known behavior that i've missed a bug on? I dont want to file a bug without additional info to support the claim other than lazy's gone skitzo
<lazyPower> katco: btw - ive put aside some time to review the status doc you sent over, will get you feedback before i EOD, ta for sending that over
<fwereade> lazyPower, I'd still go ahead and report it
<lazyPower> fwereade: ack, will do. Sorry about the zero info bug in advance :|
<fwereade> lazyPower, no worries, that points to a pretty specific area of the code, the inputs to which have been changing a bit lately
<Syed_A> Is it possible to reduce the memory of a lxc container started by juju ?
<Syed_A> Can i increase the memory of a lxc container started by juju ?
<natefinch> abentley: btw, I updated the merge proposal, and I believe I addressed all your concerns (for real this time).
<abentley> natefinch: Thanks.  I'll have a look soon.
<natefinch> abentley: thanks
<marcoceppi> katco: can we chat about min version?
<katco> marcoceppi: sure: https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=1
<abentley> sinzui: Could you please review https://code.launchpad.net/~abentley/juju-reports/web-enqueue/+merge/260879 ?
<natefinch> katco, marcoceppi: we  should rename it to "Capability Flags" since that better reflects what we're really implementing (until someone decides we need a different color shed).
<sinzui> yes abentley
<abentley> natefinch: You still have this wait_for_started that you haven't explained the need for or removed.
<natefinch> abentley: oh, I thought I deleted that.. honestly, I was just copying what deploy_stack did when it was deploying charms.
<natefinch> abentley: I can delete it, that's fine.  Running a test with it gone right now, just for a sanity check.
<abentley> natefinch: I wish I had a perfect script to point you to.  The problem with deploy_stack is that, as the oldest script, it is out-of-date in places.
<natefinch> abentley: yeah, I didn't realize that when I was copying everything it did :/
<natefinch> abentley: that's part of why I tell people to follow what assess_log_rotation.py does for new CI tests in the wiki page.... because at least that'll be less wrong.
<mup> Bug #1461246 was opened: UpgradeSuite.TestUpgradeStepsHostMachine consistently fails on utopic <blocker> <ci> <regression> <unit-tests> <utopic> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461246>
<natefinch> abentley: test passes, removed and pushed
<abentley> natefinch: That seemed like a premature choice given that assess_log_rotation hadn't yet passed code review.
<natefinch> abentley:  but I knew it was being reviewed carefully, so once it was in, it should be a good example.
<natefinch> abentley: and if not, then I knew someone could just go change the wiki page anyway ;)
<abentley> sinzui: Could you tell me what you think of the indentation changes in https://code.launchpad.net/~natefinch/juju-ci-tools/logrot/+merge/259750 ?
<natefinch> abentley, sinzui: for the record, everything was run through autopep8... apologies for the apparent spurious indentation changes.
<abentley> natefinch: The updated style is more my preference, but I believe the previous style is sinzui's preference and I've been writing to that.
<sinzui> natefinch-afk: abentley I have pondered switching to autopep8. I dissagree with the unpythonic closing brace that is NOT specified in PEP8.
<sinzui> natefinch-afk: abentley The formatting is fine. I accept the change.
<sinzui> abentley: natefinch-afk setting "ignore": "E123, E133", for pep8 and autopep8 removed trailing brace from discussion
 * sinzui uses both to not take sides when he reviews/updates other peopleâs code
<sinzui> r=me abentley. I missed the submit button 30 minutes ago
<wallyworld> waigani: quick status update on the maas fix? should be done today?
<wallyworld> or this morning even :-) ?
<mup> Bug #1460171 changed: Deployer fails because juju thinks it is upgrading <blocker> <ci> <deployer> <regression> <tech-debt> <upgrade-juju> <juju-ci-tools:Fix
<mup> Released by sinzui> <juju-core:Fix Released by wallyworld> <python-jujuclient:Fix Committed by wallyworld> <https://launchpad.net/bugs/1460171>
<waigani> wallyworld: yes, in stand up - first thing after
<wallyworld> ty :-)
<waigani> wallyworld: what's the best way to merge the branch into gomaasapi?
<wallyworld> waigani: sorry, in meeting, will need to pull trunk and merge and repush, can you ask thumper
<waigani> wallyworld: right, thanks
<wallyworld> sinzui: i don't think this one will be fixed for 1.24 either https://bugs.launchpad.net/bugs/1457225
<mup> Bug #1457225: Upgrading from 1.20.9 to 1.23.3 works, but error: runner.go:219 exited "machiner": machine-0 failed to set status started: cannot set status
<mup> of machine "0": not found or not alive <cts> <sts-stack> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1457225>
<menn0> natefinch-afk: did you see this: https://bugs.launchpad.net/juju-core/+bug/1461246
<mup> Bug #1461246: UpgradeSuite.TestUpgradeStepsHostMachine consistently fails on utopic <blocker> <ci> <regression> <unit-tests> <utopic> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461246>
<sinzui> oh goody. wallyworld I donât see an action we can take on it. Agreed.
<waigani> wallyworld: gomaasapi updated, dependencies.tsv updated, Tried to land on 1.24, but it's blocked. Shall I JFDI?
<wallyworld> waigani: yeah, i think so
<waigani> wallyworld: done
<wallyworld> tyvm
<whit> if I'm running off 1.24-beta5-trusty-amd64 locally and juju bootstrap --upload-tools, the tools on my deployed machines should match right?
 * whit is seeing tools == 1.24-alpha1.1-trusty-amd64
 * whit is wondering because is-leader is fairly consistently not present
<wallyworld> waigani: did you see the gomaasapi dep looks wrong because build failed?
<waigani> wallyworld: on it
<waigani> wallyworld: I don't get it. TestDependenciesTsvFormat passes and the revision id is from here: http://bazaar.launchpad.net/~juju/gomaasapi/trunk/revision/62
<wallyworld> waigani: sorry, was in another meeting and about to start another but gotta do something first, can you ask thumper (it's not like he doesn't know bzr :-)
<waigani> wallyworld: found it. there was a space before the revis id but after the tab - the format test didn't pick that up.
<wallyworld> waigani: you should always run godeps locally before pushing
<waigani> wallyworld: lesson learnt. I'll update the test also.
<wallyworld> ty :-
<wallyworld> )
<mwhudson> davechen1y: fwiw, go 1.4 is in wily now
<perrito666> ericsnow: ping
<ericsnow> perrito666: hi
<alexisb> and sinzui and team is working on 1.4.2 verification with juju
<perrito666> ericsnow: heyhey
<perrito666> ericsnow: do you have any extra info on https://bugs.launchpad.net/juju-core/+bug/1434437 besides whats on the bug?
<mup> Bug #1434437: juju restore failed with "error: cannot update machines: machine update failed: ssh command failed: " <backup-restore> <maas-provider> <juju-core:Invalid> <juju-core 1.22:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1434437>
<ericsnow> perrito666: nope
<perrito666> ericsnow: I am quite curious abut the original bug, as that would indicate that at least one of the machines was not provisioned?
<perrito666> I am not entirely sure a machine of juju can be missing a /var/lib/juju
<ericsnow> perrito666: yeah, not sure
<ericsnow> perrito666: could be that thing where juju uninstalls itself
<stokachu> sinzui: any issue with juju failing to create lxc containers? http://paste.ubuntu.com/11530516/
<stokachu> ive hit this in 2 separate labs, one with proxy and one without
<stokachu> on both trusty and precise
<perrito666> ericsnow: not sure, although the issue seems to have been resolved since then and replaced by other
<stokachu> both using maas
<wallyworld> perrito666: standup?
<perrito666> wallyworld: I am there
<davechen1y> mwhudson: right
<davechen1y> so what was all the hubbub about
<mwhudson> not sure
 * mwhudson is fighting monitors
<alexisb> davechen1y, mostly my lack of understanding in the process, I apologize for the unneeded alarm
<davechen1y> alexisb: all good
<davechen1y> i don't care how it happens
<davechen1y> only that it happens
<davechen1y> sinzui: so the next step is to get the W 1.4.4 package backported into the juju ppa
<davechen1y> how would you like to track that work ?
#juju-dev 2015-06-03
<gsamfira> so for anyone that cares: http://blogs.msdn.com/b/looking_forward_microsoft__support_for_secure_shell_ssh1/archive/2015/06/02/managing-looking-forward-microsoft-support-for-secure-shell-ssh.aspx
<gsamfira> still waiting for the announcement saying that windows is adopting the linux kernel
<wallyworld> waigani: thanks for fixing maas bug, auto pilot guys will be happy
<menn0> wallyworld: are you sure that not fixing bug 1457225 for 1.24 is a good idea?
<mup> Bug #1457225: Upgrading from 1.20.9 to 1.23.3 works, but error: runner.go:219 exited "machiner": machine-0 failed to set status started: cannot set status of machine "0": not found or not alive <cts> <sts-stack> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1457225>
<menn0> wallyworld: seems pretty serious
<wallyworld> menn0: from what i'm told, not really reproducable so the plan is to include in a point release if needed
<menn0> wallyworld: let me have a quick poke at it
<wallyworld> ok, ty
<menn0> wallyworld: i can guess what it is
<waigani> wallyworld: welcome. I've put my notes on the wiki, so next time will be a bit quicker: https://github.com/juju/juju/wiki/Update-Launchpad-Dependency
<wallyworld> waigani: ty
<wallyworld> menn0: i haven't looked into the bug in detail, just going on conversations, so if you can guess, that would be good
<menn0> wallyworld: ok
<wallyworld> natefinch-afk: when you are around, can you look at this bug which arose with the move conf files from logs work? https://bugs.launchpad.net/juju-core/+bug/1461246
<mup> Bug #1461246: UpgradeSuite.TestUpgradeStepsHostMachine consistently fails on utopic <blocker> <ci> <regression> <unit-tests> <utopic> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461246>
<mup> Bug #1461339 was opened: rsyslog on bootstrap node causing instability <juju-core:New> <https://launchpad.net/bugs/1461339>
<natefinch> wallyworld: I'll TAL
<wallyworld> ty
<menn0> wallyworld: I have been unable to reproduce bug 1457225 and i've requested more information
<mup> Bug #1457225: Upgrading from 1.20.9 to 1.23.3 works, but error: runner.go:219 exited "machiner": machine-0 failed to set status started: cannot set status
<mup> of machine "0": not found or not alive <cts> <sts-stack> <upgrade-juju> <juju-core:Incomplete by menno.smits> <https://launchpad.net/bugs/1457225>
<menn0> wallyworld: dropping it from 1.24 is probably ok
<wallyworld> thanks menn0, that matches my understanding also
<menn0> wallyworld: it's not what I thought it was at least. it looks like the upgrade started but never completed for some reason.
<wallyworld> yeah
<wallyworld> good to check though
<anastasiamac> axw_: allowing only count for storage constraints is only applicable to storage-add hook tool, rite?
<axw_> anastasiamac: correct. "juju storage add" should be able to do whatever
<anastasiamac> axw_: tyvm :D
<axw_> anastasiamac: "storage-add" should just take a name and optional count, and take the rest from constraints
<menn0> natefinch: I believe that the failing upgrade test runs all the upgrade steps
<menn0> natefinch: it probably doesn't need to
<menn0> natefinch: it's an old one
<natefinch> menn0: ahh interesting
<menn0> natefinch: mocking out the actual running of the upgrade steps and just confirming that they would be run is probably enopugh
<menn0> natefinch: the steps and upgrade running infrastructure is all tested comprehensively elsewhere
 * menn0 has to go for a bit
<wallyworld> sinzui: tell me i'm crazy - i run "juju get-env" and the output includes secrets. why shouldn't we be masking those out by default?
<anastasiamac> wallyworld: :(
<mup> Bug #1461342 was opened: unit storage add should only accept count as constraint <juju-core:New for anastasia-macmood> <juju-core 1.24:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1461342>
<natefinch> wallyworld: the two tests that are marked as failing pass just fine on my machine (and I'm on utopic).  That specific test, when I run it, just says it's skipping all the copying of files, because they don't exist. But on the CI machine, it says there's one there that it doesn't have permission to remove.  Seems like an environmental problem - there's a file in /var/log/juju/ that shouldn't exist.
<natefinch> wallyworld: given that it's only failing on utopic makes me suspicious that it's just that CI machine that has the errant file lying around, since the code isn't any different for different series.
<wallyworld> natefinch: that makes sense i think
<wallyworld> maybe mark the bug with a comment
<wallyworld> natefinch: also, i recall a change with "-" in the branch?
<wallyworld> with rsyslog config?
<wallyworld> i just saw this, could be related? http://pastebin.ubuntu.com/11533025/
<wallyworld> seems there's an extra "-" in front of machine-1
<natefinch> weird... might have been an errant typo
<wallyworld> natefinch: also. maybe the upgrade step should just skip files it can't move? witha warning?
<wallyworld> more robust that way
<menn0> wallyworld, natefinch: i was going to say what wallyworld just said
<menn0> you don't want an upgrade completely failing b/c of this
<natefinch> wallyworld: yeah, that makes sense.  No need to kill the upgrade step.. the whole reason we're moving the files is so that if you blow away the folder, you don't mess up our config... but we don't really care if the old files are there anymore or not.
<wallyworld> yep
<natefinch> I'll make a PR
<wallyworld> natefinch: https://bugs.launchpad.net/juju-core/+bug/1461354 - i'm working another bug atm, but for your day tomorrow, if i don't get to this - could you look at it from the perspective it could be caused by the "-" change?
<mup> Bug #1461354: invalid tag and panic seen in state server logs <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461354>
<natefinch> wallyworld: absolutely... though I don't know what the "-" change is, but I can look for it.
<natefinch> oh, actually, I know what you're talking about
<wallyworld> natefinch: in the log move work - you change rsyslog config
<wallyworld> sorry, didn't know how to explain :-)
<natefinch> yeah yeah... I made the Namespace not include the - in the field, but added it to the logic of writing out instead
<wallyworld> i may be jumping to conclusions
<wallyworld> but seems like a reasonable place to start looking
<natefinch> it does seem suspicious
<mup> Bug #1461354 was opened: invalid tag and panic seen in state server logs <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461354>
<natefinch> OMG export_test is the worst
<natefinch> is the bot broken?
<natefinch> ahh maybe just slow
<natefinch> maybe it's just the reviewboard integration that's broken
<natefinch> wallyworld, menn0: care to do a very quick review on github? https://github.com/juju/juju/pull/2476
<wallyworld> ok
<mup> Bug #1461358 was opened: Resource tags should be periodically updated to reflect reality <storage> <juju-core:Triaged> <https://launchpad.net/bugs/1461358>
<wallyworld> menn0: you and i obviously think simiarly
<menn0> wallyworld: well in one case.
<menn0> then I found all the trivial stuff and you found what could be actual problems :)
<wallyworld> well, you saved me from the gc.IsNil things :-)
<natefinch> So, I almost put a comment in the actual code about not using PatchValue.  I'm not embedding whatever base suite has that... adding it would run a ton more code for the tests for no reason, and would actually add more lines of code to the file.... all to avoid doing my own defer.
<natefinch> As for warning.... I don't know?  I guess I assumed that if we don't care if the old file is there.... we don't care.  No need to warn.   Warn is  something I feel is like "this may indicate there's a problem you should look at" .... but a file we can't delete that we don't read or care about is not really a problem for Juju.
<natefinch> I started to write it as Warning and then changed to Info.  I'm only like 40% committed to it though, so if you guys think warning is more appropriate, that's fine.
<menn0> natefinch: that BaseSuite embeds CleanupSuite which has PatchValue
<natefinch> menn0: weird, I swear I tried to type it in and it complained at me.  Maybe I typoed it,.
<natefinch> menn0: so it does... my autocomplete was not finding it for some reason.
<natefinch> menn0, wallyworld: Fixes ready, do you want me to make the log message Warning?
<wallyworld> yes please
<natefinch> ok
<natefinch> updated: https://github.com/juju/juju/pull/2476
 * thumper is getting pretty grumpy at all the intermittent failures with the bot
<wallyworld> waigani: ocr? trivial one please for ci blocker http://reviews.vapour.ws/r/1843/
<waigani> wallyworld: shipit
<wallyworld> tyvm
<thumper> YAY (NOT) new intermittent test failure:  filter_test.go:146:   s.assertAgentTerminates(c, f)
<thumper> uniter/filter
<wallyworld> thumper: you running vivid?
<wallyworld> our tests on vivid are worse
<mup> Bug #1461150 changed: lxc provisioning fails on joyent <blocker> <ci> <joyent-provider> <lxc> <regression> <juju-core 1.24:In Progress by wallyworld> <https://launchpad.net/bugs/1461150>
<mup> Bug #1461246 changed: UpgradeSuite.TestUpgradeStepsHostMachine consistently fails on utopic <blocker> <ci> <regression> <unit-tests> <utopic> <juju-core 1.24:In Progress by natefinch> <https://launchpad.net/bugs/1461246>
<natefinch> wallyworld, menn0: care to check my update?  https://github.com/juju/juju/pull/2476/files
<wallyworld> ok
<wallyworld> natefinch: +1
<natefinch> awesome
<wallyworld> axw_: you resources pr - the provisioner version bump - it should be ok with an older env, and upgrades should work too, from what i can tell. i think?
<axw_> wallyworld: the provisioner has a bit of code that loops while it gets a NotImplemented error
<axw_> so I think it should be fine
<wallyworld> ok
<thumper> wallyworld: no
<thumper> wallyworld: and that failure is from the bot
<wallyworld> ah ok
<wallyworld> axw_: lgtm, love the deleted code, much cleaner
<axw_> wallyworld: cool, thanks
<mup> Bug #1461385 was opened: apiserver/instancepoller: data race in test <juju-core:Confirmed for dave-cheney> <https://launchpad.net/bugs/1461385>
<mup> Bug #1461393 was opened: worker/deployer: multiple data races <juju-core:Confirmed for dave-cheney> <https://launchpad.net/bugs/1461393>
<axw> wallyworld: https://github.com/axw/juju/blob/resource-tags-units-storagename-etc/environs/tags/tags.go
<axw> wallyworld: quick look over please, it was missing in the diff before
<wallyworld> ok
<axw> overzealous .gitignore
<wallyworld> axw: actually i did wonder about not seeing that file in the PR
<wallyworld> axw: LGTM
<axw> ta
<fwereade> anyone here got a graph theory background?
<anastasiamac> fwereade: the answer depends on why u r asking :D
<fwereade> anastasiamac, I'm wondering if there's any literature I should look at regarding *potential* cycles
<fwereade> anastasiamac, i.e. I have a graph built up from a bunnch of successor lists which do not always agree, and may thus imply cycles
<fwereade> anastasiamac, I know that each node appears at most once in each successor list
<Spads> fwereade: so you're looking to run Tarjan's algorithm on your data set?
<fwereade> Spads, not quite
<fwereade> Spads, I'm lookinng to determin whether a data set will produce the same tarjan sort for all possible extensions of the original data
<Spads> ahhhh
<fwereade> Spads, given that the inputs can change in somewhat restricted ways
<fwereade> Spads, I thought I had something and, well, the failure rate  on a workload designed to trigger the problem is down from 1 in 10k to 1 in 2m
<fwereade> Spads, which is sort of good
<fwereade> Spads, but sorta implies that I've missed something subtle, hence wondering if anyone remembers any related literature or anything :)
<Spads> fwereade: this is outside my two-decade-old memory of my discrete methods/graph theory classes.  Let me poke someone who may know
<fwereade> Spads, tyvm
<fwereade> Spads, no rush, I have a meeting in a moment anyway
<Spads> fwereade: yeah, she may need to go dig a bit anyway
<fwereade> Spads, cheers
<Spads> fwereade: are your rules aware of the tarjan weights of the nodes they're connecting to?
<dimitern> voidspace, it's just you and me today :)
<fwereade> Spads, I'm not aware of any weighting here
<dimitern> voidspace, ping
<voidspace> dimitern: sorry
<voidspace> dimitern: was helping wife with car!!
<voidspace> dimitern: omw
<dimitern> voidspace, np
<rogpeppe> anyone know what the Settings.NoProxy value means in github.com/juju/utils/proxy ?
<rogpeppe> evilnickveitch: here's as far as i've got so far: http://paste.ubuntu.com/11541062/
<rogpeppe> evilnickveitch: i've got descriptions for everything now, at least
<evilnickveitch> rogpeppe, that looks good - I will update the table in docs accordingly
<rogpeppe> evilnickveitch: i could probably generate the table automatically from that table if you'd like - might save a bit of time
<rogpeppe> s/from that table/from that data/
<evilnickveitch> rogpeppe, if you want to give that a go, i'm not stopping you :) the markdown syntax for tables is pretty straightforward
<rogpeppe> evilnickveitch: should only take a minute or two
<rogpeppe> evilnickveitch: i'm just formalising the defaults
<evilnickveitch> cool
<wallyworld> jam: got time for a brief chat about spec?
<rogpeppe> evilnickveitch: http://paste.ubuntu.com/11541625/
<rogpeppe> evilnickveitch: gist.github.com doesn't like it though, so it may well be invalid md syntax
<rogpeppe> evilnickveitch: i left deprecated entries in there - you'll probably want to separate them out
<rogpeppe> evilnickveitch: this is better: https://gist.github.com/rogpeppe/c7ce1f02258a39898d37
<jam> wallyworld: I'm meeting with alexis now, can I chat with you after 30min or so?
<wallyworld> sure
<rogpeppe> voidspace: i'm led to believe you might know something about  github.com/juju/utils/proxy ... is that true?
<evilnickveitch> rogpeppe, yup, that will certainly do the job.  thanks!
<rogpeppe> evilnickveitch: np
<rogpeppe> evilnickveitch: let me fix those multi-line descriptions first
<evilnickveitch> ok
<rogpeppe> evilnickveitch: this is better: https://gist.github.com/rogpeppe/8015a6c7122c85878b43
<evilnickveitch> rogpeppe, it is. I will get this added shortly
<rogpeppe> evilnickveitch: it really needs default values for all the other rows too
<evilnickveitch> yes. I am not sure they all have them
<evilnickveitch> rogpeppe, AFAIR, some keys simply don't exist if they aren't specified
<rogpeppe> evilnickveitch: yeah, but at least all bool vals should document what the default bool is
<evilnickveitch> perhaps they should definitively say 'none' in that case
<evilnickveitch> agreed
<rogpeppe> evilnickveitch: FWIW this is the code i used to generate the table (and check consistency with the real code): http://play.golang.org/p/6ci9O6GOz3
<evilnickveitch> rogpeppe, ok, thanks, i am sure that will be useful in the future too
<rogpeppe> evilnickveitch: the stuff after the ///// comment was pasted directly from the juju-core config code
<mup> Bug #1461529 was opened: juju upgrade-charm has no effect for subordinate charms <juju-core:New> <https://launchpad.net/bugs/1461529>
<wwitzel3> ericsnow: ping
<Syed_A> Hello Folks,
<Syed_A> I am trying to deploy HA openstack keystone in 3 lxc nodes.
<Syed_A> I am getting this error: ERROR: grp_ks_vips: id is already in use
<voidspace> rogpeppe: I don't believe so, no
<rogpeppe> voidspace: ok, thanks
<voidspace> :-)
<voidspace> sorry
<mup> Bug #1461534 was opened: presence.$cmd {getLastError} taking 100ms <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1461534>
<sinzui> mgz: I was about to try a reboot from the horzon console
<rogpeppe> evilnickveitch: this also adds information on which attributes are immutable: https://gist.github.com/rogpeppe/cff7976fbf56f135af4b
 * sinzui can move two jobs to another slave
<evilnickveitch> rogpeppe, ah, useful! We may need to coordinate with web design again before making the table any bigger - there is precious little width to display the current info!
<rogpeppe> evilnickveitch: that's your domain :)
<evilnickveitch> I wish it were! But it is my domain to ask the people who can fix it, at least
<ericsnow> wwitzel3: pong
<natefinch> Syed_A: not a lot of open stack experts on here... try #juju
<wwitzel3> ericsnow: hey, want to pair up?
<ericsnow> wwitzel3: yep
<wwitzel3> ericsnow: going to moonstone now
<wwitzel3> jam: ping, you still around?
<jam> wwitzel3: I'm 45 minutes past needing to help my son with homework, so I have to pretend I'm not here, sorry
<wwitzel3> jam: understand completely :)
<perrito666> we could do cooperative help with the homework and get you back faster? :p
<sinzui> natefinch: can you triage bug 1461339? I see several related bugs tagged ârsyslogâ. This is also an older juju; the issue might be fixed in newer juju
<mup> Bug #1461339: rsyslog on bootstrap node causing instability <logging> <rsyslog> <juju-core:New> <https://launchpad.net/bugs/1461339>
<natefinch> sinzui: that sounds a lot like what happened a couple weeks ago with that big customer problem.  I definitely think it's something we should be looking into addressing.  This is really katco's dept though :)  I'm just a code monkey now ;)
<sinzui> natefinch: If you feel it is still a real issue, then we now the issue medium, and if we want to fix it in 1.25, it is high
<sinzui> thank you natefinch
<katco> sinzui: tal
<wallyworld> natefinch: you going to look at bug 1461354?
<mup> Bug #1461354: debug-log EOF: invalid tag and panic seen in state server logs <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461354>
<wallyworld> katco: you saw the credentials one in our fix - it's a one liner
<katco> wallyworld: eh? context pls! :)
<wallyworld> the cloudsigma crednetials leak bug we talked about this m orning
<wallyworld> it's a private bug on beta6 - a few hours ago they added a note to the bug
<katco> wallyworld: ah no i haven't looked at that
<wallyworld> we need to change one line of code to fix
<katco> wallyworld: ok, it sounds like it's proposed?
<wallyworld> proposed?
<wallyworld> oh on rb?
<katco> yeah
<wallyworld> not that i can see
<wallyworld> but i've been busy on other bugs and things sadly and it was past my official eod when i saw the comment and then i had soccer
<katco> wallyworld: ah ok. we'll try and get it fixed
<wallyworld> ty :-)
<wallyworld> past midnight here
<katco> wallyworld: yipes, gtb!
<wallyworld> will do, sorry for not fixing, thought it best to nag you at this stage of the evening :-)
<katco> wallyworld: definitely :)
<katco> wallyworld: share the load
<wallyworld> and the only other one is the critical debug log issue i discovered today
<wallyworld> debug log entries from machines != 0 are busted
<wallyworld> the logged tag seems to have an extra "-" prepended
<wallyworld> might? be syslog related, not sure
<natefinch> wallyworld: I think I know the exact problem
<wallyworld> awesome :-) so a fix is on the way?
<natefinch> wallyworld: yep
<wallyworld> tyvm :-)
<wallyworld> so we can get beta6 out today, yay
<wwitzel3> cherylj: ping, need a review when you have a moment, http://reviews.vapour.ws/r/1798/
<cherylj> wwitzel3: looking now, but I am not a graduated reviewer yet.
<mup> Bug #1461561 was opened: juju run fails with "Permission denied (publickey)" on manual provider <intermittent-failure> <manual-provider> <run> <juju-core:Triaged> <https://launchpad.net/bugs/1461561>
<wwitzel3> cherylj: thats ok, it will help get the ball rolling, thank you
<natefinch> jw4: you around?
<jw4> natefinch: indeed
<natefinch> jw4: what's up with juju action not taking the -e <env> flag until after the subcommand?  It's different from every other command we have, where you can do "juju deploy -e <env> [charm]" for example, but for action it's juju action do -e <env> [action
<jw4> natefinch: that's weird
<jw4> natefinch: I suspect it's because all of our commands are subcommands, and we don't have any top level command code
<jw4> natefinch: "bug"
<perrito666> omg, how is it possible that the chrome hangouts plugin still does not accept multi account
<natefinch> abentley: addressed your concerns.  Should be a pretty quick review.  https://code.launchpad.net/~natefinch/juju-ci-tools/logrot/+merge/259750/+index
<abentley> natefinch: Thanks.  On the phone.
<natefinch> abentley: np
<rogpeppe> anyone know which commit fixed the second issue mentioned in this comment? https://bugs.launchpad.net/juju-core/+bug/1339770/comments/2
<mup> Bug #1339770: Machines are killed if mongo fails <canonical-is> <landscape> <maas-provider> <juju-core:Fix Released by wallyworld> <juju-core 1.18:Won't Fix> <https://launchpad.net/bugs/1339770>
<natefinch> man what I wouldn't give to have issues on github so when I fix a bug I don't have to go through and copy and paste all the bug data into the comments.
<natefinch> ericsnow: reviewboard doesn't seem to be picking up commits to 1.24... can you investigate?
<ericsnow> sure
<natefinch> wallyworld: fix for your issue: https://github.com/juju/juju/pull/2483
<ericsnow> natefinch: is your 1.24 up-to-date?
<natefinch> ericsnow: ish?
<natefinch> ericsnow: I think I updated yesterday
<natefinch> cherylj: can you review this?  https://github.com/juju/juju/pull/2483   fixes a bug I introduced here: https://github.com/juju/juju/pull/2458/files#diff-29cf702f1f3ee55354dc999b19a2e391R208
<natefinch> (very quick review)
<cherylj> natefinch: yeah, I'm finishing up a review for wwitzel3, give me a couple minutes.
<natefinch> cherylj: np, thanks
<ericsnow> natefinch: re: reviewboard, I'm not seeing a problem
<ericsnow> natefinch: was it a particular PR?
<natefinch> ericsnow: oh, no, maybe something is just really slow... the review is posted now
<ericsnow> natefinch: k
<ericsnow> natefinch: yeah, looks like RB is a little slow right now
<natefinch> (where really slow might be anything greater than my patience level of ~20 seconds ;)
<abentley> natefinch: r=me.  Shall I merge it for you?
<natefinch> abentley: yes please
<natefinch> abentley: thanks for all the time you spent reviewing the code.
<abentley> natefinch: You're welcome.  Thanks for your follow-through.
<mup> Bug #1461605 was opened: juju action commands require -e in the "wrong" place <actions> <juju-core:New> <https://launchpad.net/bugs/1461605>
<mgz_> gsamfira: bug 1455627
<mup> Bug #1455627: TestAgentConnectionDelaysShutdownWithPing fails <ci> <intermittent-failure> <lxc> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1455627>
<gsamfira> mgz_: ahh. Cool. This was the only test that sometimes failed against Go 1.4.2
<gsamfira> mgz_: tested on windows and Ubuntu. The rest all passed
<gsamfira> mgz_: also did a live test. Bootstrapped new env and deployed a noop charm
<gsamfira> on windows and ubuntu
<gsamfira> all worked fine
<mgz_> cool
<gsamfira> mgz_: it should probably be tested for a couple of weeks in the CI before trusting that all is well, but initial tests looked okay :)
<rogpeppe> if anyone fancies a review, here's a change to environs/config that starts us down the road that will let us create sensible UI forms for environment config attributes: https://github.com/juju/juju/pull/2487
<natefinch_> gah... dammit ubuntu... first my screen freezes, now I reboot and my mouse cursor is invisible
<natefinch> at least reviewboard didn't drop all my comments on the floor... that's ncie
<perrito666> natefinch: well if it makes you feel better, each time I use my kb after taking a bath 2 finger scroll goes crazy
<natefinch> That sounds like a euphemism for something
<natefinch> gsamfira: I'd like to see tests with multi-byte characters... I wonder if those "interleaved nulls" are really just UTF16 padding for ASCII characters.
<perrito666> natefinch: that does not sound right at all
<natefinch> btw, evidently this was just released: https://github.com/aws/aws-sdk-go
<natefinch> "aws-sdk-go is the official AWS SDK for the Go programming language"
<natefinch> currently at "developer preview release"
<natefinch> https://aws.amazon.com/blogs/aws/developer-preview-of-aws-sdk-for-go-is-now-available/
<perrito666> natefinch: that sounds a lot like alpha :p
<natefinch> "we have moved our GitHub repository from the awslabs organization to aws/aws-sdk-go. This signifies that this SDK is no longer in an experimental state."
<natefinch> *shrug*
<natefinch> dammit git
<perrito666> natefinch: dont complain or we will make youuse bzr
<natefinch> somehow it thought my branch was missing a whole buttload of commits... so it wouldn't let me repush
<perrito666> did you rebase?
<natefinch> perrito666: yes, and this is what I got: https://github.com/juju/juju/pull/2483
<natefinch> only the first and last commit are supposed to be there
<perrito666> ah because rebasing usually results on you not being able to push your branch if you had before
<natefinch> that's ok, I just made a new branch and cherry-picked the right commits: https://github.com/juju/juju/pull/2489
<perrito666> taht makes no sense
<perrito666> all the commits there (which are mine) are merged into 1.24
<natefinch> cherylj: if you have a second, this is the same as the last one you reviewed, but with an added test.  http://reviews.vapour.ws/r/1855/diff/#   (the key to the test is the "" in NewRsyslogConfigWorker.... we have another test that does the same thing, but with a non-empty namespace, but no test with an empty one.
<perrito666> natefinch: there, it is fine now
<perrito666> seems like github was having a bad day
<cherylj> natefinch: looking now.
<natefinch> perrito666: not sure what "there" means... still looks wrong to me
<perrito666> natefinch: odd, it looks ook to me
<perrito666> hard refresh?
<natefinch> been trying that
<natefinch> even incognito says it's still wrong.  It's ok.
<natefinch> that;s what the other PR is for.
<natefinch> only took a second, it's just annoying.
<perrito666> natefinch: ah I am an idiot, I was seeing the new one
<natefinch> lol
<perrito666> natefinch: also, there is an error in your rebase, perhaps a merge in the middle, you can see how one of your commits is at the beginning of the list and the other at the end
<perrito666> in a proper rebase all your commits are at the end
<natefinch> I just did git pull --rebase  *shrug*
<perrito666> so do remember that for git there are the nodes and the tree all those commits are there because you are changing the shape of the tree for that branch
<natefinch> I don't know why there were any commits at all, given that I was on my own damn branch
<perrito666> you need to take it lightly :) come to spend a couple of weeks here where everything works like s**t and you will see git with a different eye
<natefinch> lol
<natefinch> pretty sure the average sound level at my house has increased by about 20 decibels per child
<thumper> sinzui: ping
<thumper> or abentley ?
<abentley> thumper: pong
<thumper> abentley: hey there
<thumper> abentley: where are we on adding a CI test for race conditions?
<abentley> thumper: It's there.
<thumper> abentley: I don't see it in the emails
<abentley> thumper: http://juju-ci.vapour.ws/job/run-unit-tests-race/
<abentley> thumper: It is in the emails, under Failed non-voting tests.
<thumper> abentley: ah, I see it now
<thumper> thanks
<thumper> hopefully we can make it voting next week
<abentley> thumper: That would be great.
<thumper> wallyworld: is the container provisioner fix being back ported?
<thumper> wallyworld: it should be
<wallyworld> thumper: sec
<wallyworld> thumper: not backported to 1.22 (CI didn't show an issue like it did for 1.24) and we are about the put 1.22.5 in trusty. But, there will almost certainly be a 1.22. release and as soon as that milestone is created, it will be backported
 * thumper nods
<wallyworld> it only affects joyent
<thumper> ?!
<thumper> ah... no
 * thumper has read the bug, and I see it has been fixed...
 * thumper goes to make coffee
<thumper> WTF? ...
<thumper> ERROR while stopping machine agent: exec ["stop" "--system" "juju-agent-tim-testlocal"]: exit status 1 (stop: Method "Get" with signature "ss" on interface "org.freedesktop.DBus.Properties" doesn't exist)
<thumper> it did destroy the environment, but this is a bit worrying
#juju-dev 2015-06-04
<menn0> thumper: i get that all the time in my trusty VM
<thumper> ugh
<menn0> thumper: we should probably file a bug
<menn0> :)
<anastasiamac> axw: addressed :D PTAL when u get a chance :D
<axw> anastasiamac: looking
<anastasiamac> axw: \0/
<davecheney> any one having troubles adding comments to reviewboard today
<davecheney> the comment box does not load for me
<davecheney> http://reviews.vapour.ws/r/1847/diff/#
<davecheney> can someone please try to add a comment on this review and let me know if it works or not
<natefinch> davecheney: worked for me.
<davecheney> yay, logout / login dance worked
<davecheney> i hate reviewboard
<davecheney> it adds negative value
<natefinch> davecheney: when we started with it, people wanted it  for side-by-side diffs and dependent commits.  I don't know if anyone really uses the dependent commits (I certainly don't), and Github has side by side diffs now.
<natefinch> davecheney: it is a lot easier to see a few commits at a time on reviewboard... on github you see everything or just one.  And reviewboard makes commenting better, because they're batched up.
<natefinch> davecheney: it's also way easier to see old comments and the state the code was in at the time.
<davecheney> sure, no aregument there
<davecheney> but IMO that doesn't pay for reviewboards inherent shitness as an implementation
<anastasiamac> davecheney: ty for shipit!
<anastasiamac> axw: i got t2 for the price of one! tyvm :)))
<davecheney> thumper: what's going on with this workload stuff
<anastasiamac> 2 shipits*
<davecheney> is juju getting into the process management game ?
<axw> anastasiamac: nps
<natefinch> davecheney: yes
<davecheney> is there any value in me advising caution in this respect ?
<thumper> davecheney: what workload stuff?
<natefinch> probably not in the grand scheme of things, but I personally would like to hear what you have to say, and since my team is looking into this, that probably is useful.
<natefinch> current spec (still being revised): https://docs.google.com/document/d/1PcRQXaerlsACro4y1y5LWD-uvhfHya2CkOcoljyFyCU/edit#heading=h.62n9cmrnxg4o
<natefinch> brb
<davecheney> natefinch: IMO the only reliable way to track a process is to be it's parent
<thumper> davecheney: updated my branch
<davecheney> and as the charms are the ones that own the start hook, it is not possible for juju to be the parent of any process executed by the start hook
<thumper> davecheney: I hear what you are saying
<thumper> davecheney: we are going for a good approximation of perfection :)
<natefinch> davecheney: I think we could have juju be the parent, but it would go against the way everything else works.  The main problem is that juju is currently set up such that everything in a charm is a script.  Getting juju to be the one to launch the process would likely require a more declarative approach than we currently use.
<natefinch> davecheney: there is a note in the spec about auto-launching from metadata... so that might do what you want.
<natefinch> it is under "open questions" and does say "optional" though
<davecheney> natefinch: i don't think juju can be the parent
<davecheney> postgress for example WILL NOT run in the foreground
<thumper> davecheney: could I get you to take another look at http://reviews.vapour.ws/r/1847/ plz?
<davecheney> and some charms start more than one process, etc
<davecheney> and when I origianlly joined Juju it was because juju was not going to be a process manager
<davecheney> thumper: done
<wallyworld> thumper: does a django charm take a zip / tarball as the app payload?
<natefinch> way past my bedtime.  Dave - we'll do the best we can.  I'm more than happy to have your input on it.  But I think container management (regardless of what we call it) is sort of required at this point.
<davecheney> it might be required
<davecheney> but getting 100% functionality is not possible without being the parent of the process
<davecheney> and 90% coverage will be like HA, and backups, and leadership, etc
<davecheney> that is, a source of bugs and suoport calls
<thumper> wallyworld: no, not at this stage
<wallyworld> thumper: how does django in general consume its app payload when you run a django server?
<wallyworld> an exploded dir of files?
<wallyworld> you give django the dir via some config?
<thumper> wallyworld: there are two ways right now: 1) use a config value to specify a branch of bzr/git/hg/svn that is your app
<thumper> which I don't do
<thumper> or
<thumper> provide a payload subordinate charm that installs it using the 'django-settings' relation
<wallyworld> thumper: forgetting charms, if a user installs django without juju
<wallyworld> how does django itself consume the payload? dir on disk?
<thumper> yes, an app is normally a python module
<thumper> or package
<thumper> always get those two mixed up
<thumper> dir in the python path
<thumper> django executable is called with a settings module
<thumper> it uses those settings to determine which apps are used
<thumper> apps are expected to be in the python path
<wallyworld> thumper: thanks, so with resources, we would store an egg of something ad then explode that in the right place for django to pick up
<thumper> yeah
<wallyworld> ta
<thumper> wallyworld: it would certainly be simpler than using a payload charm
<thumper> as long as the resources are optional
<wallyworld> yes, we will still support the current methods
<wallyworld> eg charm specifes bzr url
<thumper> optional meaning: could be there for any particular instance
<thumper> not supported or not
<thumper> it would be good to have an optional resource
<thumper> so we could use it if specified
 * thumper wanders off again
<wallyworld> yes, but if supported an admin could publish a django app to JES and then juju deploy --resource myapp=webstorev2
<davecheney> https://esta.cbp.dhs.gov/
<davecheney> anyone seeing a cert error visiting this site ?
<miken> davecheney: nope (FF 38)
<davecheney> chrome 43 whinges
<davecheney> but not chrome on my nexus
<rogpeppe1> davecheney: thanks for the review of http://reviews.vapour.ws/r/1853/diff/#
<rogpeppe1> davecheney: i've updated accordingly
<rogpeppe1> fwereade: fancy taking a look at http://reviews.vapour.ws/r/1853/diff/# ? it starts to implement some of the stuff that we discussed.
<fwereade> rogpeppe1, lovely, LGTM with a minor
<rogpeppe1> fwereade: ta!
<rogpeppe1> fwereade: i'd prefer not to add constants in this PR as it will obscure the actual necessary changes. i've left as much code as possible untouched so that it's obvious that it's not that invasive a change.
<rogpeppe1> fwereade: ISTM that adding constants is a fix for old code that justifies a separate PR
<fwereade> rogpeppe1, it doesn't *have* to be this PR, but would you do a followup then?
<rogpeppe1> fwereade: sure
<fwereade> rogpeppe1, works for me
<fwereade> rogpeppe1, tyvm
<rogpeppe1> fwereade: and i agree about the -path keys but that was another bite too big for this stage
<fwereade> rogpeppe1, definitely
<rogpeppe1> fwereade: i wanted to start with a fields var that was identical to the original one (i verified with gc.DeepEquals)
<fwereade> rogpeppe1, yeah, I do appreciate how uninvasive it is :)
<rogpeppe1> fwereade: you'll notice that i added another grouping ("juju") for attributes that are created by juju itself and can't be specified by the user.
<rogpeppe1> fwereade: AFAIK agent-version and uuid are the only two members of that group - is that right?
<fwereade> rogpeppe1, ...I *think* so
<axw> dimitern: would you mind reviewing a small change to ec2test? https://github.com/go-amz/amz/pull/51
<dimitern> axw, sure
<dimitern> axw, LGTM
<axw> dimitern: thanks
<voidspace> dimitern: omw
<jam> fwereade: TheMue: shouldn't we be archiving cards now that we've looked over the board?
<TheMue> jam: I don't know the capabilities of the tool. unless there are no limits and good ways to query I always prefer archiving, yes.
<jam> TheMue: we probably need to work with Alexis so she can pull out whatever metrics she wants to focus on (how many bugs addressed, velocity, etc)
<TheMue> jam: yep, I'll also talk to katco about the separation between planned tasks in cards and fixes in lp. currently I'm not sure about handling issues in kanban too (double capture vs simple overview in one place)
<rogpeppe1> axw: ping
<axw> rogpeppe1: pong
<rogpeppe1> axw: yay! you're there :)
<rogpeppe1> axw: i'm looking at UpgradeSuite.SetUpTest in cmd/juju/agent, which i think you might have had a hand in
<jam> TheMue: lp doesn't make it easy to break down by team IIRC
<rogpeppe1> axw: i just saw a test panic, and it looks like it might be related to the go func()... setAptCmds statement
<jam> and if we do start sizing bugs, it doesn't tarck that either
<rogpeppe1> axw: ... possibly :)
<axw> rogpeppe1: possible dabbled in there. looking
<rogpeppe1> axw: anyway, it looks suspicious - do you know why that test starts a goroutine there?
<TheMue> jam: yes, different focus. would like a kind of  plugin to import an issue into a new card in a given board with auto-linking
<TheMue> *dreaming*
<axw> rogpeppe1: IIRC is a contortion around the way command execution is hooked
<axw> rogpeppe1: so the commands are hooked, then sent to a channel, then this goroutine watches that channel and does stuff with them
<rogpeppe1> axw: i don't see how it could ever be correct
<rogpeppe1> axw: setAptCmds doesn't seem to synchronise with anything
<natefinch> jam, TheMue:  we don't size bugs.  we treat them as a kind of overhead
<rogpeppe1> axw: i'm fairly convinced this can't have anything to do with the panic i saw (a mgo double-close), but it still looks wrong to me
<jam> natefinch: depends how you want to do it. That portion is up to the tea
<jam> team
<natefinch> jam, TheMue: the reason is that the sizing is so we know how long features take to implement.  The bugs factor into the environment that makes features take longer
<jam> TheMue: you can look into lp2kanban if you like, its a python project on LP that was an attempt to use LP's and Leankits APIs to sync them.
<axw> rogpeppe1: what do you mean doesn't synchronise? s.aptCmds locks a mutex on the suite...?
<rogpeppe1> axw: yes, but there's nothing to stop that goroutine running randomly after the test has finished
<rogpeppe1> axw: or even in arbitrary order over several tests
<axw> rogpeppe1: right. yes, that could be a problem.
<rogpeppe1> axw: it's mutexed but not synchronised
<natefinch> jam, TheMue: if we had bugs in github there are like 1000 projects to sync things :/    for ex: https://waffle.io/juju/juju
<jam> natefinch: unbound 'done' doesn't seem great from Waffle
<axw> rogpeppe1: is this happening for you frequently, or sporadic?
<rogpeppe1> axw: i just saw it once; haven't tried again
<axw> ok
<natefinch> jam: I haven't dived into it super deeply, but I know it's pretty customizable
<rogpeppe1> axw: for the record, this is the panic i saw: http://paste.ubuntu.com/11562154/
<TheMue> natefinch: I would like bugs on github too
<axw> rogpeppe1: I'll make a note to fix it, need to finish off some storage stuff
<rogpeppe1> axw: ta!
<rogpeppe1> axw: it may have something to do with the fact i'm running with go tip, which i believe has changed the default MAXGOPROCS  to >1
<natefinch> TheMue: from a purely pragmatic viewpoint, having bugs closer to the code is a good thing... plus then you get all the github bug integrations, like "fixes #123" which marks the bug as closed and puts a link from the bug to the fix automatically.   I think we'd gain a *lot* by moving to github for Juju bugs, just in ability to navigate between bugs and code more easily, and process impovements.
<rogpeppe1> axw: FWIW, it seems to be in the process of tearing down JujuOSEnvSuite, so it's probably not great that it's still running logic from UpgradeSuite :)
<axw> rogpeppe1: heh, yeah. I think I fixed this somewhere else before. I probably cargo culted before that
<rogpeppe1> hmm, no, that's commonMachineSuite
<rogpeppe1> axw: it's a pity we can't tell from the stack trace what the embedding type was
<rogpeppe1> a fairly trivial change to juju-utils (prompted by the discrepancy between UUID checking in juju/schema and that in utils) https://github.com/juju/utils/pull/137
<TheMue> natefinch: +1
<rogpeppe1> there are various UUIDs in the juju tests that fail the more stringent check. I don't really see why they should. e.g. 2d02eeac-9dbb-11e4-89d3-123b93f75cba
<rogpeppe1> TheMue: I think it was your code originally
<rogpeppe1> TheMue: want to take a look?
<TheMue> rogpeppe1: yes, any pointers to failing tests?
<rogpeppe1> TheMue: I changed environs/config to use utils.IsValidUUID rather than schema.UUID. That caused some tests to fail (try grepping for the above UUID in the juju code)
<rogpeppe1> TheMue: I think it's reasonable that the two checks should be aligned, and I don't see any particular reason to forbid non version 4 UUIDs from being parsed
 * rogpeppe1 thinks the whole notion of "versioned" uuids is weird
<TheMue> rogpeppe: I know you're no friends of UUIDs ;)
<rogpeppe> TheMue: i like UUIDs a lot :)
<rogpeppe> TheMue: it's just the RFC i have issues with
<TheMue> rogpeppe: the versions simply express how they are produced or better which information are part of the UUID generation, e.g MAC addresses
<rogpeppe> TheMue: ... which we never use
<TheMue> rogpeppe: IMHO an IsValidUUID should allow all versions, not only v4.
<rogpeppe> TheMue: i guess it might be useful to have a Meta method on UUID that returns information on the UUID by looking at the version
<rogpeppe> TheMue: but most people these days just use /dev/random so it wouldn't be that useful
 * TheMue is currently happy, workers produce machine noise from two sides of the house.*aaaargh*
<TheMue> rogpeppe: yeah, v4 is the most simple one
<rogpeppe> TheMue: thanks. i'd appreciate your LGTM on the review if you think it looks reasonable.
<TheMue> rogpeppe: my own lib (https://godoc.org/github.com/tideland/goas/v2/identifier) produces v1, v3, v4, and v5
<rogpeppe> TheMue: the only reason I can think of for versioning UUIDs is so you can extract metadata (e.g. the mac address) from the UUID afterwards. i totally don't see the point in version 3 or 5, which don't even preserve the hash
<mup> Bug #1461871 was opened: worker/diskmanager sometimes goes into a restart loop due to failing to update state <storage> <juju-core:Triaged by axwalk> <https://launchpad.net/bugs/1461871>
<rogpeppe> TheMue: anyway, old argument, best not restarted :)
<TheMue> rogpeppe: I have to admit I never looked into the history of UUIDs and why those versions have been created. I only implemented them. *lol*
<rogpeppe> TheMue: there are many standards that aren't worth implementing :)
<TheMue> s/standards/code/
<TheMue> hmm, and are to is. or does a plural of code makes sense?
<TheMue> rogpeppe: btw, LGTM
<rogpeppe> TheMue: i'd probably say "there is much code that isn't worth implementing"
<rogpeppe> TheMue: thanks
<TheMue> rogpeppe: yeah, I've seen huge "enterprise systems" containing tons of never called code. but nobody ever removed it and so those systems got more and more unmaintainable.
<jam> jam
<rogpeppe> anyone know why state.State.ForEnviron dials mongo rather than just doing a Session.Copy ?
<thumper> rogpeppe: because I don't understand mongo
<thumper> :)
<rogpeppe> thumper: :)
<thumper> if Copy works, happy to change
<rogpeppe> thumper: it should do, unless i've misunderstood what's going on
<thumper> rogpeppe: you probably haven't
<rogpeppe> thumper: (which is very likely as i'm just skimming through code trying to understand why tests are failing)
<rogpeppe> thumper: BTW, in case you missed it: http://reviews.vapour.ws/r/1853/
<thumper> not looked sorry
<rogpeppe> thumper: it's the start of what i discussed with you the other night
<thumper> ack
<mup> Bug #1461888 was opened: Units stuck in agent-state: down state <juju-core:New> <https://launchpad.net/bugs/1461888>
<mup> Bug #1461889 was opened: don't turn invalid SetAddresses calls into Assert-only txns <tech-debt> <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1461889>
<mup> Bug #1461890 was opened: leadership unreliable in HA <juju-core:Triaged> <https://launchpad.net/bugs/1461890>
<mup> Bug #1461888 changed: Units stuck in agent-state: down state <juju-core:New> <https://launchpad.net/bugs/1461888>
<mup> Bug #1461889 changed: don't turn invalid SetAddresses calls into Assert-only txns <tech-debt> <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1461889>
<mup> Bug #1461890 changed: leadership unreliable in HA <juju-core:Triaged> <https://launchpad.net/bugs/1461890>
<mup> Bug #1461888 was opened: Units stuck in agent-state: down state <juju-core:New> <https://launchpad.net/bugs/1461888>
<mup> Bug #1461889 was opened: don't turn invalid SetAddresses calls into Assert-only txns <tech-debt> <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1461889>
<mup> Bug #1461890 was opened: leadership unreliable in HA <juju-core:Triaged> <https://launchpad.net/bugs/1461890>
<perrito666> davecheney: from the doc on rand I understand that using just rand.Intn will produce a deterministic result, which I dont really want
<davecheney> perrito666: do you mean it won't be seeded
<perrito666> well I dont feel the doc is not all that clear, but if it behaves as I assumed it does it will use always the same seed
<perrito666> which should return a deterministic sequence of numbers on each run of juju
<perrito666> I might have just missunderstood
<davecheney> perrito666: i think juju should see the rng on startup
<perrito666> s/see/seed?
<katco> wwitzel3: ericsnow: jam and i running a little late, brt
<wwitzel3> katco: rgr
<rogpeppe> here's a fix for an intermittent test failure in worker/envworkermanager; reviews appreciated please: https://github.com/juju/juju/pull/2491
<perrito666> axw: around?
<axw> perrito666: hey, lurking
<perrito666> axw: have time for a short question? (sorry I know its late there)
<axw> perrito666: sure, what's up?
<perrito666> I privmessaged it
<mup> Bug #1461957 was opened: Does not use security group ids <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1461957>
<mup> Bug #1461959 was opened: serverSuite teardown fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1461959>
<mup> Bug #1461961 was opened: UniterSuite teardown fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1461961>
<perrito666> its a shame go sees go go as a typo
<mup> Bug #1461957 changed: Does not use security group ids <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1461957>
<mup> Bug #1461959 changed: serverSuite teardown fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1461959>
<mup> Bug #1461961 changed: UniterSuite teardown fails <ci> <intermittent-failure> <test-failure> <juju-core:Invalid> <https://launchpad.net/bugs/1461961>
<mup> Bug #1461957 was opened: Does not use security group ids <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1461957>
<mup> Bug #1461959 was opened: serverSuite teardown fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1461959>
<mup> Bug #1461965 was opened: UserSuite setup fails <ci> <unit-tests> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461965>
<wwitzel3> katco: will be about 3 minutes behind standup
<katco> wwitzel3: k np
<mup> Bug #1461968 was opened: TestLXCProvisionerObservesConfigChanges fails <ci> <intermittent-failure> <test-failure> <juju-core:New> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461968>
<mup> Bug #1461969 was opened: TestDialAgain fails <ci> <intermittent-failure> <test-failure> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461969>
<mup> Bug #1461993 was opened: support using an existing vpc <juju-core:New> <https://launchpad.net/bugs/1461993>
<rogpeppe> jam: it looks like you're ocr today? if so, here are two small juju-core fixes for you, both fixing flaky tests: http://reviews.vapour.ws/r/1859/, http://reviews.vapour.ws/r/1858/
<rogpeppe> anyone else: reviews much appreciated. the first one is just a one line fix.
<jam> rogpeppe: well, technically I'm 2 hours past my EOD but I happen to be around so I'll give it a look
<jam> first one LGTM
<rogpeppe> jam: thanks :_
<rogpeppe> :)
<jam> rogpeppe: for http://reviews.vapour.ws/r/1858/diff/#
<jam> I feel like "errors.Cause(err) == tomb.ErrDying" isn't right.
<rogpeppe> jam: oh yes?
<jam> I *feel* like that should be "!= tomb.ErrDying"
<jam> rogpeppe: why would we only pass up ErrDying if the underlying runner dies?
<rogpeppe> jam: funnily enough i had it that way originally, but it's wrong
<rogpeppe> jam: luckily i wrote the test :)
<jam> rogpeppe: so I think the test you added is handled by go func() { m.tomb.Kill(m.runner.Wait()) }
<jam> that passes whatever Wait() returns immediately to tomb, doesn't it?
<rogpeppe> jam: but that can happen at any time in the future
<rogpeppe> jam: there's no guarantee that it happens before tomb.Done is called
<jam> sure
<jam> rogpeppe: but I don't see how the concrete error gets returned from loop()
<jam> if it came from runner then
<jam> rogpeppe: ah, you're checking retErr
<jam> I'm not a big fan of secret named funcs
<rogpeppe> jam: yes, i wondered if you hadn't noticed that
<jam> return vars
<rogpeppe> jam: it's not *that* secret :)
<jam> so only if the current return reason is ErrDying then override with the error from m.runner
<rogpeppe> jam: yes
<jam> rogpeppe: I feel like that is actually the responsibility of tomb.IsFatal and tomb.IsMoreImportant
<rogpeppe> jam: if you've got a better suggestion for how to phrase it, please tell
<jam> rogpeppe: shouldn't we just pass m.tomb.Kill() the value of m.runner always?
<rogpeppe> jam: tomb doesn't have either of those things AFAIR
<jam> and then loop() returns and passes that in as well?
<rogpeppe> jam: yeah, that's probably better
<jam> rogpeppe: k. so the thing I saw was m.runner having the comparison checkers
<jam> rogpeppe: but Tomb also knows about error priority
<jam> at least, it knows how to treat ErrDying vs a real error.
<rogpeppe> jam: yes, w.r.t. other errors > ErrDying
<jam> rogpeppe: do we have to unwrap our error before passing it to tomb.Kill?
<jam> given that you are using errors.Cause()
<jam> we can't pass an errors.Cause() style error directly to tomb.Kill()
<rogpeppe> jam: that's not actually necessary
<jam> if Cause() == ErrDying but *err* != ErrDying then tomb doesn't work right.
<rogpeppe> jam: i just tend to avoid direct error == tests
<jam> rogpeppe: so new review, LGTM though maybe we'd rather just do m.tomb.Kill(m.runner.Wait()) as we do elsewhere rather than reimplementing that check.
<jam> rogpeppe: hm...
<jam> maybe not
<rogpeppe> jam: ha, it should actually probably be done in the caller function, around line 36
<jam> rogpeppe: as if we are talking to m.tomb.Kill() first
<jam> then we take priority
<rogpeppe> jam: i'm thinking we probably want the return value from m.runner to take precedence
<jam> rogpeppe: defer func() {m.runner.Kill(); m.tomb.Kill(m.runner.Wait())} ?
<jam> rogpeppe: I don't know this func that well. ATM loop() sets the value first
<jam> And arguably if m.envHasUUID() or whatever fails we're in a state that it doesn't really matter what m.runner returns.
<jam> sorry m.envHasChanged(uuid)
<jam> rogpeppe: so what you have is ok, moving into the New* func sounds slightly better.
<jam> ordering of errors sounds like it is going to get us into trouble unless people can concretely say that they need this error over that one.
<jam> but if envHasChanged returns an error we are likely in a state where m.runner can't say real things anyway.
<rogpeppe> jam: it's always an interesting issue
<rogpeppe> jam: my thought is that m.runner contains the meatiest stuff
<rogpeppe> jam: so we're much more likely to be interested in that error
<rogpeppe> jam: (i wish that tomb logged errors when it threw them away)
<jam> rogpeppe: agreed about tomb
<rogpeppe> jam: aside: the default value of GOMAXPROCS has changed in go tip, so it's finding these issues more consistently
<rogpeppe> fwereade: do you have any idea what the root cause is behind the intermittent worker/uniter test failures?
<rogpeppe> fwereade: e.g. FAIL: uniter_test.go:875: UniterSuite.TestUniterUpgradeConflicts
<rogpeppe> util_test.go:726:
<rogpeppe>     c.Fatalf("never reached desired status")
<rogpeppe> jam: after experimenting both ways, i think it looks marginally nicer keeping the defer inside the loop function. then the outer level is the classic defer done, kill(loop) idiom
<natefinch> rogpeppe: I'm so glad they changed the default of GOMAXPROCS.
<fwereade> rogpeppe, not offhand -- status-setting has changed lately, though
<rogpeppe> natefinch: +1
<rogpeppe> natefinch: i'm so glad they sped up the scheduler so much :)
<natefinch> rogpeppe: hah yeah, that too.  Though honestly, I think I would have set it to NumCPUs anyway, since it was always surprising to people when it was not (and it would have meant they find more race conditions if someone did set GOMAXPROCS).
<rogpeppe> fwereade: here's the test output i saw (excluding log messages) FWIW: http://paste.ubuntu.com/11567979/
<rogpeppe> jam: i've changed it - PTAL
<jam> rogpeppe: https://docs.google.com/document/d/1At2Ls5_fhJQ59kDK2DFVhFu3g5mATSXqqV5QrxinasI/edit?pli=1 those GOMAXPROCS benchmarks look really good in 1.5
<rogpeppe> jam: yeah, the overhead's more or less gone away
<fwereade> rogpeppe, hmm, that looks like we're not clearing the resolved flag
<fwereade> rogpeppe, I always worried that would be racy... :/
 * fwereade thinks he might be able to see it -- util_test.go:978 looks like exactly the sort of hack that'd be rendered unstable by status changes
<jam> rogpeppe: lgtm
<rogpeppe> jam: ta!
 * jam leaves to go play with my son
<fwereade> rogpeppe, would you point wallyworld at it please?
<rogpeppe> jam: enjoy!
<rogpeppe> fwereade: at the test failure?
<fwereade> rogpeppe, yeah, I'm blithely assertinng that he did status stuff and should be in a good position to track it down
<fwereade> rogpeppe, or possibly perrito666? ^^
<rogpeppe> ha, i bet this is the same issue
<rogpeppe> https://bugs.launchpad.net/juju-core/+bug/1448308
<mup> Bug #1448308: Skipped TestUniterUpgradeConflicts on ppc64 <skipped-test> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1448308>
<rogpeppe> 	coretesting.SkipIfPPC64EL(c, "lp:1448308")
<rogpeppe> that's surely bogus
<rogpeppe> fwereade: it would be nice to have comments on the uniter test primitives. for example, what are the various fields in the waitUnitAgent meant to imply?
<rogpeppe> fwereade: what does a resolved status of "no hooks" imply?
<rogpeppe> sorry, no-hooks
<rogpeppe> oh i see, ignore me
<voidspace> dimitern: http://reviews.vapour.ws/r/1860/
<mgz_> oh, forgot to propose branch...
<perrito666> sorry I was afk, not having the best health day
<mgz_> rogpeppe: just for you, http://reviews.vapour.ws/r/1861
<perrito666> fwereade: rogpepe, what's going on?
<rogpeppe> mgz_: i think a fix for those is already landing
<mgz_> doh, this is why I shouldn't forget branches
<rogpeppe> mgz_: as part of https://github.com/juju/juju/pull/2487/files
<rogpeppe> mgz_: although... i can't land that quite yet as i can't get tests to pass
<rogpeppe> mgz_: so LGTM, go for it
<mgz_> rogpeppe: sure thing
<dimitern> voidspace, looking, but also in a call so it might take some time
<mgz_> rogpeppe: I actually did that change after you said in irc but forgot about it...
<voidspace> dimitern: np, I'm working on the higher level stuff anyway
<perrito666> rogpeppe: fwereade do you still need my help? that particular test suite is a pain to follow
<natefinch> ericsnow: I presume ProcessInfo.Status is supposed to be an enum?  i.e., you can't just put whatever string you want in there
<ericsnow> natefinch: pretty much
<rogpeppe> perrito666: i'm seeing a consistently reproducible failure in that test
<rogpeppe> perrito666: and i don't really want to get sidelined into fixing it
<natefinch> ericsnow: I'm going to leave a comment that says we should make it a numeric enum. Making it a string just makes it less obvious it's supposed to be an enum
<perrito666> rogpeppe: and this only happens in ppc
<perrito666> ?
<rogpeppe> perrito666: no, this is on my normal laptop
<rogpeppe> perrito666: although i am running with go tip
<rogpeppe> perrito666: i'm sure it's just an inherent problem with the test though
<perrito666> rogpeppe: yes, that test is as brittle as a crystal hamer
<perrito666> hammer*
<perrito666> well not really brittle, it just makes a very good job hiding real issues
<rogpeppe> perrito666: verifyWaitingUpgradeError in particular seems very handwavy
<perrito666> rogpeppe: this is master tip rigt?
<rogpeppe> perrito666: yes
 * perrito666 runs the test
<rogpeppe> perrito666: try running it with GOMAXPROCS=4
<rogpeppe> perrito666: and run it a few times.
<rogpeppe> perrito666: (best to use go test -c and then run the test binary directly)
<perrito666> oh it is one of those bugs
<rogpeppe> perrito666: yeah, it's definitely a race
<natefinch> nick natefinch-afk
<dimitern> voidspace, reviewed
<voidspace> dimitern: thanks
<voidspace> dimitern: cool, not much to do - thanks
<dimitern> voidspace, well, the code looks solid :)
<perrito666> hey I cannot make it to today's meeting I have to take my wife to the dentist, sorry ppl
<alexisb> perrito666, team call
<alexisb> voidspace, team call if you are still around
<alexisb> natefinch, team call
<perrito666> alexisb:  as I said abvove: <perrito666> hey I cannot make it to today's meeting I have to take my wife to the dentist, sorry ppl
<Web> https://jujucharms.com/static/img/jujudocs/1.23/getting_started-aws_security.png <-- time to change this I think.  There is a emphasis on iAM profiles now.  Need to run through the process of using one first.
<natefinch> alexisb: oops, sorry coming
<mgz_> ericsnow: I have a proposed fix, but I don't think I actually have a bitbucket account
<ericsnow> mgz_: k
<mgz_> draft.rich_text = True
<ericsnow> mgz_: ah
<mgz_> but there's also a description_rich_text
<mgz_> just tracking down if that's a subset or not
<mgz_> aha, just rich_text is deprecated
<mgz_> so, description_rich_text and testing_done_rich_text are the new way
<katco> natefinch: 1:1 time
<natefinch> katco: dang, sorry
<natefinch> katco: coming
<katco> no worries
<perrito666> are there no tests for modes code?
<natefinch> perrito666: we don't test code that rhymes
<perrito666> natefinch: not fun
<natefinch> perrito666: better run
<perrito666> you know... you are lucky to be so far away :p
<natefinch> but I do, so I can rhyme all day :)
 * perrito666 shops for a ticket to boston
 * perrito666 shops for a clue by four
<natefinch> sorry, A/C man at the door.  Sounds like he's not going to make us poor.
<thumper> morning
<natefinch> morning
<thumper> how's it going nate?
<natefinch> thumper: not bad... getting our A/C fixed.... which is good, because we thought it needed to be replaced until we got a second opinion.
<natefinch> thumper: so, probably going to cost us ~$500 instead of ~$10,000
<thumper> I bet you're pleased about that
<natefinch> immensely.  We definitely had not planned on spending $10,000 on the A/C this summer.
<natefinch> unrelated, I'm trying to figure out a way to get github to be a satisfactory bug tracker for us, because i'm tired of not having integration with the code.
<thumper> natefinch: nope, nope, nope
<natefinch> heh
<natefinch> thumper: what's wrong with github's bug tracker?
<thumper> natefinch: I'm not even going to get started
<thumper> I don't have long enough
<natefinch> thumper: haha.... that's ok, me neither
 * mgz_ hugs thumper 
<natefinch> me either?  me neither?  I guess I should say "I also do not have time"
<mup> Bug #1462097 opened: Bootstrap error logging needs to be more descriptive <juju-core:New> <https://launchpad.net/bugs/1462097>
<axw> perrito666: worker/uniter/uniter_test.go tests the modes indirectly
<axw> they're more functional style tests than unit tests
<wallyworld> axw: anastasiamac: perrito666: just fiishing a meetig, me there soon
<axw> sure
<anastasiamac> k
<perrito666> axw: well yes, I am trying to figure out how to fit my test
<axw> wallyworld anastasiamac: I raised https://bugs.launchpad.net/juju-core/+bug/1462146
<mup> Bug #1462146: cmd/juju/storage: "add" fails to dynamically add filesystem for storage <storage> <juju-core:New> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1462146>
<wallyworld> ty
<anastasiamac> axw: tyvm :D
<wallyworld> axw: anastasiamac: btw, i've started to merge 1.24 into master, *lots* of conflicts so i'll see how i go. might need to cherry pick individual commits if it's too hard
<axw> wallyworld: thank you
<anastasiamac> wallyworld: sure! just say the word :D
<mup> Bug #1462146 opened: cmd/juju/storage: "add" fails to dynamically add filesystem for storage <storage> <juju-core:New> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1462146>
<davecheney> mwhudson: would it short circuit the debian debate if I proposed a change which just added the words we wanted to that page ?
<mwhudson> davecheney: quite possibly, yes
#juju-dev 2015-06-05
<davecheney> mwhudson: ok
<natefinch> thumper: why do we use github.com/juju/schema instead of using struct tags?
<davecheney> natefinch: jw4 might have an opinion
<wallyworld> axw: can you recall the change in master that flattened storage.Volume ie got rid of VolumeInfo. i'm getting conflicts merging 1.24 and just need to recall which way around it is supposed to be in master
<axw> wallyworld: it's hte other way around, I extracted VolumeInfo from Volume
<wallyworld> axw: right, ty
<natefinch> davecheney: note, I'm talking our own yaml schema validation stuff for like metadata.yml, not jsonschema, if that's what you were thinking.  Pretty sure the schema stuff predates jw4 :)
<davecheney> oh
<davecheney> right
<davecheney> ignore what i said
<davecheney> i thought you were watlkig about json schema
<thumper> fuck fuck fuckity fuck
<wallyworld> thumper: is jessie in today?
<thumper> no, leave
<wallyworld> thumper: ah ok, can you remind him next week then to port bug 1441478 to master, i think he forgot
<mup> Bug #1441478: state: availability zone upgrade fails if containers are present <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Fix Released by waigani> <https://launchpad.net/bugs/1441478>
<wallyworld> i could include it in my forward port, if i can do so without risk i will
<menn0> wallyworld: an old ticket (bug 1318366) has recently been reopened by aaron regarding mgo/txn panics
<menn0> wallyworld: it's attached to me b/c I helped fix last year's issue
<wallyworld> oh?
<menn0> wallyworld: I think fwereade is looking at this issue ("panic: rescanned document misses transaction in queue")
<menn0> wallyworld: do you know?
<wallyworld> yes he's looking
<wallyworld> stuck though
<menn0> wallyworld: found the newer ticket: https://bugs.launchpad.net/juju-core/+bug/1449054
<mup> Bug #1449054: Intermittent panic: rescanned document <ci> <test-failure> <juju-core:Triaged by fwereade> <juju-core 1.22:Fix Released by dimitern> <juju-core 1.23:Won't Fix by fwereade> <juju-core 1.24:Fix Released by fwereade> <https://launchpad.net/bugs/1449054>
<menn0> wallyworld: looks like the fix wasn't merged into trunk though
<menn0> wallyworld: I might deal with this now b/c I want to bump the mgo dependency anyway
<wallyworld> menn0: i'm forward porting a bunch of 1.24 vhanges now, which includes the latest mgo from 1.24. trunk has an old copy of mgo from a year ago
<wallyworld> menn0: trunk will soon have the sme dep rev as 1.24
<wallyworld> or did you want a later one as a new mgo has been released since then
<menn0> wallyworld: I want something from the release from a few days ago
<menn0> wallyworld: it's not urgent though
<menn0> wallyworld: so do your forward port first and i'll handle bumping the dep again afterwards
<wallyworld> ok, i'll hopefully have master up to date (mostly) with 1.24 soon
<menn0> wallyworld: are we likely to ever do another 1.22 release? CI opened up a ticket for a test fix for 1.22 recently that we have fixed in later releases but it was never backported that far back. i'm think I can close it unless we might do another 1.22 release.
<thumper> wallyworld: got time to hangout?
<wallyworld> thumper: sure
<thumper> back in our chat
<wallyworld> kk
<wallyworld> menn0: we might do a 1.22.6
<wallyworld> menn0: but no decision yet - depends if william finds and fixes the mgo bug
<wallyworld> menn0: so may keep open until decided
<menn0> wallyworld: ok. i'll backport this fix. it's a fairly easy one.
<wallyworld> ok
<thumper> trivial review for someone: http://reviews.vapour.ws/r/1866/
<davecheney> thumper: i shall put a trivial effort into reviewing said
<thumper> davecheney: WFM
<davecheney> i also added a trivial review comment
<davecheney> it should be trivial for you to address same
<menn0> wallyworld: you around?
<wallyworld> yeah
<menn0> wallyworld: see PM
<wallyworld> axw: we you have a moment, could you eyeball this for me? it picks up our various 1.24 fixes for storage, leadership, etc, plus the tags stuff. diff is big, but if you know the work, you should be able to peruse through pretty quickly http://reviews.vapour.ws/r/1868/diff/#
<axw> wallyworld: ok
<wallyworld> no rush
<axw> wallyworld: I found why the blockdevices txns are aborting: structs are in a different order in mongo to in the assertion. I have nfi how they got in that order though. current hypothesis is that mgo/txn is deserialising a queued transaction's ops into a bson.M, and so not preserving the order
<wallyworld> oh wonderful
<wallyworld> that could explain it yeah
<wallyworld> axw: doesn't mongo order the struct fields alphabetically?
<axw> wallyworld: no, it stores them in the order you present them. mgo preserves struct field order when encoding to BSON
<wallyworld> ok, i recall something about alphabetically order, but can't remember now
<axw> wallyworld: merge looks fine
<wallyworld> axw: ty, will land and mark a lot ofbugs as fix committed
<wallyworld> axw: reviewed, off for school pickup, bbiab
<axw> wallyworld: thanks. heh, oops, left that && false for debugging
<wallyworld> thought so :-)
 * thumper is done
<mup> Bug #1462213 opened: [Ubuntu Vivid] Cannot bootstrap on local provider <juju-core:New> <https://launchpad.net/bugs/1462213>
<axw> wallyworld: bleh, I broke managedfs
<axw> wallyworld: haven't merged, will fix that first
<wallyworld> ok
<wallyworld> phew
<axw> wallyworld: FYI I've pushed my fix to the PR. tested manually, going to rebase and merge now
<wallyworld> axw: great, ty
<dimitern> voidspace, ping
<dimitern> voidspace, because I keep forgetting, let me ask you now - please send me the ep2015 talk by mail so I can go over it :)
 * dimitern steps out for a while
<voidspace> dimitern: I have to find it
<voidspace> dimitern: but yes - it's not on this computer
<voidspace> dimitern: it's probably on my laptop - and I'm going downstairs to work as the handyman needs my office
<perrito666> good morning all
<natefinch> can someone else try logging on to canonicaladmin?  it keeps telling me the username and password is wrong, but they're saved in my password manager, so I'm sure they're not wrong
<jhobbs> wfm natefinch
<natefinch> hmm... possibly related: https://www.youtube.com/watch?v=Lz9810Y7ZRw
<perrito666> natefinch: In a company I once worked that meant talk to rrhh :p
<perrito666> yeah, exactly that video
<perrito666> natefinch: works for me, still ugly as hell
<natefinch> weird
<natefinch> ericsnow: do we support vivid on 1.22?  re: https://bugs.launchpad.net/bugs/1462213
<mup> Bug #1462213: [Ubuntu Vivid] Cannot bootstrap on local provider <juju-core:New> <https://launchpad.net/bugs/1462213>
<mgz_> natefinch: no
<mgz_> natefinch: not for the local provider specifically
<mup> Bug #1462213 changed: [Ubuntu Vivid] Cannot bootstrap on local provider <juju-core:New> <https://launchpad.net/bugs/1462213>
<mup> Bug #1462409 opened: FilesystemStateSuite setup failed <ci> <test-failure> <juju-core:New> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1462409>
<mup> Bug #1462409 changed: FilesystemStateSuite setup failed <ci> <test-failure> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1462409>
<mup> Bug # opened: 1462409, 1462412, 1462415, 1462417, 1462418
<mup> Bug # changed: 1462412, 1462415, 1462417, 1462418
<natefinch> katco: which hangout are we supposed to be in now?  I'm in retrospective, but no one else is
<ericsnow> natefinch: we all went back to moonstone
<mup> Bug # opened: 1462412, 1462415, 1462417, 1462418, 1462423
<mup> Bug #1457645 changed:  warning: log line attempted over max size - leadership related <cloud-installer> <landscape> <mongodb> <juju-core:Fix Released by wallyworld> <juju-core 1.23:Fix Committed by wallyworld> <juju-core 1.24:Fix Released by wallyworld> <https://launchpad.net/bugs/1457645>
<mup> Bug # changed: 1376246, 1431372, 1454678, 1454891, 1457728, 1459057, 1459060, 1459250, 1459611, 1459616, 1459885, 1459912, 1461111, 1461342
<katco> ericsnow: wwitzel3: you 2 back yet?
<marcoceppi> why doesn't this command work in 1.24-beta5? juju get-env admin-secret
<marcoceppi> let me rephrase, how can I programatically get the password/admin-secret for an environment from the command line without doing `grep "admin-secret" ~/.juju/environments/gce.jenv | awk '{ print $2 }'`
<rick_h_> marcoceppi: should you use the admin secret any more vs the password? can you programatically get the password?
<alexisb> ericsnow, wwitzel3 ^^ any idea?
<marcoceppi> rick_h_: I tried password
<marcoceppi> it said that key does not exist
<rick_h_> marcoceppi: oh hmm, might mention that one to thumper
<marcoceppi> marco@galago:~$ juju environment get password
<marcoceppi> ERROR key "password" not found in "gce" environment.
<ericsnow> marcoceppi: is this a GCE-specific issue?
<marcoceppi> this is every environment
<ericsnow> k
<rick_h_> yea, would guess no, more a general thing
<marcoceppi> marco@galago:~$ juju environment get password
<marcoceppi> ERROR key "password" not found in "aws-west-2" environment.
<marcoceppi> also, environment doesn't accept an -e flag :(
<marcoceppi> oh, but get does
<marcoceppi> oops
<natefinch> hahaha
<ericsnow> ah, I don't know a ton about the new environment management stuff
<marcoceppi> well, get-env has been around for a while
<marcoceppi> probably the 1.18 days
<natefinch> we were just talking about the -e stuff in a bug report
<marcoceppi> which is an alias to environment get
<natefinch> marcoceppi: https://bugs.launchpad.net/juju-core/+bug/1461605
<mup> Bug #1461605: juju action commands require -e in the "wrong" place <actions> <improvement> <juju-core:Triaged> <https://launchpad.net/bugs/1461605>
<natefinch> marcoceppi: the comments on the bug go beyond just actions thought
<natefinch> s/thought/though
<natefinch> basically all the new "double command" commands "action do" "environment get"  require the -e after the second command
<marcoceppi> natefinch: cool, but I really just need to get the password programmatically now ;)
<natefinch> heh
 * marcoceppi goes back to just grep and awk though he's aware of some shake up around .jenvs
<marcoceppi> behold. grep "admin-secret" ~/.juju/environments/$(juju switch).jenv | awk '{ print $2 }'
<perrito666> my vim is all confused of having python in it again
<marcoceppi> getting this error when trying to build in golang
<marcoceppi> ../../../gopkg.in/juju/charm.v5/meta.go:19: import /home/marco/.go/pkg/linux_amd64/gopkg.in/juju/charm.v5/hooks.a: object is [linux amd64 go1.2.1 X:none] expected [linux amd64 go1.3.3 X:precisestack]
<marcoceppi> I have go 1.3.3 installed
<marcoceppi> not sure what I did or how to fix this
<gsamfira> marcoceppi: rm -rf $GOPATH/pkg
<gsamfira> and try again
<marcoceppi> gsamfira: I ended up blowing away charm.v5 and re-getting it
<marcoceppi> seemed to sort it, thanks
<marcoceppi> I'm trying to print a reference
<marcoceppi> because I want to know what it's about and I can't figure it out normally, "cannot use ref (type *charm.Reference) as type string in argument to fmt.Printf" so obviously Printf isn't my friend
#juju-dev 2015-06-06
<benji> i/quit
<benji> pfft
#juju-dev 2015-06-07
<mup> Bug #1462418 changed: TestLogging fails <ci> <test-failure> <juju-core db-log:Triaged by menno.smits> <https://launchpad.net/bugs/1462418>
<mup> Bug #1462418 opened: TestLogging fails <ci> <test-failure> <juju-core db-log:Triaged by menno.smits> <https://launchpad.net/bugs/1462418>
<mup> Bug #1462418 changed: TestLogging fails <ci> <test-failure> <juju-core db-log:Triaged by menno.smits> <https://launchpad.net/bugs/1462418>
<menn0> thumper: tiny fix for the db-log test failure on windows: http://reviews.vapour.ws/r/1876/
#juju-dev 2016-06-06
<davecheney> it is strange
<davecheney> i've seen more piture of HRH in NZ that I have in Au
<mup> Bug #1588636 changed: mgo: Panic: Test left sockets in a dirty state (PC=0x46257C) <panic> <unit-tests> <juju-core:Invalid> <https://launchpad.net/bugs/1588636>
<davecheney> spot the bug         if err != nil && err == mgo.ErrNotFound {
<davecheney>                 return errors.Trace(err)
<davecheney>         }
<natefinch> aside form the err != nil being extraneous?
<natefinch> errors.Cause?
<natefinch> if errors.Cause(err) == mgo.ErrNotFound { return errors.Trace(err) }
<davecheney> yeah, it's eaither
<davecheney> err != nil && err != mgo.ErrNotFound
<davecheney> or err == mgo.ErroNotFound
<davecheney> but the latter would mean ignoreing any error other than mgo.ErrNotFound
<natefinch> well, yes.. it depends on the rest of the error stack to figure out which one they meant
<natefinch> er s/error stack/context of the code
<natefinch> I kinda hate errors.Trace.  It just adds so much noise to the code.   If we had the stacktrace on the error, I don't think I'd ever use trace.
<davecheney> https://github.com/juju/juju/commit/93d52ed7
<davecheney> fix some log messages ... and add an entire state/presence package
<davecheney> no biggie
<davecheney> natefinch: that's why my errors package captures the full stacktrace at the point of creatin
<davecheney> it has some drawbacks
<davecheney> but compared to errors.Trace turds everywhere
<davecheney> it's paid for itself
<natefinch> davecheney: that commit is pretty funny... I presume the presence stuff was intended to be committed separately
<natefinch> davecheney: also, whoowee that is some old code
<davecheney> yeah, that line is older than some of your kids
<natefinch> davecheney: this is true.
<natefinch> wallyworld: I spent Friday looking at the logs, but I can't see any differences between the setaddress calls that work and those that fail. I'm printing out the entire list of operations... and the ones that work look identical to the ones that fail.. and the ones that fail, the ops don't change from attempt to attempt, as they should if things were actually changing.
<davecheney> natefinch: https://github.com/juju/juju/pull/5537 if you feel so inclined
<wallyworld> natefinch: maybe we need to log what is actually in the db
<natefinch> wallyworld: yeah, I had that in there and then must have deleted it while debugging why I wasn't getting any log output.  Sigh.
<natefinch> davecheney: wow, totally messed up error handling for 4 years...
<natefinch> admcleod_: i guess mgo doesn't often return errors other than not found
<natefinch> whoops
<natefinch> davecheney: that was for you, of course
<natefinch> stupid fingers
<natefinch> davecheney: I'd ask for a unit test to verify it, but... uh.. yeah.  LGTM
<davecheney> natefinch: didn't fail before : didn't fail afterwards :)
<natefinch> what I wouldn't give for separate persistence layer
<davecheney> you had me at separate layer
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1589339
<mup> Bug #1589339: environs/manual: test failure if host does not have a valid reverse dns record <juju-core:New> <https://launchpad.net/bugs/1589339>
<davecheney> nice
<natefinch> lol
<davecheney> solution: hack /etc/hosts
<davecheney> test isolation, how does it work?
<mup> Bug #1589339 opened: environs/manual: test failure if host does not have a valid reverse dns record <juju-core:New> <https://launchpad.net/bugs/1589339>
<mup> Bug #1589339 changed: environs/manual: test failure if host does not have a valid reverse dns record <juju-core:New> <https://launchpad.net/bugs/1589339>
<mup> Bug #1589339 opened: environs/manual: test failure if host does not have a valid reverse dns record <juju-core:New> <https://launchpad.net/bugs/1589339>
<natefinch> davecheney: wow, I just noticed from an offhand comment on golang-nuts that builds are now deterministic... that's amazing
<davecheney> natefinch: cool
<davecheney> i hope it lasts
<davecheney> googlers don't care about byte for byte comparisons
<davecheney> and there is no test for it
<natefinch> davecheney: gah, there should be a test for it
<natefinch> davecheney: it's SOOOO useful
<wallyworld> axw: here's part 1. next I can introduce a get-controller-config CLI and make get-model-config only show the model specific bits; we can also pass in config from cloud.yaml as controller config at bootstrap time
<wallyworld> http://reviews.vapour.ws/r/4983/
<axw> wallyworld: thanks, just saw. just from reading the summary: why in the "controllers" collection rather than the settings collection?
<wallyworld> axw: collection needs to be global
<wallyworld> so all models can always read the controller config
<axw> wallyworld: ok
<wallyworld> axw: i started out in settings and it all went to shit with different hosted models
<wallyworld> we now use that controllers collection for a few things it seems
<wallyworld> axw: once this gets a +1 or whatever, let me know where you're at so we can decide what bit i should do next - the CLI or controller config from clouds.yaml. also, i think we need to handle the case for things like resource-tags where the model values don't overwrite but merge
<axw> wallyworld: sure. merging config sounds like something we don't need to do right away
<wallyworld> yeah
<wallyworld> axw: i should have said in the PR, but a bunch of the green is model config stuff moved from state.go to its own modelconfig.go file
<axw> wallyworld: yep, thanks
<wallyworld> and i found some old code to delete also
<mup> Bug #1589350 opened: apiserver/provisioner: tests do not pass with go 1.7 beta 1 <juju-core:New> <https://launchpad.net/bugs/1589350>
<mup> Bug #1589350 changed: apiserver/provisioner: tests do not pass with go 1.7 beta 1 <juju-core:New> <https://launchpad.net/bugs/1589350>
<mup> Bug #1589350 opened: apiserver/provisioner: tests do not pass with go 1.7 beta 1 <juju-core:New> <https://launchpad.net/bugs/1589350>
<mup> Bug #1589351 opened: provider/azure: test failure during stress test <juju-core:New> <https://launchpad.net/bugs/1589351>
<davecheney> https://launchpad.net/bugs/1589353 what a shitshow
<mup> Bug #1589353: apiserver/annotations: test failure during setup <juju-core:New> <https://launchpad.net/bugs/1589353>
<mup> Bug #1534757 changed: Attempting to run charm before unit provisioned, 1.26 <2.0-count> <juju-release-support> <lxd> <juju-core:Expired> <https://launchpad.net/bugs/1534757>
<mup> Bug #1589353 opened: apiserver/annotations: test failure during setup <juju-core:New> <https://launchpad.net/bugs/1589353>
<davecheney> axw: wallyworld https://github.com/juju/testing/pull/101
<davecheney> here is a simple one
<wallyworld> ok
<wallyworld> davecheney: ty, nice to see that fixed
<davecheney> wallyworld: it's one of those bugs that doesn't stay fixed
<davecheney> wallyworld: i've been trying to repro https://bugs.launchpad.net/juju-core/+bug/1588574
<mup> Bug #1588574: Session already closed in state/presence <blocker> <ci> <intermittent-failure> <juju-core:In Progress by dave-cheney> <https://launchpad.net/bugs/1588574>
<davecheney> but all i've acomplished so far today is raising 4 extra bugs
<wallyworld> yeah, the presence stuff sucks
<wallyworld> axw: we'll need to get to the inheritance stuff pretty quickly to allow config from clouds.yaml to be handled. i guess we should keep controller only config like api port totally separate from anything that can be inherited
<axw> wallyworld: unless it's going to be a huge amount of rework (didn't look like it), I'd feel more confident reviewing them separately.
<wallyworld> ok
<anastasiamac> davecheney: did u try to remove omitempty as per my suggestion?
<davecheney> anastasiamac: no i did not i'm sorry
<davecheney> i'm still trying to reproduce the error locally
<davecheney> anastasiamac: why will changing the json part of a struct fix the bug ?
<anastasiamac> davecheney: it does not cause probs for me except at landing :D
<anastasiamac> davecheney: every attemp at landing came up with the error messaage u r chasing "session closed"
<anastasiamac> davecheney: possibly unrelated and was just this PR's luck  :-P
<davecheney> anastasiamac: yeah, i cannot reprodyce the bug locally either
<davecheney> i'm trying to build an environment in ec2 that matches CI
<davecheney> but i keep hitting flakey tests
<anastasiamac> davecheney: \o/
<davecheney> i've raised 4 bugs already today and haven't managed to run all the tests successfully
<anastasiamac> davecheney: it's sad but even that makes a productive day :-P
<davecheney> if I fix all 4 bugs I raised, then I'm back to square one, so that's something
<anastasiamac> :D
<davecheney> spoiler alert: i'm probably not goig to fix all four today
<anastasiamac> funny and here I was looking forward to watching it all being fixed today :D
<wallyworld> axw: bbiab, almost got a revised branch ready, but school pickup
<mup> Bug #1589372 opened: state: state test failure during stress test <juju-core:New> <https://launchpad.net/bugs/1589372>
<mup> Bug #1589385 opened: leftover eth0.cfg in /etc/network/interfaces.d <4010> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1589385>
<davecheney> mongodb is such a pile of crap
<davecheney> I _finally_ got the bug to trigger locally
<anastasiamac> davecheney: \o/ how?
<davecheney> anastasiamac: persistance
<davecheney> just need enough load
<davecheney> in the right way
<anastasiamac> well done! ;-P
<davecheney> https://github.com/juju/testing/pull/102
<wallyworld> axw: would you yave time to look at http://reviews.vapour.ws/r/4972/ ? it needs to merge into the feature branch
<dimitern> voidspace, fwereade: standup?
<dimitern> dooferlad: ^^
<voidspace> dimitern: sorry, omw
<voidspace> got distracted
<dimitern> fwereade: how does the advice to return concrete errors (e.g. ErrFooFailed) align with "no global variables" ?
<dimitern> (as I suspect global consts are fine)
<davecheney> fwereade: https://github.com/juju/juju/pull/5543
<davecheney> if you have the time
<fwereade> dimitern, ha :) good point, possibly error types are safer? I have been bitten a couple of times by people assigning to stuff that's "obviously" a global "constant" variable
<fwereade> dimitern, but types don't quite feel like they pay for themselves?
<dimitern> fwereade: yeah, I suspect so
<dimitern> fwereade: it's nice to use if errors.Cause(err) == ErrMyFooFailed
<dimitern> fwereade: well, I guess if we define type SimpleError string and func (s SimpleError) Error() { return s }, we can define all such concrete errors as const ErrFooFailed SimpleError = "foo failed"
<dimitern> and get both the const guarantee and no global vars
<fwereade> davecheney, LGTM
<fwereade> dimitern, mm, I think I like that
<fwereade> dimitern, neat
<dimitern> fwereade: it even works: https://play.golang.org/p/jbCEra4_xi
<dimitern> (and it should be almost the same using Cause())
<dimitern> fwereade: re tests isolation, while treating tests and production code the same: given a ctor NewFooForSeries(series string) (Foo, error); and Foo being an interface with concrete (unexported) implementations per series
<dimitern> fwereade: how to properly isolate this for tests (assuming e.g. Foo.GetFoo() returns different results based on what's there on the local machine)? One option is to allow "dummy" as series and construct a no-op Foo, that has extra methods tests can use to set a pre-canned result from GetFoo()..
<fwereade> dimitern, I *think* that if they differ interestingly enough to be different types they deserve different ctors for direct testing? then you can have a wrapper around a map[series]ctor with shims as necessary and test that as much as it needs -- but it's much closer to being purely data-driven
<fwereade> dimitern, and then if NewXenialFoo returns an exported *XenialFoo it seems sane/good to check the underlying types that come out of a Foo factory
<dimitern> fwereade: I was thinking something like this might work: https://play.golang.org/p/bW-TI4l8yI
<fwereade> dimitern, or rather checking that some series return a *UpstartFoo and others a *SystemdFoo
<dimitern> fwereade: and the dummyRouteSource there is perfectly valid at run-time, but will be mostly used by tests; no special treatment
<dimitern> fwereade_: hey, did you get the last link?
<fwereade_> dimitern, I think I'd rather see a *WindowsRouteSource and a *LinuxRouteSource
<fwereade_> dimitern, don't think so
<fwereade_> dimitern, (both the above could be unit tested everywhere, I think?)
<dimitern> fwereade_: here:https://play.golang.org/p/bW-TI4l8yI
<fwereade_> ah I did see that one
<fwereade_> <fwereade> dimitern, or rather checking that some series return a *UpstartFoo and others a *SystemdFoo
<fwereade_> <fwereade> dimitern, is anything in there so platform-specific it needs build tags?
<dimitern> fwereade_: my point is NewRouteSource() will be called somewhere early as part of a lot of tests, and by default it can use the dummyRouteSource
<dimitern> fwereade_: it needs to use different sources (usually execute different tools and parse different outputs)
<fwereade_> dimitern, I was thinking I'd rather have an explicit NewFakeRootSource() for tests?
<dimitern> fwereade_: that's cleaner perhaps
<fwereade_> dimitern, and, yeah, but you don't want to hit the actual OS with those calls in unit tests
<dimitern> fwereade_: and the Fake one can implement e.g. SetRoutes() in addition to GetRoutes() from RouteSource
<fwereade_> dimitern, yeah, that sounds good, or you could just pass them in in the ctor
<fwereade_> dimitern, harder for people to induce races ;p
<dimitern> fwereade_: I really don't want to repeat earlier mistakes, e.g. patching the package tests for GetRoutes() to use faked ones, but providing no easy way to do the same outside the package
<fwereade_> dimitern, quite so, it feels to me like it's a foo/footest.NewRouteSource(...) situation
<dimitern> fwereade_: routes as Fake ctor args is nice! will do that
<fwereade_> dimitern, where do we need to thread the routes through to, by the way?
<dimitern> fwereade_: and having specific ctor's (NewWindowsRouteSource) rather than a switch-sorta-factory feels better - NewWindowsRouteSource can still be there but it's impl. can be in a +build windows - guarded file
<fwereade_> dimitern, why hide it away? just makes it easier to break when you're not running on windows
<dimitern> fwereade_: atm I need to implement and parse "get default route", and then use it as part of the container provisioing step
<fwereade_> dimitern, unless it's guarding actual syscalls I'd rather steer clear of +build where possible
<dimitern> fwereade_: fair point
<dimitern> fwereade_: it's more about guarding against the lack of a windows-specific tool
<fwereade_> dimitern, ok, but we don't want to *actually* call the tool in the unit tests, regardless of system
<dimitern> fwereade_: nope, only in the package tests
<dimitern> fwereade_: we'll call a patched executable for the tool
<dimitern> fwereade_: I was thinking of using natefinch's PatchExec that uses jujud as the binary to call
<dimitern> fwereade_: and outside of the package, the NewDummyRouteSource() will be used in all tests
<dimitern> fwereade_: how does that sound?
<fwereade_> dimitern, do we have to PatchExec or can we supply, e.g., a `func RunCommand(string, ...string) (notsurewhattoreturn, error)`
<dimitern> fwereade_: https://github.com/juju/testing/blob/master/cmd.go#L203
<dimitern> it's essentially what it does - patching exec.RunCommand
<dimitern> fwereade_: "patching" is incorrect here, as nothing is actually patched
<dimitern> fwereade_: AIUI we're using GetExecCommand when needed in the suite, rather than using PatchValue()
<axw> wallyworld: I gave that PR a LGTM on GitHub already
<wallyworld> axw: oh great ty. just about to push up a new revision of the other. found a bug where we were setting up cfg with controller uuid and then initialising state with something else
<axw> wallyworld: cool. I probably won't be able to review till the morning tho, sorry
<wallyworld> and then there were inconsistencies. may have just been tests
<wallyworld> that's fine
<fwereade_> dimitern, I think it *does* still depend on some magic patching, from having to embed that suite, and I think it smells a bit maybe? -- but it *does* still give you a real *exec.Cmd, which *is* pretty cool
<fwereade_> dimitern, so that sounds reasonable to me
<dimitern> fwereade_: ah, sorry - PatchValue can be used after all (e.g. environs/tools/build_test.go:TestGetVersionFromJujud), but isn't necessary (e.g. cmd/juju/commands/main_test.go:TestFirstRun2xFrom1xNotUbuntu - we can just run the patched command directly with CaptureOutput)
<dimitern> I quite like that approach :)
<dimitern> s/that/that second/
<dimitern> fwereade_: ok, cheers
<fwereade_> dimitern, right, ok, but what's writing to os.Stderr/out in the first place? doesn't having to capture those imply broken isolation?
 * fwereade_ may well be missing something
<dimitern> fwereade_: I think it's still isolated, see: CaptureOutput creates both stderr and stdout in isolation
<dimitern> https://github.com/juju/testing/blob/master/cmd.go#L307
<fwereade_> dimitern, yeah, that's what's bothering me -- what is causing stuff to be written to os.Std* that we need to patch it out in :317?
<dimitern> fwereade_: exec.Cmd.Run() does
<fwereade_> dimitern, this feels like a sign that the (string, ...string) shortcut is maybe the source of the problem
<fwereade_> dimitern, only if you don't set the command up properly ;p
<fwereade_> dimitern, utils/exec doesn't seem that great for replacing either, though, what with RunParams actually being stateful
<dimitern> fwereade_: so you're saying rather than embedding PatchExecHelper, then using GetExecCommand() and CaptureOutput, use a ctor e.g. NewWindowsRouteCmd() that sets it up to run "route print", and NewLinuxRouteCmd() doing "ip route ..", and a NewFakeRouteCmd() taking args and output?
<dimitern> by "sets it up to run" I mean &exec.Cmd{Args: .., Path: .., Stdin: inBuf.Reader(), ..}
<fwereade_> dimitern, I don't *think* so? if we need the types we need the routes, I presume -- clients will just want to accept a `Routes`, right?
<dimitern> s/I mean/I mean return/
<dimitern> fwereade_: Routes like []RouteInfo argument to the OS-specific RouteSource ctor?
<fwereade_> dimitern, no, just a type encapsulating the routes, already created
 * dimitern steps out for ~30m
<frobware> dooferlad: ping
<dooferlad> frobware: pong
<frobware> dooferlad: for the reboot cloud-init stanzas - did that work for all of precise, trusty and xenial?
<dooferlad> frobware: IIRC, yes
<mup> Bug #1589471 opened: Mongo cannot resume transaction <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1589471>
<dooferlad> frobware: that branch that did the reboot is gone :-(
<frobware> dooferlad: do you recall the name of the branch?
<dooferlad> frobware: no
<frobware> dooferlad: I have you as a git remote
<dooferlad> frobware: git reflog to the rescue :-)
<frobware> dooferlad: http://pastebin.ubuntu.com/17060265/
<dooferlad> frobware: yep
<frobware> dooferlad: just shows how long we've been futzing with this
<dooferlad> frobware: no wonder we have a bit of a bunker mentality
<frobware> dooferlad: ping
<dooferlad> frobware: pong
<frobware> dooferlad: please could you capture the output (or perhaps lack of) for the bond/LACP issue and raise a bug.
<frobware> dooferlad: asking as you definitively have more NICs.... :)
<dooferlad> frobware: it simply hangs after the bridge script runs. Nothing more to say!
<babbageclunk> dooferlad, dimitern, frobware: what do maas 1.9 node-groups correspond to in maas 2?
<dimitern> babbageclunk: to rack controllers
<babbageclunk> dimitern: thanks
 * fwereade_ thinks he just barely resisted the urge to name a type *state.Less
<natefinch> fwereade_: nice
<alexisb> hey all, happy monday!
<perrito666> hey
<katco> alexisb: happy monday
<babbageclunk> dimitern: ping?
<dimitern> babbageclunk: pong
<babbageclunk> dimitern: can I pick your brains about this maas spaces demo stuff in the juju-sapphire hangout?
<dimitern> babbageclunk: yeah, ok - omw
<babbageclunk> dimitern: thx!
<katco> hey, does anyone know how to manually log into the controller's mongo instance now? the password in accounts.yaml doesn't seem to work
<perrito666> katco: if mongo is v3 you need to install mongodb-org ppa and pull the 3.x client
<perrito666> people at packaging are working to provide the client along with the tools soon
<katco> perrito666: i'm on beta9, so i assume that v3
<perrito666> so we dont have to
<perrito666> katco: it actualy depends on your distro
<perrito666> xenial?
<katco> perrito666: the machine i have available atm is wily
<perrito666> your => your machine 0
<katco> perrito666: host machine
<katco> perrito666: bootstrapped controller is xenial though
<perrito666> then yes, mongo 3
<natefinch> controller is what matters
 * perrito666 is stuck waiting for the plumber and has no food, happy monday
<natefinch> mramm: thank you for your repro case on lxd
<natefinch> mramm: on that ip address setting bug
<katco> perrito666: ta. do you have a url for that ppa?
<perrito666> katco: https://docs.mongodb.com/v3.0/tutorial/install-mongodb-on-ubuntu/ sorry not a ppa, old style debline
<katco> perrito666: ah ok
<katco> perrito666: ta for your help
<perrito666> np, ask if you need anything else
<natefinch> my computer is not so happy about having a 13 machine LXD environment deployed :)
<natefinch> obv time for a bigger machine
<perrito666> natefinch: openstack bundle?
<natefinch> perrito666: just a couple random bundles, HA/big data stuff
<mramm> natefinch: yea, I just built a desktop machine with 64 gig ram because juju deploying so much stuff broke my (relatively powerful) laptop
<perrito666> yep, lots of ram and lots of ssd :)
<mramm> yep, two different SSD's on two different SATA channels, one for the system and one for the lxd filesystem
<natefinch> yep... my laptop is quad core i7 with 16GB of RAM... I'm glad they started offering 32GB on new XPS 15s.  But I keep thinking that I should really just get a desktop and keep this laptop just for sprints.  for the amount I spent on this laptop I could get a ridiculous desktop
<perrito666> go desktop, is the way
<perrito666> I have done that and use the laptop for whenever I cant work at home
<natefinch> plus it would be fun to build a desktop from scratch again... haven't done that in like a decade
<redir> brb reboot
<mup> Bug #1589385 changed: leftover eth0.cfg in /etc/network/interfaces.d <4010> <juju-core:New> <MAAS:Invalid> <https://launchpad.net/bugs/1589385>
<perrito666> bbl, errands
<bdx> I too had to throw together a more capacitive desktop for deploying lxd in my home lab
<bdx> behold her beauty
<bdx> http://imghub.org/image/UMUX
<bdx> :-) :-)
<bdx> http://imghub.org/image/UrGu
<bdx> one more, http://imghub.org/image/UxWY
<bdx> to be honest, I built her a few yrs ago ... she serves as my lxd lab now though
<natefinch> nice cooling rig
<mup> Bug #1589581 opened: Consistant basic use of debug-log between 1.25 and 2.0 <pyjuju:New> <https://launchpad.net/bugs/1589581>
<natefinch> how many NICs are in there? looks like you have 4 ethernet cords plugged into it?
<bdx> natefinch, yea .... things are arranged slightly different now, but I use to have to make up for not having 10G network infra
<natefinch> bdx: ha.  yeah, must be hard roughing it on single gigabit ethernet
<bdx> I would create iscsi extents and share them to different servers over different interfaces/networks to try to mitigate stomping the 1G
<bdx> lol
<bdx> now she has 10G, so I can share her fast zfs arrays to other servers more better
<bdx> I use to be heavily into the gpu modding game ... those 660's in there are modded to grid k2's
<bdx> they no longer exist tho
<bdx> she is gpu-less nowadays
<alexisb> cherylj, ping
<cherylj> alexisb: pong
<alexisb> heya cherylj
<alexisb> welcome back
<cherylj> thanks :)
<mup> Bug #1588911 changed: Juju does not support 2.0-beta9 <blocker> <ci> <juju-core:Fix Released by natefinch> <https://launchpad.net/bugs/1588911>
<mup> Bug #1589628 opened: Unable to bootstrap lxd with juju 2 because of x509 certificate error <juju-core:New> <https://launchpad.net/bugs/1589628>
<mup> Bug #1589635 opened: github.com/juju/juju/state fails on TestMachinePrincipalUnits with an unexpected name <juju-core:New> <https://launchpad.net/bugs/1589635>
<mup> Bug #1589628 changed: Unable to bootstrap lxd with juju 2 because of x509 certificate error <juju-core:New> <https://launchpad.net/bugs/1589628>
<mup> Bug #1589635 changed: github.com/juju/juju/state fails on TestMachinePrincipalUnits with an unexpected name <juju-core:New> <https://launchpad.net/bugs/1589635>
<mup> Bug #1588911 opened: Juju does not support 2.0-beta9 <blocker> <ci> <juju-core:Fix Released by natefinch> <https://launchpad.net/bugs/1588911>
<mup> Bug #1588911 changed: Juju does not support 2.0-beta9 <blocker> <ci> <juju-core:Fix Released by natefinch> <https://launchpad.net/bugs/1588911>
<mup> Bug #1589628 opened: Unable to bootstrap lxd with juju 2 because of x509 certificate error <juju-core:New> <https://launchpad.net/bugs/1589628>
<mup> Bug #1589635 opened: github.com/juju/juju/state fails on TestMachinePrincipalUnits with an unexpected name <juju-core:New> <https://launchpad.net/bugs/1589635>
<katco> perrito666: ping
<perrito666> katco: pong
<katco> perrito666: hey, are api calls currently carrying information about the user who's making it?
<perrito666> iirc, the info is in the connection only
<katco> perrito666: can you give me a jumping-off point?
<perrito666> katco: certainly, if I am guession correctly what you are looking for, apisercer/client_auth_root.go is a great place to start, there you know the facade, method and the user all in the same place
<katco> perrito666: awesome, that sounds like a winner, ta
<perrito666> wow, my typo rate really worsens when my wrist is hurt
<perrito666> katco: np
<natefinch> anyone online know anything about the state.address struct?
<mup> Bug #1589641 opened: github.com/juju/juju/state fails on ActionSuite.TestUnitWatchActionNotifications <juju-core:Incomplete> <juju-core service-to-application:New> <https://launchpad.net/bugs/1589641>
<redir_lunch> natefinch: not I
<natefinch> voidspace: don't suppose you're working late today?
<mup> Bug #1589670 opened: backups does not implement Backups for non linux OSes <juju-core:New> <https://launchpad.net/bugs/1589670>
<thumper> fwereade_: ping?
<natefinch> alexisb: btw, we'll need to hand off this bug.  I'll be on later, but I'm not going to figure it out in the next half hour.  .
<alexisb> natefinch, did you provide an update in the bug?
<mup> Bug #1589680 opened: Upgrading to cloud-archive:mitaka breaks lxc creation <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1589680>
<natefinch> alexisb: yep.  Huge brain dump in there
<alexisb> thanks
<thumper> davecheney: https://bugs.launchpad.net/juju-core/+bug/1588575
<mup> Bug #1588575: allwatcher_internal_test has intermittent failure <intermittent-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1588575>
<davecheney> thumper: thanks, no promises I can start on it today, I need to verify that i've found all the places where we stop but don't wait for the watcher to die
<mup> Bug #1576874 changed: restore-backup never completes <backup-restore> <blocker> <ci> <regression> <juju-core:Fix Released by hduran-8> <https://launchpad.net/bugs/1576874>
<perrito666> wallyworld: ffs, that keeps bitting me
<wallyworld> easy fix at least
<perrito666> wallyworld: yeah, I keep forgetting we maintain all that thing in windows for no good reason
<alexisb> heya wallyworld, do you have a few minutes for me?
<wallyworld> suppose so :-)
<wallyworld> have standup in 5, meet in 1:1?
<wallyworld> alexisb: in hangout
<wallyworld> axw: anastasiamac: perrito666: be there real soon
<perrito666> k
<axw> looking for the end of my webcam USB in the dark, be there soon too...
<redir> axw: you coming back?
<wallyworld> perrito666: here tis https://github.com/juju/juju/pull/5547
<wallyworld> can you +1 and i'll land
<perrito666> bastard you got review 4994
<perrito666> ship it
<wallyworld> ty
#juju-dev 2016-06-07
<mup> Bug #1589736 opened: BootstrapSuite.TestBootstrapPrintClouds unqeual fred and mary <blocker> <ci> <regression> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1589736>
<mup> Bug #1589372 changed: state: state test failure during stress test <juju-core:New> <https://launchpad.net/bugs/1589372>
<davecheney> that feel when cpu drops to 0% during tests and you know someone has used time.Sleep
<davecheney> ?       github.com/juju/juju/state/statetest    [no test files]
<davecheney> ok      github.com/juju/juju/state/storage      0.209s
<davecheney> ?       github.com/juju/juju/state/testing      [no test files]
<davecheney> why does the state set of package have _two_ testing helper packages ?!?!?
<natefinch> davecheney: just in case?
<davecheney> better add a third, that's the juju way
<natefinch> davecheney: belt and suspenders
<davecheney> and two pairs of underpants
<wallyworld> anastasiamac: could you +1 this for me? http://reviews.vapour.ws/r/4995/
<perrito666> thumper: hey, have a moment I think I migt be reading a test wrongly
<perrito666> or the test migt be wrong
<thumper> perrito666: my head is in the middle of something else just now
<thumper> gimmie 15-20?
<perrito666> sure
<perrito666> ill ping back
<wallyworld> perrito666: could you +1 the above trival pr while you are waiting?
<perrito666> wallyworld: checking
<anastasiamac> wallyworld: lgtm... maybe we should have a test for it tho \o/
<wallyworld> thre is
<wallyworld> a test
<wallyworld> it failed
<wallyworld> hence the fix
<wallyworld> it failed sometimes
<perrito666> wallyworld: how much of an ass am I being if I say that your for defines I before but it could get it from unpacking the slice
<wallyworld> damn yeah, will fix that was left over from previous code
<wallyworld> tnks
<perrito666> lgtm for the rest
<perrito666> thumper: ping?
<thumper> perrito666: actually, this will take longer L(
<thumper> :(
<thumper> sorry
<thumper> with natefinch on a hangout
<perrito666> lol, no worries
<thumper> bitching about networking
<perrito666> uh, bitching, I wish I was invited to that
<perrito666> I am a great complainer
<mup> Bug #1589748 opened: commands should prompt you to "juju login" if your password has expired <juju-core:Triaged> <https://launchpad.net/bugs/1589748>
<perrito666> mpf, is it possible that acl tests where broken?
 * perrito666 does not want to be right about this
<davecheney> thumper: still running stress tests of my patch
<davecheney> i hope to submit it this afternoon
<mup> Bug #1589774 opened: Ghost models exist after being destroyed <juju-core:New> <https://launchpad.net/bugs/1589774>
<davecheney> thumper: dagnabit, i think there is still one test case left
<davecheney> just had another "session closed" failure in myu stress tets
<thumper> :(
<thumper> anyone else having trouble with the charmstore?
<natefinch> thumper: not I
<anastasiamac> axw_: PR for no cotntroller/model on cli.. http://reviews.vapour.ws/r/4997/
<axw_> anastasiamac: thanks, will take a look a bit later on
<anastasiamac> axw_: no rush.. i will need to make sure that unit tests are aligned.. if there are any that are checking err cause and/or msgs
<mup> Bug # changed: 1466629, 1496143, 1522090, 1531886, 1534804
<natefinch> thumper: I'm off to bed.  Good luck with that bug.
<davecheney> thumper: http://reviews.vapour.ws/r/4990/
<davecheney> Patch set 4 should be the good one
<davecheney> I've been stress testing it for an hour and I think I've found all the places we were leaking watchers
<menn0> thumper: charm migration done: http://reviews.vapour.ws/r/4998/
<redir> wallyworld: axw_ either of you around?
<axw_> redir: yup
<wallyworld> maybe :-)
<redir> got a minute to look at something?
<wallyworld> an offer i can't refuse
<redir> tanzanite?
<wallyworld> sure
<redir> night
<redir> and thanks wallyworld axw_
<wallyworld> see you later alligator
<davecheney> the state/presence tests run in 5 seconds in a variety of machines
<davecheney> someone's put a sleep in there
<wallyworld> axw_: once you land your current work, and i land the initial controller config branch, i have another ready to propose (after resolving any conflicts) which does the shared config thing from bootstrap. i also realised the current PR does correctly reject UpdateModelConfig attrs that are controller ones, so I'll remove that unnecessary TODO in this next branch
<axw_> wallyworld: ok, but I'm not really comfortable with shared model config anymore. I think it may need more thorough design and discussion
<axw_> (see my comments in review)
<wallyworld> ok, looking
<axw_> wallyworld: I'm not really convinced that it's that worthwhile either. how often will or should a controller admin be enforcing shared config for all models?
<wallyworld> a lot in maas
<wallyworld> apt mirror for example
<wallyworld> mainly for private cloud case
<wallyworld> or tools url etc
<axw_> wallyworld: so then I think it might make more sense as cloud config, rather than controller-wide
<axw_> wallyworld: tools url will be going away
<axw_> wallyworld: either way, I think it needs  a bit more thought before we go changing something so fundamental
<wallyworld> it can be made cloud specific fairly easily
<wallyworld> either way, the PR to pull out controller config should be good
<axw_> wallyworld: absolutely 100% agreed
<axw_> wallyworld: just finishing up my branch now, it's going to be big...
<wallyworld> tis ok
<wallyworld> i got to do school pickup, bbiab
<wallyworld> axw_: i think shared cloud (not controller) config is worth persuing. it will work now with one cloud per controller, and later too. it will be easy to add a global clouds collection and store the settings docs on that, keyed on cloud name. the vast majority of the other code in the branch remains the same. this then allows maas specific apt mirrors etc to be easily set up across hosted models, based on what's in clouds.yaml
<axw_> wallyworld: sure, just so long as we can do it safely
<axw_> wallyworld: and without confusing semantics around updating/removing config
<wallyworld> axw_: yeah, i guess it depend on how confusing is defined. we can certainly warn if a user deletes a model attr that is also shared
<wallyworld> and tell them that the shared value will now be is use
<axw_> wallyworld: can you please write down what you think the end solution should look like, in terms of user commands, and we can discuss at the tech board again
<axw_> I don't think we really covered the inheritance side of things well before
<axw_> and it was just you me and william
<wallyworld> ok. that bit doesn't necessarily need to land before beta9 as it won't affect upgradability
<axw_> wallyworld: yep, +1
<wallyworld> my focus today and tomorrow is very tightly focused on beta9 sadly
<axw_> wallyworld: gotta go get charlotte in a moment, will then do a live test and propose my branch
<wallyworld> ok, i'll look either before or after soccer
<axw_> wallyworld: then there's a tonne of other stuff to do as follow ups :/
<wallyworld> axw_: and my PR needs another look too when you get done proposing
<wallyworld> yep
<axw_> okey dokey
<wallyworld> so long as what we land is upgradable
<axw_> wallyworld: food for thought: I wonder if things like apt-mirror would be better suited as being cloud-specific, and you *cannot* set them at the model level
<axw_> wallyworld: then there would be one place for each thing
<wallyworld> axw_: maybe, but what if i wany *my* model to use something else
<axw_> and I would be much happier at least ;p
<axw_> wallyworld: why would you?
<wallyworld> eg i set up my own mirror for testing or whatever
<wallyworld> or to get my own packages
<axw_> wallyworld: if it's just for testing, set up a test cloud?
<axw_> (you're asking legitimate questions, I'm just wondering if we can/should go down that route)
<wallyworld> that seems unwieldy just to use a different setting to one that's shared
<axw_> eep, gtg
<wallyworld> ttyl
<axw_> wallyworld: it's a more work, yes, but I expect that's a very uncommon thing to do (setting apt-mirror on a per-model basis). if we have one and only one place for each config attribute, then we can have very clear semantics for updating/removing/etc.
<axw_> wallyworld: BTW I think we want to add identity-url and identity-public-key to the controller-specific config?
<wallyworld> axw_: we problaby should, i was just going by what was currenty ordained as controller specific, seems like those ones were missing
<wallyworld> axw_: although, unless those are set at bootstrap, there would be no way of setting them after at the moment
<axw_> do we error in validation?
<wallyworld> i can ping uros about how he sets that stuff up
<wallyworld> yes
<wallyworld> if you try and set a controller attr via set-model-config it will error
<axw_> wallyworld: ok. even still, I think we should add it to the list in case we need to change that behaviour
<wallyworld> so if i add those, it just eans they need to be set at bootstrap and are then invariant
<wallyworld> for now
<axw_> wallyworld: that's the same as controller-uuid, ca-cert, etc.
<wallyworld> yep
<wallyworld> i just don't know the workflow for setting those
<wallyworld> i'll check with uros
<axw_> okey dokey
<wallyworld> if they set them after bootstrap, then boom
<wallyworld> i do disagree with you about the benefit of shared config, so we'll see what others think
<axw_> wallyworld: sure, I've added my 2c to the thread
<wallyworld> axw_: whether we go for inheritance or not, the bootstrap code in the wip branch will still work - we'd just count stuff in clouds.yaml as cloud config
<wallyworld> all the serialisation etc is there now, and the backend storage
<wallyworld> just need to remove the inheriance bit if that's what we decide
<wallyworld> so i should be able to get all this landed tomorrow
<mup> Bug #1589736 changed: BootstrapSuite.TestBootstrapPrintClouds unequal fred and mary <blocker> <ci> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by wallyworld> <https://launchpad.net/bugs/1589736>
<voidspace> hey all
<frobware> dimitern: ping, 1:1?
<dimitern> frobware: sorry, omw
<dimitern> voidspace: o/
<axw_> wallyworld: sorry for the delay, LGTM
<wallyworld> axw_: nw, tyvm. am landing the feature branch before i head off to soccer
<wallyworld> will have to jfdi it
<axw_> wallyworld: finally, http://reviews.vapour.ws/r/5000/
<axw_> as you'll see from the description, still lots TODO
<wallyworld> axw_: ty, i'll have to look after soccer now
<axw_> wallyworld: no worries
<axw_> wallyworld: I'll be tackling the references next I think, to avoid upgrade steps
<wallyworld> +1
<axw_> and then removing cloud endpoints/creds/etc. from model config
<axw_> probably after your branch lands
<dimitern> dooferlad: ping
<dooferlad> dimitern: hi
<dimitern> dooferlad: so I tried testing my patch against LACP bonds on the NUCs ... and failed miserably
<dimitern> dooferlad: morning :)
<dooferlad> dimitern: I am not entirely surprised.
<dimitern> dooferlad: can I ask you to try it on your hw setup?
<dimitern> dooferlad: it's here http://reviews.vapour.ws/r/4959/
<dooferlad> dimitern: sure, but can we discuss it with Andy and see how it meshes with our plan for the iproute2 / rebooting / ifupdown in the standup?
<dimitern> dooferlad: sure, ok
<dooferlad> dimitern: frobware voidspace hangout time!
<mup> Bug #1589890 opened: juju2 azure fail with error 409 network conflict <cpe-sa> <juju-core:New> <https://launchpad.net/bugs/1589890>
<hoenir> https://bugs.launchpad.net/juju-core/+bug/1588143
<mup> Bug #1588143: cmd/juju/controller: send on a closed channel panic <blocker> <race-condition> <juju-core:Triaged> <https://launchpad.net/bugs/1588143>
<hoenir> it is because we have defer in a defer stmt?
<hoenir> because defers saves the state of the program and it mangles up the program?
<hoenir> who not delete the defer inside that defer and put the estate.mu.Unlock() before the <-OpDestroy{//code}
<hoenir> ?
<hoenir> why not*
<hoenir> https://paste.ubuntu.com/17085985/
<hoenir> so why not like this.. anyway one go rutine will access the channel because the estate.mu.Lock() and after unlock it.
<hoenir> thoughts on this?
<hoenir> anyone?
<hoenir> and I'm reffering at this bug https://bugs.launchpad.net/juju-core/+bug/1588143
<mup> Bug #1588143: cmd/juju/controller: send on a closed channel panic <blocker> <race-condition> <juju-core:Triaged> <https://launchpad.net/bugs/1588143>
<voidspace> back
<voidspace> man a long wait at the hospital :-(
<dimitern> voidspace: o/
<voidspace> dimitern: hi
<dimitern> voidspace: everything went ok?
<voidspace> dimitern: yeah, routine tests for my wife - just checking something out
<voidspace> dimitern: mostly her being paranoid I think
<dimitern> voidspace: I see, ok
<dimitern> voidspace: btw I'd appreciate a second review on http://reviews.vapour.ws/r/4959/
<voidspace> dimitern: looking
<dimitern> voidspace: ta!
<anastasiamac> voidspace: wives are never paranoid \o/
<voidspace> dimitern: why are you removing all the pre-up (etc) parts of /e/n/i ?
<voidspace> dimitern: why were they needed and why are they no longer needed?
<dimitern> voidspace: they aren't necessary anymore
<voidspace> dimitern: why were they needed and why are they no longer needed?
<voidspace> I'd like to understand if you don't mind
<dimitern> voidspace: they were there to work around known issues when trying to ifup multiple static interfaces
<voidspace> dimitern: and how do we work around it now?
<mup> Bug #1589890 changed: juju2 azure fail with error 409 network conflict <cpe-sa> <juju-core:New> <https://launchpad.net/bugs/1589890>
<dimitern> voidspace: well, we don't :) all my tests on trusty and xenial in the past weeks confirm the initial boot slowdown is gone with simple, statically configured interfaces
<voidspace> dimitern: ok, have you talked it through with dooferlad?
<voidspace> dimitern: he added that stuff IIRC
<dimitern> voidspace: we did talk; but those the pre-up and etc. steps were my doing, which I'm glad to drop actually :)
<voidspace> dimitern: ah right, ok - my mistake
<voidspace> LGTM then
<dimitern> voidspace: tyvm!
<dimitern> dooferlad: any update on testing with bonds btw?
<dooferlad> dimitern: was having lunch after my second stand up of the day. On it now.
<dimitern> dooferlad: ok, np - just checking
<dooferlad> dimitern: why bridge_maxwait 0?
<dooferlad> surely we want the bridge to enter forwarding mode before we continue?
<dimitern> dooferlad: that's the intent
<dooferlad> dimitern: http://manpages.ubuntu.com/manpages/precise/man5/bridge-utils-interfaces.5.html says it won't wait for the bridge to enter forwarding mode
<dimitern> dooferlad: otherwise maxwait is 32s by default (although in most of my tests it's a lot shorter in reality)
<dooferlad> dimitern: yea, waiting is good.
<dimitern> dooferlad: with multiple bridges it gets very slow very quickly
<dooferlad> dimitern: if you want that change locally when you are iterating on something that is one thing, but landing it in production code seems wrong.
<dimitern> dooferlad: how about a compromise?
<dimitern> dooferlad: e.g. 5s
<dooferlad> dimitern: I am not comfortable with that either. I assume that 32s was chosen for a reason. If we want to change it we need a better reason.
<dimitern> dooferlad: I can compare the boot times with different values of maxwait, but not specifying it is bad pretty much every time you have >1 br
<dooferlad> dimitern: why? Shouldn't the bridge come up and boot continue? It doesn't always wait 32s right?
<dimitern> dooferlad: what's there to wait for? the port was up and running just before the script was run
<dooferlad> dimitern: if that is the case, why does it not just continue anyway?
<dimitern> dooferlad: it does come up, eventually, but with 7 VLANs => 7 bridges, it can take more than 75s for some bridges
<dooferlad> dimitern: that really doesn't seem bad to me
<dooferlad> dimitern: and, as I said, if we are going to change a default we need to justify it
<dooferlad> dimitern: I would assume that the default is very widely tested and setting it to 0 isn't.
<dimitern> dooferlad: ok, fair enough (will still test with the default maxwait to compare); not having addresses on both nics and bridges seems to be the most important part for reliability, along with stp
<dooferlad> dimitern: Agreed. I just don't want any surprises :-|
<dooferlad> dimitern: though I would love to have it as 0 if it didn't make any difference other than booting faster. Perhaps when we aren't trying to get a release out and we can throw some CI resources at it!
<dimitern> dooferlad: +1
<dooferlad> dimitern: so what do you want me to run as a test? Just see if a bonded interface still works on boot?
<dooferlad> dimitern: then stick a lxc on a machine and check it works/
<dooferlad> ?
<dimitern> dooferlad: I'd suggest - bootstrap on bonded dual-nic with no vlans first
<dimitern> dooferlad: then tear it down, add a couple of VLANs on the bond and bootstrap + add a couple of LXDs to machine 0 (switch controller first)
<dimitern> dooferlad: and in both cases before tearing down, try rebooting and see if it still works
<dimitern> if you can ssh into the node, you should be able to also ssh into the containers
<dimitern> dooferlad: and I'd at least check 'ping google.com' and 'ip r' from inside the container
<dimitern> that should cover most cases
<dimitern> (common ones)
<dooferlad> dimitern: about stp - it was disabled for security reasons according to http://manpages.ubuntu.com/manpages/xenial/man5/bridge-utils-interfaces.5.html so do we have a good reason for turning it on?
<dooferlad> dimitern: is that another boot time thing?
<dimitern> dooferlad: that sounds terribly handwavy
<dimitern> dooferlad: what security concerns?
<dooferlad> dimitern: that is what it said in the man page.
<dooferlad> dimitern: I would like to know more too.
<dimitern> dooferlad: yeah, so far I haven't seen references to such issues
<dimitern> dooferlad: but I did notice improved UX and stability with STP on (I was having terrible broadcast storms with incorrectly configured switches)
<dooferlad> dimitern: http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge_stp
<dooferlad> dimitern: looks like a win on trusted networks
<dooferlad> dimitern: were you having any problems on correctly configured switches?
<dimitern> dooferlad: I see, ok - makes sense and our networks are usually trusted
<dimitern> dooferlad: nope
<dimitern> dooferlad: it might be a problem with certain setups (on a shared substrate)
<dimitern> (you know where..)
<dooferlad> dimitern: so... is there any advantage really? I don't care about poorly configured networks. That is somebody else's problem.
<dimitern> dooferlad: the issue is entirely not apparent when it happens
<dimitern> dooferlad: i.e. you might have initial connectivity, which suddenly drops due to e.g. arping an unknown IP
<dooferlad> dimitern: that isn't STP fixing your network - that is broken.
<dimitern> dooferlad: it does fix it though
<dooferlad> dimitern: and Juju deploys servers that are public, i.e. HTTP[S]
<dimitern> dooferlad: by blocking the loops that otherwise happen
<dooferlad> dimitern: I think it really should be something that users can turn on, but shouldn't be on by default.
<dimitern> dooferlad: why?
<dimitern> dooferlad: it affects MAAS setups a lot more than SDNy clouds
<dooferlad> dimitern: turning it on turns on a security hole.
<dooferlad> dimitern: see the linuxfoundation.org page
<dooferlad> The Spanning Tree Protocol has no authentication; all participants are assumed to be trustworthy and correct. This assumption is not true if bridging between a hostile environment like the Internet and a private network. For this reason, STP is turned off by default on the recent versions of Linux.
<dimitern> dooferlad: how does that apply here? we're not bridging hostile networks
<dimitern> dooferlad: all of them are managed by maas
<dooferlad> dimitern: a Juju deployed node could have a public interface
<dimitern> dooferlad: on MAAS?
<dooferlad> dimitern: why not?
<dimitern> dooferlad: not the typical case
<dooferlad> dimitern: a MAAS network could have some machines in a DMZ that are just open to the net. There is no reason why not.
<dimitern> dooferlad: I'm not saying it's impossible, but your maas is likely firewalled deep inside your network
<dooferlad> dimitern: I don't really care about typical for MAAS. I care about changing a default that is that way for a good reason. Doing something unexpected that can result in security problems is bad.
<dooferlad> dimitern: I think it is perfectly reasonable to schedule work to turn STP on if a user asks for it. I think it being a space or subnet wide setting would be a good fit.
<dimitern> dooferlad: pretty much every guide on networking I've read strongly recommends enabling STP especially in complex setups like with MAAS where it's quite easy to have multiple redundant links across
<dooferlad> dimitern: fine, so lets do the right thing please? Not just turn it on everywhere and hope we haven't screwed someone.
<dimitern> dooferlad: oh alright
<dimitern> dooferlad: in the interest of getting something else done, sure
<mup> Bug #1589890 opened: juju2 azure fail with error 409 network conflict <cpe-sa> <juju-core:New> <https://launchpad.net/bugs/1589890>
<dimitern> dooferlad: any issues btw?
<dooferlad> dimitern: sorry, I thought you were updating your patch.
<dimitern> dooferlad: oh, sorry I wasn't sure it was needed for that test
<dooferlad> dimitern: since all your bridge script changes are going away I don't think a bond will make any difference for what it's worth.
<dimitern> dooferlad: but ok, will push an update dropping stp on and maxwait 0
<dooferlad> dimitern: unless I missed something!
<dimitern> dooferlad: no, not all
<dimitern> dooferlad: only the extra bridge settings
<dooferlad> dimitern: in that case I would get that change landed if it works for you. The bond won't make any difference.
<dimitern> dooferlad: without testing it at least once?
<dooferlad> dimitern: 'if it works for you' implied you could at least do a little test :-)
<dimitern> dooferlad: well, not with bonds :)
<dooferlad> dimitern: yea, it won't care
<dooferlad> dimitern: it is a tube with bits going in and out of it :-)
<dimitern> dooferlad: but otherwise I've done enough tests to be reasonably certain it works well
<dooferlad> dimitern: in the container, why eth0 being special?
<dimitern> dooferlad: the thing is, I didn't change how we bridge bond slaves (e.g. removing duplicated IPs from there and leaving the on the bridge)
<dooferlad> dimitern: MAAS only gives an address to one interface IIRC
<dooferlad> dimitern: from the containers POV (which this change is about) it is just a connection to a network.
<dimitern> dooferlad: it's not special, just the first one is connected to the bridge which took over the NIC on the default route
<dooferlad> dimitern: so eth0 is always connected to <dev?> that has the default route on it?
<dimitern> dooferlad: true, having more than 1 gateway rendered in /e/n/i doesn't work
<dooferlad> dimitern: I was more thinking about eth1 having the default route on the host.
<frobware> dooferlad, dimitern: I just asked larry for more details regarding this ^^
<dimitern> dooferlad: "eth0" can be connected to e.g. br-eth3 on the hosty
<dimitern> dooferlad: the names don't have to match
<frobware> dooferlad, dimitern: I wanted to clarify his setup so that we could reproduce and confirm the bug fixes that
<dimitern> frobware: ok, it's good to know that
<dooferlad> dimitern: I would be tempted to copy /etc/network/interfaces from the host to the guest and change IP and MAC addresses to match the container config. If guest-ethx is always connected to host-br-ethx that would work.
<dimitern> frobware: have you tried rebooting a machine with deployed lxd containers?
<frobware> dimitern: that sounds ominuous
<dimitern> frobware: I've just discovered none of the lxds come up
<frobware> oh la la
<dimitern> frobware: they seem to be not set to auto-start on boot, as starting them manually otherwise works fine
<frobware> dimitern: lxc config show <container> -- should show autostart state
 * dooferlad goes to get a cup of tea and spend a few minutes not thinking about routes.
<dimitern> frobware: it does say `user.boot.autostart: "true"`
<dimitern> hmm.. looking at the logs
<frobware> dimitern: what's the state of your patch w.r.t. backing out bridge script changes? I see stuff in there for the stp and explicitly matching "vlan_id"
<babbageclunk> voidspace: ping?
<voidspace> babbageclunk: pong
<babbageclunk> voidspace: if a juju deploy failed because the image wasn't available in my maas, how can I ask it to try again now that the image is available?
<dimitern> frobware: as discussed with dooferlad, I'm dropping the added stp on and maxwait 0 options
<voidspace> babbageclunk: just try again with the deploy command?
<babbageclunk> voidspace: I tried retry-provisioning, but it doesn't seem to have done anything.
<voidspace> babbageclunk: I'm assuming you tried that and it didn't work
<dimitern> frobware: and the 2 rm calls for eth0.cfg and 50-cloud-init..
<voidspace> babbageclunk: destroy-service (or application) first?
<babbageclunk> voidspace: ok, just destroy the applications and the deploy again?
<dimitern> frobware: and I think we should land the rest
<voidspace> babbageclunk: I would *expect* that to work
<dimitern> frobware: `error: open /var/lib/lxd/containers: no such file or directory` << in /v/l/syslog on one of the hosts..
<babbageclunk> voidspace: sweet, seems like it.
<natefinch> alexisb: just verifying... is the lxc to lxd work higher priority than that ipaddress borked bug? https://bugs.launchpad.net/juju-core/+bug/1537585   Ian gave me the impression I should work on the lxc to lxd stuff instead of that bug.
<mup> Bug #1537585: machine agent failed to register IP addresses, borks agent <2.0-count> <blocker> <cdo-qa-blocker> <landscape> <network> <juju-core:In Progress by natefinch> <juju-core 1.25:In Progress by natefinch> <https://launchpad.net/bugs/1537585>
<dimitern> frobware: ping
<mup> Bug #1590045 opened: Uniter could not recover from failed juju run <juju-core:New> <https://launchpad.net/bugs/1590045>
<frobware> dimitern: pong
<dimitern> frobware: I've updated the PR as agreed and set it to merge
<dimitern> (it's not picked up yet for some reason though)
<frobware> dimitern: oh. because I wanted to talk about matching on vlan_id
<dimitern> frobware: there's no such setting 'vlan_id'
<frobware> dimitern: so vlan_is no longer propgated to the bridge device, correct?
<dimitern> frobware: see vlan-interfaces(5)
<dimitern> frobware: correct - it should've been there to begin with
<dimitern> frobware: only vlan-raw-device is needed
<mup> Bug #1590045 changed: Uniter could not recover from failed juju run <juju-core:New> <https://launchpad.net/bugs/1590045>
<frobware> dimitern: ok, makes sense.
<frobware> dimitern: so, switching topics... LXD reboots...
<dimitern> frobware: yeah
<frobware> dimitern: let's HO - typing takes too logn
<dimitern> frobware: ok
<dimitern> frobware: I'm in our 1:1
<voidspace> hah, starting 13 lxd instances is slowing my machine down a bit
<alexisb> natefinch, sorry was otp and missed you ping
<mup> Bug #1590045 opened: Uniter could not recover from failed juju run <juju-core:New> <https://launchpad.net/bugs/1590045>
<mup> Bug #1590065 opened: container/lxd: One rename too far -> "application", "restart", "lxd-bridge" <blocker> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1590065>
<redir> is txn.Op.ID mapped to a mongo collection _id?
<redir> ericsnow natefinch katco ^
<ericsnow> redir: it is always manually set, though always to the "_id" field :)
<redir> sorry that should be txn.Op.Id
<redir> mmm I guess I don't understand ModelUsers yet.
 * redir reads more
<redir> and tx ericsnow
<ericsnow> redir: np
<mup> Bug #1590095 opened: Model name validation error doesn't specify model name <juju-core:New> <https://launchpad.net/bugs/1590095>
<cmars> now that service-to-application is landed, this unblocks a dependency update. can i get a review of http://reviews.vapour.ws/r/5001/ ?
<redir> cmars: done, but you'll need someone else to bless.
<cmars> redir, thanks
<redir> heh
<redir> is there a way to turn down logging on test failure? DEBUG=0 or -q or something?
<mup> Bug #1590119 opened: add-credential for an openstack cloud asks for domain-name and bootstrap errors with unknown config field <cdo-qa> <juju-core:New> <https://launchpad.net/bugs/1590119>
<thumper> natefinch: hey
<natefinch> thumper: yo
<thumper> natefinch: as you can see from the email, there is some screwyness going around with mgo
<thumper> according to what I dumped out, the assertions are met
<thumper> but they fail
<thumper> I want to talk with menno about this when he gets on
<natefinch> thumper: yep.... good luck... hope you can figure it out.
<thumper> well, I hope menno can work it out, really I've got nothing
<thumper> I think we are going to have to sprinkle mgo with extra logging...
<natefinch> good times
<alexisb> thumper just so you know I asked natefinch to stay focused on lxd updates
<alexisb> given the state of that bug
<mramm> thumper: sounds like a lot of fun ;)
<thumper> alexisb: ok
<mramm> cmars: you have a min to talk about using omnibus for metering instance usage data in openstack for China Mobile?
<mramm> had a very quick chat with Carlos about metering in openstack/ceilometer and I think we have a better system....
<mup> Bug #1590119 changed: add-credential for an openstack cloud asks for domain-name and bootstrap errors with unknown config field <cdo-qa> <juju-core:New> <https://launchpad.net/bugs/1590119>
<mup> Bug #1590143 opened: deploying to cluster for vsphere as provider <oil> <vsphere> <juju-core:New> <https://launchpad.net/bugs/1590143>
<ericsnow> katco: multiple reviewers are required now, no?
<katco> ericsnow: actually, i think that remained undetermined. alexisb, was a decision ever made?
<alexisb> ericsnow, no not at this stage, we do have a review process update planned but it is not rolled out yet
<ericsnow> alexisb: k
<cmars> wallyworld, can i get a review of http://reviews.vapour.ws/r/5001/ ? post-renaming romulus update..
<alexisb> thumper, do you have five minutes to catch up with me before the release call?
<thumper> yep
<alexisb> cmars, wallyworld is out for a bit
<alexisb> thumper, can you join the release standup
<cmars> alexisb, thanks. anyone available for a short review? http://reviews.vapour.ws/r/5001/
<mup> Bug #1590161 opened: apiserver/client: panic: Session already closed <juju-core:New> <https://launchpad.net/bugs/1590161>
<wallyworld> cmars: lgtm, tv
<wallyworld> ty
<wallyworld> katco: ericsnow: did you have a few minutes to chat about dtag?
<ericsnow> wallyworld: sure
<wallyworld> https://hangouts.google.com/hangouts/_/canonical.com/tanzanite-stand
<mup> Bug #1590172 opened: ERROR cmd supercommand.go:448 autorest:WithErrorUnlessStatusCode POST https://login.microsoftonline.com/fb30bf07-xxxx-xxxx-xxxx-02ef08680fb9/oauth2/token?api-version=1.0 fa iled with 400 Bad Request <juju-core:New> <https://launchpad.net/bugs/1590172>
<katco> wallyworld: sorry just saw this. sure
<redir> where does juju log?
<redir> ~/.local/share?
<redir> duh, nm
<bogdanteleaga> what's the best supported maas on master?
<anastasiamac> bogdanteleaga: both maas 2 and 1,9 should work... prefereance is most likely 2 \o/
<bogdanteleaga> anastasiamac, cheers
<wallyworld> thumper: did you have 5 minutes for a question?
<wallyworld> or 2 minutes
<thumper> not just now
<thumper> otp with menn0
<wallyworld> ok
<thumper> dealing with this shitty mgo issue
<wallyworld> yay, ping when free
<wallyworld> should be quick
<perrito666> wallyworld: dont forget about me after redir :)
<wallyworld> i won't
#juju-dev 2016-06-08
<wallyworld> perrito666: in standup now
<perrito666> going
<cmars> can i make LP:#1585005 a blocker so i can land a fix for it? :)
<mup> Bug #1585005: list-* commands should be aliases for what they're listing <usability> <juju-core:Fix Committed by macgreagoir> <https://launchpad.net/bugs/1585005>
<thumper> alexisb: ping
<thumper> alexisb: we know what the problem is for the network address issue
<thumper> we know why it fails
<thumper> we know where the bad data gets in
<thumper> we don't know WHY the bad data gets in
<thumper> that is what we are looking at now
<thumper> we know how to avoid the problem, but we are trying to work out why the problem is happening
<alexisb> thumper, ok, please keep the bug updated
<thumper> I'll go add that to the bug
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1588574
<mup> Bug #1588574: Session already closed in state/presence <blocker> <ci> <intermittent-failure> <juju-core:Fix Committed by dave-cheney> <https://launchpad.net/bugs/1588574>
<davecheney> ^ fix comitted !!
<alexisb> sweet davecheney!
<davecheney> thumper: i punted on two issues which I'll work on today
<davecheney> so we could land this
<davecheney> alexisb: well, don't open the champaign yet, let's see if I got the fix right :)
<thumper> davecheney: which two issues are those?
<thumper> cannot resolve URL "cs:trusty/apache-flume-syslog-4": cannot resolve charm URL "cs:trusty/apache-flume-syslog-4": cannot get "/trusty/apache-flume-syslog-4/meta/any?include=id&include=supported-series&include=published": Get https://api.jujucharms.com/charmstore/v5/trusty/apache-flume-syslog-4/meta/any?include=id&include=supported-series&include=published: dial tcp: lookup api.jujucharms.com on 127.0.1.1:53: read udp 127.0.0.1:43666->
<thumper> 127.0.1.1:53: i/o timeout
<thumper> WTF?
<thumper> this error is shown to the user when deploying charms
<davecheney> thumper: wow, that is incredebly subtle
<davecheney> the mgo issue
<thumper> yeah
<thumper> kinda shit
<davecheney> thumper: first issue is https://launchpad.net/bugs/1590161
<mup> Bug #1590161: apiserver/client: panic: Session already closed <juju-core:New> <https://launchpad.net/bugs/1590161>
<thumper> we are going through mgo with logging now
<davecheney> the second issue is a followup requested by william to make the presence watchers implement worker.Worker
<davecheney> i'll do that first then tackle https://launchpad.net/bugs/1590161
<davecheney> api/client_test.go:
<davecheney> 31:     jujunames "github.com/juju/juju/juju/names"
<davecheney> you're killing me
<mup> Bug #1590205 opened: list-actions should produce tabular output <juju-core:New> <https://launchpad.net/bugs/1590205>
<mup> Bug #1590205 changed: list-actions should produce tabular output <juju-core:New> <https://launchpad.net/bugs/1590205>
<mup> Bug #1590205 opened: list-actions should produce tabular output <juju-core:New> <https://launchpad.net/bugs/1590205>
<davecheney> menn0: here is a simple one to warm up on https://github.com/juju/juju/pull/5563
 * thumper -> dog walk
<axw_> wallyworld: free to chat now?
<wallyworld> axw_: give me a few, just doing a review
<axw_> wallyworld: sure, ping me
<natefinch> mongo continues to impress: https://engineering.meteor.com/mongodb-queries-dont-always-return-all-matching-documents-654b6594a827#.k7ljsk6fv
<axw_> natefinch: :o
<natefinch> axw_: right?  It's a race condition that should be unlikely, especially in a tiny DB like ours, but still...
<menn0> davecheney: sorry, I've been on calls - ship it!
<davecheney> menn0: ta
<menn0> davecheney: one thing, are you aware of the helpers in the worker/workertest package?
<menn0> davecheney: there's something in there which negates the need for that separate assertStopped in the presence package
<davecheney> i was not
<menn0> i'm not fussed if you leave it in this case, but it's a good package to know about. there's lots of good stuff in there (all by Will)
<menn0> it's fairly new
<davecheney> worker.Stop(w) does the job
<wallyworld> axw_: sorry, just finished doing review, have 30 minute meeting, will ping when i can
<axw_> wallyworld: np
<menn0> davecheney: take a look at worker/workertest/check.go. Those helpers handle the case of Stop/Wait not returning.
<davecheney> menno-afk: in my case, they have to return, otherwise the mongo driver will panic
<wallyworld> thumper: i'd love a second look at http://reviews.vapour.ws/r/4973/ when you're back
<wallyworld> axw_: standup ho?
<axw_> wallyworld: be there in a mo
<davecheney> menno-afk: thumper next cab off the rank, https://github.com/juju/juju/pull/5564
<davecheney> natefinch: wow, that blog post, what garbage
<natefinch> davecheney: yuuup. kick you right in the assumptions
<thumper> natefinch: got 5 minutes to hear weirdness?
<natefinch> thumper: I love weirdness
<thumper> natefinch: https://hangouts.google.com/hangouts/_/canonical.com/onyx-standup?authuser=0
<menn0> davecheney: looking
<thumper> menn0: confirmed it is mgo bug
<thumper> on the way in to mgo, it is a bson.D
<thumper> when it is applying finally, it passes through a bson.M
<thumper> so it futzes the ordering
<thumper> I've got a bunch of extra logging
<thumper> have all the flusher logging too
<menn0> but you haven't found the exact place yet?
<thumper> looking
<menn0> thumper: it's so weird that we can't find the bson.M usage in the code
<thumper> menn0: when it needs to run
<thumper> it hits a queue
<thumper> of four
<davecheney> thumper: urg
<davecheney> this pinger api
<davecheney> it's like pulling out the pin on a hand grenade then loosing it in the couch
<menn0> davecheney: haha
<menn0> davecheney: so you've dealt with the panics that occurred during the tests... are you sure the recover() wasn't hiding panics in production too?
<menn0> davecheney: apart from that concern the change LGTM
<menn0> davecheney: i'd be a bit happier if the pinger kept it's own Session rather than trying to copy the one attached to the pings collection every time it wanted one
<davecheney> menn0: all the bugs are in tests
<davecheney> their shutdown behaviour is weaksauce
<menn0> but we wouldn't know if there were panics in production that the recover was hiding right?
<menn0> I agree you've dealt with the test issues.
<davecheney> i'm not interested in having a discussion about stability by hiding panics
<davecheney> this code was added ages ago when we didn't understand what was causing the problem
<davecheney> you cannot take that as a basis for it being the "right" thing, and I am proposing something more dangerous
<davecheney> panic => test or production, we find hte cause and fix that, not paper over them
<davecheney> this isn't just a philosophical argument
<menn0> i see where you're coming from but a panic in production is still a panic
<davecheney> panics can break lock invadiants
<davecheney> menn0: that's why we have upstart to restart processes that die
<menn0> i'd like another person to take a look, but apart from this concern the change looks good
<davecheney> func (s *MachineMockProviderSuite) SetUpTest(c *gc.C) { // Change to environ that supports HasRegion s.commonMachineSuite.SetUpTest(c)
<davecheney> }
<davecheney> ^ wtf,
<mup> Bug #1590237 opened: juju2 usability: removing cloud makes list-credentials behave weird <landscape> <maas> <usability> <juju-core:New> <https://launchpad.net/bugs/1590237>
<mup> Bug #1590239 opened: juju2 usability: I can import credentials of the wrong type, but can't list them/use them. <landscape> <maas> <usability> <juju-core:New> <https://launchpad.net/bugs/1590239>
<davecheney>         go func() {
<davecheney>                 c.Check(a.Run(nil), jc.ErrorIsNil)
<davecheney>         }()
<davecheney>         defer func() { c.Check(a.Stop(), jc.ErrorIsNil) }()
<davecheney> WTF
<davecheney> the call to a.Stop() acn happen _after_ the call to a.Run!!!
<natefinch> goroutines, how do they work?
<davecheney> Run(c *cmd.Context) error
<davecheney> ^ sign of a bad API, every call to this method is passes nil!
<davecheney> func (a *machineAgentCmd) Run(c *cmd.Context) error { machineAgent := a.machineAgentFactory(a.machineId) return machineAgent.Run(c)
<davecheney> }
<davecheney> func (a *machineAgentCmd) Run(c *cmd.Context) error { machineAgent := a.machineAgentFactory(a.machineId) return machineAgent.Run(c)
<davecheney> }
<davecheney> how does this help ?
<wallyworld> axw_: i have to go get car. here is a wip, based in part on your work yesterday (to pass in and store the cloud name against a model) http://reviews.vapour.ws/r/5012/  needs more tests and i haven't gone through it all yet but i need to head out for an hour or so
<axw_> wallyworld: okey dokey. about to have lunch, will look soon
<wallyworld> np, ty, bbiab
<natefinch> aahhh stupid frigging slow ass tests
<natefinch> I think this test never fails, it just waits for ever
<natefinch> hey, naked receive on a channel, thanks, tests.
<natefinch> wallyworld: test failures are killing me.  I'm going to have to pick this up in the morning.
<wallyworld> axw_: off to soccer photos, changes pushed, still got some tests to fix, but only minor changes to come
<axw_> wallyworld: ok, will look shortly
<dooferlad> frobware, dimitern: hangout time!
<dimitern> oops omw
<dooferlad> fwereade, jam: standup today?
<jam> dooferlad: thanks, just finishing the other meeting
<wallyworld> jam: would love to catch up about a couple of things, did you want to talk later, maybe an hour's time?
<jam> wallyworld: sure.
<wallyworld> ok, will ping after dinner
<axw_> menn0: actually it was block devices that I fixed: https://bugs.launchpad.net/juju-core/+bug/1461871
<mup> Bug #1461871: worker/diskmanager sometimes goes into a restart loop due to failing to update state <canonical-bootstack> <storage> <juju-core:Fix Released> <juju-core 1.22:Fix Released by axwalk> <juju-core 1.24:Fix Released by axwalk> <https://launchpad.net/bugs/1461871>
<axw_> sorry, clearly I should have emailed the list :/
<menn0> axw_: all good. the fact that it's happened twice just means we need to prevent it happening again :)
<axw_> menn0: +1
<ejat> axw_: can u help me with bug 1590172
<mup> Bug #1590172: ERROR cmd supercommand.go:448 autorest:WithErrorUnlessStatusCode POST https://login.microsoftonline.com/fb30bf07-xxxx-xxxx-xxxx-02ef08680fb9/oauth2/token?api-version=1.0 fa iled with 400 Bad Request <juju-core:New> <https://launchpad.net/bugs/1590172>
<axw_> ejat: I'll try. just checking, did you redact the UUID in the URL there?
<axw_> or is that verbatim
<ejat> axw_: yups .. i edit it
<ejat> bcause its my tenant-id
<axw_> ejat: yep just checking that it wasn't like that in your credentials file
<ejat> axw_: its in full in my credentials file
<axw_> ejat: hmm, so I *think* that the only reason why would get a 400 is if the application, subscription or tenant ID is invalid
<axw_> ejat: you'll get a 401 if they're valid but the password is invalid
<axw_> (which is kinda poor security practice)
<ejat> ic ..
<axw_> ejat: do you have the azure CLI on your laptop?
<ejat> axw_: yes i am
<axw_> ejat: can you please confirm the tenant and subscription IDs in ~/.local/share/juju/credentials.yaml by comparing with the output of "azure account show"
<ejat> yes .. its the same
<axw_> ejat: and then please compare application-id to the "AppId" field for the application you created, by running "azure ad app list"
<axw_> ejat: I can confirm that if I change the application-id to something invalid, I get the same error message
<ejat> i think that might cause me a problem ...
<ejat> may i ask why need to Create an Azure Active Directory (AAD) application ?
<ejat> i mean .. its in 2.0 not in 1.0
<ejat> and what is the best --name, --home-page, and --identifier-uris based ? project / models ?
<axw_> ejat: that's just what MS/Azure says what we (Juju) are supposed to do
<axw_> ejat: it doesn't really matter, anything will do
<axw_> ejat: BTW there's docs here, they don't seem to be linked very well at the moment: https://jujucharms.com/docs/devel/help-azure
<ejat> okie ... noted ... at least i can tell the MS here the same as what u told me :)
<ejat> axw_: yeah im refering to that doc
<axw_> ejat: also, you may run into this after you get past the 400 issue: https://bugs.launchpad.net/bugs/1589890
<mup> Bug #1589890: juju2 azure fail with error 409 network conflict <cpe-sa> <juju-core:Incomplete by axwalk> <https://launchpad.net/bugs/1589890>
<ejat> error:   'ad' is not an azure command. See 'azure help'.
<axw_> ejat: what version of the CLI do you have? you may need to update it. I'm on 0.10.1, and it exists there
<ejat> so in short , juju2 not working atm with azure?
<ejat> ok let me try to update it
<wallyworld> jam: you free now? https://hangouts.google.com/hangouts/_/canonical.com/tanzanite-stand
<axw_> ejat: it is working, see comment #1. you just need to run a few extra azure CLI commands for Juju to work properly
 * ejat update n create ad 1st .. 
<ejat> thanks for the guiding
<ejat> axw_:
<babbageclunk> dimitern: What's workload-status? It's always unknown for the applications in the maas-spaces model.
<ejat> PS C:\Users\Lenovo> azure --version
<ejat> 0.10.1 (node: 4.2.4)
<ejat> PS C:\Users\Lenovo> azure ad app create --name "informology" --home-page "http://www.informology.my" --identifier-uris "http://www.informology.my" --password $APP_PASSWORD
<ejat> error:   'ad' is not an azure command. See 'azure help'.
<dimitern> babbageclunk: that's always unknown, unless set by the charm with e.g. 'status-set active "Ready"'
<dimitern> babbageclunk: i.e. depends on the charm
<babbageclunk> dimitern: ahh. So maybe these charms just don't set it?
<dimitern> babbageclunk: yeah
<babbageclunk> dimitern: although I guess that might also mean that they're not actually working.
<dimitern> babbageclunk: that's why it's unknown - we just don't know :)
<babbageclunk> dimitern: How can I check? haproxy's listening on port 80, but in the public space.
<axw_> ejat: maybe you need to run "azure config mode arm"?
<dimitern> babbageclunk: ssh in and check the logs for errors?
<babbageclunk> dimitern: yeah, I'll try that
<babbageclunk> dimitern: How could I make a web request to the running haproxy? Do I need to create an interface on the same vlan on the host machine?
<dimitern> babbageclunk: or just open haproxy's public address as a webpage
<dimitern> babbageclunk: if you can't access the public ip directly
<babbageclunk> dimitern: I tried that, nothing.
<dimitern> babbageclunk: try using sshuttle
<babbageclunk> dimitern: Hmm, ok
<mup> Bug #1590362 opened: azure: Azure API errors do not contain information about the cause <azure-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1590362>
<dimitern> babbageclunk: e.g. `sshuttle -r maas-hw 10.14.0.0/20 10.30.0.0/20 10.50.0.0/20 10.100.0.0/20 10.150.0.0/20 10.200.0.0/20 10.250.0.0/20 10.99.0.0/20`
<dimitern> babbageclunk: maas-hw is configured in my ~/.ssh/config to be Host: 192.168.1.104, User: maashw, and I can ssh with my pubkey
<ejat> axw_: http://paste.ubuntu.com/17113187/
<dimitern> babbageclunk: the ranges after that match any maas subnets you want to access (but otherwise can't except when on the rack machine)
<babbageclunk> dimitern: Ok.
<axw_> ejat: hrm. did you do "azure login" again after changing the config mode?
<ejat> axw_: yups
<ejat> axw_: http://paste.ubuntu.com/17113282/
<axw_> ejat: ok, well I guess azure just doesn't let you do that. I'm not sure what we can do about that
<ejat> axw_: i need to contact Azure ?
<axw_> ejat: worth a try. it may be that it's a limitation of BizSpark
<ejat> axw_: Your question was successfully submitted to Microsoft Support using your Azure Subscription Management Support plan. A Microsoft support professional will contact you within 2 hour(s).
<ejat> :(
<axw_> ejat: that's better than I expected :p
<axw_> ejat: I need to head off shortly, could you please update the bug with the outcome? or leave me a PM here, either way
<ejat> thats from azure portal .. or else .. i need to ask favor from MS SG @ MY to expedite
<ejat> okie will do .. thanks a lot .. will pm you once get feedback from Azure
<ejat> axw_: i've updated the bug
<axw_> ejat: thank you. I'm heading off now, good night!
<babbageclunk> dimitern: I can't juju ssh to any of the machines in my model - I can ssh to them directly using the IPs of the physical interfaces from the MAAS UI (192.168.150.*), but not with the IPs in juju status (192.168.10.*). Is that what sshuttle fixes?
<dimitern> babbageclunk: yeah, it should fix that
<babbageclunk> dimitern: ok, thanks
<dimitern> babbageclunk: alternatively, you can add static routes for all maas subnets on your local machine, but sshuttle makes it a lot easier
<babbageclunk> dimitern: ok, I think that makes sense - I was kind of fumbling towards realising that it was because of missing routes, but good to have it confirmed.
<babbageclunk> dimitern: ok, I've got sshuttle running and it says it's connected. I still can't juju ssh to the machines - should I be able to?
<dimitern> babbageclunk: try passing -v at the end of juju ssh ..
<dimitern> babbageclunk: also try ping <haproxy-public-address>
<babbageclunk> dimitern: nope. Maybe the maas controller can't get to the hosts either?
<babbageclunk> dimitern: That seems like it would explain it.
<jam> wallyworld: sorry about the delay, are you still around?
<wallyworld> yep
<wallyworld> https://hangouts.google.com/hangouts/_/canonical.com/tanzanite-stand
<wallyworld> jam: ?
<babbageclunk> dimitern: are you around?
<dimitern> babbageclunk: yeah
<babbageclunk> dimitern: I can't ping the mysql host on its db space IP from the mediawiki host. But I can ping it on its .150 ip (default space)
<babbageclunk> dimitern: The routes look right, I think
<dimitern> babbageclunk: how is mysql deployed? bindings?
<babbageclunk> dimitern: yeah - from the bindings defined in the demo bundle
<babbageclunk> dimitern: http://pastebin.ubuntu.com/17115770/
<dimitern> babbageclunk: and for mysql?
<babbageclunk> dimitern: http://pastebin.ubuntu.com/17115875/
<dimitern> babbageclunk: btw it might be worth updating to the tip of master and rebuilding, so you can take advantage of my fix of yesterday; that e/n/i with addresses on both the bridge and the underlying NIC, e.g. ens3.10 might be causing issues
<babbageclunk> dimitern: Hmm, ok
<dimitern> babbageclunk: wait a sec
<dimitern> babbageclunk: can you paste that last one again, but with the output from 'ip route show' please? I'm not quite sure, but it looks like there are no gateways for the vlan subnets
<babbageclunk> dimitern: ok, gathering
<voidspace> dimitern: babbageclunk: http://fabriclondon.com/club/listing/1294
<babbageclunk> dimitern: http://pastebin.ubuntu.com/17116157/
<dimitern> voidspace: ooh nice! thinking of going?
<voidspace> dimitern: yep
<babbageclunk> voidspace: not all of us will be away from our families and going all rumspringa!
<voidspace> babbageclunk: your choice... :-)
<mgz> you're not having a networking meeting in fabric...
<voidspace> mgz: hah, it would be appropriate...
<dimitern> voidspace: well, I'd love to go actually.. brush up some degrading social skills :D
<voidspace> dimitern: :-)
<dimitern> babbageclunk: ok, it looks fine; and you can't ping 192.168.10.3 from 192.168.10.4 ?
<babbageclunk> dimitern: nope - just get destination host unreachable.
<babbageclunk> dimitern: Do you think I should rebuild and rebootstrap?
<dimitern> babbageclunk: can you ping both of those IPs from maas?
<dimitern> babbageclunk: it will help, but let's try to see if something else is wrong first
<babbageclunk> dimitern: nope, can't get to either of them from maas
<babbageclunk> dimitern: I haven't got the vlans on the maas controller.
<dimitern> babbageclunk: well, that's why then :)
<dimitern> babbageclunk: maas needs to be on those vlans
<babbageclunk> dimitern: that's why can't get to them from the maas controller. But does that also affect the connection from mediawiki to mysql?
<babbageclunk> dimitern: (I mean, I was going to fix the maas vlan issue, particularly so sshuttle would work, but I still don't understand the other bit.)
<dimitern> babbageclunk: yeah, since maas is the gateway for all subnets, if maas itself cannot reach them they won't be able to either
<babbageclunk> dimitern: Oooooooooooooh
<dimitern> babbageclunk: :)
<babbageclunk> dimitern: ok, that makes a lot more sense. Sorry for extreme dumbness!
<dimitern> babbageclunk: my e/n/i on maas looks like this: http://paste.ubuntu.com/17116341/
<dimitern> babbageclunk: np :)
 * dimitern steps out for ~30m
<TheMue> voidspace: ping and hello
<TheMue> or anyone else available, need an information
<dooferlad> TheMue: hello
<TheMue> dooferlad: hey, hello
<TheMue> dooferlad: I need the e-mail address of the HR department (or somebody there) for the request of a special official paper.
<TheMue> dooferlad: sadly I've got no access to mail directory anymore, hehe
<dooferlad> TheMue: ok, will PM you in a momenty
<TheMue> dooferlad: great, thanks
<TheMue> dooferlad: and juju is doing well?
<dooferlad> TheMue: yep
<TheMue> dooferlad: great, I'm missing my time at Canonical
<dooferlad> TheMue: you are missed!
<TheMue> dooferlad: am I?
<dooferlad> TheMue: more hands, more brains, more code :-)
<dooferlad> TheMue: and knowledge of Whisky of course
<TheMue> dooferlad: sadly I don't see an according job offer
<dooferlad> TheMue: you are looking?
<TheMue> dooferlad: would start immediately
<TheMue> dooferlad: yep, I do
<dooferlad> TheMue: I will keep my ear to the ground for you
<TheMue> dooferlad: had no luck with my last employer
<dooferlad> TheMue: :-(
<TheMue> dooferlad: bad climate, and also lost one important customer when I started, so we only worked halftime. and then they kicked the last employed three. *sigh*
<dooferlad> TheMue: that sucks. I am sorry to hear that.
<TheMue> dooferlad: already have some new contact, will see
<mup> Bug #1590468 opened: Controller machine log contains worker errors <cdo-qa> <cdo-qa-blocker> <juju-core:New> <https://launchpad.net/bugs/1590468>
<natefinch> you know a test is probably doing something wrong when you see atomic.StoreUint32
<natefinch> s.checkStartInstanceCustom(c, m, "pork", s.defaultConstraints, nil, nil, nil, false, nil, true)
<natefinch> no wonder fixing tests takes 3x as long as implementing the feature
<katco> redir: standup time
<dimitern> babbageclunk: hey, did you manage to sort out the vlans stuff?
<babbageclunk> dimitern: not yet - was at code club. Trying to change the script to link the subnets at the same time as it adds the vlans to solve the rerun problem I was seeing.
<dimitern> babbageclunk: ok
<alexisb> fwereade, ping
<alexisb> you still around?
<alexisb> voidspace, you still around?
<voidspace> alexisb: yes
<mup> Bug #1590520 opened: failed to prepare container "0/lxc/2" network config: model "default" networking not supported (not supported) <cdo-qa> <juju-core:New> <https://launchpad.net/bugs/1590520>
<redir> katco: ericsnow natefinch would one of you have time to make sure I am understanding the build transaction convention in state in about 30m?
<redir> about to have lunch
<ericsnow> redir_lunch: sure
<katco> ericsnow: redir_lunch: wouldn't mind sitting in on that just as a refresher
<redir_lunch> Ok. I'll ping you both after I hoover some stew
<natefinch> if at first the test does not succeed... delete the test
<redir_lunch> back
<marcoceppi> I have a question about the API, why do I need to specify the model UUID when connecting to a controller? Why can't I just connec to the controller with user/pass?
<natefinch> marcoceppi: that's a thumper question, I think
<redir> ping katco ericsnow
<redir> ericsnow is gonoe
<redir> s/gonoe/gone
<katco> redir: he's probably just restarting or something
<rick_h_> marcoceppi: because of stuff :P
<redir> watch ericsnow status
<redir> :)
<marcoceppi> rick_h_ natefinch it's a really freakin bummer for libjuju library design
<rick_h_> marcoceppi: I htink it's because there's possibility that it's not clear in future uses
<rick_h_> marcoceppi: and the UUID is required to help make sure it finds its way to the api server correctly
<marcoceppi> esp since we're not connecting to a controller at that point, we're connecting to a Model, so the idea of a Controller object is really moot
<marcoceppi> I would have expected: connect to Controller, get list of models my user has, connect to model
<redir> I agree w/ marcoceppi I would think I'd connect then switch to a model. or connect to model@controller which might do it all at once
<rick_h_> marcoceppi: redir yes, but I think part of the thing is the way we did controllers was a bit "easy route" in that it's just a model that we treat special and it takes the same api calls as models
<rick_h_> so it's just not in all the calls to make it pulled out different and not require it
<marcoceppi> but controllers are very special cases
<marcoceppi> how do I do ensure-ha? it's not against a model, it's a controller level feature
<marcoceppi> list me all your controller endpoints, as another example
<rick_h_> right, I'm +1 on trying to design the lib to hide that, but technically that's how it works is you happen to ask one of the dozen models a special question and he answers
<redir> rick_h_: understood that's how we did it, just saying it isn't how me the user would expect it to be.
<rick_h_> redir: I know, I'm +1 on trying to hide it as an implementation detail as much as we can
<rick_h_> redir: but the api is the api so there will be limits
<katco> rick_h_: i think maybe what redir and marcoceppi are trying to say is that it would be nice if the api were designed differently :)
<rick_h_> katco: I know, and I know alexisb-afk would fly to my house and mount my head on a pike if I suggested changing it atm so a bit stuck
<katco> rofl
<rick_h_> I like my head /me inserts "attached to it" joke here
<katco> rick_h_: yes. yes she would. i don't think anyone is blaming you or anyone who implemented it. them's the constraints we were working under.
<alexisb-afk> rick_h_, whats up?
 * alexisb-afk reads back scroll
<rick_h_> alexisb-afk: nothing, just noting that I don't want you to kill me :P
<alexisb-afk> and gets he axe ready
 * marcoceppi ducks
<alexisb-afk> rick_h_, this discussion should have taken place back in october
<alexisb-afk> there were qs that went around
<rick_h_> alexisb-afk: well it goes back before that and yes I agree. It goes back to how it was originally implemented
<alexisb-afk> thumper/menn0  would have details on the reasons for final outcome
<rick_h_> alexisb-afk: but nothing for it atm
<alexisb-afk> yep
<rick_h_> marcoceppi: arosales did our call go away?
<natefinch> so we're agreed that thumper will fix it for 2.0, yes? ;)
<marcoceppi> rick_h_: I'm sprinting this week on vpil stuff
<thumper> heh, wat?
<redir> katco: :)
<alexisb-afk> thumper, marcoceppi has a questions for you
<thumper> o/ marcoceppi
<alexisb-afk> and with that I go back to not being here
<rick_h_> marcoceppi: ok, do you have any time to chat? i need to get this SoW finished up and want to make sure I've got your feedback.
<rick_h_> marcoceppi: I can run with your comment int he doc, but think it'll be more than "not do it"
<marcoceppi> rick_h_: sure, we moved it to thursday
<marcoceppi> rick_h_: in that, we have a vpil engineering sync that's pretty sparse
<marcoceppi> but we can chat now if you'd like
<arosales> rick_h_: did the SoW need to get in today, or can we get the VPIL bits in tomorrow?
<natefinch> I just dumped backscroll context for thumper... but I think the consensus was "yes it's suboptimal, but that's how it has to be for now, yes
<thumper> I have 10 minutes if someone wants to jump on a hangout and explain more, marcoceppi?
<natefinch> I feel like dimiter has been here: TestProvisioningInfoWithSingleNegativeAndPositiveSpaceInConstraints
<cherylj> haha
<katco> natefinch: being serious, what's wrong with that name? it's not like you have to call that method from anywhere, and it's descriptive... so... better than ~TestConstraints isn't it?
<natefinch> katco: kinda annoying to type with check.f  *shrug*  mostly it's just hard to read and parse
<katco> natefinch: i like Test<funcName>_<testDescription>
<natefinch> katco: I'd probably call it TestNegativeAndPositive ... assuming it's in a reasonably named test suite (which it's not)
<katco> natefinch: but test negative and positive what? and against what?
<natefinch> so like func (s ProvisioningInfoSuite) TestNegativeAndPositiveSpace
<katco> natefinch: TestProvisioningInfo_PosNegSpace maybe?
<natefinch> katco: well, all the tests in the suite are about turning constraints into provisioning info structs
<katco> natefinch: ah
<natefinch> but really, a nice daffodil yellow really makes the shed stand out from its surroundings
<katco> lol
<redir> +1 natefinch
<mup> Bug #1589670 changed: backups does not implement Backups for non linux OSes <ci> <regression> <windows> <juju-core:Fix Released by wallyworld> <https://launchpad.net/bugs/1589670>
<mup> Bug #1589670 opened: backups does not implement Backups for non linux OSes <ci> <regression> <windows> <juju-core:Fix Released by wallyworld> <https://launchpad.net/bugs/1589670>
<marcoceppi> thumper: sorry, at a sprint, we're about to EOD
<thumper> marcoceppi: np
<marcoceppi> tl;dr it's weird that I have to give a model UUID for logging into a controller, would have though I give API endpoint + user + pass, then I can issue commands like ListModels, and initiate switches to other models from there
<marcoceppi> in libjuju we're expecting to have a python object for Controller to do things like list all the valid api endpoints, to ensure_ha over the api, etc
<mup> Bug #1589670 changed: backups does not implement Backups for non linux OSes <ci> <regression> <windows> <juju-core:Fix Released by wallyworld> <https://launchpad.net/bugs/1589670>
<redir> what does "state changing too quickly; try again soon" mean? Is this a known uh, thing?
<perrito666> redir: its an error produced when an attempt loop for a transaction fails every time
<perrito666> it is making an assumption that I dont especially like
<redir> perrito666: would that fit errors.IsNotFound?
<perrito666> yes, iirc the retries only happen if the failure is a assertion error so I guess that if you have an assert for doc missing it would yield that completely unrelated error
<perrito666> ymmv
<redir> perrito666: thanks. I can almost imagine a better error message.
<redir> and my mileage varied. It doesn't satisfy errors.IsNotFound.
<perrito666> ahh I did not understand your question
<perrito666> no it wont
<redir> tx perrito666
<thumper> marcoceppi: you don't need a uuid to log into the controller
<perrito666> yw, I did literally nothing
<thumper> and you can list models without specifying a uuid
<thumper> I made sure this works
 * redir takes a break and walks around.
<redir> perrito666: well at least you didn't do nothing alone.
<thumper> marcoceppi: your libjuju Controller should work
<thumper> btw, I have some interesting python code you might like
<thumper> that I started with my pylibjuju
<thumper> that went nowhere
<redir> I know I should sleep in the state tests!
 * redir ducks
<perrito666> too late, dave already sent someone to kill you
<redir> ericsnow: yt?
<ericsnow> redir: yep
<redir> is now a good time?
<redir> ericsnow: ^
<ericsnow> redir: sure
<ericsnow> redir: moonstone?
<redir> moonsonte?
<redir> k
<redir> katco: ^
<katco> redir: sure
<perrito666> I wonder why my windows spends 23 minutes heavily reading the hd each time I boot it
<mup> Bug #1590598 opened: ipv6 interfaces on a machine (in maas) are not added to lxc containers deployed to that machine <juju-core:New> <https://launchpad.net/bugs/1590598>
<menn0> thumper: here's that sort bug: https://bugs.launchpad.net/juju-core/+bug/1590605
<mup> Bug #1590605: juju debug-log sort may exceed MongoDB's limits <juju-core:New> <https://launchpad.net/bugs/1590605>
<mup> Bug #1590605 opened: juju debug-log sort may exceed MongoDB's limits <juju-core:New> <https://launchpad.net/bugs/1590605>
<thumper> menn0: thanks
<thumper> I should file a few bugs I noticed over the last few days too
#juju-dev 2016-06-09
<mup> Bug #1590045 changed: Uniter could not recover from failed juju run <run> <juju-core:Triaged> <https://launchpad.net/bugs/1590045>
<davecheney> thumper: menn0 fwereade https://github.com/juju/juju/pull/5578
<natefinch> *decrement
<natefinch> "The coordination via stopped is not reliably observable, and hence not tested"
<davecheney> (and didn't work :)
<davecheney> fixed mis spelling
<thumper> davecheney: well, it looks reasonable to me
<thumper> davecheney: do all the tests now pass?
<davecheney> thumper: the cmd/jujud/agent test that was previously failing because the catacomb had not stopped the workers it was in charge of by the time the test is shut down
<natefinch> I'd really like to see a failing test in the catacomb package that this changes fixes.  relying on a test far away to assert the correctness of the change is less than ideal.
<davecheney> natefinch: sure, https://github.com/juju/juju/pull/5564#issuecomment-224468739
<davecheney> here is the failing test
<davecheney> you can see in the panic output that the firewalls and pinger are still running and were started from the catacomb
<thumper> davecheney: I think what natefinch means is an explicit test on catacomb
<davecheney> there is an explicit test
<thumper> small test
<davecheney> if the worker has not shut down, the catacomb tests will not pass
<davecheney> there is an explicit test for catacomb.Wait()
<thumper> and did that one intermittently fail?
<davecheney> 'cos the catacomb tests before weren't testing shit
<davecheney> now the are
<davecheney> I fixed the code to match the test
 * thumper sighs
<thumper> but there wasn't a failing catacomb test
<davecheney> nope, see the comment in the PR
<davecheney> this wasn't tested
<davecheney> now itis
<davecheney> now it is
<thumper> not explicitly just on catacomb without the agent bollocks
<davecheney>         go func() {
<davecheney>                 defer catacomb.tomb.Done()
<davecheney>                 defer catacomb.wg.Wait()
<davecheney>                 catacomb.Kill(plan.Work())
<davecheney>         }()
<davecheney> ^ the catacomb is not dead until wg.Wait() drops to zero
<davecheney> wg.Wait cannot drop to zero until all the workers have exited
<thumper> davecheney: http://reports.vapour.ws/releases/4039/job/run-unit-tests-trusty-amd64-go1-6/attempt/637
<thumper> pprof thingy
<thumper> davecheney: I think that is related to what you added, yes?
<thumper> I have NFI what is wrong though
<thumper> just saw it fly past
<davecheney> thumper: yes that is the pprof facility we added a while back
<davecheney> i think axw had the last hack at fixing a related bug
<thumper> I need to chat at some stage, perhaps next week, at how to hook into that :)
<davecheney> thumper: its on the wiki
<thumper> what? docs? who reads these days :)
<davecheney> thumper: https://github.com/juju/juju/wiki/pprof-facility
<thumper> ta
<davecheney> thumper: are we good ?
<thumper> is it possible to hook into the listener to add additional output paths?
<thumper> i.e. extra stats?
<davecheney> nope, sorry that's all the runtime exposes
<thumper> hmm... ok
<davecheney> if you want to do something more than the bits we get for free from the runtiem
<davecheney> that's more than a non trivial amount of work
<wallyworld> axw_: whenever you have time, this stores cloud region against model, and cloud name in controller doc http://reviews.vapour.ws/r/5023/ next one will use separate controllerconfig
<wallyworld> thumper: we really should address those various TODO schema change items in the code base but we probs will run out of time :-(
 * perrito666 runs the whole suite and fires netflix
<thumper> yes :(
<anastasiamac> axw_: GCE region fix: http://reviews.vapour.ws/r/5024/
<natefinch> yay, someone fixed my bug! :)  sounds like it was worse than just lying about where we were bootstrapping
<natefinch> anastasiamac: so, what happens if you bootstrap without specifying a region?
<axw_> anastasiamac: have you tested live?
<anastasiamac> yes
<anastasiamac> natefinch: u will have to specify region in ur clouds. i think u cannot by-pass it now anyway...
<anastasiamac> axw_: prior to change, I'd always end up on us-central, after change i've bootstrapped in us-east :D
<anastasiamac> axw_: (as expected)
<natefinch> why is having a default bad?  I mean, sure, don't override what someone explicitly specifies... but do we require other clouds specify regions?
<axw_> natefinch: we have defaults at a higher level now
<axw_> natefinch: in clouds.yaml, the first region is the default unless you specify one
<axw_> and we always pass that into the provider
<anastasiamac> natefinch: because, like in this case, at the provider level, it's easy to omit getting mandatory property from bootstrap config
<axw_> (two places for a default => subtle bugs)
<anastasiamac> natefinch: m not planning to remove default from 1.25, only planning to add copy param logic :D
<natefinch> axw_: ok, sure, yeah, not having multiple defaults is fine.... but clouds.yaml isn't really a default, it's a configuration.  This means the user always has to manually specify a region, right?
<axw_> natefinch: no. the first region specified in clouds.yaml, for each region, is the default unless you specify one. you can set a default region yourself, otherwise it's the first in the list
<axw_> natefinch: run "juju show-cloud aws", and the first region in there should be the default for aws
<axw_> (unless you have set one with "juju set-default-region")
<natefinch> axw_: I'm confused, I thought clouds.yaml was what we called the thing you had to write out by hand to pass into add-cloud.  Is that a generated file we create on disk?
<axw_> natefinch: sorry, I was being a bit non-specific. there's ~/.local/shared/juju/public-clouds.yaml (public clouds), and ~/.local/shared/juju/clouds.yaml (personal clouds, ones you added with add-cloud)
<axw_> natefinch: the public-clouds.yaml file won't be there by default, it's also built into the client
<axw_> natefinch: but "juju update-clouds" will pull down a file in that spot if there's been updates, so a client can refer to new clouds/regions without upgrading
<natefinch> axw_: ahh, thanks, ok.  that was the context I was missing :)
<axw_> natefinch: btw that intemediate clouds.yaml to feed into add-cloud should be going away (or at least relegated), as add-cloud will eventually (soon?) be made interactive
<natefinch> axw_: thank goodness.  the less yaml I have to write, the better :)
<axw_> yep, the aim is to stop editing files by hand
 * thumper is on kid duty
<thumper> I'm going to be afk for a while, but back after being a taxi tonight
<thumper> davecheney: oh, also, will miss standup tomorrow morning, on airport pick up duty
<natefinch> cd .
<davecheney> thumper-afk: right ok
<davecheney> no worries
<davecheney> natefinch: shutdown -h
<natefinch> davecheney: heh
<natefinch> davecheney: for your consideration: https://github.com/juju/juju/blob/master/worker/provisioner/provisioner_task.go#L409
<davecheney> natefinch: you'll have to try harder than that if you're trying to shock me
<davecheney> the MustCompile is a nice touch
<davecheney> very devil may care
<natefinch> I just can't even.  So much wrong in that one little line
<wallyworld> axw_: i'm not sure we should include cloud region in migration data, since model could be stored to a different cloud/region
<axw_> wallyworld: I think migration expects to be between controllers in the same region
<axw_> wallyworld: the machines/agents/etc. remain where they are, and are redirected to the new controller
<axw_> wallyworld: they're not destroyed and recreated
<wallyworld> ok
<natefinch> anyone having problems with statesuite.TestPing?  TearDownTest seems to hang forever
<natefinch> log is just full of [LOG] 0:55.041 DEBUG juju.mongo connection failed, will retry: dial tcp 127.0.0.1:39150: getsockopt: connection refused
<davecheney> why does touching anything cause it to berak
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1590645
<mup> Bug #1590645: worker/catacomb: test failure under stress testing <juju-core:In Progress by dave-cheney> <https://launchpad.net/bugs/1590645>
<davecheney> top tip: this was already broken
<davecheney> ^ that's master
<mup> Bug #1590645 opened: worker/catacomb: test failure under stress testing <juju-core:In Progress by dave-cheney> <https://launchpad.net/bugs/1590645>
<davecheney> nope, sorry, pebkac
<natefinch> gah, testping doesn't even DO anything
<davecheney> menn0: could you take a second look at https://github.com/juju/juju/pull/5578
<davecheney> I found a bug in stress testing that I have now fixed
<davecheney> menn0: https://github.com/juju/juju/pull/5578/commits/36c3e7f8bd9435bf1cccd2480b4f921bcb6345d9
<menn0> davecheney: looking
<menn0> davecheney: the core change looks good but we still disagree about test suites
<mup> Bug #1590645 changed: worker/catacomb: test failure under stress testing <juju-core:Invalid by dave-cheney> <https://launchpad.net/bugs/1590645>
<davecheney> menn0: fine i'll roll back that change
<davecheney> it's not worth having a fight about
<menn0> davecheney: kk
<davecheney> menn0: done
<davecheney> removed that commit from the PR
<natefinch> it's pretty insane that our api client tests take many many minutes to run
<menn0> davecheney: ship it already :)
<davecheney> menn0: with pleasure
<davecheney> menn0: as you say, it was obvious in retrospect (now we can see the solution)
<davecheney> we _have_ to wait for both goroutines to finish
<menn0> davecheney: yeah totally
<davecheney> the previous stopped channel connected both goroutines in one direction, but didn't in the other
<natefinch> welp.  Not getting these tests passing tonight.  Sorry wallyworld.  I keep running into random tests that time out after 10 minutes, which is sorta killing the development cycle here.  I'm even using gt to avoid retesting code I know hasn't changed.
<wallyworld> ok, can you push what you have?
<natefinch> wallyworld: sure
<wallyworld> ty
<natefinch> wallyworld: made a WIP PR to make it easier to see the diffs: https://github.com/juju/juju/pull/5583
<wallyworld> ty
<wallyworld> i'll try and look after soccer
<natefinch> heh: +547 â5,744
<wallyworld> \o/
<natefinch> and most of that plus is really just a file rename or two
<natefinch> ok, bedtime.
 * redir goes to bed too!
<fwereade> davecheney, would you explain http://reviews.vapour.ws/r/5022/diff/# to me please? I see that the original second goroutine can outlive the catacomb; but I can't see how Kill()ing an already-finished worker triggers session-copy panics
<fwereade> davecheney, ...oh, dammit, is this presence again?
<blahdeblah> Anyone know why when I ask Azure for 8 GB RAM & 50 GB disk, I get 13 GB RAM and 29 GB disk, but it claims to have mem=14336M root-disk=130048M ?
<fwereade> dimitern, do you know about the DocID and tests for same around the linklayerdevices code?
<dimitern> fwereade: what about it?
<fwereade> dimitern, why it's exported, and why it includes the model-uuid
<dimitern> fwereade: it shouldn't be exported
<dimitern> fwereade: sorry about that
<fwereade> dimitern, it happens ;)
<dimitern> fwereade: but why is it surprising that it includes model-uuid as prefix?
<fwereade> dimitern, fixed that already, mainly just checking something wasn't planning to build on it
<fwereade> dimitern, because state isn't meant to know that stuff -- the multi-model stuff does it for you
<fwereade> dimitern, or, it should -- I was wondering if there was something about it that meant it didn't quite fit
<dimitern> fwereade: well, I wasn't that comfortable with the multi-model stuff then I guess, wanted to test it excplictly
<fwereade> dimitern, just checking, though: you aren't using those DocIDs as anything other than _ids, right? not e.g. storing them in fields for subsequent lookup?
<dimitern> fwereade: I'm using mostly global keys
<dimitern> fwereade: and the DocID IIRC is only used for txn.Op{Id:} and FindId()
<fwereade> dimitern, (global keys without any model-uuid prefix, right?)
<fwereade> dimitern, cool, thanks for the orientation
<dimitern> fwereade: without, except for the parent/child refs
<fwereade> bugger
<dimitern> fwereade: let me have a look to remind myself..
<fwereade> dimitern, thanks
<blahdeblah> Also, 9 minutes to bootstrap that instance in Azure - is that expected?
<axw_> blahdeblah: 1.25 or 2.0?
<dimitern> fwereade: nope, so for the refs the docid is used literally as given, no assumptions on prefix
<blahdeblah> 1.25.5
<blahdeblah> axw_: ^
<blahdeblah> Looks like no matter what you ask for in a root disk, you get whatever Microsoft decides you need, which is 31.5 GB raw.
<fwereade> dimitern, can you point me at the code you're looking at?
<dimitern> fwereade: linklayerdevices_refs.go
<axw_> blahdeblah: it's been a while since I looked at the old provider, will have to go spelunking
<blahdeblah> axw_: Is it something where we need to specify instance type instead of mem/disk constraints?
<axw_> blahdeblah: re slowness: yes, sadly that's expected
<blahdeblah> How does this cloud still exist? :-\
<fwereade> dimitern, ok, refs looks safe, it's explicit but doesn't need to be
<axw_> blahdeblah: I *think* the root disk size is the same for all instance types, will need to check
<fwereade> dimitern, what about lldDoc.ParentName?
<dimitern> fwereade: that can be a global key
<blahdeblah> axw_: I suppose I should log a bug saying that there's no indication that the root-disk constraint is not honoured then...
<fwereade> dimitern, whoa, ParentName lets docids leak out too?
<axw_> blahdeblah: I think it may actually be related to the images that Canonical publishes
<dimitern> fwereade: well, not quite the docid, just the gk
<blahdeblah> axw_: That affects the size of sda presented to the OS?  Seems unlikely...
<fwereade> dimitern, still
<axw_> blahdeblah: well the images have the size of the root disk in the name ...
<blahdeblah> But surely that would simply affect the size of the partition created on the disk, not the disk size itself...
<axw_> blahdeblah: maybe, depends on whatever Hyper-V does. I don't know, I'll have a poke around
<blahdeblah> Thanks - appreciated.
<dimitern> fwereade: looking at the code I don't see a good reason to export ParentName() actually.. as there is a ParentDevice() which is more useful outside of state anyway
<fwereade> dimitern, excellent, I'll drop it if I can
<blahdeblah> axw_: I'll have a poke around for relevant bugs
<fwereade> dimitern, still sl. struggling to get my head around what changes I could/should make to get around the internal test failures
<dimitern> fwereade: please don't just drop it - it will still be needed inside the package for refs checks and some other internal logic, but unexporting it should be fine I think
<fwereade> dimitern, sorry, that's what I meant
<fwereade> dimitern, so, to step back for context
<dimitern> fwereade: ok
<fwereade> dimitern, I'm trying to extract a smaller, less stateful, type from State
<dimitern> fwereade: sounds challenging :)
<fwereade> dimitern, the clean line at the moment seems to be {database, model-uuid}
<fwereade> dimitern, and I've tacked on only a few methods -- getCollection, runTransaction, and the docID/localID translators
<fwereade> dimitern, this ofc means that the hacked-up state no longer produces the correct answers
<fwereade> dimitern, because the implementation detail of *how* we calculate docID has changed
<fwereade> dimitern, but I am deeply reluctant to just "fix" that *State
<dimitern> fwereade: I'm afraid I do have a bunch of internal tests for LLDs that verify the docID format :/
<fwereade> dimitern, the biggest question actually
<dimitern> fwereade: I trust the multi-model code better now at least :)
<fwereade> dimitern, is: can I just drop those internal tests? is there any functionality that isn't covered by external ones?
<fwereade> dimitern, so many of them are working with an unconfigured state anyway... ;p
<fwereade> dimitern, first stab at multi-model you did have to care about model-uuid
<dimitern> fwereade: those tests that check the docID includes model uuid prefix? sure - I think those are unnecessary anyway
<fwereade> dimitern, really, all the internal tests
<dimitern> fwereade: but the assumptions on the globalKey format for LLDs is important
<fwereade> dimitern, why so? they strike me as the purest of implementation details
<dimitern> fwereade: I aimed for 100% coverage while writing the code, some bits of it are not possible to test externally, but the internal tests gave me confidence that code is exercised
<fwereade> dimitern, if it's not possible to test it externally, why does it matter?
<fwereade> dimitern, by definition, surely, that makes it an implementation detail
<dimitern> fwereade: well, re gk format - container LLDs are intentionally restricted to only having a parent device on the same machine as the container and only of type bridge
<blahdeblah> axw_: FWIW, seems like it might be region-dependent.  If I bootstrap in debug mode, I get a bunch of error messages about Basic_A[1-4] and Standard_G[1-5] not being available in US East
<fwereade> dimitern, ok, but aren't those restrictions exercisable via the exported api? doesn't matter what strings we use, it's the restrictions on the live types we export that we need to verify
<dimitern> fwereade: sorry, I've looked at the internal tests again; ISTM most can be either moved to the non-internal suites or simply dropped
<dimitern> fwereade: those that could be moved include tests on simple getters or exported, related helper funcs (e.g. IsValidLinkLayerDeviceName)
<fwereade> dimitern, ok, I'll see what I can do, thanks
<axw_> blahdeblah:  "Azure supports VHD format, fixed disks." --  https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-about-disks-vhds/
<dimitern> fwereade: I believe tests around parent gk format can be tested externally, it was just easier to test them with a 'fakeState' (and not bring up the whole stack with the real one)
<axw_> blahdeblah: the old provider should not allow you to specify 50GB in the first place. the new one doesn't
<blahdeblah> axw_: When you say "old provider", you mean the one in 1.25.x ?
<fwereade> dimitern, as in, we can externally test that we can't create bad LLDs?
<axw_> blahdeblah: yep
<blahdeblah> axw_: So if I need more than 31.5 GB root disk, what can I do?
<axw_> blahdeblah: it does not exist in 2.0. Azure completely changed their APIs
<fwereade> dimitern, is there something specific about the gks that I'm missing?
<axw_> blahdeblah: so with the old world, I think you could do it with some hoop jumping: take the existing VHD, resize it and upload to your private storage. then you'd need custom image metadata
<axw_> blahdeblah: not sure why the images have such a small size in the first place tho really
<axw_> blahdeblah: possibly because with "premium storage", you pay by the disk size, rather than usage
<blahdeblah> axw_: When you say "old world", you mean "The only production-ready version currently supported"? :-)
<axw_> blahdeblah: yes.
<dimitern> fwereade: well, there's one thing...
<blahdeblah> axw_: So can we specify instance type or something like that to get around this?
<blahdeblah> Or are we just stuck with 31.5 GB?
<axw_> blahdeblah: nope, it's down to the images I'm afraid. all instance types have the same max limit for OS disk sizes
<dimitern> fwereade: to test e.g. you can't add a parent device to a container device without using the parent's globalkey as LLDArgs.ParentName
<blahdeblah> axw_: OK - thanks for your time.
<dimitern> fwereade: you have to use the gk, which is then verified to exist on the same host machine as the container and it's a bridge
<fwereade> dimitern, don't we have some other way of identifying the parent?
<dimitern> fwereade: well, we only need its name and its host machine, which is conveniently part of the LLD's gk
<dimitern> fwereade: and in other cases (e.g. adding a new child to existing parent device on the same machine) we just use the plain parent device name in LLDArgs.ParentName
<fwereade> dimitern, sorry, when is a new child *not* added to an existing parent on the same machine?
<dimitern> fwereade: but hey! fortunately, that logic with LLDArgs.ParentName being a gk is only really used in one place usable externally
<dimitern> fwereade: when it's on a different machine (e.g. in a container, while the parent is on the host machine of that same container)
<dimitern> fwereade: SetContainerLinkLayerDevices is the only place currently where we use a gk-based ParentName for adding devices
<dimitern> fwereade: and if LLD.ParentName() is unexported, and ParentDevice() used instead (as it is IIRC), we won't leak gks outside of state
<fwereade> dimitern, I remain a bit suspicious of the still-extant ability to specify gk-ParentName from outside
<fwereade> dimitern, could we, e.g., have always-simple-ParentDeviceName, and ParentMachineID, in LLDArgs?
<dimitern> fwereade: that can be rectified, assuming SetLinkLayerDevices() rejects ParentName set to a gk, and SetContainerLLDs can bypass that check internally
<dimitern> fwereade: that's perhaps better - future-proof and more explicit
<fwereade> dimitern, cheers
<dimitern> fwereade: btw, now that you've had a chance to look at the approach I used for setting LLDs and LLDAddresses in bulk
<dimitern> fwereade: how do you like it?
<fwereade> dimitern, only just looking at that side of it now...
<fwereade> dimitern, just reading setDevicesAddresses... doesn't the providerID stuff have that index gotcha? I see some checking but it's not immediately clear that it's enough
<dimitern> fwereade: it was a noble attempt at first :) but it got complicated as parent/child refs were added.. and my inability to construct a single set of []txn.Op that can insert and verify new parents and new children of those in the same set of ops
<dimitern> fwereade: now providerIDsC is used for subnets, spaces, LLDs, and LLDAs, and the indexing issue is handled there
<fwereade> dimitern, yeah, the "you don't need to assert what you're doing in this txn" thing honestly only makes life harder
<fwereade> dimitern, ok... but the providerID memory-checks aren't guaranteed to run at all here, are they? and I don't think they're asserted at apply-time either
<dimitern> fwereade: yeah txn.DocMissing being the only possible option for inserts can really force you to re-think.. and I do understand why it's the only option
 * dimitern takes a look
<fwereade> dimitern, no, wait, setDevicesAddressesFromDocsOps seems to cover that
<dimitern> fwereade: so the ProviderID on LLDAddrDoc is just for convenience, it's not enforcing integrity like it used to; the asserts on PrIDsC do
<fwereade> dimitern, sorry, I seem to be slow thing morning
<fwereade> dimitern, we have some logic around :592 that checks for dupe provider ids -- what does it do that sDAFDO doesn't?
<dimitern> fwereade: it validates whether ErrAborted occured due to the assert added in sDAFDO
<fwereade> dimitern, won't reinvoking sDAFDO catch those anyway?
<mup> Bug #1590671 opened: Azure provider does not honour root-disk constraint <juju-core:New> <https://launchpad.net/bugs/1590671>
<dimitern> fwereade: well, it looks like sDAFDO doesn't validate, just asserts docMissing
<dimitern> fwereade: sorry, I need to talk to some contractors that just arrived - back in ~15m
<fwereade> dimitern, np
<voidspace> jam: happy birthday
<fwereade> dimitern, should networkEntityGlobalKeyOp theoretically be doing that check? if it's just a FindId().Count() I'd not too worried, even if we do ask it a bunch of times...
<davecheney> fwereade: the problem is the goroutine does, w.Stop() and then returns, marking the waitgroup as done
<davecheney> Stop doens't actaully stop, it just asks to stop so the worker can still be in the process of shutting down when the mongo connection is torn down
<jam> thanks voidspace
<fwereade> davecheney, right; but that's why the wg.Done happened after the Wait in the first goroutine -- how does the wg complete with a running worker? the only race I see is the possibility of a late Kill in the second goroutine
<fwereade> davecheney, and I accept *that's* wrong, because it's too trusting of Kill not to do anything untoward
<fwereade> davecheney, but if I understand correctly, you're saying that workers live too long, not that late Kills of already-dead workers are the problem?
<fwereade> davecheney, and I don't see how that was happening, because we always Wait()ed for the worker before we call wg.Done
<fwereade> davecheney, (I am assuming you're talking about Kill, which doesn't wait, rather than Stop, which does wait but wasn't used in the original)
<dimitern> fwereade: I suppose so
<dimitern> fwereade: but since networkEntityGlobalKeyOp is used for subnets, and spaces as well, it should be done carefully
<fwereade> dimitern, ack
<fwereade> dimitern, (also, I'm pretty sure machineProxy is evil vs extracting ops-composers and using them in both places)
<dimitern> fwereade: yeah, machineProxy was added mostly for convenience, is it evil because it assumes the LLD and the machine are always on the same session?
<fwereade> dimitern, more that it's pretending to be a valid machine and it's really not
<fwereade> dimitern, it's trusting to luck that an empty machine doc won't trigger a bad-memory-state failure
<dimitern> fwereade: fair point
<fwereade> dimitern, the garbage data is *kinda* ok, in that the asserts *should* trap any problems, but you've still got an invalid *Machine lying around and it makes me nervous ;p
<mup> Bug #1590689 opened: MAAS 1.9.3 + Juju 1.25.5 - on the Juju controller node eth0 and juju-br0 interfaces have the same IP address at the same time <cpec> <juju> <maas> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1590689>
<fwereade> dimitern, anyway, I think I have to consider most of this a derail, I'm adding a bug
<dimitern> fwereade: FWIW it's never lying around - only used to call LLD(name) on it in 2 places
<dimitern> fwereade: but still
<fwereade> dimitern, yeah, it's not how it's used right that bothers me so much as how someone will one day use it wrong ;)
<dimitern> (considering all of that was designed and implemented in a week..)
<dimitern> fwereade: indeed; "don't be clever, be obvious"
<dimitern> :)
<fwereade> dimitern, I apologise for the length of this letter, I did not have time to make it shorter ;)
<dimitern> fwereade: which one?
<fwereade> dimitern, it's a quote from someone-or-other that seemed tangentially relevant
<fwereade> dimitern, "shorter" *sort of* maps to "more obvious", so long as it's the right sort of shortness
<dimitern> fwereade: ah :) I was thinking of "clean code"
<fwereade> dimitern, that is rather more directly relevant than pascal, indeed ;)
<mup> Bug #1590689 changed: MAAS 1.9.3 + Juju 1.25.5 - on the Juju controller node eth0 and juju-br0 interfaces have the same IP address at the same time <cpec> <juju> <maas> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1590689>
<babbageclunk> Someone needs to come up with a way of hedging Pascal's wager against Roko's Basilisk.
<fwereade> babbageclunk, ssh!
<fwereade> ;p
<babbageclunk> Sorry you guys.
<mup> Bug #1590689 opened: MAAS 1.9.3 + Juju 1.25.5 - on the Juju controller node eth0 and juju-br0 interfaces have the same IP address at the same time <cpec> <juju> <maas> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1590689>
<mup> Bug #1590699 opened: LinkLayerDeviceArgs exposes globalKeys outside state <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1590699>
<babbageclunk> hey dimitern, got a moment for more stupid questions?
<dimitern> babbageclunk: always :)
<babbageclunk> dimitern: So I've added my vlans to the eni on the maas controller and restarted. The routes *seem* ok, but I still can't ping any of the vlan addresses on the hosts.
<dimitern> babbageclunk: what have you tried pinging?
<babbageclunk> dimitern: 192.168.10.2
<dimitern> babbageclunk: that's a node's address and you pinged from maas?
<babbageclunk> dimitern: yup
<babbageclunk> dimitern: on one node I can see that in ip route, and on the maas node I still get destination host unreachable trying to ping that address.
<dimitern> babbageclunk: do you have ip forwarding enabled on maas?
<babbageclunk> dimitern: Do I need to set the vlan up in the VMM virtual network config?
<babbageclunk> dimitern: Where do I check for ip forwarding?
<dimitern> babbageclunk: sudo sysctl -a | grep ip_forward
<babbageclunk> dimitern: oh, ok - nothing in the ui?
<babbageclunk> dimitern: net.ipv4.ip_forward = 1
<dimitern> babbageclunk: the kvm node has 1 or more NICs, each of which connected to a libvirt bridge; the bridge is layer-2, so it will pass both tagged and untagged traffic
<babbageclunk> dimitern: ok]
<dimitern> babbageclunk: on the node page in maas though, you need to have a physical NIC on the untagged vlan and 1 or more vlan NICs
<babbageclunk> dimitern: for the rack controller?
<babbageclunk> dimitern: yes, I've got those.
<dimitern> babbageclunk: can you paste e/n/i from your rack controller?
<dimitern> babbageclunk: where are the nodes and rack ctrl located?
<dimitern> babbageclunk: if the kvm nodes are on the rack ctrl machine, you need bridges there as well; if the rack ctrl is itself a kvm and it's sitting on your machine, along with all the nodes
<babbageclunk> dimitern: Ooh, looking back at your eni example I think I see the problem
<babbageclunk> dimitern: (well, a problem)
<babbageclunk> dimitern: http://pastebin.ubuntu.com/17139880/
<dimitern> babbageclunk: ...then you need to enable forwarding on your machine as well, and the bridges will be on your machine
<babbageclunk> dimitern: I've left off the /24 part of the address
<babbageclunk> dimitern: The rack controller and the nodes are all kvms on my machine/
<dimitern> babbageclunk: yeah! :) it's either e.g. /24 or netmask is needed
<dimitern> babbageclunk: omitting both means /32 IIRC
<babbageclunk> dimitern: ok, fixing and bouncing
<babbageclunk> dimitern: oops, meeting
<babbageclunk> dimitern: thanks again! I owe you about an infinitude of beers by this poing.
<babbageclunk> gah, point
<dimitern> babbageclunk: heh, keep 'em coming :D
<dimitern> anastasiamac: meeting?
<dimitern> anastasiamac: can you try again pushing?
<anastasiamac> dimitern: fatal: unable to access 'https://github.com/juju/juju/': The requested URL returned error: 403
<anastasiamac> dimitern: ;(
<voidspace> anastasiamac: so you shouldn't be able to push to juju/juju - so why is it trying?
<anastasiamac> dimitern: ofco - i have not created a branch \o/
<anastasiamac> thank you!! like i said - one of these days :D
<dimitern> anastasiamac: :)
<babbageclunk> dimitern, that didn't help. :(
<dimitern> babbageclunk: adding the /24 ?
<babbageclunk> dimitern: yup
<babbageclunk> dimitern: What were you saying about bridging above? If the controller and the nodes are all sibling kvms do I need bridges in the controller eni?
<babbageclunk> dimitern: There are bridges in the node ENIs, but I don't know whether they're needed or working.
<dimitern> babbageclunk: juju created those
<babbageclunk> dimitern: Ah, right.
<dimitern> babbageclunk: but yeah, the libvirt bridges where both maas and the nodes connect to are on your machine I guess then
<dimitern> babbageclunk: how are those bridges (networks) configured?
<dimitern> babbageclunk: in vmm
<dimitern> babbageclunk: hmm also - is there anything in /e/n/interfaces.d/ on the rack machine?
<babbageclunk> dimitern: my host eni is basically empty - presumably handled by something else?
<babbageclunk> dimitern: no, nothing in the rack controller interfaces.d
<dimitern> babbageclunk: yeah, usually network manager handles that
<dimitern> babbageclunk: re rack's e/n/i.d/ - ok; please paste the output of `virsh net-list` and `virsh net-dumpxml <name>` for each of those?
<dimitern> (or only those relevant to that maas rack)
<babbageclunk> dimitern: http://pastebin.ubuntu.com/17141198/
<dimitern> babbageclunk: ok, so your maas2 network on ens3 on the rack, but what's 10.10.0.0/24 then?
<dimitern> s/network on/network is on/
<dimitern> babbageclunk: that looks like the issue - eth1.10 on the rack
<dimitern> babbageclunk: is this the pxe subnet for the nodes?
<babbageclunk> dimitern: That's a holdover from a previous space experiment (I think it was for reproducing a bug maybe?) - should I just get rid of it?
<babbageclunk> dimitern: Don't understand your last question.
<babbageclunk> dimitern: Is what the pxe subnet?
<mup> Bug #1590732 opened: unnecessary internal state tests for networking <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1590732>
<dimitern> babbageclunk: the pxe subnet is the one the nodes boot from, it cannot be on a tagged vlan; paste the output of `maas <profile> interfaces read <id>`, for the node you where unable to ping?
<dimitern> babbageclunk: it the node interfaces table in the ui the pxe interface is marked with a selected radio button
<dimitern> s/it/in/
<babbageclunk> dimitern: hang on, I rebooted the controller after the eni change.
<babbageclunk> dimitern: http://pastebin.ubuntu.com/17141467/
<babbageclunk> dimitern: pxe's always on the physical interface.
<dimitern> babbageclunk: ok, thanks, I think I see a few things that might be the cause
<babbageclunk> dimitern: Or do you mean there can't be any tagged vlans hanging off the pxe interface?
<dimitern> babbageclunk: I think the gateway_ip is wrong for the 192.168.150.0/24 subnet is wrong
<babbageclunk> dimitern: Really? Wouldn't that have stopped them from getting to the internet?
<babbageclunk> dimitern: The 150 subnet was set up automatically at MAAS install time.
<dimitern> babbageclunk: I believe the maas rack's ip should be the gateway for the nodes (i.e. gateway ip on the 192.168.150.0/24 subnet in maas), as the .1 address is the libvirt bridge
<dimitern> babbageclunk: the rack itself needs to use the .1 ip on ens3 as gateway
<dimitern> babbageclunk: i.e. try updating the 192.168.150.0/24 subnet to use 192.168.150.2 as gateway ip, and then deploy a node directly from maas (no juju) and see how it comes up and whether you can ping its IP from maas and vs
<dimitern> vv.
<babbageclunk> dimitern: ok. (vv?)
<dimitern> babbageclunk: sorry; vice verse (?)
<babbageclunk> dimitern: Ah, right.
<dimitern> I was sure it's written as versa but erc syntax check underlines it..
<babbageclunk> dimitern: I can't change the subnet while things are using it, I'll have to kill my juju nodes.
<babbageclunk> dimitern: Oh, maybe that's just a limitation in the ui?
<dimitern> babbageclunk: ah, I suppose it's like that for safety
<babbageclunk> Do you think I should blow away my maas install and start again entirely clean?
<dimitern> babbageclunk: no need to go that far I think - just release the nodes, update the subnet's gw and try deploying I guess should fix it, hopefully
<babbageclunk> dimitern: ok, doing that now.
<dimitern> babbageclunk: I need to go out btw.. but should be back in 1h or so
<babbageclunk> dimitern: ok - I'm going to go for a run soon too. Thanks.
<dimitern> babbageclunk: np, good luck :) and I'll ping you when I'm back
<babbageclunk> dimitern: :)
<babbageclunk> dimitern: you're out, but yay! I deployed a machine and can ping it from the controller on the vlan IP and vice versa.
<babbageclunk> dimitern: rerunning my bootstrap now and going for a run.
<natefinch> jam: you around?
<rogpeppe1> anyone wanna review this, which deprecates some fields that wanted to go a long time ago in charm.Meta? https://github.com/juju/charm/pull/212
<natefinch> rogpeppe1: LGTM with a couple doc tweaks
<rogpeppe1> natefinch: thanks
<dimitern> babbageclunk: nice!
<mup> Bug #1590821 opened: juju deploy output is misleading <juju-core:New> <https://launchpad.net/bugs/1590821>
<babbageclunk> dimitern: Hmm - now juju run is hanging on me.
<babbageclunk> dimitern: huh - correction, juju run --unit hangs. juju run --machine is working fine.
<dimitern> babbageclunk: try with --debug ?
<babbageclunk> dimitern: http://pastebin.ubuntu.com/17145175/
<babbageclunk> dimitern: Ooh, I saw something in the juju run docs about show-action-status
<dimitern> babbageclunk: try `juju run --unit haproxy/0 -- 'echo hi'
<babbageclunk> dimitern: that shows the unit-level jobs I've been running as pending, while the machine ones are completed.
<dimitern> babbageclunk: the extra args are confusing 'juju run' I think
<babbageclunk> dimitern: no difference
<dimitern> babbageclunk: ok, so if the unit agent hasn't set the status to started and it's still pending, something didn't work
<dimitern> babbageclunk: try looking for issues in the unit-haproxy-0.log ?
<babbageclunk> dimitern: I did have some work at the start! I can see them completed in the show-action-status output.
<babbageclunk> dimitern: Ok, checking the log
<dimitern> babbageclunk: since juju run is actually handled by the uniter, it needs to run in order to do anything
<babbageclunk> dimitern: At the end of the log I see a lot of "leadership-tracker" manifold worker returned unexpected error: leadership failure: lease manager stopped
<dimitern> babbageclunk: hmm
<dimitern> babbageclunk: this sounds bad, but it might be orthogonal.. any other errors?
<babbageclunk> dimitern: no, nothing that sounds exciting. Blocks of http://pastebin.ubuntu.com/17145415/
<babbageclunk> dimitern: periodically
<dimitern> babbageclunk: ok, how about the output of juju show-machine?
<babbageclunk> dimitern: http://pastebin.ubuntu.com/17145473/
<dimitern> babbageclunk: hmm
<dimitern> babbageclunk: well, the only thing left to check is `juju status --format=yaml` I guess
<babbageclunk> dimitern: I'll look in the controller logs too.
<dimitern> babbageclunk: yeah, it might help if e.g. the spaces discovery wasn't completed by the time the unit started to get deployed..
<babbageclunk> dimitern: nothing interesting in juju status --format=yaml
<dimitern> babbageclunk: wait a sec... why machine 0's address is 192.168.10.2 ? I'd have expected to see 192.168.150.x ?
<dimitern> babbageclunk: wasn't the .10. subnet on a tagged vlan 10 ?
<dimitern> babbageclunk: or maybe it still is, but just the .10. address sorted before the .150. one and was picked as "preferred private address"..
<babbageclunk> dimitern: yeah, it is on the vlan
<babbageclunk> dimitern: I had to run sshuttle for juju ssh to work.
<babbageclunk> dimitern: .10 is the internal vlan
<dimitern> babbageclunk: do you mind a quick HO + screen share - it will be quicker to diagnose
<babbageclunk> ok
<babbageclunk> dimitern: juju-sapphire?
<dimitern> babbageclunk: joining yesterday's standup
<dimitern> yeah
<dimitern> babbageclunk: you appear frozen..
<babbageclunk> dimitern: yeah, my whole machine hung
<dimitern> ooh :/
<dimitern> babbageclunk: same thing.. you might be having the same issues as I had before downgrading the nvidia driver from proposed to stable
<babbageclunk> dimitern: yay, back again!
<babbageclunk> Ok, maybe not screen sharing - what about tmate?
<dimitern> I haven't used it
<frobware> dooferlad: ping
<dooferlad> frobware: two minutes
<frobware> dooferlad: can you jump into sapphire HO when ready - thx
<frobware> dooferlad: (reverse-i-search)`check': git checkout f0b4d55bd98e5d1a9089399dc7ecee2c75ecc6a8 add-juju-bridge.py
<natefinch> ahh state tests.... my old nemesis
<perrito666> heh, I had time for lunch with dessert and possibly a coffee and the suite is still running
<natefinch> man, my tests do not like to run in parallel... apiserver and state tests both barf when I run all tests, but run fine if I run them by themselves.
<natefinch> I wish we had a "please test my branch on CI because I don't trust the tests running on my own machine" button
<perrito666> natefinch: I use the other machine
<cherylj> hey cmars - were you able to access that windows system from sinzui  for bug 1581157?
<mup> Bug #1581157: github.com/juju/juju/cmd/jujud test timeout on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged by dave-cheney> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1581157>
<cmars> cherylj, haven't tried it yet
<cmars> cherylj, ok, i can rdp into it
<cmars> cherylj, how would i reproduce the hang there? not familiar with the CI setup
<cherylj> cmars: what I've done in the past is looked at what the GOPATH is
<cherylj> to see where the src might be
<cherylj> is GOPATH set?
<cmars> cherylj, it is, but there's no $env:GOPATH/src
<cherylj> cmars: okay, then you'll need to scp a tar of the src
<cherylj> I've used the ones generated by CI, one sec
<cmars> cherylj, hmm.. i could just use go get to grab the source
<cherylj> cmars, it's easier to get the tarball :)
<cherylj> in my experience anyway
<cherylj> http://data.vapour.ws/juju-ci/products/version-4043/build-revision/build-4043/juju-core_2.0-beta9.tar.gz
<cherylj> get that then scp it to the windows machine
<natefinch> I wish juju status just had a -v to alias --format=yaml
<cherylj> cmars: I had to use a path like this in the scp:  $ scp file_windows.go Administrator@<host>:/cygdrive/c/Users/Administrator
<natefinch> ha: https://bugs.launchpad.net/juju-core/+bug/1575310
<mup> Bug #1575310: Add "juju status --verbose". <feature> <juju-release-support> <status> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1575310>
<natefinch> Anyone have an inordinate amount of free time?  I need a review: http://reviews.vapour.ws/r/5027/    Files changed: 89  +547 -5744
<alexisb> ericsnow, ^^^
<ericsnow> natefinch: looking
<natefinch> it's all pretty mechanical, honestly
<cmars> cherylj, ok. i don't know how to untar on windows, but go get worked fine. i've got a loop of `go test ./cmd/jujud/agent` running in a loop, 30 times, on that machine. it's tee'ing output to a log file, will check it after lunch
<cmars> it's running against latest master
<cherylj> cmars: yeah, you have to use that '7z' utility that sinzui mentioned in the email
<mup> Bug #1590909 opened: github.com/juju/juju/state: undefined: cloudSettingsC <juju-core:New> <https://launchpad.net/bugs/1590909>
<natefinch> hmm.... I don't think juju deploy mysql --to lxd is working
<natefinch> ericsnow: is lxd as a container type supposed to work on trusty machnies?
<ericsnow> natefinch: I'd expect it to (given LXD is set up correctly)
<ericsnow> natefinch: and Go is 1.3+
<natefinch> ericsnow: I mean, like, deploy mysql --to lxd  in AWS,  ... should spin up a new machine and deploy a lxd container to it
<natefinch> and deploy mysql to that container
<ericsnow> natefinch: not sure
<cherylj> natefinch: it should work, but only if the container is also trusty
<cherylj> erm, maybe
<natefinch> ericsnow: so far, looks like no... juju add-machine lxd works
<cherylj> that's how it was with lxc anyway
<cherylj> n/m me
<cherylj> heh
<natefinch> lemme try something that uses xeniel
<natefinch> xenial
<natefinch> ...that's a no.
<natefinch> it never creates the base machine
<natefinch> damn
<natefinch> brb
<redir> sigh nil vs non-nil mismatch; obtained <nil>; expected <nil>
<ericsnow> natefinch: fix-it-then-ship-it
<natefinch> sinzui: do we test deploying to containers in CI?
<natefinch> mgz: ^
<sinzui> natefinch: all the time
<sinzui> natefinch: lxd network still fails
<natefinch> sinzui: ahh... umm.. shouldn't that be like a blocking bug or something?
<sinzui> natefinch: we bring it up several times a week. We are told it isn't as hot as the other bugs...but I am sure when lxc is removed. it will be hot
<natefinch> sinzui:  lol, yeah... I'm hesitant to remove lxc if we have no replacement (other than kvm)
<sinzui> natefinch: kvm isn't a replacement because public clouds don't support it
<natefinch> sinzui: ahh, well doubly so then
<sinzui> natefinch: We have bundles that deploy to lxd. the workloads work. lxd mostly works. its networking is broken though. juju cannot ssh into it as it can with lxc.
<natefinch> sinzui: ahh, ok... my current branch doesn't work, so that's interesting.  I'll retry with master to make sure I know what it expect
<natefinch> sinzui: by doesn't work, I mean that if you deploy, no base machine is ever created, so the fcontainer is never created, so the service is never deployed.  But it sounds like that's probably my own fault on my branch
<mgz> natefinch: specific things are still an issue with lxd, but the common stuff works. you may either have borked something in your branch or have hit a specific sequence of steps that don't
<mgz> most of lxd tests just throw a bundle at juju, only one or two use add-machine/--to
<natefinch> mgz: since I changed 89 files around container stuff, it's probably my fault.  I'll double check
<natefinch> hmmm ERROR failed to bootstrap model: cannot start bootstrap instance: cannot run instances: Request limit exceeded. (RequestLimitExceeded)
<natefinch> wonder if my previous aws deployment was retrying the machine deployment too much
<natefinch> from previous controller: machine-0: 2016-06-09 18:05:00 WARNING juju.apiserver.provisioner provisioninginfo.go:526 encountered cannot read index data, cannot read data for source "default cloud images" at URL https://streams.canonical.com/juju/images/releases/streams/v1/index.sjson: openpgp: signatur
<natefinch> e made by unknown entity while getting published images metadata from default cloud images
<natefinch> unit mysql/0 cannot get assigned machine: unit "mysql/0" is not assigned to a machine
<natefinch> that is uh, not the most useful error stack
<natefinch> reconfiguring logging from "<root>=DEBUG" to "<root>=WARNING;unit=DEBUG"  ... is this our default logging level now?
<perrito666> bbl
<natefinch> that seems... odd
<mup> Bug #1590947 opened: TestCertificateUpdateWorkerUpdatesCertificate failures on windows <juju-core:New> <https://launchpad.net/bugs/1590947>
<cherylj> natefinch: afaik, it's always been at that logging level
<natefinch> cherylj: weird... maybe I've had a custom logging level set in my environments.yaml for so long that I didn't realize.  It seems like a crazy log level
<cherylj> yes
<cherylj> I agree
<natefinch> cherylj: I mean... not showing info drops a lot of useful context on the floor... and unit=debug? What?   I'll file a bug.
<natefinch> does unit even work?  would't it need to be juju.unit?
<natefinch> mgz, sinzui: FWIW, juju deploy ubuntu --to lxd does not create base machines using master, at least for me (just tried on GCE since AWS was mad at me)
 * natefinch files another bug
<sinzui> natefinch: mad at all of us. none of us could launch an instance in us-east-1 for the last two hours
<natefinch> sinzui: ahh, ok, I thought it was something my code had triggered accidentally
<mup> Bug #1590958 opened: Juju's default logging level is bizarre <juju-core:New> <https://launchpad.net/bugs/1590958>
<natefinch> btw, looks like add-machine lxd works, and I can then deploy --to 0/lxd/0 ... it's just the straight deploy foo --to lxd  that isn't working
<mgz> natefinch: yeah, that sounds possible, we may just not do that in functional tests
<natefinch> mgz: yep
<natefinch> gotta run, will bbl
<mup> Bug #1590960 opened: juju deploy --to lxd does not create base machine <deploy> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1590960>
<mup> Bug #1590961 opened: Need consistent resolvable name for units <juju-core:New> <https://launchpad.net/bugs/1590961>
<wallyworld> katco: looks like you have a plan for capturing CLI interactions?
<katco> wallyworld: CLI interactions? no, API requests
<wallyworld> ah sorry, yeah
<wallyworld> that's what i meant
<katco> wallyworld: it looks like the RPC stuff already has a concept of listening to what goes on, but unfortunately it's constrained to a single type
<katco> wallyworld: so i'm expanding on that a little
<wallyworld> katco: could you outline your idea and email the tech board just to ensure they are in the loop
<katco> wallyworld: ...seriously? i don't think this is a radical change...
<wallyworld> ok. i was just a little cautious messing with the rpc stuff but i guess it's ok
<katco> wallyworld: i can email them. i don't think it's a breaking change. it's probably easier to just point them at a diff though
<wallyworld> katco: so the plan is to use export the RequestNotifier and use that?
<katco> wallyworld: no, the plan is to support multiple observers, one of which will remain the RequestNotifier
<katco> wallyworld: the other one will be the audit observer
<wallyworld> ok, and you need to export RequestNotifier so you can manage it as an observer
<katco> wallyworld: well it's in an internal package, so it's scope isn't any different
<katco> wallyworld: but i wanted to encapsulate all the observers so they're not cluttering apiserver
<wallyworld> ok, thanks for clarifying, just trying to get up to speed
<mup> Bug #1590909 changed: github.com/juju/juju/state: undefined: cloudSettingsC <juju-ci-tools:New> <https://launchpad.net/bugs/1590909>
<katco> wallyworld: np, lmk if you have any more qs
<wallyworld> will do ty
<wallyworld> gave your pr +1
<katco> wallyworld: ta
<wallyworld> rick_h_: you around?
<alexisb> wallyworld, rick_h_is out today
<wallyworld> ah, ok
<alexisb> wallyworld, send him mail
<wallyworld> ywp, will do
<alexisb> he has been responding occasionally
<thumper> wallyworld: for status I am using ModelName() and ControllerName() to get the local names, but how can I get cloud?
<wallyworld> thumper: get ControllerDetails()
<wallyworld> that should have cloud in it IIRC
<thumper> k
<wallyworld> yep, and it has region
<thumper> gah
<thumper> JujuConnSuite is a PITA
<thumper> where are the tools it can use defined?
<wallyworld> jeez, you just worked that out
<thumper> I need a defined version number
<thumper> a known, static version number
<wallyworld> thumper: there's UploadFakeTools helpers maybe
<wallyworld> i think that's what stuff uses
<thumper> ugh...
<thumper> too much hastle
<wallyworld> it's ugly
<thumper> just let me choose a known version
<thumper> do you recall where they are?
<wallyworld> patch current.Version
<wallyworld> i can look
<perrito666> great, my ISP cannot properly route me to argentinian ubuntu mirrors but can do to us ones
<perrito666> yay the upgrade to xenial sort of worked
<anastasiamac> "sort of" :)
<perrito666> I had to finish it  the old way
<anastasiamac> \o/ there is wisdom in old ways :)
#juju-dev 2016-06-10
<redir> so I am back to: value of (*params.Error) is nil, but a typed nil
<redir> :/
<redir> nm
<axw> wallyworld: I'm around now, let me know when you want to chat (can wait till 1:1 if you like)
<wallyworld> axw: am just typing in PR, will push in a sec
<menn0> thumper: tools migration is going well so far. here's one change - several more on their way: http://reviews.vapour.ws/r/5033/
<wallyworld> axw: i have not reviewed or tested live yet http://reviews.vapour.ws/r/5034/ we can chat soon
<wallyworld> omfg those internal networking tests are a waste of time and a bitch to fix
<wallyworld> damn, am still missing one apiserver test too
<menn0> thumper: thanks
<redir> off to do dinner. bbiab
<natefinch> thumper: why is our default log level <root>=WARNING;unit=DEBUG ?
<thumper> because it is how we see what the units are doing
<thumper> unit logging is the output from hooks
<thumper> and always useful
<thumper> but you can explicitly turn it off
<natefinch> ....then why is it at debug?
<natefinch> also, I thought it was juju.unit ?  or is unit special?
<natefinch> also, why don't we show info by default?  defaulting to warning means we drop a ton of useful context on the floor, and make debugging production systems really difficult
<thumper> wallyworld: if you ignore all the hook failures... http://pastebin.ubuntu.com/17163593/
<thumper> natefinch: unit is special
<thumper> natefinch: we should probably change to default to INFO
<thumper> I have no real good reason why
<wallyworld> thumper: nice, were you going to split the charm url also?
<thumper> not in this branch
<wallyworld> lots of people want warning
<wallyworld> info is too verbose for them
<natefinch> wallyworld:  they're welcome to set it to warning, but I think Info is a more reasonable default
<wallyworld> depends who the audience is
<wallyworld> do we cater for developers of devop people
<wallyworld> or
<natefinch> wallyworld: not really.  We limit the amount of logs we store
<natefinch> wallyworld: and they can turn it down to warning if they want
<wallyworld> and we can turn it up if we want
<wallyworld> devop people i have met do not want lots of verbose logging
<natefinch> it's not verbose.  It's specifically not.
<wallyworld> but i have not talked to lots and lots of them
<natefinch> it's not debug... except unit, evidently :/
<thumper> but info is noise
<wallyworld> verbose is subjective
<wallyworld> yes it is noise
<wallyworld> they just want warnings
<natefinch> I have tried working with logs set to warning and it's basically unusable
<wallyworld> theu just want to know when things go wrong
<natefinch> you can't tell WTF is going on
<thumper> natefinch: for us, yes
<wallyworld> unusable for you as a dev
<wallyworld> not unusable for a devop person
<natefinch> usable for anyone who wants to support the server and figure out what is wrong
<wallyworld> and that's the friction that always happens in these cases
<natefinch> I don't believe the devops people choosing warning know what they're talking about.
<thumper> wallyworld: http://reviews.vapour.ws/r/5035/
<wallyworld> you forgot the IMHO
<wallyworld> looking after i finish current queue
<axw> wallyworld: in all cases, APIPort use by the providers is only used in StartInstance. how about we just add it to StartInstanceParams for now?
<wallyworld> hmmm, that would work i think
<natefinch> .... some of them do.  The people (mostly internal to canonical) who have used juju a lot, sure.
<axw> wallyworld: we could do the same for controller-uuid, and then add another method to Environ to destroy all hosted models/resources
<axw> (passing in the controller UUID to that)
<wallyworld> axw: for now, i can just add controller uuid to setconfig params
<wallyworld> and do that next bit later
<axw> wallyworld: yep doesn't have to be in one hit, but I think that's how we can make it a bit cleaner
<wallyworld> +1
<wallyworld> one step at a time
<menn0> thumper: well that's gone a bit better than charms. tools migration worked first time once all the required infrastructure was in place.
<thumper> menn0: awesome
<thumper> wallyworld: I'm looking at breaking out the charm now rather than my normal friday afternoon thing
<wallyworld> rightio, almost starting a review
<wallyworld> axw: were we going to put region in controllers.yaml?
<axw> wallyworld: I already did
<axw> wallyworld: maybe we want to remove region from there? and just have it on the model?
<wallyworld> yep
<axw> cloud on controller, region on model
<wallyworld> yep
<axw> wallyworld: I added some comments to the diff
<axw> er review comments to your diff
<wallyworld> ty, looking
<wallyworld> axw: what's wrong with embedding that interface?
<axw> it's not what the interface is meant to be doing ...
<axw> wallyworld: its purpose is to get you a state.Model
<axw> not to get a model and model config and controller config
<wallyworld> sure, but i'm extending its behaviour
<axw> wallyworld: which defines its purpose
<wallyworld> an interface can do whatever methods you decide to put on it
<wallyworld> i should change its name i guess
<wallyworld> an Environ i think was from the old days when model was environ
<axw> wallyworld: no, I don't think you should change the name. the checkToolsAvailability function isn't even using the existing method on EnvironGetter AFAICS
<axw> wallyworld: separate responsibilities -> separate interfaces
<wallyworld> axw: it does because it passes it to GetEnviron
<axw> wallyworld: which expects a ConfigGetter, no?
<wallyworld> yes, or an interface that embeds that
<axw> wallyworld: so why would you wrap X in Y, only to pass X through to some other thing? that is pointless
<axw> and makes it unclear what the function really needs
<axw> it doesn't need the Model() method, it only needs the ConfigGetter part
<wallyworld> it means we pass in one param whose behaviour we use in the method body in various places. i can do a separate param if you want
<wallyworld> eg we pass in StateInterface in places and don't always use every method
<axw> wallyworld: yeah, that's a smell. we do that so we don't have to pass around a *state.State, which we used to
<wallyworld> but in this case the method being called directly, its logic does use every ethod on the interface
<axw> less smelly, but still a smell
<axw> wallyworld: checkToolsAvailability doesn't. updateToolsAvailability does
<axw> updateToolsAvailability should take two things: an interface for getting the current config (ConfigGetter), and an interface for updating the model (EnvironGetter)
<axw> checkToolsAvailability only needs a ConfigGetter
<wallyworld> ah, damn, i may have been dyslexic
<wallyworld> i think i was confusing two method names as the same thing
<wallyworld> ffs
<axw> wallyworld: am I making this login thing a critical/blocker to land?
<wallyworld> sure
<axw> thumper, wallyworld: do we really want to repeat the cloud name for each model? they are always going to be the same
<axw> (in status)
<wallyworld> i had read that as cloud region
<thumper> axw: that's what was asked for
<wallyworld> damn, dsylexic again
<thumper> and it isn't always the same
<thumper> if I have different models, they won't necessarily be in the same controller or cloud
<thumper> hmm...
<wallyworld> true, for the aggregated case
<axw> thumper: we're going to show models for multiple controllers?
<axw> I don't think so...
<thumper> um...
<axw> thumper: OTOH it would be useful to see at a glance from a snapshot of status which cloud
<thumper> perhaps I'm no longer clear what you are talking about
<axw> thumper: if I run "juju status", I'm seeing all the models for one controller
<thumper> um...
<axw> thumper: ah hm never mind
<thumper> if you run juju status, you only see one model
<axw> thumper: yep, forget me. that makes sense
<wallyworld> axw: one of you comments in blank so the ditto beneath it makes no sense
<axw> wallyworld: ignore ditto sorry. I (tried to) delete a comment after I answered my own question
<wallyworld> ok
<wallyworld> axw: i've left two issues open but answered the questions....
<menn0> thumper: tools migration done: http://reviews.vapour.ws/r/5036/
<axw> wallyworld: "no, different models will want to use their own logging levels on the agents" -- the controller agent(s) manage multiple models\
<wallyworld> axw: so a machine agent on a worker for model 1 will want to log different to an agent for model 2
<wallyworld> model 1 and model 2 should have their own logging-config right?
<axw> wallyworld: I'm talking about the controller
<axw> wallyworld: they are the same agent
<wallyworld> sure, but not on worker nodes
<axw> fair point about other workers tho
<natefinch> if anyone's feeling ambitious, this is a mostly mechanical change, to drop lxc support and use lxd in its place: http://reviews.vapour.ws/r/5027/
<axw> wallyworld: I guess we shouldn't constrain it to how it works today anyyway. it would be nice if it weren't global. we could have each worker in the controlelr take a logger with levels configured for the model
<axw> wallyworld: so I'll drop
<wallyworld> natefinch: any prgress on the --to lxd issue?
<wallyworld> you have a +1 from eric right?
<natefinch> wallyworld: I do have a +1 from eric, yes.... do we need 2 +1's now?
<davecheney> func (fw *Firewaller) flushUnits(unitds []*unitData) error {
<davecheney>   // flushUnits opens and closes ports for the passed unit data.
<wallyworld> not if i have anything to do with it - except for when the review feels like they need a second opinion
<davecheney> worst, name. ever
<natefinch> wallyworld: also, no, I don't have an idea about the lxd thing... my guess is that it's a switch statement that we forgot to add lxd to
<wallyworld> natefinch: so we can land this and then fix the other issue before release
<axw> wallyworld: going for lunch then fixing car, will finish review later
<wallyworld> axw: np, ty
<wallyworld> i'll start on the next bits
<natefinch> wallyworld: master is blocked, and this doesn't have a bug, AFAIK
<wallyworld> natefinch: either jfdi or create a bug - i have been jfdi
<wallyworld> we need this work for release
<wallyworld> natefinch: ah but wait
<wallyworld> we can't land until deploy --to is fixed
<wallyworld> because it will break QA
<wallyworld> doh
<natefinch> right, ok. I'll do that first
<wallyworld> ty
<natefinch> gotta catch up on sleep, will figure it out in the morning.  Seems like it's probably something pretty dumb.
<redir> wallyworld: axw whomever pr is in http://reviews.vapour.ws/r/5037/
<redir> Be back in the local AM.
<wallyworld> ty
<davechen1y> https://github.com/juju/juju/pull/5594
<davechen1y> ^ anyone experienced with the firewaller, this is a small fix as a prereq for 1590161
<axw> wallyworld: reviewed
<wallyworld> axw: ty
<axw> wallyworld: you did a half change in your previous PR, you called the doc "defaultModelSettings" but the method is called "CloudConfig" still. shall I change it to DefaultModelConfig? sounds a bit off -- like it's config for a default model. maybe ModelConfigDefaults?
<wallyworld> axw: yeah, sounds good ty
<frobware> dimitern: ping
<dimitern> frobware: pong
<frobware> dimitern: was just about the resolv.conf issue
<dimitern> frobware: I was looking at those bugs
<dimitern> frobware: trying to reproduce now with lxd on 1.9.3
<frobware> dimitern: I can help out in a bit - just trying to stash some stuff but in a meaningful state.
<dimitern> frobware: ok
<dimitern> frobware: no luck reproducing this so far :/
<frobware> dimitern: sounds like the whole of my yesterday :/
<dimitern> frobware: (that is, if the lxds even come up ok)
<frobware> dimitern: oh?
<dimitern> frobware: I noticed on machine-0 there was an issue and all 3 lxds came up with 10.0.0.x addresses
<frobware> dimitern: heh, that caught me out this moring. they are on the LXD bridge.
<frobware> dimitern: when we probe for an unused subnet, that's pretty much the default address you'll get as there's not much else, network-wise, running
<dimitern> frobware: yeah, the issue due to a race between setting the observed machine config with the created bridges and containers starting to deploy and trying to bridge their nics to yet-to-be-created host bridges
 * frobware notes that his git stash list has grown to a depth of 32...
<frobware> dimitern: explain that one to me in standup :)
<dimitern> frobware: otoh, if the bridges are created ok, lxds come up as expected with all NICs, and /e/resolv.conf has both nameserver and search (i.e. ping node-5 and ping node-5.maas-19 both work)
<dimitern> frobware: sure :)
<frobware> dimitern: standup
<fwereade> voidspace, http://reviews.vapour.ws/r/5029/
<frobware> dimitern: regarding resolv.conf. we did a change way back to copy the /etc/resolv.conf from the host. is it possible that it is triggering that path but the host has no valid entry (not for you, but the bug reporter)
<dimitern> frobware: it's very much guaranteed that container's resolv.conf will be broken if their host's resolv.conf is also broken
<dimitern> frobware: btw commented on that bug for '--to lxd'
<dimitern> mgz: hey
<dimitern> mgz: are there any places in the CI tests which do the equivalent of 'juju deploy xyz --to lxd' ?
<dimitern> mgz: if there are any, it should be because there is a machine with hostname 'lxd' that's the intended target
<babbageclunk> dimitern: is it actually ambiguous? Can you use a maas-level machine name there instead of a juju-level machine number?
<dimitern> babbageclunk: of course you can
<dimitern> babbageclunk: unless your node happens to be called 'lxd'
<babbageclunk> dimitern: ok, just thought I'd check.
<dimitern> babbageclunk: actually... hmm - maybe only on maas I guess?
<dimitern> babbageclunk: placement is supposed to work with existing machines (including containers), or new containers on existing machines
<babbageclunk> dimitern: So is the bug really that --to lxd (or lxc or kvm) should be an error?
<dimitern> babbageclunk: it even supports a list when num-units > 1: `juju deploy ubuntu -n 3 --to 0,0/lxd/1,lxd:1`
<dimitern> babbageclunk: placement for deploy and add-machine/bootstrap is handled slightly differently
<dimitern> babbageclunk: for the latter you *can* use 'add-machine ... --to lxd' or 'bootstrap --to node-x' (on maas)
<babbageclunk> dimitern: yeah, I was getting confused between them - I've interacted with add-machine and bootstrap more.
<dimitern> babbageclunk: that's an inconsistency though
<dimitern> babbageclunk: add-machine can do more than that - e.g. add-machine ssh:10.20.30.2
<dimitern> babbageclunk: bootstrap --to lxd at least fails with `error: unsupported bootstrap placement directive "lxd"`
<dimitern> babbageclunk: so it looks like a maas provider issue - it implements PrecheckInstance (called by state at AddMachine time), but apparently not very well
<babbageclunk> dimitern: Ok, that seems easy enough to fix.
<dimitern> babbageclunk: tell-tale comment on line 566 in provider/maas/environ.go: `// If there's no '=' delimiter, assume it's a node name.`
<dimitern> but doesn't bother to validate it
<dimitern> fwereade: hey
<dimitern> fwereade: I think we don't have a clear separation between deploy-time placement and provision-time placement (i.e. deploy --to X vs add-machine X)
<dimitern> fwereade: I might be wrong, but I think 'deploy ubuntu --to lxd' was never intended to work, unlike '--to lxd:2', '--to 0', or '--to 0/lxd/0'
<dimitern> frobware: how about if we pass a list of interfaces to bridge explicitly to the script?
<frobware> dimitern: sure; can we HO anyway as I have discovered some issues with lxd on aws
<dimitern> frobware: I was just about to have a quick bite - top of the hour?
<frobware> dimitern: or later if you want more time; that's only 20 mins
<frobware> dimitern: let's say ~1 hour and I'll go and eat too
<babbageclunk> frobware: I'm trying to add a machine to understand deploying to lxd better, but when I do add-machine it never goes from Deploying to Deployed in MAAS.
<dimitern> frobware: ok, sgtm
<frobware> babbageclunk: for that I think you'll have to dig into the MAAS logs.
<frobware> babbageclunk: oh, 2.0?
<dimitern> babbageclunk: trusty?
<babbageclunk> frobware: 2.0, xenial
<dimitern> babbageclunk: you run 'add-machine lxd' ?
<babbageclunk> frobware: just the machine, first - haven't gotten to deploy anything into a container.
<frobware> babbageclunk: I don't use 2.0 very much, if at all. Most of the bugs I'm looking at explicitly reference 1.9.x
<babbageclunk> frobware: no, add-machine --series=xenial
<babbageclunk> frobware: Any idea how I can get onto the machine? I think it's the network that's not coming up.
<frobware> babbageclunk: you can get to and see the console?
<babbageclunk> frobware: yeah, but I don't know login details.
<dimitern> babbageclunk: use vmm ?
<dimitern> if it's a kvm on your machine..
<babbageclunk> dimitern: what username/password though?
<frobware> babbageclunk: apply this http://pastebin.ubuntu.com/17167820/
<frobware> babbageclunk: (cd juju/provider/maas; make)
<dimitern> babbageclunk: none will work; 'ubuntu' but pwd auth is disabled
<frobware> babbageclunk: then build juju
<frobware> babbageclunk: then either start-over or run upgrade-juju and add another machine
<babbageclunk> frobware: hmm, I might try removing all of the vlans from the node first.
<frobware> babbageclunk: that's ^^ a useful exercise as it does allow you to login when we bork networking
<babbageclunk> frobware: ok, will try it.
<fwereade> dimitern, hey, sorry
<dimitern> frobware: what's up?
<fwereade> dimitern, in my understanding `--to lxd` means "hand over deployment to the notional lxd compute provider that spans the capable machines in your model"
<fwereade> dimitern, "I want it in a container, don't bother me with the details"
<dimitern> oops, sorry frobware
<dimitern> fwereade: well, why do we have container=lxd as a constraint then?
<fwereade> dimitern, hysterical raisins
<dimitern> fwereade: so 'juju deploy ubuntu --to lxd' is supposed to work exactly like 'juju add-machine lxd && juju deploy ubuntu --to X', where X is the 'created machine X' add-machine reports
<fwereade> dimitern, yes
<frobware> dimitern: hey, I kept working... can we sync after I have some lunch. :)
<dimitern> frobware: sure :)
<voidspace> dimitern: ping
<voidspace> dimitern: a quick sanity check. Every LinkLayerDevice should have a corresponding refs doc with a ref that defaults to 0. If non-zero the references are the number of devices that have this device as a parent (set in ParentName)?
<voidspace> dimitern: so a quick scan of the linklayerdevices counting parent references should enable me to reproduce it without having to directly migrate it.
<dimitern> voidspace: sorry, just got back
<dimitern> voidspace: yes, I think that's correct
<dimitern> voidspace: ah, well 'quick scan' could work but only if nothing else can add or remove stuff from the db while you do it
<babbageclunk> frobware: I tried your patch after trying a few other things, but it seems like passwd -d ubuntu just makes it so that ubuntu can't login through the terminal.
<babbageclunk> frobware: trying it with chpasswd instead.
<frobware> babbageclunk: I use that alll the time
<babbageclunk> frobware: hmm. Definitely didn't let me log in.
<babbageclunk> frobware: maybe it's hanging before the bridgescript runs?
<dimitern> frobware, babbageclunk: the ubuntu account is locked usually
<frobware> babbageclunk: if that's the case my patch is either borked, or the bridgescript did not run
<dimitern> in the cloud images
<babbageclunk> frobware, dimitern - trying deploying from maas without juju.
<babbageclunk> frobware, dimitern - how does the bridgescript get run? Juju gives it to maas which runs it via cloud-init?
<frobware> babbageclunk: yep
<dimitern> babbageclunk: yeah, as a runcmd: in cloud-init user data
<frobware> dimitern: can we HO?
<dimitern> frobware: sure - omw
<dimitern> frobware: joined standup HO
<frobware> dimitern: heh, I was in the other one. omw
<babbageclunk> frobware, dimitern - ok, I see the same problem deploying with maas-only, so presumably the bridgescript never gets to run.
<dimitern> babbageclunk: is this with trusty on maas 2.0 ?
<babbageclunk> the install's paused for a long time with "Raise network interfaces", then it times out and continues to stop at a login prompt, but it's before cloud-init runs.
<babbageclunk> dimitern: xenial on maas 2.0
<dimitern> babbageclunk: hmm well that's odd
<babbageclunk> dimitern: yeah. I'm going to kill off the vlans, that seems to trigger it. But I don't see why, since they didn't cause a problem before.
<babbageclunk> dimitern: Then at least I can try to understand the lxd deploy bug better without this getting in the way.
<dimitern> babbageclunk: sorry, otp
<natefinch> dimitern, dooferlad:  are you guys looking at the deploy --to lxd issue?  I had started looking at that last night, but didn't get very far. I need that to be fixed so I can land my code that removes all the lxc stuff
<dimitern> natefinch: yeah, I posted updates as well
<frobware> natefinch: we may need to reassign if not finished by EOD
<dimitern> natefinch: deploy --to lxc and --to lxd or --to kvm are equally broken, so it shouldn't block landing your patch
<dimitern> natefinch: side-note: I'm more concerned with removing the LXC container type as valid; wasn't there a discussion to still allow both 'lxd' and 'lxc' (but treat both the same as 'lxd') for backwards-compatibility with existing bundles?
<natefinch> dimitern: bundles will treat lxc like lxd, yes
<natefinch> dimitern: it's just everything else that is getting lxc removed
<dimitern> natefinch: ok then
<natefinch> dimitern: btw, I swear there used to be help text for --to lxc that said "deploy to a container on a new machine"
<natefinch> dimitern: but I don't see it now, so maybe I'm crazy
<dimitern> natefinch: if there was, it was never tested
<natefinch> dimitern: so are we fixing the bug that it doesn't immediately error out, or are we fixing the bug that it doesn't work?
<dimitern> natefinch: and I know for sure maas provider is not handling this as it should; not tried others
<cherylj> hey dimitern, should bug 1590689 be fixed in 1.25.6?
<mup> Bug #1590689: MAAS 1.9.3 + Juju 1.25.5 - on the Juju controller node eth0 and juju-br0 interfaces have the same IP address at the same time <cpec> <juju> <maas> <sts> <juju-core:Fix Committed> <juju-core 1.25:Triaged> <MAAS:Invalid> <https://launchpad.net/bugs/1590689>
<dimitern> cherylj: not without backporting the fix I linked to from master
<cherylj> dimitern: sorry, what I mean is, should we hold off releasing 1.25.6 until that gets done?
<dimitern> cherylj: oh, sorry not that one
<dimitern> cherylj: ah, yeah it *is* that one - and FWIW I think we should not release 1.25.6 without it
<cherylj> dimitern: is the backport already on your (or someone's) to do list?
<mup> Bug #1591225 opened: Generated image stream is not considered in bootstrap on private cloud <juju-core:Incomplete> <https://launchpad.net/bugs/1591225>
<dimitern> cherylj: not to my knowledge
<dimitern> cherylj: I could switch to that and propose it (I have too many things in progress..)
<cherylj> boy I know how that feels.
<cherylj> dimitern: I think we're still a couple days away from a 1.25.6, so maybe aim to have it in by Tuesday?
<dimitern> cherylj: that would be great!
<cherylj> thanks, dimitern!
<perrito666> bbl
<dimitern> frobware: guess what?
<frobware> its broken
<frobware> dimitern: in beta6
<dimitern> frobware: nope :) it works just the same with beta6
<frobware> dimitern: sigh
<natefinch> dimitern: so are we fixing it so that deploy --to lxd errors out the way --to lxc does?  in my tests --to lxc says: "ERROR cannot add application "ubuntu3": unknown placement directive: lxc"
<dimitern> (...for a change)
<dimitern> natefinch: is that on maas btw?
<natefinch> dimitern: whereas --to lxd doesn't error out (but then never works either)
<dimitern> frobware: added a comment anyway
<frobware> dimitern: thx
<natefinch> dimitern: no.  I never test on maas. don't have one.  GCE.  but I can try aws if it's not still broken like it was yesterday
<natefinch> dimitern: it should be provider independent, though
<dimitern> natefinch: yeah, it *should*, but as it turns out it's not unfortunately
<natefinch> dimitern: I guess maas has that messed up "if it doesn't match anything else, let's assume it's a node" thing
<dimitern> natefinch: I'll do a quick test now how deploy --to lxc and lxd is handled on maas, gce, and aws
<natefinch> dimitern: I did GCE, so you can skip that one
<dimitern> natefinch: ok, I'll try azure then
<natefinch> dimitern: lxd and kvm behave the same - they both return no error, but then never create a machine either
<dimitern> natefinch: something just occurred to me.. lxd uses the 'lxd' as the default domain for container FQDNs
<dimitern> natefinch: it might be the reason why lxd is different
<natefinch> dimitern: I'm pretty sure a placement directive of just a container type is supposed to work: https://github.com/juju/juju/blob/master/instance/placement.go#L71
<dimitern> natefinch: yeah, but there's also the PrecheckInstance from the prechecker state policy, which is called while adding a machine
<dimitern> natefinch: hmm it looks like only maas is affected
<dimitern> natefinch: as all other providers expect '=' to be present in the placement or parsing fails
<dimitern> natefinch: or like joyent simply fails with placement != ""
<dimitern> cloudsigma doesn't even bother to do anything.. precheckInstance is { return nil }.. why implement it then?
<babbageclunk> dimitern, natefinch: I can see in the add-machine case where the decision to add a new machine with a container is made for lxc, I can't find anything corresponding to that in the deploy code.
<natefinch> ahh, add machine, that's where it is: juju add-machine lxd                  (starts a new machine with an lxd container)
<natefinch> I don't know why deploy would be any different
<babbageclunk> dimitern, natefinch: ooh - does State.addMachineWithPlacement need to grow a call to AddMachineInsideNewMachine to do it?
<babbageclunk> (in state/state.go:1249)
<katco> natefinch: standup time
<natefinch> katco: oops, thanks
<natefinch> babbageclunk: 1275
<dimitern> babbageclunk: the actual code deploy uses lives in juju/deploy.go
<babbageclunk> dimitern: Yeah, but that will only put a new container in an existing machine.
<babbageclunk> dimitern: vs this code from add-machine https://github.com/juju/juju/blob/master/apiserver/machinemanager/machinemanager.go#L158
<dimitern> natefinch: on AWS 'deploy ubuntu --to lxd' and --to lxc both appear to work, but neither adds a machine for the unit
<natefinch> dimitern: yeah, same for GCE
<dimitern> natefinch: so it looks consistently broken everywhere :)
<dimitern> I'd vote to reject '--to <container-type>' for deploy on its own (i.e. still allow '--to <ctype>:<id>')
<babbageclunk> dimitern: So the code from add-machine will create a new host with a container inside, but the deploy codepath won't because it doesn't call AddMachineInsideNewMachine.
<dimitern> until we can untangle the mess around it and make add-machine and deploy --to behave the same way
<dimitern> babbageclunk: yeah, because nobody thought about it too much I guess
<babbageclunk> dimitern: I think it's just an extra check in that function - if machineId is "", call AddMachineInsideNewMachine instead of AddMachineInsideMachine.
<babbageclunk> dimitern: testing it now
<dimitern> babbageclunk: that sounds correct
<dimitern> babbageclunk: but definitely *isn't* the way to fix the bug
<dimitern> babbageclunk: I mean.. this will allow deploy --to lxd to work, but it might also open a whole new can of worms on all providers
<babbageclunk> dimitern: I don't see why? (But I haven't been following the discussion closely.)
<dimitern> babbageclunk: e.g. deploy --to kvm on aws will start an instance but then fail to deploy the unit as kvm won't be supported
<babbageclunk> dimitern: Isn't that the same behaviour as add-machine kvm?
<dimitern> babbageclunk: similarly, --to lxd with 'default-series: precise' will similarly seem to pass initially, then fail as lxd is not supported on precise
<dimitern> babbageclunk: add-machine is similarly broken in those cases
<babbageclunk> dimitern: Isn't it worth doing this fix so add-machine and deploy behave in the same way (although both broken in the cases you describe)?
<dimitern> babbageclunk: add-machine accepts other things, e.g. ssh:user@hostname
<dimitern> babbageclunk: they still won't act the same
<dimitern> babbageclunk: but, at least they will be a step closer
<babbageclunk> dimitern: Yeah, it still seems like people expect them to work in the same way in this case.
<natefinch> they should be as consistent as possible
<dimitern> babbageclunk: ok, please ignore my previous rants then :) what you suggest is a good fix to have
 * dimitern is just twitchy about changing core behavior before the release..
<babbageclunk> dimitern: :) I mean, I think you're right that those cases are problems.
<dimitern> we should have a well define format for placement, which allows provider-specific scopes; e.g. deploy --to/add-machine <scope>:<args>; where <scope> := <container-type>|<provider-type>; <args> := <target>|<key>=<value>[,..]
<frobware> dimitern: in AWS with AA-FF why do we use static addresses and not dhcp?
<frobware> dimitern: in containers
<dimitern> frobware: because the FF
<frobware> dimitern: sure, but really asking why static in that case
<dimitern> frobware: i.e. the user asked for static IPs
<dimitern> frobware: we use dhcp otherwise
<dimitern> frobware: but the whole point of the FF and now the multi-NIC approach on maas has always been to have static IPs for containers
<frobware> dimitern: it was AWS I was questioning; the MAAS I can see because you can ask for static/dhcp there
<dimitern> frobware: you can on AWS as well
<dimitern> frobware: AssignPrivateIpAddress
<dimitern> http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_AssignPrivateIpAddresses.html
<dimitern> well not nearly equivalent to what maas offers.
<alexisb> natefinch, when you have five minutes I have a few qs
<dimitern> frobware: ping
<dimitern> frobware: here's my patch so far: http://paste.ubuntu.com/17174180/
<natefinch> alexisb: sure.
<dimitern> frobware: now testing on aws w/ && w/o AC-FF (xenial), and on maas-19 (t) / maas-20 (x)
<alexisb> https://hangouts.google.com/hangouts/_/canonical.com/juju-release
<alexisb> natefinch, ^^
<frobware> dimitern: it's nuts... all this manual testing we're BOTH doing... Grrr.
<alexisb> cherylj, feel free to crash the party
<dimitern> frobware: yeah..
<frobware> dimitern: your patch "so far" - does that mean use or wait?
<dimitern> frobware: so far only as long as the currently running make check passes
<dimitern> frobware: or if something comes up from the live tests (will be able to tell you shortly); otherwise I think I covered everything in what I pasted
<dimitern> frobware: yeah, I've missed a few tests in container/kvm
<alexisb> babbageclunk, dimitern: what is the consensus for a fix on lp 1590960 ??
<alexisb> lp1590960
<babbageclunk> alexisb: maybe bug 1590960? Or is mup sulking?
<mup> Bug #1590960: juju deploy --to lxd does not create base machine <deploy> <lxd> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1590960>
<alexisb> there we go :)
<babbageclunk> I've got a fix, tested manually, just finishing the unit test for it.
<dimitern> alexisb: we can fix deploy to work with --to <container-type>, but that's not what's blocking natefinch's patch LXC-to-LXD
<alexisb> dimitern, correct it is not blocking
<babbageclunk> Should be up for review in ~10 mins
<alexisb> but looking at this mornings discussion there seemed to be some different ideas how what should work woith --to and what shouldnt
<alexisb> was just curious what the expected behavior should be
<cherylj> alexisb, natefinch looks like --to lxc is also a problem on 1.25:  https://bugs.launchpad.net/juju-core/+bug/1590960/comments/6
<mup> Bug #1590960: juju deploy --to lxd does not create base machine <deploy> <lxd> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1590960>
<dimitern> alexisb: that's the real issue: behavior was neither clearly defined nor tested
<alexisb> dimitern, exactly
<dimitern> alexisb: but it's sensible to expect deploy --to X it should work like add-machine X does
<alexisb> dimitern, also agree
<natefinch> cherylj: an error is a lot better than silently half-working... but yeah, should be fixed to mirror add-machine
<dimitern> alexisb: and babbageclunk's fix should get us there
<natefinch> huzzah :)
<dimitern> but not all the way
<alexisb> dimitern, though a note to the juju-core team might be good so that we highlight the change and educate the team
<alexisb> babbageclunk, ^^
<dimitern> agreed
<dimitern>  
<babbageclunk> alexisb, dimitern: Clarifying - am I sending the note about this change?
<dimitern> babbageclunk: I'd appreciate if you do it, I can help clarifying something or other if you need though
<dimitern> frobware: so the patch didn't work for aws
<alexisb> babbageclunk, yeah just to the juju-core lanchpad group
<frobware> dimitern: what happened?
<dimitern> frobware: ERROR juju.provisioner provisioner_task.go:681 cannot start instance for machine "0/lxd/0": missing
<dimitern> container network config
<dimitern> frobware: it slipped through somewhere.. looking
<frobware> dimitern: why do I think that's an existent bug... ?
<babbageclunk> dimitern, alexisb: Ok cool - I think I understand the wider issues now. Basically just that this will still do slightly weird things on clouds that don't support the container type, but at least that the add-machine and deploy behaviour is more consistent.
<alexisb> babbageclunk, yep
<alexisb> and we as a team should be clear on what the current behaviour is and the gaps, so we can both explain to users *and* make better desicions on what the behaviour should be
<alexisb> cmars, cherylj, do we have any progress on https://bugs.launchpad.net/juju-core/+bug/1581157
<mup> Bug #1581157: github.com/juju/juju/cmd/jujud test timeout on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged by dave-cheney> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1581157>
<dimitern> frobware: nope, it was a warning before
<dimitern> frobware: I'll need to add a few more tweaks to the patch and will resend
<cherylj> alexisb: I haven't heard anything from cmars about it
<mup> Bug #1591290 opened: serverSuite.TestStop unexpected error <ci> <intermittent-failure> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1591290>
<dimitern> frobware: fixed patch: http://paste.ubuntu.com/17177501/
<dimitern> frobware: should now work ok on AWS (testing again); all unit tests fixed
<dimitern> frobware: I probably should've proposed it rather than to bug you with it :/
<natefinch> dimitern: you mention placement strings with = in them in that bug... but placement strings don't use = AFAIK?  placement is like --to 0\lxc\0 or --to lxd:4  maybe you mean constraints?
<dimitern> natefinch: on maas you can do --to zone=foo
<dimitern> natefinch: and I think most others support zone= as well
<dimitern> natefinch: see, it's confusing :)
<natefinch> dimitern: gah, zone should be a constraint :/
<natefinch> well... maybe not
<natefinch> I g uess constraints are for all units of a service
<natefinch> still... weird
<dimitern> natefinch: yeah, it can't be useful as a constraint if we're to do automatic zone distribution
<natefinch> dimitern: right (sometimes you might not want them distributed,. but that's the exception).  Anyway... many valid placements do not use =... like specifying containers or machines
<dimitern> natefinch: there's also a container=lxd constraint btw, hardly tested
<babbageclunk> natefinch, dimitern: halp! After state.AddApplication's been called, the units are just staged, is that right? When/how does juju/deploy:AddUnits get called?
<babbageclunk> Is it triggered by a watcher of some sort?
<natefinch> babbageclunk: there's the unitassigner that makes sure units get assigned to machnies
<dimitern> babbageclunk: it goes like this: cmd/juju/application/deploy.go -> api/application/deploy -> apiserver/application/deploy -> juju/deploy -> state
<natefinch> babbageclunk: it's a worker
<babbageclunk> dimitern: yeah, I could follow that, but none of the code in that chain actually ends up calling AssignUnitWithPlacement.
<babbageclunk> natefinch: Ah, ok - thanks.
<natefinch> babbageclunk: yeah, we add a staged assignment during deploy, and then the unitassigner reads those and turns them into real assignments.
<babbageclunk> natefinch: ok - that makes sense. I was trying to understand why I didn't see the error I see in my unit test when running deploy manually.
<babbageclunk> natefinch: It's because the errors are raised by the unitassigner and logged somewhere, rather than coming back from the api to the command.
<dimitern> frobware: FYI, proposed it: http://reviews.vapour.ws/r/5040/
<babbageclunk> dimitern, natefinch: review please? http://reviews.vapour.ws/r/5041/
<dimitern> babbageclunk: cheers, looking
<babbageclunk> dimitern: I mean, you shouldn't now! It's late there! It's kinda late here now!
<babbageclunk> dimitern: but thanks!
 * babbageclunk is off home - have delightful weekends everyone!
<dimitern> babbageclunk: likewise! :)
<frobware> dimitern: will take a look
<redir> brb reboot
<mup> Bug #1588924 opened: juju list-controllers --format=yaml displays controller that cannot be addressed. <juju-core:Fix Committed> <juju-deployer:Invalid> <https://launchpad.net/bugs/1588924>
<perrito666> cmars: still around?
<mup> Bug #1591379 opened: bootstrap failure with MAAS doesn't tell me which node has a problem <v-pil> <juju-core:New> <https://launchpad.net/bugs/1591379>
<cmars> perrito666, yep, what's up?
 * perrito666 deleted what he was writing because he began in spanish
<perrito666> cmars: I wanted to ask you about juju/permission
<perrito666> We are sort of moving in another direction http://reviews.vapour.ws/r/4973/#comment27181
<cmars> leo un poquito
<cmars> looking
<perrito666> dont do that (the spanish) you just short circuited my brain badly :p
<perrito666> it is fun to see your own language and not understand it
<cmars> :)
<cmars> perrito666, is there a doc or tl;dr for the permissions changes?
<mup> Bug #1591387 opened: juju controller stuck in infinite loop during teardown <juju-core:New> <https://launchpad.net/bugs/1591387>
<mup> Bug #1591387 changed: juju controller stuck in infinite loop during teardown <juju-core:New> <https://launchpad.net/bugs/1591387>
<mup> Bug #1591387 opened: juju controller stuck in infinite loop during teardown <juju-core:New> <https://launchpad.net/bugs/1591387>
#juju-dev 2016-06-11
 * redir goes eow
<mup> Bug #1569120 changed: wrong lxc bridge still used in juju beta4 <conjure> <juju-core:Expired> <https://launchpad.net/bugs/1569120>
<mup> Bug #1569120 opened: wrong lxc bridge still used in juju beta4 <conjure> <juju-core:Expired> <https://launchpad.net/bugs/1569120>
<mup> Bug #1569120 changed: wrong lxc bridge still used in juju beta4 <conjure> <juju-core:Expired> <https://launchpad.net/bugs/1569120>
<mup> Bug #1591488 opened: Can not bootstrap on private openstack juju 1.25 or 2.0 <cpe-sa> <juju-core:New> <https://launchpad.net/bugs/1591488>
<mup> Bug #1591499 opened: Bootstrap timeout and fail on private cloud <cpe-sa> <juju-core:New> <https://launchpad.net/bugs/1591499>
#juju-dev 2016-06-12
<redelmann> redelmann> hi there, im writing a charm with resources and start asking: Â¿Can i set a default resource file?
<redelmann> <redelmann> so if i deploy without setting any resource some default file will upload
<redelmann> <redelmann> and later only if user want can change resource file
<redelmann> <redelmann> it is possible?
<perrito666> redelmann: the people that can answer that wont be here until monday morning around 9AM your time :)
<redelmann> perrito666: they do not work on sundays?
<redelmann> perrito666: :P
<perrito666> actually around 20 you get people from nz and aus but none of them is a resources export
<redelmann> perrito666: was a joke, thank you!
<perrito666> redelmann: I know :p cu
<mup> Bug #1591499 changed: Bootstrap timeout and fail on private cloud <cpe-sa> <juju-core:New> <https://launchpad.net/bugs/1591499>
<thumper> O.M.G.... state tests are so painfully slow
<perrito666> thumper: amen
<thumper> o/ perrito666
<thumper> perrito666: what are you doing here?
<perrito666> thumper: I have too many irc channels open in my quassel to botter closing work ones during weekends
<thumper> :)
<perrito666> btw, you go often to london, any simcard you can recommend me to have 2 weeks of data+calls? (calls being optional)
 * thumper thrashes machine with test runs
<thumper> oh for the love of all that is holy
<thumper> PASS: status_history_test.go:24: StatusHistorySuite.TestPruneStatusHistoryBySize - 32s
<perrito666> I thought fwereade had refactored that
<perrito666> thumper: that was me, sorry
<menn0> thumper: would you mind having a quick look at http://reviews.vapour.ws/r/5036/
<thumper> menn0: trade ya http://reviews.vapour.ws/r/5043/diff/#index_header
<menn0> thumper: ok :)
<menn0> thumper: Done. I suggested one more tiny cleanup.
<menn0> thumper: That PR is a bit like squeezing a pimple ... gross yet satisying :)
<thumper> yeah
#juju-dev 2017-06-05
<marcoceppi> _thumper_ rick_h wallyworld are we expecting cross model relations in 2.2?
<rick_h> marcoceppi: I think 2.3. The flag is still on right and we're up to rc1. balloons can you confirm? ^
<marcoceppi> cool, and 2.3 is off in the Fall-ish?
<rick_h> marcoceppi: yea, late summer I'd guess. The team's been doing 3-4mo in the 2.x so far
<balloons> marcoceppi, I really really hope it doesn't take that long for 2.3. cmr for single controller for 2.2
<rick_h> balloons: oh is the flag off? /me missed it go by
<marcoceppi> just trying to build out some timelines for long running open source projects building on top of Juju
<balloons> rick_h, yea it's still there. Most of the work is on the feature branch now which is looking towards 2.3
<jam> balloons: shall we meet in appear.in/john-nicholas ?
<thumper> morning
<hml> thumper: morning or afternoonjjj
<wallyworld> marcoceppi: cmr is usable in 2.2 with the feature flag (tested with all major clouds). still a lot of polish and refinement to add though. but 2.2 has acls and does work across regions; also uses proper firewall rules to limit ingress to just the required models etc
<wallyworld> marcoceppi: it hasn't had a huge amount of testing, so would like to get lots of feedback from real world
<wallyworld> rick_h: re bug 1596893.... i think we should also support show-action-status <name> without the --name flag? i would think we could filter on 'id starts with <arg> or name == <arg>'?
<mup> Bug #1596893: show-action-status should accept the action name as a filter as wel <2.0> <usability> <juju:In Progress by ecjones> <https://launchpad.net/bugs/1596893>
<rick_h> wallyworld: that's what I was thinking, but I understand it's a bit of magic I guess. I was trying to think of where else we do uuid/string magic and I guess maybe in models/controllers
<rick_h> wallyworld: is it going out in 2.2 behind a flag or is the flag going to be removed?
<wallyworld> rick_h: is what?
<rick_h> wallyworld: cmr
 * rick_h reads backwards there
<wallyworld> rick_h: in 2.2 it is behind a flag
<rick_h> wallyworld: is there CMR docs for the flag stuff?
<wallyworld> rick_h: there's not really docs yet; i can do some
<rick_h> wallyworld: ok, I'm looking to put together a juju show wed around 2.2 features/etc. If you're interested I can see about putting together a post about CMR and help push it out there.
<rick_h> wallyworld: maybe mention it during the show and then put out a blog post on testing it and how to provide feedback/testing results?
<wallyworld> rick_h: yeah, i think so but am worried about not enough internal testing and that external folks might get something a bit raw
<rick_h> wallyworld: I can keep it clear it's early.
<rick_h> wallyworld: but understand.
<wallyworld> rick_h: ok, sgtm
<rick_h> wallyworld: shoot me the latest in spec/notes and I'll test it out tomorrow and see what I can get ready for wed show
 * rick_h runs to violin
<wallyworld> will do
<thumper> babbageclunk: morning
<thumper> babbageclunk: I'm likely to lose internet shortly as I'm moved from ADSL to fibre
<balloons> wallyworld, https://bugs.launchpad.net/juju/+bug/1696005
<mup> Bug #1696005: sslOnNormalPorts is deprecated <juju:New> <https://launchpad.net/bugs/1696005>
<babbageclunk> thumper: morning too! Sorry, bit of a slow start today.
<wallyworld> ta
<thumper> babbageclunk: no worries, did you want to chat now?
<babbageclunk> thumper: sure, why not
<thumper> the chorus folk are here now drilling
<babbageclunk> thumper: oh cool
<babbageclunk> thumper: jumped into 1:1
<thumper> hmm...
<thumper> fibre is on
<babbageclunk> thumper: you seem faster on irc
#juju-dev 2017-06-06
<thumper> time for coffee, lunch, then tasks
<wallyworld> thumper: you around?
<thumper> yeah
<wallyworld> thumper: can you join me and eric in a team standup ho?
<thumper> yeah
<axw> wallyworld thumper: so I can merge my storage stuff into develop now, yeah?
<wallyworld> axw: yep
<axw> sweet
<thumper> :-(
<thumper> I just tried 'juju login jaas'
<thumper> that bit worked, but then...
<thumper> $ juju models
<thumper> ERROR cannot list models: unable to connect to API: x509: cannot validate certificate for 162.213.33.28 because it doesn't contain any IP SANs
<thumper> how come this hasn't been tested before?
<thumper> FFS
<babbageclunk> thumper: Are you possibly using an IP address for a JAAS endpoint when others are using a DNS name?
<thumper> babbageclunk: I just did 'juju login jaas' using the latest rc
<thumper> I expect it should DTRT
<babbageclunk> thumper: Ah, ok - no specific setup for the jaas controller then.
<thumper> no
<axw> thumper: are you up to date on develop? that was suppoed to be fixed by roger's latest PR
<thumper> yes
<thumper> perhaps his latest PR didn't land?
<thumper> oh...
<thumper> maybe I didn't merge
 * thumper checks
<thumper> no, I'm behind
 * thumper updates and tries again
<thumper> hah, works now
<thumper> thank god
<thumper> well, than rog
<thumper> what's more, I have my first jaas model running
<thumper> a canonical k8s model
<rick_h> thumper: woot!
<thumper> rick_h: I'm doing a meetup this afternoon
<rick_h> thumper: nice, I got a talk accepted at pyohio
<rick_h> thumper: so will be fun to plug some stuff
<thumper> cool
<wallyworld> babbageclunk: i made some suggested tweaks to the first pr, you ok for me to land?
<axw> wallyworld: forgot to ask, did I miss anything in standup/yesterday?
<wallyworld> axw: nah, same old, same old
<wallyworld> all about the release
<axw> wallyworld: cool. any news on the 2.1 hot fix?
<babbageclunk> wallyworld: yup yup
<wallyworld> tyvm for the review
<babbageclunk> wallyworld: by the way, sorry not to roundtrip back to you about your review comments on https://github.com/juju/juju/pull/7441 - wanted to get it landed. Maybe have a look to see if the changes are what you meant?
<wallyworld> sure, looking
<wallyworld> babbageclunk: too late now anyway, i trusted whomever gave it a +1 :=)
<babbageclunk> wallyworld: yeah, I figured that would be ok in the circumstances.
<wallyworld> babbageclunk: we just have to hope people don't run out of disk space wen upgrading
<babbageclunk> wallyworld: yeah, true.
<babbageclunk> wallyworld: maybe we should reinstate the deletion - thumper, I know you were looking at performance when running the upgrade step, were you also looking at disk usage?
<wallyworld> babbageclunk: ideally we'd run some tests on a system with the 4GB logs and see how much disk space is gobbled up during the upgrade
<thumper> babbageclunk: yes I was looking at disk usage
<babbageclunk> thumper: and the delete-as-we-go version didn't really help with that? Maybe we need to do something like compact the old collection as we go to get Mongo to release the space?
<thumper> babbageclunk: not sure what speed impact that has
<babbageclunk> thumper: hmm, true - probably not good.
 * babbageclunk goes for a run
<axw> wallyworld burton-aus: the 2.2.0 bump was made to develop, shouldn't that be on the 2.2 branch...?
<wallyworld> axw: oh shit
<wallyworld> yes
<wallyworld> i'll fix develop
<veebers> wallyworld, axw: ugh good catch, sorry that wasn't reflected in the release process doc.
<axw> wallyworld: ok, thanks. just making sure I didn't send my changes in too early :)
<axw> veebers: cool cool
<wallyworld> nah, i didn't notice the merge target
<veebers> wallyworld: does burton-aus need to re-do the pr?
<wallyworld> yup
<veebers> sweet he's on it
<wallyworld> axw: this should fix develop https://github.com/juju/juju/pull/7453/
<axw> wallyworld: thanks, LGTM
<wallyworld> and merged
<lazyPower> heyo juju dev, is there a way for me to "pick" which interface the unit will use for bootstrapping the controller? I'm having a jim dandy time trying to figure out why maas is reporting the management interface only during the bootstrap of the controller phase. the unit has 2 nic's available, both are online, one is addressable one is not, and it never seems to return that addressable interface.
<wallyworld> externalreality: still in a call,running late
<babbageclunk> thumper, wallyworld: created a PR against 2.2 for the TestPruneLogsBySize failure - should it have been against develop? https://github.com/juju/juju/pull/7458
<wallyworld> babbageclunk: sorry, in call, but yeah, develop for now
<babbageclunk> wallyworld: ok, I created it against 2.2 because it was in the 2.2.1 release blockers category
<wallyworld> babbageclunk: that works too
<wallyworld> better actually
<babbageclunk> sweet
<babbageclunk> Anyone not in that call who wants to do a trivial review for me? <bambi eyes> https://github.com/juju/juju/pull/7458
<hml> babbageclunk: gh seems to be uphappy this evening.  :-/ i canât see the diff
<hml> babbageclunk: instead iâm getting angry unicorns
<babbageclunk> hml: oh yeah - wow!
<babbageclunk> that's pretty annoying
<babbageclunk> hml: congratulations on the three months by the way!
<hml> babbageclunk: ty!
<babbageclunk> https://status.github.com/
<hml> oops
<babbageclunk> That's going to make a lot of things more difficult.
<babbageclunk> hml: for reference, here it is as an old school patch: http://paste.ubuntu.com/24797094/
 * hml looking
<hml> babbageclunk: any idea why 5000 was chosen originally?
<babbageclunk> hml: That's the threshold in the pruning code - it doesn't try to prune unless there are more than 5000 rows in the collection.
<babbageclunk> hml: but obviously it wasn't a good choice in the test, because sometimes it'll end up with slightly more than 5000, getting pruned down to just under 5000.
<hml> babbageclunk: thatâs what I was just thinkingâ¦. the threshold doesnât mean you canât have less.
<hml> babbageclunk: iâd approve if i could get to the correct page to do so
<babbageclunk> hml: Thanks!
<babbageclunk> hml: Ooh, the status just changed, and I can get to the PR now - try again?
<babbageclunk> hml?
<hml> babbageclunk: done
<babbageclunk> hml: yaythanks
<babbageclunk> hmm, not sure whether I'll be allowed to merge it though, since it's on the 2.2 branch.
<balloons> babbageclunk, thank you for https://github.com/juju/juju/pull/7458
<balloons> babbageclunk, i was just about to tell you about it. It's needed on develop too
<babbageclunk> balloons: cool - I'll forward-port it now. Held up by the github outage!
<thumper> externalreality: re bug 1696242, yes it is by design
<mup> Bug #1696242: `juju debug log --help` shows both flags "--tail" and "--no-tail" as having a default value of `false`. <usability> <juju:New> <https://launchpad.net/bugs/1696242>
<babbageclunk> balloons: Am I right that there's no mergebot watching 2.2? So I'll need to hassle someone to merge it?
<externalreality> thumper, :-D
<thumper> externalreality: I'm leaving  detailed comment
<thumper> perhaps we should update the help
<balloons> babbageclunk, yes. And veebers can help land it
<veebers> babbageclunk: aye, Just need to figure out the parameters to fire the job off with
<veebers> babbageclunk: was the one you need to land fix-prune-test
#juju-dev 2017-06-07
<babbageclunk> veebers: yes please!
<veebers> babbageclunk: cool, one snuck through, but failed. Let me check why
<babbageclunk> axw: eg looks really neat, thanks for the tip!
<axw> babbageclunk: cool :)
<axw> babbageclunk: FYI the PR I mentioned is https://github.com/juju/juju/pull/7446, the template I used is in the description
<wallyworld> axw: what do you think about adding the mongotop metrics to a prometheus collector? and other things like txn.logs size
<axw> wallyworld: there is an existing prometheus exporter (https://github.com/dcu/mongodb_exporter) which I think we should use if possible. last time I tried to use it, it was a bit panicky
<axw> wallyworld: not sure if that captures per-collection sizes. if it does not, adding txn.logs size sounds like a good idea to me
<wallyworld> axw: agree to use something existing if possible. top gets useful stats which IMO we'd want to graph over time and correlate with other measurements
<axw> yup
<axw> wallyworld: I think there might be one already, but if there's not we should look at snapping the mongodb prometheus exporter, to make it super easy to set up on the controller
<wallyworld> axw: that would be nice. as an aside, i had a brief look at the prometheus snap itself and didn't see an easy way to tell it to use a given config yaml, but i didn't look too hard
<axw> wallyworld: there should be an existing config file, I forget where... search for prometheus.yml under /snap/prometheus
<axw> wallyworld: also see https://awilkins.id.au/post/juju-2.1-prometheus/ if you haven't already, might be helpful
<wallyworld> axw: yeah there is one, but it sorta sucks to have to search for it and replace it and restart the process
<wallyworld> axw: i already have prometheus running against a local controller; not much to see as it's not busy
<axw> wallyworld: maybe we should provide a tool to reconfigure a prometheus to add scrape targets for juju controllers?
<wallyworld> now that would be good
<wallyworld> axw: are you able to look at fixing the introspection worker to support cpu profiling as a quick win?
<axw> wallyworld: it does support it, it's just the script that's broken
<axw> wallyworld: I can look at fixing the script if it's really important
<wallyworld> right, i meant the script. i'm not 100% sure what needs to change. replace GET with curl?
<axw> wallyworld: I'm not sure either. I can look at it
<wallyworld> would be good to have it work out of the box for 2.2
<wallyworld> since we are upgrading the customer to 2.2 controllers
<axw> ok
<wallyworld> jam: on the surface of it, i can't see a way to intercept incoming http connections prior to the tls negotiation stage to reject logins at that point. there's some methods on the tls.Config that appear to be called for each request that we can override, but doing so results in an internal error in the std lib code. did you have any thoughts on how to implement?
<jam> wallyworld: I didn't have any thoughts yet. my first instinct would be to have a custom Listener
<wallyworld> jam: yeah, getting the right points to intercept before tls happens is the fun bit
<wallyworld> bboab, school pick up
<jam> wallyworld: tls.Config takes a net.Listener
<jam> so if we wrap the passed in net.Listener with our own
<jam> I think it could work
<jam> line 226 of apiserver/apiserver.go
<mup> Bug #1696311 opened: layer-basic does not support centos7 <juju-core:New> <https://launchpad.net/bugs/1696311>
<mup> Bug #1696311 changed: layer-basic does not support centos7 <juju-core:New> <https://launchpad.net/bugs/1696311>
<wallyworld> jam: yeah, just poking around. part of the issue it's only agent logins we want to throttle. and we only get to read the data off the rpc request to determine that once we've established the secure connection.
<mup> Bug #1696311 opened: layer-basic does not support centos7 <juju-core:New> <https://launchpad.net/bugs/1696311>
<jam> wallyworld: right, if we just added a 1s sleep, or a load-based sleep, I think we could still get away with it, we could do a *bigger* sleep later
<jam> or we could do it by IP address
<jam> 'local' addresses get a bigger delay as they are more likely to be agents vs client
<jam> we could just slow down all Connects when we're under load/based on number of active connections, etc.
<wallyworld> that might work initially
<jam> and then slow down even further once we get to Login layer
<jam> wallyworld: to slow down retries, I had initially investigated a sleep before returning the error
<jam> which should still reduce total load
<jam> its just nice to also reduce it before you get TLS handshake stuff
<wallyworld> jam: right, i am adding an optional puase to the liniter
<wallyworld> so Acquire() might not return immediately even if it can get a slot
<wallyworld> actually, i am looking at pausing before polling the channel
<jam> wallyworld: you mean pause-before-Accept?
<wallyworld> in the Acquire() method of limiter
<wallyworld> pause before attempting to acquire a login slot
<wallyworld> juju/utils/limiter.go
<wallyworld> that will throttle the agents. maybe not the best place to do it?
<wallyworld> seems liked it was nice and transparent to the server
<wallyworld> i guess login limit is only 10
<wallyworld> so it may not help that much
<wallyworld> but it will delay any err retry
<wallyworld> so that it limits the cost of the agents trying again and again
<jam> wallyworld: so, I wouldn't do it universally in the generic code, but you could pass in an optional 'time.Duration' if we wanted
<jam> but just doing it at line 92 of admin.go
<wallyworld> right, that's what i'm doing
<jam> knows that we're explicitly rate limiting *logins* right there
<wallyworld> passing in an optional duration to NewLimiter()
<jam> wallyworld: sure, and that's also potentially testable, etc.
<wallyworld> yep
<wallyworld> and pausing before Acquire() means the agents are truely blocked
<wallyworld> as no ErrRetry is issued
<wallyworld> and s they can't just ping again
<wallyworld> immediately
<wallyworld> or that's my theory anyway
<jam> wallyworld: sure, before or after Acquire is fine
<jam> just before returning an error
<wallyworld> yep
<wallyworld> jam: here's a utils PR https://github.com/juju/utils/pull/281
<wallyworld> bah, i broke APi, I will need to fix
<jam> wallyworld: I feel like we need (min, max) instead of (0, max) thoughts?
<wallyworld> yeah ok, can easily add
<jam> or something like (avg, stddev) where we just pick some value for stddev based on avg
<wallyworld> and i'll fix the api too
<wallyworld> hmmm, do we really need that aside from min,max?
<jam> wallyworld: so its the same effect, just thinking about what is useful to express
<jam> well, stddev means you would have a normal distribution instead of a flat one,
<jam> not sure that is useful
<mup> Bug #1696311 changed: layer-basic does not support centos7 <Charm Helpers:New> <https://launchpad.net/bugs/1696311>
<jam> wallyworld: so even just 'max' is better than nothing
<jam> it just means the 'average' time is going to be 'max/2'
<wallyworld> jam: i'll add the min, easy enough
<wallyworld> after dinner though
<jam> wallyworld: reviewed
<jam> wallyworld: I do wonder if we could have a way to know "I've got a lot of load right now, lets slow down active connections a bit more", and provide backpressure
<wallyworld> jam: i also think that we need to do more - this current change is just a small step
<Mmike> Hi, lads. Is there a way to configure juju to store less than 4GB of logs in mongodb?
<thumper> hmm...
<thumper> trying to use the peer-xplod charm from the acceptance tests
<thumper> getting errors with lxd where it says '/usr/bin/env python' doesn't exist
<thumper> root@juju-61a95f-0:~# /usr/bin/env python
<thumper> /usr/bin/env: âpythonâ: No such file or directory
<thumper> from the machine itself
<thumper> seems like the current lxd xenial images only have python3
<jam> thumper: indeed, xenial doesn't come with python 2
<thumper> :-|
<jam> thumper: I thought I had dealt with that once in the past, but maybe that was on my version of the charm and not the one they are using ?
<jam> thumper: 'apt install python2' in 'install'
<thumper> yep
<thumper> did that
<thumper> although i used apt-get so it work on trusty too
<jam> thumper: sure
<thumper> :)
<jam> I have 'apt install -y python' in mine
<jam> thumper: is it a ~juju-qa charm ?
<thumper> no, the one in acceptancetests dir in tree now
<jam> there are a couple small changes between the one in tree and lp:~jameinel/charms/trusty/peer-xplod
<jam> nothing particularly major, just the 'apt-get install' and some small things about 'maximum=0' intending to be unlimited
<jam> thumper: want me to put a PR that brings them in sync?
<thumper> jam: sure, if you have the time
<jam> thumper: https://github.com/juju/juju/pull/7463
<wallyworld> jam: here's a WIP which uses the login rate limiting plus a general connection throttle https://github.com/juju/juju/compare/2.2...wallyworld:throttle-controller-connections?expand=1
<jam> wallyworld: WIP, WIP it good :)
<wallyworld> does it look reasonable? i plucked the numbers out of the air
<wallyworld> funny man
<jam> wallyworld: so I'm wondering why we are sleeping longer for Conn than Login
<jam> wallyworld: I would have thought 1s for conn, and 5s for login
<wallyworld> i can do that
<wallyworld> i thought login was limited to 10 at atome anyway
<wallyworld> but conns once logged in could grow more
<wallyworld> probably flawed thinking
<jam> wallyworld: so conn affects users as well as agents, but you're right that the login rate limit only triggers once we're at 10 active
<jam> ah sorry, we always acquire so we would always hit that
<jam> but only for agents
<wallyworld> yeah, this latest wip does affect clients as well
<wallyworld> but if the system is really, really loaded, then even they should wait  abit?
<wallyworld> they will see a slow down anyway
<jam> wallyworld: 1s is fine IMO
<wallyworld> 1s max
<jam> the question is whether that is *enough* generally, but adding an extra 5 for agents probably will be
<wallyworld> and 5ms per conn?
<jam> wallyworld: so a max 1s delay for Conn to return and a 5s extra delay for Agent Login to return 'go away'.
<jam> neither is what I'd like in 'ideal world' which would be focused on scaling the numbers based on number of active connections
<jam> but its probably a start
<wallyworld> jam: so the 5s max for Accept() was really to attempt to throttle the thundering herd, and the pause time only grows by 5ms per conn
<wallyworld> yeah, this is a quick win for 2.2rc2
<jam> ah, I missed that throttling went up and down
<wallyworld> on a normally loaded system there should be no noticable difference
<wallyworld> yeah, it grows as we get more connections accepted
<jam> wallyworld: so 5s on Conn isn't great. it affects 'juju status' when running on lxd
<jam> 'why is it taking 5s to get a result back with 2 machines'
<wallyworld> that's 5s max
<jam> wallyworld: still avg 2.5s
<wallyworld> only if there are 1000 connections
<wallyworld> the max time grows
<wallyworld> well, that was the intent
<wallyworld> start at min 10ms or so, and then the max pause time grows with conn count
<jam> wallyworld: ah sorry, I'v twisted it in my head,
<jam> just got coffe
<wallyworld> np, i'm tired so i could have messed up
<wallyworld> so for accept, on a normally loaded system -> no dicernable difference
<wallyworld> but all connections are forced to wait  a bit as conn count grows
<jam> wallyworld: so, all Accept() attempts have a 10ms floor that increases by 5ms for every active connection
<wallyworld> yeah
<jam> up to a max of 5ms from Accept until we do the SSL handshake
<wallyworld> max of 5s
<jam> on Comcast world, that will, on average have 2500/3 = 800, say 1000 active agents
<jam> every 'juju status' will be slower by 5s
<wallyworld> ah right because the connections are long lived
<wallyworld> i could do it based on rate of connection
<jam> wallyworld: right, not for the *clients* which have to pay that on every connect
<jam> wallyworld: but all the agents which have long-lived only pay it 1x
<jam> wallyworld: something like 'number of connections in the last X seconds' would be good
<wallyworld> yep, that would solve the thundering herd issue
<wallyworld> i can tweak it
<jam> wallyworld: (arguably we could do per-IP tracking or something, but again, that would be penalizing users that are actively engaging with the system)
<jam> we really just want the pushback on agents
<jam> and we only know that at the Login time
<wallyworld> agreed, but we don't concretely know what those ip addresses are at that point
<wallyworld> we can gues, but....
<jam> wallyworld: yeah, I don't think we want to do IP based, cause then you have to track all of that
<jam> I think just doing 'how many have connected in the last X' and slow it down up to 5s is ok
<wallyworld> so i reckon 5ms per X rate of new connections
<wallyworld> yep, up to 5s max
<jam> wallyworld: I'd then also have Login that is going to *reject* an agent to come back later, wait another 5s
<jam> wallyworld: which means all the people over the current 10 that we are going to reject, get delayed a little bit extra
<jam> and I'm not apposed to something that delays before Acquire as well
<wallyworld> jam: so add a pause when limiter.Acquire() returns false?
<wallyworld> i think delay before is ok too
<jam> wallyworld: those are the ones that will be reconnecting 3s later
<wallyworld> ok, i can add another apram to NewLimiter()
<wallyworld> fixed time to pause if a reject happens
<jam> wallyworld: its not hard to put it just before the "return ErrRetry"
<wallyworld> yeah, ok
<wallyworld> jam: so hopefully the net effect of this (pun half intended) is to allow things to come up more controlled without resorting ti IP tables
<jam> wallyworld: yeah, we need to set up some testing of 'restart times' so we can tune some of these numbers
<wallyworld> next thing would be to throttle log connections
<wallyworld> yeah, testing needed for sure
<jam> wallyworld: I can probably set wpk on it today
<jam> he seemed interested
<wallyworld> ok, i'll finish this work
<jam> wallyworld: I'm also curious what the net effect would be if you are running in HA
<jam> a given controller is going to push back, but will the others, etc
<wallyworld> yeha
<wallyworld> jam: i almost convinced myself those delay params should be configurable, not consts
<wallyworld> so we can play with the numbers
<wallyworld> maybe via env vars
<jam> wallyworld: well, I would hack them with ENV vars, etc to test it
<jam> wallyworld: but it also is something that as soon as we know *we* want a knob
<jam> somebody else will ask for it
<wallyworld> right, but we hide that knob
<wallyworld> those env vars are not publicised
<wallyworld> but we can ask CI to set up a system with lots of xplod charms, get it to steady state, see how it goes, and then kill the controller and see what happens then as well
<wallyworld> and tweak the numbers
<axw> wallyworld jam: https://github.com/juju/juju/pull/7465 has updates to support CPU profiling in the introspection CLI, as well as adding support for easily exposing as HTTP
<axw> wallyworld jam: I started down the road of just modifying the bash code a little bit, but it was very fragile. so ended up with something a bit more comprehensive...
<jam> axw: is this a bit too much for a 2.2 at this point? I suppose we aren't changing the actual socket, nor are we changing the scripts that we used to support
<jam> just how they connect
<jam> and possibly exposing a new thing people will ues
<jam> its nice to not need to 'apt install socat' all the time
<jam> small note 'juju-introspect' or 'jujud-introspect'... not sure
<jam> myself
<jam> I guess it is 'juju-run'
<jam> though honestly *that* one is mostly a source of confusion
<axw> jam: the alternatives I can see are: (a) do nothing, (b) use curl, which makes the command more fragile (because of timing issues, starting socat and curl not necessarily having --retry, and other weirdness around socat)
<axw> jam: IMO, this could wait for 2.2.1. it's possible to do all these thigns already with 2.2, just not in a neat command
<jam> axw: so the singlehostreverseproxy is to handle redirecting HTTP to a unix socket?
<jam> well abstract domain sockt
<axw> jam: yep
<jam> axw: to check are we changing the raw content output then?
<jam> you made a comment about not having the headers
<jam> which sounds good
<jam> but does mean the actual output of "juju-goroutines > saved.txt" is going to be slightly different?
<jam> (AFAICT, it actually means you don't have to munge the file before it is actually useful)
<axw> jam: yes. it's the same except without the HTTP response header
<axw> jam: right
<jam> axw: my concern is anyone whose scripted it may be removing it themselves and we're breaking that
<jam> thats the sort of "shouldn't do in a .patch' release", I think
<jam> axw: I do believe it was a gotcha trying to use things like the heap profile
<jam> so ultimately better
<jam> but probably a risk for putting it into rc2, but also a big win for not breaking it in a .patch
<axw> jam: I'm not aware of anyone interpreting them anyway - are you? not that that's proof or anything, but I am curious. they've always just been handed back to dev IME
<jam> axw: well, *I've* used them to run against go tool, and its always been a pain that you have to munge. Its certainly the sort of thing where I'd want us to be careful with compat
<jam> axw: and saying "<2.2.0 you need to trim the front, but we do that automatically in 2.2" sounds much better than
<jam> in '2.2.1'
<axw> jam: yep, fair point
<jam> axw: I'd *like* others to chime in on the "should it be 2.2.0rc2 or 2.2.1"
<jam> but you have my vote
<axw> jam: thanks. I will wait for wallyworld and thumper to chime in at least
<jam> a couple small things
<jam> you list the symlinks in one list over here, but individually multiple times over there
<jam> and 'juju-introspection' vs 'jujud-introspection'.. I'm not sure there, either
<jam> juju- matches other things, but really we are introspecting a jujud
<axw> jam: yep, thanks I'm fixing that list. I'm -0 on jujud-introspect because it has a different prefix to the introspection helpers (juju-goroutines, juju-heap-profile, etc.). they're all about jujud too, but I don't think it'd be helpful to users to have two different prefixes for the same class of commands
<jam> fairy nuff
<axw> jam: family's home, gtg. thanks for the review
<thumper> axw: shipit for 2.2-rc2
<thumper> axw: I was just considering something like this myself
<thumper> so yay
<axw> thumper: okey dokey. I believe the bot is disabled, so how does one do that?
<thumper> axw: one asks one of the QA folk to poke the bot manually
<axw> ah I have to run, I'll check back later
<thumper> axw: probably need to get balloons to do it when he starts
<jam> balloons: ^^ https://github.com/juju/juju/pull/7465
 * thumper should go to bed
<jam> we would like to land that for 2.2rc2
<thumper> well, go do dishes first
<thumper> night all
<jam> thumper: go sleep :)
<marcoceppi> how can I upgrade to 2.2-rc1 from a previous stable version?
<marcoceppi> --agent-version=2.2-rc1 says "ERROR no matching binaries available"
<marcoceppi> I got it upgrading, but how long should an implace upgrade take?
<wallyworld> marcoceppi: see the release  notes for rc1 - we split the logs into per model collections so for this upgrade, it can take a whiile
<wallyworld> the upgrade may need to split apart up to 4GB of logs
<marcoceppi> wallyworld: thanks
<wallyworld> marcoceppi: i'm guessing it took maybe 5 or 10 minutes?
<wallyworld> we should surface a more complete message that just "upgrading" perhaps
<wallyworld> this was done to improve the model destroy performance for large numbers of models
<marcoceppi> wallyworld: I think my upgrade might be stuck, but I have no way of telling
<marcoceppi> it was started at 48 after the hour
<wallyworld> was it a big deploy?
<marcoceppi> disk space consumption has not changed, and the logs are mostly filled with "login denied, upgrade in progress"
<marcoceppi> 6 machines
<marcoceppi> 1 model
<marcoceppi> but it was a 2.0.4 -> 2.2-rc1
<wallyworld> should work though
<wallyworld> are you able to get a mongo shell and do a db.logs.size() and also a size on the new model logs collection to see if the records are still being copied?
<wallyworld> the new logs collection is something like logs.<modeluuid>
<marcoceppi> wallyworld: how do I get a mongo shell?
<wallyworld> ssh to controller, and then mongo --ssl -u admin -p <oldpassword> localhost:37017/admin --sslAllowInvalidCertificates
<wallyworld> where oldpassword is sudo grep oldpassword /var/lib/juju/agents/machine-0/agent.conf
<wallyworld> then once in shell, do a "use juju"
<wallyworld> that selects the juju database
<marcoceppi> let me take a look
<babbageclunk> wallyworld: should I pick up a bug from the release blockers section?
<wallyworld> babbageclunk: in release call now, just discussing what needs to be done
<babbageclunk> ok
<marcoceppi> wallyworld: I get login fialed with that command
<marcoceppi> but the upgrade completed
<marcoceppi> so I don't care anymore
<wallyworld> marcoceppi: sweet, ok. but we should report better
<wallyworld> babbageclunk: HO in standup?
<babbageclunk> wallyworld: sure
<marcoceppi> wallyworld: I do have another problem
<marcoceppi> since the ugprade `juju models` hangs
<wallyworld> marcoceppi: ah bum, ok
<wallyworld> we haven't seen that
<babbageclunk> :(
<wallyworld> can you turn on debug logging and see what it says?
<wallyworld> raise a bug for sure with as much detail as possible
<marcoceppi> wallyworld: it just says connected to ws
<wallyworld> marcoceppi: does show-model work?
<marcoceppi> wallyworld: add and destroy model work
<wallyworld> show-model?
<marcoceppi> wallyworld: nope
<marcoceppi> wallyworld: http://paste.ubuntu.com/24803880/
<marcoceppi> wallyworld: it says "connection established" then that's it
<thumper> well bollocks
<wallyworld> marcoceppi: can you turn on debug logging and provide a snippet from juju debug-log
<marcoceppi> I think debug logging is on?
<wallyworld> juju model-config logging-config="<root>=DEBUG;"
<thumper> marcoceppi: juju debug-log -m controller
<marcoceppi> model config hangs
<thumper> this is a pretty serious regression
<wallyworld> look at current logging-config first so you can set it back later. juju model-config
<marcoceppi> model-config hangs all together
<wallyworld> wtf
<marcoceppi> to be fair, two hours ago this was a 2.0-beta18 controller
<thumper> marcoceppi: wat?
<wallyworld> can you log onto the controller and look at the apiserver.log file
<marcoceppi> 2.0-beta18 -> 2.0.4 -> 2.2-rc1
<thumper> marcoceppi: I'm not sure beta 18 was upgradable
<marcoceppi> thumper: well, 2.0.4 worked
<thumper> marcoceppi: we didn't say upgradable until 2.0-rc1
<thumper> hmm...
<thumper> in theory, it should work
<thumper> marcoceppi: 'juju debug-log -m controller --replay | pastebinit'
<wallyworld> once we see server logs, we can deduce what's wrong hopefully
<marcoceppi> well now everything is hanging
<marcoceppi> let me see what is happening onthe server
<marcoceppi> load of 13, helllooo
<marcoceppi> okya, model-config works, models doesn't
<marcoceppi> thumper: http://paste.ubuntu.com/24803956/
<marcoceppi> wallyworld: ^
<thumper> machine-0: 18:38:52 DEBUG juju.utils setting GOMAXPROCS to 1
<thumper> huh?
<marcoceppi> my hope is I can just "model migrate" this to 2.2.0 and resolve a lot of whatever the hell I did
<thumper> I wonder why we are seeing so much of this: machine-0: 18:38:54 DEBUG juju.mongo dialled mongodb server at "10.142.0.2:37017"
<marcoceppi> you all want ssh?
<wallyworld> thumper: it appears the api worker can't start
<wallyworld> maybe
<marcoceppi> jujud is pegging this controller at 100%
<marcoceppi> but it's been doing that since 2.0-beta18
<marcoceppi> happy to give this vm more resources if that's what it takes
<thumper> marcoceppi: probably a broken setup...
<thumper> it shouldn't be doing that
<marcoceppi> that's what I wanted to go to 2.2, get them perf fixes
<thumper> heh
<thumper> marcoceppi: need to do this "juju model-config -m controller logging-config=juju=debug"
<marcoceppi> and CMR ,and like all the other good things
<thumper> then some debug log over the models call
<marcoceppi> I've apparently exhausted memeory
<marcoceppi> http://paste.ubuntu.com/24804029/
<marcoceppi> I'mve going to bump up the VM
<marcoceppi> rebooted, more cpu/ mem
<marcoceppi> now I get this
<marcoceppi> marco@T430:~$ juju models
<marcoceppi> ERROR cannot list models: upgrade in progress (upgrade in progress)
<marcoceppi> marco@T430:~$ juju switch controller
<marcoceppi> silph.io-prod1:admin/test -> silph.io-prod1:admin/controller
<marcoceppi> marco@T430:~$ juju status
<marcoceppi> Model       Controller      Cloud/Region     Version  Notes                               SLA
<marcoceppi> controller  silph.io-prod1  google/us-east1  2.2-rc1  upgraded on "2017-06-07T21:13:29Z"  unsupported
<marcoceppi> App  Version  Status  Scale  Charm  Store  Rev  OS  Notes
<marcoceppi> Unit  Workload  Agent  Machine  Public address  Ports  Message
<marcoceppi> Machine  State  DNS            Inst id        Series  AZ          Message
<marcoceppi> 0        down   35.185.85.250  juju-c9c599-0  xenial  us-east1-b  RUNNING
<marcoceppi> marco@T430:~$ juju models
<marcoceppi> ERROR cannot list models: upgrade in progress (upgrade in progress)
<marcoceppi> crap
<marcoceppi> http://paste.ubuntu.com/24804067/
<thumper> marcoceppi: it may well be migrating the logs
<thumper> marcoceppi: that will take some time
<thumper> marcoceppi: to move 4G of logs on my laptop with an SSD was over 7 minutes
<axw> veebers: hey, would you please land https://github.com/juju/juju/pull/7465 for 2.2? it has thumper's seal of approval
<thumper> axw: we asked veebers to stop making 2.2 special for now
<axw> thumper: ah ok
<veebers> thumper: ah yeah, I'll fix that up now, sorry
<thumper> but we'll keep an eye on who submist what
<thumper> veebers: thanks
<axw> okey dokey
<veebers> thumper, axw: done it should just go through as per normal (once picked up)
<axw> veebers: cheers
<veebers> thumper, axw: any idea what else needs to land for rc2?
<axw> veebers: azure auth stuff
<axw> veebers: which has changed since I reviewed it, re-reviewing now
<thumper> veebers: I'm adding some stuff around state export
<thumper> veebers: wallyworld is working on a statushistory deletion bug
<thumper> veebers: possibly wallyworld's connection backoff code
<thumper> axw: can I get you to look over that too?
<wallyworld> babbageclunk is working on the delete bug
<thumper> wallyworld: ok, ta
<axw> thumper: sure
<veebers> thumper, axw: ack. If you can keep burton and myself in the loop so we know which CI runs to track (and baby) so we're ready to rock and/or roll when needed for release
<thumper> hmm... dealing with a facade bump where we change the args and return values...
<thumper> veebers: yep, sure
<thumper> veebers, wallyworld: we also need to work out why the capped collection overflow didn't stop the agents
<thumper> it *should* have caused all agents to stop immediately
<wallyworld> depends if CPU was overloaded etc
<wallyworld> agents stop once channel selects are processed etc
#juju-dev 2017-06-08
<wallyworld> so doesn't happen immediately if no cpu is available
<thumper> wallyworld: it seems worse than that
<axw> mhilton: your updates LGTM, thank you. did you run the change to use interactive auth-type past uros?
<axw> wallyworld: thanks I'll fix that doc up in a minute
<thumper> I've got a bad feeling about this
<thumper> axw: can I talk something through for 5 minutes?
<thumper> ah... stand up
<thumper> guess now
<thumper> not
<thumper> hmm... maybe this is fine...
<axw> thumper: sorry, missed your message. still need to talk? I can be with you in a couple of minutes
<wallyworld> axw: from what i can see, the token bucket implementation isn't really what i need - that provides a way to control the max rate at which things are processed. i need to use a random delay based on the observed rate of incoming connections.
<thumper> axw: no, I think I'm ok
<thumper> just testing manually now
<thumper> was concerned about client  <-> server facade version handling
<thumper> but I think we are ok
<axw> thumper: okey dokey
<axw> wallyworld: why random?
<axw> wallyworld: token bucket is a standard method for rate limiting, just wondering why it's not enough
<wallyworld> we want to increase the pause depending on load
<wallyworld> the less load, the shorter the pause
<wallyworld> and jitter is also good
<wallyworld> if the controller is not loaded, not processing lots of connections, no real need to rate liit the same was as when  it is loaded
<wallyworld> so the algorithm is mindelay + (rate of connections metric) * 5ms
<wallyworld> up to a max delay
 * thumper is heading out, daughter to dentist
<thumper> bbl
<axw> wallyworld: isn't that a recipe for DoS? throw a bunch of connections at the server, and then every one else is penalised
<axw> wallyworld: because it's exponential backoff for everyone
<babbageclunk> wallyworld: what else has status history? Do we clean up status history for machines?
<babbageclunk> or axw ^
<babbageclunk> Should we clean it up for machines? Or do we just rely on the pruner to get rid of that?
<axw> babbageclunk: search for probablyUpdateStatusHistory in state, all those things have status history
<axw> I guess we should clean it up for them
<axw> though I don't think they're going to be anywhere near as high volume as units
<babbageclunk> axw: Ok, thanks. Just trying to work out whether anything else has the same problem.
<thumper> well... fark
<thumper> now I need to work out why this isn't working
<thumper> axw: I could use that chat now if you have 10 minutes
<axw> thumper: sure
<axw> thumper: https://hangouts.google.com/hangouts/_/canonical.com/axw-thumper?authuser=1 ?
<thumper> ack
<babbageclunk> jam: take a look at https://github.com/juju/juju/pull/7468? I'm popping out to pick up my daughter.
<axw> wallyworld jam: https://github.com/axw/juju/commit/e77cbf1b49d0a9e158f54c629a44dca253c32426 <- WIP to add rate limiting to log sink. would appreciate your thoughts on if this approach is OK before I proceed
<axw> wallyworld jam: that ratelimit.Bucket is shared by all logsink connections, in case it's not obvious
<wallyworld> axw: looking
<wallyworld> jam: here's a connection rate limit PR. i don't have a full feel for the type of connection rates we are seeing, so am not sure of the pause numbers used (eg are they too low) https://github.com/juju/juju/pull/7470
<wallyworld> axw: looks ok to me. i wonder if the 1ms refill rate is set to the most appropriate value. hard to say without a feel for a) what mgo can handle, b) the rate at which incoming log messages arrive
<axw> wallyworld: yeah, I don't know. I see you're making things configurable via env vars in your branch, so I could do that
<wallyworld> yeah, we can then do some perf testing
<wallyworld> set env var, bounce agent etc
<thumper> can anyone point to api client tests that hit multiple fake remote versions?
<thumper> so we just test best version handling?
<wallyworld> there will be tests for FindTools() somewhere
<wallyworld> where people get confused is running proposed tools and then it not seeing released rools
<wallyworld> so upgrades from rc to ga need to set agent-stream
<wallyworld> or something along those lines
<thumper> ah fark
<thumper> api/base/testing/apicaller.go hard codes best version to zero
<thumper> :(
<wallyworld> yay
<thumper> StubFacadeCaller looks like the ticket
<thumper> nope...
 * thumper writes one
<thumper> wow... it works
<thumper> hazaah
<axw> wallyworld: FYI, here's how I'm planning to parameterise the config: https://github.com/axw/juju/commit/77a061739b01f37f4eb85448664018c1ee0cec19. I'd rather it get pulled out of env at the command level, and poked in via ServerConfig
<wallyworld> axw: yeah could do, but i wasn't looking to make the config anything formally accessible; is purely intended for our testing purposes
<wallyworld> don't really want to expose those kbobs
<thumper> just putting my code up for review
<axw> wallyworld: this is just in agent config, so the user can't see it anyway. *shrug*
<wallyworld> sure, i can add the extra code
<thumper> https://github.com/juju/juju/pull/7471 a bit more chunky than I would have liked but it fixes a bug in dump-model as well by changing the output format
<axw> need exercise... bbs
<wallyworld> thumper: looking
<wallyworld> thumper: here's a one line pr https://github.com/juju/testing/pull/127
 * thumper looks
<thumper> wallyworld: should we be checking versions around it?
<wallyworld> thumper: we only support 3.2, and 1.25 should use pinned dep
<thumper> ack
<thumper> lgtm
<wallyworld> ta
<thumper> wait
<thumper> wallyworld: you targetted master
<wallyworld> thumper: yeah, this is juju/testing repo
<thumper> oh
<thumper> duh
<thumper> sorry
<wallyworld> np
<wallyworld> axw: a small one https://github.com/juju/juju/pull/7473
<axw> wallyworld: how is that acceptance test working? expecting --noprealloc when it shouldn't?
<wallyworld> axw: it's not being run - it was for the 2.4->2.6->3.2 upgrade
<axw> wallyworld: ah, ok
<wallyworld> pretty sure
<wallyworld> thought i'd update anyway
<axw> wallyworld: LGTM
<wallyworld> tyvm
<wallyworld> jam: are you free to look at the pr for server connection rate limiting? we're looking to cut rc2 tomorrow
<wallyworld> https://github.com/juju/juju/pull/7470
<wpk> wallyworld: DefaultLoginRetyPause
<wpk> you ate 'r'
<wpk> rest after I get to the office
<wallyworld> so i did
<jam> wallyworld: just got back home, I'll look at rate limiting
<wallyworld> thank you
<axw> wallyworld: I'm struggling to come up with a way of testing the logsink rate limiting without changing a heap of code, which I don't think is wise for 2.2. do you think it would be OK to land https://github.com/axw/juju/commit/77a061739b01f37f4eb85448664018c1ee0cec19 as is, and add tests on develop (with significant refactoring)?
<wallyworld> axw: i guess you ran up a system with lots of log traffic?
<axw> wallyworld: I ran up enough to observe that rate limiting takes effect
<wallyworld> if jam gives a +1, ok with me
<axw> wallyworld: and twiddled the knobs in agent.conf the see that that works
<jam> wallyworld: hey ian, sorry about the delay, I have to get the salary recommendations before Tim goes and then I'll look at it again.
<jam> wallyworld: for good or bad, I can no longer connect to Comcast, so I have more review time :)
<wallyworld> no worries, i'm off to soccer soonish
<wallyworld> jam: i guess that means we don't know the status of site after deleting unit status hisotry rtc?
<jam> wallyworld: I was able to connect this morning
<jam> wallyworld: for about 30min or so
<jam> things were looking a lot better with the history gone
<jam> wallyworld: but it wasn't 'quiet' yet, either. Status returned, but took 10min
<jam> wallyworld: I did get a couple of cpu profiles dumped, but those are sitting on the disk over there
<jam> it isn't very easy to get data out.
<wallyworld> jam: tim is landing a pr to get all the model data much more efficiently using export. if that performs well, we can rewrite status to use that instead of very inefficiently walking the model
<jam> wallyworld: well, status when things were happy on monday was 15-30s
<jam> wallyworld: so while 'we can have better status code', we can also get the system much happier than it is right now
<jam> It may be that ultimately reworking status just gives better scaling under load, not sure
<jam> 10min vs 30s is 20x factor
<jam> I could see 1-by-1 querying being more impacted by load, though.
<wallyworld> jam: ah i see, didn't realise it was as low as 30s. well yeah, then we have work to do to figure out where things are going amiss
<wallyworld> hopefully the guys on site can get good data/measurements
<jam> wallyworld: I imagine I'll be able to get back in after another 2-4hrs
<axw> jam: https://github.com/juju/juju/pull/7474 adds logsink rate-limiting. as I mentioned to wallyworld above, I've been unable to come up with a test that doesn't involve heavy refactoring of apiserver
<axw> jam: so I'd like to land that as is if you're comfortable (already have +1 from wallyworld), and do refactoring + tests in develop
 * wallyworld off to soccer, back later
<jam> axw: still looking at ian's patch, but almost done
<jam> axw: arguably we should use similar algorithms for how we throttle
<jam> Token Bucket looks quite promising, and is used for most network throttling
<axw> jam: I agree, I did suggest it to wallyworld. his approach could not be captured in token bucket AFAICT, but I'm not sure the approach of exponential backoff across the board is necessarily good anyway. it means latecomers are disadvantaged, which means a DoS could starve out users
<jam> axw: so for logging, I would do it more per logger
<jam> vs over all of them
<axw> although there is a upper limit, so I guess it's not that bad
<jam> I guess you'd want both weights involved?
<jam> but throttling the slow logging because someone is spammy doesn't sound right
<axw> jam: the thought did cross my mind that we might want it at both levels
<jam> axw: its mostly a 'play fair with your neighbors' algorithm we're looking for
<jam> wallyworld: reviweed
<axw> jam: I've updated my PR to rate limit per connection, rather than all together. I think we may want both, but doing both requires more thought and I don't want to stall this
<axw> jam: it's at least not *worse* this way, which it could be if we did a shared token bucket
<wallyworld> jam: thanks for review. with the on-the-fly config, that was never intended to be in scope; the fact that is has *any* configurability is a concession to initial tuning under lab conditions. with the token bucket thing, i didn't go with that because we discussed a random, on average increasing delay the faster the rate of incoming connections. disconnects don't matter because typically i would think the load would be incurred in accepting the
<wallyworld> connection, ie this was designed for the thundering herd problem on startup, not steady state load
<wallyworld> if we do stick with the approach, i probably should make the 5ms configurable
<jam> wallyworld: sure. I think either way it helps the thundering herd problem. I'm wondering if changing it slightly would make it more understandable what we are tweaking, and it s going to be hard to test live by restarting the controllers
<jam> but we can live with it
<wallyworld> jam: i'm updating the pr to expose more knobs as suggested. because of the purpose, and the desire not to give people more knobs, i think it's ok to just use agent config and require a restart. it's for us to tune, not field folks initially. that can change of course. how did you want to make it more understnadable?
<jam> wallyworld: right, the issue is more that while I'm trying to tune it, I have to kill half the world (3rd the world?)
<jam> anyway, having be a start and a target
<wallyworld> yeah, but that introduces the herd problem which this is designed to fix :-)
<jam> so "at X connections delay should be Y"
<jam> wallyworld: so giving 2 (conns, delay) coordinates
<jam> and then just linear interpolation
<jam> connections_low, delay_low = (10, 10ms)
<jam> connections_high, delay_high = (1000, 5s)
<wallyworld> conns (absolute) or conns (rate of arival)
<wallyworld> atm it's rate
<jam> wallyworld: probably rate is better
<wallyworld> how many per 10ms
<jam> wallyworld: I'd use human units, someting like /s or /min
<jam> probably /s
<wallyworld> so now, it's simple - 10ms min plus 5ms per rate up to  a max
<jam> wallyworld: right but 5ms is fixed and not particularly tuned to anything
<wallyworld> i'm making that tunable
<wallyworld> plus the lookback window
<wallyworld> ie how old the earlier conn time is before we stop looking back
<wallyworld> jam: so i think what is there gives you the low/high thing you want, but it also has a linear backup in between
<wallyworld> *backoff
<jam> wallyworld: so my suggestion was to linearly interpolate between those two points
<jam> which is essentially the same. it fits more in *my* head how to think about it, but I'm sure its just a transformation between the two
<wallyworld> ok, i can do that
<wallyworld> one less thing to have to tweak
<wallyworld> jam: i've pushed some changes, see what you think. the alogirthm now has no randomness, and should be as per what we discussed
<jam> wallyworld: looking
<wallyworld> jam: anything major that's an issue?
<jam> wallyworld: sorry, OTP with the site
<jam> been catching them up to speed
<wallyworld> ok, np. midnight here i so might need to end up sson
<jam> wallyworld: sorry to hold you up. lgtm, only small thing would be "conns/10ms" is harder to think about as a human than "conns/s" and its just a scale factor of 1:100
<wallyworld> jam: ok, i'll see if i can tweak it. how are things at site?
<jam> wallyworld: Juju is up and running, all agents are green.
<jam> status takes 10min
<jam> which is not great, but it succeeds
<wallyworld> and agents all green, which is good
<jam> the Controllers are all using 2-300% cpu as is mongo
<wallyworld> hmmm
<jam> wallyworld: so I think this is our "juju goes into consuming cpus baseline" that we saw with the JAAS tests
<wallyworld> at least we can now start profiling
<jam> wallyworld: so *right* now, I'm working with heather an nick to get us a place we can run "go tool pprof --svg"
<wallyworld> great
<jam> wallyworld: yeah, we can't get files out of the system, so we have to install go and juju source, etc.
<wallyworld> joy
<wpk> jam: technically or politically?
<jam> wpk: mostly politically
<wpk> jam: because technically there's always https://www.aldeid.com/wiki/File-transfer-via-DNS ;)
<jam> wpk: :)
<wallyworld> babbageclunk: veebers: anastasiamac: standup?
<babbageclunk> sorry, omw
<veebers> wallyworld: d'oh snuck up on me omw
#juju-dev 2017-06-09
<axw> wallyworld: back, ready when you are
<axw> wallyworld: lost you
<anastasiamac> babbageclunk: ping?
<wallyworld> axw: so i'm at the neighbour's house - the dick next door cut my internet cable
<axw> wallyworld: :/
<wallyworld> nfi how long they will take to repair it
<axw> wallyworld: I'm instrumenting all the things. adding a prometheus collector for the mgo stats atm, have added pprof profile for sockets and sessions. about to run it and see if anything pops out
<wallyworld> let's see
<anastasiamac> wallyworld: what did u do to ur neighbour to have cable cut?
<wallyworld> anastasiamac: hes a fool - he cut down a tree and let it fall across my cable - snap. lucky it missed the electrity connection
<axw> wallyworld: I have a controller with 1 machine, 1 unit. getting ~10 mgo ops sent per second. they're not mgo/txn ops tho
<wallyworld> interesting to see how many sasl logins
<wallyworld> i'm looking at that atm
<wallyworld> axw: data point https://bugs.launchpad.net/juju/+bug/1696739/comments/8
<mup> Bug #1696739: mongodb reporting saslStart and isMaster as being slow <mongodb> <performance> <juju:Triaged> <https://launchpad.net/bugs/1696739>
<axw> wallyworld: hm ok, interesting
<wallyworld> doesn't really match with behaviour seen on site i don't think
<axw> wallyworld: I'm seeing saslCmd ops regularly, with just 1 machine and 1 unit
<wallyworld> hmmm
<axw> wallyworld: also a lot of iterator start/stops
<axw> wallyworld: that appears to be the way the txns.log watcher works
<wallyworld> axw: in the last 10 minutes, the number of sasl logins increased by approx 130, idle system
<axw> seems odd to start/stop watching a capped collection all the time?
<wallyworld> yeah
<wallyworld> same time socket login calls increased by 2000
<wallyworld> the socket login calls are all pretty much lease and presence
<wallyworld> and so of the 2000, 130 drop through to require a sasl login
<wallyworld> axw: i left 2 more comments, i think we could target what presence does
<axw> wallyworld: did you try already?
<axw> making the presence change
<wallyworld> axw: not yet, gathering info. i'd like to understand why there are so many cache misses; i need to read john's notes properly
<anastasiamac> wpk: could u plz jump back into our standup?
<anastasiamac> babbageclunk: ^^
<wpk> I'm here
<cory_fu_> It seems like this is from the initial AllWatcher response, which presumably includes the entire model, so I suppose it would depend on the model that you're connecting to.  But it also seems like we should consider chunking those frames; that seems like it would be safer for the controller as well
<cory_fu_> Oops, pasted that out of order
<cory_fu_> Regarding https://github.com/juju/python-libjuju/issues/136 I'm wondering what the maximum expected / reasonable frame size from the Juju websocket would be?  I don't feel entirely comfortable having no restriction whatsoever.
<cmars> morning juju, can i get reviews of a model migration bugfix? https://github.com/juju/juju/pull/7478 https://github.com/juju/juju/pull/7479
<urulama_> balloons: ^ this one is important for 2.2 :D
<balloons> urulama_, so 2.2-rc2 is already invalidated, nice
<urulama_> balloons: well, this got discovered yesterday during migrations testing
<cmars> balloons, i'll merge 7478 and 7479 this afternoon after testing an actual 2.1 -> 2.2 migration
<balloons> cmars, ty
<cmars> whoa
<cmars> i just migrated a model
<cmars> what is this magic?
<cmars> balloons, fix for LP:#1696828 has landed in 2.2, will land in develop shortly
<mup> Bug #1696828: failure to migrate a model -- meter status "" is not valid <juju:In Progress by cmars> <juju 2.2:Fix Committed> <juju trunk:In Progress by cmars> <https://launchpad.net/bugs/1696828>
#juju-dev 2017-06-10
<mup> Bug #1697175 opened: juju 2.1.2 trusty node put into recovery mode <juju-core:New> <https://launchpad.net/bugs/1697175>
<mup> Bug #1697175 changed: juju 2.1.2 trusty node put into recovery mode <juju-core:New> <https://launchpad.net/bugs/1697175>
<mup> Bug #1697175 opened: juju 2.1.2 trusty node put into recovery mode <juju-core:New> <https://launchpad.net/bugs/1697175>
