#juju-dev 2012-05-28
<rogpeppe> fwereade: good morning
<fwereade> rogpeppe, heyhey
<rogpeppe> fwereade: back home now?
<fwereade> rogpeppe, yeah, it's quite a relief :)
<rogpeppe> fwereade: parents can be stressful...
<fwereade> rogpeppe, my parents are fine really... it's the aggregate of travelling around and england in general and having to see far too many people in too few days
<rogpeppe> fwereade: yeah, i can see that...
<rogpeppe> fwereade: still, you got some good weather...
<fwereade> rogpeppe, oh, indeed, it was lovely... but I'm still really happy to be back home :)
<rogpeppe> fwereade: exciting event of the morning: i've just submitted the branch that actually runs the Go s/w on the server...
<rogpeppe> fwereade: as of 30s ago
 * fwereade whoops, hollers
<rogpeppe> fwereade: bootstrap phase 1, complete
<rogpeppe> fwereade: i thought i'd move on to some constraint stuff now
<fwereade> rogpeppe, that sounds like a good plan
<rogpeppe> fwereade: unfortunately i lost all the notes i made over UDS
<rogpeppe> fwereade: not that there were many, mind. i *think* i remember the general gist.
<fwereade> rogpeppe, heh :) hopefully the approximate parmeters remain accurate
<fwereade> rogpeppe, if I can be of any assistance let me know
<rogpeppe> fwereade: thanks. i think i'll take a dive into the python code for a little, just to see. have you got a suggestion as to the crux of the contraint code there?
<fwereade> rogpeppe, basically everything core is is juju/machine/constraints.py
<rogpeppe> fwereade: lovely, thanks
<fwereade> rogpeppe, the big thing that should definitely change behaviour-wise is the unit-level constraint stickiness
<rogpeppe> fwereade: is that something we can change without breaking things?
<fwereade> rogpeppe, yeah, I think so -- it just removes a guarantee that I don't think is made explicitly anywhere
<rogpeppe> fwereade: ok. i take it that the current code implements unit-level stickiness, and you're saying that it should be removed, as per discussions here earlier.
<rogpeppe> ?
<fwereade> rogpeppe, it *is* a guarantee that I assumed we'd need to provide, and perhaps some users share my assumptions
<fwereade> rogpeppe, yeah
<fwereade> rogpeppe, but it doesn't fit niemeyer's model and I think he is right about it :)
<rogpeppe> fwereade: yeah, probably. i remember the discussion.
<rogpeppe> fwereade: can i treat this as the authoritative spec? https://juju.ubuntu.com/docs/constraints.html
<rogpeppe> fwereade: for instance, can i assume that no value will contain white space?
<fwereade> rogpeppe, I think you should be safe there, yes
<fwereade> rogpeppe, anything related to units should be dropped, and you should maybe hold off on the provider-badged constraints until you've had a chance to sync up with niemeyer
<rogpeppe> fwereade: seems good
<fwereade> rogpeppe, (IIRC he wants ec2-zone to be availability-zone... but I'm not sure how long-term that plan is)
<fwereade> rogpeppe, (and who knows what maas constraints we'll end up with)
<rogpeppe> fwereade: so, the plan is to have a single struct that represents constraints across all providers.
<rogpeppe> fwereade: i guess that means that conflicts are provider non-specific too
<fwereade> rogpeppe, I *think* the conflicts were always fundamentally non-specific, am I missing something?
<fwereade> rogpeppe, ah, I think I see
<rogpeppe> fwereade: we should get the python version to change ec2-zone to availability-zone so that the Go version doesn't break things
<fwereade> rogpeppe, what actually conflicts with what may take some more thinking now I guess
<fwereade> rogpeppe, that sort of question is why I feel we should sync up with niemeyer
 * bigjools has a couple of MPs up if someone could take a looksie please
<fwereade> rogpeppe, and I feel it'd be best to start with a constraints struct that skips all the potentially contentious ones
<rogpeppe> fwereade: i think what i'll do as a first step is move the existing "constraint" stuff out of ec2 and into environs
<rogpeppe> fwereade: so we've got the skeleton types and API there
<fwereade> rogpeppe, great
<fwereade> bigjools, I'll see if I can manage a quick review at some stage but I'm not *really* meant to be thinking about python... so you might end up having to wait for the US to wake up
<fwereade> bigjools, sorry :(
<bigjools> fwereade: poor Python!
<bigjools> they are small, if that helps :)
 * fwereade is flattered :)
<fwereade> bigjools, cool
<rogpeppe> bigjools: lucky Go! :-)
<bigjools> are you planning to switch to Go and forget the Python one at some point?
<rogpeppe> bigjools: that's the plan
<rogpeppe> fwereade: just looking at the INSTANCE_TYPES list in providers/ec2/utils.py. where did you get the stats from. the memory numbers look a bit weird. what units are they in?
<fwereade> rogpeppe, IIRC they're MB; and, er, I forget :/. I've seen a couple of different sets of numbers quoted... amazon is not much help, they just say things like "1.6GB" which is tricky to translate with certainty
<rogpeppe> fwereade: ok, thanks.
<fwereade> rogpeppe, I guess the *other* issue with constraints is whether or not the resource-map thing is planned for 12.10
<rogpeppe> fwereade: BTW with conflicts, i'm wondering if the best approach is simply to let a conflicting requirement be satisfied by nothing.
<fwereade> rogpeppe, because I'm pretty sure we don't want hardcoded numbers; but nor do we rally want to saddle go with new features
<fwereade> rogpeppe, please expand re conflicts
<fwereade> rogpeppe, I don't think what you said works
<rogpeppe> fwereade: currently, if you specify two conflicting constraints, it's an error, right?
<fwereade> rogpeppe, yeah
<rogpeppe> fwereade: so i'm wondering if it would be easier if it wasn't an error, but simply resulted in a constraint that was impossible to satisfy
<rogpeppe> fwereade: which means that conflicts don't need to be defined in a provider-non-specific way
<fwereade> rogpeppe, not sure that works... consider an m1.small with at least 512MB. that's not unsatisfiable in any sane universe; but it does I think betray confusion on the user's part, which should be flagged an whined about
<rogpeppe> fwereade: and also means that, potentially, some attributes may conflict only if they have some values
<fwereade> rogpeppe, I fear that that in particular will become hideously complex and easy to screw up
<rogpeppe> fwereade: i'm not sure.
<rogpeppe> fwereade: i think that if we have tags, for example, a tagged class of machine may or may not conflict
<rogpeppe> fwereade: with some other attribute
<rogpeppe> fwereade: and if someone tries to deploy an m1.small with at least 512MB, they'll quickly find out it's impossible...
<fwereade> rogpeppe, I don't see how that's an improvement over immediately finding it's ill-specified
<rogpeppe> fwereade: dunno, just thinking
<fwereade> rogpeppe, consider also arch/instance-type interactions
<fwereade> rogpeppe, it's not insane to ask for an i386 m1.small, but it is to ask for an i386 cc2.8xlarge
<fwereade> rogpeppe, the first cut had arch conflicting with instance-type, and we dropped that conflict for exactly that reason
<rogpeppe> fwereade: i think the difficulty with the "immediately finding it's ill-specified" thing is that that there's an overlap between several constraining attributes in terms of the machines that can satisfy them, and that overlap won't always be clear-cut, i think
<fwereade> rogpeppe, and so that falls back to "we can't satisfy those constraints" vs "that request doesn't even make sense, pick one or the other"
<rogpeppe> fwereade: ok, so that's precisely my point
<rogpeppe> fwereade: i don't mind if we always do the former
<fwereade> rogpeppe, I see your point of view
<rogpeppe> fwereade: the reason we're able to do the conflict think with instance-type is that it specifies almost everything. but it seems to me that it's a special case.
<rogpeppe> fwereade: hmm, i see the advantage though of being able to say to the user "a constraint with these *never* be able to be satisfied, regardless of the values you choose".
<rogpeppe> fwereade: but even then that's not actually true.
<fwereade> rogpeppe, I think I may still be missing some details... how should we then handle a case where the env specified 1386 and a service asks for cc2.8xlarge?
<rogpeppe> fwereade: it would fail to choose any instances (assuming cc2.8xlarge is always amd64)
<fwereade> rogpeppe, the conflict mechanism currently drops the more-general arch constraint because it conflicts with a more-specific instance-type constraint
<fwereade> rogpeppe, if we're not using conflicts at all, I'm not sure how we deal with that situation
<rogpeppe> fwereade: i would treat all the constraints as independent
<rogpeppe> fwereade: given that ec2 is filtering instance types, not machines, i'd start with a list of all instance types, filter out all those without instance-type=cc2.8xlarge, then filter out all those with arch!=i386, leaving nothing.
<fwereade> rogpeppe, so you'd just break it? :p
<rogpeppe> fwereade: no need to encode any knowledge of constraint conflicts AFAICS - they just come out in the wash, by resulting in nothing
<rogpeppe> fwereade: it's broken by the user specifying an impossible constraint
<rogpeppe> fwereade: same as if specified mem=80000
<rogpeppe> s/if/if you/
<fwereade> rogpeppe, I don't think they're the same situation
<fwereade> rogpeppe, more realistic example
<fwereade> rogpeppe, env: "instance-type=m1.small"; service "mem=8G"
<rogpeppe> fwereade: oh yeah "setting an instance-type constraint will clear out any inherited cpu or mem values, and vice versa,"
<fwereade> rogpeppe, it's entirely unhelpful for us to whine about not having an m1.small with 8G, when IMO it's perfectly clear that the service-level 8G should result in an instance type that does satisfy
<fwereade> rogpeppe, exactly
<rogpeppe> fwereade: i think that actually makes constraints harder to reason about.
<rogpeppe> fwereade: BTW what other constraints other than instance-type might result in conflicts?
<fwereade> rogpeppe, the other uses were orchestra-name/orchestra-classes, but orchestra-name got dropped
<fwereade> rogpeppe, in general I think that name constraints will conflict with just about everything else
<rogpeppe> fwereade: i think i'm coming to the thought that certain attributes are symbolic aliases for a set of other attribute values.
<fwereade> rogpeppe, and while I understand the arguments against name constraints I also think we may be stuck with them
<rogpeppe> Aram: yo!
<Aram> morning.
<fwereade> rogpeppe, hmm, I'm not certain the linked attribute values are *necessarily* knowable
<fwereade> Aram, heyhey
<rogpeppe> fwereade: yeah, but perhaps that's true of the attributes they conflict with too.
<rogpeppe> fwereade: i.e. on some providers, instance-type *may* conflict with arch
<fwereade> rogpeppe, I'm thinking about MAAS here -- we can't currently find out cpu/mem, but then that's why we don't even expose them, so probably that will all come out in the wash when MAAS gets more sophisticated
<fwereade> rogpeppe, I think that's a dangerous road to go down... that was what we had, and we dropped it because we don't want the conflict mechanism to come into play unless specifying one side *always* implicitly specifies the other
<rogpeppe> fwereade: oh, interesting. i thought current behaviour was entirely provider-specific
<fwereade> rogpeppe, I guess it is but only as a side-effect of different providers exposing different constraint sets
<rogpeppe> fwereade: yeah
<fwereade> rogpeppe, it's an accidental property, not an essential one
<rogpeppe> fwereade: ok.
<rogpeppe> fwereade: 'cos we're proposing to go with a single type for constraints across all architectures
<rogpeppe> s/architectures/providers/
<fwereade> rogpeppe, indeed
<fwereade> rogpeppe, and while this makes me a little uncomfortable I don't have any solid arguments against doing so :)
<rogpeppe> fwereade: which means that the conflicts logic needs to happen in a provider-independent way
<fwereade> rogpeppe, yeah; and I suppose it comes down to what precisely an instance-type does specify
<rogpeppe> fwereade: yup
<fwereade> rogpeppe, AFAICT it's just cpu/mem/local-storage
<rogpeppe> fwereade: and currently we're taking that cue from ec2
<fwereade> rogpeppe, and we don't handle local-storage at the moment anyway
<rogpeppe> fwereade: but it might be different on other providers
<fwereade> rogpeppe, having poked around a little, I think some providers may have tighter requirements (on foo-cloud, perhaps, instance-type will always specify arch)
<rogpeppe> fwereade: so do we treat instance-type as conflicting with arch on those providers?
<rogpeppe> fwereade: does instance-type overwrite arch on those providers?
<fwereade> rogpeppe, personally I think it shouldn't, because that feels like a coincidence they could drop at any time
<rogpeppe> fwereade: i'm sure it's too late now, but i *think* things would be much more straightforward if all constraints were totally independent, and constraints unknown to a provider were ignored.
<fwereade> rogpeppe, like amazon actually did -- originally the only arch variation was with t1.micro, which we were happy to special-case because nobody's going to be deploying important services to those anyway
<fwereade> rogpeppe, I think the instance-type concept becomes meaningless once it's divorced from cpu/mem
<rogpeppe> fwereade: it just seems a bit weird we're proposing a general rule ("instance-type implies cpu/mem but not arch") based only on the way that amazon has specified instance types
<fwereade> rogpeppe, it's intended to be the minimum attribute set implies by instance-type
<rogpeppe> fwereade: across all providers? surely some providers might not use instance-type to imply either cpu or mem?
<fwereade> rogpeppe, that's the question... what then does instance-type actually mean?
<rogpeppe> fwereade: indeed
<rogpeppe> fwereade: in the aws case, it's a provider tag.
<rogpeppe> fwereade: ah, no it's not.
<rogpeppe> fwereade: because if we say cg1.4xlarge, we might not get the GPU h/w
<fwereade> rogpeppe, sorry, not following you there
<rogpeppe> fwereade: if it was a tag, then any machine we get would be guaranteed to get that tag.
<fwereade> rogpeppe, if we ask for cg1.4xlarge, we will surely get GPUs
<rogpeppe> fwereade: but as far as i can see, all the constaints specified by cg1.4xlarge are also satisfied by cc2.8xlarge
<fwereade> rogpeppe, if we ask for cpu/mem such that cg1.4xlarge is the cheapest way of satisfying them, we will also get the GPUs, but nobody cares
<rogpeppe> fwereade: but i'm presuming from the comment that that instance type doesn't have the fancy GPU h/w
<rogpeppe> fwereade: if we ask for cg1.4xlarge, are we guaranteed to not get cg1.8xlarge ?
<fwereade> rogpeppe, er, yes, I think so
<fwereade> rogpeppe, barring crackfulness on amazon's side
<rogpeppe> fwereade:  ah, so an instance-type isn't just an alias for the mem, cpu etc attributes of that instance-type
<rogpeppe> fwereade: it is really a tag.
<fwereade> rogpeppe, still not following... isn't it an alias for a certain amount of GPU as well?
<fwereade> rogpeppe, it's just that we don't happen to expose gpu
<rogpeppe> fwereade: it implies that, yeah.
<fwereade> rogpeppe, back to instance-type, briefly... across all providers I'm aware of, instance-type implies cpu/mem
<fwereade> rogpeppe, in many cases it implies more
<fwereade> rogpeppe, but the common properties are cpu/mem, and I'm having trouble imagining an instance-type that doesn't
<fwereade> rogpeppe, can you flesh out your earlier what-if with a plausible scenario?
<rogpeppe> fwereade: i guess i'm uncomfortable with that as a minimum. for instance, on some providers, instance-type may imply arch, but arch won't be cleared out when we specify instance-type
<rogpeppe> fwereade: and in that case, we'll fall back to the situation i'm suggesting we do for everything - it just won't satisfy anything
<rogpeppe> fwereade: hold on, there's parcel at the door
<fwereade> rogpeppe, it's a first layer of defence against knowably nonsensical requests... we don't have any guarantees that anything will actually be provisionable ever
<fwereade> rogpeppe, and it gives us sl. more helpful behaviour when dealing with the interplay between env and service constraints
<fwereade> rogpeppe, I think the second point is the one you need to address if you want to drop the conecpt
<rogpeppe> fwereade: it's more than a layer of defense, i think - the "setting an instance-type constraint will clear out any inherited cpu or mem values" thing is awkward
<rogpeppe> fwereade: yeah, that is the second point, i think
<fwereade> rogpeppe, indeed... it may be awkward for us but I think it's useful for actual users in realistic scenarios
<rogpeppe> fwereade: hmm, not entirely convinced. in that scenario earlier, it would be easy for the deployer to do "instance-type= mem=8g"
<rogpeppe> fwereade: i.e. to explicitly clear out the instance-type constraint rather than remembering that mem will override instance-type. (and arch won't...)
<fwereade> rogpeppe, it forces them to think more to get the same result
<rogpeppe> fwereade: really? i think it gives more "stuff" to remember.
<rogpeppe> fwereade: i.e. there's interplay between the attributes as well as the inheritance of attributes.
<fwereade> rogpeppe, the intent is that users shouldn't actually have to worry about it... if they want a machine with 8G, we give them one, without forcing them to take additional context into account
<rogpeppe> fwereade: but that's the context that they themselves have specified...
<fwereade> rogpeppe, that they or someone else specified at some point arbitrarily long ago...
<rogpeppe> fwereade: also, when they get told that no machines will be deployed, that's a pretty big hint they've got something wrong...
<rogpeppe> fwereade: anyway, i've missed the boat here. i think we have to implement the current python semantics
<fwereade> rogpeppe, the point is to reduce the incidence of hat sort of situation
<rogpeppe> fwereade: i'm not sure that adding more magic helps here
<fwereade> rogpeppe, and, yes, I think so; but tbh if we're planning to change ec2-zone to availability-zone, et cetera, I feel like we've reopened the whole thing really :/
<rogpeppe> fwereade: the user will encounter this situation with other conflicting attributes, so they've got to know how to deal with this kind of situation anyway
<fwereade> rogpeppe, they may do, but reducing the incidence of "oh ffs, wasn't it obvious what I meant?"s feels worthy to me
<fwereade> rogpeppe, it's not as if it's an unpredictable scenario
 * rogpeppe finds value in a simple conceptual model, even if it makes some things a little harder.
<fwereade> rogpeppe, so, we should just drop instance-type entirely? :p
<rogpeppe> fwereade: no.
<rogpeppe> fwereade: just treat it as independent
<rogpeppe> fwereade: if someone wants to specify instance-type=t1.micro mem=8G, let them do so
<rogpeppe> fwereade: they won't get many machines working for them
<rogpeppe> fwereade: and i think, on further reflection, i'd have providers treat any constraints unknown to them as unsatisfiable.
<fwereade> rogpeppe, unknown-constraints=unsatisfiable definitely works for me
<fwereade> rogpeppe, the issue is really that it's not always obvious that you're specifying instance-type=t1.micro at the same time as mem=8G
<fwereade> rogpeppe, sure, some users will have to dig in and figure out the details at some point
<fwereade> rogpeppe, delaying the point at which they do feels to me like a good thing :)
<rogpeppe> fwereade: we could make it very easy for clients to find out the total set of constraints applying to a service. a tool to debug this stuff would seem very useful to me.
<fwereade> rogpeppe, well, yeah, get-constraints exists
<fwereade> rogpeppe, forcing users to check and calculate things in order to deploy machines feels counterproductive to me
<fwereade> rogpeppe, especially since the effects are not immediately visible in the first place
<rogpeppe> fwereade: ah, now there's an interesting point
<fwereade> rogpeppe, the inheritance is the really annoying bit here
<rogpeppe> fwereade: you don't get any immediate feedback if no machines might satisfy the constraint
<rogpeppe> fwereade: yeah
<fwereade> rogpeppe, yeah: changing env constraints could break an existing service (which you might only need to add a unit to in a week)
<fwereade> rogpeppe, we can't entirely eliminate it but we should really try to guard against it
<fwereade> where we can
<rogpeppe> fwereade: or someone might be doing it deliberately - it's always possible
<rogpeppe> fwereade: here's a possibility: when you specify a set of constraints that can't be satisfied, you get an immediate warning.
<fwereade> rogpeppe, if I want to make a specific service undeployable I think I'll be eplicitly setting screwed-up settings on that service, not figuring out the right env change to break that and only that
<rogpeppe> fwereade: and when setting environment constraints, you'll get a warning for all services that have been broken.
<rogpeppe> fwereade: that seems potentially more useful (and less provider-specific) than saying that instance-type overrides cpu and mem
<fwereade> rogpeppe, I can get behind that, but IMO it's orthogonal to the is-this-little-bit-of-magic-actually-useful-in-realistic-scenarios
<fwereade> rogpeppe, I would really prefer to avoid forcing users to think when we don;t have to
<fwereade> rogpeppe, please give me a plausible scenario in which we have an instance-type that doesn't specify cpu/mem
<rogpeppe> fwereade: hmm. we're not talking MS-Word users here. these, hopefully are people that *can* think!
<Aram> btw, I did did yesterday: code.google.com/p/rbits/godefs . It's a go style command that prints only type definitions from go programs. it's very useful with acme rogpeppe.
<rogpeppe> fwereade: it's not that instance-type won't always specify cpu and mem, but that it often implies more
<Aram> I plan to sort the output to be more useful.
<Aram> but it's massively useful even in this stage
<Aram> to me
<rogpeppe> Aram: i can better that (but only by sending you a binary - my code won't build any more)
<Aram> please.
<fwereade> rogpeppe, my contention is, I guess, that cpu/mem/instance-type constraints are the most obvious and commonly-used ones, and kinder behaviour with those is valuable even if it doesn't include a full DWIM engine for every possible constraint
<rogpeppe> fwereade: and i think that having the special rule in this case will mean that people will be more confused when they move beyond the bounds of this special case
<rogpeppe> Aram: i've sent you a tarball.
<rogpeppe> Aram: assuming you're running ubuntu under amd64, it should work
<Aram> thanks
<rogpeppe> Aram: the way it works is like this
<rogpeppe> Aram: you click on an identifier, and middle-click on "def" in the window tag.
<rogpeppe> Aram: it prints the file address of the definition of that identifier
<rogpeppe> Aram: (and also the file address where you are currently, so you can jump back in context)
<rogpeppe> Aram: it works for almost any identifier
<rogpeppe> Aram: (it doesn't work if you've imported to ".")
<rogpeppe> Aram: you can do "def -a" to find out the actual type and all members of the value.
<rogpeppe> Aram: i really need to port it to Go 1.
<Aram> hi niemeyer
<rogpeppe> niemeyer: yo!
<fwereade> niemeyer, heyhey
<rogpeppe> niemeyer: how's london?
<rogpeppe> niemeyer: looks hot to me
<Aram> rogpeppe: your def thing works great.
<rogpeppe> Aram: cool
<rogpeppe> Aram: i think you're probably user number 1.
<Aram> heh.
<rogpeppe> Aram: i wrote it years ago
<rogpeppe> Aram: unfortunately i forked the go parser to do it, and my changes were not acceptable.
<rogpeppe> Aram: and i've been waiting for something equivalent to appear since then.
<Aram> can go/parser do the job now?
<rogpeppe> Aram: nope.
<Aram> what does it lack?
<niemeyer> Heya
<rogpeppe> Aram: a place to put type definitions. my version of go/parser resolves all symbols at read time.
<Aram> hmm...
<niemeyer> rogpeppe: Great, sprint running smoothly
<rogpeppe> niemeyer: cool. what's the sprint BTW?
<niemeyer> and it's warm, yeah
<rogpeppe> Aram: i just need to bite the bullet and have an independent symbol table. but i haven't had the incentive since the  binary continues to work (the joy of static linking!)
<Aram> indeed.
<Aram> rogpeppe: I wonder if I can't do some plumbing tricks so middle clicking on def is not required?
<rogpeppe> Aram: i don't think you want a right-click to take you to the definition of that identifier. usually you want the next *use* of that identifier
<rogpeppe> Aram: otherwise i'd've done it.
<Aram> yes, indeed.
<rogpeppe> niemeyer: now that go-ec2-use-go-client is in, i thought i'd look towards constraints. i was just having a useful conversation with fwereade about same.
<Aram> niemeyer: what's the stance on zookeeper? do we want to replace it if we can?
<niemeyer> rogpeppe: I'd prefer if we focused on getting the basic image actually running first
<rogpeppe> niemeyer: what do you mean?
<rogpeppe> niemeyer: you mean get the agents going?
<niemeyer> rogpeppe: Constraints give us some freedom about which images and so on to pick
<niemeyer> rogpeppe: But none of us can get a running deployment yet, on any machine
<niemeyer> rogpeppe: That sounds more important to get sorted first
<rogpeppe> niemeyer: AFAICS we need the agents to do that
<niemeyer> rogpeppe: Hmm, yes? :-)
<rogpeppe> niemeyer: perhaps i should look at doing an agent then
<rogpeppe> niemeyer: machine agent seems like it's next in line
<rogpeppe> niemeyer: i'd be happy to start on that
<niemeyer> rogpeppe: Indeed
<rogpeppe> niemeyer: great, will do
<niemeyer> rogpeppe: That'd be brilliant, thank you
<niemeyer> rogpeppe: fwereade was going to do it, but he ended up focusing on the agent tools and whatnot, which are nice things to spin off too
<rogpeppe> niemeyer: that's true. and there's quite a lot more in that area
<niemeyer> Aram: There's a tentative plan to get an implementation of the gozk API on top of MongoDB
<rogpeppe> niemeyer: we need to think about where there might be an Aram-sized hole somewhere too :-)
<niemeyer> Aram: I got started on that front, but ended up not moving on much since the flight from UDS :)
<niemeyer> rogpeppe: Agreed
<Aram> I don't know anything about mongodb, does it offer the same features and constraints as zookeeper/doozer?
<niemeyer> Aram: It's a pretty different beast, for better and for worse
<Aram> I read somewhere that mongo uses "eventual consistency" while the algorithms used by zookeeper/doozer are consistent at all times.
<Aram> indeed.
<niemeyer> Aram: Hmm.. that's not *entirely* the case
<niemeyer> Aram: It has a master
<niemeyer> Aram: and replication
<niemeyer> Aram: It has leader election within a replica set with consensus-style logic
<niemeyer> Aram: So it's not entirely unlike it
<niemeyer> Aram: Either way, as a first step, I suggest reading through the code base and trying to understand the relationship between things
<niemeyer> Aram: and covering the usual startup tasks
<Aram> yes.
<niemeyer> Aram: Then, let's see.. I'd prefer if you helped pushing us forward towards feature parity
<niemeyer> Aram: But we need to identify a nicely-sized problem chunk for that
<niemeyer> Aram: If we can't find an isolated-enough problem, then maybe pushing that gozk alternative might be a good option given your previous experience
<niemeyer> Aram: Perhaps, as a way to get started, you might start looking at a few of the missing command line entry points
<niemeyer> Aram: We have pretty minimum support for that, and might be a nice way to push some small branches while you get used to the workflow and whatnot
<niemeyer> Anyway.. I'll get some food or will miss lunch
<niemeyer> biab
<Aram> btw niemeyer, is this expected: http://pastebin.com/hgWd0qpg
<Aram> people on #go-nuts complained about it as well
<fwereade> niemeyer, I saw that as well, probably should have mentioned it :/
<fwereade> Aram, please keep in touch with me re command line tools; I'm working on stuff that's needed for `juju deploy` at the moment (while relation state doesn't exist yet)
<Aram> ok.
<fwereade> Aram, there are certainly commands that are orthogonal though
<fwereade> Aram, I can't think of any blockers for add-unit and remove-unit
<fwereade> lunch, bbiab
<niemeyer> Aram: Yeah, it's a small bug in wait.js
<niemeyer> Aram: It should say "got tired of waiting" or something
<Aram> ah, so it's nothing on my side that causes the test to fail, ok.
<rogpeppe> fwereade, niemeyer: this looks wrong to me. it looks like it's parsing the internal machine id to get the value returned from Machine.Id. am i on crack? http://bazaar.launchpad.net/+branch/juju/go/view/head:/state/machine.go#L64
<niemeyer> rogpeppe: Why is it wrong?
<rogpeppe> niemeyer: because it's parsing the internal id (the zk path) which i don't *think* has a one-to-one correspondence with the external id
<rogpeppe> niemeyer: but maybe i'm wrong and it does
<rogpeppe> niemeyer: certainly the python code seems to keep them separate
<niemeyer> rogpeppe: state/machine.py, line 276
<rogpeppe> niemeyer: ah, fair enough
<rogpeppe> niemeyer: i was sure we had a discussion recently which decided they were different
<niemeyer> rogpeppe: They are different..
<rogpeppe> niemeyer: something to do with failing half way through allocating a new machine id
<niemeyer> rogpeppe: As the comment says, it's an implementation detail that today we map one to the other
<niemeyer> We should copy the rest of the comment saying that this is an implementation detail
<rogpeppe> niemeyer: what's responsible for actually creating the machine directory?
<rogpeppe> i would assume the provisioning agent, but perhaps not
 * rogpeppe goes back to the source
<niemeyer> rogpeppe: state/machine.py, line 18
<niemeyer> rogpeppe: No code outside of state should be manipulating ZooKeeper nodes directly
<rogpeppe> niemeyer: agreed.
<rogpeppe> niemeyer: i'm slightly surprised that clients are doing non-atomic actions though.
<rogpeppe> niemeyer: i guess that means that machine ids may not be allocated in strict sequence.
<niemeyer> rogpeppe: Yep.. there is/was no multinode atomic operation in zk
<niemeyer> rogpeppe: It's no big deal, though..
<rogpeppe> niemeyer: that's true
<niemeyer> Crap, we're moving rooms..
<niemeyer> Without a battery, that means a restart :-(
<niemeyer> biab
<rogpeppe> k
<rogpeppe> off for some lunch
<Aram> godep launchpad.net/... | grep '^launchpad' | awk '{ printf("%s\n\n", $0)}' | awk 'BEGIN { RS=" " } { printf("%s\n", $0) }' | egrep '(^launch)|(^os/exec)|^C|(^net)|(^io/ioutil)|(^$)'
<Aram> oops
<Aram> wrong window :).
<Aram> wrong paste, rather.
<niemeyer> No passwords at least! ;-)
<Aram> niemeyer: I don't understand what path is used for in Coerce.
<niemeyer> Aram: Presenting debugging details in case of errors
<niemeyer> Aram: "a.b.c is wrong"
<Aram> why is path always initialized like this in tests? path := []string{"<pa", "th>"}
<niemeyer> Aram: No good reason..
<Aram> ok, I though it had some meaning and could find any :).
<Aram> great.
<Aram> rogpeppe: I am already in love with your acme def tool, I used codesearch for similar behavior, but that was noisy, yours is 100% accurate, no visual grepping needed.
<rogpeppe> Aram: cool. yeah, i love it too.
<rogpeppe> Aram: it was even better when i hacked godoc so that the identifiers in the html source code were links
<rogpeppe> Aram: but that went by-the-by sadly
<Aram> yeah, I imagine.
<rogpeppe> Aram: i find it really useful for working my way around strange code bases.
<rogpeppe> niemeyer, fwereade: in machine/unit.py, UnitMachineDeployment seems identical to SubordinateContainerDeployment. is there a reason for the separation? is there something that is likely to come later, for example?
 * fwereade looks
<niemeyer> rogpeppe: I haven't seen that code at all, FWIW
<niemeyer> rogpeppe: Haven't reviewed that code
<rogpeppe> the init methods are byte-for-byte identical
<niemeyer> rogpeppe: I hope we can take the hindsight to clean and simplify things up in the agents, as a general comment
<rogpeppe> niemeyer: i certainly intend to do what i can towards that
<rogpeppe> niemeyer: first blocker i can see is there's no unit watcher
<fwereade> rogpeppe, yeah, I don't see the point of Subordinate...
<rogpeppe> fwereade: thanks
<fwereade> rogpeppe, and it's new to me
<Aram> why is CurrentArch in environs runtime.GOARCH (dependant on where the user runs juju) but CurrentSeries is hardcoded to precise (independent of where the user runs juju), why aren't both hardcoded since they are related only to the cloud environment and not to the local environment?
<Aram> Is it because the default is used in things like testing in a lxc container? Why not set up correct arch in that particular test?
<rogpeppe> Aram: CurrentSeries is TODO
<rogpeppe> Aram: it's only just gone in
<rogpeppe> Aram: and i haven't done that bit yet
<rogpeppe> in fact, i should probably clean that up before starting on the machine agent
<Aram> I see.
<rogpeppe> Aram: CurrentArch needs looking at too, as runtime.GOARCH may not map well to linux arch
<Aram> well yes, I didn't question CurrentSeries, but CurrentArch seemed peculiar to me :).
<rogpeppe> fwereade: i was just looking at your upstart package again
<rogpeppe> fwereade: the %q seems dubious (is upstart quoting exactly the same as Go's?) but i couldn't find a definitive reference...
<rogpeppe> fwereade: (i'm referring to the %q as used to quote env vars in the upstart script BTW)
<niemeyer> rogpeppe: Isn't the string well known?
<rogpeppe> niemeyer: Conf.Env looks like it might contain arbitrary text to me.
<rogpeppe> niemeyer: but i've probably misinterpreted
<smaddock> niemeyer: ping
<niemeyer> rogpeppe: No, I think you got it mostly right.. the package is the thing that is a bit questionable I guess
<niemeyer> rogpeppe: We're really using it mostly for our own configuration
<niemeyer> rogpeppe: But it might change in the future
<niemeyer> smaddock: Heya
<rogpeppe> niemeyer: that's true. but *if* we're going to do this package, i think it should be correct for any input.
 * rogpeppe hates it when people define a config format but don't properly define the quoting rules.
<fwereade> rogpeppe, hmm, that sounds like a nice catch
<niemeyer> rogpeppe: Yeah, or add a BUG note
<rogpeppe> fwereade: am currently looking at libnih trying to work out what the actual quoting rules are...
<fwereade> rogpeppe, heh :(
<rogpeppe> niemeyer: that too.
<niemeyer> Given what's there should for the foreseeable future
<fwereade> niemeyer, rogpeppe: I'll add a bug note
<rogpeppe> i thought originally that it was shell script, where using %q would be more problematic
<rogpeppe> always fun when someone rings up and tells you you've got a virus and they can fix it
<rogpeppe> i couldn't quite work out what the scam was though. what nasty thing could they do by looking at the eventvwr app?
<rogpeppe> darn, gotta go, i'm late!
<fwereade> later all :)
<davecheney> Aram: did you get the tests to pass ?
<Aram> davecheney: it's upstream, not in my environment. niemeyer confirmed.
<davecheney> reitveld is having a seniors moment
#juju-dev 2012-05-29
<davecheney> woot, my x220 arrived
<davecheney> boo, my 9mm ssd does not fit
<imbrandon> whats the best way to obtain the machine_id outside of a hook context , e.g from a script running on cron so i can pass it to a cli app via --machine_id , is that available yet ?
<wrtp> davecheney, fwereade: mornin'
<fwereade> wrtp, heyhey
<davecheney> morning' lads
<davecheney> wrtp: i commited the environ/setconfig stuff last night
<wrtp> davecheney: yay!
<davecheney> so you should be cool to merge your PutBucket caching
<wrtp> davecheney: i'll do the madeBucket stuff next then
<davecheney> wrtp: jynx
<davecheney> also, my x220 came this afternoon
<davecheney> so I am in heaven
<wrtp> davecheney: great. it's beenh working well for me
<wrtp> davecheney: did you go for solid state?
<davecheney> i was going to move over the disk I had in my netbook
<davecheney> but it turns out you need a 7mm ssd
<davecheney> so i'll put my current one in my desktop and get another one for this
<davecheney> at the moment i'm in 7200rpm country
<davecheney> eww
<davecheney> wrtp: fwereade g+ says that gustavo is in +1 atm
<davecheney> is that true ?
<wrtp> davecheney: yeah
<wrtp> davecheney: he's in london
<davecheney> conference ?
<wrtp> davecheney: sprint of some kind
<fwereade> davecheney, wrtp speaks truth :)
<davecheney> i heard they are moving out of millbank towers
<wrtp> davecheney: yeah. don't know why though.
<wrtp> davecheney: maybe lease ran out or something
<davecheney> sounds like a long way to go for a working bee, but who am i to judge
<fwereade> davecheney, wrtp: as I understand it just not enough space there
<wrtp> fwereade: ah, makes sense
<fwereade> davecheney, wrtp: bit disappointed in myself that I never got round to visiting, apparently it was rather cool
<wrtp> fwereade: me too.
<davecheney> wrtp: i'm still getting that strange 'cant find cmd/jujud' test error
<wrtp> fwereade: didn't have an excuse though
<davecheney> it eventually sorts itself out
<wrtp> davecheney: have you still got it?
<davecheney> wrtp: happens any time I touch ~/go/src/pkg
<davecheney> it's clearly a bug in the go test support for main packages
<davecheney> but i'm thinking of just giving up and moving the agents out of main
<wrtp> davecheney: yeah.
<wrtp> davecheney: that's not the way forward though!
<davecheney> i'm on the fence if testing main packages is really the right solution
<wrtp> davecheney: i've been trying to home in on the bug, but haven't figured out a way of reliably reproducing it
<wrtp> davecheney: i don't see why not.
<davecheney> wrtp: cd ~/go/src && ./all.bash && rm -rf $GOPATH/pkg/launchpad.net/juju/*
<davecheney> will do it
 * wrtp tries that
<wrtp> davecheney: doesn't seem to
<davecheney> oh, sorry, do a hg update
<davecheney> you need to have a different version of the std lib
<davecheney> % go version
<davecheney> go version weekly.2012-03-27 +1afae7555667
<davecheney> ^ the last bit has to change
<davecheney> wrtp: if that doesn't work, then leave it with me
<davecheney> i'll figure out what is the cause
<davecheney> that upsets it
<fwereade> davecheney, btw, re Machine.Config()... my gut says this should just be a pair of SetInstanceId/InstanceId methods
<wrtp> davecheney: nope, that didn't work either
<fwereade> davecheney, I'm not sure we really want anyone outside to know/care what keys we're using under the hood
<davecheney> wrtp: ok, i'll figire out what i'm doing that shits it
<davecheney> i'm pretty sure it's related to the damaged I do to my ~/go
<davecheney> fwereade: fair call
<davecheney> it's not hard to change, the underlying content of the machine node is still a *ConfigNode
<davecheney> unless anyone has objections
<fwereade> davecheney, cool, thanks :)
<fwereade> davecheney, I'll not on the review for form's sake
<fwereade> note
<davecheney> fwereade: I won't change it tonight (can't be arsed merging) but consider your advice accepted
<wrtp> davecheney: why do you think testing main packages might not be good, BTW?
<davecheney> wrtp: well, 'cos it's buggy
<wrtp> davecheney: apart from that
<fwereade> davecheney, cheers
<wrtp> davecheney: (the bug will be fixed)
<wrtp> davecheney: i only see that bug *very* sporadically BTW
<davecheney> i think (although this is only a matter of personal taste) that main packages should be as small as possible, they should just import the stuff that they need, instanciate it
<davecheney> and off they go
<davecheney> it's just a personal preference
<davecheney> sort of based on the way that the go tool works
<davecheney> ie, most of the logic is in go/build and friends
<wrtp> davecheney: that's definitely not the way that the go tool is written!
<davecheney> shows what I know :)
<davecheney> anyway, it's not even a small objection
<wrtp> davecheney: there are >7000 lines in cmd/go/*.go
<wrtp> davecheney: bits which factor out nicely are done as packages, yeah
<wrtp> davecheney: but i'm not sure i agree about the principle that all the logic of command line program should be in external packages.
<davecheney> more guidelines
<wrtp> davecheney: i think packages should be useful...
<davecheney> i don't think of main as a package
<wrtp> davecheney: it's not
<wrtp> davecheney: well... it is strictly, but.
<davecheney> and that forms the basis of my preference, nothing more
<fwereade> wrtp, I see where davecheney's coming from, but in my case I suspect this is a prejudice based on unpleasant experiences with vast untestable mains in c/c++ :)
<wrtp> davecheney: but i'm thinking about the package that you hive off so that the main package can be minimal
<wrtp> davecheney: taking it to its logical conclusion you just get another package with a Main function...
<davecheney> wrtp: there are limits to any good intention
<wrtp> davecheney: and in Go (that bug notwithstanding) we *can* test main packages and their component pieces
<davecheney> wrtp: maybe I should just try to figure out why go test keeps fucking up, then there is no reason for main to be a 2nd class citizen :)
<wrtp> davecheney: indeed.
<davecheney> what time does aram come online ?
<davecheney> i was going to stick around for a chat
<wrtp> davecheney: as i said, i only see that bug about once a month or so
<wrtp> davecheney: i don't know when Aram comes on line, but soon I imagine
<davecheney> wrtp: i get it more frequently, i'm pretty sutre it's related to linker skew when I rebuild go
<davecheney> and I remember a bug with the go tool not checking the return code for the linker or go pack
<davecheney> which might be a clue
<wrtp> davecheney: we had the thought that we could try to have a G+ hangout with all of us today. do you think that might be possible?
<davecheney> remy_o had a CL
<davecheney> but it got superceeded by another
<wrtp> davecheney: doesn't linker skew result in an error (incompatible versions, or something)?:
<davecheney> wrtp: it _should_, but I think in this case the actual error is being eaten, so you don't get the .a on disk, then testing can't find it
<davecheney> which is my suspicion
<wrtp> davecheney: BTW if i touch the Go files, the problem goes away for me
<davecheney> yeah, if you rebuild things in just the right way, and face the right direction, then problem goes away
<davecheney> do you use go build or go get ?
<wrtp> davecheney: in the normal course of things?
<davecheney> alias gb='go install -v'
<wrtp> davecheney: for juju, i rarely use anything but go test
<davecheney> oh, that is weird
<davecheney> that isn't supposed to be like that
<davecheney> i thought I made it go install -v
<wrtp> davecheney: i use go test -i too
<davecheney> sorry, i thought it was go get -v
<wrtp> davecheney: why would you use go get?
<wrtp> davecheney: i suppose it gets dependencies automatically if they're not there
<davecheney> yup
<davecheney> turns out i don't need it :)
<wrtp> davecheney: but if they are there, then there's no difference.
<wrtp> i only use go get when i fetch something new
<davecheney> oh, i remember now, i used to alias gb='go build -v'
<davecheney> but that won't install intermediate pacakges
<davecheney> so I switch to to go install
<wrtp> davecheney: i don't use go build because it leaves the executables around
<wrtp> davecheney: and they make the tree dirty
<davecheney> yeah, and poos in your working copy
<wrtp> davecheney: so can you reproduce issue 3417 now, out of interest?
<davecheney> sporadically
<wrtp> davecheney: i meant right at this moment, by following your instructions above.
<wrtp> davecheney: but it doesn't matter - i was just interested
<davecheney> wrtp: actually it's happening to me right now
<davecheney> i wonder if it's timestamp related
<wrtp> davecheney: that's my thought
<davecheney> ctime <> mtime
<wrtp> davecheney: nah, not ctime
<davecheney> i run with noatime
<davecheney> but I don't think go can even look at that
<wrtp> davecheney: could you do an ls -ltR of $GOROOT and $GOPATH
<davecheney> u want it all ?
<wrtp> davecheney: i reckon. it won't be much gzipped and it's easier than just picking out the bits we need.
<wrtp> davecheney: in fact, don't worry about the -t bit
<wrtp> davecheney: better in alphabetical order
<wrtp> davecheney: if you could make it print the times in unix format, that would be better too
<davecheney> wrtp: i'll do it after dinner
<wrtp> davecheney: as numeric timestamps, i mean. i can't remember the option (if there is one)
<wrtp> davecheney: cool.
<davecheney> trying to finish up and send off my email
<davecheney> to see if I can get some review time with gustavo
<wrtp> davecheney: i've been waiting for the opportunity to arise again
<davecheney> trying to avoid another setconfig by keeping the changeset as small as possible
<davecheney> wrtp: bloody lbox, you can't add -req if you've already proposed, even in -wip
<wrtp> davecheney: no, it's a pain
<davecheney> reqs are a pain, full stop
<wrtp> davecheney: it's not lbox, it's launchpad
 * davecheney shakes fist at lp
<wrtp> davecheney: you can't change the prereq of a merge
<davecheney> and you can't have more than on prereq O_o!
<wrtp> davecheney: you'll have to delete the merge request
<davecheney> s/on/one
<wrtp> davecheney: yeah, that too
<davecheney> DVCS for the fail
<davecheney> anyway, this is just end of day whinging
<wrtp> davecheney: lp is not the greatest
<davecheney> fwereade: // InstanceId returns the provider specific identifier for this machine. If no id has been asssinged by the provider, this value will be blank.
<fwereade> davecheney, SGTM
<davecheney> and it just panics if there is crap in the config node
<davecheney> actually, i'm not going to do this today, i'm going to call it an evening
<davecheney> it's 6pm
<fwereade> davecheney, yeah, have a rest
<davecheney> fwereade: but i'll address that first thing tomorrow
<davecheney> later lads
<fwereade> davecheney, not 100% sure about panicing
<fwereade> davecheney, take care
<TheMue> Morning, and good night davecheney
<fwereade> heya TheMue :)
<TheMue> fwereade: Moin
<wrtp> davecheney: see ya
<niemeyer> Good mornings!
<fwereade> heya niemeyer
<rogpeppe> niemeyer: yo!
<niemeyer> Hey guys
<niemeyer> fwereade: LGTM on charm-store-backend
<niemeyer> fwereade: Good stuff
<TheMue> niemeyer <- themue.moin()
<fwereade> niemeyer, thanks :)
<niemeyer> TheMue: Moin :)
<rogpeppe> TheMue: i've been looking at doing the machine provider, and i realise that we haven't got a unit watcher yet. have you looked at that at all?
<rogpeppe> s/machine provider/machine agent/
<TheMue> rogpeppe: You mean the watcher returned by watch_service_unit_state in the py code?
<rogpeppe> TheMue: i think so
 * rogpeppe checks
<TheMue> rogpeppe: If so then yes, it's a not yet covered watcher.
<Aram> morning.
<rogpeppe> Aram: heya
<TheMue> Aram: Moin.
<fwereade> Aram, heyhey
<rogpeppe> TheMue: actually the one i need is MachineState.watch_assigned_units
<TheMue> rogpeppe: That's also an open watcher. Machine is so far very small.
<rogpeppe> TheMue: i could do it assuming you're currently working on something else.
<TheMue> rogpeppe: Currently I'm working on State.AddRelation() (redesign after review). Then I can do the watcher.
<TheMue> rogpeppe: Oh, if you would like to then thank you.
<rogpeppe> fwereade: do you know if it's possible to have two units on the same machine using the same charm? by my very coarse reading, it looks like it might be possible and that it could cause problems.
<rogpeppe> niemeyer: ^
<fwereade> rogpeppe, expand a little please... I can't think of an obvious way to do so, but there probably is one
<fwereade> rogpeppe, and it depends on the problems... is it the known will-try-to-use-same-ports deal, or something else that's worrying you?
<rogpeppe> fwereade: we could have two subordinate services with different service names but using the same charm, right?
<rogpeppe> fwereade: it's the shared directory that concerns me
<niemeyer> rogpeppe: The same *charm* in the same machine? Of course
<fwereade> rogpeppe, hmm, ok, I haven't looked into subordinates in details... I *thought* distinct units got distinct dirs
<fwereade> rogpeppe, and wil therefore have 2 copies of the charm
<rogpeppe> niemeyer: it looks to me like a given charm will always be stored in the same directory, regardless of the service
<niemeyer> rogpeppe: That's bogus
<fwereade> rogpeppe, that said I have occasionally spotted and fixed bugs along those lines so there could well be more I missed
<rogpeppe> niemeyer: yeah, i think so. i may be wrong though.
<niemeyer> rogpeppe: If you mean expanded in the same directory, that is
<fwereade> rogpeppe, are you sure the charm isn't expanded to a unit-specific dir?
<rogpeppe> niemeyer: yes i do. or at least, that's how it looks.
<rogpeppe> fwereade: no :-)
<niemeyer> rogpeppe: If different service units are living in the same directory, that's a bug
<fwereade> rogpeppe, do you have a reference in the python?
<rogpeppe> fwereade: unit/deploy.py
<rogpeppe> fwereade: download_charm
<rogpeppe> fwereade: charms_directory is always $units_directory/charms
<fwereade> rogpeppe, hmm, yeah, looks like crack to me
<rogpeppe> fwereade: ok, thanks
<niemeyer> rogpeppe: service unit != charm, FWIW
<fwereade> rogpeppe, niemeyer: one thought: why is the machine agent responsible for this in the first place?
<rogpeppe> niemeyer: yeah, i thought so
<niemeyer> fwereade: Probably historical, before we had subordinates
<fwereade> rogpeppe, niemeyer: it's always seemed odd to me that we have duplicated code for getting and unpacking charms in unit/machine agents
<niemeyer> fwereade: Might make sense to have the unit doing it
<fwereade> niemeyer, +1
<rogpeppe> fwereade: yeah, perhaps the unit agent should unpack all charms
<rogpeppe> fwereade: which makes the machine agent simpler, which sounds good to me right now :-)
<fwereade> rogpeppe, cool :)
<rogpeppe> fwereade: while you're here...
<fwereade> rogpeppe, yeah?
<rogpeppe> fwereade: in agents/machine.py, it looks to me like service_state_manager is never used. am i missing something superclassy?
 * fwereade looks
<fwereade> rogpeppe, can't see any usage of it, can't see any reason to need it in the first place
<rogpeppe> fwereade: thanks
<rogpeppe> niemeyer: i think we've said that we want to build in upgrade from the beginning. is there a plan of how to do that, in particular how we want to tell the agents to upgrade themselves?
<fwereade> rogpeppe, do you recall any relevant details re plans for the dummy environ's zookeeper?
<rogpeppe> fwereade: more or less
<fwereade> rogpeppe, I think I need that soonish, so anything yu can transfer would be good
<rogpeppe> fwereade: i'll just have a peek to remind myself
<rogpeppe> fwereade: the idea is to start a zookeeper if the zookeeper attribute in the config is true
<rogpeppe> fwereade: it'll only happen the first time that an environ of the given name is opened, but that seems right to me
<fwereade> rogpeppe, the issue is that I'm pretty sure we don't want to start a new zookeeper every time we need a dummy environ for tests
<fwereade> rogpeppe, hm, I must be missing something
 * fwereade looks again
<rogpeppe> fwereade: that's fine - don't set the zookeeper attr to true in those cases.
<fwereade> rogpeppe, I mean I think we'll want to share zookeepers... won;t we?
<rogpeppe> fwereade: hmm. that's potentially awkward, but should be doable
<fwereade> rogpeppe, I guess what I was really asking is "this feels awkward but necessary, has someone already come up with a nice plan?"
<rogpeppe> fwereade: i think it can all be done inside dummy with no visibility outside. i'd have a global instance that's used/started for the first zk after any Reset.
<rogpeppe> fwereade: (if someone opens two dummy environs at the same time, we don't want them to share zks)
<rogpeppe> fwereade: alternatively... we could just keep all zk instances around. probably simpler now i come to think of it.
<fwereade> rogpeppe, I fret that shutting them down will become complex
<rogpeppe> fwereade: i don't think that's too hard. we'd need to provide a Shutdown method or something.
<rogpeppe> fwereade: and we'd need to implement state.Close to return any zk instance that the state is using to the unused pool
<fwereade> rogpeppe, it's knowing when to call it that's the problem
<niemeyer> rogpeppe: Nothing written down yet, but it's time to talk about it indeed
 * niemeyer has to switch rooms
<niemeyer> brb
<rogpeppe> fwereade: all in all i doubt that it should take more than about 30 lines of code
<fwereade> rogpeppe, hmm, actually, the existing model where we start up a shared ZK per test package may be a good way to go... we just need to inform the dummy environ of it at some point
<fwereade> rogpeppe, I'm not too bothered about starting multiple dummy environs at the same time... what's the use case?
<rogpeppe> fwereade: we do that in some places, but those cases probably don't need zk
<rogpeppe> fwereade: a panic if it happened would probably be fine.
<rogpeppe> fwereade: that's nicely simpler actually.
<rogpeppe> fwereade: then Reset can remove everything from the dummy's zk
<rogpeppe> fwereade: or maybe the tests should do that, i dunno
<fwereade> rogpeppe, probably one for the tests, but not sure... and, hmm, bootstrap can Initialize state, now I think of it :)
<fwereade> rogpeppe, if zookeeper is set, that is
<rogpeppe> fwereade: i think i like the idea of handing off the server to the dummy environ
<rogpeppe> fwereade: dummy.SetZookeeper(*zookeeper.Server) perhaps
<fwereade> rogpeppe, yeah
<rogpeppe> dummy.SetZookeeper(testing.StartZkServer()) seems nice
<rogpeppe> fwereade: then, yeah, have dummy.Reset do the ZkRemoveTree
<rogpeppe> fwereade: and have dummy.environConfig.Open check, if cfg.zookeeper is true, that we're not creating a new state.
<rogpeppe> fwereade: and panic otherwise
<rogpeppe> fwereade: or actually it can return an error
<fwereade> rogpeppe, sounds like a plan, I'll see what I can do
<rogpeppe> fwereade: sounds good. thanks.
<rogpeppe> niemeyer: lsb_release prints n/a when it can't find the current release
<niemeyer> rogpeppe: The test still looks bogus.. why is it conflating an error with a return of a value
<niemeyer> rogpeppe: Or do I misunderstand it?
<rogpeppe> niemeyer: it means the test will still run successfully even on a non-ubuntu platform
<rogpeppe> niemeyer: or a platform where lsb_release doesn't exist
<Aram> rogpeppe: is what you are doing now related to what I asked you yesterday about hardcoding CurrentSeries and CurrentArch?
<rogpeppe> Aram: yes
<niemeyer> rogpeppe: Ok, looks good then
<niemeyer> rogpeppe: A comment would be nice
<rogpeppe> niemeyer: thanks. and thanks for the review. will do a comment.
<Aram> rogpeppe: maybe I'm misunderstanding something fundamental, but where do you want to run lsb_release?
<rogpeppe> Aram: in the test. to make sure my parsing of /etc/lsb-release is sane.
<rogpeppe> Aram: see https://codereview.appspot.com/6260048/diff/2001/environs/tools_test.go#newcode276
<Aram> ah, ok.
<Aram> in the test makes sense.
<Aram> but then we can't run the test on every platform.
<rogpeppe> Aram: we can. and that's the point.
<Aram> ok, let me look at the patch.
<rogpeppe> Aram: that was what we were just discussing
<Aram> so the test runs without lsb_release, but its utility is reduced if lsb_release does not exist.
<rogpeppe> niemeyer: rather than full-blown unquote, i'm doing strings.Trim(line[len(p):], ` \t'"`) which should be sufficient i think
<rogpeppe> Aram: yeah
<niemeyer> rogpeppe: Yeah, sounds good
<niemeyer> rogpeppe: Thanks
<rogpeppe> niemeyer: i wouldn't mind leaving the "current Ubuntu release name" in. i remember being a little confused when i came across "series" being used all over the place in the charm packages, and didn't realise it was a synonym for "ubuntu release". if we're going to have a comment anywhere, perhaps this is a good place.
<rogpeppe> niemeyer: it doesn't appear to be a commonly used term even in the ubuntu community
<rogpeppe> s/ in./ comment in./
<niemeyer> rogpeppe: I'm not sure.. the whole code base trusts on the term
<niemeyer> rogpeppe: That doesn't look like a special instance of it
<rogpeppe> niemeyer: yes, but nowhere do we actually say what it is.
<rogpeppe> niemeyer: as far as i've seen
<rogpeppe> niemeyer: this at least will show if someone does a recursive grep for Series
<niemeyer> rogpeppe: My point is that this isn't a special place.. there are tons of places one will not understand, and this isn't documenting them
<Aram> rogpeppe: why not copy some known existing lsb-release files and test against those, that way the test is meaningful on every platform?
<niemeyer> rogpeppe: But sure, I don't mind too much either
<Aram> and that way you don't need to care about lsb_release at all.
<niemeyer> Aram: It is doing that
<rogpeppe> niemeyer: it's reasonably special i think. it's actually the only place AFAIK that connects the actual running release with the term.
<niemeyer> rogpeppe: Nevermind.. comments is great, thank you
<rogpeppe> Aram: i did that. i could do with a few more samples though. feel free to add to the tests table.
<Aram> oh.
<rogpeppe> niemeyer: thanks :-)
 * Aram finds Rietveld hard to read.
 * rogpeppe actually made up the quantal lsb-release contents BTW
<rogpeppe> ha, of course the tests fail when CurrentSeries="unknown" anyway
<rogpeppe> hmm, just saw this:
<rogpeppe> runtime: signal received on thread not created by Go.
<rogpeppe> FAIL	launchpad.net/juju/go/state	8.131s
<rogpeppe> that's a little worrying
<rogpeppe> can't reproduce though
<Aram> probably from zookeeper?
<TheMue> Ah, redesign looks nicer and builds, now adopt to tests.
<niemeyer> Aram: Probably.. which means we'll solve it as a side effect, hopefully
<niemeyer> Aram: Are you up to come on Thursday still?
<Aram> niemeyer: I am not sure... I'd like to. I haven't heard anything from BTS though, I don't know what that means.
<niemeyer> Aram: They've returned to me this morning
<niemeyer> Aram: and we got approval just now
<niemeyer> Aram: I've asked for a second option, though
<niemeyer> Aram: Since the option they gave was coming early Thursday and returning early Friday
<niemeyer> Aram: If you're returning early friday, may as well stay for the day
<Aram> indeed.
<Aram> early thursday -> late friday would be prefereable.
<niemeyer> Aram: That's what I asked for.. let's see if they return and I'll put you back on the loop with a +1
<Aram> ok, assuming the feedback is positive, what's the plan? who buys the tickets, me or the company?
<Aram> hi robbiew.
<robbiew> Aram: hey! :)
<robbiew> Aram: company buys the tickets.../me assumes he knows what this conversation is about
<Aram> ok :).
<niemeyer> robbiew: Yes :)
<niemeyer> Aram: The agency will sort out for it
<rogpeppe> niemeyer: would it be good for me to come down too, if Aram makes it over?
<niemeyer> rogpeppe: I'm not entirely sure.. let's see how the sprint here looks like
<niemeyer> rogpeppe: Let's sync up on that again tomorrow
<rogpeppe> niemeyer: k
<Aram> I believe I'm just waiting for the e-ticket confirmation now.
<Aram> rogpeppe: do you happen to know if Maestro cards work in the UK?
<Aram> work as in "being widespread", I'm sure if I try hard enough I'll find ATMs.
<rogpeppe> Aram: probably.
<rogpeppe> Aram: it's mastercard, right?
<Aram> not really. it's owned by mastercard, but it's a different product.
<Aram> here in austria mastercard is the only widespread card that works. good luck with visa/mastercard.
<Aram> almost nobody accepts them,
<Aram> s/mastercard/maestro/
<Aram> meh
<Aram> I'll rephrase :)
<Aram> here in austria *maestro* is the only widespread card that works. good luck with visa/mastercard.
<rogpeppe> Aram: ah, i've never had any problems with visa. but then again i haven't spent any time in austria :-)
<fwereade> Aram, fwiw I'm used to seeing maestro symbols on ATMs in the UK
<Aram> greta
<Aram> great
<rogpeppe> fwereade: you've got a couple of reviews
<fwereade> rogpeppe, sweet, tyvm
<fwereade> gn all, need to be off promptly tonight
<fwereade> take care :)
 * Aram is away for an hour or so.
<rogpeppe> fwereade: ok. gn.
 * rogpeppe is done for the day. see you tomorrow...
 * Aram back.
<TheMue> niemeyer: Thx for LGTM, goes in now.
<niemeyer> TheMue: Cheers!
<niemeyer> and I really need some food now
<niemeyer> I'll see you all later/tomorrow!
<TheMue> niemeyer: Enjoy.
<Aram> rogpeppe: do you use upas?
<hazmat> rogpeppe, the extracted charm is separate for multiple units
<hazmat> the download directory is the same as its merely a cache
#juju-dev 2012-05-30
<hazmat> davecheney, the newer samsung ssds (830?) are 7mm
<davecheney> hazmat: yeah, i'll have to get one of those
<davecheney> hazmat: for a moment I considered taking the metal case off my sandforce
<davecheney> but only for a moment
<TheMue> davecheney, fwereade, rogpeppe: Morning.
<fwereade> TheMue, heyhey
<fwereade> and everyone else :)
<bigjools> fwereade: o/
<fwereade> bigjools, heyhey
<bigjools> who do I bug to get my branches reviewed today? :)
<fwereade> bigjools, consider me bugged, I'll do that now :)
<bigjools> tip top, thanks
<fwereade> bigjools, one minor on https://code.launchpad.net/~julian-edwards/juju/maas-provider-non-mandatory-port/+merge/107577
<bigjools> yup?
<fwereade> bigjools, just _s for unused vars
<fwereade> bigjools, otherwise both LGTM
<fwereade> hazmat, ping
<bigjools> fwereade: is that some juju convention?
<bigjools> (for reference)
<fwereade> bigjools, my heart says it is, but I couldn't point you to a document saying so
<bigjools> heh
<bigjools> fwereade: so: _s, port, _s = connect_call.args
<bigjools> ?
<fwereade> bigjools, soory, _
<fwereade> bigjools, I should have said `_`s or something
<bigjools> right, makes more sense now!
<bigjools> fwereade: ok change pushed up, thanks muchly
<fwereade> bigjools, also I'm not sure what the current datekeeper rules are for python; I'll have a word with hazmat when he's around and either extract more approves if required, or just merge them
<bigjools> you will land them right?
<fwereade> gatekeeper
<fwereade> bigjools, (I should probably just land them but I'm reluctant to come charging back into the python without checking)
 * bigjools nods
<fwereade> bcsaller, ping
<hazmat> fwereade, pong
<davecheney> fwereade: rogpeppe TheMue does anyone have any comments about testing commands and package main_test ?
<fwereade> hazmat, sorry: what are the current landing requirements for juju? bigjools has a couple of patches I'm happy with myself... what else should be done before landing?
<hazmat> fwereade, the sru just landed
<hazmat> for the connect args, i think in this case its actually more helpful to have the args documented.. its a completely mocked situation
<fwereade> davecheney, I'm not sure my own preferences are useful... main_test seems to me like the obvious and sane thing to do but if it's flaky I guess we should do things differently
<fwereade> hazmat, heh, sorry
<hazmat> fwereade, and those params are never checked anywhere else in the codebase
<TheMue> davecheney: Not yet. So far I had no main commands to test.
<hazmat> so its completely unclear what they are if their ellidded
<hazmat> bigjools, ^ sorry
<hazmat> bigjools, actually i'd take the reverse notion.. and actually verify those values
<davecheney> fwereade: for a long time gotest didn't work with main packages
<hazmat> since their not really checked otherwise
<hazmat> which is the root cause of this bug
<davecheney> as they are the end of the line, dependency wise
<davecheney> all the other code i've found that does test main, tests it inside package main
<davecheney> i don't think there is a problem doing unit tests inside the package you are testing
 * hazmat checks tests_auth
<fwereade> davecheney, putting the tests in the package you're testing feels icky to me; but like I said, if main_test is flaky and not expected to be fixed any time soon, I guess we should just change them
<davecheney> fwereade: you aren't putting them in the same package, well, only when testing, but there is no other way to test unexported functions
<davecheney> anyway, i've got to go
<davecheney> i'll see you on the flip side
<hazmat> fwereade, trunk is green for this fix
<hazmat> fwiw
<fwereade> hazmat, ok then, I'll land them both once the change you asked for is done
<fwereade> hazmat, cheers
<hazmat> fwereade, thanks
<rogpeppe> fwereade, TheMue, hazmat: mornin' all
<fwereade> rogpeppe, heyhey
<hazmat> g'night all
<bigjools> hazmat: ok I'll reverse the last commit
<bigjools> I did think it was rather odd to remove useful variable names :)
<hazmat> when their well known and unused, we do tend to elide them.. but this is a special case.. since their effectively unknown.
<bigjools> well they are test case logs
<bigjools> the other ones are tested elsewhere
<bigjools> hazmat: ok pushed up
<hazmat> bigjools, are they? :-)
<hazmat> bigjools, i couldn't find anywhere else where args was looked at
<hazmat> bigjools, thanks
<bigjools> I t hought I'd seen it somewhere
<bigjools> my brain ceased to function 30 minutes ago, it's been a long day
<bigjools> hazmat: ah just saw your paste, I'll poke that in too, one sec
<hazmat> cool
 * hazmat gives up on going back to sleep
<bigjools> and pushed
 * bigjools looks forward to a merge email
<hazmat> bigjools, shouldn't this also do port 443 for https?
<rogpeppe> fwereade: i've been looking at CL 6244060. i *think* that having multiple zk Servers sitting around in dummy isn't a good idea.
<fwereade> rogpeppe, I'm somewhat ambivalent, but I think in practice that when one is added we will remember to defer a removal as well
<rogpeppe> fwereade: i think we want to reuse the zk server regardless of the name that the test gives to the environment
<rogpeppe> fwereade: how do we remove a server?
<fwereade> rogpeppe, I guess doing it this way does effectively force us to use one dummy env name per package, which prevents us from encoding any helpful context in there
<fwereade> rogpeppe, SetZookeeper("foo", nil)
<rogpeppe> fwereade: that too
<rogpeppe> fwereade: and if we're going to do that, why bother allowing multiple zk servers
<rogpeppe> ?
<rogpeppe> there's an alternative, actually
<fwereade> rogpeppe, it seemed sensible to do that regardless... I think the issues are orthogonal?
<fwereade> go on
<rogpeppe> fwereade: hmm, another thought: why do we bother with SetZookeeper. why don't we let the dummy environ start its own zk?
<rogpeppe> fwereade: then the dummy environ can manage the server "cache". (it could keep just one server around after a Reset)
<fwereade> rogpeppe, hmm, how do we know when to shut the servers down?
<rogpeppe> fwereade: oh yeah, good point. *that* was why :-)
<rogpeppe> fwereade: anyway, i *think* we want to be able to reuse a zookeeper regardless of the name of the environ.
<fwereade> rogpeppe, tbh that was my initial preference but we do seem to have multiple storages (right?) and consistency seems sensible
<rogpeppe> fwereade: storage is *much* lighter weight than zk. and we trash all the storages after Reset.
<rogpeppe> fwereade: i think the 10s startup overhead of zk means we will never want to start up two
<fwereade> rogpeppe, my instinct says that indeed we won't, and that it will be much easier to note this and remove the capacity at some point in the future than it will to get into heavy discussions with niemeyer about it right now
<fwereade> rogpeppe, my primary consideration, sad to say, is getting it merged so I can get to work on deploy without it being at the end of a looong pipeline of CLs under discussion ;)
<niemeyer> Good mornings!
<fwereade> niemeyer, heyhey!
<rogpeppe> fwereade: i'm concerned that this kinda breaks the dummy environs model
<rogpeppe> niemeyer: yo!
<rogpeppe> niemeyer: just expressing mild concern about SetZookeeper taking an environ name
<rogpeppe> niemeyer: we want the zk to be reused regardless of the name of the environment that's opened, i think.
<niemeyer> fwereade, rogpeppe: No need to get into heavy discussions.. it was just an opinion.. happy to be shown the other side of it :)
<niemeyer> rogpeppe: So what's the point of having environment names in the dummy provider?
<fwereade> niemeyer, my personal view is that the cost of either approach is dwarfed by the cost of talking about it for more than 20s ;)
<niemeyer> rogpeppe: Environment names explicitly enable talking to completely different environments
<rogpeppe> niemeyer: so we can open multiple environments. but Reset should reset everything
<niemeyer> rogpeppe: If we have two different environments that have different internal representations but that in fact are the same ZooKeeper state, isn't it the same environment?
<niemeyer> rogpeppe: Agreed, but if we have a single ZooKeeper state, it's not multiple environments
<rogpeppe> niemeyer: we weren't allowing that
<rogpeppe> niemeyer: we weren't going to allow multiple zk environments
<rogpeppe> niemeyer: because i don't think we'll ever actually want that
<niemeyer> rogpeppe: There's no such thing as a "zk environment"..  an environment contains a ZooKeeper
<rogpeppe> niemeyer: lots of tests use the dummy env without requiring zk
<niemeyer> rogpeppe: If we have two environments with one zookeeper, we don't have two environments
<niemeyer> rogpeppe: Yep, that sounds fine
<rogpeppe> niemeyer: what sounds fine?
<niemeyer> <rogpeppe> niemeyer: lots of tests use the dummy env without requiring zk
<niemeyer> fwereade: Do we have a use case for opening two environments with a single ZooKeeper?
<fwereade> niemeyer, I don't think so
<rogpeppe> niemeyer: at the same time, you mean?
<niemeyer> fwereade: So what's the concern?
<niemeyer> fwereade: I'm concern we're modeling something unrealistic in the dummy environment
<rogpeppe> niemeyer: we don't want to restart zk every time we start a test.
<niemeyer> fwereade: But doing dummy.SetZooKeeper("name", zk) vs dummy.SetZooKeeper(zk) execute in the same amount of test
<niemeyer> erm
<niemeyer> of time
<fwereade> niemeyer, from my perspective, that it's just more bookkeeping... I expect to run a bunch of sequential tests, each of which uses a dummy env, and which in practice may as well all use the same ZK
<niemeyer> rogpeppe: Yep.. agreed
<niemeyer> fwereade: Sounds fine too?
<fwereade> niemeyer, doing this effectively means we either need multiple zookeepers, or to explicitly set multiple envs to use the same one, if we ever want to use different env names
<fwereade> niemeyer, but I really have no horse in this race, I think the cost is minimal either way and I'm happy enough with either approach
<niemeyer> fwereade: The cost is zero if we're never doing that.. and if we ever do it we can consciously choose to assign to the same ZooKeeper, right?
<niemeyer> fwereade: I guess I'm still lacking what's the argument against making the interface of the dummy provider a bit more realistic in terms of what happens
<niemeyer> fwereade: We're debating about SetZooKeeper(name, zk) vs. SetZooKeeper(zk), right?
<rogpeppe> niemeyer: i like the fact that dummy.Reset forgets all environ names
<fwereade> niemeyer, the cost is (1) a little bit of extra bookkeeping code and (2) consciously choosing to set up extra zookeepers in the package test function
<niemeyer> rogpeppe: Sure, so continue doing that.. (?)
<rogpeppe> niemeyer: but if we add SetZooKeeper(name, zk) that name remains around
<rogpeppe> niemeyer: even after Reset
<fwereade> niemeyer, like I say, not enough for me to worry about
<niemeyer> rogpeppe: I don't get it.. are you suggesting that Reset() won't reset what SetZooKeeper does? That sounds weird
<rogpeppe> niemeyer: if it does reset it, then we'll need to call SetZooKeeper on every test
<fwereade> niemeyer, ah, hmm, so Reset() should clear all zookeepers and each test should explicitly set one at the start?
<fwereade> niemeyer, that does feel heavyweight to me
<rogpeppe> niemeyer: perhaps that's what we should be doing i guess
<fwereade> niemeyer, having the magic test ZK in python felt like a win to me tbh
<niemeyer> fwereade: Assigning an entry to a map sounds heavyweight?
<niemeyer> fwereade: Reset resets everything else in Dummy.. why is it not acting on what SetZooKeeper does?
<fwereade> niemeyer, keeping track of the zookeeper in each test package, rather than just in the package setup func
<fwereade> niemeyer, Reset trashes the content of every active zookeeper
<fwereade> niemeyer, what I'm trying to avoid is starting a fresh ZK for every test
<niemeyer> fwereade: You don't have to start a fresh zookeeper.. trashing content != starting zookeeper
<rogpeppe> niemeyer: the idea behind SetZooKeeper was "here is a zookeeper server; use it for a zk environment whenever it's opened"
<niemeyer> fwereade: Each test *should* get a zeroed out zookeeper, right?
<fwereade> niemeyer, agreed; and it does
<niemeyer> fwereade: Ok, so I misunderstand your point
<fwereade> niemeyer, what rogpeppe said^^
<niemeyer> rogpeppe, fwereade: Sure.. how does that change what was just pointed out, thought?
<fwereade> niemeyer, I think that allowing different ZKs for different envs is fine; I'm not sure we'll use it much in practice
<niemeyer> fwereade: Agreed.. I'm not arguing that we use it.. I'm arguing that we don't have unrealistic scenarios for testing
<niemeyer> fwereade: Two different environments in a configuration, with different names, will have different backing ZooKeepers
<fwereade> niemeyer, and I can get behind that; I've proposed a CL that, I think, does make the scenario more realistic as you suggest, and I'm perfectly happy with it :)
<rogpeppe> niemeyer: if we have Reset clear out all the zk servers too, then every test will need to call SetZooKeeper, which means that every test suite will need a zk server field. if we have just a single zk server instance, then we can do the setup exactly as we do in state.TestPackage
<rogpeppe> niemeyer: will we ever want to have two different environments concurrently with different zookeepers?
<rogpeppe> niemeyer: it's a heavy weight test and one i can't really see a reason for doing
<fwereade> niemeyer, I'm not so keen on the idea of having a zkServer package var for every test package, and explicitly setting it as the dummy zookeeper in every test; is that what you're proposing?
<rogpeppe> niemeyer: if you never want to do that (and you can make it panic if that ever happens accidentally) then the scheme is not unrealistic - it's indistinguishable, i think.
<niemeyer> rogpeppe: <niemeyer> fwereade: Two different environments in a configuration, with different names, will have different backing ZooKeepers
<niemeyer> rogpeppe: That's how environments work for real
<rogpeppe> niemeyer: yes, but will we *ever* want to open two such environments *at the same time* in a test?
<niemeyer> rogpeppe: If we're creating a dummy provider that works in a different way, it sounds like we're doing something wrong..
<niemeyer> rogpeppe: You were the one changing the dummy package so it could take multiple environments with different names.. (!?)
<rogpeppe> niemeyer: that's for basic tests that don't require zk.
<niemeyer> <rogpeppe> niemeyer: yes, but will we *ever* want to open two such environments *at the same time* in a test?
<niemeyer> rogpeppe: That does answer your own question
<rogpeppe> niemeyer: by "such environments" i meant environments with a backing zk instance
<niemeyer> rogpeppe: We're going in circles. I've already pointed out the reasoning behind that. You can agree or disagree.
<rogpeppe> niemeyer: the approach i'm suggesting is the same one taken by state, for instance - we have a single zk instance that is reused in each test.
<niemeyer> rogpeppe: That's where our conversation started.
<rogpeppe> niemeyer: so... should Reset forget the servers registered with SetZooKeeper?
<niemeyer> rogpeppe: That was my expectation, but I guess the dummy package is being used in a different way than what I expected
<rogpeppe> niemeyer: if so, i could live with that. it's another line in each test or suite that opens a zk environ, but the zookeeper.Server could live in a global var just like state.TestingZkAddr.
<niemeyer> rogpeppe, fwereade: So far we set up the dummy package on each suite, I think
<fwereade> niemeyer, ATM we set it up per package
<fwereade> niemeyer, following the model in state
<niemeyer> fwereade: The dummy package?  Do you have an example where we set it up per package I could look at in the current source?
<rogpeppe> cmd/juju/cmd_test.go sets it up per test
<niemeyer> Right, I meant an example where we use the dummy provider on a package context, rather than per test or suite
<fwereade> niemeyer, only in https://codereview.appspot.com/6243067/ which technically doesn't exist yet
<fwereade> niemeyer, (that's calling SetZookeeper in the package test func; did you mean something different?)
<niemeyer> fwereade: Right, so that's why I was assuming a given way of using it, which doesn't match what you had in mind.. not saying either is right or wrong, but part of the argument originates there
<fwereade> niemeyer, I'm just monkey-see-monkey-doing what we did for state wrt zookeepers; seems to work ok ;)
 * fwereade wonders whether "following established conventions" is a better way of putting that ;)
<niemeyer> fwereade: Yeah, you could have monkey-see-monkey-done what the dummy package currently do now as well, though :-)
<fwereade> niemeyer, so it seems; state is the one I already knew :0
<rogpeppe> fwereade: "blindly following over a cliff"?
<fwereade> rogpeppe, if all your friends jumped off a cliff and it all seemed to be going just fine...
<rogpeppe> fwereade: come on in, the water's lovely!
<niemeyer> fwereade: Ok, let's just move back to the SetZooKeeper you had then.. it's very straightforward and will get you going
<fwereade> niemeyer, the cost of setting up a ZK was a contributing factor... if that weren't a consideration I'd happily do everything per-suite
<rogpeppe> anyway, to get back to the point. i think i'd be happy with either: one zk instance, reused automatically, as originally; or an explicit SetZookeeper required after Reset
<fwereade> niemeyer, ok, cool
<fwereade> niemeyer, cheers :)
<niemeyer> fwereade: I'm happy to delay my preciousism in the design (if that's even a word) to a second moment
<fwereade> niemeyer, and a panic if anyone tries to SetZookeeper while one is already set?
<niemeyer> fwereade: The fact it matches what's being done with zookeeper now in terms of the test suite is what makes me more comfortable
<rogpeppe> niemeyer, fwereade: thanks for bearing with me. i feel like i've been the precious one :-)
<rogpeppe> fwereade: i don't see why we shouldn't allow SetZookeeper twice.
<niemeyer> fwereade: Not sure.. how do you plan to reset the zk?
<fwereade> rogpeppe, ah ok, perhaps I misremembered an earlier comment of yours
<niemeyer> fwereade: as in, how do you plan to unset the package variable
<fwereade> niemeyer, `defer SetZookeeper(nil)` in the package test func; clear contents on Reset() or Destroy()
<niemeyer> fwereade: Ah, ok.. so accept nil, but not another zk
<niemeyer> Anyway, I have to move rooms
<fwereade> niemeyer, yeah
<niemeyer> Which means I have to shutdown
<niemeyer> :(
<niemeyer> brb
<rogpeppe> fwereade: what bad thing might happen if you allowed setting another zk?
<fwereade> rogpeppe, having two zookeepers around :p
<fwereade> rogpeppe, if someone comes up with a legitimate use case removing the panic is not hard
<fwereade> rogpeppe, but trying to do so indicates to me that something is likely to be being Done Wrong
<rogpeppe> fwereade: i'm not sure i see the big difference between { SetZookeeper(nil); SetZookeeper(anotherZk)} and SetZookeeper(anotherZk)
<rogpeppe> fwereade: but i don't mind actually. require the SetZookeeper(nil) if you prefer.
<fwereade> rogpeppe, in either case, assuming package-level setup, either you need to keep track of the original package ZK *or* someone has failed to clear it out from a previous package
<rogpeppe> fwereade: i can't see how that would happen. we only test one package at a time, right?
<fwereade> rogpeppe, and if a server from a previous package is still there, someone's ballsed something up
<rogpeppe> fwereade: so it would only happen if we happened to call SetZookeeper twice in the same package. which could happen, i guess, but i don't see it as an easy-to-find failure mode.
<fwereade> rogpeppe, fair enough :)
<rogpeppe> fwereade: how could a server from a previous package still be there? they're different programs?
<rogpeppe> s/\?$//
 * fwereade waves his hands vaguely, then runs out of steam and slinks off lokking embarrassed
<rogpeppe> :-)
<fwereade> rogpeppe, should Destroy be clearing out the appropriate key in providerInstance.state (and closing the listener)?
<rogpeppe> fwereade: erm, zookeeper.Server.Destroy?
<fwereade> rogpeppe, sorry, dummy.environ.Destroy
<rogpeppe> fwereade: i don't see that method. do you mean Reset?
<rogpeppe> oh!
<fwereade> rogpeppe, Reset clears them all
<rogpeppe> just grepped the wrong dir
<rogpeppe> i don't *think* so
<rogpeppe> fwereade: it's not like s3 goes away when you destroy an environment
<rogpeppe> fwereade: i think it's ok for Reset to do that job
<rogpeppe> Aram: yo!
<Aram> morni g all.
<fwereade> rogpeppe, I'll read some more, see if I can figure out why your proposed tweak to open panics... that was just a half-formed hypothesis
<TheMue> Aram: Hi
<rogpeppe> fwereade: it could be that something's not calling Reset
<fwereade> Aram, heyhey
<fwereade> rogpeppe, HAHA DISREGARD THAT I wrote the code in the wrong place :/
 * rogpeppe disregards it
<rogpeppe> fwereade, TheMue, niemeyer: i'm wondering about the way that we manage the state with watchers. if i'm watching something, i'll get a notification that something has happened to it (because its config node changed, for example) but then i have to re-read the same info from zk if i want to find out attributes of the object. but the info might have changed - the info i see might not be consistent with what i'm told
<rogpeppe> i'm wondering if a model where each kind of thing (Unit, Service, Machine) caches the info that was known when it was created.
<rogpeppe> s/if a/about a/
<TheMue> rogpeppe: Typically the specialized watchers return the needed info.
<TheMue> rogpeppe: So in your case a watcher returns an info but you need a different one depending on it?
<rogpeppe> TheMue: i'm looking at doing a unit watcher. the obvious thing to do is have the channel sending *Units that have been added or removed.
<niemeyer> rogpeppe: That's what the machine watcher does..
<rogpeppe> TheMue: but if i want to get any info out of those units, i have to reread it
<rogpeppe> niemeyer: yeah, and i'm a bit concerned about it
<niemeyer> rogpeppe: No, you're not re-reading.. you're reading.. the information out of those units wasn't read before
<niemeyer> rogpeppe: There are different ways to model that problem for sure
<niemeyer> rogpeppe: And perhaps the model we have in place today isn't the best one, but changing it will incur in other issues
<rogpeppe> niemeyer: ah, that's true. the info *about* the units is in the topology, but the info *of* the units is in the unit's config node, right?
<niemeyer> rogpeppe: Yep
<niemeyer> rogpeppe: So, in either case, I suggest not modifying that today so that we can move forward
<TheMue> rogpeppe: Nice description, yes.
<niemeyer> rogpeppe: I do think eventually we'll want to cache more than we do, but that can't be done without pondering further about the algorithms
<rogpeppe> niemeyer: it doesn't seem that that's entirely true. i'm looking at AssignedMachineId and it reads the topology. to find new units, i'm going to need to watch the toplogy. so i'll be reading the toplogy for every unit as well.
<TheMue> niemeyer: The test of store fails for me due to a not matching error message in lpad_test.go:42:. Is that only here or well known?
<niemeyer> rogpeppe: Yeah, it's a mix of both
<niemeyer> TheMue: Certainly not well known by me.. trunk should never be broken
<rogpeppe> a possible model to consider for the future: each object acts as a local cache of some state. it only goes to zk when it doesn't already hold that state. an Update method could be implemented on each type to fetch new values from zk.
<niemeyer> rogpeppe: Yeah
<rogpeppe> TheMue: tests all pass for me
<TheMue> niemeyer: Ah, found the error. It's the locale. So bzr returns a German message here while an English one is expected in the test. (sigh)
<rogpeppe> niemeyer: that way most methods could lose their error return perhaps - the only place returning an error would be on Update.
<TheMue> rogpeppe: See above, it's my locale.
<rogpeppe> TheMue: ah. useful to have someone in a different language locale!
<TheMue> rogpeppe: Yes. While our own messages are all English the ones of external tools may vary.
<niemeyer> rogpeppe: and then all writes have to be cached as well?  That would mean the flush operation could fail, and we wouldn't know why.. which means we'd have to implement retry points that don't really know what is being retried
<niemeyer> rogpeppe: etc
<rogpeppe> niemeyer: no, i think writes would just write through
<niemeyer> rogpeppe: It's a different model.. worth thinking about at some point.. not obviously better
<TheMue> rogpeppe: As long as we don't have any problem with re-reading I won't do an application based caching. This may be task of the backend (like ZK, Mongo, whatelse).
<rogpeppe> niemeyer: it's this kind of thing that concerns me: client A watches units. client B changes unit 1 to X. client A observes change X. client B changes unit 1 to Y. client A reads Y. client A observes change Y. client A reads Y.
<niemeyer> rogpeppe: How's that an issue?
<rogpeppe> niemeyer: it's very inefficient, particular when we're observing the topology and it's getting large.
<rogpeppe> niemeyer: maybe i shouldn't worry about efficiency here though.
<niemeyer> rogpeppe: Yeah, let's make it work first
<niemeyer> rogpeppe: Reliably
<niemeyer> rogpeppe: Note that, in your example, even if we cached the data and client B read X instead, there would be a second notification
<rogpeppe> niemeyer: it's reliability i'm concerned about too. working on a slightly newer version than we've just been told about. maybe it's not an issue, but it seems potentially problematic.
<niemeyer> rogpeppe: We haven't been told about anything.. we have simply been notified that a changed happened
<rogpeppe> niemeyer: yeah, there are two changes, so there would be two notifications
<rogpeppe> niemeyer: we've been told that some new units have been created, for example.
<niemeyer> rogpeppe: Yes, we have, but that's unrelated to your example
<rogpeppe> niemeyer: and we *did* know some info about those units, but we've thrown it away
<niemeyer> rogpeppe: Yes, but that's unrelated to reliability
<rogpeppe> the caching model seems to me like it has less potential for odd race conditions.
<rogpeppe> niemeyer: anyway, food for thought.
<hazmat> rogpeppe,  agreed it is inefficient
<hazmat> rogpeppe, the other options are to have some notion of logical transactions that the topology records for
<hazmat> not nesc. the 3.4 multi-node txn, but a logical  one where we carry the topology across multiple ops
<hazmat> just to be clear its not a race condition per se that its avoiding, its the spurious notifications caused by every state change, that may actually compose a larger logical op
<hazmat> but really the largest source of spurious notifies, is everything listening to the topology, getting hit by notifications they don't care about
<rogpeppe> hazmat: yeah that's true
<niemeyer> rogpeppe: Indeed, I do think those are good ideas, and that there's something there for us to evolve towards
<rogpeppe> hazmat: but it doesn't help when we've been listening to the topology, we're told that it's changed and what it changed to, and then we do a round trip just to fetch it again.
<niemeyer> rogpeppe: We quickly did that when we started.. but didn't quite work well
<niemeyer> Luncnh time
<rogpeppe> niemeyer: did what, sorry?
<hazmat> as an alternative i think we either need to carefully devolve into multiple topo indexes, and/or starting using multi-node transactions..
<hazmat> rogpeppe, that's the nature of watches
<hazmat> rogpeppe, oh.. we got told what it changed to?
<rogpeppe> hazmat: ah, no, of course. in this case we're doing a round trip to fetch it. and then another round trip to fetch it again.
<hazmat> rogpeppe, ah
<hazmat> rogpeppe, we did briefly experiment with an up to date object cache, but lacking transactions or pub/sub events (from watches), it ended up just being more complexity imo.
<hazmat> rogpeppe, but if you could interject into the watch channel to an object cache invalidation protocol it might have some value, but effectively anything in cache would need to be watched
<hazmat> else it could be stale
<rogpeppe> hazmat: i wasn't thinking of a global cache, just a local cache for an object when you read it.
<hazmat> alternatively work with an operation context, like a transaction scope
<hazmat> er. scoped cache
<rogpeppe> hazmat: and a way of saying "please update"
<hazmat> rogpeppe, yeah.. that makes sense, but speaking of race conditions ;-)
<hazmat> its unavoidable i suppose, the local cache gives a consistent view at least
<rogpeppe> hazmat: yeah, that's my thought
<hazmat> we always have to consider the possibility of state change
<rogpeppe> hazmat: indeed.
<hazmat> but if we go to modify a cached obj, we have to reconcile current state and modifications, not cache state and mods
<rogpeppe> hazmat: but it can change when we ask it to
<hazmat> ie. capture versions
<hazmat> at a min or use retry change utilities
<rogpeppe> hazmat: i'd be happy with a write-through cache
<rogpeppe> hazmat: it doesn't solve all the problems, but it alleviates the worst
<hazmat> niemeyer, are you in uk?
<rogpeppe> hazmat: he is
 * hazmat notes its not lunch time yet ;-)
<hazmat> rogpeppe, cool, is there a gojuju sprint going on?
<rogpeppe> hazmat: nope
<rogpeppe> hazmat: niemeyer's at some sprint in london
<hazmat> we should get a juju sprint  going on
<hazmat> i can't even remember the last sprint we had
<hazmat> like august of last year maybe..
<rogpeppe> hazmat: +1
<rogpeppe> hazmat: i haven't done a sprint yet
<hazmat> we should fix that..
<TheMue> rogpeppe, hazmat: Isn't something planned when Mark is on board?
<hazmat> i dunno
<niemeyer> rogpeppe: ping
<rogpeppe> niemeyer: pong
<niemeyer> rogpeppe: Yo
<niemeyer> rogpeppe: Are you game to come tomorrow and return on Friday?
<rogpeppe> niemeyer: i think so
<niemeyer> rogpeppe: Awesome, sorting out hotel
<rogpeppe> niemeyer: will just have a word with carmen
<niemeyer> rogpeppe: Cheers
<fwereade> rogpeppe, any particular reason StorageReader.URL doesn't return a url.URL?
<Aram> niemeyer: eager to meet you tomorrow!
<Aram> anyone else there?
<niemeyer> Aram: Looking forward too
<niemeyer> Aram: Lots of people, but no jujuers unfortunately
<niemeyer> Aram: rog should come, thogh
<niemeyer> though
<Aram> great.
<niemeyer> rogpeppe: Are we good to go?
<fwereade> niemeyer, do I recall you saying we should revisit the way we handle unit placement in python?
<fwereade> niemeyer, because we're aproaching the point at which we should be considering it :)
<niemeyer> fwereade: Hmm.. not that I remember off hand
<fwereade> niemeyer, jolly good then -- if it crosses your mind again, do let me know (I didn't see anything wrong with it myself, but...)
<TheMue> niemeyer: I now start the go test with a temporary locale en_US. But now I've got this error: http://paste.ubuntu.com/1014818/
<rogpeppe> niemeyer: yes.
<rogpeppe> niemeyer: sorry about the wait. got into an involved discussio.n
<niemeyer> rogpeppe: COol, thanks
<niemeyer> rogpeppe: Moving forward then
<rogpeppe> fwereade: there's no need for it to be a parsed URL. http.Get doesn't take a url.URL as an argument.
<niemeyer> TheMue: Sorry, can't look at that now
<TheMue> niemeyer: OK, no problem so far and doesn't hinder state development. ;)
<rogpeppe> fwereade: a string seems a fine representation of a URL to me if you don't need to be looking inside it.
<fwereade> rogpeppe, ah, ok; in that case the question becomes: TheMue, is there any particular reason AddCharm takes a url.URL?
<TheMue> fwereade: Have to think back. IMHO because I'm getting it from an API call. One moment, I'll take a look.
<fwereade> TheMue, thanks
<rogpeppe> fwereade, TheMue: it looks like it could as well be a string, as all it does is call String on it...
<rogpeppe> fwereade, TheMue: and charmData.BundleURL is a string
<fwereade> rogpeppe, concur
<TheMue> fwereade: Don't know the reason anymore, it became a URL during the discussion.
<fwereade> TheMue, rogpeppe: any reason not to make it a string again?
<TheMue> fwereade: Are there any problems with URL?
<rogpeppe> TheMue: it's worth using the same representation for the same thing throughout, i'd say
<fwereade> TheMue, just that it seems kinda silly to turn a string into a url just to turn it back into a string a few us later :)
<rogpeppe> fwereade: +1
<TheMue> rogpeppe: Same representation for the same thing: yes. A URL is a URL. ;)
<TheMue> rogpeppe: But it's ok for me.
<TheMue> rogpeppe: Can parse it to URL if access to the parts is needed.
<rogpeppe> TheMue: yeah
<TheMue> rogpeppe: The only thing is: URL verifies the URL to have a valid scheme.
<rogpeppe> TheMue: i don't think so
<rogpeppe> TheMue: it just verifies that the scheme matches [a-zA-Z][a-zA-Z0-9+-.]*
<TheMue> rogpeppe: OK, not much (but even more than a pure string).
<TheMue> rogpeppe: So AddCharm() gets a charm URL as charm.URL but the bundle URL as string. Hmm.
<rogpeppe> TheMue: assuming we're expecting to have got the charm URL from the charm package, i don't think that's an issue
<TheMue> rogpeppe: Yes, no issue, it only 'smells' a bit when reading the function signature.
<rogpeppe> TheMue: can you enlighten me as to the role of unit names?
<rogpeppe> TheMue: i can't quite see a unit has both a name and a sequence number
<rogpeppe> TheMue: and a key
<TheMue> rogpeppe: One moment.
<TheMue> rogpeppe: The name is build by service name and sequence number.
<TheMue> rogpeppe: The key is the internal key in ZK, like unit-0000000001.
<rogpeppe> TheMue: i don't quite see why key and sequence number aren't pretty much the same thing in different forms
<rogpeppe> ah! i see it, i think
<rogpeppe> TheMue: i *think* it's because a unit is built in two operations; make unit directory, then add the newly made name to the topology.
<rogpeppe> TheMue: i'm still not quite sure why we can't just take the unit number from the unit directory name.
<TheMue> rogpeppe: Yes
<rogpeppe> TheMue: and live with the fact that unit numbers might sometimes skip
<rogpeppe> fwereade: any idea?
<TheMue> rogpeppe: Do you have any trouble with it?
<rogpeppe> TheMue: yeah, it's awkward because i'm implementing topology.UnitsForMachine and i'm not sure what to return from it
<fwereade> rogpeppe, TheMue: I'm torn
<rogpeppe> TheMue: i can't return []*zkUnit because that doesn't include the service name
<rogpeppe> TheMue: i can't return *Unit because that's outside the purview of topology.go
<TheMue> rogpeppe: The unit stuff is almost a 1:1 of the Py code.
<fwereade> rogpeppe, TheMue: smoothly increasing ids throughout would obviously be nicest, and not represent a clear encapsulation breakdown
<TheMue> rogpeppe: So I would investigate there.
<rogpeppe> TheMue: i don't want to define *another* type representing a unit
<rogpeppe> fwereade: that's what we've got now, i think.
<rogpeppe> fwereade: at least, smoothly increasing *external* ids
<fwereade> rogpeppe, TheMue: cool, because that's what we have with machines for sure
<rogpeppe> fwereade: exactly.
<fwereade> rogpeppe, TheMue: and in this I favour consistency
<rogpeppe> fwereade: i'm thinking that machines is doing a very similar thing, and there (i *think) we have a 1-1 correspondence between key and machine id
<fwereade> rogpeppe, yep, I think so too
<TheMue> rogpeppe: The sequence no is increasing per service, the unit key over all units.
<rogpeppe> fwereade, TheMue: doing this would also mean we could get rid of all the UnitSequence logic in topology.go
<rogpeppe> TheMue: i don't think so, but let me check
<rogpeppe> TheMue: oh yeah
<TheMue> rogpeppe: It is, just checkt. It increases a counter per service name in topology.
<rogpeppe> TheMue: that's a bit odd.
<rogpeppe> TheMue: i'm surprised it's not a directory inside the service.
<fwereade> rogpeppe, hmm, yeah, that might be nicer indeed
<TheMue> rogpeppe: It's a value of topology, maybe because of the 'tranctional' behavior.
<rogpeppe> TheMue: it's both actually
<rogpeppe> TheMue: well... it's a directory and a node inside the topology
<TheMue> rogpeppe: The increment only happens when the topology change succeeds.
<rogpeppe> TheMue: the directory is inside /units though
<TheMue> rogpeppe: The unit node is below units, yes.
<rogpeppe> TheMue: yeah. but if the directory was inside the service, we could use a zk auto-increment to do the counting for us
<rogpeppe> TheMue: but perhaps there's another reason for having units in a global namespace
<TheMue> rogpeppe: We can do a lot. Do we have the need while the state isn't even completed?
<rogpeppe> TheMue: probably not. it would simplify the code a bit though.
<hazmat> its to give a semantic useful number
<hazmat> zk sequence numbers can have gaps if an op fails
<rogpeppe> hazmat: that's true. machine ids can do that.
<hazmat> and there global
<rogpeppe> hazmat: do gaps matter?
<hazmat> across all services, where as we want the service unit increments
<hazmat> rogpeppe, its rather disconcerting to see wordpress/5 and  wordpress/222
<hazmat> as sequential units
<hazmat> er.. sequentially allocated taht is
<hazmat> ie.. the zk sequence is across all units
<TheMue> hazmat: So while the key is purely internal the name containing the sequence no is also external visible?
<hazmat> TheMue, yup
<TheMue> hazmat: Makes sense, thx.
<hazmat> TheMue, the python code base distinguishes between names and internal_ids
<rogpeppe> hazmat: doesn't the same logic apply to machine ids?
<hazmat> names being visible, and internal ids being the impl detail
<TheMue> hazmat: Yes, in Go the internal ids are now keys, but have the same role.
<hazmat> rogpeppe, to some extent, its commutes, but its not really something we ever really care about per se, a) juju cares about presenting the services not the machines b) the machine global sequence is correct for machine allocation
<rogpeppe> hazmat: i guess i wonder if it's worth putting logic and code behind something that is a) very rare and b) only mildly disconcerting if it does happen.
<hazmat> where as the units want service namespace'd sequences
<hazmat> rogpeppe, i doubt it re machines
<hazmat> rogpeppe, for service units its important, unless you have another impl of per service unit sequences
<rogpeppe> hazmat: do we rely on the sequentiality anywhere?
<hazmat> rogpeppe, never
<hazmat> rogpeppe, its a user interface question
<rogpeppe> hazmat: or is it just the fact that someone might see 1 followed by 3 and wonder what's going on.
<hazmat> rogpeppe, its not about the 1 and 3, its about 1 and 25..
<hazmat> rogpeppe, the unit sequence in zk is global to all units across all services
<rogpeppe> hazmat: if we put unit directories inside service directories, that wouldn't be a problem
<rogpeppe> hazmat: but maybe there's a good reason why that is a bad idea
<hazmat> rogpeppe, right.. if you have another impl of per service unit sequences
<rogpeppe> hazmat: ah, i see what you mean by that now
 * hazmat loves repetition
<hazmat> rogpeppe, that sounds fine to me
<rogpeppe> TheMue: a reason for doing this is that it means i can map directly from a unit name e.g. wordpress/0 to its directory, without needing to consult the topology.
<rogpeppe> TheMue: so unit name, sequence number and key are all derivable from each other.
<TheMue> rogpeppe: Do we have problems so far consulting the topology?
<TheMue> rogpeppe: Or asking it differently: When do we get scaling probs due to the top size?
<rogpeppe> TheMue: what should UnitsForMachine return?
<hazmat> it hasn't been a problem at scale, but we haven't been measuring the network traffic to get a better understanding of the outflow
<rogpeppe> TheMue: it's not really a performance issue, but one of neatness.
<hazmat> and from a size of topology, its really a non issue with any compression
<hazmat> the sequence numbers that is
<hazmat> we can get about 15k machines/services/service units in atm without blowing 1mb, without compression
<rogpeppe> TheMue: i'll have a play and see how it looks
<TheMue> rogpeppe: So both solutions have to be compared.
<rogpeppe> TheMue: yeah
<TheMue> rogpeppe: Breaking todays top concept could lead to a larger redesign, stop by step. Let's do it here, oh, and there.
<hazmat> rogpeppe, its not clear thats its meanigful though, you need the topology for most anything.. and units to service name lookups aren't common
<TheMue> rogpeppe: I only see the risk of changing too much of what we reached before the rest is done. 12.10 is coming. ;)
<hazmat> rogpeppe, ie. that translation is pretty rare in practice for the runtime
<hazmat> rogpeppe, because typically its holding onto a service unit state which has both values captured
<rogpeppe> hazmat, TheMue: ok, i'll leave as is. it should be fairly easy to refactor later actually.
<hazmat> yes.. avoid.. premature optimization ;-)
<TheMue> hazmat: Hehe, yes.
<hazmat> rogpeppe, i agree though that we can better about the topology slinging, i just don't think that particular case is particularly relevant.
<TheMue> rogpeppe: Maybe we get a better solution when the layer around ZK is done and we change the backend in a second step.
<rogpeppe> hazmat: as always. although i wasn't really considering this in performance terms.
<rogpeppe> hazmat: just that we could reduce the number of entities w.l.o.g.
<hazmat> isn't neatness is an eye of the beholder thing isn't it?
<hazmat> and is the neatness you defined avoiding fetching the topology?
<rogpeppe> hazmat: no, that was another thing.
<hazmat> nm.. i should move on.. silly deadlines
<rogpeppe> hazmat: in this case we've already got the topology
<hazmat> gotcha
<rogpeppe> TheMue, hazmat: my impulse to change things stemmed from my "so why does a unit need two names" question. and the answer is "it doesn't, but that's the way it's been done" AFAICS.
<hazmat> rogpeppe, then you've ignored the discussion to this point.. its two fold, to present a nice interface, and to separate interface from impl
<TheMue> rogpeppe: So let's keep it as long as there's no real need. There are still enough tasks to do.
<rogpeppe> hazmat: putting unit directories inside their service directories solves both of those issues, i think.
<hazmat> rogpeppe, agreed
<hazmat> rogpeppe, but that's solving those two issues in a different way, that doesn't obviate the reasons
<rogpeppe> hazmat: if we did that, then a unit would only have one name
<rogpeppe> hazmat: or at least, the different categories of names would map 1-1
 * hazmat nods
<niemeyer> rogpeppe: Did you get the hotel confirmation?
<rogpeppe> niemeyer: yes thanks
<rogpeppe> niemeyer: should i organise my own travel?
<rogpeppe> niemeyer: (if so, what time should i plan to arrive/leave?)
<niemeyer> rogpeppe: Yeah, I think that's simpler in your case
<niemeyer> rogpeppe: Arriving around noonish or early afternoon sounds fine.. I expect Aram will be arriving around that time as well
<rogpeppe> niemeyer: ok, that's easy enough. leaving?
<niemeyer> rogpeppe: Planning for end of afternoon/early night on Friday should work well
<rogpeppe> niemeyer: not cheap, those train tickets!
<rogpeppe> niemeyer: looks like minimum price is around Â£230.
<rogpeppe> niemeyer: i bet Aram's flight isn't far off that
<niemeyer> rogpeppe: Ugh.. a bit unfortunate indeed
<niemeyer> rogpeppe: Still a lot cheaper than flying you to Brazil, though ;)
<rogpeppe> niemeyer: very true
<rogpeppe> niemeyer: if i want a flexible return time, the price is Â£301 !
<rogpeppe> niemeyer: so i think i'll plan to leave at 1700 on friday, which should get me home by 2100, if that sounds ok
<niemeyer> rogpeppe: Yep, sounds good
<rogpeppe> niemeyer: that's the train leaving at 1700, so i guess i'd be leaving millbank (which is presumably where we'd be?) at 1600.
<niemeyer> rogpeppe: No, we'll actually not have a room on Friday afternoon.. so we'll be meeting somewhere more open
<niemeyer> rogpeppe: We have a room for the morning rented somewhere, though
<rogpeppe> niemeyer: just as long as it's not too far away from king's cross, that'd be good
<niemeyer> rogpeppe: Everybody is leaving EOD.. so won't be an issue
<rogpeppe> niemeyer: cool. will book tickets now.
<rogpeppe> niemeyer: all booked.
<niemeyer> rogpeppe: Woohay
<niemeyer> Aram: Do you have the new flight confirmation?
<niemeyer> Aram: For real Thursday?
<Aram> niemeyer: yes, I printed my boarding pass.
<niemeyer> Aram: Cool, just to be sure, you noticed that the flight was booked wrong and then re-booked, right?
<Aram> yes, yes, got a funny email why wasn't I present in today's flight.
<niemeyer> Aram: Cool
<niemeyer> Aram: It was half my mistake.. I was silly to assume they would actually book in the day I asked for
<Aram> heh.
<niemeyer> Rather than double checking the confirmation
<rogpeppe> niemeyer, fwereade, TheMue: first step towards the unit-assignment watcher: https://codereview.appspot.com/6256070/
<niemeyer> Aram: I'm not being ironic.. that's why they send the flights for confirmation
<rogpeppe> i'm off for the night
<rogpeppe> Aram, niemeyer: see ya tomorrow!
<Aram> have fun.
<niemeyer> rogpeppe: Awesome, have a good one
<niemeyer> I shouldn't take too long either
<niemeyer> Aram: I assume you have address and all?
<Aram> 27th Floor, Millbank Tower
<Aram> 21-24 Millbank
<Aram> it's good, right?
<niemeyer> Yep!
<niemeyer> Aram: So if we don't talk again today, I'll see you here tomorrow
<niemeyer> Stepping out for dinner
<niemeyer> Cheers all
<Aram> yes, great!
<Aram> enjoy!
<niemeyer> Thanks!
<davecheney> niemeyer: are you happy for me to extend, go-cmd-jujud-fix-testing, to include the other commands then submit ?
<davecheney> or should I do another round of reviews ?
<davecheney> is schema_test.go failing for anyone else ?
#juju-dev 2012-05-31
<rogpeppe> davecheney: hiya
<davecheney> rogpeppe: howdy
<davecheney> rogpeppe: what's shaking ?
<rogpeppe> davecheney: am sitting on platform waiting for train...
<rogpeppe> davecheney: to go down to london and meet with Aram and niemeyer
<davecheney> nioce
<davecheney> i saw niemeyer online form a french IP this morning
<davecheney> whats up with that
<davecheney> ?
 * rogpeppe is always a little bit amazed when gatewaying through a mobile phone actually works
<rogpeppe> davecheney: he's in the uk, so that's a little bit odd
<davecheney> could just be the owner of the IP space
<rogpeppe> davecheney: probably
<davecheney> where abouts in london are you going ?
<rogpeppe> davecheney: arrive kings cross. then to millbank tower where canonical lives (for the next week - they're moving out, so i'm glad i'll see it before they do; it's supposed to be a spectacular location)
<davecheney> are they moving somewhere more salubrious?
<rogpeppe> davecheney: somewhere a little larger i think
<fwereade> davecheney, rogpeppe: the new place is just next to the tate modern I think, which sounds pretty cool :)
<rogpeppe> fwereade: cool. next to the river again then.
<fwereade> rogpeppe, nearby, at least :)
<rogpeppe> fwereade: morning BTW!
<davecheney> very nice
<rogpeppe> davecheney, fwereade: was wondering about how we're going to handle upgrades
<davecheney> rogpeppe: that is a small question for a big topic
<rogpeppe> i wondered if we have a "version" field in zk. clients can watch that and if they see it's changed, they'll look for a new version and replace themselves with it
<rogpeppe> does it have to be any more complex than that?
<rogpeppe> the zk tree will have to be backwards compatible anyway, i think.
<fwereade> rogpeppe, at some stage we'll want to sync up all upgrades so we can, eg, change how something's stored in ZK
<davecheney> rogpeppe: will the agents run under some kind of process manager, like upstart ?
<rogpeppe> davecheney: yes
<fwereade> rogpeppe, but, yes, that sounds like a good start
<rogpeppe> fwereade: i think that would be relatively easy too
<rogpeppe> fwereade: set the version to "pending" and wait for the various agents to acknowledge
<davecheney> rogpeppe: fwereade so each agent is responsible for a symlink (or something) that points to the current version of their binary
<fwereade> rogpeppe, yeah, my concern with this story is entirely in making the upgrade work even on machines that have a hyperactive toddler playing with their reset button
<rogpeppe> davecheney: i think i'd just create a new upstart script when changing versions
<davecheney> and if they notice the value of version in zk, they look for a binary that matches it, change the symlink, then commit sepuku and upstart restarts them ?
<rogpeppe> davecheney: i'm not sure the symlink is necessary
<fwereade> rogpeppe, davecheney's approach sounds like it may be more reliable in an unhelpful environment
<fwereade> rogpeppe, davecheney: but I don't think it covers the case where the args/env need to change
<davecheney> fwereade: i'd like to see a mixed approach, ie, the agents quit if they don't match the version
<davecheney> and dpkg handles installing the right version
<davecheney> but I don't think I understand how binaries get onto the machines
<rogpeppe> davecheney: the cloudinit script downloads them initially
<davecheney> ok, so no package manager
<rogpeppe> davecheney: yeah
<rogpeppe> davecheney: because they can come from a private s3 bucket
<rogpeppe> davecheney: we've already got the logic for choosing versions
<rogpeppe> fwereade: things are a little harder for the unit agent, because there may be commands running
<rogpeppe> fwereade: i suppose it can wait until all commands have completed before upgrading itself
<rogpeppe> davecheney: but i think you're right, i think there should be one thing responsible for actually downloading and restarting the s/w
<fwereade> rogpeppe, yeah, I think so; telling the jujuc server to close and waiting for it should handle that case
<rogpeppe> davecheney: and i think it should probably be the machine agent
<fwereade> rogpeppe, +1
<rogpeppe> fwereade: so i think from the machine agent's point of view, it might go like this:
<rogpeppe> fwereade: see version change; download new s/w; wait for all local agents to shut down; replace upstart scripts; restart everything (including self); exit
<rogpeppe> fwereade, davecheney: train arriving; signal might get dodgy
<fwereade> rogpeppe, yeah, SGTM
<davecheney> rogpeppe: sounds good, and possibly could be reused across all agents
<fwereade> rogpeppe, davecheney: if the MA knows what agents should be running on that machine (including the PA) I'd really prefer it if it were all handled by the MA
<davecheney> fwereade: sure, what i meant to say was, the 'watch version, download binary' might be reusable across all agents
<rogpeppe> davecheney: i don't think anything other than the MA will need to do any downloading
<rogpeppe> davecheney: all the binaries are bundled together
<rogpeppe> davecheney: i *think* that all the other agents will need to do is wait for version change and exit (and possibly indicate that they're exiting)
<rogpeppe> i'm not entirely sure of the best way to stage the shutdown though
<rogpeppe> is anyone still seeing this?
<davecheney> rogpeppe: that would be even better, much better separation
<rogpeppe> i haven't looked into upstart much. is it possible to wait for something to exit without automatically restarting it?
<rogpeppe> hmm, perhaps it could rewrite the upstart script
<davecheney> rogpeppe: I think, but I'm no expert
<rogpeppe> davecheney: i'm pretty sure that upstart monitors the scripts in /etc
<davecheney> rogpeppe: yup, and it will reload _its_ representation of them
<fwereade> rogpeppe, you can add a job-name.override containing "manual" to /etc/init
<fwereade> rogpeppe, I think you need to explicitly stop it though
 * davecheney reads http://upstart.ubuntu.com/cookbook/
 * rogpeppe starts to download it
 * fwereade feels for davecheney (not that it's bad, I'm really happy it exists, but I've always found figuring upstart stuff out harder than it should be)
<fwereade> morning niemeyer
<niemeyer> fwereade: Heya!
<niemeyer> Morning all!
<rog> niemeyer: yo!
<niemeyer> rog: Heya
<niemeyer> rog: Where are you? :)
<rog> niemeyer: we were just having a chat about upgrading
<rog> niemeyer: good. on the train, so intermittent connectivity
<niemeyer> rog: That's awesome :)
<rog> niemeyer: i think we may have something approaching a plan for upgrades
<niemeyer> rog: Sounds great, let's talk this afternoon
<rog> niemeyer: yeah
<fwereade> TheMue, I just had a thought which would have been a lot more helpful 2 weeks ago
<davecheney> night all
<fwereade> TheMue, niemeyer: I'm wondering whether we currently have any reason at all to explicitly allow addition of peer relations from outside state
<fwereade> TheMue, niemeyer: because services with peer relations should *always* have their peer relations set up, and that should perhaps be rolled into AddService
<fwereade> TheMue, niemeyer: (rather than being tacked onto the deploy command, which always felt a little off)
<TheMue> fwereade: How would that look like?
<fwereade> TheMue, AddRelation would always take 2 args; AddService would contain more code for setting up the relation (and I guess defer the topology change to include both the service and the relation as the last step)
<fwereade> TheMue, AddService has access to the charm already, I think
<TheMue> fwereade: Have to take a deeper look.
<TheMue> fwereade: AddService() then has to check if it's a peer or not. So more complexity there.
<fwereade> TheMue, just a thought
<TheMue> fwereade: Maye a good one, don't want to break it. ;)
<TheMue> fwereade: Just have to understand it more.
<niemeyer> fwereade: That's a fantastic question, actually, and I want to ponder about it this afternoon too
<niemeyer> fwereade: Because we're breaking an assumption that was made in the original code, that I'm not sure makes sense
<niemeyer> fwereade: The original code did allow for multiple peer relations
<niemeyer> fwereade: the new one does not
<niemeyer> fwereade: I'm not sure we want to do that
<niemeyer> fwereade: I'm a bit concerned right now, actually, because the model change in the topology will make it painful to bring that back
<niemeyer> Which means we'll have to rethink the original thinking again
<niemeyer> As breaking compatibility to introduce that would be bad
<niemeyer> TheMue: ^^^
<TheMue> niemeyer: Yep, seen.
<niemeyer> TheMue: We may have to redo the topoRelation stuff once mroe
<niemeyer> more
<niemeyer> TheMue: But let's get this branch to the end of the line anyway
<TheMue> niemeyer: No problem.
<niemeyer> TheMue: (with the current logic)
<niemeyer> TheMue: It'll be easier to refactor back to the original model, if we have to, than to keep that huge change flying for much longer
<niemeyer> fwereade: Thanks for bringing that up
<fwereade> niemeyer, a pleasure, hope it proves fruitful :)
<niemeyer> fwereade: Already has!
<niemeyer> fwereade: At least we'll know what we're doing, rather than blindly finding out down the road that we did a mistake on the transition
<fwereade> niemeyer, yeah :)
<fwereade> gents: I spent much of last night getting friendly with mosquitoes, so I'm taking a walk in the sun to remind my body it's daytime; bbs
<fwereade> just proposed https://codereview.appspot.com/6245075 if anyone's of a mind
<niemeyer> fwereade: Enjoy the walk
<fwereade> niemeyer, I did :)
<niemeyer> fwereade: Oh, hey, it's been a while :-)
<fwereade> niemeyer, did I miss anything? looked empty...
<niemeyer> fwereade: Hm?
<fwereade> niemeyer, don't worry, I think I misunderstood what you said
<niemeyer> fwereade: I was alluding to my complete lack of sensibility related to the timing of your previous comment
<fwereade> niemeyer, yeah, I get that now :)
<niemeyer> fwereade: Review delivered
<fwereade> niemeyer, cheers
<TheMue> niemeyer: Thx too
<TheMue> niemeyer: And ok, will split. ;)
<niemeyer> TheMue: Sorry, see last comment
<niemeyer> TheMue: Splitting is misleading
<niemeyer> TheMue: RemoveServiceRelation would really remove the *relation*, not the service relation
<niemeyer> TheMue: Can we have a ServiceRelation.Relation method that returns the *Relation, which can be removed?
<TheMue> niemeyer: Yes, is possible.
<TheMue> niemeyer: So one would RemoveRelation(relation) or RemoveRelation(serviceRelation.Relation())?
<niemeyer> TheMue: Yeah, that looks very clear
<TheMue> niemeyer: OK, H5.
<niemeyer> TheMue: Thanks
<Aram> niemeyer: hey
<Aram> I'm here
<Aram> (aolmst).
<Aram> almost
<niemeyer> Aram: Heya
<niemeyer> Aram: Where? :)
<niemeyer> Aram: Can
<niemeyer> 't see you
<niemeyer> :)
<Aram> I'm there in half an hour or so.
<niemeyer> Aram: Woohay
<fwereade> hey again wrtp
<wrtp> yo!
<TheMue> niemeyer: The comments and the logic in my addRelation() (yes, will get a better name) are from/inspired by relation.py line 105 ff. It looks like an explanation why container scoped relations are handled elsewhere.
<TheMue> niemeyer: Sadly I don't know if there's a more elegant way to handle container scoped relations in the same context.
<fwereade> TheMue, is there any way to find out what units are assigned to what machines without topology access?
<TheMue> fwereade: As far as I see not. The assignment only modifies the topology.
<fwereade> TheMue, ok; this makes me fret slightly about the test, but I'll see how I go
<TheMue> fwereade: In a different case (not yet in trunk) I put a helper in export_test.go
<fwereade> TheMue, and I just found AssignedMachineId anyway, which I think gives me everything I need.. not sure how I missed that
<fwereade> TheMue, thanks :)
<TheMue> fwereade: np
<fwereade> niemeyer, do we want to retain the placement config setting for ec2?
<niemeyer> fwereade: How do you mean?
<fwereade> niemeyer, we kept it in for 12.04 somewhat reluctantly as I recall
<fwereade> niemeyer, (allowing setting placement in ec2 config)
<niemeyer> fwereade: Hmm.. oh, you mean in the yaml?
<fwereade> niemeyer, yeah
<niemeyer> fwereade: I'm happy to delay it at this point
<niemeyer> fwereade: But we may have to add it depending what people have been doing with it
<fwereade> niemeyer, sure, that shouldn't be too hard
<fwereade> niemeyer, once we have one environment setting others should be relatively simple to add
<fwereade> niemeyer, hmm, I'm feeling a strange reluctance to test a method that just returns a constant
<fwereade> niemeyer, in python I'd probably do it ithout thinking
<fwereade> niemeyer, am I being lazy or pragmatic
<fwereade> ?
<niemeyer> fwereade: Hmm.. good question.. I'm tempted to suggest the test in this case
<fwereade> niemeyer, yeah, I decided to play it safe :)
<niemeyer> fwereade: Mainly because it avoids the silly typo scenario
<fwereade> niemeyer, indeed
<robbiew> fwereade: 1:1 time?
<fwereade> robbiew, heyhey, sorry... cath's in bed and I had to pop out and get some stuff for her
<fwereade> robbiew, have I missed my slot? :(
<robbiew> fwereade: yes...but I can reslot you
<robbiew> ;)
<fwereade> robbiew, cool, when works for you?
<robbiew> fwereade: does tomorrow at the same time work for you?  I can also do in little over an hour, but realize that could be a bit late for you
<fwereade> robbiew, tomorrow same time would probably be better if that's ok
<robbiew> sounds good
<fwereade> robbiew, 23h from now, right?
<robbiew> fwereade: yep
<fwereade> robbiew, great, thanks
<davecheney> wrtp: bit late for you mate
<wrtp>   davecheney: yo!
<wrtp> davecheney: am in london
<wrtp> davecheney: i can go to bed when i want to
<davecheney> wrtp: that sounds like me in SF
<wrtp> davecheney: how's tricks?
<davecheney> wrtp: good, just polishing up my branches then I was going to go to the cafe for some breakfast
<davecheney> wrtp: had a good night with the lads ?
<wrtp> davecheney: yeah, and had some good discussions about the unit agent, upgrading &c
<davecheney> solid
<robbiew> wow...wrtp burning the midnight oil
<robbiew> assume you met with niemeyer and Aram?
<wrtp> robbiew: it's not midnight yet
<robbiew> almost though ;)
<wrtp> robbiew: yeah. aram is here in the room.
<robbiew> lol
<robbiew> tell him I said "hi"
<wrtp> robbiew: just did
<robbiew> ;)
<Aram> hi
<wrtp> Aram: yo!
<wrtp> lol
<robbiew> Aram: don't take wrtp's lack of a life as the "norm"....it's okay not to work at 11pm ;)
<Aram> heh,
<davecheney> indeed, only I am permitted to be up this late
<robbiew> well...and I usually lurk...and often get back on after "Dad Duties" ;)
<robbiew> btw...Mark Ramm officially starts tomorrow ;)
<wrtp> robbiew: cool.
 * davecheney applauds
<wrtp> davecheney: here's a brief sketch of how i think upgrading should work: http://paste.ubuntu.com/1017166/
<davecheney> wrtp: lgtm, extra points for not making the restarting the agents job
<wrtp> davecheney: oh yeah, forgot about that the machine agent needs to restart itself too: http://paste.ubuntu.com/1017167/
<davecheney> seeing as we're approaching a quorum here
<davecheney> what does everyone think about adding a method like state.IsValid() ?
<wrtp> davecheney: current thought is that perhaps the machine agent should be responsible for starting the provisioning agent.
<davecheney> wrtp: SGTM, it wasn't clear if machine/0 had a MA
<wrtp> davecheney: just lost my connection. bizarrely Aram still saw everything, so i've now seen what you said...
<wrtp> davecheney: last i saw as "davecheney applauds"
<wrtp> s/as/was/
<Aram> wrtp: http://paste.ubuntu.com/1017173/
<wrtp> davecheney: not sure what you mean by the isValid thing.
<wrtp> my version of that: http://paste.ubuntu.com/1017177/. irc (or is it just tcp??) is bizarre.
<wrtp> davecheney: machine 0 does have an MA, i think, but it doesn't do much currently.
<wrtp> davecheney: by making the MA responsible for starting the PA, we can move towards a place where the PA is eventually just a unit
<davecheney> wrtp: i like where this discussion is going
<davecheney> would that imply it has a UA as well ?
<wrtp> davecheney: probably
<davecheney> mmm, deliciously self referential
<wrtp> davecheney: mmm, i thought so too
<wrtp> davecheney: bootstrap it up and it eats its own tail
<davecheney> that is when you know it's working right, when there are no special cases
<wrtp> davecheney: i like the idea of the PA as a subordinate charm to the MA
<wrtp> davecheney: why not replicate the PA on every machine, as niemeyer suggested earlier?
<davecheney> no reason not too
<davecheney> hmm
<davecheney> might need a little bit of work in the state
<davecheney> currently the PA marks a machine as started by looking at the value of the 'provider-machine-id' key
<davecheney> well, s/started/claimed/g
<davecheney> but it's not unsolvable to support multiple PAs
<wrtp> davecheney: i think it's worth thinking about when implementing the PA
<wrtp> davecheney: because we need high availability even if we don't implement it as a service
<davecheney> making it a service woudl make it trivial to do, service PA add unit ; add unit ; add unit
<davecheney> then you have three PA's running
<wrtp> davecheney: xactly
<davecheney> which is a decent number
<davecheney> ok, i'll ponder the implications of that
<wrtp> davecheney: if it was subordinate to the machine service, then you'd have one PA on every machine. i think that may be overkill, but maybe it's just fine.
<davecheney> wrtp: that is one for the customers or product leads, to decide how to play that
<bcsaller> the machine service doesn't run units and isn't a service, this the concept of subordinate isn't needed. A normal service unit running in the machine managed outside the ones invoked by the admin would work
<davecheney> the main problem with n PA's is storing the instance Id would have to become atomic
<wrtp> bcsaller: the machine provider doesn't run units?
<davecheney> which means the topology
<wrtp> bcsaller: i thouht that's more-or-less all it did
<bcsaller> it runs unit agents which run units
<bcsaller> but it has no services in the sense you're talking about
<wrtp> bcsaller: ok, i was thinking of a unit agent as a "unit", but i see
<bcsaller> also I think you need to scale back the number of units running zookeeper, at 1000 machines, each with a zookeeper PA the inter-cluster traffic would be too high I suspect
<wrtp> bcsaller: i definitely wouldn't want each PA machine to be running zk too
<bcsaller> unit agents have a principal service, that can spawn another unit agent running subordinate to it
<wrtp> bcsaller: the kind of thought we were kicking around today is that maybe that actually maps quite well to the relationship between the machine agent and the provider agent. probably totally crackful :-)
<bcsaller> its not crack, running the PA stuff as services and units makes sense and is something we've wanted to do for a long time, both for HA and scale out
<bcsaller> but it would be a unit of zookeeper and maybe a unit of some local storage service and a unit of the admin backend and so on running on some cross section of machines
<bcsaller> where I think there is more than one juju internal service and they might scale differently
<bcsaller> in that world there isn't a single PA I suspect
<wrtp> bcsaller: so for today my takeaway "good idea" was that the MA is primary and that we can allocate units to a machine and some might be containerised and some might be in the same container (e.g. subsidiary units)
<wrtp> bcsaller: and that that categorisation actually includes more-or-less everything other than the machine agent itself.
<wrtp> bcsaller: ... maybe. definitely a bit of late night hand waving going on. need to think more.
<bcsaller> I think I see how you're thinking about it though
<wrtp> bcsaller: the missing link currently is that we have no way of specifying that several units should run containerised on the same machine (of course we need to solve the network issue first)
<bcsaller> also missing is that we need something that does what the MA does or we can't build out units and hense services, making it hard to treat the MA as a service itself with things running subordinate to it
<bcsaller> I'd rather promote the idea of container to a 1st class object in the system
<wrtp> "build out units and hense services" ?
<wrtp> bcsaller: ^
<bcsaller> sounded like you want to call the MA a service
<bcsaller> which is very cyclic as its the thing that puts service units onto machines
<bcsaller> so modeling it a service is not a clean fit today
<wrtp> bcsaller: that's true. but it doesn't put MAs onto machines
<bcsaller> the PA?
<wrtp> bcsaller: yeah
<wrtp> bcsaller: (crack approaching fast!) so PA finds a new "machine unit", spawns a new machine to run the MA for that unit, which also looks for units allocated inside that unit.
<wrtp> bcsaller: containers now being first class, at least within the admin structure, of course. :-)
<wrtp> bcsaller: so in that sense, the MA becomes a glorified unit agent.
<bcsaller> so it would look for containers assigned to that machine an set those up, which in some future world could be unpacking them from frozen lxc states (hows that for crack)
<wrtp> bcsaller: sounds like my kind of crack
<bcsaller> heh
<wrtp> bcsaller: so... maybe there's actually no need for a machine agent at all. we can actually write the machine agent as a regular charm...
<wrtp> an interesting thought experiment anyway
<bcsaller> if the PA can bring units up running enough code to deploy services, but that last part is what the MA does and thats the loop
<bcsaller> so possible, but it means the images coming up might not be "clean images"
<wrtp> bcsaller: yup
<wrtp> bcsaller: they aren't clean right now
<wrtp> bcsaller: they've already got our shit running on 'em
<bcsaller> we start from a clean image though, we install things as part of their cloud init, but it could be similarlly done. Still, that running bit is what we call the MA.
<bcsaller> I sound like a broken record
<bcsaller> and I just realized how dated that expression is
<bcsaller> and now I feel old, thanks ;)
<wrtp> bcsaller: :-
<wrtp> )
<wrtp> bcsaller: "their" cloud init? isn't it *our* cloud init?
<bcsaller> that too
<wrtp> bcsaller: one interesting thought is the idea that we could have PAs that run on different providers. so we could have a genuinely cross-provider juju worm...
<bcsaller> wrtp: you'd have to select container handling code specific to the arch as well, if for example some types of virtualization or isolation were not available. It might be that we allow additional charm metadata (similar to constraints) that say what features from a provider we depend on
<bcsaller> that could apply to juju internal service charms initially but things like EC2 services in the charm metadata would be useful to users as well
<wrtp> bcsaller: makes sense. in fact if we have containers as first class, then what we can embed in a container comes from the kind of that container (you can't put LXC inside LXC for example)
<bcsaller> they've worked hard to make that specific case mostly work, but yeah, I hear ya
<wrtp> bcsaller: oh, really, cool. well anyway, it's probably not so useful to allow it
<bcsaller> but things like network isolation would apply at the container level as well (but possibly requires cross container machine modifications as well)
<wrtp> bcsaller: but i'm thinking we can model container placement as part of constraints perhaps
<wrtp> bcsaller: network connectivity is an interesting issue altogether
<wrtp> bcsaller, davecheney: i should probably stop now. i hear snores coming from the other side of the room... :-)
<bcsaller> ha, ok, nite
#juju-dev 2012-06-01
<hazmat> davecheney, the pa needs a lock around a the ma
<hazmat> er. around the machine state its processing
<hazmat> to allow for concurrent pa
<davecheney> hazmat: yes
<davecheney> hazmat: even simpler the PA stores the provider instance id into the state once it is known
 * hazmat tries to catch up the on the conversation
<davecheney> which can be used as a sentinal to say 'hey, at least I started this vm, it might be dead now, but I tried'
<hazmat> davecheney, sure.. that's what it does currently
<davecheney> that would be problematic with multiple PA's all racing to provision machines
<davecheney> TBH, i haven't thought about how to solve that yet
<hazmat> its just a per machine lock op lock, or a even a queue..
<davecheney> hazmat: would that be implemented in ZK ?
<hazmat> davecheney, sure
<davecheney> finger in the air, maybe that could be done with a new content node /machines/machine-X/state
<hazmat> there's recipes for building all of these structures for zk on the zk wiki.. the py zk lib (txzk) has several implemented..
<davecheney> i;ll go check that out
<hazmat>  /machines/machine-x/lock i would think
<davecheney> but i'll have to moderate that with gustavo's desire to not be too closely tied to SK
<hazmat> or even /machines/provisioning-queue/
<davecheney> ZK
<davecheney> hazmat: sounds like maildir
 * hazmat giggles
<hazmat> well he wants to maintain the current client api
<hazmat> so assuming the guarantees are the same, the principles should carry over..
<hazmat> mongo is consistent and has atomic ops..
<davecheney> it's on my pondering queue, but right now we have 0 PA's, so i'm not really focusing on having N PA's yet
<hazmat> doing locks, and queues with it is straight forward
<hazmat> sure..
<davecheney> hazmat: that might be simpler
<hazmat> the goal seems to be to get to the parity..
<davecheney> in mongo, when the client disconnects, do the locks go away ?
<hazmat> and then flip the bits ;-)
<davecheney> my concernt with lock files and shit is recovering from a stale lock
<hazmat> davecheney, you record the client and a timestamp on the lock..
<hazmat> davecheney, have a look at  the store code..
<hazmat> it has some locking
<davecheney> i'll do so
<davecheney> hazmat: in python, what was the result of the tcp connection to zookeeper going away ?
<davecheney> did the agent exit, or did you try to recover ?
<hazmat> davecheney, it recovers
<hazmat> davecheney, there's a connection wrapper that  basically creates an immortal connection
<davecheney> where is that handled ? in the zk library, by trying other replicas ?
<davecheney> ahh right
<hazmat> davecheney, it handles transient disconnects and session expirations
<hazmat> for transient disconnect the reconnection is handled by the libzk
<hazmat> for session expirations its handled by the wrapper
<hazmat> also triggers watches on session reconnect, its a fairly straightforward impl.. http://bazaar.launchpad.net/~juju/txzookeeper/trunk/view/head:/txzookeeper/managed.py
<hazmat> re mongo.. here's a simple queue impl i wrote with  locking.. effectively each client has to be cooperative about breaking stale locks.. their easy to detect.. and the breaking is just part of the queue impl.
<hazmat> rog's upgrade solution is odd..
<hazmat> i guess that makes sense.. ma responsible for code upgrade, others sync on that
<davecheney> hazmat: what if the tools were a .deb, upgrades would be as simple as something running an apt-get upgrade on every machine
<hazmat> davecheney, debs are frought with problems in this context
<hazmat> davecheney, across multiple versions of the distro... needs an apt repo maintenance..
<davecheney> hazmat: sure, i guess that was a nieve solution
<hazmat> the actual download isn't a huge problem
<hazmat> its just trying to ensure the coordination around it
<davecheney> i'm just wondering if we're reinventing dpkg hooks
<hazmat> davecheney, how does apt handle multiple versions avail..
<hazmat> it picks the latest..
<davecheney> indeed
<hazmat> by default anyways..
<hazmat> dpkg hooks aren't really about cross machine coordination
<hazmat> the whole thing becomes much more interesting, and indeed.. simpler when you consider it from a db upgrade perspective
<davecheney> no, really just for poking upstart
<davecheney> 'Everyone! Switch to version X!
<hazmat> where you need to everything having the new code ready.. you do a centralized db upgrade.. and then restart everything..
<davecheney> i like the use of the state for signalling things like that
<hazmat> indeed.. that's why we have zk..
<wrtp> davecheney: morning!
<wrtp> TheMue: yo!
<TheMue> davecheney, wrtp, fwereade: Morning.
<wrtp> fwereade: hey!
<davecheney> wrtp: TheMue fwereade Aloha!
<davecheney> wrtp: TheMue I have questions about the state
<davecheney> who is sober enough to answer them ?
<wrtp> davecheney: woke up with a head full of crack after last night's recursive agent discussions...
<TheMue> davecheney: Let's try it.
<wrtp> davecheney: i'm sober enough but i might not know the answers
<davecheney> so, in the latest propose for the PA, gustavo had this concern
<davecheney> https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#newcode37
<davecheney> TheMue: wrtp after trying all day, I can't actually make the state break, that is, ever return an error
<davecheney> nor can I observe any of the watchers close on me through rough handling of the zookeeper server
<wrtp> davecheney: that's true currently, but might not be in the future, when we're not talking directly to zk (or zk is actually mgo under the covers)
<davecheney> wrtp: yup, after talking to hazmat he said that in python they also have a wrapper than makes the state immortal and can cope with transient disconnections from the server
<wrtp> davecheney: that's controversial...
<davecheney> wrtp: TheMue my intention was to add a state.IsValid() method somewhere so that the PA could drop out of the loop, make another state connection and try again
<davecheney> but without the ability to ever make the damn thing crap itself, it's hard
<wrtp> davecheney: what would IsValid test?
<wrtp> davecheney: isn't that what the channel returned from zk.Dial is supposed to do?
<davecheney> wrtp: my idea was it would be set to false it an error (of the sort that GetW or ExistW give)
<TheMue> wrtp: Yes, good question.
<davecheney> but as I can't get it to generate an error ...
<davecheney> wrtp: IsValid is supposed to detect the 'broken' Gustavo spoke of in that comment
<TheMue> davecheney: That's depending on todays gozk. But as Roger said this will change (or at least wrapped as a first step).
<TheMue> davecheney: In case of a broken connection the state is not invalid. It's "only" a connection error. Indeed the watcher may be notified if we don't have a mechanism to automatically reestablish it.
<davecheney> TheMue: yup, i'm always planning that ZK has a limited shelf life
<TheMue> davecheney: But reading and writing of the state is always life.
<wrtp> davecheney: i think fwereade did some experiments around breaking zk connections, but i may remember wrong
<davecheney> TheMue: so here is the logic, something is receiving from the watcher channel, and it detects channel closed, so we call Stop on the watcher to clean up
<davecheney> then before going back into the loop, i need someting like state.IsValid or !isClosed() to say 'don't try this again, it won't do any better this time'
<TheMue> davecheney: isDisconnected() or something like that, but please not IsValid().
<TheMue> davecheney: Because the state in ZK is still valid.
<wrtp> davecheney: you should get an error from the watcher's Stop method, i think
<davecheney> wrtp: yes, that is another way to do it, but is fraught with difficulties as the error might be wrapped
<davecheney> wrtp: but if you're saying on _any_ error from stop we should consider the state disconnected and tear everything down
<wrtp> davecheney: we should make sure that we make *sure* that a fatal error from zk is passed through in a recognisable way to the Stop return value
<davecheney> wrtp: can you elaborate on what you mean by recognisable way
<wrtp> davecheney: i mean that it should have a known type or a known value
<wrtp> (probably the former, so we can encapslate the actual zk error)
<davecheney> wrtp: that is what I was thinking too, rather than error, state.Error
<wrtp> davecheney: well, *state.Error returned as an error.
<wrtp> gotta go to breakfast
<davecheney> wrtp: of course
<davecheney> btw, did aram write Doozer ?
<fwereade> hey all
<davecheney> has anyone been able to sign up to the HP cloud ?
<davecheney> I tried twice today, with my canonical and personal email addresses, but I haven't got a confirmation email back yet :(
<wrtp> davecheney: i haven't tried
<davecheney> wrtp: i emailed jorge to see if he could reach out to a tech contact there
<wrtp> davecheney: gustavo says "sign up, give them your credit card details, then send me the account id and the tennant id"
<davecheney> wrtp: i'm blocked at the 'sign up' stage
<wrtp> davecheney: (but obviously that might be hard if they won't let you sign up!)
<davecheney> it never sends me the confirmation email
<davecheney> jynx!
<davecheney> OT: at mailguard, our bot would mute someone if you call jynx
<davecheney> it used a levenstein computation to see who to mute
<wrtp> davecheney: cool
<davecheney> further proof that with enough time and perl you can achieve even the most useless of things
<wrtp> davecheney: i wonder if there's a go package for computing levenstein distance
<davecheney> wrtp: wouldn't be hard, i don't think its very complicated to do in code
<wrtp> davecheney: agreed. it's nice to be able to take stuff off the shelf though
<davecheney> ask uriel
<wrtp> davecheney: gotta go. have fun.
<davecheney> wrtp: cya
<niemeyer> Hellos!
<rog> Aram, niemeyer: it'd be nice to be able to talk, eh?
<Aram> heh, yeah
<niemeyer> rog: Yeah.. well.. I'm just responding to Dave's email, and then we can move elsewhere to chat as well
<niemeyer> fwereade: ping
<fwereade> niemeyer, pong
<niemeyer> fwereade: Heya, good morning
<fwereade> niemeyer, and yourself :)
<niemeyer> fwereade: Do you have the RT# for the juju-dev ML?
<fwereade> niemeyer, it was set up this morning
<niemeyer> fwereade: Oh
<fwereade> niemeyer, I appear to be an administrator, which I guess I should have expected :/
<niemeyer> fwereade: Yeah :-)
<fwereade> niemeyer, something wrong?
<niemeyer> fwereade: No, I want to use it :)
<niemeyer> fwereade: Have you subscribed people to it?
<fwereade> niemeyer, not yet
<niemeyer> fwereade: Ok, would you mind to do that so we can start using it?
<fwereade> niemeyer, on it
<niemeyer> fwereade: Subscribing the obvious candidates, and then mailing juju@ should do it
<niemeyer> fwereade: Thanks a lot
<niemeyer> I'll hold off sending the email to Dave to send it to the list instead
<fwereade> niemeyer, added
 * fwereade worries that he's forgotten someone who will be mortally offended
<niemeyer> fwereade: juju-dev@lists.ubuntu.com?
<fwereade> niemeyer, yeah
<niemeyer> fwereade: Send an email to juju@ inviting people.. no body will be offended
<niemeyer> nobody
<niemeyer> fwereade: Is Aram in?
<fwereade> niemeyer, yeah
<niemeyer> fwereade: Superb
<niemeyer> fwereade: Thanks a lot for taking care of this
<niemeyer> Sending first email
<fwereade> niemeyer, a pleasure :)
<niemeyer> Done
<niemeyer> Let's see if email still works these days ;-)
 * TheMue has the mail in his folder for this list, nice.
<niemeyer> TheMue: ping
<TheMue> niemeyer: pong
<niemeyer> TheMue: I spent some time yesterday with rog and aram
<niemeyer> TheMue: around the concept of relations and how they were inverted
<TheMue> niemeyer: I'm listening.
<niemeyer> TheMue: We got to the conclusion that the original design was actually more future proof that what we have in place in the CL now
<niemeyer> TheMue: It's very hard to imagine that we'll have a single service participating in a single relation with two roles
<niemeyer> TheMue: But it's not so much of a stretch to imagine that two services may participate in a relation with the same role
<TheMue> niemeyer: IC
<niemeyer> TheMue: E.g. a master with two slaves, two peers, etc
<niemeyer> TheMue: The new design makes that unfeasible
<TheMue> niemeyer: Hmm, a change from map[role]serviceKey to map[role][]sericeKey wouldn't help (as a quick thought)?
<TheMue> niemeyer: With all depending changes naturally.
<niemeyer> TheMue: Sounds worse than just doing what we had
<niemeyer> TheMue: map[service key]stuff
<TheMue> niemeyer: OK
<niemeyer> TheMue: That said, hold on a moment
<niemeyer> TheMue: I'll re-review the latest state of that branch.. I'm hoping we can integrate the current version, and then refactor it over
<TheMue> niemeyer: The idea of William and me has been to move away from seperate AddRelation() and AddEndpoint(). So that the risk of having a relation w/o EP or only a provider or requirerer is lower.
<TheMue> niemeyer: Maybe following the current model but with a new API could do this too.
<niemeyer> TheMue: I don't see how that's related to the point above
<TheMue> niemeyer: It's only the motivation how the change started. The model change has then rised out of a discussion (as it seems sadly with a wrong direction).
<TheMue> niemeyer: So maybe the model could now better stay as it is in Py, but the API changes to reduce the risk of adding illigal relation sinks.
<niemeyer> TheMue: Agreed, not suggesting further changes.. just the topology has to be adapted to invert the map
<TheMue> niemeyer: OK, start a new branch based on the zk to topo renaming to change the topology first. Afterwards new branches for AddRelation() and RemoveRelation() while the two current open ones will be abandoned.
<niemeyer> TheMue: Please hold on
<niemeyer> TheMue: What are you abandoning?
<TheMue> niemeyer: The AddRelation() and the RemoveRelation() branches. They have older branches as prerequisites.
<TheMue> niemeyer: I don't know how good the newer refactoring branch then works with those two.
<niemeyer> TheMue: No, please don't drop them
<niemeyer> TheMue: We've spent a lot of time polishing those branches
<niemeyer> TheMue: Let's get them in, and have the change on top of them
<TheMue> niemeyer: Merging is now problem?
<niemeyer> TheMue: Me not understand
<TheMue> niemeyer: Ah, but I understand you now. You want to get both in the trunk first before I start refactoring.
<niemeyer> TheMue: Yeah
<TheMue> niemeyer: Am I right?
<TheMue> niemeyer: Good.
<TheMue> niemeyer: Otherwise I would have kept the source and then bring it in with new branches.
<TheMue> niemeyer: The "new" RemoveRelation() propose will come in in a few moments, just testing.
<niemeyer> TheMue: Ah, cool
<TheMue> niemeyer: Aaargh, it's in, but again with a lot of file too much in the review list.
<niemeyer> TheMue: Ok, can you please clean it up so I can review and unblock you?
<TheMue> niemeyer: Will try. It depends on the AddRelation() proposal.
<niemeyer> Aram: https://launchpad.net/mgokeeper
<niemeyer> Aram: bzr branch lp:mgokeeper and profit :)
<niemeyer> TheMue: The pre-requisite is merged.. it's the same problem you had last time, and the same solution as well
<TheMue> niemeyer: Yep
<niemeyer> TheMue: "Today this isn't done. I would like to do it in a new small branch after this
<niemeyer> and RemoveRelation() are in."
<niemeyer> TheMue: Ok, but please note that down as the next task after fixing the topology details
<hazmat> new 'juju-dev' mailing list?
<TheMue> niemeyer: Hmm, merged, pushed, proposed, everything is there. But also still those files not part of the branch. It's only that they are with one delta in rietveld while the changed file have two.
<niemeyer> TheMue: You have to push the pre-req after merging trunk in it, as you did last time
<TheMue> niemeyer: I've done.
<niemeyer> TheMue: The diff is the diff that bazaar generates
<niemeyer> TheMue: So whatever is being pushed is the actual diff between the two branches
<TheMue> niemeyer: I have to digg deeper.
<niemeyer> TheMue: That's all that there is to it really
<niemeyer> TheMue: If you merge trunk on the pre-req, push it to Launchpad, merge trunk on the follow up, repropose, it should wokr
<TheMue> niemeyer: Why the quotes around the sentence above? "Today this â¦"
<niemeyer> TheMue: Because it's your sentence
<niemeyer> hazmat: Yeah, William sent a note to juju@
<niemeyer> TheMue: The task should not be lost.. a few times we've said something was going to be done as a follow up and it ended up going missing
<niemeyer> TheMue: So please write down to keep track this should be in the queue after the current branch and the topology changes
<niemeyer> We really need an organized bug tracker that we can look at
<niemeyer> Hmm.. maybe we should create a juju-core project in Launchpad
<TheMue> niemeyer: Yep, missing that. Workied with it since the 90s and introduced it in several projects. It's indeed helping.
<niemeyer> TheMue: LOL.. yeah :)
<niemeyer> TheMue: We have a bug tracker.. it's just a mess because of the mix up of juju py
<TheMue> niemeyer: So time for juju go?
<niemeyer> TheMue: Hm?
<niemeyer> TheMue: So, do you know the task I'm talking about above?
<TheMue> niemeyer Not yet, will look into the logs. Right now I'm fighting with bzr. It says in the prerequisite as well as in the current branch that there's nothing to merge from the trunk and in the prerequisite that there's nothing to push.
<TheMue> niemeyer: Drives me mad.
<niemeyer> TheMue: Ok.. I'll file a bug and assign to you.
<TheMue> niemeyer: Thx
<niemeyer> TheMue: Have you merged trunk onto the pre-requisite?
<TheMue> niemeyer: Yes.
<TheMue> niemeyer: Just tried again, nothing to do. *sigh*
<TheMue> niemeyer: Same in my branch.
<TheMue> niemeyer: And bzr push says that there is nothing to push.
<niemeyer> TheMue: Merge the pre-req onto your branch now
<TheMue> niemeyer: Oh, something new, a warning about criss-cross, but everything merged successful.
<niemeyer> TheMue: Yep.. that's the problem
<niemeyer> TheMue: Try pushing the new content
<niemeyer> and proposing it
<niemeyer> I hope that works
<TheMue> niemeyer: Ah, yeah, better now. Thank you.
<niemeyer> TheMue: Woohay!
<TheMue> niemeyer: Worked too long w/o branching. But by now I've also started to branch my private software. ;)
<niemeyer> TheMue: I see you're renaming from zkRelation to topoRelation.. are tests broken at the moment without that?
<TheMue> niemeyer: Without renaming it has been broken.
<TheMue> niemeyer: Yes.
<niemeyer> TheMue: Ok :(
<niemeyer> TheMue: Trunk is being broken way too often :(
<niemeyer> 	 382                 t.RemoveRelation(relation.key)
<niemeyer>  383                 return nil
<niemeyer> TheMue: Is RemoveRelation returning no errorr?
<TheMue> niemeyer: No, it's only removing inside a map. Today it's also not checked at that point if it exists in that dir.
<niemeyer> TheMue: OK
<niemeyer> / Relation returns the relation with key from the topology.
<niemeyer> func (t *topology) Relation(key string) (*topoRelation, error) {
<niemeyer>         if t.topology.Relations == nil || t.topology.Relations[key] == nil {
<niemeyer>                 return nil, fmt.Errorf("relation %q does not exist", key)
<niemeyer> TheMue: We don't want the user to be getting this error message
<TheMue> niemeyer: Oh, please wait.Just seen an assert_relation there. Will add it too.
<niemeyer> TheMue: The relation key is completely uninteresting as far as a user is concerned
<niemeyer> TheMue: The state method should verify the state in those cases, and error early with a sensible message
<TheMue> niemeyer: OK
<niemeyer> TheMue: If we get an Relation from relation we should display prefixed with something saying that an unexpected action happened while doing whatever
<niemeyer> TheMue: In which case the relation key might go out, but we have no idea about why it was failing since we've checked it early
<niemeyer> TheMue: So it's debugging rather than something we'd like the user to be commonly looking at
<TheMue> niemeyer: IC. Do you note it in the review?
<niemeyer> TheMue: I'll copy & paste what I just said
<TheMue> niemeyer: I have to break for lunch, wife is calling.
<TheMue> niemeyer: Yes, thx.
<niemeyer> TheMue: Sure, I'll see you on Monday, have a good lunhc
<niemeyer> fwereade: Have you added andrewmelina?
<niemeyer> fwereade: To the list?
<fwereade> niemeyer, sorry, no I didn't
<niemeyer> fwereade: no problem, I thought we might have missed it
<fwereade> niemeyer, is that the same person as andrewsmedina at gmail?
<fwereade> niemeyer, can do that now
<niemeyer> fwereade: Yeah, I suspect so
<niemeyer> fwereade: Thanks!
<niemeyer> TheMue: https://bugs.launchpad.net/juju-core/+bug/1007373
<niemeyer> TheMue: https://codereview.appspot.com/6223055/ is still dirty
<TheMue> niemeyer: So, 6223055 is cleaned up.
<niemeyer> TheMue: Thanks
<TheMue> niemeyer: Same problem with its prerequisite again.
<niemeyer> TheMue: Yeah, I do apologize for the problem.. I should have fixed it already
<TheMue> niemeyer: Gna, it's also my inexpertness with branching. So don't mind. The more often I have this problem the better I learn how to avoid it. ;)
<niemeyer> TheMue: It's not actually your fault.. the pre-req support in lbox should handle that case correctly without you having to worry about it
<TheMue> niemeyer: So I'm happily awaiting the next release.
<niemeyer> TheMue: Review delivered, and you have a LGTM with a few trivial comments
<TheMue> niemeyer: Just reading them.
<TheMue> niemeyer: Thanks.
<niemeyer> fwereade: You got a LGTM too
<fwereade> niemeyer, sweet tyvm
<niemeyer> fwereade: it's on the smaller branch, though :(
 * niemeyer <= cheater
 * niemeyer <= hungry too
<niemeyer> We'll step out for lunch
<fwereade> niemeyer, no worries :)
<fwereade> LGTMs invariably LGTM ;p
<niemeyer> Hehe :)
<niemeyer> SGTM! ;)
<fwereade> rog, is DefaultSeries making its way somewhere that's accessible via a juju.Conn?
<fwereade> bah, that was a *we*'ll step out for lunch, wasn't it?
<TheMue> fwereade: Looks like.
<niemeyer> fwereade: You've got another LGTM.. hopefully it's sensible
<niemeyer> fwereade: (the comments, I mean)
<niemeyer> fwereade: Regarding DefaultSeries, yeah, CurrentSeries is going to be available
<fwereade> niemeyer, cool, thanks
<fwereade> niemeyer, is that going to be directly on Environ?
<fwereade> niemeyer, CurrentSeries^^
<niemeyer> fwereade: No, it's a global based on the current machine IIRC
<niemeyer> fwereade: So should be usable from elsewhere, as long as we have no loops
<fwereade> niemeyer, hmm, ok; I presume it will be an env setting at some stage soon, though?
<niemeyer> fwereade: Ah, yep
<niemeyer> fwereade: That's something else, sorry, I misunderstood
<niemeyer> fwereade: DefaultSeries coming from the env config will certainly be accessible via Conn, since it's stored in the State
<fwereade> niemeyer, I have a feeling that rog and dave are both circling that general area so I'm rather reluctant to start implementing that sort of stuff
<niemeyer> fwereade: I don't *think* either of them are looking at it now
<fwereade> niemeyer, hm, ok, I though I saw a CL from dave that was State.Environment
<fwereade> niemeyer, just a ConfigNode for now but clearly a start
<niemeyer> fwereade: Uh.. maybe I missed it
 * niemeyer looks at review queue
<niemeyer> add-environment.. suspect.. looking
<fwereade> niemeyer, https://codereview.appspot.com/6261055/
<niemeyer> fwereade: Yep, looks good
<niemeyer> fwereade: Hmmm
<niemeyer> fwereade: I'm a bit sad that we'll be poking at that configuration manually
<niemeyer> fwereade: For well-known keys
<fwereade> niemeyer, I was a bit uncertain about it; trying to remember the discussion we had at UDS
<niemeyer> fwereade: Wondering if DefaultSeries + SetDefaultSeries are reasonable
<niemeyer> fwereade: and whether they should be adding to that same setting, or living outside
<niemeyer> fwereade: These settings will be common for all providers, I suspect
<fwereade> niemeyer, I'm hoping for a separate type for common env settings that we can manipulate nicely
<niemeyer> fwereade: I'd like that too
<niemeyer> fwereade: We don't even need a separate type, I suppose.. state.DefaultSeries() might work
<fwereade> niemeyer, hmm... DefaultSeries and Constraints are both common env settings, and PlacementPolicy may end up becoming one too... with 3 candidates for that sort of thing I'm reluctant to tack it directly onto State
<niemeyer> fwereade: Why? What do we get by injecting them in another type?
<rog> TheMue, fwereade, niemeyer: here's how it looks when we store units inside services rather than as a global thing. https://codereview.appspot.com/6247066
<fwereade> niemeyer, it's just that State feels big enough already :)
<niemeyer> fwereade: It does, because indeed it has a lot of stuff.. but I'm not sure that having to do state.EnvironSettings().SetDefaultSeries(...) is any better
<fwereade> niemeyer, yeah, maybe so
<rog> fwereade: can DefaultSeries be modelled as a constraint?
<niemeyer> fwereade: Maybe the fact we can change multiple things at once and then flush might justify having a separate type
<fwereade> rog, IIRC I couldn't quite find a clean way to do so
<fwereade> rog, but it's clearly that species of kidney from a certain perspective
<rog> fwereade: do you remember why, by any chance?
<niemeyer> rog: Ah, no, it's actually not part of the constraint, and there's a good reason
<niemeyer> rog: It's part of the charm URL
<fwereade> rog, I think it's that it's the only thing that's a charm constraint as well as a machine constraint
<fwereade> rog, so it ends up feeding into the constraints
<fwereade> rog, but it isn't really one itself
<rog> fwereade, niemeyer: couldn't we just take it from the constraints anyway? if we have no constraints, then we'll need to decide somehow and then that will have to be an instance constraint, presumably
<fwereade> rog, IMO that is likely to end up a bit convoluted
<fwereade> rog, we've already decided, based on the charm, before other constraints really come into play
<rog> fwereade: so does it ever make sense to have a "series" constraint that doesn't match DefaultSeries?
<niemeyer> rog: No, can't come from the constraints, because the charm has a URL
<niemeyer> rog: That's where the decision comes from
<fwereade> rog, juju deploy oneiric/wordpress
<rog> fwereade: so maybe one way of looking at it is that the *charm* implies some constraints.
<fwereade> rog, absolutely so
<fwereade> rog, ubuntu-series is a computed constraint that's never exposed in the UI but is used in picking image
<niemeyer> fwereade: I'm tempted to suggest we just state.SetDefaultSeries => state.EnvironConfig().Set("default-series", ...)
<niemeyer> fwereade: Or maybe not even that, actually.. it's not clear we'll ever set this value through the API outside of environment changes
<niemeyer> fwereade: Given that the client API is actually "juju set-env default-series=oneiric"
<niemeyer> fwereade: Right?
<niemeyer> fwereade: Or did you have something else in mind?
<fwereade> niemeyer, yeah, I think so
<niemeyer> fwereade: The reading aspect seems to make sense, though
<niemeyer> fwereade: I mean
<fwereade> niemeyer, all I'm really saying is that I'll want to know how to read it at some point soon :)
<niemeyer> fwereade: Having state.DefaultSeries() => EnvironConfig().Get("default-series")
<niemeyer> fwereade: jinx
<niemeyer> ;)
<fwereade> niemeyer, :p
<rog> sounds good to me
<fwereade> niemeyer, btw, ty for Resolve review; ok to sumbit if I err out on empty local path (and make the other changes ofc ;))?
<fwereade> rog, btw, the unit path change looks really nice to me
<fwereade> rog, is this something we've agreed to go with or just something you're throwing out there for discussion?
<fwereade> rog, if the former I'll give it a more detailed look :)
<niemeyer> fwereade: Yeah, that's implied by the LGTM
<niemeyer> fwereade: If you think something might be done differently, or disagree with one of the points, a new review makes sense, otherwise a change+submit is fine on LGTM
<fwereade> niemeyer, cool, cheers, wasn't quite sure how to interpret the "should we" ;)
<rog> fwereade: just been discussing with gustavo
<niemeyer> Very warmly.. I feel sorry for Aram :)
<rog> fwereade: couple of changes coming up in a mo
<niemeyer> We agree, though, as usual
<niemeyer> OKAY!
<niemeyer> Time to head home
<niemeyer> A good weekend to all, and wish me luck on hiding the big aluminum cigar
<Beret> heh
<niemeyer> s/hiding/riding
 * Beret won't ask
<niemeyer> Beret: airplane..
 * robbiew adds mramm to his buddy list ;)
<TheMue> mramm: Hiya
<mramm> Hy
<mramm> how's everything going?
<TheMue> mramm: Fine, thank you. Welcome on board.
<mramm> Thanks
<robbiew> we are now just waiting for him to get access to the inner sanctum...aka irc.canonical.com & wiki.canonical.com :P
<Beret> from there, his soul is forever tainted
<mramm> very interested in looking around the inner sanctum
<robbiew> lol
<mramm> sure, but there's interesting stuff
<Beret> thus the tainting
<mramm> to look at while my soul is being eaten away
<robbiew> it's where we hide the oompa loompas
<mramm> ahh
<mramm> I wondered
<fwereade> robbiew, ping
<robbiew> fwereade: pong
<robbiew> fwereade: one sec ;)
<robbiew> fwereade: invite sent
<rog> Beret: lol
<fwereade> happy weekends all!
<rog> fwereade: and you!
<TheMue> fwereade, rog: Happy weekend too.
<rog> TheMue: and you. i'm away monday and tues
<rog> TheMue: so see you wed!
<rog> trying hard to lbox propose from the train through my phone...
<rog> it's not liking it much!
<TheMue> rog: Greet the queen.
<rog> TheMue: of course. i always pop in for a quick cuppa if i'm in town
<TheMue> rog: *lol*
<rog> TheMue: yay! https://codereview.appspot.com/6247066
<TheMue> rog: Uh, there are flies on the front page. Your train is too fast.
<rog> TheMue: lol
<rog> TheMue: that CL make the unit key look like unit-00000-00000 where the first number is the service key number and the second is the id within the service
<rog> s/make/makes
<Aram> hello.
 * Aram is back in austria.
<andrewsmedina> Aram: hi
#juju-dev 2012-06-02
<Aram> moin.
#juju-dev 2012-06-03
<cprofitt> hey all
#juju-dev 2013-05-27
<thumper> wallyworld_: hangout now?
<wallyworld_> ok, just let me get my headphones
<wallyworld_> thumper: https://plus.google.com/hangouts/_/d3f48db1cccf0d24b0573a02f3a46f709af109a6
<thumper> wallyworld_: looking at your create machine branch
<thumper> wallyworld_:  surprised at the amount of change
<thumper> wallyworld_: wondering why you didn't just add a function: func (st *State) AddMachineWithConstraints(series string, extraCons *constraints.Value, jobs ...MachineJob) (m *Machine, err error) {
<thumper> wallyworld_: then there wouldn't be so much change
<thumper> wallyworld_: also, putting constraints before the other params seems wrong too...
<wallyworld_> thumper: i thought creating machines with constraints would become the norm
<thumper> wallyworld_: not in our tests
<thumper> it is just noise now
<wallyworld_> you also dislike the constraints param before the jobs param?
<wallyworld_> it can't go after
<thumper> no, before instance and nonce
<thumper> strange as it may seem...
<thumper> to me it is like pushing in
<wallyworld_> i think it (the consraints param)  aligns more closely with the series param
<wallyworld_> it is pushing in, but that's deliberate
<wallyworld_> series and constraints for of go together to me
<wallyworld_> sort of
<thumper> perhaps...
<thumper> but I think we should have an extra method for adding constraints
<wallyworld_> i can look at adding the extra method to reduce the noise
<thumper> that means all the tests still look good
<wallyworld_> yep, ok
<thumper> I'll keep reviewing
<thumper> also...
<thumper> soy flat white tastes funny
<wallyworld_> *soy* why???????
<thumper> ran out of milk
<wallyworld_> you ran out of milk
<thumper> and don't like black coffee
<wallyworld_> i would have had a pure expresso
<wallyworld_> espresso
<thumper> nah
<wallyworld_> good coffee, tastes quite nice
<wallyworld_> don't you have a shop just up the road?
<thumper> sick kids at home
<thumper> can't really leave them alone
<thumper> and don't want to take them with me
<wallyworld_> oh yeah, forgot that
<wallyworld_> thumper: thanks for the reviews, i think i've addressed the issues
<thumper> ok, will look shortly
<wallyworld_> no hurry
<dimitern> morning everybody
<fwereade> morning all
<dimitern> fwereade: morning
<fwereade> dimitern, heyhey
<dimitern> fwereade: i'll start implementing the necessary stuff to allow agents to connect to the API server first thing today
<fwereade> dimitern, great, I'm still catching up on email etc myself
<thumper> fwereade: you going?
<fwereade> thumper, oh, hey, you're still around? sorry, I didn't check, I thought it'd be unreasonably late for you
<kyhwana> fwereade: it's only 20:00
<thumper> fwereade: no, it is about 0800UTC which is the time I'm back
<thumper> I said that in the email :)
<thumper> I came back especially to talk to you and TheMue
<thumper> fwereade: so time for the "ah ma gahd" hangout?
<fwereade> thumper, cool, thanks, I forgot UTC was 2 hours
<thumper> :)
<fwereade> thumper, TheMue, I am at your service
<thumper> fwereade: I'll chat first, and will grab TheMue after...
 * TheMue is listening
<dimitern> reviewers? https://codereview.appspot.com/9781044/ - a small step towards having API connection in agents
<dimitern> fwereade, TheMue: ^^
<thumper> TheMue: ping?
<dimitern> yo thumper!
<thumper> dimitern: hey there
<dimitern> thumper: if you have 5m take a look at that CL? ^^
<thumper> sure, while I'm waiting :)
<thumper> dimitern: you have one :)
<dimitern> thumper: cheers!
<TheMue> thumper: pong
<thumper> TheMue: hey, care to join the hangout?
<dimitern> fwereade: ping
<fwereade> dimitern, sorry, meeting
<dimitern> fwereade: np, ping me when you have 5m pls
<fwereade> dimitern, hey dude
<dimitern> fwereade: hey, can you take a look at https://codereview.appspot.com/9781044/ please?
 * fwereade looks
<fwereade> dimitern, can we g+ a moment? need to read surrounding code and think out loud at you a bit
<dimitern> fwereade: sure, i'll start one
<fwereade> dimitern, cheers
<dimitern> fwereade: https://plus.google.com/hangouts/_/31a0adf406ea37411478622bf93d2c364764ab93?authuser=0&hl=en
<danilos> dimitern, hey, joining us?
<dimitern> danilos: oops be right there
<dimitern> fwereade: https://docs.google.com/a/canonical.com/document/d/1qNSzFUh_r_fnceAUDsIT4SPVJhCH4G8V-YfXpv_GQGA/edit?usp=sharing
<fwereade> dimitern, cheers
<dimitern> fwereade: actually it looks like you cannot access ping/pong frames directly - they're handled internally
<dimitern> fwereade: and the gui guys actually wait for a connection to be dropped and then reconnect
<dimitern> fwereade: also, looking at the apiserver implementation, when a connection is dropped the server dies, so that's a possibility to detect
<dimitern> fwereade: ping
<fwereade> dimitern, heyhey, sorry, meeting again -- g+?
<dimitern> fwereade: yeah, just a sec
<dimitern> fwereade: https://plus.google.com/hangouts/_/4cdea2d176b825a710dfe5865a1b69ca5bd61944?authuser=0&hl=en
<dimitern> fwereade: you just froze and i tried to rejoin
<dimitern> fwereade: https://codereview.appspot.com/9811044/
<fwereade> dimitern, I'm not sure we should be exposing SetDeadline directly... and I'm becoming unsure whether it's actually what we need, I think my brain was still parsing it as SetTimeout
<fwereade> dimitern, it doesn't strike me as very goroutine-safe
<fwereade> dimitern, in the service of quick progress I think I'd be fine with just a trivial Ping() method
<fwereade> dimitern, if we want to get clever with deadlines I think we need to do it around where we're actually reading/writing on the conn
<fwereade> dimitern, sensible?
<fwereade> dimitern, and remember, we need to have this turning into a channel close at some point so we can select on it
<fwereade> dimitern, ideally all we'd be exposing on State is a method returning that channel
<fwereade> dimitern, sorry, gtg, bank
<dimitern> fwereade: sgtm, I'll leave Ping() only then
<fwereade> dimitern, consider keeping it unexposed on the client, though, I can't think of much reason to use it
<fwereade> dimitern, I'm thinking of something like select { <-apist.ConnDead(): ... }
<dimitern> fwereade: we can't - how should we call it if it's not exposed on che client
<fwereade> dimitern, can't the client keep track internally of the connection's state?
<dimitern> fwereade: are we talking about Ping here?
<fwereade> dimitern, that's what I'm looking for
<fwereade> dimitern, Ping is just a means towards that end
<dimitern> fwereade: ok, i'm confused now
<fwereade> dimitern, so the server needs it
<fwereade> dimitern, but there's little cause for a client of api.State to ever call it
<fwereade> dimitern, because a client of api.State cares about whether the conn has gone down
<dimitern> fwereade: what then?
<fwereade> dimitern, and api.State should be exposing that information
<dimitern> fwereade: it's intended to be called by the task monitoring the heartbeat
<fwereade> dimitern, I wasn't expecting that to be a task as such... and, sorry, I really must go, I'll ping you when I'm free again
<dimitern> fwereade: so you're suggesting to implement a goroutine in the client to periodically ping the server and have a Closed() <-chan error available outside?
<dimitern> fwereade: ok, ping me when you're back
<mramm> just checking for a few minutes here on my day off -- anobody need anything?
<dimitern> mramm: everything is fine :)
<mramm> dimitern:  I figured it would be.   You guys are all great.
<dimitern> :)
<dimitern> i'm off - can't think anymore, need to rest
<ahasenack> hm, since default-image-id is now deprecated:
<ahasenack> 2013/05/27 16:49:25 WARNING config attribute "default-image-id" (4dade8d2-2b95-4e4c-b947-c5e3ff4a31ea) is deprecated and ignored, use simplestreams metadata instead
<ahasenack> and gojuju by itself seems unable to find a suitable image id:
<ahasenack> 2013/05/27 16:49:37 ERROR command failed: cannot start bootstrap instance: no "precise" images in lcy02 with arches [amd64]
<ahasenack> how do I bootstrap? In this case, it's openstack
<ahasenack> (canonistack to be more precise)
<fwereade> ahasenack, `juju image-metadata` will generate simplestreams data for a single image as specified, and you can then copy those to your public-bucket
<fwereade> ahasenack, I *think* we are expecting simplestreams data to be published for lcy02 before too long, but I'm not really up to speed on that, was on holiday last week
<ahasenack> fwereade: ok, got something to play with now
<ahasenack> $ juju image-metadata -a amd64 -i 4dade8d2-2b95-4e4c-b947-c5e3ff4a31ea -r lcy02 -s precise -e https://keystone.canonistack.canonical.com:443/v2.0/
<ahasenack> Boilerplate image metadata files "index.json, imagemetadata.json" have been written to /home/andreas/.juju.
<ahasenack> Copy the files to the path "streams/v1" in your cloud's public bucket.
<ahasenack> thanks
<fwereade> ahasenack, great
<ahasenack> fwereade: it was so simple when all I needed was default-image-id :D
<fwereade> ahasenack, it's a bit annoyingly manual at the moment, but it's better than having to assume that the default image id always has the exact required characteristics (regardless of series, arch, etc)
<fwereade> ahasenack, default-image-id just enables too many fail cases -- if the tools don't match the image, nothing will work, and the reasons will be desperately unclear... although we kinda dropped the ball on the error message there, the resolution is far from clear. bah.
<fwereade> ahasenack, would it have helped if the error message just ended with "(run `juju image-metadata`)"?
<wallyworld_> fwereade: another problem which we can discuss tonight (my time) - stale data. there are issues with the tests which mask stale data issues in the domain entities. our code is broken, need to think about how to fix
<wallyworld_> right now, some tests pass by fluke. adding code to ensure dirty attribute is properly refreshed causes other tests which rely on the stale data to fail
#juju-dev 2013-05-28
<jam> greetings all
<jam> fwereade: when you come online, I'm available for a chat
<jam> wallyworld_: hey, just the guy I wanted to say hi to
<jam> wallyworld_: ?
<dimitern> morning all
<jam> hi dimitern
<dimitern> jam: hi, feeling better?
<jam> dimitern: yeah, fever is down, though still not 100%, but probably 80 or so :)
<dimitern> jam: good to hear it's improving
<jam> dimitern: yeah, my temperature spiked to almost 40, with accompanied whole-bod
<jam> body ache
<jam> and headache that was pretty bad
<jam> but only lasted the afternoon and a bit in the morning.
<dimitern> fwereade: ping
<fwereade> dimitern, jam, hi both
<fwereade> TheMue, hi also
<dimitern> fwereade: hey, take a look at this? https://codereview.appspot.com/9811044/
<dimitern> fwereade: (should've changed the description, but will do before I submit)
<fwereade> dimitern, reviewed, looks sane, enough little quibbles that it's not quite there yet
<TheMue> fwereade, dimitern, jam: morning
<dimitern> fwereade: updated
<fwereade> dimitern, LGTM
<dimitern> fwereade: cheers!
<dimitern> TheMue: morning btw
<dimitern> anyone else wants a small, easy review? https://codereview.appspot.com/9811044/
<dimitern> jam, TheMue: ^^
<fwereade> bbiab
<wallyworld_> jam: hi, sorry i missed you, school pickup
<wallyworld_> jam: feeling better?
<dimitern> wallyworld_: hey
<wallyworld_> G'DAY
<jam> wallyworld_: about 80%, the fever, headache, and whole body ache are gone, but still a little upset stomach.
<wallyworld_> oops caps, sorry
<wallyworld_> at least it's recovering
<dimitern> wallyworld_: quick review? https://codereview.appspot.com/9811044/
<wallyworld_> sure
<TheMue> dimitern: review done
<jam> wallyworld_: yeah, I can at least think and work, I just have to stay near a bathroom :)
<dimitern> TheMue: tyvm
<dimitern> fwereade: so now on to the 3 loops
 * TheMue just returned to the keyboard after hunting the escaped rabbits of our daughter ;) those guys found out how they can open the door. don't tell me anything about dumb rabbits anymore. :D
<fwereade> jam, is mgz around today?
<jam> fwereade: he is off the whole week, and part of next
<jam> I think he is back next Tues
<rvba> Hi guys, when we were testing juju-core with MAAS we where using: "juju bootstrap --upload-tools --fake-series=precise,raring,quantal".
<rvba> I saw somewhere that --upload-tools is a dev tool only there when building juju-core from source.
<rvba> What should I be using instead now? (using juju-core from raring)
<jam> rvba: if I understand you correctly, you want to have the tools available on your MaaS cluster for multiple series
<rvba> Yes
<rvba> The MAAS cluster is raring and I want to deploy precise nodes.
<jam> we have a (fairly crufty but serviceable) command "juju sync-tools". Which pulls the tools from the public ec2 bucket, and puts them into your local environment.
<jam> you have to source EC2 credentials so it can read the public bucket.
<jam> (which is part of the 'crufty' nature of the command)
<rvba> jam: thanks, I'll try thatâ¦
<jam> rvba: feel free to poke here for feedback on getting it working. It was designed for some internal use cases (sharing with HP and Canonistack), so it still needs some love for widespread adoption.
 * fwereade lunch
<jam> fwereade: ping
<fwereade> jam, pong
<jam> fwereade: you wanted to chat with me about some API changes. I was wondering if you wanted to have a hangout with me and ian
<fwereade> jam, how's it going?
<fwereade> sure, give me 2 mins
<fwereade> jam, you in a hangout already?
<jam> fwereade: not yet, but I can set one up
<fwereade> jam, thanks
<jam> well... I used to know where the start a hangout link was, give me a sec
<jam> fwereade: wallyworld_: https://plus.google.com/hangouts/_/6c0cd174f8cbe910b0ceaa885a6c216331e77f07
<jcastro> mramm: heya, whens the next release planned?
<mramm> 08:18 marcoceppi: mramm: Any idea when the next release of juju-core will be landing?
<mramm> 08:19 mramm: davecheney was working on it this weekend
<mramm> 08:19 mramm: I expect it out any day now, last I heard it was delayed by some build contention on the build servers
<jcastro> ah
<mramm> seems like this must have come up in a meeting ;)
<jcastro> no I just wanna do cool stuff
<jcastro> and need a release
<jcastro> :)
<mramm> cool
<jcastro> if it's builder contention I can help him
<mramm> I will touch base with dave tonight
<mramm> ok
<jcastro> but I take it it'll be a 24 hour turn around?
<marcoceppi> I just really want to be able to use the majority of charms that have relation-list -r :)
<jcastro> wait
<jcastro> is that in the PPA?
<mramm> I'm not sure if that made it in before the last PPA release
<jcastro> 4:20 and 5:50 delays in the PPAs, distro is smaller.
<mramm> interesting
<jcastro> mramm: I'm going to ask the team if we can get a bump in priority
<mramm> ok
<jcastro> if a builder takes 6 hours to build it means cheney gets one shot a day to get it working
<jcastro> if not, we wait 24 more hours.
<mramm> and I'll talk to dave about the release tonight (when he is in)
 * jcastro nods
<jcastro> I'll send him a mail wrt. builders, see if there's anything I can do
<marcoceppi> When downloading the mongo binary, where should I put it?
<marcoceppi> Also, in juju-core's readme it refers to running `go install -v launchpad.net/juju-core/...` which doesn't actually work
<marcoceppi> "launchpad.net/juju-core/..." matched no packages
<ahasenack> jcastro: doesn't matter if the fix is in the ppa, it needs to be in the tools tarballs too, unless you plan to use --upload-tools
<marcoceppi> fwereade: got a second to help me resolve building from source?
<ahasenack> marcoceppi: I can help, I did it a few times
<fwereade> marcoceppi, sorry, meeting
<jcastro> ahasenack: ah bummer. :-/
<marcoceppi> ahasenack: cool, I've followed the readme up until go install -v launchpad.net/juju-core/... which just gives me a matched no packages error
<ahasenack> jcastro: https://wiki.canonical.com/InformationInfrastructure/IS/CanonicalOpenstack/CanonistackWithJujuCore and I have a g+ post about it too
<ahasenack> marcoceppi: did you do "go get -v launchpad.net/juju-core/...", with the "..." at the end?
<marcoceppi> ahasenack: yup, let me try again
<marcoceppi> ahasenack: http://paste.ubuntu.com/5710325/
<ahasenack> marcoceppi: start with go get
<marcoceppi> ahasenack: fudge, I copied the wrong command
<marcoceppi> thanks
<ahasenack> marcoceppi: you will also need to use boostrap with --upload-tools later, or else it will download the old version of the tools into your units, and not the trunk one you have now
<marcoceppi> ahasenack: ack, I'll hopefully remember to do that :)
<ahasenack> marcoceppi: https://plus.google.com/114091308548248656535/posts/cLgeVv7xBcp hopefully helps
<ahasenack> marcoceppi: with canonistack and other private clouds, there is a new complicator, I'm working on a post about that
<ahasenack> it doesn't support default-image-id anymore, so there are a few other commands to run
<marcoceppi> ahasenack: so the readme mentions having to grab mongodb from source, etc. Do I just ignore that?
<ahasenack> marcoceppi: yes, I didn't have to do that
<dpb1> go get probably takes care of all that?
<marcoceppi> ahasenack: cool, thanks
<ahasenack> dpb1: I don't know
<ahasenack> dpb1: maybe mongodb doesn't need to be updated that frequently and whatever juju is fetching is good enough
<dpb1> ya
<ahasenack> is there a way to use juju image-metadata to specify default images for multiple regions?
<marcoceppi> ahasenack: is what you were talking about with "private" clouds and default-image-id related to this: error: cannot start bootstrap instance: no "precise" images in az-3.region-a.geo-1 with arches [amd64]
<ahasenack> marcoceppi: hm, *maybe*, which cloud is that? hp?
<marcoceppi> ahasenack: yeah, hp
<ahasenack> marcoceppi: do you have a public-bucket-url for that hp cloud in your environments.yaml file?
<marcoceppi> ahasenack: yes, https://region-a.geo-1.objects.hpcloudsvc.com/v1/60502529753910
<ahasenack> marcoceppi: also, did you bootstrap with -v? Was there any mention of an url with juju-dist?
<ahasenack> marcoceppi: ok, take a look:
<ahasenack> marcoceppi: https://region-a.geo-1.objects.hpcloudsvc.com/v1/60502529753910/juju-dist/
<ahasenack> (I'm debugging with you now, I don't know yet what the problem is)
<ahasenack> marcoceppi: https://region-a.geo-1.objects.hpcloudsvc.com/v1/60502529753910/juju-dist/streams/v1/imagemetadata.json and
<marcoceppi> ahasenack: (I'm all for debugging thanks for taking the time)
<ahasenack> marcoceppi: https://region-a.geo-1.objects.hpcloudsvc.com/v1/60502529753910/juju-dist/streams/v1/index.json
<ahasenack> are of interest to us
<ahasenack> that uses the new "simplestreams" from smoser to specify the image id juju will use
<marcoceppi> ahasenack: so -v didn't give any mention of a url with juju-dist
<ahasenack> marcoceppi: if I understand it correctly, it has an image id definition of 81078 for region az-1.region-a.geo-1
<marcoceppi> just some warnings about depreciated configs
<ahasenack> marcoceppi: ok, the way I understood it now is that you can only use region az-1.region-a.geo-1
<marcoceppi> Ah, so I'm using az-3, let me try az-1
<ahasenack> marcoceppi: well, for some definition of "only", it's how it's setup. To use another one there are other steps to take
<marcoceppi> ahasenack: path of least resistance is the one for me atm
<ahasenack> marcoceppi: ok, so switch regions and try again
<marcoceppi> ack, activating/switching
<marcoceppi> ahasenack: that resolved that, thanks
<ahasenack> marcoceppi: nice!
<marcoceppi> ahasenack: so, how much harder is it to get this working with other az's?
<ahasenack> marcoceppi: I'm still writing it up, only have a few notes so far. This is what I have for canonistack so far: https://pastebin.canonical.com/91689/
<ahasenack> marcoceppi: you have to run the new image-metadata command, then create a juju-dist bucket and upload the files to it
<ahasenack> and adjust acls so it can be read anonymously
<marcoceppi> ahasenack: doesn't sound too bad
<ahasenack> marcoceppi: I don't know yet how to make it work for multiple regions, as that image-metadata command seems to only create the files for one at a time
<ahasenack> marcoceppi: I think for other regions you have to just edit the files manually
<ahasenack> I mean, to support multiple regions in one single such file
<marcoceppi> ahasenack: I only have need to support on region that isn't supported, so that's out of scope for me anyways
<marcoceppi> one*
<ahasenack> marcoceppi: hah, how murphy of you, needing the region that is not supported :)
<marcoceppi> hah, yeah. I've really lucked out. Figured the highest number region would be the best to use!
<fwereade> ahasenack, marcoceppi: fwiw, environments are single-region, so that point is hopefully moot
<ahasenack> fwereade: given the location of the index.js and imagemetadata.js files, and their names
<ahasenack> fwereade: I mean, the region is not in the path, or the filename
<ahasenack> fwereade: so it has to be in the content
<ahasenack> fwereade: so you have to be able to specify multiple regions inside these files, no?
<fwereade> ahasenack, if you're sharing a public-bucket across regions, I agree, you'll want to have metadata for all regions in play
<ahasenack> fwereade: in canonistack at least hte public bucket is the same, I don't see a region differentiation on it
<ahasenack> fwereade: I don't know about hp
<ahasenack> fwereade: but the file in their public bucket only had one region, and juju failed when marcoceppi tried to bootstrap in another
<fwereade> ahasenack, I believe that we will be publishing simplestreams data to HP anyway
<ahasenack> fwereade: so juju wasn't able to find the default image id for the other region
<fwereade> ahasenack, ha, ok
<ahasenack> fwereade: so, the end goal, just to understand, is that those json files describe the default image id for all regions?
<ahasenack> (amongst other things I suppose, since simplestreams are generic)
<ahasenack> or, well, use different public buckets per environment
<fwereade> ahasenack, the goal is that the simplestreams files describe the images that are available, so that juju can pick a/the right one for what you need
<ahasenack> fwereade: ok, this can be solved then by either adding all of them to the file, or using different files in different buckets
<ahasenack> the difference being one single public-bucket-url for all regions, or one per region
<fwereade> ahasenack, in general we will be looking in cloud-images.ubuntu.com/releases
<fwereade> ahasenack, AIUI this will hold data for a whole bunch of public clouds
<ahasenack> fwereade: that's for public clouds
<ahasenack> ok
<ahasenack> but for my own private cloud, I need to manage those simplestream files
<fwereade> ahasenack, right; and in canonistack I believe we are publishing simplestreams data and advertising it via openstack already for lcy01, just not for lcy02
<fwereade> ahasenack, for your own private cloud, we need tooling, I agree
<ahasenack> fwereade: so canonistack seems to be using something different. I managed to bootstrap on lcy01, but there is no juju-dist/streams/v1/index.json file for example
<fwereade> ahasenack, it's published... somehow, in openstack, that I am trying to remember and look up
<ahasenack> so it's not using simplestreams as far as I understand, but something different
<Makyo> Are gocheck tests run in the order they're defined?
<Makyo> I'm getting "cannot add service \"riak\": service already exists" on the first test, so I'm curious./
<Makyo> nvm -gocheck.vv shows the order.
<TheMue> so, quitting, cu tomorrow
<ahasenack> hm, I wish bootstrap -v showed the buckets it is querying
<Makyo> Hm.  Test fails with  "service \"riak\" not found", so I try deploying it at the beginning of the test, which gives me "cannot add service \"riak\": service already exists".
<FunnyLookinHat> Is there a way to enable verbose request logging with juju-core ?  including HTTP headers, etc.
<FunnyLookinHat> I was hoping to not use wireshark, but that would work if necessary...
<thumper> morning folks
<FunnyLookinHat> hmmm
<FunnyLookinHat> oh hey thumper :D ( David @ S76 )
<thumper> hi FunnyLookinHat
<FunnyLookinHat> thumper, You aware of any means to do verbose logging of all requests made when bootstrapping ?
<FunnyLookinHat> ( or in general )
<thumper> umm...
<thumper> I think -v is about as verbose as it gets (right now)
<thumper> the whole logging, and output stuff is something that I have an interest in fixing
<FunnyLookinHat> Ah ok - trying to see if we can close the gap on rackspace support...  narrowed it down to security groups ( which they are implementing ) and a strange 411 error when trying to authenticate
<fwereade> FunnyLookinHat, --debug is more verbose than -v, but I can't guarantee it'll give you what you're after; logging is somewhat spotty
<FunnyLookinHat> Ah ok - thanks fwereade
<thumper> hi fwereade
<fwereade> thumper, heyhey
<thumper> fwereade: did you want a catch up chat?
<thumper> sorry about yesterday
<thumper> snow day here
<thumper> very chaotic
<fwereade> thumper, sgtm, although probably not a very long one
<fwereade> thumper, no worries
<thumper> sounds fine
<thumper> I have a chat with mramm at 10
<thumper> (in about 35min)
<fwereade> different times I can handle, but different seasons? madness
<thumper> haha
 * thumper writes a todo list
<mramm> Another snow day?
<mramm> ;)
<thumper> mramm: nah, sunny and high of 11Â°C
<mramm> I can be flexible with that meeting time if it helps
<thumper> well, should be sunny later
<mramm> ahh
<thumper> pretty grey right now
<mramm> got it
<thumper> snow is all melting
<thumper> much to the kids disappointment
<thumper> still have two at home sick
<fwereade> mramm, you shouldn't need to be too flexible, I'll want to go to bed around then ;)
<thumper> but they aren't the two that fight so much
 * thumper heads to the gym
<thumper> wallyworld_: I'll chat with you when I'm back
<wallyworld_> ok
#juju-dev 2013-05-29
<wallyworld_> thumper: i have to go to the hardware store - bit of a bathroom issue, back soon hopefully
<thumper> wallyworld_: ok.. ping me when you're back
<wallyworld_> thumper: ping
<thumper> wallyworld_: hi
<thumper> wallyworld_: hangout?
<wallyworld_> yep
<wallyworld_> https://plus.google.com/hangouts/_/d3f48db1cccf0d24b0573a02f3a46f709af109a6
<kyhwana> hey thumper, do yo know cucumbererror?
<thumper> kyhwana: sorry, no
<kyhwana> ahh ok, he's dunners too, works at some educational instution in IT or helldesk
<thumper> kyhwana: oh, I assumed you were talking about errors from the python library cucumber L(
<thumper> L( should have been :)
<thumper> but still, no
<kyhwana> hehe
<kyhwana> ah right, at UO. oh well
<thumper> kyhwana: what it their real name, I may know them but not their irc nick
<kyhwana> Oh, darryl someone
<thumper> kyhwana: doesn't ring any bells
<kyhwana> ah well, you probably know someone that knows him
<thumper> kyhwana: almost certainly
<kyhwana> thumper: Do you drink of the beers?
<thumper> kyhwana: sometimes
<kyhwana> hmm, do you goto Beervana in wellington?
<thumper> no
<thumper> not that keen
<thumper> I know some who do
<kyhwana> ahh ok
<kyhwana> I think that's my only trips to wellington this year, Nethui, beervana and Kiwicon :P
<thumper> kyhwana: I do kiwipycon
<thumper> not going to nethui
<thumper> although Unfocused is
<thumper> wallyworld_: email sent to dev list re: opaque keys and natural business keys
<kyhwana> ahh ok
<thumper> wallyworld_: I mentioned that you feel stronger than I do in the email, so no additional rant needed
<thumper> wallyworld_: but feel free to +1 it with a non-ranty additional info
<TheMue> morning
<fwereade> TheMue, heyhey
<TheMue> fwereade: heya
<TheMue> fwereade: where i just see you, could you tell me more about the background of nonce (e.g. used in StartInstance() )?
<fwereade> TheMue, it's to deal with provisioner failures after starting an instance but before recording that fact in state
<fwereade> TheMue, if that happens, we get a rogue instance starting up, thinking it's running machine (say) 7
<fwereade> TheMue, but state being unaware of that
<fwereade> TheMue, so when we start up a machine agent, the first thing we do is check that the nonce matches the one recorded in state
<fwereade> TheMue, if it doesn't, or there's no nonce in state, the MA does nothing and just sits back and waits for its host instance to be trashed
<TheMue> fwereade: ah, ok, thx, that makes it more clear.
<TheMue> fwereade: my dict translates nonce as sex offender, esp. paedophile. strange.
<fwereade> TheMue, when it's a container being provisioned, this is potentially less of a problem -- we *could* arrange matters so that "instance" ids are predictable based on machine ids -- but I'd prefer to just keep the nonce mechanism so we aren't forced to do that
<fwereade> TheMue, and so that we have the same code path in all cases
<TheMue> fwereade: +1 on that, yes
<fwereade> TheMue, cool
<fwereade> TheMue, (yeah, it's one of those interestingly overloaded terms, I knew that meaning before I was aware of the technical one)
<TheMue> fwereade: i never heard it before
<TheMue> fwereade: but wikipedia knows the correct meaning
<dimitern> fwereade: hey
<fwereade> dimitern, hey dude
<dimitern> fwereade: sorry, yesterday i suddenly felt tired and decided to lie down for a while and was midnight when I woke up :)
<dimitern> fwereade: saw your msg too late
<fwereade> dimitern, haha, no worris
<dimitern> fwereade: I have a proof-of-concept proposal of the bulk-ops: https://codereview.appspot.com/9847044
<fwereade> dimitern, awesome
<fwereade> dimitern, btw, cath and I were pondering a cuba lunch today, feel like joining us?
<dimitern> fwereade: i was thinking I can just get rid of srvUnit altogether
<dimitern> fwereade: sgtm, although i'll bring my apples :)
<fwereade> dimitern, haha, ok -- what time works for you?
<dimitern> fwereade: either before standup at 13:30 or after 14
<fwereade> dimitern, maybe 12 or 12:15 then?
<dimitern> fwereade: ok
 * dimitern brb
<TheMue> fwereade, dimitern: shall i join too? ;)
 * TheMue orders his jet
<dimitern> TheMue: :) if you manage, by all means
<TheMue> dimitern: don't think so, my wife just took it for a short trip to new york
<dimitern> fwereade: so what do you think of the approach ^^ ?
<fwereade> dimitern, looks ideal
<dimitern> fwereade: haven't figured out a way to test is server-side yet, i'm afraid
<fwereade> dimitern, I was about to say "I think it's time to introduce direct server testing" ;p
<dimitern> fwereade: and also, if we need bulk ops exposed at client-side how do you picture it? having st.AllUnits? or something like that
<fwereade> dimitern, I was vaguely expecting st.Units([]string)
<fwereade> dimitern, I think AllUnits is potentially problematic
<dimitern> fwereade: ah, ok can do that
<dimitern> fwereade: will this be the only entry point?
<fwereade> dimitern, and I would rather not implement either until there's a need -- so, for now, yes, the st.Unit will remain the only entry point in state/api
<dimitern> fwereade: maybe if we need it later, we could have st.UnitsStatus(names) perhaps
<dimitern> fwereade: s/UnitsStatus/Units.Status(names)/
<fwereade> dimitern, I have a small part of my brain wondering whether a type that wraps a list of unit ids will be justifiable/handy at some point
<fwereade> dimitern, but yeah, exactly
<dimitern> fwereade: although if st.Units takes names, then .Status() is redundant, unless you don't want all the other info over the wire
<dimitern> fwereade: in which case, we'll need separate top-level methods - st.UnitsStatus(names)
<fwereade> dimitern, Status doesn't get sent over in the unit data does it?
<dimitern> fwereade: anyway, I'm glad I got you finally :) i'm afraid i was being stupid last couple of days
<dimitern> fwereade: well, no actually - only stuff accessible from the u.doc directly
<fwereade> dimitern, fwiw I don't think you have been
<fwereade> dimitern, the jujud agent stuff is not much fun to play with
<dimitern> fwereade: agreed!
<fwereade> dimitern, and the number of layers involved in the api stuff makes it very hard to communicate precise instructions
<fwereade> dimitern, the blocker was in clearly explaining the precise scope of the problem
<dimitern> fwereade: so to emulate Unit(name).Status() with bulk ops, i'll need to do srvUnits().Status(names) and call that
<fwereade> dimitern, yep, sgtm
<fwereade> dimitern, just one thing to bear in mind
<dimitern> fwereade: ok, so my plan is to get rid of srvUnit and replace it with bulk-only ops, while keeping the client-side the same first, then think how to test it
<fwereade> dimitern, returning a map potentially bumps up against the json key-type restriction; consider the tradeoffs in returning an []params.Unit instead
<dimitern> fwereade: and add api.Units(names) as well
<fwereade> dimitern, I'm not 100% sure it's smarter but my gut leads me that way
<dimitern> fwereade: what are these restrictions, elaborate pls?
<fwereade> dimitern, no int keys
<dimitern> fwereade: well, we'll make them strings throughout :)
<fwereade> dimitern, and there is at least one situation in which we need to retrieve relations by int key
<dimitern> fwereade: or only return a map for string keys proper, so that lookups are fast client-side
<fwereade> dimitern, suggestion: preserve order of input names
<dimitern> fwereade: can do; that'll actually save some bytes over the wire
<fwereade> dimitern, that way we can loop in a defined order if necessary, and otherwise don't have to mind
<fwereade> dimitern, great
<dimitern> fwereade: can you please comment on the CL, so it doesn't look like i'm pulling stuff from my ass there :)
<fwereade> dimitern, ofc, sorry
<dimitern> fwereade: cheers
<fwereade> dimitern, sorry sent it a few mins ago
<dimitern> fwereade: saw it, thanks
<fwereade> dimitern, I've noticed something else in the api, that you had no way of knowing about... we really shouldn't be sending data down from the environ config watcher... we should be consistent with the other watchers and just send notifications
<fwereade> TheMue, ^^
<fwereade> TheMue, since you'll be working on the provisioner, please ensure you fix that end keep the API up to date
<jam> fwereade, dimitern: anyone have insight into the state_test.go TestOpenDoesNotDelayOnHandShakeFailure test? I think it was written by davecheney
<jam> It succeeds for me locally, but fails reliably on go-bot
<dimitern> jam: not really
<jam> It seems to do 3 TLS Handshake failures locally
<jam> but on gobot I get maybe 6 ofthem
<fwereade> jam, it *looks* like it's intended to test that we fail out immediately on handshake failure
<fwereade> jam, I seem to get 4 locally
<fwereade> (can you tell I've written no code since I got back? :/)
<TheMue> fwereade: yep, will keep in mind
<jam> fwereade: well, it is slightly indeterministic if you get the failure before the last log message (if I force the test to fail on an otherwise working system)
<jam> I think I see 4 as well
<jam> I'm slightly worried that it is a timing based test and the go-bot is just slow
<fwereade> jam, from local behaviour it STM to be an actual problem
<fwereade> TheMue, dimitern: are you getting that test passing?
<jam> fwereade: as in, it should be only trying 1 time?
<fwereade> jam, that's how I read the test name
<jam> it passes for me locally
<jam> (maybe the issue is that locally it is too fast?)
<fwereade> jam, interesting
<fwereade> jam, could be
<dimitern> fwereade: let me check
<jam> fwereade: it fails for you?
<fwereade> jam, did just now, don't know whether it's consistent
<jam> fwereade: so I believe the code it cares about is in open.go around line 93
<dimitern> fwereade, jam: it passes for me - tried 3 times
<jam> which has a time.Sleep if you have a connection failure
<jam> but no time.sleep if you connect but fail to TLS handshake
<fwereade> jam, ah, but the TLS bit below *does* retry but does not delay
<fwereade> jam, yeah, ok, I suspect I misread intent
<fwereade> jam, sounds timingy then :/
<jam> fwereade: that is what I get from the "DoesNotDelay"
<fwereade> jam, seems to consistently take 1.5s for me
<jam> fwereade: that is what it was taking me as well on go-bot
<jam> so it *sounds* a *lot* like an actual sleep somewhere
<fwereade> jam, hmm, go version related perhaps?
<fwereade> jam, go version go1.1 linux/amd64
<jam> fwereade: it failed on gobot with 1.0.2, so I upgraded to 1.0.3 and still failed
<jam> fwereade: the fact that you and I both get 1.5s on very different hardware
<jam> means I'm pretty confident there is a hidden sleep
<jam> or timeout
<fwereade> jam, (I thought we'd standardized on go1.1?)
<fwereade> jam, yeah
<jam> fwereade: there will be no 1.1 in precise
<jam> that we can use for building
<jam> so use 1.1 when we can, but our syntax needs to stay 1.0.3 compatible
<fwereade> jam, oh balls -- I suppose we can't use a PPA?
<fwereade> jam, forget it, derail, anyway
<jam> fwereade: there are issues with dependent builds
<jam> at least for backports
<jam> you can get go 1.1 into backports
<fwereade> jam, heh, fair enough
<jam> but a backport of juju-core can't use debs from other backports to build itself
<jam> limitations of the builders
<fwereade> jam, blarg, ok
<fwereade> jam, good to know then
<jam> and we can get 1.0.3 into an SRU (which builders *can* use)
<jam> but not a 1.1 sort of change
<jam> fwereade: especially given dfc just mentioned there is a bug in 1.1 and we should actually wait for 1.1.1
<fwereade> jam, sounds worth investigating... sorry I can't help more :(
<fwereade> jam, ah, yeah, dimitern found some gc-related panic?
<dimitern> fwereade: yeah, let me find the issue link
<dimitern> fwereade: https://code.google.com/p/go/issues/detail?id=5554
<fwereade> dimitern, (fwiw, wrt 10 mins ago, please ping me before you implements any watchers that send actual data, rather than notifications... they're not all necessarily invalid, but they deserve discussion)
<fwereade> dimitern, cheers
<dimitern> fwereade: not planning to implement any today :) but might have to change what's already implemented
<fwereade> dimitern, if it's just EnvironConfigWatcher I've kicked that over to frank, he's making provisioner changes
<dimitern> fwereade: ah, cool
<dimitern> TheMue: ping me if you need help around the api/watchers - it's not pleasant :)
<TheMue> jam: test failed here with 1.0.3 too
<fwereade> dimitern, (and should thus be the best person to coordinate keeping that precise bit in sync)
<TheMue> dimitern: will do
<dimitern> fwereade: meet you in 10-15m @cuba?
<jam> TheMue: so the real question is why is it passing locally for me :)
<jam> on go-bot, I actually see it retrying maybe 6 times
<jam> found it, I think
<jam> fails locally now
<jam> the issue is that mgo driver
<jam> changed
<TheMue> jam: indeed, a good question
<jam> it now sleeps
<jam> it didn't as of the earlier rev I was using
<jam> TheMue: I think: https://groups.google.com/forum/#!msg/mgo-users/DU5pXpIxCg8/cAXAGWaOl0sJ
<jam> " A short delay was introduced at
<jam> Â  this point to minimize the impact of these retries"
<fwereade> jam, ahhhh
<fwereade> dimitern, sgtm, just about to leave
<dimitern> fwereade: omw in 5
<jam> fwereade, TheMue: 'time.Sleep(500 * time.Millisecond)' was added to the retry code in mgo
<jam> hence, 3 retries (4 total tries), always equals 1.5s for everyone
<fwereade> jam, hmm, maybe the answer is to drop the delay in state.open regardless, and suck up the wait?
<fwereade> not sure
 * fwereade -> lunch
 * dimitern lunch as well
<TheMue> fwereade, dimitern: enjoy
<TheMue> jam: I updated mgo yesterday, so that change may be the reason, yes
<jam> TheMue: I'm pretty confident about it. 3 retries * 500ms = 1.5s, which is what we are all seeing.
<jam> TheMue: https://bugs.launchpad.net/juju-core/+bug/1183320
<_mup_> Bug #1183320: TestOpenDoesNotDelayOnHandshakeFailure fails with updated MGO <test> <juju-core:Confirmed> <https://launchpad.net/bugs/1183320>
<jam> TheMue: so right now mgo allows you to specify a custom Dial (which is how we inject TLS support)
<jam> does it sound like we should be able to inject a retry delay?
<jam> or just punt
<jam> and remove the test.
<jam> I would think that locally for testing we'd *really* like to set the retry delay to something really low, because we aren't dealing with network partitions while running the test suite.
<jam> so sleeping 500ms in a test suite is a bit silyy
<TheMue> jam: sgtm
<jam> I feel like waiting for a patch to land in mgo so that we can use the Tarmac bot for juju-core and goose is a bit of a *long* way around the problem... :(
<TheMue> jam: niemeyer should be available soon
<jam> dimitern: for the Units api that you're working on.
<jam> Does the individual Unit objects you are returning have 'name' in them?
<jam> (I'm guessing they do)
<jam> For something like an RPC, I would like to avoid depending on the order being identical to the request order.
<TheMue> jam: dimitern is at lunch
<jam> also, I don't think we want 1 bad name to fail the rest of the requests because of potentially stale data.
<jam> TheMue, fwereade: In go, if you close a channel, then a pending <- on the channel returns, right? And it returns the zero value for the type?
<jam> Which means that technically the code in state/presence/presence.go which is waiting on "case <- done:" is actually getting false, but since it is just using it as a block
<jam> it doesn't actually care that the result is 'false'
<TheMue> jam: one moment
<jam> TheMue: I *think* I sorted out that the logic is it wants a channel that returns a bool, but all the inner code does is close the channel it has pending
<jam> it never sends something on that channel
<TheMue> jam: just looking at the code
<jam> it is a bit convoluted :)
<jam> I was trying to understand TestSync
<jam> as to why it thinks one call to Sync blocks
<jam> (for at least 200ms)
<jam> but it will be done after a later change.
<TheMue> jam: so far it seems you're correct, it's just closing the collected syncs
<TheMue> jam: wife called me for lunch, biab
<jam> TheMue: enjoy
<jam> fwereade, wallyworld_: I just sent an email to the list that is intended to roughly summarize what I got out of our chat last night about API. It would be nice if you could read over it and see where you agree/disagree/have more notes, etc.
 * dimitern is back
<wallyworld_> ok
<dimitern> jam: what's up?
<jam> dimitern: I ended up just commenting on your merge proposal
<danilos> jam: hi, trouble connecting to mumble server (connection times out :/)
<danilos> this seems to be the weird problem I've been seeing since quantal where my internet just stops working for any new connections but IRC keeps ticking along
 * danilos reboots
<fwereade> aw hell
<fwereade> dimitern, I think LifecycleWatcher is broken... ISTM that we trash intermediate results if the client doesn't pull from out before a new value arrives from in
<fwereade> dimitern, also (and this is nothing to do with you) this style completely invalidates the assumptions rog made when he did the original always-call-Next model
<fwereade> dimitern, do I recall you saying he was involved in its implementation?
<dimitern> fwereade: standup still, will get back to you
<fwereade> dimitern, np
<TheMue> *hmpf*
<fwereade> TheMue, encountering pain?
<TheMue> fwereade: only effort. ;)
<TheMue> fwereade: agent.Conf and upstart are so self-aware about their location (used in Write() and Install())
<fwereade> TheMue, relative to DataDir, right?
<TheMue> fwereade: yes, but they also write references to the DataDir as far as I can see
<TheMue> fwereade: so using the container dir full qualified here would later to an illegal dir when used from inside the container
<fwereade> TheMue, gaah, code reference?
<TheMue> fwereade: I have to check, it has been an assumption as it is a field in each struct. but maybe you're right. that would be great.
<fwereade> TheMue, I think you just write a conf to the full path, and start the agent with the container-relative path, and you're done
<TheMue> fwereade: oh, in agent conf the read value from the file is overwritten by the passed one which is an arg of the jujud
<TheMue> fwereade: nice
<fwereade> TheMue, oh, we write it out? damn, that's stupid
<dimitern> fwereade: just finished, but have to dash out quickly, sorry
<fwereade> dimitern, no worries
<fwereade> dimitern, ping me when you're back
<TheMue> fwereade: we write it, but then, when reading the file in ReadConf(dir, tag), that dir is used.
<TheMue> fwereade: so the right one is used later, only in the conf file the full qualified path is written
<fwereade> TheMue, it is more than somewhat insane that we write the file's path into the file though, feel free to fix that (or at least comment it, because it's seriously misleading)
<fwereade> TheMue, or hell just leave it
<TheMue> fwereade: so it only would confuse somebody debugging and reading that file inside a container wondering where this dir comes from
<fwereade> TheMue, agnt.Conf is ludicrous anyway :/
<TheMue> fwereade: hehe
<fwereade> TheMue, it's absurdly overloaded
<TheMue> fwereade: absolutely. hmm, wouldn't omit the writing of that field be enough? if it is set each time after reading the file?
<fwereade> TheMue, it'd probably be slightly better
<fwereade> TheMue, better still to actually *separate* the ancillary in-memory data from the stuff we serialize rather than smooshing it all together
<TheMue> fwereade: yes, but it's always so convenient to use a field instead of passing an argument through all nested methods. ;)
<fwereade> TheMue, heh :(
<TheMue> fwereade: hehe
<TheMue> fwereade: *hmpf* again, there are so many references to that DataDir. thankfully most simply reading the field
<dimitern> fwereade: back
<fwereade> dimitern, hey, sorry I missed you
<fwereade> dimitern, 5 min chat in the hangout before kanban?
<dimitern> fwereade: oops, too late i think
<dimitern> fwereade: i'm in the kanban g+ now
<fwereade> ffs
<Makyo> Running into the following problem, would appreciate input: a test is setup with a service (riak), but when running, the assert fails that the service "riak"doesn't exist.  if I deploy riak as the first step of the test, I get "cannot add service \"riak\": service already exists" - any ideas?
<Makyo> Here: http://bazaar.launchpad.net/~makyo/juju-core/upgrade-charm2-1171548/view/head:/state/statecmd/upgradecharm_test.go#L80
<dimitern> fwereade: mail sent
<FunnyLookinHat> let's say I wanted to disable the security ( nova / quantum ) group stuff - where would I look to start commenting stuff out in src / trunk ?
<fwereade> FunnyLookinHat, driveby answer: environs/openstack/provider.go:809; and consider stubbing out all the methods with names like OpenPort, ClosePort, etc
<FunnyLookinHat> fwereade, perfect - thanks!
<wallyworld_> fwereade: hi, if you have a chance, can you take another look at https://codereview.appspot.com/9662048. tim had a question about the --series param and non-ubuntu deployments. i can change the flag's description text to something more generic. perhaps that's all we need for now
<thumper> morning
<thumper> hi wallyworld_
<thumper> wallyworld_: how are you doing today?
<wallyworld_> hi, was going to ping you soon. getting through some mechanical changes to clean up some aspects of our errors
<wallyworld_> a bit noisy here - guys trashing our bathroom
<thumper> hah
<wallyworld_> :-(
<wallyworld_> $$$$$$$
<thumper> expected trashing?
<wallyworld_> well, the shower has been rooted for weeks, big leak :-(
<thumper> hopefully only one of them and not all at once
<thumper> :(
<wallyworld_> so they need to pull apart the entire room
<thumper> wallyworld_: I've been trying to get loggo working
<wallyworld_> cool
<thumper> wallyworld_: however there are so many issues with our logging in the tests
<thumper> that makes it hard
<thumper> time consuming
<thumper> and energy sapping
<thumper> I need something enervating to do
<wallyworld_> yeah, this is my surprised face, not
<wallyworld_> we can discuss surrogate ids :-)
<bigjools> g'day
<thumper> hi bigjools
<wallyworld_> bigjools: fuck off
<thumper> bigjools: how are you feeling these days?
 * thumper slaps wallyworld_
<thumper> none of that here
<thumper> play nice
<wallyworld_> thumper: he knows i'm joking
<thumper> wallyworld_: but others reading here don't
<wallyworld_> it's the way we say hello in australa
<bigjools> it's his standard greeting.  If he doesn't say it I know he hates me :)
<thumper> lovely jubbley, say it on the private channels then :)
<bigjools> consider yourself bitchslapped wallyworld_
<wallyworld_> sigh
<bigjools> heh
<wallyworld_> everything is so politically correct nowadays
<thumper> sarcasm also doesn't come across well on irc
<thumper> wallyworld_: be non-pc on private channels, not public logged ones
<thumper> sad but true
<thumper> aspect of online life
<wallyworld_> yeah i know
 * thumper sighs
 * wallyworld_ sighs again
 * thumper looks to bigjools for the next sigh
<bigjools> meh
<bigjools> tired of sighing lately
<bigjools> thumper: anyway to answer your question, I am slightly better than I was thanks.  Had a torrid time last week but seem to be relatively ok at the mo. (fingers crossed)
<thumper> cool
#juju-dev 2013-05-30
 * thumper needs to reply to the api thread to agree with wallyworld_
 * wallyworld_ wants to talk surrogate ids with thumper
<thumper> wallyworld_: what about them?
 * thumper feels like he needs something else...
<thumper> just finished some ham, cheese and pineapple toasties
<thumper> perhaps another coffee
<wallyworld_> i think we can add a natural key to a machine without needed a schema change, but i could be wrong
<wallyworld_> there's always time for another coffee
 * thumper goes to make that coffee
<thumper> hi davecheney
<wallyworld_> thumper: hangout time?
<thumper> wallyworld_: sure
<thumper> wallyworld_: you create
<wallyworld_> https://plus.google.com/hangouts/_/d3f48db1cccf0d24b0573a02f3a46f709af109a6
<bigjools> today is going to need more coffee and more painkillers
<dimitern> morning
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: dimitern | Bugs: 8 Critical, 71 High - https://bugs.launchpad.net/juju-core/
<wallyworld> davecheney: i don't type any import lines - my IDE does it all automatically :-D
<wallyworld> davecheney: so you want all "jujuerrors" import aliases to go away and "errors" aliased to "stderrors" where there are conflicts? not just in the one place you made a comment
<davecheney> wallyworld: i think there was only one place where we import both errors and jujuerrors
<davecheney> sort of like the way we do with testing and jujutesting
<davecheney> testing => stdtesting
<wallyworld> there were several in my branch
<davecheney> juju-core/testing => testing
<wallyworld> ok, i'll fix them all
<wallyworld> all of mine i mean
<davecheney> cool
<wallyworld> thanks for pointing it out
<wallyworld> i wasn't sure what the conventoon was
<wallyworld> convention even
<davecheney> maybe you get to be the trend setter this time
<davecheney> at least it's consistent
<dimitern> wallyworld: and instead of jujuerrors, you can use coreerrors
 * davecheney doesn't care about the color of the bike shed, only that it has a traditional number of walls
<wallyworld> i won't need that though if i alias errors to stderrors
<dimitern> wallyworld: like coretesting, etc.
<dimitern> wallyworld: if you need to
<wallyworld> do we want to stick with coretesting?
<davecheney> i dont
<dimitern> yes we do
<wallyworld> me either since it's inconsistent with stdblah
<wallyworld> we need *one* pattern
<wallyworld> i don't mind which
<dimitern> any day i'll prefer a standard convention throuout the code than a new one
<davecheney> dimitern: this isn't a new one
<wallyworld> depends on what Go best practice is thpough
<wallyworld> if best practice is stdfoo, then we should use that
<wallyworld> i'm deferring to davecheney here :-)
<davecheney> wallyworld: i don't htink the std library can help here
<dimitern> i mean if we tend to name core packages, when an import conflict will happen, "core"+package, let's do that
<davecheney> we're in uncharted territory
<davecheney> but we do have a pattern of importing the std lib's testing as stdtesting
<dimitern> yeah, exactly
<wallyworld> so: do we alias the core packages to corefoo, or the std lib packages to stdbar. we need to choose one or the other
<davecheney> so ithink what wallyworld and I are saying is, lets keep doing that
<dimitern> wallyworld: i don't mind which one - there are examples of both
<davecheney> as an asside, if we were so arrogant as to give one of our pacakges the same name as a package that already exists in the standard library, we should change out packages' name
<dimitern> wallyworld: but i do mind inventing a yet another prefix scheme
<wallyworld> the bike shed should all purple or all red, not red and purple :-)
<wallyworld> dimitern: sure, using "juju" was wrong, i didn't realise we used "core"
<dimitern> wallyworld: that's all i meant, thanks
<davecheney> holy fuckj
<davecheney> just submit the thing
<wallyworld> np, i knew what you meant :-)
<davecheney> the person that cares _that_ much about naming thigns can do a followup
<wallyworld> i don't mind this discussion, really! i'd like to get it right rather than propagating more inconsistency into the code base
<dimitern> at the moment i mostly care for us to come to an agreement what are we doing with  the api
<dimitern> 3 days of discussion now and no single solution at the horizon, we're waisting time
<davecheney> dimitern: you must be new here
<dimitern> davecheney: :D
<mramm> is there a notes doc for today's meeting yet?
<wallyworld> davecheney: with the "arrogant" thing - if a package has a really generic name like "errors", its hard to avoid a clash
<jam> dimitern, wallyworld, mramm: I'm probably going to miss the juju-core meeting. I'll see if I can get back in time, but today is "show your stuff to your parents day at school".
<jam> It is done right when the meeting starts.
<mramm> jam: no problem
<wallyworld> oooh good - what's on display?
<jam> dimitern: I didn't feel your email actually had a "should I do it X or Y". or at least, the question wasn't very clearly stated.
<davecheney> mramm: it's the same doc as always
<davecheney> danios wanted to keep it all one one document from now on
<mramm> ahh, right
<jam> wallyworld: well, it is KG2, so some crudely drawn paintings that all of us parents will surely fawn over
<dimitern> wallyworld: we can introduce a cuning schema of replacing package names with base64(name) or rot13, so such clashes with std packages are avoided }:>
<wallyworld> lol
<mramm> jam: got time for a quick sync up before you leave?
<wallyworld> rotflmao
<jam> mramm: It doesn't hurt to resend it to canonical-juju to remind us all about it. Though as it is a private doc we have been suggested to only send it to the private list.
<jam> mramm: unfortunately, I have to head out the door now.
<davecheney> wallyworld: fair point, but that doesn't mean if we start a juju-core/utils/math package, we have more right to the name "math"
<mramm> jam: no problem
<mramm> jam: have fun!
<jam> mramm: what are you even doing awake?
<jam> :)
<mramm> I am trying out staying up until the meting
<wallyworld> davecheney: agreed! we just need a convention for dealing with it - i think i prefer stdfoo
<davecheney> wallyworld: seconded
<davecheney> they I's have it
<mramm> ... see if I am more awake that way than getting up in the middle of the night...
<wallyworld> \o/
<dimitern> guys, i'll apreciate your input on the mail i sent yesterday - API Bulk Operations
 * davecheney is super testy because he suspects the mongo retry behviour is part of what is making the tests unreliable with the .deb mongo
<davecheney> in fact, i'm going to rip the retry shit out and see if that helps
<wallyworld> dimitern: sorry about the delay. will look after i pick up my kid from viola soon
<wallyworld> davecheney: if you fix the tests you will be my hero
<dimitern> wallyworld: cheers
 * wallyworld goes to get kid from school
<davecheney> https://codereview.appspot.com/9857046/
<dimitern> davecheney: on it
<dimitern> davecheney: reviewed; how come nobody encountered an issue with that?
<davecheney> dimitern: they probably did
<davecheney> but didn't realise it
<dimitern> davecheney: i'd like to see a test showing the issue before and after the change, if possible
<davecheney> dimitern: run the race detector on the cmd/jujud package
<dimitern> davecheney: ah, ok, thanks
<davecheney> that is how i found it :)
<dimitern> davecheney: where's the agenda for today's meeting?
<davecheney> dimitern: same as last week
<davecheney> remember danilos asked to have it all on one page
 * dimitern lunch
<wallyworld> fwereade: you now ok with https://codereview.appspot.com/9662048/ ?
 * TheMue => lunchtime
<wallyworld> dimitern: since you asked :-) https://codereview.appspot.com/9666050/
<dimitern> best standup ever!
<danilos> :)
<wallyworld> jam: oh, so sorry, you missed the standup :-P
<dimitern> wallyworld: reviewed
<wallyworld> thank you
<fwereade> wallyworld, sorry, I'll make sure your reviews are ready for your morning
<wallyworld> fwereade: np. i've yet to do any more coding on the drity flag or assignment ploicy ones
<fwereade> wallyworld, tentative yes on that one but let me check; I'll need to do a spot more thinking on the assignment policy though
<wallyworld> fwereade: with the assignment policy - i'll remove the code which provides the default constraint value in env till we work out how to do it consistently for all constraints
<wallyworld> fwereade: with the dirty flag, i'm switching it to "clean" so the semantics are backwards compatible with existing schemas
<fwereade> wallyworld, ok, so new machines start with "clean" set and old ones are assumed dirty? sgtm
<wallyworld> fwereade: that's the plan. sort of like a failsafe - assume dirty unless told otherwise
<wallyworld> i had intended to do it that way originally but got talked out of it :-)
<fwereade> wallyworld, heh, annoying, sorry about that
<wallyworld> not your fault!
<fwereade> wallyworld, I still ned to think about assignment policy, there's something knocking at my mind but I'm not sure what ;)
<fwereade> wallyworld, I'm just expressing regret, not taking responsibility ;p
<wallyworld> fwereade: the end game is to use a constraint, not to hard code on each provider. i think that bit is ok. bt we need to decide how to specify any default value if the user doesn't specify a policy
<wallyworld> i added an env setting for this. but am happy to hard code until we decide how to do constraint default values
<fwereade> wallyworld, yeah, agreed; I wish that state could have a concept of an actual environ other than via a config
<wallyworld> juju config seems "interesting" and complex to me
<fwereade> wallyworld, I'd be most comfortable if we stuck with that rather than taking a "temporary" step in an unhelpful direction... for example, we *could* have the env inject it into the bootstrap constraints if not already set
<fwereade> wallyworld, don't even start :/
<fwereade> wallyworld, it's trying to be too many things
<wallyworld> yeah. i don't prefess to *fully* understanding the design rationale
<fwereade> wallyworld, and I'm terrified of the "worse" phase in "it'll get worse before it gets better"
<wallyworld> indeed
<fwereade> hey ho
<fwereade> ok, I am feeling somewhat crappy, I'm going to have a bit more of a rest in the hope of becoming coherent for a chunk of the rest of the afternoon
<fwereade> ttyl guys
<wallyworld> have a drink! i've imbibed a few already :-)
<teknico> marcoceppi: hi, you around?
<marcoceppi> teknico: I am!
<teknico> marcoceppi: could you please have a look at https://bugs.launchpad.net/juju-plugins/+bug/1185820 ?
<marcoceppi> teknico: branch I can run juju-test against?
<teknico> marcoceppi: https://code.launchpad.net/~juju-gui/charms/precise/juju-gui/trunk , revno 59
<marcoceppi> teknico: thanks, it looks like some discrepencies between what jitsu was doing and what the new plugin does. Based on feedback from others it now executes everythin in the tests/ directory, not just .test files. Which is why your utils.py files are even being considered for tests
 * marcoceppi pokes somemore to figure out the other issues
<teknico> marcoceppi: thank you
<marcoceppi> teknico: you can specify test(s) to be run at the end of the juju-test command, and juju-test won't traverse directories in the tests folder. So you could reorginze your helpers in another directory or just run juju-test unit.test deploy.test
<teknico> marcoceppi: good to know about the subdirs. is there a way to reuse an already bootstrapped environment?
<marcoceppi> teknico: not at the moment, but I can add that option. The downside being that you wouldn't get a teardown between tests. If your tests are OK with that it shouldn't be a problem
<marcoceppi> teknico: finally, there are verbose controls, -v(vvv) which will give you debug output
<teknico> marcoceppi: yes, I'm going to try again with those
<marcoceppi> teknico: could you attach the output output with -vv in the report when it's done running?
<teknico> marcoceppi: sure. what about the default env? any chance to use it?
<marcoceppi> teknico: I made it explicit out of concern people would trample their default environments accidentally. However, I guess it'd be ok to use the default as it does a sanity check for already-bootstrapped
<teknico> marcoceppi: yeah, it sounds reasonable
<marcoceppi> I'll add default-environment support in a few mins
<teknico> marcoceppi: great, thanks. I guess in order to actually use "juju test" rather than juju-test, we'll need a package in a PPA, right?
<teknico> marcoceppi: output of a -vv run added
<teknico> marcoceppi: those "Permission denied" errors are weird, perms and ownership are ok locally
<marcoceppi> teknico: permission errors are because the files aren't +x
<marcoceppi> well, your helper files aren't
<teknico> ah, right
<marcoceppi> teknico: It looks like environment variables aren't being passed to test properly. I'll see if I can get a fix for that out asap
<teknico> marcoceppi: awesome, thanks
<jcastro> dave mentions in the release notes that the PPA has changed
<jcastro> but afaict, it's still ppa:juju/devel
<marcoceppi> teknico: Give the latest a try. I wasn't able to check against your current repo as I don't have all the dependencies for testing, but it should fix the errors as well as trying to execute none-executables. Though that last bit might go away in the near future. You may want to reorganize or just document that you have to specify which .test files you want to run
<marcoceppi> teknico: as for the default environment I'll see if I can get that in there later today
<teknico> marcoceppi: thanks, I will in a while, in a call now
<marcoceppi> teknico: ack, thanks for filing the bug and playing guinea pig :)
<teknico> marcoceppi: oink ;-)
<teknico> marcoceppi: there are some problems with last revno, running tests after applying this diff: http://pastebin.ubuntu.com/5717057/
<FunnyLookinHat> Anyone here know why Content-Length wouldn't be included in the request to /tokens w/ the OpenStack provider ?
<FunnyLookinHat> ( http://hastebin.com/fanagihive.xml )
<marcoceppi> teknico: sorry about that! Not sure why make check didn't catch it. I've pushed a fixed that should work now
<teknico> marcoceppi: now I'm having unrelated problems with selenium not being able to open displays :-/
<marcoceppi> teknico: does selenium use the DISPLAY environment variable?
<teknico> marcoceppi: I don't think so, it looks for display :1032
<teknico> marcoceppi: the expert on that is frankban, I'll ask his help after our standup call ;-)
<marcoceppi> teknico: cool, if there's anything else juju-test needs to set in the way on env vars, etc (as they're pretty light weight what's actually set when each test runs), let me know!
<teknico> marcoceppi: I will, thanks again!
<fwereade> danilos, https://codereview.appspot.com/9876043/ reviewed
<fwereade> danilos, and, doh, disregard last question, I see you already answered it
<fwereade> Makyo, ping
<Makyo> fwereade, Hey
<fwereade> Makyo, I'm a bit concerned about where the UpgradeCharm logic has moved to, can we g+ quickly?
<Makyo> fwereade, sure
<fwereade> Makyo, heh, I'm not sure which of the many matthew scotts you are...
<fwereade> Makyo, I'm sure I *have* met you in person...
<Makyo> fwereade, https://plus.google.com/105798904379156554275/posts
<teknico> marcoceppi: so, I was wrong, and selenium does indeed need the DISPLAY env var
<fwereade> cool, the review queue's looking a bit more manageable
<fwereade> later all
<marcoceppi> teknico: no problem, I'll add that to the plugin in a second
<marcoceppi> teknico: anything else while I'm poking around?
<teknico> marcoceppi: no thanks, I patched juju_test.py myself and am trying to run the test again
<teknico> marcoceppi: there seem to be problems with manually destroying the env after the juju-test command is interrupted prematurely
<teknico> marcoceppi: btw, the feature of only running tests in executable files is working flawlessly, thanks
<marcoceppi> teknico: what's the error? That's likely a juju problem but I'll look in to trying to trap Ctrl+C to do better cleanup
<teknico> marcoceppi: I get "error: environment is already bootstrapped"
<marcoceppi> teknico: ah, because it didn't destroy-environment during the premature exit. You'll have to destroy environment before running again (or, when the option is added, use the "use-existing-bootstrap" option
<teknico> marcoceppi: yes, that's what I was trying to do, but it takes quite a few retries, for some reason, or maybe just a bit of time
<teknico> ok, EOD, I'll resume this tomorrow
<teknico> marcoceppi: thanks again for the help
<marcoceppi> teknico: thank you so much for the feedback!
<jcastro_> https://bugs.launchpad.net/juju-core/+bug/1027873
<_mup_> Bug #1027873: cmdline: Implement constraints set/get <cmdline> <juju-core:In Progress by fwereade> <https://launchpad.net/bugs/1027873>
<jcastro_> marcoceppi: is this it?
<marcoceppi> jcastro_: similar, that's setting them overall. this is specifically "juju add-unit --constraints"
<fwereade> jcastro_, marcoceppi: heh, I apparently suck at bug management
<marcoceppi> fwereade: is that released!
<marcoceppi> ?
<jcastro_> fwereade: hey so we just ran into this while doing a live site
<fwereade> marcoceppi, no such thing as unit constraints (except internally, so pretend I didn't say that -- it's not meaningful in the model we want to expose)
<jcastro_> where we deployed to an xsmall on HP cloud but need to bumpt the instance size up
<marcoceppi> fwereade: right, so the idea was "juju add-unit --constraints" then we can remove unit the old on
<fwereade> marcoceppi, you should be able to set service constraints, add-unit, and then remove the oldone
<marcoceppi> fwereade: how? I didn't find anything in juju help
<fwereade> marcoceppi, this then means that future ones will be deployed with the new constraints as well -- is that unhelpful?
<fwereade> marcoceppi, `juju help set-constraints` looks, er, kinda crap... but `juju set-constraints myservice mem=4G` should work?
<marcoceppi> fwereade: I'll give it a shot, thanks
 * fwereade looks somewhat dolefully at the command and its help, and presumes he implemented that way to match python
 * fwereade observes that, no, it doesn't even match python
<fwereade> ah! that's ok
<fwereade> I was reading get-constraints
<fwereade> marcoceppi, juju set-constraints -s myservice mem=4G
<fwereade> that's more like it
<marcoceppi> cool
<marcoceppi> fwereade: thanks, that did the trick!
<fwereade> marcoceppi, great
<marcoceppi> fwereade: along those same lines, can you force remove-unit?
<marcoceppi> It's in a state of dying, but the machines never came up. Now I can't remove unit or terminate machine
<thumper> morning
<bac> hi thumper
<thumper> hi bac
<bac> hi marcoceppi, do you have a moment?  i'd like to talk about the charm-tools build recipe
<marcoceppi> bac: I've got about 10-15 mins
<bac> marcoceppi: perfect.  hangout?  i'll invite you
<marcoceppi> bac: sounds good
<bac> marcoceppi: failed
<marcoceppi> just copy/paste a link?
<bac> https://plus.google.com/hangouts/_/b665b424984af215e05d2968dab0964ada7130ed?hl=en
 * thumper pulls a funny face
<thumper> fwereade: not still up are you?
<thumper> arse biscuits
<thumper> wallyworld: you around?
<wallyworld> thumper: hi
<wallyworld> thumper: just a sec, need to talk to plumber
<wallyworld> thumper: i gotta go do something. i'll ping you in a bit
<thumper> wallyworld: ack
#juju-dev 2013-05-31
<thumper> wallyworld_: finally got the logging branch passing all the tests.
<wallyworld_> \o/
<thumper> last failures with the uniter tests where it is watching log messages and using a regex to match
<thumper> to make sure hooks run
<thumper> had me cringing on the inside
<wallyworld_> ah yes, i ran into that too
<wallyworld_> very hard to figure out the test failures
<thumper> heh
<thumper> yeah
<thumper> https://code.launchpad.net/~thumper/juju-core/use-loggo/+merge/166611 is the work in progress, i'm just going to go through the diff and clean up anything that needs tweaking
<thumper> wallyworld_: I see it is "land it friday" for your branches
<wallyworld_> yeah, 2 more to go if i can get the issues sorted
<wallyworld_> thumper: a quick read of the code - looks nice, what i would expect to see for logging. now, if only we had logger.Trace(....) and devs could juju set-env tracelevel to turn it on/off etc :-)
<thumper> wallyworld_: we do have logger.Trace
<thumper> and we should do a set-env
<wallyworld_> ah ok, sorry. didn;t see it in my quick look
<thumper> wallyworld_: it is in launchpad.net/loggo
 * wallyworld_ nods
<thumper> wallyworld_: we have loggo.ConfigureLogging
<thumper> or
<thumper> logger := loggo.GetLogger("juju.uniter.foo")
<thumper> logger.SetLogLevel(loggo.TRACE)
<thumper> INFO, DEBUG, WARNING, ERROR
<wallyworld_> cool
<wallyworld_> travel levels would be nice :-)
<wallyworld_> trace
<wallyworld_> not travel
 * thumper nods
<thumper> we have them
<thumper> by default not set though
<thumper> and --debug
<thumper> only sets juju to DEBUG
<thumper> you'd need to use the configure logging to get trace
<thumper> juju --log-config juju=TRACE magic
<wallyworld_> that's fine. i think trace is a set-env thing anyway
<wallyworld_> perhaps
<wallyworld_> thumper: hangout?
<thumper> sure
 * wallyworld_ starts one
<wallyworld_> https://plus.google.com/hangouts/_/d3f48db1cccf0d24b0573a02f3a46f709af109a6
<bigjools> thumper: it's good to see you used loggo :)
<thumper> bigjools: yeah, after both you and jam suggested it, i changed it
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: - | Bugs: 8 Critical, 71 High - https://bugs.launchpad.net/juju-core/
<dimitern> morning
<fwereade> hey guys, guess what I did? dropped laura at school without her lunch :/
<fwereade> bbiab
<fwereade> ok, I can no longer kid myself I'm not sick, I'm going to sleep and hope it gives me strength enough to collect laura this pm (cath's worse :/)
<fwereade> I'll probably drop by every so often 'cos I'm obsessive like that, but don;t count on it
<TheMue> fwereade: ok. so get well soon and recharge yourself.
<dimitern> fwereade: get well soon, both of you
<dimitern> fwereade: ping
<dimitern> jam: (if you're actually here) Thanks for the review!
<dimitern> jam: very good suggestions, I'll update the proposal
<TheMue> dimitern: will also take a look as i now need to have to take a step back from the provisioner stuff
<dimitern> TheMue: cheers, I'll be updating it shortly with jam's suggestions
<TheMue> dimitern: ok
<teknico> marcoceppi_: hi, is this enough to propagate the DISPLAY env var in juju-test? http://pastebin.ubuntu.com/5719876/
 * TheMue grumps
<ahasenack> how can I get a help from the juju tools? Like relation-get --help?
<TheMue> Whashing machine and daughters bike defect.
<ahasenack> since they are not part of the juju package
<ahasenack> and there is no manpage
<dimitern> ahasenack: you can look in the code perhaps: worker/uniter/jujuc/relation-set.go
<TheMue> dimitern, ahasenack: just wanted to give the same advice ;)
<dimitern> TheMue: :)
<ahasenack> ok
<ahasenack> dimitern: that tip doesn't work for relation-list
<ahasenack> worker/uniter/jujuc/relation-list.go doesn't help much
<ahasenack> actually
<ahasenack> it's the same
<dimitern> ahasenack: oh?
<ahasenack> there is no simple text template to copy from
<ahasenack> have to look at flags and stuff
<dimitern> mramm: kanban?
<dimitern> ahasenack: --smart and -r are supported
 * TheMue has to leave, will step in later again
#juju-dev 2013-06-01
<arosales> Have any folks deploy juju-core 1.11.0-1~1240~quantal1 on hp cloud?
<arosales> seems juju core 1.11 can't find a precise image on HP :-/
 * arosales will file bug for follow up
#juju-dev 2013-06-02
<ajnr> hi my ubuntu 12.04 hangs while shut down , plz help me out to sort the problem. https://bugs.launchpad.net/ubuntu/+source/indicator-session/+bug/1186605
<_mup_> Bug #1186605: 12.04 ubuntu shutdown hangs <indicator-session (Ubuntu):New> <https://launchpad.net/bugs/1186605>
<fwereade> wallyworld, just popping in to say that dimitern's done a couple of API branches and I'd really appreciate your reviews on them tonight if you have a mo
<fwereade> wallyworld, but I'm going to sleep now
<wallyworld> ok, will look
#juju-dev 2014-05-26
<davecheney> good news everyone, save joyent, http://paste.ubuntu.com/7517545/
<davecheney> ppc64el is passing
<wallyworld> davecheney: o/
<wallyworld> i am fixing joyent this morning
<wallyworld> davecheney: have you seen the jenkins job? it fails with a compile error
<wallyworld> what's your take on that?
<wallyworld> eghttp://juju-ci.vapour.ws:8080/job/walk-unit-tests-ppc64el-trusty-devel/309/console
 * davecheney looks
<davecheney> is there a bug raised for this ?
<wallyworld> not that i know of, i only just saw it this morning. there's a history of job failures we should extract bugs for
<wallyworld> how is the jenkins set up different to your?
<davecheney> it is possible that the gccgo fix has not landed in main yet
<davecheney> i am running the fix from the ppa
<wallyworld> ah
<davecheney> can you ssh tot he test machine and look at dmesg
<davecheney> the last 20 lines should be sufficient
<wallyworld> i'll have to determine what the machine is
<wallyworld> let me get the upstream oyent fixes in the core first
<davecheney> ummm
<davecheney> + juju_version=juju-core_1.19.3
<davecheney> + set -x
<davecheney> + set +e
<davecheney> I think this is wrong
<wallyworld> where is the above from?
<davecheney> that link
<davecheney> you posted
<davecheney> +e turned off break on error afaik
<wallyworld> we can talk to curtis about the test scripts used
<wallyworld> davecheney: this will make the joyent tests go faster https://codereview.appspot.com/101740043
<wallyworld> davecheney: is this the ppa i would need to use to get the same gccgo as you? https://launchpad.net/~ubuntu-toolchain-r/+archive/ppa/?field.series_filter=trusty
<axw> morning all
<thumper> davecheney: juju-ci.vapour.ws:8080/job/walk-unit-tests-ppc64el-trusty-devel/312/console
<thumper> davecheney: ec2 tests paniced
<thumper> davecheney: is this new or known about?
<thumper> actually, that isn't ec2
<axw> looks like it panicked in the go tool itself
<thumper> is it joyent?
<thumper> yeah, looks like compiling error somewhere
<thumper> doesn't seem like it gives good feedback as to which bit causes the problem
<thumper> reminds me of the "internal compiler error" I used to get with gcc and hairy templates
<wallyworld> the gccgo on jenkins is old apparently
<wallyworld> doesn't have the latest fixes
<wallyworld> the one from the ppa is better
<thumper> ah...
<wallyworld> i'm trying it out now
<thumper> that makes sense
<wallyworld> see my above link
<wallyworld> axw: morning. if you have a moment, could you +1 this small mp. fixes the joyent tests. https://codereview.appspot.com/101740043
<axw> sure, looking
<wallyworld> ta
<axw> yay, they merged
<wallyworld> yeah, finally :-)
<wallyworld> the tests still take a little too long, but much better
<davecheney> thumper: we're just discussing that
<thumper> kk
<davecheney> my hunch is the gccgo fix hasn't landed in trusty-updates yet
 * thumper nods
<davecheney> wallyworld: doko moves ppa's more often than the tide
<davecheney> that one will probably work
<wallyworld> cool, trying it locally
<thumper> heh
<davecheney> Guest78498: trying the joyent fixes now
<Guest78498> davecheney: sadly, i don't get a clean test run. http://pastebin.ubuntu.com/7517679/
<Guest78498> a test failure or two and compile errors
<Guest78498> that's with the ppa
<Guest78498> but joyent tests pass :-)
<Guest78498> still 3 times slower than maas tests, or 6 times slower tahn ec2
<Guest78498> but, it's a start
<thumper> Guest78498: what have you don't with wallyworld?
<thumper> s/don't/done/
<davecheney> Guest78498: can you send me the last 20 lines of dmesg from that host
<Guest78498> thumper: computer crashed, and freenode takes way too f*cking long to drop the connection, so it won't let me back in
<davecheney> cmd juju ran too long
<thumper> Guest78498: ghost wallyworld
<thumper> hmm...
<Guest78498> davecheney: i ran the tests on my laptop, you want dmesg from that?
<davecheney> signal: segmentation fault (core dumped)
<davecheney> FAIL    launchpad.net/juju-core/environs/simplestreams  1.552s
<davecheney> Guest78498: gotta be on ppc
<davecheney> that is where the bug is
<davecheney> skip the dmesg request
<davecheney> you don't need to apply the ppa to gccgo on amd64
<davecheney> it is unaffected
<davecheney> looking at the rest of these tests
<davecheney> looks like they just took to long
<Guest78498> what about the seg fault
<davecheney> no idea
<davecheney> check dmesg
<davecheney> (yes, I know i just told you no to)
<Guest78498> davecheney: oh joy, lookie
<Guest78498>   674.033377] CPU7: Core temperature above threshold, cpu clock throttled (total events = 1921)
<Guest78498> [  674.034391] CPU7: Core temperature/speed normal
<Guest78498> [  674.034392] CPU3: Core temperature/speed normal
<Guest78498> [  675.690153] mce: [Hardware Error]: Machine check events logged
<Guest78498> maybe that had something to do with it?
<davecheney> Guest78498: if I had to guess, 8 tests compiled with gccgo plus 8 mongodbs is causing you to swap
<Guest78498> i have 16GB RAM
<Guest78498> but no swap partition
<davecheney> some of the tests consume gigabytes when run under gccgo
<Guest78498> i've never needed a swap partition till now with that much memory. sigh
<davecheney> adding swap won't help
<davecheney> probably reducing the number of tests run concurrently will
<davecheney> go test -p 4 ./...
<davecheney> go test -p 4 -compiler=gccgo ./...
<Guest78498> even if i don't have gomaxprocs set currently?
<davecheney> unrelated
<davecheney> go test tries to start as many test jobs in parallel as you have cpis
<Guest78498> ok, i'll try that after getting the jpyent branch landed
<Guest78498> ffs, how long does freenode want to hold open my old connection for
<davecheney> ok, good news and bad news
<davecheney> good news: ok  launchpad.net/juju-core/provider/joyent127.919s
<davecheney> bad news: some other transient error, http://paste.ubuntu.com/7517706/
<davecheney> usual unreachable servers bullshit
<axw> Guest78498: I managed to reproduce the issue kapil had where kvm containers are leaking. I'll be looking at that today
<Guest78498> \o/ great thanks
<axw> I think it doesn't happen in 1.19.3, but I'd like to figure out what's going out to be sure
<Guest78498> sounds good
<Guest78498> davecheney: that replicset stuff has been so f*cking unreliable
<Guest78498> it's been made more robust, but......
<Guest78498> still can fail
<wallyworld> davecheney: much better with -p 4. the jujud watcher error occurs sometimes on amd64 also, but there's a fault in the openstack provider tests http://pastebin.ubuntu.com/7517849/
<wallyworld> davecheney: here's the dmesg from the CI ppc machine used to run the tests every hour and for which i posted the output with all the faults etc earlier http://pastebin.ubuntu.com/7517890/
<wallyworld> thumper: have you run a local provider with kvm before?
<thumper-otp> yes
<wallyworld> seenthis?
<wallyworld> ian@wallyworld:~$ juju status
<wallyworld> ERROR failed getting all instances: exit status 1
<wallyworld> ERROR Unable to connect to environment "local-kvm".
<wallyworld> Please check your credentials or use 'juju bootstrap' to create a new environment.
<wallyworld> i just bootstrapped, no errors
<thumper-otp> no
<wallyworld> :-(
<wallyworld> hmmm, seems KVMObjectFactory.List() is sad
<wallyworld> ian@wallyworld:~$ virsh -q list --all
<wallyworld> error: failed to connect to the hypervisor
<wallyworld> error: no valid connection
<wallyworld> error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Permission denied
<wallyworld> thumper: could it be that we need sudo and it's not prompting?
<axw> wallyworld: seems odd that "juju status" is trying to list providers; that implies your .jenv doesn't have a valid API server address cached
<wallyworld> axw: works fine for local with lxc
<axw> wallyworld: you should add your user to the group that owns libvirt-sock tho
<wallyworld> ok
<axw> then you should be able to use virsh
<axw> I think it's libvirtd
<wallyworld> if we need that for local kvm to work, we should check for it
<axw> it shouldn't be necessary, sounds like something else is wrong
<axw> but it's useful for using virsh
<wallyworld> does local with kvm work for you?
<axw> yes
<axw> just works
<wallyworld> hmmmm
<wallyworld> i'm trying to reproduce bug 1322281
<_mup_> Bug #1322281:  local provider deployment of mysql to kvm hangs in pending state <cloud-installer> <juju-core:Triaged> <https://launchpad.net/bugs/1322281>
<axw> I am in the group, however all the important bits run under root anyway
<axw> I tried that, it just worked for me
<axw> i.e. I installed mysql in a kvm container and it worked fine
<wallyworld> :-(
<wallyworld> so my setup is screwed somehow
<wallyworld> we might need to ask for the cloud init log
<wallyworld> for the user of that bug
<axw> yeah may be useful
<thumper> Guest21046: crash again?
<thumper> Guest21046: I have a branch where the only failing bit is tool selection and syncing
<Guest21046> thumper: nah, had to log out and back in so a new group membership would stick
<thumper> Guest21046: not sure why... perhaps you could take a look?
<Guest21046> sure
<thumper> Guest21046: lp:~thumper/juju-core/enable-alpha-versions
<thumper> Guest21046: I can't see where the version comparison stuff is different for tool selection
<thumper> and since you most recently touched it, you may know more
<Guest21046> sure, no problem
<Guest21046> thumper: just pulling your branch and looking without running tests, ottomh there are a number of tests which rely on minor version being odd to be a dev version. and that is now no longer the case in you branch
<thumper> ah...
<thumper> ok
<Guest21046> i'll run the tests though to conirm
<thumper> almost certain that'll be the problem then
 * thumper goes to enfixorate
<Guest21046> ok
<Guest21046> let me know i you want me to do anything to help
<jam> axw: sorry I missed this a few days ago, but the fix here https://code.launchpad.net/~axwalk/juju-core/lp1321025-mongod-path-1.18/+merge/220544
<jam> should really be to default to juju-mongodb except for precise and raring (I think)
<jam> otherwise everything will fail again when V is released.
<davecheney> jam: and raring is out of support
<axw> jam: that is what it does, see the code above
<axw> jam: it's using the result of MongoPackageForSeries, which was already doing the right thing
<jam> axw: ah, I see. It was just that we had a line that was using == trusty, gotcha
<axw> yup
<jam> I misread the diff given you added explicit "utopic: juju-mongodb" line
<jam> 	but that was just in tests
<axw> wallyworld: I think you put the wrong LP# in the replicaset card
<wallyworld> sigh yes
<wallyworld> bigjools was distracting me with drivel
<axw> heh :)
<bigjools> he's easy to distract
<bigjools> I just throw a ball
<wallyworld> ball? where?? me fetch? can I? huh? huh?
<menn0> hmmm... i'm trying to add a field to the FullStatus API response and everything hangs if I use a map keyed by int but it's fine if the map is keyed by string
<bigjools> I am NOT rubbing your tummy
<menn0> is this a known thing?
 * wallyworld tries so hard to be pc
<bigjools> makes a change :)
<menn0> does our API serialisation only like string keys on maps?
<wallyworld> as thumper says, it's a family channel
<bigjools> hello menn0, apparently you used to work with my brother in law at BATS
<thumper> heh
<menn0> what's his name?
<bigjools> Glyn
<menn0> yep :)
<menn0> I know Glyn well
<wallyworld> unlike his brother in law, gyln is an awesome guy
<bigjools> I've known him for 19 years, some say 19 years too many
<menn0> Glyn is awesome... if a little grumpy at times :)
<bigjools> lol
<menn0> any takers on my question? what's our serialisation format for the API?
<wallyworld> davecheney: apart from replicaset shite, ppc tests on CI server now pass also http://juju-ci.vapour.ws:8080/job/walk-unit-tests-ppc64el-trusty-devel/317/consoleFull
<menn0> if it's JSON I can see how there could be a problem...
<wallyworld> i think it is
<menn0> right
<menn0> that make ssense
 * menn0 will use strings
<wallyworld> don't ya love json for that stuff
<menn0> it's not a big deal in this case
<davecheney> wallyworld: w00t
<wallyworld> davecheney: i upgraded gccgo-4.9 from the ppa
<davecheney> cool
<davecheney> we're just waiting for that to land in trusty-updates
<wallyworld> davecheney: did you see my local test run? fault in opestack tests http://pastebin.ubuntu.com/7517849/
<davecheney> wallyworld: have you run godeps ?
<wallyworld> i did earlier
<wallyworld> which dep in particular?
<jam> wallyworld: did you see https://bugs.launchpad.net/bugs/1321009
<_mup_> Bug #1321009: juju-metadata doesn't produce content that 1.19.2 bootstrap can use <landscape> <metadata> <regression> <juju-core:Fix Committed by wallyworld> <https://launchpad.net/bugs/1321009>
<davecheney> mgo
<jam> hmm,, mail client shows me the old stuff, but not the new
<jam> looks like you already fixed it
<wallyworld> jam: yes, fixed, it wasn't a simplestreams data format bug but rather bootstrap looked up the simplestreams data for supported arches before it was uploaded in the case the --metadatasource was specified
<wallyworld> davecheney: i've got rev 275 o mgo locally even though juju-core's dep file says 273
<davecheney> ... o_O
<wallyworld> that shouldn't cause the test failures though
<wallyworld> looking at the logs, it seems a mongo process is just dying unexpectedly
<davecheney> we had a bug a few days ago where our client was causing mongo to crash
<davecheney> looks like the replica set stuff is a bit fragile
<davecheney> as in, mongos'
<wallyworld> and the test doesn't recover. it gets EOF, refreshes and calls Ping(), but tries to reach the process that has just died
<wallyworld> yeah, i reckon mongo itself is fragile :-(
<wallyworld> wish we didn't use it
<wallyworld> jam: how detailed is you mgo ha knowledge?
<jam1> wallyworld: I wouldn't say it is amazing, but I'm willing to tell you want I can :)
<wallyworld> jam: quick hangout?
<jam> sure
<jam> link?
<wallyworld> https://plus.google.com/hangouts/_/gxweywbcs523zdbqunelr5u4uua
<vladk> jam: morning
<jam> wallyworld: says the party is over..
<jam> morning vladk
<wallyworld> invite sent
 * wallyworld -> school run bbiab
<vladk> jam: hangout time
<dimitern> morning all
<TheMue> morning
<axw> fwereade: on closer inspection, my assumption about the keymanager test was correct; it's tested by TestImportKeys above. going to take that test back out again...
<wallyworld_> mgz: standup?
<mgz> wallyworld: lost you
<wallyworld_> mgz: we're here
<wallyworld_> you're muted
<mgz> I.. can't now hear anything
<wallyworld_> mgz: you dropped out and came back
<TheMue> jam: I lost you. I can see youâre in, but muted and no video.
<jam> TheMue: you're frozen for me as well, I'll reconnect
<TheMue> jam: could it be that Iâm not allowed to write into the team calendar?
<jam> TheMue: fixed
<TheMue> thx
<jam> dimitern: fwereade: https://codereview.appspot.com/96600043 adds a caching layer to the srvRoot code so that all facades are now cached
<dimitern> jam, looking
<dimitern> jam, LGTM
<jam> hx
<jam> thx
<dimitern> TheMue, fwereade, standup?
<jam> dimitern: we lost you after "I almost"
<frankban> hi all, is anyone available for reviewing https://codereview.appspot.com/92610045 ? thanks!
<perrito666> good morning everyone
<jam> vladk|offline: we're done for now, don't worry about coming back
<fwereade> frankban, LGTM
<frankban> fwereade: thank you!
<jam> fwereade: so for modelling "environ tag" as part of the data we store in the .jenv, we currently have a split of EnvironInfo.SetAPIEndpoint() and EnvironInfo.SetAPICredentials()
<jam> I'm tempted to just put EnvironUUID into APIEndpoint informationd
<jam> Which currently holds Addresses and CACert
<jam> but seems a decent fit
<fwereade> jam, +1
<jam> k, thanks for confirming it was sane
<jam> now I just have to figure out the spaghetti mess to figure out how to hook it together... :)
<jam> perrito666: wwitzel3  is natefinch gone today?
<perrito666> jam: natefinch and wallyworld_
<perrito666> aghh
<perrito666> wwitzel3:
<perrito666> holiday
<jam1> ah right memorial day
 * perrito666 goes again
<perrito666> jam1: nate and wwitzel3 and voidspace are on holiday today
<bodie_> greetings
<jcw4> bodie_: o/
<jcw4> fwereade: I'm trying to gather my thoughts to ask you some questions about cleanup.go
<jcw4> fwereade: are you going to be around for a bit?
<fwereade> jcw4, let's talk now
<fwereade> jcw4, if you can?
<fwereade> jcw4, otherwise we can do the one in 80m
<fwereade> jcw4, I have a meeting in 20m but if I can save you a bit of time I would be delighted
<jcw4> fwereade: I've started working on cleanupDeadUnit
<jcw4> fwereade: and it's unclear if it should be called in unit destroyOps
<jcw4> fwereade: or in a new *Ops on the unit?
<jcw4> fwereade: seems there should be some hook for EnsureDead
<jcw4> fwereade: to call some *Ops fn
<fwereade> jcw4, hmm, so, Destroy will go from alive to either dying or removed
<fwereade> jcw4, if it's in fast-forward mode, we should add a cleanup there in destroy
<jcw4> fwereade: I see... so EnsureDead is not guaranteed to run... makes sense.
<fwereade> jcw4, wondering whether it makes more sense to tack it onto the remove ops
<jcw4> fwereade: okay, so just add it as another op right after the cleanupDyingUnit...
<fwereade> jcw4, the cleanups would not be guaranteed to run before the unit was removed anyway
<fwereade> jcw4, not sure I follow you there
<jcw4> fwereade: unit.go:325
<fwereade> jcw4, nah, that'll run while the unit is still dying and not necessarily dead
<jcw4> fwereade: ok. that's what I was stumbling against.  so removeOps... line 367...
<jcw4> I see
<fwereade> jcw4, I'd add it in Service.unitRemoveOps
<jcw4> fwereade: okay.. that makes sense
<fwereade> jcw4, sorry, removeUnitOps
<fwereade> jcw4, cool
<jcw4> fwereade: I'll start looking there and ping you with further questions as needed
<bodie_> looks like the dirt in xeipuuv/gojsonschema is in sigu-399/gojsonreference
<bodie_> I need to make sure there's not more in gojsonschema itself, but gojsonpointer looks pretty OK if not terribly beautifully written
<bodie_> mgz, you around?
<bodie_> :)
<jcw4> fwereade: If we cleanup actions in removeOps it seems that we could miss actions added while the unit is Dying...
<perrito666> hey what does it means when godeps yields "blah is not clean"
<perrito666> where blah is a dependency path
<fwereade> jcw4, I don't think so?
<jam> fwereade: should we be versioning the Admin api and not changing Login without a version bump?
<fwereade> jam, ha, yes, good point
<fwereade> jam, excellent testbed
<jam> ... :(
<jam> true, though more work for me
<fwereade> jam, we will probably notice of login fails
<jam> fwereade: well, it is our entry point where we were going to tack on the compat stuff
<jam> so it is a bit hard to get right
<jam> fwereade: namely old servers will happily pay no attention to a "Version: 2" in the Admin login request
<jam> is that good/ok ?
<jam> (as in, you pass V2, but you just get login v1)
<jam> well, v0 at least
<fwereade> jam, it's not ideal, but I think it's inescapable... trying to figure out the worst that could happen
<jam> fwereade: well, we can just "Login", 2 and have it just work but not give us back the actual v2 of the call.
<bodie_> okay this gojsonreference module is weird
<bodie_> heh, discarding errors in the test....
<bodie_> let's see how much of gojsonschema is broken once this is unbroken
<bodie_> okay, implemented some much better testing in gojsonschema
<bodie_> I'm pretty sure they just had some dumb typos, but it looks like there might be a problem with their implementation of gojsonpointer regarding URL scheme
<bodie_> (fwereade, mgz, rogpeppe)
<bodie_> sorry, implemented testing in gojsonreference, not gojsonschema
<bodie_> the uglier of the dependencies
<jcw4> bodie_: o/
<bodie_> :)
<jcw4> bodie_: you haven't pushed back up to github yet?
<bodie_> I have, actually.  it's under a dev branch
<jcw4> bodie_: I see now.. tx
<bodie_> pushed up latest bits
<bodie_> oy, just noticed Go playground is more verbose about errors than my vim plugin.
<bodie_> that's very annoying.
<jcw4> oh?
<bodie_> http://play.golang.org/p/dXHaRCeB_l
<bodie_> mine was giving me something like "expression expected"
<jcw4> ah, but not the actual syntax error "unexpected comma..."
<bodie_> right
<bodie_> urg.... found the really ugly bits of this
<hazmat> jam, if you expect an echo/assert of the version in the response the client could handle appropriately
<hazmat> with absence -> v0
<thumper> fwereade: around maybe?
<jimmiebtlr> Any hints as to where in the code I would find a machine tag being calculated, or how to get a machine tag from an id?
<jcw4> jimmiebtlr: names/machine.go ?
<fwereade> thumper, heyhey
<thumper> fwereade: got some time to chat?
<fwereade> thumper, 5 mins?
<jcw4> fwereade: me next if possible ;)
<thumper> fwereade: as in "in 5 minutes" or "only have 5 minutes" ?
<jcw4> (after thumper )
<fwereade> thumper, sorry, in 5 minutes
<fwereade> jcw4, sure
<thumper> fwereade: that's fine
<jimmiebtlr> thanks
<jcw4> jimmiebtlr: yw
<jcw4> fwereade: I know it's late there... I'll just post my question and you can respond at your convenience
<jcw4> fwereade: https://codereview.appspot.com/92630043 is the WIP branch
<jcw4> fwereade: all good except the cleanup test TestCleanupEnvironmentServices is now failing because somehow the actions cleanup is getting queued but not run
<jcw4> fwereade: when I explicitly call Unit.Remove() and then Service.Cleanup() it works...
<jcw4> fwereade: and AFAICT cleanupUnitsForDyingServices *should* call unit.Destroy() and trigger the actions cleanup, but I *think* the unit is gone before that line of code gets run
<jcw4> fwereade: http://paste.ubuntu.com/7524594/
<jcw4> fwereade: I see that cleanupUnitsForDyingServices only processes Units that are Alive...
<fwereade> jcw4, it's ok, I think: you clean up the environment services, but those cleanups schedule more cleanups because some units got removed; you then need to clean up *again* before there are no cleanups left
<jcw4> fwereade, I think that's what I just came to
<fwereade> jcw4, unit.Destroy will *queue* the actions cleanup but not run it
<jcw4> I'm busy adding one more assertCleanupsRun to the test + comments explaining
<fwereade> jcw4, assertCleanupCount is what I introduced last branch for exactly that reaosn wrt dying-unit cleanups
<fwereade> jcw4, comments explaining why it's used will generally be appreciated, indeed
<jcw4> fwereade: Okay, I'll plan on using assertCleanupCount too to make sure we're not succeeding accidentally
<fwereade> jcw4, cool
<jcw4> fwereade: tx
<jcw4> fwereade: btw... do you have involvement in the ubuntu sprint in your neck of the woods this week?
<jcw4> fwereade: I just saw niemeyers comment on G+ and thought it was an interesting co-incidence:)
<fwereade> jcw4, not really, but I will surely pop down into sliema to say hi
<fwereade> jcw4, it's client stuff really
<jcw4> fwereade: i see :)
<jcw4> fwereade: tests passing now.. .I think I'll do a real lbox propose now
<fwereade> jcw4, cool, I need to do tidying up and stuff now, if I still have enrgy when I'm done I'll pass by and see what I can do
<jcw4> fwereade: thx, no pressure :)
<jcw4> fwereade, fwiw https://codereview.appspot.com/92630043/
<waigani> why does this test not run: http://pastebin.ubuntu.com/7524918 ?
<waigani> it is not inside juju-core, just an exercise to understand the testing setup
<wallyworld_> waigani: you need to register the test
<wallyworld_> func Test(t *testing.T) {
<wallyworld_> 	gc.TestingT(t)
<wallyworld_> }
<wallyworld_> not the test i mean, but you need to set up to run with gocheck
<davecheney> wallyworld_: is ppc passing in ci yet ?
<wallyworld_> davecheney: yes andno. intermittent mongo failures, not releated to ppc http://juju-ci.vapour.ws:8080/job/walk-unit-tests-ppc64el-trusty-devel/
<wallyworld_> there's a good run of blue there
<davecheney> awesome
<davecheney> do you think the mongo failure are because mongo on ppc is not well tested ?
<wallyworld_> davecheney: it fails on amd64 too. mongo is not my favourite db, let's put it that way to stay polite
<waigani> wallyworld_: Thank you! so that is how gc stitches up to the standard go testing package?
<wallyworld_> yep
<wallyworld_> a lot of our packages have a package_test.go in them - that's the convention we use since there is often more than one test file
<wallyworld_> and we only want to register once per package
#juju-dev 2014-05-27
<hazmat> umm.. https://bugs.launchpad.net/juju-core/+bug/1256053/comments/3
<hazmat> i think we need to mark this critical https://bugs.launchpad.net/juju-core/+bug/1215579
<_mup_> Bug #1215579: Address changes should be propagated to relations <addressability> <reliability> <juju-core:Triaged> <https://launchpad.net/bugs/1215579>
<hazmat> i'm working up a script to resolve it, but even transporting the script to all the units is a bit tedious without resorting to ansible and a juju inventory
<hazmat> posted a workaround to both, nevermind re critical..
<davecheney> hazmat: createing a worker running on ecah machine agent that listens to netlink should be straight forward
<davecheney> what hook would it fire ?
<hazmat> davecheney, the user's issue isn't actually the one that bug
<hazmat> davecheney, the netlink this is pure optimization
<hazmat> davecheney, the actual issue is that juju doesn't update the relations per bug http://pad.lv/1215579
<hazmat> davecheney, ie. you can have unit related to another service whose ip addrss changes. juju will pick up the address change. but the private-address of that unit in its relations (which juju set) will never be updated.
<hazmat> s/netlink this/netlink thing
<hazmat> davecheney, fwiw.. here's my workaround for folks today.. https://gist.github.com/kapilt/a61efcb4eaef9e685397
<wallyworld_> hazmat: i'll get tanzanite to start working on that bug after the githib migration; likely early next week
<hazmat> wallyworld_, awesome, thanks
<wallyworld_> maybe even this week if we have bandwidth
<wallyworld_> hazmat: stuff like that, just ensure i know about it and we'll prioritise it
<hazmat> wallyworld_, will do.
 * hazmat calls it a night
<axw> morning all
<wallyworld_> axw: morning
<wallyworld_> axw: i've got agreement to set up a test instance running mongo 2.7.1 to see if that hep with the mongo instability
<wallyworld_> well, mongo > 2.4.x
<wallyworld_> 2.7.1 is the latest
<axw> wallyworld_: cool
<wallyworld_> it's compiling now on the instance, will create a jenkins job
<wallyworld_> axw: cause one of the failures with replicaset tests i looked at yesterday - the mongo process just died for no reason and hence broke the test. there's several mongo bugs which might be culpable which are fixed in 2.6 or later which are not being back ported
<wallyworld_> so if this goes well, we'll look to get mongo 2.6/7 into juju-mongodb
<axw> wallyworld_: did it actually die (core?) or did it just drop the connection?
<axw> wallyworld_: meant to ask on the hangout, but forgot
<wallyworld_> axw: the log showed the mongo process itself died
<axw> ok
<wallyworld_> and the test tried to reset and ping and failed cause the process was gone
 * wallyworld_ taps fingers waiting to mongo to finish compiling
<davecheney> thumper: http://paste.ubuntu.com/7525844/
<davecheney> what am I not seeing ?
<thumper> davecheney: whitespace maybe?
<thumper> davecheney: menn0 and I suggest doing a full print prior to the assert
<thumper> just to check
<menn0> 	err := st.users.Find(bson.D{
<menn0> 		{"_id", bson.RegEx{
<menn0> 			Pattern: "^" + regexp.QuoteMeta(name) + "$",
<menn0> 			Options: "i", // case-insensitive
<menn0> 		}}}).One(udoc)
<menn0> 	if err == mgo.ErrNotFound {
<menn0> 		err = errors.NotFoundf("user %q", name)
<menn0> 	}
<menn0> 	return err
<menn0> }
<menn0> sorry, ignore that
<davecheney> thumper: full print ?
<menn0> davecheney: fmt.Printf("%q", tw.Log)
<davecheney> kk
<davecheney> hmm, that just makes things more confusing
 * davecheney goes to look at js.LogMatches
<davecheney> grr
<davecheney> this is a bug with the matcher
<davecheney> http://paste.ubuntu.com/7525911/
<davecheney> when times didn't match, the matcher returns false but keeps the reason to itself
<davecheney> has anyone tried to write a gocheck checker that checks the output of another checker
<davecheney> oh and the checker type is not exported
 * davecheney throws a table
<axw> davecheney: gocheck.Not does that
<davecheney> axw: can I use that to write a test of a checker ?
<axw> davecheney: sorry maybe I misunderstood. I don't know how you'd do that
<davecheney> axw: s'ok, i'm working on something
<jcw4> if state/state.go has an un-exported logger, can I use that in state/action.go, or should I declare a new logger with a different name to avoid a name collision?
<jcw4> thumper: since it's midday likely for you, and you just emailed about errors... ^^^ :)
<jam1> jcw4: you can use logger in any file in the same package
<axw> jcw4: all source files in the same package can/should use the same logger, as a general rule
<jam1> so just use "logger.Debugf" in action.cog
<jcw4> jam1, axw great.  Thanks
<jimmiebtlr> Is it expected that the output of add-machine is to Stderr?  ex. "created machine 1"
<wallyworld_> axw: with the replicaset stuff - we cab back it out if it doesn't help, or, if 2.7.1 works better we can change it also, or william likes the idea of mocking out the replicaset behaviour and adding separate CI tests so we'll be looking into that as well
<axw> wallyworld_: sounds good
<jcw4> jimmiebtlr: I assume so... sorry. I don't know
<jcw4> I can't seem to get -gocheck.f="ActionSuite"
<jcw4> to work
<jcw4> is there a trick to it, or does it just not work?
<jcw4> jam1: ^^  :)
<jam1> jcw4: you're sure it is spelled "ActionSuite" and not "actionSuite" or some other spelling?
<jcw4> jam1: interesting... yep, just verified
<jam1> go test -gocheck.v -gocheck.f=apiclient does what I would expect
<jcw4> is it somehow related to not having the testing package initialize in that suite?
<jam1> jcw4: are you trying to run the tests recursively? or just in one dir?
<jcw4> go test -gocheck.f="ActionSuite" ./state/...
<jam1> jcw4: if you don't have a testing.T function in a package, then no tests get run by "go test"
<jam1> jcw4: well, I'm sure that syntax won't work, because you are passing the "â¦" at the end
<menn0> jcw4: which directory are you running "go test" from?
<jcw4> jam1: I see
<jam1> and go test is a bit picky about argument order
<jcw4> menn0: juju-core
<jcw4> what is the preferred argument order jam1
<jam1> jcw4: at the least, you need to spell it "go test ./state/â¦ -gocheck.f=ActionSuite"
<jcw4> jam1: Ah!  I see
<jam1> the issue is that there are some arguments that are processed by "go test"
<menn0> try changing to the package directory where ActionSuite is defined
<jcw4> menn0: +1
<jam1> and there are other arguments that are processed by the binary that it builds and runs for you
<jam1> "â¦" is telling *go test* to run everything
<jam1> "-gocheck.f" is being passed to the binary that gets built to tell it what to run
<axw> I believe the -gocheck args don't get passed down if you run use "..."
<jam1> since "go test" itself doesn't understand what "-gocheck" means, it stops processing arguments and hands the rest to the binary being run
<axw> davecheney: right? ^
<jam1> axw: it does, you just have to spell it right
<jcw4> axw, jam1, menn0 fantastic help.  Thanks
<jam1> axw: I originally thought it was all broken because "-gocheck.v" looks like it doesn't work, but actually it does, but without "-v" "go test" in recursive mode suppresess the output
<jam1> so you have to do:
<jam1> go test -v ./â¦ -gocheck.v -gocheck.f=XXX
<jam1> axw: ^^
<jam1> axw: and you can't quite do that in juju-core because the 3rd party packages under utils don't use gocheck, and thus fail if you try to pass them those flags
<axw> hmm ok
<axw> gocheck.v isn't doing anything when I use ...
<jam1> what sucks, is that having the test binary passed positional arguments
<jam1> doesn't fail
<jam1> but it does fail if you pass named options
<jam1> (ah consistency....)
<jcw4> axw, jam1, menn0 yep it worked (when I didn't use ..., and reordered gocheck opts to the end)
<axw> jam1: ah right, missed the need for -v
<davecheney> jcw4: axw correct
<davecheney> those arguments should go at the end of the command line
<jam1> axw: yeah, so did I for a *long* time.
<davecheney> there is no reason gocheck couldn't adjust gocheck.v based on the test.v value which is better handled by the go tool
<davecheney> does anyone have experience with writing gocheck checkers ?
<davecheney> this is doing my head in
<jam1> davecheney: I've done some
<davecheney> jam1: can you explain how the Check siganture works ?
<davecheney> Check(params []interface{}, names []string) (result bool, error string)
<davecheney> names appears unused
<davecheney> error is a string
<jam1> if you are writing a checker, you should generally not put anything into error
<davecheney> result appears to indicate succes or failure
<davecheney> jam1: why not ?
<jam1> because it indicates there is actually something you cannot test
<jam1> davecheney: if you return an error it gets reported immediately, and breaks things like Not()
<jam1> because Not(error) is still error
<davecheney> ok
<jam1> while Not(resultFalse) is still true
<davecheney> this all started because
<jam1> s/still/
<jam1> davecheney: so IIRC, names can be used as an output variable, but you don't have to do anything with it as input
<davecheney> http://paste.ubuntu.com/7525844/
<davecheney> but names is passed as a value
<davecheney> there is no way I could append to names inside the function
<davecheney> and have that visible to the caller
<jam1> davecheney: you may not be able to append, but I think you can change in place
<davecheney> true
<davecheney> ok
<jam1> I believe that is how I made slightly nicer error messages by replacing params in place (so when you get an array of 100 things, you can report the 1 that is actually wrong, and hide the rest)
<davecheney> ok
<davecheney> that helps me understand what is going on
 * jam1 really dislikes the in/out parameters, but that seems to be how gocheck works
<davecheney> go check is ok as an assertion framework, but writing checkers is ass backwards
<jam1> I was trying to figure out how to give nicer errors, but I can't just return something into "error" because that is a forceful failure
<jam1> davecheney: fwiw, I think the actual failure from your paste is that the LogLevel is wrong, but it isn't being shown
<davecheney> checked that
<davecheney> [{"INFO" "audit" "audit_test.go" ')' "2014-05-27 11:52:08.804353055 +1000 EST" "user-agnus: donut eaten, 7 donut(s) remain"}]
<davecheney> expected
<davecheney> obtained
<davecheney> ["INFO user-agnus: donut eaten, 7 donut(s) remain"]
<jam1> ah, ffs, the match is a regex
<jam1> and (s) doesn't match \(s\)
<davecheney> if you excuse me
<davecheney> i have to go outside and shout
<davecheney> well, on the positive side
<davecheney> i've got a branch which improves the coverage of the LogMatches to 100%
 * jam1 wonders if we should have a fallback "regex did not match, but exact match did" error message.
<jam1> :)
<fwereade> axw, hey, re https://launchpad.net/bugs/1215579
<_mup_> Bug #1215579: Address changes should be propagated to relations <addressability> <reliability> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1215579>
<axw> fwereade: yo
<fwereade> axw, please let me know what you're planning to do there
<axw> fwereade: just trying to figure it out right now...
<axw> fwereade: much bigger than I realised :)
<fwereade> axw, indeed, it's in a positive thicket of subtleties
<axw> fwereade: I was thinking a new worker, relationuniter, which watches for address changes on a machine and then updates the private-address setting for all relation units on that machine
<fwereade> axw, the main issues are: (1) we have to have a hook for it, otherwise units can't respond to their *own* changes of address
<fwereade> axw, -100
<fwereade> axw, a worker that keeps an eye on address changes is one thing
<fwereade> axw, fucking around with relations outside the purview of the uniter will totally fuck us
<axw> ok
<fwereade> axw, one issue is that they use the much-maligned state.Settings, which is only safe for writes from a single client
<fwereade> axw, and that clientis the uniter
<axw> fwereade: yeah, I modified to do be safe in that regard
<axw> with a loop that checks for concurrent changes... but I hadn't considered units changing the address themselves
<fwereade> axw, but even if that were fixed, the units (1) have to know what their addresses are and (2) have to be able to reject private addresses they don't like
<fwereade> axw, and this only becomes more critical as we build up the networking model
<fwereade> axw, (re (2): proxy charms)
<axw> ok. I think I lack enough knowledge in this area that I'm going to be dangerous. I'm going to unassign myself for now
<fwereade> axw, well, it demands care and forethought, but I wasn't trying to actively scare you off...
<axw> I will keep looking at it, I just don't want anyone to get their hopes up :)
<fwereade> axw, remind me about what you did to settings?
<fwereade> niemeyer, morning, welcome to malta, sorry I haven't said hi before now
<niemeyer> fwereade: Heya, thanks!
<axw> fwereade: nothing in the tree yet. I made a RelationUnit.UpdateSettings method; extracted the operation building from Settings.Write (into WriteOps), and had UpdateSettings do that with a assertUnchanged in a loop
<jimmiebtlr> should lbox propose should create a code review on rietveld?
<axw> jimmiebtlr: yes, that's its purpose
<jimmiebtlr> ok, I got the error "User does not have sufficient permissions to edit the bug task milestone." and no prompt for credentials for it
<fwereade> axw, ah, excellent -- there are other problems with settings, too, and if we can find an excuse to address them we should do so (because at last, with the db not exposed, we can do so sanely)
<axw> jimmiebtlr: try without -bug
<jimmiebtlr> ok
<fwereade> axw, are you free for a hangout in a few minutes?
<axw> fwereade: sure, let me just make a cup of tea
<jimmiebtlr> axw:  That did it, thanks
<axw> jimmiebtlr: no worries. if you go to the bug, you can associate the branch manually (top right, "Link to a related branch")
<axw> fwereade: ready when you are, shall I start one?
<fwereade> axw, please
<davecheney> jimmiebtlr: how come your merge proposals don't have code review links ?
<jimmiebtlr> When I do propose with -bug set, it gives permissions error and never seems to make it to the rietveld creation
<jimmiebtlr> I think I have to first do it without setting bug, then go into launchpad and set it through the gui.
<jimmiebtlr> Sorry about the number of emails, trying to figure the tools out
<jcw4> fwereade: https://codereview.appspot.com/92630043/ -- updated to address your feedback
<jcw4> fwereade: should I lbox propose again, or wait til you've reviewed the code?
<jcw4> fwereade: I suppose I have to lbox propose again to pick up the newly edited changes...
<fwereade> jcw4, lbox propose again please
<jcw4> fwereade: incoming....
 * jcw4 is EOD... cya later
<jam1> fwereade: I've put together a patch to change Login to accept and return EnvironTags, and to have those values saved in the cache and passed in on demand.
<jam1> It does not do it with a v2 of the Login api
<jam1> because I haven't gotten to the client side of that stuff yet, and because my other stuff hasn't landed anyway.
<jam1> You're welcome to say "I only want this in a v2" and it goes into WIP until I can do it.
<jam1> but I figure the code as it stands is reasonable to review.
<fwereade> jam1, cheers
<jam1> fwereade: https://codereview.appspot.com/101760046
<fwereade> jcw4, btw, if you have draft comments waiting when you lbox propose again, it will publish them alongside the PTAL message
<jam1> fwereade: so the post is up, an email sent to the CI guys, and I'm off for about 2 hours to go be entertained by what my son is doing in school. bbiab
<fwereade> jam1, take care, have fun
<voidspace> morning all
<fwereade> voidspace, heyhey
<jam1> fwereade: thanks for the review
<jam1> fwereade: one other thing I wanted to run by you, while working on this, I discovered that only JujuConnSuite and cmd-restore use NewAPIConn, and I'd like to kill it with fire
<jam1> it is the one that returns a Conn that has both a State and an Environ
<jam1> but nobody actually needs that, and JujuConnSuite does bad things with it.
<jam1> (JujuConnSuite.APIInfo() is implemented in terms of APIConn.Environ.StateInfo(), for examples of bad behavior)
<fwereade> jam1, awesome! has long been an ambition of mine, glad we're at the point we can do it
<fwereade> jam1, +100
<jam1> great, I thought you felt that way, and it got in the way of something I wanted to do :)
<fwereade> jam1, https://codereview.appspot.com/100460045/ LGTM
<jam1> fwereade: thanks for getting to it, I know its a bit big
<fwereade> jam1, np; need to pop out for a while myself; would you take a look at casey's https://codereview.appspot.com/100400043/ if you get a moment please? I feel like I've lost perspective on it
<jam1> k
<axw> fwereade: sorry, won't be able to chat about relation address things again today. Tomorrow hopefully, if that suits you
 * axw bbl
<voidspace> so bzr blame says the code I'm interesting in was committed by Andrew in revision 2636
<voidspace> but launchpad (juju-core) says that revision 2636 was by nate and doesn't contain this code
<voidspace> will bzr blame give the revision from the "original branch" rather than the revision it was merged onto trunk?
<mgz> voidspace: are you sure it's not 2636.something.something?
<voidspace> mgz: it's 2636.4.2
<voidspace> mgz: what does the 4.2 indicate?
<voidspace> mgz: "bzr help blame" says "revision, author and date"
<mgz> okay, so then what you want is `bzr log -c mainline:2636.4.2`
<voidspace> mgz: ah, cool
<voidspace> mgz: thanks
<voidspace> that tells me what I need
<mgz> the dotted revnos are changes off the mainline, (Andrew started from Nate's revision)
<voidspace> mgz: ah
<voidspace> now I need to see if I can find the original MP on launchpad, and any tickets closed as a result
<voidspace> but the commit message from "bzr log" has a link to the CL, which is handy
<mgz> voidspace: should be linked from the... right
<voidspace> the CL has a link to the MP
<voidspace> gets me a step closer :-)
<voidspace> fwereade: I tested the "immutable syslog port" with local provider and it works fine
<voidspace> fwereade: running a test against a canonistack HA deployment before I merge, just for sanity
<voidspace> wallyworld_: ping, you still around?
<bloodearnest> heya all, anyone got any more details on LP #1307434 ? We're hitting the same issue using mgo in a different project
<_mup_> Bug #1307434: talking to mongo can fail with "TCP i/o timeout" <cloud-installer> <performance> <juju-core:Triaged> <https://launchpad.net/bugs/1307434>
<wallyworld_> voidspace: i'm here, but eating dinner, can ping you soon
<rogpeppe> jam: ping
<jam1> rogpeppe: pong
<voidspace> wallyworld_: cool, thanks :-)
<rogpeppe> jam1: i was just looking at this https://codereview.appspot.com/101760046/ and a slightly different approach occurred to me, which i thought i might run by you
<rogpeppe> jam: because i've been thinking about multi-tenant state servers
<rogpeppe> jam1: specifically, i was wondering about, rather than specifying the environment in login, we could specify it in the URL for the API
<jam1> so we could, but we don't currently connect to a URL in as many letters. The information we store is a host:port combination, which we turn into a URL
<jam1> so yes, with a but
<rogpeppe> jam1: that means that *all* the URLs for an environment (including charm putting, logs, etc) can be accessed with respect to an environment
<rogpeppe> jam1: yeah, you'd need to know the environment uuid when you call api.Open
<rogpeppe> jam1: but you probably want that anyway (because nothing much calls Login explicitly)
<jam1> which I am migrating towards, but all existing .jenv files, etc don't know the UUID
<jam1> so I think at the least we still need to get that data into the files
<rogpeppe> jam1: you're migrating towards calling Login explicitly?
<jam1> no, towards passing EnvironTag/UUID into api.Open via the api.Info struct
<rogpeppe> jam1: that sounds great
<rogpeppe> jam1: i guess the question is just what happens under the hood in api.Open
<jam1> *today* it is easier to do it compatibly via Login, if only because I've already written that code :)
<jam1> but also, changing all the URLs we use is going to involve a whole different layer (the HTTP mux)
<rogpeppe> jam1: yeah, it will (but it's work we do want to do some time anyway)
<jam1> but I also think what I've done doesn't preclude us from doing that in the future.
<rogpeppe> jam1: i guess it doesn't matter if we pass the UUID in Login too
<rogpeppe> jam1: when the UUID is in the url, it will just be a tiny bit of extra validation
<rogpeppe> jam1: i guess if we're explicit that the UUID in Login is *only* for validation, not for selection, then we'll be ok for backward compatibility too
<rogpeppe> jam1: BTW, do we now generate the UUID and store it in the .jenv file at bootstrap time?
<jam1> not yet
<wallyworld_> voidspace: back now. late dinner tonight after soccer. what can i do you for?
<mgz> wallyworld_: hi!
<wallyworld_> heya
<voidspace> wallyworld_: hey
<voidspace> wallyworld_: so I'd like to re-enable mongo replica sets for local provider
<wallyworld_> great :-)
<voidspace> wallyworld_: apparently there was a problem with them previously, which I'm led to believe you reported
<wallyworld_> i can test since it broke for me
<voidspace> wallyworld_: I'm struggling to find the issue though
<wallyworld_> but
<voidspace> we obviously need to work out why it was broken for you and fix it
<wallyworld_> i upgraded my kernel today and that broke cgroups and hence local :-(
<voidspace> hah
<wallyworld_> i'll need to downgrade
<voidspace> ouch
<wallyworld_> it broke for me because mongo just didn't start
<wallyworld_> can't recall the exact error
<wallyworld_> but the upstart job failed to start mongo with --repl or whatever the ption was
<wallyworld_> but i can just test it again to see if it works
<wallyworld_> i think one other person also had the issue
<wallyworld_> not sure who now
<voidspace> wallyworld_: ok, thanks
<wallyworld_> voidspace: i can do that tomorrow after i downgrade my kernel, is that ok?
<voidspace> wallyworld_: I'll create a branch with HA always on (just delete the code that switches it off) and you can try it
<voidspace> wallyworld_: tomorrow is fine, thanks
<wallyworld_> i have some other stuff i need to try and get done tonight
<voidspace> cool
<wallyworld_> i'll let you know how i go
<voidspace> wallyworld_: I'll push a branch and email you a link
<voidspace> wallyworld_: tomorrow is fine
<wallyworld_> ok
<wallyworld_> ta
<dimitern> jam1, fwereade, standup?
<jam1> dimitern: sorry about that, brt
<wwitzel3> hello
<wwitzel3> need a reboot, brb
<fwereade> dimitern, with you in a mo
<fwereade> dimitern, don't wait
<wallyworld_> mgz: how's the github stuff? did you get to talk to rick or you don't need to now?
<mgz> wallyworld_: I didn't need anything from him yesterday, will probably bug both him and curtis today about the remaining small things
<wallyworld_> ok
<wallyworld_> mgz: if you talk to curtis make sure he is ok with the timeframe or else we'll need to delay
<mgz> yeah
<wallyworld_> mgz: you on track to get the contributing doc finished up today (with everything else you have to do)?
<mgz> yup
<wallyworld_> great :-)
<mgz> thanks for the comments from you and andrew
<wallyworld_> np, hope they helped
<wallyworld_> i'd like to email it out soon so will do so when you finalise it
<perrito666> morning
<wwitzel3> perrito666: morning
<voidspace> perrito666: morning
<voidspace> wwitzel3: morning
<dimitern> fwereade, jam1, vladk - the goamz branch I'd love a review on https://codereview.appspot.com/100780048
<voidspace> right, lunch time
 * voidspace lurches
<perrito666> voidspace: uh, thats true, medicine time, bbl
<fwereade> jam1, replied to bug instead of you, sorry
<dimitern> niemeyer, hey
<niemeyer> dimitern: Heya
<dimitern> niemeyer, I have another small goamz branch, if you can take a look https://codereview.appspot.com/100780048
<niemeyer> fwereade: When are you coming?
<niemeyer> dimitern: Cool
<fwereade> niemeyer, heh, haven't really figured it out yet -- I was thinking mid-afternoon on wed, thu, or fri would be pretty convenient, though, because I can get a lift most of the way there
<fwereade> niemeyer, do you have something particular in mind that I should be at?
<fwereade> niemeyer, because, you know, there's nothing stopping me at any other time
<fwereade> niemeyer, (well, I have a washing machine being delivered in a couple of hours, so today ain't great, but still)
<niemeyer> fwereade: Nope, nothing specific.. might just be nice to take the chance that 100+ co-workers are around to interact a bit
<fwereade> niemeyer, yeah, I just don't want to show up and end up hanging around being useless ;)
<fwereade> niemeyer, I'd been thinking at least that I could come down and see some people in the evening one (more?) of those days
<perrito666> niemeyer: well you are trying to make a geek choose between a new gadget and 100+ people... that is unfair
<fwereade> niemeyer, how are your plans looking?
<niemeyer> fwereade: I would take the chance to be here most of the time, to be honest, even if you're just working on your usual tasks during meetings
<niemeyer> fwereade: There's open space to hack, and just being around to talk to people over breaks is worth it
<niemeyer> fwereade: That's how I would feel if people would so close to me, either way.. :)
<niemeyer> s/would so/were so/
<fwereade> niemeyer, that's very true, I guess I'd kinda assumed that hacking space might not be so available
<hazmat> mgz, re gh please don't transfer history how gui did it (squashing branch commits into mainline history)
<mgz> hazmat: yeah, we'll keep all the commits, even though git doesn't handle them quite as neatly
<hazmat> mgz, well more to the point.. goal would be not polluting main branch history with every branch's individual commit as linear line.
<hazmat> which is what gui ended up doing
<mgz> hazmat: ...that sounds like the opposite of what I understood your first comment as
<hazmat> mgz, squashing was a bad word choice, s/polluting would have been better
<hazmat> mgz, ie. doing a git log on trunk should see merge commit messages, not every branches individual commit messages.l
<mgz> hazmat: it may be something to raise on the mailing list then, the current planned import does no squashing
<hazmat> mgz, hmm.. actually i think i might be off track here.. git log --oneline --graph shows what i would expect, just the default git log shows minor commits on feature branches by default
<hazmat> yeah.. git log --oneline --first-parent is the equivalent of bzr log
<hazmat> er. bzr log --line
<mgz> yeah, it's an issue with the git tooling
<mgz> git itself does the dag okay
<jam> dimitern: https://codereview.appspot.com/100780048/ reviewed
<jam> mgz: just that most git tools think first parent isn't relevant and sort by sha1 for ordering rather than the actual recorded merge parent ordering.
<dimitern> jam, thanks!
<dimitern> niemeyer, I've updated the branch after fwereade and jam reviewed it, I just need your review :) https://codereview.appspot.com/100780048/
<niemeyer> dimitern: Okay, I cannot promise to be timely there
<dimitern> niemeyer, ok, when you can, thanks
<niemeyer> dimitern: Or just go ahead and merge it.. I trust John and William
<dimitern> niemeyer, thanks!
<dimitern> niemeyer, it's just for testing, no public-facing logic has changed
<niemeyer> dimitern: Super, thank you
<jam> dimitern: I think you can simplify the "known" changes if you go back to a bool map.
<jam> since you can just do && idMap[id]
<jam> sort of stuff
<alexisb> welcome gsamfira !
<gsamfira> hey alexisb :)
<dimitern> jam, oh, sorry I forgot about this
<dimitern> jam, but I'll fix it next time i need to make changes, it's not a big deal
<jam1> cmars: I reviewed https://codereview.appspot.com/100400043/
<wwitzel3> perrito666: where'd you go? :)
<cmars> thanks jam1. I've also become concerned over compatibility issues with clients -- including older juju clients. Having second thoughts about making ecdsa the default.
<jam> cmars: for juju clients, we only really need to be compat with 1.18
<cmars> jam1, i'm considering leaving the support for switching to ecdsa in this proposal, issuing separate certs for the state server & mongo, but leaving the default RSA
<jam> cmars: I really like the idea of splitting out the certs, as sharing the certs has always bothered me
<jcw4> fwereade: "Checking the logs would be ideal" ... how does one check the logs in a test?
<jam> jcw4: you can look for test that use LogMatches
<jcw4> jam: tx!
<jcastro> http://askubuntu.com/questions/472849/destroy-a-juju-service-and-also-its-associated-machine
<jcastro> would it be a good idea to have a --destroy-the-machine-too flag for destroy-service?
<fwereade> jcastro, it surely would, even as default behaviour, but I can't remember whether we have it scheduled for this cycle
<jcastro> yeah I remember talking about it
<fwereade> jam, do you recall what heading we had it under?
<jcastro> should I file a bug?
<fwereade> jam, I have a feeling it might be one of the tanzanite ones
<fwereade> jcastro, if you can't find one, yeah, go for it
<jcastro> https://bugs.launchpad.net/juju-core/+bug/1299034
<_mup_> Bug #1299034: garbage collect/remove destroyed service machines by default <destroy-service> <improvement> <juju-core:Triaged> <https://launchpad.net/bugs/1299034>
<perrito666> natefinch: wwitzel3 voidspace heh, sorry that seemed easier to explain on code that over the phone :)
<perrito666> and in that process I found out the solution, :)
<natefinch> awesome
<perrito666> happens, I usually explain things to my dog to do this
<perrito666> but he rarely answers
<natefinch> fwereade: you around for the ensure availability talk?
<voidspace> gah, "syslog immutable" branch test failures
<voidspace> one spurious, one at least looks genuine
<voidspace> re-running locally
<voidspace> just one
<voidspace> I really thought I'd run all tests
<voidspace> *sigh*
<fwereade> natefinch, tasdomas: sorry, drama, guys came t deliver washing machine, decided to try to do it from the road without a permit, much excitement
<fwereade> natefinch, tasdomas: joining
<victor13> hola jodedores
<perrito666> now that was a bit rude
<perrito666> and poorly translated
<natefinch> perrito666: what is jodedores?
<natefinch> google translate is failing
<perrito666> a rather literal translation of f****ers
<natefinch> haha
<voidspace> juju seems to have got itself into a tizzy and I can't use the local provider
<voidspace> "ERROR cannot use 37017 as state port, already in use"
<voidspace> netstat doesn't show anything using that port though
<voidspace> rebooting to fix...
<voidspace> yay that seems to have fixed it
<natefinch> voidspace: yeah, that happens to me once in a while
<voidspace> heh
<voidspace> natefinch: making syslog-port immutable has landed (I had to fix a failing test first)
<voidspace> natefinch: and I've created a branch with replica set enable for local provider, for wallyworld_ to try out
<voidspace> natefinch: so next is looking at replica sets for tests
<voidspace> natefinch: you said there was *one place* to make the change, can you easily find out where that is or should I go spelunking?
<voidspace> natefinch: MgoInstance run method presumably
<voidspace> which is what starts mongo for the MgoSuite
<voidspace> hmmm... just adding "--replSet juju" is not looking *too* promising
<voidspace> tests appear to have hung
<voidspace> we'll see if they timeout in ten minutes (did we ever change that timeout?)
<voidspace> Panic: no reachable servers
<voidspace> *sigh*
<voidspace> two minute timeout
<voidspace> for two failures
<voidspace> so maybe one minute
<voidspace> 23 passed though
<jcw4> fwereade: https://codereview.appspot.com/92630043/
<jcw4> mgz: I should be pinging you on these too right? ^^
<sinzui> natefinch, do you have a minute to review https://codereview.appspot.com/98630044
<sinzui> ^ perrito666
<sebas5384> hey! :)
<sebas5384> there's any plans for hardware constraints work for juju-local (lxc) ?
<sebas5384> since cgroups help you with that already
<natefinch> I'm not sure what you mean about hardware constraints on local.  Isn't it, by definition, always running on your local hardware?
<natefinch> sinzui: I can review now
<natefinch> heh, perrito666 beat me
<sebas5384> natefinch: linux containers can be constrained, like cpu, mem, etc..
<sebas5384> https://bugs.launchpad.net/juju-core/+bug/1242783
<_mup_> Bug #1242783: containers should use constraints to configure cgroups/kvm values <constraints> <local-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1242783>
<perrito666> natefinch: I am fast :p
<natefinch> sebas5384: ahh, hmm, interesting... that never occurred to me
<sebas5384> natefinch: that feature would be awesome!
<sebas5384> :)
<natefinch> sebas5384: juju is open source - write it ;)
<sebas5384> since now we can deploy an ubuntu charm, or mysql, etc...
<sebas5384> heheh http://sorisomail.com/img/127989758237.jpg
<perrito666> natefinch: wanted to contribute something today before my cold kills me
<sebas5384> natefinch: there's some guide to newbies to contribute?
<sebas5384> i would like to at least see if I can contribute :P
<natefinch> sebas5384: http://bazaar.launchpad.net/~go-bot/juju-core/trunk/view/head:/CONTRIBUTING
<natefinch> not really a newbie guide, just a guide.  We should probably work up something a little easier to swallow
<perrito666> there are more specific docs in the /doc folder
<sebas5384> natefinch: yeah that would be nice, thanks, i will read it :)
<alexisb> natefinch, ping
<natefinch> alexisb: hey hey
<alexisb> do you remember who on the core team was assigned to shadow hazmat in the TOSCA effort?
<alexisb> was it you?
<alexisb> natefinch, ^^^
<natefinch> alexisb: I remember, it was me :)
<alexisb> ok I am going invite you to tomorrow morning call evne though I think it is a bad time for you
<natefinch> alexisb: ok
<jcw4> sebas5384: feel free to ping bodie_ or I with questions too.. we recently started contributing so we may have some fresh memories of any issues
<natefinch> jcw4: thanks, that's a huge help... for people who have been doing this for a long time, we may have forgotten a lot of the setup steps we now take for granted
<natefinch> alexisb: I guess I'll go read up on TOSCA then ;)
<jcw4> the *right* thing to do would be to capture each question we get and make an FAQ
<natefinch> jcw4: make FAQ.txt and slap it in the /doc directory, or even in the root directory for that matter.
<jcw4> natefinch: will do
<natefinch> jcw4: put it in the root directory, then make the first FAQ "where is the dev documentation?" and point to CONTRIBUTING and /doc  :)
<jcw4> natefinch: lol!
<jcw4> natefinch: +1
<jcw4> okay sebas5384 ... I'm ready for the questions.. :)
<sebas5384> nice!!! jcw4 thank you!
<jcw4> sebas5384: :)
<sebas5384> jcw4: I started a recognition field to work in the bug
<sebas5384> http://bazaar.launchpad.net/~go-bot/juju-core/trunk/view/head:/provider/local/environ.go#L314
<sebas5384> so i found this already
<perrito666> rogpeppe: are you around by any chance?
<jcw4> sebas5384: looks like constraints for lcx containers are lower priority now though (well medium instead of high)
<sebas5384> yeah jcw4 but i'm planning to use it
<sebas5384> because i'm going with lxc containers way :)
<jcw4> sebas5384: cool, I'm interested to see how that goes
<sebas5384> yeah! and will be my first contribute :)
<sebas5384> jcw4: whats your environment for developing?
<jcw4> trusty
<jcw4> vim
<sebas5384> bzr hehe
<sebas5384> never used
<jcw4> sebas5384: apparrently if you wait a week you'll be able to use git
<jcw4> sebas5384: when I started I was gung-ho to move to github (still favour that)
<sebas5384> they are going to move to git?
<jcw4> sebas5384: but I've gotten to like some things about bzr and especially lbox
<jcw4> sebas5384: I believe so
<sebas5384> jcw4: sweet!
<natefinch> yep... moving to github thursday evening, I believe
<sebas5384> yeah i imagine, every tool have's some features that the other one doesn't
<sebas5384> yeah!! \o/
<sebas5384> thats nice
<jcw4> https://lists.ubuntu.com/archives/juju-dev/2014-May/002459.html
<sebas5384> juju-gui is over there already
<jcw4> sebas5384: natefinch next FAQ : mailing lists...
<jcw4> Although that's in CONTRIBUTING already I think
<jcw4> nope...
<sebas5384> didn't see about that
 * jcw4 rubs his hands and chuckles... 
<sebas5384> jcw4: i have to build a fresh install of trusty in a vbox
<sebas5384> using mac here :(
<jcw4> I was able to do some things on a mac directly, but yeah ubuntu is certainly the main dev env
<jcw4> sebas5384: if you don't mind paying a little bit, digital ocean has decent prices on vm's, and I was doing quite well on an AWS $100/month vm
<jcw4> sebas5384: but I ended up paving over a laptop of mine with Trusty
<hazmat> natefinch, ping if you have questions re tosca.. i'm diving back in myself..
<natefinch> hazmat: thanks.  I looked at the website for like a minute last time, so... basically just need to read more in-depth
<natefinch> sebas5384: the tests are kind of the major problem with developing on a mac.  most will run, but I'm sure a bunch will fail.  And definitely the local provider won't work on OSX
<perrito666> sebas5384: well, excepting for the compile time, which can be a bit dense, working on a VM works quite well in mac
<hazmat> natefinch, you want this https://www.oasis-open.org/committees/download.php/52954/TOSCA-Simple-Profile-YAML-v1.0-wd02-Rev-05.docx
<hazmat> natefinch, just go to the docs section from the home page.. https://www.oasis-open.org/committees/documents.php?wg_abbrev=tosca
 * perrito666 did the smallest change and broke a lot... such is my luck
<sebas5384> perrito666: yeah! i will bootup a vbox with vagrant
<sinzui> natefinch, is the landing bot ill?
<sebas5384> i'm doing that right now
<natefinch> sinzui: uh, no idea
<natefinch>  hazmat thanks
<hazmat> natefinch, how's your python?
<hazmat> natefinch, most of the code for this (and all of ostack bits) is in python.. there are two repos of interest
<natefinch> hazmat: my python is ok.  I'm a little rusty, and never wrote a ton of it, but it's generally pretty easy to read as long as they're not trying to be too crazy
<hazmat> my take on the spec is https://github.com/kapilt/juju-tosca  its doing some auto gen of types/classes.  some of the other tosca guys have been working on this https://github.com/spzala/heat-translator
<jcw4> natefinch: FAQ: https://codereview.appspot.com/98650043
<natefinch> I love that the name of the thing is "OASIS Tosca YAML Simple Profile 1.0"
<natefinch> it sounds so.... Standards-y
<jcw4> especially the word Simple in a six word title
<natefinch> Right?  Especially because Tosca is an acronym already
<natefinch> OASIS Topology and Orchestration Specification for Cloud Applications YAML Simple Profile 1.0
<jcw4> haha, and OASIS too presumably?
<jcw4> and YAML
<natefinch> Organization for the Advancement of Structured Information Standards
<alexisb> alrighty all I am off for the afternoon doing community work, be back online later this evening to check in
<natefinch> alexisb: have fun
 * jcw4 leaves for a couple hours
<bodie_> jcw4, you around?
<bodie_> if you're not too busy, I could use a second brain for a minute
<natefinch> does that work? 'Cause I need a second brain like... always
<perrito666> natefinch: sure, amazon ech
<perrito666> you just add more brains
<bodie_> it's kind of like networked storage
<bodie_> you need good protocols and error handling
<bodie_> you can usually assume things might not make their way from sender to receiver in one piece
<bodie_> etc
<natefinch> heh
<sebas5384> hehe
<sebas5384> BOA - brain oriented architecture
<sebas5384> and like always, as a service ... xD
<perrito666> the usage of addr:port as a str instead of stored as two separated entities has made my life difficult many times these days
<waigani> fwereade: --define ?
<perrito666> sweet, fwereade now accepts flags?
<waigani> lol
<waigani> I wonder what the output would be
<bodie_> fwereade --help
<bodie_> lol
<menn0> I think he also takes --review and --architect-something
<bodie_> heh
<wallyworld_> sinzui: hi, did you have time for a quick catchup?
<sinzui> I do
<wallyworld_> i'll start a hangout
<wallyworld_> https://plus.google.com/hangouts/_/gwsrc7uen7yzc2m4y5nxqxyhnea?hl=en
<thumper> morning folks
<thumper> sinzui: morning
<sinzui> hi thumper. my fix for the win installer is playing now. I hope to see a pass for all juju http://juju-ci.vapour.ws:8080/
<thumper> sinzui: so we are good?
<thumper> sinzui: I also sent a reply to the list
<thumper> sinzui: and then noticed that you had already committed a fix for the win installer
<sinzui> thumper, probably. Everything That needed to know about the version appears to. I just need to see a pass for everything
<thumper> sinzui: I think we should have a note in the version.go file to say "if you change this, please remember the windows installer file"
<sinzui> thumper, or a test.
<thumper> a test would be good
<sinzui> I learned from natefinch that updating one always mean updating the other :). my first two releases  failed bcaue of that
<mfoord> signing off (not been here for a while but forget to say goodnight)
<mfoord> so goodnight...
<fwereade> waigani, thumper, hey, I'm here for a bit
<waigani> fwereade: hello
<fwereade> waigani, thumper, have I been telling you to do stupid things?
<waigani> fwereade: I was wondering what you thought of $ juju switch --define
<thumper> fwereade: I have 7 minutes before my call with cmars
<thumper> fwereade: can we hangout?
<fwereade> waigani, ah-- hmm, I'm not sure anyone's *defining* anything
<fwereade> thumper, sure
<waigani> fwereade: okay, worth a shot
<thumper> --define?
<waigani> thumper: fwereade does not like --env-info
<thumper> what's --env-info?
<waigani> the flag to dump all the info on the environ
<waigani> $ juju switch --env-info
<waigani> then it gives you user, endpoints etc
<waigani> its all in my branch
<waigani> ANYWAY
<waigani> just looking for a better name
<sinzui> thumper, juju doesn't support -beta1 Every deploy tests fails
<sinzui> http://juju-ci.vapour.ws:8080/job/aws-deploy/1244/console
<sinzui> ^ one example
<thumper> what?
 * thumper looks
 * thumper wonders why tests didn't catch this
<thumper> sinzui: shall we roll back the version stuff? and I'll take more of a look at it when I have some time
<sinzui> thumper, might be best if you don't see an obvious fix
<thumper> sinzui: I don't have an obvious fix
<thumper> I say rollback
<sinzui> thumper, hold
<sinzui> thumper, in that log, the actual failure is 1.18 attempting to destroy the an env
<sinzui> thumper, never mind. that succeed, as did the proper clean up 1.20-beta1 cannot bootstrap itself from stream
<sinzui> thumper, lets roll back
<sinzui> This will be a test that my changes properly support the old version
<thumper> cmars: around?
<cmars> thumper, hey
<cmars> thumper, i'm in the hangout
<thumper> coming
<thumper> cmars: I can hear you
<thumper> volume up?\
<waigani> davecheney, thumper: menn0 and I did our stand-up. So the next call can just focus on tags discussion if you like.
<davecheney> waigani: cool
 * davecheney has many comments
<waigani> menn0: can you send me that email please?
<menn0> waigani: done
<waigani> menn0: thank you :)
#juju-dev 2014-05-28
<bodie_> nice, got my gojsonreference reimplementation built and passing tests
<bodie_> now need to get tests on gojsonschema passing again
<jcw4> bodie_: o/
 * thumper takes a big breath
 * thumper exhales
<waigani> thumper: davecheney: are we just using the standup channel?
<thumper> waigani: sure
<davecheney> yup, see you back in the hangout
<axw> morning all
<wallyworld_> morning
<perrito666> man looking at all the discussions that include menn0 and thumper in mailing lists is like reading a webcomic, we need to wait 12 hs between each episode
<davecheney> perrito666: welcome to canonical
<waigani> I've added some user docs and apparently I need to let evilnick know about them. What is the best way to do that?
<waigani> thumper: $ juju user detail foobar; #output: username: foobar
<waigani> that is one useful cmd!
<thumper> waigani: exactly, which is why we want display name :)
<thumper> and later... groups and stuff
<thumper> waigani: we probably also want 'date created' etc, and other items not yet in the document
<waigani> thumper: i'll throw in anything i can find and we can then cut back
<waigani> except maybe password ;)
<thumper> waigani: add in some TODOs for when we modify the doc
<sinzui> thumper, I have deployed with the last built juju twice.
<thumper> and?
<sinzui> thumper, I don't see the issue universally reported by ci's tests
<sinzui> I also know that there were no stale jenv files because I reset all the JUJU_HOMEs a hour before
<thumper> sinzui: using upload tools?
<sinzui> never for cloud tests. only local host is allowed to use upload tool
<sinzui> thumper, I am reading all the shell env hoping to find a 1.18 invocation of juju that could cause the problem
<thumper> sinzui: gee, wouldn't it have been useful to have a stack trace there...
<sinzui> yes it would. I wish I could trust juju not to reveal keys when in --debug
<sinzui> thumper, I was trying to get you such a stack trace when the bugger succeeded
<wallyworld_> davecheney: this makes me happy. http://juju-ci.vapour.ws:8080/job/walk-unit-tests-ppc64el-trusty-devel/    ppc tests more reliable than amd64 right now
<davecheney> wallyworld_: it makes me happy as well :)
<wallyworld_> amd64 failures are down to random io timeouts etc
<thumper> \o/
<thumper> congrats davecheney and wallyworld_
<davecheney> thumper: and axw
<davecheney> he saved our bacon
<wallyworld_> thumper: axw as well
<thumper> and axw :-)
<wallyworld_> he did a lot
<axw> winning
<axw> the bot is way happier lately too
<axw> glad the work has been useful
<davecheney> i noticed
<axw> every time I see a merge attempt email I have a heart attack, thinking there's going to be a new intermittent failure
<davecheney> axw: same
<davecheney> negative conditioning
<axw> just need to make sure everyone does this once in a while ;)
<thumper> axw: it has been incredibly valuable, kudos
<axw> davecheney wallyworld_: the ppc64 test instance is nailed up right? I wonder if that has anything to do with stability
<axw> as opposed to amd64 which is spun up/down each time
<wallyworld_> axw: yeah
<davecheney> axw: well, when maas works on ppc then maybe we can introduce that element of randomness
<davecheney> i can't say it's high on my list
<axw> heh :)
<sinzui> thumper, I am convinced there is a bug test code that ensures the right version of juju is bootstrapped.
<sinzui> thumper, please don't revert
<thumper> kk
<sinzui> thumper, I found the problem. I don't know how ti fix it yet, but I was able to feed 1.20-beta1 into the test suite and find the bad method
<thumper> oh?
<davecheney> thumper: https://github.com/juju/testing/pull/4
<davecheney> *cough* when you have a sec
 * thumper looks
<wallyworld_> axw: was otp before. if i can get juju tests passing with mongo 2.6, initially that will be on a nailed up ec2 instance but then we could switch to an ephemeral instance and see how well it works
<axw> wallyworld_: okey dokey, sounds good
<wallyworld_> got some fundamental test failures to sort out. once root cause found, should then just work
<wallyworld_> i can bootstrap an env with mongo 2.6, just not get passing tests
<davecheney> wallyworld_: did it build properly with TLS
<davecheney> ?
<wallyworld_> yeah
<axw> bbs, school run
<davecheney> i've heard you have to sacrifice a newborn to make that happen
<wallyworld_> seemed to work ok
<davecheney> maybe you need to add more children
<wallyworld_> could bootstrap a system
<wallyworld_> test failure related to a permission problem with listDatabase() ops and such
<wallyworld_> bitch to debug remotely so building locally
<thumper> davecheney: I'm just about to go and have a coffee... but I'm curious, what was the problem with the code matching yesterday?
<menn0> thumper: it's pretty funny, in a not funny kind of way :)
<sinzui> thumper, I have a success. I will attempt to erase CI's recent memory of this revision to restart all the tests
<davecheney> thumper: SimpleMessage, the message isn't a string, it's a regex
<davecheney> because i had included parethesis in the mathcing text
<davecheney> sadface
<thumper> davecheney: https://github.com/juju/errors/pull/2
<thumper> davecheney: ahh..
<thumper> axw: reflect.DeepEqual does work
<thumper> axw: it really just isn't what I tend to think of when wanting identity
<davecheney> thumper: if you are happy then I am happy
<thumper> perhaps I just have to accept it
<davecheney> i know you've explained to me a bunch of times
<davecheney> but I still don't understand what you need
<davecheney> or what it does
<davecheney> so if you're happy
<davecheney> then i'm happy
<thumper> davecheney: the tests pass, so I'm happy
<davecheney> case closed
<davecheney> thumper: i'll take quick peek at what reflect.DE does for interface types
<thumper> davecheney: if there is a short circuit success pass then I'm extra happy
<thumper> but in reality, this function isn't called all that often
<thumper> so it will never be any form of bottle-neck
<sinzui> thumper, deploy is fixed, bug upgrade has a panic https://bugs.launchpad.net/juju-core/+bug/1323937
<davecheney> thumper: it's even nicer than that
<davecheney> teo secs
<thumper> sinzui: upgrade panics?
<thumper> sinzui: ah... new version
<thumper> arse
<thumper> damn...
<davecheney> http://golang.org/src/pkg/reflect/deepequal.go?s=3380:3419#L128
<thumper> sinzui: ok, I have a plan
<thumper> sinzui: we keep the code, but revert the version number to 1.19.3
<davecheney> reflect.ValueOf returns a reflect.Value that represents the concrete type of the interface value
<sinzui> thumper, the details imply simple streams...because 1.18.1 canno even bootstrap when it discovers that version exists
<thumper> sinzui: we keep the behaviour of the numbers until 1.20.0 has been released
<sinzui> thumper, yeah. I was thinking the same
<thumper> sinzui: really? hmm...
<thumper> wallyworld_: thoughts?
<sinzui> thumper, I can only guess. I am tired
<thumper> wallyworld_: finding tools versions where versions aren't parsable by old code?
<wallyworld_> hmmm
<thumper> sinzui: I'll submit a patch to make the current version go back
<thumper> sinzui: because otherwise 1.18 clients will panic when getting a new version number it can't handle
<davecheney> so reflect.DE gets the underlying values, and comparest their types,
<wallyworld_> thumper: i think we need current-1 to be able to find tools fir current to allow upgrades
<sinzui> thumper, could the - be the issue? could 1.20beta1 be acceptable?
<davecheney> which is what we were doing by snarfing in the first word of the interface struct
<thumper> sinzui: this will be the cause of the upgrade failure
<thumper> sinzui: let me fix it
<jimmiebtlr> It looks like any writes to command context (Infof, Verbosef) go to stderr, is this expeceted?
<thumper> jimmiebtlr: yes
<thumper> jimmiebtlr: some commands have strict format output that goes to stdout
<thumper> jimmiebtlr: we wanted to make sure that stdout says parsable by tools expecting that strictness
<thumper> jimmiebtlr: so extra info goes to stderr
<jimmiebtlr> ok
<thumper> and to be consistent, we always send to stderr
<jimmiebtlr> so for instance from the add-machine command would the output "created machine X" be expected to output to stderr or stdout?
<thumper> jimmiebtlr: I'd expect it to go to stderr for the above reason
<jimmiebtlr> ok, thanks
<thumper> Rietveld: https://codereview.appspot.com/93570046
<thumper> wallyworld_, axw, sinzui, anyone... ^^^^
<thumper> fixes the broken upgrades of trunk
<wallyworld_> ok
<thumper> by reverting version from 1.20-beta1 back to 1.19.3
<axw> doh
<thumper> yeah...
<thumper> the 1.18 agents need to be able to parse the new version updated in config without dying :-)
<thumper> my boo-boo
<sinzui> thumper, when I release 1.20.0 next month. I will then set the version to 1.21-alpha1?
<thumper> sinzui: yes, or 1.21-dev1
<thumper> what ever flicks your switch
<thumper> alpha, bravo, charlie, delta, echo, ...
<thumper> finish with a foxtrot
<sinzui> albatros
<thumper> sinzui: you could change the naming scheme with every release :-)
<thumper> 1.21 - birds
<sinzui> I am off to bed. CI is actually just finished the last revision. the 1.18.x upgrade tests were true failure. canonisatacks failure was swift related
<thumper> sinzui: thanks
<sinzui> oh, wallyworld_ I think I need to revert the ppc64el deb -> ppc64 tgz tool. This failure happen after to remove support for the ppc64el tool http://juju-ci.vapour.ws:8080/job/local-deploy-trusty-ppc64/10/console
<wallyworld_> looking
<wallyworld_> sinzui: what's inside the deb? ++ wget -q http://juju-ci.vapour.ws:8080/job/publish-revision/lastSuccessfulBuild/artifact/juju-core_1.20-beta1-0ubuntu1~14.04.1~juju1_ppc64el.deb
<wallyworld_> ppc64el.tar.gz or ppc64.tar.gz
<wallyworld_> should be the latter for juju to find it
<wallyworld_> if i understand the issue correctly
 * bigjools wonders what thumper has been doing to get an icky taste in his mouth
<thumper> :P
<davecheney> bigjools: management
<bigjools> +1
<davecheney> bigjools: sorry, you set that one up, I just needed to take a run up
<bigjools> davecheney: quite all right
<menn0> i've been deep diving in to how upgrades work and I've got a handle on most things
<menn0> there's just one bit I'm not sure of
<menn0> has anyone got 5 mins to help?
<jam> fwereade: axw: are we meeting to discuss bug #1215579 ?
<_mup_> Bug #1215579: Address changes should be propagated to relations <addressability> <reliability> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1215579>
<jam> I don't see dimitern yet
<axw> jam: I was about to join, but I guess we should wait for dimiter
<jam> I'm there, but just to be around if someone joins
<jam> axw: fwereade is here now
<axw> okey dokey
<wwitzel3> fwereade: ping
<fwereade> wwitzel3, pong
<wwitzel3> fwereade: you have some time to discuss presence things?
<fwereade> wwitzel3, sure :)
<fwereade> wwitzel3, moonstone?
<wwitzel3> fwereade: k, let me grab my coffee and I'll be in moonstone.
 * fwereade does same
<vladk> jam, fwereade: networker API facade: https://codereview.appspot.com/98600047
<voidspace> morning all
<rogpeppe> voidspace: hiya
<voidspace> rogpeppe: hey roger
<rogpeppe> voidspace: how's tricks?
<voidspace> rogpeppe: not bad, although today's task is working out why turning on replica set for tests makes the "no reachable servers" problem about a million times worse
<voidspace> rogpeppe: so not much fun :-)
<voidspace> rogpeppe: how's jazz-land?
<wwitzel3> voidspace: morning
<voidspace> wwitzel3: hey, morning
<voidspace> wwitzel3: sleep any better?
<wwitzel3> voidspace: sadly no
<voidspace> wwitzel3: :-(
<voidspace> I thought being around this early wasn't a good sign
<voidspace> grabbing coffee, back in a minute
<wwitzel3> voidspace: I was able to get a nap in the afternoon yesterday for about an hour and I tossed and turned for about 5 hours this evening for maybe 2 hours of total sleep.
<wwitzel3> voidspace: I'll probably take a sleep aid tonight, so I don't become a zombie.
<voidspace> wwitzel3: yeah, sounds like a good idea :-/
<jam> fwereade: wrt https://code.launchpad.net/~jimmiebtlr/juju-core/add_machines_tag/+merge/221013 are we wanting to expose things as tags to the CLI?
<fwereade> jam, certainly not
<fwereade> jam, not sure where that came from
<jam> yeah, I didn't see an associated bug, and you've mentioned to me that we didn't want it
<fwereade> jam, we have a bug that debug-log uses tags (grarrrr)
<fwereade> jam, I assigned it to thumper but forgot to mention it to him, I should do that
<jam> I'm not personally all that happy with bare numbers for machines, but if we want to not use tags, then we shouldn't
<jam> vladk: I have some feedback for https://codereview.appspot.com/98600047/
<jam> I haven't reviewed everything, but I did give a bit of a "here are some stuff that we need to iterate one"
<jam> on
<jam> fwereade: for changing Login so that it goes via "https://host:port/ENVUID/api" sort of URLs. Would you be averse to using a different Mux? One that was path aware? (like github.com/gorilla/pat) ?
<jam> I'm not sure I like Pat, as it looks to be doing a separate global hash map for extra data
<jam> (It at least pulls in gorilla/context which does that exact thing)
<fwereade> jam, I don't have an opinion; am happy to follow your judgment on that
<jam> but it seems a bit of "we could roll our own" that would be less flexible in the long term
<jam> though our needs today are very modest
<fwereade> jam, I'm all for pulling in well-defined battle-tested solutions from elsewhere
<fwereade> jam, hand-hacking the perfect tool with zero fat is a beguiling approach but by no means necessarily optimal
<jam> fwereade: bringing in tons of dependencies and their own cruft isn't always great, either. but Mux seems like something there are a few possible solutions around.
<fwereade> jam, sure, some degree of taste and sanity needs to inform these decisions
<jam> fwereade: I found http://www.alexedwards.net/blog/a-mux-showdown though it is interesting that now Pat is gorilla/pat and imports gorilla/mux, so I'm not sure why to use Pat over Mux if we go that route
<jam> weird, apparently there is also https://github.com/bmizerany/pat
<jam> which seams a bit lighter
<jam> fwereade: for URLs, should it be UUID or Tag ?
<jam> eg: /environ-dead-beef/api/
<jam> or
<jam> eg: /dead-beef/api/
<jam> or possibly
<jam> eg: /environ/dead-beef/api/
<jam> it feels a bit like putting the environment- prefix in the URL isn't needed.
<jam> (I should be clear that the tag has the full environment- prefix, I always want to spell it with the short name)
<voidspace> axw: ping
<voidspace> if you're still around
<axw> voidspace: heya
<axw> I may disappear soon, but I am here now
<voidspace> axw: hey, you wrote the code in this CL
<voidspace> https://codereview.appspot.com/88350043/patch/40001/50005
 * axw hides
<voidspace> haha
<voidspace> no, it's all good
<voidspace> it includes the code that *doesn't* enable HA for local provider
<axw> right
<voidspace> the problem that necessitated that appears to have disappeared
<voidspace> so I'd like to just enable HA for local provider
<voidspace> shouldEnableHA includes the following comment
<voidspace> +// Eventually this should always be true, and ideally
<voidspace> +// it should be true before 1.20 is released or we'll
<voidspace> +// have more upgrade scenarios on our hands.
<voidspace> axw: as I'm going to remove shouldEnableHA *before* 1.20 (i.e. now)
<voidspace> is it correct that I don't have to consider any *additional* upgrade scenarios - the existing code will handle everything
<axw> voidspace: correct
<voidspace> axw: awesome :-)
<voidspace> thanks
<axw> voidspace: I assume you were able to reproduce the bug before, but can't now?
<voidspace> axw: wallyworld was able to reproduce before and can't now
<axw> ah great
<axw> sweeeet
<natefinch> voidspace: nice!
<voidspace> axw: so we'll enable it - and then I'll email juju list and warn people to try it
<voidspace> natefinch: yep, morning :-)
<natefinch> voidspace: morning :)
<axw> thanks, SGTM
<axw> very much
<voidspace> natefinch: enabling replica sets for tests seems to be more problematic, but I'll start digging into that once this mp is ready
<natefinch> voidspace: we might have to punt on that if it gets too big, but we should at least give it the old college try before giving up.
<voidspace> natefinch: yep
<voidspace> natefinch: although it will be annoying to have a production check for replica sets merely to accomodate the test environment
<natefinch> voidspace: well, I think we can do it a different way.  Just mock out a function in test or something.
<voidspace> natefinch: ah yes
<fwereade> jam, I think the API can just be uuid
<fwereade> jam, sorry, the *URL* can just have the UUID
<jam> right
<jam> I understood you
<jam> dimitern: are you back ?
<jam> standup ?
<dimitern> jam, yes, brt
<jam> fwereade: are you coming to our standup today ?
<fwereade> jam, why not :)
<wwitzel3> fwereade: do you know the distinction between agent-state and agent-state-info? in both the cases of ensure-availability and status .. agent-state-info seems to have the correct information, even though agent-state is reporting "down".
<fwereade> wwitzel3, the SetStatus has a status "enum", and an info string, and we tacked on a map for more info later
<wwitzel3> fwereade: I'm wondering if having the AgentPresence check inspect agent-state-info for hints about the status (ie is it really down?) might be ok?
<fwereade> wwitzel3, the difficulty is that agent-state: down is hacked in in the status api, and we don't send the real value of agent-state-info down in status
<fwereade> wwitzel3, I don't *think* it would help
<wwitzel3> fwereade: ahh ok
<fwereade> wwitzel3, agent-state might be a better bet, though
<fwereade> wwitzel3, if it's never yet set a value, it's probably not there yet
<wwitzel3> fwereade: right, makes sense
<fwereade> wwitzel3, although even that is imperfect, because the pinger will start a bit before the machiner gets round to setting started
<wwitzel3> fwereade: good news is the rename worked just fine, compiled and tests pass
<fwereade> wwitzel3, *and* we should expect that the set of statuses, and when they're set, will change as well
<fwereade> wwitzel3, cool
<axw> fwereade: it's in a bit of a shoddy state, but I've got addresses flowing end to end now
<axw> fwereade: I modified AliveHookQueue to do something similar to changedPending- does that sound terrible?
<axw> fwereade: AliveHookQueue selects on another channel, and then sets a flag to say that it wants to send a relation-address-changed hook next
<voidspace> axw: that was quick! Thanks.
<axw> no worries
<voidspace> Nice when a CL is mostly code removal.
<axw> deleting code is one of my favourite things
<voidspace> :-)
<voidspace> "if you think writing code is awesome you should try deleting code!"
<axw> heh :)
<bodie_> :)
<bodie_> I deleted a bunch of code yesterday... felt great
<bodie_> I kind of get the appeal of dentistry now, too...
<voidspace> hah
<voidspace> I kinda like my teeth
<perrito666> morning
<voidspace> perrito666: morning
<wwitzel3> perrito666: morning
<voidspace> rebooting to get local provider working again :-/
<voidspace> ERROR cannot use 37017 as state port, already in use
<hazmat> voidspace, ps aux | grep mongo
<voidspace> hazmat: I had this yesterday, killed mongo - still got the same error
<hazmat> voidspace, do see port 37017 in output of sudo netstat -tulpn
<voidspace> hazmat: not after rebooting :-)
<voidspace> hazmat: I'll try next time, happened yesterday too
<hazmat> voidspace, i'd try juju destroy-environment --force  on it
<voidspace> thanks
<natefinch> yeah, there's a few things that aren't always cleaned up perfectly in the local provider.... we should probably do some work to try to make it more reliable, because it does tend to get stuck like that, whether it's mongo or lxc or whatever.   You know, when we have spare time ;)
<voidspace> natefinch: local provider still seems to work with my changes, and I have an LGTM
<voidspace> natefinch: so merging and switching to looking at tests after lunch
<natefinch> voidspace: awesome
<hazmat> voidspace, if its not working for you yet, i'm up for doing a g+ screenshare to debug
 * hazmat reads backlog and realizes its already  working
<hazmat> cool
<hazmat> rogpeppe, do you still have that api parser code handy? the one that extracted an api listing from the src?
<rogpeppe> hazmat: yes
<rogpeppe> hazmat: but recent changes mean that it's not going to work for much longer (assuming it does still)
<hazmat> rogpeppe, bummer
<rogpeppe> hazmat: the API is not longer statically determinable
<rogpeppe> s/not/no/
<hazmat> the refactoring that's been going on in api makes it much harder to determine what the full api is and identify the delta for client construction
<hazmat> i guess i should be using godoc instead of src.. ala http://godoc.org/launchpad.net/juju-core/state/api
<sinzui> jam, natefinch fwereade Can someone look into this regression, bug 1324110
<_mup_> Bug #1324110: go get cannot find github.com/juju/testing/logging <ci> <packaging> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1324110>
<jam> sinzui: I see github.com/juju/testing just fine, are you sure you are "go get" ing all the new dependencies?
<jam> people have been breaking out a lot of the core code into libs
<jam> ah, hmm... lo testing/logging
<jam> that I don't see
<sinzui> jam, CI runs this
<sinzui> go get -v -d launchpad.net/juju-core/...
<jam> sinzui: I think the bug is that something was removed and we didn't notice because go build leaves the old artifacts around
<sinzui> and it say github.com/juju/testing/logging is missing I see testing/ but mote testing/logging
<frankban> sinzui, jam: this can be related to my current change in https://github.com/juju/testing/pull/5 . I am also waiting on a review for https://codereview.appspot.com/92660046 . After that, I can fix the dependencies on core
<jam> sinzui: sure, I think it used to exist there, and it got moved, and some imports moved with it, but we missed this one because everyone running the test suite still had it from before it was moved
<jam> frankban: the issue is that trunk is currently already broken
<sinzui> jam, understood,
<jam> because it imports a module
<jam> and updated godeps
<jam> to point to one that didn't have it anymore
<jcw4> fwereade, mgz, natefinch : on a whim inspired by natefinch I added an FAQ yesterday.  Are 3 FAQ's worth including? I expect it to grow: https://codereview.appspot.com/98650043/
<bodie_> morning john
<jcw4> bodie_: o/
<jcw4> fwereade: I think this one is ready to go too: https://codereview.appspot.com/92630043/
<mgz> jcw4: well, that's useful info at least
<jam> frankban: can you either fix the import or revert the godeps so that trunk can build again?
<mgz> I kind of think it's stuff that should just be in HACKING rather than faq form
<jcw4> mgz: I'm open to whatever seems best
<mgz> jcw4: landing as is seems best for now, as I'm also rewriting things
<jam>  2783 Francesco Banconi	2014-05-23 [merge]
<jam>       [r=gz] Update the github.com/juju/testing dependency.
<jam> is probably the one that broke ups
<mgz> easy to bring different bits together later
<jcw4> mgz: excellent
<mgz> jam: yeah, I've seen that
<mgz> I'll fix
<jam> mgz: should we change the bot to always do a cleanroom build? so stuff like this can't land
<mgz> maybe... I actually need the go get to work... which requires tip juju/testing to not be borked I think
<jam> I don't think it would add huge amounts of compile time overhea
<jam> d
<mgz> jam, well, my bot-test just broke on it
<jam> mgz: bzr merge -r 2783..2782
<mgz> so... EOW
<jam> End of Week ?
<mgz> Hopefully!
<jam> mgz: its only Wednesday :)
<jam> mgz: or you mean the github version will do cleanroom builds
<mgz> jam: right, git on jenkins is from scratch each time at present
<natefinch> jcw4: LGTM'd
<jcw4> natefinch: ta
<mgz> hm, go get really screws things
<mgz> if any import in any tip of a dep is changed, it will fail
<frankban> jam, mgz so I am not sure how to dupe that problem, but there are two places in the code which still try to import testing/logging
<frankban> one is golxc, and I have a fix here: https://codereview.appspot.com/92660046
<jam> frankban: rm -rf $GOPATH/pkg
<jam> then when you build again, it should fail
<frankban> the other is juju-core testing/testbase/log.go, which is easy to fix
<mgz> frankban: I'm looking at the second
<mgz> I guess we land the golxc one, then I can bump that dep as well
<mgz> will review
<mgz> currently I'm missing PatchEnvironment..
<frankban> mgz the second is something like http://pastebin.ubuntu.com/7536578/
<mgz> frankban: ta
<frankban> mgz: Once the golxc land, we should update golxc and github/juju/testing dependencies in juju-core, aplly the patch and hopefully we are all using the new testing version
<mgz> frankban: lgtm
<frankban> mgz: landing it
<mgz> yeah, I'm doing that, go ahead and land and give me the sha1
<mgz> oh, it's actually a lp branch
<frankban> mgz: merged revno 9
<frankban> mgz: proposing a fix for juju-core
<mgz> frankban: I've got it
<mgz> ...when lbox responds, that is
<frankban> mgz: ok
<mgz> (I'll also need to fiddle with the landing bot anyway)
<frankban> mgz: yeah, we need to update both golxc and github/juju/testing, correct?
<mgz> frankban: yup. cr up... nearly, man I won't miss lbox
<mgz> downloading all of juju-core each time just to make a diff for cr is insane
<frankban> mgz: on the other hand, Rietveld is freaking good
<mgz> frankban: 92600046
<frankban> mgz: looking
<frankban> mgz: LGTM
<mgz> frankban: bot deps done, marking approved
<frankban> cool
<mgz> bot has picked it up
<natefinch> OMG, why does LibreOffice turn on line numbers by default on documents?
<jam> natefinch: Tools - Line Numbering set your own options?
<natefinch> jam: yeah, just seems like an insane default
<natefinch> jam: at first I assumed the document was corrupt or something...
<jam> natefinch: I just created a new Writer and it doesn't have line numbers on
<jam> are you sure you didn't change your setting some time in the past?
<natefinch> jam: possible I suppose.... though I honestly can't imagine ever turning on line numbers for a document like that.  Anything's possible though
<perrito666> natefinch: I cant remember opening *office since google docs exist
<natefinch> yeah, me either.   but, the TOSCA stuff is in a docx you need to download :/
<natefinch> which.... pretty much tells you all you need to know about TOSCA
<perrito666> such as "we still use ms word"
<natefinch> yep
<jam> natefinch: its docx, though, isn't that the "open" XML standard, right? ..... right?
<natefinch> it's a lot better than .doc yeah :)
<perrito666> yup, the one that has an open spec filled with references to closed specs :p
<natefinch> haha
<perrito666> it was a fun read
<mgz> wtf test failures
<frankban> :-/
<mgz> these are new to me
<mgz> why is it trying apt-get at all...
<frankban> mgz: I guess there is something wrong with the new LoggingCleanupSuite
<mgz> jam: any ideas on the failures for merge lp:~gz/juju-core/logging_dep_fixes ?
<mgz> frankban: I guess, it seems isolationy
<mgz> frankban: can you try to reproduce locally?
<frankban> mgz: duped, lots of failures
<mgz> k, poke me if you need help
<jam> mgz: apt exit status 100 seems like "something got really screwed up"
<jam> now I don't know why we are actually running apt-get anything on the bot, but you could just try again
<jam> http://askubuntu.com/questions/347830/how-can-i-get-a-verbose-apt-get-exit-code
<jam> mgz: "could not open lock file"
<jam> we shouldn't be root anyway
<mgz> jam: something bork-ish with isolation on the suite somehow I think, the log above shows it complaining about not being root
<mgz> (rightly, we don't run the tests as root as we're not insane)
<mgz> how we end up on a "must run apt-get" path is mysterious
<jam> mgz: are you reverting the version of juju/testing ?
 * jam wonders if we were getting "correct" isolation with the right version of juju/testing
<mgz> jam: probably should now, I tried going forward (as otherwise go get is broken)
<mgz> going backwards will at least unblock landings
<frankban> mgz: maybe I've found something
<mattyw> natefinch, if you have more HA work I'm available
<frankban> mgz: could you please try it again after applying this patch? http://pastebin.ubuntu.com/7536884/
<natefinch> mattyw: cool, in a meeting, talk to you in a bit
<frankban> mgz: all this Logging* stuff can be confusing
<mattyw> natefinch, no problem, whenever is good
<mgz> frankban: ha, I nearly did that, my bad, should have double checked when you posted the first one
<frankban> mgz: tests seems to work locally with that patch
<mgz> frankban: trying on the bot again
<frankban> finger crossed
<frankban> mgz: merged \o/
<wwitzel3> cargo culting something and then working backwards on actually understanding it is a totally valid approach right?
<mgz> frankban: ace
<mgz> wwitzel3: provided part #2 happens :)
<frankban> mgz: thanks a lot for your help
<wwitzel3> mgz: right :)
<mgz> everyone: stuff should now be landable, when pulling trunk you'll want new golxc and testing packages
<wwitzel3> nice
<natefinch> wwitzel3, perrito666: finishing up last meeting, be there in a couple minutes
<perrito666> natefinch: ack
<wwitzel3> natefinch: sounds good, no one else is here yet anyway :P
<perrito666> wwitzel3: michael and I are
<perrito666> you might want to check where you are
<wwitzel3> moonstone?
<perrito666> wwitzel3: yup
<wwitzel3> perrito666: had to restart it, standard issue hangout
<voidspace> natefinch: you still in a meeting?
<natefinch> voidspace: coming
<dimitern> anyone willing to review a trivial fix for bug 1323263 ? https://codereview.appspot.com/101820044/
<_mup_> Bug #1323263: remove --exclude-network from deploy <deploy> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1323263>
<dimitern> natefinch, perrito666, wwitzel3, voidspace, ^^ ?
<voidspace> dimitern: looking
<dimitern> voidspace, cheers!
<perrito666> voidspace: natefinch ping me on irc if you need a hand loving mongo, bbiab going to lunch
<voidspace> dimitern: yep, looks good to me
<dimitern> voidspace, thanks
<voidspace> hmmm... the mgoDialTimeout is 60 seconds so I don't think it's that we're not waiting long enough
<perrito666> fwereade: the warning message is very fun
<fwereade> perrito666, :D
<perrito666> voidspace: any luck?
<voidspace> perrito666: specifying an explicit logpath seems to cause a tls connection failure!
<voidspace> ERROR: cannot read certificate file: /tmp/test-mgo234085341/server.pem error:02001002:system library:fopen:No such file or directory
<voidspace> which is new :-)
<perrito666> voidspace: yay
<voidspace> perrito666: I can tell the problem because it *does* create the logfile
<perrito666> race? permissions?
<voidspace> perrito666: it creates the log file fine so it can't be permissions I don't think
<voidspace> I'm going to remove the logfile options to see if those are *really* the cause
<voidspace> I suspect not
<jcw4> fwereade: thx
<voidspace> maybe they really are. seems to be back to failing for the old reasons.
<natefinch> back
<voidspace> natefinch: hey, hi
<voidspace> natefinch: so adding a "--logpath" parameter to mongod causes it to fail to start with "ERROR: cannot read certificate file"
<voidspace> I guess it logs to syslog by default
<voidspace> hmmm... maybe not
<perrito666> voidspace: grep on /var/log :p it must be there somewhere
<voidspace> perrito666: we're starting it without logging options - I think it logs to stdout by default
<voidspace> perrito666: so not necessarily
<perrito666> really?
<voidspace> well, not sure :-)
<natefinch> sorry, that back was premature.  Back now.
<voidspace> perrito666: but there are options to set logging to syslog and options to set it to a file
<voidspace> perrito666: there is no option to set it to standard out - but it's mentioned several times
<voidspace> perrito666: the implication being that it's the default if you don't specify a logpath or syslog
<voidspace> natefinch: I'm grabbing coffee
<voidspace> logging to syslog causes the tests to hang
<voidspace> this time logging to a file caused the error:
<voidspace>  ERROR: dbpath (/tmp/test-mgo975498476) does not exist.
<voidspace> ah, however maybe we *use* the standard output to tell when mongo has started
<voidspace> so diverting logging causes the tests to not know when mongo has started
<voidspace> yeah, we're capturing output
<sinzui> voidspace, I am watching this test. I failed, but I see a new set of debs being published that will run this test again http://juju-ci.vapour.ws:8080/job/local-deploy-precise-amd64/
<sinzui> voidspace, mongodb replication set issue on precise localhost http://juju-ci.vapour.ws:8080/job/local-deploy-precise-amd64/1349/console
<voidspace> sinzui: oh dear
<sinzui> voidspace, the test used mongodb-server 1:2.4.6-0ubuntu5~ctools1
<voidspace> dammit
<voidspace> I have to leave now
<sinzui> damn, voidspace The newest revision fails the same way http://juju-ci.vapour.ws:8080/job/local-deploy-precise-amd64/1350/console
<voidspace> I'm going out with wife and friends so can't postpone
<sinzui> voidspace, I will talk to the other devs
<voidspace> but that looks like a problem
<natefinch> voidspace: go go, we'll figure it out
<natefinch> (or not)
<voidspace> maybe we need to back out the local provider change :-/
<voidspace> needs testing with precise
<voidspace> natefinch: thanks, :-(
<voidspace> I have to go, so EOD
<voidspace> natefinch: if there's anything I need to do for the morning email me
<natefinch> voidspace: will do
<voidspace> natefinch: I could setup a precise VM to try it (in fact I probably have one)
<sinzui> natefinch, note that this issue happened first in 2804, which passed on trusty ppc64. It will retested with 2806 soon
<alexisb> awesome, thanks cmars for the doc
<perrito666> mm launchpad just emailed me saying it could not email me a code review comment
<perrito666> anyone got one of those before?
<perrito666> btw, https://codereview.appspot.com/100810045 I will especially appreciate comments on how to improve tests
<perrito666> bbiab
<bodie_> fwereade, https://codereview.appspot.com/94540044/
<perrito666> fwereade: you overbritished me :p what is pointification
<fwereade> perrito666, speaking as if one were the pope :)
<perrito666> fwereade: well, the pope speaks spanish :p
<fwereade> perrito666, ("to speak in a pompous or dogmatic manner", anyway)
<perrito666> altough I would definitely drop my jaw if I got such network savvy comment
<fwereade> haha
<fwereade> perrito666, that's not how VLANs work, my child
<perrito666> 22/TCP in "thow shall not pass"
<perrito666> fwereade: anyway I like your approach better
<fwereade> perrito666, I'm not quite sure I really like either of my suggestions -- I have a nasty feeling that the "best" one is an untractably tedious hassle to actually implement
<perrito666> fwereade: 2 is better than current
<fwereade> perrito666, the two-part one might be viable, but get someone else's input, it's late and I'm a bit tired
<perrito666> fwereade: well lat night shift is about to kick in
<perrito666> and apparently I have a meeting tonight at 11PM
<fwereade> perrito666, yeah, I won;t be at that one I'm afraid
<fwereade> perrito666, all I'm really good for is reading devops borat, I should probably just go to bed
<perrito666> well, my current time zone makes me feel guilty to miss any f the 3 meeting shifts :p
<perrito666> oh man, now you just made me hook with devop borat
<fwereade> When you are run DROP DATABASE you are have of many problem but Big Data is not of one of them.
<bodie_> lol
<bodie_> that was eminently quotable
<fwereade> menn0, waigani, thumper: was there anything I needed to talk to any of you about?
<thumper> fwereade: not entirely sure
<waigani> fwereade: just getting through emails now. Did we reach agreement on a flag?
<menn0> fwereade: I'm waiting a tiny bit of review feedback ...
 * menn0 looks up review
<fwereade> waigani, I like thumper's idea: make it a cmd-specific format
<waigani> so just $ juju switch --format
<fwereade> waigani, sill not sure if it should be called "smart" or something else, but the default output format can be the original behaviour and we can get json/yaml as usual
<waigani> fwereade: ^
<fwereade> waigani, yeah
<waigani> okay, done
<fwereade> waigani, cool
<menn0> fwereade: never mind me. I just saw that you have responded (still getting through email)
<fwereade> menn0, cool, I thought I had :)
<waigani> menn0 I know the feeling - tooo many emails!
 * perrito666 notices fwereade setting the terrain to flee
<menn0> fwereade: I also have a tiny question about how one aspect of upgrades work but I don't know if you're the person for that
<menn0> waigani: the email volume here is tiny compared to my last job (500-1000 per day). working here has been a relief in that regard.
<waigani> yikes!
<perrito666> menn0: wow, that is so much mail to ignore
<waigani> did you work at a spam company?
<waigani> menn0: I think wallyworld knows a bit about upgrades
<perrito666> waigani: he worked at a company that processed dbus calls, by hand
<fwereade> menn0, I was just looking at them and I'm worried about the order in which we run the upgrade steps, so I might be able to answer
<menn0> no, just a company with a lot of automated systems that squawked via email when unhappy and a company culture where you got included in every email discussion
<menn0> fwereade: hangout?
<fwereade> menn0, sure, would you start one please? just grabbing a drink :)
<menn0> fwereade: sure. I think this will be quick if that helps :)
<perrito666> fwereade: I see you are setting the example
<fwereade> perrito666, ehh, not sure how laudable it is really ;p
<perrito666> fwereade: I said example, the word good was never involved
<menn0> fwereade: https://plus.google.com/hangouts/_/gutacrivw5fqrtq2mojvs5qjxma?authuser=1&hl=en-GB
<perrito666> fwereade: can I steal tomorrow a moment from you so we think trough the options for apiworker host? I like option 2 but I am not sure how much work is involved vs how much does it add against 1
<fwereade> perrito666, the big question is how easy it is to extract information about what machines the addresses are associated with from the doc in state, and I've completely forgotten its content
<fwereade> perrito666, if it's easy to get that info back, it's easy for the API server to tell that machine X wants to know state server addresses, hey machine X is a state server, let's tell it local host
<fwereade> perrito666, if it's hard, it might not be worth doing... but, hmm, even if it's not in that doc it should be pretty trivial to craft a query to reassemble that data with machine ids as part of it
<perrito666> fwereade: I am a bit sleepy, cold medicine has been doing that to me a lot, tomorrow I will definitely re-ping you regarding this
<fwereade> perrito666, ok, be prepared for a bit of spitting with rage -- not at you, but at the common.APIAddresser code, which is not set up for bulk calls
<fwereade> perrito666, because "who would ever need them"
<fwereade> perrito666, and yet it fucks us because we don't know who's asking without fiddling inappropriately with the auth object
<perrito666> fwereade: I noticed, I tried to reproduce it for the thing that keeps track of state instances and it really is... interesting
<fwereade> perrito666, it's not "what are the api servers" -- it is "what are the api servers we want to expose for entity X"
<perrito666> yep
<fwereade> perrito666, every time we make that assumption it leads to a fuckup
<perrito666> fwereade: I must admit it was a pretty large thing to swallow and I still am not entirely sure to understand how it fully works
<fwereade> perrito666, and it's always for the same fucking stupid reason, that we can't possibly imagine when we'd want to discriminate according to the recipient of the information
<fwereade> perrito666, sorry, I'm ranting at you and you don't deserve it
<fwereade> perrito666, it's a definite candidate for immediate unfucking once we have api versions, though
<fwereade> perrito666, ok, I will be off, and let you go
 * fwereade does a bit of crazy-person handwaving and eyerolling on his way out
<menn0> given that today is github conversion day do we need to hold off merging anything until it's done?
<fwereade> menn0, if you can slip it in before wallyworld wakes up I think you're ok -- but I'm not sure if he actually sleeps
<wallyworld> wot
<perrito666> fwereade: lol, rant away, I dont mind
<perrito666> I would mind less if I had a beer
<wallyworld> fwereade: menn0: i am delaying the cut over
<wallyworld> email not sent yet
<wallyworld> but soon will be
<fwereade> oh, yeah, curtis isn't ready?
<fwereade> anyway, really off now
<perrito666> fwereade: bye
<menn0> fwereade: bye, thanks for the help
<wallyworld> fwereade: a few reasons, but given CI is currently unhappy, we really need to get that sorted out first
<wallyworld> thumper: can we reschedule our 1:1 to after the core meeting today?
<waigani> thumper: my calender tells me we have a core meeting from 2pm to 3pm today. Is that correct?
<perrito666> waigani: if that is in like 4 hs yes
<waigani> perrito666: sounds about right
<waigani> yay, I'm actually going to be away for a meeting
<waigani> AND have my voice!
<waigani> now if only I had something to say...
<waigani> s/away/awake
<perrito666> waigani: we are now doing shifting meetings
<perrito666> so everyone has the chance to go to at least 2 iirc
<waigani> yeah I saw that, hard to keep track of
<waigani> but i'm sure I'll get use to it
<thumper> waigani: yes...
<perrito666> waigani: not really
<thumper> waigani: this is the attempt to have a rolling meeting schedule
<thumper> waigani: so it isn't terrible for some people all the time
<perrito666> this one its outside of my tz but if I am awake I will most likely be there
<thumper> waigani: but instead terrible for everyone some of the time
<thumper> waigani: also we don't have to go to them all now...
<waigani> hehe, true
<thumper> like the ones at 4am or whatever
<waigani> that is good
<waigani> okay, time to code
<thumper> wallyworld: yeah, sure
<wallyworld> ok
<thumper> menn0, waigani: rick_h_ had me thinking in the meeting we just had where he said we were making a terrible user experience, and I think he is right
<thumper> this is with respect to sharing environments
<thumper> how about 'juju login username@api-endpoint'
<thumper> prompts for password
<thumper> hmm... needs a name
<thumper> juju login env-name username@api-endpoint
<thumper> makes env-name.jenv file
<thumper> errors out if env-name already in the environments store
<thumper> juju logout could remove the jenv file
<thumper> with obvious warnings like "this doesn't destroy the environment"
<waigani> so if there where two environs with the same name, they would be disambiguated by the endpoint? Oh, not with a multi tenant state
<waigani> thumper: would this replace switch?
<thumper> waigani: no... it replaces the "here is a jenv file, put it in the right place"
<thumper> waigani: we may well implicitly switch when they login
<waigani> right I get you
<thumper> waigani: the env-name is just what I call it
<thumper> waigani: obviously when we have multi-env state servers, we will need a way to identify the environment within the state server
<waigani> yeah - and THAT should be the environ-name (or id)
<thumper> probably uuid
<waigani> actually, we already have that
<thumper> we do, but in a state server where there is only one environment, we can leave it out
<thumper> but we could write the code now to specify it
<thumper> I think that is a good plan
<thumper> so...
<waigani> So we have the UUID field in environmentDoc, but we currently don't use it?
<thumper> juju login <local-name> <user@endpoint> [<env-uuid>]
<thumper> waigani: right
<thumper> or we don't use it for much :)
 * thumper thinks...
<thumper> consider this:
<waigani> and the usecase is: environ exists, has been bootstrapped, has default .jenv file. Then you want to give a new user access to this environ?
<thumper> juju login prod-foo thumper@jaas.io paste-uuid-from-email
<thumper> waigani: yes
<thumper> I think this is better than the 'generate jenv' option
<waigani> why do you need the prod-foo?
<thumper> it is the name of the environment for me
<thumper> ~/.juju/environments/prod-foo.jenv is created
<thumper> otherwise, what do we call it?
<waigani> hmm, does the user really need to name that? I suppose for switching later?
<thumper> right...
<thumper> although, interestingly...
<thumper> we could default to try to use the name that is identified by the environment itself
<waigani> I'm just wondering if we can automate it in the background
<thumper> as the environment has a 'name'
<waigani> right, appended with the user's name/id?
<thumper> ick
<waigani> then the .jenv just becomes a background implementation detail
<thumper> it needs to be something that the user can switch to easily
<waigani> I'm missing where authentication is happening?
<thumper> I don't want all my environments named with my name at the end
<thumper> we authenticate as part of the login, what the login does is create the .jenv file for us
<waigani> so the user will get prompted for a password?
<thumper> yes
<waigani> why not also propt them for a .jenv name?
<waigani> with a default
<thumper> we could... if they don't specify a name on the command line
<waigani> $ juju login thumper@jaas.io paste-uuid-from-email
<waigani> password:
<waigani> env-name [default]:
<thumper> sure
<waigani> we have to stop calling it environ-name
<thumper> well... maybe
<waigani> are we storing the user's uuid in the .jenv?
<waigani> if .jenv is deleted on logout, do we need it in the first place? Is it just playing the role of keeping user session?
<perrito666> ok I am going to get dinner, I might be back for the team meeting if I dont fall sleep before cheers
<waigani> the other way to approach this with a different workflow: 1. authenticate to a state server 2. be presented with environs you are allowed to access 3. once an environ is chosen, be presented with users that you can login as
<waigani> thumper: ^
<thumper> waigani: that doesn't really make sense
<thumper> waigani: as step 1 and step 3 don't really work together
<thumper> waigani: I do see where you are getting at, but it doesn't scale
<thumper> waigani: consider jaas.io where I have access to 1500 environments
<waigani> right
<waigani> so which part does not scale?
<thumper> be presented with environs I can access to choose the one
<waigani> well, that can be handled in the ux - e.g. apt-get
<thumper> except which of the 1500 uuids do I choose?
<thumper> I understand where you are coming from, I just disagree in the approach
<waigani> I have some gaps in my understanding of the user workflow and what role .jenv is playing
<waigani> standup time!
<sinzui> I think mongo killed stilson-06. I killed the proc, but juju cannot bootstrap anymore. And the disk is full, but I cannot find what is taking up 4G+ of space
<sinzui> reboot at least restored 2G
<sinzui> no, 7G
 * sinzui tries stable juju again
<sinzui> yeah
#juju-dev 2014-05-29
<wallyworld> axw: meeting?
<axw> wallyworld: sure
<wallyworld> rotating team meeting
<wallyworld> see cal
<axw> oh right, core team
<axw> whoops
<wallyworld> :-)
<wwitzel3> perrito666: if you're still around
<wallyworld> thumper: want the 1:1 now?
<thumper> wallyworld: yeah, in our normal hangout
<wallyworld> already there :-)
 * thumper goes to make a coffee
<sinzui> axw, I updated bug 1324255 to explain that the juju and the slave were dead from disk shortage and zombie mongos. from other tests
<_mup_> Bug #1324255: Local provider cannot boostrap <ci> <local-provider> <mongodb> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1324255>
<axw> sinzui: hmm ok. I'm pretty sure I saw it on the amd64 tests to
<axw> too*
<sinzui> axw, CI doesn't have a trusty local lxc test yet
<axw> sinzui: sorry, I was unclear; I meant the unit tests themselves see the "Closed explicitly" error. not in local deployment, just running the unit tests
<axw> sinzui: e.g. http://juju-ci.vapour.ws:8080/job/walk-unit-tests-amd64-trusty/485/console
<axw> it may be unrelated, I'm not sure
<sinzui> I see
 * thumper making dinner, back in about 2 hours for a bit more
<waigani> https://codereview.appspot.com/98700043 :)
<wallyworld> axw: thanks for review, more info on reason for new interface etc is here https://docs.google.com/a/canonical.com/document/d/1J6EZ37gfiZdVMdoxdFrih4R27LCcqyBzi50wjWFjbiA
<axw> okey dokey, will have a read
<axw> wallyworld: I think calling it "err" will also be a problem, because an "err" in the body will shadow it
<wallyworld> axw: err in body is not meant to be a new var
<wallyworld> that's why i used = not :=
<axw> wallyworld: not on the first line of the function
<wallyworld> that's because file is new but err will not be as it's the 2nd var and there's one in scope
<wallyworld> or have i got that wrong?
<axw> mm I think it only counts if it's declared in the same scope. will verify
<axw> I could be full of crap
<wallyworld> the err at the top is in the same scope afaik
<wallyworld> bbiab, gotta pick up kid
<wallyworld> axw: i was tempted to leave the err in the defer as err := but thought it could be confusing so changed its name even though not necessary
<axw> wallyworld: you are correct
<wallyworld> np
<wallyworld> that's why i hate := and =
<wallyworld> too easy to make subtle mistakes
<wallyworld> now i'm really bbiab
<voidspace> morning all
<voidspace> is anyone working on the "juju doesn't bootstrap on precise" issue?
<voidspace> I suspect it's my fault
<voidspace> I'm spinning up a precise VM to look at it
<axw> voidspace: if you can reproduce it, I have a small change I'd love to test
<axw> I'll also start a precise image on AWS
<voidspace> I'm creating a local VM. Still downloading precise - my "old" VM was 13.10 not precise.
<axw> voidspace: from what I've found in Googling, it seems that mgo will kill sockets for servers that it doesn't think are available in the cluster
<axw> voidspace: so I've made a small change to copy (rather than clone) the session when running replSetInitiate, to force it to create a new socket
<vladk> jam: ping
<jam> vladk: pong
<vladk> jam: I'd like to discuss my further steps in developing Networker API
<jam> sure, can we get dimitern involved as well?
<vladk> jam: of course
<jam> (well that was mostly just to ping him)
<vladk> dimitern: ping
<vladk> jam: can we do it on hangout?
<jam> certainly, I was just waiting to see if dimitern was actually around
<dimitern> jam, vladk, i'm here
<jam> k I need about 30s
<jam> we can just use our team standup hangout
<jam> https://plus.google.com/hangouts/_/canonical.com/juju-sapphire
<jam> dimitern: vladk ^^
<fwereade> axw, pre-commented on https://codereview.appspot.com/102860043/ -- let me know if it sounds sane to you
<voidspace> axw: awesome, that sounds very useful
<voidspace> axw: might help test issues as well
<voidspace> axw: we start mongo ok, but then fail to connect to it
<voidspace> axw: (when using replica sets for tests)
<voidspace> axw: and it's not a timing issue, we wait sixty seconds
<voidspace> axw: I'm creating a branch with my changes backed out - so once I get the VM setup I can quickly tell if it's my changes
<axw> fwereade: thanks, seems sensible to put it on RelationUnit. in my PoC I did the address selection TODO in the uniter, which is a bit dumb I guess.
<axw> fwereade: yes, free for a chat tomorrow
<axw> voidspace: we wait sixty seconds for what?
<voidspace> axw: we wait up to sixty seconds (mongoDialupTimeout I believe) when trying to connect to the mongo server we have created for tests
<voidspace> axw: you're missing some context
<voidspace> axw: as well as using replica sets for local provider I wanted to switch us to using replica sets for tests too
<axw> I see
<voidspace> axw: so that our test mongo matches the production mongo
<axw> we're already doing that in a bunch of tests - you want to do it in *all* of the,?
<axw> them*
<voidspace> axw: but when I do that, many tests fail with "unreachable servers" - many of them deterministically
<voidspace> axw: yep
<axw> ok
<voidspace> JujuConnSuite *doesn't*
<voidspace> the driving motivator is that we want to switch to write-majority
<axw> sounds like it will add quite a bit of overhead to the tests. setting up a replica set can take quite some time
<fwereade> axw, no worries, we converge :)
<voidspace> the command to switch to "write majority" (WMode) errors if you're not using replica sets
<axw> :/
<voidspace> so one fix is to use replica sets everywhere
<voidspace> that's what we were attempting
<voidspace> it maybe that because of precise we *can't*
 * axw sighs
<axw> of course it bootstraps first time for me
<axw> of course it'd help if I had pulled trunk
<voidspace> heh
<voidspace> axw: this branch reverts my local provider replica set changes
<voidspace> axw: lp:~mfoord/juju-core/revert-local-replset
<axw> ok
<voidspace> or "bzr merge -r 2803..2802"
<axw> voidspace: if you manage to reproduce the error with replSet enabled, try this patch: http://paste.ubuntu.com/7542857/
<axw> (please)
<voidspace> axw: yep
<voidspace> axw: precise installing now
<voidspace> axw: you couldn't reproduce?
<axw> voidspace: nope :(
<voidspace> odd
<wallyworld> mgz: 1:1 time
<mgz> wallyworld: whoops, not looking, joining now
<axw> voidspace: it's not just precise. http://juju-ci.vapour.ws:8080/job/local-deploy-trusty-ppc64/
<axw> there's a bunch of failures there on r2807
<axw> "no reachable servers"
<voidspace> axw: :-/
<voidspace> axw: configuring my vm to try it
<voidspace> 215mb of updates...
<menn0> hi all
<voidspace> menn0: morning :-)
<voidspace> menn0: howsit?
<menn0> voidspace: alright
<menn0> voidspace: how are you?
<menn0> voidspace: we had a bit of household chaos here this afternoon so I'm catching up now. have one big email to send to #juju-dev today. I'm off until Tuesday.
<voidspace> menn0: ah, fun
<voidspace> menn0: I hope "household chaos" is good chaos
<menn0> voidspace: not really, but all fine now
<voidspace> menn0: :-/
<voidspace> we're all good here, lots going on
<voidspace> better a hectic life than a boring one though
<voidspace> trying to reproduce precise bugs at the moment
<menn0> voidspace: I think there's a balance :)
<menn0> voidspace: yeah I heard about that problem
<menn0> voidspace: tricky to reproduce?
<voidspace> menn0: so far
<voidspace> menn0: I'm just updating and configuring a precise vm to see if I can reproduce
<voidspace> axw has failed so far
<menn0> thumper: hi again :)
<thumper> o/
<alexisb> cmars, fwereade: joining us?
<natefinch> ahhhh stupid hangouts keeps crashing
<voidspace> natefinch: morning
<natefinch> voidspace: morning
<voidspace> so if I try to bootstrap with the local provider on precise it tells me that juju-local is required
<voidspace> but juju-local is also unavailable for precise
<natefinch> ahh hrm
<voidspace> do I need to add a ppa?
<voidspace> looking
<voidspace> natefinch: "Due to needing newer versions of LXC the local provider does require a newer kernel than the released version of 12.04."
<voidspace> Therefore we install Linux 3.8 from the LTS Hardware Enablement Stack:
<voidspace> natefinch: I've added ppa:juju/stable and juju-local is now available
<voidspace> looks like I need to upgrade the kernel too
<axw> voidspace: I used cloud-archive:tools
<voidspace> axw: is there a difference?
<voidspace> I guess "stable" is likely to be out of date
<axw> I mean to get the newer lxc
<voidspace> ah
<voidspace> hmmm
<axw> hm, I guess you got an older 12.04, I probably got 12.04.03
<voidspace> I did a full update
<axw> ok I dunno
<voidspace> axw: those quotes were from the docs, not an error message
<voidspace> https://juju.ubuntu.com/docs/config-LXC.html
<voidspace> axw: I might try without the new kernel first
<voidspace> maybe it will fail in the same way...
<mgz> voidspace: `sudo apt-add-repository cloud-archive:tools` if you've not
<voidspace> although Curtis is pretty switched on, so I doubt that's the problem
<voidspace> mgz: ok
<voidspace> mgz: you sound like you know what you're talking about :-)
<mgz> worth checking your kernel version too, by default upgrades may not give you a newer kernel
 * axw disappears
<voidspace> axw: o/
<mgz> night axw
<voidspace> mgz: they're unlikely too
<voidspace> *to
<voidspace> mgz: would you reccommend I remove the juju/stable ppa?
<mgz> if you're not trying to use the *client* we have released on that machine, yes
<vladk> jam. dimitern: what is your way to keep go packages up to date?
<vladk> 'go get -u ./...' damages my working juju-core, so I wrote a script http://pastebin.ubuntu.com/7543496/
<mgz> vladk: yeah, go get is dangerous
<jam> vladk: I generally run just "godeps -u dependencies.tsv" and then go update the ones that are out of date
<jam> you can just "go get -u github.com/juju/errors/â¦" for example
<mgz> vladk: ^that's what I do too
<voidspace> mgz: I'm trying to use trunk, thanks
<mgz> well, I don't use go get at all, I pushd to the branch and git pull or whatever
<jam> mgz: sure, I've done that as well, depends on my mood, I guess
<mgz> jam: and yeah, I do use go get as well, if say we've added a new dep with its own deps
<voidspace> ok, installing new kernel
<voidspace> so lunch
<vladk> vladk: jam, mgz: I want to update all of packages
<dimitern> vladk, that's a handy script, thanks
<wwitzel3> hello
<voidspace> wwitzel3: morning
<voidspace> so, juju bootstrap on precise, with trunk, seems to work fine for me
<wwitzel3> so after not sleeping more than 5 hours the last 3 days, I finally zonked out and managed to sleep 10 hours.
<vladk> dimitern: use with care, not fully tested, I'm not sure about pull options
<wwitzel3> feels good, like I'm sane again :)
<voidspace> going on lunch
<voidspace> will dig deeper after that
<thumper> night folks
<dimitern> vladk, sure, i'll give it a try
<voidspace> wwitzel3: that's good news, welcome back to the land of the living
<wwitzel3> voidspace: thanks, enjoy lunch
<alexisb> cmars, hazmat, fwereade: omnibus review
<perrito666> good sort of morning
<mgz> sort of morning?
<perrito666> mgz: I usually start around or before 7AM now its 10
<mgz> sinfully late
<wwitzel3> perrito666: it is a trend today, I didn't get on until just after 8 and I even went to bed right after the meeting last night.
<perrito666> wwitzel3: I did too, it was like midnight when the meeting ended, but this AM I just spent 2.5hs in a queue trying to do some bureaucracy and then an extra half an hour having to exit and re enter the city because the path to go trough the center was blocked by a cluster of strike protests
<wwitzel3> that sounds less fun than sleeping
<perrito666> yep
<wwitzel3> fwereade: ping
<fwereade> wwitzel3, pong, listening, but in meetings, so responding slowly
<wwitzel3> fwereade: np, I was pretty dense yesterday and I don't think I retained the full value of our earlier conversation.
<wwitzel3> fwereade: looking at the presence code, I'm not seeing us actually use WaitAgentPresence anywhere. We define it and have tests.
<wwitzel3> fwereade: I'm wondering if we should just be using that with or instead of SetAgentPresence in the Pinger.
<fwereade> wwitzel3, not quite following -- I know we don't use WaitAgentPresence except in tests, though, but I don't get the larger thread
<wwitzel3> fwereade: part of the problem is I'm still cloudy on the "Why" SetAgentPresence doesn't think the agent is there when it is called the first time.
<fwereade> wwitzel3, I *think* it's because it uses "2 consecutive pings" as the "yep, it's there" condition
<fwereade> wwitzel3, but please validate that against reality
<fwereade> wwitzel3, (ie if you've only had one ping it's not yet considered presence)
<natefinch> Wow, I just realized the Juju Core Team meeting was last night not tonight.  Stupid time zones making my Thursday meeting on Wednesday :/
<perrito666> yup, do we have stdup today?
<voidspace> hah
<perrito666> I just set up an alarm for the day before, so I know in advance when that meeting is
<natefinch> I had skipped thursdays, because we used to have the juju core meeting before it, but now that the juju core meeting is moving, I think I need to reinstate the thursday standup.  Short answer: yes
<wwitzel3> fwereade: I'm not seeing the two pings, following the trail from api status -> machine.AgentPresence -> presence.Alive which which fires a watcher.sendReq for the machine key and we select over the result and return bool,err back up the stack.
<wwitzel3> fwereade: I can make the problem with a long enough wait and recalling SetAgentPresence, but that isn't a legitimate fix and I am still failing to really grok the problem in the first place.
<natefinch> wwitzel3, voidspace: standup?
<wwitzel3> natefinch: oops, sorry
<natefinch> wwitzel3: it's ok, it wasn't on the calendar for today until now
<dimitern> fwereade, natefinch, mgz, I'd love a review on this networks constraints CL https://codereview.appspot.com/102830044
<dimitern> wow! LP now supports inline comments on MP diffs! \o/
<voidspace> wow
<dimitern> is it too late now to not switch to github? :D
<wwitzel3> dimitern: hah
<natefinch> haha
<natefinch> fwereade: did you and Ian figure out if we should use --replset in tests?
<fwereade> wwitzel3, ok, I may be wrong there: is it just the case that the first ping is not landing quickly enough to be seen?
<fwereade> natefinch, essentially yes
<fwereade> natefinch, for the tests that *by their fundamental nature* require mongo, we should be using --replset, because those really are integration tests and we should mimic a real environment as closely as possible
<fwereade> natefinch, but we have many tests that only coincidentally require mongo, and in those cases --replset is not required, because we should be aiming to make those tests unitier anyway -- to the point where they don't depend on mongo
<fwereade> natefinch, in particular replset clearly needs a real mongo, and almost certainly peergrouper too; and very probably state
<fwereade> natefinch, other test that merely happen to use state should be migrating away from the mongo dependency anyway, and so the loss of --replset is not a big deal
<fwereade> natefinch, how upsetting is this from your perspective?
<voidspace> sinzui: ping
<sinzui> hi voidspace
<voidspace> sinzui: ah, actually ping for five minutes
<natefinch> fwereade: it seems very squishy.  Deciding if a test "really" needs mongo is non-trivial.
<voidspace> I assumed you'd be slower :-)
<fwereade> natefinch, well, state/presence and state/watcher are the other two definite ones
<natefinch> fwereade: btw are you on the cross team meeting, or should I jump on?
<fwereade> natefinch, but generally anything we *could* back with a double for state.State is IMO a candidate
<fwereade> natefinch, I'm not, alexis is; your presence couldn't hurt though :)
<natefinch> heh ok
<wwitzel3> fwereade: thank you, I think you've given me an epiphany
<voidspace> I've been asked to speak at PyCon India
<voidspace> A good opportunity to promote juju
<voidspace> Have to work on a snappy demo
<natefinch> Cool
<perrito666> voidspace: sweet
<perrito666> voidspace: you can try the one from LAS
<bodie_> fwereade, did you get a chance to check out my MR?
<voidspace> perrito666: Las Vegas?
<voidspace> that was pretty IBM centric
<bodie_> I'm looking for direction to either dig deeper on the gojsonschema to add (id) references, or to move forward towards state and param validate
<voidspace> something equally impressive though I hope
<perrito666> voidspace: yes, but nevertheless pretty
<voidspace> yep
<voidspace> something python related would be good
<perrito666> voidspace: some of the material used on the os demo, minus the lego part
<voidspace> I'll talk to the online services guys as they're charming up their infrastructure
<mgz> bodie_: I'm on it
<voidspace> deploying and updating a django app would be good
<voidspace> and scaling out
<dimitern> fwereade, mgz, natefinch, a gentle reminder about https://codereview.appspot.com/102830044/
<bloodearnest> voidspace: oi! that's my planned demo for PyConUK :D
<voidspace> bloodearnest: awesome, I can use yours
<mgz> dimitern: I'm not wild about that syntax...
<voidspace> oops, I mean "we can work on it together"...
<dimitern> mgz, it's decided from above
<perrito666> voidspace: sure you meant that :p
<mgz> dimitern: code seems fine so far
<voidspace> :-)
<dimitern> mgz, I'll wait for jam and/or fwereade  to take a look as well, because i'm not sure about a few bits, but any comments are welcome
<perrito666> voidspace: that is cool, I only get invited to pycons I cant afford at times I cant travel :p
<bloodearnest> voidspace: when is pycon india?
<voidspace> perrito666: right, thankfully they're covering the costs or I wouldn't be able to afford it either
<voidspace> bloodearnest: end of September
<voidspace> so after PyCon UK I think
<bloodearnest> kk
<bloodearnest> voidspace: so we should do a joint talk proposal for pyconuk then :)
<bodie_> thanks mgz, nothing too fancy really.  just got your changes in to the github repo (all tests are green) but need to make a decision about gojsonschema itself
<voidspace> bloodearnest: heh, yeah - sounds good
<bodie_> fwereade indicated interest in getting it MR'd as such even if it's not a "real" merge request, just to get visibility
<sinzui> voidspace, http://pastebin.ubuntu.com/7544599/
<voidspace> sinzui: looking, thanks
<sinzui> voidspace, I asked abentley to join us to discuss what the test is doing bad
<voidspace> sinzui: so that worked
<voidspace> sinzui: ok
<voidspace> sinzui: although revision 2802 on master worked
<voidspace> so *something* changed
<voidspace> abentley: hello
<abentley> voidspace: Hi.
<sinzui> voidspace, yes, but the only thing I think changed is that I renamed the test to from local-deploy to loca-deploy-precise-amd64
<voidspace> heh
<voidspace> it was failing deterministically before that
<sinzui> abentley, voidspace I can try to restore the test name
<voidspace> that *really* shouldn't make any difference
<sinzui> oh but ci is still running
<sinzui> I can copy the test to the old name
<sinzui> oh, voidspace, I haven't run upgrade yet. I will do that to get the new lxc
<voidspace> sinzui: so the initial manual bootstrap works, but we think that re-bootstrapping from the generated .jenv file doesn't?
<abentley> sinzui: Are you re-bootstrapping from old jenvs?
<sinzui> abentley, voidspace, the env name and the pregenerated jenv are the difference
<voidspace> or by "generated jenv file", do you mean your pre-canned one?
<sinzui> abentley, I used true local with a jenv
<sinzui> abentley, I used true local withOUT a jenv
<voidspace> can I get a copy of the failing jenv file?
<abentley> sinzui: And you used an old jenv for testing local-deploy-precise-amd64?
<abentley> voidspace: We don't have a pre-canned jenv.
<voidspace> I can't see it in the workspace to get at it
<sinzui> voidspace, abentley. It is generated. not pre-canned. We let juju make the jenv file. We update the version in it to ensure juju does what we and enterprises mean
<voidspace> ah, I see
<mgz> bodie_: see comments on review, one thing needs fixing and some other comments, then lets land
<sinzui> voidspace, we tell users to share jenvs. we are
<bodie_> cool
<abentley> sinzui: We leave the .jenv entirely untouched.  It's the yaml that we tweak.
<sinzui> I am tempted to do the lxc upgrade, then ask jenkins to run the test as usual
<voidspace> sinzui: ok
<bodie_> mgz, I agree about pulling out the yaml -- is there a place in the repo where test files could go?  there will be a charm actions yaml loader tested separately as part of dir (I think it was)
<voidspace> can I get at the CI scripts? I'm reading the logs, but it looks like a lot of the work is in deploy_job.py
<mgz> bodie_: it would be an improvement to just have them as top level variables so that's easy later, splits the test logic a bit but easier to pull apart later
<mgz> voidspace: the branches should be public
<voidspace> mgz: ok
<mgz> voidspace: bzr branch lp:juju-ci-tools
<voidspace> yep, just found it :-)
<voidspace> mgz: thanks
<bodie_> mgz, I moved them to local scope due to fwereade's comments here: https://codereview.appspot.com/94540044/diff/1/charm/actions_test.go#newcode21
<mgz> bodie_: lets not go back and forth on that then :)
<abentley> voidspace: Every juju action is dumped to stdout so you should be able to reproduce the script actions from the log output.
<sinzui> abentley, voidspace jenkins failed again after the update
<bodie_> there's probably a case to be made for either approach.  I'm personally kind of a fan of the way the gojsonschema tests work with external files, but in this case it's not really testing the schema as much as the YAML reader, which is simple enough
<sinzui> oh, I forgot the mem constraint
 * sinzui run manually with the constraint
<voidspace> cannot initiate replica set
<voidspace> so it is the problem with mongo
<voidspace> and just as mysterious as last time we saw it apparently
<rogpeppe> frankban: reviewed https://codereview.appspot.com/92700044
<bodie_> okay, going to put my kid down for a nap, then into the salt mines with me!
<mgz> lick lick?
<sinzui> voidspace, where any of these mongo warning relevant to the failure http://juju-ci.vapour.ws:8080/job/local-deploy-precise-amd64/1371/console
<frankban> rogpeppe: thanks
<bodie_> mgz -- after I get these tweaks in shall I start digging towards State then?
<bodie_> maybe start on a new branch to get dir and the loader landed
<mgz> bodie_: yeah, I think so
<mgz> we'll make sure this lands first :)
<voidspace> sinzui: unknown config field. Shouldn't be.
<voidspace> I'll look at the last successful build, see if those warnings are new.
<voidspace> nope, those warning were there before
<sinzui> voidspace, the warning is from a vestigial environments.yaml. I think I can remove it now, juju 1.18 doesn't need it. 1.16 did
<sinzui> voidspace, abentley maybe a memory issue...this is the failure from my manual run. I recalled the good bootstrap command from bash history, then appended the mem=2G http://pastebin.ubuntu.com/7544723/
<voidspace> I'll try that locally
<voidspace> I didn't use the memory constraint
<voidspace> although my VM only has 1G!
<sinzui> machine has lots of memory
<abentley> sinzui: Hmm.  I wouldn't have thought that had an effect.  Especially since mongo runs outside the lxc.
 * sinzui tries again without mem=2G
<sinzui> abentley, agreed
<voidspace> nope, works
<voidspace> for me
<voidspace> if we can't figure out the difference we'll have to back out the replica set change
<sinzui> abentley, I believe 1G should be adequate for our testing on lxc anyway
<voidspace> I'm looking up that specific failure (cannot initiate replica set)
<voidspace> natefinch: ^^^
<voidspace> natefinch: the precise failure is consistent when run as a CI job
<voidspace> natefinch: although effectively the same command run on the CI machine succeeds
<voidspace> natefinch: the specific failure is "cannot initiate replica set"
<voidspace> natefinch: so the replica set is the problem, although it's not obvious why
<natefinch> mongo logs probably have a better message.
<voidspace> that error implies that we have a mongo started *without a replica set* and then tried to add a replica set
<voidspace> or we tried to add a non-empty mongo to a replica set
<natefinch> it's the initiation which has to happen after it starts
<voidspace> but we start mongo with the replset argument
<natefinch> that's not initiation
<voidspace> nope, but initiation will fail if the db isn't empty
<sinzui> voidspace, abentley I bootstrapped again manually *without* the mem constraint and it failed. This is the relevant errors: http://pastebin.ubuntu.com/7544771/
<perrito666> voidspace: you are missing a step there?
<voidspace> natefinch: https://jira.mongodb.org/browse/SERVER-6294
<natefinch> voidspace: I don'tb think that's true.
<voidspace> natefinch: that's the closest I could find to a reference
<voidspace> natefinch: it's local.* that needs to be empty
<voidspace> perrito666: I'm sure I'm missing lots of step
<natefinch> voidspace: ahh, yeah... I don't think that's the problem
<natefinch> usually it's that mongo can't resolve the address that you gave it during initiate, or you don't have access to the admin database
<voidspace> rebootstrapping from an existing jenv worked for me
<voidspace> natefinch: should I back out the change to unblock CI?
<voidspace> natefinch: without a way to repro it's going to be hard to re-introduce though
<natefinch> what's the version of mongo on ci vs your vm?
<voidspace> sinzui: can you check the mongo version on the CI machine for me please
<voidspace> natefinch: I'm using 2.4.6 it looks like
<sinzui> voidspace,  Installed: 1:2.4.6-0ubuntu5~ubuntu12.04.1~juju1
<voidspace> sinzui: thanks
<voidspace> natefinch: so looks like the same
<natefinch> I really don't want to revert a change that seems to work correctly on all machines except CI :/
<natefinch> especially without understanding why
<natefinch> my guess is that the address we're putting into the replicaset information in mongo is somehow not resolvable
<perrito666> voidspace: run that rs.config by hand by connecting with the client
<perrito666> you might get something more useful that failed out of it
<voidspace> perrito666: the only place it fails is on the CI machine which I don't have direct access to
<voidspace> perrito666: what specific command are you suggesting, we can ask sinzui or abentley to run it
<perrito666> voidspace: line 5 of http://pastebin.ubuntu.com/7544771/
<perrito666> rs.initiate(daconfig) iirc
<voidspace> so, attempt to manually re-initiate? From my hunting it looks like that error message we get is the full mongo error.
<natefinch> yeah, all you really need is { members: [ { host: "10.0.3.1:37019" } ] }
<voidspace> it's odd that  we're using a 10.0 address in that call to replicaSet.Config
<natefinch> yeah
<voidspace> 2014-05-29 15:21:41 DEBUG juju.worker.peergrouper initiate.go:34 Initiating mongo replicaset; dialInfo &mgo.DialInfo{Addrs:[]string{"127.0.0.1:37019"},
<voidspace> Followed by:
<voidspace> 2014-05-29 15:21:42 INFO juju.replicaset replicaset.go:52 Initiating replicaset with config replicaset.Config{Name:"juju", Version:1, Members:[]replicaset.Member{replicaset.Member{Id:1, Address:"10.0.3.1:37019", Arbiter:(*bool)(nil), BuildIndexes:(*bool)(nil), Hidden:(*bool)(nil), Priority:(*float64)(nil), Tags:map[string]string{"juju-machine-id":"0"}, SlaveDelay:(*time.Duration)(nil), Votes:(*int)(nil)}}}
<natefinch> yeah, you have to dial using localhost in order to get mongo to let you call initiate
<voidspace> I'm checking what address we use when it succeeds on my machine
<wwitzel3> grabbing some food
<voidspace> ah, except I'm only seeing INFO not debug here
<natefinch> run with --debug
<voidspace> thanks
<voidspace> I see *no* mongo related output
<natefinch> voidspace: oh, right, you don't actually get the output, you have to look at the logs under $HOME/.juju/local/log
<voidspace> natefinch: my successful run uses 10.0 addresses too
<natefinch> maybe it's just the CI machine can't connect to that IP/port
<natefinch> although I don't know why it would be different for precise and trusty
<voidspace> natefinch: I'm pretty sure that "local.oplog.rs is not empty" error is coming back from mongo
<voidspace> nor why it would be different on *some* precise machines
<natefinch> maybe the CI machine already has the database populated in a way that a normal machine would not?
<voidspace> I'm grabbing coffee
<perrito666> uh, cffee, good idea
<mgz> even better with an o
<rogpeppe> anyone know what our convention is for landing branches in github.com/juju ?
<rogpeppe> i want to submit this PR: https://github.com/juju/testing/pull/6
<sinzui> natefinch, voidspace there are no other mongos running on ci. I did check that because I found that had happened before
<voidspace> right
<sinzui> but maybe there is config left behind?
<voidspace> I wonder if somehow artifacts are persisting between test runs on *that* mongo
<voidspace> natefinch: perrito666: is there some step we can run at the beginning of CI to ensure the database is clean and there is no config / other artefacts left around?
<rogpeppe> in fact, have we got any CI set up on the github repos?
<voidspace> the failure indicates to me  that we're trying to initiate a non-clean repo
<voidspace> rogpeppe: no idea. I know that's not helpful, but I'm not ignoring you :-)
<rogpeppe> voidspace: thanks
<natefinch> voidspace: if mongo's not running, you can just delete and recreate the folder that we use for mongo
<natefinch> rick_h_: what's the process for landing stuff on github?
<perrito666> natefinch: +1 restore does just that
<natefinch> rogpeppe: I don't think we have CI on github right now (at least for the juju-core stuff)
<rogpeppe> natefinch: ok, ta. i'll just rebase and push it then.
<mgz> natefinch: what stuff?
<rogpeppe> mgz: github.com/juju/...
<mgz> all the little bits?
<mgz> yeah, that's all botless
<rogpeppe> mgz: yeah
<rogpeppe> mgz: ok, thanks
<natefinch> we should fix that
<natefinch> especially since we're moving more and more stuff into its own repo
<rogpeppe> mgz: pity we've lost that, but i'm guessing sinzui has plans for reinstating it
<mgz> it's covered as part of juju-core
<rogpeppe> mgz: it's going to be part of other pieces too
<mgz> once you start depending on a broken rev, core will refuse the dep bump
<rogpeppe> mgz: juju-core CI runs the github.com/juju tests ?
<mgz> that still leaves go get etc borked as we saw yesterday
<sinzui> my ideal bot wil do the squashing for us so that we don't need to worry that we loose conversations when commits are removed
<rogpeppe> sinzui: my plan is to rebase-squash only at submit time
<rogpeppe> sinzui: but i have a feeling that violates the "don't rebase after publishing" rule
 * rogpeppe misses rietveldt already
<sinzui> It does and github will delete conversations linked to the revisions
<rogpeppe> sinzui: so is there any way we can a) keep the conversations and b) keep the revision history coherent ?
<mgz> rogpeppe: squash on merge would
<rogpeppe> mgz: that's what i was suggesting, i think
<sinzui> rogpeppe, I see that homebrew is now making their own swashed pull request for each pull request I make. My conversation and commits are maintained, but they do not show up as merged.
<mgz> sinzui: yeah, that's the upshot
<mgz> no pull requests are ever really merged
<rogpeppe> yeah, i can see that's the case. i can live with that, i think
<rogpeppe> just so long as i can get from the revision history to the conversations, i think it's all ok
<sinzui> rogpeppe, I think the bot can merge the approve branch, test, squash, commit, then add a comment naming the replacement commit when it closes the original pull request
<rogpeppe> sinzui: i guess we'll need a convention to allow the bot to choose an appropriate commit message.
<sinzui> yep
<rogpeppe> sinzui: most recent commit (by, erm, some measure of "recent") with some keyword in the message might work
<sinzui> +1
<natefinch> question..... do we really need to squash?
<natefinch> So like... yes, roger's PR has 4 revisions and if you don't squash, those 4 revisions all go on main...... who cares?
<mgz> natefinch: not nessersarily, especially if everyone gets in the habbit of rebasing neatly before proposing
<rogpeppe> natefinch: yes, i care
<natefinch> mgz: sure
<mgz> natefinch: mostly hurts with lots of small review comment fixes
<voidspace> natefinch: so the mongo directory is defined as agentConfig.DataDir() + '/db'
<voidspace> natefinch: where is the default location
<rogpeppe> natefinch: sometimes i'll commit hundreds of times before proposing
<rogpeppe> natefinch: with minute changes
<voidspace> "/var/lib/juju"?
<rogpeppe> natefinch: i don't want all of those commits to end up cluttering the main log
<natefinch> voidspace: I think so, yeah
<rogpeppe> voidspace: /var/lib/juju/<agent-tag>
<voidspace> rogpeppe: natefinch: between CI builds is it safe to delete /var/lib/juju altogether?
<natefinch> voidspace: it better be
<rogpeppe> voidspace: it should be
<voidspace> good
<natefinch> brb
<rogpeppe> voidspace: although CI shouldn't be touching /var/lib/juju at all
<voidspace> sinzui: can you check if there's a /var/lib/juju on the CI machine - and preferably delete that at the start of the CI build
<rogpeppe> voidspace: because CI may be running on a juju node
<rogpeppe> voidspace: in fact, it probably will be
<rogpeppe> voidspace: so that will kill the local juju agents, which is not a great thing
<voidspace> rogpeppe: we're worried that db artefacts are what is causing bootstrap of local provider to fail *only* on the CI machine
<voidspace> sinzui: hmmm... if jenkins is provisioned by juju then blowing the whole thing away is not a good idea
<voidspace> we just want to delete the part created by the CI run
<sinzui> voidspace, I was about to say I want to keep the machine and unit agents alive
<voidspace> heh
 * sinzui looks for cruft
<voidspace> probably preferable
<rogpeppe> it would be nice if CI never actually touched /usr/lib/juju
<sinzui> voidspace, /var/lib/juju is not contaminated.
<voidspace> damn
<voidspace> sinzui: thanks
<perrito666> bbiab
<rogpeppe> g'night all
<voidspace> rogpeppe: g'night
<voidspace> I have to go, EOD
<voidspace> natefinch: so, we still don't have a resolution for this issue :-(
<voidspace> which sucks majorly
<natefinch> voidspace: ug.... yeah, thanks for all your work on it
<voidspace> heh
<natefinch> I think I might have some time to actually look into it... not that I currently have a precise VM, but I can make one
<voidspace> cool
<voidspace> let me know of any progress you make
<voidspace> g'night all
<bodie_> nite
<bodie_> I'm not following why this is still showing a conflict in dependencies.tsv (all the way at the bottom)
<bodie_> https://code.launchpad.net/~binary132/juju-core/charm-actions/+merge/219926
<bodie_> oh, god.  did I commit it with merge comments still in it?  -_-
<mgz> bodie_: you need to actually merge trunk I think
<bodie_> that's what I thought I did, heh
<mgz> the last rev seems to only have one parent
<mgz> just `bzr merge co:trunk`, resolve, commit, push, should be fine
<bodie_> alright, why would bzr merge lp:juju-core not work?
<bodie_> doesn't that refer to trunk?
<perrito666> how expensive is to iterate a slice?
<natefinch> perrito666: O(n) :)
<wwitzel3> yeah, plain old iteration should be linear
<perrito666> natefinch: nobody likes a smartass :p
<natefinch> perrito666: it's just like iterating over an array in C, with some bounds checking that makes it minutely slower.  If you're doing the for n, val := range slice {    version, then it copies each value, too, but using a for x := 0; x < len(slice); x++ {  there's no copying until you access an index of the slice (obviously)
<natefinch> perrito666: heh
<wwitzel3> lol
<perrito666> natefinch: tx, that was the answer I was looking for
<perrito666> :)
<natefinch> perrito666: a slice is just a struct with a pointer to a backing array, the length of the current slice, and the capacity of the backing array.  Iterating over it is just like iterating over the backing array.
<perrito666> I was curious about how much copying was range doing
<wwitzel3> I'd just use the syntax that expresses the idea in a more clear fashion
 * perrito666 is not so happy with having to check if a string is in a slice
<natefinch> 90% of the complaints about Go are "I have to write this loop where in my other language, the loop is hidden from my view"
<perrito666> natefinch: well, off course
<wwitzel3> at the end of the day it is all just typing and loops
<perrito666> natefinch: I think the issue is that almost everyone trying this knows python
<natefinch> if x in foo:
<natefinch> yeah I know
<perrito666> wwitzel3: and if err != nil, dont forget if err != nil
<bodie_> mgz, nope, bzr merge co:trunk didn't do it for me.  Not a branch: "/home/bodie/go/src/launchpad.net/juju-core/.bzr/branches/master/"
<bodie_> er, that's when I tried to merge co:master because I got that error for trunk
<natefinch> perrito666: yep
<bodie_> sorry this is taking so long, I've been doing bzr gymnastics
<mgz> bodie_: er, are you using the same branching scheme as me?
<bodie_> poorly
<bodie_> I'm using cobzr
<mgz> okay, you're not
<mgz> cobzr people, how do you address trunk?
<natefinch> there's too many different ways to do colocated branches in bzr (which is to say, there's two, and they both seem to have the same name)
<mgz> well, three
<mgz> one was an actual bzr plugin that did the right things, and got superceeded by support in 2.6
<mgz> then there's the weird go thing that doesn't play well with anything...
<natefinch> I thought that's what cobzr was?
<natefinch> oh
<natefinch> I don't know that third thing
<bodie_> okay, so what I need to do is really simple
<bodie_> and I thought I already did it...
<bodie_> but now I've erased and re-checked-out my repo, and I'm not certain it will be normal
<bodie_> going in circles and driving myself nuts
<natefinch> perrito666: you're welcome to write a utils package that hasa function that takes a slice of strings and a string and returns a boolean that says whether the string is in the slice.  then you could do like strslice.Contains(list, value)
<mgz> natefinch: what do you call trunk with cobzr?
<perrito666> natefinch: I am really thinking it although I am a bit suspicious about the fact that is not something that is already there and widely used, so I presume there must be a decent reason
<bodie_> I think bzr merge lp:juju-core worked (I don't have any conflicts in my dependencies.tsv, in the file) but in Launchpad it shows conflicts
<natefinch> mgz: uh.... I use the built in support.... I think?
<natefinch> mgz: I have a branch called trunk
<mgz> bodie_: yeah, that's fine too
<mgz> bodie_: what does `bzr status` say?
<natefinch> mgz: I just use bzr switch -b foo to make a new colocated branch....  I think that's the built-in support, not cobzr, right?
<mgz> natefinch: can be either..
<natefinch> lol
<mgz> the actual naming is what different (and confusing), which is what you need for merge
<perrito666> natefinch: that depends on what does your bzr
<perrito666> mgz: I have to point to the place where my bzr branches actually live with the full path
<mgz> perrito666: you use tim's method, right?
<bodie_> there's a conflict in deps
<perrito666> mgz: I... am not sure
<perrito666> I am using A method
<perrito666> which might be tim's
<bodie_> mgz, I merged dependencies.tsv, fixed the conflict, tried to commit, ran bzr resolve dependencies.tsv, committed, pushed, erased the core repo, go get'd juju-core, bzr branch charm-actions, bzr switch charm-actions, bzr pull --overwrite lp:~binary132/juju-core/charm-actions, lastly bzr merge lp:juju-core/dependencies.tsv
 * mgz blinks
<bodie_> heh
<mgz> okay, your problem
<mgz> you merge just the branch
<mgz> not a file *in* the branch, that's a cherrypick
<mgz> you *want* the actual revisions to resolve the conflict, not just the file contents
<mgz> bzr merge lp:juju-core <- what you want
<bodie_> I don't understand why since all I'm trying to do is grab the differences in that file that will cause issues and tweak it to match trunk plus my addition
<bodie_> so you're saying grab all of trunk just to make sure everything is shipshape before I land the branch?
<perrito666> bodie_: you are trying to git your bzr
<bodie_> :'(
<mgz> because with a cherrypick, you still have a split history
<mgz> and the launchpad merge resolution plays it safe
<mgz> if there's not a clean history merge, it throws it back for a real person to resolve
<bodie_> so I have to merge the whole trunk in order to merge a single file?
<bodie_> that doesn't seem sensible
<mgz> you're not merging a single file
<mgz> even git doesn't work like that, it's not cvs
<bodie_> hrm
<mgz> you can do the merge in other ways to make launchpad happy, but just merging trunk as a whole is fine
<mgz> (you can rebase all your changes onto trunk if you like)
<mgz> (and push --overwrite the branch)
<bodie_> I doubt it'll conflict since my stuff doesn't touch other places
<bodie_> but how would I rebase my changes onto trunk?
<mgz> eg, `bzr switch trunk && bzr switch feature_redux && bzr merge -rancestor:trunk..branch:feature && bzr commit` or similar
<mgz> using the rewrite plugin means you can keep your individual revisions if you like
<mgz> -b on the second switch, the the revspec may not be quite right
<mgz> then you bzr push --overwrite <whereever remote feature branch is>
<bodie_> I see
<bodie_> I think I'm gonna stick to the basics for the time being
<bodie_> still having enough trouble with bzr, heh
<bodie_> once I've merged trunk then I simply need to commit to finish the job, right?
<bodie_> I've edited the conflicts out and bzr resolved them
<bodie_> Okay, I committed it, and pushed up to my branch, and it still appears to have conflicts, which I expressly fixed
<bodie_> so either Launchpad is lagging, or I'm going insane, or both
<bodie_> okay, it was just lagging.  thank god
<perrito666> bodie_: you might still be going insane
<bodie_> haha, for sure
<bodie_> mgz, I'm looking at the other comment now -- I think this may actually be a more significant problem than it appears
<bodie_> or --
<bodie_> yeah, I don't think we're hitting it.  that's supposed to be for schemas that don't adhere to the spec for JSON-Schema
<bodie_> which are supposed to be caught by gojsonschema
<natefinch> perrito666: btw, next on our plate is doing backup and restore the right way.  WIth the schema upgrades stuff getting taken over by Tim's team, we'll be working on that pretty soon.
<perrito666> natefinch: sweet
<perrito666> with luck we will be able to do that before it breaks again
<natefinch> haha
<bodie_> oh, mgz I recall why we're not testing that error -- william thought we shouldn't test gojsonschema in our tests
<bodie_> I'll add a couple of edge cases
<mgz> bodie_: I'm a little less worried about the coverage than I am the error being bogus, but it should be easy enough to do something there
<bodie_> hm?  I'm not following
<mgz> the error in the branch has two fmt operators and one arg given
<bodie_> ah
<bodie_> gotcha
<bodie_> why would that not build?
<bodie_> I fixed that, thought you meant something different
<mgz> it does build, it just is pretty bogus
<mgz> you get some junk in the error string
<bodie_> oh, since it's a formatting string it's at runtime
<bodie_> I see
<bodie_> I meant why would that build
<bodie_> thanks :)
<bodie_> mgz, if you're still waiting for my branch to come in, it's because I may have found an error with gojsonreference
<bodie_> getting a nil exception in my new test
<mgz> bodie_: urk?
<natefinch> that feeling when you accidentally reboot your laptop instead of the VM running on it
<natefinch> EOD anyway
<mgz> natefinch: oh yeah...
<natefinch> night all
<mgz> have a good evening
<waigani> morning all
<perrito666> waigani: sure, morning, why not
<waigani> hehe
<waigani> fwereade: are you still awake?
<fwereade> waigani, heyhey
<waigani> fwereade: hello!
<fwereade> waigani, how's it going?
<waigani> fwereade: I'm stumbling my way into understanding client/server api cmd
<waigani> fwereade: what do you mean by bulk api call?
<waigani> fwereade: I checked the api.txt docs, but couldn't see anything
<fwereade> waigani, I mean taking an array of args, and returning an array of results
<waigani> ah okay, that's all it means?
<fwereade> waigani, pretty much, yeah
<waigani> and what I did was not good because it had no args?
<fwereade> waigani, it's inconvenient in some respects but not having it is more inconvenient
 * fwereade looks at api.txt, and adds it to the list of things to fix
<waigani> fwereade: maybe if you elucidate your reasoning in api.txt, I'm happy to read that later
<fwereade> waigani, yeah, I'm worried I may have given offence, I was *seriously* pissed yesterday off to discover that horacio can't fix a bug he's working on cleanly because someone implemented *another* non-bulk call
<waigani> I'm not offended, I don't understand - that's all
<waigani> fwereade: ^
<waigani_> fwereade: why does having a cmd that calls an api without args an issue?
<waigani_> i think my connection is having a spaz
<fwereade> waigani, the short version is: you can always call a bulk API with one arg, but you can't call a singular one with bulk args; and you have many more opportunities for optimizing bulk calls, that you just don't get if you have singular ones
<fwereade> waigani_, and the reason that no-args is bad is because it encodes the assumption that the result will always be the same for everyone everywhere
<waigani_> fwereade: but couldn't that be the case?
<fwereade> waigani_, in the StateServerAddresses (or whatever it is) case, one might think "ehh, the set will always be the same, doesn't matter who's calling", right?
<fwereade> waigani_, but this is not true, because the actual address one might use to access machine X may depend on where you're calling it from
<fwereade> waigani_, if we're explicit about "what does the state server address list look like to machine-4?" we can accommodate that problem without having to change the API
<waigani_> right
<waigani_> so it's about allowing room to move in the  future
<fwereade> waigani_, and when we're, I dunno, provisioning a whole bunch of machines, it's much more efficient to say "what will the state servers look like for machines 5, 6, 7, 8,9,..."
<fwereade> waigani_, yeah: changing an API is a whole heap more hassle than changing an implementation
<waigani_> so api calls should ALWAYS tag an args struct? Even if there are no args in the struct?
<fwereade> waigani_, the meta-point is that the assumption about common sets of state servers has a character to it that I have grown sensitive to because that sort of assumption seems to end up wrong *disturbingly* often
<fwereade> waigani_, I'm not sure there are any cases where there aren't any args: the closest I can come up with is getting environ info, because there's always one environ per conn
<fwereade> waigani_, that still doesn;t feel to me like enough of a unique snowflake to be worth breaking convention for
<waigani_> okay I get it
<fwereade> waigani_, because I've also used apis in which singular/bulk are mixed apparently at random, and I'd really like us to be predictable :)
<fwereade> waigani_, (and when I have it usually seems that they have fallen prey to exactly that class of assumption, and had t ofix it later, and it always just looks embarrassing ;))
<waigani_> fwereade: is there a way of ensuring args is always passed?
<fwereade> waigani_, heh, we probably could do something to our rpc package
<waigani_> could save a lot of explaining ...
<fwereade> waigani_, but we're carrying the weight of the original client api, designed by rog, who doesn't believe in bulk api calls, and we're not ready to retire it all yet
<fwereade> waigani_, much though I may want to
<waigani_> I'm missing something with your grudge about the state servers
<alexisb> fwereade, what time is it for you?
<fwereade> waigani_, the specific bug was that we want state servers' api tasks to connect to the local api for preference
<fwereade> alexisb, eh, 11:45, but cath went to bed early leaving me with 2 g&ts to drink, I have to entertain myself somehow
<waigani_> lol
<waigani_> by trying to drum some sense into us newbies!
<alexisb> fwereade, hehe, fair enough
<fwereade> waigani_, it's much more sinister than that, I'm trying to turn you all into mental copies of me :)
<waigani_> okay, keep trying please :)
<fwereade> waigani_, the first cut at the bug was a bit of a hack -- it was looking up the state servers in the agent config, grabbingone of them for the port number, and replacing the list with localhost:port
<fwereade> waigani_, and so I said "eww!" and started thinking how we could do better
<fwereade> waigani_, and to be fair it's not entirely easy to do better
<fwereade> waigani_, it's not smart to keep the other api servers secret and *always* just write localhost, because we want to remain open to machines being demoted but still coming back into the fold as viable deployment targets
<fwereade> waigani_, so we do indeed want to store the complete set of state servers
<waigani_> I assumed that is what so I would assumed StateServerAddresses would return
<waigani_> the complete set of state servers
<waigani_> ugh, let me type that again
<waigani_> I assumed that is what StateServerAddresses would return
<fwereade> waigani_, right -- but the best set of addresses for machine 2 to know about is not "m0.dns, m1.dns, m2.dns"
<fwereade> waigani_, it's "m0.dns, m1.dns, *localhost*"
<fwereade> waigani_, but you can only return tailored results like this if you know who's asking
<waigani_> seems like two different functions to me
<fwereade> waigani_, think broader
<fwereade> waigani_, the address yu use to hit a particular machine depends on your own location
<fwereade> waigani_, this is a very narrow example, but the general problem is much broader
<fwereade> waigani_, think of state servers spread across a manual environment
<waigani_> okay
<fwereade> waigani_, 10.x addresses may or may not be valid, depending on where you're connecting from
<waigani_> one set of state servers per environ?
<waigani_> ha
<fwereade> waigani_, the same 10.x address might refer to entirely different machines, according to two different clients
<waigani_> yep, of course
<waigani_> I'm stuck in web world where everything has a public IP
<fwereade> waigani_, yeah, I've had a bit of difficulty adjusting myself
<fwereade> waigani_, remember when computers never talked to one another at all? life was so easy :)
<waigani_> private networks make the story  much more interesting
<waigani_> lol
<fwereade> waigani_, anyway, this is essentially anecdotal, but it's a case in which being explicit about the intended context of the result would have made life a lot easier for us, because it would have been a pure api-side fix
<waigani_> okay cool so, the result of StateServerAddresses is always relative to the agent calling it, and that agent needs to be passed in as an arg
<waigani_> or an identifier for the agent i should say
<fwereade> waigani_, and once you're being explicit about the agent who's interested, you may as well implement the bulk call for the reasons discussed above
<fwereade> waigani_, yeah
<waigani_> sweet I think I get it :)
<waigani_> time for a quick question about my branch?
<fwereade> waigani_, ofc, it is a matter of certainty that there will be some APIs for which my insistence on bulk never does us a shred of good, and they really are only ever called with a single arg, and that arg always matches the authenticated entity
<fwereade> waigani_, but I think the costs are worth bearing when weighed against the costs of the alternative ;)
<waigani_> yep, like my bogus CurrentUserInfo
<waigani_> Why not just enforce args?
<waigani_> ensure that a struct is always passed, even if empty?
<waigani_> ohh because of backwards compatibility
<fwereade> waigani_, mainly hysterical raisins -- I'm not confident that we've ever been entirely without APIs-that-make-me-grumpy
<fwereade> waigani_, if we can get to the point where we can, I will be happy, but we *do* have to carry the 1.18 api forever
<fwereade> waigani_, well, 5 years, which is approximately equal to forever
<waigani_> oh
<waigani_> yeah true
<waigani_> well, at least get this in the docs
<fwereade> waigani_, anyway, yes, let's talk about your branch -- can I take 5 for a ciggie first please?
<waigani_> of course (and another g&t, makes things much more fun)
<fwereade> waigani_, yeah, there were plenty of ephemeral google docs around the time we were discussing it but the never made it into dev docs
<waigani_> ping me when you're ready
<fwereade> waigani_, back :)
<waigani_> fwereade: cool
<waigani_> fwereade: I've got two questions
<waigani_> one practical the other just for understanding
<thumper> o/
<fwereade> thumper, heyhye
<waigani_> fwereade: how do you authorise the api calls?
<thumper> fwereade: want to catch up hangout?
<thumper> fwereade: I have some points around connect
<fwereade> waigani_, the facades get created with something that implements Authorizer, and you can ask that who's connected
<fwereade> waigani_, that's the primary mechanism for keeping inappropriate clients' dirty hands off apis they shouldn't have access to
<waigani_> fwereade: cool I'll look into that, as I didn't want to expose UserInfo to just anyone
<fwereade> thumper, shortly, yes please
<waigani_> fwereade: second why have a client side api?
<waigani_> fwereade: why can't clients just call and get results from the server api?
<fwereade> waigani_, I'm not quite sure what you're asking, but I think the answer is "because we're bad at naming things"
<waigani_> right, yeah I'm not quite sure what I'm asking either
<fwereade> waigani_, the Client facade is implemented on the server, but it's so named because it's intended for the use fof clients
<waigani_> I get that, but you were talking about machine 2 having access to the api on localhost?
<waigani_> I don't understand that?
<fwereade> waigani_, if machine 2 is a state server, api clients on that machine should prefer to connect to the local api server rather than the remote one
<fwereade> (s)
<waigani_> oooh it's a state sever
<fwereade> waigani_, if you're not a state server you have no choice but to connect to a remote one
<fwereade> waigani_, sorry, yeah, that was the context that I never explicitly mentioned
<waigani_> okay I'm done! Can I be a fly on the wall for the connect conversation?
<waigani_> fwereade: thank you :)
<waigani_> a little part of my brain feels a little less like me ;)
 * fwereade cackles cheerfully
<fwereade> thumper, ready now, please invite waigani_ too
<thumper> waigani_: https://plus.google.com/hangouts/_/gwnzn2d4zfnezivhtfspwc2r3ma?hl=en
<thumper> waigani: did you want to do a standup? it is just the two of us
<bodie_> fwereade, mgz, looks like there's a corner case in gojsonschema where a "$schema" key not at the document root causes a nil exception
<waigani> thumper: okay
<waigani> thumper: I'm in the channel
<waigani> thumper: how can I check if --format is used?
<thumper> waigani: you can just check the formatter name
<waigani> thumper: cool, got it thanks
#juju-dev 2014-05-30
<waigani> thumper: ping
<waigani> thumper: emailed you my question
<thumper> ok
<thumper> waigani: you need to cast it to what you know it is
<waigani> oh, so simple
<wallyworld> mramm: can i get editing perms for that doco?
<axw> wallyworld: what is the card "mass provider picks maas name using placement" ?
<axw> wallyworld: I already implemented maas-name placement
<axw> is there something missing?
<wallyworld> axw: i got that from the next steps on a previous mp
<wallyworld> maybe it's been done
<wallyworld> all those 4 cards are from the same source
<axw> probably talking about supporting it in "juju deploy --to" and "juju add-unit --to"
<wallyworld> yeah
<axw> I spoke with William at Vegas about placement, and he reckoned we should just stick with machine-id in those commands for now
<wallyworld> ok
<wallyworld> feel free to update the cards as necessary
<axw> supporting placement in those is a little trickier
<axw> ok
<wallyworld> i just wanted to capture out of the mp the next steps
<axw> wallyworld: going o delete the maas-name one, since that's done. anything remaining is provider-independent
<wallyworld> sounds good
<axw> I'm going to prod my AZ changes along at the same time as this address stuff
<wallyworld> ok
 * thumper has hit friday afternoon itis
 * thumper makes a sad face
<thumper> ugh...
<bodie_> on that note, if anyone feels like having a look at this to see why $schema values are causing nil exceptions in gojsonschema, input would be welcome
<bodie_> https://codereview.appspot.com/94540044
<bodie_> I'll be looking at it tomorrow
<bodie_> going to bed in a minute here
<jcw4> Anyone wanna chat about logging / mongo and the state package?
<bodie_> https://codereview.appspot.com/94540044 fwereade mgz
<jcw4> I'm working on adding a 'results' collection for Actions, and if I'm reading it right the 'log' collection on watchers is in a separate mongo db from the state db?
<jcw4> okay.. I see now there are 3 db's: juju, presence, and admin
<jcw4> and I now see that the watcher logs are on the juju db
<wallyworld> axw: i have to go to soccer so won't get to look at your branch before i got, i'll look when i get back later if it's still unreviewed
<axw> wallyworld: thanks
<axw> no particular rush
<axw> I'm going to work on manual AZ support now
<wallyworld> ok
<wallyworld> you might be EOD before I get back so have a good long weekend :-)
<axw> thanks wallyworld . enjoy soccer
<fwereade> jcw4, if you're still around, I can talk about it: I suspect if you're needing to look at the log collection you're doing it wrong somehow
<fwereade> axw, just responding to your review
<axw> fwereade: thanks
<fwereade> bodie_, I might not get to that in time for this evening, hopefully mgz will though
<fwereade> axw, so, I wrote you an essay: https://codereview.appspot.com/102860043/
<fwereade> mgz, are you around? in dimitern's absence I'd like what clarity you can provider on container addresses -- see my comment in that review ^
<axw> fwereade: thanks. when I said "relation unit in state", I really meant the scope doc (but that doesn't invalidate anything)
<fwereade> axw, I don't think the unit wants to watch its own scope/settings docs though
<fwereade> axw, it wants to watch something else, and the settings only get updated according to the unit's having responded to those changes, detected elsewhere
<fwereade> axw, anyway, I'm going to make some coffee and have a ciggie; ping me when you're ready to talk and I'll be there soon
<axw> fwereade: thanks, I'll just digest this first
<axw> fwereade: cool, it all makes sense. I kinda came to the same conclusion, after assuming that we could allow the RU to enter scope first. you mentioned the addresses-watching worker the other day, but it didn't click at the time
<fwereade> axw, also I suspect we already have the seed of that worker underway, because the networker is going to be responsible for setting up the machine's networks in the first place; having it update the machine with its addresses seems entirely natural to me
 * axw nods
<fwereade> axw, btw, is there anything new for review in that or can I wip it?
<axw> fwereade: nope, it can go to WIP
<fwereade> axw, (btw, if you want to propose that driveby cleanup on its own I would look favourably on such an endeavour :))
<axw> certainly
<axw> fwereade: not sure what else there is to discuss, but I am free now if you want to chat still
<fwereade> axw, the main thing that's on my mind is whether we can and/or should combine the relation model with the exposure model, and if so what it'd take to do so
<fwereade> axw, but that's not directly relevant to what you're doing
<axw> fwereade: frankly I don't think I know enough to have any useful input on that right now
<axw> maybe once I've gotten through the initial bits I will
<fwereade> axw, yeah, nobody does really tbh, I should probably find a cuddly toy and explain my thoughts to them and see if I still think they're a good idea afterwards
<axw> ;)
<voidspace> Morning all
<axw> morning void
<axw> voidspace rather
 * axw looks into the void
<mgz> fwereade: I'll take a look
<voidspace> axw: you know what they say - the void looks back
<axw> heh :)
<fwereade> axw, https://codereview.appspot.com/99660047/ LGTM but might benefit from a gentle massage to move some of the functionality to slightly more natural-feeling places
<fwereade> axw, let me know if the review doesn't give enough direction
<fwereade> wallyworld, hey, are you around? I wanted to talk about md5 vs sha256
<axw> fwereade: thanks, sounds fine
<axw> fwereade: my predilection for methods is to avoid polluting the namespace, but I understand the desire to keep methods near their types too
<fwereade> axw, one thing that floats around my mind is the idea that the functionality that depends entirely on the exposed interfaces of exposed types should probably be kept separate from the stuff that needs access to the internals
<axw> fwereade: yeah, that makes sense
<fwereade> axw, it's tricky especially in go, though, because if you want to use them in-package (which you very often do) you can't move them out (lest import cycles) and therefore you can't actually enforce those boundaries
<fwereade> axw, but free funcs in their own files are probably the closest we have
<fwereade> axw, (or ofc you can explode the package into a *load* of smaller packages but that's often a dauntingly big job even for a small package, and with something like state it's devastatingly hard
<fwereade> )
<axw> yeah, I'll just go with free funcs ;)
<niemeyer_> <mikespook> `Added charm "cs:precise/ntpmaster-3" to the environment.
<niemeyer_> <mikespook> ERROR cannot assign unit "ntpmaster/0" to machine: cannot assign unit "ntpmaster/0" to new machine or container: cannot assign unit "ntpmaster/0" to new machine: use "juju add-machine ssh:[user@]<host>" to provision machines`
<niemeyer_> That's a pretty bad error message
<fwereade> niemeyer, it is, isn't it :/ do we have a bug fior it?
<fwereade> niemeyer, we definitely ought to catch that case earlier, regardless -- axw, are you still around?
<niemeyer> fwereade: That seems rooted on how these new error helper libraries work.. I bet this is not the only one
<fwereade> niemeyer, it's rooted in the old way of doing things, actually, it's a balance it's been difficult to maintain across the board
<fwereade> niemeyer, I have pretty high hopes that they'll help us present clearer error messages to users but record the context usefully as well
<fwereade> niemeyer, nobody ever said that too much annotation is not a problem when it leaks up to the user
<fwereade> dimitern, would you take a look at https://codereview.appspot.com/102860043/ and contribute your understanding of how we'll be handling the container-address/machine-address issues?
<niemeyer> fwereade: The problem is not just leaking to the user.. this is adding the same annotation repeatedly, which isn't helpful to developers either
<fwereade> niemeyer, quite -- it's a failure to annotate helpfully, which is orthogonal to the mechanism we use for annotation
<niemeyer> fwereade: It might be orthogonal, or it might not
<niemeyer> fwereade: If you stipulate that every level should annotate, this encourages developers to add boilerplate to fill up the form
<fwereade> niemeyer, and fwiw it's not *quite* the same annotation each time -- it does give you a trail of breadcrumbs that helps me as a developer see how we came to be in that situation
<fwereade> niemeyer, this is not a defence of the UX, or of the particular sequence of messages
<fwereade> niemeyer, on a separate note, I was thinking I would grab a lift from cath when she comes into town this pm, so I could come and see you -- do you have any time this pm?
<fwereade> niemeyer, and this evening I was thinking of going into floriana (easy frequent bus ride from sliema) for http://www.whatson.com.mt/en/home/events/9093/ghanafest-2014.htm?fb_action_ids=10152073938286851&fb_action_types=og.likes&fb_ref=.U4gRG8XT_7Q.like (if you or anyone else might be interested)
<fwereade> niemeyer, I have no idea if it's any good
<niemeyer> fwereade: Oh, that looks nice
<niemeyer> fwereade: Yeah, pm is mostly free, besides a few events
<fwereade> niemeyer, ok, I'll probably get to the meridien a bit after 3
<niemeyer> fwereade: Cool
<wwitzel3> fwereade: I was wondering if you would pair with me on a review at some point today.  I find it beneficial to do a review, thinking out loud, and then watch someone else do the same review.
<dimitern> fwereade, looking
<fwereade> wwitzel3, it would be a pleasure, we should do it in the next 2 hours or so, but not right now I think, I need a bit of a lunch break
<wwitzel3> fwereade: at your convenience, just ping me
<voidspace> wwitzel3: morning
<fwereade> wwitzel3, it would be great if you would take a pre-look at https://codereview.appspot.com/92610043/ https://codereview.appspot.com/100810045/ https://codereview.appspot.com/94540044/ in the meantime -- and if you can come up with comments please go ahead and post them
<wwitzel3> voidspace: morning :) and any break on that precise regression?
<voidspace> wwitzel3: nope, although I have one indication it may just be a timing issue - I wonder if sleeping and retrying would help :-)
<fwereade> wwitzel3, regardless it will be good if you have a bit of context loaded for them before we begin
<voidspace> wwitzel3: http://efreedom.net/Question/1-9992535/Azure-MongoDB-Replica-Sets-Initialization-Error-Localoplogrs-Empty
<voidspace> There was no problem at all. This exception is normal behavior. Just need to wait couple of seconds and replica set will be initialized.
<voidspace> - See more at: http://efreedom.net/Question/1-9992535/Azure-MongoDB-Replica-Sets-Initialization-Error-Localoplogrs-Empty#sthash.sbXHTaFd.dpuf
<voidspace> Hah, original reference for that: http://stackoverflow.com/questions/9992535/azure-mongodb-replica-sets-initialization-error-local-oplog-rs-is-not-empty
<wwitzel3> fwereade: indeed
<voidspace> or maybe even that the replica set is already initialized
<voidspace> we see many logging lines of the same failure - maybe we're trying to initiate multiple times
<dimitern> TheMue, vladk, fwereade, standup?
<voidspace> I have to nip out for an early lunch
<voidspace> biab
<wallyworld> fwereade: i am now
<wallyworld> i used md5 cause it was native in gridfs, but can change it
<wallyworld> fwereade: just ping me if you want to chat
<perrito666>  hi ppl
<natefinch> morning
<hazmat> g'morning
<fwereade> wallyworld, bugger, my schedule is becoming problematic
<fwereade> wallyworld, did that review make any sense to you?
<fwereade> wallyworld, I fear it was a bit light on direction
<fwereade> wallyworld, mainly I really want us to get the existing storage api in place against gridfs before we try to build in the content-addressability stuff
<fwereade> wallyworld, or at least deduplication, sorry, my brain is firing on relatively few cylinders today
<wallyworld> fwereade: the gridfs storage is already impelemnted
<wallyworld> this is the next step
<wallyworld> i also considered pulling out a separate txn type base class but wanted to wait till see how it evolved
<wallyworld> there's a few more brnaches to go
<wallyworld> the next step will be a ManagedResource service which uses ResourceCatalog and ResourceStorage to provide an exported, authenticated storage manager which handles dups etc
<wallyworld> we probably need to talk face-face. if you get a break in your schedule, ping me or we can do it later, whatever suits
<sinzui> voidspace, I am rebooting juju-ci-vapour.ws; my last desperate effort to address the lxc deploy issue
 * sinzui waits for jenkins to return
<perrito666> sinzui: still not able to reproduce that somewhere else?
<sinzui> perrito666, I have tried else where, but I played over revisions of juju and the test passes.
 * sinzui rebuilds after upgrade and restart
<rogpeppe> so... import aliases: import (coretesting "launchpad.net/juju-core/testing"; jujutesting "launchpad.net/juju-core/juju/testing"; ???? "github.com/juju/testing")
<rogpeppe> oh yeah, i missed: stdtesting "testing"
<dimitern> and apiservertesting as well :P
<rogpeppe> any suggestions for github.com/juju/testing. i'm currently using gitjujutesting.
<wwitzel3> is the github package specific to juju or just living in that org namespace?
<rogpeppe> wwitzel3: it's under juju control, but it's not specific to -core
<rogpeppe> wwitzel3: i believe that juju-core/testing will continue to exist
<rogpeppe> ha, envtesting, toolstesting
<wwitzel3> rogpeppe: I haven't explored, but I've always wondered the difference between juju-core/testing and juju-core/juju/testing
<rogpeppe> wwitzel3: the former can be used by lower level packages - it doesn't import state
<wwitzel3> rogpeppe: ahh, ok
<wwitzel3> ugh yuck
<wwitzel3> rogpeppe: no matter what you do, you won't like it
<wwitzel3> rogpeppe: does that help? lol
<rogpeppe> wwitzel3: lol
<wwitzel3> I guess my initial reaction was to name the gh/juju/testing package something else
<wwitzel3> rogpeppe: also I really hate that we alias the go stdlib testing package as stdtesting .. if anything should just be testing, it should be the stdlib package, but I digress.
<wwitzel3> rogpeppe: I guess gitjujutesting .. it will only make your eyes bleed occasionally
<wwitzel3> if we just stopped writing tests this wouldn't be a problem ;D
<wwitzel3> rogpeppe: you could name it something like .. yatp
<wwitzel3> moartesting
<jcw4> fwereade: wrt. the log collection, I was just trying to understand where actionResults would go.  It seems clear now that it would be a collection in the juju db in mongo
<perrito666> hey, perhaps some of you might find this useful, patches are welcome, it is a pretty quick hack https://github.com/perrito666/golazytools/blob/master/gogrep
<jcw4> and probably held in the state struct like actions too
<natefinch> wwitzel3, voidspace: standup?
<wwitzel3> natefinch: yep, thanks
<fwereade> jcw4, yeah, I think it's just another collection, like actions
<bodie_> morning all
<fwereade> bodie_, heyhey
<jcw4> fwereade: yeah.  I started looking for logs 'cause it was the closest analog I could think of, and noticed that the watchers had their own mgo.Collection on them, and that's what led me astray
<jcw4> hi bodie
<fwereade> jcw4, yeah, the logs are really at a level below state
<jcw4> fwereade: I think I'm clear on where the collection should be now, but I'm not sure how much detail to put in the results... timestamp, unit?, output, error?, running time? etc. etc.
<mgz> jcw4: good questions
<mgz> all of htat sounds good, I'm not sure on the form of the unit link though
<jcw4> mgz: I suppose we can start light just action name, unit id?, and output, and then add more detail later as needed?
<jcw4> mgz: yeah, globalKey? u.doc.Name? etc.
<jcw4> mgz: encoded in _id like the actions collection
<bodie_> okay, I'm going to see about getting the next steps plugged in, I guess an Actions() method on the Charm interface
<jcw4> bodie_: sounds good to me
<bodie_> anyone have a few spare cycles to give me a quick LGTM on https://codereview.appspot.com/94540044 ?
<bodie_> I think we've just about beaten out every possible flaw there, heh
<bodie_> unless fwereade has input about whether to move the yaml back to global scope
<bodie_> or to a file?
<mgz> bodie_: I wouldn't go *that* far :)
<bodie_> lol
<jcw4> "that far" being "beaten out every possible flaw"?
<jcw4> lol
<mgz> :D
<bodie_> I'm more than willing to go back over it with an industrial-strength iron, but I'd personally really like to be out of that file
<fwereade> wwitzel3, did you get a chance to look at that one? ^^
<wwitzel3> fwereade: that is the one I was just pulling up now, I just did them in the order you linked them and that one happened to be last :)
<mgz> bodie_: lgtm, land it. go to the launchpad page, set the commit message, and mark approved
<bodie_> aye-aye :)
<fwereade> mgz, you might need to land it for him?
<mgz> fwereade: I may need to toggle the approved
<mgz> he can set the commit message
<fwereade> wwitzel3, it's worth casting an eye over it anyway (but this is not a blocker on landing it)
<fwereade> mgz, +1
<mgz> ...I guess making him do it isn't that useful as we're changing all this, but hey
<wwitzel3> fwereade: sure, doing that now
<vladk> dimitern: machine-1-lxc-0 can't authorize machine-1/lxc/0, is it a bug?
<jcw4> mgz: it got pushed back at least once... maybe again...
<wwitzel3> fwereade: the first one I reviewed, add user info to juju switch, was fairly heavily reviewed, so I didn't have any comments on it. But I did read through the progression of the code and all the comments.
<fwereade> wwitzel3, I'm afraid I'm disappearing for half an hour to regather my sanity, because today has somehow been stupidly relentless, and you have been very nice about my inattention so I propose to trespass upon it further :/
<mgz> jcw4: I'll get yelled at if it does
<jcw4> mgz: lol
<fwereade> vladk, machine-1/lxc/0 is not a thing, is it?
<fwereade> vladk, it's a mix of tag and id syntax
<fwereade> vladk, machine-1-lxc-0 is the tag, 1/lxc/0 is the id
<wwitzel3> fwereade: haha, no worries, I'm here for hours yet :) enjoy your siesta
<vladk> fwereade: thanks
<fwereade> vladk, (or possibly my brain has turned to cheese, double-check in the juju-core/names package)
<wwitzel3> head cheese
<perrito666> uhh, cheese
 * perrito666 is near lunchtime
<perrito666> I see there is no much consistency on the tests about jc.IsTrue vs gc.Equals, true) sounds to me like jc.IsTrue is the best way to express the assertion, is there any reason not to use it?
<rogpeppe> perrito666: nope
<rogpeppe> perrito666: the latter is what we had to use before jc.IsTrue
<perrito666> ah, I see
<rogpeppe> perrito666: (personally, i think gc.Equals expresses the logic perfectly, but i think some people begrudged the extra characters)
<rogpeppe> frankban, anyone else: review appreciated: https://github.com/juju/testing/pull/7
<bodie_> btw mgz, I think the nil exception was coming from a logical error in my tests
<bodie_> I caught that thanks to your suggestion to test the one edge bit
<bodie_> so, thanks :)
<mgz> :D
<bodie_> how detailed / verbose do we like our commit messages to be?
<rogpeppe> frankban: it's trivial BTW
<rogpeppe> bodie_: i generally prefix the commit message with the package name(s) where reasonable, and a single line that describes the change to some degree
<rogpeppe> bodie_: the important one is the MP message, which turns into the eventual trunk commit msg
<bodie_> I see, thanks
<frankban> rogpeppe: done
<rogpeppe> frankban: thanks
<wwitzel3> rogpeppe: why do you have to skip C (cgo)? for it to work.
<rogpeppe> wwitzel3: because build.Default.Import fails for cgo's import "C"
<rogpeppe> wwitzel3: and the net package uses cgo
<rogpeppe> wwitzel3: "C" is a fake package - it doesn't actually exist anywhere
<wwitzel3> rogpeppe: reading about that now, it is essentially a marker for C preamble.
<rogpeppe> wwitzel3: yeah
<rogpeppe> wwitzel3: and C names are inside its name space
<wwitzel3> rogpeppe: yep, makes sense
<bodie_> rogpeppe, fwereade, mgz do you think I need to amend the code with a comment for the spec the Actions YAML is expected to take, or is that more of a docs/ addition?
<bodie_> I'm just combing the last kinks out of the MP message
<rogpeppe> bodie_: i think that a doc comment is always good
<rogpeppe> bodie_: for anyone trying to parse charms from go and using the package, that's all they'll see
<rogpeppe> bodie_: although you could add a link to the spec instead
<bodie_> hmm
<bodie_> okay
<bodie_> give me a few minutes to land something appropriate, I'm just putting my kid down for a nap
<bodie_> then hopefully we can finally get this thing in
<mgz> bodie_: sorry, can we land your branch yet?
<rogpeppe> review appreciated of this: https://codereview.appspot.com/99670045/
<rogpeppe> pretty trivial stuff - mostly just code movement
<voidspace> natefinch: wwitzel3: perrito666: hey guys, sorry I missed standup
<voidspace> natefinch: wwitzel3: perrito666: got stuck in traffic and have been out a stupidly long time
<voidspace> natefinch: I assume you didn't make any progress on the precise CI problem yesterday?
<bodie_> mgz, I'd like to basically copy the MR message into a comment for the Actions type, just give me a minute here
<bodie_> sorry for the delay
<voidspace> natefinch: from "further research" this morning I have a theory - going looking in the code to see if it could be correct
<bodie_> mgz, do I need to re-propose?
<bodie_> (I know I'd want this change as a user of the type)
<natefinch> voidspace: yeah, no, sorry, no progress
<mgz> bodie_: nope
<bodie_> mkay
<bodie_> okay, it says I need review since I pushed a change, even though it's a simple one...
<bodie_> just a richer comment on the type
<bodie_> so I guess that means I need to lbox propose
<bodie_> I promise the next iteration won't have such a low signal to noise ratio :/
<fwereade> bodie_, you just need to refresh the launchpad page and approve on the frsh page, I think
<bodie_> alrighty
<bodie_> fwereade, I'm not seeing the approved button any longer
<bodie_> can I just get a quick approval here... https://codereview.appspot.com/94540044
<bodie_> someone/anyone?  All I added was a couple of comments to types that needed to be more informative
<bodie_> https://codereview.appspot.com/94540044/
<bodie_> fwereade, mgz, natefinch, rogpeppe, just want to land this thing already, sorry to pester but apparently the comments I added to the code mean it needs a fresh review
<bodie_> just so it's on your radar
<mgz> bodie_: landing
<bodie_> thank ye sir
<bodie_> was that for merge 219926?
<mgz> why does their url for v4 have /latest/ rather than /v4/ or something >_<
<mgz> bodie_: I poked the commit mesage a little and flagged for the bot
<bodie_> oy...   I thought I linked the v4 draft.  bleh
<bodie_> yeah, that fact was driving me nuts
<mgz> oh, poo, I need to go pull in the deps
<bodie_> I think I made up for it by linking the v4 _schema_ rather than the draft since I couldn't find a copy of the v4 draft
<bodie_> technically, JSON-Schema lets you include a $schema key to define the schema to use -- however, I noticed the "latest", i.e. v4 draft, is from last august
<bodie_> I don't even know if they landed a final version, or if it fell out of date, or what's up with all that
<mgz> heh `fail`... I should totally make that tyop an alais for `tail -f`
<mgz> bodie_: bot is chewing on it
<jcw4> mgz lol
<mgz> bodie_: landed
<jcw4> woot
<mgz> everyone: you'll want to pull trunk and get the new gojsonschema related dependencies
<bodie_> pilgrim's progress
<bodie_> fwereade, anyone else, where do you think the Actions params validator should go?  since Actions is a map belonging to Charm, which needs to be loaded by gojsonschema in order to validate a Params map, it seems really sensible to me to implement that as a member of the Charm interface
<bodie_> something like func (c *Charm) Validate(map[string]interface{}) error { ... }
<bodie_> however, I could also see that being the responsibility of a method in State
<jcw4> bodie_: I don't think Charms or State should know about Action parameter validation (directly), I would expect the Validate to be on an ActionDefinition type (not sure what we'll call that type)
<jcw4> bodie_: in the example you give... How does Charm know *which* action you're validating?
<bodie_> sorry, one moment, helping the girls up
 * jcw4 wil brb
<bodie_> maybe we could use an Action interface to tie it all together?
<bodie_> like Charm
<bodie_> that probably isn't what we need since Charm has a different meaning for Actions then State will have
<bodie_> than*
<voidspace> natefinch: I'm approaching EOD, and it still looks like we're no nearer a solution
<voidspace> natefinch: although interestingly it looks like the specific failure has changed - we're now seeing "Closed explicitly" as the error message
<voidspace> natefinch: we need to time bound how long we're going to continue looking at this I think
<sinzui> natefinch, voidspace : do you have a minute to review https://codereview.appspot.com/100880046
<voidspace> sinzui: looking
<voidspace> that was tricky :-)
<voidspace> natefinch: although this reference implies that the call to replSetInitiate can error, but be successful
<voidspace> http://stackoverflow.com/questions/9992535/azure-mongodb-replica-sets-initialization-error-local-oplog-rs-is-not-empty
<voidspace> There was no problem at all. This exception is normal behavior. Just need to wait couple of seconds and replica set will be initialized.
<voidspace> Inside Initiate we wait 1 second (well, ten sleeps of 100ms)
<voidspace> we could wait a bit longer (up to a few seconds) whilst polling for the replica set status
<voidspace> it could be a timing issue (slow machine) which is why it only happens on the build machine
<natefinch> voidspace: back
<natefinch> voidspace: hmm slow machine sounds possible
<voidspace> natefinch: hey, hi - only a few lines of scrollback
<voidspace> natefinch: the question is whether or not that specific error is genuinely recoverable
<voidspace> natefinch: we have replicaset.CurrrentStatus()
<voidspace> natefinch: I could add polling for that - if we get an alive status (or equivalent) then continue
<natefinch> upping the timeout seems harmless.  bootstrap is already relatively slow, making it a few seconds slower in the edge case where there's a non-recoverable error seems like not a big deal
<voidspace> natefinch: MaybeInititiateMongoServer only returns an error
<voidspace> natefinch: ok I'll create an mp adding that and we can see if it fixes the problem
<natefinch> I hate this blindly swinging at invisible errors stuff.
<voidspace> me too :-/
<voidspace> natefinch: this log with debug (the CI console output logs are INFO only) shows us attempting to call replSetInitiate multiple times
<voidspace> http://pastebin.ubuntu.com/7544771/
<natefinch> gustavo might have a better idea of what's going on, but he's playing squash or something in Malta, I believe
<voidspace> and if it's "in progress" it makes sense that subsequent calls fail with this error
<voidspace> effectively meaning "I'm already initiating a replSet for you"...
<voidspace> you are only supposed to call it once
<natefinch> ahh right, good point
<voidspace> the odd thing is that the loop for multiple calls is explicitly checking for unreachable servers
<perrito666> voidspace: well the first call declares it fails, those can be retries
<voidspace> perrito666: we actually have two loops
<voidspace> one inside initiate.go and one inside replicaset.go
<voidspace> the initiate.go one logs "replica set initiation failed, will retry" and we're *not* seeing that
<voidspace> so it must be the replicaset.go one
<voidspace> hmmm... ah wait
<voidspace> the other loop is: if err != nil && err.Error() == rsMembersUnreachableError
<voidspace> looks like a debug log has been removed
<voidspace> perrito666: http://pastebin.ubuntu.com/7544771/
<perrito666> voidspace: migration step?
<voidspace> that log is showing repeated retries inside Initiate - but not repeated calls to the initial logging line
<voidspace> perrito666: what do you mean?
<perrito666> voidspace: sounds like something intended to introduce peergrouper in a harmless way
<perrito666> 31 // MaybeInitiateMongoServer checks for an existing mongo configuration.
<perrito666>  32 // If no existing configuration is found one is created using Initiate.
<perrito666> also there is a comment by rogpeppe
<perrito666> 43         // TODO(rog) remove this code when we no longer need to upgrade
<perrito666>  44         // from pre-HA-capable environments.
<voidspace> perrito666: right, but the logging shows we go into Initiate and then retry multiple times inside there
<voidspace> perrito666: and this reference implies that this error can be non fatal (i.e. initiation is still happening)
<voidspace> http://stackoverflow.com/questions/9992535/azure-mongodb-replica-sets-initialization-error-local-oplog-rs-is-not-empty
<voidspace> ok, I need to EOD
<voidspace> g'night all
<perrito666> voidspace: well in restore, after running init of rs I do run a loop trying to connect as the client until it actually works
<perrito666> voidspace: bye
<natefinch> night voidspace, thanks for the help.  Sorry it's been such a hassle
<voidspace> natefinch: heh, such is life
<rogpeppe> voidspace: the reason to remove that code is that in post HA-version environments, MaybeInitiateMongoServer should only be called once, at bootstrap time. at that time, there are no users, so there's no need to connect as a specific user
<voidspace> rogpeppe: right, but it's currently harmless and definitely *not* the source of the current problem
<rogpeppe> voidspace: ok, that's fine
<voidspace> which is a problem of initiating failing at bootstrap time
<voidspace> on one machine
<voidspace> and that machine only so far...
<rogpeppe> voidspace: ok, that's what i was about to ask
<voidspace> so, latest theory is that initiating is just slow on that machine
<voidspace> so I'll poll replicaset status and if we see a healthy replica set within a few seconds we'll ignore that error
<voidspace> as I've found one reference saying that this can be the case
<voidspace> if that fixes the problem then great
<voidspace> but for now...
<voidspace> good night :-)
<natefinch> rogpeppe: yeah, it's just the precise CI machine that can't run local provider with --replset for whatever reason.
<natefinch> rogpeppe: works on other peoples' precise VMs
<rogpeppe> natefinch: have we tried doing it manually on the CI machine?
<natefinch> I think Michael had said he tried it manually, but I'm not 100% sure
<natefinch> certainly worth testing out
<natefinch> sinzui: is there a way I can get on the precise CI machine to noodle with mongo?
<perrito666> natefinch: I think Michael mentioned not having access to the machine to try it manually
<natefinch> perrito666: ahh yeah, right, well, we should get on there.
<bodie_> okay, so I've gone and forgotten to switch branches before writing code, and committed to master
<bodie_> is there a quick and dirty way to just yank that commit into a new branch somehow without taking the fact that it came from master?
<bodie_> I'm not entirely clear on how much bzr cares about branch history vs file contents
<bodie_> like a cherry-pick
<bodie_> I guess that's a merge -c
<natefinch> sorry, no idea
<jcw4> bodie_: If that doesn't work, I would personally just generate a patch of the changes and then remove your local repo and start again
<jcw4> bodie_: otherwise, you'll be fighting the changes in master when you pull from lp:juju-core, and when you try to create new branches
<bodie_> right
<bodie_> I think I can merge -r 1 and then cherry-pick the incorrect changes from master, which should bring them in free of branch history info
<bodie_> then go back to master and merge -r 1
<mgz> bodie_: you just committted on master?
<jcw4> bodie_: unless you can actually remove those revisions from master I think you'll have issues
<mgz> oh, god, but you're using cobzr so I'm not sure how you address trunkk >_<
<jcw4> mgz: :)
<bodie_> I think I can just refer to it as master
<bodie_> I was thinking if I could just cherry-pick the changes into a feature branch without branch history, then I could revert master to its rightful state
<bodie_> so, like john was saying, a patch would accomplish that
<bodie_> then just nuke my repo and start over, then apply the branch
<mgz> bzr branch -r-2 co:trunk co:prevtrunk
<mgz> is what's I'd do
<mgz> give youself a new branch of clean trunk
<mgz> then you can pull of the changes from trunk to a new branch of that
<mgz> and finally push --overwrite prevtrunk over trunk
<sinzui> Hi Does juju not think that 1.19.3 is the last dev version? My branch that changes juju to 1.19.4 has the same failure in repeated tests. https://code.launchpad.net/~sinzui/juju-core/inc-1.19.4/+merge/221570
<bodie_> jcw4, do you know if bzr has a built in patching mechanism?
<jcw4> you mean like git apply?
<bodie_> this
<bodie_> http://doc.bazaar.canonical.com/plugins/en/bzrtools-plugin.html#patch
<bodie_> but my bzr is telling me it doesn't know what "patch" means
<bodie_> bzr patch --usage
<bodie_> cobzr: /usr/bin/bzr patch --help: exit status 3
<bodie_> bzr: ERROR: unknown command "patch"
<jcw4> seems like you don't have the bzrtools plugin?
<bodie_> ah
<bodie_> I wonder if that would mess with cobzr...
<bodie_> meh.
<jcw4> I don't think so
<bodie_> I think cobzr only filters out commands that it knows
<jcw4> http://doc.bazaar.canonical.com/plugins/en/plugin-installation.html
<jcw4> I bet you could cd to $HOME/.bazaar/plugins/ and then just do bzr branch lp:bzrtools
<bodie_> oh, good find
<sinzui> natefinch and his esteemed minions. We have two critical bugs :(
<sinzui> https://bugs.launchpad.net/bugs/1325074
<_mup_> Bug #1325074: Juju version cannot be set to 1.19.4 <packaging> <juju-core:Triaged> <https://launchpad.net/bugs/1325074>
<sinzui> https://bugs.launchpad.net/bugs/1325072
<_mup_> Bug #1325072: unit tests fail on utopic <ci> <test-failure> <utopic> <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1325072>
<natefinch> sinzui: looking
<sinzui> oh, but I see arm64 unit tests run for the first time ever. Even if they fail, this is a big improvement over the compilation errors I have gotten over the last 3 months
 * sinzui finds children
<bodie_> hmm
<bodie_> jcw4, did we decide it made more sense to permit Actions to come back nil?
<bodie_> or to return an empty struct
<jcw4> from a Charm?
<natefinch> sinzui: unfortunately, I don't know what the log is supposed to look like when this kind of thing succeeds.  I can't imagine just bumping the revision just broke something.... is there no simplestreams data for 1.19.4, is that the problem?
<bodie_> yeah
<jcw4> I *think* empty struct... but I could convinced either way
<bodie_> hmm
 * jcw4 goes to pick up his kids from school
<bodie_> if we model it on Config, it'll come back empty
<bodie_> ah
<bodie_> fwereade, mgz, natefinch -- I noticed Config comes back with an empty Options member if there's no config.yaml
<bodie_> why not just return an empty Config?
<bodie_> I guess because then you can iterate on := range c.Config().Options without checking for nil
<sinzui> natefinch, that you for looking. I could consider this issue a offer from fate to let me take the weekend off.
<natefinch> sinzui: sounds like it to me
<natefinch> bodie_: empty is probably nicer
<bodie_> natefinch, I realized it's coming back empty now, but its members aren't instantialized
<bodie_> I think I'm going to return empty members
<bodie_> or uh, a struct with empty but instantiated* members
<bodie_> does that sound reasonable?
<natefinch> bodie_: but I don't think we have a convention
<natefinch> bodie_: depends on the struct.  Not sure exactly what you're making.  The fields all default to the zero value, which, if designed well, is an ok default
<bodie_> natefinch, perhaps I was getting a nil value for a different reason
<bodie_> thanks for helping me understand :)
<bodie_> jcw4, if you have spare time tonight, if you'd like you can check out the state side of the Charm Actions() interface method
<bodie_> I got the dir and bundle stuff put together and tested
<bodie_> I think it's good
<jcw4> cool.  pushed up to lp?
<bodie_> yeah, http://launchpad.net/~binary132/juju-core/charm-interface-actions
<bodie_> testing whole thing now
<bodie_> I think the only places the implementation needs to be locked down are state/store and state/charm
<jcw4> k
<bodie_> pushed another minor fix
<jcw4> k
<bodie_> jcw4, with that change I believe it's in the clear
<jcw4> in the clear meaning ready for me to refer to when looking at the state side implementation?
<bodie_> I'm going to propose -wip
<bodie_> just need to add a few tests, methinks
<bodie_> heading out for the night, laters
<jcw4> ttyl
<bodie_> lbox proposed -wip :) https://codereview.appspot.com/99640044 (fwereade, mgz?)
<bodie_> Still needs a few tests implemented
<jcw4> cool
<jcw4> long lunch alexisb
<alexisb> jcw4, yes it was
<jcw4> :)
<alexisb> I met a colleague in town, but town for me is a long ways away
<alexisb> so I took the opportunity to run some in town errands
<jcw4> ... I was thinking you were east coast like most of the folks, but I forgot you were in my old stomping grounds
<alexisb> o yeah? where are your old stomping grounds?
<jcw4> I lived in Vancouver (WA) and worked in Portland from 2004 until January
<jcw4> Now I'm down in Sacramento
<alexisb> heh that is funny, I was born in Sacramento
<jcw4> haha
<alexisb> jcw4, where in Sacramento do you live?
<jcw4> Natomas, basically two blocks from the fields to the north
<alexisb> nice
<jcw4> alexisb: reprieve from the rain for sure
<alexisb> yes, Sacramento is not a bad town, when I graduated we looked at living there and me working in folsom at Intel
<jcw4> that would have been nice.  Folsom is quite pretty
<alexisb> but ultimately the commute was more then I wanted to deal with
<jcw4> I dunno, PDX traffic was way worse than any I've encountered here
<jcw4> although I AM working from home
<jcw4> :)
<alexisb> :)
<alexisb> if you have to be in portland yes, traffic sucks, especially vancouver to pdx
<jcw4> yeah... that was the main problem.  Two bridges
<alexisb> but I actually live in yamhill and also wfh :)
<jcw4> Oh, very nice... LOVE yamhill county and IIRC the little town of yamhill is very very nice
<alexisb> yes we really like it here
<jcw4> thats on the back road between dundee and forest grove right?
<alexisb> yep
<jcw4> I'm suprised you get decent internet there?
<alexisb> a very good local company called coho
<alexisb> they do a nice job and give very reliable service
<jcw4> very nice
#juju-dev 2014-06-01
<perrito666> hello, a lot of people for a sunday
<sebas5384> cmars: ping
#juju-dev 2015-05-25
<davechen1y> ping http://reviews.vapour.ws/r/1773/
<thumper> davechen1y: shipit
<davechen1y> it looks like the reviewboard <> github integration is broken
<davechen1y> reviewboard doesn't understand markdown
<davechen1y> [LOG] 0:10.674 ERROR juju.worker exited "rsyslog": failed to write rsyslog certificates: cannot create temp file: open /var/log/juju/rsyslog-cert.pem412364416:
<davechen1y> no such file or directory
<davechen1y> [LOG] 0:10.675 INFO juju.worker restarting "rsyslog" in 250ms
<davechen1y> mocking fail
<davechen1y> [LOG] 0:10.702 INFO juju.worker start "reboot"
<davechen1y> [LOG] 0:10.706 ERROR juju.worker exited "reboot": mkdir /var/lib/juju: permission denied
<davechen1y> [LOG] 0:10.706 INFO juju.worker restarting "reboot" in 250ms
<wallyworld> menn0_: did you have any time for me to ask questions about that mgo txn / leadership issue?
<davechen1y> thumper: ping http://reviews.vapour.ws/r/1774/
<thumper> davechen1y: shipit
<davechen1y> thumper: ta
<davechen1y> ... value string = "error: dial unix /mnt/tmp/check-8992814163298018886/10/bad.sock: no such file or directory\n"
<davechen1y> ... regex string = ".*/mnt/tmp/check-8992814163298018886/10/bad.sock: .* no such file or directory\n"
<davechen1y> how are these two not a match ?
<davechen1y> oh, i see, two spaces
<menn0_> wallyworld: sorry, just back after being taken out for lunch
<wallyworld> menn0_: no probs, hope you enjoyed it
<menn0_> wallyworld: i can talk now if you're still free
<wallyworld> sure, onyx for a changce
<menn0> wallyworld: see you there
<axw> wallyworld: do you need me to look at bugs, or carry on with storage cards?
<wallyworld> axw: sec, talking to menno
<axw> sure
<wallyworld> axw: the bugs situation is looking a little dire, so if you could pick up anything from beta 5 milestone that would be great eg bug 1454678
<mup> Bug #1454678: "relation-set --file -" doesn't seem to work <landscape> <relation-set> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1454678>
<axw> wallyworld: okey dokey
<davechen1y>  LOG_TXT='17.04,Angsty Antelope,angsty,2016-10-23,2017-04-30,2018-01-29'
<davechen1y> orly
<thumper> :)
<davechen1y> better than, Aweful Aardvark
<davechen1y> wallyworld: your build got shat on
<davechen1y> + set +e
<davechen1y> + scp -o 'StrictHostKeyChecking no' -o 'UserKnownHostsFile /dev/null' /var/lib/jenkins/workspace/github-merge-juju/juju-core_1.24-beta5.tar.gz 'ubuntu@ec2-54-152-75-93.compute-1.amazonaws.com:~/'
<davechen1y> ssh: connect to host ec2-54-152-75-93.compute-1.amazonaws.com port 22: Connection timed out
<davechen1y> lost connection
<davechen1y> i see th eproblem
<davechen1y> set -e not set +e
<davechen1y> axw: sorry, that was your build that got shat on
<axw> davechen1y: thanks
<axw> happens every hnow and again
<davechen1y> who can fix the build plan
<davechen1y> i'm pretty sure it's leaked a machine
<mup> Bug #1458447 was opened: Enforce 'optional: false' in metadata.yaml <juju-core:New> <https://launchpad.net/bugs/1458447>
<davechen1y> axw: you can't catch a break today
<axw> the internets hate me
<axw> also, iinet. 8 minute callback ends up being 30 minutes, then I have to wait on the phone again
<davechen1y> they are fuckers
<davechen1y> protip: iinet and internode are the same price
<davechen1y> vote with your feet
<axw> davechen1y: I'm thinking about moving, waiting to see what happens with takeover
<davechen1y> same
<davechen1y> afaik althoguh they have the same parent company
<davechen1y> they still run independent networks
<davechen1y> so internode should suck less
<mup> Bug #1458452 was opened: Enforce 'limit' in metadata.yaml <juju-core:New> <https://launchpad.net/bugs/1458452>
<axw> wallyworld: thanks for updates, replied
<wallyworld> ty, looking
<wallyworld> axw: i have a meeting now, will see what i can do afterwards. hopefully what is needed can be encoded into an assert
<axw> wallyworld: that jujuc stdin thing is actually a bit of a PITA. "juju run" uses "bash -s" to read commands from stdin, which conflicts with jujuc commands run underneath
<wallyworld> oh joy
<wallyworld> so it's thumper's fault :-)
<axw> so not actually fixed yet
<wallyworld> np, i think this week will be a write off for feature work :-(
<jam> wallyworld: just finishing up my previous meeting, will be there in a bit
<axw> wallyworld: looks like it
<axw> write off that is...
<wallyworld> yeah
 * dimitern steps out for 2h
<axw> wallyworld: I've got a telco technician coming round some time tomorrow to diagnose a line quality issue, so may be offline for some time
<wallyworld> np
<axw> some time between 8am-6pm, very helpful :)
<wallyworld> usually the way
<axw> wallyworld: installing things on a Windows VM to make sure I don't bugger up the utils/exec code before propsing a change to "juju run" .. hope to be done with that bug tomorrow
<wallyworld> thanks for checking on windows
<axw> nps, about time I ran the tests there anyway
<wallyworld> wonder if there's a CI test that is run already
<wallyworld> i'll ask tomorrow
<axw> I *think* there is
 * dimitern is back
<anastasiamac> dimitern: \o/
<anastasiamac> dimitern: :D
<dimitern> anastasiamac, hey :)
<anastasiamac> dimitern: just wanted to say it's good u r back :D
<dimitern> anastasiamac, good to be back :)
<dimitern> any reviewers around to have a look at http://reviews.vapour.ws/r/1777/
<dimitern> ?
<mgz> dimitern: I can take a look in a min
<dimitern> mgz, thanks!
<thumper> so... us holiday?
<waigani> axw_: when you have a moment, see my comment on 1457728
<menn0> thumper: I'd like to merge current upstream into the db-log feature branch: https://github.com/juju/juju/pull/2419
<thumper> menn0: jfdi this type of thing
<menn0> thumper: kk
 * thumper sighs
 * thumper wishes that the ... worked when there were already some args and you just want extra
<wallyworld> waigani: hey, i see you've assigned yourself to the cinder bug - could you leave that one to us? there's other more important for the 1.24 release
<wallyworld> waigani_: you dropped off irc i think? did you see my last message?
<waigani_> wallyworld: I didn't get pinged but I see it now, okay I'll leave it. Is this the best list to go by: https://launchpad.net/juju-core/+milestone/1.24.0
<wallyworld> waigani_: there's still bugs on beta5 also, and beta5 will be the next release, but yeah 1.24.0 milestone is also relevant
<waigani_> wallyworld: okay, I'll work through beta5, cheers
<wallyworld> i'm intending to try and take a first look at leadership related bug(s) today
<wallyworld> ty
<wallyworld> waigani_: looking at beta5, probably the remaining bug to look at is bug 1457225, not sure what you'll find there, i haven't looked in detail
<mup> Bug #1457225: Upgrading from 1.20.9 to 1.23.3 works, but error: runner.go:219 exited "machiner": machine-0 failed to set status started: cannot set status
<mup> of machine "0": not found or not alive <cts> <sts-stack> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1457225>
<waigani_> wallyworld: ha, I just opened it
<wallyworld> might be a remaining upgrade issue, not sure
<wallyworld> maybe we can't repro
<waigani_> I'll see what I can grok
<axw_> wallyworld: eep, I got cut off
<wallyworld> ah
<waigani_> axw_: thanks for taking a look. If you were testing on tip of 1.24 (which is beta5), what were you upgrading to?
<axw_> waigani_: itself
<axw_> waigani_: --upload-tools bumps version build number. didn't require any code change to trigger the bug
<waigani_> axw_: ah okay, just --upload-tools, increment patch no. and that caused the bug?
<axw_> yes
<waigani_> axw_: what did you have deployed?
<axw_> wallyworld: can you please merge that change? I don't have rights
<wallyworld> ok
<axw_> waigani_: I have a dummy charm with a peer relation, just added 2 units of that
<axw_> waigani_: not sure if presence of units makes a difference or not
<wallyworld> axw_: done
<axw_> wallyworld: thanks
<waigani_> axw_: okay, thanks for the info. I'll keep digging.
<axw_> waigani_: nps, thanks
#juju-dev 2015-05-26
<davecheney> thumper: http://paste.ubuntu.com/11360922/
<davecheney> current state of play
 * thumper looks
<thumper> davecheney: seems only 10 packages have races
<davecheney> this was run without -p 1
<davecheney> so some testss timed out
<davecheney> becuase of contention on the cpu
<davecheney> yeah 10 looks about right
 * davecheney makes cards
<mwhudson> davecheney: so, the "don't strip go binaries" thing
<mwhudson> davecheney: do you know what the actual problems are, or is it more "it's not tested and sometimes breaks things so don't do it"?
<axw_> thumper: when you have a moment, can you glance over https://github.com/juju/utils/pull/134 and tell me if there's any reason why this should break "go run"?
<axw_> please
<thumper> ok
<thumper> axw: do I take it from this question that it is breaking juju run?
<axw> thumper: err yeah, juju run not go run :)
<davecheney> mwhudson: it's sort of self referential
<davecheney> strip(1) doesn't really follow elf
<axw> thumper: context: fixing https://bugs.launchpad.net/juju-core/+bug/1454678
<mup> Bug #1454678: "relation-set --file -" doesn't seem to work <landscape> <relation-set> <juju-core:Triaged> <juju-core 1.24:In Progress by axwalk> <https://launchpad.net/bugs/1454678>
<davecheney> it just doesn't mangle gcc produced things
<davecheney> so that broke go binaries
<davecheney> mainly anthing that wasnt amd64
<axw> thumper: with my pending fix, jujud would consume stdin and pass it to the backend
<davecheney> now, we don't test stripped binaries
<davecheney> so if they got better or worse over time, we don't know
<axw> thumper: that breaks juju run, because it reads the subsequent commands piped to bash
<thumper> hmm...
<axw> thumper: e.g. if you did "juju run 'cat; echo 123'", you'd get output of "echo 123" rather than "123"
<davecheney> so it's sort of a circular problem, we tell people not to strip, they file bugs, we close them, we don't test that strip works, we tell people not to strip binaries, etc
<thumper> axw: well, juju run just calls 'juju-run' on the server, which enters a hook context to execute the commands...
<thumper> couldn't we just change how the juju-run server side command sends the actual script?
<thumper> axw: cmd/jujud/run.go
<thumper> axw: couldn'd we just hook up the stdin around line 111
<thumper> ?
<axw> thumper: that doesn't solve this particular issue, though we might want to do that too. the problem is that at the moment, hook tools don't accept stdin at all
<axw> hang on, I'll link my branch
<thumper> I have the pull from above
<axw> wallyworld: http://reviews.vapour.ws/r/1776/
<wallyworld> ok
<axw> err sorry, thumper^^
<axw> wallyworld: ignore sorry
<axw> thumper: so, atm you cannot do "echo yaml | relation-set ... --file=-"
<axw> thumper: my branch changes it so you can. but that showed up a problem in a test where a hook tool was running underneath "juju run"
<axw> thumper: if there are multiple hook tool commands in the same juju-run, then the first one would consume the stdin which happened to be the rest of the juju-run commands
<thumper> ah...
<thumper> that's kinda weird
<thumper> and a bit strange...
<thumper> not quite sure how to fix that
<thumper> sorry
<axw> thumper: my change to utils/exec fixes it :)  I'm just wondering if there's any reason why we shouldn't do it.. I don't think so
<thumper> axw: I can't see a reason not to
<axw> thanks
<davecheney> thumper: there are a SHITLOAD of changes on juju/utils
<davecheney> which aren't deployed because godeps has pinned the version way back in the past
<thumper> success!
<davecheney> no
<davecheney> hodl on that
<davecheney> for some reason godeps didn't update my working copy
<davecheney> anyone http://reviews.vapour.ws/r/1782/
<axw> thumper: how do I turn up logging in tests? is there a doc on this somewhere?
<thumper> axw: in the setup, do something like this:
<axw> thumper: no env var? :\
<thumper> loggo.GetLogger('juju.whatever').SetLogLevel(loggo.TRACE)
<axw> ok, thanks
<thumper> axw: no we protect all the tests from the environment
<axw> sure, we could set up logging and then remove the env var tho
<axw> doesn't matter, that'll do for now
<davecheney> is anyone looking at the bug in reviewboard that causes it to shit on markdown liks ?
<davecheney> links
<davecheney> axw: thanks for the review, here is another https://github.com/juju/juju/pull/2420/files
<axw> LGTM
<mup> Bug #1458717 was opened: utils/featureflag: data race on feature flags <juju-core:New> <https://launchpad.net/bugs/1458717>
<mup> Bug #1458721 was opened: lease: data races in tests <juju-core:New> <https://launchpad.net/bugs/1458721>
<axw> davecheney: dunno about the markdown links. I pinged ericsnow, but didn't hear back
<mwhudson> davecheney: right, i get the self-referential bit
<mwhudson> maybe i'll try to bang on the details for 1.6 or something
<davecheney> mwhudson: external linking passes everyting to /bin/ld ?
<davecheney> that may work
<mwhudson> davecheney: yes
<davecheney> but using the itnernal linker will probably cause sadness
<mwhudson> ah yeah
<mwhudson> makes sense
<menn0> thumper: here's the PR to move the unit agent: http://reviews.vapour.ws/r/1784/
 * thumper looks
<thumper> shipit
<menn0> thumper: sweet
<davecheney> thumper: on kanban, the link to LP bug link just sends be back to the board, not to lp
<thumper> davecheney: I'll fix it
<thumper> it is board specific
<thumper> and I didn't set it assuming the board I copied did
<davecheney> ta
<thumper> davecheney: done
<thumper> menn0: I'm thinking I should have perhaps, maybe, not tried to do all this at once
 * thumper takes another bite of the elephant in the package
 * thumper makes it compile first
<menn0> thumper: I know that feeling well
<davecheney> menn0: nice change on moving code out of the cmd
<thumper> order of operation:
<davecheney> testing commands is a pain
<thumper> tests compile first
<davecheney> move the code elsewhere
<thumper> tests pass second
<thumper> tests right and correct third
<thumper> although perhaps 2 and 3 will be reversed
<davecheney> 1, 2, you know what to review, http://reviews.vapour.ws/r/1785/
<menn0> davecheney: thanks... the change was essential in order to properly test what i'm working on
<axw> davecheney: RB is screwed, I can't reply to your comment. I don't think it makes sense to change to io.Writer, since we want to buffer the output and return it as []byte
<davecheney> fair enough
<davecheney> i couldn't see from the diff
<davecheney> so it was easlier to throw a comment over the wall
<davecheney> anyone want to retunr the favor
<davecheney> http://reviews.vapour.ws/r/1785/
<davecheney> its a 2 line change
<mup> Bug #1458693 was opened: juju-deployer fills up ~/.ssh/known_hosts <juju-core:New> <https://launchpad.net/bugs/1458693>
<davecheney> axw: why do you think moving the line above the go statement changes the semantics of the test ?
<axw> davecheney: because the time is going to be different
<axw> davecheney: seems the time is meant to be after the lease was claimed
<davecheney> sure, but that go routine may not be scheduled til some point in the future
<davecheney> how about I move more code up ?
<axw> davecheney: that's what I'm suggesting: move the ClaimLease call above "leaseClaimedTime := time.Now()"
<davecheney> axw: done
<davecheney> ptal
<davecheney> fwiw both versions passed my stress test
<davecheney> but yours is more correct
<axw> davecheney: LGTM
<axw> thanks
<thumper> ok... I gotta go cook dinner before picking rachel up from the airport
<thumper> see you folks tomorrow
<davecheney> oh the irony
<davecheney> http://paste.ubuntu.com/11364012/
<mup> Bug #1458741 was opened: cmd/jujud/agent: TestJobManageEnvironRunsMinUnitsWorker fails <juju-core:New> <https://launchpad.net/bugs/1458741>
<anastasiamac> axw_: tyvm :)
<anastasiamac> axw_: I'll look tonite :D
<axw_> anastasiamac: nps
<anastasiamac> axw_: this store that I am adding ("allecto") exist or the charm that I am using.
<anastasiamac> axw_: the whole idea was to use charm with storage
<anastasiamac> axw_: and this one has 2 charm stores :D
<anastasiamac> i'll update tthe code later on but i think u r spot on the money with writechanges!
<anastasiamac> axw_: brilliant! tyvm :)))
<axw_> anastasiamac: sorry, didn't realise storage-block had been updated
<anastasiamac> axw_: guilty as charged :))
<axw_> anastasiamac: writeChanges shouldn't cause your test to pass though, that would only make a difference if you passed an error into FlushContext
<axw_> anastasiamac: ah, I know what hte issue is then
<axw_> anastasiamac: you didn't specify a Count, so it was set to the MinCount of that store which is 0
<axw_> anastasiamac: it should default to 1
<axw_> (in the case of this method only)
<anastasiamac> axw_: axw_oomg! u r 100% right!!! thnx!!!
<anastasiamac> axw_: :D
<anastasiamac> axw_: i need this store to have 0, so I'll pass Count as 1 in the test :)
<anastasiamac> axw_: the whole idea of adding this store to test charm was to have a 0 ifor count range :)
<axw_> anastasiamac: I think state.AddStorageToUnit should set Count to 1 if it's 0
<anastasiamac> axw_: sure?
<anastasiamac> axw_: u don't want it to send an error back? saying env default is 0 so storage wasn't aadded?
<axw_> anastasiamac: doesn't make sense to add storage with 0 count
<axw_> anastasiamac: IMO, storage-add should add a single instance unless otherwise specified
<axw_> anastasiamac: so maybe the state method should just error if Count is 0/unspecified
<axw_> and require the client to specify it
<anastasiamac> axw_: k, i'll ad it to PR too! thanks for the thoughts :D
<anastasiamac> add*
<anastasiamac> axw_: at state - err if count is 0; in storage-add - set count to 1 if none specified
<axw_> anastasiamac: yep. storage.ParseConstraints already does that (you're using that right?)
<axw_> yes you are
<axw_> anastasiamac: so, just error if Count is 0 and fix the tests to specify non-zero count
<anastasiamac> axw_: will do! tyvm :)))))))))
<mup> Bug #1458754 was opened: $REMOTE_UNIT not found in relation-list during -joined hook <juju-core:New> <https://launchpad.net/bugs/1458754>
<mup> Bug #1458758 was opened: enable to execute a command/script on lxc/kvm hypervisors before containers are created <feature-request> <juju-core:New> <https://launchpad.net/bugs/1458758>
<dimitern> reviewers ? PTAL http://reviews.vapour.ws/r/1777/
<wallyworld> dimitern: what are the plans for bug 1348663 ? given 1.24 is delayed till next week, are there plans to fix?
<mup> Bug #1348663: DHCP addresses for containers should be released on teardown <maas-provider> <network> <oil> <juju-core:Triaged by mfoord> <juju-core 1.24:Triaged by mfoord> <MAAS:Invalid> <https://launchpad.net/bugs/1348663>
<dimitern> wallyworld, yes, the plan is to work around this by using the new devices api from maas - michael is working on implementing it this week
<wallyworld> dimitern: awesome ty. for 1.24 then i asume?
<dimitern> wallyworld, at the very least juju lets maas (1.8+) know when in spins up a container and which node is its parent
<wallyworld> great
<dimitern> wallyworld, yes, I hope we'll make it for 1.24.0, if not - for .1
<wallyworld> dimitern: ok, maybe then we move that bug off beta5 milestone and onto 1.24.0
<dimitern> wallyworld, sounds good to me
<wallyworld> done
<dimitern> cheers!
<dimitern> wallyworld, if you can, can you review http://reviews.vapour.ws/r/1777/ please?
<wallyworld> ok
<axw_> fwereade: any thoughts on how to fix this? https://bugs.launchpad.net/juju-core/+bug/1457728/comments/6
<mup> Bug #1457728: `juju upgrade-juju --upload-tools` leaves local environment unusable <local-provider> <upgrade-juju> <vagrant> <juju-core:Triaged> <juju-core 1.24:In Progress by axwalk> <https://launchpad.net/bugs/1457728>
<axw_> fwereade: my initial thought is to make it more like the watcher API, which can be canceled when the worker is killed
<wallyworld> dimitern: done, but a few comment sorry. i have to run away to soccer for a bit but will be back later
<dimitern> wallyworld, ta!
<dimitern> wallyworld, I was trying to find a way not to use JujuConnSuite, but couldn't find how - ideas welcome
<dimitern> axw_, ^^
<axw_> dimitern: see {api,apiserver}/diskmanager for example
<dimitern> axw_, ah, ok - thanks!
<axw_> dimitern: convert the state.State to an interface {ResumeTransactions()}
<axw_> then in the tests you replace the state.State with a mock version
<wallyworld> dimitern: i referenced diskmanger in the comments :-)
<dimitern> axw_, the problem is RegisterStandardFacade needs a factory method taking *state.State
 * wallyworld runs away to soccer 
<axw_> dimitern: yeah that's a bit of a pain. couple of options: limited use of PatchValue as in apiserver/diskmanager, or have the factory defer to some other code that takes an interface
<dimitern> axw_, right, that's an option, but we really should change facade factory methods across the board to avoid the need to pass state
<axw_> dimitern: I agree
<axw_> just haven't gotten around to it :)
<fwereade> axw_, oops, sorry, looking
<fwereade> axw_, I'm not sure the Block is intrinsically the problem; but, yes, a watcher-style approach would be much more in keeping with everything else in juju
<fwereade> axw_, the core problem I *think* is that the block can outlive the manager responsible for notifying of the change
<axw_> fwereade: yeah, the lease manager on the apiserver just exits without notifying the subscribers
<axw_> fwereade: so they just sit there waiting, forever
<fwereade> axw_, grrrmbl
<fwereade> axw_, it has a few other hang bugs too
<axw_> fwereade: so we can close those channels, but I'm not too sure how to prevent new ones from coming in yet. the whole thing's a singleton, which makes it slightly difficult
<fwereade> axw_, the singleton is a goddamn nightmare
<fwereade> axw_, let me forward you a couple of mails
<axw_> okey dokey
<fwereade> axw_, if you have input re replacing it cleanly I would be most grateful
 * axw_ lights the pipe and puts on his reading glasses
<axw_> sure thing
<fwereade> axw_, but every approach I can see has tentacles :(
<fwereade> axw_, I'm going out for a short run soon but ping me and I'll respond when I can
<axw_> fwereade: will do, I'll have to digest all of this first
<fwereade> axw_, yeah, I'm not expecting immediate responses at all :)
<axw_> :)
<axw_> fwereade: I'll investigate making lease a non-singleton. will let you know if I get anywhere
<fwereade> axw_, awesome, tyvm, http://reviews.vapour.ws/r/1787/ and my responses may be relevant background also
<axw_> ok
<axw_> fwereade: re worker dependencies, I think I'd avoid that initially and return an error if the apiserver facade attempts to use the lease manager if the worker is stopped. is that reasonable?
<fwereade> axw_, yeah, that's fine by me
<fwereade> axw_, but then we need a strategy for wiring the fresh lease manager into the api server when it's bounced...
<axw_> fwereade: ah, I was thinking they'd all bounce.. that won't happen though will it. unless we make all lease-manager errors fatal.
<fwereade> axw_, if we made the lease manager part of state directly we might cut through that problem entirely
<fwereade> axw_, a state already looks after the watcher and presence "worker"s
<fwereade> axw_, it's not a *good* solution but it might make a good dolution easier to see
<fwereade> axw_, not sure
<fwereade> axw_, really have to go out now, bbs
<axw_> sure, ttyl
<dimitern> axw_, fwereade - http://reviews.vapour.ws/r/1777/ PTAL
<dimitern> fwereade, you'll like this I believe :) ^^
<axw_> dimitern: is resumer really run once per env? I would've thought it'd be once for the state server
<axw_> I don't think there's a separate txn log per env is there?
<dimitern> axw_, I think it's run once per state server (jobmanageenviron)
<axw_> dimitern: sorry, reading fail. I saw perEnvSingular and read perEnv
<dimitern> axw_, ah :)
<dimitern> axw_, yeah - perEnvSingular could be named better - like envManagerWorkers
<axw_> dimitern: actually... it does look like it'll be one per (hosted) env
<axw_> env worker manager starts those workers for each env in state
 * axw_ doesn't know JES well
<dimitern> axw_, hmm - well, that smells fishy
<dimitern> axw_, but I haven't changed the logic there I believe
<axw_> dimitern: you moved it into startEnvWorkers, so I *think* there'd be one of them per hosted env. I could be wrong, thumper and co could tell you definitively. anyway, I'll keep reviewing
<dimitern> axw_, fair point, will ping thumper or menn0
<axw_> dimitern: stupid question. what do we gain by running this over the API anyway? it's pretty closely tied to mongo
<dimitern> axw_, satisfying the "thou shalt not use state directly ever" concept :)
<dimitern> axw_, fwereade is really keen on this and I agree - better isolation, mockability, etc.
<dimitern> axw_, I guess I could move the starting of resumer in postUpgradeAPIWorker when isEnvironManager == true
<axw_> dimitern: mk. well, what's there LGTM, apart from that possible per-env issue
<dimitern> axw_, thanks!
<axw_> dimitern: yeah that looks like it'd work
<dimitern> axw_, it will still run 1 resumer per apiserver I guess, but it should work regardless
<dimitern> (for all hosted envs and in HA setup)
<axw_> hm yeah, we don't have singular workers over API. welp, I dunno. is it valid for two things to try to resume transactions?
<axw_> I guess it must be
<dimitern> axw_, looking in state/txn.go - ResumeAll() that ultimately gets called, it seems we always find all txns and try to resume !tapplied || !taborted
<perrito666> mornin
<wallyworld> fwereade: with that pr, i was only trying to do the minimal work to improve what was there for 1.24, not solve the bigger picture issues which would take a lot more effort. i was hoping that as long as what was there was no worse, and hopefully better than what exists, it could solve the huge txn queue issues (but not everything else)
<fwereade> wallyworld, I *suspect* that all that'd take is dropping the delete/add, and leaving everythinng else as is
<fwereade> wallyworld, but the txn builder doesn't add anything afaics -- if anything it makes it slightly worse by making the lease managers more relentless in overwriting one another
<fwereade> wallyworld, (I think?)
<wallyworld> fwereade: that last point i did question - i think it could be changed to just error out if the txn revno differed
<fwereade> wallyworld, it doesn't help
<fwereade> wallyworld, you're just checking that the database looked how it did when you decided to make the change
<fwereade> wallyworld, but you're not using the database to help you decide whether that change is sane
<wallyworld> well isn't the database looking as you expect sufficient?
<fwereade> wallyworld, no, because the only component that knows how it shoudl look is the lease manager
<fwereade> wallyworld, the lease persistor is just doing as its told and not synchronising anything afaics
<fwereade> wallyworld, it's only the lease manager that understands on what basis it's replacing the lease, but it's keeping that basis secret from the persistor, so the persistor can't know whether it's still a good idea at the time it looks at the db
<wallyworld> hmmm, sounds like the lease manager needs to use the db as a point of synchronisation rather than an in memory model
<fwereade> wallyworld, I think that is unquestionable
<wallyworld> it could work if we could guarantee that the db 1:1 reflected the in memory model, but that doesn't work for ha etc
<fwereade> wallyworld, it's one of those communication screwups where I'd thought that was the only way that could ever possibly work, and that clever in-memory stuff might be a smart optimisation
<fwereade> wallyworld, it didn't even cross my mind that we'd try to build a distributed lease manager *without* synchronisation
<wallyworld> it wouldn't be so bad is mongo wasn't so fucking sumb
<wallyworld> dumb
<fwereade> wallyworld, yeah, it's a genuinely interesting problem
<wallyworld> so i was looking for a quick 1.24 fix (not perfect)
<wallyworld> i thought that by at least making the db writes conditional, we may avoid the huge txn queue issue
<wallyworld> not trying to fix everuthing
<wallyworld> also not ignoring errors
<fwereade> wallyworld, I haven't checked yet but I strongly suspect that the huge queues are because of the delete/add
<wallyworld> at least we'd see what may be failing
<wallyworld> right, so the delete add is gone
<fwereade> wallyworld, and the trouble with not ignoring errors is that you can't really escape the tentacles
<wallyworld> by using the buildtxn function we avoid the delete/add
<wallyworld> as i said, not ment to be perfect
<wallyworld> but no worse
<wallyworld> with visible errors
<fwereade> wallyworld, errors visible in the wrong place to a random subset of clients, I think?
<wallyworld> errors will cause worker to reboot
<wallyworld> with logging
<fwereade> wallyworld, right
<wallyworld> so better since they are visible
<wallyworld> and maybe txn issue solved
<fwereade> wallyworld, but the worst worker problems that cause hangs and deadlocks are not touched
<wallyworld> yes
<fwereade> wallyworld, and you're delivering the errors to inappropriate places
<wallyworld> but that wasn;t the goal
<wallyworld> why inappropriate? the worker will reboot, the cache wull be reloaded, the error will be logged = imporvement
<wallyworld> as it is now, the cache can be corrupt
<fwereade> wallyworld, the clients who callecd the method will get some weird error they should never see
<fwereade> wallyworld, other clients will just hang
<wallyworld> but that's no worse than now is it?
<wallyworld> at least the error will be visible somehow instead of swallowed
<fwereade> wallyworld, some errors will be visible to some clients
<wallyworld> right, but only if something failed
<fwereade> wallyworld, no
<fwereade> wallyworld, ...or maybe I misunderstood you
<wallyworld> quick hangout maybe?
<fwereade> wallyworld, sure, 5 mins?
<wallyworld> ok
<wallyworld> in our 1:1
<mup> Bug #1457218 changed: failing windows unit tests <ci> <regression> <windows> <juju-core:Fix Committed by ericsnowcurrently> <juju-core 1.23:Fix Committed by ericsnowcurrently> <juju-core 1.24:Fix Committed by ericsnowcurrently> <https://launchpad.net/bugs/1457218>
<jam> wallyworld: fwereade: any solutions coming out of the hangout?
<wallyworld> jam: you could join us briefly?
<wallyworld> https://plus.google.com/hangouts/_/canonical.com/ian-william
<jam> wallyworld: link? (I'm supposed to be meeting with mramm, but he's not showing up yet)
<jam> wallyworld: he just showed up
<wallyworld> jam: tl;dr; i think we can land the pr with slight mods
<wallyworld> jam: fwereade is thinking about it :-)
<jam> wallyworld: fwereade: can we do it with opaque tokens? (manager gives a request to persister which manager needs to pass back in the next time)
<wallyworld> jam: i'm off to bed, fwereade will fill you in
<fwereade> jam, so, I'm reasonably sure that wallyworld's PR doesn't make things *worse*, with a couple of fixes we can put that in
<fwereade> jam, re passing tokens -- possibly? I couldn't think of a way to do that nicely, because of the smearing of knowledge across the layers (lease persistor knows what's written; lease manager knows what those leases mean; leadership manager knows how leases map to leadership)
<fwereade> jam, but maybe I mistake what problem you're addressing?
<wwitzel3> natefinch: ping
<natefinch> ericsnow: check out https://github.com/natefinch/pie
<ericsnow> natefinch: nice :)
<voidspace> dimitern: ping
<dimitern> voidspace, pong
<voidspace> dimitern: I've created three tasks for working with the devices api
<voidspace> dimitern: pre-generating MAC addresses is actually probably simpler than our initial approach of a machine agent and apiserver methods for the container to report the MAC address after provisioning
<dimitern> voidspace, great, thanks! I'll have a look shortly
<voidspace> dimitern: there are some open questions however
<voidspace> dimitern: it doesn't look like you can associate a "device" with a "host"
<voidspace> dimitern: so on host destruction we'll still have to manually release the addresses (destroy the containers)
<voidspace> dimitern: that's easy, but not what we hoped
<dimitern> voidspace, wait I don't quite follow
<voidspace> dimitern: I thought part of the point we were hoping to get from the devices api was the ability to declare a container as belonging to a host machine
<dimitern> voidspace, you need the system-id (instance id in juju terms) of the host to pass as parent= in device new, right?
<voidspace> dimitern: gah
<dimitern> voidspace, that's establishes the link
<voidspace> dimitern: I was looking at get not new
<dimitern> that even
<voidspace> dimitern: so I didn't see parent
<voidspace> dimitern: cool, that's great
<dimitern> voidspace, :) yeah
<voidspace> dimitern: storing the devices uuid will be interesting
<voidspace> dimitern: 1) it's provider specific
<voidspace> dimitern: 2) the logical place for it is in instanceData - but that normally doesn't get created until after provisioning
<voidspace> dimitern: so there'll be some re-working there
<dimitern> voidspace, yeah, true
<dimitern> voidspace, it seems like we need to extend SetInstanceInfo to take an extra argument
<voidspace> dooferlad: dimitern: I picked up that PDU you recommended (dooferlad) for cheap on ebay (about half the price of that refurbed one)
<voidspace> dimitern: right
<dimitern> voidspace, if that argument is set, we'll store it in a new field in the instanceData doc for the container
<dimitern> voidspace, nice! does it work ok?
<voidspace> dimitern: waiting for it to arrive
<voidspace> dimitern: alternatively, we can fetch the device id from the mac address
<voidspace> dimitern: so we can just store that, and it's not provider specific
<dimitern> voidspace, interesting
<dimitern> voidspace, so an environ method like InstanceIdFromMAC(mac string) (instance.Id, error)
<voidspace> dimitern: well, the release IP address method could do that
<voidspace> dimitern: the MAAS specific one
<voidspace> dimitern: probably no need for a new public method on Environ
<dimitern> voidspace, I like this!
<dimitern> voidspace, the hostname can be used as well
<voidspace> dimitern: right
<dimitern> (but it needs to be a FQDN)
<voidspace> dimitern: so it should be easy, and no need to store provider specific information
<dimitern> voidspace, cool!
<voidspace> dimitern: so MAC address is not stored on the machine, nor the instanceData but in a networkInterfaceDoc
<voidspace> dimitern: (in terms of state)
<voidspace> dimitern: and that's done from SetInstanceInfo
<dimitern> voidspace, yeah, that's a bit crappy and needs fixing at some point
<voidspace> dimitern: is it the right way to store container mac address for now?
<voidspace> dimitern: or is it *already* done like that
<ericsnow> dimitern: is there (or will there be) networking info in charm metadata?
<voidspace> dimitern: i.e. if we specify the MAC address for the container on creation, it will be populated correctly in state by SetInstanceInfo
<voidspace> ericsnow: networking will largely be done as deploy time constraints and environment configuration
<ericsnow> voidspace: hmm, I would have thought it would be similar to storage, where the charm specifies up-front what networking resources it will need
<ericsnow> voidspace: see http://bazaar.launchpad.net/~axwalk/charms/trusty/postgresql/trunk/view/head:/metadata.yaml
<dimitern> voidspace, well, considering we'll most likely change what we do in SetInstanceInfo apart from calling SetProvisioned
<voidspace> ericsnow: what networking resources do you have in mind?
<ericsnow> voidspace: not sure exactly :)
<voidspace> ericsnow: what *could* a charm usefully specify...
<dimitern> voidspace, I'd suggest to reuse SetInstanceInfo, if possible (pass the MAC as part of the network info)
<ericsnow> voidspace: what have you got? :)
<voidspace> dimitern: they should be already - as interfaces
<voidspace> ericsnow: what "spaces" a unit can be in - specified at deploy time
<voidspace> ericsnow: and then the creation of spaces and the creation of subnets and allocating them to spaces
<ericsnow> voidspace: spaces as in subnets?
<voidspace> ericsnow: a space is a collection of subnets
<ericsnow> voidspace: k
<voidspace> ericsnow: and they're environment specific, so you can't usefully specify anything about them in a charm
<ericsnow> voidspace: so "space" is what could meaningful in the charm metadata
<voidspace> ericsnow: raise ParseError("what?")
<ericsnow> voidspace: you could at least identify the space
<voidspace> ericsnow: but each environment will have different spaces
<voidspace> ericsnow: so you specify them at deploy time
<ericsnow> voidspace: I'm asking in context of charm-launched containers
<ericsnow> voidspace: we are looking to specify them in the charm metadata
<voidspace> ericsnow: well, a container will only be able to be in the spaces that the host can see
<ericsnow> voidspace: part of that would be identifying the networking resources the container should use
<voidspace> ericsnow: the spaces available to a container will depend on the host - if the physical (or virtual!) machine a container is *in* doesn't have access to the subnets in a space then the container can't either
<voidspace> ericsnow: so I don't think there's anything useful to specify in the charm metadata there
<voidspace> ericsnow: unless the charm can get the spaces available at container creation time and (effectively) say "be on this subnet"
<voidspace> ericsnow: which if the host is in several spaces, that may be useful
<ericsnow> voidspace: exactly
<ericsnow> voidspace: if there is only one possibility then there's no need to decide :)
<voidspace> ericsnow: this is metadata added at charm runtime, not upfront then?
<ericsnow> voidspace: it's in the face of multiple options that we'd like to be explicit
<ericsnow> voidspace: no, it will be part of the charm metadata
<voidspace> ericsnow: you can't know at charm creation time what spaces will be accessible to a machine at arbitrary machine creation time
<voidspace> ericsnow: so you can't know anything useful upfront, it's deploy time data not charm data
<ericsnow> voidspace: mostly declaring the space to use for a container is relevant if the charm has multiple containers and multiple spaces and the containers should be on the same subnet
<voidspace> ericsnow: so if this is metadata encoded into the charm (i.e. not to be determined at hook runtime / container creation time) then you can't know ahead
<voidspace> ericsnow: but what spaces units of a charm are to be deployed to is the decision of the person deploying the charm not the person writing the charm
<voidspace> ericsnow: so you can't encode that into the charm
<voidspace> I think if a charm (unit of a service) creates a container, the assumption has to be that it will have the same constraints as those specified for the charm
<perrito666>  /query natefinch
<perrito666> lol
<perrito666> my irc client has the worse UI in history
<ericsnow> voidspace: okay, so we'll just have to wing it :)
<voidspace> ericsnow: yeah
<voidspace> ericsnow: so there may need to be some code / checking that we *do* pick the same subnet for configuring the networking of the container
<voidspace> ericsnow: but I think that's deterministic, so it shouldn't be a problem currently
<ericsnow> voidspace: agreed
<voidspace> ericsnow: eventually we will do per-instance (including containers) firewalling - and setup routing rules so that spaces are isolated from each other
<voidspace> ericsnow: so the host will need to know what ports the container is using as we're doing NAT
<voidspace> ericsnow: at least with addressable containers we are
<voidspace> ericsnow: but per-instance firewalling, and routing rules for spaces, are both some way off
<ericsnow> voidspace: you mean like we mostly had to do for the new vsphere provider? :)
<voidspace> ericsnow: thankfully I have no idea...
<voidspace> g'night all
<natefinch> I hate it when my job comes down to: let's find the least-sucky way to do this.  ...because invariably people disagree which way is least sucky.
<natefinch> wwitzel3: you around?
<wwitzel3> natefinch: yeah
<wwitzel3> natefinch: in moonstone with ericsnow
<natefinch> kk
<natefinch> I was wondering if you knew if it's possible to load the existing syslogconfig ...  I can find a Write method, but not a Read method... so I don't know if we even support reading from whatever config we wrote to disk.
<wwitzel3> natefinch: don't know off hand, I can poke around in a bit
<natefinch> wwitzel3: that's ok, I can poke around, just figured I'd ask if you knew
<natefinch> dammit, I hate it when the docs don't specify what happens in edge conditions.  If you os.Rename a file and the target exists.. what happens?
<perrito666> I unix, most likely a rewrite
<perrito666> unless there is a guard
<wwitzel3> anyone able to explain the workflow process of developing new stuff in juju/charms?
<wwitzel3> do you work against v5-unstable? and propose to v5?
<niedbalski> Does anybody had experienced this error (missing series)  "21":   agent-state-info: invalid binary version "1.23.3--armhf" ?
<thumper> cmars: we on for today?
<thumper> niedbalski: wow, cool...
<thumper> unknown series?
<thumper> niedbalski: what host?
<niedbalski> thumper, 1.23.3-vivid (client), 1.23.2 ( bootstrap node ) on armhf. This happens on sync-tools / add-machine operations.
<thumper> niedbalski: what hardware are you using?
<thumper> for armhf?
<niedbalski> thumper, raspberry pi 2
<niedbalski> thumper, this is not super critical, is for my local lab, but the bug is ugly anyways :)
<thumper> ack
<thumper> can you file a bug plz?
<thumper> cmars: nm, I just saw the email about the decline
<niedbalski> thumper, ok, it seems that other archs experienced this same issue in the past, btw. (http://irclogs.ubuntu.com/2014/09/24/%23juju.txt)
<niedbalski> thumper, https://bugs.launchpad.net/juju-core/+bug/1459033, anything else I can add?
<mup> Bug #1459033: Invalid binary version, version "1.23.3--amd64" or "1.23.3--armhf" <juju-core:New> <https://launchpad.net/bugs/1459033>
<thumper> niedbalski: nah, that is a good start
<thumper> niedbalski: thanks
<mup> Bug #1459033 was opened: Invalid binary version, version "1.23.3--amd64" or "1.23.3--armhf" <juju-core:New> <https://launchpad.net/bugs/1459033>
<waigani> wallyworld, axw: I've hit a bug with 1.24, ec2 --upload-tools - there a bunch of CLOSE_WAIT connections on the server to s3 - full details: #459047
<mup> Bug #459047: [105158.082974] ------------[ cut here ]------------ <amd64> <apport-kerneloops> <kernel-oops> <linux (Ubuntu):Confirmed> <https://launchpad.net/bugs/459047>
<wallyworld> oh joy
<wallyworld> maybe bug 1459047 perhaps
<mup> Bug #1459047: juju upgrade-juju --upload-tools broken on ec2 <juju-core:New> <https://launchpad.net/bugs/1459047>
<waigani> wallyworld: ugh, what did I paste?
<wallyworld> missing the 1
<waigani> ah, right heh
<wallyworld> waigani: so i think you're on bug duty for onyx? looks like you've a bug to work on :-)
<waigani> wallyworld: yep
<wallyworld> waigani: we're having fun fixing lease manager stuff \o/
<waigani> wallyworld: any idea why we're connecting to s3 with --upload-tools? I thought it was using gridfs?
<waigani> wallyworld: oh yeah, that one looked interesting
<wallyworld> s3 was at one stage a repository for public tools
<waigani> wallyworld: do you know if we are using it for anything now?
<waigani> s/are/should be
<wallyworld> and s3 is still used for bootstrap state file i think (need to check)
<wallyworld> i don't think we've ported off that yet
<waigani> right
<wallyworld> so very minimal use for new environments
<waigani> okay, I'll leave you to your leasing :)
<wallyworld> we can swap :-P
<waigani> haha
<mup> Bug #1459047 was opened: juju upgrade-juju --upload-tools broken on ec2 <juju-core:New> <https://launchpad.net/bugs/1459047>
#juju-dev 2015-05-27
<wallyworld> menn0: if i have a txn-queue uuid from a db.lease.find(), what's the cmd to show the queue size?
<menn0> wallyworld: the queue size is literally the length of the txn-queue field on the document (it's an array)
<wallyworld> ah right, the array is length 1 with a uuid
<wallyworld> on  a system with 5 units
<menn0> wallyworld: a single item in the array is totally normal
<wallyworld> yep
<menn0> mgo/txn leaves the last txn that touched a document in the field
<wallyworld> i guess i've leave it run for a bit nd see if the queue size explodes
<menn0> wallyworld: the system that experience that bug had a number of services deploy
<menn0> deployed
<wallyworld> i have 2
<wallyworld> with 5 units each
<wallyworld> i can add more
<menn0> wallyworld: this one had 8 or something, not sure if that makes a difference
<menn0> wallyworld: are you try to repro with master or checking your fix?
<wallyworld> my fix
<wallyworld> maybe a should have repoed first
<wallyworld> i might try a bundle
<menn0> wallyworld: it would be good to know if you can see the problem with master first, otherwise you don't know whether you've actually helped
<wallyworld> true, i just assumed it would show up with enough load
<wallyworld> deploying landscape bundle anyhow, we'll see what happens
<wallyworld> axw: anastasiamac: i have to head out for a blood test, been fasting for 12 hours. bbiab with food :-)
<anastasiamac> wallyworld: good luck :D
<wallyworld> i studied all night
<wallyworld> hope i pass
<menn0> waigani: is there any reason why this PR hasn't been reviewed yet? http://reviews.vapour.ws/r/1748/
<menn0> waigani: never mind
<menn0> waigani: I just saw that it's a forward port
<menn0> waigani: to avoid having these clutter up RB i've been manually marking them as submitted in RB even before they get merged
<menn0> waigani: it avoids some confusion
<waigani> menn0: okay, I'll do that in the future.
<mup> Bug #1459057 was opened: multiple storage-add constraints for the same storage unsupported <juju-core:New> <https://launchpad.net/bugs/1459057>
<mup> Bug #1459060 was opened: add support for "storage-add <name>" <juju-core:New> <https://launchpad.net/bugs/1459060>
<axw> menn0: would you kindly take another look at http://reviews.vapour.ws/r/1781/
<mup> Bug #1459060 changed: add support for "storage-add <name>" <juju-core:New> <https://launchpad.net/bugs/1459060>
<davecheney> % tree | grep logger
<davecheney> â   âââ logger
<davecheney> â   â   âââ logger.go
<davecheney> â   â   âââ logger_test.go
<davecheney> â   âââ logger
<davecheney> â   â   âââ logger.go
<davecheney> â   â   âââ logger_test.go
<davecheney> â   âââ logger
<davecheney> â   â   âââ logger.go
<davecheney> â   â   âââ logger_test.go
<davecheney> â   â   â   âââ logger.go
<davecheney> juju has at least three packages called logger !>!
<davecheney> how does this help anything ?
<anastasiamac> davecheney: the more the merrier? :D
<mup> Bug #1459060 was opened: add support for "storage-add <name>" <juju-core:New> <https://launchpad.net/bugs/1459060>
<davecheney> you get a logger! and you get a logger! and you get a logger! /oprah
<thumper> davecheney: pretty sure those are all mine :)
<thumper> davecheney: at least I didn't call it "state"
<anastasiamac> :D
<menn0> axw: ship it! looks great.
<axw> menn0: cheers
<menn0> axw: sorry it took me a while, had to go to the post office
<axw> no worries
<davecheney> so, today i'm looking at races in the logger and leadership packages
<davecheney> but I can't figure out which one i need to look at
<axw> davecheney: leadership and lease are in flux a bit, FYI
<axw> there's some critical bugs that need fixing
<davecheney> eek
<davecheney> http://paste.ubuntu.com/11381996/
<davecheney> axw: https://bugs.launchpad.net/juju-core/+bug/1459064
<mup> Bug #1459064: worker/leadership: data races in test and code <juju-core:New> <https://launchpad.net/bugs/1459064>
<axw> joy
<wallyworld> menn0: \o/
<menn0> wallyworld: ?
<wallyworld> trivially reproducable
<wallyworld> "txn-queue" : [        "5565255b0c132d0fb0000150_978b92fe",    "556525790c132d0fb0000198_5c4423d2",    "556525970c132d0fb00001b8_741af2e9",       "556525b50c132d0fb00001cb_36de73a9",    "556525d30c132d0fb00001ef_156997b6" ]
<wallyworld> and my fix works
<menn0> sweet!
<menn0> that's great news
<wallyworld> yeah
 * wallyworld goes to hit merge
<menn0> now i'm curious as to what you changed
<menn0> i'll have to check out the diff
<wallyworld> it's not a complete fix for all the issues
<wallyworld> just this one specific thing
<thumper> wallyworld: have you fixed leadership?
<thumper> oh...
<wallyworld> thumper: one teeny little bit
<thumper> are ya gunna fix it all?
<wallyworld> axw will :-)
<wallyworld> we are both working on it
<wallyworld> and illiam
<wallyworld> w
<mup> Bug #1459064 was opened: worker/leadership: data races in test and code <juju-core:New> <https://launchpad.net/bugs/1459064>
<menn0> wallyworld: one thing to check... do the txn-queue entries get automagically fixed when upgrading from a system with the bug to one with your fix?
<wallyworld> menn0: that's a good point. i guess i should try that
<wallyworld> since this started in 1.23 :-(
<wallyworld> if it were 1.24 betas, i wouldn't bother
<wallyworld> menn0: if it's not fixed, what would the answer be run the txn purge tool
<wallyworld> ?
<menn0> wallyworld: I guess the new txnpruner worker should take care of them
<menn0> wallyworld: b/c they are completed txns
<wallyworld> i'll try it out
<menn0> wallyworld: actually, it won't
<menn0> wallyworld: b/c the txnpruner doesn't modify the txn-queue field. it only removes docs from the txns collection for completed txns that are no longer referenced
<menn0> wallyworld: so it won't help
<wallyworld> menn0: i'm worried this is a time bomb which will hit other deployments
<menn0> wallyworld: but I have a feeling that mgo/txn may tidy up the txn-queue fields for you
<wallyworld> we'll see soon once my deployment bootstraps and i upgrade
<menn0> yep. fingers crossed.
<wallyworld> menn0: my upgrade hung the state server - i think i may have hit bug 1457728. but in any case, i'm thinking maybe it might be best just to wipe the lease collection on upgrade
<mup> Bug #1457728: `juju upgrade-juju --upload-tools` leaves local environment unusable <local-provider> <upgrade-juju> <vagrant> <juju-core:Triaged> <juju-core 1.24:In Progress by axwalk> <https://launchpad.net/bugs/1457728>
<menn0> wallyworld: is that safe?
<wallyworld> i think so. i'll try upgrading again first
<menn0> wallyworld: is one result of your PR that ClaimLeadership doesn't block forever? that's the issue that's causing me hassles at the moment
<wallyworld> menn0: that aspect is not deliberately fixed
<menn0> wallyworld: ok. i'll have to figure out another way to do this
<wallyworld> might be by accident, but there was no intent to change that behaviour
<wallyworld> menn0: what s the scenario
<menn0> wallyworld: i'm creating feature tests to ensure that logging to mongodb works
<menn0> wallyworld: the one that spins up the machine agent works
<menn0> wallyworld: the unit agent variant hangs during teardown
<wallyworld> ah, deadlocks on test teardown - william has found ways to work around that
<menn0> wallyworld: it's because in the unit agent the leadershiptracker hangs making it's initial request for leadership
<wallyworld> menn0: i'm am fairly sure william has been through this pain and i though he had a solution
<menn0> wallyworld: I suspect it's possibly also b/c i'm using JujuConnSuite so there isn't a full set of workers running on the server side
<wallyworld> possibly
<menn0> wallyworld: which is possibly why it's getting stuck in the first place
<menn0> wallyworld: but i don't know how the lease/leadership stuff works all that well
<menn0> wallyworld: i'll bug will if I don't get anywhere by EOD
<wallyworld> yeah, i'd ask william, i'll mention it as i'm meant to be talking to him later
<wallyworld> menn0: until a proper worker with tombs is used, and the singleton goes away, it will remain broken
<menn0> wallyworld: ok
<menn0> wallyworld: i'm just wondering if I can get enough infrastructure in place so that leadership does work in the tests
<wallyworld> menn0: ack, but sadly i can't give you a good answer
<mup> Bug #1459082 was opened: flag for debug-hooks to skip non-existent hooks <debug-hooks> <juju-core:New> <https://launchpad.net/bugs/1459082>
<menn0> wallyworld: np, thanks
<menn0> wallyworld, thumper: I think I just cracked it. spinning up the lease manager loop allows the leadership functionality in the uniter to work and unblocks the test.
<wallyworld> ah, that sounds plausible - i didn't realise the loo[ wasn't running
<wallyworld> menn0: when/if you have a moment, i'd like to check something wth you about upgrades, maybe ping me when you've done writing tests
<menn0> wallyworld: under JujuConnSuite it's not
<wallyworld> :-(
<menn0> wallyworld: i wouldn't have expected it to... JujuConnSuite gives you State and an API server but not necessarily all the workers that the state servers run
<wallyworld> that's true
<wallyworld> but 'tis a trap
<menn0> wallyworld: yep and it's a hard to diagnose trap
<menn0> wallyworld: i'll be free in a sec to discuss that upgrade issue
<wallyworld> ok, ta, see you in onyx
<davecheney> thumper: https://bugs.launchpad.net/juju-core/+bug/1459085
<mup> Bug #1459085: worker/logger: data race in tests <juju-core:New> <https://launchpad.net/bugs/1459085>
<davecheney> this is a bug in loggo
<davecheney> i have a fix to propse
<davecheney> which branches of juju does this need to land on >
<davecheney> which branches of juju does this need to land on ?
<thumper> hmm
<thumper> davecheney: what is the loggo data race?
<thumper> davecheney: go test -race in loggo seems fine... perhaps we need more coverage?
<thumper> davecheney: the races probably need to land on master
<thumper> unless they are obviously fixing broken behavour in supported jujus
<davecheney> i'm on the fence on that one
<davecheney> if I can avoid spending a day nursing it back to 1.22
<davecheney> i'd prefer to avoid it
<davecheney> thumper: btw, you going to land this ?
<davecheney> https://github.com/juju/loggo/pull/6
 * thumper looks
<thumper> no
<thumper> decided against it
<davecheney> thumper: then close it
<davecheney> thumper: https://github.com/juju/loggo/pull/11
<davecheney> will try to add a test
<davecheney> it is likely that older versions of the race detector cna't spot this bug
<thumper> davecheney: just did
<thumper> close it that is
<thumper> I wonder if the bot is doing this yet
 * thumper pokes to see
<thumper> davecheney: ah... ok
<thumper> davecheney: without this patch, do the current loggo tests exhibit a problem?
<thumper> I'm guessing no, because I don't have goroutines doing things in the tests
<davecheney> ok, the loggo tests dno't trigger this because they always use ConfigureLogger which uses a mutex which acts as an accidenal memory barrier
<davecheney> to trigger it you need to do; logger.SetLevel(x); go logger.LoggerInfo()
<davecheney> it has to be in a second goroutine otherwise by definition there is not race
<davecheney> lemmie try to add a test as well
<davecheney> i'll do that in another PR
<thumper> davecheney: looks like there is a bot
<mup> Bug #1459085 was opened: worker/logger: data race in tests <juju-core:New> <https://launchpad.net/bugs/1459085>
<davecheney> thumper: hmm, this is going to be tricky to write a test for
<davecheney> let me try a bit more, then give up if the worker/logger test's race is fixed
<natefinch> son of a.... ERROR juju.worker runner.go:219 exited "upgrader": invalid series "wily"
<natefinch> haven't we fixed this dumb bug by now?
<natefinch> (the "Juju doesn;t know about future series" bug
<natefinch> wallyworld, thumper ^  thoughts on how I can get around this?  I just want to test the upgrade step I'm writing
<wallyworld> natefinch: install the distro info package
<wallyworld> juju can't guess what a new series is
<natefinch> wallyworld: but it can install the distro info package itself :/
<wallyworld> i guess so, yeah
<wallyworld> natefinch: any progress on bug 1370896 as we need to look to get 1.24 wrapped up
<mup> Bug #1370896: juju has conf files in /var/log/juju on instances <canonical-bootstack> <logging> <rsyslog> <juju-core:Triaged by natefinch> <juju-core 1.24:In Progress by natefinch> <https://launchpad.net/bugs/1370896>
<natefinch> wallyworld: exactly what I'm working on.  We need an upgrade step to move the config files from the old location to the new location.
<wallyworld> makes sense
<wallyworld> you should be in bed anyway
<davecheney> thumper: got a test
<davecheney> it's going to come with a LARGE comment
<natefinch> wallyworld: well, I work when I get a chance, sometimes having 3 kids under 4 years old means that chance is quite late at night
<wallyworld> 3!!
<wallyworld> you've been busy
<wallyworld> s/quite/very
<natefinch> wallyworld: heh, yeah, we had our third, a boy, on Christmas eve.   So we have a 5 months, almost 2, and almost 4 year old
<wallyworld> and an upcoming vasectomy :-)
<davecheney> actally no
<davecheney> can't add a test for this easily
<natefinch> wallyworld: Ironically, I'm the one that wants one, and my wife wants another kid... but i think she's crazy :)
<wallyworld> me too :-)
<wallyworld> you could always get one secretly
<mwhudson> natefinch: my brother had 4 kids at ~2 year intervals
<mwhudson> i can't say i recommend it
<natefinch> lol
<natefinch> yeah, no
<natefinch> that's what my wife wants... but I'm quite happy with 3.  4 presents a host of problems we can narrowly avoid with 3
<wallyworld> larger car, larger house
<natefinch> wallyworld: exactly.  House is 3 bedrooms... the girls are doubling up, and hopefully that'll be fine for a long time... but the other bedroom is pretty small for two kids.
<wallyworld> long time = until they turn into teenagers
<natefinch> right... at which point the one we trust the most can move into the finished basement
<wallyworld> ha, and you will be outside the door with a shotgun
<natefinch> so anyway, distro-info has not helped, even after restarting jujud
<wallyworld> hmmm
<natefinch> yeah, distro-info --all doesn't contain wily
<wallyworld> wtf, well that sucks
<natefinch> sounds like one of those "someone put up a stream that they shouldn't have" things
<wallyworld> i would have thought it woud have
<wallyworld> add it by hand to get by for now i guess
<natefinch> is there a reason we can't just ignore series we don't understand?   I mean, if we're not even using that series, who cares?
<natefinch> remind me where that file is, again?
<wallyworld> /usr/share something, i'll check
<wallyworld> /usr/share/distroinfo
<natefinch> wallyworld: got it, thanks
<wallyworld> i think the idea is to stop typos and misconfigured systems
<natefinch> it just seems to screw us like clockwork, every 6 months
<wallyworld> eg accidentally running a centos series with ubuntu tools
<wallyworld> yeah
<natefinch> no matter how smart we try to get
<wallyworld> waigani: you working on bug 1459047? can you assign yourself and mark in progress?
<mup> Bug #1459047: juju upgrade-juju --upload-tools broken on ec2 <juju-core:New> <https://launchpad.net/bugs/1459047>
<wallyworld> so we know it's being looked at
<waigani> wallyworld: done
<wallyworld> tyvm
<wallyworld> waigani: helps when we have release meetings each morning
<waigani> wallyworld: of course, I'll make a note to keep all that updated
<wallyworld> ty
<mup> Bug #1459093 was opened: Upgrade fails if there's a series in streams Juju doesn't recognize <upgrade-juju> <juju-core:New> <https://launchpad.net/bugs/1459093>
<dimitern> thumper, hey, still around?
<dimitern> menn0, ^^
<dimitern> fwereade, hey
<dimitern> fwereade, re http://reviews.vapour.ws/r/1777/
<dimitern> fwereade, I'd appreciate if you reply to my mail from yesterday and give some rationale around why are we bothering to migrate the remaining state-bound workers to use the API, as it seems it's not generally obvious for a few people
<fwereade> dimitern, the resumer worker one?
<dimitern> fwereade, yes
<fwereade> dimitern, ok, I'm a little bit more concerned that anyone considers it ok for random workers to touch the database directly, but will do
<dimitern> fwereade, :) cheers
<TheMue> morning o/
<TheMue> dimitern: I see there are discussions about the API usage migration of the workers?
<dimitern> TheMue, morning
<dimitern> TheMue, yeah
<thumper> dimitern: I have come back to do a little more
<thumper> dimitern: whazzup?
<dimitern> thumper, hey, I wanted to ask about your opinion about moving the resumer worker to use the API and run once per apiserver - http://reviews.vapour.ws/r/1777/
<thumper> once per apiserver or once per environment?
<thumper> I don't see any real reason to have the resumer use the API
<thumper> it only ever runs on state server machines
<thumper> dimitern: to make it use the api server seems weird to add layers for the sake of layers when it doesn't need it
<dimitern> once per apiserver - it applies to all hosted environments AIUI
<thumper> well, we only really need one running
<thumper> what do we have now?
<dimitern> thumper, fwereade can explain better why it's bad to have *any* workers coupled to state.State directly
<thumper> dimitern: well... for me it seems like busy work, and something that isn't really needed to be done right now
<thumper> why now?
<dimitern> thumper, before my PR, it was running in the StateWorker of the machine agent, now I moved it in the method starting per-env workers, but I'll move it to postUpgradeWorker instead
<dimitern> thumper, now is as good time as any, and it was planned with lower priority for quite some time
<thumper> dimitern: what problem is being fixed by the work?
<thumper> dimitern: if there is an active remit to move every worker away from state directly, then I'm fine with it
<dimitern> thumper, decoupling the worker from *state.State, improving tests (dropping JujuConnSuite across the board in favor of BaseSuite + mocking), faster and more deterministic tests, the ability to evolve state.State and what the worker needs separately
<thumper> sure
<mup> Bug #1459148 was opened: azure: juju can't create compute/network optimized instances <juju-core:New> <https://launchpad.net/bugs/1459148>
<dimitern> thumper, so can you review that PR with your comments why it's a good thing ? :)
<thumper> dimitern: not right now :)
<dimitern> thumper, ok, sure
<thumper> dimitern: but I will bring it up with menno tomorrow
<dimitern> thumper, cheers!
<fwereade> thumper, do you recall "api everywhere" as being a major goal about a year ago? having resumer there indicates that we didn't bother to actually finish that work, and at least some people were actively lying about it being complete
<fwereade> thumper, and afaics left the door open so that everybody just thought, ehh, direct db access? not so bad, nobody really cares, ehh
<thumper> fwereade: IIRC we stopped when the non-state server machine and unit agents were able to not use state
<thumper> we agreed that doing all was a time sink at that stage
<thumper> I'm not saying it is a worthwhile goal
<thumper> just questioning timing
<thumper> effort vs. priorities
<fwereade> thumper, if that was the case why did we do, eg, firewaller?
 * thumper shrugs
<thumper> wasn't me :)
<fwereade> thumper, because we don't want all our workers running across the db like a gang of drunken monkeys
<thumper> heh
<thumper> now that is an amusing image
<fwereade> thumper, we *will* fuck things up :)
<thumper> dude, we screw up all the time
 * thumper points fwereade at our bug list
<fwereade> thumper, I would like the bulk of our potential fuckups to at least pass through an authorization layer and not to have the power to do literally anything
<thumper> sure
<thumper> I'm not questioning the work, just the order of work
<fwereade> thumper, ok, but we've got people writing new workers that use state now
<thumper> oh?
<fwereade> thumper, dblogpruner is new, right?
<thumper> yeah...
<fwereade> thumper, why would we leak the fact that the logs are stored in the db?
<fwereade> thumper, the api server can have a "prune logs now please" method
<thumper> point taken, I'll bring it up tomorrow
<fwereade> thumper, cheers
<thumper> actually, I can do better than that
 * thumper puts a card on the board
<fwereade> thumper, <3
<voidspace> dimitern: it seems that with kvm we can supply a template to "uvt-kvm"
<voidspace> dimitern: the template is "libvirt domain xml" format
<voidspace> dimitern: and that can have a network interfaces section specifying MAC address
<voidspace> dimitern: by default, we get the default template (of course)...
<dimitern> voidspace, great! have you actually tried if it works ? :)
<voidspace> dimitern: (i.e. we're not using the template argument currently)
<voidspace> dimitern: no...
<voidspace> dimitern: will do
<voidspace> dimitern: we'll have to clone the default template and add a network interface definition
<voidspace> dimitern: https://libvirt.org/formatdomain.html#elementsNICS
<perrito666> good morning
<dimitern> voidspace, can't we generate the template always, like with lxc.conf? (based on the default template XML + some text/template tags)
<voidspace> dimitern: well, yes
<voidspace> dimitern: I meant that initially we'll have to clone the default one
<voidspace> dimitern: expressed badly
<dimitern> voidspace, sounds good
<perrito666> aghh I think the lease singleton is making my tests die :(
<fwereade> perrito666, are you running a dummy provider, and are you trying to run your own lease worker?
<perrito666> fwereade: none, but I think that the fact of importing lease might be making my test go for 10m until they are killed
<fwereade> perrito666, that is unlikely, I think
<perrito666> fwereade: well Ill know more in about 10m :p
<fwereade> perrito666, -test.timeout 20s
<fwereade> the mere fact of importing lease will not cause that
<fwereade> perrito666, what are you trying to do with it?
<perrito666> fwereade: I stand corrected its something else
<perrito666> every outpug log looks different after coffee
<perrito666> (and putting my glasses on(
<fwereade> perrito666, lol
 * TheMue takes a bike-ride to the new potential co-location office now. we'll see how fast I'll be back online ;)
<mup> Bug #1459250 was opened: centos 7 is ambiguous in streams <centos> <simplestreams> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1459250>
<wwitzel3> ericsnow, natefinch: ping
<natefinch> sinzui: https://bugs.launchpad.net/juju-core/+bug/1459093
<mup> Bug #1459093: Upgrade fails if there's a series in streams Juju doesn't recognize <upgrade-juju> <juju-core:New> <https://launchpad.net/bugs/1459093>
<natefinch> sinzui: wily isn't in distro-info (at least on trusty)... but I'm guessing someone published a stream for wily
<sinzui> :/ How many times does that bug need to be fixed
<natefinch> sinzui: just once... we need to stop caring if we see series we don't understand, and just ignore them
<natefinch> sinzui: my environment shouldn't care if someone publishes a stream for the series "TOTALLY_INVALID"
<sinzui> natefinch, we haven't build willy
<sinzui> natefinch, unit tests are run on a machine with a fake series registered in distro-info-data to catch this error
<natefinch> sinzui: it's not a problem with something extra in distro-info... it's a problem with someone somewhere mentioning a series that *isn't* in distro-info
<sinzui> natefinch, I think wallyworld reported a similar bug/solution.
<natefinch> ls
<sinzui> natefinch, yep, it is the same long standing issue that juju thinks it has to know the world
<mup> Bug #1459288 was opened: TestWriteTokenReplaceExisting fails <ci> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1459288>
 * fwereade has just had guests arrive, stopping for now
<sinzui> mgz, abentley to either of you have a moment to review https://code.launchpad.net/~sinzui/juju-release-tools/agent-archive/+merge/260321
<abentley> sinzui: sure.
<abentley> r=me
<abentley> sinzui: So I have this bug:https://bugs.launchpad.net/juju-core/+bug/1449208
<abentley> sinzui: I know it exists in jes-cli.  I don't have evidence that it exists in master.
<sinzui> :/
<abentley> Should I create a jes-cli series and target to that?
<sinzui> abentley, I don't want to, but I think Lp requires it
<abentley> sinzui: I will be the bad guy, then :-)
 * sinzui really hates lp bug targeting
 * natefinch ^^^^ +1000
<mgz> I think we need to sort-of ignore bugs on feature branches
<mgz> eg, windows tests are currently borked on jes-cli - but that's kinda their problem
<natefinch> or just target all bugs to master and use tags to track branches.  Then we'd actually be able to search all the bugs at once, too.
<mup> Bug #1459298 was opened: TestMissingServerFile fails <ci> <test-failure> <juju-core:Incomplete> <juju-core jes-cli:Triaged> <https://launchpad.net/bugs/1459298>
<mup> Bug #1278831 changed: debugging first run of install hook is not straight forward <debug-hooks> <juju-core:Triaged> <https://launchpad.net/bugs/1278831>
<mup> Bug #1458693 changed: juju-deployer fills up ~/.ssh/known_hosts <juju-core:New> <https://launchpad.net/bugs/1458693>
<mup> Bug #1458754 changed: $REMOTE_UNIT not found in relation-list during -joined hook <hooks> <relations> <juju-core:New> <https://launchpad.net/bugs/1458754>
<mup> Bug #1459327 was opened: Juju MAAS netwoking with custom bridge inside service <juju-core:New> <https://launchpad.net/bugs/1459327>
<mup> Bug #1459337 was opened: UniterSuite setup fails <ci> <intermittent-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1459337>
<natefinch> abentley: is there a python equivalent of gofmt that'll fix the linting rules enforced by make lint?  like trailing whitespace, visual indenting, etc?
<natefinch> sinzui, mgz: ^^
<sinzui> natefinch, no. there are some tools that will do some that that, but nothing that reformats to pep8
<natefinch> boo.... manually wrapping lines by hand is uh... not a hugely valuable use of my time
<natefinch> especially when it then complains that I don't indent things perfectly.
<mgz> natefinch: what editor do you use?
<sinzui> natefinch, maybe this will suffice https://pypi.python.org/pypi/autopep8
<natefinch> mgz: sublime
<mgz> natefinch: top google result is a pep8 autoformat plugin
<mgz> for "sublime python format"
<natefinch> yeah, just found and installed one... so much better
<hatch> using 1.24-beta5.1-trusty-amd64 the deploy command simply hangs is this a known bug?
<natefinch> hatch: pretty sure it works fine for me
<hatch> natefinch: ok I'm going to restart this machine, see if I can get it working or reproduce reliably
<sinzui> abentley, mgz: do either of you have time to review https://code.launchpad.net/~sinzui/juju-ci-tools/more-agents/+merge/260353
<abentley> sinzui: looking
<abentley> sinzui: r=me
<sinzui> thank you abentley
<natefinch> ahh yes, the old "you have to update 100 fragile mocks in order for your tests to pass"
<natefinch> s/fragile/fragile and incomprehensible/
<perrito666> natefinch: ah, welcome to my world
<natefinch> perrito666: heh
<natefinch> perrito666: I've had central A/C for 5 years... if the house is above 75Â°F/24Â°C, everyone's miserable.  Actually, I'm not too bad, but my wife is really uncomfortable, which means I'm uncomfortable
<perrito666> oh, so 27 C is a bad temp for you
<natefinch> yes
<perrito666> now I get it
<perrito666> I keep forgetting that you people have blood instead of coolant
<natefinch> yep
<perrito666> you need more south americanness
<natefinch> I agree, but my wife would probably be mad at me if I imported some
<natefinch> mental note: upgrade steps get run as-is on your local machine during tests... which means if they delete files off disk... guess what??
<perrito666> lol
<natefinch> luckily I was just deleting stuff from /var/log/juju ... but still
<wallyworld> sinzui: so looking at osversions, i can see why it was done - we just have 13.10 etc for juju series
<wallyworld> but we use win8 etc for windows
<wallyworld> so i guess for !ubuntu, we need to be more explicit
<wallyworld> ericsnow: can i have a trivial review pretty please? http://reviews.vapour.ws/r/1800/
<ericsnow> wallyworld: sure
<ericsnow> wallyworld: ship it!
<wallyworld> ty
<wallyworld> ty
<sinzui> wallyworld: yeah, that was my expectation.
<wallyworld> sinzui: fix landing for 1.24, soon master
<thumper> cherylj: regarding the forward port of the vivid/systemd issues, if the branch hasn't significantly changed from the version that landed on the earlier branch, you don't need another review
<thumper> just get the bot to merge
<thumper> if the branch does have significant changes to the previous, it is useful to say what those where in the description
<thumper> this is to smooth the way for fixes to be forward ported
<thumper> given that *most* of the time, it is exactly the same fix
<thumper> in the newer versions
<mup> Bug #1453634 changed: juju upgrage-juju --upload-tools hangs with jes flag <ec2-provider> <upload-tools> <juju-core:Invalid by waigani> <juju-core trunk:Invalid by waigani> <https://launchpad.net/bugs/1453634>
<mup> Bug #1459047 changed: juju upgrade-juju --upload-tools broken on ec2 <juju-core:Invalid by waigani> <https://launchpad.net/bugs/1459047>
<wallyworld> waigani: i have a bug for your todo list for beta 6 - bug 1441478
<mup> Bug #1441478: state: availability zone upgrade fails if containers are present <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1441478>
<waigani> wallyworld: yay :)
<wallyworld> hopefully won't bee too bad
<waigani> wallyworld: :), just finishing this one off, then I'll get onto it
<wallyworld> ty
<menn0_> thumper: http://reviews.vapour.ws/r/1802/ pls
<menn0> thumper: easy one :)
#juju-dev 2015-05-28
<natefinch-afk> anyone familiar with the upgrade tests?  I added upgrade steps to move the rsyslog config files from /var/log/juju to /var/lib/juju ... but the tests fail because the tests mock out a bunch of stuff and evidently don't set up the log directory.
<anastasiamac> axw_: wallyworld: backort of storage-add http://reviews.vapour.ws/r/1803/
<axw_> looking
<anastasiamac> axw_: \o/
<wallyworld> natefinch-afk: isn't t just a case of adding a LogDir() metod to fakeConfigSetter
<natefinch> wallyworld: could be.  upgrade_test.go is 1000 lines long... finding what particular part I need to tweak in what particular way is not always obvious.
<wallyworld> i just guessed by looking at what setup test did, hopefully it works
<natefinch> wallyworld: oh yeah, I didn't think to look at setuptest
<wallyworld> actuially, matbe not setup test, i forgot what i looked at, but see NewFakeConfigSetter
<wallyworld> it has some of the COnfig methods but not all
<wallyworld> i find that both useful and difficult in Go
<wallyworld> hard to know what implements what
<natefinch> we just mock out and fake out so much of our code to run the tests... it's hard to know what actual code is getting run.
<natefinch> every test we have runs hundreds of lines of setup code, most of which is very far away from the test itself.
<natefinch> wallyworld: I seriously have no clue how to even begin to do this.  Adding LogDir() to fakeConfigSetter does not seem like it will populate that directory with the correct expected files.
<wallyworld> it won't - you need to do that
<wallyworld> with c.MkDir()
<wallyworld> and then pass that dir to fakeconfigsetter
<wallyworld> so it can be returned from the LogDir() func
<wallyworld> just guessing
<natefinch> I can't even really tell where this test ends up creating a fakeConfigSetter
<wallyworld> see TestContextInitializeWhenNoUpgradeRequired
<wallyworld> looks like the test sets it up
<menn0> thumper: reporting dropped log messages: http://reviews.vapour.ws/r/1804/
<menn0> thumper: btw, i really don't like the tests for logsender and I have a plan for reworking them to be more unit-testy
<thumper> menn0: ack
<axw_> wallyworld: FYI I have the deadlock in hand, just working through updating all the tests and so on
<wallyworld> axw_: awesome, so the worker uses a tomb etc. did you get rid of the singleton?
<axw_> wallyworld: I did, but now I'm putting it back (sort of)
<axw_> wallyworld: it's a huge PITA to change all of the tests
<wallyworld> yeah :-(
<axw_> wallyworld: so I'm changing it so there's a worker, and it updates a singleton which you can call and will fail if there's no active worker
<axw_> wallyworld: changing *most* of the tests was fine, but then JujuConnSuite things were a nightmare, because they have their own API server which needed to have resources threaded through... just not worth the effort right now
<wallyworld> i hate jujuconnsuite
<wallyworld> i'm off to soccer soonish, will look in more detail when i get back
<axw_> wallyworld: sure. I've not proposed yet, still fixing tests
<axw_> enjoy
<wallyworld> will do, not quite leaving yet
<wallyworld> axw_: with your change, will tests still need to start their own worker loop, or did you get to remove that ickiness too?
<axw_> wallyworld: which tests?
<axw_> wallyworld: the uniter tests will still need to
<wallyworld> axw_: william put a worker loop in dummy provider, and menno also added one for some agent tests
<wallyworld> hmmm, i think the agent tests are on master though
<axw_> wallyworld: didn't see anything in dummy provider. I don't think my change will improve on that
<wallyworld> maybe william's code not landed yet
<wallyworld> if blocking calls were removed, that would help with the problem
<wallyworld> i guess with your changes though, it's a matter of starting a worker rather than a loop directly
<wallyworld> and maybe blocking calls can be aborted when the worker fies
<wallyworld> dies
<axw_> wallyworld: if the worker isn't started, lease/leadership calls will error out
<axw_> wallyworld: so they won't hang forever
<wallyworld> great, that's an improvement
<axw_> wallyworld: not sure if that's enough to cover all the tests
<wallyworld> gotta start somwhere
<axw_> fwereade: http://reviews.vapour.ws/r/1806/
<axw_> FYI
<axw_> wallyworld: ^^
<fwereade> axw_, reviewed; basically awesome, but I have a few quibbles
<dimitern> fwereade, can you stamp http://reviews.vapour.ws/r/1777/ please?
<fwereade> wallyworld, axw_: and, yes, I suspect I didn't land the branch that put lease manager in the dummy provider, dammit, sorry
<axw_> fwereade: I guess we don't really need an exported error for the BlockUntilLeadershipReleased call at all, do we? I could just create an error inline
<fwereade> axw_, true
<fwereade> axw_, but you need the same error in several circumstances, right?
<fwereade> axw_, so an unexported errStopped is probably a good idea
<axw_> fwereade: just that one method in leadership, but I'll do that anyway
<fwereade> axw_, shouldn't claim and release have similar?
<fwereade> axw_, we don't want to leak the lease error to the clients, and we definitely don't want to let tomb.ErrDying out of the worker
<axw_> fwereade: their cancellation happens in the lease package. so... I could export an error from there instead
<axw_> fwereade: fair enough. ok, so I'll export an error from lease, but trap it in leadership and translate
<axw_> I was hoping not to put that in the lease contract
<fwereade> axw_, well that's sort of why I was advocating a worker.ErrStopped
<axw_> I see
<axw_> ok then, that'll do
<fwereade> axw_, from the POV of a client it barely matters which worker was stopped or why
<fwereade> axw_, it's as simple as "I cannot fulfil your request because the necessary components are not running"
<fwereade> axw_, the non-running components ought to have their own failures logged anyway, so there's little benefit to leaking anything else out of the methods
<axw_> fwereade: yep, sounds good
<mup> Bug #1459610 was opened: juju status --format=tabular > 80 characters wide <juju-core:New> <https://launchpad.net/bugs/1459610>
<mup> Bug #1459611 was opened: juju status --utc does not display utc and is confused <juju-core:New> <https://launchpad.net/bugs/1459611>
<axw_> fwereade: updated, PTAL when you can
<fwereade> axw_, cheers
<mup> Bug #1459611 changed: juju status --utc does not display utc and is confused <juju-core:New> <https://launchpad.net/bugs/1459611>
<mup> Bug #1456957 changed: rsyslog worker should not add machines that are not ready yet <cpec> <logging> <rsyslog> <juju-core:Won't Fix> <juju-core 1.22:New> <juju-core 1.23:New> <juju-core 1.24:New> <https://launchpad.net/bugs/1456957>
<mup> Bug #1459611 was opened: juju status --utc does not display utc and is confused <juju-core:New> <https://launchpad.net/bugs/1459611>
<mup> Bug #1459616 was opened: 'juju status' timestamps should use rfc3339 or ISO8601 <juju-core:New> <https://launchpad.net/bugs/1459616>
<wwitzel3> sorry I missed the first half, but the last half was useful :)
<anastasiamac> wwitzel3: it was just as great for me too! I had all my questions answered :D
<wallyworld> ericsnow: you ocr? can you look at http://reviews.vapour.ws/r/1805/ sometime during your day?
<wallyworld> fwereade: just as an fyi ^^^^^
<wallyworld> since we discussed it
<fwereade> wallyworld, cheers
<thumper> I'm outa here
<fwereade> perrito666, woudl you also cast an eye over http://reviews.vapour.ws/r/1805/ with a view to understanding how/whether it interacts with the restore-mode api
 * perrito666 casts eyes
<marcoceppi> Is the environment UUID exposed in the API?
<fwereade> marcoceppi, Client.EnvironmentInfo
<fwereade> marcoceppi, and you get the tag, which is "environ-<uuid>" when you log in
<mup> Bug #1459679 was opened: MinUnitsSuite teardown fails <ci> <intermittent-failure> <unit-tests> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1459679>
<mup> Bug #1459679 changed: MinUnitsSuite teardown fails <ci> <intermittent-failure> <unit-tests> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1459679>
<wwitzel3> ericsnow: ping
<perrito666> how can I go from a name to a Tag?
<wwitzel3> perrito666: ?
<wwitzel3> perrito666: what do you mean?
<wwitzel3> perrito666: you have string name?
<perrito666> yes, and I want a way to get a tag
<wwitzel3> perrito666: if you don't know the type, you can just use ParseTag which will switch over the different tag types for a match
<wwitzel3> perrito666: if you know the tag type you want, you can call ParseUnitTag(s) for example
<perrito666> wwitzel3: what I have is what a unit will return by calling Name()
<perrito666> I was not sure if that parsed to tag
<wwitzel3> perrito666: yes, it will
<perrito666> k tx
<wwitzel3> perrito666: all u.UniTag does is wrap names.NewUnitTag(u.Name())
<wwitzel3> perrito666: so it will be the same result, except NewUnitTag panics and ParseUnitTag has an error return, so just depends on the behavior you want
<perrito666> tx a lot man
<wwitzel3> np
 * perrito666 adds the appropriate string in the appropriate parts
<marcoceppi> how is juju backup different than just, say, a bundle?
<mup> Bug #1459761 was opened: Unable to destroy service/machine/unit <juju-core:New> <https://launchpad.net/bugs/1459761>
<mup> Bug #1459761 changed: Unable to destroy service/machine/unit <destroy-machine> <local-provider> <lxc> <juju-core:Triaged> <https://launchpad.net/bugs/1459761>
<mup> Bug #1459761 was opened: Unable to destroy service/machine/unit <juju-core:New> <https://launchpad.net/bugs/1459761>
<ericsnow> marcoceppi: what do you mean by "bundle"?  a charm bundle?
<mup> Bug #1457011 changed: init system discovery script fails with: [[: not found <cloud-init> <compatibility> <regression> <tech-debt> <juju-core:Fix Released by ericsnowcurrently>
<mup> <juju-core 1.23:Fix Released by ericsnowcurrently> <juju-core 1.24:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1457011>
<mup> Bug #1459775 was opened: init system discovery script has bashisms <juju-core:In Progress by ericsnowcurrently> <juju-core 1.24:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1459775>
<mup> Bug #1457011 was opened: init system discovery script fails with: [[: not found <cloud-init> <compatibility> <regression> <tech-debt> <juju-core:Fix Released by
<mup> ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <juju-core 1.24:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1457011>
<mup> Bug #1459775 changed: init system discovery script has bashisms <juju-core:In Progress by ericsnowcurrently> <juju-core 1.24:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1459775>
<mup> Bug #1457011 changed: init system discovery script fails with: [[: not found <cloud-init> <compatibility> <regression> <tech-debt> <juju-core:Fix Released by ericsnowcurrently>
<mup> <juju-core 1.23:Fix Released by ericsnowcurrently> <juju-core 1.24:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1457011>
<mup> Bug #1459775 was opened: init system discovery script has bashisms <juju-core:In Progress by ericsnowcurrently> <juju-core 1.24:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1459775>
<mup> Bug #1459785 was opened: systemd-related tests may fail under windows <systemd> <windows> <juju-core:In Progress by ericsnowcurrently> <juju-core 1.24:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1459785>
<mup> Bug #1459785 changed: systemd-related tests may fail under windows <systemd> <windows> <juju-core:In Progress by ericsnowcurrently> <juju-core 1.24:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1459785>
<mup> Bug #1459785 was opened: systemd-related tests may fail under windows <systemd> <windows> <juju-core:In Progress by ericsnowcurrently> <juju-core 1.24:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1459785>
<perrito666> does anyone know why 1.18 would not bootstrap to ec2?
 * perrito666 looks a sinzui 
<sinzui> perrito666: 1.18 did/does
<perrito666> I have this error https://bugs.launchpad.net/juju-core/+bug/1358933
<perrito666> or related
<mup> Bug #1358933: localHTTPSServerSuite.TestMustDisableSSLVerify fails in lxc <ci> <test-failure> <juju-core:Fix Released by sinzui> <https://launchpad.net/bugs/1358933>
<perrito666> mm, or just related
<sinzui> perrito666: and cloud health shows it did 16 minutes ago, so AWS hs not changed to break 1.28
<perrito666> it was that bug, juju 1.18 deppends on ca-certificates yet the package doesn t
<sinzui> perrito666: that sounds true. the ca-certificates is delievered with ever desktop system
<perrito666> sinzui: I am running juju from inside an lxc container
<perrito666> there's a corner case :p
<sinzui> perrito666: sure easily fixed by the edge user
<perrito666> indeed
<sinzui> Though I hope that wasnât the only reason devs swore juju in lxc was unsupported
<perrito666> alghough I dont think it would hurt the package to depend on its dependencies even if they are installed by default
<cmars> perrito666, they might not be, if you install ubuntu server with the "minimal virtual machine" option
<cmars> installed by default, that is..
<perrito666> I assume using lxc by hand is not all that common but I dont feel all that edgy
<cmars> monocultures are bad. i like edgy :)
<ericsnow> cherylj: ping
<cherylj> ericsnow: what up
<ericsnow> cherylj: I have some concerns with http://reviews.vapour.ws/r/1797/
<ericsnow> cherylj: just wanted to talk about it real quick :)
<cherylj> ericsnow: sure.  Reading through your comments now
<ericsnow> cherylj: cool
<cherylj> ericsnow: to answer your question about adding in a start for the service with systemd - the short answer is that the container never halted if I didn't explicitly start the service.
<cherylj> When I was debugging, I saw that the service was in an inactive state once I added in the correct After and Install info
<ericsnow> cherylj: yeah, it kind of makes sense
<cherylj> And I guess you had a different take on who should make the decision about the cloud-init AfterStopped target.
<ericsnow> cherylj: relatedly, how does the shutdown service get removed?  or does is simply stick around but do nothing?
<ericsnow> cherylj: yeah, the code in service/systemd shouldn't have any knowledge of cloudinit
<ericsnow> cherylj: also, part of what the cloudbase guys did was to pull all the cloudinit-related code into one place (under cloudconfig)
<cherylj> ericsnow: the ExecStopPost disable works just fine.  It completely removes the service such that any containers created from that template to not have it in their registered services
<ericsnow> cherylj: oh yeah :)
<cherylj> ericsnow: I can go in and make your suggested changes
<ericsnow> cherylj: cool
#juju-dev 2015-05-29
<waigani> axw_: #1441478 when you say it "bails if any of the instances cannot be found", are you talking about the txn.DocExists assert in the upgrade step?
<mup> Bug #1441478: state: availability zone upgrade fails if containers are present <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:In Progress by waigani> <https://launchpad.net/bugs/1441478>
<wallyworld> thumper: can you recall what happens in jujud panics in it's Run() method - will the upstart service restart the agent?
<wallyworld> ah yes i think it does
<thumper> wallyworld: can we chat earlier today?
<thumper> wallyworld: and yes, upstart restarts
<wallyworld> sure, free whenever
<thumper> kk, just making some toast, with you shortly
<thumper> wallyworld: in our hangout now
<axw_> waigani: was taking my daughter to school, looking now
<axw_> waigani: I mean the azFunc function may return environs.ErrNoInstances
<waigani> axw_: ah right, okay thanks
<waigani> I've setup maas with a few nodes on my laptop to test, got juju bootstrapped, but after a destroy env and rebootstrap I've run afoul of #1412621
<mup> Bug #1412621: replica set EMPTYCONFIG MAAS bootstrap <bootstrap> <maas-provider> <mongodb> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1412621>
<waigani> so looking at the original bug, but ironing out maas issues along the way
<mup> Bug #1459885 was opened: harvest mode setting not used by container provisioner <juju-core:Triaged> <https://launchpad.net/bugs/1459885>
<axw_> wallyworld: 1:1?
<wallyworld> oh yeah
<davecheney> https://github.com/juju/juju/pull/2447
<davecheney> can I get a review on this
<davecheney> it shits me to tears every time I got to fix a bug in one of our libraries
<davecheney> i find that juju is lagging behind using fixes that have landed in it by months
<davecheney> thumper: waigani http://reviews.vapour.ws/r/1810/
<davecheney> all the on call reviewers are in the EU timezoen today
<davecheney> do you have a second to take a look
<waigani> davecheney: it's just an dependencies update?
<waigani> s/an/a
<davecheney> yup
<waigani> LGTM
<davecheney> danka
<natefinch> if only there were some way to automatically pull from the head revision, so we'd always get all the new fixes....
<davecheney> nah, that leads to madness
<davecheney> and irreproducible builds
<davecheney> what i don't understand is
<natefinch> you can freeze release branches
<davecheney> when I land a fix on juju/tesing
<davecheney> it's because I need it to fix a bug in juju
<natefinch> ]I don't understand why we freeze a development branch, though
<davecheney> who is landing fixes on juju/testing that doesn't need them ?
<natefinch> davecheney: probably people working on the charm store etc
<natefinch> also, shouldn't it be obvious in the commit log? L)
<natefinch> :)
<wallyworld> natefinch: it's generally accepted pulling tip of your dependencies is bad - but Go does a lot of things I disagree with :-)
<natefinch> wallyworld: I guess the problem is that we don't have CI running across all tests that use our own dependencies, so in theory, changing github.com/juju/testing could break charm store if we only test the change in juju-core
<natefinch> wallyworld: otherwise, really, it's not a dependency.... it's just our code.  Anymore than github.com/juju/juju/agent is a "dependency" of github.com/juju/juju/state ...
<wallyworld> natefinch: yeah, the Go approach works ok if you consider all the source code in the entire repo part of your project, which is what googles does
<wallyworld> but for shared libraries not so well
<natefinch> wallyworld: we probably should, since otherwise we can get into a state where the charm store makes a change to testing for its needs, and then when juju-core updates, that change is incompatible with our own code
<wallyworld> the overriding facror though needs to be reproducable builds
<natefinch> ....for release branches
<wallyworld> and proper dependency management
<natefinch> not master
<natefinch> nobody is reproducing builds off master
<wallyworld> hmmm
<wallyworld> let's add this to the "discuss over several red wines" list
<natefinch> haha
<wallyworld> the more the better :-)
<natefinch> someday, someone will actually be able to explain to me what "proper dependency management" means.   Because everyone seems to have different - usually fairly fuzzy and not actionable - ideas about it.
<wallyworld> repeatable builds, explicitly choosing the version/revno of the lib you want to link to/ build against, and a way to manage and track that
<natefinch> so, like godeps
<davecheney> natefinch: well, they are going to have very bad day when I land my refactor
<natefinch> davecheney: what are you refactoring?
<davecheney> https://launchpad.net/bugs/1459064
<mup> Bug #1459064: worker/leadership: data races in test and code <intermittent-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1459064>
<natefinch> ahh awesome
<natefinch> davecheney: not sure who they is, and why that would make them have a bad day
<davecheney> imma in your package, breakin your api
<natefinch> see, you're not supposed to do that :)
<davecheney> what if your api is not thread safe
<natefinch> time for v2
<davecheney> and the package is supposed to be used in a multithreaded manner ?
<natefinch> heh ouch
<natefinch> then you screwed up.  I guess in that case, you break the API and then because everyone's pulling tip, they are helpfully given compile errors that tell them they need to update their code because the old code is totally borked.
<natefinch> as opposed to when you're pinning revisions and you go on forever using that old racy P.O.S. code
<davecheney> sounds like youre scrrewed either way
<davecheney> have fun
<natefinch> Yes. but I'd rather know than go on thinking I'm ok
<mup> Bug #1459288 changed: TestWriteTokenReplaceExisting fails <blocker> <ci> <regression> <unit-tests> <windows> <juju-core:Fix Released by wallyworld> <juju-core 1.24:Fix Released by wallyworld> <https://launchpad.net/bugs/1459288>
<mup> Bug #1459616 changed: 'juju status' timestamps should use rfc3339 or ISO8601 <juju-core:Won't Fix> <https://launchpad.net/bugs/1459616>
<mup> Bug #1459912 was opened: juju agent opens api when upgrade is pending <juju-core:In Progress by wallyworld> <juju-core 1.24:Fix Committed by wallyworld> <https://launchpad.net/bugs/1459912>
<davecheney> http://reviews.vapour.ws/r/1813/
<wallyworld> axw_: can i please get a trivial review? http://reviews.vapour.ws/r/1817/
<axw_> looking
<waigani> axw_, wallyworld: Here is the PR for #1441478 http://reviews.vapour.ws/r/1818/ I've hit EOD trying to manually test this on maas (hit other maas bugs). So I can follow up with manual testing
<mup> Bug #1441478: state: availability zone upgrade fails if containers are present <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:In Progress by waigani> <https://launchpad.net/bugs/1441478>
<wallyworld> waigani: ty
<wallyworld> waigani: are these maas bugs known? do you need to report them?
<waigani> wallyworld: actually yes they are
<wallyworld> oh joy, makes it hard for us
<wallyworld> axw_: ty for review, i need to head to doctor to get more cholesterol medicine, will review your tags one when i get back
<axw_> thanks
<waigani> wallyworld: thumper has given me some other maas bugs to look at while I'm in maas land - so I'll have a maas(ive) bug list to keep going on next week :)
<wallyworld> waigani: yeah, i told him to give you that one :-P
<waigani> thanks :/
<wallyworld> anytime :-)
<wallyworld> figured you were on a roll
<waigani> heh
<wallyworld> the right man for the job
<waigani> more like a tumble ....
<waigani> but good to get the MAAS knowledge
<axw_> waigani: LGTM, thanks
<waigani> axw_: wow, quick thanks :)
<axw_> wallyworld: please see my reply on the lease change, when you're back
<wallyworld> axw_: thanks for reply, ship it i reckon
<voidspace> Fledgling MAAS cluster: PDU, switch, plus two proliant servers
<voidspace> https://www.dropbox.com/s/fvmz7aj5lsvk0pb/2015-05-29%2009.55.44%20HDR.jpg?dl=0
<voidspace> dooferlad: ^^
<voidspace> dimitern: https://www.dropbox.com/s/fvmz7aj5lsvk0pb/2015-05-29%2009.55.44%20HDR.jpg?dl=0
<perrito666> good morning
<mwhudson> voidspace: do the proliants have bmcs, or is that what the pdu is for?
<dooferlad> mwhudson: the proliants have an API that you can poke over one of their network interfaces, but voidspace judged that engineer time to get that doing what we wanted vs cost of a PDU favoured shopping over typing
<dooferlad> mwhudson: it would be interesting to play with one at some point - apart from being larger than a NUC, if we can get the RESTful API doing what we need, they will be great for a home MAAS
<mwhudson> ah not ilo or ipmi or anything like that?
<dooferlad> mwhudson: I belive it is iLO
<mwhudson> huh apparently maas only supports the kind of ilo you get in moonshots?
<voidspace> I'm sure it's possible to get it working
<voidspace> a
<voidspace> a PDU seemed the path of least resistance though
<voidspace> and means I can plug other things in - like my older N36 proliant
<waigani> wallyworld: ping
<wallyworld> waigani: HI
<waigani> wallyworld: hey, just saw your email - pushing up a targeted PR to 1.24
<wallyworld> waigani: you rock, thank you
<waigani> wallyworld: the bug was in a 1.22 upgrade step - hence the 1.22 target. I intended to forward port up each version. Should I also target 1.23?
<waigani> wallyworld: landing on 1.24: https://github.com/juju/juju/pull/2456
<wallyworld> waigani: sorry, was afk, bit of an emergency here. but if you've proposed for 1.22, mat was well do 1.23 as well
<wallyworld> thanks for 1.24, that's great
<mup> Bug #1460071 was opened: relation-set --file ignores --format <juju-core:New> <https://launchpad.net/bugs/1460071>
<voidspace> dooferlad: ping
<dooferlad> voidspace: pong
<voidspace> dooferlad: I'd like to setup a network for my maas - and I'd like you to help if you can :-)
<dooferlad> sure, give me 5 minutes to finish this review
<voidspace> I have a switch plus cables and spare ethernet port on my desktop
<voidspace> sure
<voidspace> and maas running on my desktop
<mup> Bug #1460087 was opened: quickstart deployment fails to add relations when bootstrap goes "down" <juju-core:New> <https://launchpad.net/bugs/1460087>
<dooferlad> voidspace: right, what do you want to know? Should we jump in a hangout?
<voidspace> dooferlad: hangout is good
<voidspace> dooferlad: sapphire?
<dooferlad> voidspace: yep
<voidspace> I'm there
<dooferlad> voidspace:  so am I... did you hit join? That is my usual problem.
<voidspace> dooferlad: juju-sapphire?
<voidspace> dooferlad: I'm in
<voidspace> "waiting for people to join this video call..."
<voidspace> I'll leave and re-join
<natefinch> It's a good thing we use *_test packages to only test the exported interface of an API: https://github.com/juju/juju/blob/master/upgrades/export_test.go
<natefinch> s/API/package/
<mup> Bug #1460071 changed: relation-set --file ignores --format <juju-core:Invalid> <https://launchpad.net/bugs/1460071>
<voidspace> dimitern: struggling to join the conference
<voidspace> dimitern: the number on the event for London is wrong
<dimitern> voidspace, hey, I won't manage
<dimitern> voidspace, but if you can, please do and we'll chat later
<voidspace> dimitern: it's been cancelled - to be reconvened
<voidspace> dimitern: was a terrible connection :-/
<dimitern> voidspace, ok, thanks for the heads up
<cherylj> ericsnow: ping?
<mup> Bug #1460171 was opened: Deployer fails because juju thinks it is upgrading <blocker> <ci> <deployer> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1460171>
<mup> Bug #1460175 was opened: apiserver_test authhttp_test SetUpTest.debugLogSuite failed <intermittent-failure> <ppc64el> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1460175>
<natefinch> man I love unit tests. I don't know how I ever wrote code that worked without them.
<dimitern> any reviewers around to have a look at http://reviews.vapour.ws/r/1821/ ? this (hopefully) fixes the critical bugs#1449054 and #1452221
<dimitern> fwereade, fyi ^^
 * dimitern @eod
<ericsnow> cherylj: sorry, I totally missed your ping
<perrito666> mm, chaining hooks seems to be something less trivial than expected
<cherylj> ericsnow: np, I answered my own question :)
<ericsnow> :)
<mup> Bug #1460184 was opened: Bootstrapping fails with Maas on Ubuntu Vivid <maas-provider> <vivid> <juju-core:Incomplete> <https://launchpad.net/bugs/1460184>
<perrito666> once again my isp did not show to install the upgrade, if I did not work at home I would already have lost 2 work days waiting for them
<natefinch> people who write functions that take 5 strings should be flogged
<mup> Bug #1459616 was opened: 'juju status' timestamps should use rfc3339 or ISO8601 <juju-core:New> <https://launchpad.net/bugs/1459616>
<ericsnow> cherylj: thanks for revising that
#juju-dev 2015-05-30
<natefinch> man I love debugging tests that compare strings that are 62 lines long each
#juju-dev 2015-05-31
<mup> Bug #1421260 changed: juju 1.21.1 bootstrap timeout <bootstrap> <oil> <oil-bug-1372407> <juju-core:Expired> <https://launchpad.net/bugs/1421260>
#juju-dev 2016-05-30
<thumper> ugh
<thumper> how about we just fix the tests to not suck?
<davecheney> thumper: http://reviews.vapour.ws/r/4920/
<davecheney> review pls ? kai thx
 * thumper looks
<anastasiamac> davecheney: morning \o/ r u working on bug 1586244? shall i mark it as in progress?
<mup> Bug #1586244: state: DATA RACE in watcher <2.0-count> <race-condition> <juju-core:New for dave-cheney> <https://launchpad.net/bugs/1586244>
<axw> wallyworld: lol, 42 pages of diff
<axw> 840 files changed
<wallyworld> oh jeez
<wallyworld> i didn't look
<anastasiamac> \o/
<wallyworld> didn't realise we had that many juju/name imports
<mup> Bug #1581966 changed: controller node need to be restart for comissionning new nodes or lxc .. <cpe-sa> <orange-box> <juju-core:Invalid> <https://launchpad.net/bugs/1581966>
<davecheney> anastasiamac: i had a look at it on friday
<davecheney> its not a simple bug to fix
<davecheney> but i believe it is restricted to tests only
<davecheney> because they mutate the results they get back from the state watcher
<davecheney> which the contract for Get says they must not
<anastasiamac> k. i marked as triaged and unassigned from u (until the time when u or someone else can pick it up). tyvm :D
<davecheney> np
<axw> wallyworld: reviewed
<wallyworld> ty and sorry
<wallyworld> axw: yeah, i forgot about the global key prefix, sigh another 1000000 files affected in the next branch
<mup> Bug # changed: 1482015, 1518131, 1540650, 1575469, 1577939, 1585300
<davecheney> thumper: ping
<thumper> hey
<thumper> coming
<davecheney> np
<axw> menn0: you didn't make any changes to do with known_hosts at bootstrap time did you? still getting WARNING at bootstrap
<menn0> axw: no I didn't do that. I created bug 1579593 to track that.
<mup> Bug #1579593: SSH host keys for bootstrap aren't checked <security> <juju-core:Triaged> <https://launchpad.net/bugs/1579593>
<axw> menn0: cool, thanks
<mup> Bug #1586880 opened: provider/lxd: instance names are overly long <blocker> <juju-core:Triaged> <https://launchpad.net/bugs/1586880>
<axw> wallyworld: can you PTAL at http://reviews.vapour.ws/r/4919/
<wallyworld> sure
<mup> Bug #1586890 opened: Cloud/credentials details should be stored in state separate from model config <juju-core:Triaged> <https://launchpad.net/bugs/1586890>
<mup> Bug #1586891 opened: There is no command for removing clouds <blocker> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1586891>
<dimitern> frobware: pinh
<dimitern> ping even
<mup> Bug #1578376 changed: Cannot add MAAS credentials through juju add-credential <juju-release-support> <maas-provider> <usability> <juju-core:Invalid> <https://launchpad.net/bugs/1578376>
<perrito666> morning to those of you not on holidays
<thumper> fwereade: ping
<fwereade> thumper, joining
<thumper> kk
<thumper> holy moley I think I'm done...
 * thumper runs tests again
<thumper> I am still finding issues with a few tests
<thumper> but I think at least one of them has a fix from davecheney racy
<thumper> is the reboot package still intermittently failing for others?
 * thumper should pull master and update
<wallyworld> perrito666: could you please eyeball this mechanical change and +1 https://github.com/juju/charmrepo/pull/95
<perrito666> sure
<perrito666> wallyworld: done, looks good, ship it
<wallyworld> ty
<perrito666> their bot accepts a squirrel as parameter, we need that
<perrito666> like now
<davecheney> perrito666: ð for not lgtm
<davecheney> ð­ for really not lgtm
<davecheney> ð£ for LGTM and we'll fix the review comments in the next PR
<davecheney> (i hope the emomji is coming through)
<perrito666> they are, but my chat client is using the standart-ish unicode ones
<perrito666> which are not all that nice
<perrito666> it is amazing how those things are actually googleable
#juju-dev 2016-05-31
 * thumper just pushed a mega-branch
<thumper> http://reviews.vapour.ws/r/4937/
<thumper> even though it touches 59 files
<thumper> the diffstat is just: +541 â387
<thumper> it needs some cleanup in instance/namespace.go and instance/namespace_test.go but the rest is good for review
 * thumper goes to walk the dog before she climbs the walls
<anastasiamac> thnx thumper\o/.. i *think* m meant to be OCR today :-P
<thumper> wallyworld: this is the shortler lxd name branch
<thumper> wallyworld: it also fixes the maas container dns issue
<thumper> and makes some things more consistent
<wallyworld> thumper: awesome
<wallyworld> thumper: my branch a couple of days ago was 840 files
<thumper> :)
<thumper> someone has to do the big hunks
<thumper> see you in a bit
<anastasiamac> hunks or chunks?
<davecheney> thumper: OH MY GOD
<davecheney> i just found a set of state tests that duplicate TearDownTest
<perrito666> davecheney: doing test archeology?
<anastasiamac> perrito666: lasr i've heard, all these were under davecheney's couch or something :)
<davecheney> perrito666: no, i hit my toe on this
<perrito666> davecheney: If you are to believe Indiana Jones movies, that is how most archeology is done
<davecheney> nhttps://github.com/juju/juju/pull/5493
<davecheney> perrito666: i have not excavated deep enough to get to the real issue yet
<davecheney> i'm still diging up the past
<perrito666> ah, allwatcher
<davecheney> perrito666: hide yo' kids, hide yo' wife, all watcher coming
<thumper> wallyworld: http://reviews.vapour.ws/r/4937/ updated. Nothing really surprising. Heading home to test maas now. Car wheel alignment now done.
 * thumper heads off line briefly
 * thumper tries to remember how to bootstrap a maas thing again
<davecheney> thumper: I have a fix for https://bugs.launchpad.net/juju-core/+bug/1586244
<mup> Bug #1586244: state: DATA RACE in watcher <2.0-count> <blocker> <race-condition> <juju-core:In Progress by dave-cheney> <https://launchpad.net/bugs/1586244>
<davecheney> it's not perfect
<davecheney> but it will unblock things for beta8
<davecheney> the "we can just zero shit out to make the test pass" logic was woven through all those tests
<davecheney> a custom equality function would have been a _lot_ of work
<davecheney> and would have taken days to test
<davecheney> days of wall time
<davecheney> because the state tests are such an utter cluster fuck
<thumper> wallyworld: what is the expected way to be able to bootstrap maas?
<thumper> I don't need to add-cloud do i?
<thumper> just creds?
 * thumper is confused
<natefinch> thumper: unless it has changed in the last week, you do need add cloud
<thumper> what is supposed to go into the cloud config for a maas cloud?
<thumper> juju help maas doesn't do anything
<thumper> I feel it should tell you how to set up juju for maas
<natefinch> thumper: yeah.... I am hoping whoever did the maas work will fix that
<thumper> it is the credential and cloud definitions
<thumper> not the maas provider
<thumper> what auth-type should be defined for maas?
<thumper> it uses an oauth token
<natefinch> thumper: http://pastebin.ubuntu.com/16858151/
<natefinch> thumper: your yaml file for add-cloud should look like that... fix the cloud name and endpoint of course.
<thumper> right
<thumper> and creds?
<natefinch> thumper: I think when you add it, it'll ask for the outh key
<natefinch> maybe during bootstrap?  I forget when it asked me
<natefinch> I really wish juju bootstrap maas/https://myhostname.com/MAAS worked
<natefinch> the release notes say that works, but it doesn't
<anastasiamac> mine works and creda re in format http://pastebin.ubuntu.com/16858205/
<anastasiamac> creds* r
<thumper> I add creds for vmaas
<thumper> then go "juju list-credentials"
<thumper> and it errors
<natefinch> fantastic
<thumper> saying "removing secrets from credentials for cloud maas: cloud maas not valid"
<thumper> works if you say 'juju list-credentials vmaas"
<natefinch> uhh... that sounds like a bad idea.... deleting secrets because we encountered an error?
<anastasiamac> i did mine by hand... and here is my clouds... http://pastebin.ubuntu.com/16858228/
<natefinch> or do they just mean they're eliding them from the output?
<natefinch> because that error message is scary
<anastasiamac> also...maybe do not name your cloud 'maas'? maybe there is a confusion with type 'maas'
<thumper> got it now
<natefinch> meh.... if they're one and the same, what's the difference?
<thumper> it is bootstrapping
<natefinch> huzzah
<natefinch> I name all my environments after the provider type.  I only have one of each... why reinvent the wheel
<thumper> anastasiamac: I didn't name it maas
<thumper> I called mine "vmaas"
<thumper> so no idea where it is falling down there
<anastasiamac> :/
<anastasiamac> i've cafted files by hand and could bootstrap... the commands gave me grief :(
<thumper> maas tests running
 * thumper packing up and taking laptop while Maia does BJJ
<thumper> emails and stuff
<davecheney> all := newStore()
<davecheney> why ...
<davecheney> github.com/juju/juju/payload/api/private
<davecheney> private is a terrible name for a package
<natefinch> davecheney: naming is hard.  It's the api for agents, as opposed to the client api
 * thumper isn't going to start on that one
 * thumper does hr bollocks
<mup> Bug #1587236 opened: no 1.25.5 tools for vivid? <juju-core:New> <https://launchpad.net/bugs/1587236>
<davecheney> yes, the client api is called this github.com/juju/juju/payload/api/private/client
<davecheney> ffs
<davecheney> thumper: did some looking into the api bifurcation
<davecheney> lots and lots of refactoring will be needed before its possible
<davecheney> the api depends directly on watchers
<davecheney> i don't even know how that's possible
<davecheney> oh and directly on the state/multiwatcher
<davecheney> lucky(~/src/github.com/juju/juju/api) % pt multiwatcher
<davecheney> allwatcher.go:
<davecheney> 9:      "github.com/juju/juju/state/multiwatcher"
<davecheney> 51:func (watcher *AllWatcher) Next() ([]multiwatcher.Delta, error) {
<davecheney> right, so we expose the state types directly inside the api, even though they get pooped into and out of json
<davecheney> this should be reasonably straight forward to fix
<thumper> davecheney: as fun as it is, now is not the time to be looking into this
<thumper> we need to keep focused on the 2.0 beta 8 bugs
<thumper> o/ balloons
<davecheney> thumper: understood, I'm not making any changes
<davecheney> but I have some time to think during test runs
<thumper> doc is fine to start
<thumper> :)
<davecheney> indeed
<wallyworld> thumper: there's 2 docs in state which have EnvUUID - endpointBiningsDoc and resourceDoc - these should be deleted right
<thumper> wallyworld: 'juju deploy ubuntu --to lxd:" doesn't work
<wallyworld> correct
<wallyworld> it's on the todo list
<thumper> why not?
<thumper> oh, bug?
<wallyworld> knowm issue, just haven't got to it yet
<thumper> um... not sure about those docs
<thumper> we should double check with the original authors
<thumper> ack
<thumper> won't file another bug then
<thumper> wallyworld: btw, just finished testing with maas, changes look good
<wallyworld> thumper: nate is looking to at plus other lxc to lxd things this week
<wallyworld> awesome
<thumper> wallyworld: juju add-machine talks *a lot* about lxc
<thumper> we should change to lxd
<wallyworld> yes
<wallyworld> there's a lot there
<wallyworld> just a sed script
<wallyworld> sigh
 * thumper nods
<thumper> sed is magic like that
<thumper> wallyworld: I thought you'd find my typo amusing on the PR
<thumper> fuxes-nnnn
<thumper> u and i are so close together
<wallyworld> lol
<wallyworld> thumper: with the EnvUUID thing, these fields are tagged "env-uuid" so even if we keep them, sure we want to use "model-uuid"
<wallyworld> surely
<thumper> yes
<thumper> surely
<thumper> we probably can just remove them
<thumper> I have been cleaning up docs as I go
<thumper> most model-uuid fields are implicit and aren't needed
<thumper> the framework ensures they are there
<thumper> and valid
<wallyworld> ok, i'll delete them
 * thumper looks sadly at tomorrow
<thumper> meetings solid from 8am to noon
 * thumper sighs
<thumper> yay
<thumper> later peeps
<thumper> I'll check in on that branch to see if it lands - should only be intermittent failures if it doesn't
<davecheney> oh no
<davecheney> we have tests which accidentally mutate data stored in a cache
<davecheney> then expect to match that accidentally mutated data
<davecheney> menn0: thumper-bjj http://reviews.vapour.ws/r/4941/
<davecheney> ^ fix for beta8 blocker
<wallyworld> axw: rb doesn
<wallyworld> 't like the latest pr, you can eyeball the changes here https://github.com/juju/juju/pull/5497/files?diff=split
<wallyworld> if you get time, could you take a look? not really urgent
<axw> wallyworld: ok
<axw> only 380 files changed this time
<wallyworld> yeah :-(
<dimitern> axw: hey, do you know if any of the openstack charms use the enhanced storage support?
<axw> dimitern: AFAIK only ceph is using it
<axw> dimitern: why do you ask?
<dimitern> axw: I've been deploying openstack-base bundle on my hardware maas for the past few days
<dimitern> axw: it mostly works :)
<axw> dimitern: cool :)   are you looking to test storage?
<dimitern> axw: yeah, I was looking at the various ceph-related charms, and ISTM none of them define "storage" section in their metadata
<axw> dimitern: hrm, possibly in staging still
<dimitern> axw: and it appears cinder is required in order to later use juju on the deployed openstack
<dimitern> axw: ok, just checking whether I missed something obvious..
<dimitern> nova-lxd is quite cool!
<axw> dimitern: it's in here: https://api.jujucharms.com/charmstore/v5/ceph/archive/metadata.yaml
<axw> dimitern: apparently it still hasn't made its way over to ceph-osd yet
<dimitern> axw: what's the charm url for the above? cs:~?/ceph ..
<axw> dimitern: cs:xenial/ceph-1 (or just "ceph")
<dimitern> axw: ah, I see - so ceph has it but ceph-osd not yet
<axw> dimitern: if you want to use juju storage, don't set the osd-devices config. instead, deploy with --storage osd-devices=<...>
<axw> dimitern: seems so. I thought it had been copied over
<dimitern> axw: but for that to work the charm needs a storage section in the metadata, right?
<axw> dimitern: yep, so that only works for ceph, and not ceph-osd atm
<dimitern> axw: do you know the difference ? is ceph-osd+ceph-mon = ceph ?
<axw> dimitern: not entirely sure, but I think so
<dimitern> axw: ok, I'll ask jamespage for details
<axw> dimitern: Chris Holcombe is in charge of the ceph charms, FYI
<dimitern> jamespage: can I use 3x ceph instead of 3x ceph-osd + 3x ceph-mon for an opestack base deplyment?
<dimitern> axw: oh, good to know, thanks!
<axw> dimitern: totally unrelated, I'm curious to know what you're doing with concourse.ci. I saw it a while ago, but didn't dig too deep. got anything to show off at the sprint perhaps? :)
<dimitern> axw: we have a call later today with the QA guys to discuss concourse ci
<axw> dimitern: okey dokey. I shall watch this space
<dimitern> axw: I'm charming concorse so it can be evaluated easily, first attempt in bash, now "properly", i.e. with charm layers and unit tests
 * axw nods
<axw> wallyworld: I can't open the full diff on GitHub either, so reviewing this is going to be difficult...
<wallyworld> faaark
<axw> wallyworld: I've got the raw diff, I guess I'll just email comments :/
<wallyworld> damn, sorry
<jamespage> axw, dimitern: ceph-osd has storage support in master branch - not yet released
<dimitern> wallyworld: oh wow it's happening! service -> application
<jamespage> cs:~openstack-charmers-next/xenial/ceph-osd
<wallyworld> dimitern: yeah, it is. omfg it's been a big job
<axw> jamespage: righto, thanks for clarifying
<dimitern> jamespage: nice, I'll use that then - did your rabbitmq-server fix for bug 1574844 land on cs:xenial/rabbitmq-server ?
<mup> Bug #1574844: juju2 gives ipv6 address for one lxd, rabbit doesn't appreciate it. <conjure> <juju-release-support> <landscape> <lxd-provider> <juju-core:Won't Fix> <rabbitmq-server (Juju Charms Collection):Fix Committed by james-page> <https://launchpad.net/bugs/1574844>
<dimitern> wallyworld: it will be worth enduring it now rather than later :)
<wallyworld> indeed. we need to get this all done for beta8
<wallyworld> since after beta8, we need to support upgrades
<dimitern> jamespage: another question - since all of my NUCs have 1 disk only, I decided to try emulating 2 disks by using a volume group with 3 volumes - root, ceph, and lxd (for nova-lxd) - seems to work
<dimitern> jamespage: well, the question is - should it work equally well like this I guess?
<dimitern> axw: I've found an issue with storage/looputil/ tests failing if you have a loop device attached locally (e.g. losetup /dev/loop0 /var/lib/lxc-btrfs.img)
<axw> dimitern: eep, sorry. they're meant to be isolated
<axw> dimitern: which test fails?
<axw> or is it all of them? :)
<dimitern> axw: attempted a fix here: http://reviews.vapour.ws/r/4871/diff/# (that was discarded, but I think about extracting and proposing a fix like the one in the diff in storage/)
<dimitern> axw: in parseLoopDeviceInfo
<dimitern> the fix no longer causes the tests to fail, but I guess the isolation issue is still present..
<jamespage> dimitern, not yet - working that today
<jamespage> dimitern, you can always test with the ones from ~openstack-charmers-next - that's an upload of the master branch of each charm as it lands
<dimitern> jamespage: I used -next initially, but a few deployments failed due to incompatible changes across revisions (e.g. lxd started using 'block-devices' vs 'block-device')
<axw> dimitern: seems like a reasonable change
<dimitern> axw: cheers, I'll propose it as a separate fix then
<axw> dimitern: could you please also file a bug about isolation? it should be fixed also, but it's a separate issue
<dimitern> axw: will do
<axw> thanks
<jamespage> dimitern, well those will always happen at some point in time - lxd was never actually released until 16.04 so we broke it before then...
<dimitern> jamespage: that's ok - it's under development still
<dimitern> jamespage: I really should've tried that earlier (full openstack deployment with lxc or lxd).. now I see how flaky our multi-nic approach is :/
<dimitern> lxc is *even* worse with the default lxc-clone: true .. any lxc always comes up with 2 IPs per NIC due to cloning ;(
<dimitern> frobware: are you around?
<frobware> dimitern: yep
<dimitern> frobware: wanna sync?
<frobware> dimitern: oh yes
<dimitern> frobware: ok, omw
<axw> wallyworld: sorry, this is just too immense for me to review
<wallyworld> ok, i wonder wtf is up with github :-(
<wallyworld> maybe i'll try reproposing
<axw> wallyworld: regardless of that, it's too big. the other one was big, and was about 1/4 the size diff
<wallyworld> the trouble is when you rename even one thing, the corresponding changes are huge
<axw> and htat one was basically the same change repeated in most of hte files, this one is all over the place
<axw> wallyworld: I don't know how to fix it, but I can't review as is
<wallyworld> ok, i'll see if it behaves a second time
<axw> wallyworld: one thing I did pick up was an inappropriate renane of juju/service package path
<axw> that's about systemd/upstart services, not juju services
<wallyworld> i didn't mean to do that
<wallyworld> axw: are you sure?
<wallyworld> it's not renamed as far as i can see
<axw> wallyworld: I just replied to your question via email. I misremembered, there's just an invalid comment change
<wallyworld> ah, rightio
<wallyworld> axw: i'll rey and get the diff up using rbtool or something. after soccer
<axw> wallyworld: ok, but I don't think it's going to help much. it really needs to be broken up I think
<wallyworld> damn, tht will be very difficult :-(
<wallyworld> since even an error message change perculates through several packages and tests
<axw> wallyworld: you could do all messages/strings in one branch. types in another, functions in another... I don't know. all I know is I can't perform any kind of useful review on a 60K line diff
<frobware> dimitern: https://bugs.launchpad.net/juju-core/+bug/1576674
<mup> Bug #1576674: 2.0 beta6: only able to access LXD containers (on maas deployed host) from the maas network <lxd> <maas-provider> <oil> <ssh> <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1576674>
<dimitern> frobware: ta!
<dimitern> axw: fyi, bug 1587345
<mup> Bug #1587345: worker/provisioner: Storage-related tests not isolated from the host machine <tech-debt> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1587345>
<mup> Bug #1587345 opened: worker/provisioner: Storage-related tests not isolated from the host machine <tech-debt> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1587345>
<frobware> dimitern, fwereade: I'm going to skip the meeting in 15 - need to catch up with other stuff and other meetings later in the day too.
<dimitern> frobware: that's ok - marked you as optional anyway :)
<dimitern> fwereade: omw
<anastasiamac> dooferlad: standup
<babbageclunk> dimitern, fwereade, voidspace, frobware: review please? http://reviews.vapour.ws/r/4944/ It's the other side of the Mongo3.2 slowness fixes.
<dooferlad> anastasiamac: i am on holiday until Thursday :-)
<anastasiamac> dooferlad: ooh lucky some \o/ did not see it in team's calendar -sorry :) have fun
<dimitern> babbageclunk: sorry, was otp till now - looking
<fwereade> dimitern: further thoughts: nursery workers probably want a StringsWorker that notifies only on enter-relevant-state (or watcher-created-state-already-relevant)
<fwereade> dimitern, failures kinda need to be the nursery's responsibility to reschedule
<fwereade> dimitern, I'm not sure whether that will be better with more controller-side infra, or not
<dimitern> fwereade: a stringsworker as coordinator?
<fwereade> dimitern, if we do have more infra we should probably go all the way with it: so when we create a machine we add a schedule-whatever doc for "now", and if the worker fails it just sends a reschedule-because-error message that writes a new time on the ticket, updates the status(?) and moves on confident that the watcher will deliver it when required
<fwereade> dimitern, well, the worker wants to be based on some watcher? I don't know for sure that strings>notify
<dimitern> fwereade: strings sounds better as those the nursery entites should be a few and short-lived anyway
<dimitern> fwereade: I've taken notes for those
<dimitern> fwereade: that makes total sense
<dimitern> fwereade: it's already taking shape in my mind.. tyvm!
<dimitern> babbageclunk: ping
<babbageclunk> dimitern: pong
<dimitern> babbageclunk: how can I test your PR locally? run tests before and after apt install mongodb-3.2 ?
<babbageclunk> dimitern: Yup - although it's juju-mongodb3.2
<dimitern> babbageclunk: ok, I'll try it now
<dimitern> babbageclunk: cheers
<babbageclunk> dimitern: Then build, then run tests with JUJU_MONGOD=/usr/lib/juju/mongo3.2/bin ...
<babbageclunk> dimitern: Thanks for looking!
<dimitern> babbageclunk: oh, I see, ta! (was just wondering..)
<babbageclunk> dimitern: You might see the occasional failure in subnets_test - that's the one that seems to be a problem with txns between mgo and mongo3.2
<dimitern> babbageclunk: with your patch and both mongodb versions? or only with 3.2?
<babbageclunk> dimitern: yes, with my patch, only with mongo3.2.
<dimitern> babbageclunk: ok
<babbageclunk> dimitern: Actually you see the same without my patch.
<dimitern> babbageclunk: how much longer is "too long" with 3.2?
<babbageclunk> dimitern: ...but with mongo3.2
<babbageclunk> dimitern: Well, running the state tests with 3.2 for master takes longer than a couple of hours, so longer than I could bear to wait while doing something else. Running a small slice of them showed they were about 100x slower.
<babbageclunk> dimitern: Is the maas-juju meeting I have in my calendar still current?
<dimitern> babbageclunk: oh! I better not wait more than a couple of minutes over the time it took with 2.4 then :)
<dimitern> babbageclunk: anybody else around from maas?
<babbageclunk> dimitern: Not unless you're feeling very bored.
<babbageclunk> dimitern: No, I'm the only one here!
<dimitern> babbageclunk: I don't have anything new so I guess we should skip it
<babbageclunk> dimitern: Cool cool
<mup> Bug #1585836 changed: Race in github.com/juju/juju/provider/azure <azure-provider> <blocker> <ci> <race-condition> <regression> <unit-tests> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1585836>
<dimitern> babbageclunk: all tests pass with 3.2, the time difference is more than double here though
<dimitern> 233.236 s vs 477.708 s
<babbageclunk> dimitern: :(
<babbageclunk> dimitern: Not sure there's much I can do about it unfortunately.
<dimitern> babbageclunk: I'll run it a couple of times to get better stats
<babbageclunk> dimitern: Thanks
<redelmann> hi there
<redelmann> im seeing a lot of this output: http://paste.ubuntu.com/16863930/ in debug-log
<redelmann> juju version 1.25.5
<redelmann> it also happen in 1.24.x before upgrade to 1.25
<mup> Bug #1585300 opened: environSuite invalid character \"\\\\\" in host name <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1585300>
<dimitern> redelmann: on what cloud are you seeing this?
<redelmann> dimitern: aws
<dimitern> redelmann: have you changed firewall-mode for the environment?
<redelmann> dimitern: no, everything is in default
<dimitern> redelmann: and those machines showing the error - can you access exposed workloads on them regardless of the error?
<redelmann> dimitern: here is a larger output: http://paste.ubuntu.com/16863964/
<redelmann> dimitern: yes
<mup> Bug #1585300 changed: environSuite invalid character \"\\\\\" in host name <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1585300>
<redelmann> dimitern: ok, this is new: exited "firewaller": AWS was not able to validate the provided access credentials (AuthFailure)
<dimitern> redelmann: it looks like something is odd about your AWS credentials?
<frobware> dimitern, babbageclunk, voidspace: testing meeting?
<redelmann> dimitern: good point, i will research a little
<dimitern> redelmann: uh, sorry I need to take this - back in ~1/2h
<redelmann> dimitern: thank you
<babbageclunk> frobware: I don't have an invite?
<dimitern> babbageclunk: you should have now
<babbageclunk> dimitern: ta
<mup> Bug #1585300 opened: environSuite invalid character \"\\\\\" in host name <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1585300>
<mup> Bug # changed: 1518807, 1518809, 1518810, 1518820, 1519141, 1576266, 1583772
<dimitern> babbageclunk: I suspect the longer run-time was bogus as the second run finished in 482.571 s; running a third now
<babbageclunk> dimitern: oh, good - yeah, it's definitely pretty variable.
<dimitern> redelmann: I'd suggest to install the official AWS CLI tools and using the same AWS credentials you give to juju to try e.g. starting and stopping an instance with --dry-run
<dimitern> babbageclunk: you have a review
<babbageclunk> dimitern: cheers!
<dimitern> it seems the test actually take less time the more you run them :D
<rogpeppe> if anyone wanted to know what the Juju API looked like: http://rogpeppe-scratch.s3.amazonaws.com/juju-api-doc.html
<babbageclunk> dimitern: Sorry! I think I mislead you about the value for JUJU_MONGOD - it needs to be the full path to mongod, including the filename, so for 3.2 you need to set it to /usr/lib/juju/mongo3.2/bin/mongod
<babbageclunk> dimitern: Otherwise it'll silently fall back to using the 2.4 binary.
<alexisb> rogpeppe, :)
<rogpeppe> alexisb: i don't think there are any API docs anywhere, right?
<alexisb> rogpeppe, nope
<alexisb> just what is on github
<dimitern> babbageclunk: that's what I did, yeah
<rogpeppe> alexisb: i guess some people might find this useful then
<dimitern> babbageclunk: it was using 3.2
<alexisb> rogpeppe, you bet!
<alexisb> thank you
<babbageclunk> dimitern: Huh. Ok, awesome!
<rogpeppe> alexisb: my pleasure :)
<dimitern> babbageclunk: I double checked by 'ps -ef|grep mongod' while the test was running
<perrito666> good morning
<babbageclunk> dimitern: Sweet, that would definitely be better.
<dimitern> perrito666: o/
<alexisb> frobware, I am going to be a few minutes late
<frobware> alexisb: ack
<dimitern> babbageclunk: hey! you know what?
<babbageclunk> dimitern: what?
<dimitern> babbageclunk: running the tests on 2.4 with your patch (which I've just thought to do), actually cuts the run-time in *half* !
<frobware> dimitern: heh, nice
<babbageclunk> dimitern: Wow, that's a bigger change than I saw. But I didn't want to get excited about the tests going faster under 2.4, because that will just make the pain of 3.2 more intense.
<dimitern> babbageclunk: running again to get a better sample
<dimitern> babbageclunk: well, it's quite reasonable to expect this, especially with not doing things like recreating dbs for *every* test case
<babbageclunk> dimitern: Yeah, definitely - it's doing a lot less.
<dimitern> babbageclunk: I can confirm - ~276 s on 2.4 vs ~476 on 3.2
<dimitern> babbageclunk: great job! please land this soon! :)
<babbageclunk> dimitern: ok!
<dimitern> babbageclunk: I did a final run with 3.2 with comparable system load levels - it's still ~476
<babbageclunk> dimitern: Cool.
<babbageclunk> dimitern: If someone's commented on a change, and I think I've addressed those comments, I should probably wait for them to put a Ship It on the review before merging it, right?
<natefinch> babbageclunk: if it was trivial, then just go ahead and merge.  If it was something complicated, I usually wait to make sure they agree with my change
<natefinch> babbageclunk: that's assuming you got a "fix it, then ship it"... if you didn't get the ship it, then definitely ask if they intended to give you a ship it or if they think it will need a re-review
<babbageclunk> natefinch: Makes sense - thanks!
<mup> Bug #1587503 opened: LXD provider fails to set hostname <juju-core:New> <https://launchpad.net/bugs/1587503>
<babbageclunk> natefinch: (I got a bit excited and didn't check with someone for a previous change, when I realised I figured it was probably a bit rude.)
<voidspace> babbageclunk: dimitern: frobware: sorry I missed the testing meeting - I was at the dentist. I have the invite now.
<natefinch> hmm... pretty sure I'm a team of 1 for my standup
<dimitern> voidspace: that's ok - we'll have one every other week
<voidspace> dimitern: yeah, I saw. Great.
<dimitern> voidspace: the gist of it is we need to come up with a list of relevant networking tests that could be added to the current CI
<voidspace> dimitern: right, sounds like a good start
<babbageclunk> frobware: Are you ok with me merging http://reviews.vapour.ws/r/4944
<babbageclunk> frobware: ?
<arosales> natefinch: or katco or anyone familiar with resorces could we get your ack on https://github.com/juju/docs/pull/1122
<arosales> need this to land in docs asap so we can have charmers start using terms
<natefinch> arosales: will look
<arosales> natefinch: thanks!
<rogpeppe> fwereade, dimitern: you might like this: http://rogpeppe-scratch.s3.amazonaws.com/juju-api-doc.html
<dimitern> rogpeppe: awesome! tyvm
<dimitern> rogpeppe: what did you use for generating this?
<rogpeppe> dimitern: some code :)
<rogpeppe> dimitern: one mo and i'll push it to github
<rogpeppe> dimitern: it does rely on a function i implemented in apiserver/common to get all the facades
<dimitern> rogpeppe: nice! cheers
<mup> Bug #1587552 opened: GCE Invalid value for field 'resource.tags.items <blocker> <ci> <gce-provider> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1587552>
<frobware> babbageclunk: sorry, was otp
<babbageclunk> frobware: no worries - not blocking me, just thought I'd check
<fwereade> rogpeppe, nice
<rogpeppe> fwereade: obviously a lot of the comments are Go-implementation-oriented, but it's better than nothing :)
<fwereade> rogpeppe, absolutely
<perrito666> bbl
<rogpeppe> dimitern: i've pushed the code to github.com/rogpeppe/misc/cmd/jujuapidochtml and github.com/rogpeppe/misc/cmd/jujuapidoc
<rogpeppe> dimitern: the former just generates the HTML; the latter generates the computer-readable form that jujuapidochtml works from
<dimitern> rogpeppe: thanks! starred and bookmarked :)
<alexisb> natefinch, is the PR in this bug merged?: https://bugs.launchpad.net/juju-core/+bug/1581885
<mup> Bug #1581885: Rename 'admin' model to 'controller' <juju-release-support> <usability> <juju-core:In Progress by natefinch> <https://launchpad.net/bugs/1581885>
<alexisb> seems to be??
<natefinch> alexisb: yes, sorry, it's in
<natefinch> alexisb: I marked it as such
<alexisb> natefinch, awesome, thank you
 * thumper looks sadly at the calendar for this morning and sighs
<rick_h_> thumper: look at all those oppertunities!
 * perrito666 wonders why the code is working
<natefinch> it's never good when you have to wonder why something works
 * thumper slaps rick_h_
<perrito666> uh, did you just ....rickslapped him? <puts sun glasses> <cue csi miami music>
 * rick_h_ says "Thank you sir may I have another?"
<redir> brb reboot
<thumper> trivial review for someone http://reviews.vapour.ws/r/4948/
<perrito666> thumper: ship it
<thumper> perrito666: ta
<mup> Bug #1587644 opened: jujud and mongo cpu/ram usage spike <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1587644>
<mup> Bug #1587236 changed: no 1.25.5 tools for vivid? <juju-core:Won't Fix> <https://launchpad.net/bugs/1587236>
<mup> Bug #1587653 opened: juju enable-ha accepts the --series= option  <cdo-qa> <ha> <juju-core:New> <https://launchpad.net/bugs/1587653>
<mup> Bug #1587653 changed: juju enable-ha accepts the --series= option  <cdo-qa> <ha> <ui> <juju-core:New> <https://launchpad.net/bugs/1587653>
<mup> Bug #1587236 opened: no 1.25.5 tools for vivid? <juju-core:Won't Fix> <https://launchpad.net/bugs/1587236>
<mup> Bug #1587236 changed: no 1.25.5 tools for vivid? <juju-core:Won't Fix> <https://launchpad.net/bugs/1587236>
<mup> Bug #1587653 opened: juju enable-ha accepts the --series= option  <cdo-qa> <ha> <ui> <juju-core:New> <https://launchpad.net/bugs/1587653>
<perrito666> alexisb: did you need anyhing?
 * perrito666 notices he is answering old messages because of lag
<alexisb> perrito666, nope, I got an update from Ian, thanks!
<perrito666> oh, I get my updates from the internet, I just apt-get update :p
<alexisb> :)
 * perrito666 tries to unfreeze
<thumper> alexisb, wallyworld: http://reviews.vapour.ws/r/4949/diff/#
<thumper> I've not done other packages because I think we actually need some of them
<wallyworld> ok
<thumper> potentially we could do the apiserver too
<thumper> but haven't done that yet
<thumper> state takes ages
<alexisb> thumper, well that was a simple fix to remove a bunch of pain
<alexisb> thank you
<thumper> alexisb: I told you it wouldn't be big
 * perrito666 runs tests for state and wonders if he could make dinner while he waits
<axw> thumper: in http://reviews.vapour.ws/r/4925/, you say "business object layer". would it make sense to have it in the core package maybe? I still think it belongs in "names" at the moment for consistency, but maybe if we were to move core outside, and fold names into it?
<thumper> axw: have it in names for now
<axw> okey dokey
<mup> Bug #1463420 changed: Zip archived tools needed for bootstrap on windows <simplestreams> <tools> <windows> <juju-core:Fix Released by bteleaga> <https://launchpad.net/bugs/1463420>
<alexisb> menn0, it seems our 1x1 has fallen off the calendar
<menn0> alexisb: yeah, we haven't had one for a while
<alexisb> I will put some time for us tomorrow
<menn0> alexisb: sounds good
#juju-dev 2016-06-01
<wallyworld> perrito666: i made some more comments
<davecheney> how's LP, is it still crapped up ?
<perrito666> wallyworld: ack, re registering in the shim, I just cargoculted that from other use of that pattern, ill change it
<perrito666> wallyworld: I can do the per series dissallowance, that is free
<perrito666> wallyworld: I answered
<wallyworld> perrito666: so we copy whatever init files are relevant to that internal directory? upstart or systemd?
<davecheney> thumper: when you skip a test you should use t.Skipf
<davecheney> thumper: https://github.com/juju/juju/pull/5496/files#diff-758d572a629db5b9982ba1841b64dda5R14
<davecheney> thumper: but, two thumbs up for not running the state tests on windows
<perrito666> wallyworld: yup, its quite a clever thing that eric did, we create the init files there and are linked to the right path
<wallyworld> perrito666: ok, so long as it all works :-)
<perrito666> mmpf, trying to bootstrap trusty with --config enable-os-upgrade=false --config enable-os-refresh-update=false never ends :(
<thumper> davecheney: ah, ok
<thumper> will remember for next time
<mup> Bug #1587689 opened: Can't upgrade controller with --upload-tools (due to rename) <juju-core:New for natefinch> <https://launchpad.net/bugs/1587689>
<davecheney> thumper: s'ok, I had to rebase my branch anyway, so I fixed it for you
<axw> wallyworld anastasiamac: I'm back
<axw> free to chat now
<anastasiamac> axw: wallyworld: tanzanite?
<wallyworld> ok, one sec
<axw> anastasiamac wallyworld: I'm there
<davecheney> thumper: can we do the same thing we did to the windows state tests timing out to here
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1581157
<mup> Bug #1581157: github.com/juju/juju/cmd/jujud test timeout on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged by dave-cheney> <https://launchpad.net/bugs/1581157>
<davecheney> ?
<davecheney> the underlying cause is the same
<thumper> state was different because we don't run apiservers on windows
<thumper> however jujud is needed
<thumper> I'd be loathed to skip those on windwos
<thumper> however
<thumper> we could skip it for now until the npipe implementation is fixed
<thumper> as long as we tag the skip with a bug number
<thumper> and have the bug targetted to 2.0
<thumper> thoughts?
<davecheney> thumper: it ain't going to get fixed soon
<thumper> davecheney: why? can't the windows npipe impl just use select?
 * thumper taps his fingers waiting for tests
<perrito666> wallyworld: around?
<wallyworld> yes
<perrito666> wallyworld: answer my priv then :p
<wallyworld> perrito666: was getting coffee
<wallyworld> perrito666: it's already committed
<wallyworld> perrito666: but actually
<wallyworld> it can't be used yet, i'll fix
<wallyworld> after you land
<wallyworld> so ignore me
<perrito666> lol, ok, because I tried merging master and did not bring the change
 * perrito666 ponders getting coffee then remembers the tim
<perrito666> time
<thumper> menn0: got a second?
<menn0> thumper: yep
<thumper> menn0: 1:1 hangout
<davecheney> thumper: 'cos it's windows and select doesn't exist on windows
<thumper> wallyworld: are you touching core/description with the Service -> Application rename at all?
<wallyworld> thumper: yeah
<thumper> ok
<wallyworld> if you make changes between now and landing, i'll do the merge work
<thumper> I'm not
<thumper> but I was going to if you weren't
<wallyworld> all good, i'm covering everything
<davecheney> thumper: the metrics sender uses a global channel to send metrics ...
<davecheney> one channel
<davecheney> per process
<davecheney> no matter how many workers or manifolds are running
<thumper> I reader many writers?
<davecheney> no, it's just another global that has to be patched everywhere
<davecheney> thumper: oh, it gets worse
<davecheney> now I've extracted that global
<davecheney> it is celar nothing is draining that channel
<davecheney> so shit is sending into it (it's not buffered of course) and nothing is reading from it
<thumper> uh... wat?
<davecheney> it's only used in tests
<mup> Bug #1587701 opened: worker/metricworker: test timeout under race detector <juju-core:New> <https://launchpad.net/bugs/1587701>
<davecheney> thumper: up, when not patched by tests, metricsworker.notify is nil
<thumper> um...
<thumper> isn't that ok though
<thumper> ?
<thumper> if notify is nil is it still sending?
<davecheney> its only used as a hook during tests
<davecheney> a nil channel is never ready to send or receive it is ignored in select
<davecheney> looking into this a little further, almost everthing in that package is exported because tests
<davecheney> nothing is called outside juju
<davecheney> err, calling outside test scope
 * thumper sighs
 * thumper smacks himself in the face again with the lxd tests
<thumper> FAIL	github.com/juju/juju/worker/terminationworker	1200.065s
<thumper> failed to stop
<thumper> 	/home/tim/go/src/github.com/juju/juju/worker/terminationworker/worker_test.go:63 +0x1ee
<thumper> bah humbug
<davecheney> thumper:         -n
<davecheney>                 do not execute test binaries, compile only
<davecheney> ^ I call this nope mode
<davecheney> i just added to to gb to compile, but not run the tests
<anastasiamac> axw: wallyworld:thumper: one less invalid login ... https://github.com/juju/juju/pull/5506
<axw> looking
<axw> anastasiamac: reviewed
<axw> anastasiamac: and I don't know why, but I reviewed on github
<anastasiamac> axw: i saw.. thank you :D m  moving logic around based on suggestion...
<anastasiamac> axw: revew on github is my mistake - i've sent u github link instead of RB
<axw> anastasiamac: that explains it :p
<davecheney> thumper: menn0 https://github.com/juju/juju/pull/5507
<menn0> davecheney, thumper looking
<axw> wallyworld: in cmd/juju/application/unexpose.go there's still "A application"
<thumper> davecheney: how does this fix the timeout?
<wallyworld> axw: damn, i must have had the case checkbox ticked, i'll fix in the next one
<davecheney> thumper: 1, removes the use of the suite
<davecheney> 2, adds buffering to the channel
<davecheney> thumper: look at metricmanager_test, that has a buffer of 2
<menn0> davecheney: good stuff. ship it.
<davecheney> but when testing the sender and the cleanup, they didn't have a buffer
<thumper> davecheney: shipit
<davecheney> thanks
<davecheney> i ways also pleased that I could unexport most of the code in that package
<davecheney> which made passing the notify channel for test less gross
<menn0> thumper: simply alias help formatting change: http://reviews.vapour.ws/r/4953/
<thumper> shipit
<davecheney> thumper: there's just one blocker left on the race build https://bugs.launchpad.net/juju-core/+bug/1587716
<mup> Bug #1587716: worker: test timeout during race build <juju-core:New> <https://launchpad.net/bugs/1587716>
<davecheney> i'll look a that after lunch
<thumper> awesome
<thumper> thanks davecheney
<davecheney> no worries
<menn0> thumper: password check now passes for controller workers using HTTP endpoints for a hosted model
<thumper> hazaah
<menn0> thumper: still need to sort out the nonce issue though
<menn0> thumper: i'll be pulling this out into a separate PR
<thumper> sounds good
<menn0> it's reasonably terrible
<menn0> and problematic if I screw it up
<anastasiamac> axw: cleaned and all tests pass. here is RB link :D http://reviews.vapour.ws/r/4951/
<anastasiamac> axw: which is to say.. thre is probably nothing that really tests this well.. i'll talk to ppl to get better ci coverage i think
<mup> Bug #1587716 opened: worker: test timeout during race build <juju-core:New> <https://launchpad.net/bugs/1587716>
 * thumper sighs
<thumper> where has --upload-tools gone?
<thumper> bugger
<thumper> pebkac
<axw> anastasiamac: couple more small things
<mup> Bug #1587734 opened: worker/machiner: test failure during race build <juju-core:New> <https://launchpad.net/bugs/1587734>
<bradm> is there anyway to kill a juju run action?
<bradm> got some queued that will never run
<wallyworld> axw: FFS, why is it when you really need to land something, the bot hates you
<axw> wallyworld: :(
<wallyworld> still trying to land branch #2
<axw> wallyworld: I've paused my branch, started a new one to restructure instancecfg. I need to pass more bootstrap params through, and the ball of mud in there was driving me crazy
<axw> now finding bugs in providers, woo
<wallyworld> yay
<dimitern> dooferlad: ping
<dimitern> dooferlad: didn't we port the fix that puts add-juju-bridge.py in /var/tmp vs /tmp from 1.25 to master ? precise is still broken because of this..
<wallyworld> axw: yay, branch #2 landed, will propose #3
<dooferlad> dimitern: yes, we did. Thought I had. I am on holiday until tomorrow so can you do it / file a bug / drop me an email to remind me?
<dimitern> dooferlad: ah, ok - sorry then, enjoy your holiday :)
<wallyworld> axw: only 235 files this time :-/ https://github.com/juju/juju/pull/5509
<frobware> dimitern: not sure we have
<frobware> dimitern: re: /var/tmp
<dimitern> frobware: we haven't
<dimitern> frobware: I've verified this earlier, but it only affects precise fortunately
<frobware> dimitern: I don't see it in master or 1.25 :(
<dimitern> frobware: it's in 1.25
<dimitern> frobware: https://github.com/juju/juju/blob/1.25/provider/maas/bridgescript.go#L11
<frobware> dimitern: right. sorry - my ref juju clone was out of date
<frobware> dimitern: the scripts should be identical in 1.25 and master -- at least that's my expectation
<frobware> dimitern: so they /var/tmp is missing. the other change I think dooferlad has in fligh, or in review
<dimitern> frobware: the script is the same AFAICS, onlt the path is different
<frobware> dimitern: I see a difference
<frobware> dimitern: let me propose the /var/tmp change
<dimitern> fwereade, jam: standup?
<axw> wallyworld: looking now
<axw> 10000 lines fewer than last time I think? :)
<frobware> babbageclunk, dimitern: http://reviews.vapour.ws/r/4957/
<wallyworld> axw: yeah, on home run now. still lots of vars etc not done. but those can wait
<dimitern> frobware: LGTM, testing on precise now
<axw> menn0: I wonder if we should split the help text out into separate files, and compile them in afterwards. might make it easier for, say, the docs team to manage them separately?
<axw> menn0: also would be a sensible direction if we ever want to i18n
<fwereade> axw, menn0: +1, but, omg, i18n, /shudder
<axw> fwereade: heh, yeah. but won't somebody please think of the enterprises!
<menn0> axw: yeah, not a bad idea
<menn0> fwereade: I meant to say during the call, that a potential idea for the sprint would be to run a workshop where we run through code with issues and review them as a group
<menn0> I wouldn't use specific code from Juju but come up with code that's heavily inspired by existing code, past or current
<menn0> it would be helpful to work through a few examples, say a new worker, some apiserver changes, a client API change and a CLI change
<menn0> and then try to hit as many of the issues you address in your doc as possible
<fwereade> menn0, I am writing bits of http://paste.ubuntu.com/16886293/ when I need a break from agent tests -- and it STM that, to be useful, it needs to cover juju-specific scenarios and will thus inevitably end up with content a bit like that
<dimitern> frobware: your fix works as far as the bridges get created on precise, however it seems lxc is still broken on precise due to: 2016/06/01 09:39:17 http: TLS handshake error from 10.12.19.103:43989: tls: client offered an unsupported, maximum protocol version of 302
<axw> wallyworld: shipit
<wallyworld> axw: you are awesome ty
<frobware> dimitern: I'm ignoring LXC until advised not to
<dimitern> frobware: yeah, just saying :)
<frobware> dimitern: so I do notice on precise that we end up with an additional eth0.cfg (not 50-cloud-init.cfg) :(
<dimitern> frobware: that's coming from the cloud image
<frobware> dimitern: yep, it's just different from trust, xenial
<mup> Bug #1587689 changed: Can't upgrade controller with --upload-tools (due to rename) <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1587689>
<mup> Bug #1587739 opened: rackspace,vsphere: firewalling in HA setup is broken <juju-core:Triaged> <https://launchpad.net/bugs/1587739>
<mup> Bug #1587788 opened: MAAS bridge script needs to reside in /var/tmp on precise <bootstrap> <network> <juju-core:Fix Committed by frobware> <https://launchpad.net/bugs/1587788>
<fwereade> jam, incidentally, a picture of what's happening in an agent: http://paste.ubuntu.com/16889596/
<dimitern> fwereade: do you have ~15m for a quick chat?
<fwereade> dimitern, sure
<dimitern> fwereade: I'm in today's standup HO
<alexisb> natefinch-afk, please ping when you are in
<natefinch> alexisb: ping
<mup> Bug #1587701 changed: worker/metricworker: test timeout under race detector <blocker> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1587701>
<babbageclunk> dimitern: I'm tempted to have a goroutine sleep in a test. Am I a bad person?
<dimitern> babbageclunk: :) you mean os.Sleep?
<mup> Bug #1553292 changed: TestGoroutineProfile dial unix : no such file or directory <go1.5> <go1.6> <intermittent-failure> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1553292>
<babbageclunk> dimitern: yup
<dimitern> babbageclunk: that actually is a known trick to yield back to the scheduler (with os.Sleep(0) for example)
<dimitern> babbageclunk: but otherwise it's rarely a good idea
<natefinch> babbageclunk: sleeping is really only bad if you wait until the thing wakes up.
<dimitern> indeed, natefinch
<babbageclunk> dimitern: Well, it's that or the goroutine goes into a tight spin until its OplogTailer gets stopped (which should happen very shortly after).
<natefinch> dimitern: you shouldn't ever need to yield to the scheduler manually... every function call is a yield point these days.  Unless you're doing odd math stuff in a tight loop, you probably never need to worry about yielding
<babbageclunk> dimitern: But yeah, it seemed like it was probably a bad idea.
<natefinch> babbageclunk: well, ideally you'd have a channel that gets closed when the tailer stops
<dimitern> natefinch: personally, I haven't found a good use for os.Sleep(0) except "let's see what will happen" :)
<babbageclunk> natefinch: yeah, I do
<babbageclunk> natefinch: but between it getting running out of canned data and getting stopped by the test, it goes into a spin.
<babbageclunk> natefinch: When running for real it's querying mongo with a timeout, but that's faked out, essentially.
<babbageclunk> natefinch: I think I'm worrying about nothing. Or at least, putting a sleep in to mimic the timeout would be actively worse.
 * babbageclunk deliberately adds imports out of order for the joy of seeing them pop into place.
<natefinch> lol
<natefinch> I copy and paste code and intentionally don't even try to make my edited code look nice, because why bother when gofmt will just fix it up?
<mup> Bug #1553292 changed: TestGoroutineProfile dial unix : no such file or directory <go1.5> <go1.6> <intermittent-failure> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1553292>
<mup> Bug #1568179 changed: filestorageSuite.TestRelativePath fails because s390x host /tmp is different <ci> <s390x> <test-failure> <unit-tests> <juju-core:Fix Released by reedobrien> <https://launchpad.net/bugs/1568179>
<babbageclunk> alexisb: ping?
<alexisb> babbageclunk, pong
<alexisb> whats up?
<alexisb> o you acan merge you bug
<alexisb> just tag it
<babbageclunk> Ok, that was it - thanks!
<perrito666> bbl lunch
<babbageclunk> dimitern: I've tagged my bug with blocker and ci - should it be showing up on juju.fail?
<babbageclunk> dimitern: It isn
<babbageclunk> gah
<mgz> babbageclunk: only if critical
<mgz> babbageclunk: you can just run the script locally
<babbageclunk> mgz: Ah, that'll be it.
<babbageclunk> mgz: Where do I get it?
<mgz> lp:juju-ci-tools
<babbageclunk> mgz: great, thanks
<babbageclunk> mgz: Welcome back by the way!
<mgz> ./check_blockers.py check master
<mgz> and wow, big list
<mgz> want python-launchpadlib - can install via apt or pip
<mgz> babbageclunk: thanks :)
<babbageclunk> mgz - ok, so as you said, needs to be critical as well.
<alexisb> babbageclunk, you sould be good now
<babbageclunk> alexisb: Thanks!
<natefinch> man, just used the 1.25 local provider.. forgot how fast it bootstraps.
<perrito666> fast to bootstrap, fast to wipe your env
<perrito666> and your machine
<perrito666> and your network
<mup> Bug #1588041 opened: juju bootstrap with vsphere provider hangs with xenial <oil> <juju-core:New> <https://launchpad.net/bugs/1588041>
<natefinch> perrito666: I know I know... I just have a need for speed.
 * thumper dashing to Jessie's school to drop off homework
<perrito666> shame on you, you do your daughter's homework?
<thumper> homework delivered
<perrito666> there was a way to bootstrap (as in bootstrap.Bootstrap) that prevents upgrade after bootstrap, is that still possible?
<natefinch> perrito666: probably passing config in with bootstrap
<alexisb> wallyworld, thumper will be a few minutes late
<wallyworld> ok
<menn0> anastasiamac: should bug 1514874 be Fix Committed now? (for 1.25 and master)
<mup> Bug #1514874: "Invalid entity name or password" error with valid credentials. <blocker> <juju-core:In Progress by anastasia-macmood> <juju-core 1.25:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1514874>
<anastasiamac> menn0: m waiting for stakeholder to verify..
<anastasiamac> menn0: i'll have more info soon... m not too keen to "fix commit" atm.. it's a bit of a handful bug \o/
<mup> Bug #1559299 opened: cannot obtain provisioning script <blocker> <bootstrap> <ci> <manual-provider> <regression> <xenial> <juju-core:Incomplete> <juju-core 1.25:Triaged> <juju-core api-call-retry:Fix Released by axwalk> <https://launchpad.net/bugs/1559299>
<anastasiamac> menn0: but fwiw i *think* it's fixed ;-P
<menn0> anastasiamac: ok np... just thought the bug status update had been missed :)
<anastasiamac> menn0: no :-D it's intentional. thnx for checking
<mup> Bug #1588084 opened: "upgrade-juju --upload-tools" is hard coded to the model name "admin" <juju-core:Triaged> <https://launchpad.net/bugs/1588084>
<menn0> thumper and anastasiamac : trivial change to change command alias formatting: http://reviews.vapour.ws/r/4961/
<thumper> shipit
<anastasiamac> menn0: u have 3 shipits \o/
<menn0> anastasiamac: that should do it :)
<anastasiamac> yes \o/ highly desired, i guess :-P
<axw> anastasiamac: do we still need to store image metadata in simplestreams format in the blob store?
<axw> wallyworld: ^^
<anastasiamac> axw: for 2,0?
<axw> anastasiamac: yes
<anastasiamac> axw: i don't think so
<anastasiamac> axw: i don;t think we need to read from it anymore...
<axw> anastasiamac: I don't see why it should be there either, so I'll take it out and see what breaks
<axw> thanks
<anastasiamac> axw: nps :D
<wallyworld> axw: yeah, simplestream is external to juju, we model the image/tools metadata in the blobstore
<axw> it's nice when putting things into a good structure shakes out a lot of old and broken things
<axw> wallyworld: yep, just wasn't sure if anything was still reading it. sounds like not, which is what I expected/hoped
<wallyworld> yup
<anastasiamac> axw: r u resolving an issue around it or just a cleanup while u r in the area?
<axw> anastasiamac: I'm changing the way we transfer args to the bootstrap agent, and part of that means changing how we transfer custom image metadata
<anastasiamac> axw: awesome \o/
<axw> anastasiamac: I'm transferring it in the original ImageMetadata struct format, so there's no need to save to disk then read back off
<axw> anastasiamac: and while I was there I saw that we were still writing to blobstorte
<anastasiamac> axw: brilliant - thnx :D
<axw> store*
<axw> it'll be much more straight forward to parameterise the bootstrap agent now
<wallyworld> redir: standup?
<mup> Bug #1588084 changed: "upgrade-juju --upload-tools" is hard coded to the model name "admin" <juju-core:Triaged> <https://launchpad.net/bugs/1588084>
<redir> sorry
<redir> brt
<thumper> davecheney: ping
<thumper> davecheney: unping
<cmars> thumper, davecheney see my comments on LP:#1581157
<mup> Bug #1581157: github.com/juju/juju/cmd/jujud test timeout on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged by dave-cheney> <https://launchpad.net/bugs/1581157>
<thumper> davecheney: actually if you are around, I'd like to chat quickly
<mup> Bug #1588092 opened: juju-2.0 has no way to cancel an action <canonical-bootstack> <juju-core:Triaged> <https://launchpad.net/bugs/1588092>
<mup> Bug #1588095 opened: help for juju run-action refers to commands that don't exist <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1588095>
<alexisb> menn0, I will be a few minutes late
#juju-dev 2016-06-02
<menn0> alexisb: no worries. i'm in the hangout when you're ready
<wallyworld> redir: can you pop back into the standup hangout?
<thumper> axw: just looking at migrating the storage collections
<thumper> axw: which order should I do them?
<thumper> what's the dependency tree like?
<thumper> axw: storageC first?
 * thumper thinks block devices first
<axw> thumper: I'd do block devices, volumes, volume attachments, filesystems, filesystem attachments, storage, storage attachments
<thumper> axw: ta
<natefinch> ahhhhhhhhh
<natefinch> no wonder my logging isn't coming through: reconfiguring logging from "<root>=DEBUG" to "<root>=WARNING;unit=DEBUG"
<perrito666> oghh, we suck at naming things
<perrito666> c.client.Do, really? the one thing that is not a verb and the method is called Do
<natefinch> perrito666: https://golang.org/pkg/net/http/#Client.Do
<natefinch> perrito666: should be called send
<natefinch> thumper: where do we store the logging config on the server?  I'm debugging a system where the juju client doesn't work, but the log level is too high for what I need.  I didn't see it in the agent.conf
<perrito666> natefinch: lol, seems like a widespread bad practice
<menn0> natefinch: you want to change the log level of the client or the server?
<menn0> natefinch: to change the logging config for a model:
<menn0> juju set-model-config logging-config='<root>=DEBUG'
<menn0> natefinch: for the client:
<menn0> JUJU_LOGGING_CONFIG='<root>=DEBUG' juju --debug ...
<menn0> the specific logging config is up to you of course
<natefinch> wallyworld: https://github.com/crewjam/rfc5424
<thumper> natefinch: it is stored in the config
<thumper> for a model
<natefinch> thumper: where is the config stored?
<natefinch> thumper: in the DB?
<thumper> environs/config/config.go
<davecheney> thumper: sorry i missed your ping
<davecheney> i was on another call
<davecheney> and then it scrolled off the screen
<davecheney> THE RACE BUILD IS GREEN! http://data.vapour.ws/juju-ci/products/version-4018/run-unit-tests-race/build-1518/consoleText
<davecheney> oh
<davecheney> it's not
<thumper> heh
<davecheney> + echo 'Race detection compliancy is not supported by 1.25.6'
<davecheney> Race detection compliancy is not supported by 1.25.6
<davecheney> it's jus been disabled ...
<davecheney> compliancy is not a word people
<thumper> davecheney: I've sorted my issue any way
<davecheney> cool
<davecheney> glad I could help
<mup> Bug #1588135 opened: worker/terminationworker: test timeout during race build <juju-core:New> <https://launchpad.net/bugs/1588135>
<mup> Bug #1588137 opened: cmd/jujud/agent: incorrect use a sync.Waitgroup <juju-core:New> <https://launchpad.net/bugs/1588137>
<menn0> thumper, wallyworld, axw, anastasiamac : two of you please: http://reviews.vapour.ws/r/4963/
<wallyworld> ok
<axw> menn0: looking
<menn0> thanks
 * thumper rages
<thumper> doer?
<thumper> really?
<thumper> I'm going to doer the next person that adds 'er' on to things it shouldn't be on
<davecheney> thump-er
<davecheney> thumper: this bug is bad, https://launchpad.net/bugs/1588137
<mup> Bug #1588137: cmd/jujud/agent: incorrect use a sync.Waitgroup <juju-core:New> <https://launchpad.net/bugs/1588137>
<axw> is she a doer, eh, eh
<axw> menn0: LGTM
<natefinch> lol doer, yeah
<natefinch> thumper: so.... where is the config stored?  the DB, I presume, since it's per-model?
<mup> Bug #1588143 opened: cmd/juju/controller: send on a closed channel panic <juju-core:New> <https://launchpad.net/bugs/1588143>
<thumper> natefinch: in the environ config
<thumper> yes, the db
<thumper> pretty sure it is in the modelC
 * thumper thinks
<thumper> no
<thumper> maybe
<thumper> can't remember
<thumper> davecheney: hmm... interesting
<thumper> davecheney: we obviously need some way to reject the connection before trying to do wg.Add
<davecheney> thumper: i have a fix
<thumper> davecheney: coolio
<davecheney> this is important to fix because if the wg count goes wrong, the apisever doesn't stop and -- tada, tset timeout
<davecheney> it's probably a rare race
<davecheney> i'm trying ti reproduce it locally with a stress test
<davecheney> thumper: short version is we need to reverse the order of operations inside the srv.traceRequest method, plus a bit of other cleanup
<thumper> k
<davecheney> what is o_O is the way the apiserver shuts down when the mongo pinger returns an error
<davecheney> there is no other way to do it
<davecheney> that I can see
<davecheney> which is kind of crap
<davecheney> the mongo pinger should be external tot he apisever and call it's Close method externally
<davecheney> rather than being tightly coupled
<mup> Bug #1588147 opened: when i bootstrap whith the option --metadata-source=/root/juju  option,there is not any debug or error info about the  specified the metadata source <juju-core:New> <https://launchpad.net/bugs/1588147>
<davecheney> thumper: good news, I can repro the failure locally
<davecheney> testing fix now
<thumper> awesome
<thumper> that's often the hard bit
<davecheney> thumper: http://reviews.vapour.ws/r/4964/
<davecheney> ^ careful review appreciated
<davecheney> i need to run some more stress tests before I'm confident with this one
<natefinch> I cannot, for the life of me, figure out why my log statements aren't getting hit.
<mup> Bug #1498081 changed: loopback storage provider not completing successfully <storage> <juju-core:Expired> <https://launchpad.net/bugs/1498081>
<thumper> davecheney: looks good to me.
<davecheney> thumper: thanks, my laptop is currently screeeaming running a stress test
<davecheney> i'll submit this later if it looks ok
<natefinch> wallyworld or anyone else want to tell me what stupid thing I'm doing such that I'm not seeing logging output where I think I should?
<wallyworld> be careful what you wish for
<wallyworld> natefinch: do you have info?
<natefinch> wallyworld: I added some log lines, at Error level (although david fixed the log level on the machines, so debug+ is shown).  I'm definitintely running my function since the error message is correct, but I'm not getting the log statements.  I checked the md5 of the jujud I built and the one that's running and they're the same.
<wallyworld> natefinch: where do you see the error message if not in the logs?
<natefinch> wallyworld: what I see is the error message returned from the function that my log messages are in... but the log messages aren't getting triggered, even thouhg one of them is at the very top of that function
<wallyworld> and you are using debug log?
<natefinch> wallyworld: I never use debug log.  it's too damn slow.  I'm just tailing the machine log
<wallyworld> hmmm, have you tried looking in the logsink file also?
<natefinch> I don't even know what the logsink file is
<wallyworld> it's what the controllers write out as a backup to the logs in the db
<natefinch> ahh hmm
<natefinch> where is it?
<wallyworld> /var/log/juju maybe
<natefinch> wallyworld: oh, I'm a dip... this is 1.25, btw
<wallyworld> oh, then allmachines.log is relevant right?
<natefinch> wallyworld: yes
<wallyworld> and there's nothing in there?
<natefinch> not for my log lines, no
<wallyworld> this is a machine agent on a worker machine?
<natefinch> yeah, it's a container on a non-controller machine
<wallyworld> and you're looking at /vr/log/juju in the container rootfs?
<natefinch> wallyworld: yep
<wallyworld> and it's not something that the agent on the host machine f the container logs?
<natefinch> I replaced the jujud binary for the container, because the container hit the bug we're looking for
<wallyworld> i think you should have done upgrade-juju to ensure all the binaries are the same
<natefinch> be that as it may.... the only thing I really care about is getting a tiny bit more logging on this one machine, and not trying to upgrade an entire openstack deploy before doing so
<wallyworld> the only thing i can think of is that the log is not coming from where you think it is
<wallyworld> and an older binary is not producing it
<natefinch> I think what I'll do is change the error message that is coming out of it, to make sure it's actually running my code
<wallyworld> that's a good step
<natefinch> also adding some old fashioned printf, which I think we still redirect to the logs
<bradm> is there any way to force what IP juju agent uses?  I'm getting it using the wrong ip, and I want to force it to the right one
<bradm> basically it seems to be using the numerically lowest one
<bradm> looks like bug 1469193
<mup> Bug #1469193: juju selects wrong address for API <kvm> <local-provider> <lxc> <network> <sts> <juju-core:Expired> <https://launchpad.net/bugs/1469193>
<bradm> this isn't with lxc, but with maas
<bradm> well, no local provider, we do have lxcs being spun up
<wallyworld> bradm: i don't think there's a way to force it,
<bradm> wallyworld: .. really.
<bradm> that's going to suck, its picking an IP from the cloud floating IP range as the juju api server
<wallyworld> it's not  exactly straighforward - we can't guarntee to know the available ips ahead of time
<wallyworld> so any solution requires a bit of thought to cover all corner cases
<bradm> right..
<wallyworld> bradm: I think network spaces is intended to contribute to a solution, for 2.0 at least
<wallyworld> you can bind a relation endpoint to a given nic, or something to that effect i think
<bradm> this is getting the wrong IP from the bootstrap node, so the agents don't even start
<bradm> will have to hack up something with routing, I guess
<natefinch> honestly, using the local provider for anything more than a demo on your laptop is asking for trouble
<bradm> this isn't local provider
<bradm> this is maas
<bradm> hopefully using maas isn't asking for trouble :)
<natefinch> no no :)
<natefinch> my mistake :)
<bradm> the bug is local provider I think, its the closest thing
<bradm> want me to reopen that one?  or a new one?
<bradm> given its a different provider I guess..
<natefinch> there's been a bug in the past where we chose the wrong IP address... I think it's been addressed, but possibly only in 2.0?  I'm not sure.  I assume you're on 1.25?
<bradm> yeah, 1.25.5 for this customer
<bradm> its picking the IP on br1 for the bootstrap node, not on br0
<wallyworld> bradm: you are best to ask the folks in europe (dimiter, andy) in an hour or so
<wallyworld> thyey are across this much better than me
<bradm> cool, cool.
<bradm> I suspect there's something not quite right with the routing for this network range too, but still..
<natefinch> wallyworld: somehow my binary is not being run... even though I can confirm the md5 matches the one I built on my laptop.  bizarre.
<wallyworld> natefinch: you just copied it across the old one i assume and restarted the jujud service
<natefinch> wallyworld: yep
<wallyworld> obviously something went wrong, maybe upgrade-juju --upload-tools is the right thing after all
<natefinch> gah, I can try
<wallyworld> not sure why you're against it
<wallyworld> i use it a bit in testing, not on openstck though
<natefinch> going through the whole upgrade process... just a lot to go wrong
<wallyworld> as opposed to doing it by hand and having it fail? :-P
<natefinch> the binary is in the right spot and seems to be the one being run :/
<natefinch> so... how does --upload-tools pick the jujud to upload?
<bradm> is there anyway to tell a container to retry provisioning?  juju retry-provisioning doesn't seem to support containers
<wallyworld> bradm: yeah, i don't think you can, i think thatwas just for cloud instances
<wallyworld> natefinch: the same way as bootstrap --upload-tools does
<bradm> bummer.
<wallyworld> i could be wrong, but that's my recollection
<natefinch> wallyworld: whoever thought that magically picking some random binary to upload was a good idea?  Why not let the user just specify a path? :/
<wallyworld> natefinch: yeah, can't argue there, that was before my time
<natefinch> wallyworld: now I get to wait to see if this works.
<natefinch> wallyworld: so far, not looking good.  juju status is not connecting
<wallyworld> shouldn't take too long
<natefinch> hey, it finished
<natefinch> hey, it finished
<natefinch> hey, it finished
<natefinch> oops, wrong window, sorry
<natefinch> well crap, it used the wrong jujud
<natefinch> I thought it was supposed to use one that exists in the same directory as the juju that is running, but evidently not.
<natefinch> I gotta go to bed... I have to be up in ~4.5 hours
<natefinch> wallyworld: ^
<wallyworld> natefinch: it uses the one in the path, not sure
<wallyworld> tomorrow is anothe day
<natefinch> wallyworld: ahh, yeah, probably
<axw> wallyworld: tests are still running, but optimistically send PR: https://github.com/juju/juju/pull/5518
<wallyworld> ok
<wallyworld> axw: reading the cover description - maybe i misunderstood before. we still need custom image metadata in state, but structured in the image metadata collection, not a simplestreams json blob
<axw> wallyworld: "Custom image metadata no longer needs to be written to the blob store in Mongo" -- it's no longer in the blob store, but structured metadata is still there
<wallyworld> axw: ah, right, sorry, misread it
<axw> wallyworld: do you have a doc for the rest of what's coming for bootstrap?
 * axw bbs
<wallyworld> axw: posted to #juju
<axw> wallyworld: I'm not there (desktop is off for diagnosis, don't have the auth set up on this laptop), can you email me?
<wallyworld> sure
<mup> Bug #1588186 opened: reboot-executor does not run in jujud tests <juju-core:Triaged> <https://launchpad.net/bugs/1588186>
<axw> wallyworld: is it known that HA is broken atm?
<axw> I see no critical bug blocking master, so I guess not
<frobware> babbageclunk: ping - sync?
<babbageclunk> frobware: oops! Jumping into juju-sapphire now
<babbageclunk> frobware: oh no - saw the right one
<babbageclunk> frobware: in sync now.
<babbageclunk> frobware: is it just named that for the pun?
<dimitern> FYI fwereade just texted me his internet connection is down atm
<voidspace> dimitern: hangouts playing up but omw
<voidspace> gah
<voidspace> not working - freezing firefox
<voidspace> dimitern: frobware: dooferlad: babbageclunk: firefox crashed!
<dooferlad> voidspace: mine died too
<dooferlad> voidspace: and it is chrome
<voidspace> nice
<babbageclunk> voidspace, dooferlad: google hangouts experimenting with "kill packets"
<voidspace> babbageclunk: you jest...
<dooferlad> babbageclunk: :-)
<babbageclunk> voidspace, dooferlad: future iterations expected to have lethal options
<dimitern> LOL
<dimitern> then they'll just send killbots next time
<mwhudson> connection interrupted by ordnance
<anastasiamac> dooferlad: standup today ?
<perrito666> morning all
<perrito666> mongodb is an endless source of joy
<natefinch> is there a JFDI for upgrade-juju?  --reset-previous-upgrade does not seem to actually do anything
<natefinch> wallyworld, fwereade, perrito666? ^
<natefinch> wallyworld: (this is why I don't use upgrade-juju btw - ERROR some agents have not upgraded to the current environment version 1.25.5.2: machine-0-lxc-0, machine-0-lxc-1, machine-0-lxc-2, machine-0-lxc-3, unit-ceph-mon-0, unit-glance-0, unit-landscape-client-3, unit-landscape-client-6, unit-landscape-client-7, unit-ntpmaster-0)
<mup> Bug #1588390 opened: 2.0 beta7: can't bootstrap with vsphere cloud provider - ERROR invalid config: host: expected string, got nothing <oil> <juju-core:New> <https://launchpad.net/bugs/1588390>
<perrito666> I know very little about upgrade-juju
<mup> Bug #1588390 changed: 2.0 beta7: can't bootstrap with vsphere cloud provider - ERROR invalid config: host: expected string, got nothing <oil> <juju-core:New> <https://launchpad.net/bugs/1588390>
<perrito666> natefinch: look in the archive for mails from menno on that aspect
<perrito666> there was a way to kick the chair from under juju to trigger an upgrade
<perrito666> iirc, it requierd touching mongo
<perrito666> whoever shortened lxc names, thank you
<natefinch> heh
<mup> Bug #1588390 opened: 2.0 beta7: can't bootstrap with vsphere cloud provider - ERROR invalid config: host: expected string, got nothing <oil> <juju-core:New> <https://launchpad.net/bugs/1588390>
<alexisb> perrito666, that was thumper
<alexisb> morning/evening all
<perrito666> alexisb: Ill make sure to thank him
<perrito666> morning alexisb
<mup> Bug #1588403 opened: Tab completion missing in Juju 2.0 betas <juju-core:New> <https://launchpad.net/bugs/1588403>
<natefinch> fwereade: is there a way to force upgrade-juju to work even with units in error states?
<fwereade> natefinch, hmm, not without surgery that I can think of
<fwereade> natefinch, if you can create the upgrade doc I don't think the errors will impede the process
<natefinch> fwereade: so... all I really want to do is swap out the binary for one particular machine... but somehow when I do so, it is not running the new binary... I think I must be doing something very dumb.
<fwereade> natefinch, ah, right: ok, exactly what did you swap out? tools/<version>?
<natefinch> fwereade: yes
<fwereade> natefinch, what's in the tools/<agent> dir?
<natefinch> fwereade: hang on, sorry... a previous failed juju-upgrade is putting me in a bad state
<fwereade> natefinch, (it certainly *used* to just be a symlink to the one in version, but I have a feeling there was some churn there at some point)
<fwereade> natefinch, you can make the tools think they're a different version by putting some file next to them, if that helps? you can use that to get everything reporting a consistent version if you have to
<natefinch> fwereade: ahh.. it is not a symlink
<fwereade> natefinch, yay, vague instinct saves the day ;p
<natefinch> lol, another bug I fixed elsewhere getting hit on this machine.  it's gonna be one of those days
<alexisb> natefinch, is this dpb machine?
<natefinch> alexisb: yes, it's not  a big deal... there were multiple machines experiencing the bug so I just switched to a different one
<natefinch> er experiencing the bug I'm debugging, not the one I already fixed
<alexisb> dimitern, ping
<mgz> can I get a stamp on reviews.vapour.ws/r/4968 for service-to-application
<mup> Bug #1588446 opened: MAAS bridge script emits "RTNETLINK: file exists" during bootstrap many times <bootstrap> <maas> <network> <juju-core:New for frobware> <https://launchpad.net/bugs/1588446>
<mgz> ocr, or anyone, poke ^
<perrito666> mgz: shipit
<mgz> thank you horachan
<redir> who knows about user management?
<natefinch> cmars: FYI I am aware of the PRs against npipe... will get to them when I have some time.  Let me know if they become critical for me to look at.
<cmars> natefinch, thanks. i've been hacking away at this with gsamfira. going to try to confirm my latest fix with jujud tests, will let you know
<natefinch> cmars: glad you're getting help from Gabriel, that windows code is nasty business, and he knows his stuff.
<natefinch> cmars: feel free to ping me when you think it's all stable
<cmars> sinzui, what kind of instance would run the windows CI test, run-unit-tests-win2012-amd64 ?
<cmars> how many cores would it have?
<sinzui> cmars: it is a c4.xlarge
<alexisb> cmars, you are a rockstar thank you!
<cmars> sinzui, thanks! gsamfira, looks like 4 cores running windows tests
<gsamfira> test will probably pass if you set GOMAXPROCS=1...but that is not a fix.
<mup> Bug #1274755 changed: simplestreams test metadata only lists tools for arm and amd64 <arm64> <hs-arm64> <simplestreams> <tech-debt> <juju-core:Won't Fix> <https://launchpad.net/bugs/1274755>
<mup> Bug #1517092 changed: [xenial] libgo panic doing a bootstrap on ARM64 <2.0-count> <arm64> <bootstrap> <xenial> <juju-core:Fix Released> <https://launchpad.net/bugs/1517092>
<wallyworld> cmars: sorry, that may have been me who removed the model uuid param :-(
<cmars> wallyworld, no worries
<cmars> I really ought to have CI tests to catch this stuff
<mup> Bug #1588542 opened: Juju2 is slow with MAAS2, log shows errors, works anyway <juju-core:New> <https://launchpad.net/bugs/1588542>
<alexisb> thumper, ping
<alexisb> axw, wallyworld ping
<wallyworld> alexisb: otp with rick
<alexisb> wallyworld, ack
<axw> alexisb: pong
<alexisb> heya axw, I need someone to hash through manual provider planning with me
<alexisb> do you have time to chat?
<axw> alexisb: sure
<alexisb> https://hangouts.google.com/hangouts/_/canonical.com/core-leads-call
<alexisb> wallyworld, feel free to join us when you are free
<perrito666> wallyworld: wanna call off the 1:1? ill be there for standup anyway
<wallyworld> perrito666: sorry :-( just finished with rick, now this one, i'll catch you after standup
<perrito666> wallyworld: dont worry I forgive you (did you see that restore landed fixing 2 bugs in its way?)
<wallyworld> perrito666: yay, ty
<menn0> wallyworld: the charms upload HTTP endpoint seems to require a series argument... this doesn't work for multi-series charms
<menn0> wallyworld: should this be relaxed?
<wallyworld> menn0: yeah
<menn0> wallyworld: I'm trying to use the endpoint to upload charms during migration
<wallyworld> good catch
<menn0> but because I'm testing with the ubuntu charm it's failing because the charm URL series is blank
<wallyworld> series in url should now be optional
<menn0> ok cool
<wallyworld> anastasiamac: perrito666: be right there
<mup> Bug #1588559 opened: juju agree should tell the user what to do next <juju-core:New for cmars> <https://launchpad.net/bugs/1588559>
<wallyworld> cmars:
<wallyworld> # github.com/juju/juju/cmd/juju/application
<wallyworld> cmd/juju/application/deploy_test.go:931: unknown metricRegistrationPost field 'ApplicationName' in struct literal
<wallyworld> cmd/juju/application/deploy_test.go:978: unknown metricRegistrationPost field 'ApplicationName' in struct literal
#juju-dev 2016-06-03
<perrito666> I am saying yay a lot on meetings its a sign of either madness or that I am becoming a teenager schoolgirl
<wallyworld> redir: chat noe if you want
<redir> k
<wallyworld> standup ho
<redir> same bat place?
<redir> k
 * thumper wades through more migration tests
<menn0> wallyworld, axw: With mongo3.2 I can't use the mongo shell to connect to Juju's DB any more
<menn0> wallyworld, axw: I suspect it's that the SCRAM-SHA-1 auth mechanism is used by default now
<menn0> but the 2.6 client doesn't support it
<menn0> do you guys know a way around it
<wallyworld> menn0: right, i have organised for mongo client to be packaged with our juju mongo db
<wallyworld> this is not done yet :-(
<menn0> wallyworld: ok, so at least it's a known issue
<wallyworld> balloons: ^^^^^ how far away is our mongo packaging fix?
<menn0> wallyworld: will it be in the juju-mongo3.2-tools package?
<wallyworld> yes
<wallyworld> so it is available on any controller
<perrito666> menn0: install mongodb-org ppa and then use mongodb-org-clients
<menn0> perrito666: awesome - thanks
<perrito666> menn0: I might have misplaced some of those dashes
<menn0> perrito666: i'll figure it out?
<perrito666> but the names of the packages/ppas are those
<davechen1y> thumper: this time I think the race build did complete toally
<davechen1y> and I think I can fix the build hang in the termination worker
<anastasiamac> davechen1y: while fixinf races, did u come around any in uniter and hooks?
<anastasiamac> davechen1y: i have bugs against 1.25 and m wondering if i need to start from scratch or backport..
<davechen1y> anastasiamac: hard to say, do you ahve a bug reference
<davechen1y> at this point, i'd think backporting from 2.0 is near impossible
<davechen1y> the codebases have diverged so much
<anastasiamac> davechen1y: the one i was going to start with is bug 1486712
<mup> Bug #1486712: Race on uniter-hook-execution, prevents to resolve unit. <canonical-bootstack> <hooks> <race-condition> <sts> <sts-needs-review> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1486712>
<davechen1y> anastasiamac: this is a logical race, not a data race
<davechen1y> i guess
<davechen1y> (as there is no output from the race detector0
<davechen1y> in fact, i've no idea why the OP said race in the title
<mup> Bug #1585015 changed: worker/terminationworker: test timeout during CI <tech-debt> <unit-tests> <juju-core:Won't Fix> <https://launchpad.net/bugs/1585015>
<mup> Bug #1585846 changed: worker/terminationworker: test timeout after 20 minutes <timeout> <unit-tests> <juju-core:Won't Fix> <https://launchpad.net/bugs/1585846>
<mup> Bug #1588572 opened: cmd supercommand.go:448 cannot read tools metadata in tools directory: open /var/lib/juju/tools/2.0-beta6-xenial-amd64/downloaded-tools.txt: no such file or directory <juju-core:New> <https://launchpad.net/bugs/1588572>
<davechen1y> anastasiamac: this doesn't look like a data race, just a straight out bug from not handling an error path and the uniter getting out of sync with what the state thinks it is
<davechen1y> ie, dying when the action will not be retried
<axw> wallyworld: is it safe to land my instancecfg change yet, or is the beta8 cutover still happening?
<davechen1y> or maybe it's retrying in a loop
<davechen1y> but not making any progress
<anastasiamac> davechen1y: k
<wallyworld> axw: safe to land, they are taking a previous revision
<axw> goodo
<wallyworld> axw: haven't looked yet, kust finished talking to people, but here's the acl wip https://github.com/juju/juju/pull/5526/files
<axw> wallyworld: yeah I just added a comment
<davechen1y> menn0: thumper https://github.com/juju/juju/pull/5528
<davechen1y> fix for terminationworker hang
 * thumper looks
<menn0> davechen1y: you already have 2 ship its and thumper hasn't even finished yet :)
<davechen1y> ta
<davechen1y> landing it
<mup> Bug #1585424 changed: all: data races in tests are surpressed <race-condition> <tech-debt> <juju-core:Fix Released> <https://launchpad.net/bugs/1585424>
<mup> Bug #1588574 opened: Session already closed in state/presence <intermittent-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1588574>
<mup> Bug #1588575 opened: allwatcher_internal_test has intermittent failure <intermittent-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1588575>
<menn0> wallyworld: would you be open to the charm upload HTTP endpoint taking an optional "schema" argument
<menn0> wallyworld: for migrations I need to preserve the original schema
<wallyworld> menn0: sure, i think
<menn0> when charmstore charms are migrated they end up with the wrong charm URL
<menn0> (local: instead of cs:)
<menn0> wallyworld: the only concern I can think of is some sort of nefarious behaviour were someone installs a dodgy charm as if it were a charmstore charm
<menn0> s/installed/uploads
<menn0> gah
<menn0> s/installs/uploads/
<wallyworld> menn0: can't think at the moment, otp with tim
<menn0> wallyworld: ok np
<axw> sinzui: what happened to the DNS records for vapour.ws?
<davechen1y> hey prople who think we're using mongodb 3.2
<davechen1y> we're not using mongodb 3.2
<davechen1y> or even mongodb 2.6
<davechen1y> we're using  Get:20 http://archive.ubuntu.com/ubuntu/ trusty/universe juju-mongodb amd64 2.4.9-0ubuntu3 [6,936 kB]
<davechen1y> does this come as a surprise to anyone ?
<axw> davechen1y: I think we only use 3.2 on xenial? not sure tho, need to double check
<axw> anyone got a cached IP for juju-ci.vapour.ws?
<davechen1y> axw: this is the landing bot
<axw> davechen1y: ah right, in the tests.
<davechen1y> % host juju-ci.vapour.ws
<davechen1y> juju-ci.vapour.ws has address 54.86.116.234
<axw> davechen1y: thanks
<davechen1y> .ws does appear to be busted at the moment
<cmars> wallyworld, on it
<thumper> menn0: a very dull PR https://github.com/juju/juju/pull/5529
<wallyworld> ty
<menn0> thumper: looking
 * thumper heads out to get the dog some exercise
<thumper> and to leave the house for the day
<thumper> bbl
<menn0> thumper:  ship it
<natefinch> wallyworld: ok, time to actually figure out this bug, now that I have some data that's not just "something went wrong"
<wallyworld> natefinch: awesome, otp will catch up soon
<natefinch> chrome/linux/somebody is getting mad that I'm copying & pasting 3.8megs to pastebin
 * natefinch remembers that pastebinit exists
<davechen1y> nuts, my dns entry for .ws aged out
<davechen1y> thumper: https://bugs.launchpad.net/juju-core/+bug/1588574
<mup> Bug #1588574: Session already closed in state/presence <intermittent-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1588574>
<davechen1y> i cannot reproduce this locally
<davechen1y> stress tested for two hours
<davechen1y> can you ?
<davechen1y> ironically it just hit me in CI http://juju-ci.vapour.ws:8080/job/github-merge-juju/7972/console
<axw> wallyworld: am I reviewing https://github.com/juju/juju/pull/5519 or https://github.com/juju/juju/pull/5525?
<wallyworld> axw: the one merging into the feature branch
<wallyworld> cmars: i don't see how the currebt PR fixes the deploy_test compile error?
<cmars> wallyworld, did i forget to push? checking..
<cmars> wallyworld, it fixes it because I fucked up the field name in changeset number 2. see http://reviews.vapour.ws/r/4971/diff/2-4/
<cmars> so from 0-4 you'll see no net difference but the landing bot failed on #2 i think
<wallyworld> cmars: ah right, ok. so we retain the field name but serialise as "service-name"
<cmars> wallyworld, yeah, wireformat needs more time to change
<wallyworld> cmars: np, lgtm
<wallyworld> ty
<cmars> wallyworld, thanks!
<thumper> davechen1y: I don't think I have hit it locally for quite some time
<thumper> not sure where the race
<thumper> is
<davechen1y> i'm pushing some debug into mgo.v2 to record the stack of _who_ closed the session that we got handed
<davechen1y> it's going to be hard to test this if I cannot repro it locally
<davechen1y> fwwi, the presence tests are unstable AF, but just not this failure ...
<thumper> davechen1y: yeah, presence is kinda crap
<thumper> menn0: speaking of mgo, what's the status of the idea of having mgo tell us which assert failed ?
<axw> wallyworld: https://bugs.launchpad.net/juju-core/+bug/1586089  -- I've thought about the azure instance IDs a bit, I think we should do the same thing that thumper has done for LXD/MAAS
<mup> Bug #1586089: azure arm instance-ids are not ids, cannot find machine in azure <azure-provider> <ci> <ha> <jujuqa> <testing> <juju-core:Triaged> <https://launchpad.net/bugs/1586089>
<axw> wallyworld: atm they're just "machine-0" (e.g.) because instance IDs never clash across models, there's the invisible resource group qualifying them
<wallyworld> that could be a good solution
<axw> wallyworld: but then you need to know which resource group to look in, so having the suffix of the model UUID would make it easier to identify
<wallyworld> yep, and it's no worse than other providers
<axw> wallyworld: another thing that needs to be done before beta9 I'm afraid
<wallyworld> indeed
<davechen1y> 1588574
<davechen1y>  logger.Infof("watcher loop failed: %v", err)
<davechen1y> seriously ...
<natefinch> davechen1y: getting notified of which assert failed would probably have saved me days worth of work on my current bug
<davechen1y> errors logged at info
<menn0> thumper: I haven't raised the idea with niemeyer
<menn0> thumper: it would be a big change b/c the current API returns a static ErrAborted
<menn0> thumper: all users of mgo would have to change
<thumper> ugh
<thumper> :(
<wallyworld> natefinch: how is the bug going?
<natefinch> wallyworld: examining the transaction asserts to try to figure out why they're failing.  Luckily have  some success cases and some failure cases to compare
<wallyworld> ok
<natefinch> wallyworld: btw, the problem was that I was replacing the machine agent's binary, but it was actually logging an error received from an API call to the controller.. so it was the controller that I really needed to replace the binary on.  It just wasn't clear from the error message in the machine agent log that it was an API call. But once I thought about it, I realized it had to be the controller, since it was code in state.  I was too hung up on
<natefinch> the assumption that I needed to patch the code where the error was being logged.
 * thumper afk to collect kids from school
<wallyworld> natefinch: makes sense. easy in hindsight
<davechen1y>         defer func() {
<davechen1y>                 // If the session is killed from underneath us, it panics when we
<wallyworld> natefinch: also, you'll see you got a bug to fix some fallout from the admin to controller name change
<davechen1y>                 // try to copy it, so deal with that here.
<davechen1y>                 if v := recover(); v != nil {
<davechen1y>                         err = fmt.Errorf("%v", v)
<davechen1y>                 }
<davechen1y>         }()
<davechen1y> what the F
<natefinch> wallyworld: good times :/
<natefinch> davechen1y: where's that code?
<davechen1y> state/presence
<wallyworld> axw: should we change the model global key to "m" (currently "e"). this will be our last chance
<natefinch> davechen1y: if only someone had reviewed that code: https://github.com/juju/juju/pull/361#discussion-diff-15266925
<davechen1y> touche
<axw> wallyworld: I think m is taken for machines?
<natefinch> I think it's natural to hate past you.  Past me screws me all the time.
<davechen1y> I think this is before michael foord discovered just how bad an idea this was
<davechen1y> natefinch: https://github.com/juju/juju/pull/5530
<davechen1y> give that we don't apply the same logic everywhere else we call Copy
<wallyworld> axw: yeah, m#123 is, but not "m" on its own
<axw> wallyworld: I think having m being wildly different m#123 is a recipe for confusion
<wallyworld> ok
<wallyworld> my OCD kicks in when I see "e"
<axw> wallyworld: "e" is for "em" ? :)
<wallyworld> lol
<thumper> haha
<thumper> perhaps we should rename machine to instance
<thumper> then we wouldn't have two m's
<wallyworld> thumper: i do wish we could solve this. but too hard i think
<wallyworld> "n" for node maybe
<thumper> model for model
<wallyworld> global keys can only be one letter i think
<natefinch> I was gonna say, is there a byte shortage?
<thumper> bollocks
<wallyworld> in the past i've run into issues
<thumper> global keys are made up from multiple ids
<wallyworld> the prefix bit
<thumper> r#foo#bar#baz
<thumper> uses split on #
<wallyworld> right, the r# bit
<wallyworld> the r is only one letter
<thumper> they just have to be unique in the places it is used
<thumper> no, can be anything
<wallyworld> there's code that assumes it's one letter
<thumper> it is short by convention
<thumper> where?
<wallyworld> i'll try and find it
<wallyworld> i've tried to change it before and it's caused a bug
<natefinch> rule #1 of code club is: don't encode data in your ids
<wallyworld> thumper: backingEntityIdForGlobalKey
<wallyworld> that would apply to model thugh in this case
<wallyworld> wouldn't
<wallyworld> so we could make it "model"
<wallyworld> there might be other places too
<thumper> JFDI and see what  breaks
<wallyworld> but the above is one i can find quickly
<wallyworld> yeah, might do just that
<davechen1y> menn0: thumper http://reviews.vapour.ws/r/4977/
<menn0> davechen1y: lookin
<davechen1y> ta
<davechen1y> why are there three watcher implementations in state
<davechen1y> we have state/
<davechen1y> state/multiwatcher
<davechen1y> and state/presence
<davechen1y> each with their own watcher model
<davechen1y> thinggy
<menn0> davechen1y: review done... not LGTM I'm afraid
<menn0> davechen1y: the Watcher in state/presence is a completely different thing to the watchers elsewhere
<menn0> it's unfortunate that the same name was used
<natefinch> wallyworld: oh, I wanted to talk about the lxc to lxd stuff
<wallyworld> ok
<menn0> davechen1y: state/watcher implements the low level infrastructure that accepts requests for things to watch, tracks the txn log, and reports when things that are being watched have changed
<natefinch> wallyworld: if this is the right doc: https://docs.google.com/document/d/1SXBJJ_HHDX4_WvNGVGtLhNajXaioOF_Pn8-EvP1WvaU/edit
<natefinch> wallyworld: then I'm pretty unclear on which things are tasks Im supposed to do
<wallyworld> --to lxd
<wallyworld> lxd in bundles
<wallyworld> are the first 2 main things
<menn0> davechen1y: the stuff in state/watcher.go and state/allwatcher.go and state/multiwatcher all rely on what's in state/watcher
<natefinch> wallyworld: this says to remove --to lxc.. but I think you want it to stick around, warn, but then convert to a --to lxd, correct?
<wallyworld> well, we could remove that one. the thing we need to warn about is in bundles
<davechen1y> menn0: what a facepalm of tighlyt coupled code
<natefinch> wallyworld: ok
<wallyworld> since we don't want to brek existing bundles
<natefinch> wallyworld: right
<davechen1y> menn0: thanks for the review
<wallyworld> natefinch: also stuff like lxd-defaultg-mtu name change
<wallyworld> instead of lxc-default-mtu
<natefinch> Adding "lxd-default-mtu" to mirror "lxc-default-mtu" and pass it through the API stack
<davechen1y> menn0: i don't agree with your feedback
<natefinch> wallyworld: I have no idea what those words mean ^
<davechen1y> the session.Copy pattern is everywhere in the code
<davechen1y> including multiple times in the presence package
<wallyworld> natefinch: the default mtu is a network setting - "maximum transmission unit"; it defines packet size
<natefinch> wallyworld: ok, I guess I can grep for it easily enough
<davechen1y> i think this code was an impleplete solution to that michael foord fixed later
<wallyworld> natefinch: the doc says to mirror the setting but i don't see why we can't replace it
<wallyworld> i'll check with john
<natefinch> wallyworld: yeah, just renaming it seems fine
<wallyworld> natefinch: does this mean the other bug is fixed?
<natefinch> wallyworld: no, but I wanted to talk to you before tomorrow, so when I get it fixed, I know what to do afterward.
<wallyworld> yay, ty
<wallyworld> bet you'll be happy to see the back end of thst bug
<natefinch> wallyworld: yeah, I looked at the lxd stuff today and realized I was not clear about what I should be doing. Now I am. Thank you.
<wallyworld> np
<natefinch> wallyworld: oh man... this thing was such a cluster just getting it going... like 3 different dumb mistakes stacked on top of one another.  And added on top of needing to deploy openstack each time through another person... not good times.  We lost like an hour or so today when their maas somehow imploded and lost all networking
<natefinch> (which is one of the times when I had a chance to go looking through the code and the lxd stuff)
<natefinch> if anyone wants to stare at some mgo transactions and try to figure out why they're failing, there's a nice fat 4meg log here: http://paste.ubuntu.com/16938088/
<wallyworld> joy
<natefinch> some of the logging is unfortunately duplicated as I was adding in fmt.printlns to try to ensure I wasn't getting screwed by f'd up logging.
<natefinch> so far I'm not seeing a pattern... the assert that fails is asserting that the current address either doesn't exist or is empty (which probably means it doesn't exist, and the precondition check is returning a zero value for the struct).  Which sounds like it could be bad, except that the exact same kind of assert passes elsewhere
<natefinch> wallyworld: whelp, I gotta sleep.  Will pick it up in the morning, now that we've figured out all the problems with getting debug builds up, I can sprinkle in smoe more logging, see if I can figure out what's goijng wrong.
<wallyworld> ok, ty
<menn0> davechen1y: I've replied
<davechen1y> menn0: thanks, it's also frustrating that I cannot reproduce this on my machine
<davechen1y> protip -- killall mongod while the tests are running; not a big deal
<davechen1y> nothing panics
<davechen1y> nothing stops
<davechen1y> menn0: session.Copy is nuts
<davechen1y> all it does is return a copy of the session, pointing to the same backing socket and stuff
<davechen1y> but with it's _own_ mutex
<davechen1y> menn0: axw thumper https://bugs.launchpad.net/juju-core/+bug/1588614
<mup> Bug #1588614: mongodb dying does not cause tests to fail <juju-core:Confirmed> <https://launchpad.net/bugs/1588614>
<davechen1y> can you please try to confirm this for me
<davechen1y> it's a 10 second test
<axw> davechen1y: dies straight away for me
<axw> davechen1y: the tests exit straight away I mean
<davechen1y> WOW, this is absolutely the words
<davechen1y> https://github.com/go-mgo/mgo/blob/v2/log.go#L76
<davechen1y> worst
<davechen1y> if the race detector is running, add some extra locking to make it happey
<davechen1y> otherwise -- FUCKING!
<davechen1y> FUCKIT
<davechen1y> axw: can you paste the log
<davechen1y> seriously, it's been running happily for a few minutes with no bo
<davechen1y> no db
<davechen1y> an, no, my mistake
<axw> davechen1y: http://paste.ubuntu.com/16940975/
<davechen1y> i don't get any of that output
<davechen1y> my behaviour is entirely different, the tests sit trying to find a working cluster menber for minutes
<davechen1y> then give up
<axw> davechen1y: you're on master right?
<davechen1y> yes
<axw> wallyworld: we may need to remove the ability to bootstrap without first defining a cloud (e.g. bootstrap manual/<IP>), if we're requiring each model to have a cloud name
<axw> or otherwise come up with a way of generating cloud names
<axw> it's probably less confusing just to require that they be added first tho
<mup> Bug #1588636 opened: mgo: Panic: Test left sockets in a dirty state (PC=0x46257C) <juju-core:New> <https://launchpad.net/bugs/1588636>
<wallyworld> hmmm, that was a nice short cut for people, but we could remove it
<dimitern> frobware: joining now the HO for top of the hour
<dimitern> frobware: i'm in
<dimitern> jamespage: hey there
<frobware> dimitern: otrp
<dimitern> jamespage: is the ntp charm maintained by the os-charmers?
<dimitern> frobware: ok, np
<dimitern> jamespage: it's the only one not reporting 'Unit is ready' status; see - just 4 lines short of a perfect output: http://paste.ubuntu.com/16942268/ :)
<babbageclunk> frobware, dimitern: could you take a look at http://reviews.vapour.ws/r/4967
<babbageclunk> I was hoping to get Menno to look at it but I think he's been too busy.
<dimitern> babbageclunk: sure, will take a look
<babbageclunk> dimitern: thanks!
<axw> wallyworld: charlotte just reminded me that it's a public holiday here on monday
<axw> wallyworld: nearly got the guts of my branch done, still tests to write though
<perrito666> axw: still here?
<perrito666> axw: I left you a priv message
<dimitern> babbageclunk: reviewed
<jamespage> dimitern, not at the moment
<jamespage> dimitern, we do have todo's to ensure that all charms that are part of openstack do status
<dimitern> jamespage: ok
<dimitern> jamespage: side-note - any plans for neutron to support bindings for the external network as well as the internal (data)?
<dimitern> jamespage: it looks like the tenant net (and its underlying vlan in maas) also wants to use dhcp and insists on claiming a full /20 (i.e. router's int port uses the .1 address, which is also used by maas..)
<frobware> dooferlad: could you try my patch as-is right now with a bonded interface (LACP)?
<frobware> dooferlad: http://reviews.vapour.ws/r/4969/
<frobware> dooferlad, dimitern: we're screwed
<frobware> dooferlad, dimitern: I just got caught out again by /etc/network/interfaces.d/eth0.cfg
<frobware> dooferlad, dimitern: guess where my route over eth0 came from... http://pastebin.ubuntu.com/16944207/
 * frobware sulks and finds lunch...
<mup> Bug #1588784 opened: juju/state: intermittent test failure in SubnetSuite with mongod 3.2 <mongodb> <unit-tests> <juju-core:New> <https://launchpad.net/bugs/1588784>
<babbageclunk> fwereade: Here are all the details for that bug, with the standalone reproduction attached.
<fwereade> babbageclunk, cool
<fwereade> babbageclunk, nice bug report <3
<frobware> dooferlad, dimitern: the only way I can get stuff to work atm is... https://github.com/frobware/juju/tree/master-lp1588446-hack-hack-hackity-hack
<dimitern> frobware: looking
<frobware> dimitern: I'll save you the effort: https://github.com/frobware/juju/blob/master-lp1588446-hack-hack-hackity-hack/provider/maas/add-juju-bridge.py#L454
<dimitern> frobware: aw, that..
<dimitern> fwereade: why also remove /e/n/i.d/* ?
<dimitern> oops
<dimitern> frobware: ^^
<frobware> dimitern: because eth0 has a running dhclient
<frobware> dimitern: from eth0.cfg
<frobware> dimitern: so we end up with a route out via eth0 and the nascent br-bond0
<dimitern> frobware: but, if we replace /e/n/i completely like in my patch?
<frobware> dimitern: show me your wares....
<dimitern> frobware: https://github.com/juju/juju/pull/5512/files#diff-59aa120b6b815c64439ef2971190a313R61
<frobware> dimitern: this is a joke.
<frobware> dimitern: we need to stop SPINNING endlessly on this.
<dimitern> frobware: nope, that's for containers, sorry..
<frobware> dimitern: I was confused...
<dimitern> frobware: well, my patch doesn't change where we generate /e/n/i with bridges
<dimitern> frobware: but looking at one of the nodes, I can see http://paste.ubuntu.com/16944715/
<frobware> dimitern: the problem is we have a  dhclient (unexpected item in bagging area) running against eth0
<dimitern> frobware: and my patch does rm 50-cloud-init link from /e/systemd/network/
<frobware> dimitern: does it kill the dhclient for that iface?
<frobware> dimitern: after briding bond0 (parent eth0, eth1) I have http://pastebin.ubuntu.com/16944207/
<dimitern> frobware: well, it's not running now on that same node
<frobware> dimitern: everything is fine apart from lne 2
<dimitern> frobware: but haven't checked right after boot
<dimitern> frobware: I suspect that link causes sysd to spawn dhclient, not the one in /e/n/i.d/
<frobware> dimitern: if your 50-cloud-init.cfg has 'eth0: dhcp' you're screwed too
<dimitern> frobware: I need to try with a bond to see how is it
<dimitern> frobware: paste the -before-add-juju-bridge /e/n/i ?
<frobware> dimitern: http://pastebin.ubuntu.com/16944804/
<frobware> dimitern: line 45 is the killer
<babbageclunk> fwereade: thx :)
<dimitern> frobware: yeah..
<frobware> dimitern: we're dooomeed!
<dimitern> frobware: is this xenial or trusty?
<frobware> dimitern: trusty. i.e. customer centric. :)
<dimitern> frobware: no, why? we can just filter out any source stanzas
<frobware> dimitern: too late
<frobware> dimitern: there's already a running dhclient for that eth0
<dimitern> frobware: didn't one of dooferlad's fixes for 1.25 make sure dhclient is stopped or something?
<frobware> dimitern: hence my hack which downs all interfaces, rm's the crap, ifup's then bridges dynamically
<dimitern> frobware: let me try my fix with trusty
<frobware> dimitern: the probkem is we have two configs for eth0: static and DHCP
<frobware> dimitern: my exemplary setup is a bond0 on eth0 and eth1
<dimitern> frobware: yeah, I get that, but it's also different for upstart and systemd
<frobware> dimitern: ok
 * frobware really goes for lunch. really really really
<tasdomas> what determines if a deployed unit gets an ipv4 or ipv6 address?
<dimitern> tasdomas: on what the provider tells us first
<tasdomas> dimitern - running juju built from master and it seems pretty random to me (also, pretty anoying, since some charms don't handle ipv6 addresses)
<dimitern> tasdomas: see bug 1576674 for relevant info
<mup> Bug #1576674: 2.0 beta6: only able to access LXD containers (on maas deployed host) from the maas network <lxd> <maas-provider> <oil> <ssh> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1576674>
<dimitern> tasdomas: oops, sorry, not that one..
<dimitern> tasdomas: this - bug 1574844
<mup> Bug #1574844: juju2 gives ipv6 address for one lxd, rabbit doesn't appreciate it. <conjure> <juju-release-support> <landscape> <lxd-provider> <juju-core:Won't Fix> <rabbitmq-server (Juju Charms Collection):Fix Committed by james-page> <https://launchpad.net/bugs/1574844>
<dimitern> tasdomas: fwiw, those charms should be fixed to accept both types of addresses
<dimitern> tasdomas: there are ways to work around this, depending on the provider
<tasdomas> dimitern, lxd?
<dimitern> tasdomas: i.e. disabling ipv6
<tasdomas> dimitern, agree that charms should handle this, but random assignment is not a good sign anyway
<tasdomas> dimitern, ah, yeah
<dimitern> tasdomas: yeah - easy, sudo dpkg-reconfigure lxd and No any IPv6 related questions
<dimitern> s/No any/answer No to any/
<tasdomas> dimitern, thankjs
<dimitern> frobware: good news :)
<dimitern> frobware: smoser says curtin>379 disables the 50-cloud-init generation (currently in xenial-proposed)
<dimitern> testing now
<frobware> dimitern: how/where/what?
<dimitern> frobware: see on #server@c
<frobware> dimitern: though I'm happy to come back to good news. :)
<frobware> dimitern: I just disconnected - did I miss anything? last was good news, testing now.
<dimitern> frobware: sent you a pm with the log
<frobware> dimitern: ok, so waiting on r390
<dimitern> frobware: AIUI only for precise?
<frobware> dimitern: true, but I was testing that and noticed it failed there too
<dimitern> frobware: I've seen it fail for precise, but not trusty before (well, at least not every other deployment)
<frobware> dimitern: too much shifting sand. This is _exactly_ why we need better automation around this. And "this" is only bootstrap.
<mup> Bug # changed: 1519095, 1573294, 1582731, 1588135, 1588137
<mup> Bug #1568895 changed: Cannot add MAAS-based LXD containers in 2.0beta4 on trusty <ci> <jujuqa> <lxd> <maas-provider> <cloud-images:Confirmed> <juju-core:Invalid> <MAAS:Invalid> <https://launchpad.net/bugs/1568895>
<mup> Bug #1584979 changed: better container networking <juju-core:New> <https://launchpad.net/bugs/1584979>
<frobware> dimitern: another case for dynamic bridges - https://bugs.launchpad.net/juju-core/+bug/1585847
<mup> Bug #1585847: LXD creates all the interfaces as the physical machine has when using MAAS. <lxc> <maas-provider> <network> <spaces> <juju-core:Triaged> <https://launchpad.net/bugs/1585847>
<dimitern> frobware: yeah, it was never intended to stay that way :)
<dimitern> frobware: nothing present in /e/n/i.d/ on xenial with latest curtin btw
<frobware> \o/
<frobware> dimitern: are you using daily images?
<dimitern> frobware: using https://images.maas.io/ephemeral-v2/releases/
<frobware> dimitern: ok
<mup> Bug #1588897 opened: Unable to kill or destroy the lxd controller <juju-core:New> <https://launchpad.net/bugs/1588897>
<mup> Bug #1588898 opened: Unable to kill or destroy the lxd controller <juju-core:New> <https://launchpad.net/bugs/1588898>
<mup> Bug #1588911 opened: Juju does not support 2.0-beta9 <blocker> <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1588911>
<mup> Bug # changed: 892552, 1287949, 1505504, 1513165, 1514874, 1522544, 1540900, 1557254, 1568122, 1568854, 1568944, 1569361, 1571053, 1572741, 1573136, 1576120, 1576528,
<mup> 1576750, 1577415, 1577609, 1579148, 1580417, 1580418, 1580946, 1580964, 1581074, 1581885, 1581886, 1582620, 1585582, 1585851, 1586217, 1586880, 1586891
<natefinch> mgz_, sinzui, perrito666: https://github.com/juju/version/pull/3
<natefinch> ..and now back to debugging this stupid mongo bug
<natefinch> perrito666: do you have a little time?  I could use some help
<natefinch> gah, he's idle
<mgz_> natefinch: looms good to me, I'd expect tag should be alpha only (have we ever defined clearly?), we just need numbers for the buildid
<natefinch> mgz_: yeah, I don't know. I expect that someone misunderstood \w to mean just alphabet characters, but *shrug*
<mgz_> natefinch: my one worry of the current change is it's futhering that initial confusion maybe
<mgz_> and just getting rid of \w is the correct intention
<natefinch> mgz_: I'm happy to make it alphabeticals and underscore only... or even only a-zA-Z (or even only a-z)
<natefinch> mgz_: not sure we really every want WoOt_ as a tag
<mgz_> as far as CI is concerned, all we've ever needed is a-z
<natefinch> mgz_: I'm happy with that.  I'll make it [a-z]+
<mgz_> natefinch: sounds good, lets post to -dev or something as well saying what changed so people have a chance to yell
<natefinch> mgz_: good idea
<mgz_> natefinch: stamped the pr
<mbruzek> Can someone advise me how to clean up my controller when I am not able to with juju commands?
<mbruzek> Context:  https://bugs.launchpad.net/juju-core/+bug/1588898
<mup> Bug #1588898: Unable to kill or destroy the lxd controller <juju-core:New> <https://launchpad.net/bugs/1588898>
<mbruzek> I just want to clean up so juju does not think I have a controller and I can continue to do testing/development.
<mbruzek> I knew which files to clean up with juju 1.x but 2.x is slightly different.
<natefinch> mgz_, sinzui: that reminds me, we need to set up CI for github.com/juju/version
<natefinch> mgz_, sinzui: https://github.com/juju/juju/pull/5533
<natefinch> I gotta step out for a bit... whoever wants that to land is welcome to $$merge$$ it once LGTM'd
<mgz_> natefinch: okay, I'll merge
<redir> anyone here understand authorization in apiserver bits?
<cmars> redir, like the admin facade? i worked on that a long time ago
<cmars> things have probably changed some
<redir> like adding and remoing users cmars
<cmars> redir, less familiar with that
<redir> :) Me too
<cmars> redir, i think thumper would know, but he's eow unf
<redir> yeah
<mup> Bug #1588970 opened: go test .\cmd\jujud\agent hangs on Windows (firewaller) <juju-core:New> <https://launchpad.net/bugs/1588970>
#juju-dev 2016-06-04
<mup> Bug #1306315 changed: juju-mongodb process hangs around after package removal <cloud-installer> <destroy-environment> <local-provider> <juju-core:Expired> <juju-mongodb (Ubuntu):Invalid> <https://launchpad.net/bugs/1306315>
<mup> Bug #1528703 changed: juju unable to deploy juju-gui to lxc containers on servers with multiple networks <maas-spaces> <juju-core:Expired> <https://launchpad.net/bugs/1528703>
<mup> Bug #1563918 changed: controller bootstrap process boots node twice <cdo-qa> <juju-core:Expired> <https://launchpad.net/bugs/1563918>
<mup> Bug #1306315 opened: juju-mongodb process hangs around after package removal <cloud-installer> <destroy-environment> <local-provider> <juju-core:Expired> <juju-mongodb (Ubuntu):Invalid> <https://launchpad.net/bugs/1306315>
<mup> Bug #1528703 opened: juju unable to deploy juju-gui to lxc containers on servers with multiple networks <maas-spaces> <juju-core:Expired> <https://launchpad.net/bugs/1528703>
<mup> Bug #1563918 opened: controller bootstrap process boots node twice <cdo-qa> <juju-core:Expired> <https://launchpad.net/bugs/1563918>
<mup> Bug #1306315 changed: juju-mongodb process hangs around after package removal <cloud-installer> <destroy-environment> <local-provider> <juju-core:Expired> <juju-mongodb (Ubuntu):Invalid> <https://launchpad.net/bugs/1306315>
<mup> Bug #1528703 changed: juju unable to deploy juju-gui to lxc containers on servers with multiple networks <maas-spaces> <juju-core:Expired> <https://launchpad.net/bugs/1528703>
<mup> Bug #1563918 changed: controller bootstrap process boots node twice <cdo-qa> <juju-core:Expired> <https://launchpad.net/bugs/1563918>
<mup> Bug #1589061 opened: Juju status with no controllers offers up juju switch <juju-core:New> <https://launchpad.net/bugs/1589061>
<mup> Bug #1589066 opened: Alias a 'foos' to list-foos <juju-core:New> <https://launchpad.net/bugs/1589066>
<mup> Bug #1589066 changed: Alias a 'foos' to list-foos <juju-core:New> <https://launchpad.net/bugs/1589066>
<mup> Bug #1589066 opened: Alias a 'foos' to list-foos <juju-core:New> <https://launchpad.net/bugs/1589066>
#juju-dev 2016-06-05
<mup> Bug #1317109 changed: unable to override login password <add-user-story> <cloud-installer> <juju-core:Expired> <juju-gui:Won't Fix> <https://launchpad.net/bugs/1317109>
<mup> Bug #1588898 changed: Unable to kill or destroy the lxd controller <juju-core:New> <https://launchpad.net/bugs/1588898>
<davecheney> those silly kiwis' won't they every learn
<davecheney> the queens real birthday is not on the 5th
<davecheney> 6th
<anastasiamac> davecheney: is it public holiday in nz for queen's bday?
<davecheney> yeah
<anastasiamac> i find it so hard to believe that they even celcbrate it.. I mean u no lnger swear ur fealty to the crown when accepting nz citizenship..
<anastasiamac> unlike real colonies like australia, where u still swear to the queen :D
#juju-dev 2017-05-29
<veebers> thumper: having "http-proxy:..." set in config.yaml passed at bootstrap doesn't do the job :-\
<thumper> veebers: may well be a bug
<axw> wallyworld: you cut out?
<wallyworld> axw: i'm here
<axw> maybe it's me...
 * babbageclunk goes for a run
<wallyworld> babbageclunk: i've now moved the Consume() API back; there was a fair bit of rework for the mocks, but it means more unit tests rather than full stack tests
<babbageclunk> wallyworld: nice - I'll take another look
<jam> morning all
<anastasiamac> jam: o/
<veebers> thumper: who would be good to ask about my proxy issue? It seems it can see enough to query the stream to see there is an agent, but when it comes to the curl it fails there
<veebers> babbageclunk: I seem to recall that you did some proxy work in the past? If true do you have a couple of moments?
<babbageclunk> veebers: yeah, sure
<babbageclunk> veebers: want to hangout?
<veebers> babbageclunk: sweet, firing up one now
<babbageclunk> veebers: where/
<babbageclunk> ?
<veebers> babbageclunk: heh sorry had pm-ed url. http://hangouts.google.com/hangouts/_/canonical.com/veeb-clunk
<wpk> rogpeppe: https://github.com/juju/juju/pull/7414
<thumper> wpk: not sure rogpeppe is online to review
<thumper> but looks ok to me
<rogpeppe> wpk: i'll take a look tomorrow morning
#juju-dev 2017-05-30
<wallyworld_> axw: hey, with your race fix PR - would it be simpler just to have each test start/stop workers explicitly rather than mess with setup/teardown
<axw> wallyworld_: TBH I don't think we really need to duplicate the tests for each permutation. I'd rather fix the dependencies in the long term, and do the easy fix for now
<wallyworld_> ok, maybe add a TODO
<axw> wallyworld_: done
<wallyworld_> ta
<axw> wallyworld_: did I miss anything in standup? I've got PRs up for GCE and OpenStack to use volume attachment AZs in StartInstance now, will move on to PrecheckInstance if there's nothing release-related I should be looking at
<wallyworld_> axw: we're waiting on a new CI test run with resource limits fixed so we can see where we stand. So right now, nothing that warrants attention (but that might change) We want to release Thursday. I saw the PRs, will look soon
<axw> wallyworld_: okey dokey
 * axw screams into a pillow
<axw> ci failures driving me crazy
<veebers> axw: which failure is giving you grief?
<axw> veebers: mostly grant and windows
<axw> veebers: and mostly mongo stuff on windows
<axw> if not only
<veebers> ah right :-\
<veebers> axw: I might get a chance today to look at why grant is so flaky, see how it might be fixed
<axw> veebers: that would be wonderful, thanks. I know you're busy though
<axw> I intend to see what we can do to either cut mongo out of windows tests, or see if they can be made more robust
<axw> but features
<wallyworld_> babbageclunk: have you looked at assess_log_forward.py ?
<wallyworld_> that Ci test should have deets on how to set stuff up
<babbageclunk> wallyworld_: ooh, no - I'm trying the charm at the moment to see how that does it. Looking at the test now.
<babbageclunk> wallyworld_: thanks
<rogpeppe> wpk: ping
<wpk> pong
<rogpeppe> wpk: i just took a look at https://github.com/juju/juju/pull/7414
<rogpeppe> wpk: i think that returning the original error is fine, but there should definitely not be a warning there
<wpk> rogpeppe: Ian merged it already, I wanted for you to take a look at it as that's your change
<wpk> Why no warning?
<rogpeppe> wpk: because that situation happens all the time and it's normal
<rogpeppe> wpk: we don't want people to see warnings all the time - it makes them worried :)
<wpk> rogpeppe: the warning is issued only if both methods fail
<wpk> rogpeppe: (and the connection fails)
<rogpeppe> wpk: hmm, i guess it's not that common to get a cert error then another error. i still don't think it's worth a warning though.
<rogpeppe> wpk: i think i'd put it at info level
<wpk> rogpeppe: but the connection fails
<rogpeppe> wpk: not necessarily
<rogpeppe> wpk: there can be many concurrent dial instances
<rogpeppe> wpk: and if it's the only reason it fails, we'll display the returned error anyway
<wpk> but the returned error is half of the story
<rogpeppe> wpk: yeah, but i can't see that mattering much unless you're trying to debug, and in that case you can see Info or Debug level messages easily
<rogpeppe> wpk: i'm wary of unnecessary warnings
<rogpeppe> wpk: oh yes, and also line 705 already logs the returned error, so we'll be logging the same error twice
<rogpeppe> wpk: i'd suggest changing line 746 to: logger.Debugf("failed to connect to websocket with public cert after private cert failed: %v", rootCAErr)
<rogpeppe> wpk: i'm particularly wary of extra warning messages because they're a common source of bug reports
<wpk> rogpeppe logger.Debugf("Failed dialing websocket using fallback public CA - %q", rootCAErr
<wpk> rogpeppe: looks OK?
<rogpeppe> wpk: usually we prefix errors with a colon
<rogpeppe> wpk: and i think "failed to dial" reads a bit better that "failed dialing"
<rogpeppe> wpk: i'm never sure whether log messages should start with a capital letter or not...
<wpk> logger.Debugf("Failed to dial websocket using fallback public CA: %v", rootCAErr)
<rogpeppe> wpk: SGTM, but i'd probably use lower case "failed" as 90% of debug messages start with lower case
<rogpeppe> wpk: thanks
<wpk> rogpeppe: https://github.com/juju/juju/pull/7416
<rogpeppe> wpk: LGTM
<SimonKLB> is it possible to run setup the controller and a worker on the same machine using manual provisioning?
<SimonKLB> im just setting up a poc on a single machine right now, but id like to have the possibility to scale out to two machines later on
<SimonKLB> using the localhost controller will bind me to a single machine right?
<axw> SimonKLB: you can deploy applications to the controller machine. just bootstrap, then switch to the controller model ("juju switch controller"), and then deploy the app with "--to 0" to place the application unit on machine 0
<SimonKLB> axw: ah of course :) thanks
<SimonKLB> there seem to be some problems installing a snap-based charm in lxd, the mount unit fails
<SimonKLB> is this a known limitation?
<SimonKLB> May 30 09:07:02 juju-222327-0-lxd-0 mount[10844]: fusermount: mount failed: Operation not permitted
<rogpeppe> axw: if you're still around, i need a second review of https://github.com/juju/juju/pull/7407
<rogpeppe> jam: ^
<jam> SimonKLB: snaps and lxd have some known caveats regardless of juju and charms
<jam> SimonKLB: I don't remember exactly, it may be that you can install user-space mounting tools and get it to work
<jam> SimonKLB: https://stgraber.org/2016/12/07/running-snaps-in-lxd-containers/
<jam> SimonKLB: seems you need 'squashfuse' ?
<SimonKLB> jam: https://bugs.launchpad.net/snappy/+bug/1611078
<mup> Bug #1611078: Support snaps inside of lxd containers <landscape> <lxd> <nova-lxd> <verification-failed-xenial> <Snappy:Fix Released by stgraber> <apparmor (Ubuntu):Fix Released by tyhicks> <linux (Ubuntu):Fix Released by jjohansen> <lxd (Ubuntu):Fix Released by stgraber> <apparmor (Ubuntu
<mup> Xenial):Fix Released by tyhicks> <linux (Ubuntu Xenial):Fix Released by jjohansen> <lxd (Ubuntu Xenial):Fix Committed> <apparmor (Ubuntu Yakkety):Fix Released
<mup> by tyhicks> <linux (Ubuntu Yakkety):Fix Released by jjohansen> <lxd (Ubuntu Yakkety):Fix Released by stgraber> <https://launchpad.net/bugs/1611078>
<SimonKLB> it got fixed but then it hit a new (?) issue
<SimonKLB> this guy seem to have hit the exact same thing as i have https://bugs.launchpad.net/snappy/+bug/1611078/comments/29
<mup> Bug #1611078: Support snaps inside of lxd containers <landscape> <lxd> <nova-lxd> <verification-failed-xenial> <Snappy:Fix Released by stgraber> <apparmor (Ubuntu):Fix Released by tyhicks> <linux (Ubuntu):Fix Released by jjohansen> <lxd (Ubuntu):Fix Released by stgraber> <apparmor (Ubuntu
<mup> Xenial):Fix Released by tyhicks> <linux (Ubuntu Xenial):Fix Released by jjohansen> <lxd (Ubuntu Xenial):Fix Committed> <apparmor (Ubuntu Yakkety):Fix Released
<mup> by tyhicks> <linux (Ubuntu Yakkety):Fix Released by jjohansen> <lxd (Ubuntu Yakkety):Fix Released by stgraber> <https://launchpad.net/bugs/1611078>
<jam> SimonKLB: is that after installing squashfuse?
<SimonKLB> jam: nope :) that fixed it
<SimonKLB> probably need to add that to the snap layer
<jam> SimonKLB: you only need squashfuse inside lxd
<jam> so snaps on VM/baremetal don't need it
<SimonKLB> jam: right, so what would be the correct way to fix it? determine where the charm is deployed and install squashfuse if it's in lxd?
<jam> SimonKLB: its an unfortunate leaky abstraction, and I don't have a great answer for it. charms generally shouldn't know where they are installed, but maybe occasionally they have to
<SimonKLB> jam: agreed
<jam> *juju* shouldn't care that a charm uses snaps, etc
<jam> SimonKLB: if anything, I would tend to say that 'snapd' should know its in a container and depend on squashfuse
<SimonKLB> jam: yea, could this perhaps be fixed using profiles?
<jam> SimonKLB: or maybe, since the default ubuntu images come for lxd have snapd installed, then they should have squashfuse installed as well
<SimonKLB> jam: i think that would be the most straight forward fix
<SimonKLB> since snap is pretty useless in lxd, the lxd image should have squashfuse installed
<jam> (caveat that there is a push to have 1 'pristine' Ubuntu build, that then gets used for MAAS/LXD, etc)
<SimonKLB> pretty useless without squashfuse that is
<jam> SimonKLB: yeah, I agree that snaps in lxd without squashfuse just don't work
<jam> SimonKLB: I think it is "known", I'm not sure who/where they should be working on it.
<SimonKLB> jam: any idea where to propose this addition to the lxd images?
<SimonKLB> ah
<jam> SimonKLB: so, file a bug, tell me about it, and I'll raise it to some people
<SimonKLB> is there a repo or something like that for the images?
<SimonKLB> or where should it be filed
<jam> SimonKLB: if anything I would raise it against Snappy in Ubuntu
<SimonKLB> okok!
<jam> https://bugs.launchpad.net/ubuntu/+source/snapd
<SimonKLB> snappy or snapd? :D
<jam> they can always add other projects to it
<SimonKLB> right
<jam> well 'snapcraft'/ building snaps doesn't need it, I think
<jam> I don't know snappy vs snapd very well
<jam> snapcraft being the user tools, and something being the snap store, etc.
<SimonKLB> me neither :) ill report it in snapd and someone that knows better can re-label it perhaps
<SimonKLB> jam: https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1694411
<mup> Bug #1694411: Add squashfuse to the Ubuntu LXD images <snapd (Ubuntu):New> <https://launchpad.net/bugs/1694411>
<jam> SimonKLB: thanks, will add some people
<SimonKLB> great!
<SimonKLB> got everything installed for a basic kubernetes install on manual provisioned lxd containers except etcd
<SimonKLB> ERROR cannot add application "etcd": cannot deploy to machine 0/lxd/6: adding storage to lxd container not supported
<SimonKLB> :(
<SimonKLB> is this a WIP?
<SimonKLB> turns out this only applies when you use the --to flag when deploying a charm
<SimonKLB> https://github.com/juju/juju/blob/635d98d8cb34e0124f1baf1aa3baec2e28511a64/state/unit.go#L1496
<jam> SimonKLB: axw is in Perth, Australia whose been working on storage, but AIUI we don't support custom storage on LXD provider yet, I'm not sure whether the bundle is trying to supply something custom that we don't support, or whether we just have a bug where we're trying storage when it isn't requested
<SimonKLB> jam: looking at that line i linked to it seems that the storage is only validated when you deploy using the --to flag to do a specific machine assignment
<SimonKLB> wasn't a problem deploying etcd without --to
<SimonKLB> should probably validated it a bit less strict though, since persistent storage isnt a requirement of etcd, just an option
<rogpeppe> jam: ping
<rogpeppe> jam: i've replied to your concerns voiced in https://github.com/juju/juju/pull/7407 - i'd be interested to have a chat about it if you have a moment at some point
<jam> rogpeppe: hey, I'm mostly at the end of my day, can we do something more when you come online tomorrow? Should be middle of my day
<rogpeppe> jam: i'd really love to be able to get the next PR up for review today - it's been too long already
<rogpeppe> jam: but i can't propose it until this one lands
<rogpeppe> jam: but if your day's finished, then not much we can do i guess
<jam> rogpeppe: so i think we can talk more about whether we want a dns cache, but that doesn't have to block this code landing
<jam> I don't think you've made anything worse, as long as the next PR wouldn't be unhinged if we removed it all
<rogpeppe> jam: ok, cool. if you could LGTM it with that caveat, that would be great, thanks
<jam> done
<jam> rogpeppe: found an interesting performance bug
<wpk> yay, another dumb Windows-only unit test error fixed!
<jam> rogpeppe: txn.Resume ends up being quadratic on the number of txns to resume at least when they are all on the same doc
<rogpeppe> jam: ha, i'm not too surprised
<jam> interestingly, cpuprofile says we spend 5.5s/10s in token.id() which parses the token string into a ObjectId
<jam> and those numbers are also quadratic
<jam> for 10 entries, we make 316 calls, for 400 entries we make 402,601 calls
<jam> for 800 entries, 1,605,201 calls
<jam> thats a lot of short-lived strincgs and hex parsing
<wpk> Q: if you'd see DialOpts{Total: 0, RetryDelay: 0, Min: 0} - what's your intuition about how many times could it be retried ?
<jam> wpk: either we'd say "that's a zero value, so replace it with the default", or 1 attempt
<wpk> jam: that's passed to retry.Regular used in retry.StartWithCancel
<wpk> (directly)
<wpk> on Linux that means try once, we failed, oops - Total exceeded, return
<wpk> on Windows we can try to connect 3 or even 4 times during those 0 seconds.
<jam> wpk: often on windows time resolution is actually like 15ms
<jam> its a very old 'clock()' resolution thing
<jam> I believe you have to explicitly request high-perf clocks on Windows and many don't
<wpk> jam: I'll just add a workaround
<jam> either that or something like if you call it it *globally* changes the clock down to 1ms resolution (affects other programs), something like that
<wpk> if delay==0 && total == 0 -> delay = 1
<jam> wpk: delay = 1 or total = 1?
<wpk> delay = 1
<wpk> after first try it'll notice that now+delay > total and return
<rogpeppe> wpk: that's a really good question
<rogpeppe> wpk: i think the current behaviour is wrong
<wpk> rogpeppe: and what would be the correct behaviour?
<wpk> rogpeppe: I think that 'try once' is OK in this case
<wpk> At least it's intuitive for me
<rogpeppe> wpk: i'm not sure. we'd need to decide how long to try for.
<rogpeppe> wpk: does a zero deadline mean an infinite deadline?
<wpk> for me a zero means no deadline in this case
<wpk> 'try once'.
<rogpeppe> wpk: so zero deadline means "no deadline" and a zero retry-delay means "no retries" ?
<wpk> zero deadline means 'try once', zero retry-delay means 'no delay between retries'
<wpk> total 1, retry-delay 0 means try as many times as possible in 1 second
<wpk> that's consistent with gopkg.in/retry nomenclature
<wpk> https://github.com/juju/juju/pull/7417
<wpk> This fixes TestDialAPIMultipleError for me
<rogpeppe> wpk: i'd like to make the deadline a proper hard deadline on the whole dial - at the moment it's not
<rogpeppe> wpk: are you aware that the delay timing is in nanoseconds? a delay of 1 really won't make it wait much longer... :)
<wpk> rogpeppe: are you sure?
<rogpeppe> wpk: sure about what?
<wpk> rogpeppe: that it's in nanoseconds?
<wpk> rogpeppe: because it works
<wpk> oh, and it makes a difference
<rogpeppe> wpk: see https://golang.org/pkg/time/#Dur
<rogpeppe> wpk: see https://golang.org/pkg/time/#Dur
<wpk> because N + 1 is still > N
<rogpeppe> wpk: see https://golang.org/pkg/time/#Duration
<wpk> so it's enough
<rogpeppe> wpk: maybe the retry code is wrong. i'm not sure it should retry if the timeout is zero
<wpk> it's start=now, end=now+Total, start.add(delay); if start.after(end) (a sharp comparision) return
<wpk> that'd work should that be 'not_before'
<wpk> oh, that's your package :)
<wpk> if !end.after(start) should work here
<rogpeppe> wpk: agreed
<stokachu> rogpeppe: does https://github.com/juju/juju/commit/874fbd53dd898c325edc36ec37d0518f03bfd987 fix that issue i was seeing with the certificate error trying to connect to jaas?
<rogpeppe> stokachu: not quite. the next PR to follow it will though.
<rogpeppe> stokachu: sorry, it's taken a while to get it reviewed
<stokachu> Cool np!
<thumper> mornign
<thumper> ^T
<babbageclunk> wallyworld_: ping?
<babbageclunk> morning thumper
<thumper> babbageclunk: morning
<babbageclunk> thumper: I finally got log forwarding working - certificates are fiddly and opaque, although that's probably just unfamiliarity.
 * thumper nods
<thumper> coolio
<wpk> 23:01 <@thumper> mornign
<thumper> wpk: hey
<thumper> wpk: why are you still up?
<babbageclunk> thumper: I still need to fix something about the last-seen tracking, but otherwise it seems to be working.
<thumper> babbageclunk: sweet
<babbageclunk> wpk: timzones, ha
<wpk> thumper: bugs won't fix themselves ;)
<wpk> babbageclunk: I had a proposal that we all should just abandon timezones and switch to UTC
<wpk> babbageclunk: that'd make scheduling much easier
<babbageclunk> wpk: good call - gets rid of the blight that is daylight savings too.
#juju-dev 2017-05-31
<blahdeblah> thumper: You around?  Looking for some advice on how to tackle a problem.
<blahdeblah> I've got a production 1.25.10 environment in a rather bad way, showing the traditional symptoms of 1587644, plus the broken status updates of 1666396, but with no missing txn-revnos.
<blahdeblah> debug-log is full of https://pastebin.canonical.com/189580/
<blahdeblah> restarting jujud-machine-0, juju-db, and rsyslog has had no effect; it's just constant leadership-tracker spam per above
<blahdeblah> ^ Or anyone else, for that matter :-)
<babbageclunk> blahdeblah: can you see any message about why the leadership-tracker isn't running?
<babbageclunk> Or has that fallen off the bottom of the log?
<blahdeblah> Let me run the debug-log for a while and grep out the noise
<blahdeblah> I got one of these: machine-0: 2017-05-31 00:22:31 ERROR juju.worker.resumer resumer.go:69 cannot resume transactions: cannot find transaction ObjectIdHex("5912fa035290d208a719a8cc")
<blahdeblah> Is that a smoking gun for an mgopurge?
<babbageclunk> blahdeblah: I think so (although I'm not an expert).
<blahdeblah> Seems like it from https://github.com/juju/juju/wiki/Incomplete-Transactions-and-MgoPurge
<blahdeblah> I'll try that
<thumper> blahdeblah: yeah
<thumper> blahdeblah: oh, seems like you've sorted it
<blahdeblah> thumper: so mgopurge is the right step?
<thumper> blahdeblah: yes
<blahdeblah> cool - thanks, babbageclunk & thumper
<blahdeblah> Hmmm. This is a rather persistently missing transaction.  Same one keeps repeating in the debug log.
<blahdeblah> mgopurge output: https://pastebin.canonical.com/189582/
<blahdeblah> Anything look awry there? ^
<blahdeblah> thumper: ^ When you have a sec
<blahdeblah> ^ Full mgopurge (a.o.t. just pruning per https://github.com/juju/juju/wiki/MgoPurgeTool#pruning) did the trick
<thumper> cool
<babbageclunk> wallyworld_ or thumper: take a look at https://github.com/juju/juju/pull/7418 please?
<wallyworld_> righto
<babbageclunk> ta
<wallyworld_> babbageclunk: a small test quibble
<babbageclunk> wallyworld_: cool, thanks
<axw> wallyworld_: I thought you were talking about http://qa.jujucharms.com/releases/5314/job/run-unit-tests-race/attempt/2811 (obtained 3, expected 4) in the standup ... but now I see a card for TestAgentConnectionsShutDownWhenAPIServerDies. I've fixed the former, did you want me to look at the latter?
<babbageclunk> wallyworld_: oh yeah, good point
<wallyworld_> axw: that card is from a day or 2 ago;  i think the test run is in the details of the card. but yeah, any race failure is good to fix
<axw> okey dokey
<natefinch> can't even spell my own damn name
<mup> Bug #1494661 changed: Rotated logs are not compressed <canonical-bootstack> <uosci> <juju:In Progress by jsing> <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1494661>
<natefinch> @thumper you around?
<meetingology> natefinch: Error: "thumper" is not a valid command.
<thumper> natefinch: yep
<thumper> damn bots
<natefinch> @thumper has this code been reviewed by someone on Juju?
<meetingology> natefinch: Error: "thumper" is not a valid command.
<thumper> natefinch: you need to stop using @
<natefinch> right sorry, slack
<natefinch> creeping in my brain
<natefinch> heh
<thumper> I *could*, or perhaps axw might be better suited
<thumper> slack, pah
<natefinch> haha
<thumper> natefinch: if you like, I could quickly test as well...
<natefinch> mostly wanted to know if I was rubber stamping or if I need to actually be really thorough on this
<thumper> natefinch: is it verification you wanted?
<natefinch> sounds like the latter
<thumper> I did look through it when it was initially proposed
<thumper> and it looked fine to me
<natefinch> *nod*
<thumper> but I didn't feel like I should comment on the PR
<natefinch> it;'s open source , you can do whatever the hell you like ;)
<natefinch> what I'd really like is if most of the new logic were in separate functions that could be unit tested
<natefinch> part of that is my own fault for not doing the same when I was writing it
 * babbageclunk goes for a run
<thumper> axw: is there a bug for the status history pruner failures?
<axw> thumper: not that I know of
 * axw searches
<axw> thumper: no not that I can see. why do you ask?
<thumper> axw: where did you see that it periodically fails?
<thumper> just thinking there should be a bug
<axw> thumper: http://qa.jujucharms.com/releases/5314/job/run-unit-tests-race/attempt/2811
<thumper> ah ha
<thumper> ok
<natefinch> thumper:  I'll approve with the mode/chown code added.
<axw> wallyworld_: coming to tech board?
<wallyworld_> axw: i have to rush off to the football, i was looking for a second set of eyes. live testing seems to work, but unit test fails - the err is nil rather than not found. the txn ops to remove the constraints are getting queued inthe slice, so it's just something dumb i've missed. https://github.com/juju/juju/pull/7421
<blahdeblah> wallyworld_: Wave for the camera - I'll be watching for you. :-)
<wallyworld_> axw: also, i just noticed deploying the second time results in a storage attached hook error, but i can't dig in now, i'll have to look later tonight when i get back
<wallyworld_> will do :-) hope we win
<wallyworld_> axw: maybe send an email if you see something and i'll pick up from there. ttyl
<axw> wallyworld_: ok np. enjoy
<rogpeppe> jam: hiya
<wpk> rogpeppe: https://github.com/go-retry/retry/pull/1
<jam> hi rogpeppe, currently digging into some hairy code, I'm guessing you'd like to chat about DNS Cache stuff?
<rogpeppe> jam: yeah
<rogpeppe> wpk: reviewed
<wpk> rogpeppe: updated
<rogpeppe> wpk: thanks!
<rogpeppe> wpk: i'm not sure you've pushed your changes
<wpk> I didn't, sec
<wpk> ready
<jam> rogpeppe: so, DNSCache stuff. a few thoughts. To start with, realize that my understanding of the MAAS issue is not code that I worked on, but stuff that I heard around the area so my memory may be innaccurate.
<jam> rogpeppe: to start with, one major caveat is that code was much more about caching DNS *misses* than about hits
<jam> which is not what you're focused on
<jam> I believe the problem worked out as
<jam> 1) We only ever connected to a single 'address' when doing things like 'juju ssh'
<rogpeppe> jam: yeah, that's my issue too - i'm not entirely sure what the issues were, and i think we've probably lost all the code reviews from then :-\
<jam> 2) We thought 'hey, hostnames, that should be better', so we started preferring hostnames to IP addresses
<jam> 3) Then MAAS started giving us hostnames that we can't resolve because they aren't in most users laptops
<jam> and we didn't want to continually try to lookup hostnames that weren't resolvable, and we *definitely* didn't want to use them as the preferred address for 'ssh' if we were only going totry 1
<jam> we now do attempt multiple targets
<jam> I'm not sure why we would want to internally add yet another DNS cache
<jam> (IIRC Linux defaults to using a local DNS cache anyway)
<jam> if we're doing something internally where we're connecting repeatedly and DNS lookups are a significant problem, I'm not opposed to them
<jam> but as always 'cache invalidation' is one of the core problems in programming
<jam> so avoiding cache when you don't actually need it is often a good plan
<blahdeblah> jam: +1000
<jam> and balloons would be the person to talk to about the archived review site, I'm fairly sure the content was not deleted, just the site taken down
<jam> as maintaining those machines (keeping security updates, etc) has a nonzero cost to us
<rogpeppe> jam: yeah, i'm aware of that, but it really is a useful resource
<rogpeppe> jam: hmm, so the ssh issue is one i hadn't thought about
<jam> rogpeppe: your DNSCache isn't caching negative results, so it doesn't really touch that problem, but AIUI that is why we had the "unvalidated addresses"
<jam> unresolved
<rogpeppe> jam: so my problem with the unresolved addresses is that is makes it sound like the other ones are resolved, but they're not. the two fields sit in uneasy tension - their responsibilities are unclear
<rogpeppe> jam: the direction i'm trying to head is that one field has the addresses as returned by the controller, and that other fields provide meta-info that records stuff related to the addresses (e.g. their resolved IP address or whether we could resolve the address)
<rogpeppe> jam: the meta fields don't impact on correctness and can always be deleted without a problem (except potentially some extra connection time)
<rogpeppe> jam: so... you think that there's not really a problem with slow DNS lookups?
<rogpeppe> jam: if so, why did the original code bother to record the resolved IP addresses at all? it could just have moved addresses that resolve OK to the front of the list.
<jam> rogpeppe: I think it was a case of "when we can find IP addresses prefer them, because for everything that isn't JAAS they're actually more 'real'" and available everywhere
<jam> regardless of my personal configuration, etc.
<jam> I think JAAS throws a wrench into that
<jam>  that comes after that code was landed
<rogpeppe> jam: but IP addresses can change
<rogpeppe> jam: and i still don't really understand. "when we can find IP addresses"... that's the responsibility of DNS, right? why are we doing it ourselves?
<rogpeppe> jam: that is, why is it a good idea to store the resolved addresses in controllers.yaml?
<jam> so for things like shared (old) environments.yaml files, what DNS servers you could see was often disjoint from what IP addresses you could see
<jam> so if I had one machine that *was* configured to see MAAS, putting the IP addresses in there meant that I could share it with another machine that *couldn't* see MAAS's DNS
<jam> but could route to MAAS
<rogpeppe> jam: ha, so the MAAS DNS addresses were only resolvable locally, but the resolved IP addresses worked globally?
<jam> rogpeppe: so MAAS runs its own DNS server that tracks all of the machines that it is managing
<jam> you can certainly have a *route* to the MAAS network
<rogpeppe> jam: if that's the case, that's a reasonable argument for maintaining a DNS cache
<jam> without changing your local DNS to point to MAAS's bind
<jam> (its not bind, but whatever it is)
<wpk> (it is bind ;) )
<rogpeppe> jam: so if that's the case, how is the code much more about caching DNS misses?
<jam> wpk: I thought it was dnsmasq or something like that
<jam> maybe I'm thinking the DHCP one changed, not DNS
<jam> I know they rewrote one of the backends
<jam> well, switched backends
<wpk> jam: btw, when you're done with this you could take a look at https://github.com/juju/juju/pull/7383 ?
<jam> wpk: looks like I started, but just didn't finish, will refresh
<axw> jam: sorry, didn't see you had reviewed ian's branch already... I don't understand why his change would fix anything. maybe you can answer my questions?
<jam> axw: I'm going off the comments sections around where he had touched, but did not try to completely validate the logic myself. It sounded like one of those cases where a TXN can't chain its actions
<jam> (op2 doesn't see the result of op1, IIRC)
<jam> at least in terms of all-assertions are triggered before all ops
<jam> 'are checked'
<jam> it sounded like if there were multiple ways that we might decref the reference counters during teardown, it wouldn't always go to 0.
<jam> though there is a "$inc" , -1, (or if there is a $dec) those operations shouldn't be trying to check the value and setting it to one less than it currently is
<jam> axw: what *I* got out of it, was that if you always did the finalization, then you actually end up with 2 finalization calls sometimes and the second would fail
<jam> so instead he changed it to be "always call it at the end, but avoid calling it early"
<jam> axw: at least, that was my understanding and why it 'seemed like it would be ok', but I'll admit to not really digging deep into everything.
<axw> jam: ok. I'm not 100% sure, but I didn't think it would do that because there's no asserts on the ops
<jam> axw: so double finalize sounds like it could fail
<jam> cause the doc you're removing doesn't exist
<axw> jam: Remove will succeed even if the doc doesn't exist, unless you assert txn.DocExists (just tested by duplicating the ops). pretty sure the issue is that "isFinal" is not triggering, but I don't know why
<axw> I started extracting a "transaction builder" for a limited set of State, but the yak hair was growing faster than I could cut it
<rogpeppe> axw: lol
<rogpeppe> wpk: i just merged your retry PR, thanks!
<wpk> rogpeppe: great, I'll update juju PR with just new dependencies
<wpk> rogpeppe: ok, juju PR updated
<rogpeppe> wpk: thanks
<rogpeppe> jam: (sorry, was busy trying to debug juju-run issue...)
<rogpeppe> jam: so, what's the upshot of our discussion?
<rogpeppe> jam: if we still care about copying controllers.yaml files and retaining previously resolved IP addresses, then ISTM that we'll still need some sort of DNS cache
<rogpeppe> jam: but i'm not quite sure whether we need to record DNS failures too
<rogpeppe> jam: currently i can't quite see that it's necessary.
<axw_> rogpeppe: just saw a test failure in CI for TestWithUnresolvableAddrAfterCacheFallback (http://juju-ci.vapour.ws:8080/job/github-merge-juju/11036/artifact/artifacts/xenial.log/*view*/)
<axw_> I'm logging off shortly, will look tomorrow if you don't get to it
<rogpeppe> axw_: thanks for the heads up
<rogpeppe> axw_: i'll take a look
<axw_> cheers
<jam> rogpeppe: well the current way to share controllers is things like 'register' and we're looking to have some other way to share with yourself
<jam> cause we don't *want* to copy controllers.yaml around manually
<rogpeppe> jam: ok, so perhaps we can lose all the DNS caching stuff. all we really need to do is put the dialed host name at the start of the address list
<jam> rogpeppe: the only other thing to sanity check is things like 'git blame' to see what commit messages say about thinsg.
<rogpeppe> jam: my current approach would mean that if there's a controller with a host name that resolves to several IP addresses and one of them is down, that the second time it would always try that IP address first
<rogpeppe> jam: unfortunately our commit messages are often pretty crap
<rogpeppe> jam: i really miss having the review history
<jam> rogpeppe: so with git blame and a small amount of walking, you can find the rev that actually merged the code, which gives you at least the review message
<rogpeppe> jam: --ancestry-path is very useful for that
<jam> rogpeppe: why would the IP that didn't resolve get chosen first the next time?
<jam> I also thought we always sort and then move the one we successfully connected to, to the front
<rogpeppe> jam: the IP that *did* resolve would be chosen first next time, sorry
<rogpeppe> jam: we do currently. my plan was to remove the unresolved-api-endpoints field and add a dns-cache field mapping host names to ip addresses
<rogpeppe> jam: when you successfully dial an address, you move that hostname to the start of api-endpoints and the dialed ip address to the start of the dns-cache entry
<wpk> rogpeppe: could you check https://github.com/juju/juju/pull/7417 ?
<rogpeppe> wpk: reviewed
<natefinch> backup compression has landed in lumberjack FYI.
<natefinch> not sure who is on during US work hours anymore
<natefinch> hi rick_h marcoceppi rogpeppe alexisb
<rogpeppe> natefinch: yo!
<alexisb> heya natefinch
<natefinch> howdy :)
<natefinch> rogpeppe: how's things in juju land?
<rick_h> Howdy natefinch!
<rogpeppe> natefinch: scrumptious as always :)
 * natefinch waves at everyone
<natefinch> haha
<rick_h> natefinch: how's the weather up in the Northeast treating ya?
<natefinch> rick_h: pretty good.  mild most days, barely need heat or A/C.
<rick_h> natefinch: awesome, great time if the year
<natefinch> thumper wanted backup compression done by today, so it's in.  Updating to master of gopkg.in/natefinch/lumberjack.v2 will bring it in.  Also tagged it as v2.1 for anyone who might be using something that cares about semantic versioning.
<rick_h> natefinch: that's awesome ty much!
<marcoceppi> o/ natefinch
<natefinch> hi marcoceppi
<thumper> veebers: so... why does the assess_log_rotation acceptance test require a JUJU_HOME/environments.yaml?
<veebers> thumper: due to how the tests currently setup the environment to bootstrap, we have a source for credentials and settings etc. which are named (hence 'env' argument). ci-tests take that and prepare a JUJU_DATA (known as JUJU_HOME for historic reasons for the test arg)
<thumper> I'm not sure what I need to pass it to get it running locally
<veebers> thumper: if you have cloud-city you need: JUJU_HOME=<path to cloud city> ./<script name> <env name> where env name is parallel-lxd
<thumper> veebers: ok it is running now...
<veebers> thumper: cool
<thumper> babbageclunk: https://bugs.launchpad.net/bugs/1694559
<mup> Bug #1694559: Log forwarding + debug log level = infinite messages <juju:New> <https://launchpad.net/bugs/1694559>
<thumper> babbageclunk: is there any way to easily enforce a larger batch size?
<thumper> larger minimum that is
<babbageclunk> thumper: you'd need to change the structure of the code a bit - at the moment it just sends batches as it's handed them.
<babbageclunk> But I don't think it'd be especially hard.
<thumper> wallyworld, babbageclunk: is this bug still accurate? https://bugs.launchpad.net/juju/+bug/1646907
<mup> Bug #1646907: gce open-port does not create firewall rules <gce-provider> <network> <open-port> <juju:Triaged> <https://launchpad.net/bugs/1646907>
<babbageclunk> thumper: don't know, would need to try it out sorry
<thumper> babbageclunk: that's ok, I thought it might have been covered by work you did there
<thumper> with the firewaller
<wallyworld> thumper: don't *think* so. there was a lot of cleanup and improvement to that code that i did in the past couple of months, and thebug was from dec
<thumper> if you don't know, I'll just drop priority and we can address later
<wallyworld> +1
<thumper> wallyworld: it seems to me that if https://bugs.launchpad.net/juju/+bug/1613823 was still a problem, we'd see many more CI failures for gce
<mup> Bug #1613823: Google Compute Engine IP is ephemeral by default <gce-provider> <juju:Triaged> <https://launchpad.net/bugs/1613823>
<thumper> thoughts?
<anastasiamac> thumper: i think u'd see it to be a problem on a longer-running juju... how many CI tests are long-running?
<thumper> anastasiamac: but this is talkinga bout controller dialing from a client
<wallyworld> thumper: the IP does change on reboot of a machine, but i didn't think it changed arbitarily during use
<thumper> so if the controller reboots... nothing can talk to it?
<thumper> that seems terrible
<wallyworld> yeah, i think that may be the case
<wallyworld> i haven't tested fully myself
<wallyworld> but it does seem an issue
<wallyworld> we should look at for 2.3
#juju-dev 2017-06-01
<mup> Bug #1634390 changed: jujud services not starting after reboot when /var is on separate partition  <uosci> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1634390>
 * thumper relocates while care gets serviced
<blahdeblah> Any ETA on 2.2 release?  Asking for a friend's HA controller. :-)
<wallyworld> blahdeblah: 2.2 rc1 this week or more likely early next week; 2.2 fa soon after
<anastasiamac> s/ga/fa
<blahdeblah> anastasiamac: or s/fa/ga/ even :-)
<blahdeblah> wallyworld: thanks
<anastasiamac> blahdeblah: as long as u've figured :)
<anastasiamac> details... you know?... :D
<wallyworld> babbageclunk: free now if you want to chat
<babbageclunk> wallyworld: hey, yes please - back in standup?
<wallyworld> sure  be there in a sec
<axw_> wallyworld: we've got a lot of landing failures in the api package, going to look at that before windows things
<wallyworld> axw_: ok. i'm free now too whenever you wanted to talk
<axw_> wallyworld: ok, see you in standup then
<babbageclunk> wallyworld: yay, it turns out that MachineAgent.apiserverWorkerStarter would just leak the state if any error occurred when creating the apiserver in MachineAgent.newAPIserverWorker. The syslog tests pass now.
<wallyworld> whoot
<axw_> anastasiamac: would you kindly review https://github.com/juju/juju/pull/7428?
<anastasiamac> me looking
<anastasiamac> axw_: lgtm, tyvm!!
<axw_> gracias
<thumper> axw_: is this one that either you or wallyworld have fixed? https://bugs.launchpad.net/juju/+bug/1665040
<mup> Bug #1665040: Race in github.com/juju/juju/worker/peergrouper <ci> <race-condition> <regression> <unit-tests> <juju:Triaged> <https://launchpad.net/bugs/1665040>
<thumper> github.com/juju/juju/worker/peergrouper.(*workerSuite).TestSetsAndUpdatesMembers.func1.1()
<wallyworld> thumper: yep, axw i think, i'd need to check the pr
<axw_> thumper wallyworld: pretty sure I fixed a different one, checking now
<wallyworld> but there's no more peer grouper races in the lastest runs
<thumper> thanks
<axw_> thumper: actually my PR would have fixed a bunch of tests, it was related to some common code. so yes
<thumper> sweet
<thumper> axw_: can you put the pr in that bug?
<axw_> yup
<thumper> ta
<axw_> wallyworld: https://github.com/juju/juju/pull/7429 should fix the windows test failure. going for a ride, bbs
<axw_> feel free to $$merge$$ if you're happy with it
<wallyworld> axw_: alright. after that i need to talk to you about storage
<thumper> veebers: ping
<thumper> veebers: are you able to jump on a quick hangout?
<veebers> thumper: yep, real quick have a standup coming up :-)
<thumper> oh, you go to jam's?
<veebers> aye, most of the time
<jam> yeah
<thumper> https://hangouts.google.com/hangouts/_/canonical.com/quick
<babbageclunk> ha ha, state has 270 public methods.
<thumper> :)
<thumper> hazaah
<veebers> thumper: any luck with that test now?
<thumper> veebers: got some time?
<thumper> I only have a few minutes before heading out
<thumper> taking Maia to guides
<thumper> veebers: I'm still in quick
<thumper> veebers: nm, have to head out now
<thumper> I'm running an attempt at a test fix
<wallyworld> babbageclunk: burton-aus: i think this failure may just be a slight difference in file content, but i haven't looked closely http://reports.vapour.ws/releases/5321/job/log-forward/attempt/1238
<jam> axw_: babbageclunk: I've been doing some tweaking on the internals of mgo/txn, and while I'd usually reach out to menn0, he's not around to discuss them. Are either of you interested?
<babbageclunk> jam: I would be, but I need to drop soon for child feeding and hosing down, sorry.
<jam> babbageclunk: well, these are not high priority, so if you're interested in the area, we can schedule it for the future
<babbageclunk> jam: yeah, definitely!
<babbageclunk> whoa, that was probably more enthusiasm than I intended.
<babbageclunk> But I definitely am interested.
<veebers> ugh, sorry thumper was peeling potatoes :-\
<babbageclunk> wallyworld, burton-aus: I haven't looked very hard at that, but there are definitely forwarded messages in the logs there, so I think it might be a test issue?
<wallyworld> babbageclunk: yeah. my initial thought was that the changes done should have been transparent, so if the test was passing before, it should pass now also
<babbageclunk> wallyworld: now that I think about it there might be ordering differences (since the logs for each model would be forwarded independently), or the forwarding might only be set up for the controller model in the test (expecting that would also forward the model logs, which isn't true any more).
<wallyworld> babbageclunk: the latter sounds more plausible
<wallyworld> we'll have to get the test updated
<babbageclunk> wallyworld: want me to take a look at that? I think I'm finished the log collection splitting. Just fixing state tests that look directly in the logx collection, and then migration steps.
<babbageclunk> I mean, upgrade steps
<wallyworld> babbageclunk: i think it maybe better to continue your wip
<wallyworld> babbageclunk: burton-aus might get to it first, otherwise you could look after putting up the log split PR
<wallyworld> even if we just identify that the test needs fixing, we can sort out something to unblock the release
<axw_> wallyworld: sorry, you wanted to chat storage? 1:1?
<wallyworld> axw_: yeah, standup ho?
<wallyworld> axw_: https://github.com/wallyworld/juju/compare/cleanup-removes-app-artefacts...wallyworld:cleanup-removes-app-artefacts2?expand=1
<axw_> jam: if you have something written down about your changes, I'd be interested to read - I don't know enough about the insides of mgo/txn to provide immediate useful feedback
<jam> axw: sure. the specific changes in this case are doing some caching and preloading of db requests
<babbageclunk> wallyworld: yeah, I thought that too, just checking.
<veebers> wallyworld, burton-aus: with the log-forwarding test, the test uses a regex to look for log entries that come from the other machine, that might need to be updated (maybe simplified)
<burton-aus> veebers this is the one I guess:
<burton-aus> "^[A-Z][a-z]{,2}\ +[0-9]+\ +[0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}\ machine-0.3ec9b846\-9520\-4d40\-87f2\-5c9114c8a28f\ jujud-machine-agent-3ec9b846\-9520\-4d40\-87f2\-5c91\ .*$"
<burton-aus> veebers though that machine related string is just fetched from the run.
<veebers> burton-aus: aye, that's the one.
<veebers> babbageclunk, wallyworld, burton-aus: It's kind of hidden but the failure is in ensure_multiple_models_forward_messages, which adds a new model and deploys something, then checks that logs from that model appears in the rsyslog machine logs
<veebers> so Looking at what babbageclunk mentioned, perhaps there is some extra config needed to make sure those logs get forwarded as well?
<wallyworld> veebers: that bit should be transparent IIANM
<veebers> wallyworld: I'm sorry I don't understand, which part, that there needs to be extra config for the models? Or that there shouldn't be any need for extra config?
<wallyworld> no need for extra config
<wallyworld> if the test is checking logs from a model, that bit should work the same as before
<wallyworld> axw: i think we have an issue still - cleanupDyingUnit calls cleanupUnitStorageAttachments() with remove=false, so the storage removal doesn't happen, and EnsureDead() fails. i can't see that the processing of a dying unit adds a cleanup job to remove dying storage
<veebers> wallyworld: ah cool, thanks for clarifying. It's possible the regex check needs tweaked (and or relaxed) if the format has changed a bit
<wallyworld> i am likely missing something
<axw> wallyworld: just a minute, looking
<wallyworld> veebers: the format should be the same also
<wallyworld> veebers: xtian will need to look into it a bit
<axw> wallyworld: right, so cleanupDyingUnit causes the storage attachments to go to Dying (detach but don't remove)
<wallyworld> veebers: it could be a test tweak as well, we just don't know yet
<axw> wallyworld: then the uniter will run detach-storage hooks
<axw> and will then remove them
<wallyworld> axw: ah right, i need to run that bit manually as well
<veebers> wallyworld: ack ok, keep us posted :-)
<thumper> jam: just here to beg a review
<jam> thumper: of?
<thumper> https://github.com/juju/juju/pull/7430
<veebers> thumper: I missed your ping before, available now if you like
<thumper> veebers: see PR
<thumper> veebers: can you just check the CI test aspects?
<veebers> thumper: link to PR?
<thumper> veebers: two lines above your mention
<veebers> thumper: ah ha :-) looking now
<thumper> it really is pretty simple
<thumper> 7 files, +10 â6
<veebers> thumper: sweet, commented. LGTM
<thumper> jam: ?
<jam> thumper: was otp, do you want it right away?
<thumper> I'll poke axw
<thumper> I was wanting to kick off the merge
<thumper> it's very very simple
 * thumper looks at axw
 * thumper will pop back in 10min
 * thumper needs to clean house a bit
<thumper> veebers: I'm assuming for develop we use the in tree tests and charms
<jam> thumper: are we guaranteed never to see .log before it becomes .log.gz
<veebers> thumper: not yet. That's something we're working toward (won't be far away)
<thumper> veebers: oh... well the CI test will fail then
<thumper> you do see a .log before it becomes .log.gz, but just very briefly
<veebers> thumper: ack, once that branch lands we can do the separate "update all" which will propagate the changes (then re-run the test again if needed)
<thumper> veebers: but... but.. then the tests will fail from older versions
<thumper> or have you fixed that?
<axw> thumper: sorry I was afk, you wanted a review from me?
<thumper> https://github.com/juju/juju/pull/7430
<thumper> axw: discussing ^^
<thumper> jam: I guess if we hit a weird timing issue, we can add a sleep 5 to the action
<thumper> :)
<thumper> but it passed here
<thumper> on lxd with ssd
<thumper> well... I guess we'll find out
 * thumper needs to head off now...
<axw> veebers: I can merge tim's branch, but CI will start failing - how much longer are you around? isn't it already past your EOD?
<veebers> axw: aye it is, the CI test will fail on the revision-build, we can make it pass though by updating the nodes once it lands
<veebers> It's a bit messy as we're in the process of making it so testing is done from in tree
<axw> veebers: ok. looks like this is meant to be going into rc1, so I'll merge and hopefully balloons can sort it out when he wakes up
<veebers> axw: ack, I'll email and let him kknow
<axw> veebers: thanks :)
<wallyworld> rogpeppe: hey, i'm told  a recent change to add/use a dns cache may have added a flakey test, TestDNSCacheUsed... any chance you could look? we are trying to get an rc out this week. here's an example of a failure http://reports.vapour.ws/releases/5325/job/run-unit-tests-xenial-amd64/attempt/255
<rogpeppe> wallyworld: was that before https://github.com/juju/juju/pull/7429 landed?
<wallyworld> rogpeppe: it's off the latest CI run, let me check to see that rev it is
<rogpeppe> wallyworld: thanks
<wallyworld> rogpeppe: yeah, the CI run is from testing PR 7430 which landed 5 hours after
<rogpeppe> wallyworld: OK, i'll look into it
<wallyworld> rogpeppe: tyvm, i'm off to bed real soon
<wallyworld> we are looking to get a good CI run for the morning in australia
<wpk> around midnight UTC?
<rogpeppe> wallyworld: there's one problem that really should be fixed before release
<rogpeppe> wallyworld: https://bugs.launchpad.net/juju/+bug/1692905
<mup> Bug #1692905: cert error on public controller: cannot validate certificate  <juju:New> <https://launchpad.net/bugs/1692905>
<rogpeppe> wallyworld: i'm working on the fix
<balloons> hey wallyworld ;)
<balloons> Looks like if you land the unit test fix the only issue will be with the windows deploy test. We think the slave is sick
<balloons> rogpeppe, will you have a fix for that today?
<rogpeppe> balloons: i am hoping to, yes
<rogpeppe> balloons: i've fixed the code - just writing tests for it
<balloons> Awesome. So we can get a bless on that landing. Changing any dependencies?
<rogpeppe> balloons: here's a fix for another flaky test of mine... https://github.com/juju/juju/pull/7434
<balloons> rogpeppe, ack. Good stuff.
<mup> Bug #1694988 opened: AWS instances created by juju don't have an associated IPv6, even if "auto-assign IPv6 addresses" is enabled for the subnet <juju-core:New> <https://launchpad.net/bugs/1694988>
<rogpeppe> this PR fixes juju bug 1692905: https://github.com/juju/juju/pull/7438; reviews appreciated
<mup> Bug #1692905: cert error on public controller: cannot validate certificate  <juju:New> <https://launchpad.net/bugs/1692905>
<wpk> rogpeppe: unit tests are failing
<cmars> wpk, could i please get a review of https://github.com/juju/juju/pull/7439 ?
<wpk> cmars: done.
<cmars> wpk, thanks
<rogpeppe> wpk: looking
<rogpeppe> cmars: you could look at the server version and warn if it's 2.2 or greater, I guess
<rogpeppe> wpk: ok, a bunch of fairly trivial things; i was too lazy to run the tests my own machine, can you tell? :)
<rogpeppe> wpk: i'd very much appreciate a review if you're up for it, BTW
<rogpeppe> wpk: tests should be fixed now
<wpk> rogpeppe: full suite kills my laptop, so I always do the bare minimum and then test it in Jenkins :)
<rogpeppe> wpk: at least the tests are now run before you hit $$merge$$
<babbageclunk> veebers: hey, sorry to miss the discussion last night - I think if you set up the model defaults with log forwarding settings before creating the model it should start forwarding logs straight away.
<veebers> babbageclunk: that's contrary to what wallyworld said re: the settings being transparent isn't it?
<wallyworld> veebers: you always did need to set up log forwarding in the initial config
<babbageclunk> veebers: It's transparent if you were already setting the model defaults. ;)
<veebers> ah right
 * veebers checks what the test is doing now
<babbageclunk> veebers: There is a change in behaviour - before if you set up forwarding for the controller it would automatically forward for all models. Now if you want that you need to put the settings in model defaults.
<veebers> babbageclunk: ah right ok, I think that's the missing part in my thinking, cheers
<babbageclunk> veebers: Sorry, I probably should have mentioned that earlier!
<veebers> babbageclunk: you have docs re: what the settings values are for that?
<externalreality> Does anyone happen to know if there exists CI jobs that depend on the fact that GO is being installed by Juju's top level Makefile?
<veebers> balloons: is it only unit tests that we expect juju to install go as part of it's own setup?
<balloons> externalreality, yes we do depend on it
<balloons> veebers, we we run the merge jobs we use the makefile on the new instance to test
<babbageclunk> veebers: no, I don't think it's documented anywhere. The settings haven't changed - still logforward-enabled, syslog-host, syslog-ca-cert, syslog-client-cert, syslog-client key.
<babbageclunk> veebers: Just how they're used has changed.
<veebers> babbageclunk: do you need to set the syslog-* stuff on the model config too?
<babbageclunk> veebers: yup - they could be independent, in theory.
<babbageclunk> veebers: Want to have a quick hangout about it?
<veebers> babbageclunk: would love to, just in release call, will be a little bit before i'm free, can I ping you?
<rogpeppe> thumper, axw, wallyworld: you might wanna take a look at this PR - I'm hoping it can land before the release: https://github.com/juju/juju/pull/7438
<wallyworld> rogpeppe: might be able to :-) btw there's still a dns cache related race
<babbageclunk> veebers: yup yup
<rogpeppe> wallyworld: got a link to a failure?
<wallyworld> give me a sec to find it
<wallyworld> rogpeppe: TestDNSCacheUsed. I think you fixed a different one. http://reports.vapour.ws/releases/5331/job/run-unit-tests-race/attempt/2831
<rogpeppe> wallyworld: ooh, an actual race!
<rogpeppe> wallyworld: thanks
<babbageclunk> ooh!
<wallyworld> rogpeppe: we are delaying rc1 till next monday/tuesday, so that will get time for your pr to land
<wallyworld> rogpeppe: yeah, we have fixed several actual races this week elsewhere as well. so close to getting a proper blessed CI run
<rogpeppe> wallyworld: ok, that's a trivial fix
<wallyworld> yay
<babbageclunk> Is anyone else having trouble with pushes to github taking a long time?
<veebers> babbageclunk: I did last night and thought it was just my internet (or that I did something wrong)
<rogpeppe> wallyworld: https://github.com/juju/juju/pull/7440
<wallyworld> rogpeppe: you rock, ty, will look real soon
<wallyworld> i'll merge
<wallyworld> rick_h: just finishing meeting, be there in a sec
<rick_h> wallyworld: all good
<babbageclunk> veebers: Mine's just sitting here with a git pack-objects process doing nothing. I guess the other end of the connection is loaded?
<babbageclunk> ugh, finally!
<thumper> babbageclunk: can you join the release call plz?
<babbageclunk> thumper: sure
<wallyworld> rick_h: still in release call, can we delay for 15 mins, or defer?
<thumper> rick_h: I'm feeling left out, you haven't asked me for a call
<rick_h> wallyworld: rgr, just setup something that works for you next week if that's ok
<rick_h> thumper: well, I like wallyworld :P
<wallyworld> rick_h: sure, and sorry, 2.2 release is so close
<wallyworld> we need to get stuff sorted
<rick_h> wallyworld: <3
<rick_h> yea
<rick_h> wallyworld: thumper the one thing for 2.2 I wanted to bring up is if this article effects instance type availability and needs to be mentioned. https://goo.gl/VNe9oC
<rick_h> wallyworld: other than that I'll catch you later
<wallyworld> rick_h: we'll look at it
<wallyworld> babbageclunk: could you also tweak the controller setting max-txn-log-size when you do the mustString() thing for the other ones?
<babbageclunk> wallyworld: ok
<babbageclunk> wallyworld: we don't ww
<babbageclunk> oops
<babbageclunk> wallyworld: covfefe
<wallyworld> lol
<babbageclunk> wallyworld: We don't want to run the log pruner per-model do we?
<wallyworld> it should be like the status history pruner
<wallyworld> i think that's per model
<babbageclunk> wallyworld: oh, ok - I'll take a look at that
#juju-dev 2017-06-02
<anastasiamac> axw: ping
<wallyworld> babbageclunk: bug 1688635 fwiw
<mup> Bug #1688635: 'max-logs-age' panic on bootstrap with lxd provider <juju:In Progress by 2-xtian> <https://launchpad.net/bugs/1688635>
<babbageclunk> wallyworld: ok, thanks
<wallyworld> babbageclunk: i may have a slightly different solution, just looking into it now
<babbageclunk> wallyworld: cool, let me know how it goes.
<wallyworld> babbageclunk: yep, i will do a fix elsewhere
<babbageclunk> wallyworld: ah, ok - so I don't need to think about config at all then?
<wallyworld> no
<wallyworld> babbageclunk: was a 3 or 4 line fix, just writing a test
<babbageclunk> awse
<thumper> babbageclunk: https://github.com/juju/juju/compare/develop...howbazaar:separate-log-dbs?expand=1 updated, I'm heading out for lunch, will test more when I get home.
<wallyworld> thumper: was a simple fix, just had to change how restore constructed controller config. Using New() fills in any defaults https://github.com/juju/juju/pull/7442
<wallyworld> or babbageclunk ^^^^ if thumper has run away
<thumper> babbageclunk: my brach had a bug
<thumper> will update
<babbageclunk> thumper: ok - luckily I haven't looked at it yet!
<babbageclunk> thumper: just sorting out tests for pruning.
<thumper> babbageclunk: I'm testing the upgrade test now
<thumper> manually
<babbageclunk> thumper: cool.
<babbageclunk> thumper: do you know whether there's a way to pass a yaml file to model-default or model-config (the way you can to --config or --model-default at bootstrap time)?
<thumper> no sorry
<babbageclunk> see veebers it's thumper's fault
<veebers> babbageclunk: hah :-)
<babbageclunk> thumper: also a question from veebers that I don't know the answer to: do you do the model-default sub command on the controller or the model itself?
<babbageclunk> thumper: and: does setting model defaults after a  model has been created update the config on that created model?
<babbageclunk> I also would like to know the answers to these Qs
<thumper> model-defaults are set at bootstrap time
<thumper> not sure if they can be set after that...
<babbageclunk> thumper: there's a model-defaults command that works just like model-config except presumably it sets defaults. But I don't really understand what it does if it's not working on the controller.
<thumper> it probably is working just on the controller'
<thumper> also, if you have a model and update the defaults
<thumper> I'm not sure if the defaults are re-propagated, wallyworld probably does though
<wallyworld> say wot
<wallyworld> model defaults are used when add models
<wallyworld> adding
<wallyworld> the new model config is seeded from the defaults
<babbageclunk> wallyworld: yeah, but the model-defaults command can take a model - what's that about?
<veebers> wallyworld: so if I bootstrap and have a default model, then set the model-defaults, that default model won't have the defaults?
<veebers> wallyworld: but if I add a model after that, the new model will?
<wallyworld> babbageclunk: what do you mean "can take a model"? it's a controller command IIANM
<wallyworld> veebers: correct, the model defaults are not inherited but copied
<wallyworld> veebers: once a model is added, that's it
<wallyworld> it has its own config
<veebers> wallyworld: ok thanks.
<babbageclunk> wallyworld: Ah, you're right! But the examples include a -m version. Which was blowing my tiny mind, 'specially on a Friday afternoon.
<babbageclunk> I'mm'a push a fix for that.
<veebers> babbageclunk, wallyworld I think that means I have a way forward with the log-forwarding test, gonna try now
<wallyworld> babbageclunk: whomever wrote those examples should be shot. hope it wasn't me
 * babbageclunk runs a git praise
<wallyworld> whew
<wallyworld> not me!
<babbageclunk> You're safe for now!
<thumper> hmm...
<thumper> my testing rendered my controller non-responsive
<babbageclunk> wallyworld: also, what does it mean to specify a cloud/region for that command?
<wallyworld> babbageclunk: different cloud regions can have different defaults, eg proxy
<wallyworld> or apt mirror
<babbageclunk> wallyworld: but how can a controller have different clouds or regions? Oh, is this a jaas thing?
<wallyworld> for a model, if there's no clud region default, it will lookto use a model default sans region
<wallyworld> we now support models in differnet region in a sinle controller
<wallyworld> the underlying cloud has to be the same for wach model
<babbageclunk> wallyworld: oh right, ok - that makes sense. thanks!
<wallyworld> np
<babbageclunk> thumper: that sounds bad
<thumper> yeah, investigating
<thumper>  /var/lib/juju/db/collection-69--8034483712429095609.wt: handle-write: pwrite: failed to write 4096 bytes at offset 4849664: No space left on device
<thumper> which is weird, because there is space now...
<babbageclunk> thumper: and we're only talking ~300M of data right?
<thumper> ah
<thumper> my zfs system is full
<babbageclunk> doh
<thumper> I had some old machines lying around
<thumper> not sure where they are from
 * thumper cleans up
 * babbageclunk fights the urge to talk about atm machines and pin numbers.
<veebers> wallyworld, babbageclunk: should I be concerned if I see this: WARN  juju.cmd.juju.model defaultscommand.go:592 key "syslog-ca-cert" is not defined in the known model configuration: possible misspelling
<wallyworld> we shouldn't see that. it's likely noise but should not be printed
 * thumper starts again
<wallyworld> if everything works, it's worth a bug
<babbageclunk> wallyworld: maybe I need to add it somewhere? Odd though - I didn't change the config settings.
 * thumper taps fingers waiting
<wallyworld> babbageclunk: there is a check that what the user types belongs to the config schema; not sure off hand why that attr didn't pass muster
<babbageclunk> thumper: are your latest changes in your branch? I'll start pulling them optimistically.
<thumper> babbageclunk: let me push
<thumper> babbageclunk: there now
<babbageclunk> thumper: thanks
<thumper> oh yeah...
<thumper> machine sluggish now
<thumper> load over 10, CPU over 70% of every core
<thumper> 1323 M of log collection data
<thumper> pruning cuts in every 5 minutes
<thumper> so I'll wait for that then run the upgrade
<babbageclunk> thumper: Removing version - how far should I go? Take it off params.LogStreamRecord too? I think that's ok, api-versioning-wise - logstream's only used from workers in the controller.
<thumper> babbageclunk: I think so
<babbageclunk> thumper: book
<veebers> babbageclunk, wallyworld: Hmm, that didn't work. Have you confirmed this manually? I might be missing some step
<wallyworld> i haven't tested, xtian has
<babbageclunk> veebers: I haven't tested that though, only at bootstrap time. I'll try it now, hang on.
<veebers> babbageclunk: ah, I might not be setting logforward-enabled on the model
<veebers> babbageclunk: only on the controller
<babbageclunk> veebers: that should do it - that has to be enabled for each model.
<veebers> babbageclunk: let me try again
<babbageclunk> veebers: although if it's enabled in defaults then it will be for new models.
<veebers> babbageclunk: right, need to check that's what I need in the test or if the timing of the start of forwarding is important
<thumper> hmm...
<thumper> why isn't log pruning happening
<babbageclunk> thumper: hmm, version's needed by logforwarding - it goes into the origin for juju in the syslog - I guess I can still rip it out and just have the forwarder add version as it writes to the syslog.
<axw> wallyworld thumper: https://private-fileshare.canonical.com/~axw/lp1677434-jujud.tar.xz <- contains the presence and status history fixes on top of 2.1, with version set to 2.1.2
<thumper> axw: I think they are on 2.1.3
<thumper> hmm...
<thumper> well they should be as that is the security release'
<axw> thumper: the bug says 2.1.2
<babbageclunk> thumper: is your system a bastardised one with your split logs collections? pruning might be a bit bung.
<thumper> huh
<thumper> you are right
<thumper> babbageclunk: no, 2.1.2
<thumper> what was the log size? I thought it was 300 meg
 * babbageclunk shrugs then
<babbageclunk> thumper: ripping version out is getting a bit intense, I'm going to park it for now and hopefully do it later.
<axw> thumper: if it turns out they're running a newer version, they can drop a FORCE-VERSION file into the same dir as jujud
<thumper> babbageclunk: ack
<thumper> axw: ack
<thumper> oh fuck
<thumper> default max is 4 gig
 * thumper isn't going to fill that
<thumper> hmm...
 * thumper looks again
<thumper> ugh... should be ok
 * thumper makes more logs
<veebers> babbageclunk: on closer inspection, some log forwarding was always working, it's just the added model that's not. I'm checking now
<veebers> ah, possible that it needs to be enable specifically
<thumper> ok... over 4G of logs
<thumper> this should be a good test for upgrade step
<babbageclunk> veebers: ok, that sounds good.
<thumper> babbageclunk: took 7 minutes
<thumper> but it successfully split the logs into 5
<thumper> and that was for 4G of logs
<thumper> which is the default max of 2.1.2
<thumper> now for other interesting bits
<thumper> on 4.4G of logs indexes were 89M
<thumper> now with the split
<thumper> we have...
 * thumper queries and adds up
<thumper> meh...
<thumper> doesn't seem to be that much different in size to be honest
<thumper> but faster to clean up
<babbageclunk> thumper: Hmm, is there any way we can make it resilient to crashing half-finished? Find the max id in any of the child collections and start from there, maybe?
<thumper> babbageclunk: for the upgrade?
<babbageclunk> thumper: yup
<thumper> babbageclunk: if it crashes half way through it hasn't "upgraded" so agent.conf isn't updated
<thumper> next time it starts, it will run the upgrade steps again
<thumper> which will just continue
<babbageclunk> right, but will we get double-ups in the child log tables?
 * thumper thinks...
<thumper> yes
<thumper> but the'll get pruned
<thumper> if it does crash...
<thumper> minor problem with dupes
<thumper> for a while
<thumper> the "correct" way would be to remove each doc as it is moved
<thumper> will probably double the time for the upgrade
<thumper> but more resiliant to failure
<babbageclunk> thumper: but I think finding the latest id in any of the child collections would work too, wouldn't it? Or at least restrict double ups to the time it was up to at the crash.
<thumper> are object ids strictly sortable?
<babbageclunk> thumper: no, I don't think so in the general case, unless you know they were only generated in one machine
<thumper> babbageclunk: actually, perhaps we should batch removals?
<thumper> babbageclunk: it would minimise restart dupes
<thumper> but also limit the doubling of logs during migration
<thumper> babbageclunk: thoughts?
<babbageclunk> thumper: yeah, sounds sensible to me.
<thumper> babbageclunk: as I'm slightly concerned about disk usage during upgrade
<thumper> perhaps bulk insert too
 * thumper considers
<babbageclunk> thumper: ooh, nice.
<thumper> insert takes (docs... interface{})
<veebers> babbageclunk: sweet, looks like I have a fix for the test, it was a lot more simple than I first thought
<babbageclunk> veebers: awesome
<thumper> babbageclunk: ah fark...
<babbageclunk> ?
<thumper> babbageclunk: harder to do batch inserts
<thumper> due to different collections
<thumper> can do batch delete
<babbageclunk> probably the easier option would be to grab a batch, group them by model, insert each model-batch, then delete the batch from source.
<babbageclunk> non-optimal since it's doing smaller inserts, but still better.
<thumper> babbageclunk: I've gone for single inserts, batch deletes
<thumper> happy medium for effort / reward
<babbageclunk> thumper: fair enough.
<thumper> babbageclunk: http://paste.ubuntu.com/24744827/
<babbageclunk> thumper: are you going to do another perf test to see how much the batch deletes cost?
<thumper> babbageclunk: have you grabbed the upgrade steps yet?
<thumper> it is more the disk usage during upgrade
<thumper> performance will be slightly worse
<thumper> no
<babbageclunk> thumper: nope - been fixing test failures.
<thumper> wasn't going to do another test
 * thumper pushes change
<babbageclunk> thumper: any chance it'll be lots worse?
<thumper> not lots
<thumper> a little
<babbageclunk> I you could do the test on a much smaller sample to get a feel for how much worse.
<babbageclunk> oops -I
<thumper> oh alright then
<babbageclunk> :)
<babbageclunk> anyway, I'm getting the changes and putting them into my tree.
 * thumper creates 4G of logs
<veebers> babbageclunk, wallyworld: hey thanks for talking me through the log-forwarding stuff, turns out this is the fix: https://github.com/juju/juju/pull/7445 (a lot simpler than I first thought)
<veebers> :-P
<wallyworld> veebers: told you it was essential transparent :-)
<wallyworld> just a tweak to enablement
<veebers> indeed, sorry for the noise wallyworld (in my defense I did learn some stuff)
<wallyworld> veebers: hey, no problem! always good to question how things are
<wallyworld> that's how we discover our mistakes and improve
<babbageclunk> veebers: nice one.
<wallyworld> by our i mean dev
<veebers> ^_^
<wallyworld> easy for dev to make assumptions because we are close to it all the time
<veebers> There is also a fix for the model migration test (well a general fix, migration test\s are the most affected)
<wallyworld> yay, so CI should look pretty sweeeeeet
<wallyworld> thumper: you forgot my PR :-(
<thumper> yes I did
<thumper> sorry, too busy testing and working with babbageclunk
<wallyworld> no worries, understand
<wallyworld> axw: could you take a look, it's only 4 lines https://github.com/juju/juju/pull/7442
<wallyworld> thanks thumper
<veebers> wallyworld, thumper, babbageclunk: I'm off o/ I'll check in tomorrow to make sure those test fixes where landed etc. and re-run any tests that might need it
<babbageclunk> veebers: thanks!
<wallyworld> veebers: thanks for all the help! so close to getting a bles snow
<babbageclunk> the best kind of snow
<veebers> no worries. Yeah those test results are looking heaps better!
<wallyworld> stupid typo
<veebers> I think I like "bles now" better than a bless
<wallyworld> still a race or two to fix
<wallyworld> babbageclunk: i have a branch with the new split consume working. just need to do a couple of more tests. will propose for review next week
<babbageclunk> wallyworld: oh cool"
<babbageclunk> !
<wallyworld> so close to multi-controller cmr but we probs won't productise it
<wallyworld> foundations will server other purposes though
<wallyworld> *serve
<thumper> babbageclunk: um...
<babbageclunk> ?
<thumper> babbageclunk: my code is broken
<babbageclunk> :(
 * thumper enfixes
<babbageclunk> Ok, as long as the changes are just in the split func and tests it's pretty easy for me to update.
<thumper> babbageclunk: have you copied stuff yet?
<babbageclunk> yup
<babbageclunk> just been fixing tests then pushing.
<thumper> http://paste.ubuntu.com/24745126/
 * thumper thinks how to copy this to the dead machine
<babbageclunk> ta
<babbageclunk> thumper: oh, easy fix.
<thumper> oh FFS
<thumper> babbageclunk: we need to handle dupe inserts
<thumper> in the upgrade step
<thumper> in the case of restart
<thumper> 2017-06-02 05:39:58 ERROR juju.upgrade upgrade.go:149 upgrade step "split log collections" failed: failed to insert log record: E11000 duplicate key error collection: logs.logs.c4bfb84a-fa78-4def-8b7e-e5bb0b3d15b1 index: _id_ dup key: { : ObjectId('5930ef6656401a20508455dc') }
<thumper> but I need to leave
<thumper> rachel will kill me otherwise
<thumper> babbageclunk: so an error from the insert that is a dupe should be ignored
 * thumper out
<babbageclunk> wallyworld: could you take a gander at https://github.com/juju/juju/pull/7441? I still need to make the fix thumper alluded to there.
<wallyworld> ok
<babbageclunk> But if I don't stop now to help with dinner then I also will be killed.
<babbageclunk> the kids have been especially trying today.
<babbageclunk> quick, hide!
<wallyworld> babbageclunk: see what you think of my comments
<rogpeppe> jam: since you were involved before, you might want to take a look at https://github.com/juju/juju/pull/7438
<axw> balloons: I think the windows test machine might be dead? http://juju-ci.vapour.ws:8080/job/github-merge-juju/11077/artifact/artifacts/windows.log/*view*/
<wpk> axw: it picked a great day to die
<arosales> jam: Thanks for the comment in https://bugs.launchpad.net/juju/+bug/1677434
<mup> Bug #1677434: listing models is slow <adrastea> <amd64> <arm64> <canonical-bootstack> <ppc64el> <uosci> <usability> <juju:Triaged> <https://launchpad.net/bugs/1677434>
<arosales> jam: is there a recommended way to bet cpuprofile with the 2.1 agents?
<arosales> s/bet/get/
<balloons> axw, I'll look
<balloons> axw, your merge is re-running
#juju-dev 2018-05-28
<thumper> https://github.com/juju/juju/pull/8772/files
 * anastasiamac_ lloking
 * thumper sighs
<thumper> ugh... wrong channel even
<anastasiamac_> mondays r hard
