#juju-dev 2012-09-03
<TheMue> goooooood morning
<davecheney> helllo
<rogpeppe> TheMue, davecheney: mornin'
<TheMue> rogpeppe: heya
<fwereade> yay, I has internets again, I hope
<fwereade> mornings :)
<TheMue> fwereade: internet makes stupid (topic of a German TV show yesterday)
<rogpeppe> just rebooting
<Aram> yo.
<wrtp> Aram: hiya
<TheMue> Aram: hi
<TheMue> Aram: jfi, i'm working on mstate/unit
<TheMue> lunchtime
<wrtp> niemeyer: yo
<wrtp> !
<niemeyer> wrtp: Buenas!
<wrtp> niemeyer: do you know of any way of doing an "unmerge"?
<wrtp> niemeyer: i.e. i've got a branch that merged another one some time ago, and i want to back out from all the changes that that merge introduced
<niemeyer> wrtp: The only reasonable path for that is to apply a reversed diff
<niemeyer> wrtp: It's not free of side effects, unfortunately
<wrtp> niemeyer: thanks.
<niemeyer> wrtp: The old branch won't merge properly anymore, so it'll need to be recreated  by branching and applying the forward diff again
<wrtp> niemeyer: i'll probably just build the branch from scratch again.
<wrtp> lunch
<niemeyer> wrtp: Enjoy
<wrtp> pwd
<niemeyer> :)
<TheMue> Aram: after unit, as said, i'm now starting with relation units (first step w/o presence and watcher)
<niemeyer> I have a scheduled visit to the doctor.. back soon
<wrtp> TheMue, Aram: a fairly small cleanup of the juju.Conn interface: https://codereview.appspot.com/6488077
<wrtp> niemeyer: ^
<TheMue> wrtp: you've got a review
<wrtp> TheMue: thanks
<TheMue> wrtp: yw
<wrtp> TheMue: about that comment - you're probably right, but i think i'll leave it be for the time being, as it's an unrelated change.
<TheMue> ok
 * niemeyer waves
<wrtp> niemeyer: hiya
<niemeyer> wrtp: Review sent
<wrtp> niemeyer: ooh you're a lovely man
<niemeyer> Lunch time, biab
<niemeyer> wrtp: :-)
<TheMue> so, sports is calling, cu tomorrow
 * niemeyer is back
<niemeyer> and heading to a meeting
<niemeyer> I suppose
<wrtp> niemeyer: is that CL LGTY with the suggested change?
<niemeyer> wrtp: Yeah, it does look reasonable
<wrtp> niemeyer: thanks
<wrtp> niemeyer: done. will submit.
<niemeyer> wrtp: Thanks!
<wrtp> niemeyer: good to get one more dependency out of the way!
<niemeyer> wrtp: Yeah, feels good doesn't it :)
<niemeyer> wrtp: Progress!
<wrtp> niemeyer: yeah.
<wrtp> niemeyer: upgrade-juju is done; just writing the tests.
<wrtp> niemeyer: i have an idea for the provisioner agent stuff - i'll see if it works out.
<niemeyer> wrtp: Woohay
<wrtp> niemeyer: if i'm right (and you think it's ok) it could save lots of time and code. we'll see.
<niemeyer> wrtp: Sweet
<niemeyer> wrtp: I'm just finishing the presence stuff with docs and whatnot
<wrtp> niemeyer: right, that's me for the day. good to finish with a submit...
<niemeyer> wrtp: +1
<wrtp> niemeyer: have a good rest-of-day!
<niemeyer> wrtp: Have a good evening
<niemeyer> wrtp: thanks!
<wrtp> niemeyer: will do
<davecheney> heads up, while i'm waiting on a few things stabalise I'm doing juju ssh
<davecheney> i'll do scp at the same time as it is very similar
<niemeyer> davecheney: Cheers
<davecheney> niemeyer: could you have a read of the email I sent to you and Aram yesterday about mstate.Open/Dial
<davecheney> when you have a chance
<niemeyer> davecheney: It does support TLS
<niemeyer> davecheney: But I think we can chew one thing at a time
<niemeyer> davecheney: We'll have to define a new protocol to establish the connection
<davecheney> niemeyer: this is mainly for juju.Conn, what will replace the current UseSSH flag on state.Info ?
<davecheney> I guess I can do UseTLS and the define mstate.Dial/DialTLS
<niemeyer> davecheney: I think we should make mstate work entirely before we shift that over
<niemeyer> davecheney: Because otherwise we'll have new issues
<davecheney> of course
<davecheney> i'm not trying to put the cart before the horse
<niemeyer> davecheney: How to validate the certificate, how to provide the initial password, etc
<davecheney> yeeesh
<davecheney> all that :(
<davecheney> so, do you want mongo tunneled over ssh ?
<niemeyer> davecheney: UseSSH should at least so we can take the blocker out of everybody's plate
<davecheney> understood
<niemeyer> davecheney: Yeah, same thing we're doing now with ZooKeeper
<niemeyer> davecheney: After we get everything else working, we can fix that
<davecheney> sounds like a good plan to me
<davecheney> i'm all booked for lisbo, gonna get there on time this time :)
<davecheney> lisbon
<niemeyer> davecheney: That's great to hear. We'll have a good time there
<niemeyer> I'll get some dinner and bbiab
<davecheney> niemeyer: we have unit.AssignedMachineId()
<davecheney> would I be able to add
<davecheney> (or replace it with)
<davecheney> unit.AssignedMachine
<davecheney> as most of the other machine related methods on unit take a *Machine, not an id
#juju-dev 2012-09-04
<niemeyer> davecheney: The use cases we will dictate what's the best API
<niemeyer> we have
<davecheney> i'll leave it as a TODO for now
<davecheney> it's only one extra line
<niemeyer> davecheney: It certainly makes sense from a high-level perspective
<niemeyer> davecheney: I don't know how we use it, though
<davecheney> % juju ssh mongodb/2 -- uptime
<davecheney> Warning: Permanently added 'ec2-184-72-139-96.compute-1.amazonaws.com,184.72.139.96' (ECDSA) to the list of known hosts. 00:49:01 up 21 min,  1 user,  load average: 0.08, 0.03, 0.05
<niemeyer> davecheney: WOah!
<davecheney> niemeyer: there is a small issue currently
<davecheney> you need the --'s to tell gnuflag to stop consuming arguments
<davecheney> but it works
<wrtp>  fwereade__: mornin'
<fwereade__> wrtp, heyhey
<wrtp> fwereade__: how's it going?
<fwereade__> wrtp, good, I'm pretty sure
<wrtp> fwereade__: great!
<fwereade__> wrtp, the upgrader stuff is now (I think) approaching reviewable
<fwereade__> wrtp, 3rd attempt :)
<wrtp> fwereade__: well, it's not easy stuff...
<fwereade__> wrtp, it has helped me to see a specific area I need to work on though... I've realised that, these days, once something's merged I'm *much* more resistant to even considering changing it
<wrtp> fwereade__: interesting. i should watch out for that too.
<fwereade__> wrtp, I suspect it has to do with the amount of effort we put into getting things right the first time
<fwereade__> wrtp, but it took me an annoyingly long time to see that hook state and charm state (where state here implies local, persistent) are really just aspects of *uniter* state
<fwereade__> wrtp, and that life gets much easier if they're integrated
<wrtp> fwereade__: that is true. but all that initial effort is usually predicated on some imagined use cases, and if those don't pan out, things might be better changin.g.
<fwereade__> wrtp, indeed so
<wrtp> fwereade__: i find a source of endless wonder from the way that getting an abstraction right can make whole swathes of difficulty just drop away.
<wrtp> fwereade__: i have this feeling that *somewhere* in there are the *real* "design patterns"
<fwereade__> wrtp, haha, yeah -- it's a great feeling when it happens
<wrtp> fwereade__: can i run an idea past you?
<fwereade__> wrtp, ofc
<wrtp> fwereade__: the next stage with agent upgrading is to do unit upgrade, which should be trivial. but the next after that is the provisioning agent, which gustavo suggested a while ago might require a whole bunch of infrastructure in state. e.g. NewProvisioner, Provisioner.AssignToMachine etc etc
<fwereade__> wrtp, hmmm, interesting
<wrtp> fwereade__: it seems to me that almost all of the latter is currently covered by state.Unit
<fwereade__> wrtp, yeah, things-managed-by-a-machine-agent sounds like a category worthy of investigation
<wrtp> fwereade__: and that i should be able to side step all of it by getting the bootstrap machine to start a hidden service ("juju-provisioner" say) and have the machine agent know about that special service and start the agent directly rather than starting a unit agent for it.
<fwereade__> wrtp, gut says +1 to first bit, -1 to second bit
<wrtp> fwereade__: i wondered about reserving all service names starting "juju-*"
<fwereade__> wrtp, that definitely sgtm
<wrtp> fwereade__: i think it's important that the provisioner not be started by a unit agent
<fwereade__> wrtp, why is that so?
<wrtp> fwereade__: because the whole reason for this is to have a place to put the agent version for upgrades
<wrtp> fwereade__: and if it's started by the unit agent, the unit agent uses that
<fwereade__> wrtp, I think you need to expand on "a place to put the agent version for upgrades" before I can follow properly
<wrtp> fwereade__: ok, so each agent puts their current running version in the state so that you can see what's upgraded to what
<fwereade__> wrtp, ok, yep
<wrtp> fwereade__: we've got {Machine,Unit}.SetAgentVersion
<wrtp> fwereade__: but nothing for Provisioner, because there's no such abstraction in State currently.
<wrtp> fwereade__: i don't see this as a special-case hack for the provisioner BTW; i see it as a way of being able to add *any* new agent at minimal cost.
<wrtp> fwereade__: for instance, separating the firewaller from the provisioner would be trivial.
<wrtp> fwereade__: actually, i don't think the machine agent itself would need any logic for this - i think container.Deploy(unit) could see that the service is juju-* and deploy "jujud \1"
<fwereade__> wrtp, I'm still generally +1 on the idea that juju components should be services wherever practical
<fwereade__> wrtp, but this feels like a digression slightly
<wrtp> fwereade__: yeah. the controversial part of this is that there would be a unit without a unit agent.
<fwereade__> wrtp, I think I'm missing something, because I'm still (dogmatically?) opposed to the concept and the reason it's necessary has not yet clicked
<wrtp> fwereade__: is it possible that we could use the charm upgrade mechanism to upgrade the provisioning agent?
<fwereade__> wrtp, ha, that hadn't clicked... but offhand *perhaps* no reason why not
<wrtp> fwereade__: it seems a little inconsistent, given that machine agents are done differently.
<fwereade__> wrtp, although the *charm* itself wouldn't be changing much I'd imagine
<fwereade__> wrtp, seems more like a service-config thing really
<wrtp> fwereade__: indeed.
<wrtp> fwereade__: ah, that's interesting.
<wrtp> fwereade__: BTW is there anything that the unit agent is *required* to do, other than maintain its presence node?
<fwereade__> wrtp, eg I think there are a couple of charms that let you switch between stable and bleeding edge for example
<fwereade__> wrtp, restate please, in a sense it's required to do everything it does ;)
<fwereade__> wrtp, set unit status and charm version in state
<fwereade__> wrtp, and "participate in relations"
<wrtp> fwereade__: assume no relations
<fwereade__> wrtp, the rest doesn't touch external state IIRC
<fwereade__> wrtp, then presence, unit status, charm version
<wrtp> fwereade__: i'm wondering if a way to look at it could be that the provisioning agent is its own unit agent.
<fwereade__> wrtp, I'm still not following something fundamental
<fwereade__> wrtp, I had this vague understanding that each individual agent was watching juju version and upgrading itself?
<wrtp> fwereade__: yes, that's right. juju version is now global.
<wrtp> fwereade__: and it sets something in the state to say when it's upgraded.
<wrtp> fwereade__: which status prints out, for example.
<wrtp> fwereade__: and while we'll need to watch when we do major-version upgrades.
<fwereade__> wrtp, yep, ok
<wrtp> fwereade__: i'm not sure how that would work if the PA was started by a uniter
<wrtp> fwereade__: but if the PA is its own uniter, it's trivial.
<fwereade__> wrtp, yeah, without some out-of-band signalling mechanism to the charm it's problematic
<wrtp> fwereade__: the problem is really that we don't have any out-of-band signalling mechanism *from* the charm, right?
<fwereade__> wrtp, the idea *feels* like ++confusion not --confusion... but it could be that point when you're pulling on a tangled mass of slinky just before it reconfigures back into the neat form
<fwereade__> wrtp, I can live with that perspective too :)
<wrtp> fwereade__: thing is, i *think* that in terms of lines of code, it will be almost negligible
<wrtp> fwereade__: like three or four lines in container, a couple of lines in status to exclude juju-* services from printing by default, and a little stuff in jujud bootstrap-init to actually set up the service.
<wrtp> fwereade__: and that's what attracts me to the idea.
<wrtp> fwereade__: and the alternative previously suggested is to build an entirely new edifice in state with many methods and tests, just to do something we already *almost* do.
<fwereade__> wrtp, I still feel like unitless services (or reimplementations of the uniter) are kinda smelly
<fwereade__> wrtp, there are definitely good things floating around in this space though
<fwereade__> wrtp, wait a mo
<fwereade__> wrtp, is there much meaningful distinction between the environment config and the uniter's service config?
<fwereade__> wrtp, it suddenly feel like they're (almost) exactlythe same thing
<wrtp> fwereade__: it's a *config.Config, not a *state.ConfigNode
<fwereade__> wrtp, couple of conceptual levels up though
<wrtp> fwereade__: but that's only a recent thing
<fwereade__> wrtp, what "is" the environment config if not the config of the juju service?
<wrtp> fwereade__: that seems like a nice way to look at it
<wrtp> fwereade__: that would mean we could put the machine agent under this umbrella too
<wrtp> maybe
<fwereade__> wrtp, I think it *might* be, but I suspect the ramifications will be somewhat hefty even if it is
<fwereade__> wrtp, hmm, I had still been thinking of the MA as the root of everything, maybe we can take it a step further but it feels wrong somehow
<wrtp> fwereade__: ooh, that might actually solve some problems down the line
<wrtp> fwereade__: like... the machine agent doesn't need to know environment secrets, but the provisioner does.
<fwereade__> wrtp, one MA per machine which is responsible for starting the things that should run on that machine -- everything else is services
<wrtp> fwereade__: perhaps you could think of the MA as just one unit on a machine. it's a bit special, because it's started by starting the instance, but maybe...
<wrtp> fwereade__: but yes, that's what i was thinking.
<fwereade__> wrtp, it does all definitely sound worthy of investigation :)
<wrtp> fwereade__: but that does imply uniterless units, i think
<fwereade__> wrtp, indeed, we need an escape hatch *somewhere* in order to do things that nicely
<wrtp> fwereade__: unless the *uniter* is the first thing to run on a machine
<wrtp> fwereade__: can a hook do config-set, BTW?
<fwereade__> wrtp, nope, which also may be a problem
<fwereade__> wrtp, the only bits of state a unit agent even *can* write are status, charm, presence, and relation stuff
<wrtp> fwereade__: it's not really a problem. i kinda feel that each unit should have its own config actually.
<fwereade__> wrtp, and I'm starting to like the idea that the MA could be a unit of the juju-machines service
<wrtp> fwereade__: but that probably goes against the whole concept of homegeneous units
<fwereade__> wrtp, but I'm a little concerned about that depth of the rabbit hole
<wrtp> fwereade__: yeah, that's definitely a bigger thing than i was suggesting.
<fwereade__> wrtp, I'm mainly just worried that this whole area of investigation is just the innocuous entryway to a rabbit megalopolis that we aren't really in a position to explore right now ;)
<wrtp> fwereade__: i think i'll give the PA idea a play and see what the code looks like.
<fwereade__> wrtp, cool
<wrtp> fwereade__: thanks for the discussion - i now know the points to be careful about when proposing the idea!
<fwereade__> wrtp, haha, glad it was useful :)
<wrtp> davecheney: hiya
<wrtp> davecheney: if you're feeling so inclined, i'm looking for reviews of https://codereview.appspot.com/6490067/
<davecheney> wrtp: OH MY GOD
<davecheney> today has been such a drama
<davecheney> at one point I had three sepearate tradesemen here
<davecheney> the tiler, manged to short out my power
<wrtp> davecheney: lol
<davecheney> then the eletrician had to some
<davecheney> and then the guy from the power compnay because the eletrician noticed the voltage was a bit high
<davecheney> and now the flooring guy just turned up to do a moisture test
<davecheney> anyway
<davecheney> drama is over
<davecheney> i wish I was bak in lisbon
<davecheney> most of the renovations happened when we were there
<wrtp> davecheney: do you know this song? http://www.youtube.com/watch?v=zyeMFSzPgGc
<davecheney> wrtp: i don't think this is a situation for levity
<wrtp> davecheney: you're probably right. but i've found that song to be spot on many times :-)
<davecheney> lol
<davecheney> wrtp: i'll review your CL in two secs
<wrtp> davecheney: thanks
<davecheney> just submitting one that has been waiting for my intertubes to return
<wrtp> davecheney: there's a follow-up too
<davecheney> wrtp: i've been enjoying watching your cleanups
<davecheney> refactor !!
<wrtp> davecheney: thanks
<davecheney> wrtp: for you https://codereview.appspot.com/6499071/
<wrtp> davecheney: do it when you see it, i reckon
<davecheney> wrtp: absolutely
<wrtp> davecheney: i haven't looked at more than the title, but definite +1 - i thought of that before. but... i wondered if there was a particular reason for it.
<wrtp> davecheney: i.e. is there a time when it's valid to have an assigned machine id but no actual machine
<wrtp> ?
<wrtp> davecheney: "
<wrtp> All the non test consumers of this
<wrtp> method are actually after a *Machine, not the id anyway
<davecheney> wrtp: i can't see how, you have to pass a *state.Machine to unit.AssingToMachine
<wrtp> "
<wrtp> not *quite* true - status.go only uses the id :-)
<davecheney> status.go can suck it up
<wrtp> lol
<davecheney> firewaller also uses it, but it is a reasonable change
<davecheney> i ran across this writing juju ssh
<wrtp> davecheney: can you remove a machine without removing the units assigned to it?
<davecheney> wrtp: this is overall too much code to turn a unit name into an insatnce
<davecheney> http://paste.ubuntu.com/1185173/
<davecheney> wrtp: i don't know the answer to that question
<wrtp> davecheney: it's an important question to answer in the context of that CL
<wrtp> davecheney: if you can, then it's a reason for AssignedMachineId
<wrtp> davecheney: or... maybe we say it can return nil.
<wrtp> or an error
<davecheney>         return u.st.Machine(keySeq(machineKey))
<davecheney> will return the error
<wrtp> davecheney: seems reasonable.
<fwereade__> wrtp, davecheney: IIRC you cannot remove machines until they're empty
<davecheney> that sounds reasonable
<fwereade__> wrtp, davecheney: and AFAIK there is nothing in place that allows users to move units from one machine to another, but AIUI that will not necessarily be the case
<fwereade__> always
<wrtp> davecheney: i think we should have State.Unit - then your code could be: http://paste.ubuntu.com/1185191/
<davecheney> wrtp: that was infact the first method I reached for
<davecheney> wrtp: and wouldn't the units' name be "unit/0"
<davecheney> so I wouldn't have to split c.Target at all
<wrtp> davecheney: indeed.
<wrtp> davecheney: but... i'm not entirely sure.
<wrtp> davecheney: in fact, yeah, that's right
<davecheney> func (s *State) Unit(name string) (unit *Unit, err error) Unit returns a unit by name.
<davecheney> DOH!
<davecheney> i'm a fuckwit
<davecheney> i think I didn't use that initially because I had the id of the unit/NUM
<wrtp> davecheney: oh yeah, i just looked for that and missed it
<davecheney> and I was frustrated that i had an int, but needed a string
<davecheney> still, i need to get from Unit -> assigned Machine -> Instance
<wrtp> davecheney: yeah. i think returning a *Machine makes total sense.
<wrtp> davecheney: i can't think of any case where the id is useful or even significant by itself.
<davecheney> and Machine.Id() does not return an err
<davecheney> so it is much easier to use than state.Machine(id)
<davecheney> which does
<wrtp> davecheney: indeed.
<wrtp> davecheney: yup
<davecheney> wrtp: http://paste.ubuntu.com/1185220/
<davecheney> getting better, and I can reduce it further with my other CL
<wrtp> davecheney: not entirely sure
<wrtp> davecheney: the Unit request can fail because of more than just the name being malformed,
<wrtp> davecheney: i think that strings.Index(name, "/') > 0 might be a better test
<wrtp> davecheney: that way you don't incur the round trip when it's not a unit
<davecheney> wrtp: will do, the python code also included that logic
<davecheney> wrtp: speaking of round trips
<davecheney> adding mstate.Info
<davecheney> we're going to reuse the UseSSH logic to establish a tunned to mgo
<davecheney> but in the future there exists a possibility of using TLS directly
<wrtp> davecheney: i'd love that ssh tunnelling code to disappear
<wrtp> davecheney: it was a right pain to write, and i still don't fully trust it
<davecheney> given we control the mgo code, it might be easier to use the go.crypto ssh package
<davecheney> but that is bordering on out of scope
<davecheney> i am very confident of the tcp forwarding code, several of us have been banging on that for months
<davecheney> wrtp: the state already checks the unit name for us
<davecheney> % juju ssh mysql/q
<davecheney> error: cannot get unit "mysql/q": "mysql/q" is not a valid unit name
<davecheney> wrtp: btw +1 to your change to always make a state connection when we open juju.Conn
<wrtp> davecheney: good point.
<davecheney> anyway, i'll leave that CL for the moment
<davecheney> the tests for it are going to be a days work
<davecheney> we have to mock *everything*
<davecheney> wrtp: sorry, still reading your CL
<wrtp> davecheney: which CL is that?
<davecheney> the one you pasted me 30 mins ago
<wrtp> davecheney: i mean, the one you're gonna leave
<davecheney> oh, juju ssh
<davecheney> i haven't proposed that yet
<davecheney> will finalise it tomorrow
<davecheney> i'm sort of jumping between a few at the moment
<davecheney> in good news, juju deploy is getting very solid
<davecheney> juju bootstrap ; juju deploy mysql ; juju deploy mongodb ; juju add-unit mongodb ; juju add-unit -n2 mysql ; juju status
<davecheney> just works
<wrtp> davecheney: cool
<davecheney> wrtp: in reusing the UseSSH code from state/ssh.go
<davecheney> do you think it is worth moving it into another package
<davecheney> or just copy it to mstate as we're nix'ing state soon ?
<wrtp> davecheney: the latter
<davecheney> wrtp: roger, roger
<wrtp> davecheney: i want that code to go!
<wrtp> davecheney: and giving it its own package is not the way forward :-)
<davecheney> wrtp: no, we don't want to give it an endorsement
<davecheney> wrtp: https://bugs.launchpad.net/mgo/+bug/1045678
<wrtp> davecheney: +1.
<wrtp> davecheney: or even an io.ReadWriteCloser
<davecheney> sure, if mgo doesn't need to know Local/RemoteAddr
<Aram> moin.
<davecheney> hey
<wrtp> davecheney: those changes could be handled in a separate CL, but it's only 9 lines of changes total, with no impact on other code, and they were both as a result of problems i encountered when putting this CL together.
<davecheney> wrtp: i'd like to see them split out, but it's not a strong objection
<wrtp> davecheney: ok, i'll split 'em out. there's no dependency on them i think.
<wrtp> davecheney: BTW i suspect you might have been the one that used Errorf in opRecvTimeout. any particular reason for it?
<Aram> anyone seen mramm lately?
<TheMue> good morning
<Aram> morning.
<fwereade__> right, I think I might actually be happy with the charm upgrades at last -- taking a short break before polish and propose
<fwereade__> (good mornings to those I have not thus far addressed)
<niemeyer> Gooood morning rockstars
<fwereade__> niemeyer, heyhey
<TheMue> niemeyer: hiya
<fwereade__> huh:
<fwereade__> runtime: signal received on thread not created by Go.
<fwereade__> FAIL	launchpad.net/juju-core/state	21.458s
<fwereade__> anyone seen that before?
<davecheney> fwereade__: yes
<davecheney> not in state
<fwereade__> davecheney, is it something I did? :)
<Aram> no.
<Aram> it happens sometimes.
<fwereade__> is there a bug for it?
<davecheney> it's an artifact of the gozk c bindings
<Aram> it's poor zookeeper cgo interraction.
<davecheney> fwereade__: probably should raise one
<fwereade__> ah, ok, cool, I'll go and do that
<Aram> there is one bug in the Go tracker.
<fwereade__> Aram, for gozk?
<davecheney> Aram: there is a googler who is playing in that area at the moment
<Aram> not for gozk, for cgo.
<Aram> cgo is at fault.
<fwereade__> Aram, ahh, ok
<Aram> nothing we can do, sadly.
<fwereade__> Aram, very well, I shall wait for state to go away entirely then :)
<davecheney> jolly good, carry on
<davecheney> fwereade__: longer version, it is an argument about who owns the signal handler
<davecheney> if a signal is delivered to a thread while in c code, the go signal handler doesn't have the proper registers setup to handle it
<davecheney> so it has to panic
<fwereade__> davecheney, good to know, thank you
<davecheney> fwereade__: there isn't much of a solution at the moment, apart from 'don't use so much cgo you darn kids'
<davecheney> c'mon, lets get this show on the road
<wrtp> davecheney: ah! i'd wondered where that came from.
<niemeyer> Party time!
<niemeyer> Invites sent
<Aram> https://plus.google.com/hangouts/_/3016131d787ecda60f236d1dec4e32264e3353ad?authuser=0&hl=en
<niemeyer> davecheney, fwereade__, wrtp, TheMue, Aram
<Aram> "google can't load CSS", wow, never seen that one.
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1045151
<davecheney> niemeyer: Aram ^^
<davecheney> state.waitForInitialised
<Aram> that doesn't require a watcher though.
<TheMue> lunchtime
<fwereade__> niemeyer, https://codereview.appspot.com/6500072/ -- I hope this one meets with your approval :)
<fwereade__> niemeyer, the only way I could see to split it was uncomfortably artificial -- without the upgrade modes in place, the changes to the other modes looked unpleasantly arbitrary to me
<Aram> is there a way to turn line numbers off in codereview?
<niemeyer> fwereade__: Sure.. I'll have a look
<niemeyer> fwereade__: There some low-hanging fruits to split there, but I'll look as-is
<fwereade__> niemeyer, indeed, I guess a couple of types could indeed have been implemented independently without affecting anything else, but then I'd end up with both used and unused code for doing the same thing, and that confuses me ;)
<niemeyer_> fwereade__: Cool, I'd just appreciate if things were a bit more broken down overall.. your branches have been consistently large
<niemeyer_> fwereade__: This not only tricky for review, but also creates that frustration where it feels like the branch is always coming back because it's all in the same hunk
<fwereade__> niemeyer_, indeed, I know they have been less than ideal lately, I shall try to do better :)
<wrtp> niemeyer_: i don't think of it as a trivial point. concurrent code is not easy to grok, and every extra select is another bifurcation point that makes things a little harder to grasp. that was the motivation behind my original comments. i apologise for the diversion though.
<niemeyer_> wrtp: I apologize, but all you're doing is insistently complaining that I'm writing down safer concurrent code.
<niemeyer_> wrtp: Read my original point and ponder about it for half a second.
<wrtp> niemeyer_: depends what we mean by "safer" i guess. i like being able to trace an obvious no-conditions path through the code. with the select, a reply can be lost, which isn't obvious; i think it's easier to verify that replies are always sent. i appreciate that this is a matter of taste though.
<niemeyer_> wrtp: I hope I was clear by what I meant by safer in the review.
<wrtp> "if the other side decides to abort in the middle of handling this due to Dying" seems a bit like "if you forget to call file.Close" to me. yes you can forget it, but it's easy to verify, whereas concurrent states aren't quite as easy.
<wrtp> niemeyer_: ^
<fwereade__> late lunch, bbiab
<wrtp> fwereade__: hmm lunch, good idea
<niemeyer_> wrtp: No, it means that if we do any operation in the middle that can fail, exactly as we already have in other cases such as refresh, we have an unperceived deadlock in the code.
<niemeyer> wrtp: That's writing safer concurrent code. If you're unhappy with that, I'll have to apologize.
<wrtp> niemeyer: it depends where you see the locus of control, i think. with RPC-style channel comms like this, i think it's reasonable to think of sending the RPC request as handing control to the central routine. then it's up to that routine to reply appropriately if something fails.
<wrtp> niemeyer: both approaches are "safe". one has slightly simpler code which is easier to verify IMHO.
<niemeyer> wrtp: It's sensible to think like this, except it's wrong in some cases, and thus not safe.
<fwereade__> niemeyer, am I right to think it is acceptable for the UA to read from environment config to determine the provider type?
<niemeyer> fwereade__: Yes, I think it's fine for us to do that rigth now
<fwereade__> niemeyer, cool, cheers
<niemeyer> fwereade__: The UA shouldn't have access to the whole environment config, but when we fix that we can do whatever
<fwereade__> niemeyer, yeah, I understand we don't want it to have access long-term
<TheMue> niemeyer: one question about the usage of your presence.Watcher. i need Alive() for Unit.AgentAlive(). is it intended the every "user" of such a watcher creates an own one or shall State provide one watcher, e.g. unitw?
<niemeyer> TheMue: Single instance for State
<TheMue> niemeyer: ok
<TheMue> niemeyer: currently i have no overview. do we need watchers for all collections?
<niemeyer> TheMue: We need the watchers we'll actually use
<niemeyer> TheMue: The watchers being used by workers are a good guideline to start with
<TheMue> niemeyer: ok, i have to scan the code
<TheMue> niemeyer: thx
<niemeyer> TheMue: np
<niemeyer> TheMue: Are you busy or would you have a moment?
<niemeyer> TheMue: Would just like to run an idea I exchanged with wrtp the other day, for the next time you get blocked
<TheMue> niemeyer: i'm integrating your presence in unit.AgentAlive() and co.
<TheMue> niemeyer: ok, i'm listening
<niemeyer> TheMue: Right now we have the handling of environment somewhat hardcoded in each different worker
<niemeyer> TheMue: It'd be nice to have a shared way to ensure a worker is handling the environment loading and reloading basically correctly
<niemeyer> TheMue: One idea we came up with was to have an observer channel being passed into the worker
<TheMue> niemeyer: yeah, i've seen that wrtp has been already active regarding provisioner and firewaller
<niemeyer> TheMue: If not nil, we send the environment into the channel every time the local "<w>.environ" property is set
<niemeyer> TheMue: So that we can create a shared test suite that is run against any worker
<niemeyer> TheMue: Makes sense?
<wrtp> niemeyer: i thought we decided that that wasn't a great way to go
<TheMue> niemeyer: a peep whole ;)
<TheMue> niemeyer: yes, sounds good
<TheMue> wrtp: why not?
<niemeyer> Aug 31 10:11:06 <rogpeppe>      niemeyer: perhaps you could mention to TheMue the thoughts you had on the worker tests, re: the above provisioner_test.go CL.
<niemeyer> Aug 31 10:11:36 <niemeyer>      rogpeppe: Will do
 * wrtp goes back to look
<niemeyer> wrtp: My memories and this makes it feel like we agreed on something else
<niemeyer> wrtp: But I'm happy to talk agan
<niemeyer> again
<TheMue> niemeyer: the way right now with the op codes and the secret doesn't feel very good
<niemeyer> TheMue: Yeah, it's a good interim solution
<TheMue> niemeyer: exactly
<wrtp> niemeyer: hmm, i remembered more discussion than that
<niemeyer> TheMue: But it'd be nice to have a way we can reuse for all the workers, instead of having to hack it together
<niemeyer> wrtp: It was a G+ call
<TheMue> niemeyer: absolutely
<niemeyer> wrtp: Still, I have vivid memories of the agreement
 * wrtp wishes we could archive G+ calls
<niemeyer> wrtp: I recall even you saying you'd prefer to have the channel passing in through the New function
<niemeyer> wrtp: While my original proposal was to have a method
<TheMue> niemeyer: my favorite would be a method to set the channel. so it's nil by default and would be only set in tests.
<TheMue> niemeyer: could get a neat interface EnvironTester { SetEnvironChan(ch) }
 * wrtp is still trying to remember the context in detail
<wrtp> TheMue: the problem with that is that it has to interact with the worker to set the channel.
<wrtp> TheMue: and it would probably need a channel to do that, so we don't gain anything.
<niemeyer> wrtp++
<TheMue> wrtp: sh.., yes
<wrtp> niemeyer: i can't remember if we decided it was best to have a chan of environs.Environ or of *config.Config
<wrtp> ah, i remember!
<niemeyer> wrtp: It was environ the whole time, IIRC. The point was ensuring that it was properly loaded
<TheMue> niemeyer: thx btw, will go in. the other one is WIP, i need to discuss something for it with you
<niemeyer> TheMue: Sure, what's up?
<TheMue> niemeyer: so far the agent alive flag has been a zk node below the entity node
<TheMue> niemeyer: now we don't have that hierarchy.
<niemeyer> TheMue: Indeed
<wrtp> niemeyer: my confusion had been about the discussion we had about an *incoming* channel
<wrtp> niemeyer: i didn't read your initial comment above properly, doh!
<niemeyer> wrtp: Ah, no worries
<TheMue> niemeyer: shall i add a collectiong for alive like for config nodes, with own namesspaces "m/..." and "u/..."?
<TheMue> niemeyer: so presence.Watcher and Pinger would act on this collection.
<niemeyer> TheMue: Sorry, I kind of know what you're talking about, but given the way you phrase it I'm not sure
<TheMue> niemeyer: oh, sorry, will rephrase it.
<TheMue> niemeyer: the base question is, what shall be watched/pinged for Machine.AgentAlive and Unit.AgentAlive? We don't have own nodes for it anymore.
<niemeyer> TheMue: As a side note, I quickly agreed with Aram that we'll move towards using "m#...", "u#...", etc
<niemeyer> TheMue: This avoids the ambiguity of "m/1" being both a unit name and a resource key
<TheMue> niemeyer: ok, will change it in my code.
<niemeyer> TheMue: Okay, if I understand your question, for machine 1 we should watch/ping "m#1"
<TheMue> niemeyer: yes, exactly. or u#wordpress/1.
<niemeyer> TheMue: +1
<TheMue> niemeyer: cheers, will do it for units first and then machines
<TheMue> niemeyer: and both use the same collection
<niemeyer> TheMue: Btw, the pinger should live in a separate MongoDB database
<niemeyer> TheMue: "presence", probably
<niemeyer> TheMue: This means it gets its own write lock
<niemeyer> TheMue: Which is relevant given how much traffic this may end up happening, and how unrelated to the activity in "juju" it is
<TheMue> niemeyer: hmm, ok, have to take a look how that works.
<niemeyer> TheMue: No big deal.. just a different session.DB
<TheMue> niemeyer: but it sounds reasonable
<TheMue> niemeyer: ok
 * TheMue loves mgo
<niemeyer> TheMue: As another idea, I think we may end up having a type entity interface { entityKey() string }
<TheMue> niemeyer: +1
<niemeyer> TheMue: That unifies all things "C#ID"
<niemeyer> Or most, anyway
<wrtp> niemeyer: upgrade-juju command now works: https://codereview.appspot.com/6498085
<niemeyer> wrtp: Woohay!
<niemeyer> wrtp: Did you actually see the thing working live?
<wrtp> niemeyer: i'll just try it now. i have the utmost confidence :-)
<niemeyer> wrtp: Man, this is so awesome
<wrtp> niemeyer: ('cos it actually doesn't do much)
 * wrtp goes for a drink while the amazon test runs
<wrtp> jeeze it's slow
<wrtp> *still* uploading the same set of tools.
<wrtp> maybe it's hung. you just canna tell.
<niemeyer> https://codereview.appspot.com/6490067/diff/3018/cmd/juju/cmd_test.go
<niemeyer> wrtp: On that CL,
<niemeyer> wrtp: time.Sleep(500 * time.Millisecond)
<wrtp> niemeyer: yeah, looks spurious
<niemeyer> wrtp: Weren't you complaining about timings just yesterday?
<niemeyer> :)
<niemeyer> wrtp: Let's please not do this
<wrtp> niemeyer: i'm just removing it - i think it was from a debug session
<niemeyer> wrtp: I can imagine why you had it there
<niemeyer> wrtp: I don't think the error reordering is entirely right either..
<wrtp> niemeyer: it makes for a much more useful diagnostic
<niemeyer> wrtp: Ah, you've added buffering to opc
<wrtp> niemeyer: if the command has died with an error, you get to see the error rather than a "timed out" message
<wrtp> niemeyer: yeah
<niemeyer> wrtp: Okay, I guess that should work
<wrtp> niemeyer: i made the change because the diagnostics were terrible when it failed...
<niemeyer> wrtp: +1
<wrtp> niemeyer: will remove the sleep pronto
<niemeyer> wrtp: Can we bump a bit further along the 20 number? Like 256?
<niemeyer> wrtp: We'll never see the wasted memory, and it may save us precious debugging time some day
<wrtp> niemeyer: sure. although given that it's only 2 in practice...
<niemeyer> wrtp: Okay, 64 then ;)
<niemeyer> wrtp: The point is really to get unrealistically large
<niemeyer> wrtp: So we don't have to debug a most extraneous bug some day
<wrtp> niemeyer: i thought 20 already was, but will change. 50 perhaps. can't really see why a power of two is useful here.
<niemeyer> wrtp: Although, the whole logic will probably need some caring for :(
<niemeyer> wrtp: Since it has hardcoded ops
<wrtp> niemeyer: that's probably true. but not in scope for this CL
<wrtp> niemeyer: i've wondered about giving dummy.Listen a mask of operations we're interested in.
<wrtp> niemeyer: then hardcoded ops would be just fine, i think.
<niemeyer> wrtp: I'd rather aim for more useful tests..
<niemeyer> c.Check((<-opc).(dummy.OpPutFile).Env, Equals, "peckham")
<niemeyer> What is this telling us really?
<niemeyer> That a method was called, in whatever circumstances, with whatever arguments..
<niemeyer> THUS, IT MUST WORK!
<wrtp> niemeyer: agreed. we should test that the tools now exist in the environment
 * niemeyer <= mocking disbeliever
<wrtp> niemeyer: and that we can connect to the state, etc.
<wrtp> niemeyer: that's much easier with the infrastructure we have now actually
<niemeyer> wrtp: Yeah
<wrtp> niemeyer: i'll file a bug
<niemeyer> wrtp: Cheers.. something for us to prize, if nothing else :)
<wrtp> niemeyer: done. also, sleep removed and buffer size increased.
<niemeyer> wrtp: Cheers
<fwereade__> wrtp, i'm running a machine agent on EC2 in the hope that it will deploy a unit agent for me, but it appears to be stuck opening state... does this sound familiar to you?
<wrtp> fwereade__: how long have you waited?
<fwereade__> wrtp, maybe 10 mins
<wrtp> fwereade__: wait another 5 mins perhaps
<fwereade__> wrtp, why would that happen?
<wrtp> fwereade__: what do the log messages say?
<fwereade__> wrtp, 2012/09/04 15:28:32 JUJU state: opening state; zookeeper addresses: ["ec2-107-20-89-226.compute-1.amazonaws.com:2181"]
<fwereade__> wrtp, nothing after that
<wrtp> fwereade__: have you tried sshing to the machine?
<fwereade__> wrtp, that's where I got the log from; is there something else I should be looking for?
<wrtp> fwereade__: ah, so that message is from the machine agent log?
<fwereade__> wrtp, yeah
<wrtp> fwereade__: did you run the initial bootstrap with --verbose --debug?
<fwereade__> wrtp, gah, no
<wrtp> fwereade__: ISTR that that flag propagates through to the bootstrap machine and thence to the machines it starts
<wrtp> fwereade__: have you tried sshing to the machine it's trying to connect to?
<wrtp> fwereade__: in fact, have you tried juju status?
<fwereade__> wrtp, since it's the state machine I'm implicitly sshing to with every command, no
<fwereade__> wrtp, status is fine I think
<wrtp> fwereade__: hmm, what about sshing to the machine agent's machine, then trying to ssh from that machine to the address listed above
<wrtp> fwereade__: perhaps it can't use the external address internally or something
<fwereade__> wrtp, ah-HA, security groups
<fwereade__> wrtp, opened 2181, stuff is happening
<fwereade__> wrtp, but...
<fwereade__> 2012/09/04 15:49:43 JUJU deploying unit etherpad-lite/0
<fwereade__> 2012/09/04 15:49:43 JUJU cannot deploy unit etherpad-lite/0: cannot find executable: exec: "jujud": executable file not found in $PATH
<fwereade__> 2012/09/04 15:49:43 JUJU machiner waiting for change
<fwereade__> wrtp, and I need to be off :(
<wrtp> fwereade__: hmm, isn't 2181 open for everything?
<fwereade__> wrtp, I have every hope that I will be on later...
<fwereade__> wrtp, juju-amazon just had 22
<wrtp> fwereade__: interesting. i'll check.
<wrtp> fwereade__: and cloud-init should set the PATH
<wrtp> fwereade__: good stuff anyway!
<wrtp> fwereade__: environs/ec2/ec2.go:664: 			// TODO authorize internal traffic
<wrtp> :-)
<niemeyer> wrtp: Sent a review..
<niemeyer> wrtp: It looks great in general
<wrtp> niemeyer: brill, thanks
<niemeyer> wrtp: There are a couple of things to sort out, one of them is connected to the test stuff we were just talking
<niemeyer> wrtp: The other is about environ config changes
<wrtp> niemeyer: yeah, i wondered how you'd take that
<niemeyer> wrtp: Unfortunately I'll have to step out for lunch to avoid a fight here, but I'll be back soon :)
<wrtp> :-)
<wrtp> niemeyer: i wanted to avoid timeouts
<niemeyer> wrtp: Sure.. that's a great goal.. but there are many ways to do it
<niemeyer> Anyway, lunch
<niemeyer> biab
<wrtp> niemeyer: enjoy!
<TheMue> niemeyer: presence integration in unit works great *hug*
<wrtp> fwereade__: i've got a fix for the internal traffic TODO. will propose tomorrow.
 * wrtp is off for the night. see y'all tomorrow
<niemeyer> wrtp: Night!
<fwereade__> wrtp, you rock :)
<niemeyer> fwereade__: ping
<fwereade_> niemeyer, pong
<niemeyer> fwereade_: yo
<fwereade_> niemeyer, how's it going?
<niemeyer> fwereade_: Was just wondering if you had any input on the ForceRefresh conversation with TheMue
<fwereade_> niemeyer, sorry, no
<niemeyer> fwereade_: But no to worry, it's late for you
<fwereade_> niemeyer, cath and laura are asleep, I'm weighing my options -- if I can help then I'm happy to
<niemeyer> fwereade_: Thanks, but that was all I had at least.. I was wondering if you had any insight from the uniter perspective on the issue, but it's not a big deal either way
<fwereade_> niemeyer, I was just thinking that I could ofc break down that CL I proposed earlier into (1) replace state files (2) add Deployer, extend GitDir (3) dump Manager, use Deployer, implement upgrades
<fwereade_> niemeyer, if you haven't already spent time on it then that might be a sensible use of my time before you get up tomorrow
<fwereade_> niemeyer, although none of the breaks there are quite as clean as I would hope
<niemeyer> fwereade_: I haven't spent time on it, and I wouldn't mind waiting until tomorrow morning as the plate is full ATM
<niemeyer> fwereade_: This will likely improve your progress on it too
<niemeyer> fwereade_: As we'll surely be able to starting merging bits sooner rather than later
<niemeyer> start
<fwereade_> niemeyer, hopefully so -- deployer is a nicely separable chunk, at the very least
<fwereade_> niemeyer, (and that then leads me to consider another possible partitioning...)
 * fwereade_ just wishes we had a good multi-prereq story
<niemeyer> fwereade_: =1
<niemeyer> +1
#juju-dev 2012-09-05
<davecheney> Path conflict: cmd/juju/ssh.go.THIS / cmd/juju/ssh.go
<davecheney> why does this keep happening
<davecheney> this file isn't even in this branch
<rogpeppe> fwereade_: mornin'
<rogpeppe> TheMue: hiya
<TheMue> rogpeppe: morning
<TheMue> rogpeppe: up so early in the morning?
<rogpeppe> TheMue: yeah, lying awake, so got up
<TheMue> rogpeppe: makes sense
<rogpeppe> fwereade_: ping
<rogpeppe> fwereade_: ping
<fwereade_> rogpeppe, pong
<rogpeppe> fwereade_: i've done a fix for the internal traffic thing, but need to write a test for it, and that requires deploying a charm live, and i'm not quite sure of the best approach
<rogpeppe> fwereade_: the approach i'm thinking of is to use testing/charm and assume that the charms in there are portable
<fwereade_> rogpeppe, sound like something we have little option but to leave to the FTs if that's so -- presumably you can at least check the security groups though?
<rogpeppe> fwereade_: i.e. copy one of the charms into a local directory under the current series
<fwereade_> rogpeppe, sorry, what *precisely* are you trying to write a test for?
<rogpeppe> fwereade_: tbh i want to write the test anyway. i don't see why we shouldn't have one.
<rogpeppe> fwereade_: i want to check that a new machine comes up and manages to connect to the state.
<rogpeppe> fwereade_: actually, i *can* do that without deploying a charm.
<fwereade_> rogpeppe, well, indeed, +1 to testing everything, but -1 to actually touching outside state in unit tests
<rogpeppe> fwereade_: what do you mean by "touching outside state"? we already do, in the ec2 live tests.
<rogpeppe> fwereade_: i think it would be good to have a test that deploys a charm in jujutest/livetests
<fwereade_> rogpeppe, does that run every time?
<rogpeppe> fwereade_: no|!
<fwereade_> rogpeppe, +1 to that
<fwereade_> rogpeppe, sorry, miscommunications clearly
<rogpeppe> fwereade_: you'd know if it did - it takes about 12 minutes to run
<rogpeppe> fwereade_: so...
<fwereade_> rogpeppe, yeah, I thought so
<fwereade_> rogpeppe, which is why I don't consider it to be amongst the "unit tests"
<rogpeppe> fwereade_: that's why i didn't call it a "unit test" :-)
<fwereade_> rogpeppe, although I am aware that the various categories of test are somewhat fuzzily defined
<fwereade_> rogpeppe, "but -1 to actually touching outside state in unit tests"
<rogpeppe> fwereade_: ah yes
<fwereade_> rogpeppe, aaanyway, good morning :)
<rogpeppe> fwereade_: top of the day to yourself too, sir :-)
<TheMue> fwereade_: good morning
<fwereade_> TheMue, heyhey
<rogpeppe> fwereade_: so, do you think we should have a repository of charms, one per series, with charms that are actually intended to run; or do you think we can assume that the charms that we use in testing are portable across series?
<fwereade_> rogpeppe, I feel like that should be a pretty safe assumption, and pretty easy to rectify if we balls it up
<fwereade_> rogpeppe, I *would* like to lean up the testing repo though
<rogpeppe> fwereade_: perhaps we could add a "series" argument Repo.ClonedURL
<rogpeppe> s/Repo/to Repo/
<fwereade> rogpeppe, sorry, I suspect I missed stuff; last I saw was "...t the charms that we use in testing are portable across series?"
<rogpeppe> fwereade: yes, but we need a URL with the correct series in
<fwereade> rogpeppe, sure, but the series is almost entirely irrelevant to the testing charms -- its structure happens to be a Repo but that's basically coincidental
<rogpeppe> fwereade: (funny really how the "URL" in charm.URL is anything but "universal" - it only finds charms :-])
<fwereade> rogpeppe, heh, indeed
<rogpeppe> fwereade: yes, but when we put the charm into the state, it needs to be put in with the correct series (i think!) so that the remote machine will use it correctly.
<fwereade> rogpeppe, the series of the charm determines the series of the machine
<fwereade> rogpeppe, pick the series you want, construct a URL, and you're done... right?
<rogpeppe> fwereade: the URL has to actually refer to something
<fwereade> rogpeppe, why?
<fwereade> rogpeppe, you can make up whatever crap you like
<rogpeppe> fwereade: because it gets pushed to the state
<rogpeppe> fwereade: so you'd need to copy the directory too. it's not hard of course, but perhaps worth a function in testing, no?
<fwereade> rogpeppe, I do that sort of thing in a couple of places around the tests for Uniter and BundlesDir... perhaps there's something relevant there
<rogpeppe> fwereade: i'd like to use this as an excuse to live-test our standard deploy mechanism.
<fwereade> rogpeppe, then I don;t see what needs to be put in state that isn't already handled by deploy
<rogpeppe> fwereade: we need to give a charm to deploy, no?
<fwereade> rogpeppe, right
<rogpeppe> fwereade: which needs to be in a directory named after a valid series
<fwereade> rogpeppe, I *see*
<fwereade> rogpeppe, (or just using a faked-up charm store if I want to be pedantic, but yes, that is the best way)
<rogpeppe> fwereade: yeah, we could do that do
<rogpeppe> to
<rogpeppe> too :-)
<fwereade> rogpeppe, anyway, sorry, my perspective has been skewed looking at this problem from the UA perspective, when I can force the right context on myself everything you say makes sense
<rogpeppe> fwereade: np!
<fwereade> rogpeppe, and I think it's just fine to assume that testing charms are valid on every series
<fwereade> rogpeppe, I'm not sure any of them *actually* do anything
<rogpeppe> fwereade: well, we can rectify that in the future...
<fwereade> rogpeppe, but that shouldn't be a problem for this test, right?
<rogpeppe> fwereade: exactly
<fwereade> rogpeppe, cool
<rogpeppe> fwereade: i wonder about some prefix for names of charms that are intended for live testing
<rogpeppe> fwereade: "live-basic" for a starter charm, perhaps, that does nothing other than install
<rogpeppe> fwereade: then it's obvious which charms need to be looked at from the point-of-view of "will this actually run?"
<fwereade> rogpeppe, hmm, maybe... part of me is saying we should put live test charms in the charm store ;p
<rogpeppe> fwereade: that's not a bad idea actually
<fwereade> rogpeppe, it has aspects of a good idea and aspects of a bad idea ;)
<rogpeppe> fwereade: except that i can't do that now
<fwereade> rogpeppe, quite so -- for now I would just slap some oneiric, precise, quantal symlinks into the testing repo and call it a day
<rogpeppe> fwereade: ah, now yer talkin'!
<fwereade> :D
<rogpeppe> fwereade: maybe we should give explicit "series" arguments to all the functions in testing.Repo
<rogpeppe> fwereade: that way we can easily asked for a cloned charm in a particular series for example
<rogpeppe> fwereade:
<rogpeppe> // ClonedURL makes a copy of the charm directory named name
<rogpeppe> // into the destination directory (creating it if necessary),
<rogpeppe> // and returns a URL for it. The URL will have the given series.
<rogpeppe> func (r *Repo) ClonedURL(dst, series, name string) *charm.URL {
<rogpeppe> fwereade: i think that's all we need.
<fwereade> rogpeppe, feels like overkill, surely? can't you use a local repo pointing to the dir with the symlinks and leave it at that?
<fwereade> rogpeppe, I worry I have skewed context again
<rogpeppe> fwereade: i don't think so. all the methods in testing.Repo hard-code the series name "series"
<fwereade> rogpeppe, yeah, testing.Repo *might* even be meant to be internal really
<rogpeppe> fwereade: yeah, i wondered about that
<fwereade> rogpeppe, have a vague feeling something used it once, maybe
<rogpeppe> fwereade: it's quite nice as a namespace for all those methods
<fwereade> rogpeppe, true
<fwereade> rogpeppe, but its path is also a perfectly good path for JUJU_REPOSITORY, I think
<fwereade> rogpeppe, does it have a clone-the-whole-thing method?
<rogpeppe> fwereade: that's actually insufficient, i think, as cloned charms can't be under the same path
<rogpeppe> fwereade: it should do.
<rogpeppe> fwereade: or at any rate, any requested charm should be cloned.
<rogpeppe> fwereade: what was the last thing you saw?
<rogpeppe> fwereade: i saw:
<rogpeppe> [08:50:35] <fwereade> rogpeppe, does it have a clone-the-whole-thing method?
<fwereade> <fwereade> rogpeppe, or alternatively just clone a charm to where you want it for one specific test, which may well be in a newly-hacked-up repo with the perfect series alredy
<fwereade> rogpeppe, I saw fwereade: it should do.
<rogpeppe> [08:52:19] <rogpeppe> fwereade: or at any rate, any requested charm should be cloned.
<fwereade> rogpeppe, I think it's more just that any dir we tell juju about to should be outside source control
<fwereade> s/to //
<rogpeppe> fwereade: indeed
<rogpeppe> fwereade: maybe it should be a fixture
<rogpeppe> CharmSuite
<rogpeppe> fwereade: then we can have JujuConnSuite use it and most other things get it for free
<fwereade> rogpeppe, I'm not sure that's justified -- lots of tests already get by just cloning the one or two they need, and that's a lot of cloning that might otherwise be unnecessary
<rogpeppe> fwereade: i'd make it so that it only clones on request
<rogpeppe> fwereade: if you ask for a charm, it ensures it's cloned, then gives you a reference to it
<fwereade> rogpeppe, fair enough then
<rogpeppe> fwereade: the advantage of making it a fixture is that it's clear who has responsibility for cleaning up the clones
<rogpeppe> fwereade: as it is, clients have that responsibility, and that means you can't use a single global JUJU_REPOSITORY
<fwereade> rogpeppe, yeah, sounds sensible
<rogpeppe> fwereade: actually, i don't think you *can* use a single global JUJU_REPOSITORY
<rogpeppe> fwereade: because you might have more than one active CharmSuite
<fwereade> rogpeppe, ha, yes
<rogpeppe> fwereade: but i don't wanna use the env var anyway; we'll just have CharmSuite.Path or something
<rogpeppe> RepoPath probably
<fwereade> rogpeppe, yep, sgtm
<rogpeppe> fwereade: ok, seems like a plan
<rogpeppe> fwereade: funny how fixing something fairly trivial (internal traffic enabling) leads on to a seemingly totally unrelated code change.
<fwereade> rogpeppe, yeah, one takes some strange paths on occasion :)
<rogpeppe> fwereade: oh, that's *much* nicer now
<fwereade> rogpeppe, cool :D
<rogpeppe> fwereade: 16 lines less and more general
<fwereade> rogpeppe, awesome
<Aram> moin.
<fwereade> Aram, heyhey
<TheMue> Aram: moin moin
<rogpeppe> Aram: hiya
<fwereade> evening davecheney
<fwereade> rogpeppe, you recall that jujud-not-on-PATH thing I pasted last night?
<rogpeppe> fwereade: yes
<rogpeppe> fwereade: i have a feeling that our cloud-init doesn't set up PATH, though i haven't checked
<fwereade> rogpeppe, ah ok -- isn't the path the the executable predictable based on tools, though?
<fwereade> rogpeppe, maybe we should be using an absolute path
<rogpeppe> hmm. the path should be based on agent name
<rogpeppe> fwereade: /var/lib/juju/tools/$agent/jujud i think
<fwereade> rogpeppe, and we know what agent we're trying to run :)
<fwereade> rogpeppe, and that's all somewhere in container?
<rogpeppe> fwereade: yeah, although perhaps it might seem better for the agent Main to set up the PATH
<rogpeppe> fwereade: no, not in container
<rogpeppe> fwereade: container just does a path search
<fwereade> <fwereade> rogpeppe, ah, I thought that was what actually installed the unit
<rogpeppe> fwereade: last thing i saw:
<rogpeppe> [12:18:59] <fwereade> rogpeppe, and that's all somewhere in container?
<rogpeppe> [12:19:23] <rogpeppe> fwereade: yeah, although perhaps it might seem better for the agent Main to set up the PATH
<rogpeppe> [12:19:38] <rogpeppe> fwereade: no, not in container
<rogpeppe> [12:19:48] <rogpeppe> fwereade: container just does a path search
<rogpeppe> [
<fwereade> rogpeppe, you are probably right that they should just be on PATH
<fwereade> rogpeppe, but well hmm
<fwereade> rogpeppe, multiple versions
<rogpeppe> fwereade: that's fine
<fwereade> rogpeppe, how so?
<rogpeppe> fwereade: the agent dir is a symlink to the right version
<rogpeppe> fwereade: that's how we upgrade
<fwereade> rogpeppe, ok, sorry, so what exactly goes on PATH?
<rogpeppe> fwereade: i'll just check
<fwereade> rogpeppe, ahhhh sorry the agent Main
<davechen1y> fwereade: i think you should always try to call the binary with an abosolute path
<rogpeppe> fwereade: yeah
<rogpeppe> davechen1y: why so?
<davechen1y> the cloudinit or initd environment is very sparce
<fwereade> davechen1y, +1
<fwereade> rogpeppe, yeah, I maybe misunderstood again
<rogpeppe> davechen1y: we do always call the binary with an absolute path
<fwereade> rogpeppe, which agent main sets up what path?
<davechen1y> rogpeppe: i thought so, I remember that from cloudinit
<rogpeppe> davechen1y: but we want the agents to set up PATH so that anything doing FindPath("jujud") will get the right thing
<davechen1y> rogpeppe: i would try to avoid that
<davechen1y> PATH is a shell thing
<davechen1y> and we're not an interactive process
<davechen1y> I think you'll find it more reliable if you manage that without PATH
<fwereade> rogpeppe, I'm inclined to just make sure AgentToolsDir is on the PATH when we're calling hooks
 * Aram agrees
<Aram> with davechen1y that is
<rogpeppe> fwereade: definitely
<rogpeppe> davechen1y: the PATH isn't just for our internal consumption
<rogpeppe> davechen1y: it's also for commands running inside hooks
<davechen1y> rogpeppe: that is an important, but seperate issue
<fwereade> rogpeppe, but that's unit-agent-specific at the moment
<rogpeppe> davechen1y: if we don't use PATH, then we have to have a separate place to store the current agent name
<rogpeppe> davechen1y: because the path to the executable depends on the currently running agent
<rogpeppe> fwereade: hmm, which piece of code was failing?
<rogpeppe> davechen1y: have you changed the ssh code in state recently?
<rogpeppe> davechen1y: i just saw a transient error in TestSSHConnect
<rogpeppe> davechen1y: http://paste.ubuntu.com/1187088/
<davechen1y> rogpeppe: never seen that before
<davechen1y> maybe you are racing with the mock sshd starting
<rogpeppe> davechen1y: neither me. let's hope it never happens again.
<fwereade> rogpeppe, I am wondering how we're going to manage juju-run though, given that there could be multiple unit agents in a container
<rogpeppe> fwereade: that's fine
<rogpeppe> fwereade: each agent has its own name
<rogpeppe> fwereade: the unit agent name should contain the unit name
<fwereade> rogpeppe, so juju-run picks the right jujuc to call based on unit name... I guess
<rogpeppe> fwereade: interesting
<rogpeppe> fwereade: i think it should look at the context.
<rogpeppe> fwereade: which might contain the unit name, yeah
<fwereade> rogpeppe, what happens when we want to upgrade juju-run, I wonder ;p
<fwereade> rogpeppe, I don;t think there's any way to know that
<rogpeppe> fwereade: what's juju-run again?
<fwereade> rogpeppe, the idea of juju-run is it's called from anywhere
<fwereade> rogpeppe, run this script as if it were a hook please
<fwereade> rogpeppe, I think the plans are pretty firm, just not urgent
<rogpeppe> fwereade: wouldn't it be reasonable to say that it must be run by a process that descends from a hook?
<fwereade> rogpeppe, totally unreasonable, wouldn't it?
<rogpeppe> fwereade: really?
<fwereade> rogpeppe, calling juju-run from within a hook will deadlock, surely?
<rogpeppe> fwereade: yes. but calling juju-run from within a background process started by a hook should be fine
<fwereade> rogpeppe, and how do you make a cron job run in a process descended from a hook?
<rogpeppe> fwereade: i was wondering about that
<rogpeppe> fwereade: how does that juju-run know how to talk to the right jujud anyway?
<fwereade> rogpeppe, I think juju-run has to take a unit name... or *maybe* infer it from context where possible... but I think I'd rather be explicit
<rogpeppe> fwereade: i'm not sure i'm happy with executables being called with no control at all. need to think on it.
<fwereade> rogpeppe, sorry, which executables?
<rogpeppe> fwereade: juju-run in particular. but also the executables that are called by the script invoked by juju-run
<fwereade> rogpeppe, the idea of juju-run is that it will wait for no other hook to be running and then execute the supplied script within a normal context
<rogpeppe> fwereade: it scuppers my idea of how to do major-version upgrades, i think
<fwereade> rogpeppe, just like any other hook, neatly serialized with all the other hook executions
<fwereade> rogpeppe, heh :(
<fwereade> rogpeppe, anyway, we can worry about this later and figure out some way to make it work with the upgrades that will already be in place
 * fwereade pops a couple of levels
<fwereade> rogpeppe, now, the stuff that was failing was:
<fwereade>  2012/09/04 15:49:43 JUJU deploying unit etherpad-lite/0
<fwereade> gah
<rogpeppe> fwereade: ah, of course. that's fine.
<rogpeppe> fwereade: that would be fixed by setting up $PATH
<fwereade>  2012/09/04 15:49:43 JUJU cannot deploy unit etherpad-lite/0: cannot find executable: exec: "jujud": executable file not found in $PATH
<rogpeppe> fwereade: which, despite mr cheney's objections, i think is probably the right approach
<fwereade> rogpeppe, I guess I need to read the code some more, I'm not clear on what exactly is happening here
<rogpeppe> fwereade: it's quite simple
<fwereade> rogpeppe, I think I'm convinced that PATH is not something we should be depending on, though
<rogpeppe> fwereade: container does a path search for the jujud exe
<rogpeppe> fwereade: then puts that in the cloudinit executable
<fwereade> rogpeppe, I don't think it's any of our business what PATH upstart happens to run with
<rogpeppe> fwereade: we don't use any PATH stuff inside the upstart script
<fwereade> rogpeppe, I'll shut up, you were saying useful things
<rogpeppe> fwereade: the container package itself does the path search
<rogpeppe> fwereade: if we don't have PATH, we need some global place to store the executable dir
<rogpeppe> fwereade: i.e. some global place to store the current agent name
<rogpeppe> fwereade: i think that PATH is actually a reasonable way to encode that. it's all we care about. and everything just falls out from that imo
<fwereade> rogpeppe, sorry, what are we using the executable dir for?
<rogpeppe> fwereade: all it takes is that each agent does os.Setenv("PATH", environs.AgentToolsDir(agentName) + ":" + os.Getenv("PATH"))
<rogpeppe> fwereade: the executable dir is agent-specific
<rogpeppe> fwereade: which means each agent can upgrade in its own sweet time
<fwereade> rogpeppe, I understand that bit
<rogpeppe> fwereade: ha! i've just realised that the cloudinit heuristic is wrong, wrong, wrong!
<fwereade> rogpeppe, well, I shall feel pleased with myself in a non-sepcific and slightly bewildered way :)
<rogpeppe> fwereade: we don't need the path at all
<fwereade> rogpeppe, I was wondering what it was for
<rogpeppe> fwereade: because the cloudinit script should not run the jujud from the agent that's creating the script
<rogpeppe> fwereade: so there needs to be a little more logic in container to create the agent tools dir symlink
<rogpeppe> fwereade: for the agent it's about to start
<rogpeppe> fwereade: it might be good if it set PATH in the cloudinit script too
<rogpeppe> fwereade: that way jujud doesn't have to muck about with PATH at all
<fwereade> rogpeppe, excellent, our thoughts are in alignment then... but maybe not on that last bit
<rogpeppe> fwereade: well, that's only a half-baked thought tbh
<fwereade> rogpeppe, I feel that creating PATH for the hooks is very specific to hook-running, and not something that should be dealt with by all agents just for this one case
<fwereade> rogpeppe, anyway, I might take a look at that briefly, unless you are underemployed
<rogpeppe> fwereade: please do
<fwereade> rogpeppe, although if I get unblocked on uniter implementation that will take precedence ;)
<fwereade> rogpeppe, cheers
<fwereade> rogpeppe, that might be an after-lunch thing though :)
<niemeyer> Hullah!
<niemeyer> I'll just announce a mgo release before kicking fully in
<fwereade> niemeyer, heyhey
<rogpeppe> niemeyer: yo!
<niemeyer> fwereade, rogpeppe: Heya!
<fwereade> niemeyer, I have a hideous pile of reviews for you today, but it should be better than one hideous review :)
<niemeyer> fwereade: LOL
<niemeyer> fwereade: Thanks a lot for splitting them up
 * rogpeppe wishes he saw all the launchpad emails.
<fwereade> niemeyer, np, it turned out to be very helpful to do so, the code is better thanks to it
<fwereade> niemeyer, there's still one big one but an awful lot of the lines are moves/deletes/trivials, so hopefully it won;t hurt too bad
<niemeyer> fwereade: Sweet
<niemeyer> Okaaay
 * niemeyer jumps onto a pleasant review queue
<rogpeppe> fwereade: here's the upshot of our earlier conversation. i *think* it's nicer, but i'm biased. what do you think? https://codereview.appspot.com/6495086/
 * fwereade looks
<niemeyer> rogpeppe: What's the motivation behind the change?
<rogpeppe> niemeyer: the initial motivation was that i wanted to deploy a charm in jujutest.LiveTests
<rogpeppe> niemeyer: that meant that i needed to create my own charm repository and manually populate it with a charm of the right series.
<niemeyer> rogpeppe: The first changed file already shows something that is slightly less ideal
<niemeyer> rogpeppe: It's now bundling a new charm for every test, rather than for the suite
<rogpeppe> niemeyer: the time impact is minimal
<niemeyer> rogpeppe: We can't care both about having a fast suite and then disregard that kind of change
<rogpeppe> niemeyer: 0.05seconds of penalty on that package
<rogpeppe> niemeyer: i don't think we mind too much about 50ms
<niemeyer> rogpeppe: Okay, that is irrelevant indeed
<rogpeppe> niemeyer: though i would indeed like it if we could choose whether to use test fixtures for the duration of a test or a suite
<niemeyer> rogpeppe: That doesn't make sense to me.. the test fixture can do whatever it pleases, and we can organize the tests in whatever way we like
<niemeyer> rogpeppe: If you want a suite for the duration of a test, have a suite with a test (!)
<rogpeppe> niemeyer: what if you want a suite which sets up everything for each test, but you want it to work for a whole suite?
<fwereade> rogpeppe, +1 to that, +-0 to the change itself purely from an API point of view... much of the time "series" is irrelevant, and it seems tedious to type it out each time
<niemeyer> rogpeppe: That's exactly what suites do?
<rogpeppe> niemeyer: it seems wrong to call SetUpTest inside a SetUpSuite
<niemeyer> rogpeppe: Sorry, that makes even less sense
<niemeyer> rogpeppe: SetUpTest runs for each test.. SetUpSuite runs for each suite.. we can do whatever with those
<rogpeppe> niemeyer: yes, but that means the fixture itself decides whether the context it sets up is appropriate for a test or a suite
<rogpeppe> niemeyer: but i *think* that it doesn't matter - what matters is the context that it sets up, not where it's used
<niemeyer> rogpeppe: The fixture itself should only be used when it makes sense
<niemeyer> rogpeppe: If it's not appropriate to have a test in a fixture, don't put it
<niemeyer> rogpeppe: We can have as many suites as we please
<niemeyer> rogpeppe: With whatever organization we do
<rogpeppe> niemeyer: i'm not sure what you mean by "in" a fixture. AFAICS a test suite *uses* a fixture. and perhaps that's the difficulty i'm having.
<niemeyer> rogpeppe: Nope, a test suite *is* the fixture
<rogpeppe> niemeyer: i think that's a hangover from inheritance days
<rogpeppe> niemeyer: it doesn't seem a good fit for the go way of composing things.
<niemeyer> rogpeppe: Well, we can debate about that, but that's a separate problem from not being able to do things
<rogpeppe> niemeyer: sure. anyway, i suppose that your concern about doing the bundle in every test comes out of this issue. if this fixture was *solely* concerned with setting and tearing down a charm repo (in whatever context), then we could call SetUp inside SetUpSuite as easily as SetUpTest
<niemeyer> rogpeppe: There's absolutely nothing preventing a type from doing *solely* whatever it wants to do and being called *solely* wherever we want to
<rogpeppe> niemeyer: i've idly thought quite a few times that all the current fixture "suites" could be factored into types that conform to this interface: http://paste.ubuntu.com/1187225/
<fwereade> niemeyer, rogpeppe: do we have any easy way to get a state.Info out of a state.State?
<fwereade> niemeyer, rogpeppe: feels kinda evil so I imagine not
<rogpeppe> fwereade: i don't think so.
<rogpeppe> fwereade: why?
<fwereade> rogpeppe, I need it to fill in --zookeeper-servers for the unit agent
<rogpeppe> fwereade: haven't you got an Environ?
<niemeyer> rogpeppe: Yeah, we may eventually have something like that supported by stock gocheck
<niemeyer> rogpeppe: Meanwhile, there's nothing preventing it from existing, except for the convenience of having  a suite embedded onto another type and having it just work
<fwereade> rogpeppe, I'm in simpleContainer.Deploy; should I?
<fwereade> rogpeppe, and I'm called by Machiner, which just has a *state.Machine
<fwereade> rogpeppe, I'll also want it for Uniter when that comes to deploy subordinates
<rogpeppe> niemeyer: actually, you can do that too: http://paste.ubuntu.com/1187233/
<rogpeppe> niemeyer: then embed gocheck.TestFixture(something) or gocheck.SuiteFixture(something) and it'll all be done for you
 * rogpeppe thinks that's quite neat actually
<niemeyer> rogpeppe: I think it's more code without a clear win
<rogpeppe> niemeyer: you could even have fixture composition operators
<niemeyer> rogpeppe: The point was to not embed.. if I have to embed and manage it all by hand, there's no reason to do it
<rogpeppe> niemeyer: if i include 2 fixtures currently, i have to write about 20 lines of code that are entirely mechanical.
<niemeyer> rogpeppe: You're not improving that situation by having a different mechanism to create a type that does the same thing we have today
<niemeyer> rogpeppe: You are adding a step to *create a suite*. Everything involved in *using a suite* is contained in *create suite* + *use suite*
<rogpeppe> niemeyer: maybe so. would you get rid of the SetUpTest etc methods entirely?
<niemeyer> rogpeppe: I'd make it so that they are not necessary
<rogpeppe> niemeyer: interesting. i don't quite see how it would work though.
<niemeyer> rogpeppe: I'll get there eventually
<fwereade> niemeyer, rogpeppe: ISTM that we are going to end up with noticeable duplication across Uniter and Machiner, if both are to be responsible for deploying units
<fwereade> niemeyer, rogpeppe: it feels like we're missing some abstraction somewhere
<fwereade> niemeyer, rogpeppe: does this resonate with either of you?
<rogpeppe> niemeyer: ah, you could do it by having a test methods which took a fixture as an arg
<rogpeppe> fwereade: container is supposed to be the shared abstraction
<niemeyer> fwereade: Not really
<rogpeppe> fwereade: what else will be duplicated?
<niemeyer> fwereade: Deploying another unit from within a set up container should be as complex as putting a upstart script in place and running it, right?
<fwereade> niemeyer, true
<fwereade> niemeyer, I guess it's just a matter of a... UnitSubordinateWatcher?
<fwereade> niemeyer, I guess the client code does end up reasonably different
<fwereade> TheMue, we don't have any way to watch a unit's subordinates, do we?
<TheMue> fwereade: today? afaik no.
<fwereade> TheMue, np, we'll do it when we need it :)
<rogpeppe> fwereade: you can watch units assigned to the current machine and ignore all but the subordinates
<fwereade> rogpeppe, yeah, easy enough, cheers
<niemeyer> rogpeppe: Reviewed it
<niemeyer> fwereade: How do you mean (different)?
<fwereade> niemeyer, not in any major way -- just stuff like what containers they actually deploy to -- ie it doesn't feel like it will really be so unbearably similar as to demand a single implementation
<rogpeppe> niemeyer: thanks. it was WIP though, and i'm currently thinking this is a better direction, based on the above discussion and fwereade's reaction: http://paste.ubuntu.com/1187296/
<niemeyer> rogpeppe: Heh
<niemeyer> rogpeppe: I think it was mostly ok, despite a few issues
<niemeyer> rogpeppe: I'll wait until you *actually* want a review, though
<rogpeppe> niemeyer: the essence is the same except that the original Repo type name is retained and everyone operates on that.
<rogpeppe> niemeyer: rather than creating a fixture.
<rogpeppe> niemeyer: very sorry to have wasted your time.
<rogpeppe> niemeyer: most of your comments are still accurate BTW
<niemeyer> rogpeppe: No problem, will be glad to review whatever you think makes sense as an improvement
<fwereade> niemeyer, rogpeppe, TheMue, Aram, mramm2: could be better, but -- and forgive the spam:
<fwereade> william@diz:~/code/go/src/launchpad.net/juju-core$ juju status
<fwereade> machines:
<fwereade>   0:
<fwereade>     dns-name: ec2-50-16-72-153.compute-1.amazonaws.com
<fwereade>     instance-id: i-1478c96e
<fwereade>     proposed-agent-version: 0.0.0
<fwereade>     agent-version: 0.0.1
<fwereade>   1:
<fwereade>     agent-version: 0.0.1
<fwereade>     dns-name: ec2-23-20-216-107.compute-1.amazonaws.com
<fwereade>     instance-id: i-f66adb8c
<fwereade>     proposed-agent-version: 0.0.0
<fwereade> services:
<fwereade>   etherpad-lite:
<fwereade>     charm: etherpad-lite
<fwereade>     units:
<fwereade>       etherpad-lite/0:
<fwereade>         proposed-agent-version: 0.0.0
<fwereade>         status: error
<fwereade>         status-info: 'hook failed: "install"'
<fwereade>         agent-version: 0.0.0
<fwereade>         machine: 1
<fwereade>     exposed: true
<rogpeppe> fwereade: yay!
<niemeyer> fwereade: Holy crap!!
<niemeyer> HOLY CRAP!
<TheMue> Yiiiieha!
 * niemeyer does the funky chicken around the chair
<fwereade> :DD
<TheMue> fwereade: grats, guy, this looks damn good
<TheMue> fwereade: and is definitely no spam, it good beef
<fwereade> haha
<TheMue> s/it/it's/
<mramm2> fwereade: So very awesome!
<niemeyer> mramm2: Well, s/fwereade/team/, really..
<niemeyer> This is the peak of several months of everybody's hard work.. remarkable really
<TheMue> dinnertime (instead of lunch today)
<niemeyer> TheMue: :-)
<niemeyer> TheMue: Enjoy
<rogpeppe> niemeyer, fwereade: i think this is nicer (and it's ready for review :-]) https://codereview.appspot.com/6495086
<niemeyer> rogpeppe: You've got a review on https://codereview.appspot.com/6498085/
<rogpeppe> niemeyer: yeah, i've occasionally wondered about version.IsValid. will do.
<rogpeppe> niemeyer: thanks a lot for the review BTW!
<rogpeppe> niemeyer: we need to have a chat about those tests in that earlier CL
<rogpeppe> niemeyer: i wondered what you'd think of my version bumping scheme :-)
<rogpeppe> niemeyer: IMO *any* version bumping screws up our versioning.
<rogpeppe> niemeyer: so we might was well screw it up in an obvious way
<niemeyer> rogpeppe: Not really, there are ways we can fix this
 * rogpeppe listens
<niemeyer> rogpeppe: Well, let's do it in parts..
<rogpeppe> niemeyer: sure, i'll remove it for the time being
<niemeyer> rogpeppe: For that branch, can we please drop the bumping?
<niemeyer> rogpeppe: It's mostly ready otherwise, except for those trivials
<rogpeppe> niemeyer: still gotta get the prereqs through... but thanks, i'm happy
<niemeyer> rogpeppe: Super, me too, thanks a lot for the functionality
<niemeyer> rogpeppe: So, the tests..
<niemeyer> rogpeppe: Which one do you wanna talk about first?
<rogpeppe> niemeyer: the "does nothing" tests
<niemeyer> rogpeppe: Okay,
<niemeyer> rogpeppe: This is indeed boring to test, no doubts
<rogpeppe> niemeyer: yeah. i'm interested to hear your thoughts.
<niemeyer> rogpeppe: Let me explain first a vague thought I had, which depends on functionality that is coming in mstate
 * rogpeppe is listening
<niemeyer> rogpeppe: What we're trying to test, vaguely speaking, is that a change in the version was observed
<rogpeppe> niemeyer: yeah
<niemeyer> rogpeppe: So that we can be reassured that is is being ignored, rather than updating
<rogpeppe> niemeyer: yup
<niemeyer> rogpeppe: The difficulty we have is that we don't have a good mechanism to define "steady state"
<niemeyer> rogpeppe: So I'm trying to imagine of ways we could fix that
<niemeyer> rogpeppe: I'm not even sure if mstate can help there, to be honest
<rogpeppe> niemeyer: ah, are you thinking we can rely on the synchronous delivery of watcher events to all watchers?
<niemeyer> rogpeppe: Yeah, but that doesn't quite work I think
<rogpeppe> niemeyer: no, different connections
<niemeyer> rogpeppe: Because we're precisely trying to make it so that events are not really delivered if the consumer didn't bother to ask, right? That was yesterday's conversation
<rogpeppe> niemeyer: i await the watcher API with eager anticipation :-)
<niemeyer> rogpeppe: Well, the API is ready.. we need the implementation :)
<rogpeppe> :-)
<niemeyer> rogpeppe: So, let's see..
<niemeyer> rogpeppe: If we had a mechanism to force all events to at least be made available to consumers if they wish,
<niemeyer> rogpeppe: it might be simpler to define the concept of "steady state" for a loop
<niemeyer> rogpeppe: So we could say, for example, upgrader.Sync()
<rogpeppe> niemeyer: i'm not sure what that means
<niemeyer> rogpeppe: and know it has dealt with every other event that was available till then, and found nothing else to do
<niemeyer> rogpeppe: If we do that, we can realize changes in the state, Sync, and test that nothing was done
<rogpeppe> niemeyer: isn't that almost equivalent to the current approach - just send a value into the inner loop rather than receiving one?
<niemeyer> rogpeppe: Without assuming things about specific code paths
<niemeyer> rogpeppe: It's completely unrelated to the current approach, IMO
<niemeyer> rogpeppe: They just have the same goal
<niemeyer> rogpeppe: Which, well, is precisely what we're trying to preserve
<rogpeppe> niemeyer: i'm not sure what Sync means though. we don't have events, we have state.
<niemeyer> rogpeppe: We have both state and events that are emitted as a consequence of changing state
<rogpeppe> niemeyer: ah, do you mean State.Sync ?
<niemeyer> rogpeppe: No, I meant upgrader.Sync() indeed
<niemeyer> rogpeppe: Because we must verify that the *upgrader* is done processing events
<rogpeppe> niemeyer: but the events are being delivered by the state, right?
<niemeyer> rogpeppe: Right
<niemeyer> rogpeppe: But you're on the right track
<rogpeppe> niemeyer: so how do we know that the Sync call has arrived before the event we're expecting from the state?
<niemeyer> rogpeppe: I'm planning to make available state.ForceWatcherRefresh()
<niemeyer> rogpeppe: That will at least make all events ready for consumption locally
<rogpeppe> niemeyer: ok. but does it guarantee they've been delivered?
<niemeyer> rogpeppe: No, because there's an intermediate loop that processes watches
<rogpeppe> niemeyer: so i still don't see how upgrader.Sync can work reliably
<niemeyer> rogpeppe: The thing that actually implements the e.g. foo.WatchWhatever() loop
<niemeyer> rogpeppe: Right, I'm still thinking through to see if there's a better way, but no matter what in the short term there's no fantastic way to assert the all events have been made available
<niemeyer> rogpeppe: So,
<niemeyer> rogpeppe: To avoid blocking you further,
<rogpeppe> niemeyer: AFAICS the only way of telling if an event has been delivered is to ask the thing it's been delivered to
<niemeyer> rogpeppe: Sure, but that sucks for various reasons
<rogpeppe> niemeyer: yeah, i see your point. but... the alternative sucks too :-)
<niemeyer> rogpeppe: Not the least one, it means hardcoding logic for every tiny test we need to do
<niemeyer> rogpeppe: My proposal is for us to implement a mechanism that is good enough for the moment
<niemeyer> rogpeppe: and avoids the hackery
 * rogpeppe is all ears
<niemeyer> rogpeppe: By defining a test mode in those cases that enables the Sync method to work
<niemeyer> rogpeppe: When we call Sync(), the loop initializes a channel that was previously nil to time.After(100 * time.Millisecond)
<rogpeppe> niemeyer: i still don't see how Sync can ever work properly
<niemeyer> rogpeppe: and puts it in the main loop
<niemeyer> rogpeppe: When Sync is acting, the value is reset every time the loop begins
<niemeyer> rogpeppe: When the timeout happens, Sync returns
<niemeyer> rogpeppe: With mstate's flush logic I described, this will become quite reliable, and we can probably reduce the timeout to half of this or less
<rogpeppe> niemeyer: if we're going to do that, we may as well just have a timeout in the test rather than cluttering the code for a feature that may never be possible to implement properly
<niemeyer> rogpeppe: I believe that eventually we can actually have a Sync() method on all watchers, and turn the mechanism into a deterministic system, and avoid issues completely
<rogpeppe> niemeyer: even *with* the flush logic, we'll still need a timeout
<niemeyer> rogpeppe: Not necessarily, no, as I just mentioned
<rogpeppe> niemeyer: i'm not convinced that's possible, i'm afraid
<niemeyer> rogpeppe: Wouldn't be the first time. :)
<rogpeppe> niemeyer: how can any watcher know if it has received all the events that might happen as a result of a change to the db?
<niemeyer> rogpeppe: It's fine, though.. if you want to put a short timeout in the test for the moment, that works for me.. the test should ensure that nothing happens though, rather than just assuming nothing happens.
<niemeyer> rogpeppe: I'm not going to solve that problem today
<rogpeppe> niemeyer: ok, i think i'll go for that as the simple approach for the time being, pending some future mechanism. i'm not very  happy about it though. we can easily check that the event gets delivered... and even if we have Sync, we'd be checking the same thing, i think. anyway, i'll go for the timeout.
<niemeyer> rogpeppe: Agreed. I'm not very happy either. But I'm even less happy with what's there now.
<niemeyer> rogpeppe: I'm really concerned about having that as a normal practice, and dirtying up all code with that kind of "address breakpoint" that not only doesn't pay for itself, but makes for brittle tests that are hard to debug and dirty up the logic.
<niemeyer> rogpeppe: I promise I'll keep your core concern in mind, though, and try to have a more reasonable proposal eventually
<niemeyer> I'll step out for lunch.. biab
<rogpeppe> niemeyer: BTW about ChangeEnviroConfig...
<rogpeppe> niemeyer: ok, enjoy
<niemeyer> rogpeppe: Oh, ok
<niemeyer> rogpeppe: Wanna conitnue?
<niemeyer> continue
<rogpeppe> niemeyer: it was only a brief observation: my intention wasn't to do a delta, but to atomically write the whole thing. the delta is defined by the function itself. but maybe that's not possible with mstate actually.
<rogpeppe> niemeyer: i've fixed the tests to do a timeout BTW; i think the branch may be LGTY now.
<niemeyer> rogpeppe: There's no way to not do a delta with the sample you provided.
<niemeyer> rogpeppe: You did attrs := AllAttrs; attrs["foo"] = "bar
<niemeyer> rogpeppe: So attrs is a whole document, which has the issues I described
<rogpeppe> niemeyer: and then returned all the attributes, which are bundled into a Config and written in their entirety.
<niemeyer> rogpeppe: Exactly.. you said you're trying to fix the problem of applying deltas, but that isn't
<rogpeppe> niemeyer: or can't you write a whole Config at once
<niemeyer> rogpeppe: We can't, and that's the interface we have today
<niemeyer> Erm
<niemeyer> rogpeppe: We can, and that's the interface we have today
<niemeyer> rogpeppe: With the known issues..
<niemeyer> rogpeppe: If we're not solving the issues, we can just use it
<rogpeppe> niemeyer: can't we do the equivalent of zk's RetryChange?
<niemeyer> rogpeppe: I've covered that in the review too
<niemeyer> rogpeppe: Still presents issues
<rogpeppe> niemeyer: we can't make an invalid config because we make the Config before writing it. i'm still not seeing it. but anyway, i've removed the function.
<fwereade> niemeyer, I have a feeling I'm going to feel very stupid in a moment, but: http://paste.ubuntu.com/1187523/
<fwereade> brb
<niemeyer> rogpeppe: A Config is not necessarily a valid environment configuration..
<rogpeppe> niemeyer: true
<niemeyer> fwereade: No perceived stupidity.. it's indeed not immediately obvious
<niemeyer> fwereade: I've argued quite a bit when the jujucharms stuff was created to avoid that kind of issue, but lost the argument in a miserable way
<fwereade> niemeyer, can you explain what is going on in words suitable for the hard of thinking?
<niemeyer> fwereade: It's actually trivial: the charm store revisions charms monotonically.. anything that is looking at the content through alternative means may report a different reality
<fwereade> niemeyer, I thought that ended up meaning we picked the lowest possible rev that was higher than any other possible rev
<niemeyer> fwereade: and to be honest, I don't see anything out of sync in jujucharms
<niemeyer> fwereade: It's not reporting the wrong revision, at least in this page
<niemeyer> fwereade: branch revision != charm revision
<niemeyer> fwereade: revision numbers in Bazaar (or any DVCS) are wildly unreliable
<fwereade> niemeyer, yeah, I've been thinking about the content of revision and/or the legacy revision in metadata.yaml
<niemeyer> fwereade: Charms obtained from the store should have the proper number there
<fwereade> niemeyer, it appears not to
<niemeyer> fwereade: Example?
<fwereade> niemeyer, well, pasted -- the store thinks that 2 is the most recent revision, but there is nothing I can see on jujucharms with rev 2 at all
<niemeyer> fwereade: Sorry, I don't understand
<fwereade> niemeyer, according to the rev files on jujucharms, it should be 22, AFAICT
<niemeyer> fwereade: Which rev files, and why does it matter?
<niemeyer> fwereade: jujucharms is not the store
<fwereade> niemeyer, well, yes, this is true, but I was expecting them to have roughly similar content
<niemeyer> fwereade: They have roughly similar content
<fwereade> niemeyer, to the point where at least I could figure out some correspondence from one to the other
<niemeyer> fwereade: This is actually possible
<niemeyer> fwereade: Not sure if it's exposed, but there's a digest in the store
<niemeyer> fwereade: Which is the revision digest
<fwereade> niemeyer, but, I'm sorry, I have to go eat supper... I'll reread some stuff this evening and see if I can see the light
<niemeyer> fwereade: The branch revision number is not the way to do that, though..
<niemeyer> fwereade: Because revision numbers are highly unreliable in Bazaar and in pretty much any DVCS
<fwereade> niemeyer, yeah, I only reported the branch revision for the sake of completeness
<fwereade> niemeyer, anyway, sorry, must go
<niemeyer> fwereade: Well, not really
<niemeyer> fwereade: You were basing your analysis on it I think
<niemeyer> fwereade: Enjoy
<niemeyer> rogpeppe: I don't get the "// If a test failed, make sure we see any error from the upgrader.".. if a test failed, we already have a failure.. why are we deferring a second check?
<rogpeppe> niemeyer: i had this problem when development - if an assert fails because some event was not received, it's quite possible the upgrader has died with an error which we want to see
<niemeyer> rogpeppe: Let's drop the assert then, and use a check + wait?
<niemeyer> rogpeppe: It feels like we're fighting our own test in a way
<rogpeppe> niemeyer: waitDeath, yes?
<niemeyer> rogpeppe: Yeah, it looks nice, thanks
<rogpeppe> niemeyer: so you're saying use Check rather than Assert?
<rogpeppe> niemeyer: sure. they're equivalent in a defer anyway
<niemeyer> rogpeppe: I'm just wondering about that defer which seems to fight the test itself
<rogpeppe> niemeyer: if we don't use a defer, we can't use Assert anywhere else
<rogpeppe> niemeyer: there are lots of tests elsewhere that do defer func() {err := x.Close(); c.Assert(err, IsNil)}
<rogpeppe> niemeyer: this is the moral equivalent
<rogpeppe> niemeyer: i've gotta go. very close to reproposing the upgrade-juju branch, but just remembered Version.IsValid which is more than i can do in 2 minutes...
<niemeyer> rogpeppe: Where is that test running Stop?
<rogpeppe> niemeyer: it isn't. but it always waits for the upgrader to die, so that should be ok, i think.
<rogpeppe> niemeyer: (no need to call Stop if Wait has returned, right?)
<niemeyer> rogpeppe: If it always waits for the upgrader to die, why do we have a defer?
<niemeyer> rogpeppe: No, that's not true
<niemeyer> rogpeppe: If the test fails, we have an upgrader running in background
<rogpeppe> niemeyer: that's true. we should call stop anyway, right? in a defer.
<niemeyer> rogpeppe: Wait is merely saying "Uh.. didn't see it stop!"
<niemeyer> rogpeppe: Yeah, c.Assert(Stop(), IsNil)
<rogpeppe> niemeyer: ok, i'll call Stop
<niemeyer> rogpeppe: That'd be fine
<rogpeppe> niemeyer: in a defer?
<niemeyer> rogpeppe: Yeah, as we generally do for the stop stuff
<niemeyer> rogpeppe: Sorry for not being clear before.. it had something smelly and I couldn't quite put my finger on it
<rogpeppe> niemeyer: that's fine. Stop is much better there.
<rogpeppe> niemeyer: https://codereview.appspot.com/6490067/ is ready for final review i think
<niemeyer> rogpeppe: I'm reproducing the counter bug now, btw.. I'll have a deeper look and try to fix it once and for all
<rogpeppe> niemeyer: great. i do see it a *lot*
<niemeyer> rogpeppe: Sorry about that
<rogpeppe> niemeyer: like at that exact moment!
<niemeyer> :-)
<rogpeppe> niemeyer: gotta go. have fun.
<niemeyer> rogpeppe: Have fun, and thanks!
<rogpeppe> TheMue, fwereade, Aram, niemeyer: night all
<fwereade> gn rg
<fwereade> gn rogpeppe
<niemeyer> fwereade: Btw, https://bugs.launchpad.net/charmworld/+bug/1046444
<niemeyer> fwereade: I'll also add the digest information to the returned info
<fwereade> niemeyer, ah, cool, tyvm
<niemeyer> fwereade: np, and thanks for raising the issue
<fwereade> niemeyer, always a pleasure to blunder constructively into things :)
<niemeyer> fwereade: Yeah, appreciated. It is a slighly unclear subject, unfortunately
<TheMue> rogpeppe: gn
<niemeyer> Hah
<niemeyer> I think the status counter stuff is an actual bug
<niemeyer> mthaddon: ping
<niemeyer> And so it is..
<fwereade-on-juju> niemeyer: would you believe how I'm posting this?
<fwereade-on-juju> niemeyer: http://ec2-50-19-8-166.compute-1.amazonaws.com:3000/
<niemeyer> fwereade-on-juju: OMG
<TheMue> niemeyer: i have to leave. do you see any chance to review the two agent alive branches till tomorrow? tomorrow my vacancies start, but i can do some stuff till friday in the evening hours.
<OMG-IT-WORKS> WOOHOO
<TheMue> fwereade-on-juju: FANTASTIC
<OMG-IT-WORKS> TheMue: Definitely, I'll make sure to have it ready by your morning
<fwereade-on-juju> and the unit agent log looks sane as well, barring presence spam that I've absent-mindedly fixed in about 3 branches that never made it to trunk
<fwereade-on-juju> http://paste.ubuntu.com/1187643/
<OMG-IT-WORKS> fwereade-on-juju: Oops, sorry about that
<TheMue> OMG-IT-WORKS: great, thx. i want to have it in befor we leave.
 * fwereade-on-juju dances, whoops, backflips
<fwereade-on-juju> OMG-IT-WORKS: nah, I suspect they've always been working branches that get taken apart and proposed piece by piece
<TheMue> fwereade-on-juju: hehe, my claps his hands to your dance
<OMG-IT-WORKS> TheMue: Thanks a lot, that's appreciated, although I feel a bit bad that you're looking at it over your holidays
<TheMue> OMG-IT-WORKS: don't worry, our current progress is too fantastic to not participate :D
<OMG-IT-WORKS> What *doesn't* seem to work so well is Subway, though :-)
<OMG-IT-WORKS> Some messages just don't show up
<fwereade> OMG-IT-WORKS, yeah, seems so
<fwereade> OMG-IT-WORKS, ah well
<OMG-IT-WORKS> NOT OUR PROBLEM
<OMG-IT-WORKS> -)
<fwereade> :D
<OMG-IT-WORKS> (for once!)
<fwereade> haha
<fwereade> right, I think I really am done for the night, unless la famille falls asleep early again
<fwereade> gn all, take care, and wave a tearful goodbye to the first ever working go-juju service deployment :)
<fwereade_> rogpeppe, btw, is there something specific that demands that unit agent names be "unit-foo-123" rather than just "foo-123"?
<fwereade_> rogpeppe, I'm feeling weird tacking "unit-" in front of everything
<niemeyer> fwereade_: +1
<niemeyer> fwereade_: Although I don't know the context
<fwereade_> niemeyer, I don't think it introduces any ambiguities
<fwereade_> niemeyer, just for AgentToolsDir
<niemeyer> fwereade_: Ah, the reasoning is that we'll have other agents there I think
<niemeyer> fwereade_: machine-N, provisioning-N, etc
<niemeyer> fwereade_: So unit-NAME is just making it even
<fwereade_> niemeyer, I'd be just as happy with prefixes everywhere
<niemeyer> fwereade_: I think we do have them, don't we?
<fwereade_> niemeyer, at the moment I think htere's an unstated assumption that there will be <=1 of each of those, so they;re not N'd
<niemeyer> fwereade_: Hmm
<niemeyer> fwereade_: I see.. we'll probably not have more than a single one, but I do think we should have the suffix
<fwereade_> niemeyer, yeah... only question is where the provisioner ID comes from for now :)
<fwereade_> niemeyer, just copy the machine ID?
<niemeyer> fwereade_: I'd be fine with that.. rogpeppe will be fixing it soon anyway, as we'll need a real entity for the provisioner as part of the upgrading work
<niemeyer> (which sounds like a sensible thing, no matter what)
<fwereade_> niemeyer, +1
<fwereade_> niemeyer, I guess it becomes AgentToolsDir(kind, name string)
<niemeyer> fwereade_: I wonder if we might have something more straightforward
<fwereade_> niemeyer, I wouldn't say no to that -- you have a suggestion?
<niemeyer> fwereade_: func AgentToolsDir(agent interface{}) { switch agent.(type) { ... } )
<niemeyer> fwereade_: Where do we use this function ATM?
 * niemeyer looks
<fwereade_> niemeyer, not clear what agent is actually expected to be
<niemeyer> fwereade_: Okay, we use it in a number of different places.. so, better idea:
<niemeyer> func (u *Unit) PathKey() string { return "unit-" + u.Name() }
<niemeyer> func AgentToolsDir(agent interface{ PathKey() string }) { return filepath.Join(prefix, agent.PathKey()) }
<niemeyer> EOM :)
<fwereade_> niemeyer, with /-replacement, ok; similar for machine; and just hack something up for provisioner?
<fwereade_> niemeyer, (man, I really want to just auto-deploy a juju-provisioner charm :))
<niemeyer> fwereade_: Yeah, for the moment we can have a locally-built value for provisioner
<niemeyer> fwereade_: That's feeling unlikely at the moment.. it's actually less convenient due ot upgrading/versionoing/etc
<niemeyer> fwereade_: This PathKey() sounds useful for other places too, btw
<niemeyer> fwereade_: (like, building the /var/lib dir for the unit)
<fwereade_> niemeyer, yes indeed
<fwereade_> niemeyer, (but I bet we could still do a juju-provisioner charm that works with upgrades... not now though :))
<niemeyer> fwereade_: Hmm.. that'd be interesting, but I think we'd have to bridge two different worlds for that (charm upgrades and agent upgrades)
<fwereade_> niemeyer, I *think* they're independent... I don't see the charm as needing many upgrades, and all we'd need is to bounce the upstart job whenever we saw a config-changed
<fwereade_> niemeyer, so long as the upstart job is written in terms of the unit's AgentToolsDir, anyway :)
<fwereade_> niemeyer, it *is* a bit hackish, I agree :)
<niemeyer> fwereade_: Yeah :)
<niemeyer> Bug fixed!
<fwereade_> niemeyer, yay! (what bug?)
<niemeyer> LOL
<niemeyer> fwereade_: The one in juju-core/store you likely see every other time you run "go test" (/me hides)
<fwereade_> niemeyer, YAY!
<niemeyer> fwereade_: and interestingly, it was a *real* bug
<fwereade_> niemeyer, ha, hiding in plain sight
<niemeyer> Yeah
<niemeyer> Will need to ping mthaddon to see if the counter data in production needs fixing
<niemeyer> and it's up: https://codereview.appspot.com/6490083
<niemeyer> fwereade_: and this one has the digest support: https://codereview.appspot.com/6500083
 * niemeyer opens a bug on charmworld to use it
<niemeyer> fwereade_: ping by any chance?
<niemeyer> Okay, that was a long and nice day
<niemeyer> davecheney: Morning!
<niemeyer> I'm stepping out for dinner+rest
#juju-dev 2012-09-06
<davecheney> ./conn_test.go:117: undefined: juju.NewConnFromAttrs
<davecheney> grrr, thanks rog
<TheMue> Good morning
<fwereade_> TheMue, heyhey
<davecheney> hello
<fwereade_> davecheney, heyhey
<TheMue> fwereade_: Had a nice party yesterday?
<fwereade_> TheMue, ha, just a sleep :)
<TheMue> fwereade_: Thought you would be too upset to fell asleep. (Do you say so?)
<fwereade_> TheMue, excited maybe more than upset
<davecheney> what did I miss ?
<fwereade_> TheMue, upset usually carries unhappy connotations :)
<fwereade_> davecheney, I ran something on juju :)
<davecheney> fwereade_: indeed you did
<davecheney> a cause for celebration, not for sadness by my account
<fwereade_> davecheney, definitely, it's just that I celebrated by going to sleep
<davecheney> sounds like a valid choice to me
<TheMue> fwereade_: Ah, excited is the missing word. Thx.
<TheMue> fwereade_: Already felt that upset is wrong.
<davecheney> possibly elated
<TheMue> Btw, my vacation starts today. But I'll be in later today again as I have two LGTMs.
<davecheney> ok
<TheMue> While I'm away you can reach me by mail, thx to those little portable computers which also can phone. ;)
<davecheney> TheMue: i have a really bad camera that also has email on it
<davecheney> btw, http://codereview.appspot.com/6497070/
<davecheney> this might be the solutoin to the strange EOF problems we have talking to LP and AWS
<davecheney> *might*
<davecheney> or it coudl just be the intertubes in australia
<fwereade_> TheMue, btw, if you's still around, why do you favour the switch over the if/else in loose-hook-info-members?
<fwereade_> TheMue, it's clearly a perfectly legitimate thing to do but I have a strong bias toward switching on just one thing... is this just because I'm used to "broken" switch statements?
<rogpeppe> davecheney, fwereade_: morning
<davecheney> morning
<rogpeppe> davecheney: the change to NewConnFromAttrs was niemeyer's suggestion BTW
<rogpeppe> davecheney: sorry about that
<davecheney> rogpeppe: so'k
<davecheney> wasn't hard to figure out
<rogpeppe> davecheney: it made for better alignment with the entry points in environs
<davecheney> yeah, 'tis and improvement
<TheMue> fwereade_: Just optical reasons. I like the switch more than if-else-chains (even if they are short). But don't care, just personal preferences. :)
<fwereade_> TheMue, changed it, still somewhat ambivalent, we'll see
<rogpeppe> fwereade_: does this look better to you than the version as test fixture?  https://codereview.appspot.com/6495086/
<fwereade_> rogpeppe, heyhey
 * fwereade_ looks
<rogpeppe> fwereade_: i generally prefer a switch if there are more than two branches
<TheMue> fwereade_: Thx. So, off for the moment.
<fwereade_> rogpeppe, there are only two here
<rogpeppe> fwereade_: in which case i'm probably on the fence, although i haven't seen the code in question
<fwereade_> rogpeppe, it may just be that I had a couple of reviews requesting switch->if-else and have grown wary of them :)
<fwereade_> rogpeppe, that looks much nicer to me
<rogpeppe> fwereade_: cool. i'm not entirely sure about the "series" helper methods (explicit better than implicit?) but i'm happier with it as a normal type not a fixture.
<fwereade_> rogpeppe, definitely
<rogpeppe> davecheney: good work on tls fix - let's hope it helps
<rogpeppe> fwereade_: the only other thing i considered was embedding LocalRepository or something like that.
<rogpeppe> fwereade_: 'cos they're really quite similar
<rogpeppe> fwereade_: but most uses don't care, and it's easy to make up a LocalRepository when needed
<fwereade_> rogpeppe, IIRC I wanted to do that when I was first writing it but couldn't get them quite similar enough :)
<davecheney> rogpeppe: i'm not really sure it was the problem, but TSAN picked it up so it needed to be fixed
<rogpeppe> fwereade_: i think it's ok - they both represent a repository, but do different things with it. and all the state is in the directory, so it doesn't matter.
<rogpeppe> davecheney: ah. i'm looking forward to that being included as standard. i'm interested to see what it makes of our code...
<davecheney> rogpeppe: there is a reason why I don't run GOMAXPROCS > 1
<davecheney> that will make gozk EXPLODE!!
<rogpeppe> davecheney: have you tried it?
<davecheney> once
<davecheney> got a segfault in gozk
<rogpeppe> davecheney: ha ha. i thought it *should* be safe.
<davecheney> rogpeppe: did you log that issue to have the signal details logged
<rogpeppe> davecheney: ah, no. i'll do that.
<rogpeppe> davecheney: it would be quite a hassle to fix sadly
<rogpeppe> davecheney: because in that context we can't call any functions
<davecheney> isn't there a panicf
<davecheney> or a printf that doesn't allocate ?
<rogpeppe> davecheney: hmm, maybe. i'm not sure of what invariants we need at that level.
<rogpeppe> davecheney: of course, the real fix is to allow callbacks from non-Go-created threads.
<davecheney> rogpeppe: that fix might be saying 'the real fix is to relax the laws of reletivity to allow high paying packets to travel at their chosen speed'
<rogpeppe> davecheney: yeah
<davecheney> m and g are not available, but as long as the GC doesn't need to run, it should be safe to use runtime.print
<rogpeppe> davecheney: do you know what linux signal behaviour is w.r.t. threads? is the signal delivered to one thread only? or all threads?
<rogpeppe> davecheney: hmm, one random thread, it looks like. hurray.
<davecheney> rogpeppe: yup, but all threads wouldn't help either
<rogpeppe> davecheney: if it was all threads we could block the signals in all but the one we care about and ignore if we find no m, no?
<davecheney> rogpeppe: good point, but i'm sure that was tried and found wanting
<rogpeppe> davecheney: better would be if a thread had to explicitly enable signals rather than receiving them by default
<davecheney> /home/dfc/src/launchpad.net/juju-core/testing/mgo.go:119: c.Fatal("Test left sockets in a dirty state")
<davecheney> what does this mean ?
<fwereade_> davecheney, sorry, no idea... failed to delete them maybe?
<davecheney> fwereade_: i was leaking a mgo.Session
<davecheney> my fault
<fwereade_> np
<rogpeppe> davecheney: do you know a way of reliably reproducing the signal problem. my naive attempt doesn't fail: http://paste.ubuntu.com/1188513/
<davecheney> rogpeppe: not really
<davecheney> it is quite rare
<davecheney> deinfitely run with high GOMAXPROCS
<rogpeppe> davecheney: if the issue is what we think it is (the signal is randomly delivered to some thread), i can't see why the above code doesn't trigger the issue
<davecheney> rogpeppe: try firing the signal from the C side
<rogpeppe> davecheney: i tried firing the signal from the command line. no difference.
<davecheney> rogpeppe: i'm not surprised, it's not supposed to happen
<rogpeppe> davecheney: well... how can they be stopping it from happening?
<davecheney> 'supposed to happen' == no panic
<rogpeppe> davecheney: i've seen a couple more failures of the TestSSHConnect test BTW. (one just now and another one yesterday).
<rogpeppe> davecheney: it concerns me
<davecheney> rogpeppe: my suggestion to that would be to switch to the ssh crypto package
<rogpeppe> davecheney: i'd love to. we started off trying that direction, but it wasn't mature enough.
<davecheney> rogpeppe: all it needs to support for loading keys off disk (in the same order that ssh does it)
<davecheney> and talking to the ssh agent (which we have now)
<davecheney> i'll put it on my list for this weekend
<davecheney> must fly
<Aram> moin.
<fwereade_> Aram, heyhey
<fwereade_> rogpeppe, trivial: https://codereview.appspot.com/6490086
<rogpeppe> fwereade_: LGTM
<fwereade_> rogpeppe, cheers
<rogpeppe> fwereade_: pity it had never been tested live before...
<fwereade_> rogpeppe, indeed
 * Aram has troubles finding flights to lisbon.
<rogpeppe> fwereade_: need trivial LGTM please: https://codereview.appspot.com/6494090
 * fwereade_ looks
<fwereade_> LGTM
<fwereade_> rogpeppe, ^
<rogpeppe> fwereade_: ta!
 * rogpeppe is ashamed to have broken the build :-(
<rogpeppe> fwereade_: FWIW, i think a string argument to AgentToolsDir is still ok. each agent can decide on its own name, no?
<rogpeppe> fwereade_: (i just saw your discussion with gustavo last night)
<fwereade_> rogpeppe, it happens ;p
<fwereade_> rogpeppe, kinda... but, atm, everybody has to prepend unit-, and deslash, whenever they want to specify a unit name
<fwereade_> rogpeppe, this feels kinda icky
<fwereade_> rogpeppe, although I am also not 100% sold on KeyPath
<rogpeppe> fwereade_: i'm not sure what you mean. unit names aren't prepended with "unit-" AFAICS
<rogpeppe> fwereade_: the *agent name* for a unit is though
<fwereade_> rogpeppe, isn't that what AgentToolsDir expects?
<rogpeppe> fwereade_: yes, but how often do you need to call AgentToolsDir?
<fwereade_> rogpeppe, a few places that may eventually coalesce into one place
<fwereade_> rogpeppe, the hook env needs it, EnsureJujucSymlinks needs it
<fwereade_> rogpeppe, container needs it
<rogpeppe> fwereade_: why does the hook env need it?
<fwereade_> rogpeppe, so we can put it on the PATH
<rogpeppe> fwereade_: why not just do that in the uniter main?
<fwereade_> rogpeppe, and, later, add it as an explicit env var for use when setting up juju-run commands
<fwereade_> rogpeppe, because it's only needed by the hooks?
<rogpeppe> fwereade_: fair enough. i'd probably do something like uniter.AgentName(unit *state.Unit) string
<rogpeppe> fwereade_: or even uniter.ToolsDir(unit *state.Unit) string
<fwereade_> rogpeppe, not unreasonable for container to know about uniter, I guess... and environs doesn't need to know about uniters, because that's always done by machiner... yeah, probably good
<rogpeppe> fwereade_: yeah, i hadn't thought about container needing to know about uniter, but as you say, sounds reasonable, as it's starting the worker (indirectly)
<fwereade_> rogpeppe, yeah, exactly
<fwereade_> rogpeppe, a further thought
<fwereade_> rogpeppe, if we do have a consistent "kind-name" badge for the various agents, which sounds sensible, we really should be using it in more places than just AgentToolsDir
<fwereade_> rogpeppe, it should be in logfile names, for example
<rogpeppe> fwereade_: that seems like a good plan
<rogpeppe> davecheney: did you have any further discussion with niemeyer about your CL https://codereview.appspot.com/6499071
<rogpeppe> ?
<rogpeppe> davecheney: i'm not sure i understand his objections
<rogpeppe> davecheney: but perhaps his objections are *only* to the CL description
<davecheney> rogpeppe: no, i will try again tomorrow
<rogpeppe> davecheney: and if it was changed to "most uses just want the *Machine" it would be ok
<davecheney> rogpeppe: couldn't hurt
<rogpeppe> davecheney: although he's right - there's actually *no* real code that wants the Machine rather than the id
<davecheney> rogpeppe: it's probably not worth the discussion
<davecheney> what I have works
<davecheney> and there are so many more important things do to
<rogpeppe> true 'nufff
 * rogpeppe goes for some lunch
<rogpeppe> niemeyer: morning!
<niemeyer> rogpeppe: Morning!
<niemeyer> Hello all!
<rogpeppe> niemeyer: thanks for the review!
<niemeyer> rogpeppe: np, thanks for all the niceties
<rogpeppe> niemeyer: no problem at all
<rogpeppe> niemeyer: i don't think there are two round trips to get machine id and machine. in state i think that's true anyway. or are you thinking of mstate?
<niemeyer> rogpeppe: Sorry, your logic escapes me.. you mean that state.Machine(id) doesn't have a roundtrip to the state server?
<rogpeppe> niemeyer: no, i'm saying that MachindId needs the round trip anyway
<rogpeppe> MachineId
<rogpeppe> AssignedMachineId even
<niemeyer> rogpeppe: The logic still escapes me.. the CL makes it call Machine, always
<niemeyer> rogpeppe: Machine has a roundtrip to the state server, both in state and in mstate
<rogpeppe> niemeyer: ah yes, you're right. that's wrong. but we can construct the machine from the topology without needing an extra round trip.
<niemeyer> rogpeppe: No, we can't, because we use cached state in mstate
<rogpeppe> niemeyer: ah, so Unit.AssignedMachineId doesn't go to the server?
<niemeyer> rogpeppe: Yes, and Machine has to go to the server to get the machine state
<rogpeppe> niemeyer: right, gotcha. presumably we *could* fetch the machine too when we fetch the unit.
<niemeyer> rogpeppe: Dude..
<rogpeppe> lol
<rogpeppe> ok ok
<fwereade_> huh, are we not actually using --juju-directory for anything?
<mramm> Hi all.   I was on swap yesterday, but am back now.
<mramm> Just got off a call about doing a Juju marketing video
<Aram> hi mramm
<fwereade_> mramm, heyhey
<fwereade_> mramm, I deployed something yesterday, would you believe?
<mramm> I'm also working on the info for the website, a juju webinar slide deck, and talking to jorge about marketing juju to developers and using it as a DIY PaaS.
<mramm> fwereade_: That is AWESOME!
<mramm> fwereade_: Everybody's hard work is coming together!
<fwereade_> mramm, yep :D
<mramm> I am excited!
<fwereade_> mramm, now ofc I have to hammer it into proposable shape
<mramm> and using too many exclamation points!!!!!!!!!!!!!
<fwereade_> mramm, but that shouldn't be too far off
<mramm> fwereade_: great!
<mramm> I am excited about the juju video
<mramm> the people we've got working on it seem competent
<rogpeppe> fwereade: you'll be pleased to know that authorisation of internal traffic now works
<fwereade> rogpeppe, sweet
<fwereade> mramm, (that was the only manual step)
<rogpeppe> fwereade: just waiting on a live test that checks that both machine agents upgrade
<rogpeppe> YAY!
<rogpeppe> it works
 * fwereade cheers
<mramm> I also had a talk with Clint about packaging juju for Ubuntu on Tuesday night
<mramm> rogpeppe: AWESOME!
<mramm> so very much awesome going on right now!
<rogpeppe> fwereade: 3.87s to upgrade both agents FYI
<rogpeppe> fwereade: and that's using the proper deploy infrastructure, juju.AddService, AddUnits, etc
<fwereade> rogpeppe, sweeet
<rogpeppe> fwereade: yeah, it's nice to have some functional tests that are a bit higher level
<rogpeppe> fwereade: it's just a pity that pushing the tools takes so long. sometimes as long as 5 minutes.
<rogpeppe> fwereade: it would have been done earlier if amazon's docs weren't so crap :-)
<fwereade> rogpeppe, ha, I know your pain
<mramm> rogpeppe: haha
<mramm> So, I have a product review meeting tomorrow
<mramm> if you have things you think I should give management heads up on, let me know
<mramm> I've got the unit agent, updater, mongo, and marketing update basics figured out
<mramm> so if you have any additional detail you think is important on those things, or any other stuff that people need to know about, shoot it on over.
<fwereade> rogpeppe, niemeyer: sanity check on http://paste.ubuntu.com/1189087/ if yu have a mo
<fwereade> niemeyer, KeyPath on state entities was not so nice in the end -- we don't always have the state entities available
<fwereade> niemeyer, but I *think* this ends up quite neat
<fwereade> niemeyer, rogpeppe: I would almost certainly want to pair it with a PathSuite or something that swaps out LibDir, LogDir, InitDir
<fwereade> niemeyer, (and possibly $HOME, why not, eh?)
<niemeyer> juju/testing/conn.go:// It also sets up $HOME and environs.VarDir to
<niemeyer> fwereade: ^
<fwereade> niemeyer, yeah; and a lot of other tests swap one or both of those out here or there
<fwereade> niemeyer, it was moving environs.VarDir that made me think "ouch, too much duplication"
<niemeyer> fwereade: My only concern is that right now we already have an Agent with an interface
<fwereade> niemeyer, but if I could find a sensible name, it would probably be ok?
<fwereade> niemeyer, and actually I'm not sure agent.Spec is all that much subject to confusion
<fwereade> niemeyer, wait a mo, there is no Agent interface that I can find
<fwereade> niemeyer, now I want to just call that an Agent :)
<niemeyer> fwereade: cd cmd/jujud; grep Agent *
<fwereade> niemeyer, I see AgentConf and AgentState
<niemeyer> fwereade: Uh?
<niemeyer> fwereade: UnitAgent, MachineAgent, ProvisioningAgent, AgentState, AgentConf, etc etc
<fwereade> niemeyer, ok... you said we had an Agent with an interface, maybe I mistook what you meant
<niemeyer> fwereade: AgentConf being JujuDir + StateInfo..
<fwereade> niemeyer, ah yeah, mean to mention that, JujuDir is not used
<niemeyer> fwereade: I mean that the concepts you're playing with already exist
<niemeyer> fwereade: We can't just add another package without sorting their situation out
<fwereade> niemeyer, ok, I see now
<fwereade> niemeyer, well, JujuDir doesn't actually do anything
<niemeyer> fwereade: That won't change if we just add another package :)
<fwereade> niemeyer, but I'm pretty sure what it *should* do is overwrite environs.VarDir (which would be agent.LibDir)
<rogpeppe> niemeyer: https://codereview.appspot.com/6500089
<niemeyer> rogpeppe: Cheers.. I think I'll focus a bit on the watcher today, though, otherwise we'll never have it
<rogpeppe> niemeyer: no problem
<rogpeppe> niemeyer: you'll be glad to know that this enables the uniter stuff to work live though
<niemeyer> rogpeppe: Wow, sweet!
<rogpeppe> niemeyer: and that it tests two machine agents upgrading at the same time
<rogpeppe> niemeyer: through the juju.Conn deploy machinery
<niemeyer> rogpeppe: Oh man
<niemeyer> rogpeppe: It's coming together
<rogpeppe> niemeyer: if nothing else, just have a brief glance at https://codereview.appspot.com/6500089/diff/2001/environs/jujutest/livetests.go
<rogpeppe> niemeyer: (i'm really happy that worked)
<niemeyer> rogpeppe: Impressive indeed.. even more impressive to have that as a stock test
<rogpeppe> niemeyer: yeah, i'm happy about that
<niemeyer> fwereade: wb
<rogpeppe> fwereade: https://codereview.appspot.com/6500089
<niemeyer> fwereade: I sent you and -dev a mail
<niemeyer> fwereade: With the feedback, before I forget and so that I could move on
<fwereade> niemeyer, cheers, I will take a look
<fwereade> niemeyer, yeah, consider my mind already churning on these thoughts
<niemeyer> fwereade: +1, cheers
<rogpeppe> fwereade: shouldn't JujuDir set environs.VarDir?
<fwereade> rogpeppe, I intimated as much above
<rogpeppe> f.StringVar(&environs.VarDir, "juju-directory", environs.VarDir, "juju working directory")
<rogpeppe> fwereade: oh yes, i didn't notice that
<fwereade> rogpeppe, although I am starting to feel dissatisfied with its globalness, and am wondering if we really need it
<rogpeppe> fwereade: yeah, i'd be happy for it to avoid globalness
<rogpeppe> fwereade: i think most of environs/tools.go could happily fit in a new package
<rogpeppe> fwereade: with no globals
<rogpeppe> fwereade: maybe just a package called "tools"
<rogpeppe> aw
<rogpeppe> fwereade: last thing you saw?
<rogpeppe> fwereade: last thing from you i saw was
<rogpeppe> [16:39:26] <fwereade> rogpeppe, although I am starting to feel dissatisfied with its globalness, and am wondering if we really need it
<fwereade> rog, I saw  fwereade: i think most of environs/tools.go could happily fit in a new package
<rogpeppe> 16:41:37] <rogpeppe> fwereade: with no globals
<rogpeppe> [16:42:19] <rogpeppe> fwereade: maybe just a package called "tools"
<fwereade> rogpeppe, I was just pondering exactly that while having a ciggie :)
<rogpeppe> fwereade: because that's what this is all about, i *think*
<fwereade> rogpeppe, yeah, there's a sideline in "bits of container look awfully like bits of cloudinit" -- which originally started me off on this jaunt -- but you are absolutely right
<niemeyer> Lunch break
<rogpeppe> fwereade: which bits were you thinking of?
<fwereade> rogpeppe, just the bits that create Confs are awfully similar and mildly tediously inconsistent
<fwereade> rogpeppe, not a big deal at all really, but it was nagging at me
<rogpeppe> fwereade: Conf?
<fwereade> rogpeppe, upstart.Conf
<rogpeppe> fwereade: yeah, i think that code would fit quite happily in a tools package
 * rogpeppe is quickly hacking together a sketch
<rogpeppe> fwereade: how about something like this? http://paste.ubuntu.com/1189197/
<rogpeppe> fwereade: a few bits tidied up:
<rogpeppe> package tools
<rogpeppe> // SearchFlags gives options when searching  for tools.
<rogpeppe> type SearchFlags int
<rogpeppe> const (
<rogpeppe> 	// HighestVersion indicates that versions above the version being
<rogpeppe> 	// searched for may be included in the search. The default behavior
<rogpeppe> 	// is to search for versions <= the one provided.
<rogpeppe> 	HighestVersion SearchFlags = 1 << iota
<rogpeppe> 	// DevVersion includes development versions in the search, even
<rogpeppe> 	// when the version to match against isn't a development version.
<rogpeppe> 	DevVersion
<rogpeppe> 	// CompatVersion specifies that the major version number
<rogpeppe> 	// must be the same as specified. At the moment this flag is required.
<rogpeppe> 	CompatVersion
<rogpeppe> )
<rogpeppe> // List holds a list of available tools.  Private tools take
<rogpeppe> // precedence over public tools, even if they have a lower
<rogpeppe> // version number.
<rogpeppe> type List struct {
<rogpeppe> 	Private []*state.Tools
<rogpeppe> 	Public  []*state.Tools
<rogpeppe> }
<rogpeppe> // ListAll returns a List holding all the tools
<rogpeppe> // available in the given environment that have the
<rogpeppe> // given major version.
<rogpeppe> func ListAll(env Environ, majorVersion int) (*List, error)
<rogpeppe> // Put builds the current version of the juju tools, uploads them
<rogpeppe> // to the given storage, and returns a Tools instance describing them.
<rogpeppe> // If vers is non-nil it will override the current version in the uploaded
<rogpeppe> // tools.
<rogpeppe> oops
<rogpeppe> sorry everyone
<rogpeppe> i meant this: http://paste.ubuntu.com/1189199/
<rogpeppe> fwereade: http://paste.ubuntu.com/1189199/
<rogpeppe> hmm, i wonder if i've been muted
<fwereade> rogpeppe, sorry, I can hear you
<rogpeppe> np
<fwereade> rogpeppe, it looks sane, I think, I'm not sure exactly how it will square with the other wild ideas I am chasing
<rogpeppe> fwereade: what kind of thing are you thinking of?
<fwereade> rogpeppe, agent.Agent, and what exactly it should have attached to it, if it should even exist
<rogpeppe> fwereade: i'm not convinced it should exist
<rogpeppe> fwereade: an agent is its own beast
<fwereade> rogpeppe, there are an *awful* lot of commonalities across agents, I think there is a useful common structure waiting to emerge
<rogpeppe> fwereade: i go along with niemeyer's comments in this respect
<rogpeppe> fwereade: when it emerges, we can factor it out
<rogpeppe> fwereade: as particular chunks of functionality
<fwereade> rogpeppe, I think it's worth an overnight ponder at least :)
<rogpeppe> fwereade: saying "*this* is an agent" is a bit like the inheritance way of thinking, i think
<fwereade> rogpeppe, I guess it is somewhat predicated on the assumption that an agent should correspond to a single state entity
<rogpeppe> fwereade: indeed
<rogpeppe> fwereade: which may very well not be the case in the future
<rogpeppe> fwereade: well... i guess we will probably always have at least one item in the state for an agent
<fwereade> rogpeppe, hmm -- so it's workers that should correspond to state entities?
<rogpeppe> fwereade: no
<rogpeppe> fwereade: it's running processes
<rogpeppe> fwereade: i.e. things that can upgrade themselves
<fwereade> rogpeppe, ok (is there now a Provisioner state entity that I haven't noticed?)
<rogpeppe> fwereade: but i may very well be seeing through very upgrading-centric eyes currently :-)
<rogpeppe> fwereade: no, but there will be
<rogpeppe> fwereade: of some kind
<rogpeppe> fwereade: otherwise we can't tell when a PA manages to upgrade itself
<fwereade> rogpeppe, cool, thought so, and this just makes all the agents look even more similar to me, but indeed the stars might not yet be aligned
<rogpeppe> fwereade: there are definitely similarities
<rogpeppe> fwereade: but we can abstract those out when we need some common functionaliy
<rogpeppe> ty
<rogpeppe> fwereade: what kinds of operations do you envisage on agent.Agent?
<rogpeppe> fwereade: would it be a concrete type or an interface?
<SpamapS> Hello golangian friends. I am wondering, are there any examples of Go applications already packaged (via PPA, or even in the Ubuntu archive) ?
<SpamapS> and on a related note, I am looking at writing a charm in Go as an exercise, and wondering how best to do it.. #!/usr/bin/gorun is great for prototyping, but at some point I think I'll want to just compile it
<niemeyer> SpamapS: lbox and cobzr might be good examples to start with
<niemeyer> SpamapS: They're both in PPAs
<SpamapS> niemeyer: perfect
<niemeyer> SpamapS: and auto-building
 * SpamapS already has the PPA's for those.. apt-get source to the rescue
<SpamapS> niemeyer: any reason you had to be so explicit on binary-arch here: http://bazaar.launchpad.net/~niemeyer/lbox/package/view/head:/debian/rules
<SpamapS> niemeyer: seems like an override_dh_install: might have handled that.
<TheMue> hiho
<TheMue> niemeyer: ping
<rogpeppe> SpamapS: i had some ideas about writing charms in Go, but haven't had time to do anything about them yet
<SpamapS> rogpeppe: it seems a bit heavy handed for most charm duties.
<rogpeppe> SpamapS: yeah, a shell script is often a good fit
<SpamapS> which boil down to "parse this file. put that file over there. run this command" ..
<niemeyer> TheMue: Pong
<niemeyer> SpamapS: Hmm
<SpamapS> rogpeppe: I'm actually finding puppet's DSL quite good and succinct for charm duties. ;)
<TheMue> niemeyer: just wanted to inform you that the CLs are in
<niemeyer> TheMue: Beautiful, thanks
 * rogpeppe hasn't actually looked at puppet yet
<TheMue> niemeyer: You had some notes on the first and not on the second and vice versa. I applied them on both.
<SpamapS> rogpeppe: the biggest drawback is that you have to bust out to ruby to do anything interesting
<niemeyer> TheMue: Well, actually you didn't, as I mentioned in the review, but that's fine
<TheMue> niemeyer: Like the tabs, the panic message and the comment.
<niemeyer> TheMue: We can sort out these details when you're back
<niemeyer> TheMue: Ah, yeah, that was great,thanks
<TheMue> niemeyer: What exactly you are referring?
<rogpeppe> SpamapS: interesting. ruby's Yet Another Language I Have Never Used
<rogpeppe> SpamapS: i always figured it was too close to python to be worth learning unless i needed to.
<niemeyer> TheMue: The three points in the review
<TheMue> niemeyer: WaitAgentAlive() on Unit?
<niemeyer> TheMue: Where I suggested they might be done in a different CL
<SpamapS> rogpeppe: more like perl meets javascript
<niemeyer> TheMue: There's a bug opened as well with them now
<TheMue> niemeyer: Yeah, the Dying and the Alive error, but not the removal of the loop.
<niemeyer> TheMue: Don't worry, what's going in is a great start, and we can easily sort the points out in a follow up when you're back
<TheMue> TheMue: You made the comment not on the last checkin.
<niemeyer> SpamapS: I don't really know.. I bet there are better ways to handle it.. I'm just not a well versed packager
<rogpeppe> SpamapS: anyway, Go might be good for charms that do a lot of orchestration dancing
<SpamapS> niemeyer: its really only a future-proofing thing anyway.. the way its written now is 100% correct
<SpamapS> rogpeppe: right, I was thinking of centralized things that have to handle all coming/going .. like monitoring or logging
<rogpeppe> SpamapS: i reckon you could write it in good Go style, as a goroutine that receives "charm-hook-execute" events in a loop, then acts on the hook and replies.
<TheMue> niemeyer: Please take a look at https://codereview.appspot.com/6494073/diff/2002/mstate/unit.go at line 230ff
<rogpeppe> SpamapS: rather than our traditional "you get called when a hook happens" model
<niemeyer> TheMue: Ok?
<TheMue> niemeyer: It's now w/o the loop, like for Machine.
<rogpeppe> SpamapS: but these thoughts are only fleeting and unformed...
<niemeyer> TheMue: Ok, that's great
<SpamapS> rogpeppe: thats actually what I was thinking too. But I think such things will only be necessary in the extreme cases where something has many thousands of related service units.
<SpamapS> rogpeppe: for 99% of the cases.. python/ruby/bash will be up to the task of running a hook every few seconds.
<rogpeppe> SpamapS: it's not necessarily about large scale - it could make it easier to make logic simpler
<SpamapS> rogpeppe: I doubt that. :)
<TheMue> Fine, now I can leave relaxed. I will look into mail via phone to stay informed during my vacation. It's an exciting time for juju.
<rogpeppe> SpamapS: you may well be right :-)
<SpamapS> rogpeppe: simpler to a go developer maybe.. but not to a charmer. :)
<SpamapS> the most insanely complex charms are still pretty straight forward
<rogpeppe> SpamapS: well, i certainly intend to write some more charms at some point :-)
<rogpeppe> SpamapS: also, i have a feeling we'll get more golangers coming along in the python/ruby space, though i may be wrong in that
<SpamapS> http://www.cloudifysource.org/
<SpamapS> another juju-like project
<niemeyer> TheMue: Yeah, have a great time there
<TheMue> niemeyer: Thx, the weather forecast looks good, so we'll have some fine days (and evenings with a bottle of wine) at the water. It's only already too cold for swimming.
<rogpeppe> i'm off for too. have fun all. see you tomorrow!
<rogpeppe> TheMue: enjoy your break!
<TheMue> gn rogpeppe
<TheMue> and thank you
<fwereade> niemeyer, ping
<niemeyer> fwereade: Pongus
<fwereade> niemeyer, I think trunk is broken, I'm getting a panic from somewhere inside presence in the mstate tests... just verifying it *actually* happens on trunk, not just my merged version
<fwereade> niemeyer, if it is, is it still ok to submit over the top?
<niemeyer> fwereade: Can you please paste the panic?
<fwereade> niemeyer, https://bugs.launchpad.net/juju-core/+bug/1047051
<niemeyer> fwereade: Yeah, if that's the only failure, it'd be fine
<fwereade> niemeyer, cool, thanks
<fwereade> niemeyer, yeah, trunk too
<niemeyer> fwereade: Ah, ok.. I can tell what it is
<niemeyer> fwereade: I've missed that when reviewing Frank's branch
<fwereade> niemeyer, ++psychic debugging :)
<niemeyer> fwereade: We're creating a watcher and not stopping
<fwereade> niemeyer, jolly good, could be worse ;)
<niemeyer> fwereade: Hehe :)
<niemeyer> fwereade: I'll fix it as soon as I stop the current line of thinking here
<fwereade> niemeyer, lovely, tyvm
<niemeyer> fwereade: Sorry for the trouble
<fwereade> niemeyer, no trouble :)
<niemeyer> fwereade: ping
<fwereade> niemeyer, pong
<niemeyer> fwereade: Yo
<fwereade> niemeyer, how's it going?
<niemeyer> fwereade: Up for a quick review fixing that issue?
<fwereade> niemeyer, absolutely
<fwereade> niemeyer, then sleepytime ;)
<niemeyer> fwereade: Cool, pushing
<niemeyer> fwereade: I bet! 8)
<niemeyer> fwereade: https://codereview.appspot.com/6510043
<niemeyer> I've screwed up the description.. fixing meanwhile
<fwereade> niemeyer, LGTM
<niemeyer> Done
<niemeyer> fwereade: Cheers!
<fwereade> niemeyer, btw, I think I have convinced myself that the reset --soft is pure superstition; and that in any cases which may exist in which deleting the lock file is insufficient, we really can't do anything about this
<fwereade> niemeyer, *but* that I also shouldn't even try to recover on unit agent startup -- the user could very easily be logged in and messing with the dir while the uniter is in an error state, and we can't just charge in and start doing stuff
<niemeyer> fwereade: Ah, cool, thanks for the note.. I really wasn't sure
<fwereade> niemeyer, so I now want to just speculatively delete the lock file before every command that hits the index
<niemeyer> fwereade: Sounds good.. your atomicity scheme kind of makes it unnecessary as long as the automated process is concerned, so sounds sane
<niemeyer> fwereade: Uh oh
<fwereade> niemeyer, hm, is that bad?
<niemeyer> fwereade: Sounds like the two extremes..
<niemeyer> fwereade: We're not reseting because the user might be messing around, but we're happy to kill the lock all the time?
<fwereade> niemeyer, the most authoritative-sounding source I could find was a git list thread from 2009, in which they discussed making the "just delete the lock file" message carry more of that intent
<fwereade> niemeyer, the only times we have any reason to kill the lock are precisely when we're doing things with the index
<fwereade> niemeyer, this implies that we are upgrading, or running hooks, and if the user is messing around in the charm dir while we are working that is on hi own head ;)
<niemeyer> fwereade: I see, so hitting the index as in changing it
<niemeyer> fwereade: Not just reading, right?
<fwereade> niemeyer, I'm not sure it'll even read the index when something else has a lock
<fwereade> niemeyer, but this was in fact done by a process of pure research
<niemeyer> fwereade: I'd rather start more conservative, if feasible
<niemeyer> fwereade: The lock is there precisely to avoid processes from stepping on each other
<fwereade> niemeyer, of all the commands we use: add, commit, pull, reset all require that the lock file not exist
<niemeyer> fwereade: Sure, because if it exists someone/something else may be acting on the directory
<niemeyer> fwereade: But, as I understand all the changes are done in independent directories
<niemeyer> fwereade: With your atomicity mechanisms
<niemeyer> fwereade: Which means that if we ever see a lock, something awkward is happening
<niemeyer> fwereade: Breaking when something awkward happens seems legit
<fwereade> niemeyer, not actually true, but it could be made to be so
<niemeyer> fwereade: Ah, indeed, the final pull I guess
<fwereade> niemeyer, yeah, the charm dir itself is not a swappable symlink
<fwereade> niemeyer, although maybe it should be
<niemeyer> fwereade: COol, we can leave that
<niemeyer> fwereade: Even then, I'd be fine with asking the user to go there and fix it for the moment, before we're sure that stuff works fine
<niemeyer> fwereade: Seems useful to learn about how frequently we'll observe lock files left around and whatnot
<fwereade> niemeyer, hm, ok, let me think a sec
<niemeyer> fwereade: We can always change our mind and be less conservative
<fwereade> niemeyer, in that case I suspect I favour just explicitly blowing up when we see a lock file that wold impede the impending operation
<fwereade> niemeyer, sane granularity?
<niemeyer> fwereade: Yeah, that's what I mean
<niemeyer> fwereade: Ah, explicitly
<niemeyer> fwereade: You mean checking it's there?
<fwereade> niemeyer, we shouldn't be operating on the charm dir except at times when the user really shouldn't be operating on the charm dir
<fwereade> niemeyer, hmm, just let it fail even :)
<niemeyer> fwereade: Right.. there's no point in checking, since that's exactly the race that the lock file is meant to protect against
<fwereade> niemeyer, true :)
<fwereade> niemeyer, I'm just thinking it through... this is a potential new error state
<niemeyer> fwereade: Really? How is this any different from any other error related to the charm deployment?
<niemeyer> fwereade: Can't create dir, can't download, etc
<fwereade> niemeyer, ah, ok, it's a totally-screwed-agent-down-uniter-log-filled-with-errors situation, not a polite-error-message-log-in-nicely-and-resolve-it one?
<fwereade> niemeyer, I guess all of those are
<niemeyer> fwereade: Right
<fwereade> niemeyer, yep, ok, I'm just more scared of git because I'm more aware of my ignorance
<niemeyer> fwereade: Well.. some of those errors surely are resolvable too, right?
<niemeyer> fwereade: I'm not sure if you're saying these are conditions we can't recover from or not?
<fwereade> niemeyer, I'm just saying that currently we don't
<niemeyer> fwereade: Okay, I'm happy to move on and polish later, but in principle those conditions should also be recoverable
<niemeyer> fwereade: E.g. a download error is a pretty unexceptional error
<niemeyer> fwereade: As Amazon would gladly tell you
<fwereade> niemeyer, so *that* will work just fine assuming the resource is actuallythere... eventually it'll succeed
<niemeyer> fwereade: Perhaps not gladly, no :-)
<fwereade> niemeyer, and therefore shouldn't be reported, at least not immediately, because we expect it to be transient
<niemeyer> fwereade: Right, ok.. so that's what I mean.. I see all of those cases as "wedged, needs action"
<niemeyer> fwereade: but "juju resolved" in general should mean "Alrightly, will try again and see if that works now"
<fwereade> niemeyer, I'm not sure I agree that a dropped connection during a download is reason enough to stop doing things until the user comes to help us
<niemeyer> fwereade: Sorry, yes, you're right.. that's not a good example
<fwereade> niemeyer, that's the trouble
<niemeyer> fwereade: I mean a case like "can't create directory"
<fwereade> niemeyer, some uniter errors are resolvable by time, and others justify a stop-the-world
<fwereade> niemeyer, we do not as yet have a way of classifying them
<niemeyer> fwereade: I see, ok
<fwereade> niemeyer, it would be a nice thing to be able to do so but it also feels like it might get hairy
<niemeyer> fwereade: I think we might do with some heuristics
<niemeyer> fwereade: without having to put them in separate buckets
<fwereade> niemeyer, so, given that, I would like to just drop the concept of recovery entirely and let the process fail as it pleases
<niemeyer> fwereade: E.g. if it's something we want to retry automatically, don't yet tell it's an error
<niemeyer> fwereade: But if we retry so many times, or for so long, and it's not working, stop trying and get into an error state
<fwereade> niemeyer, yeah, reasonable -- feels like a different change to just dont-ever-recover though
<niemeyer> fwereade: This is even a conservative approach to prevent a large system from self-destructing
<niemeyer> fwereade: Agreed
<fwereade> niemeyer, but, yeah, absolutely, sounds like what we should be doing to me
<niemeyer> fwereade: To make things simpler for us to move forward, it's probably fine to start with don't-ever-recover, and move towards auto-retry
<fwereade> niemeyer, well, auto-retry will happen on everything for free via upstart
<niemeyer> fwereade: That's not very nice
<niemeyer> fwereade: We're managing retrying internally in other cases already
<niemeyer> fwereade: E.g. connections to state
<niemeyer> fwereade: Well, all errors really
<niemeyer> fwereade: In other agents we manage retry-after-error internally
<fwereade> niemeyer, oh, er, yeah, sorry, I *did* copy that bit in from the machine agent
<fwereade> niemeyer, ok, I see where you're coming from now
<fwereade> niemeyer, so managing the heuristics becomes a little easier, and it's still a good idea to add some sort of heuristics
<fwereade> niemeyer, er, sorry nonsense
<fwereade> niemeyer, it feels to me like "try again" should always be reasonable behaviour if we've written the workers right
<fwereade> niemeyer, the spinning and logspam are unsightly, it is true
<niemeyer> fwereade: It is always true, but in some cases we may need authorization to do so, due to the non-idempotency of some operation
<fwereade> niemeyer, and that should be handled internally by the uniter, surely?
<fwereade> niemeyer, the uniter knows about these things; the unit agent just needs to keep a uniter running at all times, except when it's upgrading
<fwereade> niemeyer, ah ok I think I see a source of confusion
<niemeyer> fwereade: That, what?
<fwereade> niemeyer, when I say always-retry is the default I am specifically referring to errors that cause the uniter to fail
<fwereade> niemeyer, hook errors do not cause the uniter to fail -- when state indicates a hook error, the uniter runs just fine, in a mode in which it waits for resolution
<fwereade> niemeyer, the uniter will not retry an upgrade or a hook unless requested
<fwereade> niemeyer, and so it should always be safe for the unit agent to launch a new one after the first has failed
<niemeyer> fwereade: If a conflict happens on a merge.. how is that handled?
<fwereade> niemeyer, the GitDir returns ErrConflict, magic happens, the uniter runs ModeConflicted
<niemeyer> fwereade: and then?
<fwereade> niemeyer, eventually the user resolves the error, and we pull again, commit, and continue in ModeStarted
<niemeyer> fwereade: Okay, it is an error that doesn
<niemeyer> 't cause the uniter to bail out, then?
<fwereade> niemeyer, exactly
<niemeyer> fwereade: How's a Conflicted error different from a "directory has a lock" one, from our perspective?
<fwereade> niemeyer, no different, essentially; this is a point of trouble :/
<niemeyer> fwereade: How's it trouble? It seems like a solution.. if it's the same, let's unify the handling
<fwereade> niemeyer, ha, yes indeed
<fwereade> niemeyer, there's no good reason for any of the git operations to fail, none of them touch the network
<fwereade> niemeyer, if they *ever* do we can go straight to help-fix-git-please mode
<niemeyer> fwereade: Right
<niemeyer> fwereade: help-fix-deployment even
<niemeyer> fwereade: heck-were-is-my-data alternatively
<niemeyer> :-D
<niemeyer> where
<fwereade> niemeyer, indeed, it's currently called ModeConflicted and assumes that it's because of an upgrade, but I don;t think that's fundamental, just conincidental
<fwereade> haha
<niemeyer> fwereade: Yeah, ModeDeploymentError or something
<niemeyer> fwereade: status-info ftw
<fwereade> niemeyer, yep, ok, I think it's all percolating nicely through my brain; time to sleep for now, I hope I'll have time to sort out smarter error handling before too long
<fwereade> gn
<niemeyer> fwereade: Have a good night
#juju-dev 2012-09-07
<niemeyer> davecheney: Morning!
<davecheney> hey
<niemeyer> WOohay! First watcher test just passed
<niemeyer> davecheney: All good there/
<niemeyer> ?
<davecheney> watchers! hot damn
<davecheney> it's good to have you in the trenches with us
<niemeyer> davecheney: Thanks, I actually enjoy it a lot too, and it's getting even more exciting now that we can actually use the stuff we've been doing
<niemeyer> davecheney: Well, or are about to use, anyway
<niemeyer> 3 tests passing!
<davecheney> niemeyer: i have UseSSH patched into mstate
<niemeyer> davecheney: Wot
<niemeyer> woot
<davecheney> i'm going to see if I can change the test suites to run twice, once direct, the other via ssh
<davecheney> the state/* tests don't do this
<davecheney> they have a simple test for ssh forwarding only
<davecheney> actually, i'll do that in a followup CL
<davecheney> AAAAAAAAAAAAAAAAAAAAGH
<davecheney> i forgot the prereq !!!
<niemeyer> Uh oh :)
<niemeyer> and it takes shape..
<niemeyer> Watcher foundation is up for review..
<niemeyer> and I'm down for bed..
<fwereade> niemeyer, gn
<niemeyer> fwereade: Heya
<niemeyer> fwereade: Up early? Or maybe I'm late..? :)
<fwereade> niemeyer, I'm a bit early, you're very late :)
<fwereade> niemeyer, sounds like a fruitfulday though
<niemeyer> fwereade: Indeed.. pretty happy with it
<fwereade> niemeyer, cool
<niemeyer> fwereade: Slightly dense, but not too much code at all
<fwereade> niemeyer, great
<niemeyer> fwereade: and performant.. a thousand events written down, monitored and dispatched in a couple of seconds
<fwereade> niemeyer, fantastic :D
<niemeyer> fwereade: Unblocking FTW :)
 * niemeyer heads to a nice shower and bed..
<niemeyer> fwereade: Have a good time there
<fwereade> niemeyer, sleep well :)
<niemeyer> fwereade: tks!
<davecheney> niemeyer: great stuff!
<rogpeppe> fwereade: mornin'
<fwereade> rogpeppe, heyhey
 * rogpeppe goes to look at the watcher foundation...
<fwereade> rogpeppe, trivial: https://codereview.appspot.com/6488092
<rogpeppe> fwereade: all code deletion LGTM!
<fwereade> rogpeppe, cheers :)
<fwereade> rogpeppe, another trivial: https://codereview.appspot.com/6503087
<rogpeppe> fwereade: isn't that just the kind of thing that log.Debugf is for? we already *have* a Debug flag
<fwereade> rogpeppe, one global debug flag is IMO not adequate -- *most* of the time we don't need this logging, but it has on occasion been invaluable
<fwereade> rogpeppe, when it's running, though, it overwhelms theagent logs with spam
<rogpeppe> fwereade: maybe we should remove the global debug flag entirely
<fwereade> rogpeppe, possibly -- but I suspect that fully "correct" logging is likely to be a contentious subject and would prefer to avoid that rabbit hole while I can ;p
<rogpeppe> fwereade: or provide some way of registering particular debug flags
<fwereade> rogpeppe, yeah, indeed, *I* think we need something like that but niemeyer was -1 in related discussions a while ago
<rogpeppe> fwereade: fair enough. LGTM with the above reservations.
<fwereade> rogpeppe, cheers :)
<fwereade> rogpeppe, I'm heading out to work in a cafe for a while, bbs
<rog> davecheney: hiya
<fwereade_> going home again, later all
<davecheney> rog: hey
<Aram> hey all.
<Aram> I'm not really here today, but hello.
<rog> hello not-here Aram
<rog> davecheney, Aram: trivial LGTM please? https://codereview.appspot.com/6495103/
<davecheney> rog: fire at will
<rog> davecheney: ta
<rog> fwereade: as a stopgap measure, how about: http://paste.ubuntu.com/1190585/
<fwereade> rog, something like that is indeed a possibility, that was kinda where we started
<rog> fwereade: gets us through for the time being anyway, i think. and doesn't pretend to encapsulate an agent, just provides a way for a given agent to find its name.
<rog> fwereade: FYI i think your trivial.EnsureDir is exactly equivalent to os.MkdirAll
<fwereade> rog, ha, that'll learn me :/
<rog> fwereade: apart from it produces a slightly different error message when the target isn't a directory :-)
<rog> fwereade: how about /var/lib/juju/agents/unit-foo-3 as a uniter directory name (rather than /var/lib/juju/units/foo-3) ? then all the agent directories can live under the same directory.
<fwereade> rog, definitely yes
<rog> fwereade: great. done :-)
<fwereade> rog, even though none of the other agents need local storage yet, it seems cray not to think about them
<rog> fwereade: +1
<fwereade> rog, ah ok -- what are you working on atm?
<rog> fwereade: i'm fixing that container bug
<rog> fwereade: where it's using the wrong path
<rog> fwereade: because it's directly affecting me currently
<rog> fwereade: erm, are you on the same thing?
<fwereade> rog, not exactly... I don;t think... but which bug?
<rog> fwereade: a bug i mentioned yesterday (but haven't actually reported)
<rog> fwereade: where container is looking up the path in $PATH
<rog> fwereade: but it needs to use the agent name in the executable path
<rog> fwereade: otherwise upgrades won't work
<fwereade> rog, ah right, if that were still around I'd be drive-by fixing it next time I saw it
<fwereade> rog, but knock yourself out :)
<rog> fwereade: thus far, until i tried earlier today, we haven't tried upgrading a non-cloudinit-started agent yet
<rog> fwereade: i'm already reeling from the blow
<fwereade> rog, when does that happen except on unit dpeloyemnt?
<fwereade> rog, because container.Deploy is just generally broken
<rog> fwereade: it doesn't
<rog> fwereade: oh, how else is it broken?
<fwereade> rog, no state info
<rog> fwereade: i'll fix it in this CL
<fwereade> rog, no logfiles, no output, the upstart stuff is just generally inconsistent with the way agents are started in cloudinit
<rog> fwereade: ah, i now understand your remark yesterday
<rog> fwereade: "can we get StateInfo from State?"
<fwereade> rog, ha, yeah :)
<fwereade> rog, actually just passing it around is not really any trouble
<rog> fwereade: well, we can't even get a State from a Unit AFAIK
<fwereade> rog, indeed
<rog> Deploy(environs.Environ, *state.Unit) ?
<fwereade> rog, hmm, maybe, why get an environ when everything that deploys should have a stateinfo readily available?
<rog> fwereade: sounds reasonable to me. i'll see how it pans out.
<fwereade> rog, cool
<fwereade> brb
<rog> fwereade: it's a kind of moral equivalent of Environ.StartInstance if you think about it like that
<fwereade> rog, gut still says -1 but I'll wait and see :)
<rog> fwereade: -1 to what?
<fwereade> rog, giving an Environ to Deploy... but I could well be wrong
<rog> fwereade: (i wasn't suggesting passing an environs in - Environ.StartInstance takes a StateInfo as an argument)
<rog> s/environs/environ/
<fwereade> rog, ah!
<fwereade> rog, ok, sorry :)
<rog> fwereade: environs.VarDir is going :-)
<fwereade> rog, excellent news
<rog> fwereade: just looking at worker/uniter/tools_test.go. this seems a little weird:
<rog> 	s.toolsDir = c.MkDir()
<rog> 	toolsDir := filepath.Join(s.varDir, "tools")
<rog> 	err := os.Mkdir(toolsDir, 0755)
<rog> is that first s.toolsDir assignment a mistake?
 * fwereade looks
<fwereade> rog, I think it's sane, but I agree the names are a bit off
<rog> fwereade: so what are the two meanings of "toolsDir" there?
<rog> niemeyer: yo!
<rog> niemeyer: good work on the watcher foundation BTW
<niemeyer> Hello!
<niemeyer> rog: Hah
<niemeyer> rog: From the amount I've heard about this, I thought you'd be significantly happier ;-)
<rog> niemeyer: not quite sure i get you there
<niemeyer> rog: "I will be really happy when I see watchers working." -- Roger Peppe
<rog> niemeyer: haven't seen 'em working yet!
<fwereade> rog, s.toolsDir is the place the tools are actually stored; toolsDir is what environs thinks the main tools dir is
<rog> niemeyer: but it looks good.
<niemeyer> rog: Run the tests and enjoy then
<niemeyer> :)
<fwereade> rog, it is probably not necessary, but when I was writing it all I knew was that thet tools would actually be a sy,mlink away, so it seemed sensible to test it that way
<fwereade> rog, your changes sound likely to make that redundant
<rog> fwereade: ah i see. so s.toolsDir could be environs.ToolsDir(someVersion)
<fwereade> rog, yeah, exactly
<rog> fwereade: this is what i'm doing: http://paste.ubuntu.com/1190853/
<fwereade> rog, that looks ideal
<rog> fwereade: great
<niemeyer> Alright!
<niemeyer> rog: Carefully responded your concerns with background, and repushed with the fixes
<niemeyer> rog: Thanks for the review
<niemeyer> I'm now stepping out for some holidaying
<rog> niemeyer: have a great time! you deserve it.
<niemeyer> rog: Thanks!
<fwereade> rog, this is pleasing, I find myself bumping up against wanting a ToolsDir just as you suggested yesterday
<rog> fwereade: yesterday? or just now?
<fwereade> rog, I thought yesterday -- did you show me something other than ToolsSuite today?
<fwereade> rog, I presume it's pretty much what you've been doing, what with the VarDir removal
<rog> fwereade: ah, did i have a type named ToolsDir?
<fwereade> rog, heh, maybe that wasn't its name
<rog> fwereade: tools.Dir perhaps
<fwereade> rog, that could be it :)
<rog> fwereade: i think i'm happier passing around varDir tbh
<rog> fwereade: i'm not sure the type is that useful
<fwereade> rog, that also works for me tbh
<rog> fwereade: i quite like the way it's turning out, and it's not too disruptive.
<fwereade> rog, excellent... I think I have something nice but it *is* a bit disruptive, I need to keep poking before I can decide whether it's right
<fwereade> rog, I have a sudden feeling that it would be neat to use "agents/unit-x-1/tools" rather than "tools/unit-x-1"
<fwereade> rog, I suppose it requires that we create the agent directory for everything, wich is slightly inconvenient
<rog> fwereade: i don't think so.
<rog> fwereade: it also means we can't have all the tools directories sitting in the same parent
<fwereade> rog, but having the stuff that the agent depends on within the agent's own directory feels like a nice thing
<rog> fwereade: with simple same-directory symlinks between them
<rog> fwereade: several agents may depend on the same tools
<fwereade> rog, what's wrong with absolute symlinks?
<fwereade> rog, I know that
<rog> fwereade: i *think* it's neater to have all the tools directories in one place
<rog> fwereade: and as you say, currently we don't need to create a directory for most agents
<fwereade> rog, it feels even neater to me to only have the real tools directories there
<rog> fwereade: so agents/unit-x-1/tools would be a symlink?
<rog> fwereade: i prefer relative symlinks too
<rog> fwereade: as it is (well, could be :-]), we can move the juju directory somewhere else, start things with a different --juju-dir and it'll work
<fwereade> rog, fair enough
 * rog reserves the right to change his mind, as always :-)
<mramm> Hey all.   Good work this week.  Review with sabdfl & managers went well today.
<rog> mramm: thanks
<rog> mramm: things are coming along nicely i think. as always the horizon recedes as we get towards it though :-)
<fwereade> mramm, cool
<fwereade> dammit, I think I have something really good coming along, but I have to go
<rog> fwereade: bother, i'm about to propose the vardir change and thought you might like a look :-)
<rog> fwereade: but fair enough, it's friday night!
<rog> fwereade: have a great weekend!
<fwereade> rog, that is extremely relevant to my interests, I will almost certainly look at it over the w/e
<fwereade> rog, and you :)
<rog> fwereade: cool
<rog> fwereade: all tests just ran clean, yay
 * fwereade cheers
<rog> fwereade: this branch is just a prelude to the branch that fixes container though, i'm afraid
<rog> fwereade: https://codereview.appspot.com/6501106
<fwereade> rog, cheers, I'll open the tab but I really must go
<fwereade> rog, take care, have fun :)
 * rog thinks that 3 hours for that amount of refactoring is pretty good going. huzzah for static typing!
<rog> fwereade: oh yes, incidentally this also naturally fixes the bug that --juju-dir didn't work :-)
<rog> right, that's me for the weekend.
<rog> see y'all monday; have a good one!
#juju-dev 2013-09-02
<bigjools> davecheney: seems like the azure provider is now basically blocked on a cloudinit sru to precise...
<davecheney> bigjools: ok, i'm making a 1.14 branch
<davecheney> we can backport fixes in there to get it working once cloudinit is done
<axw> davecheney: https://bugs.launchpad.net/juju-core/+bug/1216770 is merged, I just forgot to update the bug
<axw> it is in 1.13.3
<davecheney> axw: no worries
<davecheney> https://docs.google.com/a/canonical.com/document/d/1aEvcmxSJaj1i9zNjGy48yKF-SPlTFwW-NiKfoO_Ygo4/edit#heading=h.h7wry0fbg197
<davecheney> ^ hint hint
<axw> davecheney: updated
<axw> feel free to edit
<wallyworld> axw: hi, i had a test failure locally with the manual provisioning stuff - detectSeriesAndHardwareCharacteristics() because for some reason the bash that runs doesn't like my .shinit. i think we should ensure the bash that runs ignores all such files
<axw> wallyworld: doh, thanks. I'll get onto it
<wallyworld> np :-)
<davecheney> axw: cool
<davecheney> the builders are backed up
<davecheney> i won't get the build til after lunch
<davecheney> but we've tagged 1.13.3 now
<davecheney> so feel free to wreck the build
<wallyworld> axw: i had a quick look - using --noprofile or --norc didn't work; there must be another option to ensure .shinit is ignored. .shinit is more a sh thing i think
<axw> wallyworld: k, thanks
<axw> wallyworld: do you mean the unit tests were failing? or you were playing with it?
<axw> ah the sshscript thing I guess...
<wallyworld> axw: test failure. i ran the tests cause a merged trunk and there were api changes i needed to port into the new manual provisioning code.
<wallyworld> i have a .shinit, the bit that failed was a mkdir, go figure
<hazmat> davecheney, re slowness on cli.. how long does bootstrap take for you ? i'm consistently getting 2m plus
<davecheney> lucky(~) % juju bootstrap -e us-west-1 -v
<davecheney> 2013-09-02 01:24:23 INFO juju.provider.ec2 ec2.go:209 preparing environment "us-west-1"
<davecheney> 2013-09-02 01:24:23 INFO juju.provider.ec2 ec2.go:187 opening environment "us-west-1"
<davecheney> 2013-09-02 01:24:23 INFO juju.environs.tools tools.go:81 filtering tools by released version
<davecheney> 2013-09-02 01:24:23 INFO juju.environs.tools tools.go:28 reading tools with major version 1
<davecheney> 2013-09-02 01:24:23 INFO juju.environs.tools tools.go:36 filtering tools by series: precise
<davecheney> 2013-09-02 01:24:28 INFO juju.environs.tools tools.go:43 falling back to public bucket
<davecheney> 2013-09-02 01:24:29 INFO juju.environs.tools tools.go:92 picked newest version: 1.12.0
<davecheney> 2013-09-02 01:24:38 INFO juju.environs.boostrap bootstrap.go:56 bootstrapping environment "us-west-1"
<davecheney> 2013-09-02 01:24:38 INFO juju.environs.tools tools.go:28 reading tools with major version 1
<davecheney> 2013-09-02 01:24:38 INFO juju.environs.tools tools.go:33 filtering tools by version: 1.12.0
<davecheney> 2013-09-02 01:24:38 INFO juju.environs.tools tools.go:36 filtering tools by series: precise
<davecheney> 2013-09-02 01:24:39 INFO juju.environs.tools tools.go:43 falling back to public bucket
<davecheney> 2013-09-02 01:24:48 INFO juju.provider.ec2 ec2.go:426 started instance "i-24a95f7e"
<davecheney> 2013-09-02 01:24:50 INFO juju supercommand.go:284 command finished
<davecheney> that is from australia
<davecheney> i suspect the growing number of tools we have in the public bucket is not helping
<hazmat> wow
<hazmat> thats a huge delta from what i see
<davecheney> 2013-09-02 01:14:29 DEBUG juju state.go:159 waiting for DNS name(s) of state server instances [i-9b6cfffc]
<hazmat> davecheney, yeah.. it would be nice to just have a latest pointer
<davecheney> 2013-09-02 01:14:40 INFO juju.state open.go:68 opening state; mongo addresses: ["ec2-54-226-75-153.compute-1.amazonaws.com:37017"]; entity ""
<davecheney> i have no idea what is happening in those 10 seconds
<davecheney> oh, sorry
<hazmat> so that's roughly 16s vs 25s
<davecheney> that is getting the dns name from the provider
<hazmat> yeah.. instance id -> ip addr
<davecheney> it's ec2 being slow
<davecheney> what can I say
<hazmat> its still 75% of the runtime cost that can be saved with some trivial caching
<hazmat> per the bug title
<davecheney> lucky(~) % juju status -e us-west-1 -v
<davecheney> 2013-09-02 01:28:26 INFO juju.provider.ec2 ec2.go:187 opening environment "us-west-1"
<davecheney> 2013-09-02 01:28:29 INFO juju.state open.go:68 opening state; mongo addresses: ["ec2-54-219-20-209.us-west-1.compute.amazonaws.com:37017"]; entity ""
<davecheney> 2013-09-02 01:28:30 INFO juju.state open.go:106 connection established
<davecheney> environment: us-west-1
<davecheney> machines:
<davecheney>   "0":
<davecheney>     agent-state: started
<hazmat> pastebin pls
<davecheney>     agent-version: 1.12.0
<davecheney>     dns-name: ec2-54-219-20-209.us-west-1.compute.amazonaws.com
<davecheney>     instance-id: i-24a95f7e
<davecheney>     instance-state: running
<davecheney>     series: precise
<davecheney>     hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M
<davecheney> services: {}
<davecheney> 2013-09-02 01:28:33 INFO juju supercommand.go:284 command finished
<davecheney> you'l live :)
<hazmat> davecheney, i'm using us-west-2 incidentally
<hazmat> us-west-1 has higher pricing
 * davecheney tried us-west-2
<hazmat> davecheney,  http://pastebin.ubuntu.com/6053452/
<hazmat> 36s compared to 17s
<davecheney> hazmat: it's slow in us-west-2 'cos there are no tools
<hazmat> davecheney, there's tools in us-west-1?
<hazmat> i thought only in us-east
<davecheney> i have no idea why
<davecheney> tools are no region specific in ec2
<davecheney> the automatic fallback to sync-tool is very confusing if you don't pass -v
<davecheney> hazmat: http://paste.ubuntu.com/6053456/ us-west-1
<davecheney> us-west-2 is taking much longer
<davecheney> i can only assuming it's doing sync tools
<hazmat> generally with fios/fiber to the home, i have pretty good bandwidth & latency.. in terms of ruling out client connectivity.
<hazmat> davecheney, there's some sleep and retry logic in the ec2/s3 code for eventual consistency as well
<davecheney> hazmat: if I can bootstrap in 25 seconds from australia
<davecheney> isn't not a latency problem :)
<hazmat> wow
<hazmat> 2m9 vs 25s
<davecheney> it's sync tools
<davecheney> hazmat: still going ...
<davecheney> us-west-2
<hazmat> davecheney, plus it looks up tools multiple times..
<hazmat> in bootstrap
<hazmat> one to find/upload them, one to launch an instance
<hazmat> which gets duplicated by the provisioning worker for every launch
<davecheney> hazmat: http://paste.ubuntu.com/6053480/
<davecheney> ends with a Fffffuuuuu
<hazmat> 2m4s
<davecheney> 'cos of sync tools
<davecheney> workaround: don't use us-east-2
<davecheney> west-2
<hazmat> that's interesting i don't see the tool sync
<davecheney> gotta use -v
<davecheney> always -v
<davecheney> use -v for everything
<davecheney> got a question
<davecheney> -v
<davecheney> need answers
<davecheney> -v
<hazmat> i've been using --debug
<hazmat> which seems like it should imply -v
<davecheney> sure
<davecheney> i use-v
<davecheney> did i mention i use -v
<davecheney> -v is great
<davecheney> http://paste.ubuntu.com/6053487/
<davecheney> wut
<davecheney> it either is, or is not bootstrapped
<hazmat> davecheney, i know that bug
<hazmat> davecheney, its when you have a bucket but no environment, ie. shutdown from the console
<hazmat> davecheney, the only resolution was to destroy the environment again
<davecheney> whoo
<hazmat> or manually empty the bucket..
<davecheney> hazmat: http://paste.ubuntu.com/6053492/
<davecheney> 25seconds
<hazmat> davecheney, for comparison http://paste.ubuntu.com/6053494/
<wallyworld> axw: i looked at the simplestreams branch - i marked it as lgtm so long as a few issues are fixed. let me know if you have any questions. as soon as it lands, i need to munge simplestreams format a bit
<axw> wallyworld: okey dokey, thanks
<davecheney> 2013-09-01 14:47:35 INFO juju.environs.tools tools.go:93 picked newest version: 1.13.2
<davecheney> 2013-09-01 14:48:04 DEBUG juju.provider.ec2 storage.go:52 Creating bucket
<davecheney> what the heck is it doing for 30 seconds there
<davecheney> 2013-09-01 14:48:35 DEBUG juju.provider.ec2 ec2.go:394 ec2 user data; 12541 bytes
<davecheney> 2013-09-01 14:48:56 DEBUG juju.provider.ec2 ec2.go:397 ec2 groups setup
<davecheney> hazmat, whatever ec2 enpoint you are talking to
<davecheney> it's screwed
<hazmat> davecheney, that's us-east
<hazmat> davecheney, its the same time for us-west
<davecheney> sure, but bgp says we don't see the same thing
<hazmat> yeah
<axw> wallyworld: where did you try adding --noprofile/--norc?
<wallyworld> axw: in the call which invoked bash
<wallyworld> md := exec.Command("ssh", sshHost, "bash")
<wallyworld> but i may have not done it right, i was in a hurry
<axw> ok, that wouldn't work - bash doesn't actually get executed there
<davecheney> axw, maybe need -t
<axw> davecheney: ?
<axw> ssh -t?
<davecheney> otherwise bash won't have a pty
<axw> davecheney: yeah.. don't need it here. not actually running a real ssh in the tests
<davecheney> kk
 * davecheney stops 'helping'
<axw> ;)
<axw> wallyworld: I'm not sure how to reproduce the problem. Can you please try "var sshscript = `#!/bin/bash --noprofile" in detection_test.go, when you have a minute
<wallyworld> axw: sure, i'll just park some stuff and try it
<wallyworld> axw: no good :-(. you could maybe reproduce it by having a ~/.shinit with a non zero exit code?
<axw> wallyworld: tried that..
<axw> hrm
<wallyworld> hmmm. was you .shinit run?
<axw> probably not :)
<axw> it is if I run "sh"
<wallyworld> so i'm confused
<axw> I guess your bash profile is running something through sh?
<wallyworld> ah, i think i might invoke it
<wallyworld> elsewhere
<axw> I'm sure we have other tests which do this though
<axw> so I don't know why this one in particular would break
<wallyworld> i have it in my .bashrc
<wallyworld> BASH_ENV=$HOME/.shinit
<wallyworld> but .bashrc should not be run, right?
<thumper> wallyworld: what are you doing that is so special?
<wallyworld> thumper: the line that fails is a mkdir -p foobar
<thumper> axw: I suggest we just blame wallyworld and make him fix it
<axw> hehe
<thumper> wallyworld: what is in your .shinit?
<wallyworld> except we shouldn't be running profile scripts when doing the ssh :-)
<axw> wallyworld: so you could change --noprofile to --norc, but it's still a bit weird
 * thumper reads that quickly as "shit init"
 * axw snorts
<axw> I did the same :)
<thumper> heh
<wallyworld> thumper: i have a bunch of exports, and a line to make a tmp dir cause i have /tmp mapped to ram. the line which errors in the tests is mkdir -p /var/tmp/mailman/logs
<wallyworld> the test error is
<thumper> wallyworld: why does it fail?
<wallyworld> ...     "error detecting hardware characteristics: exit status 126 (/home/ian/.shinit: line 49: /bin/mkdir: Argument list too long\n" +
<wallyworld> ...     "/tmp/gocheck-8674665223082153551/3/ssh: line 14: /tmp/gocheck-8674665223082153551/3/ssh: Argument list too long\n" +
<wallyworld> ...     "/tmp/gocheck-8674665223082153551/3/ssh: line 14: /tmp/gocheck-8674665223082153551/3/ssh: Success)"
<thumper> wallyworld: why have it in .shinit instead of .bashrc?
<wallyworld> thumper: i call .shinit from my bashrc
<thumper> why?
<wallyworld> not sure now. there was a reason at some point
<thumper> if you renamed it to something else, what fails?
<wallyworld> if i rename it, the tests pass
<wallyworld> but then it is not run
<thumper> \o/
<wallyworld> well, there is still the roblem that my .bashrc is being run
<wallyworld> in order to call .shinit
<wallyworld> and we shouldn't be doing that
<thumper> well, no
<thumper> /bin/sh loads .shinit
<axw> wallyworld: can you try replacing --noprofile with --norc?
<wallyworld> sure
<wallyworld> damn. no good
<axw> wtf
<wallyworld> yeah, makes no sense to me
<wallyworld> i'll just rename the shinit
<axw> wallyworld: re "Please make this a big number so that we can ensure the fix actually works"
<axw> that test is not testing the fix at all
<axw> it's testing that standard JSON unmarshalling behaviour takes palce
<axw> place*
<wallyworld> ah, ok. i thought that the new unmarshall/marshall stuff was being invoked all the time
<axw> if you unmarshal a number into an interface{}, you get a float64
<axw> no, only if you go through ParseCloudMetadata
<axw> because only then do you have a template value
<wallyworld> ok, no problem. thanks for clarifying
<axw> nps, I'll add a comment in the review too in case anyone else cares
<wallyworld> did i make sense when i said i just wanted the marshall/unmarshall code moved out?
<axw> yes
<axw> I've kept "itemCollection" (the clone of ItemCollection) and the ItemCollection methods in json.go
<axw> everything else has gone back
<axw> is that right?
<axw> ItemCollection methods = UnmarshalJSON & construct
<wallyworld> yeah, i think that's good. i wanted the business logic / data model as it was, and the on the wire stuff separate to it's easier to grok etc
<axw> cool
<wallyworld> cause the user reading the simplestreams logic doesn't care about the json crap
<thumper> dog walk time
<wallyworld> and visa versa
<wallyworld> ha, that last comment could also apply to thumper
<wallyworld> even though it wasn't meant to
<axw> davecheney: the azure bugs can be fixed pretty quickly it seems
<axw> davecheney: any chance of slipping anything else in?
<axw> wallyworld: have you seen this? https://bugs.launchpad.net/juju-core/+bug/1219123
<davecheney> axw: too late
<davecheney> release it tagged
<davecheney> we can always back port
<axw> okey dokey
<wallyworld> axw: no, haven't seen that before
<thumper> arse biscuits
<thumper> some of our code is truly horrible
 * thumper thinks
 * thumper decides not to fix everything at once
<wallyworld> axw: are you landing your simplestreams branch now?
<axw> wallyworld: yes. do you want to make another pass, or shall I go ahead?
<wallyworld> axw: nah, Just Do It. we can iterate as needed
<axw> cool
<wallyworld> i need to merge it into my work after it lands, and then change stuff to do with the data format
<axw> okey dokey. let me know if there's something you'd like me to do
<axw> also, sorry it took so long
<wallyworld> no problem, it didn;t take long, it was fairly involved
<axw> wallyworld: just merging with trunk, I'll let you know when the bot's done
<wallyworld> yay, thanks
<axw> or not, one of the tests is playing up
 * thumper wonders how to withdraw a rietveld review
<wallyworld> axw: i have school pickup, i check when i get back to see how the merge is going
<axw> wallyworld: ok, later. fixing tests atm
 * thumper uses some mocks for agent.Config
<thumper> it feels GOOD
<thumper> hmm...
<thumper> WTF went wrong there...
<thumper> copy and paste went wrong there
<jam> axw: I bring up the SHA256 thing because it failed in the bot
<axw> jam: ah right. yeah there was a bug in my test code, I was accidentally writing dynamic content to the fake tools files
 * thumper is almost done
<axw> must've been lucky with the gocheck temporary dir being the same on my machine
<thumper> argh!!!
 * thumper takes a deep breath
<jam> thumper is never almost done
<jam> :)
<thumper> :P
<jam> well, never *actually* done
 * thumper leaves it for now
<axw> jam: expect MS to call you if you've signed up for Azure
<axw> they woke me up the other day..
<jam> axw: yeah.
<jam> axw: so when I "LGTM" with some comments, if you address and agree with the comments, you can feel free to just land it
<axw> jam: ok, will do
<axw> I'm just updating the default-precise comment now
<axw> and testing
<davecheney> 1.13.3 is released
<davecheney> the lp builders are lagged
<davecheney> i'll uploda the tarball and publish the release notes once the tools are uploaded
<jam> thanks davecheney
<davecheney> will look at the 1.14 branch next
<davecheney> /s/branch/series
<davecheney> i release 1.13.3 so y'all would stop putting bugs back in there :)
<davecheney> https://launchpad.net/juju-core/+milestone/1.14.0
<wallyworld> axw: just saw your test error - that's intermittent, i raised a bug 1219602
<wallyworld> i resubmitted and it worked
<axw> wallyworld: thanks, I was about to look into it...
<axw> cool
<wallyworld> np, that's why i pinged you to save you the trouble :-)
<axw> wallyworld: it says "needs review" still?
<wallyworld> axw: if a test run failes, the bot puts it back to needs review
<wallyworld> you need to flip it to approved again
<axw> <wallyworld> i resubmitted and it worked  <- what worked?
<wallyworld> axw: the landing bot ran the tests successfully and merged the branch
<wallyworld> when i say resubmitted, i mean i marked the mp as approved
<axw> oh ok
<axw> yep..
<wallyworld> and so the bot noticed and re did its thing
<axw> I'm confused because it's not saying merged, but it doesn't matter. I'll do it again
<wallyworld> axw: it's not saying merged because the merge did not happen because the tests failed
<wallyworld> so the bot emailed the test failures and flipped the status back to needs approved
<axw> wallyworld: yeah I get that, but I thought you said it passed
<axw> sorry, I'm being a bit thick
<wallyworld> it passed for *me* for my branch when i tried a second time
<axw> ahh right, on a different MP
<wallyworld> yeah, sorry
<dimitern> morning!
<axw> morning dimitern
 * wallyworld goes to soccer
<axw> wallyworld: it has finally merged
<axw> enjoy soccer
<jamespage> davecheney, I'm assuming that my next upload to saucy should be 1.14.0 right?
<jamespage> and thats really just 'stable release of 1.13.x'?
<jamespage> (need to phrase my changelog message right as I don't think its a huge jump from 1.13.2)
<jam> jamespage: that sounds right to me
 * jam heads out to run a couple of errands.
<davecheney> jamespage: correct
<davecheney> will be ready by weeks end if not sooner
<davecheney> i wouldn't worry about 1.13.3
<mgz> so, it's a no US day today
<thumper> jam: hey, seen mramm?
<mgz> thumper: I assume he's out as it's a US holiday today
<thumper> mgz: he's in the UK :)
<mgz> oh, then I guess not :)
<jam> mgz:  /wave
<jam> I haven't seen mramm all day anyway
<mgz> hey jam
<dimitern> https://codereview.appspot.com/13272045 - mgz, jam, please take a look (uniter api's remaining unit ops)
<dimitern> fwereade: ^^?
<fwereade> dimitern, sure
<dimitern> fwereade: ty
<mgz> dimitern: sure
<jam> mramm: hey, good to see you around
<mramm> jam: in the london office for the gui sprint
<dimitern> mramm: hey, so we can book now for the sf sprint?
<mramm> dimitern: yep
<dimitern> mramm: cool!
<dimitern> mramm: arriving 21, leaving 25 oct, right?
<dimitern> mramm: or thereabouts
<jam> dimitern: I think you need the final signoff from mramm before you can actually get the agency to book the travel
<jam> but you could probably start the conversation.
<dimitern> jam: well, following the steps in the mail, I'm filling in the form first, then will contact bts travel
<jam> dimitern: right, so after filling in the form mramm gives a final signoff on it, and you get a confirmation ID that you need to share with BTS before they'll book anything
<jam> I think you actually share the whole email with them.
<dimitern> jam: ok
<mgz> will miss the standup, will catch up when I'm back shortly after
<fwereade> dimitern, couple of questions, not *quite* an LGTM, the Destroy test is the only one that gave me serious pause
<dimitern> fwereade: ok
<dimitern> fwereade: well, the uniter does call unit.Destroy
<fwereade> dimitern, but not on itself
<dimitern> fwereade: filter.go:453, uniter.go:498
<fwereade> dimitern, bah, filter, I stand corrected
<dimitern> fwereade: :) so it's ok then?
<fwereade> dimitern, yeah, I think so
<fwereade> dimitern, it might be neater to implement the test for a unit destroying its own subordinate, though, because that's the one that's least sensitive to future change
<fwereade> dimitern, and you've already written that support code, so it should be easy ;p
<dimitern> fwereade: ok, will do
<fwereade> dimitern, thanks
<dimitern> fwereade: hmm
<fwereade> dimitern, eh, problems?
<dimitern> fwereade: I'm not sure it's possible to do an api call for a subordinate if you've logged in as the principal
<fwereade> dimitern, doesn't the principal destroy its own subordinates? maybe I misremembered
<dimitern> fwereade: i mean, it'll return ErrPerm because the auth'ed entity tag won't match
<fwereade> dimitern, modes.go:298
<dimitern> fwereade: so the when logged in as a principal, you should be able to call Destroy() (and only that method?) on its subordinates as well
<fwereade> dimitern, that's the only one I can think of offhand
<dimitern> fwereade: ok, that'll require a few changes at server-side
<fwereade> dimitern, hopefully minor?
<dimitern> fwereade: they should be, yeah
<fwereade> dimitern, cool, thanks
<fwereade> dimitern, if it needs to be a separate CL so be it
<dimitern> fwereade: yeah it seems more like that
<dimitern> fwereade: more changes are needed actually - I need to be able to fetch a subordinate from a logged in principal
<dimitern> fwereade: so what do you say to leaving it as is, adding a TODO to fix it and doing the fix in a follow-up?
<fwereade> dimitern, LGTMed
<dimitern> cheers
<fwereade> dimitern, well, you don't need to get the subordinate for that piece of code
<fwereade> dimitern, that's just a Destroy loop over SubordinateNames
<dimitern> fwereade: not really
<fwereade> dimitern, hell, the first 10 lines of that mode should really just be a "destroy all subordinates plskthx" api call
<dimitern> fwereade: SubordinateNames returns names, not tags, and I need a uniter.Unit proxy object to call the server-side
<dimitern> fwereade: so I have to construct it in place, it'll be a bit ugly
<fwereade> dimitern, all the more reason to replace it all then -- but that's not the approach we agreed, I know
<fwereade> dimitern, that said
<fwereade> dimitern, why would we use names over the API? *shouldn't* they be subordinate tags? ;p
<dimitern> fwereade: we return names, but take tags as args
<fwereade> dimitern, that's incoherent
<fwereade> dimitern, how widespread is it?
<dimitern> fwereade: in most cases we use only tags, except for a few cases like SubordinateNames and the strings watchers
<dimitern> fwereade: not much, but I have to check
<fwereade> dimitern, so long as there's general agreement that the appropriate language is tags, not names, I'm reasonably happy
<fwereade> dimitern, bugger, we have deployed lifecycle watchers that use names, haven't we?
<dimitern> fwereade: so these return or use names: GetPrincipal, SubordinateNames,Relation (inside the RelationResult struct there's a ServiceName field), ReadRemoteSettings (only in the error message "cannot read...")
<dimitern> fwereade: lifecycle watchers are notifywatchers I think
<dimitern> fwereade: but the deployer uses a stringswatcher that returns names
<fwereade> dimitern, ideally I think all those would be fixed before we deployed something that expected names
<fwereade> dimitern, but the StringsWatcher thing has rather screwed us
<dimitern> fwereade: yep, so I looked
<dimitern> fwereade: stringswatchers, error messages, a few params struct fields, and these 2 methods of UniterAPI use names
<dimitern> fwereade: standup
<jam> TheMue: standup? https://plus.google.com/hangouts/_/f497381ca4d154890227b3b35a85a985b894b471
<dimitern> wow, so with some of the cli now using the api, cmd/juju tests running time increased like 5 fold
 * dimitern lunch
<dimitern> fwereade: hey
<fwereade> dimitern, heyhey
<dimitern> fwereade: the way uniter api works now, it calls Life when you call uniter.Unit(tag)
<dimitern> fwereade: and that's the only way to get a unit proxy object
<dimitern> fwereade: does Life has to work for subordinates as well? because that opens a different can of worms
<fwereade> dimitern, I don't think so
<fwereade> dimitern, I suspect a simple DestroyAllSubordinates will be the easiest way
<dimitern> fwereade: so the client-side api has to distinguish between a principal and a subordinate
<dimitern> fwereade: ah, so make a new method on unit - DestroyAllSubordinates then?
<dimitern> fwereade: and the matching call at server-side
<fwereade> dimitern, I think so
<dimitern> fwereade: this will require some changes in the uniter as well
<fwereade> dimitern, agreed
<dimitern> fwereade: ok, will add a TODO for that as well
<fwereade> dimitern, we're writing code either way
<fwereade> dimitern, use your judgment
<fwereade> dimitern, if it's cleaner to preserve the structure then that's ok too
<dimitern> fwereade: it's not cleaner at all - it involves chaning LifeGetter to accomodate that special case
<dimitern> fwereade: I prefer not to do it unless we need to
<fwereade> dimitern, huh? surely not
<fwereade> dimitern, auth is specified outside LifeGetter
<fwereade> dimitern, it's the same getauth as deployer, surely?
<fwereade> dimitern, ah not quite
<fwereade> dimitern, but still
<dimitern> fwereade: well, the auth func will get a lot more complicated, but yes LifeGetter might not need to change
<dimitern> fwereade: well, the auth func will get a lot more complicated, but yes LifeGetter might not need to change<
<dimitern> fwereade: oops, sorry
<dimitern> fwereade: too many windows :)
<dimitern> fwereade: there it is https://codereview.appspot.com/13426045
<dimitern> fwereade: ping
<fwereade> dimitern, hey, sorry, I'm looking
<dimitern> fwereade: cheers
<fwereade> dimitern, LGTM
<fwereade> dimitern, sorry delay
<fwereade> dimitern, although actually...
<dimitern> fwereade: thanks!
<fwereade> dimitern, how about putting that comment in uniter?
<fwereade> dimitern, that's where we'll need to know it, not the other places, really
<dimitern> fwereade: ah, good idea!
<dimitern> fwereade: another, quite trivial https://codereview.appspot.com/13348050
<fwereade> dimitern, a thought
<fwereade> dimitern, if we have DestroyAllSubordinates, the only use of Subordinates would be better expressed plain old HasSubordinates
<dimitern> fwereade: have to check
<dimitern> fwereade: yeah, that seems sane
<dimitern> fwereade: there are only 2 cases where it's used, and the the second one is just like HasSubordinates
<dimitern> fwereade: so you're saying rename Subordinates to HasSubordinates, returning just a bool
<dimitern> fwereade: the penultimate for today, promise :) https://codereview.appspot.com/13383046
<dimitern> fwereade: the last will be a trivial rename s.unit -> s.stateUnit and unit -> apiUnit in client-side tests
<fwereade> dimitern, <3
<fwereade> dimitern, reviewed
<dimitern> fwereade: tyvm
<dimitern> fwereade: updated https://codereview.appspot.com/13348050
<fwereade> dimitern, LGTM, couple of things to check
<dimitern> fwereade: cheers
<dimitern> fwereade: updated https://codereview.appspot.com/13383046/
<hazmat> anyone tried local provider lately? i keep running into bug 1219887
<_mup_> Bug #1219887: local provider needs to wait for upstart to load service file <juju-core:New> <https://launchpad.net/bugs/1219887>
<rick_h> hazmat: used it on friday, what version? I was on the stable ppa I think
<hazmat> rick_h, trunk of go and juju
<dimitern> fwereade: last one - https://codereview.appspot.com/13472043
 * dimitern bbiab
<dimitern> fwereade: ping
<dimitern> TheMue, mgz: still aroung?
<dimitern> around
<TheMue> dimitern: will take a look after dinner
<dimitern> TheMue: cheers; it's a trivial renaming
<TheMue> dimitern: you've got a LGTM
<dimitern> TheMue: thanks!
<thumper> WTF?
<thumper> my review seemed to pick up all sorts of weird changes
 * thumper grunts
<thumper> because lbox uses the launchpad copy of the branch, not the local one
<davecheney> mramm ping
<davecheney> mramm ping
<mramm> davecheney: pong
<davecheney> goodbye sweet prince
<davecheney> that's what you get for using a mac
 * davecheney waves
<thumper> gym time \o/
#juju-dev 2013-09-03
<wallyworld> thumper: back?
<davecheney> wallyworld: https://code.launchpad.net/~dave-cheney/+recipe/juju-core
<davecheney> what has happened to the build recipe ?!?
<davecheney> why don't the debs end in the series ?
<wallyworld> i have no idea. my knowledge of packaging stuff is zero :-(
<wallyworld> bigjools: ?
<bigjools> sup
<wallyworld> ^^^
<davecheney> fuck
<davecheney> this broke the release
<davecheney> bugger it
<davecheney> i can fix this in the release script
<bigjools> what is the question for me, exactly?
<wallyworld> [10:36:18] <davecheney> what has happened to the build recipe ?!?
<wallyworld> [10:36:33] <davecheney> why don't the debs end in the series ?
<davecheney> bigjools: why do the debs produced by the recipe no longer end in the series
<bigjools> that makes no sense
<bigjools> what do you mean by "end in the series" ?
<davecheney> the name of the deb used it end in ~saucy
<davecheney> for example
<davecheney> now it ends in ~ubuntu13.10
<bigjools> because your recipe says "{debupstream}-1~{revno:juju-core}"
<davecheney> bigjools: that has never changed
<davecheney> and that gives me -1~1739
<davecheney> the remainder was added by the build bot
<bigjools> I don't see any complete builds that don't have that
<wallyworld> bigjools: [FULLYBUILT] juju-core - 1.13.2-4~1703~saucy1
<davecheney> bigjools: https://code.launchpad.net/~juju/+archive/devel/+builds?build_text=&build_state=built
<davecheney> somethign happened in the last two weeks
<davecheney> look at the name of the 1.13.2 packages
<davecheney> and earlier
<bigjools> it has changed to use release names instead of code names by the looks of things
<bigjools> which is correct IMHO
<bigjools> nothing is broken
<davecheney> bigjools: is it possible to change it back ?
<davecheney> i was relying on that misfeature
<bigjools> in what way?
<davecheney> i used the series in the name of the file produce to deduce the series the deb was produced for
<bigjools> can't you change it to look at the ubuntuNN.NN.N instead?
<davecheney> i can
<davecheney> but I wanted to know why this changed
<bigjools> no idea
<davecheney> who can I talk to about this ?
<bigjools> I'd ask steve or william
<wallyworld> i poked in laucnhpad-ops
<wallyworld> waiting on a reply
 * davecheney joins
<bigjools> I asked in #launchpad-dev :)
<bigjools> but there's probably better ways to deduce the series
<davecheney> bigjools: suggestions welcome
<davecheney> i have a deb file
<davecheney> i'd like to know which series it came form
<davecheney> maybe the changelog
<bigjools> use the LP API
<davecheney> bigjools: i have the file
<davecheney> all i have is the file
<wallyworld> davecheney: here's the answer
<wallyworld> [10:46:35] <wgrant> wallyworld_: It's changed to ~ubuntuYY.MM.1
<wallyworld> [10:46:45] <wgrant> As Ubuntu's backports have been for a while.
<wallyworld> [10:46:53] <wgrant> Because in a few releases the alphabet will wrpa.
<davecheney> the file is all i have
<bigjools> where arey you getting the file?
<davecheney> wallyworld: bigjools ok
<davecheney> i'll fix it in code
<davecheney> thanks for your help
<bigjools> davecheney: where are you getting the deb from?
<davecheney> the link from the recipe
<davecheney> i'll just make a table to convert from number to name
<bigjools> how are you accessing the recipe?
<bigjools> hang on :)
<davecheney> bigjools: manuallyu
<davecheney> copy and paste the url
<bigjools> you have a script that scrapes the page?
<davecheney> http://bazaar.launchpad.net/~go-bot/juju-core/trunk/view/head:/doc/juju-core-release-process.txt
<davecheney> i have a person that scrapes the page
<bigjools> release notes should always talk about version numbers not code names, FWIW
<bigjools> I was never comfortable with juju's config files containing "precise" etc
<bigjools> should be 10.04
<bigjools> 12.04 even
<davecheney> bigjools: i don't think its possible to change that now
<bigjools> it's always possible :)
<bigjools> just work
<davecheney> bigjools: if you feel strongly about it
<davecheney> raise an issue
<davecheney> it's not one anyones radar
<bigjools> well I am just pointing out UBuntu policy
<bigjools> given my prior work area
<axw> it's kinda on my radar; I was thinking ahead about how tools might be named for non-Ubuntu targets, if we were to support them
<bigjools> given that so many people get it wrong anyway it's not worth pushing hard
<axw> LSB codename doesn't work so well for other distros
<davecheney> lol at lsb
<bigjools> but it's something to be aware of because Ubuntu may change things like you just experienced
<davecheney> ubunutu doesn't start with the letter l
<davecheney> bigjools: fair enough, i am so warned
<bigjools> cool
<bigjools> davecheney: anyway I am struggling from that large page of text to work out what exactly you're doing with the deb
<bigjools> you get the deb url manually then?
<bigjools> oh and the script parses the deb file name.
<bigjools> you might want to specify that manually tbh
<bigjools> in Ubuntu debs can be in more than one release if they didn't change between
<davecheney> bigjools: that page just documents what happens now
<davecheney> it does not describe what we want to happen
<bigjools> fair enough
<bigjools> I'd honestly make an LP API script to automate this
<davecheney> bigjools: i'm sure someone will replace this with somthing much better
<bigjools> ok.  said my piece :)
<davecheney> thanks all for your help
<bigjools> any time
<davecheney> wasn't hard to work around
<wallyworld> davecheney: i know i am stupid, but i'm getting a nil pointer panic and i can't see why :-( any clues? http://pastebin.ubuntu.com/6057264/
<davecheney> wallyworld: can't do it like that
<davecheney> what is happening is gocheck does this
<davecheney> s := new(LegacyToolsSuite)
<davecheney> legacy tools suite contains a ToolsSuite as a valid initalised to its zero value
<davecheney> and inside that is a field called toolsTestHelper which is an interface
<davecheney> the interface is initalised to it's zero value
<davecheney> which is a nil pointer
<wallyworld> why is polymorphism so hard in Go :-(
<davecheney> cos there is no polymorphism
<wallyworld> yeah :-(
<wallyworld> kinda limiting
<wallyworld> thanks, i'll rework it
<davecheney> ok, i've found another bug with that change
<davecheney> the version inside the 12.10 deb is 13.10
<wallyworld> \o/
<davecheney> http://paste.ubuntu.com/6057271/
<wallyworld> i'll ask in launchpad-ops
<davecheney> oh
<davecheney> hang on
<davecheney> something is wrong
<davecheney> shit
<davecheney> sorry
<davecheney> i fucked up
<wallyworld> :-)
<wallyworld> thumper: ping
<davecheney> axw: lp:~axwalk/juju-core/lp1218329-azure-released-images
<thumper> wallyworld: hi
<davecheney> are you able to backport that to the 1.14 branch ?
<axw> davecheney: sure
<davecheney> ta
<davecheney> there is no bot
<davecheney> merge manually
<axw> okey dokey
<wallyworld> thumper: hi, i was reviewing your pass-through-agent-config branch. i couldn't tell for sure, but are you ensuring the default value for dataDir is "/var/lib/juju". it looks like that is missing with the rework?
<thumper> wallyworld: that is because it is dumb :)
<thumper> it was saying "if it is empty"
<thumper> it is never empty
<thumper> agent.Config will not be created with an empty dataDir
<wallyworld> ok. other than that, it all looks pretty mechanical
<davecheney> right
<davecheney> this release is a dud
<davecheney> spot the bug., http://paste.ubuntu.com/6057341/
<axw> davecheney: pushed to 1.14
<davecheney> thanks
<davecheney> might have been for naught
<davecheney> the machiner can't restart properly
<davecheney> it has a requirement on the api server being up
<davecheney> and that races with the api server job
<davecheney> restarting
<davecheney> ok, it looks like it finally gets there
<davecheney> but takes many many rstarts
 * davecheney files bug
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1220027
<_mup_> Bug #1220027: worker/provisioner: cannot restart cleanly due to hard dependency on api server <juju-core:Triaged> <https://launchpad.net/bugs/1220027>
<thumper> I think this would be pretty easily solved if we didn't make everything fatal
<thumper> that way the provisioner would try to start, fail and then retry
<thumper> while the api server comes up
<thumper> or,
<thumper> we should make sure the at the api server comes up before we start any other jobs that depend on it
<thumper> that sounds sensible
<davecheney> thumper: i think we have to do that
<davecheney> before the provisioniner didn't use the api to provision
<davecheney> now it does
<thumper> we used to have this for state
<thumper> at least it made sure the state connection was valid
<thumper> now we need a valid api endpoint
<davecheney> this is the danger of testing on ec2
<davecheney> we expect everything to be dog slow
<davecheney> wallyworld: can someone sync tools to canonistack and hp cloud ?
<wallyworld> davecheney: ok
<davecheney> wallyworld: ta much
<davecheney> axw: https://bugs.launchpad.net/juju-core/+bug/1218329
<_mup_> Bug #1218329: Update default environment.yaml for Azure to use Precise for default-series <juju-core:Fix Committed by axwalk> <juju-core 1.14:Fix Committed> <https://launchpad.net/bugs/1218329>
<davecheney> which rev was this fixed on 1.14 ?
<thumper> axw: what is a simple way to copy a clice?
<thumper> foo[:] ?
<davecheney> thumper: nope
<davecheney> use the copy() statement
<thumper> works in python :P
<davecheney> sorry functoin
<thumper> builtin?
<davecheney> that will give you a new slice with the same backing array
<thumper> what benefit does that really give us over returning the slice itself?
<axw> thumper: sorry missed... append([]x, y...)
<thumper> it it shares the backing array
<axw> []x{} rather
<davecheney> thumper: i don't know, i don't know what the question is
<thumper> davecheney: axw suggested in a review to return a copy of our caCert []byte instead of the one we have
<thumper> just so it isn't misused
<thumper> accidentally
<axw> append([]byte{}, caCert...)
<thumper> axw: does append(nil, caCert..) work?
<thumper> or does it need to be a typed nil?
<axw> umm. I don't recall
<axw> don't think so, but I'll try
<axw> thumper: nope
<axw> (nope doesn't work; needs to be typed)
 * thumper nods
 * thumper uses append
<axw> thumper: as for your question of "what benefit does that [returning a resliced slice] give" - none in this case
<thumper> the new copy I get
 * thumper returns a copy
<thumper> and runs the tests prior to submitting
<thumper> axw: I also fixed the comment.
<axw> thumper: thanks
<axw> oh god
<axw> it's going to take ~24h to get to SFO
<davecheney> axw: WHAT
<davecheney> 4 hours to Syd
<davecheney> then 14 to SFO
<davecheney> 14 to LAX
<davecheney> then 2 to SFO
<axw> davecheney: not all in the air
<axw> don't forget fun times waiting at the airports
<wallyworld> davecheney: syncing finished for canonistack and hp cloud
<davecheney> wallyworld: ta
<davecheney> testing now
<wallyworld> axw: serves you right for living in perth :-P
<axw> wallyworld: yeah :(  let's all have the next one here.. ha ha
<wallyworld> i'd rather europe :-)
<axw> or that
<davecheney> ooooooooooooooooooooooh
<davecheney> now i understand what hazmat is talking about
<davecheney> if the provisioner gets into the restart loop
<axw> still a long flight, but at least there's cheap beer and good food at the end of it
<davecheney> status will take ages
<davecheney> because there is no api server
<davecheney> axw: in SFO
<axw> davecheney: ?
<davecheney> uh, are you sure ?
<axw> europe
<axw> davecheney: what's the provisioner problem? this sounds a bit like an issue I saw with the local provider
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1220027
<_mup_> Bug #1220027: worker/provisioner: cannot restart cleanly due to hard dependency on api server <papercut> <juju-core:Triaged> <juju-core 1.14:Triaged> <juju-core trunk:Triaged> <https://launchpad.net/bugs/1220027>
<axw> mk
<axw> that does look similar
<davecheney> lucky(~) % juju bootstrap -v
<davecheney> 2013-09-03 02:44:45 INFO juju.provider.openstack provider.go:126 opening environment "nec-az3"
<davecheney> 2013-09-03 02:44:45 INFO juju.environs.tools tools.go:29 reading tools with major version 1
<davecheney> 2013-09-03 02:44:45 INFO juju.environs.tools tools.go:37 filtering tools by series: precise
<davecheney> 2013-09-03 02:44:48 INFO juju.environs.tools tools.go:44 falling back to public bucket
<davecheney> 2013-09-03 02:44:48 INFO juju.environs.tools tools.go:93 picked newest version: 1.13.2
<davecheney> wallyworld: did you sync tools to hp cloud ?
<wallyworld> davecheney: supposedly
<wallyworld> what public bucket are you using?
<wallyworld> i've got https://region-a.geo-1.objects.hpcloudsvc.com/v1/60502529753910
<davecheney>     public-bucket-url: https://region-a.geo-1.objects.hpcloudsvc.com/v1/60502529753910
<davecheney> wallyworld: i am using az3
<wallyworld> wtf, they're different
<davecheney> they look the same to me
<wallyworld> ok, so i need to sync to your's too
<wallyworld> oh hang on
<wallyworld> i can't read
<wallyworld> they're the same
<wallyworld> so looks like sync tools lied to me
<davecheney> wallyworld: should I try a different az ?
<davecheney> wallyworld: how do you have sync tools setup ?
<davecheney> do you put the public bucket as your cotrol bucket then run juju-sync tools ?
<wallyworld> no, i've not run it before,i assumed it would upload to the public bucket of the env
 * davecheney has no idea
<wallyworld> i'll take another look
<wallyworld> davecheney: i forgot the --public
<davecheney> bzzzzzzzzzzzzzzt
<wallyworld> so it uploaded to my private bucket
<wallyworld> sorry
<davecheney> rofl
<wallyworld> fuck off :-)
<wallyworld> damn
<wallyworld> auth error
<wallyworld> davecheney: ok, i've had to do hp cloud by hand since there are different credentials for public bucket vs my normal account i think. i'll now do canonistack, but there are authentication issues there too
<davecheney> let me try
<davecheney> maybe my creds work for hp cloud
<wallyworld> davecheney: i can upload via the web ui. but i think my creds to write to the public bucket differ from the creds i need to use to access my juju assets and so sync-tools doesn't like that
<wallyworld> similar issue with canonistack - i'm having to do that by hand tool
<wallyworld> davecheney: canonistack done too hopefully
<davecheney> kk
<davecheney> wallyworld: tools in hpcloud look good now
<davecheney> thanks
<wallyworld> great
 * thumper goes to make coffee
 * thumper puts the  music on loud to try to cover the screaming kids
<thumper> they are playing some xbox game
 * thumper shelves the logging work until someone can help explain the api
<thumper> wallyworld: hey there
<thumper> wallyworld: wotcha up to?
<wallyworld> yo
<wallyworld> tools stuff
<thumper> call?
<wallyworld> ok
<thumper> wallyworld: https://plus.google.com/hangouts/_/775c25d6dba9048907bcf2c54e570e6a4008a22f?hl=en
 * thumper back later
<wallyworld> fwereade: you may have an interest in this https://codereview.appspot.com/13343046, i've used some new interfaces, get out the paint cans
<fwereade> wallyworld, nice
<fwereade> wallyworld, thanks
<wallyworld> i'd be happy to discuss before/after the standup
 * wallyworld unpacks his new Nexus 7 II tablet :-D
<dimitern> fwereade: hey
<fwereade> dimitern, heyhey
<dimitern> fwereade: so looking at the code I think it's high time I start doing the migration from relation-id to relation-key tags
<fwereade> dimitern, good plan, +1
<dimitern> fwereade: this, and the relationunitswatcher are the only missing bits at server-side
<dimitern> fwereade: otherwise most of the stuff that's left is mostly trivial
<fwereade> dimitern, that is music to my ears :)
<dimitern> fwereade: :)
<dimitern> fwereade: I need to dig up logs for rogpeppe's findings on relation names format, do you know that site that has all irc logs in html format?
<fwereade> dimitern, er not offhand, I usually just search and poke around until something shows up
<dimitern> fwereade: ah, found it http://irclogs.canonical.com/YYYY/MM/DD/
<fwereade> dimitern, same time as me :)
<dimitern> fwereade: hmm, but this one doesn't have the freenode logs
<dimitern> fwereade: so, for the record it's http://irclogs.ubuntu.com/2013/08/16/
<fwereade> dimitern, cheers :)
<dimitern> fwereade: [a-z][a-z0-9]*(-[a-z0-9]+)* seems to be the right regexp for relation names
<dimitern> fwereade: so for the tag format itself, what do you think? "relation-<name>" with the format above, no changes to dashes, etc.?
<dimitern> fwereade: ah, but that's just one name, we have "name1:service1 name2:service2"
<dimitern> fwereade: so we need something like "relation-<name1>@<service1>|<name2>@<service2>" ?
<fwereade> dimitern, yeah, I think so
<fwereade> dimitern, the first step is tidying up the names considered to be valid in charms
<dimitern> fwereade: hmm.. the pipe is not a valid filename char
<dimitern> fwereade: maybe ~ ?
<fwereade> dimitern, I would prefer to preserve filename-sanity if possible
<fwereade> dimitern, # maybe?
<dimitern> fwereade: "relation-my-relation123@some-service~my-other-relation42@remote-service23"
<dimitern> fwereade: is it valid?
<dimitern> fwereade: ok, I'll go with #
<fwereade> dimitern, ehh I'm uncertain about swapping the service:relation ordering
<fwereade> dimitern, seems mainly like an opportunity to make mistakes
<fwereade> dimitern, I understand it fits nicely with @
<dimitern> fwereade: ah, was it the other way around? no problem, wasn't sure
<fwereade> dimitern, yeah, but then "service@relation" is kinda crazy
<dimitern> fwereade: it kinda is
<fwereade> dimitern, how abouut... relation-svc.rel#svc.rel perhaps?
<dimitern> fwereade: dots?
<fwereade> are they bad?
<dimitern> fwereade: don't think so, just a bit weird, sgtm
<fwereade> dimitern, svc.rel feels to me closest to svc:rel
<fwereade> dimitern, and it's an endpoint on a specific service so i think the ordering is useful
<dimitern> fwereade: according to wikipedia http://en.wikipedia.org/wiki/Filename dots should be ok
<fwereade> dimitern, so I would hope indeed :)
<dimitern> fwereade: but : is definitely not
<fwereade> dimitern, quite so
<fwereade> dimitern, svc:rel is the existing syntax though
<jam> dimitern: are you saying "sane filenames on Linux" or "sane filenames on Windows". Because on Linux any 8-bit is allowed except for "/" and "\x00".
<dimitern> jam: well, not really - some are shell special chars
<dimitern> jam: and need escaping, which is ugly
<jam> dimitern: they are "valid filename chars"
<jam> it happens that shell likes to do things with them
<dimitern> fwereade: so the conversion should be s/:/./ and s/ /#/ + "relation-" as prefix
<jam> which is more about clarity "not shell characters" is not the same as "only filename chars"
<jam> : ; ^ , {} [] " ' all have shell meaning but are valid in filenames
<dimitern> jam: yeah, but let's not make our lives harder in the remote possibility we actually need to use tags as filenames :)
<fwereade> dimitern, sgtm I think
<jam> mgz, fwereade, dimitern, TheMue, wallyworld_: standup time: https://plus.google.com/hangouts/_/f497381ca4d154890227b3b35a85a985b894b471
<mgz> ta
<jam> fwereade: poke ?
<fwereade> jam, oops, sorry
<jam> TheMue: can you link me to the ssl bug ?
<jam> wallyworld_: you may have hit: 1207294
<jam> bug #1207294
<_mup_> Bug #1207294: sync-tools fails with --public <juju-core:Triaged> <https://launchpad.net/bugs/1207294>
<dimitern> fwereade: https://codereview.appspot.com/13490043
<wallyworld_> jam: yes, at first glance could be
<jam> mgz: do you know the status of https://bugs.launchpad.net/juju-core/+bug/1215949 and how we fix it?
<_mup_> Bug #1215949: juju-core in the devel ppa does not use alternatives <juju-core:Triaged> <https://launchpad.net/bugs/1215949>
<jam> would be really good to have for 1.14 (without having to go via Saucy)
<mgz> sort of...
<mgz> we just need to get that building from the ubuntu-branch packaging
<mgz> the reason that's not quite trivial is the ubuntu packaging is set to build from a stable tarball, rather than pull trunk of everything,
<mgz> so, if we update deps, we'd break the devel packaging potentially
<mgz> could overlay everything still..
<jam> mgz: things in ppa:juju/devel are still releases, so we can still build from *a* tarball
<jam> wallyworld_: if you are still here, did this get landed: https://bugs.launchpad.net/goose/+bug/1132618
<_mup_> Bug #1132618: swift service double lacks container list prefix filtering <swift-double> <Go OpenStack Exchange:Fix Committed by wallyworld> <https://launchpad.net/bugs/1132618>
<jam> it looks old, but it is still open as "Critical"
<jam> TheMue: another poke if you have the bug # for the ssh-hostname-verify stuff.
<jam> There is an old bug #1074025 where we said we needed to support it, and then disabled it or something
<_mup_> Bug #1074025: environs/config: juju must support ssl-hostname-verification config option <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1074025>
<jam> but I'm pretty sure there should be a new bug
<jam> natefinch: https://codereview.appspot.com/13079045/ has been review
<jam> reviewed
<jam> TheMue: I think I found it, bug #1202163
<_mup_> Bug #1202163: openstack provider should have config option to ignore invalid certs <cts> <papercut> <Go OpenStack Exchange:Confirmed> <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1202163>
<dimitern> jam: could you take a look at https://codereview.appspot.com/13490043/ ?
<jam> dimitern: you realize that # is also a shell character, meaning comment, and so if you type "foo#bar" it will get seen as just "foo"
<jam> I personally think : is a less volatile "shell char"
<TheMue> jam: I assigned it to you
<dimitern> jam: but on windows it won't work
<dimitern> jam: how about replacing # wit @ ?
<TheMue> jam: it's https://bugs.launchpad.net/juju-core/+bug/1202163
<_mup_> Bug #1202163: openstack provider should have config option to ignore invalid certs <cts> <papercut> <Go OpenStack Exchange:Confirmed> <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1202163>
<jam> dimitern: so, I just tried "echo foo#bar" and I get "foo#bar" but "echo foo #bar" gives just "foo"
 * TheMue just finished flight booking
<dimitern> jam: but there's no space anywhere
<dimitern> jam: echo relation-svc1.rel1#svc2.rel2 seems to work
<jam> right, we're not allowed spaces, though I won't guarantee all shells act like my bash :)
<jam> dimitern: though I would bias towards making it something you can easily write in a shell, vs something that can be turned into a filename
<jam> and today, we already write them in the shell as "juju add-relation service:rel service:rel"
<dimitern> jam: tags are not supposed to be typed I think - they're just used mostly over the wire in the API
<jam> anyway, what you have is fine, and I don't mean to just cause churn, but I don't quite see what we actually gain rather than having them closer to the actual relation key
<jam> I *think* "." isn't very nice in Mongo, (you don't want to put it into keys, because by default search uses "." to indicate a nested doc)
<dimitern> jam: tags are not stored in mongo
<dimitern> jam: they are constructed from mongo keys
<jam> dimitern: sure, but it seems like playing nicer with mongo is better than playing nice with FS. Again, it would be possible to do it your way, but why mutate something that is close to fine as is?
<jam> I would get rid of the space
<jam> but why not have it be
<jam> relation-service1:foo#service2:bar
<dimitern> jam: because of windows mostly
<jam> dimitern: if we aren't storing them in mongo, and we aren't writing them to disk, and we aren't typing them into the command line, I don't think we've actually gained anything over a simpler transform
<dimitern> jam: rogpeppe can explain better than me the idea behind having all tags FS compatible
<dimitern> jam: all the others are like that, why change it now?
<jam> dimitern: do you know why we want Key based rather than Id based? I think I missed that conversation as well.
<dimitern> jam: fwereade can explain better
<dimitern> jam: it's something related to performance of keys/indices in mongo perhaps; and the key being more obvious what it's about, rather than an opaque number
<dimitern> jam: also, rel ids are used in very few places - we use keys for most things
<dimitern> jam: so..
<dimitern> jam: can I get a review please? :)
<dimitern> jam: thanks!
<fwereade> dimitern, jam: the main thing is that tags were originally conceived as filesystem-safe, and it seems a bit capricious to drop that property
<fwereade> jam, we can't have id based ones because they're reported as lists of keys from the watcher
<dimitern> fwereade: right!
<fwereade> jam, and need to convert them for the api *even* when the corresponding doc no longer exists
<fwereade> jam, but that's where the id would be stored
<jam> fwereade: I'd really be happier if clients never generate tags from other data, is there a reason the watcher or whatever can't talk in terms of tags?
<fwereade> jam, tags don't exist at the db level
<fwereade> jam, auto-converting from _id would just move the problem, not replace it
<dimitern> fwereade: but we need to convert the stringswatcher to report tags, and we have the names package for conversions
<fwereade> jam, I would also be happier if clients never generated tags from other data -- what situation are you thinking of?
<dimitern> fwereade: or maybe just the api stringswatchers
<fwereade> jam, the server is the thing that needs to convert _ids from a state watcher to tags for an api watcher
<jam> fwereade: golang question. I have a "type Foo struct { Bar }" and I have a "f *Foo" object, but I want to call a function that takes a "*Bar" object. Is there an obvious way to do so that isn't (&f.Bar) ?
<fwereade> jam, the existing problem is that it returns names and thus mixes vocabularies
<fwereade> jam, don't think so
<dimitern> jam: try f.Bar.Call()
<dimitern> jam: ah, it's embedded
<fwereade> dimitern, the reason not to fix stringswatcher api now is that things are already using it, and nobody but us is using it, so it doesn't get any worse with time
<dimitern> fwereade: yeah, that's right
<fwereade> jamespage, is there any action we should be taking for juju-core wrt https://bugs.launchpad.net/juju-core/+bug/1200878 ?
<_mup_> Bug #1200878: Upgrade breaks existing pyjuju deployment <apport-collected> <papercut> <regression-release> <saucy> <juju-core:Triaged> <juju (Ubuntu):Fix Released> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1200878>
<dimitern> fwereade: before implementing relationunitswatcher I'll make a second attempt to remove the unitsettings from the relationunitchange struct
<dimitern> fwereade: actually do you have some time for a g+?
<jam> fwereade: bug #1200878 does sound like it is packaging-only
<_mup_> Bug #1200878: Upgrade breaks existing pyjuju deployment <apport-collected> <papercut> <regression-release> <saucy> <juju-core:Triaged> <juju (Ubuntu):Fix Released> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1200878>
<jam> so more of a "juju-core (Upstream)" bug
<jam> sorry juju-core (Ubuntu) only bug
<jam> mgz: if you see jamespage can you ask him about ^^
<jamespage> jam: its not packaging only
<jamespage> fwereade, ^^
<jamespage> the packaging upgrade just fine - but there is no way to upgrade a running py-juju environment to juju-core
<jamespage> I restored py-juju to Saucy to allow people to continue managing py-juju environments....
<jam> jamespage: I would argue that means we have a separate bug about a migration path, but bug #1200878 is about "apt-get upgrade" breaking a py-juju environment.
<_mup_> Bug #1200878: Upgrade breaks existing pyjuju deployment <apport-collected> <papercut> <regression-release> <saucy> <juju-core:Triaged> <juju (Ubuntu):Fix Released> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1200878>
<jamespage> jam: I'm not concerned how this is represented in the bug tracker - so long as its documented that there is no migration path from py-juju to juju-core right now
<jamespage> infact the bug was only about losing the ability to manage a py-juju environment
<jamespage> not breaking it
<jam> jamespage: fwereade: so last I recall we weren't going to spend effort trying to come up with a way to actually migrate a running environment. However, I wonder if "install GUI, get a Bundle, reinstall elsewhere" might be an answer?
<fwereade> jamespage, jam: hazmat it looking into it somewhat actively now AIUI
<jamespage> fwereade, indeed he is
<dimitern> fwereade: ping
<fwereade> dimitern, pong
<dimitern> fwereade: g+?
<fwereade> dimitern, sure, please start one
<dimitern> fwereade: i'm running into issues with removing unitsettings's settings field and need some help understanding hookqueue and tests
<dimitern> fwereade: https://plus.google.com/hangouts/_/f0bf3d6a8b15defb35894c5637399f0b87363f4b?hl=en
<dimitern> fwereade: i'm waiting
<sidnei> https://bugs.launchpad.net/juju-core/+bug/1220269
<_mup_> Bug #1220269: update-<timestamp> directories being left behind <juju-core:New> <https://launchpad.net/bugs/1220269>
<natefinch> jam, fwereade (and anyone else): what's your opinion on tests that simply duplicate the code in the function they're testing?  I hate not testing things, but I also hate duplicating code just for the sake of coverage metrics. For example: http://pastebin.ubuntu.com/6059183/
<natefinch> jam, fwereade (and anyone else): (the os-specific functions have their own individual tests)
<mgz> jam: I wonder if rather than exposing NonValidating as a thing inside goose, it mightn't be better to just have the alternate constructor take an http.Client object
<mgz> then juju-core can just pass one in with the InsecureSkipVerify flag set, and maybe also reuse connections to other things
<mgz> natefinch: do you still need help to get your azure branch landed?
<natefinch> mgz: yeah... I've been procrastinating because I wasted so much time trying to get an azure environment up and running, I didn't want to waste more time on it until I knew it would be worth it.
<natefinch> mgz: I probably should have hounded the red squad guys a little more to get it done
<mgz> do you know what you wanted to test? we could just ask one of red to try it out.
<natefinch> mgz: *nod* that's a decent idea. It's just the addressability stuff.  I think I'd have to write some custom code for it, since AFAIK, the addresses aren't actually used anywhere right now.... is that correct? Or is it one of those things where it's stubbed out and once it's not it'll get picked up?
<mgz> just looking at hostname should do, but it would be a little easier with a few extra bits
<mgz> an actual live test would be great...
<natefinch> mgz: maybe allenap has an azure environment that works, or knows someone on red squad that does?
<mgz> all of them should have a working setup
<dimitern> fwereade: when you're back and have some time PTAL https://codereview.appspot.com/13494043 and also let's discuss how to live test it with the local provider
<jam> natefinch: I would think we have helpers for patching the environment, but if we don't you code is ok.  I would probably change the comment to "unset JUJU_HOME to force OS specific home selection".
<jam> natefinch: your test also gives us some ability to grow it as we might need to do something different on other platforms (though Darwin looks to still use $HOME for now at least)
<jam> natefinch: "tests for tests sake" still fall in the "if I change X something lets me know that I might have broken something" other than just assuming I manually audited everything.
<hazmat> is there any timeline on manual provisioning?
<thumper> hi hazmat
<thumper> hazmat: landed in trunk
<thumper> hazmat: it may even be in 1.13.3, I'd have to check
<thumper> hazmat: although axw does need to write some docs for it
<hazmat> thumper, awesome
<thumper> hazmat: new focus on that area is the null provider and manual bootstrap
<hazmat> thumper, so the manual provisioning works with existing providers?
<thumper> yes
<hazmat> perfect
<thumper> as long as the machine you are adding can see the bootstrap node
<thumper> all is good
<thumper> caveat:
<thumper> unlikely to work with the local provider
<thumper> as it gives a 10.3.0.0/24 address for storage
<thumper> we should write that down
<thumper> :)
<hazmat> thumper, i was mostly considering maas as a context
<thumper> should work fine there
<hazmat> thumper, i wonder if will ever get to the point that we can consider undocumented features are a bug
<thumper> yes, I think we should
<hazmat> the stop gap for core dev documenting features was supposed to be rel notes, but this one didn't make it in.
<thumper> no, it should have
 * thumper will poke the right people :)
<hazmat> thumper, got a moment to chat re manual bootstrap?
<thumper> um... sure..
 * thumper dies a little inside
 * bigjools prods thumper
 * thumper jumps
<bigjools> there. not dead.
<bigjools> (I have caffeine for that)
<axw> thumper: my exchange's fibre got severed yesterday afternoon. apparently the current connectivity is a temporary workaround
<axw> so if I'm not online today, that's why
<axw> (and it's why I wasn't online from 3:30 yesterday)
<thumper> ok, np
 * thumper is heading off to the gym shortly
<axw> enjoy
#juju-dev 2013-09-04
<bigjools> anyone know if the go-bot setup is documented anywhere?
<wallyworld_> bigjools: there was an email sent to juju-dev
<bigjools> wallyworld_: archive-diving I go then, ta
<wallyworld_> it turned into a thread of about maube 10 emails
<wallyworld_> it may well be documented somewhere but that's all i know of
<wallyworld_> the thread is called Landing bot setup steps
<bigjools> wallyworld_: when was it sent?
<wallyworld_> bigjools: july 30
<bigjools> ta
<bigjools> aaand found it
 * thumper hmmms
 * thumper wishes once again that go had generics
<thumper> copying boilerplate is error prone and annoying
 * thumper wanders off to think a little
<axw> thumper: are we doing a code review today?
<thumper> yeah, why not
<thumper> I figure we all get on a hangout and look through the code
<thumper> or do you think we should look first, then discuss on the call?
<axw> I have looked already, so either way
<thumper> wallyworld_: your opinion^^^
<wallyworld_> i haven't looked
<wallyworld_> i guess i have 10 mins
 * thumper goes to see what file it is again
<axw> provider/state.go
 * thumper is in the hangout
<bigjools> wallyworld_: how long should I wait for a canonistack instance to bootstrap?  it's been 15 minutes
<wallyworld_> bigjools: cnonistack is unusable most times for me
<wallyworld_> too slow
<bigjools> \o/
<wallyworld_> the servers are heavily overcommitted i think
<bigjools> wallyworld_: I have an instance
<bigjools> which I can't ssh into.... woo
<bigjools> sigh
<wallyworld_> yay
<bigjools> today is Fail Central
<axw> bigjools: I had better luck with lcy02 than 01 - try that if you're not already on it
 * thumper has had enough
<bigjools> axw: got it.  The wiki page I found didn't mention I needed sshuttle ...
<axw> ahh, right
<axw> I got caught a few times by that
<bigjools> how do I convince juju to destroy a service with a failed install hook?
<bigjools> resolved I guess?
<axw> bigjools: yep
<bigjools> ah it leaves an empy machine - will that get used next deployment?
<bigjools> guess I will find out soon
<axw> so I'm told, but I haven't witnessed it
<axw> otherwise you can --to
<bradm> bigjools: it does use it again, but I'm pretty sure it doesn't clean it up - so its got all the config of the previous service on it
<bradm> bigjools: which makes it pretty useless in reality, since you can't be sure of the config
<bigjools> bradm: that's probably a bug I guess.  Thanks.
<bradm> bigjools: you can fake it out by doing a juju deploy ubuntu and then destroy the service
<bradm> bigjools: we've used that in the past to pre-deploy juju instances
<bigjools> ok cheers
<davecheney> bigjools: use resolved to get the machine to a started state (even if this is a lie) then destroy it
<davecheney> bigjools: juju never reuses machines
<davecheney> never
<davecheney> --to isn't part of that statement
<bigjools> davecheney: ok thanks - I am recalling what you did in the sprint
<davecheney> 'cos it just takes off all the safety guards
 * davecheney goes back to bd
<davecheney> bed
<bigjools> ...
<bradm> it must have been in pyjuju days when it did that, since I can't replicate it now
<bradm> so thats good
<jam> bradm: bigjools: we are supposed to suppot "Clean & Empty" which means nothing was actually deployed to the machine yet we will use it for a future deploy/add-unit.
<jam> you can use "juju add-machine" (I think it works with a provider) to pre alloc machines.
<bradm> jam: we did use the juju deploy ubuntu ; juju destroy-service ubuntu with pyjuju at a sprint to get the machines deployed to make training easier
<bradm> jam: its debatable how much it actually helped, but anyway :)
<jam> bradm: I'm pretty sure that doesn't work by default in juju-core, but you should be able to "juju deploy --to X" to force it, and
<jam> juju add-machine should let you allocate machines ahead of time
<bradm> jam: yes, it doesn't seem to work like it did in pyjuju, I tested it on canonistack
<bradm> jam: honestly, I think its a good thing, I could imagine it biting us as we use juju more
<jam> bradm: right, the default policy is intended to be "if a machine was made dirty by installing something directly, don't default to putting something onto that machine"
<bradm> jam: totally makes sense.
<jam> the problem that  bigjools highlights, is that if you destroy a unit, what is the point of leaving the machine around (by default)
<bradm> jam: it's really a hack what we were doing
<bradm> jam: to recover logs, any data that you need
<jam> bradm: sure, but I would say "allow a flag to destroy-unit that keeps the machine alive" not "create a flag to also destroy the machine"
<jam> so if you need something, you can just remove the unit, but otherwise don't leave unusable machines around
<bradm> jam: sure, or even a config option that says "keep the units after a service destroy" and default to destroying it, any of those
<bradm> I'm still getting up to speed at how we're using juju
<dimitern> morning
<dimitern> fwereade_: hey
<fwereade_> dimitern, heyhey
<dimitern> fwereade_: I did the changes as discussed (hopefully) - https://codereview.appspot.com/13494043/
<fwereade_> dimitern, sweet, I'll take a look before I go out... I have more houses to look at
<dimitern> fwereade_: can you take a closer look please, and I'd appreciate a stepwise testing plan :) (never used the local provider, and not sure how to trigger these hooks - should I write I charm?)
<dimitern> fwereade_: thanks
<fwereade_> dimitern, I'd suggest writing two metadata-only charms that just define relations
<fwereade_> dimitern, that can relate to one another
<fwereade_> dimitern, deploy one unit of each
<fwereade_> dimitern, wait until they're started
<fwereade_> dimitern, then open 2 windows, and debug-hooks each of the units
<fwereade_> dimitern, then add a relation between them
<dimitern> fwereade_: ok
<fwereade_> dimitern, and then you can basically do whatever you need -- it'll just pause and let you run whatever you want for each hook
<dimitern> fwereade_: hmm
<dimitern> fwereade_: that's what's unclear - what should I do while paused?
<dimitern> fwereade_: call relation-set/get someting?
<fwereade_> dimitern, specifically you need I think to just keep on setting fresh unit data on either side of the relation and checking it arrives properly at the otherside
<fwereade_> dimitern, yeah exactly
<fwereade_> dimitern, set on one side
<fwereade_> dimitern, get on the other
<dimitern> fwereade_: ok, sounds simple enough
<fwereade_> dimitern, if you want to be extra clever add a config to the charm
<dimitern> fwereade_: how?
<fwereade_> dimitern, config.yaml
<dimitern> fwereade_: aah :) ok
<fwereade_> dimitern, just a string setting you can change to trigger a hook and call relation-set in to get out of steady state again
<dimitern> fwereade_: just a sec
<fwereade_> dimitern, cath is pointing out that we're going to be late... I will review as soon as I'm back
<fwereade_> dimitern, sorry :(
<dimitern> fwereade_: ok
<dimitern> fwereade_: will try it out
<dimitern> fwereade_: tyvm
<dimitern> TheMue: hey
<dimitern> TheMue: can you point me out to the local provider docs you wrote?
<TheMue> dimitern: sure, one moment
<TheMue> dimitern: https://juju.ubuntu.com/docs/config-local.html
<dimitern> TheMue: cheers
<TheMue> dimitern: yw
<bigjools> I've got a charm that does an adduser in the installation hook, and it errors out with "Authentication token manipulation error"
<TheMue> ah, great, now shows exactly the behavior i wanted
<TheMue> dimitern: btw, if you discover errors or missing parts in the doc please let me know
<dimitern> TheMue: well, it worked, but only as root
<dimitern> TheMue: I had to copy my env.yaml to /root/.juju/ and also my ssh keys to /root/.ssh
<dimitern> TheMue: and add juju binaries to root's PATH
<dimitern> TheMue: sudo juju bootstrap failed with "juju not found"
<dimitern> TheMue: so I had to do sudo su - and run as root from that point on
<TheMue> dimitern: strange
<TheMue> dimitern: i've done it as regular user and only bootstrapped with sudo
<TheMue> dimitern: it's described in the code that way
<dimitern> TheMue: perhaps some steps are missing
<TheMue> dimitern: which release? i took 13.04 on a clean system
<dimitern> TheMue: sudo x doesn't do a login shell session, so PATH is not set
<dimitern> TheMue: raring
<dimitern> TheMue: and my system is mostly clean (reinstalled a month ago, so most of the packages are missing)
<dimitern> TheMue: didn't install mongodb-server, just lxc - I have mongo installed for juju tests already
<TheMue> dimitern: i'm using a testing image in a parallel vm for those tests. here i installed juju, mongo and lxc as user (but with sudo)
<TheMue> dimitern: and then i generated the config which has been in $HOME/.juju (as expected)
<TheMue> dimitern: after that bootstrapping with sudo (so it uses my env)
<TheMue> dimitern: et voila
<dimitern> TheMue: well it didn't work for me
<TheMue> dimitern: deployed our lovely pair wp/mysql then and it worked
<TheMue> dimitern: service deployment has been as non-root w/o sudo
<TheMue> dimitern: because the provisioner already works as root
<dimitern> TheMue: I suppose if sudo juju bootstrap worked, the rest would've worked as well
<dimitern> TheMue: but juju could not be found, so I had to do as described above
<dimitern> TheMue: if I had juju installed from the archive/ppa it should work though, it'll be available for any user
<dimitern> TheMue: so my point is - if you're running juju from source, the described steps should be different
<TheMue> dimitern: sounds reasonable, on that system i had no juju source
<jam> TheMue: thanks for the reviews. One more: https://codereview.appspot.com/13360044/
<TheMue> jam: looking
<jam> TheMue: did it help to split this out into concrete steps? I sort of evolved the patch all at once, but I thought it might be easier to review in layers.
<TheMue> jam: yes, helps a lot. would have been too large then
<TheMue> jam: but i can't review the first one, get an error
<jam> TheMue: one of them I submitted with a missing prereq so I deleted and resubmitted it.
<TheMue> jam: ah, ok
<jam> TheMue: do you mean https://codereview.appspot.com/13391047/ ? (the httpsuite-ssl change, vs the httpclient-ssl change, vs the final client/Client change)
<jam> I thought you had reviewed 2 patches ,but it was just that I got your email 2 times (one from Reitveld, once from LP)
<TheMue> jam: yes, that one. "old chunk mismatch"
<jam> TheMue: for https://codereview.appspot.com/13391047/ the unified diff looks correct, I don't know what is wrong with the side-by-side diff
<jam> I'll try to repropose
<natefinch> morning all
<TheMue> jam: thx
<TheMue> natefinch: morning
<jam> TheMue: reproposing seems to have fixed the bug
<jam> morning natefinch
<TheMue> jam: ah, yes, looks better now, will look at it in a few moments
<TheMue> jam: LGTM, wow, all together a nice change
<jam> TheMue: fwereade_, wallyworld_, dimitern, natefinch, mgz, jam: standup ? https://plus.google.com/hangouts/_/f497381ca4d154890227b3b35a85a985b894b471
<jam> fwereade_, mgz  &&
<jam> ^^
<dimitern> fwereade_: can you take a look at this please https://codereview.appspot.com/13523043/
<dimitern> fwereade_: I'll try the extra live testing steps you mentioned about the previous change
<fwereade_> dimitern, great, thanks
<natefinch> jam: if you guys are still reviewing... it bugs me that loadState closes the reader it gets. It seems much clearer to have the callers close their own readers, even if it does mean one duplicated line. I think it's better to have the close method next to the method that creates the Closer
<jam> natefinch: I did have that one in my list, but we're wrapping up and it wasn't huge
<natefinch> jam: yeah, not really a big deal
<jam> natefinch: it does eliminate some redundancy, but it makes it a bit harder to audit for the http.Get bug
<jam> (you must call Close explicitly or http leaks file handles)
 * TheMue has to step out, wife awaited me since 3
<natefinch> jam: really, I think it's just a separation of concerns... loadstate doesn't care if the thing is closed, it just wants to be able to read. And if you then have something that wants to use loadState that doesn't have a close method, you have to manuifacture one
<jam> natefinch: "is it better that LoadStateFromURL hand off the defer r.Close to the loadState
<jam> helper? It eliminates redundancy, but makes it slightly harder. (I wouldn't
<jam> directly guess that "Load*" would close the object it is given. though taking a
<jam> ReadCloser does tell you that.)
<jam> "
<jam> was my notes
<natefinch> yep
<natefinch> jam: what's my business unit?  None of these things seem to fit quite right (doing the travel form for October sprint)
<jam> natefinch: you're under CDO
<jam> that is the group underneath Robbie (Mark Ramm's direct manager)
<natefinch> jam: ok, cool
<dimitern> fwereade_: so when you can https://codereview.appspot.com/13523043/
<fwereade_> I'm even doing it now :)
<dimitern> fwereade_: sweet!
<fwereade_> dimitern, reviewed, LGTM with one extra test
<dimitern> fwereade_: thanks
<dimitern> fwereade_: am I getting this correctly - once the mysqlUnit has entered scope, leaving scope should generate a change in the watcher, right?
<fwereade_> dimitern, scope for unit X is unaffected by any change to unit X
<dimitern> fwereade_: yeah, but unit X in this case is wordpressUnit
<fwereade_> dimitern, ah ok yes
<fwereade_> dimitern, mysqlUnit leaving scope should not generate a change, but should do a depart
<dimitern> fwereade_: isn't that also a change?
<dimitern> fwereade_: in Departed
<dimitern> fwereade_: I'm not getting anything though.. strange
<fwereade_> dimitern, ah yes, got you, I had mismatched levels
<fwereade_> dimitern, missing SyncBackend somewhere?
<dimitern> fwereade_: hmm perhaps, will dig into it
<fwereade_> dimitern, oh, and, I meant to say
<fwereade_> dimitern, you're written the WatcherC
<fwereade_> dimitern, would be really nice to use it in the state tests too
<fwereade_> dimitern, if you swap that in there you'll be able to tell whether the test harness works ;)
<dimitern> fwereade_: ah, right!
<dimitern> fwereade_: I'll start there then
<dimitern> fwereade_: does this look ok to you? http://paste.ubuntu.com/6062708/
<sidnei> hi folks, can i get a pair of reviews? https://codereview.appspot.com/13242049
<fwereade_> dimitern, hmm, might be worth a repeated sync on ShortWait timeout
<dimitern> fwereade_: just Sync() ?
<fwereade_> dimitern, but check how the state tests do it, I might be wrong
<fwereade_> dimitern, ah, hmm
<dimitern> fwereade_: do I need the StartSync() then?
<fwereade_> dimitern, just a sec, I should remind myself
<dimitern> fwereade_: hmm there's no state.Sync() anymore it seems
<dimitern> sidnei: now you need only a single LGTM btw
<sidnei> even better :)
<dimitern> sidnei: fwereade_ might be the right reviewer for this
<fwereade_> sidnei, I am now churning through reviews
<sidnei> thanks!
<dimitern> fwereade_: nevermind, found the issue :)
<dimitern> fwereade_: one spurious for loop that never ends
<teknico> wedgwood: you might find something useful in https://code.launchpad.net/~teknico/charm-helpers/lint-fixes/+merge/181859
<teknico> if not, no worries :-)
<wedgwood> thanks. I normally do reviews on Wednesday afternoons.
<wedgwood> teknico: taking a cursory look, I see one place where imports are commented out (rather than removed) (line 36-39) and one place where logic is changed (line 268-276)
<teknico> wedgwood: the first one is because those imports are example placeholders, IIUC
<teknico> wedgwood: the second is different logic but same behavior: the flow does not ever exit the for loop prematurely
<wedgwood> teknico: ah, right you are. tbh I thought for ... else behaved differently.
<teknico> wedgwood: that was a possibility I wanted to dig up by changing the code ;-)
<wedgwood> it's definitely doesn't do what it seems like it should. Poor choice of keywords.
<teknico> wedgwood: yes, it needs to be looked at in the right wavelength light to even begin making sense :-)
<wedgwood> In any case, thanks for the cleanups :)
<teknico> wedgwood: my pleasure, I hope I wasn't too much obsess... I mean, comprehensive about them :-)
<wedgwood> teknico: improvements are improvements, I say
<teknico> I actually removed a few more hundred lines of diffs that changed double-quote string delimiters into single-quote ones, it was becoming a bit too much. :-) maybe some other time
<sidnei> fwereade_: addressed comments, first time using export_test, so hopefully it's good. :)
<fwereade_> sidnei, cheers
<fwereade_> sidnei, LGTM, just pull that duplicated code block out into a func and you're good
<sidnei> fwereade_: a method of DeployerSuite or a standalone func?
<dimitern> fwereade_: ping
<fwereade_> sidnei, standalone, I think, unless it's using fields on the type
<fwereade_> dimitern, pong
<dimitern> fwereade_: so the txnRevno for settings I get from relUnit.Settings always seems to be -1 than the one I get from the watcher
 * fwereade_ raises a somewhat surprised eyebrow
<dimitern> fwereade_: I'll push the branch as soon as the tests finish
<dimitern> fwereade_: so originally I had c.Assert(actual.Changed, DeepEquals, expected.Changed) in AssertChange()
<dimitern> fwereade_: but it seems all but the versions match (not sure if it's always off by one but it seems likely)
<dimitern> fwereade_: I noticed that in relationunit_test after changing them to use the RelationUnitsWatcherC
<dimitern> fwereade_: the versions were not checked before, just the len(changed), which seems weird
<fwereade_> dimitern, ahhhh-ha, that rings a bell, just a mo
<dimitern> fwereade_: so I resorted to checking actual.Changed, HasLen, len(expected.Changed) and then making sure each key in actual is also in changed
<dimitern> fwereade_: it seemed weird that on remoteUnit.EnterScope(nil) with a fresh relation the initial event reported txnRevno2 for unit settings (always)
<fwereade_> dimitern, so, just one thought... what are you basing expected versions on?
<fwereade_> dimitern, I think you'll find there is a reason
<dimitern> fwereade_: no idea really - I thought txnRevno should be 1 on a fresh scope
<fwereade_> dimitern, try setting the unit's public address before entering scope ;p
<dimitern> fwereade_: hmm, ok
<fwereade_> dimitern, (entering scope depends on the unit's life, so the txn revno will be 1 + max(scope doc, unit doc))
<dimitern> fwereade_: you mean instead of EnterScope(nil) to do myRelUnit.EnterScope(map[string]interface{}{"private-address": "blah"})
<dimitern> fwereade_: in both cases the initial event comes with version 2
<fwereade_> dimitern, no, I mean make an extra change to the unit doc before joining
<dimitern> fwereade_: aah
<fwereade_> dimitern, so basically settings versions are volatile
<fwereade_> dimitern, now I think of it you should be testing that they're increasing per unit, but should not be testing specific values
<dimitern> fwereade_: I did mysqlUnit.SetPrivateAddress("1.2.3.4") and then myRelUnit, err := rel.Unit(s.mysqlUnit); myRelUnit.EnterScope(nil) - still
<dimitern> fwereade_: still ver 2
<fwereade_> dimitern, you sure private wasn't already set?
<dimitern> fwereade_: positive
<dimitern> fwereade_: I could try some other unit doc field..
<fwereade_> dimitern, or just set it twice, even, I think
<dimitern> fwereade_: did SetPrivateAddress() twice - the same ver 2
<dimitern> fwereade_: (with different values for private address)
<dimitern> fwereade_: I could just ignore the version and make sure the changed field's keys are sane
<fwereade_> dimitern, hum, I'd expected that it would be the unit doc
<fwereade_> dimitern, ok, in any given test you *should* ignore the version
<fwereade_> dimitern, but we should *also* have a test that the version increases with each change of a unit
<fwereade_> dimitern, the value will not be stable as the codebase changes
<fwereade_> dimitern, but we depend on its always getting bigger
<dimitern> fwereade_: when should it increase?
<fwereade_> dimitern, sorry: the settings revno should increase by at least 1 whenever the settings change
<dimitern> fwereade_: where should this test reside?
<fwereade_> dimitern, this may be crazy, but... how about integrating it in the WatcherC?
<fwereade_> dimitern, the client says what should change
<dimitern> fwereade_: but doesn't specify a version
<fwereade_> dimitern, the WatcherC validates the stream of data-per-unit coming out of the watcher by checking strictly increasing versions?
<dimitern> fwereade_: the watcherC keeps the old txnRevno and compares it's greater
<fwereade_> dimitern, yeah, exactly
<fwereade_> dimitern, does that sound crazy?
<dimitern> fwereade_: sgtm
<fwereade_> dimitern, cool
<dimitern> fwereade_: so I need to keep track of each unit that's changed and make the check on each AssertChange
<fwereade_> dimitern, yeah, just keep a map of unit: last-change and whenever you get a new one check and set
<sidnei> fwereade_: it's not, but neither the 'bundle' function is. i shall convert both then.
<fwereade_> sidnei, nice, thanks
<dimitern> fwereade_: worked nicely
<fwereade_> dimitern, sweet
<sidnei> fwereade_: so, on a similar subject. we need to get this fix into currently-deployed prodstack instances at some point. what's the process for that, once we have a built release (which at this point is 1.15.x?) to upgrade the existing deployments that are on 1.13.3-ish?
<natefinch> mgz: can you take a look at this real quick? It's been mostly LGTM'd by john, but I added one more tweak to get the stuff working live with azure: https://codereview.appspot.com/13082044/
<mgz> natefinch: sure
<mgz> hm, can't persuade rietveld to give me the 1->2 diff...
<mgz> ah, pick a file, then change in the dropdown works
<natefinch> mgz: yeah, I don't know what's going on there.  The only addition was one line, adding DeploymentName to the struct we pass to GetDeploymentRequest
<mgz> yeah, but is still blank... I'll branch the thing and look at the actual history
<mgz> probably just the merges from trunk upseting it
<natefinch> mgz: yeah
<dimitern> fwereade_: would like you to have a final look though https://codereview.appspot.com/13523043
<TheMue> fwereade_: thx for review
#juju-dev 2013-09-05
 * thumper watches the tumbleweed roll past
<axw> it is rather quiet
<axw> wallyworld_: LP just crapped out on an email notification; I've LGTM'd your two simplestreams branches
<wallyworld_> axw: awsome thanks :-)
<wallyworld_> i'm finishing another soon then i'll poke the landing bot
<wallyworld_> axw: with the location of the interfaces in interfaces.go - I prefer them to be in separate files like you suggest, but others on the project want all the interfaces declared in one big file :-(
<axw> wallyworld_: my comment is more about the package location than the file
<axw> if something other than an Environ is to implement HasConfig, then it probably shouldn't be in environs
<wallyworld_> same thing in a way - i would want to define the HasConfig interface in environs.config, but other times got pushback from stuff like that
<wallyworld_> HasConfig returns a environs.config.Config
<axw> yeah, that's why I thought it would be better there
<wallyworld_> me too
<axw> your call - just letting you know what I think :)
<wallyworld_> but for some reason, people don't like cohesion and loose coupling :-(
<axw> lol
<wallyworld_> i agree with you fwiw, i just knew the pushback i'd get
<wallyworld_> i figured i'd have enough to argue about wrt the interface names :-)
<axw> hehe
<axw> why'd the standup change time?
<wallyworld_> axw: cause if conflicted with breakfast time for nate
<axw> ah
<bigjools> I have a juju debug log writing to my terminal, yet I have no actual debug-log running in it ...
<bigjools> wtf
<axw> wallyworld_: Path isn't meant to be relative?
<wallyworld_> axw: it is in the metadata, but when used, a full url is required
<wallyworld_> cause the tools can then be retrieved via wget
<axw> mmk. just slightly confusing that ToolsMetadata.Path has different forms based on whether it's input/output
<wallyworld_> if you are looking at a mp i just posted, i'm fixing a few other things
<axw> yeah I am
<wallyworld_> i can add a new attr
<wallyworld_> Path is only used internally
<wallyworld_> outside of simplestreams, the Tools struct is used
<axw> I'll keep looking, it's probably okay if it doesn't escape
<wallyworld_> i probably should add an absolute path for clarity, even if it is internal only
<axw> wallyworld_: yeah, I think I'd prefer adding either a BaseURL or URL field
<wallyworld_> can do if it aids clarity
<bigjools> is the only was to pass config to a charm initial deployment via the --config directive?
<bigjools> s/was/way/
<wallyworld_> nfi
<wallyworld_> bigjools: where does maas and azure keep a copy of the tool tarballs? or are they just downloaded from the shared s3 bucket?
<bigjools> haha
<bigjools> there's a public storage account for azure that only I know how to upload to
<wallyworld_> which wget works with?
<bigjools> yes
<bigjools> maas doesn't have public tools AFAIK
<wallyworld_> so s3?
<bigjools> since it's not a public cloud
<wallyworld_> or --upload-tools
<bigjools> NFI whatever juju does
<wallyworld_> s3 then, unless upload-tools is used
<bigjools> I normally use --upload-tools
<wallyworld_> bigjools:  does maas provider support unuathenticated access to files in its storage using the Storage.URL() method?
<bigjools> it used to but not any more
<wallyworld_> yes, looks like it
<wallyworld_> oh?
<wallyworld_> code seems to be there
<bigjools> it's a security hole
<bigjools> I think we still need to fix it :)
<wallyworld_> hmmm. amazon supports it
<bigjools> maas is not amazon
<bigjools> maas has a backward compatibility mode for unauth access IIRC
<bigjools> jtv1: can you remember the details for this?
 * wallyworld_ is sad that our key interfaces that all providers are supposed to implement are not universally supported
<wallyworld_> which means interface design is wrong
<bigjools> yeah
<bigjools> very hard to get right though
<jtv1> bigjools: for storage?  Do you mean the anon URLs?
<bigjools> jtv1: yes
<jtv1> Oh, and we had the weirdness with the transition between single-tenant and multi-tenant storage, didn't we?
<bigjools> jtv1: I recall that rvb stuffed in some egregious hack
<bigjools> yes exactly that
 * jtv looks
<jtv> There's FileStorage.anon_resource_uri...  it evaluates to a URL where the file can be retrieved anonymously.  To keep the file secret, keep the URL secret.
<wallyworld_> but that is deprecated?
<jtv> Doesn't say so in the source...
<wallyworld_> bigjools seemed to think it may be going away?
<jtv> Key-based access.  That was the way we dealt with this.  You pass the file's key to get its anonymous URL.
<jtv> There were two related issues.  One was legacy files from the single-tenant era, which have no owner.  The other is anonymous access.
<jtv> Ownerless files should go away.  I don't know if we'll ever fully be rid of the code, but we should at least keep up the pretense.
<wallyworld_> sounds like i shouldn't count on anon access
<jtv> Key-based anonymous access was meant to last, as I remember it.
<wallyworld_> ok
<bigjools> I think this hack was put in to make juju work
<bigjools> if memory serves (which it doesn't much these days)
<bigjools> thumper: if I want to pass a chunk of data to a charm, what's the best way?
<bigjools> a huge ugly piece of yaml?
 * thumper shrugs
<thumper> how about base64 encoded bson?
 * thumper ducks
<bigjools> well it's an ssh private key so it's already encoded
<thumper> in all honesty though, NFI
<bigjools> but it's just unwieldy in a charm config
<bigjools> I was hoping there might be a config that says "throw the contents of this file at the charm"
<thumper> sometimes I'm amazed at peoples ability to accept so much boilerplate and copy and paste in tests
 * thumper refactors
<bigjools> :(
 * thumper headdesks
<thumper> it is worse than I thought
 * thumper umms
<thumper> to fix it all now, or to make others do it...
<thumper> fuck it
 * thumper fixes
 * thumper chooses appropriate music
<bigjools> Ugly Kid Joe? :)
<thumper> Device will do
<thumper> heh first track is "You Think You Know"
<thumper> second is "Penance" followed by "Vilify"
 * bigjools discovers multiline config in yaml
<wallyworld_> thumper: it's not just tests - the !#^^!@! boilerplate is everywhere. it seems to be the Go way :-(
<jtv> Not just Go.  We used to do it in Launchpad as well.  I think cultural isolation has also been contributing to the repetition of mistakes.
<jtv> Although Go _is_ the first language where I've seen "we'd rather repeat code to minimize dependencies" stated as policy.
<wallyworld_> yep :-(
<bigjools> so thumper, I can't ctrl-c out of juju debug-log
<bigjools> nor, worryingly, ctrl-z
<bigjools> vartdafark?
<jtv> Isn't that the "suspend" signal?
<bigjools> yes
<bigjools> I very rarely see it fail to suspend a shell job
<jtv> Quite.
<bigjools> only special things like ssh catch it and redirect
 * bigjools files a bug
 * thumper sighs
<thumper> bigjools: I ctrl-C'ed out of debug-log before
<bigjools> it's consistent for me
<bigjools> each new env
<thumper> I've got a brain dead review for someone
<jtv> I'll take it.
<jtv> Appropriately.,
 * thumper is waiting on damn lbox
<jtv> It's already on Launchpad.
<thumper> so many of these songs are appropriate for hacking/refactoring
<thumper> War of Lies
<thumper> Opinion
<thumper> Haze
<thumper> Through it all
<thumper> Out of Line
<thumper> ah fark... still waiting
 * thumper twiddles thumbs
 * thumper hands jtv the lp review
<thumper> https://code.launchpad.net/~thumper/juju-core/open-api-as-new-machine/+merge/184012
<thumper> \
<thumper> lbox still isn't done
<jtv> Argh.  Somebody has to stop this "while we're here, let's change the gocheck import" insanity.
<thumper> sorry
<thumper> eventually it'll all be done
<jtv> Not at this rate.  Birthday paradox.
<thumper> we aren't adding more
<thumper> so eventually we'll fix it
<thumper> oh reitveld is finally done
<jtv> A new change will come along before it's done.
 * thumper ignores it
<thumper> jtv: I was busy copying tests, as you do, and I saw this setup step. I thought "hey I'd like to use that, I'll factor it out"
<thumper> then I saw the function in the ConnTestSuite
<thumper> and thought who is using that
<thumper> low and behold, lots of duplicate code
<jtv> thumper: what does "creates a new machine in state" mean?
<thumper> so create a function that does what you actually care about
<thumper> state is the "state of the system" duh, obvious isn't it?
<thumper> that was the answer I got when I first asked
<thumper> and since the package and main variable is called state
<jtv> I see the duplication in the diff...  awesome.  You get the coveted Negative Lines Of Code productivity rating.
<thumper> I'm passing it on
<thumper> 10 files changed, 58 insertions(+), 134 deletions(-)
<jtv> So... we don't know what it means and nobody will explain?
<thumper> I can explain
<thumper> adding a machine to state is like adding a machine row in a db table
<thumper> except we don't have tables
<thumper> or a row in the machine table
<thumper> but we aren't using a relational database for our relational data
<thumper> because mongo has legs
<jtv> So "create a machine in state" is not the same as "create a machine that lies in state"?
 * thumper leaves off the sarcasterisc
<thumper> the whole codebase lies
<thumper> but that isn't what you mean
<jtv> I don't know what I mean any more.  :)
<thumper> I'd rather say "lives in state"
<thumper> you ok with that?
<jtv> But if "state" is the global, general, universal state of everything, why bother saying it in the first place?
<thumper> / OpenAPIAsNewMachine creates a new machine entry that lives in system state, and
<thumper> / then uses that to open the API.
<thumper> could just say "create a new machine and use that..."
<jtv> Definitely clearer.
<jtv> Either is fine by me.
<thumper> do you perfer more verbose
<thumper> I'll leave it with what I have
<thumper> new revision pushed
<thumper> jtv: and the diff will update automagically \o/
<jtv> "s.st"...  the thing about cratily si ahtt si't leasiy ostl.
 * thumper agrees
<thumper> I HATE it
<thumper> and if I could you bold, 24 point, I would
<thumper> using psychologically damaging RED
 * jtv starts looking for the Health & Safety manual
 * thumper waits for jtv to click approve
<jtv> Mind if I just review the Launchpad proposal?
<thumper> not at all
<jtv> Done.
<thumper> thanks
 * thumper ducks out for a while
<jtv> For the maas & azure providers, I just went in and applied the new import rules throughout.  Not a lot of work, and then it's over with.
<bigjools> is there any way I can force a service to re-install so I can debug its install hook?
<jpds> bigjools: juju resolved --retry service
<bigjools> aha
<bigjools> thanks
<bigjools> jpds: error: unit "tarmac/0" is not in an error state
<jpds> bigjools: Ah, so; destroy-service and redeploy ?
<bigjools> jpds: won't the debug hook disconnect?
<jpds> I would debug-log at that point.
<bigjools> it's a bit of a race getting that in place before the install hook runs and I lose
<bigjools> yes, I am going to have to debug-log :(
<bigjools> or perhaps force an error so I can use resolved
<jpds> bigjools: juju ssh into the box, and look at the log in /var/log/juju/ ?
<bigjools> yeah I am doing that but I need to work out why a command isn't doing what I expect
<jpds> One thing I can think of its juju add-machine; wait for the agent to come up then deploy --to=machine.
<bigjools> could work
<bigjools> but debug-hooks needs a service unit which is not there yet
<bigjools> jpds: ah that worked!
<jpds> bigjools: Brilliant.
<bradm> is it a bad idea to try opening and closing ports in a juju charm by calling /usr/bin/close-port or /usr/bin/open-port ?  that means you need juju installed on the units, right?
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: dimitern | Bugs: 7 Critical, 94 High - https://bugs.launchpad.net/juju-core/
<wallyworld_> bradm: i don't know, but i'll ask later when some more people come online and find out
<bradm> wallyworld_: pretty sure the problem is that squid-reverseproxy is hard coded to call /usr/bin/close-port, when its moved into  /var/lib/juju/tools somewhere
<wallyworld_> yeah i thought close-port was py-juju, but not sure
<bradm> /var/lib/juju/tools/1.13.3-precise-amd64/close-port exists on the node
<bradm> so I'm guessing removing the path might help
<wallyworld_> on the units, jujud is installed automatically, via the tools tarball
<wallyworld_> that path above indicates it's symlinked to jujud
<wallyworld_> and comes from the tools tarball
<bradm> oh, interesting
<bradm> ok, I submitted a fix for the squid-reverseproxy thing, its a really trivial patch
<bigjools> damn, this juju stuff works!
<wallyworld_> fwereade: i'd love a design chat, maybe in an hour or so after dinner?
<fwereade> wallyworld_, sgtm
<fwereade> wallyworld_, I'm actually doing your review now fwiw
<fwereade> wallyworld_, sorry for the delay
<wallyworld_> fwereade: np. i want to move the new interfaces. the HasConfig one to environ.config. The other oe elsewhere
<wallyworld_> i only put them in environs/interfaces.go because that was a stated preference but one i do't agree with
<mattyw> is someone able to anwer a quick api question? When I send the json to deploy a service with CharmURL: cs:precise/wordpress I get the error "charm url must include revision". What's the correct way of getting it to deploy the latest version of a charm? I thought you used to be able to do that?
<dimitern> mattyw: this sounds like a bug in the api deploy command
<mattyw> dimitern, there's code in client/client.go:135 to do it
 * dimitern is looking
<dimitern> mattyw: well, by definition rev == -1 means "latest"
<dimitern> mattyw: and it's the default if not set I think
<mattyw> dimitern, looking at parseURL it looks like the default is -1 yeah
<mattyw> dimitern, shall I file a bug?
<dimitern> mattyw: actually looking at the commit log, this changed in rev 1262 by fwereade
<fwereade> damn, what did I do?
<dimitern> mattyw: there's an explicit test for "cs:precise/wordpress" -> "must include revision"
<fwereade> dimitern, mattyw: ah right
<fwereade> dimitern, mattyw: the GUI already has access to the charm store, and the charm store will tell you the latest revision of a given charm
<dimitern> fwereade: I knew there was more to it :)
<fwereade> dimitern, mattyw: and in fact the gui is necessarily using a particular version, with a specific readme, and specific config settings and metadata
<fwereade> dimitern, mattyw: so my position is that the gui should be knowably deploying exactly that version, rather than opening ourselves up to rare and surprising confusion in which you deploy charm X with config Y and are told that it doesn't actually work
<fwereade> mattyw, does this break something you need?
<mattyw> fwereade, dimitern ok understood, but doesn't make the api harder to use if you aren't the gui?
<mattyw> fwereade, nothing important, I'm just playing with a few side projects
<mattyw> fwereade, I was playing with writing a client in elixir
<wallyworld_> fwereade: are you free for a quick chat?
<fwereade> mattyw, sorry for the inconvenience -- hopefully the charm store api is itself not too painful to use, though?
<mattyw> fwereade, it's no problem at all! :). where can I find the charm store api?
<dimitern> fwereade: there it is https://codereview.appspot.com/13559043 - RUW completed
<fwereade> wallyworld_, yeah, sgtm, but I may eat at you for some of it
<fwereade> dimitern, sweet, I'll look as soon as I can
<fwereade> mattyw, er, let me look around a mo
<wallyworld_> np. https://plus.google.com/hangouts/_/ff292c4c07ae7359f6cd87925661e2868636a3bd
<wallyworld_> fwereade: ^^^^^  if you are busy we cs\n do it
<wallyworld_> betweem the meetings
<dimitern> fwereade: btw I tested with debug hooks and the steps you suggested - it all works fine
<fwereade> dimitern, <3
<fwereade> mattyw, it looks pretty weak, but there's this: https://juju-docs.readthedocs.org/en/latest/internals/charm-store.html#upgrades
<fwereade> mattyw, I'm not actually sure what the original source of that was
<fwereade> mattyw, presumably the internals/ docs in pyjuju
<fwereade> mattyw, but it should still be accurate, because it's talking about the exact same charm store
<fwereade> mattyw, it's pretty narrow but it *will* tell you the latest versions :)
<mattyw> fwereade, ok thanks, I'll have a play around and see what I can come up with, thanks a lot for your help
<fwereade> mattyw, cheers, always a pleasure
<fwereade> mmm, leftover pistachio pizza
<dimitern> :)
<thumper> mgz: meeting ping
<dimitern> fwereade: so how about that review?
<fwereade> dimitern, better finish ian's first
<dimitern> fwereade: ah, ok
<natefinch> so, is manual provider available in trunk now?  I thought I saw something about that
<mgz> it's on trunk, but needs some more work, see the thread in response to the last release announcement
<yolanda> hi, i need to send static contents like images on a charm, is that possible in some way?
<mgz> yolanda: yes, more details on exactly what? base64 and yaml into config or a relation works fine, for instance
<yolanda> mgz, i have a gerrit charm and i'd like to have some way to configure look&feel
<yolanda> it includes sending html and css files, but also some images if needed
<yolanda> such as a canonical logo or whatever
<yolanda> mgz, i used that to send some contents but not sure if that will work for a list of images, we need to upload image content and image name
<mgz> yolanda: it could work, but may be rather ugly (eg, list of key-value path/base64 contents would be fine)
<mgz> I wonder if other charms have similar use cases, and what they've done? for instance I know jenkins and horizon allow of front end customisation
<yolanda>  mgz, i'll take a look at those then
<mgz> yolanda: maybe look at one of those, or poke james or adam for suggestions?
<yolanda> mgz, i'm sending some values to charm with base64 encoding, but for a defined field, not a list, so looked ugly to me
<fwereade> yolanda, mgz: sending a readable reference to a place to get the files would be rather nicer, really
<fwereade> yolanda, mgz: ideally humans looking at how you've configured your charm ought to be able to derive some sort of meaning from it directly
<dimitern> fwereade: so i saw ian's review's done :)
<fwereade> dimitern, yeah, a meeting is happening... I will try to multitask it :)
<yolanda> fwereade, so what do you suggest, maybe point to some directory where to upload all the images?
<mgz> fwereade: where though? we don't really give charms access points into filestore, or provide tools to stick stuff in there
<fwereade> yolanda, mgz: IIRC wordpress lets you set a repo url for that sort of thing
<fwereade> yolanda, mgz: it is true we don't provide tools to help directly
<dimitern> fwereade: thanks, no rush, just poking
<mgz> hm, I guess a seperate versioned branch does make some sense
<yolanda> mgz, fwereade, i really like the wordpress approach, but what do you do in case of deployments that don't allow to get content from external sources? you just first copy it locally?
<mgz> yolanda: you'd need a branch that your cloud could access, yeah
<yolanda> mgz, so setting that on a launchpad branch sounds good
<yolanda> it's a clean approach
<mgz> all, are we doing standup today, or just going on to the new time tomorrow?
<TheMue> mgz: as i understood jam we'll continue tomorrow with the new time
<natefinch> mgz: yeah, my understanding as well
<TheMue> mgz: the calendar shows no more standups on thursday with the change of the meeting time
<mgz> okay, shall code onwardst then
<dimitern> fwereade: still meeting?
<fwereade> nah, I'm actually doing your review :)
<dimitern> cheers :)
<fwereade> dimitern, LGTM
<dimitern> fwereade: thanks!
<sidnei> uhm, anyone interested in upgrade bugs? im trying to do upgrade-juju from ~1.15.0.1 to ~1.15.0.2 and it's stuck in a restart loop
 * TheMue => late lunch
<fwereade> sidnei, I'm interested
<dimitern> sidnei: are you upgrading from a release to a trunk version?
<fwereade> sidnei, are these just your own changes between, or did you merge in as well?
<sidnei> dimitern: both are from source, not from release, but the version number differs
<dimitern> sidnei: you did go install . in cmd/juju and cmd/jujud before trying --upload-tools, right?
<sidnei> dimitern: nope, only go install launchpad.net/juju-core/...
<dimitern> sidnei: well, there's that thing - if you use --upload-tools from source, you should always do these two install steps, otherwise you'll get older binaries
<dimitern> sidnei: pretending to be newer version
<sidnei> dimitern: ok. still didn't solve the issue anyway.
<dimitern> sidnei: np, just wanted to make sure that's out of the way
<dimitern> sidnei: then it seems there's a bug
<sidnei> dimitern, fwereade: http://paste.ubuntu.com/6066470/
<fwereade> sidnei, ah, hell, that looks like https://bugs.launchpad.net/juju-core/+bug/1214676
<_mup_> Bug #1214676: upgrade-juju in local environment causes bootstrap machine agent to restart continuously <juju-core:Triaged> <https://launchpad.net/bugs/1214676>
<sidnei> indeed
<natefinch> jam, fwereade, etc: You guys have anything they think I should be working on? I'm kinda out of stuff. I could look at bugs, but not sure if there's something more appropriate
<fwereade> natefinch, oddly enough, that upgrade that sidnei just pointed out above is a pretty big deal
<natefinch> fwereade: I'd be happy to look at it
<fwereade> natefinch, I was just thinking "gaah who will look at that"
<fwereade> natefinch, <3
<mgz> :)
<dimitern> fwereade: if I have a suite A that embeds another suite B, do I need to define A.SetUpTest, just so it will call B.SetUpTest ?
<dimitern> fwereade: provided that's the only thing A.SetUpTest has to do?
<fwereade> dimitern, it *will* but it's prone to accidental screwing-up so we prefer to be explicit
<mgz> gah, editor lag meant my change didn't get in that file? why did I not look at the diff again before landing...
<dimitern> fwereade: ah, ok - it seems to work both ways, but I too prefer to be explicit
<sidnei> fwereade: https://bugs.launchpad.net/juju-core/+bug/1190985 is another one I'd love if you could prioritize
<_mup_> Bug #1190985: Confusing upgrade-charm and deploy -u behavior <juju-core:Triaged> <https://launchpad.net/bugs/1190985>
<fwereade> sidnei, ha, yes, that is an ugly one
<dimitern> a quick, trivial review anyone? https://codereview.appspot.com/13561043
<mgz> dimitern: can I have a rubber stamp on cl 13562043
<mgz> swap you :)
<dimitern> mgz: sure
<mgz> was typing before you asked :)
<dimitern> mgz: stamped! :)
<mgz> hm, same // Note in both new files, and some other dup still, but not a bad +484/-434 considering it's much clearer overall
<mgz> lgtmed.
<dimitern> thanks
<dimitern> will fix the note
<dimitern> what other dup?
<mgz> I think the note is correct, it's just... odd having the same long comment in two different tests in different files
<dimitern> ah, the watcher note, yeah
<mgz> the dup is mostly just that every action takes three lines
<mgz> create something; assert err is nil, assert actual assertion... and several tests have three or more actions before they get to what they actually are about
<natefinch> mgz: sounds like we need a checker that takes err, realValue and does that for us
<mgz> hm, actually, some of these later tests just create a unit then check the error, and don't also assert Life() is alive
<dimitern> well, the unit is alive
<dimitern> it's created every time
<dimitern> only in refresh and life tests it's explicity asserted over
<mgz> right, so maybe that check is actually redundant in the earlier tests (if we get there and the thing isn't alive, we have bigger problems other tests would break on)
<fwereade> oh, hell, I have to look at houses again in a few minutes, and it's the cross-team call imminently
<mgz> there doesn't seem to be much easy reduction to do though, the tests are just verbose
<fwereade> can I deputise someone to attend and let them know what we're up to please?
<mgz> fwereade: I could cross-team is there anything particular you wanted to saY?
<fwereade> mgz, not really
<mgz> have fun looking at houses :)
<fwereade> mgz, oh, I am
<fwereade> mgz, this week is so much better than normal weeks ;p
 * fwereade shouldn't grumble
<natefinch> fwereade: if we didn't grumble, how would anyone know we're programmers? :)
<fwereade> haha :)
<natefinch> am I supposed to be able to SSH into the lxc container with juju ssh 0?  I get connection refused
<mgz> natefinch: with lxc, you ssh into the units, not the machines
<natefinch> mgz: ahh, ok
<natefinch> mgz: I'm guessing this is probably bad: agent-state-info: '(error: container "nate-local-machine-1" is already created)'
<mgz> that does seem bad, but you may be able to straight resolve it
<natefinch> mgz: I had been diddling with local provider a while back... maybe some  cruft left over
 * natefinch is ramping up his lxc-fu
<mgz> all the way to lxc-fuuuuu
<natefinch> rofl.... exactly
<natefinch> I think that's a short ramp
<TheMue> natefinch: tell me about your installation experiences so that i can possibly change the docs (if i have)
<TheMue> natefinch: here it worked, but we today already have seen troubles with the firewall, so i changed the doc (it's proposed)
<natefinch> TheMue: the biggest problem I had was that my go installation was local to my normal account, so when I do sudo juju bootstrap, it couldn't find the go executable
<natefinch> (so it could build jujud)
<TheMue> natefinch: so, back again, had a talk with robbiew
<TheMue> natefinch: yep, doing it as a juju developer is different, the docs are for users on a clean system. but i'll talk to the others about adding a kind of box "local provider for juju contributors"
<mgz> TheMue: did robbiew poke you, or did you just join the call?
<robbiew> mgz: I'll join..running late
<mgz> robbiew: no problem, just wanted to make sure I was in the right place :)
 * mgz proposes merge in the meantime
<TheMue> mgz: had my 1:1
#juju-dev 2013-09-06
<wallyworld_> davecheney: do you know how to get the magic gobbledegook to put into the dependencies.csv file for a new goose revision?
<wallyworld_> nevermind, found it
<davecheney> wallyworld_: that deps file doesn't do anyting
<davecheney> you know that right ?
<davecheney> nothing consumes it
<davecheney> so we can't tell if it is wrong
<wallyworld_> i think the bot uses it?
<davecheney> nope
<davecheney> i recommend deleting the file until there is something that can consume it
<wallyworld_> hmmm. i'll ask at the standup later today
<davecheney> kk
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: - | Bugs: 7 Critical, 94 High - https://bugs.launchpad.net/juju-core/
<wallyworld_> fwereade: i've been having a lot of issues with the bot today. io timeouts on RelationUnitSuite.TestProReqWatchScope and RelationUnitSuite.TestContainerWatchScope. 2 of my branches landed ok. the tests pass locally. one one bot run, a test setup also timed out. have you seen this at all before?
<fwereade> wallyworld_, hmm, no, I have not... *but* I reinstated those tests only a few days ago, after they got dropped accidentally, and dimitern then modified them just yesterday
<fwereade> dimitern, can I ask you to take a quick look please?
<wallyworld_> strange that my first 2 branches landed, and the tird failed
<wallyworld_> third
<fwereade> I will be out for houses for, hopefully, the last time today
<wallyworld_> np. i'm about to have dinner soon
<dimitern> fwereade: will look
<TheMue> fwereade: one short question regarding the result of ServiceGet in case of defaults. if an option is set to default but this default is nil we currently return "value: nil".
<TheMue> fwereade: if i remember it right we said that in this case we can omit the output of value.
<TheMue> fwereade: am i right? (not sure anymore)
<dimitern> wallyworld_: is it always the same test failing?
<dimitern> ah, there are 2 of them
<TheMue> fwereade: allready out?
<wallyworld_> dimitern: yeah, the same 2
<wallyworld_> dimitern: https://pastebin.canonical.com/97052/ if it helps
<dimitern> wallyworld_: it's strange that they didn't fail when I submitted my change
<wallyworld_> yeah, and they seem to pass locally for me too
<dimitern> wallyworld_: and the bot wasn't under load presumably
<wallyworld_> don't think so
<wallyworld_> i guess we can wait till the next bot run and see what happens
<dimitern> yeah
<dimitern> it's something to do with the dummy provider I think
<dimitern> yay i just got my Clean Code book
<dimitern> wallyworld_: I have no clue what could have gotten wrong :/
<wallyworld_> thanks for looking, i was hoping maybe something would jump out
<dimitern> there's no specific timeout that could've been triggered there
<dimitern> it's just a state call
<dimitern> mgz: are you the OCR today?
<dimitern> https://codereview.appspot.com/13249050/
<dimitern> i'd like a review on that if anyone's available
<TheMue> looking
<TheMue> dimitern: you've got a review
<dimitern> TheMue: cheers!
<raywang> hi, anyone knows how to change juju gui's admin password?
<TheMue> raywang: did you already asked in #juju-gui? i can't help, sorry.
<raywang> i see, thanks TheMue
<frankban> raywang: the password is the admin-secret in your environments.yaml. AFAIK there is no way to change that after the environment is bootstrapped.
<raywang> frankban, well, user sometimes need to change the password then. it's necessary to have this functionality :)
<frankban> raywang: improvements on the auth machinery are in progress, and eventually it will be possible to do that, but for now you can change the admin secret only before you create the environment
<raywang> frankban, understood, thanks for the information :)
<frankban> raywang: welcome
<dimitern> TheMue: https://codereview.appspot.com/13269051 - this should fix the tests failing only on the bot and preventing stuff from landing
<TheMue> dimitern: taking a look
<TheMue> dimitern: those two calls lead to the failing?
<dimitern> TheMue: yeah - see https://pastebin.canonical.com/97052/
<dimitern> TheMue: I hope this should fix the issue, if not we'll look for other things
<TheMue> dimitern: interesting, it dislikes the connection to mongo in that moment
<dimitern> TheMue: it's really weird
<dimitern> TheMue: but since I added these asserts recently, removing them shouldn't harm anything
<TheMue> dimitern: yep, just lgtm'ed, give it a try
<dimitern> TheMue: cheers
<dimitern> it didn't work :(
<TheMue> dimitern: still fails, yes
<TheMue> dimitern: just took a look at the result
<dimitern> TheMue: I still have no idea why that happens
<dimitern> TheMue, natefinch, fwereade,mgz, wallyworld_: standup
<natefinch> fyi the standup link is different from before (I had the old one bookmarked)
<natefinch> fwereade, mgz ^^
<mgz> ah, doh, forgot about the time move
<natefinch> heh, figured
<dimitern> hmm some things were missing and had to be reconfigured after the bot machine reboot
<dimitern> like the crontab job
<dimitern> and it appears the golang version got updated from 1.1.1 to 1.1.2 so I had to wipe out the linux_amd64 object files
<dimitern> seems to be working .. for now
<mgz> hm... wonder how we'd cope with the updagrade in a less manual fashion
<dimitern> easy - put a rc.d script that does these steps on boot
<mgz> that seems sane
<dimitern> but in fact the charm should do that in a hook when it starts
 * dimitern fingers crossed.. tests almost done
<dimitern> :( nope, still the same failure
<TheMue> weird
<dimitern> so, an update - tried downgrading go to 1.1.1 on the bot - the two tests stil fail, so I reverted back the go version to 1.1.2 from the archive and I'm proposing this change, which skips these two tests; tried manually on the bot - tests pass
<dimitern> proposing now and will sent the link shortly
<dimitern> https://codereview.appspot.com/13487044
<dimitern> can someone lgtm that so I can land it
<dimitern> mgz, TheMue: ^^
<mgz> looking
<mgz> dimitern, have you files a bug?
<dimitern> mgz: no
<dimitern> mgz: will do
<dimitern> added bug 1221705
<_mup_> Bug #1221705: relationunit_test.go: 2 tests fail only on the bot <juju-core:Confirmed> <https://launchpad.net/bugs/1221705>
<dimitern> \o/ !! it landed
<dimitern> the bot should be good again
<mgz> well played dimitern :)
<TheMue> +1, super
<dimitern> well, I haven't really fixed the problem, but we have carte blanche for now
 * TheMue just made something to eat and now has lunch beside the computer
<dimitern> wallyworld_: your last branch should be good to land
#juju-dev 2013-09-07
<benonsoftware> ~/sort
#juju-dev 2013-09-08
 * davecheney waves
<bradm> davecheney: hey, how do I get my squid-reverseproxy fix into the charm?  it looks like you've approved it?
<davecheney> bradm: i'm only a baby charmer
<davecheney> i'm not grown up enough to do release yet
<davecheney> let me whinge in #eco
<bradm> davecheney: cool - I'm happy if the answer is wait, just not sure if I'm supposed to do anything or what
<davecheney> bradm: me neither
<davecheney> apparently some juju (ha!) is needed to rev the charm in the charm store
<bradm> haha, right
<bradm> no rush on it, just interested - this'll be my first patch to a charm, and I've got more coming, so I wanted to know if I'd done something wrong or what
<davecheney> i'm sure marco will fix it in short order
<pjdc> davecheney: can you also check if the same magic can be applied to https://code.launchpad.net/~pjdc/charms/precise/haproxy/default-timeouts/+merge/177302 please?
<bradm> davecheney: cool, thanks for chasing it up
#juju-dev 2014-09-01
<waigani> thumper:  I'm do we have any nice way of passing a userTag over the wire?
<waigani> I'm currently breaking it up on the client, putting it in a params struct and rebuilding it on the server
<waigani> alternatively, we could not build the tag until the server side, and just pass through usernames...
<thumper> waigani: yes, it is a tag
<thumper> wallyworld: got 5 min?
<wallyworld> sure
<waigani> thumper: we should allow the env to be 'unshared' as well right?
<thumper> wallyworld: https://plus.google.com/hangouts/_/canonical.com/tim-ian
<thumper> waigani: yes...
<thumper> waigani: not sure "unshare" is the right verb
<waigani> hence the scare quotes
<waigani> under the hood that would just be RemoveEnvionmentUser
<thumper> trivial review for someone: https://github.com/juju/juju/pull/641
<thumper> and my notes about bout abootstrapping: https://github.com/juju/juju/pull/642
<axw> review please: https://github.com/juju/juju/pull/643  -- fixes CI blocker
<thumper> axw: does this address both blockers?
<axw> thumper: I've only looked at the one so far
<axw> will take al ook at the other now
<thumper> kk
<wallyworld> thumper: what's the other blocker? there was only one i thought
<wallyworld> the other critical bug is not a blocker as it's not a regression
<thumper> https://bugs.launchpad.net/juju-core/+bug/1363143
<mup> Bug #1363143: local lxc deployments fail to create machines <ci> <local-provider> <lxc> <precise> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1363143>
<thumper> showing "ci regression"
<wallyworld> thumper: ah, it hasn't got a milestone
<wallyworld> i was looking at 1.21alpha1 milestone bugs
<wallyworld> it could be the same root cause as the manual provider fix
<thumper> could be
<axw> wallyworld thumper: is local meant to be doing apt-get update/upgrade by default now?
<axw> cos it seems to be
<wallyworld> axw: with lxc-clone, it wasn't meant to
<axw> isn't lxc-clone implicit on local?
<wallyworld> it is now yes
<wallyworld> as of 1.20
<axw> I'm doing add-machine on trunk with no special config, and added machines are updating/upgrading
<wallyworld> if local is doing the apt dance with lxc-clone = true, that's wrong
<axw> I expected that to clone and not update/upgrade
<wallyworld> do you have a template conrainer created?
<axw> yes
<wallyworld> and if it is using that, ie clone = true, then there's a bug if it's doing apt
<axw> I'll look into it more after I've repro'd this bug
<waigani> thumper: added Usermanager.AddEnvironmentUser to the serverapi, we can expose it as Environment.Share on the client: https://github.com/juju/juju/pull/644
<thumper> waigani: Noooooooo
<waigani> sigghhhhhhh
<wallyworld> axw: i'll look as well - people wanted to option of doing apt in clones so as not to have stale templates, but we wanted to lso preserve the current default behaviour
<axw> thanks
<waigani> thumper: hangout?
 * thumper nods
<wallyworld> axw: there's a test that passes TestLocalDisablesUpgradesWhenCloning
<wallyworld> maybe the test is wrong
<thumper> waigani: muted?
<thumper> waigani: you are very quiet
<axw> wallyworld: I didn't specify lxc-clone or use-clone or anything
 * axw looks at the test
<wallyworld> axw: lxc-clone defaults to true - maybe the apt logic is broken in that it doesn't handle implicit defaults
<wallyworld> i'll check
<wallyworld> axw: i found the bug
<wallyworld> well, i think so from reading the code - i'll fix
<axw> wallyworld: cool, so I'm not going crazy :)
<axw> thanks
<wallyworld> nope, not this time :-)
<axw> :p
<wallyworld> i'll write a test to check my theory
<wallyworld> axw: just checked the config code - seems that lxc-clone defaults to false, which means i totally misremembered what was done
<axw> huh, ok.
<axw> wallyworld: it's in container/lxc
<wallyworld> i *think* we must have wanted to retian the original old behaviour which was before clone was supported ie do not clone unless asked
<axw> if it's not set, container/lxc auto-detects support based on series
<axw> https://github.com/juju/juju/blob/master/container/lxc/lxc.go#L104
<wallyworld> axw: huh, well that conflicts with the code in config
<wallyworld> it will be messy because that check in lxc.go is donr on the host machine
<wallyworld> the config parsing is done on the client
<axw> they're one and the same :)
<axw> for local anyway
<wallyworld> not for maas with lxc etc
<axw> yes, will be messy for them probably
<wallyworld> for now, maybe best just to be explicit with lxc-clone=true
<thumper> axw: I'll update my doc to match your comments
<axw> thumper: thanks
<axw> and thanks in general, I've been meaning to write that for a while...
<thumper> yeah, me too
<thumper> well...
<thumper> I deleted the AddAdminUser method
<thumper> and got all the breakages I expected before...
 * thumper enfixorates
 * thumper chuckles at the pure number of panics
<thumper> just for state: OOPS: 248 passed, 90 FAILED, 213 MISSED
<davecheney> whoops, juju dev is usually #2 on my irris
<waigani> thumper: https://github.com/juju/juju/pull/645
<davecheney> whelp, that's a circular import
<davecheney> time to table flip and try again
<davecheney> state/state.go:
<davecheney> 30:     "github.com/juju/juju/environmentserver/authentication"
<davecheney> ffs
 * thumper weeps and leaves
<thumper> searched for "user-admin" in our tests
<thumper> lots
<thumper> no longer admin
 * thumper goes to make dinner
<thumper> problem for tomorrow-tim
<axw> wallyworld: it looks like the tools-in-cloud-config for local may be what's making things not work well in local/lxc
<axw> I'm going to back it out and see if it fixes things
<axw> from what I can see, it makes cloud-init take a lot more CPU. I guess it's hurting the YAML parser
<wallyworld> axw: could be, yeah. i would prefer another way tbh
<axw> wallyworld: prefer another way? as in, other than what's currently in master?
<wallyworld> axw: scp or something like that - to get tools into the bootstrap machine. we used to set up an http service but that fails if firewall ports are closed
<axw> wallyworld: ah. well I'm only changing it for non-bootstrap machines atm. bootstrap is fine
<axw> (but could be better, I agree)
<wallyworld> np, let's iterate on it
 * wallyworld -> school pickup, bbiab
<jam> axw: are you putting a 8MB tarball into a text configuration file ?
<axw> jam: indeed, and now reverting that :)
<jam> axw: did you test whether that size was even feasible for Userdata ?
<axw> jam: yes, it does work. I only did it on the local provider, and it works for both lxc and kvm
<axw> but it does seem to add significant overhead, which I hadn't noticed before
<axw> and delays agent startup... probably didn't notice because I only tested on my laptop before
<axw> just tested on a VBox VM, and it was noticeable
<jam> yaml doesn't do length-prefixed delimiting, so I can imagine that looking for the closing marker on 8MB * (4/3 base64) of tarball is a bit expensive
<axw> jam: https://github.com/juju/juju/pull/648 reverts it, if you have a moment to review. fairly trivial
<jam> axw: the first part of that certainly looks like we are only putting the URL into the user-data
<jam> Tools.URL
<jam> is being set
<jam> shouldn't there be some sort of encoding of the actual content?
<axw> jam: environs/cloudinit treats file:// specially, and reads the file in when generating the cloud-config
<axw> that's used for bootstrap at the moment
<axw> and manual provisioning
<jam> axw: that looks so totally horrible to me... :(
<jam> "if something starts with file://" then it must be a tools prefix and thus we should read it into our cloud config file as tools.tar.gz
<jam> very "spooky action at a distance"
<axw> yes, it is a bit magical and needs fixing
<jam> maybe if it had at least "tools" in that string.
<axw> in which string?
<axw> jam: oh, we only do that in one specific place: when generating the "copy tools" command
<jam> nm, it isn't anything with file:// it is just if Tools.URL has file:// which is slightly better, but still
<axw> yes, still a bit magical. it'd be better if we just scp'd it in the first place
<jam> axw: just to confirm you've tried it with and without and the overhead is significantly better after your patch, right?
<axw> will try to reorganise things at some point to accommodate that
<jam> axw: agreed
<axw> jam: yes, on my VM it's noticeably faster
<axw> also doesn't leave 8MB cloud-init files lying around in the lxc container cache
<jam> axw: so with your change we just get the Tools.URL that we discovered, rather than overwriting it as a "file://' url, right?
<axw> yup, it'll just do what every other provider does - download from the API server
<jam> axw: oh man... AddBinaryFile adds a shell script which is doing "printf %s BASE64CONTENTS | base64 -d > file"
<jam> We're lucky the shell was allowing it given that size
<jam> axw: LGTM
<axw> thanks
<jam> axw: fwiw, ec2 User Data is capped at 16kB, which I think would be a sane rule for us to follow.
<axw> jam: yep, I won't make that mistake again. bootstrap will migrate sooner or later, but that at least does not seem to have this problem
<axw> bbs, school pickup
<jam> later
<TheMue> morning
<jam> dimitern: just grabbing some coffee, will be there in a bit
<dimitern> jam, sure, omw
<wallyworld> fwereade: heya, you back on board this week?
<fwereade> wallyworld, heyhey
<fwereade> wallyworld, yeah :)
<wallyworld> fwereade: awesome. i'd love to catch up via hangout when you have some time
<wallyworld> maybe ping me later when you have read all your email
<fwereade> wallyworld, cheers
<fwereade> wallyworld, will do
<jam1> dimitern: did my connection die or is it yours?
<mattyw> so - is there a way for me to know if landing is currently blocked?
<mattyw> also - morning all
<jam1> mattyw: try to land something and CI will reject it?
<jam1> AFAIK there is no blockers right now
<jam1> are no
<wallyworld> critical bugs are in the topic - maybe we should update those to indicate which ones block landings
<mattyw> jam, I was wondering if there was a better way - so I can work out if there's any point trying to land something
<mattyw> but I guess better to ask forgiveness and all that
<jam1> mattyw: well you can do the search yourself
<jam1> for any "ci+regression" bugs
<wallyworld> maybe the bugs in the topic ate the blockers already
<jam1> https://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.importance%3Alist=CRITICAL&field.tag=ci+regression+&field.tags_combinator=ALL
<mattyw> jam, ok great
<jam1> wallyworld: looks like LXC	 deployments failing to create machines is considered blocking?
<mattyw> wallyworld, jam, that's what blocked me on friday
<jam1> mattyw: axw just submitted something for it
<jam1> at least, I reviewed something that sounded like it was this
<wallyworld> jam1: yeah, there were 2 - one for manual, one for local
<axw> first is done, second is on its way
<wallyworld> the manual one was fallout of two branches sort of landing together
<jam1> mattyw: https://github.com/juju/juju/pull/648 is addressing bug #1363143
<mup> Bug #1363143: local lxc deployments fail to create machines <ci> <local-provider> <lxc> <precise> <regression> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1363143>
<jam1> so it is queued right now
<wallyworld> fwereade: also, since you're ocr today, i'd love a review of this which *may* solve one of the container pending issues we cannot reproduce but the landscape guys can https://github.com/juju/juju/pull/646
<fwereade> wallyworld, just having a ciggie; quick hangout after that?
<wallyworld> sure
<jam1> wallyworld: so my quick take from my experience there was that it was that the provisioner would puke on certain kinds of errors
<jam1> is that what you're changing?
<mattyw> wallyworld, we still using the spreasheet to work out our ocr days? isn't it changing to the calendar soon?
<wallyworld> mattyw: yes, soon :-)
<jam1> wallyworld: I saw that when tools couldn't be found
<jam1> it treated no-tools as a bug in the Provisioner code, which would cause it to restart
<wallyworld> jam1: pretty much - but in this case, it's an inconsistent database. the  machine was there but the status record wasn't
<jam1> but of course, its queue still said "i need to start a machine with tools that aren't available" so it would just keep doing that
<wallyworld> this is different
<jam1> so it *felt* to me that errors during provisioning shouldn't be treated as a Provisioner failure but a failure to provision
<jam1> and thus the Provisioner could keep going to the next thing to start.
<wallyworld> yep, agreed
<jam1> which is what we did, but specific to tools
<wallyworld> this problem is one case of the provisoner not beibg rbust
<wallyworld> and also is due to our lack of tranactions
<wallyworld> so the provisioner gets told a machine is ready to provision, except it isn't
<wallyworld> because there's no status record yet
<wallyworld> that arrives later, after the provisioner has a;ready errored for that machine
<wallyworld> well, that's what the debug logs show
<wallyworld> and it explains the behaviour that's been seen
<jam1> mattyw: wallyworld: andrew's patch 'succeeded' except for a replicaset_test timeout
<jam1> so it should be unblocked RSN
<wallyworld> !@%$!~$@~ replicasets
<jam1> wallyworld: so it looks to be a traceback in TestAddRemoveSet for the MongoSuite and not MongoSuiteIPv6
<jam1> which leads me to believe that it is just both that are flakey
<jam1> though you specifically called out just the IPv6 version
<jam1> (they run the same test with different addresses)
<wallyworld> jam1: i called that one out because there's been a bug raised for it and it was assigned to 1.20 series
<wallyworld> i think it was holding up CI at one point
<wallyworld> ie failing very often, hence they raised it and assigned it
<jam1> wallyworld: sure. Looking at the code AFAICT they are identical except we always do the IPv4 setup, and then we follow that with an IPv6 setup in the v6 case
<jam1> and I thought that doing double set up might be the problem.
<jam1> but didn't get a chance to actually run it enough to have any confidence there.
<wallyworld> ah, ok. i hadn't looked specifically at the test
<jam1> wallyworld: it only matters because *my* team was responsible for adding IPv6 but we didn't write the original test, not that we can't be the ones to fix it
<wallyworld> jam1: fair enough. i was sort of thinking that whoever wrote the failing tests could have the best chance of fixing them, but of course anyone can fix any test
<mattyw> jam, so even though the lxc bug has been marked as fix committed my branch still fails to land because of it - do I need to wait till it's marked fix-released?
<jam1> mattyw: I just checked the URL that the bot is supposed to be using, and it is clear now. Though LP timed out the first time I tried
<jam1> maybe just try submitting again? It hadn't updated when it got to your previous request
<mattyw> jam1, will do - thanks
<mattyw> jam1, I just tried again and it failed - I'll go make some coffee and try later, I don't want to spam the poor thing on a monday morning
<jam1> mattyw: can you link the PR, I'd like to see it
<mattyw> jam1, https://github.com/juju/juju/pull/562
<jam1> axw: wallyworld: I thought the bot ignored Fix Committed bugs, but it is clearly still complaining.
<jam1> Do we need to drop it from CRITICAL so that we can land code again?
<axw> jam1: not sure, I thought it ignored them too. can try it I guess
<jam1> axw:  http://paste.ubuntu.com/8204244/ is the specific request the bot is making
<jam1> that sure looks like just Triaged and In Progress (and not Fix Committed)
<jam1> axw:  mattyw: when I go to https://api.launchpad.net/devel/juju-core?ws.op=searchTasks&status%3Alist=Triaged&status%3Alist=In+Progress&importance%3Alist=Critical&tags%3Alist=regression&tags%3Alist=ci&tags_combinator=All
<jam1> it returns an empty list
<axw> weird
<mattyw> I'll just try to land one more time
<mattyw> I'm sure the bots likes being kept busy anyway
<axw> jam1: where's the code that's making the request? could it be caching?
<jam1> axw: I'm pretty sure the code is: https://code.launchpad.net/~juju-qa/juju-ci-tools/trunk
<jam1> but I can't find anything that calls check_blockers.py
<jam1> axw: that code is just doing urllib2.urlopen so it shouldn't be doing any caching.
<axw> jam1: the Jenkins job calls it directly
<axw> calls check_blockers.py
<jam1> axw: k, I don't think I have visibility into that layer
<axw> nothing enlightening
 * axw shrugs
<jam1> and it has now been 7 minutes from a $$Merge$$ before the bot has noticed.
<jam1> all the other requests in that thread show as "request accepted within 1 min"
<mattyw> it really doesn't want to land that branch
<mattyw> jam1, I've seen it completely miss a branch before - can't remember the reason, but I've seen it happen
<jam1> TheMue: I'm currently in a meeting, so I might be a bit late for our 1:1
<jam1> but I'll keep you updated
<mattyw> mgz, ping?
<TheMue> jam: ok, I'm here
<TheMue> jam: so ping me when you're ready
<wallyworld> axw: you free for standup nw since katherine is away?
<axw> wallyworld: sure, gimme a couple of mins
<wallyworld> jam: sorry, was on a call before, did you get the bot issue sorted?
<jam2> wallyworld: the bot seemed to be sleeping for an hour: http://juju-ci.vapour.ws:8080/job/github-merge-juju/
<jam2> see the 1hr gap from 8:18 to 9:17
<jam2> wallyworld: axw: mattyw: it appears to have woken up for axw, but is still ignoring matty ?
<wallyworld> jam: so it was, hmmmm
<jam2> wallyworld: https://github.com/juju/juju/pull/562 seem to have been pending for an hour without the bot noticing
<mattyw> jam2, looks like it - I've seen this happen before - but not for ages
<axw> wallyworld: can barely hear you
<mattyw> jam, jam2 which is the real one?
<wallyworld> jam: i had no idea why ottomh
<wallyworld> maybe mgz  can look into it
<mattyw> jam, last time this happened mgz did some magic
<axw> now frozen
<jam> mattyw: I'm on 2 machines, jam2 happens to be my laptop which is a bit better than my desktop *right now*
<mattyw> jam, so I guess what would be most annoying for you would be pinging them alternately?
<mattyw> jam2, right?
<jam2> my laptop pings on both of them
<wallyworld> axw: you forze, you still there?
<axw> wallyworld: hangouts isn't loading..
<axw> I got cut off, now I can't load hangouts
<axw> hang on
<wallyworld> \o/
<axw> my wifi setup is dodgy atm
<jam> TheMue: I'm in the hangout, if you can try to make it quickly
<TheMue> jam: OK
<mattyw> fwereade, we should have a chat about what we've discussed around metrics & environment
<mattyw> fwereade, also, good morning
<fwereade> mattyw, heyhey :)
<jam> TheMue: just a reminder to be booking your travel to brussels
<wallyworld> fwereade: maybe this time https://github.com/juju/juju/pull/650
<fwereade> wallyworld, LGTM
<wallyworld> fwereade: tyvm
<fwereade> mattyw, free for a quick chat?
<wallyworld> jam: i take your point but if I do prereqOps = append(prereqOps, machineOp) then the result isn't just prereqOps anymore, so I deliberately used a different name. prereqOps isn't used elsewhere so if it gets modified it doesn't matter
<jam> dimitern: https://github.com/juju/juju/pull/627
<wallyworld> sound ok?
<jam> wallyworld: I understood why, but it means that line has an unspecified side effect, I'd rather we were concrete about them.
<jam> You could just do:
<jam> return mdoc, append(...), nil
<jam> I'll live with it either way, but I feel it is risky to do "append()" and assign it to another variable
<wallyworld> thought about that too, but it looked ugly. i might change it to that though. the side effect is local only
<wallyworld> there's one case with two lists that need appending where it will get messy
<mattyw> fwereade, sorry - yes I am now
<fwereade> mattyw, np, I'll start a hangout
<bogdanteleaga> hey, could I get another review for https://github.com/juju/testing/pull/31 ? It's a small PR in the testing package that adds a missing windows script, as well as new failing functionality for testing
<dimitern> jam, you've got a reviw
<dimitern> review even
<jam> dimitern: thanks
<mattyw> perrito666, ping?
<perrito666> mattyw: pong
<mattyw> perrito666, good morning, you did a fix for this? https://bugs.launchpad.net/juju-core/+bug/1363079
<mup> Bug #1363079: userManagerSuite.TestUserInfoUserExists fails <test-failure> <juju-core:Triaged by hduran-8> <https://launchpad.net/bugs/1363079>
<mattyw> perrito666, the reason I ask is that a number of the test failures in our google doc look to be the same error as was reported by that bug
<mattyw> perrito666, I'm trying to work out if we can consider the other tests "fixed" or at least keep an eye on them for probably/ maybe being fixed
<perrito666> mattyw: I did not manage to get it solved, I did figure what might be happening to that particular test
<perrito666> mattyw: want to tell memore?
<tasdomas> could I get somebody to take another look at https://github.com/juju/juju/pull/517 ?
<gsamfira> hello folks. Can someone have another look at: https://github.com/juju/testing/pull/31 ?
<jcw4> fwereade: I think you're OCR today? https://github.com/juju/juju/pull/651
<fwereade> jcw4, cheers
<jcw4> fwereade: tx
<bodie_> hello all, three last branches for Actions on the unit ready for final review --
<bodie_> https://github.com/juju/juju/pull/615
<bodie_> https://github.com/juju/juju/pull/415
<bodie_> https://github.com/juju/juju/pull/520
<bodie_> sorry, 617, not 615
<mattyw> perrito666, we'll talk about those errors tomorrow if that's ok?
<thumper> morning
<hazmat> thumper, g'morning mate
<thumper> o/
<perrito666> mm, somthing is odd here, thumper said good morning and its not yet night here
<thumper> perrito666: days are getting longer
 * perrito666 calls the ministry of truth to fix that
<lifeless> perrito666: the ratio of daylight to non-daylight is increasing in the southern hemisphere
<perrito666> lifeless: I should know, I live there
<lifeless> perrito666: kk :)
<perrito666> lifeless: we are now on the road to the interesting time where its 10PM and it still afternoonish
<lifeless> perrito666: you must be waaaay south for that - mcmurdo?
<perrito666> lifeless: geographic center of argentina, which is quite south.
<perrito666> I dont mind much the sun, the 30Â°C during the night is what gets me
<lifeless> ah, I should have been able to guess that from the earlier quip ;)
<lifeless> perrito666: 30'is too hot to sleep comfortably for sure
<waigani> thumper: when making a user, what is a valid 'Name' field? Should '@' be allowed at all?
<thumper> waigani: no, it should fit through this hole: var validPart = "[a-zA-Z][a-zA-Z0-9.-]*[a-zA-Z0-9]"
<thumper> waigani: that is the name part from names/user.go
<waigani> thumper: currently state.AddUser checks via names.IsValidUser
<thumper> waigani: right, but valid user is user@provider
<thumper> what I think we want is "names.IsValidUserName"
<thumper> which just checks the valid part
<waigani> thumper: so that will be a new func on names?
<thumper> right
<menn0> thumper: meta-review please: https://github.com/juju/juju/pull/649
<thumper> menn0: kk, shortly
<menn0> thumper: it's a small one so shouldn't take long
<fwereade> thumper, menn0, waigani, mornings :)
<waigani> fwereade: morning :)
<menn0> fwereade: hai!
<waigani> or evening for you
<thumper> fwereade: o/
<fwereade> waigani, details details, fell asleep putting laura to bed, bit confused :)
<fwereade> how's everything?
<waigani> fwereade, menn0: I'm taking a week off before the sprint. All goes well, weekend in London and a few days in Paris.
<fwereade> waigani, oh, lovely
<waigani> so any must see/do please email me :D
<menn0> waigani: sounds good!
<waigani> hopefully Molly will come - it will be our first real holiday :)
<menn0> fwereade: are you around for a bit? I have a few quick questions regarding stuff we did in Dunedin.
<menn0> thumper: another quick meta-review please: https://github.com/juju/names/pull/24
<thumper> omg, this branch is going to be so horrible
<waigani> menn0: happy with regex? https://github.com/juju/names/pull/24/files
 * thumper nods
<davecheney> waigani: "^"+ValidPart+"$"
 * thumper wonders if the compile part of the test run will be done before the standup
<thumper> fan going full speed
<lifeless> thumper: I thought you used go nowdays?
<thumper> lifeless: I do
<thumper> lifeless: the tests are compiled
<lifeless> thumper: and that it had ultra super awesome compile times
<thumper> lifeless: they do...
<perrito666> thumper: renice
<thumper> each package is compiled into its own executable I think
<davecheney> correct
 * thumper afk to collect rachel and have a coffee
<davecheney> waigani: thumper menn0 as discussed https://github.com/juju/juju/pull/653
<waigani> thumper: should state.AddUser be able to take "user@provider" or just "user"?
#juju-dev 2014-09-02
<thumper> waigani: just user IMO
<thumper> waigani: because all users are local
<waigani> thumper: ok
<davecheney> thumper: waigani menn0 https://github.com/juju/juju/pull/653
<davecheney> if you have a chance
<davecheney> this is what we discussed on the call
<menn0> davecheney: sorry, I haven't had a chance to look yet
<davecheney> s'ok
<davecheney> no that urgent, but not that complex either
<menn0> davecheney: done
<davecheney> thanks
<davecheney> re gofmt
<davecheney> that's sort of how it comes
<davecheney> and i'm moving those packages to the top level so that will change the ordering again
<davecheney> menn0: you were right about gofmt
<davecheney> fixed
<davecheney> thumper: the state/api{,server} -> / change is coming next
<thumper> davecheney: kk, awesome
<davecheney> thanks mongo, http://paste.ubuntu.com/8211128/
<thumper> ugh...
 * thumper is starting to really hate the juju/juju/juju package
<davecheney> back in the day j/j/juju was supposed to be "the" way of connecting to state
<davecheney> but now we have
<davecheney> state/api
<davecheney> and agent/
<davecheney> i guess all good things come in threes
 * thumper nods
<thumper> j/j/juju needs to die IMO
<davecheney> i'm the man for that job
<thumper> unfortunately I can't kill the method right now
<thumper> as restore plugin has decided to use it
<thumper> everywhere else is just tests
<thumper> for the method I'm looking at
<thumper> juju.NewAPIState(environ, api.DialOpts{})
<davecheney> hmm
<davecheney> there is OpenAPIState in another package
<davecheney> probably  few of them
<davecheney> and testing/OpenAPIAs
<thumper> colour me unsurprised
<davecheney> I am daves complete lack of surprise
 * thumper needs to find some time to write the LXC talk he is giving in about 4.5 hours
<davecheney> glwt
<davecheney> thumper: http://godoc.org/github.com/davecheney/graphpkg
<davecheney> just updated this a little
<davecheney> so I can see the lay of the land
<davecheney> you can also use godoc to get a graph of the packages
<davecheney> but I don't trust it to be up to date
<davecheney> and dot went to 100%
<davecheney> :(
<davecheney> not a good sign
<davecheney> wallyworld_: I think we have more new flaky test
<davecheney> http://paste.ubuntu.com/8211272/
<wallyworld_> davecheney: known issue, already in the doc
<davecheney> cool
<davecheney> thanks for confirming
<wallyworld_> np
<davecheney> i haven't seen that one before
<wallyworld_> you're lucky
<davecheney> i'm not sure if that means Ineed to look more, or less
<wallyworld_> sadly we have a few of those common ones
<wallyworld_> but once it's fixed, a several breakages will go away
<davecheney> it's an odd one
<davecheney> it's happening in test tear down
<davecheney> btw, I love the open source community
<davecheney> https://github.com/davecheney/gmx/issues/2
<davecheney> such people skills
<wallyworld_> yup, it also can happen elsewhere i think, but teardown is most common
<wallyworld_> lol
<davecheney> i'm a tool
<wallyworld_> davecheney: in Launchpad, the various build recipe APIs were wrong for a while because I couldn't spell
<davecheney> who can't spell gauge
<wallyworld_> i thought it was recipie
<davecheney> wallyworld_: how many i's are there in recipippie ?
<wallyworld_> not enough
<wallyworld_> imo
<davecheney> wallyworld_: you're not alone, i have a hard time remembering that recipe isn't spelled, recepit
<davecheney> fuck
<davecheney> recepit
<wallyworld_> i just can't spell in general
<davecheney> damn it
<davecheney> i really cannot spell that word
<davecheney> or any other that is slightly like it
<davecheney> without the red underscore I am worthless
<wallyworld_> dyslexic much
<wallyworld_> me too
<wallyworld_> axw: well, i have done my friday test fix early this week - stupid mega watcher test race bit me trying to land some code
<axw> wallyworld_: okey dokey,
<davecheney> https://github.com/juju/juju/pull/655
<davecheney> truly enormouse, entirely mechanical change
<axw> wallyworld_: I think I'm going to do tools catalogue & storage in one
<axw> probably not much bigger of a PR in the end
<wallyworld_> whatever is easiest :-)
<wallyworld_> i don't mind big pr's sometimes
<davecheney> wallyworld_: https://github.com/juju/juju/pull/655/files
<axw> I don't think it'll be all that big, we'll see
<wallyworld_> 600 files!!
<davecheney> 600 files changed, +600 lines, -600 lines :(
<wallyworld_> davecheney: so just a package move
<davecheney> sed baby, sed
<davecheney> yup
<davecheney> hopefully uncontentious
<wallyworld_> yup
<davecheney> people i talk to think putting the api at the top level makes sense
<davecheney> i'm +/- on having both api and apiserver at the top level
<davecheney> i rememver back in the day there was something like
<davecheney> api/apiserver
<davecheney> or apisever/api
<davecheney> but that sounds no better
<axw> davecheney: IMO we should move api out of this repo altogether at some point
<davecheney> axw: +1 << 32
<axw> have language-specific API repos
<davecheney> yes, that was part of the reason for monving api/params to apiserver/params
<axw> cool.
<davecheney> i'd be keen to see the api (client) move it it's own repo
<davecheney> but I don't think it is urgent
<davecheney> ie, it doesn't justify your time to setup another repo and top level build job
<axw> agreed
<menn0> davecheney: FWIW the gofmt comment was waigani's, not mine
<waigani> prefixed with "super minor point" gofmt -s -d says it should come after
<davecheney> menn0: right
<davecheney> sorry
<davecheney> anyway, it's fixed now
<thumper> (â¯Â°â¡Â°)â¯ï¸µ â»ââ»
<davecheney> that's the ticket
<davecheney> thumper: waigani the current state of play http://godoc.org/github.com/juju/juju/state?import-graph
<waigani> hey that's really cool
<davecheney> hit hide(all) to hide the std lib pacakges
<thumper> davecheney: can you generate that but only look at "github.com/juju/juju" ?
<davecheney> that will make it a bit simpler
<davecheney> thumper: do you mean the juju/juju package
<davecheney> or juju/juju/...
<davecheney> thumper: https://github.com/davecheney/juju/compare/juju:master...davecheney:188-state-clone-state-serving-info?expand=1
<davecheney> not ready for review yet
<davecheney> I think I can reduce the delta  a bit more
<davecheney> but we're still going to have three copies of params.StateServingInfo after this change
<davecheney> there are also a lot of things which now depend directly on state
<davecheney> which is not good either
<wwitzel3> thumper: ping
<thumper> wwitzel3: hey
<thumper> whazzup?
<wwitzel3> hey, you have a second for a quick hangout, I have some questions about some juju run stuff I've been working on.
<thumper> sure
<wwitzel3> thumper: I'll go to the Oynx standup one
<thumper> ack
<davecheney> all: juju/mongo is a helper package for setting up and managing mongo servers, right ?
<thumper> fixed it: â¬ââ¬ï»¿ ã( ã-ãã)
<hazmat> so much awesome.. http://vimeo.com/95066828
<hazmat> comedy value and distributed systems
<wwitzel3> thumper: looking at state/apiserver/client/run.go .. the client.Run method, it calls ParallelExecute which calls ssh.ExecuteCommandOnMachine
<thumper> wwitzel3: yeah, that is how the api server calls out to the machines that it needs to call
<thumper> wwitzel3: it calls the 'juju-run' command on the other side
<wwitzel3> thumper: ahh, ok
<wallyworld_> jam: i have an api version question if you're around
<thumper> waigani: hmm...
 * thumper thinks of expedience
<thumper> vs learning
<thumper> bugge
<thumper> r
<thumper> waigani: the issue I'm having is that you have done things exactly as they have been done before
<thumper> the problem is that they way they were done before is kinda crap
<waigani> haha
<waigani> well, let's do it right
<thumper> sure?
<waigani> ugh .. no but your call
 * thumper takes a quick look to determine the amount of work
<waigani> hangout?
<thumper> sure
<thumper> waigani: use the standup one
<axw> hazmat: if you haven't already, you should read his columns too. hilarious and educational
<mattyw> morning all
<voidspace> back and I have internet....
<voidspace> currently provided by a 30metre ethernet cable slung into next door...
<TheMue> morning
<TheMue> voidspace: hehe, welcome back. troubles with provider switch at least almost solved?
<voidspace> well, that's 330mb of updates done
<mattyw> voidspace, have EE given you a better date than 2 weeks?
<dimitern> mattyw, since you're the OCR along with myself, would you mind taking a look at this trivial fix? https://github.com/juju/juju/pull/659
<mattyw> dimitern, looking
<mattyw> dimitern, done
<mattyw> dimitern, and thanks for pointing it out, the branch I'll be landing soon had some loggers I needed to rename
<dimitern> mattyw, cheers!
<dimitern> mattyw, mail sent\
<dimitern> jam, can you take a look at https://github.com/juju/juju/pull/659 as well please?
<jam> dimitern: heya. want to fill me in on what you and markS talked about?
<jam> wallyworld_: /wave
<wallyworld_> jam: hi there, just otp, will ping soon
<jam> dimitern: so on https://github.com/juju/juju/pull/659 I'm wondering if we want to change the logging path, since if anyone was filtering theyd have to change their scripts
<jam> dimitern: review
<jam> reviewd
<dimitern> jam, you mean the prefixes?
<jam> dimitern: the logging prefixes, yes
<jam> but it seems better in the new place anyway
<dimitern> jam, yeah, and I've sent a mail to juju-dev, just in case someone complains, we can deal with it
<dimitern> jam, thanks, I'll merge it then
<hazmat> axw, those are precious.. thanks "A systems programmer will know what to do when society breaks down, because the systems programmer already lives in a world without law."
<hazmat> although the nosql bane line in the presentation takes the cake for me. bane voice. "let your reads and writes choose their own destiny." :-)
<hazmat> jam, did you have a chance to try the rudder charm or look at the readme.
<jam> hazmat: I looked through the readme, but I haven't played with it
<jam> its essentially just creating a tunneling network just like you would get installing openvswitch etc on each node
<axw> hazmat: :)
<hazmat> jam, in a nutshell, but its all coordinating the subnet split to host map without central point of failure and it works in environments that are gre hostile (azure).
<hazmat>  and you don't really need openvswitch to do the point to point gre tunneling.
<menn0> fwereade: the UpgradeInfo work is finally up for review. https://github.com/juju/juju/pull/660
<jam> dimitern: standup?
<dimitern> jam, omw, sorry
<perrito666> morning
<fwereade> menn0, thanks, there is a small chance I will get to it today...
<menn0> fwereade: it probably needs someone else to have a look as well given that we're the authors
<menn0> fwereade: but it would be good to get your feedback on the change I made since you last looked at it
 * perrito666 discovers new levels of heartburn
<wallyworld_> jam: hi, got time for a question?
<jam> wallyworld_: so I'm on the phone now myself, I'm happy to chat in IRC a bit
<wallyworld_> sure
<mattyw> perrito666, morning
<perrito666> mattyw: morning
<jam> wallyworld_: so what's your question?
<mattyw> perrito666, I managed to get that auth fails bug again this morning - but I wasn't looking for it when it happens :|
<davecheney> dimitern: thanks for fixing my fuckup
<davecheney> it never occred to me about the logger names
<jam1> voidspace: http://paste.ubuntu.com/8214743/
<davecheney> please consider my suggestion in reply
<davecheney> i think we can make the default case of the logger less of a footgun
<jam> davecheney: what is the call stack for top level var inits?
<mattyw> mgz, am I being hated again? https://github.com/juju/juju/pull/562
<perrito666> mattyw: I presume you cannot make it happen willingly right?
<davecheney> jam it will have the package name in it
<davecheney> all package level vars are initalised by a faux init() function
<davecheney> bascically one per file
<davecheney> that is created by the compiler for you
<davecheney> hang on
<davecheney> i'll prove it
<mattyw> perrito666, well I've tried running just a subset of the tests that should show the error and I've got through hundreds of runs with no error. But I've seen it a couple of times during a make check
<wallyworld_> jam: so, i have a client api call that i can add params to. when run against older juju, the new params will be ignored. but i'd rather have the client detect what version of the api is running server side and warn the user. it's the EnsureAvailability() api call on client. can we up that one api call to version 2? or does it have to be done at the facade level?
<perrito666> mattyw: I have the theory that under a decent load on the testing machine you should be able to triggerit
<davecheney> jam: http://paste.ubuntu.com/8214777/
<dimitern> davecheney, no worries :)
<jam> wallyworld_: api versioning is at the Facade level, and I would like us to bump the version if we change anything, even if it is "technically compatible"
<wallyworld_> ok, sad that the client facade is sooooo big
<mfoord> jam: back on freenode - rejoining hangout
<mfoord> jam: calling Remove on a replicaset is likely to cause primary renegotiation
<mfoord> jam: (I would expect)
<hazmat> davecheney, you heading out early on the sprint to hit go eu?
<mfoord> jam: which is why it would be slow
<jam> wallyworld_: agreed. would it be easier to move EnsureAvailability to a locally clustered API and version *that* one ?
<davecheney> hazmat: negative
<wallyworld_> jam: is the version gets bumped to V2, how will a caller know what methods may be incompatible or not? they won't.
<davecheney> hazmat: people couldn't get their shit sorted in time
<davecheney> so I couldn't change my flights
<jam> wallyworld_: they know only about the ones that they are using. So we will still expose v0 for clients that don't know about v2
<davecheney> i'm going to dotGo only
<jam> wallyworld_: but no, we don't have WADL sort of thing so that you can programattically say "oh, v2 is exactly like v0 except for here"
<hazmat> davecheney, ugh. bummer. i was thinking of coming in for the day from brussels just to check it out.
<davecheney> hazmat: i wouldn't say no
<wallyworld_> jam: i could introduce a new facade. it makes sense that all apis on a facade should be bumped together
<davecheney> i'm going to be in paris from the 9th til the 12th
<jam> wallyworld_: you've already been part of the great Client breakup
<jam> Keyserver et al
<wallyworld_> jam: yes indeed, i was going to use the same technique
<jam> wallyworld_: just to mention, that all new facades should start at V1, as v0 is intended to be the version that was in 1.18
<wallyworld_> jam: is there sample code that shows how to set up a new api version? point me where to look?
<wallyworld_> i guess there's a version number registered somewhere
<jam> wallyworld_: so just creating a new one is easy
<jam> the line of Register()
<jam> takes a 1 instead of a 0
<wallyworld_> ok, ta
<jam> wallyworld_: the client side code needs to check what the supported version of the API is, and then fall back to the Client facade if it can't get to the new facade
<wallyworld_> how do it do that? in this case it doesn't matter as the new fascade will be there or not, but in general?
<jam1> wallyworld_: Facade.BestAPIVersion()
<wallyworld_> thanks
<voidspace> jam: I think I'm back again, lost the hangout though *sigh* - really sorry
<jam> :)
<jam> TheMue: can you help wallyworld_ find the right API docs to figure out how to properly introduce a new Facade version?
<wallyworld_> jam: TheMue: in standup, will check back soon
<jam> I feel like we've documented all this, but I'm not finding the docs in our source tere.
<jam> tree.
<jam> wallyworld_: so there is a bit of "must be careful" here, I believe, which maybe we can make better. But for a Facade that didn't exist in 1.18, then "BestAPIVersion()" should return 0, which we "know" didn't exist, so your client code can tell that it should use the other facade.
<jam> It is also possible that the new client layer could do the transition without the cmd/juju stuff caring.
<jam> so we would have api/statemanagement/statemanagement.go
<jam> and that would embed a FacadeCaller for your new StateManagement facade
<jam> and it could have EnsureAvailability() on it
<jam> which internally would check
<jam> if self.BestAPIVersion() == 0 { // switch to old interface:
<TheMue> wallyworld_: back from lunch, ping me when available too
<mfoord> jam: I don't have the wireless router connected at all now, and my connection is via a switch
<jam> but you're still only either or?
<jam> mfoord: either IRC or Hangout, but never both... :)
<voidspace> jam: it still looks like my connection is being broken every minute
<voidspace> jam: I've emailed you
<voidspace> jam: I can browse and use email fine but both freenode and hangouts hate me
<voidspace> jam: I *have* to get IRC working as a very minimum
<jam> voidspace: yeah
<jam> I emailed you
<jam> good luck sorting it out
<jam> I'm close to EOD anyway
<jam> voidspace: it seems to be calling c.stopNow() which is fundamentally calling runtime.Goexit()
<jam> which certainly seems like it should be exiting ungracefully
<jam> voidspace: ah, goexit is just this goroutine
<voidspace> jam: looking at the traceback
<jam> eg, kill this thread, not kill this process
<voidspace> jam: so Remove didn't used to be an attemptLoop - but that was unreliable too because Remove *needs*  to be an attemptLoop (for the same reason that the other operations do)
<voidspace> jam: we probably don't want Remove to be in a defer
<jam> voidspace: it doesn't help the traceback that we defer a func() that calls a function that takes a func()
<voidspace> right...
<voidspace> jam: we could just not defer those removes and do them at the end, that would simplify
<voidspace> jam: that function isn't used outside of those two tests I don't think - so it shouldn't matter (?) if a test fails and we don't call remove
<jam> voidspace: I wonder if we want to Remove at all
<voidspace> we'll clean up the replicaset anyway
<jam> I would think changing the test to just nuke everything would be cleaner
<voidspace> jam: well, it is "assertAddRemoveSet"
<jam> voidspace: so *those* Remove calls are just "cleanup what I set up"
<jam> because they are in defer
<jam> there is a call on line 241
<voidspace> they are, but the test name implies we want to test that Remove works
<jam> that is the actual testing of Remove working
<voidspace> ah
<wallyworld_> TheMue: hi. i can read the code to see how to bump up an api version, or are there docs i can read?
<jam> that may also be the flakiness
<jam> voidspace: if we get an error, and the replicaset is in a bad state
<jam> and then we hammer it with "remove them one by one"
<jam> while I'm destroying some of them
<jam> I imagine it isn't happy with that
<voidspace> the tests setup a new replicaset every time - so perhaps we don't need those defered removes at all
<voidspace> as we call Destroy immediately after
<jam> voidspace: The only reason we *might* is if we wanted to not have to set up the Root to be shared between tests or something like that
<jam> but I have the feeling, we're better off *for the replicaset tests* to start and return to scratch each time
<jam> I know we share the DB between tests when we can, just calling Drop Databases in TestTearDown
<jam> but I don't think we need to do that here.
<TheMue> wallyworld_: do you know jams doc on Google?
<voidspace> jam: TearDownTest calls root.Destroy()
<voidspace> jam: and in the IPV6 suite we explicitly call Destroy ourselves
<wallyworld_> TheMue: i don't think so. i may have got an email at one point but can't recall now
<jam> TheMue: probably not. as the goal was to get them somewhere in the source tree/on juju.ubuntu.com and I think those never got finished
<voidspace> jam: so I don't think that's a concern
<jam> voidspace: right, so that is just the root, but yeah, we should just be clearer that we call Instance.Destroy() on everything and not try to Remove them down the stack
<jam> because cleaning up a Mongo Cluster by removing one, killing it, removing one, killing it
<jam> is likely to just cause us more pain that we need to be testing.
<TheMue> jam, wallyworld_: ah, ok. a part of it is here: https://github.com/TheMue/juju/blob/api-implementation-guide/doc/design/juju-api-implementation-guide.md
<voidspace> jam: we also defer the Destroy on the members
<wallyworld_> TheMue: thanks, will read that
<TheMue> jam: I only wait to get my current experiences with new facades into it too
<TheMue> wallyworld_: I'm currently writing some code helping to test the client side when the server side not yet has the newest version
<wallyworld_> ok, sounds good
<jam> TheMue: sure, so just share it as we can, and we should get it published soon. better to have something for reference.
<TheMue> jam: yep
<jam> even if it isn't 100% complete, as long as it isn't *incorrect*
<wallyworld_> jam: i need to be able to run ensure-availability with a placement directive to have a specified machine become a state server, not just any old new one. the guys are happy to pass in either a machine number or maas name (like in --to for add-unit). This will require either converting a HostUnits machine to a ManageEnvirons machine, since add-machine creates HostUnits machine, or 2. adding an option to add-machine to create a
<wallyworld_> ManageEnvirons machine which does not participate until activated. pros and cons to both. maybe there's a 3rd option.  i think 1 may be most flexible. update machine jobs in state; a watcher in machine agent notices, updates agent conf to record the manageenviron job, then restarts machine agent. but i'm not fully across all of the HA mechanics so i'm not sure what else needs to be done. do you have any thoughts or should i ask nate
<wallyworld_> or william?
<jam> wallyworld_: what has never been clear from a MaaS guys is why they need to add-machine first
<jam> I realize they want to control locations
<wallyworld_> jam: this is landscape guys
<jam> and maybe there is a problem that they need to specify 2 machines
<jam> wallyworld_: sure, but its on MaaS
<wallyworld_> yes, >1 will be needed
<wallyworld_> --to foo,bar
<katco> wallyworld_: hey regarding https://github.com/juju/juju/pull/630/files#r16968425
<wallyworld_> they want this machine right here to be the state server, not any other one
<jam> voidspace: lost you again ?
<katco> wallyworld_: will the updated config be persisted to environments.yaml making the change permanent?
<jam> wallyworld_: sure, I'm just wonderign why we need to "add-machine" first
<jam> can't we just "juju ensure-availability --??? maas-name:A,maas-name:B"
<wallyworld_> katco: it will get written to jenv, becoming permanent
<katco> wallyworld_: ahh ok. ty
<wallyworld_> jam: we could i guess, but that will be a bit of work, since we won't have run cloud init to set stuff up
<jam> wallyworld_: ?
<jam> wallyworld_: the point is that we can have ensure-availability still allocating machines, just give them the knob to say what machine
<wallyworld_> add-machine boots a machine and runs cloud init to set up the tools, agent config etc. if we don't do that and ensure-availability to an arbitrary machine, that machine still has to then be set up to run juju
<jam> wallyworld_: presumably they had a knob available on "add-machine" so that they could decide what that exact machine was
<jam> and couldn't we supply it on ensure-availability?
<wallyworld_> ah, i think the add-machine knob was to specify ssh:
<wallyworld_> that would take an existing machine and put juju on it
<wallyworld_> so i guess if we did juju ensure-availability --to maas-name:A,maas-name:B  then that would have to ssh in and set up juju
<jam> wallyworld_: the idea is that they are manually provisioning with ssh?
<jam> "juju add-machine ssh:foo@bar" /
<jam> I thought they were just doing "juju add-machine" but with maas-name
<jam> to pick an explicit machine to proviison
<jam> provision
<wallyworld_> they want to use whatever they use with add-unit
<wallyworld_> add-unit --to blah
<jam> wallyworld_: add-unit or add-machine ?
<wallyworld_> same syntax as for add-unit
<jam> wallyworld_: AFAIK you can't add-unit to a manually provisioned machine
<wallyworld_> so a machine number or mass name, right?
<jam> wallyworld_: right, but that just requests a machine from the Provisioner, which lets us set up everything in cloud-init.
<jam> wallyworld_: now... I *hope* all of the EnsureMongoServer stuff has gotten done correctly so we can create a state server late
<jam> rather than only at cloud-init time.
<jam> So I'd be *happy* if "juju ensure-availability --to 1" worked correctly. But For what they're doing what they need is
<jam> "juju ensure-availability --to maas-name:A"
<wallyworld_> how does mass-name placement directive work then? does that assume a virgin maas machine?
<jam> not "juju ensure-availability --to 1" (I think)
<jam> wallyworld_: like tags
<jam> wallyworld_: each machine just has a unique name
<wallyworld_> with tags, that assumes that a maas machine is not running, but tagged in maas, and so is then booted with cloud init etc?
<jam> wallyworld_: right, same with maas-name
<jam> maas-name is a Provisioning constraint (placement directive)
<jam> give me a machine that fits this description
<jam> which happens to uniquely address one machine ever
<wallyworld_> ok, so "juju ensure-availability --to maas-name:A" would be like add-machine but setting it up to run a state server etc
<jam> wallyworld_: so... I'd like "juju ensure-availability --to 1,2" to work. But I'd really like to understand what Landscape is trying to do, as I feel that them doing "juju add-machine" first is actually bad practice
<wallyworld_> rather than host units
<jam> wallyworld_: (I think everything still gets host units, but it would also get JobManageEnviron)
<wallyworld_> yep
<wallyworld_> i think we can tell them what best practice is
<wallyworld_> i think not using add-machine is the right way to go
<jam> wallyworld_: I certainly understand their desire to say "ensure-availability and put it exactly here"
<jam> And we want to definitely support that case.
<jam> And ideally we could turn a machine into a proper state server late.
<jam> You'd have to try it/check with nate to see if it might work today
<jam> because we *shouldn't* be setting all that stuff in cloud-init now
<jam> but doing it as part of EnsureMongoServer
<mattyw> mgz, ping?
<wallyworld_> that sounds right
 * fwereade has a doctor's appointment and may be gone for the rest of the afternoon, but will be back in the evening sometime
<wallyworld_> fwereade: just ask him to warm his hands first
<wallyworld_> and use gloves
<mattyw> wallyworld_, are you able to help solve problems with the landing bot?
<mattyw> wallyworld_, my branch isn't being picked up https://github.com/juju/juju/pull/562
<wallyworld_> mattyw: i may not have login credentials to jenkins handy, i'll have a look
<mattyw> wallyworld_, thanks
<voidspace> jam: ping
<jam> voidspace: /
<voidspace> I'm here for now...
<jam> voidspace: sure, I'm just in the next meeting now
<voidspace> jam: ah, ok
<voidspace> I'm getting 12mbp downstream but can't connect to google hangouts
<jam> voidspace: I'm usually EOD now, but I have a meeting with markr today
<voidspace> jam: I'll create a PR with those extra Removes removed
<voidspace> and look at GetStatus
<jam> sounds good, just test it a bit as well
<voidspace> ok
<wallyworld_> mattyw: i can't seem to ssh in to the slave to look, you'll have to ask mgz sorry
<jam> voidspace: well maybe if you stopped streaming 12mbps of youtube videos your google hangouts would  be more stable :)
<hazmat> does api server returning non json results ring a bell to anyone.. (1.20.5-1.20.6) https://bugs.launchpad.net/juju-deployer/+bug/1364375
<mup> Bug #1364375: TypeError: expected string or buffer <oil> <juju-deployer:New> <https://launchpad.net/bugs/1364375>
<jam> wwitzel3: ping about cloud-sigma reviews
<jam> I know you mentioned you've been doing some, but I'm wondering if I'm missing the comments somewhere
<mattyw> sinzui, ping?
<sinzui> hi mattyw
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1348477
<wwitzel3> jam: no, you aren't missing them, I've only got my comments published for two of the PRs. I will get them published for #173 and #172 today.
<jam> wwitzel3: how are you managing to delay publishing them, writing them somewhere else?
<wwitzel3> jam: yeah, I've been doing most of the review locally since i'm trying to use the amazon and azure provider as examples.
<wwitzel3> jam: then I just take then and add them all to github at the same time
<ericsnow> natefinch, perrito666, wwitzel3: standup?
<wwitzel3> jam: I'm still feeling like I'm not reviewing this correctly though. I feel like there is some bigger picture I am supposed to be aware of, but I'm just reviewing usage and ensuring common idioms.
<wwitzel3> jam: yeah, my notes for 172 are done, I will publish those right after standup and you can take a look.
<perrito666> here
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1348477 1364410
<natefinch> ericsnow, wwitzel3, perrito666: trying to join... google is defying me
<natefinch> ericsnow, wwitzel3, perrito666: chrome has been a bitch since it updated this morning.
<perrito666> natefinch: different browser?
<cmars> sinzui, i'm going to take a look at LP: #1348477, since I'm already set up with access to a power machine
<mup> Bug #1348477: userAuthenticatorSuite.TearDown failure <ci> <ppc64el> <regression> <test-failure> <juju-core:Triaged by cmars> <https://launchpad.net/bugs/1348477>
<cmars> would latest master branch be a good place to start?
<sinzui> thank you cmars...i think mattyw was saw it
<mattyw> sinzui, we're going to work together
<mattyw> sinzui, for moral support
<dimitern> mattyw, perrito666, natefinch, what's that windows PR that needs reviewing?
<mattyw> dimitern, which one?
<dimitern> mattyw, something about logging? not sure..
<dimitern> mattyw, alexisb asked me to take a look as OCR
<mattyw> dimitern, #652?
<dimitern> mattyw, could be - looking
<wwitzel3> jam: also I just realized that my client review ended up on the environinstance review. Guess that is a problem with doing them offline and then adding them.
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1348477 1364410 1364438
<voidspace> wwitzel3: ping
<wwitzel3> voidspace: pong
<katco> evilnick: i understand you're the guy to talk to regarding juju documentation! please see https://github.com/juju/docs/pull/155 :)
<voidspace> No wonder this test was passing every time no matter what I did to it
<voidspace> "OK: 0 passed"
<voidspace> gocheck takes a regex pattern not a glob...
<evilnick> katco, okay! thanks!
<evilnick> katco, that all sounds very terrifying, thanks
<katco> evilnick: lol
<katco> evilnick: the evil you know... etc. :)
<evilnick> katco, *I* am the evil I know. But yeah. I might have to put some sort of warning for those of a nervous disposition on those docs
<katco> evilnick: https://github.com/juju/juju/pull/630/files#diff-7f2ac2b013ef527684140a73c9773b54R110
<evilnick> katco :)
<katco> evilnick: halloween isn't far away either. we could do a special on juju and "how the repear harvests" ;)
<mattyw> perrito666, ping?
<perrito666> mattyw: pong
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1348477 1364410 1364438 1359837
<natefinch> ericsnow: can we move our 1:1 back another half hour?  I just got back and need to do a few things first
<ericsnow> natefinch: sure
<mattyw> wwitzel3, ping?
<wwitzel3> mattyw: pong
<mattyw> wwitzel3, I'm not sure how much you know about the presence watcher. I want to know if there's a reason why it uses mgo.Collection rather than taking a session and opening/closing it as needed
<mattyw> wwitzel3, actually - I just noticed it does session.copy
<wwitzel3> mattyw: ok, I was going to say no I don't know why, so my response wasn't going to be very helpful
<mattyw> wwitzel3, ok no problem
<wwitzel3> mattyw: yeah, that ping method copies the session, no sure why specifically
<wwitzel3> mattyw: actually looks like the prepare method does as well, maybe others
<mattyw> wwitzel3, looks like a few of them
<wwitzel3> mattyw: yeah looks like most of them do
<mattyw> wwitzel3, I'm looking into https://bugs.launchpad.net/juju-core/+bug/1348477 with cmars and it appears that it's the presence watcher that is returning the error in some cases
<mup> Bug #1348477: userAuthenticatorSuite.TearDown failure <ci> <ppc64el> <regression> <test-failure> <juju-core:Triaged by cmars> <https://launchpad.net/bugs/1348477>
<wwitzel3> mattyw: ahh, interesting
<ericsnow> natefinch: ready?
<natefinch> ericsnow: yep
<evilnick> katco, well, I think we have the trick sorted. not sure about the treat.
<cmars> i'm having a gccgo toolchain issue on ppc64el and opened a bug, https://bugs.launchpad.net/ubuntu/+source/gccgo-go/+bug/1364562
<mup> Bug #1364562: Intermittent go test crashes on ppc64el <gccgo-go (Ubuntu):New> <https://launchpad.net/bugs/1364562>
<cmars> anyone experienced this? where should I look for support (if any is available)?
<alexisb> cmars, I always start with dave cheney
<alexisb> cmars, do you know if this is a known gcc-go toolchain issue?
<alexisb> there are a few out there we seem to always hit
<cmars> alexisb, will try to find out
<alexisb> cmars, this one seems to come up a lot:
<alexisb> https://bugs.launchpad.net/ubuntu/+source/gccgo-4.9/+bug/1362906
<mup> Bug #1362906:  internal compiler error: in comparison, at go/gofrontend/expressions.cc:6508 <gcc-4.9 (Ubuntu):Fix Released> <gccgo-4.9 (Ubuntu):Invalid> <gcc-4.9 (Ubuntu
<mup> Trusty):Invalid> <gccgo-4.9 (Ubuntu Trusty):Confirmed> <gcc-4.9 (Ubuntu Utopic):Fix Released> <gccgo-4.9 (Ubuntu Utopic):Invalid> <https://launchpad.net/bugs/1362906>
<cmars> alexisb, yep, that one bit us last week. this one's weird, its as if the 'go test' command itself has the issue
<alexisb> ok
<cmars> davecheney, good morning, I have a gccgo question for you
<cmars> i keep seeing these nil pointer refs and segfaults on ppc64el, https://bugs.launchpad.net/ubuntu/+source/gccgo-go/+bug/1364562
<mup> Bug #1364562: Intermittent go test crashes on ppc64el <gccgo-go (Ubuntu):New> <https://launchpad.net/bugs/1364562>
<cmars> known issue? any advice?
<davecheney> cmars: sorry mate
<davecheney> had to cloes that bug
<davecheney> i dunno where you got that compiler from but it's to old
<cmars> davecheney, what should I use then?
<davecheney> cmars: can I ask a different question
<davecheney> how did you hit this bug ?
<cmars> mattyw and I were pairing on reviews, and it was raised by QA
<cmars> we were looking at https://bugs.launchpad.net/juju-core/+bug/1348477
<mup> Bug #1348477: userAuthenticatorSuite.TearDown failure <ci> <ppc64el> <regression> <test-failure> <juju-core:Triaged by cmars> <https://launchpad.net/bugs/1348477>
<cmars> so i apt-get install gccgo-go, rsync over a snapshot of master (because no github access in the lab) and then go test is falling all over the place
<davecheney> cmars: yes that is because the compiler fix has not landed in main
<cmars> ack
<davecheney> i wish I could explain why
<davecheney> just stear clear of gccgo bugs
<davecheney> they need special care and attention to repro
<davecheney> that bug isn't ppc64 related
<davecheney> take the ppc tag off it
<davecheney> yup, all those failures are the "known failures" that wallyworld_ is tracing
<davecheney> tracking
<cmars> ah, oh
<cmars> ok
<hazmat> gsamfira, ping
<gsamfira> hazmat: pong
<hazmat> gsamfira, sweet
<hazmat> gsamfira, i was wondering about maybe getting juju to work on debian.. and figured your the right person to ask ;-)
<davecheney> cmars: thanks for fixing the tag on that bug
<davecheney> as you saw, it's one of our known failures, which ian logged on the 25th of july
<gsamfira> debian should be easy :). All you need to fix up is the userdata :)
<gsamfira> the rest is almost identical
<davecheney> i can't speak for sinzui but I think there is a general agreement that those bugs are not build blockers
<davecheney> for whatever value of living with broken windows you choose
<gsamfira> hazmat: namely, making sure that you add proper repos before doing the apt-get install bits
<hazmat> gsamfira, juju has a few hardcodes around the distro series detections bits
<gsamfira> hazmat: and of course creating the repos for the bits you need :)
<hazmat> gsamfira, nothing really needed re repos afaics, package names are all the same
<gsamfira> hazmat: yes. Correct. But that should be easy to code. If it was easy for windows, debian should be 20 mins work :). Look in version/osversion.go
<hazmat> gsamfira, i thought you might have already done the basics around centos
<hazmat> windows is different enough that its a separate path through alot of the base, where as linux variants remove distro assumptions
<gsamfira> hazmat: not yet. Merging the windows support took longer then expected.
<gsamfira> hazmat: off the top of my head, for os detection: https://github.com/juju/juju/blob/master/version/supportedseries.go#L28
<gsamfira> osversion_linux.go should work for debian as well
<hazmat> gsamfira, aha.. yeah. thats what i wanted
<hazmat> thanks
 * hazmat hacks
<gsamfira> hazmat: you are welcome. I remember juju having to add a few repos on older versions of ubuntu to get up to date packages
<sinzui> davecheney, which bugs?
<hazmat> gsamfira, yeah.. only on precise.. for mongo
<gsamfira> also, for MaaS you need to enable debian as a supported OS
<davecheney> sinzui: the list that wallyworld_ is curating
<gsamfira> hazmat: talk to Blake Rouse, or Andreas about any plans to do so
<hazmat> gsamfira, yeah.. they have user uploaded images in maas now.. its basically a masquerade there
<davecheney> sinzui: https://docs.google.com/a/canonical.com/document/d/1k6o9yBzlJeRdaSuH3ZgPcdhL7SzhS2Ws7wDKSN99fAU/edit
<hazmat> well maas 1.7
<davecheney> look ma, not on LP :)
<hazmat> gsamfira, targeting gce atm though not maas, but yeah.. i had this discussion with them as well ;-)
<gsamfira> hazmat: awesome :D
<sinzui> davecheney, right, they are not release blockers. I am happy to defer them to a future release to deliver goodness to people sooner
<gsamfira> hazmat: debian should be quick and easy to add
<gsamfira> hazmat: centos will require a bit of abstraction on package management, and an updated cloud-init package.
<gsamfira> hazmat: they still have an outdated package from the paleolithic
<sinzui> davecheney, But I am happy to discuss the essential definition of 1.21.0 If stakeholders wont use it because they want something on that list...I think we will require the issue to be resilved
<hazmat> gsamfira, i'm trying with manual provider to get started
<hazmat> skip the cloud init bits..
<davecheney> sinzui: cool
<gsamfira> hazmat: https://github.com/juju/juju/blob/master/version/supportedseries.go#L17 and https://github.com/juju/juju/blob/master/version/supportedseries.go#L75 might interest you as well
<davecheney> i don't want to be part of that discussion
<davecheney> only trying to help cmars
<gsamfira> happy hacking :D
<gsamfira> hazmat: looking forward to seeing debian work :D
<hazmat> interesting...
<menn0> _thumper_, waigani: morning
<hazmat> juju upgrade juju -> error message can't upload charm http://pastebin.ubuntu.com/8218255/ silly
<hazmat> argh.. cpu-checker
<hazmat> so not needed.. its line two lines of shell script to check for kvm support in /proc/cpuinfo output
<hazmat> its like
<alexisb> hazmat, ping
<mbruzek> Hello juju-dev I am having a tools problem with 1.20.6 and I was hoping someone could have a look at the bug https://bugs.launchpad.net/juju-core/+bug/1364631
<mup> Bug #1364631: juju fails to find matching tools <juju-core:New> <https://launchpad.net/bugs/1364631>
<mbruzek> It may _look_ like a dupe of https://bugs.launchpad.net/juju-core/+bug/1309805 but I do not believe it is.
<mup> Bug #1309805: LXC / Local provider machines do not boot without default-series <config> <local-provider> <lxc> <juju-core:Fix Released by jose> <https://launchpad.net/bugs/1309805>
<sinzui> mbruzek, I doubt the issue relates to the bug. What is your tools-metadata-url set to? Are you are a private network?
<sinzui> mbruzek, There was a subtle change to the rules for locating tools in 1.20.6. The command line didn't change to prevent regressions, but the rules certainly are different
<mbruzek> sinzui, I have no tools-metadata-url set in environments.yaml for local and not on a private network this is my laptop
<sinzui> mbruzek, very interesting, do you see the trusty precise tools being uploaded when you first bootstrap?
<mbruzek> sinzui, I do I also have a pastebin of the bootstrap with --debug
<mbruzek> http://paste.ubuntu.com/8218239/
<sinzui> mbruzek, thanks I was going to ask about that. So bootstrap confirms the tools were uploaded and the metadata was generated
<sinzui> mbruzek, is it the deployments that didn't find tools?
<mbruzek> sinzui, it looks to me like the tools are found and downloaded, but never found by the machines.  I can not deploy cs:precise/ubuntu
<sinzui> mbruzek, We need the cloud-init-output.log from machine-1
<mbruzek> sinzui, sure, looking for that now...
<sinzui> mbruzek, I did this today but lost the command to copy that file from the container to $HOME to read it before it is destroyed
<mbruzek> /home/mbruzek/.juju/local/cloud-init-output.log
<mbruzek> Is that the one you are looking for?
<sinzui> no
<sinzui> mbruzek, we need to log from the actual container that failed, not your localhost (state-server)
<sinzui> mbruzek, I did something like this today after I saw the status of a container say there was an error
<sinzui> sudo cp /var/lib/lxc/jenkins-local-precise-utopic-amd64-machine-1/rootfs/var/log/cloud-init-output.log ./
<mbruzek> sinzui, I can not juju ssh to those machines to get it.
<sinzui> mbruzek, you can always ssh to the machines juju created even when jujud isn't installed. juju provisions with your keys so you just ssh to the ip address
<mbruzek> sinzui, The machines do not have an IP address they are in a pending state
<sinzui> mbruzek, sudo lxc-ls --fancy will always show the truth
<sinzui> mbruzek, juju often claims it cannot create machines/containers because the jujud didn't report...but lots of things happen before jujud is downloaded as a tool
<mbruzek> http://pastebin.ubuntu.com/8218539/
<sinzui> mbruzek, that isn't a tools error. you cannot get a tools error without a machine. what is juju status showing?
<mbruzek> sinzui, http://pastebin.ubuntu.com/8218548/
<mbruzek> juju status is reporting "no matching tools available"
<sinzui> mbruzek, I award you a gold star for finding a awesome cryptic bug
 * mbruzek bows
<sinzui> mbruzek, you are on trusty deploying a precise charm right?
<_thumper_> o/
<mbruzek> sinzui, yes.
<mbruzek> sinzui, you can see that I tried to deploy precise first and then trusty.
<sinzui> mbruzek, did you set default-series in environments.yaml?
<mbruzek> It *was* set when I reported the bug, but I suspected that was an issue so I removed default-series for this particular bootstrap
<mbruzek> sinzui, I will add it again but charms will not deploy
<sinzui> mbruzek, I am just looking for differences between my own setup. I am trusty with default-series of trusty
<waigani> thumper: the Action field in ModifyUsersStruct, where are we using that?
<mbruzek> sinzui, I am setting default-series, do you want me to set it to trusty or precise?
<sinzui> mbruzek, trusty
<thumper> waigani: we should be using that in the server side, check that the action = 'add' and error if it doesn't with 'unknown action'
<thumper> waigani: the idea being that we'll implement 'remove'
<waigani> thumper: but isn't the action determined by the function called: ShareEnvironment would always add a user, so what value does the check have in there?
<thumper> waigani: the current client api will always add to the server api
<thumper> waigani: we will add another client api that calls the same server api with a different action
<thumper> the server api stays the same
<thumper> but gains another action
<thumper> does that make sense?
<mbruzek> sinzui, http://pastebin.ubuntu.com/8218617/
<waigani> thumper: what do the cmds look like?
<thumper> what commands?
<thumper> juju ?
<thumper> we haven't decided yet
<waigani> e.g. juju share add user@pro
<thumper> no
<thumper> juju share user@pro user2@pro
<thumper> perhaps...
<thumper> juju share --remove user@pro
<waigani> ah
<thumper> default is to share with someone
<thumper> but we need to allow removal
<waigani> so that is why we need the actions
<thumper> yes
<waigani> got it
<thumper> there will be two different client api calls
<thumper> or...
<thumper> maybe later, one call
<thumper> with an action option
<thumper> but the server needs to support the action
<thumper> as that is the bit that shouldn't change
<waigani> should I make the actions consts and where should they live?
<thumper> apiserver code
<waigani> ok
<thumper> with the params
<waigani> right
<thumper> consts are preferred but not essential
<waigani> iotas ?
<thumper> cmars: around?
<thumper> no, strings
<waigani> ok
<thumper> "add", "remove"
<thumper> better for understanding the wire protocol
<cmars> thumper, here
<waigani> makes sense, going over the wire
<thumper> cmars: are we meeting today?
<cmars> thumper, we are. thought it was monday, sorry
<thumper> :)
<sinzui> mbruzek, I am trying juju sync-tools to change the tools in the env, if that works, then we know something more
<sinzui> mbruzek, oh!
<mbruzek> sinzui, ?
<sinzui> before you try sync-tools, pastebin a copy of ~/.juju/local/storage/tools/streams/v1/com.ubuntu.juju\:released\:tools.json
<sinzui> mbruzek, this is mine: http://pastebin.ubuntu.com/8218656/
<sinzui> mbruzek, sync-tools is not fast :( It will pull down about 8 tools for precise and trusty
<mbruzek> http://paste.ubuntu.com/8218657/
<sinzui> mbruzek, that looks fine to me
<mbruzek> sinzui, Looking at the diff of those 2, looks very similar to me.
<wallyworld_> sinzui: mbruzek: i haven't read all the backscroll in detail. 1. you have found a tools bug? 2. if you want to juju ssh into a local provider lxc instance, you need to use sudo ssh or else you get a permission denied error
<sinzui> wallyworld_, I just updated the bug https://bugs.launchpad.net/juju-core/+bug/1364631
<mup> Bug #1364631: juju fails to find matching tools <deploy> <local-provider> <lxc> <juju-core:New> <https://launchpad.net/bugs/1364631>
<mbruzek> wallyworld_, I don't think my lxc instances are coming up.
<mbruzek> sinzui,  thanks for the update
<mbruzek> wallyworld_, I believe I have found a tools related bug.  I can not deploy and it looks to be tool related.
<sinzui> mbruzek, It took me 15 minutes to sync-tools. I forgot that it would download utopic too. I wonder if matching tools were found after you had the non .1 versions
<mbruzek> sinzui, I have not yet synced would you like me to?
<wallyworld_> sinzui: mbruzek: i can retest with 1.20 trunk, but last time i tried it last week, there were no issues that i know of. so i'm sad if there's a new problm :-(
<sinzui> mbruzek, Lp is timing out. I am trying to set the bug to 1.21-alpha1 with a suggestion to backport to 1.20.7
<wallyworld_> sinzui: mbruzek: there is a potential issue in trunk because of aria2 not being found - that is going to be reverted
<wallyworld_> that issue will cause the lxc container to come up but tools copy into the container to fail
<sinzui> wallyworld_, I have been using 1.20.6 all day, and I cannot reproduce it myself. I even did this same kind of deployment this morning
<wallyworld_> :-( that will make it hard to fix
<sinzui> wallyworld_, failure to download tools means we get a machine and a nice log.
<wallyworld_> so we will need all the logs from /var/lib/juju/containers
<wallyworld_> let's get these attached to the bug if possible
<sinzui> mbruzek, does this show anything interesting about deploy: juju debug-log -l ERROR --replay
<sinzui> mbruzek, or just attach the .juju/local/log/all-machines.log to the bug
<mbruzek> OK.
<wallyworld_> and the /var/lib/juju/containers logs
<wallyworld_> ^^^^ these are important as they container cloud init data
<wallyworld_> contain
<sinzui> wallyworld_, there is no cloud init because there were not tools selected
<mbruzek> wallyworld_, that path contains 4 directories, which files do you want from there?
<sinzui> wallyworld_, sudo lxc-ls --fancy shows no containers being created
<wallyworld_> mbruzek: all of them if possible, in  tar.gz
<mbruzek> http://pastebin.ubuntu.com/8218736/
<wallyworld_> sinzui: oh, ok. so it's failing really early
<sinzui> wallyworld_, mbruzek /var/lib/juju/containers The four containers are the new and old template containers I think
<wallyworld_> mbruzek: ok, just the ones with lxc
<wallyworld_> the ones without are the old directories and can be deleted
<mbruzek> wallyworld_, uploaded to https://bugs.launchpad.net/juju-core/+bug/1364631
<mup> Bug #1364631: juju fails to find matching tools <deploy> <local-provider> <lxc> <juju-core:Triaged> <https://launchpad.net/bugs/1364631>
<wallyworld_> mbruzek: ty. can you also include all-machines.log
<mbruzek> wallyworld_, on it
<wallyworld_> ty
<mbruzek> done.
<mbruzek> sinzui, your command shows only 2 lines
<mbruzek> machine-0: 2014-09-02 22:05:47 ERROR juju.worker runner.go:218 exited "api": unable to connect to "wss://localhost:17070/"
<mbruzek> machine-0: 2014-09-02 22:06:20 ERROR juju.provisioner provisioner_task.go:418 cannot find tools for machine "1": no matching tools available
<sinzui> mbruzek, and that one extra line is the difference from me.
<mbruzek> sinzui, you mentioned that you saw this problem earlier today?
<sinzui> mbruzek, no, I have never seen this problem
<sinzui> I have done deployments exactly like yours today without incident
<wallyworld_> mbruzek: can you please upload your ~/.juju/local/storage/tools ?
<wallyworld_> so we can see what tools and metadata were generated at bootstrap
<sinzui> wallyworld_, it is in the pastbin I added to the bug
<wallyworld_> ok, ta, looking
<sinzui> wallyworld_, mbruzek , but I didn't ask to confirm the tools were there and readable. We just verified http://paste.ubuntu.com/8218657/
<mbruzek> looking
<mbruzek> everything looks readable by mbruzek
<wallyworld_> mbruzek:  sinzui: that pastebin is the products file, if there the index file also?
<wallyworld_> it's the index file that appears to be the problem
<mbruzek> $ ls -l streams/v1/
<mbruzek> total 8
<mbruzek> -rw------- 1 mbruzek mbruzek 1671 Sep  2 17:05 com.ubuntu.juju:released:tools.json
<mbruzek> -rw------- 1 mbruzek mbruzek  498 Sep  2 17:05 index.json
<mbruzek> wallyworld_, do you need to see index.json?
<wallyworld_> yup
<wallyworld_> that's what the error message in machines.og refers to
<mbruzek> http://paste.ubuntu.com/8218852/
<wallyworld_> hmmm, looks ok
<mbruzek> wait
<mbruzek> fetchData failed for "http://10.0.3.1:8040/tools/streams/v1/index.sjson": file "tools/streams/v1/index.sjson" not found
<mbruzek> the file is named index.json
<mbruzek> typo?
<mbruzek> or is that just a different file?
<wallyworld_> mbruzek: no, it first looks for the sjosn version
<wallyworld_> so that's ok
<mbruzek> OK
<wallyworld_> mbruzek: can get fetch http://10.0.3.1:8040/tools/streams/v1/index.json ?
<wallyworld_> subst localhost for 10.0.3.1
<wallyworld_> the sjson is signed json metadata
<mbruzek> $ wget https://localhost:8040/tools/streams/v1/index.json
<mbruzek> --2014-09-02 17:57:47--  https://localhost:8040/tools/streams/v1/index.json
<mbruzek> Resolving localhost (localhost)... 127.0.0.1
<mbruzek> Connecting to localhost (localhost)|127.0.0.1|:8040... failed: Connection refused.
<wallyworld_> we use signed json on the official simplestreams website
<wallyworld_> do you still have your local provider running?
<wallyworld_> or have you destoyed it?
<mbruzek> still running
<wallyworld_> hmm, maybe you need to leave the 10.0.3.1
<wallyworld_> i'm just curious as to what data is actually served up
<wallyworld_> just to rule out it is not providing different data to what we think
<sinzui> wallyworld_, mbruzek NO ONE NEEDS sjson, None of us have keys to sign it
<sinzui> only streams.canonical.com gets signed metadata
<wallyworld_> yep
<mbruzek> wallyworld_, leaving 10.0.3.1 worked
<mbruzek> http://paste.ubuntu.com/8218897/
<sinzui> mbruzek, I thought you were on local host
<wallyworld_> mbruzek: there's the problem
<mbruzek> why does it say ppc64?
<wallyworld_> exactly
<mbruzek> sinzui, I AM on the local provider on my intel laptop
<wallyworld_> it's generating tools metadata for ppc64
<sinzui> mbruzek, your bug gets more awesome by the hour
<wallyworld_> mbruzek: your local laptop isn't ppc64 is it? :-)
<mbruzek> no it is not
<wallyworld_> not that i think it is
<mbruzek> what would cause the arch to be incorrect.
<mbruzek> ?
<wallyworld_> well, now we know why it can't find tools, just have to fogure out how the fark it's generating ppc64 metadata
<wallyworld_> don't know right now, i'll do a little digging in the code
<mbruzek> wallyworld_, OK.
<sinzui> wallyworld_, mbruzek is one of the few people who can develop on ppc64, Is there a backdoor to select arch that mbruzek could have tripped in his shell env?
<sinzui> mbruzek, I if you run sync-tools, then return in 30 minutes, you will have every tool and I suspect you can deploy.
<wallyworld_> something is making juju think it needs ppc64 tools
<mbruzek> wallyworld_, do you want me to do that?  Will that provide anything useful for your debug?
<wallyworld_> not straight away
<mbruzek> sinzui, I dumped my env vars and grepped for ppc
<wallyworld_> let me first check to see how juju gets the arch it thinks it needs
<sinzui> well I think uname -p says what we expect to be selected
<mbruzek> $ uname -p
<mbruzek> x86_64
<wallyworld_> yes, that's what i think too, but something else must be happening
<wallyworld_> i can't think of why it then writes ppc64
<wallyworld_> mbruzek: when possible, i'd love to see a copy of the output when you "juju bootstrap --debug --show-log"
<wallyworld_> that will contain the info needed to delve deeper into this
<wallyworld_> to see where the tools are getting generated
<wallyworld_> mbruzek: also a listing of the ~/.juju/local/storage/tools/releases directory
<mbruzek> mbruzek@workhorse:~/.juju/local/storage/tools/releases$ ls -l
<mbruzek> total 16008
<mbruzek> -rw------- 1 mbruzek mbruzek 8195954 Sep  2 17:05 juju-1.20.6.1-precise-amd64.tgz
<mbruzek> -rw------- 1 mbruzek mbruzek 8195954 Sep  2 17:05 juju-1.20.6.1-trusty-amd64.tgz
<mbruzek> Let me destroy the environment and get that for you
<mbruzek> http://paste.ubuntu.com/8219045/
<wallyworld_> thank you
<mbruzek> ~$ juju bootstrap --debug --show-log -e local 2>&1 | tee juju_bootstrap_4.txt
<mbruzek> That is how I ran it
<wallyworld_> mbruzek: that log appears to show everything is correct at first glance, ie amd64 tools metadata, nt ppc64
<wallyworld_> what does the index.json file contain now?
<wallyworld_> since this looks perfectly as expected:
<wallyworld_> 2014-09-02 23:25:37 DEBUG juju.environs.simplestreams simplestreams.go:367 read metadata index at "file:///home/mbruzek/.juju/local/storage/tools/streams/v1/index.json"
<wallyworld_> 2014-09-02 23:25:37 DEBUG juju.environs.simplestreams simplestreams.go:534 candidate matches for products ["com.ubuntu.juju:14.04:amd64"] are [{Tue, 02 Sep 2014 18:25:36 -0500 products:1.0 content-download  [] streams/v1/com.ubuntu.juju:released:tools.json [com.ubuntu.juju:12.04:amd64 com.ubuntu.juju:14.04:amd64]}]
<wallyworld_> 2014-09-02 23:25:37 DEBUG juju.environs.simplestreams simplestreams.go:847 finding products at path "streams/v1/com.ubuntu.juju:released:tools.json"
<wallyworld_> 2014-09-02 23:25:37 DEBUG juju.environs.simplestreams simplestreams.go:885 metadata: &{map[com.ubuntu.juju:12.04:amd64:{ 1.20.6.1 amd64   map[20140902:0xc21000a360]} com.ubuntu.juju:14.04:amd64:{ 1.20.6.1 amd64   map[20140902:0xc21000a660]}] map[] Tue, 02 Sep 2014 18:25:36 -0500 products:1.0 com.ubuntu.juju:released:tools  }
<wallyworld_> 2014-09-02 23:25:37 INFO juju.environs.bootstrap bootstrap.go:58 newest version: 1.20.6.1
<wallyworld_> 2014-09-02 23:25:37 INFO juju.environs.bootstrap bootstrap.go:86 picked bootstrap tools version: 1.20.6.1
<mbruzek> wallyworld_, the wget gives me ppc64.
<mbruzek>                 "com.ubuntu.juju:12.04:ppc64",
<mbruzek>                 "com.ubuntu.juju:14.04:ppc64"
<wallyworld_> wtf
<mbruzek> I don't know!
<wallyworld_> the content of the file on disk?
<wallyworld_> says ppc64?
<mbruzek> give me location again?
<wallyworld_> "file:///home/mbruzek/.juju/local/storage/tools/streams/v1/index.json
<mbruzek>                 "com.ubuntu.juju:12.04:amd64",
<mbruzek>                 "com.ubuntu.juju:14.04:amd64"
<mbruzek> this is a mystery !
<wallyworld_> yes
<mbruzek> Where is it getting ppc64.
<wallyworld_> so if you wget from http://10.0.3.1/.... you get ppc64
<mbruzek> yes
<davecheney> mbruzek: netstat -anp
<davecheney> who is listening on 10.0.3.1
<mbruzek> oh crap
<mbruzek> sinzui, was right.
<wallyworld_> what did he say?
<mbruzek> I am running sshuttle -r ubuntu@stilson-01 10.0.3.0/24 so I can see the local systems on the power machine stilson-01
<davecheney> bad ida
<davecheney> idea
<wallyworld_> lol
<davecheney> especially on that range
<wallyworld_> well that explains a lot
<wallyworld_> i was beginning to worry
<mbruzek> Doh!
<mbruzek> I am sorry
<wallyworld_> no problemo
<wallyworld_> glad we found the problem
<davecheney> so, has this fixed all the problems ?
<davecheney> or just some of them ?
<mbruzek> bootstrapping.
<wallyworld_> what is "all the problems"?
<davecheney> wallyworld_: all the ones mbruzek reported to me earlier
<davecheney> with tools selection being screwed
<wallyworld_> yes, i think this will fix all that
<davecheney> good
<wallyworld_> i hope  anyway
<davecheney> i hope everyone learnt a valuable lesson :)
<mbruzek> sorry guys
<davecheney> np
<wallyworld_> no need to apologise :-)
<davecheney> we found the problem
<wallyworld_> we introduce enough bugs of our own :-)
<davecheney> mbruzek: your pennence is to write up what happened
<davecheney> and forward it to anyone who is using sshuttle
<mbruzek> yes sir
<mbruzek> I can confirm that the system bootstraps now, and I can deploy both precise and trusty ubuntu charms.
<wallyworld_> \o/
<davecheney> alexisb: i've finished my review of the gccgo4.9 bugs
<davecheney> please see my email for details
<menn0> anyone able to review this? https://github.com/juju/juju/pull/660
#juju-dev 2014-09-03
<davecheney> menn0: reviewing
<menn0> davecheney: thanks
<menn0> davecheney: sorry that it's a fairly big one
<mbruzek> wallyworld_, note sent, please reply if I missed something.
<wallyworld_> will do, thank you
<mbruzek> davecheney, sinzui, wallyworld_  thanks very much for helping me on this problme.
<wallyworld_> anytime
<mbruzek> have a good day/night
<wallyworld_> you too
<davecheney> menn0: review done
<menn0> davecheney: thanks very much
<waigani> thumper: any hints on how to get resources and authorizer to call the NewClient? Do we have mocks somewhere?
<davecheney> waigani: there is a mock for Authorizer
<davecheney> apiserver/testing.FakeAuthorizer
<davecheney> takes a tag
<waigani> yes!
<waigani> thank you
<davecheney> dunno about resources
<davecheney> not even really sure what they are
<ericsnow> axw: ping
<axw> ericsnow: hey, just got back from drop off
<ericsnow> axw: cool, could you take a look at bug #1364438?
<mup> Bug #1364438: utopic lxc tools.tar.gz and aria2c not found <ci> <local-provider> <lxc> <regression> <utopic> <juju-core:Triaged> <https://launchpad.net/bugs/1364438>
<axw> ericsnow: ah, yep, thanks
<ericsnow> axw: I'm pretty sure it's related to the aria2 change from the other day
<axw> indeed
<ericsnow> axw: thanks!
<waigani> thumper: ping
<thumper> waigani: hey
<thumper> waigani: look at common.NewResources()
<waigani> thumper: yeah got that
<waigani> on to the next prob now, state.getCollection borks
<waigani> because st.authenticated is a nil pointer
<waigani> i'm guessing because of the mock authenticator not setthing it?
<waigani> *is not setting it
<thumper> waigani: that means that the open call was done without a user/passwrod
<waigani> open call?
<thumper> waigani: how are you getting state*.State
<waigani> from the base suite
 * thumper thinks
<waigani> JujuConnSuite
<thumper> I think I need to see the code
<waigani> sure
<waigani> screen share, pastebin or push pr?
<thumper> um... push pr probably easiest
<waigani> thumper: http://pastebin.ubuntu.com/8219767/
<waigani> this is how I'm setting up the suite
<wwitzel3> ok, so I changed RunCommands in every place I could find .. yet I still am getting an error from the rpc.Register, http://paste.ubuntu.com/8219764/
<wwitzel3> having trouble tracking down why the exported method isn't suitable
<thumper> wwitzel3: I think you may have to restructure so it has only one in param and one out param
<wwitzel3> thumper: I will give that a shot, thanks
<thumper> wwitzel3: there is weird magic in the api layer registration
<wwitzel3> lol, good enough explantion for me
<wwitzel3> :)
<thumper> waigani: that looks ok to me...
<thumper> waigani: although...
<thumper> waigani: you probably want to change the auth tag
<thumper> waigani: to be names.NewUserTag(state.AdminUser)
<thumper> all that is changing in my branch
<waigani> ah, sure
<thumper> but it is likely to be a problem for you
<thumper> as the code that does the lookup
<thumper> won't find that user
<waigani> thumper: I'll hunt down the st.authorized problem - got a few leads :)
<waigani> oracle is helping for once
<waigani> thumper: dummy provider does not set Tag or Password in mongo.MongoInfo before opening a connection: provider/dummy/environs.go:112
<thumper> waigani: but other tests pass... so figure out why :-)
<waigani> sigh
<thumper> waigani: think of it as a learning exercise
<thumper> and to help things out, it is probably something simple and stupid
<thumper> at least it will seem that way once you have found it
<wwitzel3> I always feel that way
<wwitzel3> :/
<axw> review please, https://github.com/juju/juju/pull/661 - fixes CI blocker
<menn0> davecheney: PTAL https://github.com/juju/juju/pull/660/
<menn0> davecheney: you may find it quicker to just look at the second commit which is where I've addressed your review feedback
<waigani> thumper: so s.State is nil in SetUpSuite, even though I've called s.baseSuite.SetUpSuite(c)
<waigani> thumper: fix would be to init the client in SetUpTest, but why is state nil?
<thumper> don't know...
<thumper> and I'm busy reading...
<thumper> chat with menn0 :)
<davecheney> menn0: ta
<davecheney> looking
<waigani> names.ParseUserTag("user@provider") returns names.UserTag{name:"", provider:""}
<waigani> is that expected?
<waigani> davecheney ^?
<davecheney> waigani: nope
<davecheney> is there a test case covering this ?
<menn0> davecheney: I've responded to the your remaining comment for PR 660
<waigani> davecheney: just hit it
<waigani> so I'd say no
<davecheney> waigani: hang on, parseUserTag returns two values
<waigani> yeah, so when err is nil
<waigani> it's still an empty tag
<davecheney> lucky(~/src) % go run tt.go
<davecheney> user- "dave@deathstar" is not a valid tag
<waigani> ah shit ignore me, sorry for the noise - we are not returning on err
<davecheney> ok
<axw> wallyworld_: I'm just rebasing then should be in a position to push my tools-in-state branch up. it's grown quite a bit, so I think I'll have to try and split it up a bit
<wallyworld_> ok
<axw> I'll push it up anyway if you want to take a look at the core bits though
<axw> wallyworld_: FYI https://github.com/axw/juju/compare/state-tools-catalogue
<axw> would appreciate a glance over state/tools.go specifically
<wallyworld_> ok
<axw> that's the bit that uses blobstore
<wallyworld_> shit, i still gotta fix that
<axw> wallyworld_: would you like me to create a bug against 1.21 so we don't release without it?
<wallyworld_> yeah, that would be good actually
<axw> wallyworld_: https://bugs.launchpad.net/juju-core/+bug/1364750
<mup> Bug #1364750: blobstore's hashing needs improvement <juju-core:New> <https://launchpad.net/bugs/1364750>
<wallyworld_> ta
<axw> wallyworld_: just realised I got the databases back-to-front for the managed storage. IIANM, the blobs should go in their own DB and the catalogue can go in the juju db
<wallyworld_> yep
<wallyworld_> axw: also, the path for tools storage should be prefixed with /tools
<wallyworld_> cause there'll also be a /charms etc
<axw> wallyworld_: atm it's "tools-", so I should change that to "tools/"? is the leading slash required?
<wallyworld_> yeah change to tools/, leading slash not required. storage will preprend with /environs/<uuid>/
<axw> thought so. cool
<wallyworld_> axw: also, i was thinking we'd have a ToolsStorage interface with an implementing struct that is constructed with NewToolsStorage(), rather than bulking up state.State with more methods
<wallyworld_> NewToolsStorage() would take the environ uuid etc as parameters, and probably a txnRunner of some sort
<wallyworld_> the txnRunner could just be state instance passed in
<wallyworld_> this then allows for easier, standalone testing using dependency injection etc
<axw> wallyworld_: how would the ToolsStorage be accessed?
<wallyworld_> how is it accessed now in yur branch?
<axw> it's currently used in two places IIRC (not counting the myriad tests): by bootstrap, and by the apiserver/tools.go code
<axw> and apiserver/common/tools.go
<wallyworld_> it looks like we construct a ToolsGetter passing in state
<wallyworld_> so that would change to pass in a ToolsStorage, which is constructed from state
<wallyworld_> that's one example in tools.go
<axw> I'll have a look at how it can be split out, I'm not seeing a lot of benefit right now though
<wallyworld_> adding more and more to state kinda sucks
<wallyworld_> i'd much prefer smaller, standalone components
<wallyworld_> easier to test and reuse
<wallyworld_> axw: if you have time at some point, i'd love a review of this PR so i can give something to the landscape guys when they come on. the next OCR people aren't on for a while :-( maybe if you are sick of tools and want something slightly different for a short time. https://github.com/juju/juju/pull/662
<axw> wallyworld_: can you please elaborate on what is only called at bootstrap time?
<wallyworld_> at bootstrap, the floating ip is tracked and associated with the instance, and the addresses correctly stored. then, the address poller queries the instances running and gets their addresses and overwrites what was done at bootstrap, because the api used bu the instance poller did not take account of the floating ip
<axw> wallyworld_: ah, I see, thanks
<wallyworld_> i tested live on hp cloud and it worked
<wallyworld_> took a little extra testing because their use a funny address range that highlights the bug and we don't
<axw> wallyworld_: I'm just looking at goose/nova.ServerDetail, and I think there's already fields in there for floating IPs
<axw> namely, AddressIPv4 and AddressIPv6
<axw> may be a simple matter of just using them in getAddresses()
<wallyworld_> hmmm, could be, i didn't see those
<wallyworld_> i'll check it out
<axw> wallyworld_: tests are broken now, but can you see if this is more to your liking? https://github.com/axw/juju/compare/state-toolstorage
<wallyworld_> sure
<axw> wallyworld_: code implementation moved into state/toolstorage, with a method on State to create one
<axw> core impl*
<wallyworld_> axw: looks better. we can now test the tools stuff without instantiating a state at all. just mongo and a txn runner
<axw> okey dokey. I'll fix up the tests and propose this in isolation
<axw> then I'll get back to the rest
<thumper> (â¯Â°â¡Â°)â¯ï¸µ â»ââ»
<wallyworld_> axw: sadly, those AddressIP4 and AddressIP6 fields are never filled in. i've run up an hp cloud env and even explicity querying for server details after assigning the pubic address, they come back empty :-(
<axw> rats
<axw> wallyworld_: welp, in that case my comment on the PR stands
<wallyworld_> keeping the code to process the floating ips in the Instances() calls means also it is only done once
<axw> wallyworld_: yep, fair enough - that's why I went looking at ServerDetails
<wallyworld_> not each time an individual instance's Addresses() is called
<axw> so it would need to be done on AllInstances too
<axw> I think those are the only two places
<wallyworld_> ok, i didn't notice AllInstances
<wallyworld_> i'll fix after school pickup
<wallyworld_> axw: one though i had also - tools storage is really associated with an environment, so i wonder if the toolstorage package should be environs/toolsstorage not state/toolsstorage. it also means that stuff under state doesn't depend on environs
<axw> wallyworld_: it's inherently tied to mongo, so I don't think it's a good idea
<axw> there are other environment-specific things in state
<axw> like, most of state
<wallyworld_> ok, fair enough
<wallyworld_> axw: i updated the PR to fix AllInstances()
<axw> wallyworld_: just commented on it
<mattyw_> morning folks
<axw> jam: the reason I set it as critical is that we should not release code with blobstore without fixing it first. is there a better way to flag that?
<jam> axw: Critical means drop everything, IMO.
<jam> axw: and FWIW I actually think we'd be perfectly safe releasing with SHA-1
<jam> it isn't *more safe* to also use MD5 and SHA-256 is better
<jam> but we aren't leaving a critical security hole open by only using SHA-1
<axw> jam: we're not using SHA-1, we're using MD5+SHA-256 concatenated
<axw> we want to drop MD5
<jam> axw: which is fine, but still not a *security* issue
<axw> if we release *with* it, we'll have a migration problem to deal with
<jam> I agree with the improvement.
<voidspace> morning all
<wallyworld_> jam: did that email thread conclude we could just use SHA-384?
<wallyworld_> instead of SHA256 and MD5
<mattyw> is landing still blocked on https://bugs.launchpad.net/juju-core/+bug/1348477?
<mup> Bug #1348477: userAuthenticatorSuite.TearDown failure <ci> <regression> <test-failure> <juju-core:Triaged by cmars> <https://launchpad.net/bugs/1348477>
<wallyworld_> oh, also, we weren't concatenating
<wallyworld_> jam: the implementation required the user to know both checksums as per marks's original directive to william. if that's what is meant by concatenating, then we were I guess
<wallyworld_> but the checksums were specified separately
<wallyworld_> mattyw: hmmm, that's not a regression
<wallyworld_> that bug test failure has been around for a while
<wallyworld_> it's intermittent
<mattyw> wallyworld_, yes, but I was sure it had appeared as a reason for not allowing landing yesterday
<mattyw> wallyworld_, unless it's been downgraded
<wallyworld_> could have done i suppose, i haven't been keeping up
<wallyworld_> it's still marked as a regression
<wallyworld_> i think i'm going to change it
<mattyw> wallyworld_, I think that's probably a wise move
<mattyw> wallyworld_, is there a simple query I can do on lp to get the list of bugs that block landing at any moment? is it just critical bugs with ci + regression tags?
<wallyworld_> mattyw: I think so yes
<mattyw> wallyworld_, looks like there are 4 at the moment?
<wallyworld_> could be, i haven't looked
<axw> wallyworld_: cleaned it up, https://github.com/juju/juju/pull/663
<axw> wallyworld_: I meant (md5(x), sha256(x)), as opposed to md5(sha256(x)) or sha256(md5(x))
<mattyw> this one could probably be changed to not include ci, regression as well https://bugs.launchpad.net/juju-core/+bug/1364410
<mup> Bug #1364410: Timeout TestManageEnviron MachineSuite in ppc64el <ci> <intermittent-failure> <ppc64el> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1364410>
<wallyworld_> axw: sorry, what's the context?
<axw> wallyworld_: that's what I meant by concatenating
<axw> basically, what you said
<wallyworld_> axw: oh, i didn't realise that you mentioned concatenating, i think t came from the email thread in my mind
<wallyworld_> but yes, i think we agree
<wallyworld_> mattyw: i see 2 critical ci regression bugs now
<wallyworld_> i'm not sure that either should block landings
<wallyworld_> but i'm also reluctant to override curtis without talking to him
<mattyw> wallyworld_, what search are you doing?
<mattyw> wallyworld_, agreed, I'll talk to him when he's around
<wallyworld_> https://bugs.launchpad.net/juju-core/?field.searchtext=&orderby=-importance&search=Search&field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_commenter=&field.subscriber=&field.structural_subscriber=&field.tag=ci+regression+&field.tags_combinator=ALL&field.has_cve.used=&field.omit_dupes.used=&field.
<wallyworld_> omit_dupes=on&field.affects_me.used=&field.has_patch.used=&field.has_branches.used=&field.has_branches=on&field.has_no_branches.used=&field.has_no_branches=on&field.has_blueprints.used=&field.has_blueprints=on&field.has_no_blueprints.used=&field.has_no_blueprints=on
<wallyworld_> i forgot to filter on critical
<wallyworld_> axw: i gotta run to soccer, i'll look at your PR when I get back
<axw> wallyworld_: cheers, have fun
<voidspace> jam: http://pastebin.ubuntu.com/8221800/
<voidspace> jam: an unstable replicaset has members in state Recovering
<voidspace> and sometimes state Unknown it seems
<voidspace> (states 3 & 6 respectively)
<voidspace> just setting up an lxc with nbb so I can hammer replicasets with a slow disk
<voidspace> After adding members they start in Unknown (state 6)
<voidspace> removing members causes some of them to go into Recovering (state 3)
<fwereade> dimitern, ping
<dimitern> fwereade, hey
<fwereade> dimitern, I have been adding a few comments to https://github.com/juju/juju/pull/517/files that I think are relevant to your interests
<jam> voidspace: so that seems ok, though I think we need some amount of logic about how long we'd be willing to wait.
<fwereade> dimitern, cast an eye over it and let me know if anything springs to mind
<dimitern> fwereade, sure, will do
<dimitern> fwereade, thanks for taking the time to review it
<fwereade> dimitern, sorry that one's been languishing so long :(
<dimitern> fwereade, I'm planing to update the document later today and convert comments to proposals, where relevant
<fwereade> dimitern, awesome
<dimitern> fwereade, no worries
<perrito666> morning
<voidspace> jam: sure, that just returns true or false
<voidspace> jam: we could use that in an attemptLoop (for example)
<mattyw> mgz, ping?
<voidspace> jam: another question is, do we always wait for *all* members to be healthy
<jam> voidspace: for the purposes of the test I think we do
<jam> voidspace: for "realsiez" we probably just wait for the majority ?
<voidspace> jam: right, but don't we want something that backup (et al) can use
<voidspace> right
<mattyw> who would like to talk to me about the presence watcher?
<mattyw> fwereade, quick favour?
<fwereade> mattyw, what can I do for you?
<mattyw> fwereade, I've made a new pr for my metric cleanup pr: https://github.com/juju/juju/pull/665. The bot was ignoring $$merge$$ on the original
<mattyw> fwereade, could you just give it a once over to confirm it's the same as the one that has already been LGTM'd
<mattyw> it should be exactly the same
<fwereade> mattyw, as long as it is the same, which you'd know best, I'm happy to trust you to self-LGTM with a link to the original PR for context
<mattyw> fwereade, ok thanks
<mattyw> fwereade, who's a good person to pester about the presence watcher?
<voidspace> jam: I need to leave for hospital appointment. *Probably* not back in time for standup. I have a branch with extraneous "Remove" removed - works fine but not yet tested with a slow disk. I also know how to check replicaset health (as discussed) but also want to test that on a slow disk. I'm now getting nbd working and mounted (wrestling a bit with nbd-server config). Will then create an lxc on the nbd mounted disk.
<voidspace> Creating an lxc container with the backing filesystem on another disk is straightforward. I should be able to share home directory without having to setup a full dev environment.
<voidspace> It's only mongo that needs to be running on the ndb device, not jujud.
<voidspace> I might then need to look at trickle, but just using nbd should add a significant latency I would expect.
<voidspace> anyway, gotta go
<jam> voidspace: hope it all goes well
<fwereade> jam, dimitern has a power cut
<jam> fwereade: thanks for relaying the message
<jam> TheMue: looks like its just you and me today
<jam> apparently it was just me
<hazmat> what's the trick to not have bootstrap destroy the environment on failure?
<hazmat> ah.. keep-broken
<mattyw> perrito666, ping?
<perrito666> mattyw:
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1348477 1364410  1359837
<natefinch> damn, the day  I need stuff reviewed is the day I'm on-call :/
<perrito666> natefinch: well, self review :p
<natefinch> perrito666: doesn't matter, CI is blocked anyway, just realized
<perrito666> natefinch: I am finishing a couple of tests and then I can take a look to the blockers if no one else is on them
<mattyw> folks, I'm looking into this bug: https://bugs.launchpad.net/juju-core/+bug/1348477. I'm trying to work out if there is any significance in us calling s.State.SetAdminMongoPassword("") right before we attempt to close. I wonder if making that call while the presence worker is doing it's thing is causing the error
<mup> Bug #1348477: userAuthenticatorSuite.TearDown failure <ci> <regression> <test-failure> <juju-core:Triaged by cmars> <https://launchpad.net/bugs/1348477>
<mattyw> because the auth fails error is coming from the presence watcher
<jcw4> rick_h_: I want to make sure we have a good corpus of use cases for Actions from the perspective of the GUI - is there someone I should work with from your team?
<rick_h_> jcw4: we were actually just talking about how we could use actions for our current work
<jcw4> rick_h_: sweet
<rick_h_> jcw4: but for a corpus, we've not built a list yet. I'd send an email to juju-dev mailing list and use the power of the masses to generate a list.
<jcw4> rick_h_: +1
<rick_h_> jcw4: and we'll be sure to reply with out use case
<jcw4> rick_h_: thanks
<marcoceppi> Good morning, seeing an issue that I confirmed was not in the cloud images (as best as I can tell)
<marcoceppi> Openstack charms are failing (and any charm with charmhelpers) to deploy on manual provider but not any other provider (including LXC on manual provider)
<marcoceppi> Does manual provider use cloud-init to to setup the image?
<natefinch> marcoceppi: I don't think it can.  The whole point is that it's an already-running machine that you want to plop juju on.  Pretty sure it just ssh's in and runs stuff.
<marcoceppi> natefinch: that's the issue then, well at least thats what it seems to be
<marcoceppi> python-yaml isn't being installed, likely because of lack of cloud-initing, causing a discrepency in all other provider images and the "manual" provider experience
<wwitzel3> natefinch: ping
<natefinch> wwitzel3: howdy
<wwitzel3> natefinch: I'm in moonstone
<perrito666> I need to chose a rommie or I will be automatically added :p anyone wants to be my roomate? I wake up too early, take showers equally early and sleep late :p and I might disassemble my laptop
<wwitzel3> natefinch: if you have time for the 1on1
<natefinch> wwitzel3: ahh yeah, sure
<natefinch> wwitzel3: belay that... kiddo has 102.8 temp... back in a bit
<wwitzel3> np
<perrito666> ouch that is like 39C :(
<mattyw> sinzui, ping?
<sinzui> hi matty
<mattyw> sinzui, good morning, I'm still looking into https://bugs.launchpad.net/juju-core/+bug/1348477. Do you think it should be blocking landing?
<mup> Bug #1348477: userAuthenticatorSuite.TearDown failure <ci> <regression> <test-failure> <juju-core:Triaged by cmars> <https://launchpad.net/bugs/1348477>
<sinzui> mattyw, I think it should, though it didn't appear in the last three test runs. I would like to say the bug has to prove itself to be rare
<mattyw> sinzui, ok
<mattyw> sinzui, understoof
<mattyw> sinzui, also understood
<alexisb> TheMue, I will be missing our 1x1 today as I have a customer meeting
<mattyw> perrito666, ping?
<perrito666> mattyw: pong
 * perrito666 match, game, set
<TheMue> alexisb: ok, almost forgot it. good that you remind me. ;)
<perrito666> or something like that, I never understood those sports scoring systems
<natefinch> alexisb, wwitzel3, hazmat: I'm going to miss the TOSCA call in a few minutes... need to take my daughter to the pediatrician's office.
<TheMue> alexisb: beside my current fight with versioning ;) I also have nothing special
<perrito666> natefinch: ok, keep ups posted on how she goes and divert our way anything that we can take care of for you
<mattyw> perrito666, in state/presence/presence.go:231. If you change the period from time.Second to time.Millisecond you can make the auth fails error much more often I think
<perrito666> mattyw: sweeeeeet
<alexisb> natefinch, hope all is ok
<alexisb> wwitzel3, you joining us?
<wwitzel3> alexisb: yep, sorry
<dimitern> alexisb, thanks for the travel approval,  i've sent a booking request to bts
<alexisb> dimitern, sweet
<dimitern> anyone willing to review my fatal #666 PR? :) https://github.com/juju/juju/pull/666 - added more error tracing and logging to help catch a few CI bugs
<dimitern> natefinch, wallyworld_, voidspace, fwereade, others? ^^
 * fwereade is briefly irrationally jealous of dimitern
<voidspace> dimitern: looking
<dimitern> thanks! :)
<perrito666> dimitern: I presume I am the person for that
<voidspace> I finally have an lxc container created inside an nbd device
<dimitern> hehe
<voidspace> it took ages, I couldn't get the standard way to work at all and had to force the server to start without config file and using "old style" exports (so I could specify port manually)
<voidspace> any other way and nbd-server just failed to do anything
<voidspace> using the lxc container seems appropriately slow
<dimitern> I know it's bad, but if the PR looks ok, I'll try merging it with __fixes-1348477__ to overcome the bot block and see more context for the errors
<voidspace> now to see if I can get juju tests running inside it
<perrito666> dimitern: no need there is a flag for that
<perrito666> for "this will not fix anything but I really really need it up"
<dimitern> perrito666, oh, what's that flag?
<perrito666> JFDI
<dimitern> oh :) i like it
<perrito666> please provide a justification for it, since it has been abused in the past
<voidspace> dimitern: what does errors.Trace do? wraps the error I presume
<mattyw> perrito666, is that right?
<perrito666> mattyw: please disambiguate "that"
<mattyw> perrito666, JFDI
<perrito666> yup
<mattyw> I'd feel bad using it, but that doesn't mean I'm not tempted
<voidspace> dimitern: so this essentially fixes a bug caused by the fact that some errors are wrapped and some aren't?
<voidspace> plus adds more consistent error handling
<perrito666> mattyw: you should not be using it unless you have a very good reason for it
<perrito666> dimitern: the extra info you added could help discover wtf is happening with the auth error
<voidspace> dimitern: and on line 815/816 you add a branch that explicitly doesn't wrap the error with Trace
<voidspace> dimitern: is this because it's already wrapped, or some other reason
<voidspace> does Cause recursively unwrap (root cause) or just one layer?
<dimitern> voidspace, the root cause
<voidspace> dimitern: subject to those questions LGTM
<voidspace> dimitern: and the reason for not using Trace on lines 815/816 of the diff?
<dimitern> voidspace, it's because tomb and some other packages like juju/txn check if err == SomeExactErrVar
<dimitern> voidspace, which file is that for lines 815/816 ?
<voidspace> dimitern: right, so wrapping screws that up
<dimitern> voidspace, exactly
<perrito666> state
<voidspace> dimitern: state/state.go
<voidspace> ah, line numbers are per file
<voidspace> } else if err == jujutxn.ErrExcessiveContention {
<perrito666> voidspace: yup, its a unified dif
<voidspace> dimitern: for ErrExcessiveContention you explicitly avoid Trace
<dimitern> voidspace, yes, exactly because juju/txn internally check for ErrExcessiveContention and some other errors with if err == X
<voidspace> dimitern: right
<voidspace> LGTM then
<perrito666> voidspace: that is an error expected to be handled
<dimitern> voidspace, cheers!
<voidspace> ah, perrito666 was doing it too :-)
<voidspace> double LGTM
<perrito666> dimitern: you could drop a comment there since its cause for doubts
<perrito666> voidspace: is your lgtm as worthless as mine?
<perrito666> :p
<dimitern> perrito666, which comment to drop?
<voidspace> hehe, not officially I believe
<voidspace> but in practise...
<dimitern> perrito666, ah, you mean add a comment why ErrExcCont is not wrapped?
<dimitern> perrito666, sure
<perrito666> dimitern: sorry perhaps a bad translation, add a small comment on why you are not trace wrapping that particular err
<mattyw> dimitern, how much stuff do you know about the jujuconnsuite teardown logic?
<mattyw> cmars, and and I are looking into it now - there's a few bits that seem odd to us
<alexisb> gsamfira, do we have a hangout for the meeting?
<gsamfira> no, will you create one ?
<gsamfira> alexisb ^
<alexisb> yep
<dimitern> mattyw, not much, but my PR there aims to help debugging this mess
<alexisb> gsamfira, sent you an invite
<mattyw> dimitern, cmars and I are looking into it now, we've a few ideas and we can recreate the issue *sort of*
<ericsnow> natefinch, perrito666, wwitzel3: standup?
<perrito666> ericsnow: nate is not available afaik
<perrito666> wwitzel3: did you finish your meeting?
<wwitzel3> yep
<wwitzel3> if I have an endpoint can I get a relation_id from that? ..
<wwitzel3> the int value that is
<natefinch> ericsnow: here but can't standup, sleeping sick baby on my lap and only one hand to type
<ericsnow> natefinch: no worries
<natefinch> ped appt is later, unfortunately, but at least she's sleeping.
<perrito666> natefinch: all is well I hope
<perrito666> is the doc going to your house?
<perrito666> wwitzel3: ?
<natefinch> perrito666: nope, they don't do that here
<perrito666> :(
<dimitern> mattyw, sorry, in a meeting, so i'm responding when i can; i'm interested to hear your ideas how to repro it?
<mattyw> dimitern, it seems timing related so it's not sure fire: http://paste.ubuntu.com/8224259/
<ericsnow> wwitzel3: standup?
<mattyw> tasdomas, was just looking at this: https://github.com/juju/juju/pull/667
<mattyw> tasdomas, I'm probably trying to do too many things at once but I couldn't work out what the significance of those changes is
<perrito666> mattyw: hey, I am all yours now, can I give you a hand fixing the auth fails issue?
<mattyw> dimitern, ping?
<dimitern> mattyw, pong
<mattyw> dimitern, do you have time to talk about this auth fails bug?
<dimitern> mattyw, trying to merge my PR now, and if it happens to fail on the bot with auth fails or something, we'll see the logging/error tracing
<dimitern> mattyw, yes, I have some time
<voidspace> heh, rate limiting my ndb drive to 200kps up/down with trickle means the lxc container living there will take about a week to start...
<voidspace> unless I kill it first...
<wwitzel3> lol
<voidspace> the drive was running at 147 mb/s before - so 200kb/s is probably a bit too slow....
<wwitzel3> why are you rate limiting it?
<voidspace> wwitzel3: to simulate a slow disk
<wwitzel3> voidspace: ahh, cool
<voidspace> wwitzel3: to work with mongo replica sets
<voidspace> wwitzel3: nbd lets you serve a volume over tcp and access it over tcp
<voidspace> nbd-client and nbd-server
<voidspace> so I rate limit the server and then mount the volume
<voidspace> and there's an lxc container living on the rate limited volume
<voidspace> and the intention is to have mongo running on that
<voidspace> I haven't actually got that far yet
<voidspace> I think I'll need a reboot as I had to kill a mount command and now can't mount the volume...
<voidspace> but first - jogging...
 * natefinch is back
<natefinch> ericsnow: great writeup on your charm
<ericsnow> natefinch: I hope it's useful
<ericsnow> natefinch: like I said before, it went pretty smoothly and the only criticisms I have are pretty mild
<natefinch> ericsnow: definitely.  It gives me a lot to think about to compare with my experience, which I think was more frustrating than yours, not that it wasn't insurmountable.
<ericsnow> natefinch: the order of operations I outlined is mostly a best guess and undoubtedly not accurate, but should capture the bulk of what I did
<natefinch> ericsnow: maybe I expect too much, but I was annoyed with a lot of the process, most of which is our own fault
<natefinch> ericsnow: like debugging a hook.... I have one unit of one service deployed, it has one failed hook..... juju debug-hooks should just do the right thing and put me into the hook context on that machine.
<ericsnow> natefinch: oh, I didn't bother with that stuff.  I opened the logs directly, tweaked my charm accordingly, removed it, and re-deployed
<ericsnow> natefinch: I never tried debug-hooks, and only tried debug-log once
<natefinch> ericsnow: I had to do my testing on Amazon, because my charm didn't work in LXC, so re-deploying was painful
<ericsnow> natefinch: ah, so maybe that is the big difference
<ericsnow> natefinch: it would have been much more frustrating if it hadn't been on local provider
<natefinch> ericsnow: yeah, a lot of my complaints were around that
<natefinch> ericsnow: I also found the local repository stuff to be unecessarily complicated.  Why can't I just say juju deploy --local=<path> ?
<ericsnow> natefinch: oh, yeah, that's a good one (I would have had it on my writeup if I'd remembered)
<natefinch> ericsnow: btw did you send that to anyone else or just me? :)
<ericsnow> natefinch: just you, I figured you would know to whom to forward it
<ericsnow> natefinch: or if you like I can just post it to the juju-dev list (or some other more appropriate list)
<natefinch> Writing to juju-dev is a good idea
<ericsnow> natefinch: will do
<ericsnow> natefinch: done
<natefinch> marcoceppi: if I use --config when I deploy, can I access those config variables during the install hook with config-get?
<marcoceppi> natefinch: theoretically, yes
 * natefinch squints at marcoceppi 
<natefinch> marcoceppi: what does that mean? :)
<marcoceppi> natefinch: yes, I'm like 90% sure you can
<marcoceppi> as in, I can't remember, but I'm pretty sure you can
<natefinch> ok :)
<natefinch> that's cool
<marcoceppi> you have access to everything up until the execution of that hook context, then it's locked until next hook
<marcoceppi> so you can even juju set before the install hook runs and it'll make it in
<marcoceppi> again 90% sure
<natefinch> I would hope so, but it didn't occur to me until just now that I might be able to.  It helps skip a restart if I can prepare config during install
<ericsnow> marcoceppi: I think you're right because that behavior actually broke a charm I'm using :P
<natefinch> haha
<natefinch> sorry
<ericsnow> perrito666: too bad you didn't get PR #666 :)
<TheMue> so, next step reached, tests run, merge done, conflicts resolved. time to go to bed
<TheMue> good n8 folks
<wallyworld_> katco`: hi, how was your day?
<thumper> wallyworld_: what is the current plan with the ci regression blockers?
<thumper> wallyworld_: do we have one?
<rick_h_> and with split diffs there was much rejoicing! https://github.com/juju/juju-gui/pull/526/files?diff=split
<wallyworld_> thumper: i understood that matt was going to ask about taking those intermittent test failures off the blocker list last night
<wallyworld_> i commented on one of the bugs that i didn't think it was a regression
<sinzui> wallyworld_, What info do we need to ask for to know how to fix bug 1365035
<mup> Bug #1365035: MAAS provider bootstrap: Timeout, server <server> not responding. <bootstrap> <cloud-installer> <landscape> <maas-provider> <timeout> <tools> <juju-core:Triaged> <https://launchpad.net/bugs/1365035>
<wallyworld_> sinzui: i'll read the bug, sec
<thumper> rick_h_: nice
<katco`> wallyworld_: good... 3 PRs up now
<wallyworld_> katco`: 3!!!
<katco`> wallyworld_: well not all today
<katco`> wallyworld_: just 1 today haha
<katco`> wallyworld_: working on another
<wallyworld_> katco`: i'm ocr today but also 1/2 my day is filled with meetings, sigh
<katco`> wallyworld_: have i mentioned how nice of a person you are? :)
<wallyworld_> katco`: maybe, but you cn remind me any time :-)
<katco`> wallyworld_: oh great wally, ruler of all things -- for this is your world -- please look upon my PRs favorably
<wallyworld_> lol
<wallyworld_> katco`: and sacrifices, i love sacrifices
<sinzui> wallyworld_, okay, I see the bug I was looking to backport was already targeted by you to 1.20.8
 * katco` sacrifices bugs at the alter of wally
<wallyworld_> sinzui: i'm not sure how "Timeout, server tesla.beretstack not responding." isn't a network failure?
<sinzui> wallyworld_, yep
<sinzui> wallyworld_, the attached log looks a lot like cloud-init-output.log. I think cloud-init did its job and that the agent could talk to the other party
<wallyworld_> sinzui: yeah, looks like it. let's put our head in the sand till 1.20.7 is out and they can have the option to leave the broken system running and then poke around
<wallyworld_> sinzui: so what was the other bug?
<sinzui> wallyworld_, bug 1361374
<mup> Bug #1361374: maas provider assumes machine uses dhcp for eth0 <addressability> <maas-provider> <network> <juju-core:Fix Committed by dimitern> <juju-core 1.20:Triaged> <https://launchpad.net/bugs/1361374>
<wallyworld_> sinzui: ok, i was asked about that one by jorge so i already added it to 1.20.8 to be backported
<sinzui> wallyworld_, I was pretending I was going to release 1.21-alpha1 today so I started reviewing all the bugs
<wallyworld_> there's a lot of them
<sinzui> katco`, you are awesome. You are in a comfortable third place in fixes https://launchpad.net/juju-core/+milestone/1.21-alpha1
<waigani_> thumper: testing server side is done. How shall I test the client side? mock out the FacadeCall ?
<katco`> sinzui: wow, really??
<wallyworld_> sinzui: with the ci blockers - i thinks there's a couple of intermittent test failures marked as regressions. i don't think they should be as those issues have been around for a while and people are working on improving the tests now for part of every week and there's no way we can quickly fix those
<sinzui> wallyworld_, when we are pressured to release unblessed code, those tests are in the way.
<thumper> waigani_: ideally the client should test against a mock for all but one to show that it is in fact connected
<waigani_> right, I remember you saying that now
<wallyworld_> sinzui: they are agreed. but they are intermittent failures and hard to track down. i fear we will be blocked for a long time if we keep them
<waigani_> thumper: so one unit test which uses the real facade and others using a mock
<thumper> waigani_: yes
<sinzui> wallyworld_, given that ci has only 6 blessed revisions for 1.20.6 after 10 weeks, I think someone can easily say I haven't been strictly enforcing quality. I have a mortgage to pay
 * wallyworld__ is so sick of this kernel bug killing his network all the time :-(
<mattyw> wallyworld__, sinzui I spent time today looking into https://bugs.launchpad.net/juju-core/+bug/1348477. I have some idea what's causing it, and I think I can almost reproduce it on demand
<mup> Bug #1348477: userAuthenticatorSuite.TearDown failure <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged by cmars> <https://launchpad.net/bugs/1348477>
<mattyw> ^^ but I need to talk to someone more familiar with the code to know for sure how to fix it
<cmars> mattyw, that's great news
<wallyworld__> mattyw: awesome
<wallyworld__> mattyw: what are your thoughts?
<perrito666> nites people
<mattyw> wallyworld__, making this change is enough to cause the error to happen on demand: http://paste.ubuntu.com/8227111/
<mattyw> wallyworld__, although it's still timing related so the more tests you run the more likely you are to see it
<mattyw> wallyworld__, and it appears that the call to SetAdminMongoPassword in juju/testing/conn.go:434 is deleting the admin user - quite a number of watchers access state directly using the admin user. so if that admin user is deleted and a watcher does some work before state is closed you'll get that error
<wallyworld__> mattyw: yeah, that's part of the problem - a lot of our tests suck because of timing issues so running them on different platforms triggers issues
<wallyworld__> mattyw: hmm, i had thought that people had removed direct access to mongo from the business logic
<mattyw> wallyworld__, I experimented with splitting the state.Close call into two function. One to stop the watchers and another to close the session - that seemed to fix that problem - but we got more errors during the call to dummy.Reset()
<wallyworld__> sounds like a good start
<sinzui> there are 168 tested commits in 1.20-alpha1, only 11 passed. all of which were 2 weeks ago
<perrito666> mattyw: hey man, any luck?
<mattyw> perrito666, we have a mongo log that more or less shows us trying to connect with an old user name
<sinzui> wallyworld__, mattyw ^ I know I am being difficult. We cannot delude ourselves into thinking it is okay to add features when there is no evidence that this juju version is good
<wallyworld__> sinzui: i agree except that we know the test failures are due to issues with the tests
<mattyw> sinzui, I understand
<mattyw> sinzui, wallyworld__ if we could make the auth fails bug less likely to happen is that something that would be useful in the interim?
<wallyworld__> mattyw: sounds good to me
<mattyw> wallyworld__, I'll be going to bed soon - but in the morning I'll see if I can land some stuff to make it less likely to happen - although obviously  whether or not it actually works enough for ci to run will remain to be seen
<mattyw> sinzui, does that sound acceptable to you?
<mattyw> sinzui, also, I think this has been fixed already https://bugs.launchpad.net/juju-core/+bug/1365124
<mup> Bug #1365124: "juju deploy --to <non-existent-machine> <charm-name>" juju still tries to deploy the service. <deploy> <placement> <juju-core:New> <https://launchpad.net/bugs/1365124>
<sinzui> mattyw, me too, I couldn't find the bug...
<sinzui> mattyw, but I can get an interesting error reproducing it Juju tells me know way
<mattyw> sinzui, ?
<sinzui> mattyw, status still lists am impossible service
<mattyw> sinzui, werid - if you send me the steps to reproduce it I'd be happy to take a look tomorrow
<cmars> sinzui, copy me on that one as well plz
 * thumper takes a deep breath
<thumper> down to only 6 failing tests, and all in cmd/juju
<mattyw> thumper, failing how?
<thumper> mattyw: I've been removing "admin"
<mattyw> thumper, admin?
<thumper> mattyw: the first user is no longer "admin", but the name of the logged in user
<thumper> mattyw: has far reaching impact
<thumper> current unified diff is over 4k
<thumper> will break it down somewhat
<mattyw> thumper, wow
<mattyw> thumper, what have you done with Mr cheney?
<thumper> mattyw: what do you mean?
<mattyw> thumper, he's top of my list of people I want to talk to
<thumper> mattyw: he starts in about 45 min
<mattyw> thumper, oh right yeah - I was sure he was already up this time yesterday
<thumper> mattyw: there was an early meeting yesterday
<mattyw> thumper, that's pretty inconsiderate
<perrito666> mattyw: its like 11pm for you right?
<mattyw> perrito666, yeah - I'm not really working
<mattyw> perrito666, don't worry - I'm not that dedicated
<perrito666> your non working you is amazingly similar to your working you
<perrito666> the proof is on the private ping you just answered ona working communication mean
<perrito666> :p
<mattyw> perrito666, that's laziness - not closing the connection there
<voidspace> wwitzel3: ping
 * perrito666 has no moral grounds for this discussion
<mattyw> perrito666, isn't it 9pm for you?
<perrito666> nah 7:30
<perrito666> but I will be here at 9:30
<perrito666> and even later lol
<perrito666> brb
<voidspac_> wwitzel3: ping
 * perrito666 chopping onions and reading flaky test, not sure which one is the one provoking the crying
<thumper> perrito666: are you looking at mattyw's branch and the flakey teardown?
<perrito666> thumper: the teardown in tip
<davecheney> perrito666: if you are working on that bug
<davecheney> please mark it in progress
<perrito666> davecheney: I am working on another caused by the same Issue, Ill mark it as soon as I make sure tim and I are not working on the same thing :p
<thumper> ok... tests pass
 * thumper runs make check to test
<menn0> davecheney, thumper: so I've been looking at that CI blocker and the way we find a free port to use for API servers and mongod in tests is crazy
<menn0> just like davecheney said
<menn0> fixing it is hard though
<davecheney> menn0: i found the bit that finds a port for the api server and it is sane
<davecheney> but the way we do for mongo is not sane
<davecheney> menn0: suggestion, add one to the port the port finding thinggy thinks
<menn0> davecheney: the same function and approach is used for both
<davecheney> menn0: nah, for mongo we bind, then close then give that address to mongo
<davecheney> for the api server it shuld just do a bind :0
<davecheney> and use that listener
<menn0> davecheney: adding 1 to the port isn't much better. some other process is just as likely to have grabbed that one
<davecheney> menn0: i don't believe so
<davecheney> we have a roughly 1:10000 chance of two tests getting the same port
<davecheney> my assertoin is the Close() leaves the port still in use
<davecheney> for a very short amount of time
<davecheney> so mongo can't bind to that port
<menn0> davecheney: for the jujud tests (i.e. what that ticket is talking about) FindTCPPort is used to generate the state server config and there could be a signficant time between when the port was determined and when the API server is started
#juju-dev 2014-09-04
<menn0> davecheney: see cmd/jujud/agent_test.go:347
<davecheney> why is it writing out a file
<davecheney> just to start a mock api server ?
<menn0> the jujud tests often fire up real machines
 * davecheney puts head in hands
 * davecheney starts to sob
<menn0> I'm just the messenger man :)
<menn0> davecheney: what we should probably do is use a port of 0 in the config, let the API server get a port itself and then find out what it was and use that for client connections
<menn0> (as long as we have tests that fire up real API servers)
<menn0> no idea how hard that will actually be in practice
<menn0> davecheney: getting back to your earlier point about two tests getting the same port... what about any other processes on the host? we're not just competing for ports against our own tests here.
<menn0> davecheney: also, you may be right about Close() leaving the port unavailable for a short time afterwards. I think I've seen that before even with SO_REUSEADDR being used (which it is).
<davecheney> it's just a fragile pattenr
<menn0> davecheney: but... given that the apiserver retries the listen I don't think that's what's happening here
<menn0> davecheney: agreed. the pattern sucks.
<menn0> davecheney: we either need to expect and handle "socket already in use" or use port 0 when setting up the actual server (where possible)
<menn0> davecheney: shall I update the ticket with some of what we've discussed?
<davecheney> sure
<hatch> hey all - none of the mysql charms (precise/trusty) will deploy 1.20.6-trusty-amd64
<hatch> where should I be filing this bug?
<sinzui> wallyworld__, thumper cross your eyes and toes and hope for a Heisenbug I am retesting dimitern's commit to add better logging. CI has 2  test to complete to pass master tip. Dare I say that the commit to add logging decreased the chances of the bug re-occuring
<wallyworld__> oh dear
<hatch> lazyPower: marcoceppi you guys look like the last people to modify the mysql charms?
<sinzui> hatch, did you try hp and aws? hp required more mem because the charm does bogus steps are setup
<hatch> I'm on aws now on a 1.7GB ram
<sinzui> yeah, that is too low
<sinzui> hatch, 2G seems to be what the charm needs. QA learned that last year with juju 1.14
<sinzui> hatch, on hp the charm needs 4G
<hatch> sinzui:  that doesn't seem like the real issue https://gist.github.com/hatched/cff0bc4929b3b3201bd9
<sinzui> hatch, I am just repeating my experience with that charm in clouds with juju 1.14..1.18. It just doesn't run in under 2G anyware reliably
<hatch> ok I can try to spin up a bigger one....I've never done that in the GUI...hmm
<hatch> sinzui: so is it just the charm that requires so much? Or mysql?
<sinzui> wallyworld__, 1 test to go...
<wallyworld__> sinzui: maybe i forgot - what's the agreed method to tells devs that ci is blocked? the #juju-dev topic?
<sinzui> hatch, mysql can be configure to be greedy and fast, or modest and slow...the charm tuned mysql to need to much at the start
<sinzui> wallyworld__, I have been using the topic
<wallyworld__> sinzui: ok, could we make the text then reflect that these are the critical bugs blocking ci? since not all criticals do
<sinzui> wallyworld__, and I haven't updated the topic since this morning.
<wallyworld__> s/Open critical bugs/Critical bugs blocking CI
<sinzui> wallyworld__, lets be honest, critical means do this now.
<sinzui> wallyworld__, now we get to talk about what to do now
<wallyworld__> yes, but not all devs need to care abut all criticals, but they do need to know if CI is blocked
<sinzui> wallyworld__, in a few minutes http://reports.vapour.ws/releases is going to list build 1785 of master 5ecf58fb as blessed. The only change is the api name changes and logging
<hatch> sinzui: even using a massive instance i get the same error, looks like the mysql charms are just busted on ec2 and lxc
<wallyworld__> sinzui: that sucks in a way, shows just how fragile our tests are :-(
<sinzui> wallyworld__, We can demote the remaining bugs to high, opening master to mass landings
<wallyworld__> that will make thumper happy
<hatch> now where do I file bugs....man this must be super frustrating for new users, finding where the real source is, then trying to find somewhere to file bugs
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Blocking bugs: None
<sinzui> wallyworld__, http://reports.vapour.ws/releases/1785
<wallyworld__> o/
<wallyworld__> \o/
 * wallyworld__ merges katco's harvesting mode branch
<sinzui> wallyworld__, I think abentley can release 1.20.7 tomorrow while I prep for 1.21-alpha1 for the day after
<wallyworld__> sounds good
<wallyworld__> and 1.20.8 next week sometime
<hatch> sinzui: so....do you have any idea where I'm supposed to file bugs? Best I can come up with is https://bugs.launchpad.net/charms/trusty/+source/mysql but I'm pretty sure that's not the real repository
<wallyworld__> axw: you online?
<axw> wallyworld__: I am
<axw> good morning
<wallyworld__> morning :-)
<axw> hooray, CI is happy again
<axw> hum
<wallyworld__> axw: with the branch to mock out the provisioner api call - there's code to do that for uniter and usermanager facades (but duplicated). can we look to set up some common, shared code to do this?
<axw> wallyworld__: not sure it's worth it, it still requires in-package code because of the need to access private field
<axw> s
<wallyworld__> fair enough. perhaps then we should make sure all the implementatons are the same
<axw> I looked at the usermanager one and wasn't keen on it, I'll take a look at the uniter one
<wallyworld__> it's pretty much the same
<wallyworld__> whatever we decide, i think consistency is best
<axw> spose so. I'll take another shot
<menn0> sinzui, wallyworld__ : I've updated bug 1364410 with some more details based on a discussion davecheney and I had earlier. We don't think it's PPC related.
<mup> Bug #1364410: Timeout TestManageEnviron MachineSuite in ppc64el <ci> <intermittent-failure> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1364410>
<wallyworld__> menn0: i would be surprised if it were
<wallyworld__> menn0: i like option 1 as well. i wonder why the fark we didn't implement it that way to start with :-(
<menn0> wallyworld__, sinzui: tags and title updated
<menn0> wallyworld__: possibly because it's hard to pull the port out of the depths of the API server
<wallyworld__> menn0: well, that's something that should have been considered in the design
<menn0> wallyworld__: and possibly because it makes managing the API server configuration more difficult
<wallyworld__> IMO
<menn0> wallyworld__: agreed
<sinzui> hatch, That is the real location of the bug tracker, and this shows the recommended and user branches that were converted to charms https://code.launchpad.net/charms/trusty/+source/mysql
<wallyworld__> menn0: sometimes a just despair of how stuff has been written without concurrency in mind. i mean, the port issue should have been clearly raising alarm bells when the code was developed
<wallyworld__> s/a/I
<hatch> sinzui:  yeah found it I filed a bug https://bugs.launchpad.net/charms/+source/mysql/+bug/1365205
<mup> Bug #1365205: Charm cannot be deployed, fails on install hook with gpg error <mysql (Juju Charms Collection):New> <https://launchpad.net/bugs/1365205>
<menn0> wallyworld__: the docstring for FindTCPPort even mentions that there is a race but that the probility of an actual problem should "hopefully" be small
<menn0> wallyworld__: that didn't work out :)
 * wallyworld__ cries on the inside
<wallyworld__> ffs
<axw> menn0 wallyworld__: not sure if it's helpful or not, but I changed apiserver not too long ago to take a net.Listener as an arg, for the purpose of getting its port
<axw> i.e. listen on port 0, get the port, then pass the listener to api server
<wallyworld__> hmmmm, that may be useful indeed
<menn0> axw: that sounds pretty helpful
<menn0> axw: the problem is the config is generated and then the server is started later... and I think that config is also used for establishing client connections
<menn0> so there will be a bit of faff to get the port actually used by the API server to where it's needed to allow clients to connect
<axw> menn0: so I see. yuck
<axw> menn0: one option is to store a net.Listener there (in primeStateAgent), then override net.Listen so that it returns that listener
<axw> a bit horrible maybe
<menn0> axw: yeah a bit... but at least it stops the tests breaking every now again
<thumper> wallyworld__, sinzui: there is a race condition in our tests...
<thumper> no surprise there
<thumper> but the location is interesting
<wallyworld__> a??
<wallyworld__> several
<wallyworld__> many
<wallyworld__> lots
<thumper> we open a port (:0) to figure out which port to use for mongo
<thumper> we then close it
<thumper> and pass it to mongo
<thumper> and mongo uses it
<thumper> sometimes it isn't fully closed before mongo tries
<thumper> we could add a retry thingy
<thumper> around mongo
<thumper> checking for "port in use"
<thumper> just a few times
<thumper> should reduce the chance of it happening
<wallyworld__> we could
<thumper> funnily enough, we can't test that the port is closed properly without opening it
<thumper> can we?
<wallyworld__> or wait for the port to close before passing to mongo
<menn0> thumper: are you referring to bug 1364410 or something else?
<mup> Bug #1364410: API server fails to start with "address already in use" in MachineSuite tests <ci> <intermittent-failure> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1364410>
<wallyworld__> not sure
<thumper> menn0: yes, that one
<menn0> thumper: because that bug is about the API server, not mongo
<thumper> same problem
<menn0> yes
<thumper> we open a random port and pass it on
<menn0> yes
<wallyworld__> doesn't mongo support passing in 0 for the port?
<menn0> we've already been talking about it quite a bit here
<thumper> menn0: oh, ok, I wasn't watching...
<wallyworld__> it chooses one and reports what it is listening on
<thumper> I'll leave it to someone else :)
<wallyworld__> maybe it doesn't support that
<menn0> thumper: read scrollback and check the updates to the ticket :)
<menn0> for mongo we do actually retry a bunch of times as you suggest
<menn0> see juju/testing/mgo.go:167
<menn0> it's a bit harder for the API server
<menn0> bbs
<axw> wallyworld__: you can't use port 0 in mongo
<wallyworld__> i was wishing more than anything
<axw> wallyworld__: we could maybe do something horrible like get it to use a unix socket and redirect a socket to that
<axw> but it would be better just to not use mongo in the unit tests :)
<wallyworld__> you think? :-)
<axw> wallyworld__: I updated the api test mocking PR
<wallyworld__> you mean "unit" tests right?
<wallyworld__> ok
<axw> unified them around a common patching thingy
<wallyworld__> \o/
<wallyworld__> i'm still looking at 663
<axw> okey dokey
<axw> I'll go back to fixing the tools URL stuff I emailed about
<wallyworld__> ok
<perrito666> uh am I late for bashing our tests for using mongo?
<perrito666> ahh I am
<davecheney> has anyone tried the new side by side diffing on github yet ?
<axw> what, this is news
<perrito666> oh I just did
<axw> yay
<perrito666> its glorious
<perrito666> and it remembers my choice
<perrito666> sweeeeet
 * axw removes octosplit
<perrito666> and it highlights the diffed parts
 * perrito666 cries
<perrito666> our whining has been heard
<davecheney> thumper: menn0 https://github.com/juju/juju/pull/670
<davecheney> a slightly contraversial one
<axw> wallyworld__: that comment about failing is wrong, it just overwrites now
<axw> sorry
<axw> I had originally intended it to, but changed my mind
<thumper> davecheney: is it possible to just change it in state, but leave it with the api?
<axw> this is why the hash is encoded in the path, so if the put to the blobstore OR to mongo fails, we don't end up with inconsistent blob/metadata
<davecheney> thumper: nope
<davecheney> thumper: do you want a long discussion or a short discussion about why this type doesnt' add anything
<thumper> short
<thumper> the only benefit I see is that it is shorter to type
<davecheney> thumper: http://play.golang.org/p/4KQ6WrhFGf
<davecheney> there are no methods on StatusData
<davecheney> all it does is obscure what is actually being stored in the StatusData field
 * thumper ndos
<davecheney> most of the code churn is in the tests
<davecheney> as usual
<davecheney> the actual api code, apart from the method sigs
<davecheney> is unaware of the change
<davecheney> thumper: thanks
<davecheney> this will probably be the most contraversial change
<davecheney> the rest are more mechainical
<menn0> davecheney, thumper: FWIW, I'm -0 on this. I prefer the type aliases sometimes. They provide documentation and in this case, repeatedly spelling out of an otherwise awkward type.
<menn0> map[string]interface{} is ugly to look at
<menn0> especially when it's used a lot
<davecheney> menn0: i dunno how to incorprate signed zero feedback
<davecheney> either your +1 or -1
<menn0> it means, I don't like it, but I don't feel strongly enough about it that I'm going to ask you to stop
<davecheney> fair enough
<menn0> too late now anyway
<davecheney> noted
<davecheney> i don't think there will be many more cases like this
<lazyPower> hatch: bug tracker for the charm is on the GUI - http://i.imgur.com/MW05eAO.png, it should be in the README though.
<lazyPower> hatch: what's going on with MYSQL? did you get a bug filed that I can look at?
<thumper> got this interesting intermittent failure: http://paste.ubuntu.com/8228881/
<thumper> why the 2 minute waits?
<thumper> anyone got any ideas?
<wallyworld__> thumper: i *think* the waits are added to allow mongo replicaset to come up
<wallyworld__> from memory
<thumper> ugh
<wallyworld__> which sucks. it's bad enough we have "unit" tests with mongo required, but to also turn on the replicaset stuff for *all* tests is even worse
<wallyworld__> if we must use mongo for now for tests, why have the replicaset stuff enabled unnecessarily
<davecheney> wallyworld__: didn't someone note that recently
<davecheney> i have a memory there was a note in the weekly minutes to stop doing that to ourseles
<wallyworld__> davecheney: probably, but it hasn't been done i don't think
<wallyworld__> i was hoping nate would be doing it since he's looking at the replicaset tests
<wallyworld__> i'll ask him
<davecheney> ta
<axw> wallyworld__: I don't understand your comment about Tools tests & primitives
<axw> what do you want to see?
<wallyworld__> axw: it was a suggestion that we use things other than the component under test to test itself. so for Tools(), we would get a blobstore, write some data manually, and check that Tools() can load it. ie a simplified bit of code without error checking and with hard coded values or whatever
<wallyworld__> AddTools() can still use Tools() in its test because Tools() has been separately tested
<axw> I see
<axw> ok
<wallyworld__> does that make sense? do you agree?
<davecheney> thumper: menn0-afk https://github.com/juju/juju/pull/671
<davecheney> one more, less contentious this time
<wallyworld__> otherwise i could write an implementation that passes the tests but which doesn't work
 * thumper fears the rebase
 * thumper takes a deep breath
<davecheney> thumper: never go the full rebase
<thumper> why?
<davecheney> thumper: yes, agent/bootstrap.go depends on state
 * davecheney really wishes our package had package comments
<davecheney> that way I wouldn't have to guess what the agent package did
<davecheney> thumper: i agree, to swap one dependency for another would not be a win
<davecheney> but agent already depends on state
<thumper> yep
<davecheney> now, i'm not sure if that makes sense
<thumper> that is why I'm fine with this
<davecheney> but i don't really know what the agent package is
<davecheney> i'm guessing it's helpers for jujud agent processes
<thumper> yeah
<thumper> ideally it wouldn't depend on state
<davecheney> no
<davecheney> i can put that on the end of the list if you like
<davecheney> thumper: that might be it for the changes today
<davecheney> need to for arm64 paperwork this afternoon
<thumper> kk
<davecheney> and figure out how to come up with a plan for how to schedule 5 bits of concurrent work
<menn0> davecheney: that PR you just submitted has 7 commits, 4 of which have a commit message of just "wip"
<davecheney> menn0: yup
<davecheney> i don't rebase
<davecheney> i can't make it work
<menn0> davecheney: that's awful
<davecheney> yup
<davecheney> we've discussed this before
<menn0> git rebase -i HEAD~<number of revs back>
<menn0> easy
<davecheney> there has been no firm guidance here
<menn0> but you have to admit that a commit with a message of "wip" is pretty useless
<davecheney> yup
<davecheney> it's my old process from bzr
<davecheney> menn0: i'm hoping that the bot will grow the ability to squash commits
<menn0> davecheney: but squashing commits or even just rewording the commit messages is ridiculously easy
<menn0> davecheney: nevermind.... I doubt I'm going to change your mind
<thumper> menn0: I have a conflict during rebase
<thumper> and I want to do a three way merge to see
<thumper> any idea how?
<menn0> do you have a favourite merge tool? (e.g. meld or kdiff3)
<thumper> yes
<thumper> either of those is fine
<menn0> if so, type "git mergetool" at the conflict
<menn0> select the tool you want to use
<menn0> and it launches so that you can do the merge
<menn0> that's what I do
<thumper> what do you mean "at the conflict" ?
<menn0> when the rebase stops due to the conflict
<thumper> yes...
<thumper> anything else?
<menn0> then you type "git mergetool"
<menn0> does that work for you?
<menn0> thumper: ^^^?
<thumper> kinda
 * thumper is in meld hoping he is doing the right thing
<menn0> with meld I usually use "Merge All" and then check how well it did
<menn0> it normally does a pretty good job
<menn0> but for really tricky conflicts you sometimes need to do some hand editing or manual merges
<menn0> I didn't realise for sometime after I started using meld that you can edit in the middle pane
<menn0> :)
<menn0> thumper: ^^
 * thumper runs make to test
<thumper> haha
<thumper> everything fails
<thumper> fixed the two trivials
 * thumper waits to try again
<thumper> coffee machine is calling
<wwitzel3> is there an easy way to push a new build to an existing juju environment?
<wallyworld__> wwitzel3: update-juju --upload-tools
<wwitzel3> wallyworld__: thanks
<wallyworld__> i've not used it myself :-)
<wwitzel3> I'm just peppering the code with Debugf statements and it gets old rebuilding the environment each time
<wallyworld__> yup
<thumper> wallyworld__: or axw: https://github.com/juju/juju/pull/673 - trivial move
<wallyworld__> ok
<thumper> gah...
 * thumper rebased the wrong branch
<thumper> FFS
<thumper> how?
<thumper> nope
<thumper> did the right branch
<thumper> tried to merge the wrong branch
<thumper> d'uh
<hatch> hey anyone here know anything about haproxy? I'm trying to find documentation, release notes, etc, but the 'official' site hasn't been updated since 2013
<thumper> ugh...
<thumper> I need to apply a reverse merge for just one file from a previous commit
<thumper> git master help needed
<thumper> just hit: [LOG] 0:00.293 ERROR juju.worker exited "apiserver": listen tcp :60622: bind: address already in use
<hatch> thumper: so you want to remove one file from a previous commit?
<thumper> hatch: remove the changes to one file from a previous commit, yes
<hatch> thumper:  is it a recent commit?
<thumper> two back
<hatch> git rebase -i HEAD~2 then change 'pick' for the commit in question to 'edit'
<hatch> if I am remembering correctly
<hatch> I believe edit will allow you to modify the commit
<thumper> hmm... ok
<hatch> then once you do the changes you do `git commit --ammend` I believe
<hatch> sorry I don't have a local repo to test with atm :)
<hatch> thumper: work?
<thumper> yeah
<hatch> excellent
<hatch> thumper: rebase has a lot of really awesome functionality, if you ever get some free time it's definitely worth reading the docs on it
<wallyworld__> axw: whenever you're free, i've updated blobstore to just use sha384 checksums https://github.com/juju/blobstore/pull/14
<axw> looking
<wallyworld__> axw: ty, i've merged as we don't have a landing bot yet
<axw> cool
<davecheney> menn0: i'm sorry about the ugly commit
<davecheney> i'll try to produce better PR's int he future
<menn0> davecheney: :)
<davecheney> menn0: thumper
<davecheney> i'm at the stage where I have a bunch of enumeration types
<davecheney> lets say params.Life
<davecheney> that's an easy one
<davecheney> they are defined in params
<davecheney> but used in state as they are stored in the database
<davecheney> now, the first suggestion would be to move them into state
<davecheney> which makes sense for the apiserver
<davecheney> but that means api clients would also be importing state
<davecheney> and that feels wrong
<thumper> yes...
<davecheney> thoughts ?
<thumper> this is why we want a general separate package that handles that :)
<davecheney> i think there is general agreements that the api/ packages should know nothing about state
 * thumper is being called for dinner
<thumper> agreed
<davecheney> thumper: yes, but then both state and the apiserver and api clietns have to import those
<davecheney> and that feels like odd coupling
<davecheney> i'm thinking about using raw types in state, ie strings
<davecheney> then having parse and tostring() in the params pacakge
<davecheney> so the api and apiserver deal with enumerated types
<davecheney> and we convert them to strings to be stored in the databse
<davecheney> or something
<davecheney> similar to the method I wrote that converrts
<davecheney> state.StateServingInfo -> params.StateServingInfo
<menn0> davecheney: seems ok except we then don't have nice friendly constants to use in state code
<davecheney> menn0: i agree
<menn0> which kinda sucks
<davecheney> and, taking params.Life as an example again
<davecheney> it does have helper methods on it that make it more than just an alias, liek StatusData was
<davecheney> menn0: hang on, give me a sec to try something
<davecheney> hmm, what about
<davecheney> package state; type Life int const ( LifeAlive << iota .. )
<davecheney> ^ this is what we have in params at the moment
<davecheney> then in apiserver/params
<davecheney> type Life state.Life
<davecheney> these should still marshal across the wire
<davecheney> and callers of the api won't be able to tell
<davecheney> there will be a transitive dependency from api -> apiserver -> state
<davecheney> but not a direct one
<davecheney> it will also prevent anyone using api/ types in state
<menn0> that seems pretty good to me
<davecheney> let me try a PR
<davecheney> see how it looks
<davecheney> that means in the apiserver
<davecheney> you do
<davecheney> f(life params.Life) { st.Something(state.Life(life)) }
<davecheney> which is probably acceptable, given the constraints
<menn0> and if we decide that there needs to be a difference between the constants in the apiserver vs state, it could be unpicked fairly easily
 * menn0 is EOD...
<davecheney> kk
<fwereade> wallyworld__, I have to go into town to see the dentist shortly; I think there's just one major unresolved thing in the qos/status stuff, but it's a biggie: it's the per-relation statuses
<fwereade> wallyworld__, I can't see a way to do health well without relation granularity at least
<wallyworld__> fwereade: ok, i'm updating some comments in the doc now, yet to get to that bit
<wallyworld__> but i will
<fwereade> wallyworld__, and I can't see a way to do relation-granular statuses easily and comprehensibly either
<wallyworld__> fwereade: i have soccer in an hour or so, perhaps we can talk after the TL meeting?
<fwereade> wallyworld__, yeah sgtm
<wallyworld__> fwereade: what time is it?
<fwereade> wallyworld__, FWIW it's not really about stuff in the impl doc, it's unresolved questions in the reqs doc
<wallyworld__> tooth-hurty :-D
<fwereade> wallyworld__, haha
<wallyworld__> thought you'd "appeciate" that
<fwereade> wallyworld__, oh, I do, I had a 2:30 dentist appointment when I was small and told teh dentist that exact joke
<wallyworld__> \o/
<fwereade> wallyworld__, he was very nice and pretended he'd never heard it before
<wallyworld__> lol
 * fwereade has to go
<wallyworld__> axw: do we need to, or should we be, including the uuid in the tools url? are tools something that we care about segregating per environment?
<axw> wallyworld__: they are in the URL
<wallyworld__> axw: yes, that's what i'm referring to
<axw> https://<apiaddr>/environment/<uuid>/tools/<version>
<axw> wallyworld__: ah, sorry thought you thought they weren't and should be
<axw> um
<wallyworld__> one sec, someone at door
<axw> wallyworld__: I think one user should not affect another user's env by manipulating tools
<wallyworld__> axw: that's a fair point
<wallyworld__> and the data is de-deuped anyway
<axw> they're going to be deduped in the blobstore
<axw> yeah
<wallyworld__> yup :-)
<axw> wallyworld__: we need to support 1.18 upgrading directly to 1.21? my understanding was that the client needs to be compatible, but upgrade still need to go through the hops
<wallyworld__> axw: yes, correct, upgrade steps are run one version at a time. i was worried if older clients attempted to use that attribute if it is taken away, i could be wrong
<axw> wallyworld__: I mean, I thought the client still was required to upgrade to 1.20 first, and then to 1.21
<wallyworld__> axw: we need to support people using juju 1.18 clients "forever"
<wallyworld__> even with the back end upgraded
<axw> wallyworld__: everything except upgrading directly to 1.21 will work still
<axw> if you want to upgrade you still need to go to 1.20, then to 1.21 (this is my assumption)
<wallyworld__> axw: sure, but i might upgrade to 1.22 and you may still want to use an older 1.18 client
<wallyworld__> we need to always ensure 1.18 clients can be used with any 1.20, 1.22 etc
<axw> wallyworld__: that'll be fine. the bit of code in question is only relevant to the upgrader
<wallyworld__> axw: ok, np. just being doubly sure by asking the question
<axw> sure
<wallyworld__> we'd be in the shit if stuff broke :-)
<thumper> wallyworld__, davecheney, axw, someone: https://github.com/juju/juju/pull/675
<wallyworld__> thumper: currently reviewing, will look after i finish unless i have to bail for soccer
<thumper> wallyworld__: ack
<davecheney> menn0: state already has a Life type
<davecheney> and it's defined as a uint8 ...
<davecheney> so we have params.Life, a string
<davecheney> and state.Life, an uint8
<thumper> ick
<thumper> wallyworld__: if you are happy with my branch, please add the merge flags
<wallyworld__> will do
<thumper> wallyworld__: I'm off for the evening until meeting time
<thumper> cheers
<mattyw> morning all
<wallyworld__> o/
<mattyw> wallyworld__, thanks for merging my branch - it makes me very happy
<wallyworld__> mattyw: i landed your branch for you :-)
<wallyworld__> np
<wallyworld__> mattyw: i wanted to get in in case CI was blocked again before your SOD :-)
<mattyw> wallyworld__, I appreciate it, thanks very mcuh
<wallyworld__> anytime
<dimitern> morning all
<wallyworld__> axw: off to soccer, i lgtm'ed your pr with a suggested change
<axw> wallyworld__: thanks
<axw> enjoy
<tasdomas> morning
<tasdomas> could somebody take a look at https://github.com/juju/juju/pull/667 ?
<dimitern> tasdomas, morning, and looking :)
<tasdomas> dimitern, thanks
<dimitern> tasdomas, reviewed, i'm afraid it looks like it needs a bit more
<tasdomas> dimitern, thanks
<TheMue> morning btw
<voidspace> TheMue: morning
<TheMue> voidspace: ah, good to see you. sharing room in Brussels?
 * TheMue btw fight with code that yesterday before merging worked but not now anymore *wonder*
<voidspace> TheMue: I have a roomie TheMue
<voidspace> TheMue: if that was an offer, which I assume it was, thank you
 * voidspace is also wrestling, but with lxc not code
<TheMue> voidspace: ok, thx for info
<gsamfira> if anyone has a few minutes to spare, can I have a review on: https://github.com/juju/utils/pull/27 ?
<jam2> morning TheMue
<TheMue> jam2: hi jam
<jam2> voidspace: I read that for a second as wrestling with your roomie.
<jam2> TheMue: I missed you yesterday, was I just not around when you were, or did I miss something else?
<TheMue> jam2: I thought I told on Tuesday that I had a doc appointment
<jam2> TheMue: it is certainly possible that you did and I just forgot. No problem. Just turned out that Voidspace had a doc appointment, and dimiter had a power outage all during standup time.
<jam2> I was so lonely.
<TheMue> jam2: my whole working day moved in the afternoon and evening, at about 11 I finished with merging and fixing conflicts :)
<TheMue> jam2: hehe, next time I will be with you using my mobile hangout :D
<TheMue> jam2: I even had to break my work in the afternoon for a short while, I got an urgent call by my daughter
<TheMue> jam2: she and her friend bought furnitures and wanted to rent a larger transporter. but they are too young
<TheMue> jam2: so they needed daddy as driver
<jam2> natefinch: I commented on https://github.com/juju/juju/pull/512
<voidspace> jam2: heh, not yet wrestling with a roomie
<voidspace> jam2: that comes soon
<voidspace> jam2: I finally have juju running on an lxc under trickle
<jam2> voidspace: beware of thumper, he has a history there
<voidspace> hah, I can believe it
<voidspace> jam2: the fact that replicasets are taking *an age* to complete imply that it's working
<voidspace> caching makes it slightly hard to tell
<jam2> voidspace: do you know about drop cache ?
<voidspace> jam2: I don't, but I will shortly... thanks
<jam2> voidspace: http://www.linuxinsight.com/proc_sys_vm_drop_caches.html
<voidspace> I didn't rate limit it too much whilst I was installing packages and fetching the juju dependencies
<voidspace> 2MB/s
<voidspace> I completely failed to get nbd-server to read a config file and actually start
<voidspace> I ended up forcing it to skip the config file and specifying the parameters at the command line, which is actually a deprecated way of starting it
<voidspace> https://plus.google.com/114852031032123777881/posts/LgLST5uPHsJ
<stub> Interesting... my agent died while in debug-hooks. Anything I can do to diagnose? Its lxc.
<voidspace> and as a testimony to my experimentation I now have an lxc container I can't destroy (as it's filesystem is on a disk that no-longer exists)
<jam2> voidspace: and inode handles keep them all alive?
<jam2> stub: so you can pastebin the panic
<stub> http://pastebin.ubuntu.com/8231693/
<stub> jam2: this chan prob better ;)
<jam2> stub: it looks cut off at the start, though I'm not sure
<jam2> I only see from goroutine 28
<stub> jam2: yeah, hang on. paste fail.
 * stub looks to see if it logged locally, outside of debug hooks terminal
<stub> http://pastebin.ubuntu.com/8231715/
<stub> jam2: So, yeah. I typed 'relation-get db' or something and killed it :)
<jam2> stub: so yeah... if a Charm can Panic Jujud, that's *bad*. Can you file a bug with just that first part of the traceback?
<jam2> I can understand how that code could be triggered
<jam2> (something assumes that it is getting a valid object, and then turns it into a Tag that assumes the input is well formatted and panics if it isn't)
<jam2> davecheney: ^^
<jam2> looks like a case where we might be trying to validate input
<jam2> but it is getting turned into a Tag before we query the DB
 * davecheney looks
<jam2> but that panic() rather than giving a "no such thing" error
<stub> relation-get - db
<stub> That was what I typed
<davecheney> yup, the bug is github.com/juju/juju/state/api/uniter.(*RelationUnit).ReadSettings
<davecheney> is using NewUnitTag
<jam2> stub: sure, it probably is supposed to be "relation-get unit-db-0"
<davecheney> not ParseUnitTag
<stub> yeah, it was my fat fingers
<jam2> stub: but regardless, it is a bug in Jujud code
<davecheney> jam2: hmm, not sure
<jam2> that we would panic() because of bad user input
<davecheney> i think it should be db/0
<davecheney> that is the name of the unt
<davecheney> unit
<jam2> davecheney: sure, I'm not quibbling on what Stub should type, but *jujud* should not panic because of a typo in a charm
<davecheney> jam2: no argument there
<jam2> stub: I'd consider this a Critical bug, fwiw, since it means a bad charm makes your environment unreliable
<jam2> stub: or davecheney: can you file a bug to track this? I'm currently in 2 other conversations
<stub> https://bugs.launchpad.net/juju-core/+bug/1365412
<mup> Bug #1365412: relation-get with invalid relation name panics agent <juju-core:New> <https://launchpad.net/bugs/1365412>
<TheMue> aaaargh, found why it's not working anymore
<jam2> stub: is this with 1.20 or only with juju trunk?
<stub> jam2: 1.20
<tasdomas> dimitern, thanks for the review - I've responded to the comments
<voidspace> jam2: I think it's lxc configuration that prevents it from destroying the container
<voidspace> jam2: it shows up under lxc-ls but lxc-destroy reports that the container doesn't exist
<davecheney> jam2: stub i have a fix
<davecheney> PR coming RSN
<jam2> great davecheney !
<dimitern> tasdomas, cheers, will look in a bit
<davecheney> jam2: stub https://github.com/juju/juju/pull/677
<jam2> davecheney: is there a function like NewUnitTag that would do the check for us rather than doing IsValidUnit call?
<davecheney> jam2: nope
<davecheney> the value we're handling is not a tag
<davecheney> so names.ParseUnitTag() is not the right hammer either
<stub> My, that looks like parameter validation ;)
<davecheney> but, there is a simpler fix
<perrito666> good morning
<jam2> davecheney: ah right, it is the name not the tag
<davecheney> if you're prepared to send this to the API server
 * perrito666 sees a heavy storm outside and announces that he might be suddenly disconnected
<davecheney> note that we're turing the uname into a tag, then calling tag.String() on it to pass it to the api
<davecheney> why both ?
<davecheney> why bother
<davecheney> actually, not
<davecheney> we need to
<davecheney> beause the api only accepts tags
<davecheney> this fix is correct
<jam2> davecheney: is there a reason NewUnitTag should panic() rather than having error like other things?
<davecheney> yes, it's a testing function
<davecheney> it's use in prod code is a smell
<davecheney> i discourage anyone from calling it
<jam2> davecheney: how do you turn user input into name.Tag objects then?
<davecheney> depends
<davecheney> if the input is a string form of a tag with ParseTag
<davecheney> if not, we need to validate it first
<davecheney> then handle it with care
<TheMue> *iiirks*
<jam2> davecheney: it sounds like we want something similar to ParseTag (ParseName) ?
<jam2> davecheney: it certainly seems like validating user input for these things should be common code in the names package
<jam2> anyway, your fix seems fine for now, but it doesn't seem "ideal" in the design sense.
<davecheney> jam2: i agree
<davecheney> the use of NewXXXTag is very dangrous
<davecheney> i discourage everyone from using it
<davecheney> my use of it in this case was a mistake because I think at the time i started this work I didn't understand the scope of the problem
<TheMue> jam2: is it intended that once the client requests a higher version of an API that doesn't exist on the server even if the API and the called function exist in a lower valid version, it fails (means: returns an error)?
<jam2> TheMue: it is intended, yes
<jam2> the client is responsible for noticing that the higher version is not available
<TheMue> jam2: so once introducing a new version maybe the whole client has to changed, because now an error is returned and an explicit call to a lower version (where the code intenionally has been written for) has to be done
<jam2> TheMue: BestAPIVersion has already worked out what version is available and makes calls against that version
<jam2> the point is that api/client.go should know about the available versions and give the compatible internal API
<jam2> TheMue: so you would have Client.DoSomething(), which internally might be calling "Client", 1, "DoSomething", args
<jam2> or might be calling "Client", 0, "DoSomethingElse", otherargs
<TheMue> jam2: it does it when starting once by retrieving them from the server, doesn't it?
<jam2> TheMue: at Login time we find out what facade versions the server has available
<TheMue> jam2: then here maybe my troubles are
<jam2> and when instantiating the api.Client object it determines (FacadeCaller) what the matching version is
<voidspace> jam2: running the AddRemoveSet tests on my machine (normally)  takes 60 seconds
<voidspace> jam2: running them on an lxc rate-limited to 2MB/s up/down takes 4 minutes
<voidspace> jam2: limited to 1MB/s dies with unreachable servers
<TheMue> jam2: the code for temporarilly discarding a version has also to do this on the already retrieved versions
<voidspace> jam2: (this is with the extraneous Remove removed)
<jam2> voidspace: sounds like a good start, you can probably increase some timouts to see if unreachable servers can be recovered from.
<voidspace> jam2: ok, I'll look at that - I might try with an intermediate value as well
<voidspace> jam2: I'd like to get a good few runs with a slow disk showing that we don't die in test cleanup when we don't have the Remove
<jam2> voidspace: well for our tests to be reliable more consistenty, I'm think we probably need to change the timeouts, and move it to CI
<jam2> voidspace: well, first I'd like you to see that we *do* die in cleanup with Remove there
<jam2> voidspace: always have the test fail first
<jam2> before you fix it
<voidspace> ah, ok - good call
<jam2> voidspace: standard bugfixing, you have to know that you're reproducing the bug before you know that your fix is actually fixing it :).
<voidspace> jam2: sure :-)
<voidspace> that's why changing existing tests is dangerous
<voidspace> unless you also make sure you see the *modified test* fail
<voidspace> it's easy to change a good test into a no-op without noticing...
<voidspace> of course, testing replicasets on a *deliberately slow* system is even more painful than normal
<jam2> voidspace: dimitern: standup?
<jam2> voidspace: yeah, replication is painful
<voidspace> omw
<dimitern> jam2, omw, sorry
<voidspace> jam2: dimitern: TheMue: you can tell the next door neighbours kids have gone back to school - my internet connection is now good enough for google hangouts...
<jam2> voidspace: :)
<TheMue> voidspace: *lol*
<dimitern> :)
<dimitern> tasdomas, you've got another review with some suggestions
<thumper> is anyone else having trouble pushing master to github?
<thumper> if not, are you using the pre-push hook?
<mgz_> thumper: I can try now
<thumper> mgz_: ta
 * mgz_ suspects windows syslog changes
<mgz_> and indeed, seems to be
<mgz_> lets just double check...
<mgz_> hm, happy after I did a dep upgrade
<mgz_> thumper: works for me
<mgz_> on trusty, after running godeps -u dependencies.tsv
<mgz_> $ git push origin master
<mgz_> Everything up-to-date
<thumper> state/metrics_test.go:42:11: (*github.com/juju/juju/state.Metric).Key -> <nil>
<thumper> MethodSet {}
<thumper> panic: method sets and lookup don't agree [recovered]
<thumper> 	panic: method sets and lookup don't agree [recovered]
<thumper> 	panic: method sets and lookup don't agree
<jam2> voidspace: so it sounds like you're really close to the "land the code" state, is that true?
<thumper> that is what I get
<jam2> voidspace: because I have other work for you to be looking into if you do.
<mgz_> thumper: I have that line, but not the panic. go 1.2.1
<jam2> TheMue: voidspace: dimitern: we're going to be picking up the PortRange changes from domas, so it would be good for a couple of you to read over the patch so far
<jam2> tasdomas: can you work with those guys ^^ to help share knowledge of what is going on?
<mgz_> thumper: I guess if you just do `./scripts/pre-push.bash` you get that panic?
<dimitern> jam2, i've reviewed it twice and am familiar with the previous work
<voidspace> jam2: well - Remove the Remove is ready to land just need to create the PR
<jam2> voidspace: yep.
<voidspace> jam2: the replicaset CheckHealth function (name?) is done but needs a test / logging / deciding where to actually use it
<jam2> voidspace: sounds reasonable
<voidspace> jam2: if we just use it in the tests instead of sleeps then I can land that today too
<jam2> (name wise)
<voidspace> jam2: and it probably doesn't need its own tests if we just use it in the tests
<jam2> voidspace: think carefully about it, and I'll let your educated thoughts drive where we put it :)
<voidspace> jam2: heh, ok
<jam2> voidspace: The only good way to test it is by mocking out mongo
<jam2> IMO
<tasdomas> jam2, TheMue, voidspace: I still need to update the patch w.r.t fwereade's comments and I have a day off tomorrow - could we set aside time to go over this on Monday?
<jam2> voidspace: I liked your ideas about having Add wait until things should be reliable
<voidspace> jam2: yep
<voidspace> jam2: ah, ok - I thought you *weren't* keen on that
<jam2> voidspace: but I'm not deep in the code to have a good feeling for how it all hangs together
<jam2> voidspace: sorry, it sounds like we must have been talking past eachother. I think haiing things transition from a known good state to another known good (enough) state is worthy
<jam2> and Add() seems like the place where you might do that.
<voidspace> jam2: let me get the function usable, after I PR Remove the Remove, and look at plugging it into Add / Remove
<voidspace> jam2: ok, cool - what about Remove?
<voidspace> which is already slow
<jam2> The issue in the code is whether it expects things to be more asynchronous and would work poorly if we waited here and tehre.
<voidspace> right
<jam2> voidspace: changing the replicaset should be really infrequent
<jam2> so waiting seems great
<voidspace> we have code (in tests) that expect stuff to be unstable and just retries the next operations until they succeed
<voidspace> which is horrible
<jam2> Again, we can make "waiting" be until quorum
<voidspace> jam2: I'll look around it
<jam2> or maybe "wait X for everything stable, then back off to Y for quorum"
<voidspace> yep, with logging
<jam2> voidspace: I like the idea with you can spend a bit of time for full stability as long as it doesn't really impact
<voidspace> I'll see how they are used in the code and the wait strategies around them
<jam2> better than always returning immediately because you always have quorum
<voidspace> and as the current operations leave the state unstable, we're either *already* waiting
<voidspace> or we have code that can sometimes fail horribly
<voidspace> so we're either replacing one wait with a better wait, or fixing a bug
<voidspace> so it sounds like a win
<jam2> voidspace: right, and it is certainly possible that in your slow-case testing you'll find we can't reliably work with the system in a pre-commit test timeout
<jam2> (which IMO, real unit tests should be milliseconds....)
<wallyworld_> fwereade: i'm just about to do our standup, can i ping you after that, soon?
<voidspace> jam2: I didn't quite parse that - you mean that waiting for stable should help with the slow system timeouts
<voidspace> because we're waiting for stable state
<jam2> voidspace: I mean that waiting-for-stable-state may make the test unsuitable for the pre-commit suite and it needs to go into CI
<voidspace> (specifically I couldn't parse "with the system in a pre-commit test timeout")
<jam2> and while I don't want to lose you to it, we need people who know how to add stuff to CI, so I'm willing for you to spend the time there.
<voidspace> jam2: ah
<voidspace> jam2: yep, parsed it now
<voidspace> jam2: I thought "pre-commit" was mongo terminolody
<voidspace> *terminology
<voidspace> jam2: agreed
<jam2> voidspace: pre-landing-on-trunk
<voidspace> yep
<voidspace> But first, coffee
<jam2> :)
<mgz_> wait, what's this pre... I see
<mgz_> you're suggesting voidspaces adds this as a ci job rather than a unit test
<jam2> voidspace: I like the idea of having a "ci/" subdirectory that only runs in CI fwiw
<mgz_> ...why did I plural...
<jam2> mgz: so TestAddRemoveSet is currently in github-merge-juju
<jam2> mgz: we think that it needs to just be slower, which means it would be better as "run-trusty-tests" or whatever there
<mgz_> jam2: yeah, it sounds like a reasonable idea to me
<mgz_> the alternative is leave it in unit test form, but off unless --flag, and switch that on in the post-gating jobs... but that's pretty ugly
<jam2> mgz_: FWIW I like ENV_VAR more than --flag
<jam2> just because you can't pass --flag to "go test" reliabl
<jam2> reliably
<mgz_> yeah, that's a pain
<jam2> because it passes it to all packages
<jam2> not just the one you want
<jam2> otherwise I'd prefer flag
<jam2> I'm not a big fan on ENV magic, but go test seems close to requiring it
<thumper> mgz_: yeah... I have ./scripts/pre-push.bash hooked up as the git pre-push hook
<thumper> makes pushes take a lot longer
<mgz_> thumper: me too
<thumper> but it was in the docs
<thumper> mgz_: so why is mine panicing?
<mgz_> that's what I want to know :)
 * thumper is too tired to work it out
 * thumper goes to bed
<thumper> night all
<voidspace> I made a radical decision, I went for tea instead of coffee
<jam2> voidspace: omg....
<voidspace> jam2: yes, I like having CI tests in trunk - so developers can write new ones easily and have a sense of ownership over them
<voidspace> heh, indeed
<jam2> voidspace: that's crazy talk
<jam2> go back to coffee before you have any more
<jam2> ownership over tests.... NOOOOO!
<voidspace> :-)
<jam2> then we're responsible when they fail
<voidspace> it would be nice to reduce sinzui's blood pressure at least a *bit*
<jam2> voidspace: we need a lot more ownership of CI stability, so I'm very happy to hear it.
<voidspace> yep
<voidspace> jam2: https://github.com/juju/juju/pull/679
<jam2> voidspace: just to confirm, you saw reliable failures under slow disk with the Remove in, and not with it out?
<voidspace> jam2: reliable as in about 3 runs out of 4
<voidspace> jam2: but yes
<jam2> voidspace: that is good reproducibility and good reduction without them, right?
<jam2> great
<voidspace> jam2: yep
<jam2> voidspace: LGTM
<voidspace> jam2: thanks
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Blocking bugs: 1365480
<perrito666> mattyw: ping without hurry
<voidspace> natefinch: ping
<jam2> TheMue: can you have a look at https://github.com/juju/juju/pull/392 ? It is doing versioning of the root Login function, and I think it would be worthwhile to have you think about it, too.
<jam2> I'm mostly happy, though I had been hoping we wouldn't have to version the *entry point* into the system.
<jam2> We can only do that *today* because 1.18 would let you Login even if you pass a version, and 1.20 will give you CodeNotImplemented if you did so.
 * voidspace lunch
<TheMue> jam2: yep, will do
<mattyw> perrito666, ping
<mattyw> perrito666, pong, even
<natefinch> wwitzel3: standup?
<wwitzel3> yep, sorry
<fwereade> axw, ping? must be late for you
<fwereade> axw, wallyworld_, if either of you are awake: in the context of "drop provider storage", how far are we from the "can do HA manual provider" subgoal?
<fwereade> axw, wallyworld_, wwitzel3: because the cloudsigma thing will be incoming, and last I looked that was still using the manual storage method, which means that won't be HA-capable either
<bodie_> I thought we were avoiding gopkg.in
<fwereade> bodie_, I didn't think so?
<bodie_> hum, I got some pushback about it a while back, just didn't know whether I should worry about it
<voidspace> natefinch: I've submitted a new expenses claim I would appreciate you approving
<voidspace> natefinch: travel for Lexington sprint
<natefinch> voidspace: ok sure
<perrito666> natefinch: ping me when you have a slot available
<natefinch> perrito666: will do
<katco`> hey, any opinions on how to name functions stuffed into var's for patching in tests?
<katco`> i have environForName as the function now, i'd like to expose it with a var
<natefinch> katco: tricky
<katco> natefinch: i really dislike this pattern =|
<katco> natefinch: i'm thinking the var should be the name that makes sense since it should be the thing used everywhere, and then the func shouild be called *forPatching or something ugly to disuade use
<natefinch> katco: yeah, that's a good idea
<natefinch> katco: or *ProductionFunc or something
<katco> natefinch: i like that a bit better
<katco> natefinch: we'll see how that looks, thanks dude
<natefinch> katco: welcome.  I'm pretty good at coming up with ugly names ;)
<katco> natefinch: haha aren't we all
<bodie_> WouldBeUglierIfJava
<perrito666> ouch is mattyw gone?
<tasdomas> perrito666, mattyw is probably just having connection problems
<perrito666> * mattyw has quit (Max SendQ exceeded)
<perrito666> sounds like flooding
<katco> evilnickveitch: ping
<TheMue> *yeeeeeehaw*
<TheMue> sorry for being loud, but this crazy versioning test now works as it should
<jcw4> TheMue: yay!
<wwitzel3> is there a way to upload new tools without re-bootstrapping? I tried juju update-tools but that isn't a command.
<katco> wwitzel3: juju upgrade-juju --upload-tools maybe?
<wwitzel3> katco: yeah, tried that, but it just says they are the same version, which is true
<wwitzel3> :)
<wwitzel3> wondering if I hacked the version and incremented it, if it would work then
<katco> wwitzel3: ah. look at the status, it should be incrementing the version. i think that error is... well erroneous ;)
<katco> wwitzel3: yeah, before: agent-version: 1.21-alpha1.1; after: agent-version: 1.21-alpha1.3
<katco> wwitzel3: i remember reading up on this; there is a bug or something out there to simplify all this behavior. it's a very confusing command.
<katco> wwitzel3: also check your path to make sure you're not using the one from apt
<perrito666> lunch brb
<wwitzel3> katco: sorry, forgot to say thanks :)
<katco> wwitzel3: haha no worries. although i _was_ wondering if it worked!
<katco> wwitzel3: btw i tried some bourbon the other day... maker's mark? i think? it wasn't as good as the stuff you bought in boston. but i'm a bourbon newb.
<wwitzel3> katco: I wish markers was a good as the stuff I buy, well actually, Jessa wishes it more than me :)
<katco> wwitzel3: haha
<katco> wwitzel3: i'm guessing the stuff you're into is rather pricey
<perrito666> natefinch: team meeting?
<perrito666> anyone else?
<wwitzel3> ericsnow: ping
<ericsnow> wwitzel3: coming
<perrito666> anyone else aroudn here that feels should be on this meeting?
<jcw4> perrito666: sure ! :)
<wwitzel3> anyone know what line 50 of worker/uniter/runlistener.go is actually doing?
<mattyw> night all
<natefinch> wwitzel3: setting the value of result to the value of runResult
<natefinch> wwitzel3: er vice versa
<natefinch> wwitzel3: it's like an out parameter
<natefinch> wwitzel3: I have no idea why they're not just returning the result and the error
<wwitzel3> natefinch: yeah, that's what I was wondering, if there was a reason to not just return them there
<natefinch> wwitzel3: probably... maybe an interface that's being fulfilled or something (though why that interface itself wouldn't allow the result to get returned, I don't know)
<wwitzel3> the problem is, that if that RunCommands call actually does return nil for the result (as it does in all its error cases), that line throws a NPE
<natefinch> wwitzel3: heh uh yeah, that's bad.
<katco> wwitzel3: that code isn't run from within a closure by chance is it?
<wwitzel3> katco: yes, I think so, right
<katco> wwitzel3: funny, i recently did this in a personal project. at the time of the creation of the closure, it needed the variable to assign to; i.e.: the closure was being run in such a way that it couldn't return a value due to an interface it had to conform to
<katco> wwitzel3: and so i did a pointer dereference
<natefinch> wwitzel3, katco: what's funny is that on line 24 there's an interface that is written in the right way, which returns results and the error
<katco> natefinch: ok i call shenanigans on that code then :p
<wwitzel3> hrmm
<natefinch> katco, wwitzel3:  JujuRunServer is passed to an rpc server... that's the trick
<katco> natefinch: so it can't return a result?
<natefinch> katco: http://golang.org/pkg/net/rpc/#Server.Register
<wwitzel3> natefinch: thanks
<wwitzel3> katco: thanks
<katco> natefinch: ah there we are. good find natefinch :)
<wwitzel3> well it still doesn't change the fact that if the results from runner.RunCommand are nil that it panics with a NPE
<wwitzel3> so I should probably fix that while I'm in here
<katco> wwitzel3: quit being so proactive.
<natefinch> wwitzel3: absolutely fix it.  If err != nil, we shouldn't look at the other value at all
<natefinch> ericsnow: what PR did you want me to review?
<ericsnow> https://github.com/juju/juju/pull/606
<ericsnow> natefinch: thanks!
<natefinch> ericsnow: I think you're trying too hard to hide stuff in that backups code
<ericsnow> natefinch: I guess I'm trying to avoid the way we leak stuff all over the rest of our code base :)
<ericsnow> natefinch: anything in particular?
<natefinch> ericsnow: db.connInfo  ... you should pretty much never have exported functions returning non-exported types
<ericsnow> natefinch: that's a pretty crucial rule-of-thumb I wasn't aware of :(
<ericsnow> natefinch: though it certainly makes sense
<natefinch> ericsnow: well for one thing, it means you can't see documentation about the type in godoc.   It also means you can't write a function that takes that type
<natefinch> ericsnow: that's a really good example of a struct that really should just have all public fields.  It doesn't need any methods at all
<natefinch> ericsnow: if you really want the Check function, you can make it a standalone function that takes a ConnInfo and does basically what it does now.
<ericsnow> natefinch: doesn't that put more burden on the caller unnecessarily?
<natefinch> ericsnow: but honestly, the check function seems spurious.  It might be useful as an internal helper function.. but  ConnInfo is just a data bucket.  Whether or not the address being empty is a bad thing depends on the function using it
<ericsnow> natefinch: considering that the context is backups (specifically dumping the DB), I think the constraint is appropriate
<natefinch> ericsnow: It ties the logic of what makes a valid conn info to the type, rather than to the function.... and it's only the function that cares.
<ericsnow> natefinch: thanks for this feedback, by the way :)
<natefinch> ericsnow: welcome :)
<ericsnow> natefinch: but the point is that the type has a very specific use that is tied to that logic
<ericsnow> natefinch: I'd expect a more general-purpose type to live in a more general-purpose package
<natefinch> ericsnow: I like that Go makes it easy to separate data from logic.  You don't need to have types and methods for everything. You can have a struct with just some fields and a standalone function that verifies the struct has the right things set.  It would be a lot less code.  I *do* think that NewMongoConnInfo is useful.  That's some pretty specific logic there.  The rest is just boilerplate stuff because you're hiding
<natefinch> the fields in ConnInfo (and hiding ConnInfo itself).
<natefinch> ericsnow: I know we were all taught that public variables are the devil.  It turns out, well, sometimes they're really not... especially if it's just data, and not internal state.
<ericsnow> natefinch: hey, I'm a Python guy, so the whole public/private thing is kind of new to me...I guess I've been feeling it out
<natefinch> ericsnow: heh.... it;'s funny, because I come from C# / C++ etc where you have to think about that stuff all the time, and people are much more rigorous about never exposing anything...
<ericsnow> natefinch: being able to syntactically express expectations like that is the one thing I like about static type systems
<natefinch> ericsnow: I like being able to statically express the type of thing I'm expecting to get.... it's good for me and it's good for the person calling my code :)
<natefinch> ericsnow: certainly, with Go's interfaces, it makes it a lot less burdensome on the caller than in other languages.  "Give me something with a Read() method, I don't care what"  is a lot nicer than "Give me exactly a ByteRead type, and if you don't have one, guess you're gonna have to make one and wrap your thing in it"
<ericsnow> natefinch: type inference helps a lot too
<natefinch> ericsnow: yep, definitely.
<perrito666> davecheney: if you have time, I LGTMd this https://github.com/juju/juju/pull/681 and therefore need your validation to assert my self worth
<perrito666> :p
<davecheney> perrito666: sure
<davecheney> looking
<davecheney> i like it
<davecheney> if we on't write our own logic
<davecheney> we don't have to test it
<davecheney> win/win!
<davecheney> hmm
<davecheney> i dunno how to review this
<davecheney> i don't know if c.Remove respects the transaction system
<davecheney> and if it doesn't what the results will be
<davecheney> i'll write as such on the PR
<perrito666> davecheney: you are right, I somehow just assumed that it was not a problem since we are deleting and not changing (and therefore we do not worry about the main benefit of always work on an updated copy)
 * perrito666 automatically looses all sense of self worth and crawls under his desk
<perrito666> wallyworld_: say my name when you are available plz
<wallyworld_> perrito666: sure, on a call, so soon
 * perrito666 actually needs to start using world clocks to know when people are here
<perrito666> wallyworld_: davecheney what are your tzs?
<davecheney> perrito666: ausstralian eastern time
<davecheney> +10 (+11) I never remeber which
<davecheney> you're in the same tz as US west coast right
<perrito666> davecheney: gmt -3 which I believe has 1h diff with west coast
<perrito666> well my clock puts you both tomorrow
<davecheney> that is true, ian is from the future
<perrito666> that explains a lot
<wallyworld_> perrito666: hi, what's happening
<perrito666> wallyworld_: priv
<thumper> morning folks
<rick_h_> thumper!
<thumper> wallyworld_: check with gccgo before caring about power
<menn0> thumper: good moaning :)
<thumper> menn0: 'ello 'ello
<wallyworld_> thumper: so just finished eating and looking at this bug, the log attached shows compile errors for power which makes me sad
<wallyworld_> the version on the test machine is 2 ahead of my gccgo which does compile it ok
<wallyworld_> ffs
<wallyworld_> awesome, i upgrade my compiler and it still works locally
<waigani> menn0: standup?
<wallyworld_> \o/ lots of seg faults running tests wil gccgo
<wallyworld_> with
<waigani> menn0: where were you going to make that one char change?
<menn0> waigani: cmd/jujud/machine.go:546
<waigani> cheers
<menn0> the call to net.Listen
<wallyworld_> davecheney: i think gccgo is broken again - it seems it no longer registers the hooks used in some of the tests, hence the code required to make the tests pass is never run
<wallyworld_> i think this may have come up before?
<wallyworld_> i have no idea what to do about it - i can't see that this should be a blocker that stops commits to trunk as it's not our code that is broken
#juju-dev 2014-09-05
<jcw4> wallyworld_: looking at the error messages I was wondering if it was an issue with stale .a files on the test machine... is that possible?
<wallyworld_> jcw4: maybe, but i can reproduce locally using compiler=gccgo
<wallyworld_> i get lots of segfaults as well
<jcw4> wallyworld_: ah.. I misunderstood... I thought you couldn't repro locally.
<wallyworld_> but i can get the test failure
<jcw4> wallyworld_: is there any way to debug if you don't have a ppc machine?
<wallyworld_> i can't reproduce unless i just run one test at a time
<wallyworld_> you have to know how gccgo works i think
<wallyworld_> i have no idea :-(
<jcw4> I see
<jcw4> :)
<wallyworld_> davecheney: you around?
<davecheney> wallyworld_: ack
<wallyworld_> davecheney: bug 1365480 is blocking ci. it appears to be a gccgo issue because it fails to run the hooks used to mock out method calls
<mup> Bug #1365480: ppc64el unit tests fail in many ways <ci> <ppc64el> <regression> <juju-core:Triaged by wallyworld> <https://launchpad.net/bugs/1365480>
<wallyworld_> i have no idea how to fix
<wallyworld_> this is failing to work
<wallyworld_> 	cleanup := s.srv.Service.Nova.RegisterControlPoint(
<wallyworld_> 		"addFloatingIP",
<wallyworld_> 		func(sc hook.ServiceControl, args ...interface{}) error {
<wallyworld_> 			return fmt.Errorf("failed on purpose")
<wallyworld_> 		},
<wallyworld_> 	)
<wallyworld_> the register func uses the stackframe to figure out what to do
<wallyworld_> i guess it's broken again - i think it was broken before at some point?
<wallyworld_> there's several tests affected
<davecheney> yeah, it breaks a bit
<davecheney> has the version of gccgo on the builder machine changed ?
<wallyworld_> nfi
<wallyworld_> i was running gccgo (Ubuntu 4.9.1-10ubuntu2) 4.9.1
<wallyworld_> the build machine had gccgo (Ubuntu 4.9.1-12ubuntu2) 4.9.1
<wallyworld_> i updated to -12 locally
<davecheney> ok, and it only repro's on ppc ?
<wallyworld_> i can repo locally using -compiler=gccgo
<wallyworld_> but i get LOTS of segfaults
<wallyworld_> i have to specify each test one at a time
<wallyworld_> and yes, ci fails when running of ppc
<wallyworld_> davecheney: am i able to ask you to look into this a bit? i have no idea where to start with regard to gccgo
<davecheney> wallyworld_: can I fix it on monday ?
<wallyworld_> davecheney: it's blocking landings sadly, unless we can get the regressionm tag removed
<davecheney> i recommend removing the regression tag
<davecheney> if this is a compiler fix
<davecheney> we can't do that at critical level
<wallyworld_> sinzui: bug 1365480 looks like a gccgo issue, is there anyway we can remove the regression tag?
<mup> Bug #1365480: ppc64el unit tests fail in many ways <ci> <ppc64el> <regression> <juju-core:Triaged by wallyworld> <https://launchpad.net/bugs/1365480>
<wallyworld_> davecheney can do a compiler fix but not till mondau
<sinzui> wallyworld_, your made
<davecheney> i can look at it on monday
<sinzui> mad
<davecheney> i can't promise a fix
<sinzui> wallyworld_, the old version of juju works, and now it doesn't.
<wallyworld_> sinzui: i can prove that code which has not been touched for ages fails because gccgo does not register the monkey patch being applied
<sinzui> wallyworld_, we can retest an older revision, maybe the one that passed. If the test fails like the new revision then we know something other than juju changes
<wallyworld_> gccgo is can be fragile when it comes to looking at the call stack
<wallyworld_> which is how the monkey patching stuff works
<wallyworld_> fragile = different to golanggo
<sinzui> I will retest the last passing revision, if it fails the same way then you are vindicated
<wallyworld_> sinzui: was gccgo updated recently?
<wallyworld_> on the test vm?
<sinzui> wallyworld_, We would see that in the first test that failed
<sinzui> wallyworld_, you loose, http://juju-ci.vapour.ws:8080/job/run-unit-tests-trusty-ppc64el/1213/console clearly states that gcc was already the latest version and that no packages were installed for the test
<wallyworld_> and yet the tests that are failing have not changed and the failure is clearly due to gccgo not executing monkey patched code that the tests rely on to pass
<wallyworld_> i put a panic in the code and it did not trigger
<sinzui> wallyworld_, there is a difference, but it is not see in installs...
<sinzui> The passing one has
<sinzui> go version xgcc (Ubuntu 4.9.1-10ubuntu3) 4.9.1 linux/ppc64
<sinzui> The failing one has
<sinzui> go version xgcc (Ubuntu 4.9.1-12ubuntu2) 4.9.1 linux/ppc64
<wallyworld_> yes, that's what i used to have here till i upgrade
<wallyworld_> i'm on utopic now and it doesn't give me an option to go back to -10
<wallyworld_> wait
<wallyworld_> yes it does
<sinzui> wallyworld_, I can look into this after I avert the disaster that really cannot be averted
<wallyworld_> ok
<wallyworld_> i'll try testing with -10
<davecheney> ok, this is not good
<davecheney> -12 must be the new version in proposed which fixes a different bug
<sinzui> FU&CKI
<sinzui> wallyworld_, even after using s3cmd to sync the tools that are on aws, I still get different filesizes from streams.canonical.com
<wallyworld_> i can't seem to get apt to allow me to downgrade to -10 to test
<wallyworld_> wot :-(
<wallyworld_> sinzui: that is not good :-(
<davecheney> wallyworld_: juju bootstrap && juju deploy cs:ubuntu
<sinzui> wallyworld_, Am I experiencing this because i finally reported the versioning issue as a bug https://bugs.launchpad.net/juju-core/+bug/1365633
<mup> Bug #1365633: cannot rebuild replacement tools for streams <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1365633>
<wallyworld_> looking
<sinzui> wallyworld_, We have lived with this since Fabruary, I report the bug and now I need the fix
<sinzui> wallyworld_, tools that should be identical are not, I cannot given then extra version information to differentiate their origin to avoid confusion or outright malign intent
<wallyworld_> sinzui: simplestreams supports versioning using dates
<sinzui> wallyworld_, that is not helping the users
<wallyworld_> new tools tarballs with different names could be uploaded
<wallyworld_> and new metadata with a newer date added
<sinzui> wallyworld_, I am going to remake this data, and now I can expect users to complain that tools of the same name dont match
<wallyworld_> the tarball name used to matter before simplestreams but it doesn't now
<sinzui> wallyworld_, but these tools from two different machines that should be the same have different sums
<wallyworld_> the tools tarball could be called juju-1.20.6-release1-precise-amd64 and juju-1.20.6-release2-precise-amd64.
<wallyworld_> which one to use comes from the simplestreams metadata
<wallyworld_> the latter one would be in the metadata with a later date, so that would be be picked up if juju asks for which tools to use for series/arch/release
<wallyworld_> maybe i'm missing something
<sinzui> wallyworld_, That would help. when I tested alternate names for tools, the metadata command ignored them :(
<wallyworld_> that may be a limitation of that command :-(
<wallyworld_> which needs to be fixed
<sinzui> I have done evil things to preserve the greater good
<wallyworld_> that command i think from memory does use the filename to suck stuff in
<wallyworld_> it could be made smarter
<sinzui> yeah, the convention is convenient for many people copying tools.
<wallyworld_> or made so it can be called from a script, passing in the required tarball and params
<sinzui> wallyworld_, maybe...
<wallyworld_> sinzui: we are moving to a shared tarball across series
<wallyworld_> ie one tarball only for precise/trusty/utopic
<wallyworld_> since they are the same
<sinzui> I have just reconciled the diffs from what was last in the CPCs and my own machine to make a json that describes what what there and what I am now uploading.
<wallyworld_> so the filename will become less relvant
<sinzui> wallyworld_, I would like to do that. The number of tools we make and publish do take a lot of time
<wallyworld_> yes indeed :-(
<wallyworld_> it's sorta happening now as part of moving tools into mongo storage
<wallyworld_> and removing the need for cloud storage
<sinzui> wallyworld_, I think if this command worked for azure, we might have prevented my misadventure
<sinzui> juju metadata validate-tools --juju-version 1.20.7
<wallyworld_> oh, azure doesn't currently support custom metadata
<wallyworld_> i because there's no central storage we can use from memory
<sinzui> wallyworld_, but joyent does. I don't understand? isn't the command getting the json and answering the version question?
<wallyworld_> like we have for aws and hp cloud
<wallyworld_> it's been ages since i looed at that stuff - from memory it's because there's no support for a custome search path on azure, i can't recallwhy
<wallyworld_> i'll have to go digging in the code
<wallyworld_> and even if no custom tools location is supported, i would think the metadata command should still work
<wallyworld_> don't know why it doesn't :-(
<sinzui> wallyworld_, oh yes, now I understand. I faced some of that using their python adk
<sinzui> sdk
<sinzui> wallyworld_, We add md5 and shasum metadata to each tool we upload to azure and manta because we wrote our own rsync tools to do what real storage systems do
<wallyworld_> ok
<sinzui> manta still sucks though. there is a 5 minute period where we make 1000+ calls to look up the sums because it doesn't support bulk queries
<sinzui> well swift doesn't either, but the web/xml interface does
<wallyworld_> 1000+ !!
<wallyworld_> sinzui: we will soon not need cloud storage for juju
<sinzui> wallyworld_, indeed...part of the tools problem is that each machine is downloading tools from one or more sources and that allows for mismatches
<wallyworld_> yeah, so soon all machines will get tools from the state server
<wallyworld_> the tools are loaded into the state server on bootstrap
<sinzui> wallyworld_, I am 1. starting a rebuild of the last good master rev. I am 2, looking for the old packages to revert one of the machines to
<wallyworld_> ty
<wallyworld_> i've updated the bug with my thoughts
<sinzui> wallyworld_, I might be able to go back to what was in place on Aug 31 http://ports.ubuntu.com/pool/universe/g/gcc-4.9/
<wallyworld_> sinzui: that would be great. you may also find that gcc-base and other packages need downgrading also
<sinzui> wallyworld_, yeah, that is what makes this hard
<wallyworld_> indeed :-(
<thumper> wallyworld_: I'm going to see if I can fix this bug: https://bugs.launchpad.net/juju-core/+bug/1348477
<mup> Bug #1348477: userAuthenticatorSuite.TearDown failure <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged by cmars> <https://launchpad.net/bugs/1348477>
<thumper> wallyworld_: I have a plan
<wallyworld_> thumper: awesome, can we catch up in a sec, i'm otp withj axw
<sinzui> wallyworld_, you are vindicated by the replay of the passing tarball
<sinzui> wallyworld_, I am too tired to install the old packages. Maybe I shouldn't because I am not awake enough to know that this is stupid
<wallyworld_> sinzui: \o/ does that mean we can remove the regression tag and unblock landings?
<wallyworld_> sinzui: we do need to fix the compiler still
<wallyworld_> dave can look at that on monday
<sinzui> wallyworld_, I am going to take the tests voting rights away. if it starts passing, then we can assume the code or the compiler are in agreement and reatore the vore
<sinzui> vote
<wallyworld_> great,sounds good,
<sinzui> I can do this now, and then add the real source for the bug
<wallyworld_> sinzui: is there an eta then on landings being unblocked?
<sinzui> wallyworld_, I will lower the priority of of the bug because obviously we cannot do anything now that it is out of our power...let me fix the vote first
<wallyworld_> ok, thank you :-)
<sinzui> oh, actually. I cannot go to sleep until this test completes
<wallyworld_> :-(
<sinzui> wallyworld_, on the other hand the apiserver.metrics might actually have problems. but without a safe compiler, we wont know
<wallyworld_> thumper: did you want to talk about your plan?
<thumper> wallyworld_: yeah, cause it isn't working
<wallyworld_> ok, see you in onyx standup hangout?
<thumper> ok
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Blocking bugs: None
<thumper> https://github.com/juju/juju/pull/683 anyone? refactoring work still from this week's mega branch being broken up
<thumper> bug fix coming for auth failed
<thumper> wallyworld_: https://github.com/juju/juju/pull/685
<wallyworld_> looking
<katco> wallyworld_: hey thanks for landing all my branches :) take away the fun part why don't ya!
<katco> and now i'm off to bed. night all.
<thumper> axw: could I get you to cast your eyes over https://github.com/juju/juju/pull/642 again?
<axw> looking
<thumper> axw: I've updated it based on recent changes and your suggestions
<axw> thumper: line 20 can be dropped I think
<thumper> sure
<thumper> will do
<thumper> and pushed
<axw> thumper: reviewed, thank you
<thumper> nm
<axw> wallyworld_: https://github.com/axw/juju/compare/state-tools-take2 if you're interested in seeing the core changes
<axw> fixing tests again now
<wallyworld_> sure, looking
<axw> wallyworld_: apiserver/common/tools.go and apiserver/tools.go are probably of most interest
<axw> wallyworld_: also cmd/jujud/bootstrap.go
<wallyworld_> kk, just got a phone call, will look soon
<wallyworld_> axw: ToolsStorager NOOOOOOOOOOO
<axw> heh
<wallyworld_> not funny :)
<axw> ToolsStorageProvider? it's really a very minor thing, I don't really care
<axw> Getter is just as horrible to me
<wallyworld_> is there already a "ToolStorage", can't recall
<axw> yes, but this is a thing that has a ToolsStorage method
<wallyworld_> otherwise ToolsStorageProvider
<axw> ok
<wallyworld_> sorry, i HATE that particular Go idiom
<wallyworld_> fwereade: do you have a moment?
<mattyw> morning all
<TheMue> morning
<axw> wallyworld_: did you find anything obviously wrong, apart from that name?
<axw> morning TheMue
<wallyworld_> axw: no, looked ok. i got distracted a bit by a bug report, let me just give it one more look
<axw> no worries
<axw> wallyworld_: doesn't need to be too deep, just wanted a glance over before I get too stuck into fixing tests
<axw> which reminds me, tests
<wallyworld_> axw: nothing jumped out, but i didn't go over the find in storage logic too closely
<axw> ok
<axw> thanks
<wallyworld_> dimitern: hi there
<dimitern> wallyworld_, hey
<wallyworld_> dimitern: i backported your fix for allowing maas to disable network config to 1.20. the 1.20 branch is a little different to trunk. could you please review my back port? and type $$merge$$ if you are happy as i have to head to soccer https://github.com/juju/juju/pull/687
<dimitern> wallyworld_, sure, looking
<wallyworld_> thank you
 * wallyworld_ heads out to soccer
<TheMue> dimitern: heya, mind another look at https://github.com/juju/juju/pull/626 ?
<TheMue> dimitern: it now also covers the simulation and testing of a V0 machiner API.
<dimitern> TheMue, cheers, will have a look
<TheMue> dimitern: great, thanks
<TheMue> dimitern, voidspace: hangout?
<voidspace> TheMue: omw
<voidspace> dimitern: after changing TIME_WAIT I haven't seen the tests fail...
<voidspace> dimitern: not conclusive, but they were failing regularly before
<voidspace> dimitern: I'll go to 2MB rate limit (used to fail every time) and see if they now pass
<dimitern> voidspace, good news then :)
<voidspace> dimitern: ah no, fail :-/
<dimitern> voidspace, too bad.. but hey, it's some progress at least
<TheMue> so, back from lunch
<TheMue> dimitern: thanks for review
<TheMue> dimitern: only regarding the test for the providers I don't like to change
<TheMue> dimitern: simply so that all providers, also future ones, always follow the same approach
<dimitern> TheMue, well, I really don't like passing an opaque array of booleans
<TheMue> dimitern: I recognized it as advantage in the moment I added the testing for the V0
<TheMue> dimitern: and I don't like to do everything the same way but only ...
<TheMue> dimitern: these exceptions always make it more difficult for later maintainers
<TheMue> dimitern: but I could change it that I define the standard behavior as a const (ok, it's a var), so the tests read better
<dimitern> TheMue, it will be difficult for anyone to see what [16]bool{true,true,true,false,false,...} actually means
<dimitern> TheMue, that sounds better, yes
<TheMue> var ExpectedStandardBehavior = [16]bool { ... }
<dimitern> TheMue, btw why [16]bool and not []bool ?
<TheMue> dimitern: OK, that's a compromise for me
<TheMue> dimitern: hey, we all love Go for its type safety. so why open a door to pass to few or much values?
<TheMue> dimitern: only to safe to chars?
<dimitern> TheMue, ok, as long the [16]bool is hidden behind a var, I'm fine for the time being
<gsamfira> hello folks. If anyone has some time, can I get a review on: https://github.com/juju/utils/pull/27/ ?
<TheMue> dimitern: will hide it
<perrito666> natefinch: fetching aurics, brt
<perrito666> ericsnow: wwitzel3 do we?
<voidspace> so I can confirm that CurrentStatus will report members in PrimaryState/SecondaryState even when primary renegotiation is happening and the replica set is unstable
<voidspace> although it looks like it sets Uptime to 0 when that happens
<voidspace> who wrote the replicaset code?
<voidspace> It's part of juju not mgo
<natefinch> voidspace: I wrote the replicaset code.
<voidspace> natefinch: ok
<voidspace> natefinch: I've butchered the applyRelSetConfig code
<voidspace> natefinch: I don't think the loop inside that does quite what it looks like it does
<voidspace> natefinch: however I've got rid of it anyway, so my question is now moot
<natefinch> voidspace: heh ok
<voidspace> natefinch: I have a new WaitForMajorityHealthy function which we can use to tell when the replica set is stable
<voidspace> natefinch: so far it's mostly working - except for the times when it doesn't...
<sinzui> alexisb, I am going to delay 1.21-alpha1 until Monday. There are too many changes to write up as release notes in a single day. I honestly don't know what features are in this release and how to explain to users who to use them
<natefinch> voidspace: That was definitely not the finest code in the world.  I wish there were better ways to do pretty much everything in that code... mostly around querying mongo for "WTF are you doing right now?"
<voidspace> natefinch: it's the fact that you change cmd to "Ping"
<alexisb> sinzui, understood, no one is pinning for it today
<voidspace> natefinch: which is only useful if you re-enter the block "if err == io.EOF"
<alexisb> sinzui, you and I and Ian need to sync on release roadmap for 1.21 though
<voidspace> natefinch: which almost certainly isn't what Ping returns
<voidspace> natefinch: and even if Ping is successful we retry the loop instead of breaking
<voidspace> natefinch: as there's no check for err == nil
<voidspace> natefinch: if my function is reliable, it will look like this instead
<voidspace> natefinch: http://pastebin.ubuntu.com/8260487/
<natefinch> voidspace: hmm yeah that's not good. That  code has been tweaked by a lot of people who were trying to make it more reliable... it's quite possible there were some screw ups along the way.  A lot of it was trial and error trying to figure out what mongo will do at any particular time.
<sinzui> alexisb, agreed
<natefinch> voidspace: can you show me waitformajorityhealthy?  That's the key part that I had difficulty writing myself.
<natefinch> voidspace: also, when does session.Run return EOF?  We should comment why that's an ok error to get
<voidspace> natefinch: http://pastebin.ubuntu.com/8260518/
<voidspace> natefinch: I should add back a comment about that
<voidspace> natefinch: it's when changing the config causes primary re-negotiation so existing connections are dropped
<voidspace> natefinch: it's fine - we just need to refresh
<voidspace> natefinch: which WaitFor... does
<voidspace> natefinch: this is currently not stable - I'm sometimes seeing WaitFor... timeout, so I need to add some debugging
<voidspace> this is what I'm doing now
<voidspace> it *mostly* works
<natefinch> voidspace: thanks for putting in time on this, it'll make our code a lot more robust, and hopefully fix a lot of mongo related errors in the tests
<voidspace> maybe... :-/
<voidspace> it's been dead end after dead end so far
<voidspace> this looks really promising, but I'm still seeing timeouts
<natefinch> sinzui: is amazon sick today?  one of my PR's failed in a weird way: http://juju-ci.vapour.ws:8080/job/github-merge-juju/546/console
<sinzui> natefinch, that indeed looks like aws failed to provide an instance
<sinzui> natefinch, I saw messages yesterday that clearly states there weren't any instances of the size requested for the AZ :(
<natefinch> sinzui: I suppose AWS could just be busy
<perrito666> mattyw: hey, are you around?
<mattyw> perrito666, yep
<hazmat> sinzui, that's a bug imo, juju should recover and try a different az
<hazmat> although that's different then what natefinch build says
<perrito666> mattyw: did you see axw's last pr?
<mattyw> perrito666, removing the call to setadminmongopassword?
<perrito666> yup, I applied that and ran with and without your patch
<perrito666> that seems to at least fix half of the erorrs yet the error related to presence is still there
<mattyw> perrito666, my patch?
<perrito666> http://paste.ubuntu.com/8227111/
<perrito666> "patch"
<mattyw> perrito666, does axw branch make use of the change that thumper landed overnight?
<perrito666> yes
<perrito666> https://github.com/juju/juju/pull/688
<natefinch> how the hell are you supposed to use juju run?  I can't for the life of me figure out how to get it to do anything but say "unrecognized args <stuff in the command to run>"
<wesleymason> natefinch: juju run --service <servicename> 'comand here'
<wesleymason> for example
<natefinch> in quotes?
<wesleymason> yeah, in single quotes so bash/zsh etc. doesn't interpolate first
<wesleymason> recommended anyway
<natefinch> ahh that was it.  I was trying with -- to keep it from parsing flags.... we really need better help on that command
<natefinch> or like ONE example would be nice
<wesleymason> +1
<natefinch> I'll work on that.  bad help is a pet peeve of mine
<voidspace> natefinch: do you know how to debug "no reachable servers" errors?
<natefinch> voidspace: when initiating the replicaset?
<voidspace> natefinch: no, after applying a config change or during a Dial
<voidspace> natefinch: but in both cases I have a replicaset with several members
<natefinch> voidspace: either they all still trying to come up, or the addresses are internal to the cloud, not public...
<voidspace> natefinch: it's during tests, so not a cloud issue
<voidspace> natefinch: and I'd like to know *how* to tell whether or not they're trying to come up
<natefinch> voidspace: I wish I knew
<voidspace> natefinch: as I've waited five minutes and CurrentStatus is failing
<voidspace> because of the connection error
<voidspace> hah, right
<natefinch> niemeyer: ^^
<natefinch> niemeyer: we're trying to make our code more robust with respect to Mongo, especially when initiating a replicaset and when bringing up instances of mongo during testing.  We get what appear to be random failures where sometimes they either never come up or take a really long time, or initiating takes a really long time.   Part of the problem is that we don't really now how to figure out what state mongo is in... all we
<natefinch> can do is dial and see if it responds within a timeout.  Is there some better way we can do this?
<voidspace> I'm seeing a lot of errors like:
<voidspace> [LOG] 6:43.772 DEBUG juju.testing tls.Dial(127.0.0.1:35846) failed with dial tcp 127.0.0.1:35846: connection refused
<voidspace> Even with session.Refresh() and waiting for (up to) five minutes
<niemeyer> natefinch: Yes, you can always ask the server for its status
<voidspace> niemeyer: how specifically?
<voidspace> calling CurrentStatus(session) is failing with connection refused
<niemeyer> voidspace: http://docs.mongodb.org/manual/reference/command/replSetGetStatus/
<voidspace> niemeyer: that's precisely what CurrentStatus is doing
<niemeyer> voidspace: If the connection is refused, you know the status :)
<voidspace> niemeyer: any idea *why* sometimes our connections die like that and just don't come back
<niemeyer> voidspace: Okay, that's not what Nate said above
<niemeyer> voidspace: Hmm
<niemeyer> voidspace: Die with connection refused?
<voidspace> [LOG] 6:43.772 DEBUG juju.testing tls.Dial(127.0.0.1:35846) failed with dial tcp 127.0.0.1:35846: connection refused
<natefinch> niemeyer: sorry... what I mean is - we tell it to initiate... and then can never get it to respond
<niemeyer> voidspace: The TCP port is not open..
<niemeyer> natefinch: Look at the logs
<niemeyer> natefinch: I've never seen anything similar before
<niemeyer> natefinch: the test suite of mgo routinely shoot servers down and bring them back up
<natefinch> niemeyer: it's the single most common failure for our tests - mongo going away and never coming back
<natefinch> niemeyer: it's quite likely we're just doing something wrong, we just don't know what that is.
<niemeyer> natefinch: That makes no sense.. a connection refusal is a TCP port not open, which in general means MongoDB is not even running
<niemeyer> natefinch: I'd look at the logs to see why
<natefinch> niemeyer: it's not always connection refusal... that's the problem this time, often times the dial will just time out eventually
<niemeyer> natefinch: Heh..
<voidspace> that particular failure was during a call to instance.MustDialDirect() - *after* waiting for CurrentStatus to report all members up
<niemeyer> natefinch: First thing to do is make up your mind about what the symptom is :)
<voidspace> well, I just did another test run and got the same symptom
<voidspace> [LOG] 6:43.764 DEBUG juju.testing tls.Dial(127.0.0.1:37222) failed with dial tcp 127.0.0.1:37222: connection refused
<niemeyer> Yeah, that's a server down.. the logs will say why
<natefinch> voidspace: I think you'll need to hack the code a little to prevent gocheck from cleaning up the mongo directory, so you can look at the logs
<voidspace> niemeyer: do you know where the logs should be? I've got a horrible feeling we redirect mongo logging somewhere useless.
<voidspace> natefinch: ah, right
<voidspace> natefinch: when we start mongo don't we get it to log to standard out so we can parse the logs...
<voidspace> natefinch: meaning we get no logs
<voidspace> natefinch: or does it log to the directory as well?
<niemeyer> voidspace, natefinch: -check.work will prevent it from being removed, and display it as well
<voidspace> niemeyer: cool, thanks
<natefinch> niemeyer: oh, awesome, thanks
<niemeyer> voidspace: But I don't know where the logs are being sent to
<natefinch> voidspace: I'm pretty sure mongo's logs are still written to disk, but I honestly don't remember
<voidspace> we're still using the launchpad version of gocheck of course
<voidspace> wasn't there a thread about that?
<voidspace> yeah, looks like we're about to update
<natefinch> niemeyer: is that check.work flag available on launchpad's gocheck?  I can't find docs on the flags it takes
<niemeyer> natefinch: -gocheck.work, likely
<niemeyer> natefinch: -help on the test binary, or just passing a wrong flag, will print the options
<voidspace> I don't think it is available
<voidspace> we're at the latest revision of launchpad
<voidspace> natefinch: copying the gopkg.in one over the top of the launchpad one seems to work though
<voidspace> :-p
<natefinch> voidspace: heh, we're lucky we always rename the package, otherwise that wouldn't work
<voidspace> right
<wwitzel3> woo, I have passing tests!
<natefinch> anyone know why I'd get "cannot open ports 80-80/tcp on machine 5 due to conflict" when I re-ran my install hook?  Shouldn't open-port be idempotent?
<gsamfira> natefinch: there was a discussion on the mailing list about this a while back. Subject was "Port ranges - restricting opening and closing ranges". Not sure of the conclusion on that though
<gsamfira> https://lists.ubuntu.com/archives/juju-dev/2014-August/003131.html
<perrito666> anyone knows the difference between using net.Listen("tcp", "localhost:0") and net.Listen("tcp", ":0") ?
#juju-dev 2014-09-07
<menn0> thumper: morning
<marcoceppi> thumper: you around?
<thumper> marcoceppi: yep
<marcoceppi> thumper: opinions on this? https://bugs.launchpad.net/juju-core/+bug/1366627
<mup> Bug #1366627: Automagically configure environments.yaml to work for an encypted home directory <juju-core:New> <https://launchpad.net/bugs/1366627>
<marcoceppi> the title is wrong
<marcoceppi> but the body was updated
 * marcoceppi updates
<thumper> marcoceppi: triaged
<marcoceppi> thumper: \o/ brussels beers all around
<thumper> marcoceppi: I don't think I'll be doing it, but we'll try to get someone on to it.
<marcoceppi> thumper: awesome, it's biting new users left and right and the docs to caveat this are always falling behind
 * thumper nods
<thumper> fair call
<thumper> it wasn't something that I thought of when we created it
<marcoceppi> thumper: yeah, we're sprinting now and basically had a eureka moment
<wallyworld_> thumper: marcoceppi: there's also this one: bug 1361759
<mup> Bug #1361759: Early LXC log files should be in /var/log not /var/lib <landscape> <logging> <lxc> <juju-core:Triaged> <https://launchpad.net/bugs/1361759>
<wallyworld_> apparently that annoys people
<thumper> heh
#juju-dev 2015-08-31
<davecheney> thumper: ready for 1:1 ?
<thumper> davecheney: yeah, you early?
<davecheney> not in the chat yet
<davecheney> just jumping in now
<davecheney> thumper: https://docs.google.com/document/d/1EWnZX0NU5Ib0PYCfrr0Hz1TgsHOtuy2HYIN5SIuUQx0/edit
<wallyworld> menn0: i can't see where toBsonD is defined
<menn0> wallyworld: it's in collection.go (it's also used for other things)
<menn0> wallyworld: all multi-env related util funcs are about to get moved to their own file
<wallyworld> ah, my master branch must be out of date
<menn0> i'm actually doing that right now
<menn0> I think toBsonD laned on Friday night
<wallyworld> makes, sense, i'm on master from friday am
<anastasiamac> i read it as "i'm master friday am" :D
<anastasiamac> from*
<mup> Bug #1489896 changed: Juju cannot upgrade to 1.26-alpha1 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1489896>
<axw> anastasiamac: mind if I take the card "Add status for filesystems" off you?
<axw> anastasiamac: I want to do the rescheduling one, but it needs this first
<axw> wallyworld: if I add status to filesystems, can that change go into 1.25? it'll need an upgrade step
<wallyworld> axw: depends on how we word the change. if it's done as a bug in status, we could i think
<wallyworld> but we could leave till 1.26
<wallyworld> i don't think it's that urgent imho
<wallyworld> what do you think?
<axw> wallyworld: could leave it I think. there's no CLI for filesystems anyway
<wallyworld> yep
<anastasiamac> axw: so to answer ur question - no, I do not mind :D
<axw> anastasiamac: that's good, because I did it already ;)
<anastasiamac> axw: in general, I think names on any card in this lane are indicative... just pick up next in priority :D
<anastasiamac> axw: \o/ ur amazing! tyvm :D
<mup> Bug #1490480 opened: juju apt proxy vs MaaS <juju-core:New> <https://launchpad.net/bugs/1490480>
<mup> Bug #1490552 opened: local hostname not in /etc/hosts <juju-core:New> <https://launchpad.net/bugs/1490552>
<natefinch> ericsnow: standup
<fwereade> tasdomas, re RB2519
<tasdomas> fwereade, yes?
<fwereade> tasdomas, I'm not completely convinced that either is right -- what's the thinking behind choosing that one?
<fwereade> tasdomas, because we do have that verifyWaiting
<fwereade> tasdomas, which *should* indeed queue a config-changed when we're back in a stable state -- once, that was the first thing we did in modeabide, now I presume that check is a bit below the upgrade check?
<fwereade> tasdomas, and that is actually fine, I think
<tasdomas> fwereade, you mean running the config-changed hook after uniter's restart?
<fwereade> tasdomas, we *do* still get a config-changed guaranteed after bouncing, we just slip the upgrade in first so we don't have to do it twice
<fwereade> tasdomas, yeah
<tasdomas> fwereade, yeah - I think that's how it works currently
<tasdomas> fwereade, there was an email from Ian last week where he asked me to hold off on implementing an explicit run-config-changed resolver for now
<tasdomas> fwereade, and the way config version is handled right now (aiui it's a transient state variable), config-changed does get run after restart
<fwereade> tasdomas, yeah
<fwereade> tasdomas, shit it
<fwereade> dammit
<fwereade> tasdomas, LGTM
 * fwereade sighs at himself
<tasdomas> fwereade, that's how I look at the code I write - good to know I'm not alone ;-]
<ericsnow> biab
<mbruzek> Who is working on the CentOS support in juju?
<jcastro> the cloudbase guys
<mup> Bug #1490603 opened: TestSubnets fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1490603>
 * perrito666 suddenly remembers he owes them a couple of reviews
<jcastro> Hi, I have questions about: https://bugs.launchpad.net/juju-core/+bug/1489477
<mup> Bug #1489477: m4 instance types not supported on AWS <benchmarking> <ec2-provider> <feature> <juju-core:Triaged> <https://launchpad.net/bugs/1489477>
<jcastro> when targetted to 1.26 does that mean we won't get this until January?
<alexisb> jcastro, in a stable release yes
<jcastro> ouch
<alexisb> we are feature frozen for 1.25 already
<jcastro> is there a way so we can make it so that getting new instance types is easier in older jujus?
<jcastro> supporting a new instance type 8 months after launch is going to be problematic.
<perrito666> is that really a feature?
<natefinch> jcastro, alexisb: my thought has always been that if someone types instancetype=foo ... we just pass foo to the provider, and not try to do our own validation on it.  If it works, it works. If it doesn't, it doesn't.  But don't artificially try to restrict what instance type we request for no reason.
<jcastro> yeah, and on top of that we have this: https://bugs.launchpad.net/juju-core/+bug/1373516
<mup> Bug #1373516: Switch default instance type from m1.small to t2.small/m3.medium for EC2 provider <ec2-provider> <juju-core:Triaged> <juju-core 1.26:Triaged> <https://launchpad.net/bugs/1373516>
<bogdanteleaga> somebody from CI around?
<katco> jcastro: someone will be picking that up shortly
<natefinch> jcastro: moonstone is going to be working on that one in the coming couple weeks
<jcastro> \o/
<jcastro> natefinch: I agree with you 100%, also same with AMIs
<alexisb> katco, I just want to clarify that it is targeted for 1.26, correct?
<katco> alexisb: at this point, yes
<natefinch> jcastro: yeah, AMIs is a little different just because we don't currently support specifying them manually... but in general, yes, that would be nice, and probably not too hard.
<katco> alexisb: i think the only one we said we'd target for a 25.1 was http://pad.lv/1486553
<jcastro> https://bugs.launchpad.net/juju-core/+bug/1489484 is the bug I filed for that one natefinch
<mup> Bug #1489484: Juju should support passing an AMI to deploy <charmers> <ci> <ec2-provider> <feature> <juju-core:Triaged> <https://launchpad.net/bugs/1489484>
<natefinch> alexisb, katco:  I think we should consider putting the default instance type one into 1.25 if possible.  It should be a straightforward change, and is a cause of some fairly common problems.
<natefinch> well, one fairly common problem
<jcastro> yeah the problem with the default instance types is that m1.smalls are being phased out
<natefinch> which we knew a year ago :/
<jcastro> so if you deploy a bundle today with AWS and it's over a few nodes, it has a good chance of failing
<jcastro> so you're like, oh no problem, I'll just use what AWS recommends, m3's and m4's, and why wouldn't I want to use m4's, they're cheap and fast.
<alexisb> this sounds like a regression of behavior due to lack of current support
<alexisb> so we have a case for a 1.25 fix
<jcastro> it's more like when amazon says deprecated they'll actually stop handing out those instances
<alexisb> jcastro, it is my first day back after being out but we will discuss this in the release call today and get it prioritized appropriately
<jcastro> nod
<jcastro> don't worry it gets worse
<katco> alexisb: wb btw :)
<jcastro> we'll have to fix it for every version we support
<alexisb> well we will need to make a more doable line for that, ie what is in updates on lts is the oldest we will support
<alexisb> katco, thank you and thank you all for holding the fort while I was away
<jcastro> but we should probably figure out a general workflow for "aws announced new instance types"
<natefinch> jcastro: absolutely
<perrito666> we could really use a declarative, external, list of those things
<jcastro> they call their older ones previous generations: https://aws.amazon.com/blogs/aws/ec2-update-previous-generation-instances/
<jcastro> so I think as long as we try to stay off that list for any defaults we should be fine
<natefinch> jcastro: it would be nice if this kind of thing were configurable, so we could just update a configuration file at runtime and get new behavior.
<jcastro> that goes for bundles who declare instance types too, so it affects eco as well
<jcastro> natefinch: we used to have `default-instance-id` in environments.yaml, no idea why it was removed
<natefinch> jcastro: yeah, but I want something to configure juju's defaults outside of metadata.yaml, so users don't have to go google why their juju is broken and then go twiddle in the yaml by hand.
<jcastro> natefinch: I wonder though, if we can ask AWS programatically for the support status of an instance type
<natefinch> jcastro: metadata/environments/
<jcastro> so juju can just W: unsupported instancetype foo, please use instancetype bar
<natefinch> jcastro: dunno... my guess is that they don't have that... they don't really like people being able to programmatically query the capabilities of their system
<jcastro> crazyness
<ericsnow> fwereade: do you think there's room for a "run arbitrary commands" worker on the machine agent?  (juju-run would leverage this rather than SSH)
<fwereade> ericsnow, quite possibly
<fwereade> ericsnow, it has crossed my mind that the ability to provide an execution context is pretty much completely independent of the unit agent
<ericsnow> fwereade: for vsphere and rackspace providers we are using an SSH connection manage a local firewall, but would use that instead
<fwereade> ericsnow, and tasdomas et al have been busily extracting it from uniter so it's becoming more plausible
<ericsnow> fwereade: plus, apparently it would help juju-run on windows
<ericsnow> fwereade: k
<ericsnow> bogdanteleaga: ^^^
<fwereade> ericsnow, that firewall -- an implementation of the environ methods used by firewaller? or something entiirely different?
<ericsnow> fwereade: we have to manually implement a firewall on the local machine (iptables) for the sake of the provider
<fwereade> ericsnow, ! (in a good way)
<fwereade> ericsnow, that's awesome, I think
<fwereade> ericsnow, because the environ-based firewalling has many weak points
<ericsnow> fwereade: it's a naive implementation of what I believe core is planning to do more completely as a later feature
<fwereade> ericsnow, but I'm still not sure why we need to ssh into the machines?
<fwereade> ericsnow, cool
<ericsnow> fwereade: to run the iptables commands :)
<fwereade> ericsnow, shouldn't that be a local worker doing that stuff?
<ericsnow> fwereade: for the firewall stuff I expect that a worker would be a better approach, but for now we SSH
<ericsnow> :)
<bogdanteleaga> ericsnow, fwereade, I think the same reasoning goes for juju run then as well, it doesn't need to be an "run arbitrary commands" worker
<fwereade> bogdanteleaga, sure, I'm thinking of it as "having an internal Runner that can be used by some other workers" rather than just having a generic mechanism for injecting shell commands
<fwereade> brb
<marcoceppi> natefinch jcastro aws doesn't like that, but we could make it a simple stream, couldn't we?
<marcoceppi> instance types supported/depricated and how they translate to generic  constraints
<marcoceppi> then we can iterate it outside of a core release
<jcastro> marcoceppi: yeah, having to update juju with this kind of metadata seems like not-the-right-way to do it
<jcastro> especially since we have to ping simplestreams for other stuff anyway
<mup> Bug #1490649 opened: TestSortingAndFilteringBeforeCachingRespectsPreferIPv6 <ci> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1490649>
<mup> Bug #1490649 changed: TestSortingAndFilteringBeforeCachingRespectsPreferIPv6 <ci> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1490649>
<natefinch> SomeoneNeedsToFindOutHowToUseCommentsToDescribeAFunction
<mup> Bug #1490649 opened: TestSortingAndFilteringBeforeCachingRespectsPreferIPv6 <ci> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1490649>
<mup> Bug #1490653 opened: TestMissingSocket <ci> <test-failure> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1490653>
<mup> Bug #1490656 opened: TestBlockAddRelation <ci> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1490656>
<natefinch> katco: it occurs to me - if we're going to change workload status to be behind juju ps rather than juju status, maybe I should skip the feature tests for the work I did in juju status?
<katco> natefinch: or at least delay them until ps gets coded up
<natefinch> katco: yeah, that's what I meant. Cool.
<natefinch> gah, launchpad....  There's a 1.26 series version of the bug that is assigned to a 1.25 milestone?   on bug 1373516
<mup> Bug #1373516: Switch default instance type from m1.small to t2.small/m3.medium for EC2 provider <ec2-provider> <juju-core:Triaged> <juju-core 1.26:Triaged> <https://launchpad.net/bugs/1373516>
<natefinch> katco: I guess we decided that one should be in 1.25? ^
<mup> Bug #1490665 opened: maas provider: juju bootstrap tools download errors hard to debug <cloud-archive> <landscape> <juju-core:New> <https://launchpad.net/bugs/1490665>
<katco> natefinch: i think we were going to talk about it in the release standup
<katco> natefinch: but probably?
<natefinch> katco: kk
<natefinch> gah, how the **** are you supposed to know if the production code does the right thing if you mock out the data it bases its decision on? :/
<perrito666> natefinch: i dont follow
<katco> natefinch: answer: different tests test different things. the test you're looking at most likely just tests that the function passes information on in the correct manner
<natefinch> perrito666: the code to choose which instance type to deploy is all kind of implicit in the values for the instance type's features... like CPU power etc.  But when we test what gets deployed, we mock out that list.
<natefinch> katco: as far as I can tell, this is the only test that says "if you don't give FindInstanceType any constraints, what type does it return?"
<katco> natefinch: eek... i am modifying that area of the code right now. hope we're not stomping on each other
<natefinch> katco: I might be reading the tests wrong though... they're kind of hard to read
<ericsnow> natefinch: when you get a chance could you follow up on a couple reviews: http://reviews.vapour.ws/r/2424/ and http://reviews.vapour.ws/r/2426/
<natefinch> ericsnow: absolutely
<ericsnow> natefinch: thanks
<katco> natefinch: fyi, i am fixing bug 1489142, so if you can try and focus on just finding where we declare the default and toggle that. i'll handle making sure specifying instance type overrides all else
<mup> Bug #1489142: cpu-power constraint conflicts with with instance-type when trying to launch a t2.medium <constraints> <juju-core:Triaged> <juju-core 1.24:In Progress by cox-katherine-e> <https://launchpad.net/bugs/1489142>
<natefinch> katco: yep, that's what I'm trying to figure out.  It looks like there's no actual default specified... it's just implicit in the default constraints that we set.  We just set constraints that happen to work out so that we choose m1.small.... and there seems to be no test that actually tests that directly.
<katco> natefinch: ahhh
<katco> natefinch: in that case, maybe the bug is invalid? when t1 is gone will it pick a correct instance?
<natefinch> katco: I'll try commenting out the existence of m1.small and see what happens... that may well work
<natefinch> katco: it's not invalid, because we'll always try and fail with an m1.small first
<natefinch> katco: I can adjust the default minimum requirements to be more memory than the m1.small has, assuming m3.medium is what we want to pick by default (which is only 8.8 cents per hour instead of m1.small's 7 cents)
<natefinch> tested that and it works
<katco> natefinch: that doesn't seem quite right... min. constraints are a truth
<katco> natefinch: we shouldn't bend them to get the right answer
<natefinch> katco: this is just the default minimum constraints.  It's how we're doing it right now.  We set the minimum  CPU power to exclude all the tiny instances that aren't generally useful (but you can get one if you explicitly set the CPU power lower)
<natefinch> this would just be tweaking it so that we also exclude m1.small
<katco> natefinch: if what we mean is "juju should have a default instance", we should probably codify that and not minimum constraints. if we really mean min. constraints, we should not change those but change the algorithm to not select invalid instances
<natefinch> katco: well... it's tricky.  Right now there's only two ways to choose an instance type - set constraints, which can match anything, or pick an instance type explicitly.  We don't have the idea of invalid instances.  And m1.small is not exactly invalid... it's just a poor default because it's fairly often not available.
<katco> natefinch: can i ask you to set this one on the shelf a bit? after my code lands it may become much easier
<katco> natefinch: i'm sorry i didn't say that earlier. i had it in my mind that there was just a default somewhere we'd have to tweak
<natefinch> katco: setting CPU Power to 300 instead of 100, and memory to 2048 happens to have the right outcome... however a more appropriate fix would be to mark all the 1st generation types as such, and exclude them by default unless you explicitly ask to include them.  However, that's significantly more work.
<natefinch> katco: I'm happy to set it aside, but given that it's something we really should address for 1.25, let's not set it aside for too long.
<katco> natefinch: i should have a patch up for review tomorrow
<natefinch> katco: cool.
<katco> ericsnow: wwitzel3: it looks like we need to modify the wpm spec to conform to the decisions that have been made
<ericsnow> katco: k
<natefinch> like not calling it WPM anymore? ;)
<natefinch> WM just doesn't have the same ring
<natefinch> "The P is silent"
<natefinch> ericsnow: why do the components need a datadir now?
<ericsnow> natefinch: for WPM we need to a place to store plugin data
<ericsnow> natefinch: otherwise there are situations where the status worker won't work right
<natefinch> ericsnow: can you explain?  What plugin data are we storing?  And why not store whatever data we need in the database?
<ericsnow> natefinch: we store all sorts of things locally already for other "components"
<ericsnow> natefinch: for WPM we need to store the path to the executable (for the feature tests)
<natefinch> ericsnow: seems like a lot of code churn and added complexity just to support something only used for feature tests
<ericsnow> natefinch: I'm open to suggestions on an alternative :)
<natefinch> ericsnow: I think I'd just make it built in, and use an environment variable to enable it.  At least that wouldn't require public API changes etc.
<ericsnow> natefinch: that doesn't resolve the problem though
<natefinch> ericsnow: I thought the problem was the plugin executable
<ericsnow> natefinch: wait, you're suggesting re-writing the plugin?
<natefinch> ericsnow: I'm suggesting not having an executable plugin and instead having it as a package like the docker package, except only enabled if there's an environment variable set.
<natefinch> ericsnow: assuming I'm at all understanding the problem correctly, which I may not be.
<ericsnow> natefinch: we already have a plugin that we use during the feature tests
<ericsnow> natefinch: it is not trivial to re-write that in Go in order to make it a lib
<natefinch> ericsnow: I may be misunderstanding the problem.  Why did we have to add the data directory, when it wasn '
<natefinch> wasn't needed before?
<ericsnow> natefinch: it *was* needed before :)
<natefinch> ericsnow: but it's new in your patch....
<ericsnow> natefinch: my patch fixes that particularly brokenness
<ericsnow> natefinch: I'm going to pull that part of the patch out into a separate one so it's a bit more clear
<natefinch> ericsnow: cool
<natefinch> ericsnow: I'll look so more tonight.  Been looking at both the reviews you pointed me to earlier.
<ericsnow> natefinch-afk: thanks
#juju-dev 2015-09-01
<mup> Bug #1490789 opened: [juju authorised-keys add] cannot specify a key file <juju-core:New> <https://launchpad.net/bugs/1490789>
<wallyworld> thumper: what's you fav flag design?
 * thumper looks for a copy
<anastasiamac> wallyworld: with stripes?...
<wallyworld> they've narrowed down to 4
<thumper> http://www.stuff.co.nz/national/71624100/live-final-four-nz-flag-designs-unveiled
<thumper> top left
<thumper> black and blue
<thumper> second favourit is bottom right
<wallyworld> mee too
<thumper> red and blue
<wallyworld> i would have preferred soutern cross on all back and white
<mwhudson> obviously they should have chosen http://fm.cnbc.com/applications/cnbc.com/resources/img/editorial/2015/05/15/102682735-3127-flag.530x298.png?v=1431699254
<wallyworld> i wish we would change ours :-(
<thumper> someone did a variant of the top left with the black and blue inverted
<thumper> mwhudson: :)
<mwhudson> or nyan kiwi
<mwhudson> https://www.govt.nz/browse/engaging-with-government/the-nz-flag-your-chance-to-decide/gallery/design/2811
<anastasiamac> i like monochromatic fern
<thumper> mwhudson: I actually like the new design (top left one)
<mwhudson> thumper: it's ok i guess
<mwhudson> not a subject i am really capable of getting excited over :-)
<mwhudson> i don't like the way the fern is drawn i think, although i'm not sure what is wrong
<mwhudson> too chunky maybe
<thumper> personally I'm in favour of changing the flag, but I believe that the government is trying to hide a lot of shit by saying, "hey look over there, new flag"
<mwhudson> yeah definitly
<mwhudson> wallyworld: your country's politics is pretty surreal at the moment
<wallyworld> indeed :-(
<wallyworld> nz has their shit together on so many things
<wallyworld> we are a basket case
<thumper> pfft
<thumper> I don't follow ozzie politics much
<thumper> but I'm not sure I'd say NZ has their shit together
<thumper> although to be honest
<thumper> pretty much any country I look at more than superficailly seems screwed in some way
<thumper> why does humanity suck so much?
<anastasiamac> thumper: careful - u r aproaching wisdom
<anastasiamac> *pp
<thumper> I'll go back to being superficial
<davechen1y> thumper: http://reviews.vapour.ws/r/2474/
<davechen1y> got all the kinks out of it now
<thumper> davechen1y: lgtm, if you mark the last issue resolved, then you have the shipit from axw
 * thumper heads off to code craft
<davechen1y> ta
<davechen1y> fixed
<wallyworld> axw: school pickupt ime, if you get a chance maybe you could look at https://github.com/juju/juju/pull/3168 for me? otherwise i'll ping one of the emerald guys later
<axw> wallyworld: sure
<wallyworld> ty, i'll look at yours when i get back
<wallyworld> jam: hiya, do you think the resources spec is ready to distribute more widely? maybe take another look and get back to me?
<axw> wallyworld: if you have time, would you mind reviewing http://reviews.vapour.ws/r/2530/diff/# also? it's a prereq of the other one
<wallyworld> sure
<wallyworld> axw: have to go to dinner, will finish review in a bit
<axw> wallyworld: no worries, it can wait till tomorrow
<axw> or OCR, whichever comes first
<mup> Bug #1490075 changed: juju use lxcbr0 rather than juju-r0 <juju-core:Invalid> <https://launchpad.net/bugs/1490075>
<mup> Bug #1490865 opened: destroy-environment on an unbootstrapped MAAS environment can release all my nodes <juju-core:New> <https://launchpad.net/bugs/1490865>
<wallyworld> axw: reviewed, one main concern to consider
<axw> wallyworld: thanks
<wallyworld> np
<axw> wallyworld: if you're still there, please see my replies before I merge
<wallyworld> sure
<wallyworld> axw: with the order of operations - we ran into the issue with service status. the status was retrieved first and then the service wasn't there
<wallyworld> i'm included to prefer the parent doc exist first
<wallyworld> the megawatcher was the component affected
<wallyworld> it was watching status and then got into trouble when it tried to get the service
<axw> wallyworld: then we need to do it for all other entities too. machine, unit, etc. I'm pretty sure we don't want status docs floating around if the entity we create them doesn't get created though.
<wallyworld> my suggestion is to create the entity doc first
<wallyworld> hence status won't be floating around
<axw> wallyworld: in which case you can find the entity and not the status...
<wallyworld> correct
<wallyworld> but that is think is the lesser of 2 evils
<wallyworld> as we found with megawatcher
<wallyworld> consider the case of watching the status - we get notified there's a status update and then find no parent doc
<wallyworld> that i think is a more common case
<wallyworld> or at least it has bitten us that way
<axw> wallyworld: I think we should just make the megawatcher resilient to that, rather than making *everything* else prone to errors due to missing prereq docs
<wallyworld> i'm surpirsed unit machine etc are the "wrong" way
<wallyworld> i would argue the parent entity is the prereq :-)
<wallyworld> you can't have an arm without a body
<wallyworld> in any case, the megawatchr was made resiliant before the root cause was found
<wallyworld> i guess if unit, machine etc are one way, then we could be consistent
<wallyworld> but i disagree with the order nonetheless
<axw> wallyworld: ok, so, say I did make this change, and a failure occurs part way through creating a filesystem and we get a filesystem but no status. we restart, find we have the filesystem, then proceed to try to provision... update status and the agent will explode because the status doc is missing
<axw> wallyworld: this is one example, there will be many more. how would we deal with that?
<wallyworld> yeah, there's failure modes in either case :-(
<wallyworld> leave it as i guess
<axw> my point is that status->parent is pretty well secluded to the megawatcher
<fwereade> wallyworld, the motivation for the current approach is that "if you have an X, you should be able to find X's data, and you should be able to treat lack of X data as a guarantee of lack of X"
<wallyworld> at least it wil be consistent
<wallyworld> fwereade: but if you have X's data, you should be able to find X
<wallyworld> as we found out with megawatchr and service status
<wallyworld> if I have an arm, i should be able to find the body it's attached to
<fwereade> wallyworld, right, and if we had atomic/isolated txns we could do that, but we don't
<wallyworld> indeed :-(
<fwereade> wallyworld, neither a body without an arm nor an arm without a body make much sense
<wallyworld> fwereade: so service status i think was or is inverted in order then
<wallyworld> to remove the megawatchr race
<fwereade> wallyworld, it *should* come into existence before the service, and be removed afterwards
<wallyworld> a body without an arm makes more sense
<wallyworld> a tyre without a car makes less sense than a car without a tyre
<wallyworld> but if we are to be consistent, then i'll defer to the current practice
<wallyworld> i just with mgo were Atomic :-(
<wallyworld> wish
<fwereade> wallyworld, right, but I will go insane if all the state code I write means I have to figure out what a missing satellite doc means
<wallyworld> and i could argue the other way too :-)
<wallyworld> but, i'll defer to the current implementation for most things
<wallyworld> at least we have a shared understanding that there's a potential issue
<fwereade> wallyworld, yeah, in a greenfield scenario we could have an interesting discussion about which way hurts less
<wallyworld> axw: so ship it as it
<wallyworld> is
<fwereade> wallyworld, I *would* point out that the megawatcher is not an especially sane piece of code, though, and "it's hard to make it work with the megawatcher" is not a very strong argument against a given practice
<fwereade> wallyworld, http://thecodelesscode.com/case/97 :)
<wallyworld> fwereade: with megawatcher, i was refering more to the order of acess used
<wallyworld> which i think is a alid use case
<fwereade> wallyworld, an approach that wasn't sickeningly codependent with the storage details would work fine
<fwereade> wallyworld, the existence of an X implies the existence of all X's satellite data
<fwereade> wallyworld, so you *watch for Xs* and then pick up the rest of their data when you need it
<wallyworld> fwereade: we are mmoving to mongo 3.0 (or planning to) and that has proper acid under the covers, so we will get there soon enough :-)
<fwereade> wallyworld, right, all we have to do is rewrite the storage backend again, it'll be easy
<wallyworld> yeah :-D
<axw> hoho
<wallyworld> i should have said "soon"
<wallyworld> a guy can dream
<wallyworld> one day Hallie Berry will return my calls
<fwereade> wallyworld, I know, and you're not as wrong as I'm making out :)
<fwereade> wallyworld, it'll certainly be *easier*
<wallyworld> indeed
<fwereade> wallyworld, and the uncommitted state stuff will hopefully help there
<wallyworld> yes
<fwereade> wallyworld, did you get a chance to read that wiki doc btw?
<wallyworld> fwereade: i wish :-( on my todo list, will get there torrow
<wallyworld> or tonight after a drink
<fwereade> wallyworld, no worries, but I think we can probably have interesting discussions about it
<wallyworld> fwereade: if nothing else, i plan on drinking way too much with you in seattle and talking there :-D
<fwereade> wallyworld, sgtm :)
<thomnico> hello all (I'm new to go and dev in juju please be gentle)
<thomnico> I would like to get a config parameter in juju/container/image.go
<thomnico> like http-proxy for expl..
<thomnico> can someone provide me an example of that ?? I got lost with the config / environ etc.. options
<mup> Bug #1490947 opened: Unable to add manually provisioned machine previously destroyed <juju-core:New> <https://launchpad.net/bugs/1490947>
<perrito666> cherylj: congratulations :)
<perrito666> katco: why would you cancel tanzanite standup?
<katco> perrito666: eh?
<perrito666> I just got an email from gcal stating that you cancelled tanzanite standup, forever
<katco> perrito666: uh... can't be a coincidence, i'm trying to get kmail to work with my calendar. i hope it's not doing something completely stupid
<katco> perrito666: i just refreshed my canonical calendar...
<perrito666> well, seems kmail got fun
<perrito666> katco: trying the new kde?
<katco> perrito666: no, i use i3
<mgz> perrito666: you died... ;_;
<katco> perrito666: trying kmail because i can't find a good linux mail client. and i especially can't find one that supports caldav. looks like kmail is not an exception >.<
<mup> Bug #1490977 opened: juju run moves errored units state to pending <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1490977>
<TheMue> katco: I now simply use inbox in a browser, at least for our canonical mail
<katco> TheMue: i have 3 inboxes to manage... don't like keeping 3 tabs open 24/7
<TheMue> katco: my standard mailer can't be told here  *cough* *cough* Apple Mail *cough*
<TheMue> katco: yeah, multi account is always nice. once tried thunderbird and quickly shrugged
<katco> TheMue: Thunderbird is terribly slow for me
<TheMue> katco: I had troubles with the UI, not very intuitive
<katco> TheMue: kmail is working ok. pretty speedy, but UI is not intuitive in some places.
<cherylj> thanks, perrito666 :)
<TheMue> cherylj: also from my side
<cherylj> thanks, TheMue :)
<anastasiamac> katco: u did not like tanzanite standup?... at all?... :(
<katco> anastasiamac: i'm sorry :( see above chat... my calendar software decided to be stupid
<perrito666> katco: i3? why you hate eye candy so much? :p
<katco> perrito666: i love eye candy :) i love tiling more
<anastasiamac> katco: seen and m sympathic but will have fun in the morning :D it's almost midnight
<katco> anastasiamac: haha ok... go to bed!
<anastasiamac> katco: :D but... but... it's the best time to work - everyone else is finally in bed
<katco> lol
<perrito666> anastasiamac: amen
<perrito666> but if you husband is anything like my wife you will eventually hear a "wtf are you doing typing in the middle of the night"
<anastasiamac> perrito666: he is in IT as well... he is typing in his office too  now :)
<perrito666> anastasiamac: glorious
<perrito666> the good thing about night habits is that they get you things if your SO is not a night person
<perrito666> last thing I heard is  "would you please get a new kindle with light? your bedside light is not letting me sleep"
<anastasiamac> perrito666: :D better new kindle than separate rooms :)
<perrito666> anastasiamac: well the only other room of the house is my office :p
<perrito666> new house got an office at the oposite end of the bedroom so I can be noisy late
<anastasiamac> perrito666: that's something to look forward to :D
<perrito666> who has rb and github superpowers?
<perrito666> I need to close a couple of prs
<mgz> bogdanteleaga: looks like juju/testing was broken by the last commit on mgo.v2
<mgz> bogdanteleaga: two ways forward, fix that properly, or add a dependencies.tsv for juju/testing that specifies a working revision of mgo.v2
<bogdanteleaga> mgz: well, we can't set deps.tsv separately afaik, I think we can just wait for an upstream fix since this isn't that much of a priority now
<mgz> bogdanteleaga: I am doing it now.
<mgz> we just need a dependencies.tsv file specific for juju/testing
<mgz> like a subset of the juju/juju one
<mgz> bogdanteleaga: done it, works
<mgz> bogdanteleaga: I'll get my branhc review'd and land
<bogdanteleaga> mgz: hmm, this still doesn't seem like a good approach tho
<bogdanteleaga> mgz: it does solve this problem, but it feels really sketchy to have different dependencies for the juju/* stuff that's not core
<mgz> well, it would be nice if the tip of all branches a project depends on never breaks
<mgz> but that's what happens to us in practice
<bogdanteleaga> :)
<bogdanteleaga> wait don't we already use the deps.tsv from core?
<bogdanteleaga> I think that's what's getting me confused
<mgz> ony juju/juju does, the other projects at least pretend at being independant
<bogdanteleaga> oh, so the rest build with tip
<bogdanteleaga> ok
<bogdanteleaga> then it definitely makes sense
<bogdanteleaga> if you could find a way to make it pull the core one it would be great
<mgz> the problem with just using the core one is we actually don't want to pull in the mountains of stuff that does
<mgz> and we still have the issue that changes on another branch can break us, if juju/juju deps get updated
<ericsnow> katco: are we still targeting 1.25 for #1478156
<mup> Bug #1478156: tabular format does not give enough details about machine provisioning errors <charmers> <juju-core:Triaged> <https://launchpad.net/bugs/1478156>
 * marcoceppi hopes
<ericsnow> :)
<katco> ericsnow: nope; feature freeze for 1.25.0 has passed. 1.25.1
<ericsnow> katco: right, I meant 1.25 :)
<katco> ericsnow: just updated the bug
<ericsnow> katco: thanks
<katco> ericsnow: ty for asking
<bogdanteleaga> mgz: well yeah, if they try to be independent it does make sense to have different ones
<bogdanteleaga> mgz: you could just take what you need from core tho
<natefinch> mgz: have you talked to gustavo about the mgo breakage?
<mgz> natefinch: I commented on the mp that broke the test suite
<mgz> natefinch: https://github.com/go-mgo/mgo/issues/153 if you are curious
<natefinch> mgz: thanks.
<mgz> bogdanteleaga: merged
<natefinch> niemeyer3: have you seen the update about our tests failing because of that IPv4-only change to mgo? ^^
<niemeyer3> natefinch: Have you seen the update on the update? :)
<bogdanteleaga> mgz: yay thanks
<natefinch> niemeyer3: have now.  Thanks! :)
<natefinch> mgz: ^
<niemeyer3> np, sorry for the trouble
<natefinch> mgz: sounds like we can avoid putting a dependencies.tsv in juju/testing, which would be my preference
<mgz> natefinch: I got around it by just putting it in the job
<mgz> natefinch: we're going to have to have one for charm though I think
<natefinch> mgz: more than one dependencies.tsv in a build is asking for trouble.
<mgz> yup, but I think the projects can just ignore any in their dependencies
<bogdanteleaga> can somebody close https://github.com/juju/juju/pull/2951?
<bogdanteleaga> I pulled it in another PR to update dependencies and run gofmt
<natefinch> mgz: not really a fan of packages having their own dependencies.tsv ... why is charm going to need it?  Just for mgo?
<bogdanteleaga> also http://reviews.vapour.ws/r/2534/ and http://reviews.vapour.ws/r/2533/. They're closed on github but reviewboard didn't pick it up
<mgz> natefinch: charm has various vN branches, the tips of which are not compatible with the tips of all the branches they use
<mgz> natefinch: http://paste.ubuntu.com/12246351/
<natefinch> mgz: why not?  it should be referencing only things which are stable and/or using gopkg.in for versioning... otherwise we're doing something wrong.
<natefinch> mgz: that looks like we're doing something wrong :/
<katco> ericsnow: wwitzel3: natefinch: has anyone updated the wpm spec yet?
<ericsnow> katco: I haven't
<ericsnow> katco: I'll do it now
<katco> ericsnow: ty
<wwitzel3> thanks ericsnow :)
 * ericsnow cringes every time he sees "whit has quit"
<lazyPower> Are any core devs that are familiar with proxy settings available to help troubleshoot an issue w/ an orangebox in #juju? kwmonroe and I are starting to come up dry as this is one of those grey areas we dont tread in much
<ericsnow> katco: did we settle on what "juju status" is supposed to report for workloads?  just a count?  IDs only?
<katco> ericsnow: no, just that it would be a 1-line representation
<ericsnow> katco: k
<ericsnow> katco: 1 line per workload or 1 line per unit?
<katco> ericsnow: 1 per workload, right? because it's workload status, not unit status
<ericsnow> katco: k
<katco> ericsnow: lmk when you're finished
<ericsnow> katco: k
<perrito666> mgz: still around?
<ericsnow> katco, natefinch, wwitzel3: could you double-check my updates to the spec?  in particular check my changes relative to status
<wwitzel3> ericsnow: ok, I'll give it a read
<katco> ericsnow: i'm tal, but i'd like wwitzel3 to look at it as well since he had a hand in drafting it
<ericsnow> wwitzel3, katco: thanks
<katco> ericsnow: in the yaml, do we really need the top-level workloads key?
<ericsnow> katco: probably not
<katco> wwitzel3: here
<katco> wwitzel3: https://docs.google.com/document/d/1PcRQXaerlsACro4y1y5LWD-uvhfHya2CkOcoljyFyCU/edit#heading=h.hvw0kpxrzscu
<natefinch> ericsnow, katco: keep in mind that the yaml has to be produced by code, which has to handle collisions etc.  The reason we have components:workloads:<workload> is because components is a map of all the component names (so you don't have colisions between component names and other stuff in that part of the status), and then workloads as the component name to keep the workload data separate from other component data
<natefinch> ericsnow, katco:  Although... I guess if we're switching this to juju ps, then all of that is irrelevant
<ericsnow> natefinch: we're not switching all of it
<wwitzel3> we aren't?
<natefinch> ericsnow: oh, I thought we were totally yanking workloads out of status
<ericsnow> natefinch: if we are then it certainly is simpler :)
<wwitzel3> ericsnow: it looks good to me, but I was always under the impression that all status stuff was moving under ps
<ericsnow> wwitzel3: that's fine with me :)
<katco> ericsnow: wwitzel3: i can go with that
<ericsnow> katco, wwitzel3: I'll update the spec
<natefinch> ericsnow: do we plan on supporting executable plugins in production in the future?
<ericsnow> natefinch: no immediate plans, but I'm hopeful
<mup> Bug #1491132 opened: Failing ec2 tests <juju-core:New> <https://launchpad.net/bugs/1491132>
<natefinch> ericsnow: should we change the name of the doc?
<ericsnow> natefinch: nice catch :)
<ericsnow> natefinch: done
<natefinch> If you call 72 point font a nice catch, sure ;)
<natefinch> ericsnow: should we remove all references to plugins for now?.  I think communication on that point has been rather clear and decisive... which doesn't mean we can't bark up that tree another day.
<ericsnow> natefinch: I thought I had moved all those references down to an "open questions" section
<natefinch> ericsnow: ahh, yes, sorry, the section title had scrolled off the top of the screen :D
<natefinch> ericsnow: there are some comments that still use the word plugin though
<ericsnow> natefinch: I did not edit any comments (nor planned on doing so)
<natefinch> ericsnow: *nod*, just trying to avoid any misunderstanding due to the use of "old" terminology.
<ericsnow> natefinch: np
<menn0> thumper: here's the PR that was almost done: http://reviews.vapour.ws/r/2552/
 * thumper finally gets to look at email
<wallyworld> axw: anastasiamac: perrito666: our standup event is gone for some reason, i'll create a new one
<perrito666> yes please
<axw> wallyworld: katco cancelled it... accident I guess
<wallyworld> created
 * wallyworld makes a note to cancel katco's meetings in revenge
<axw> hehe
<wallyworld> axw: can you get in?
<katco> axw: wallyworld: hey i'm so sorry... i tried to refresh my calendars in kmail and for some unknown reason it cancelled the tanzanite standup. i have no earthly idea why
<wallyworld> katco: sure, it was all google's fault
<katco> wallyworld: kmail!
<katco> wallyworld: didn't we just talk yesterday about how they all suck? ;p
<wallyworld> katco: yes we did
<wallyworld> perrito666: re: rpc and json marshalling of nested map
<wallyworld> see bug 1486254
<mup> Bug #1486254: Raw Go errors reported to users <jes> <ui> <juju-core:Triaged by menno.smits> <https://launchpad.net/bugs/1486254>
<wallyworld> the last comment
<wallyworld> looks like there's some potential overlap there
<wallyworld> perrito666: you may want to just deal with simple maps initially
<perrito666> wallyworld: key=string/int ?
<wallyworld> perrito666: yeah, that status data was i think only intended to be simple string->string
<wallyworld> of string->primitive type
<wallyworld> or
<perrito666> well that is there then, going with that :)
<wallyworld> yep
<wallyworld> no need to over complicate it
#juju-dev 2015-09-02
<perrito666> bbl dinner
<wallyworld> axw: i've updated the pr when you are back
<anastasiamac> I am OCR next Tuesday but am on holidays - anyone keen to swap? :D
<axw> wallyworld: shipit
<wallyworld> ty
<axw> wallyworld: I think we should eventually tease out those model representations from api/uniter, and move them into the worker/uniter package itself, wrapping a flat API
<axw> wallyworld: then we can think about auto-generating the API client code
<axw> not today obviously, just thinking about things we can do to improve quality in future
<wallyworld> yeah, there's room for improvement in that area. we need to rethink a few of our initial design deceisions
<wallyworld> not that we are hitting these pain points
<wallyworld> now
<axw> wallyworld: well we are with testing. and writing API clients isn't *hard* but it's more work than necessary
<axw> time better spent elsewhere
<wallyworld> yep
<davecheney> axw: I have a question about tools storage
<davecheney> tools_test.go:331: c.Assert(url, gc.Equals, "https://0.1.2.3:1234/environment/my-uuid/tools/"+version.Current.String())
<davecheney> ... obtained string = "https://0.1.2.3:1234/environment/my-uuid/tools/1.26-alpha1-trusty-arm64"
<davecheney> ... expected string = "https://0.1.2.3:1234/environment/my-uuid/tools/1.26-alpha1"
<davecheney> i've been tinkering with something and have a test failure
<davecheney> but I think the expected message is wrong
<davecheney> why would the expected want tools withotu a series and an arch ?
<axw> davecheney: sec, finding the code to refresh memory
<davecheney> apiserver/common
<axw> davecheney: yeah, it should be the binary string... version.Current is a verison.Binary in master, did you change its type?
<davecheney> i did
<axw> davecheney: show me your diff?
<davecheney> it's very large
<davecheney> so when you say binary string
<davecheney> it should be the output of version.Binary.String() ?
<axw> davecheney: yep
<davecheney> m'kay
<davecheney> thanks
<thumper> menn0: got time to talk migrations?
<menn0> thumper: yep
<davecheney> axw: i see the problem
<davecheney> axw: https://github.com/juju/juju/compare/master...davecheney:remove-version-arch?expand=1
<davecheney> if you want to see the whole thing
 * thumper peeks
<davecheney> axw: the error mesage above is back to front
<davecheney> the expected is being constructed incorrect ...
<davecheney> the expected is being constructed incorrectly ...
<axw> davecheney: I think you just want to s/version.Current/current/ on the last line of TestToolsURLGetter
<axw> davecheney: the URL returned *should* include arch/series
<axw> it's a URL to download a specific tools archive
<davecheney> yeah, i understand what is going on now
<davecheney> hanks
<davecheney> thanks
<axw> nps
<mup> Bug #1491226 opened: juju status hangs at 'connection established to' until jujud-machine-0 restarted <juju-core:New> <https://launchpad.net/bugs/1491226>
<thumper> I'm just popping out again to take sad daughter with sadder laptop back down to repair person
<thumper> bbs
<thumper> back
<thumper> dying hard disk
<thumper> there goes another $150
<thumper> boo
<davecheney> thumper: version.Binary.OS is _always_ derrived from version.Current.Series ...
<davecheney> we don't need to put that on the struct
<davecheney> we can recover it with a function
<thumper> right, I don't know who did, or why exactly
<thumper> perhaps look for the commit that added it and look at the review?
<thumper> I'm guessing it was added as a "helper"
<thumper> where a plain function would have been better
<davecheney> having it on the value does have one advantage
<davecheney> it's always known to be valid
<davecheney> recovering it from version.Current.Series when needed does mean you have to handle the error there
<davecheney> that doesn't look like too much of a problem
<davecheney> and that means version.Binary goes back to being just Number, Series and Arch
<davecheney> i think it's worth the cost
<davecheney> _especially_ as most of the callers that use version.Binary.OS, are actually asking ... is this windows
<davecheney> is this centos
<davecheney> they don't want to switch on the versin
<davecheney> just check if they shold enable something or not
<davecheney> I have not found one case where we overwrite vesion.Current.OS
<davecheney> I think i'll move the OS stuff out of version/ into juju/os
<davecheney> like we have juju/arch
<thumper> sounds good to me
<thumper> also good to have explicit functions for checking OS
<thumper> I'm getting called for dinner
<thumper> chat tomorrow
<mup> Bug #1491353 opened: workers ignore stop channel <juju-core:New> <https://launchpad.net/bugs/1491353>
<mgz> bogdanteleaga: I have some bugs for you
<mup> Bug #1491398 opened: RebootSuite test failures on windows <ci> <regression> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1491398>
<mup> Bug #1491399 opened: TestInstallMongodServiceExists fails on precise <ci> <precise> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1491399>
<niemeyer_> natefinch, mgz: Can you please see if the IPv6 test is now working again?
<mgz> niemeyer_: sure thing
<niemeyer_> mgz: Thanks!
<wwitzel3> ericsnow: what is the state of the workload plugins to libs?
<mgz> niemeyer_: all good, <http://paste.ubuntu.com/12253283>
<wwitzel3> ericsnow: want me to pick it up? my resolve storage variables is being blocked by it
<ericsnow> wwitzel3: basically all done
<niemeyer_> mgz: Sweet
<ericsnow> wwitzel3: I split it up; first: http://reviews.vapour.ws/r/2541/
<niemeyer_> mgz: Still need to do more for the full IPv6 address resolution story, but at least addresses are working properly again
<mup> Bug #1491398 changed: RebootSuite test failures on windows <ci> <regression> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1491398>
<mup> Bug #1491399 changed: TestInstallMongodServiceExists fails on precise <ci> <precise> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1491399>
<mup> Bug #1491398 opened: RebootSuite test failures on windows <ci> <regression> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1491398>
<mup> Bug #1491399 opened: TestInstallMongodServiceExists fails on precise <ci> <precise> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1491399>
<mup> Bug #1491451 opened: [Juju 2.x] Remove command aliases <juju-core:New> <https://launchpad.net/bugs/1491451>
<katco> ericsnow: wwitzel3: natefinch: new card on the board for removing the cloudsigma feature flag. if one of you could pick that up in the next few days, that'd be great
<katco> ericsnow: wwitzel3 natefinch why is "massachusetts names" at the top of the spec?
<natefinch> lol
<natefinch> mistype sorry
<wwitzel3> hah
<bogdanteleaga> can someone take a look at this? http://reviews.vapour.ws/r/2557/
<benji> k/quit
<mup> Bug #1491547 opened: [upgrade-juju] Poor user experience <juju-core:New> <https://launchpad.net/bugs/1491547>
<mup> Bug #1491547 changed: [upgrade-juju] Poor user experience <juju-core:New> <https://launchpad.net/bugs/1491547>
<alexisb> cherylj, ping
 * alexisb looks at the calendar and sees cherylj is on vacation
<mup> Bug #1491547 opened: [upgrade-juju] Poor user experience <docteam> <juju-core:New> <https://launchpad.net/bugs/1491547>
<lazyPower> Greetings core hackers o/ I've run into an interesting scenario that appears to be a dupe of this bug: https://bugs.launchpad.net/juju-core/+bug/1416928 - juju has picked the wrong interface to atttempt to communicate with the state server
<mup> Bug #1416928: juju agent using lxcbr0 address as apiaddress instead of juju-br0 breaks agents <api> <lxc> <network> <juju-core:Fix Released by dooferlad> <juju-core 1.21:Fix Released by dimitern> <juju-core 1.22:Fix Released by dooferlad> <https://launchpad.net/bugs/1416928>
<lazyPower> in this particular case its the docker0 bridge interface, and none of the LXC interfaces. Is there a suggested work around?
<natefinch> lazyPower: I presume you're on a version of juju that has the fixes to that bug?
<mbruzek> natefinch: it is  a customer who is encountered this and they are at 1.24.5.1
<lazyPower> natefinch: 1.25.1 to be exact.
<lazyPower> er
<lazyPower> sorry mbruzek is correct. 1.24.5.1
<natefinch> *nod*
<mbruzek> natefinch http://paste.ubuntu.com/12256492/
<mbruzek> natefinch: I can see Juju dialing "wss://192.168.122.1:17070/environment
<mbruzek> natefinch: then I see Juju trying to download https://172.17.42.1:17070/environment
<natefinch> mbruzek, lazyPower:  yeah, looking at the fix, the bugfix was very specifically targetted at lxc, so it wouldn't filter out other bridge adapters, like docker's.
<lazyPower> natefinch: i can file a follow up bug if thats helpful
<lazyPower> we have enough log output and i see what happened in the deployment well enough to outline whats happened.
<mbruzek> natefinch: is that the right bug 1416928 ?
<mup> Bug #1416928: juju agent using lxcbr0 address as apiaddress instead of juju-br0 breaks agents <api> <lxc> <network> <juju-core:Fix Released by dooferlad> <juju-core 1.21:Fix Released by dimitern> <juju-core 1.22:Fix Released by dooferlad> <https://launchpad.net/bugs/1416928>
<mbruzek> Can I have them watch that one for a fix?  Or where?
<natefinch> lazyPower: yes please.  I'm sure it's the exact same thing, just the fix explicitly just uses lxc-net to find the lxc bridge
<natefinch> mbruzek: watch the new one lazyPower makes.  And lazyPower, please link to that lxc bug.  Hopefully we can find a more general solution that won't break whenever something else adds its own bridge
<mup> Bug #1491578 opened: [wily] When adding a machine, can't find the juju-local metapackage  <juju-core:New> <https://launchpad.net/bugs/1491578>
<natefinch> katco: I just reread https://bugs.launchpad.net/juju-core/+bug/1486553 and only now noticed the linked bug at the bottom of the description that is a private bug.... it seems like there's no actual juju-core bug to reproduce.  Yes, there's a network timeout when running deploy, but that's not a bug, that's the network being the network.
<mup> Bug #1486553: i/o timeout errors can cause non-atomic service deploys <cisco> <landscape> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1486553>
<natefinch> katco: It seems like that big is "theoretically, something bad could happen" not "something bad has actually happened"
<natefinch> s/big/bug/
<katco> natefinch: we have customer sites that this is affecting
<natefinch> katco: nothing in the bug says what problems they're having, other than a network timeout.... after which they look at the environment and the deploy succeeded.
<katco> natefinch: let me find someone you can talk to
<natefinch> katco: that would be helpful :)
<katco> natefinch: talk to landscape/beret (dean)
<natefinch> Beret: I'm looking into https://bugs.launchpad.net/juju-core/+bug/1486553   hoping you can help me understand how core can help.
<mup> Bug #1486553: i/o timeout errors can cause non-atomic service deploys <cisco> <landscape> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1486553>
<Beret> natefinch, bad stuff does indeed happen
<Beret> and happens often
<Beret> ahasenack, around?
<Beret> natefinch, did you see Dimiter's comment?
<natefinch> Beret: yes, but I'm not sure if that's the specific problem the customers are seeing.
<dpb1_> natefinch: wdyn
<natefinch> dpb1_: the bug https://bugs.launchpad.net/juju-core/+bug/1486553 doesn't tell me what specific problem the customer is seeing, other than a network timeout, which is sorta out of my jurisdiction.
<mup> Bug #1486553: i/o timeout errors can cause non-atomic service deploys <cisco> <landscape> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1486553>
<ahasenack> natefinch: we had two occurances related to that timeout
<ahasenack> natefinch: a) juju deploy returned a timeout, we retried it, it then said the service already exists
<mup> Bug #1491578 changed: [wily] juju add-machine, can't find the juju-local metapackage <juju-core:New> <https://launchpad.net/bugs/1491578>
<ahasenack> natefinch: b) juju deploy returned a timeout, service was deployed but without the configuration we asked  (just its defaults)
<ahasenack> natefinch: I'm a bit less clued in the (b) case, I know what was reported in the bug in that case
<ahasenack> natefinch: but about (a), we see that rather often, as in a few times per week
<ahasenack> natefinch: we found out the (b) case when we started handling the "service already exists" error for when we retry. We assumed it was all good, but later found out the service config options were not what we set
<natefinch> ahasenack: b) is much more concerning to me.  a) seems like the price we pay for an unreliable network... though perhaps there's some bug causing the timeout.  b) is definitely just a bug that should be fixed (though I'm sure it'll be tricky).
<ahasenack> natefinch: it's tricky to workaround as well, we would have to remove the service and deploy again to be sure
<ahasenack> natefinch: it's when we hit (b) that the bug got escalated, since we have a workaround in place for (a)
<ahasenack> "if it says service is already deployed, good, let's just move on"
<ahasenack> natefinch: iirc the timeout was to localhost,
<ahasenack> natefinch: not between us (client) and state server
<ahasenack> natefinch: gotta run now
<mup> Bug #1491578 opened: [wily] When adding a machine, can't find the juju-local metapackage  <juju-core:New> <https://launchpad.net/bugs/1491578>
<natefinch> a timeout to localhost is pretty bogus, for sure
<ahasenack> back
<ahasenack> natefinch:
<ahasenack> Aug 7 20:45:06 job-handler-1 INFO Traceback (failure with no frames): <class 'canonical.juju.errors.RequestError'>: cannot add service "mongodb": read tcp 127.0.0.1:37017: i/o timeout
<ahasenack> that's what I meant with timeout to localhost
<ahasenack> job-handler is our juju client in this scenario
<ahasenack> natefinch: so if true, it's not a networking issue, more like a local high load perhaps, short timeout, all threads occupied, etc, scenario
<natefinch> ahasenack: yeah,  I was going to say... what's the timeout?
<natefinch> like length of time
<ahasenack> ours? don't know, would have to check
<natefinch> deploy *should* be quick, but it depends on if you're doing a deploy of a local charm, etc...
<ahasenack> but that is to localhost, so it's on your end
<ahasenack> that message, "read tcp 127.0.0.1:37017: i/o timeout", comes back from juju, no?
<ahasenack> our connection is still up and fine
<natefinch> ahasenack: ahh ok, sorry, was getting confused by the extra stuff on the beginning of the error.   Yes, the 'cannot add service "foo" ' is our message.
<natefinch> ahasenack: any more info you can put in that bug would help... like, the type of environments you've seen this in, etc.  You shouldn't be connecting to localhost unless you're running local provider or running the client on the state machine
<ahasenack> natefinch: for what is worth, we deploy a lot of containers, so disk i/o is intense
<ahasenack> natefinch: we do not connect to localhost
<natefinch> ahasenack: sorry, I mean that juju should not be connecting to localhost
<ahasenack> landscape ---------> |17070:state server -> mongo:37017|
<ahasenack> is that a fair diagram?
<natefinch> ahasenack: oh, yes, duh, ok
<natefinch> ahasenack: I gotta run, it's dinner time here.... I'll work on this some more tonight, see if I can reproduce, or induce a reproduction.
<ahasenack> natefinch: cheers, thx
<mup> Bug #1491592 opened: local provider uses the wrong interface <customer-support> <local-provider> <networking> <juju-core:New> <https://launchpad.net/bugs/1491592>
<mbruzek> Hello juju-devs this bug that mup just referenced is from a customer.
<mbruzek> I see that natefinch has already left https://bugs.launchpad.net/juju-core/+bug/1491592
<mup> Bug #1491592: local provider uses the wrong interface <customer-support> <local-provider> <networking> <juju-core:New> <https://launchpad.net/bugs/1491592>
<mbruzek> It is a networking related problem with the local kvm provider
<rogpeppe> wallyworld: i think you might have been involved in some of this code. perhaps you might want to review this fix? https://github.com/juju/utils/pull/148
<wallyworld> rogpeppe: sure thing, just have to finish a meting first
<rogpeppe> wallyworld: thanks!
<mup> Bug #1491608 opened: importing juju/utils should not side-effect http.DefaultTransport <juju-core:New> <https://launchpad.net/bugs/1491608>
<alexisb> axw, I will be a few minutes late
<axw> alexisb: np, ping when you're there please
<alexisb> axw, I am on the hangout when you are ready
#juju-dev 2015-09-03
 * thumper dogwalk
 * perrito666 returns
<natefinch> ahasenack: thanks for the update on the bug... I was going to post something similar.  Sorry for the initial misunderstanding.
<axw> wallyworld: sorry I missed standup, was in 1:1. FYI yesterday I put up a branch that adds scheduling/retries/status for filesystems in the storage provisioner; another branch with tests for retries/status; and then fixed that error reporting issue in maltese-falcon
<wallyworld> axw: no worries, i'm currently reviewing some stuff on rb, will get to yours soon
<axw> wallyworld: thanks
<wallyworld> axw: actually, i can't see yours, must have already landed?
<axw> wallyworld: RB hates me: https://github.com/juju/juju/pull/3177
<wallyworld> yay rb
<davechen1y> thumper: I have a PR for the new juju/os package
<davechen1y> i'm on call reviewer
<davechen1y> which is a problem
<davechen1y> do you want to take a look ?
<davechen1y> thumper: https://github.com/juju/juju/pull/3193
<thumper> davechen1y: ack
<perrito666> cu all
<natefinch> bye perrito666
<thumper> davechen1y: it seems that while I've been reviewing this it has gone from diff version 1 to 4
<davechen1y> yes
<davechen1y> git --force is magical
<davechen1y> the delta between diffs is smal
<davechen1y> just renamed a few imports from coreos to jujuos, 'cos we don't need the hassle
<davechen1y> the key parts are version/ and juju/os/
<davechen1y> the rest are just mechanical translation
<davechen1y> the rest are just mechanical translations
<thumper> I saw the change
<thumper> I actually recommended s/coreos/jujuos/g
<thumper> just a few other questions and some comments, but pretty happy overall
<davechen1y> cool, i'm just fixing a few tests that wont compile due to the move of the constants from version to juju/os
<davechen1y> will push another proposal v soon
<davechen1y> i'd like to try to get this landed today
<davechen1y> so I can take a run at deleting version.Binary.OS
<axw> wallyworld: I'm going out for a while, school carnival
<wallyworld> sure, have fun
<anastasiamac> axw: school carnivals r fun :) enjoy!
<anastasiamac> axw: especially if there is cotton candy...
<davechen1y> thumper: thanks for the review
<davechen1y> i've responded to all the points
<davechen1y> PTAL
<thumper> kk
<davechen1y> thumper: http://reviews.vapour.ws/r/2559/
<davechen1y> just gave this the thumbs down
<thumper> I said the same on the gitub review
<davechen1y> bzzt
<mup> Bug #1491688 opened: all-machine logging stopped, x509: certificate signed by unknown authority <landscape> <juju-core:New> <https://launchpad.net/bugs/1491688>
<davechen1y> ^^ that'll be ntp
<natefinch> davechen1y: ubuntu should have a feature that smacks you if you mess with the clock
<thumper> menn0: does the db log pruner worker run for each environment?
<axw> anastasiamac: didn't see cotton candy, but they had a coffee van... very professional
<anastasiamac> axw: 2nd bets thing!
<anastasiamac> best
<davechen1y> natefinch, fance a side bet that this was a MAAS install ?
<anastasiamac> cd ..
<anastasiamac> oops :)
<thumper> heh
<anastasiamac> thumper: just checking who is watching :D
<davechen1y> anastasiamac: least it wasn't your passwod
<menn0> thumper: no it's a global thing
<menn0> thumper: sorry didn't notice your msg
<wallyworld> axw: if you have time, i finally got school pick up and dinner out the way http://reviews.vapour.ws/r/2573/
<axw> wallyworld: sure
<wallyworld> ta
<axw> wallyworld: done. one problem with clearing the UpdateStatusRequired field, but otherwise looking good
<wallyworld> axw: i relied
<wallyworld> replied
<wallyworld> thnaks for looking
<axw> wallyworld: me too
<wallyworld> axw: fair point, i'll tweak it
<axw> wallyworld: thanks
<axw> wallyworld: I'd just add a StatusVersion like ConfigVersion, should be straight forward
<wallyworld> yup
<wallyworld> need to help kid with homework, so might not do it immediately
<axw> nps
<mattyw> wallyworld, ping?
<wallyworld> mattyw: hey
<mattyw> wallyworld, hey - so are you ok with runhook being left as is for now?
<wallyworld> mattyw: yeah, the tests still bother me (started and installed inconsistent), but we can fix later
<mattyw> wallyworld, that's sort of the point, they're inconsistent because they're testing change
<mattyw> wallyworld, the other option is I could select on the hooks that would run if installed = true
<wallyworld> mattyw: sur, but the reader will look and go "wtf" why is installed false when start is true
<mattyw> wallyworld, so I can try to catch all states - to make sure the output is always at least consistent in itself
<wallyworld> but i agree it's not urgent before merging
<mattyw> if I get time today I'll make that change and ping you et al to take a look
<wallyworld> mattyw: yeah, selecting is what i suggested, but i wouldn't hold up merging for it
<mattyw> wallyworld, in the meantime - bed time again
<wallyworld> mattyw: of more importance - i think you assigned yourself to do some doc?
<wallyworld> ttyl
<mattyw> wallyworld, I started a flowchart yeah
<mattyw> wallyworld, I need to put leadership etc on there
<mattyw> wallyworld, and it doesn't include the realtion hooks yet
<mattyw> wallyworld, but I think we'd agreed that it could happen after mergning
<wallyworld> mattyw: people are whining there's been no sprint summary email. i told them we'd wait for doc and code to be done :-) neither is done yet
<wallyworld> +
<wallyworld> 1
<mattyw> wallyworld, ok - I'll talk to casey tomorrow, we'll put together some press statement before we propose the branch :)
<wallyworld> we can help too, maybe if a draft were done on google docs we could all hack on it
<mattyw> sounds great
<mattyw> wallyworld, nighty night
<wallyworld> see ya
<wallyworld> axw: actually, the watcher has to reset remotestate.updatestatus to false, otherwise when each snapshot is taken, it will remain true. it needs to go to true and straight back to false. a one shot trigger. if the resolver needs to ensure it acts on it, it can if it wishes record something in local state when the channel fires.
<axw> wallyworld: hence a version, something that only goes forwards
<wallyworld> ok
<wallyworld> seems "wrong" to call it a version when it's a trigger, but it will work
<axw> wallyworld: Snapshot() is meant to describe the goal state, and we should not be encoding the logic of ignoring state changes in the state watcher
<wallyworld> it's not ignoring - the resolver will pick it up
<wallyworld> whether it acts on it is still up to the resolver
<wallyworld> but i'll change it to be consistent
<axw> wallyworld: it boils down to "UpdateStatusNeeded" not really being a goal state. the idea of having a version is to say that the local state is out of date if its version is less than that of the remote state's
<wallyworld> yes, agreed. it's not a goal state, just an advisotry
<wallyworld> advisory
<axw> wallyworld: so yes, the idea is to convert a trigger into something that is not a trigger, but a goal
<axw> wallyworld: and I'm saying that's up for the client to decide
<wallyworld> i agree, just differ on the implementation. but it'scool, i'll change
<lazyPower> dimitern: Good morning dimitern o/
<dimitern> lazyPower, morning :)
<lazyPower> dimitern: Can i call your attention to this bug briefly while its still fresh in my mind? https://bugs.launchpad.net/juju-core/+bug/1491592
<mup> Bug #1491592: local provider uses the wrong interface <charmers> <customer-support> <local-provider> <networking> <juju-core:New> <https://launchpad.net/bugs/1491592>
<dimitern> lazyPower, sure, let me have a look
<dimitern> lazyPower, hmm so it's local + kvm?
<lazyPower> yeah
<lazyPower> I riffed briefly with nate yesterday about the referenced bug - where a fix was provided to filter lxc bridges, but that seems fairly brittle since we're growing our SDN providers, and most of them create bridge devices. This just happens to be the docker0 bridge thats causing the trouble this time :/
<dimitern> lazyPower, well, we're only filtering whatever lxc_bridge is in agent config
<dimitern> lazyPower, I bet it's lxcbr0 or (for kvm) virbr0
<dimitern> lazyPower, there's a potential workaround you could try
<dimitern> lazyPower, ignore-machine-addresses: true in environments.yaml (I'll double check the setting name)
<perrito666> morning all
<dimitern> lazyPower, that's it - is it possible to give that setting a try and re-bootstrap (or ask the customer)?
<dimitern> perrito666, o/
<lazyPower> dimitern: we sure can
<lazyPower> Thanks for talking a look dimitern
<dimitern> lazyPower, no worries, I hope it might help
<katco> wwitzel3: standup
<katco> natefinch: bug 1486553 is now targetted for 1.25.0, so please do you work there first and forward-port to 1.26
<mup> Bug #1486553: i/o timeout errors can cause non-atomic service deploys <cisco> <landscape> <juju-core:In Progress by natefinch> <juju-core 1.25:New> <https://launchpad.net/bugs/1486553>
<natefinch> katco: ok, thanks
<ericsnow> katco, alexisb: FYI, the cloudsigma provider is now no longer provisional in 1.25 (I'll forward-port shortly)
<katco> ericsnow: awesome! ty for picking that up. alexisb ^^^
<alexisb> ty ericsnow !
<ericsnow> katco, alexisb: np :)
<natefinch> ericsnow, wwitzel3: either of you want to trade 1:1 times with me?  We have a guest coming to the house at 3, which is when my 1:1 is with katco.
<ericsnow> natefinch: I can
<natefinch> ericsnow: cool, when's yours?
<ericsnow> natefinch: 4 eastern
<natefinch> that's better than 3 for me... though earlier would be better. wwitzel3 - how about you?
<wwitzel3> natefinch: I'm in mine right now and it is over in 17 minutes
<natefinch> wwitzel3: haha ok
<voidspace> fwereade: ping
<mup> Bug #1415517 changed: juju bootstrap on armhf/keystone hangs <armhf> <bootstrap> <hs-armhf> <juju-core:Fix Released> <https://launchpad.net/bugs/1415517>
<natefinch> katco: so, mind if ericsnow and I switch 1:1 times?
<katco> natefinch: ericsnow: not at all... sorry
<natefinch> katco: cool
<ericsnow> natefinch: thanks
<ericsnow> katco: ^^^
<natefinch> ericsnow: any time ;)
<mgz> I think katco should talk to ericsnow about natefinch and natefinch about ericsnow in that case
<ericsnow> mgz: that's what we already do <wink>
<mgz> :)
<mgz> master is block btw, ftb on windows
<mup> Bug #1491923 opened: FTB on windows - ReadOSRelease undefined <blocker> <ci> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1491923>
<mgz> the windows-public-clouds branch has been rolled back one rev
<fwereade> voidspace, sorry, pong
<voidspace> fwereade: I think we're about to tear-down, I'll ping you tomorrow
<fwereade> voidspace, ok, cheers
<rogpeppe2> fwereade: hiya
<fwereade> rogpeppe, o/
<fwereade> rogpeppe, am just off actually :(
<rogpeppe> fwereade: oh...
<rogpeppe> fwereade: just wanted to ask what "workloads" were in charms
<rogpeppe> fwereade: and chew the fat a bit :)
<rogpeppe> fwereade: perhaps you could tell me who knows about them
<fwereade> rogpeppe, can do the first: things like the bunch of docker conntainers you're running and want to surface to the admin
<fwereade> rogpeppe, katco knows all
<rogpeppe> fwereade: cool
<rogpeppe> fwereade: just wanted to make a slightly better doc comment than https://github.com/juju/charm/pull/150/files
<katco> rogpeppe: haha, yes
<katco> rogpeppe: so actually, i'm not sure if that code will stay in there. wwitzel3 ericsnow natefinch do you recall?
<rogpeppe> katco: it would be great if someone could add a decent explanation of what workloads are to the charm package
<katco> brb
<ericsnow> wwitzel3: think you could take a look at rogpeppe's request ^^^
<ericsnow> wwitzel3: I'd be glad to as well
<katco> ericsnow: weren't we talking about moving that out of the charm repo. because we won't be in metadata.yaml any longer?
<rogpeppe> katco, ericsnow: that would be ideal from my pov
<ericsnow> katco: we were deliberating; on the one hand workloads.yaml is a certainly charm-level concern, but on the other hand it isn't a fully realized part of all the charm machinery such that it *has* to be in the charm repo
<katco> rogpeppe: it was in there originally because we were adding to metadata.yaml; now we'll have a separate file: workloads.yaml, but there may still be a reason to keep it in there... remembering something about needing to load the workloads.yaml at the same time as metadata.yaml
<katco> ericsnow: ah ok. we need to make that decision soon
<ericsnow> katco: conceptually I think it still belongs in the charm repo but we should come to a concensus with the GUI/eco teams
<natefinch> ericsnow: now that we don
<natefinch> ericsnow: don't have plugins... do we really need type-options?
<natefinch> ericsnow: seems like that feature was mostly to give plugins an escape hatch
<ericsnow> natefinch: we still need it
<ericsnow> natefinch: it's not about executable vs. lib plugin
<ericsnow> natefinch: it's about generic options vs. technology-specific options
<natefinch> ericsnow: right, but if the types are all hard coded, any type-specific option could just be in the main struct, and ignored by the types that don't use it.
<natefinch> ericsnow: sorry, just thinking out loud.  I think type-options is fine, just less important now that we know at compile-time what all the possible options are.
<ericsnow> natefinch: I'm not sure that having a struct with *all* possible options supports maintainability well
<natefinch> ericsnow: one option I thought of was to let the plugin provide the value to be serialized/deserialized. Then a plugin that wanted to extend the base struct could embed it and add its own fields.  But that's a significant change, and somewhat less straight forward than just having an extra "other stuff" map.
<ericsnow> natefinch: keep in mind that the struct is effectively exposed to charmers so we want to keep the focus on the language we're effectively establishing for dealing with workloads in Juju
<ericsnow> natefinch: what you are suggesting would make more sense if we had a separate type used internally from the one exposed to charmers (which I would consider overkill)
<natefinch> ericsnow: well, what's exposed to charmers is a yaml file, right?  Certainly we'd want 95% of that to be the same for all workload types.  So the question is how you support the other 5%
<ericsnow> natefinch: yeah
<ericsnow> natefinch: this is a discussion we had a while back and the type-options approach was the one we settled on
<katco> ericsnow: will be just a few mins late
<ericsnow> katco: np
<mup> Bug #1492000 opened: mongo: testEnsureNumaCtl fails when $TMPDIR is set <juju-core:New> <https://launchpad.net/bugs/1492000>
<perrito666> mattyw: scheme seems to do only checking I need conversion
 * perrito666 is really hoping for mattyw to correct him
<mattyw> perrito666, I think it does conversion as well
<mattyw> perrito666, that's what the coerce calls are for
<perrito666> mattyw: I have not looked in detail but it seems to do the  convversion just as a way to check types
<ericsnow> wallyworld_: could you take a quick look at my fix for #1491923?  http://reviews.vapour.ws/r/2587/
<mup> Bug #1491923: FTB on windows - ReadOSRelease undefined <blocker> <ci> <regression> <windows> <juju-core:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1491923>
<ericsnow> wallyworld_: I'm going to test it locally under windows too
<wallyworld_> sure
<ericsnow> wallyworld_: thanks
<wallyworld_> ericsnow: i think it looks ok :-)
<ericsnow> wallyworld_: I can't remember my admin PW for my Windows VM lol
<wallyworld_> ha
<wallyworld_> windozesux maybe?
<ericsnow> wallyworld_: yeah, that was it <wink>
<mwhudson> davecheney: so can you explain how i broke freebsd/arm?
<mwhudson> oooh
<mwhudson> ok nevermind
<mwhudson> davecheney: https://go-review.googlesource.com/#/c/14280/1 take 2
<mwhudson> davecheney: can you test on freebsd or nacl or android or something?
<perrito666> wtf google signing me off
<mup> Bug #1492058 opened: Deploy CentOS images with Juju KVM provider <juju-core:New> <https://launchpad.net/bugs/1492058>
<mup> Bug #1492066 opened: cloud-init fails when deploying CentOS with Juju. <centos> <cloud-init> <juju> <juju-core:New> <https://launchpad.net/bugs/1492066>
#juju-dev 2015-09-04
<davecheney> mwhudson: testing now, thanks
<davecheney> this stuff is a bit subtle
<mwhudson> davecheney: as has been noted before the go linker is terrible
<mwhudson> davecheney: i know what i was doing wrong though
<mwhudson> basically the value of goarm was being appended to the value from the runtime.a file
<mwhudson> so it was \x00\x07
<mwhudson> and it's only a uint8 so the runtime was just seeing 0
<ericsnow> wallyworld_: I've landed that fix to unblock master (thanks for the review)
<wallyworld_> np, thanks for fixing :-)
<ericsnow> wallyworld_: CI is supposed to auto-unblock once CI passes, right?
<wallyworld_> yes
 * ericsnow crosses fingers
<ericsnow> g'night
<davecheney> ericsnow: thanks for fixing that
<stokachu> ran into an interesting bug: https://bugs.launchpad.net/juju-core/+bug/1492088
<mup> Bug #1492088: juju bootstrap fails inside a wily container <cloud-installer> <juju-core:New> <https://launchpad.net/bugs/1492088>
<stokachu> anyone seen this before?
<mup> Bug #1492088 opened: juju bootstrap fails inside a wily container <cloud-installer> <juju-core:New> <https://launchpad.net/bugs/1492088>
<mwhudson> davecheney: say, do you know off hand the difference between GOARM=5 and GOARM=6?
<mwhudson> oh soft float seems like a major suspect
<davecheney> 12:10 < dfc> thumper: http://reviews.vapour.ws/r/2588/
<davecheney> 12:10 < dfc> remove version.Binary.OS
<davecheney> mwhudson: yes, and no atomics
<davecheney> no STREX/LDREX
<mwhudson> i think this is soft float
<mwhudson> davecheney: if you enable the -shared flag for arm things go very wrong in hashmap code
<mwhudson> and there is floating point code that could explain it
<davecheney> yes, the hash factor
<davecheney> or fill factor or something
<davecheney> from memory hashmap.go:mapinit
<perrito666> cmd/juju/status_test.go is a clear form of modern torture
<wallyworld_> yes it is :-(
 * perrito666 uses the time between testruns to learn emacs
 * perrito666 wonders if he can set kb layouts only for a given app
<mwhudson> davecheney: yeah, luckily the softfloat is so broken that it doesn't get 10.0*100.0 right...
<davecheney> perrito666: i'm working on the great american novel while running tests on ppc64
<perrito666> heh
<perrito666> oh finally, success
<perrito666> I changed one value in formatted status
<perrito666> git diff status_test.go | grep "^\+" | wc -l
<perrito666> 105
<mwhudson> ARGH
<davecheney> sinzui, I thought we had a voting race builder now ?
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1492095
<davecheney> mwhudson: soft float only works for values of 1 with extremely large expononents
<mup> Bug #1492095: worker/statushistorypruner: data race <juju-core:New> <https://launchpad.net/bugs/1492095>
<mwhudson> haha i fixed it
<mwhudson> not sure that was a sensible use of my time, but it took waaaay less than figuring out what was going on
<mup> Bug #1492095 opened: worker/statushistorypruner: data race <juju-core:New> <https://launchpad.net/bugs/1492095>
<davecheney> thumper: thanks for your comments, please see my reply
<thumper> davecheney: I agree with your summation
<thumper> davecheney: all within time
<perrito666> davecheney: thanks for finding the race (that was me, most likely)
<davecheney> perrito666: np
<davecheney> thumper: it's a valid operation
<davecheney> but I think it deserves to be broken out into its own logic
<davecheney> possibly in an upcoming juju/series package ?
<thumper> perhaps
<davecheney> mwhudson: congrats on your +2
<davecheney> proving the motto of the Go team: "you get commit rights when we get sick of comitting your stuff"
<mwhudson> davecheney: thanks
<mwhudson> and, yeah
<davecheney> we're classy like that
<wallyworld_> axw: a small one if you have a moment http://reviews.vapour.ws/r/2590/
<axw> wallyworld_: looking
<wallyworld_> ta
<axw> wallyworld_: filesystem CLI is delayed, the existing volume stuff needs cleaning up first. I created another card on the board
<wallyworld_> np
<axw> wallyworld_: I don't really understand why this branch is required at all. when would we ever get to the end of the resolver and have Started==false?
<wallyworld_> axw: during my testing i saw the update-status hook fire before the start hook had run (after install hook i think, before leader-elected)
<wallyworld_> so it's in response to observed behaviour
<wallyworld_> hmmm
<axw> wallyworld_: ah hm, apparently Start isn't run until after the first ConfigChanged
<wallyworld_> maye the refactoring got eid of the prolem
<axw> wallyworld_: so if the update status trigger comes in before config changed, this could happen I think
<wallyworld_> config changed may cone first yeah
<wallyworld_> yeah
<axw> wallyworld_: but no... we always wait for the first config changed
<wallyworld_> oh, ok, i forgot that
<axw> wallyworld_: although I think it could still happen in the case of a failed/resolved config-changed
<wallyworld_> could do. adding this extra started check is trivial abd seems like a good sfatey net
<axw> wallyworld_: we should probably move that logic into the resolver (out of operation/runhook.go), and have it drop out of the resolver if !Started and waiting for config changed
<axw> maybe not now
<wallyworld_> the stuff in run hook commit?
<axw> wallyworld_: yes
<wallyworld_> yeah, makes sense to do that now i think, but not earlier with the old code
<axw> wallyworld_: actually even that doesn't make sense. if we resolved the config-changed, it'd still commit and go to Started
<wallyworld_> so maybe now that the update status trigger has been pulled into the main loop processing, the issue is mmot
<wallyworld_> whereas before it was a concurrency lottery
<axw> wallyworld_: right, yeah. I thought you were testing with your change
<wallyworld_> i just added a bunch of cards, but i should have retested after the first refactor
<wallyworld_> i'll retest and drop this branch probably
<wallyworld_> just goes to show how fragile the uniter was before
<wallyworld_> and how theis reworkhas fixed a bunch of stuff implciitly
<axw> wallyworld_: that particular bit was okay before maltese-falcon, it got broken during (I think)
<wallyworld_> that may well be true
<wallyworld_> axw: and i think those other cards about duplicate status may be bugs in status history (need to dig further). so the feature branch may well be almost ready
<axw> wallyworld_: cool.
<wallyworld_> i have soccer now but will continue testing after
<fwereade> cmars, axw: you know the worker/gate thing I did?
<fwereade> cmars, axw: I think that we want a sort of extension of the concept to describe what charmdir.worker really does
<fwereade> cmars, axw: because I think it really is just a custom synchronisation construct, the charmdir relationship is entirely incidental
<fwereade> cmars, axw: metaphorically something like Fortress sorta works -- clients can Visit(func() error), the person in charge can Unlock (unblock Visits) and Lockdown (stop accepting new Visits, wait for existing ones to complete)
<fwereade> cmars, axw: but that's mainly just because I'm thinking about gates
<fwereade> or anyone who's interested in naming problems :) ^^
<perrito666> davecheney: still here?
<voidspace> dimitern: http://reviews.vapour.ws/r/2593/
<voidspace> fwereade: ping
<dimitern> voidspace, cheers
<mup> Bug #1492232 opened: backup hogs resources. <juju-core:New> <https://launchpad.net/bugs/1492232>
<mup> Bug #1492237 opened: juju state server mongod uses too much disk space <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1492237>
<mup> Bug #1492241 opened: juju upgrade-juju cli doesn't provide clear feedback on action being taken <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1492241>
<dimitern> fwereade, hey, are you around?
<fwereade> dimitern, heyhey -- and voidspace, oops, sorry
<dimitern> fwereade, :) voidspace has a branch that I'd like you to have a look, if possible
<fwereade> dimitern, just saw; voidspace, looking forward to it :)
<dimitern> fwereade, hopefully rectifies some of the issues with unit addresses changing randomly
<dimitern> fwereade, awesome, thanks :)
<dimitern> fwereade, we were careful not to break api compatibility
<ericsnow> mgz: could we get a CI run against master to clear that blocking bug?
<voidspace> fwereade: cool
<voidspace> fwereade: it touches the uniter which is why we pinged you particularly
<voidspace> fwereade: (touches it in a very minor way)
<fwereade> voidspace, phew :)
<fwereade> voidspace, what with maltese-falcon and all ;)
<voidspace> fwereade: yeah...
<voidspace> fwereade: hopefully a good touch not a bad touch...
<ericsnow> mgz, dooferlad: ping
<dooferlad> ericsnow: pong
<ericsnow> dooferlad: could you kick off a CI run against master to clear the blocker bug?
<dooferlad> ericsnow: I wouldn't know how...
<ericsnow> dooferlad: ah
<dimitern> ericsnow, abentley, mgz, jog_, or sinzui are better people to ask :)
<ericsnow> dimitern: duh, mixed up irc handled :)
<ericsnow> abentley, mgz, jog_: could you kick off a CI run against master to clear the blocker bug?
<abentley> ericsnow: looking...
<ericsnow> abentley: thanks!
<abentley> ericsnow: The lxc on our wily slave is unhappy.  We're still in the middle of testing 1.25.  I'm looking into fixing lxc.
<ericsnow> abentley: k
<ericsnow> abentley: any way we could get an exception for unblocking master?
<ericsnow> abentley: I've verified locally that Windows builds and passes the test suite now
<abentley> ericsnow: We need the lxc working before we can test master.
<wwitzel3> katco: having google issues
<katco> wwitzel3: there's no issues like google issues!
<abentley> ericsnow: I've fixed the lxcs and queued master to be tested next.
<ericsnow> abentley: thanks!
<ericsnow> abentley: is it still about 2 hours to run?
<abentley> ericsnow: yes.
<ericsnow> k
<natefinch> wwitzel3: we should circle up on my bug... I think I'm going to end up being out a lot today.
<wwitzel3> ok, now is good
<wwitzel3> fwereade: ping
<fwereade> wwitzel3, pong
<wwitzel3> fwereade: question about deploy.go and bug #1486553
<mup> Bug #1486553: i/o timeout errors can cause non-atomic service deploys <cisco> <landscape> <juju-core:Triaged> <juju-core 1.25:In Progress by natefinch> <https://launchpad.net/bugs/1486553>
<fwereade> wwitzel3, ah yes
<wwitzel3> fwereade: if you look at the DeployService code, is there a specific reason all of that AddService and UpdateConfig and AddUnits aren't in a single transaction?
<wwitzel3> fwereade: is there some chicken egg thing going on? Or is a possible fix to prevent the empty service being created just to perform all those in a single transaction?
<wwitzel3> fwereade: if that isn't possible, then my other thought was to wrap all of those so I could handle the error and properly cleanup previous transactions manually
<fwereade> wwitzel3, apart from sheer inertia, the trickiest bit of fixing that would be to unpick the machine assignnments
<fwereade> wwitzel3, I am generally a bit underwhelmed by "clean up the mess" approaches, because it's hard to guarantee that they get run
<wwitzel3> fwereade: yeah, I thought about that, but given that the placement to the unit is the last thing that happens, if that errors, I wouldn't have to actually worry about unpicking the assignments right?
<fwereade> wwitzel3, I *think* it goes add/assign/add/assign/add/assign etc
<wwitzel3> fwereade: hrmm, ok, so even if the AddUnitsWithPlacecode returns an error, it may hve done the add but not the assign?
<wwitzel3> fwereade: and that won't be cleaned up?
<fwereade> wwitzel3, yeah :(
<wwitzel3> fwereade: that's shitty
<wwitzel3> lol
<fwereade> wwitzel3, it has always been like that: my justification is that at least it's *possible* to clean it up manually, and there are only so many hours in the day
<fwereade> wwitzel3, however
<wwitzel3> fwereade: :)
<fwereade> wwitzel3, now that at last we've been told it's important to fix it, we can actually dive into Doing It Right
<fwereade> wwitzel3, which I *think* is not that hard
<fwereade> wwitzel3, because:
<wwitzel3> fwereade: take my hand, show me the way
<fwereade> wwitzel3, as you observe, add-service/set-config/add-unit go very nicely together
<fwereade> wwitzel3, there will be tweaks necessary -- eg set all the unit refcounts in the service doc at once
<fwereade> wwitzel3, and *create* service settings with X data instead of create-empty and set-later
<fwereade> wwitzel3, and I really don't think there's anything terribly *hard* there
<fwereade> wwitzel3, but, obviously, that leaves us with a bunch of unassigned units
<fwereade> wwitzel3, ...and *that* sounds to me like a candidate for a watcher/worker that just assigns unassigned units
<fwereade> wwitzel3, now this is clearly not a *small* bugfix
<wwitzel3> fwereade: right, I can probably put a good dent in it today though and hand it off (I'm out next week)
<fwereade> wwitzel3, but if we have traction on fixing it I think it would be worth while
<fwereade> wwitzel3, sweet
<wwitzel3> fwereade: so in the case where they used placement .. the worker/watcher would handle that too
<fwereade> wwitzel3, (placement directives might be a touch fiddly -- some (`--to 0`) you can just run (or reject) directly, but others will need to be stored somewhere and used by the assigner
<wwitzel3> fwereade: but then, there would be a delay right?
<fwereade> wwitzel3, yeah, there would, I think that's just the price we have to pay
<wwitzel3> fwereade: we would still have to validate placement as part of the operation though right?
<wwitzel3> fwereade: we wouldn't want the assigner to come back later and give the user a bad placement error
<fwereade> wwitzel3, I think it's equivalent to a provisioning error
<fwereade> wwitzel3, any pre-validation we can do, hell yes
<fwereade> wwitzel3, in fact I think that covers everything, right?
<fwereade> wwitzel3, we reject invalid ones on the way in, and we can ask the environ about them
<wwitzel3> fwereade: so it looks like .. add service w/ config, increment unit refs, validate and store placement directives .. run that transaction
<fwereade> wwitzel3, but they *might* induce provisioning errors on the associated machines later, just like any other machine
<fwereade> wwitzel3, yeah exactly
<wwitzel3> fwereade: assigner picks up job, attempts to use unassigned units, surface and error to the user like a provisioning error
<fwereade> wwitzel3, yeah
<fwereade> wwitzel3, in fact I think it is possible that assignment could fail there
<fwereade> wwitzel3, manual provider with unhelpful assignment policy?
<fwereade> wwitzel3, we don't have any way to retry machine provisioning *with different constraints/placement*, do we?
<wwitzel3> fwereade: right, so in the case of this bug, does this fix their issue though .. I'm not sure. If we fail at assignment, we still have a service?
<wwitzel3> fwereade: I guess this bug was caused by the adding of the service and the updating the config and units not being atomic .. this does solve that
<wwitzel3> fwereade: an assignment error would be something else and wouldn't be caused by a timeout to the API since the worker would be retrying in cases of timeout
<fwereade> wwitzel3, sorry justa sec
<wwitzel3> fwereade: np
<fwereade> wwitzel3, so, yes, I think that it's a different situation, even if I'm not 100% sure why
<fwereade> wwitzel3, failing to put my finger on what it is about the assignment logic that plays badly with transactions
<fwereade> wwitzel3, but I think if we (1) surface the errors and (2) think through how users'd want to address them, we will provide a much better experience there
<wwitzel3> fwereade: well, at the least, this is an improvement over the current implementation and it addresses the bug
<fwereade> wwitzel3, (the assignment stuff must be coming up to 3 years old now... memory is hazy)
<fwereade> wwitzel3, yeah
<wwitzel3> fwereade: thank you
<fwereade> wwitzel3, np :)
<perrito666> who is ocr today??
<natefinch> wwitzel3, katco, ericsnow: sorry, my wife is not really getting much better, so I won't be getting anything done today.
<katco> natefinch: hope she starts feeling better soon :(
<ericsnow> natefinch: hope she gets better soon!
<natefinch> thanks, hopefully it'll get better overnight
 * natefinch stays on to be able to read scrollback
<mup> Bug #1492396 opened: Misleading error when agent-version doesn't match juju version on bootstrap <bootstrap> <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1492396>
<perrito666> any one willing to rubber stamp a couple of fwports?
<perrito666> mmpdf, seem to be having flaky tests again
<perrito666> MachineWithCharmsSuite.TestManageEnvironRunsCharmRevisionUpdater <-- anyone seem that one ?
<natefinch-afk> wwitzel3: you around?
<wwitzel3> natefinch-afk: yeah
<natefinch> wwitzel3: let's catch up
#juju-dev 2015-09-05
<mup> Bug #1491923 changed: FTB on windows - ReadOSRelease undefined <blocker> <ci> <regression> <windows> <juju-core:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1491923>
<mup> Bug #1469799 changed: Agent tests fail with no output <ci> <ppc64el> <test-failure> <juju-core:Expired> <juju-core 1.24:Invalid> <https://launchpad.net/bugs/1469799>
<mup> Bug #1492530 opened: Improve bootstrap "tools" download message <juju-core:New> <https://launchpad.net/bugs/1492530>
<mup> Bug #1492530 changed: Improve bootstrap "tools" download message <juju-core:New> <https://launchpad.net/bugs/1492530>
<mup> Bug #1492530 opened: Improve bootstrap "tools" download message <juju-core:New> <https://launchpad.net/bugs/1492530>
<mup> Bug #1492598 opened: It is hard to find nested commands in juju help <juju-core:New> <https://launchpad.net/bugs/1492598>
<mup> Bug #1489346 changed: /var/lib/juju/db taking lots of disk space <juju-core:New> <https://launchpad.net/bugs/1489346>
#juju-dev 2016-09-05
<menn0> thumper: I figured out the blobstore thing. Although Juju doesn't use github.com/juju/blobstore any more, charmstore.v5 still does. I've emailed rog about it.
<wallyworld> thumper: can you take another look at my PR?
<anastasiamac> thumper: u r an OCR \o/ my monday might get better after all :D could u PTAL http://reviews.vapour.ws/r/5590/?
<thumper> yep
<anastasiamac> tyvm!
<anastasiamac> thumper: one line dependencies update - http://reviews.vapour.ws/r/5591/
<thumper> anastasiamac: timestamp is a lie
<thumper> anastasiamac: do this:
<thumper> godeps ./... | grep "juju/util"
<thumper> you can then just copy that line
<thumper> veebers: now good?
<anastasiamac> thumper: done \o/
<thumper> anastasiamac: I used to hand craft those lines until I worked out that trick
<anastasiamac> thumper: u r awesome \o/ we should have this in our wiki, I think :D
<anastasiamac> thumper: i need to update these licecne on 1.25 branch of juju/utils but no amount of fetching/pull is giving me this branch to creat my own on...
<anastasiamac> i suspect my upstrem/origin pair r not correct but can't figure out where m going wrong
<anastasiamac> could anyone point me in the right direction?
<thumper> anastasiamac: probably not worth fixing IMO
<thumper> there are hoops we could jump through if someone thinks it is important enough
<thumper> bit kinda icky due to godeps
<thumper> I know we were looking at updating godeps to allow it to look at different branches
<thumper> but AFAIK, this isn't done yet
<anastasiamac> thumper: the thing is i don't think it's too hard.. i think the problem is my setup.. juju/utils does have 1.25 branch which juju 1.25 uses.. my thinking i branch from utils/1.25 to change licence and then do update deps for juju/1.25...
<thumper> anastasiamac: well, it kinda is hard because godeps expects the revision to be in the master history
<thumper> what you have to do is make a branch from the 1.25 hash
<thumper> make the change
<thumper> the merge in master
<thumper> resolve conflicts
<thumper> then propose that to merge into master
<thumper> even though it may not change any files
<anastasiamac> thumper: is this not  utils for 1.25? https://github.com/juju/utils/tree/1.25
<thumper> then you need to update the hash that 1.25 uses to be the non-mainline hash
<thumper> that is the tag on master from which 1.25 was released
<thumper> well branch
<thumper> but you can't specify non-master branch in godeps
<thumper> yet
<anastasiamac> ah ... k.. i'll mark as won't fix on 1.25 and if it makes someone cry, we'll fix then :D
<thumper> it is not impossible
<thumper> just difficult at the moment
<anastasiamac> of course \o/ but since 1.25 is in critical only, i can downgrade it to High and if there is a burning desire to have it in, then we'll go the painful way...
<anastasiamac> m sure there are other licencning inconsistencies in 1.25 deps :D
<thumper> anastasiamac: menn0 thinks I'm wrong
<thumper> which may well be trye
<thumper> true
<anastasiamac> thumper: menn0: which part wrong? tracking only master branches in dependencies.tsv?
<thumper> to create a branch off the 1.25 branch do this
<thumper> git branch update-1.25-foo upstream/1.25
<thumper> then checkout
<thumper> we were arguing about how git works when fetching revisions
<anastasiamac> thumper: menn0: fatal: Not a valid object name: 'upstream/1.25'
<anastasiamac> i think my remotes r funny
<thumper> anastasiamac: do you have upstream set as the remote?
<menn0> anastasiamac: what does "git remote -v" show?
<anastasiamac> $ git remote -v
<anastasiamac> origin  https://github.com/anastasiamac/utils.git (fetch)
<anastasiamac> origin  https://github.com/anastasiamac/utils.git (push)
<anastasiamac> upstream        https://github.com/juju/utils (fetch)
<anastasiamac> upstream        https://github.com/juju/utils (push)
<anastasiamac> menn0: ^^
<menn0> anastasiamac: that looks ok
<menn0> anastasiamac: what about "git branch -r | grep 1.25" ?
<anastasiamac> menn0: and yet it does not work :( must b monday... 'git branch' yields nothing, just returns
<menn0> anastasiamac: ok, do a "git fetch" and then try again
<anastasiamac> menn0: did and it just returns too
<anastasiamac> somehow m not "seeing" 1.25 in my git
<anastasiamac> but I can see it against juju/utils on github in browser
<anastasiamac> thumper: wallyworld: veebers: is it normal that landing jenkins says "(pendingâjuju-core-slave is offline)"
<wallyworld> nope
<thumper> anastasiamac: do you find that git asks for your github password everytime you push?
<anastasiamac> yes
<thumper> anastasiamac: considered using ssh rather than https?
<veebers> anastasiamac: I'll see if I can find out whats happening
<thumper> anastasiamac: make sure github has your public key
<anastasiamac> i'd consider anything at this stage especially if it'll solve my current trip
<wallyworld> eithet that or use a keyring
<thumper> then set the remote url
<thumper> git remote origin set-url git@github.com:anastasiamac/utils.git
<thumper> then no more password requests
<thumper> anastasiamac: do `git branch -r`
<menn0> anastasiamac: also worth trying "git fetch -v upstream" to find out more about the fetch
<anastasiamac> thumper: $ git remote origin set-url git@github.com:anastasiamac/utils.git
<anastasiamac> error: Unknown subcommand: origin
<thumper> anastasiamac: I got set-url and origin the wrong way around
<anastasiamac> menn0: with "git fetch -v upstream" i can se that 1.25 there but it's not being fetched for me :(
<veebers> anastasiamac: I've kicked that node back into action
<anastasiamac> veebers: awesome
<anastasiamac> thumper: menno: finally after setting up ssh, i can checkout 1.25 \o/
<anastasiamac> m moving on to creating my branch - THANK YOU :D
<menn0> anastasiamac: schweet
<menn0> thumper: prechecks wired up inside the InitiateMigration API handler: http://reviews.vapour.ws/r/5595/
<menn0> thumper: so much better from a UX perspective (see QA steps)
<thumper> looking
<menn0> thumper: just figured out the bug that jam filed a while back where migrations get stuck. it happens when you try to migrate a model back to a controller it's just been migrated from and the migrationmaster is still finishing up there.
<menn0> thumper: the cause is subtle but I know how to fix it now.
<wallyworld> axw: hey, hope your macaroon stuff is going ok. you probs don't have time today, but if you did get a few minutes free, would love a pretty straightforward review on a change to list-models output http://reviews.vapour.ws/r/5594
<axw> wallyworld: going ok, but slowly. will take a look
<wallyworld> only if you are able
<frobware> dimitern: ping - can we sync?
<dimitern> frobware: hey, sure, just give me 5m
<frobware> dimitern: ok, will be in standup HO
<babbageclunk> wallyworld: Around?
<voidspace> babbageclunk: ping
<babbageclunk> voidspace: pong
<voidspace> I have a test failure on master
<voidspace> state/metrics_test.go:748:
<voidspace>     c.Check(metricBatches[0].Unit(), gc.Equals, "metered/0")
<babbageclunk> godeps?
<babbageclunk> voidspace: ^^
<wallyworld> babbageclunk: hey
<voidspace> babbageclunk: nope, fully godep'ed up
<babbageclunk> wallyworld: hey!
<voidspace> babbageclunk: probably a map ordering issue
<perrito666> morning all btw
<voidspace> perrito666: o/
<babbageclunk> wallyworld: Replying to your email - maybe worth a hangout if you can? (Sorry, realise it's late for you)
<wallyworld> babbageclunk: no worries, give me 5 and i'll ping
<babbageclunk> wallyworld: awesome, thanks
<voidspace> babbageclunk: yeah, the result has two entries and they're swapped over
<voidspace> babbageclunk: so the test is order dependent
<babbageclunk> babbageclunk: stink
<voidspace> babbageclunk: easy enough to fix, I'll do it in my branch
<frobware> dimitern: which HO?
<dimitern> frobware: I've realized it's not there so I just added one
<dimitern> https://hangouts.google.com/hangouts/_/canonical.com/juju-dns-nss
<dimitern> jam: ^^
<jam> dimitern: brt
<perrito666> wow 500M of Res memory is a bit of a heavy footprint
<mup> Bug #1613992 changed: 1.25.6 "ERROR juju.worker.uniter.filter filter.go:137 tomb: dying" <landscape> <juju-core:Won't Fix> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1613992>
<wallyworld> babbageclunk: free now, did you want to hang out?
<babbageclunk> wallyworld: yes please - how about in core?
<wallyworld> sure got a link handy?
<wallyworld> i don't have that one visible in my calendar
<babbageclunk> wallyworld: https://hangouts.google.com/hangouts/_/canonical.com/core
<wallyworld> babbageclunk: ah damn, missed one place - ControllerInstances is called when restoring from backup
<babbageclunk> wallyworld: doh
<wallyworld> so we'll need that query by tag
<wallyworld> damn
<babbageclunk> Oh well
<wallyworld> so close
<babbageclunk> But it's still only query by one tag for now, right? I don't need to do multiple calls and then intersect them together?
 * perrito666 hears backup and peeks in
<babbageclunk> wallyworld: Oh no, I do - the machines in the controller model are not all necessarily controllers.
<wallyworld> correct
<wallyworld> iscontroller=true && controller-uuid=blah
<frobware> dimitern: I pushed an update for bridging all interfaces - https://github.com/frobware/juju/tree/master-bridge-all-interfaces
<dimitern> frobware: ok, looking
<frobware> dimitern: that's just the unit test cases updated (and the bridge script updated). Was going to try some genuine deployments next.
<dimitern> frobware: at first glance looks like it does what we need
<frobware> voidspace: standup?
<voidspace> frobware: oh yeah, thanks
<dimitern> frobware, voidspace, babbageclunk: can you please have a look at this tiny PR: http://reviews.vapour.ws/r/5597/
<babbageclunk> dimitern: LGTM!
<dimitern> babbageclunk: thanks!
<dimitern> frobware: I'll land this unless you have comments I guess?
<frobware> dimitern: was just looking at it
<dimitern> frobware: ok, no rush then
<frobware> dimitern: do you know if the sysfs paths are consistent across kernel versions?
<dimitern> frobware: as far as I can see - yes, tested on trusty, xenial, and centos 7
<frobware> dimitern: I wonder why "brif" is not listed in here: https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-class-net
<dimitern> frobware: not sure, but I've seen lots of examples using brif
<frobware> dimitern: OK I see this in if_bridge.h - #define SYSFS_BRIDGE_PORT_SUBDIR "brif"
<dimitern> frobware: also this - https://books.google.bg/books?id=ALapr7CvAKkC&pg=PT420&lpg=PT420&dq=sys+class+net+bridge+brif&source=bl&ots=gVdzaVkYPv&sig=PVTMP6WszdmaCcAj4v6hNmYY67c&hl=en&sa=X&ved=0ahUKEwje3OPnwfjOAhUDWxQKHVJqBBYQ6AEIKTAC#v=onepage&q=sys%20class%20net%20bridge%20brif&f=false
<frobware> dimitern: I'll believe the header file :)
<frobware> dimitern: LGTM
<dimitern> frobware: :) thanks!
<frobware> dimitern: stepping out for early dinner (read no breakfast and no lunch)...
<dimitern> frobware: enjoy!
<frobware> dimitern: neither were intentional, just the way the day planned out
<dimitern> frobware: you've got a review on https://github.com/juju/juju/pull/6156 btw
<frobware> dimitern: thanks
<dimitern> frobware: I'm about to propose the next step - decided to split networkingcommon changes in 2 PRs for easier review
<frobware> dimitern: doh! :)
<dimitern> frobware: here's the next step: http://reviews.vapour.ws/r/5598/
<dimitern> I don't expect you to review it tonight :) just FYI
<axw> wallyworld: sorry I was a bit rushed in my review yesterday. I'll try again later.
<axw> wallyworld: it would be helpful if there were a definition for the access levels for models somewhere. we don't have that do we?
<wallyworld> axw: there's a doc, let me find it, but it is a bit vague tbh
<menn0> thumper: standup?
<thumper> coming
<anastasiamac_> thumper: have u made the decision about Jesse's MADE PRs?
<thumper> no
<thumper> I'm waiting until after 2.0 to look
<thumper> no time right now
<thumper> also
 * thumper weeps at apiserver/params importing mongo
 * thumper looks at perrito666
<thumper> UpgradeMongoParams
<thumper> apiserver/params shouldn't import external packages
<thumper> especially not for arg structures
<thumper> I'
<thumper> I'll fix it in my branch
<thumper> because I have introduced the cycle
#juju-dev 2016-09-06
<menn0> thumper: Fix for the "migrations get stuck" bug: http://reviews.vapour.ws/r/5599/
 * thumper looks
<thumper> menn0: just one Q on the review
<menn0> ok
<thumper> oh ffs, will this branch die...
<anastasiamac_> thumper: u reviewed Casey's PR yesterday (http://reviews.vapour.ws/r/5588/) r u k to stamp it?
<axw> wallyworld: still going on this, but if you have any time I'd appreciate if any early comments before I delve into unit tests: https://github.com/juju/juju/compare/master...axw:login-local-macaroon?expand=1
<wallyworld> ok, give me 10
<axw> wallyworld: in particular, there's some UX changes
<wallyworld> or 5
<axw> sure, take your time
<anastasiamac_> axw: menn0:wallyworld:thumper: any idea who is looking after https://github.com/juju/blobstore? us or Uros's team?
<wallyworld> either
<wallyworld> whoever needs to make changes does so
<wallyworld> we created it initially
<wallyworld> they have landed patches since
<menn0> anastasiamac_: I emailed rog about it yesterday. what's up?
<anastasiamac_> wallyworld: menn0: m trying to determine if the issues agaisnt this libraries are ours (and hnce need to be in launchpad) or theirs (and then I do not need to worry)
<anastasiamac_> library*
<wallyworld> i don't think we track bugs for that library on launchpad
<wallyworld> any issues for those upstream libraries in github.com/juju are tracked in github IIANM
<anastasiamac_> wallyworld: not all, there are some bugs in launchpad. for example, most recently, licence bugs were floating around
<wallyworld> oh? ok. what project were they filed against?
<anastasiamac_> wallyworld: and as far as I can see we r not *really* tracking in github
<menn0> anastasiamac_: in case it's relevant, I emailed rog b/c we don't use github.com/juju/blobstore directly. We just the newer version at gopkg.in/juju/blobstore.v2.
<wallyworld> oh, that needs to be fixed :-(
<menn0> anastasiamac_: the only reason we still have it in dependencies.tsv is because charmstore.v5 still uses it
<anastasiamac_> menn0: tyvm - u r the best \o/ we'll wait for rog :D
<menn0> rog said that it shouldn't cause a problem b/c of the way it's used
<wallyworld> we could do tha one ourselves if we really wanted to :-)
<menn0> and that they're moving away from using mongodb for a blobstore "soon" anyway
<anastasiamac_> oooh.. what r they moving toward to?
<menn0> anastasiamac_: no idea, he didn't say
<anastasiamac_> intersting ... :D
<menn0> this is what rog said: We actually plan to move away from using mongo-based blobstore at some point but in the meantime we should probably move to v2. I don't see any potential harm in having both versions as deps in juju though in the meantime - neither have any global state AFAIR.
<anastasiamac_> menn0: \o/ k
 * wallyworld hates carrying unnecessary deps
<wallyworld> adds to build time and landing bot time
<anastasiamac_> wallyworld: so here is one for ur and thumper's delight: 2 PR against juju/cmd (on github) LGTM'ed in 2014.. one can be merged, one has conflicts :D
<wallyworld> "delight"
<anastasiamac_> wallyworld: thumper: could we plx decide if we close the or land them?
<anastasiamac_> \o/
<anastasiamac_> always "delight"
<anastasiamac_> hand-in-hnad with  "wonder" :)
 * thumper relocates
<anastasiamac_> to OZ?
<thumper> dropping kid at bjj and working from cafe there
<thumper> back online from cafe
<anastasiamac_> wallyworld: i thin u've just been delegated
 * wallyworld is busy, will look later
<anastasiamac_> awesome \o/
<anastasiamac_> axw: menn0: wallyworld: is https://github.com/juju/errgo officially deprecated in favour of juju/errors?
<wallyworld> only in core
<wallyworld> it's still used upstream :-( :-(
<axw> wallyworld: does it? macaroon stuff uses gopkg.in/errgo.v1
<wallyworld> axw: that's my point :-)
<axw> can't see any imports - I think it's dead
<wallyworld> i want to use juju/errors everywhere
<anastasiamac_> wallyworld: axw: so if there is a PR against this library, who is to review/land?
<axw> wallyworld: I just meant there's gopkg.in/errgo.v1, which != juju/errgo. but yes, we still have two
<anastasiamac_> r we responsible for it?
<wallyworld> we = rog
<anastasiamac_> we != rog :D
<wallyworld> juju core doesn't maintain errgo
<wallyworld> juju core would like not to have to pull in errgo at all
<anastasiamac_> \o/
<anastasiamac_> i'll happily ignore these PRs then :D
<anastasiamac_> wallyworld: axw: menn0: what about https://github.com/juju/errors?
<wallyworld> that's out error package
<wallyworld> our
<menn0> kinda need that one :)
<axw> we could just do better and stop creating errors
<wallyworld> lol
<anastasiamac_> oooh, axw, wallyworld: what about juju/go4?.. talking about lugging things around :D
<axw> anastasiamac_: I think rog is using that
<anastasiamac_> k :) 4? isn;t he on the tip of go7?
<wallyworld> that's not go the sdk
<anastasiamac_> wallyworld: axwwhat about juju/ratelimit? is it rog too?
<wallyworld> not sure, would need to search the code base
<thumper> :-( 	 cmd/juju/controller/listblocks_test.go           |  124 -------------
<thumper>  49 files changed, 1014 insertions(+), 1238 deletions(-)
<thumper> tim@elwood:~/go/src/github.com/juju/juju (block-cmd-rework)$ git diff master | wc -l
<thumper> 3229
<thumper> wallyworld: sorry
<wallyworld> yay!
<wallyworld> and i know you'r enot sorry at all
<thumper> http://reviews.vapour.ws/r/5600/
<thumper> wallyworld: just adding qa steps
<thumper> wallyworld: it has been epic
<thumper> wallyworld: that review addresses all the block related commands, and a few drive by fixes
<thumper> some of which were necessary, others not entirely so
<wallyworld> ok, am finishing something, then looking at andrew's diff then yours
 * thumper nods
<thumper> the change that introduced the import cycle that I had to fix was moving the CmdBlockHelper out of cmd/juju/common
<thumper> it was the only file that brought gocheck into the package, which made gocheck a non-testing dependency
<thumper> moved it into testing
<thumper> that introduced an import cycle in the api package
<thumper> due to the mongo.Version structure over the api
<anastasiamac_> thumper: what an adventure! thank you :D
<veebers> thumper: That PR removes the block, unblock commands right?
<thumper> veebers: it renames
<thumper> veebers: block -> disable-command
<thumper> unblock -> enable-command
<thumper> block list -> disabled-commands
<veebers> thumper: sorry yeah renames. That's going to need a change to the assess_block CI test when it lands
 * thumper nods
<veebers> thumper: I should be able to rough it up now, but not sure if it'll land before/when your PR does
<thumper> veebers: the branch is still being reviewed
<veebers> thumper: ack, I'll get something ready in the wings
 * thumper relocates again
<axw> wallyworld: I've just updated my branch, I think I've got all the loose ends now. now I need to write a bunch of tests
<wallyworld> axw: ok, almost ready to look
<wallyworld> axw: looking now. i've add to my PR to address the comments, but also added machine and core count to show-controller. a difference with current behavior is that it does an AllModels() api call to get the model info to stick in the output, rather than simply only looking at locally cached info
<wallyworld> and that's more correct anyway
<wallyworld> but it doesn't currently update the local yaml though
<axw> wallyworld: doesn't update the cache for list-controllers you mean?
<wallyworld> axw: doesn't update the local model yaml
<wallyworld> so this is for show-controller
<wallyworld> previously, show controller would only look at local models yaml
<wallyworld> now it makes an api call
<wallyworld> as it needs to get the machine and core info
<wallyworld> and so it also then gets the latest models as well
<axw> wallyworld: looks fine, but can you please get OCR to check too. it's over 500 lines now
<wallyworld> ffs
<wallyworld> i was going to propose separately
<wallyworld> should have done that
<wallyworld> at the expense of velocity
<menn0> anastasiamac_: I improved that test as a result of your comment
<axw> wallyworld: I only say because I'm actually having trouble keeping track of all the changes
<anastasiamac_> menn0: veni, vidi, vici
<axw> tiny brain
<anastasiamac_> menn0: i've commented and stamped \o/ looks awesome
<wallyworld> axw: you don't need to import errgo just to call Cause() - juju/errors.Cause() will work IIANM
<axw> wallyworld: ah ok, thanks
<wallyworld> axw: errgo and errors both define their causer interface identically
<axw> wallyworld: it might be more useful for you to pull the branch and test it, than to try and review it as it is. I'll break it up later on and propose a few bits separately. mostly it'll have to be done as one though
<axw> if you don't have time, that's fine
<wallyworld> axw: ok, can do. the fine detail of the auth stuff is a little out of my comfort zone
<axw> wallyworld: no worries. me too, hence why it's taken so long :)
<wallyworld> have to wait for a 2nd review anyway :-/
<wallyworld> def get rog to look :-)
<wallyworld> axw: axtually, looks like there will be a conflict, wanna rebase before i pull?
<axw> wallyworld: sure, just a minute
<axw> wallyworld: pushed. I just realised I missed something, which is that "juju logout" will need to clear any cookies for the controller. I might do that in a follow up though
<wallyworld> ok
<wallyworld> axw: appears to work ok. password change, deleting go-cookies etc. i get prompted as i would expect for the password and then from there it seems to use the macaroon ok. any thing in particular else to test?
<axw> wallyworld: not really, mostly just wanted to you sanity check. does it feel natural to you? as opposed to how it was, where you would get told that your login expired, and you had to run "juju login"
<axw> wallyworld: FYI, I also tested "juju register" to make sure you get logged in automatically still
<wallyworld> axw: yeah, much nicer to have you simply prompted
<axw> cool
<wallyworld> IMHO
<axw> wallyworld: thanks. updating tests now, will propose it for real later on
<wallyworld> axw: eg with github, if my cached password expires, it simply just prompts again
 * axw nods
<wallyworld> anastasiamac_: if you get a chance, i'd love a 2nd look at this, it's a fraction over 500 lines sadly http://reviews.vapour.ws/r/5594/
<babbageclunk> wallyworld: Morning!
<wallyworld> hey
<babbageclunk> wallyworld: Any idea what this maas user data that thumper's talking about is?
<babbageclunk> wallyworld: I can't find it.
<wallyworld> no :-(
<wallyworld> i'm sure they maas guys told me to use tags
<babbageclunk> Seems kind of unlikely that there's yet another way of searching for machines.
<wallyworld> yeah
<babbageclunk> ok cool - I was feeling a bit silly there
<wallyworld> you and me both
<wallyworld> might pay to ask a maas person directly
<babbageclunk> are there any in a convenient timezone?
<babbageclunk> I'll ask allenap once he's up.
<anastasiamac_> wallyworld: was afk - kids/school/lif (apparently, there is that). I can have a look after dinner/kids bedtime :D is it k?
<wallyworld> anastasiamac_: sure, am working on the next bit in preparation
<anastasiamac_> wallyworld: \o/ tyvm :) hope u have dinner in btw too :D
<wallyworld> yeah, at some point :-)
<axw> wallyworld menn0: to what degree is migration supposed to be stable in 2.0? there's some macaroon stuff in there that will need to change
<wallyworld> axw: you mean model migration or schema changes for upgrades?
<axw> wallyworld: model migration
<wallyworld> axw: we have some lattitude - an external tool is being used to handle parts of the model the 2.0 agent doesn't know about
<wallyworld> assuming the stuff you need done doesn't make 2.0
<axw> wallyworld: well I'm going to have to disable the ability to use macaroons to do the migration, so it would only be usable if you have a password in accounts.yaml
<frobware> mgz: you about?
<frobware> mgz: ah, no... you're crewing...
<voidspace> babbageclunk: ping
<babbageclunk> voidspace: pong
 * rick_h_ pokes head up and looks around
<macgreagoir> frobware: You have 5 mins? Wanna stay in that HO?
<frobware> macgreagoir: I'm still in there
<macgreagoir> frobware: Hmmm... I can't see you...
<perrito666> bbl lunch
<alexisb> perrito666, ping
<perrito666> alexisb: pong
<alexisb> perrito666, have you pick up a bug yet?
<perrito666> yes I have, https://bugs.launchpad.net/juju/+bug/1616197
<mup> Bug #1616197: juju restore-backup error <backup-restore> <juju:In Progress by alexis-bruemmer> <https://launchpad.net/bugs/1616197>
<perrito666> but I am in time to switch if you have something not backup related
<perrito666> anyting at all
<alexisb> perrito666, ok proceed thank you
<alexisb> nope that is a good one for you to take
<perrito666> k, Ill proceed
 * perrito666 puts the hazmat suit again
<thumper> morning folks
<perrito666> going to fetch some meds, bbl
<alexisb> morning thumper
<thumper> alexisb: morning
<perrito666> Thumper sorry for whatever I broke that you where saying last night
<perrito666> I already forgot what it was
<thumper> perrito666: already fixed and landed
<thumper> reminder though not to use external package structures over the api
<thumper> apiserver/params was using mgo.Version
<thumper> no
<thumper> mongo.Version
<thumper> anyway
<thumper> now it isn't
<veebers> thumper: what's the easiest way to find out which beta version the block command was added to?
<thumper> veebers: um... block has been around for a long time
<thumper> do you mean my change to block?
<veebers> thumper: oh no, I meant originally added, this is for updating the ci tests :-)
<veebers> thumper: all good, that pretty much answers my question
<thumper> if `juju block list` returns non-zero, it isn't found and has new commands
<thumper> veebers: it is in 1.25
<veebers> thumper: ah right, thanks :-)
<thumper> BOOM!
<thumper> success on that test
<menn0> wallyworld: ping
<wallyworld> hey
<veebers> thumper: Am I right in my reading that you changed all-changes -> all for enable-command? (the PR says you did but the help text still says 'all-changes')
<thumper> veebers: yes, "all-changes" is now just "all"
<veebers> thumper: Cool, I'll file a bug for the help text
<mup> Bug #1620830 opened: destroyEnvSuite.TestDestroyEnvironmentCommandEFlag interface is nil <ci> <intermittent-failure> <panic> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1620830>
<mup> Bug #1620832 opened: github.com/juju/juju/featuretests MSpan_Sweep: bad span state <ci> <go1.6> <intermittent-failure> <regression> <unit-tests> <juju:Triaged> <juju-core:Triaged> <https://launchpad.net/bugs/1620832>
 * redir goes to make tea and change computers, brb
<thumper> menn0 or other reviewer? http://reviews.vapour.ws/r/5609/
<menn0> thumper: looking
<menn0> axw: ping me when you're around regarding migrations + macaroon issues
<thumper> update juju/testing dep http://reviews.vapour.ws/r/5610/
<thumper> one line review
<thumper> wallyworld: ^^
<menn0> thumper: more ship its
<thumper> menn0: cheers
<axw> menn0: I'm here now
<thumper> review board not picking this up for some reason.https://github.com/juju/juju/pull/6172
<thumper> https://github.com/juju/juju/pull/6172
<thumper> wallyworld:  access stays unknown... http://pastebin.ubuntu.com/23143762/
<thumper> hmm... older controller
<thumper> perhaps wasn't sending right data back
<wallyworld> thumper: correct. did you even read the messsage
<wallyworld> --refresh
<thumper> see the last line
<thumper> with --refresh
<wallyworld> oh
<wallyworld> in that case the controller is sending back the access, i'll have to understand the repor steps
<thumper> bootstrapping newer
<wallyworld> thumper: at bootstrap, access and version is filled in, it should never say unknow unless tyoy are running an older version
<thumper> wallyworld: I think it was probably due to it being an older controller
<wallyworld> my guess is you had an older version bootstrapped
<wallyworld> yep, that won't work
#juju-dev 2016-09-07
<wallyworld> thumper: this one fixes the race curtis was talking about, but it needs anothe +1 :-( http://reviews.vapour.ws/r/5594/
 * thumper looks
<thumper> wallyworld: why does the testing show a badly indented table?
<wallyworld> thumper: markdown char inpaste
<menn0> thumper: internet is back up
<thumper> menn0: cool
<menn0> thumper: I must admit it was very productive having no distractions for a bit :)
<thumper> awesome
<menn0> axw: do you have time for a hangout to discuss macaroons
<lazyPower> i too, see the badly indented table
<lazyPower> wallyworld :D glad its not just me
<lazyPower> o/ heya core devs
<wallyworld> lazyPower: hey
<lazyPower> i dont suppose any of you are coming to the summit D:
<wallyworld> the charmer summit?
<wallyworld> perrito666 is
<lazyPower> wooo perrito666
<wallyworld> lazyPower: save up all the beatings until then please :-)
<lazyPower> haha no beatings! i want you people here so you can see and connect with your users!
<wallyworld> indeed, they canbeat us
<lazyPower> you know, the reason - other than - endless milestone sheets being handed down from planning sessions.
<lazyPower> i'm so hype right now. I bet that wears off quickly :D
<wallyworld> lazyPower: i can't wait for juju 2.0 to hit rc next week - hard to go back to 1.25 now
<lazyPower> tell me about it. i'm still poking about in -stable for our testing on k8s
<lazyPower> we get charm testing fixed just in time for it to change again .surprise!
<axw> menn0: sorry was otp, free to hangout now
<menn0> axw: give me 30s
<wallyworld> lazyPower: as soon as we hit rc, no more breaking changes is the aim
<axw> the 1.25 CLI feels very clunky every time I go back to it
<axw> particularly bootstrap
<thumper> wallyworld: review done, few issues
<wallyworld> ok, ta
<thumper> lazyPower: wish I was, but alas, can't
<lazyPower> wallyworld - i'm not even mad my friend, not even a little bit....
<perrito666> lazyPower: as wallyworld said, you can beat me
<perrito666> and users can connect some punches too
<lazyPower> perrito666 - not sure when your team turned into massochists/sadists, but you wear the role really well :P
<lazyPower> thumper - shameeeeeeee my man! I was looking forward to seeing you again
<lazyPower> next time
<wallyworld> lazyPower: the beatings will continue until morla improves
<perrito666> lazyPower: somewhere around the first time we broke the gui without letting urulama know first
<thumper> lazyPower: I'm going to the CDO sprint in nov rather than the Juju one
<lazyPower> hahaha
<thumper> so we get coverage
<lazyPower> and hatch tries so hard to tell me they aren't core ;)
<thumper> you at that one?
<lazyPower> hatch - you're totally owned by core now, be tee dubs
<perrito666> lazyPower: dimitern is coming too so you can beat him about network stuff
<lazyPower> thumper - no idea. but i doubt it. Marco has done a good job of putting us grunts to work
<lazyPower> perrito666 - oh and i have plans on that one ;)
 * thumper nods
 * thumper heads out, time to pick up kiddo
<thumper> bbs
<lazyPower> cheers thumper
<lazyPower> perrito666 - well i look forward to hanging out during the summit as well :) seek me out and we can make good on those OCP chassis beers promised from forever ago
<wallyworld> thumper: issues addressed. i couldn't find any uses of those tw helper functions. am i the first to use them?
<thumper> no
 * thumper finds example
<thumper> blocks/list.go
<thumper> uses them
<thumper> w := output.Wrapper{tw}
<thumper> w.Println(value, value, value)
<thumper> or
<thumper> w.Print(value, value)
<wallyworld> thumper: oh, i found other ones
<wallyworld> output.TabWriterPrint(tw)
<wallyworld> i didn't see anywhere else using those
<thumper> yeah...
<thumper> those were written initially to replace the stand alone p functions
<wallyworld> they just wrap what you did by habd above :-)
<thumper> but in the end I felt it was better to do it other ways
<wallyworld> sigh :-)
<thumper> should probably just remove them :)
<wallyworld> double sigh
<wallyworld> i'll update the PR
<wallyworld> thumper: that wrapper really also needs a Printf so that if you are just printing one thing, you don't need to resort to fmt.Fprintf(tw, ...) and can consistently use w.Print everywhere
<thumper> Print can take just one thing
<thumper> what do you mean?
<wallyworld> thumper: you need to do fmt.Sprint() inside the w.Print()
<thumper> output.CurrentHighlight.Fprintf(tw, "%s\t", name) -> w.PrintColor(output.CurrentHighlight, name)
<thumper> wallyworld: for now, yes
<wallyworld> right, so if there were a w.Printf(onething) that would help
<thumper> you don't need it for %d
<thumper> w.Print(cores)
<wallyworld> but i have other cases where it's needed
<thumper> where?
<wallyworld> fmt.Fprintf(tw, "CONTROLLER: %v\n", c.ControllerName())
<thumper> other places, sure, but not in this branch
<wallyworld> that's in the bits i'm changing in this branch to use the new helper
<wallyworld> would be nice not to have to use fmt.Printf
<thumper> w.Println("CONTROLLER: " + c.ControllerName())
<thumper> sure
<wallyworld> sure, but FPrintf is nicer
<thumper> no it isn't
<thumper> :)
<wallyworld> just a suggestion for when you remove those other ones
<wallyworld> the above is a simplistic example
<thumper> I know
<wallyworld> so if you are in the area anyway..... to remove those obsolete helpers.....
 * thumper nods
<wallyworld> thumper: so is there a recommended way to replace this output.CurrentHighlight.Fprintf(tw, "%s\t", name)
<thumper> see comment 6 minutes ago
<wallyworld> ah :-)
<wallyworld> oh alright then
<thumper> I'm at that stage where I know what I need to do
<thumper> but it is more than I wanted to do...
 * thumper sighs
<thumper> oh well
 * thumper dives in
<thumper> minion, fetch me a coffee
<thumper> oh... no minions around, will fetch own coffee
<thumper> such are the problems of working from home
<redir> guests here. back later.
<menn0> axw: what type will used to represent a macaroon cookie jar in CLI code? []macaroon.Slice?
<menn0> i'm just trying to make things as easy as I can for myself
<menn0> by having types line up
<axw> menn0: yeah, that's what api.Info accepts
<axw> that gives us most flexibility
<menn0> axw: I ask b/c the macaroon give by the API to accounts.yaml was giving the macaroon as a string
<menn0> rather than a macaroon.Macaroon
<axw> menn0: that was just to avoid having a dependency
<axw> using []macaroon.Slice would be simplest I think
<menn0> ok great. that makes my life easier
<wallyworld> mwhudson: go 1.7 installed from the ppa doesn't seem to support running race tests, i have to resort to using 1.6
<wallyworld> runtime.raceinit: __tsan_init: not defined
<wallyworld> etc
<wallyworld> is there something i'm missing?
<mwhudson> wallyworld: install golang-1.7-race-detector-runtime from the ppa too?
<wallyworld> ah let me check
<mwhudson> should be recommended though
<mwhudson> do you have recommends turned off?
<wallyworld> i just installed golang-1.7 or whatever using apt-get
<wallyworld> not sure, i haven't changed it explicitly i don't think
<wallyworld> mwhudson: so gophers-archive/now doesn't have it listed but gophers-archive/xenial does
<mwhudson> wallyworld: i don't know what either of those things mean :-)
<mwhudson> wallyworld: this is the PPA you should be using https://launchpad.net/~gophers/+archive/ubuntu/archive/+packages
<wallyworld> me either, i just fired up synaptic and that's what it shows as package sources
<mwhudson> oh heh
<wallyworld> but yeah, i didn't have the race package installed
<wallyworld> works now, ty
<wallyworld> and is much faster
<wallyworld> thumper: can you see if you're ok with http://reviews.vapour.ws/r/5589/ now?
 * thumper looks
<wallyworld> anastasiamac_: not sure how busy you are, but would love a review on http://reviews.vapour.ws/r/5611/ if you get a moment today
 * thumper sighs
<thumper> some of the points were completely misunderstood
<thumper> wallyworld: let me finish what I'm doing and I'll go through it properly
<wallyworld> righto
<thumper> See Also:
<thumper> or
<thumper> See also:
<thumper> menn0: ^^
<thumper> There are 9 "See Also:"
<thumper> and 59 "See also:"
<natefinch> Latter... it's not a title, IMO.
<thumper> was there a definitive answer somewhere?
<thumper> wallyworld: ?^^
<wallyworld> thumper: i think menn0 said "See Also"
<thumper> he has to me before, but I'm collecting input
<thumper> I'll skip for now
<wallyworld> i sort of prefer "See also" but that's just IMO
<thumper> I was going to drive by fix them
<menn0> thumper: If "See also" is the majority let's just use that
<thumper> ok
 * thumper fixes
<natefinch> +1 for making the smaller fix that makes it consistent
<natefinch> wallyworld: this auto upload-tools thing is garbage, btw
<wallyworld> what do you really think?
<natefinch> wallyworld: sorry... it's just been frustrating for me almost every other day
<wallyworld> works great for all the use cases i know of
<wallyworld> especially snaps
<wallyworld> and devel
<natefinch> it seems like, if the version on my branch is ever older than what's in streams, I get what's in streams
<wallyworld> yes
<wallyworld> that's intended
<natefinch> what if I need to work on old code?
<wallyworld> use --build-agent
<wallyworld> it's been done so solve the 99% cases
<wallyworld> so if you work on old code, you just use --build-agent instead of --upload-tools
<wallyworld> does that not work for you?
<natefinch> I didn't realize build-agent was the new upload tools.
<natefinch> I thought there were some different semantics to it
<wallyworld> there is - it will always compile
<natefinch> oh, I don't (really) care about that
<wallyworld> whereas upload-tools used to not compile if it found a binary
<wallyworld> i'm sure i documented this either in email or the release notes
<natefinch> I guess I don't understand why we have auto upload-tools if we have build-agent
<wallyworld> for snaps
<wallyworld> and 99% of development cases
<natefinch> the problem during development is that depending on invisible data in the cloud (i.e. what is in streams), juju bootstrap will behave differently
<natefinch> so, every time curtis pushes a new build, the code I built that used to auto-upload now doesn't, until I pull.
<wallyworld> so use --build-agent as you used to use --upload-tools
<natefinch> (assuming master has been updated with a new version number which is also not always instant)
<natefinch> well, yes... but then why have auto-upload?
<wallyworld> for the other 99%, it's so liberating not to have to type --upload-tools all the time
<wallyworld> for snaps
<natefinch> ... I don't understand what "for snaps" means
<wallyworld> what don't you get? :-)
<wallyworld> snaps need to not require upload-tools
<natefinch> I know (vaguely) what a snap is.  I don't understand what that has to do with whether or not juju auto uploads
<natefinch> again, what does a snap have to do with juju?  (other than installing juju)
<wallyworld> a snap is a contained juju/jujud
<wallyworld> juju run from the snap will use the jujud in the snap
<wallyworld> for that to work, it needs aut discovery of the binary
<wallyworld> this is all needed to allow people to publish daily snaps etc to edge
<wallyworld> plus for development, people get so sickof having to type upload-tools all the time
<natefinch> heh
<wallyworld> and we also want people to be able to install debs and foro stuff to just work
<wallyworld> this is all about 1. snaps, and 2. removing friction for bootstrap
<wallyworld> and it holds 99% of the time
<natefinch> so, wait... I still don't get what snaps have to do with this.  Why would the snap not just use what's in streams?  couldn't we just publish a nightly stream or something?
<wallyworld> but when running older code, you just use --build-agent
<wallyworld> snaps are self contained
<natefinch> sure... on the client
<wallyworld> and also when publishing to the edge channel
<wallyworld> we want to remove friction for people to say "try my stuff"
<wallyworld> creaing a snap is easy
<wallyworld> publishing jujud to some stream you don't control is cracl
<natefinch> sure
<natefinch> ok, so this is for when a dev wants to give a test binary for someone to try out a fix or new feature.  Ok
<wallyworld> yep, also daily snaps so people can get the latest stuff
<wallyworld> for example, menno did one recently
<wallyworld> to test out a status feature
<menn0> axw: this macaroons change is tedious. i'm almost done but need to do a preschool pickup soon.
<menn0> will continue afterwards
<axw> menn0: no worries, thank you
<natefinch> I still wish it always worked the same way unless you gave it a flag.... For example, we could build the PPA with a flag that makes the default not-upload-tools and for everything else, have it default to upload-tools. Then at least it would be consistent.  My main problem is that I don't *trust* it, because it has failed for me a few times, and it's really hard to pick out the right line of logging as it scrolls past.
<wallyworld> we are moving away from ppas
<wallyworld> failed is subjective - it's behaved exactly as designed
<wallyworld> you just didn;t know about --build-agent
<thumper> menn0: http://reviews.vapour.ws/r/5612/
<wallyworld> also bootstrap messages have been improved to show what version is being used
<natefinch> the design is bad, because it behaves differently than my intent.... some of the time.  And it behaves differently according to factors that are basically impossible for me to know ahead of time... and even while bootstrapping, it's difficult to see if it's uploading or not
<wallyworld> natefinch: there are flags to force intent, like agent-version
<wallyworld> if you want to be explicit, you can
<wallyworld> otherwise, it will find a streams version which matches the client
<wallyworld> that's very easy to understand the behaviour
<wallyworld> and if you are developing and want to compile a binary which basically is not really what it says it is, then use --build-agent
<natefinch> a brake pedal that doesn't work when the moon is in conjuction with pisces is very easy to understand too... that doesn't make it a good design.  One gets used to it working, and then one day, without any visible changes, it acts differemtly
<wallyworld> the changes are documented, the change is semantic is deliberate. sometimes shit changes
<wallyworld> the change solves the 99% cases very well, and for the other 1%, use --build-agent
<natefinch> I'm not saying that the code changes are the problem.... I'm saying that with the current code "juju bootstrap" has different results based on effectively the stage of the moon
<thumper> wallyworld: just for you http://reviews.vapour.ws/r/5613/
<wallyworld> looking
<thumper> wallyworld: actually http://reviews.vapour.ws/r/5612/ was for you too
<wallyworld> thumper: did you reopne those 3 issues on that pr? i don't agree that pr should solve the check empty args case
<thumper> not sure why menno got it
<thumper> wallyworld: if we are changing how the command works, then giving the user good feedback when they do something wrong is part of the work
<thumper> consider quality :P
<natefinch> wallyworld: I'll just use build-agent all the time from now on.  That will fix my problems.
<wallyworld> natefinch: it behaves deterministically. it will search for a packaged version which matches the client version
<wallyworld> there's no moon involved
<thumper> wallyworld: yes, I reopened those 3 issues
<wallyworld> if the client reports it is a version other than what it is, then yes build-agent is needed
<natefinch> there is a moon. it's the contents of streams.  When I'm typing at my terminal, there's no way for me to know what's in streams.
<wallyworld> how so?
<wallyworld> there's the validate-tools plugin
<wallyworld> that will tell you
<menn0> axw: change of plan... not doing pickup
<wallyworld> thumper: awesome, that's for printf
<wallyworld> *thanks
<menn0> thumper: in your PR's QA steps, I presume you deleted the model before running destroy-controller ?
<menn0> otherwise you would have needed --destroy-all-models right?
<menn0> thumper: review done
<thumper> menn0: I did think that was weird
<thumper> no, I didn't
<menn0> hmmm
<thumper> I think that may be a different bug
<menn0> maybe the -y is (incorrectly) allowing the check to be bypassed?
<thumper> no
<thumper> don't think so
<wallyworld> thumper: reviewed, i have opinions on some of the text, see what you think
<thumper> ok
 * thumper away from internet for a bit
<thumper> kiddo dropoff and wait
<menn0> axw: here's the support for multiple macaroons with migrations: http://reviews.vapour.ws/r/5614/
<menn0> adding QA steps
<axw> menn0: TYVM, reviewing now
<axw> menn0: LGTM. is there some doc somewhere which describes the commands to use for migrating? I will need to QA also when I make the changes to read from ~.go-cookies
<menn0> axw: it's just one command: juju migrate
<menn0> axw: but I'll email you some details
<axw> menn0: cheers
<wallyworld> axw: if you get a moment after the macaroon stuff, here's a small win http://reviews.vapour.ws/dashboard/
<axw> wallyworld: I'll take a look now
<wallyworld> and if you are feeling like poking your eye out and have time later also http://reviews.vapour.ws/r/5611/
<axw> assume you mean the local: prefix one
<wallyworld> ty
<wallyworld> yeah, or both, or whatever you have time for
<wallyworld> macaroons comes first
<wallyworld> the local: one is small
<wallyworld> axw: i am 100% (well 99.99%) sure our builders are all 1.6. i can confirm and land separately. also that localhosthomestack thing is deliberate to ensure the openstack cloud abuts the lxd one and is last in the file (the \n are stripped off)
<axw> wallyworld: lemme just re-read that last bit then, I must have glanced over something
<axw> wallyworld: ok I understand. can you please add a comment explaining that?
<wallyworld> sure
<wallyworld> thanks for review
<axw> np
<menn0> axw: i've just sent the multiple macaroons PR for merging
<axw> menn0: awesome, thanks!
<redir> wallyworld: yt?
<wallyworld> depends who's asking :-)
<redir> Ed McMahon
<redir> wallyworld: ^
<wallyworld> oh, well in that case
<redir> wallyworld: thumper raised an issue re: model-config returning a string of the value when --format=json is present
<redir> which makes sense to me
<wallyworld> yeah, i think it's a valid point
<redir> but should it return '{value}'
<wallyworld> shouldn't it be {name=value}
<redir> or '{key: {value: value, source: model}}'
<wallyworld> yeah, something like that
<redir> as if it is one element of the lot
<wallyworld> what do we do for the yaml
<wallyworld> i think with yaml it prints out gobs of extra metadata
<redir> but tabular should still just output the value
<redir> wallyworld: so does json
<wallyworld> yeah, users want to have the option to get *just* the value printed
<redir> wallyworld: works for me
<wallyworld> for without a format arg that's what they should get
<wallyworld> but json and yaml should be formatted
<wallyworld> i'd argue yaml and json should have equivalent info
<wallyworld> all the extra metadata
<wallyworld> so just match what the yaml does now
<redir> wallyworld: but I think that affects the juju model-config --format=yaml | juju model-config --yaml -
<redir> idea
<wallyworld> redir: oh, wait, i was thinking of *application* config sorry
<redir> heh
<wallyworld> that is the one with gobs of extra metadata
<wallyworld> so we can still support the pipe, the get just needs to ignore the source attribute. bur for now, let's just not include it in the json or yaml
<redir> wallyworld: model-config prints the key: \n value: value\nsource: thesource\n for yaml now in model-config
<wallyworld> so tabular = value, from etc
<wallyworld> ok
<redir> OK
<wallyworld> so the get should just ignore that field right?
<redir> so tabular single gets just he string value
<redir> everything else structured
<wallyworld> yeah, tabular doesn;t change
<redir> wallyworld: makes sense to me
<wallyworld> and my comment before about *just* the value - that's for application config
<redir> just wanted to verify before doing and footshooting
<wallyworld> so i think the only change here is to the json output?
<wallyworld> to fix that bit
<wallyworld> yaml is ok as is right?
<wallyworld> and tabular
<wallyworld> is ok
<redir> wallyworld: yes I think so.
<wallyworld> sgtm
<redir> I didn't realize we didn't do that already:/
<wallyworld> the json issue?
<redir> wallyworld: yes
<wallyworld> yeah, i had no idea either
<wallyworld> wtf
<redir> added tests to
 * redir shrugs
<redir> should have an update and a PR for the model-defaults stuff soon
<wallyworld> with the other comment(s), use "configuration values" i think?
<redir> then tomorrow hopefully they'll land and I can get the --region bits in for defaults
<wallyworld> ok, awesome
<redir> http://reviews.vapour.ws/r/5589/ is ready for another look.
<redir> and also http://reviews.vapour.ws/r/5616/ is ready. This second one is stacked on the first, so it may be easier to review after it lands. It is nearly identical but to flatten model-defaults rather than model-config.
<redir> g'nite
<jam> frobware: dimitern: ping
<jam> we're talking about the IPAM spec, and we wanted to get some info about the rationale for a couple of the decisions.
<frobware> jam: how/where do you want to do this?
<jam> frobware: https://hangouts.google.com/hangouts/_/canonical.com/tech-board
<fwereade> sorry, hangouts flailing
<dimitern> anastasiamac_: I've updated http://reviews.vapour.ws/r/5598/ with your suggestions and added a minute change to populate parent name for bridge ports, I'd appreciate a second look, if you can
<dimitern> frobware: ^^
<anastasiamac_> dimitern: tyvm! I'll need to go afk - apparently kids need to go to bed :D - i'll look when i return later on \o/
<dimitern> anastasiamac_: sure, thanks!
<anastasiamac_> dimitern: actually, looks kind of awesome \o/ reviewed.. m going afk now
<dimitern> anastasiamac_: \o/ :)
<voidspace> dimitern: ping
<voidspace> frobware: ping
<frobware> voidspace: pong, otp
<voidspace> frobware: have you deployed openstack (specifically I need nova compute) to KVM?
<frobware> voidspace: not. well, not since I last deployed openstack on arm. but that was with devstack.sh
<voidspace> frobware: does devstack.sh setup the KVM config for you?
<frobware> voidspace: nope
<frobware> voidspace: but then again, what do you mean by kvm config?
<voidspace> frobware: setup the required KVM instances with the correct network config for openstack to work - similar to your scripts
<voidspace> frobware: I imagine that in order to deploy openstack there are fairly specific requirements on what machines are available
<voidspace> frobware: dimitern has probably done this more
<frobware> voidspace: never tried it with/via juju
<voidspace> frobware: I have a bug that can be repro'd with a shutdown command to nova compute issued via a run-action
<dimitern> voidspace: sorry, was in a call until now, but I need to leave - bbiab
<macgreagoir> voidspace: I've done a bit of openstack on kvm, if i can help.
<voidspace> macgreagoir: great
<voidspace> macgreagoir: so how do I do it ;-)
<macgreagoir> :-)
<voidspace> macgreagoir: can you tell me what KVM setup I need and then how to do the deploy
<macgreagoir> The challenge may be having enough VMs to host the services. Is it HA?
<voidspace> macgreagoir: I'd like the simplest possible setup that gets me nova compute
<voidspace> macgreagoir: I can create the VMs if I know what configuration I need
<macgreagoir> I have some scripts to create a set of KVMs with two nics each, which would be OpenStack priv and pub nets.
<frobware> voidspace: are you looking at https://bugs.launchpad.net/juju/+bug/1555808
<mup> Bug #1555808: Cannot deploy a dense openstack bundle with native deploy <2.0> <2.0-count> <bundles> <cdo-qa> <ci> <deployer> <eda> <juju-release-support> <jujuqa> <maas-provider> <juju:Triaged by rharding> <https://launchpad.net/bugs/1555808>
<voidspace> frobware: no
<voidspace> frobware: https://bugs.launchpad.net/juju/+bug/1534103
<mup> Bug #1534103: "unknown operation kind run-action" (1.26alpha3) <2.0-count> <actions> <sts> <juju:In Progress by mfoord> <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1534103>
<voidspace> macgreagoir: there's a guide here that suggests that four machines with containers should be enough
<voidspace> macgreagoir: https://www.hastexo.com/resources/hints-and-kinks/ubuntu-openstack-juju-4-nodes/
<voidspace> macgreagoir: but that assumes physical machines and doesn't detail the network config they need so I can't easily repro it with KVM
<macgreagoir> voidspace: Two should be enough. What storage is the bundle using, do you know?
<macgreagoir> As in ceph?
<voidspace> macgreagoir: I don't know what bundle *they're* using, I'd like to use whatever is the simplest approach
<macgreagoir> If you just want to test nova-compute, a simple storage should be fine, making the VMs easier.
<voidspace> macgreagoir: that sounds good
<macgreagoir> voidspace: This is a cleane-up copy of my scripting: https://github.com/macgreagoir/maas-juju-sandbox
<macgreagoir> It assumes on maas server and three nodes, but a forth should be OK.
<voidspace> macgreagoir: ok, great - that looks like it deploys MAAS to KVM (which I have but I don't mind starting from scratch)
<voidspace> macgreagoir: are you suggesting I repurpose that for openstack?
<macgreagoir> It sets up the two new virsh nets too.
<macgreagoir> I would think the nodes will be OK for an openstack test.
<voidspace> macgreagoir: ah, cool
<voidspace> macgreagoir: and how to do the openstack deploy to it, using simple storage as you suggest
<voidspace> macgreagoir: will I need to choose a bundle and configure it? (I've not done that before so any help you can give will be appreciated)
<macgreagoir> In a Canonical reference deployments the openstack controller services would be in lxd containers. I haven't seen if that example does the same...
<voidspace> macgreagoir: ah, you mean follow the example for that part
<voidspace> macgreagoir: ok
<macgreagoir> voidspace: Let me see if there in a non-HA bundle. HA is what will kill your resources.
<macgreagoir> I have an old OpenStack example deployment somewhere too, but it's not Juju.
<voidspace> macgreagoir: I'll try the example and hassle you when I have problems
<voidspace> macgreagoir: thanks :-)
<macgreagoir> voidspace: nw, I'm just finishing a thing here, but I'll try to send you something more concrete in a few.
<voidspace> who knows anything about run-action?
<voidspace> fwereade: ping
<fwereade> voidspace, pong
<voidspace> fwereade: hey, you've left comments in the run action code - so you've probably at least looked at it...
<voidspace> fwereade: ok to ask a question (I'm back on the run action bug "unknown operation kind run-action" bug)
<fwereade> voidspace, in uniter? right, I probably still remember some of it ;p
<fwereade> voidspace, ofc
<voidspace> fwereade: yeah, uniter resolver
<voidspace> fwereade: so the specific error (unkown operation kind) is coming out of the resolver switch statement looking at "locaState.Kind"
<voidspace> fwereade: if localState.Kind is RunAction then before it gets there it will have gone into the Actions resolver
<fwereade> voidspace, hmm -- so the actions resolver isn't handling it?
<voidspace> fwereade: ah, and if we have an operation.RunAction with a hook of nil then we will return resolver.ErrNoOperation
<voidspace> fwereade: so if the hook is nil but Kind is RunAction then we will drop into that switch statement which will barf
<voidspace> fwereade: so what does a Kind of RunAction with a Hook of nil mean? is it an error that we should flag in some other way
<voidspace> *however*, if that's happening I ought to see "run-action hook is nil" in the logs - which I don't
<fwereade> voidspace, surprised Hook is relevant -- isn't action-id the bit that we should be looking at?
<voidspace> if I can repro I can check
<voidspace> fwereade: worker/uniter/actions/resolver.go
<voidspace> fwereade: line 57
<fwereade> voidspace, ha
<voidspace> fwereade: should there be a return there?
<fwereade> voidspace, I think we need to return some sort of operation that marks the current action failed in the db and updates the local state to make sense again
<voidspace> fwereade: ok, any pointers on that?
<voidspace> fwereade: like what "makes sense again" means :-)
<voidspace> the operation factory doesn't have a failure operation it seems
<fwereade> voidspace, if you look at operation/runaction.go
<voidspace> there is a FailAction on the Callbacks interface
<fwereade> voidspace, the Commit step seems to have the right way to change local state once an action has completed...
<fwereade> voidspace, and, indeed, you (probably) want an operation that just calls FailAction and then restores local state in the same way we do on successful commit
 * dimitern is back
<fwereade> voidspace, we've done the action, success or failure is immaterial to what we do next
<dimitern> voidspace: you shouldn't need any kvm config - nova takes care of creating these based on the configured flavors
<dimitern> voidspace: I've installed openstack-base bundle with a few mods to the 4 NUCs I have here, and managed to start a guest via nova, after configuring OS as described in the bundle
<dimitern> my 5/6 blog posts are all about that
<voidspace> dimitern: cool, thanks
<fwereade> voidspace, making sense?
<dimitern> frobware: ping
<frobware> dimitern: pong
<voidspace> fwereade: so an operation that marks the action as failed and modifies localState
<dimitern> frobware: any thoughts on my last PR? I'm eager to land it, so we can then land yours (re bridge everything one)
<fwereade> voidspace, yeah, I think that's the right thing
<dimitern> frobware: http://reviews.vapour.ws/r/5598/
<frobware> dimitern: ah, let me take a look. a thoudand different things.
<dimitern> frobware: I've come up with a really simple merge step for the follow-up
<dimitern> frobware: no more sorting, guessing, etc.
<voidspace> fwereade: ok on the high level, I might come and pester you for details
<dimitern> and once that's done, we're .. well perhaps 2 PRs away from fixing bug 1566791
<mup> Bug #1566791: VLANs on an unconfigured parent device error with "cannot set link-layer device addresses of machine "0": invalid address <2.0> <4010> <cpec> <network> <juju:In Progress by dimitern> <https://launchpad.net/bugs/1566791>
<fwereade> voidspace, any time :)
<voidspace> fwereade: I'm going to try and repro first to confirm this is the problem
<fwereade> sgtm
<voidspace> fwereade: or at least the code path, although I think it must be
<voidspace> fwereade: thanks
<fwereade> pleasure :)
<rick_h_> morning party people
<frankban> redir: hi, could you pleae take a look at https://github.com/juju/juju/pull/6181 ? thanks
<rick_h_> frankban: redir is a ways from being up. fwereade are you ablento peek at ^ please?
<fwereade> rick_h_, ack
<frankban> fwereade: ty
<frankban> rick_h_: morning
<dimitern> welcome back rick_h_ ;)
<rick_h_> dimitern: ty :)
<frobware> dimitern: I reviewed http://reviews.vapour.ws/r/5598/
<dimitern> frobware: thanks! wow you've been thorough! :)
<frobware> dimitern: we need to get this right - it's complicated and partitioning the steps we require would be really helpful. (I know that some of this may come in a follow-up PR.)
<dimitern> frobware: yeah, the follow-up will clean a lot of the mess in networkingcommon
<rock_> Hi. I have a question. We have developed a JUJU Charm for configuring cinder to use one of our Storage array as the backend.   So How to redeploy the Charm to add more storage arrays to configure cinder without destroying/removing the current deployed charm. [For example, We don't want to remove the current configured storage arrays from the Cinder configuration.]
<rock_> Please anyone provide me some solution for this.
<voidspace> dimitern: ah, you deployed to NUCs - I want to deploy *to* KVM
<voidspace> dimitern: so yes I need some KVM config...
<mgz> rock_: you probably want to ask in the #juju channel, but generally you just use `juju set-config` and `juju upgrade-charm` to change an deployed charm's behaviour
<dimitern> voidspace: I've managed to do the same on 4 KVMs and use force nova to use nested kvms - dead slow though
<voidspace> dimitern: I only want it running so I can shut it down again, don't need it to create any nodes for me
<voidspace> dimitern: I don't have a hardware setup that can do this
<voidspace> dimitern: shutting it down via an action should be enough to repro the bug
<rock_> mgz: OK. Thank you.
<frobware> voidspace: I'm confused by your KVM "config" - do you not just need machines available in your MAAS setup or is this about making those KVM nodes accessible/available to openstack?
<alexisb> babbageclunk, I am on the HO when you are ready
<babbageclunk> alexisb: Oh, sorry! I thought your email also applied to me - omw
<alexisb> :)
<voidspace> frobware: well, doesn't openstack have some requirements about what networks are available?
<voidspace> frobware: or if I just create six nodes with one network will that work
<voidspace> frobware: I suspect I need at least two networks for each node, however I don't know which is why I'm asking
<frobware> voidspace: perhaps this is why my OpenStack deploy failed a little earlier today
<voidspace> frobware: macgreagoir seems to think that two networks should be enough - so I can use your scripts to create the nodes
<frobware> dimitern: do you have a working (dense) openstack bundle?
<voidspace> frobware: and there's a guide here (from 2015) suggesting four nodes
<voidspace> frobware: https://www.hastexo.com/resources/hints-and-kinks/ubuntu-openstack-juju-4-nodes/
<macgreagoir> voidspace: Sorry I haven't got back to you yet. Also note which libvirt type nova-compute is configured to use. You'll want to not use kvm (on kvm). nova-lxd maybe?
<dimitern> frobware: how dense do you need? :)
<frobware> dimitern: I have one 24GB node.
<frobware> VM
<wallyworld> frankban: hey, looks like the latest rev still uses a separate APICallCloser? did you forget to push an update?
<frankban> wallyworld: no, it seems to me that reviewboard went crazy, see https://github.com/juju/juju/pull/6181/files
<frankban> wallyworld: I also applied some changes to the test file suugested by fwereade
<wallyworld> frankban: awesome, ty. sorry for noise. not sure wgat crack rb is on
<dimitern> frobware: 24G disk or ram?
<frobware> dimitern: RAM. disk is a solved problem. :)
<dimitern> frobware: :) well - i haven't tried all-in-one deployment
<dimitern> apart from I guess deploying maas on that node adding kvms and then deploying to them :D
<natefinch> rick_h_: btw, I'm not convinced that marking https://bugs.launchpad.net/juju-core/1.25/+bug/1610880 as invalid is correct. the explanation about CPC doesn't make sense with the symptoms stated.  We're failing to validate the certificate that Juju created.... this is a couple steps away from anything to do with CPC
<mup> Bug #1610880: Downloading container templates fails in manual environment <juju-core 1.25:Invalid> <https://launchpad.net/bugs/1610880>
<rick_h_> natefinch: ok, looks like might be a few things goin on there.
<rick_h_> natefinch: let's let dave/dan run with it for now and if it comes back up we'll know where to start
<natefinch> rick_h_: ok.  FWIW, I had a hellish time trying to figure out what was going on.  Lots of moving parts and unclear what the correct behavior was of several parts of the code... almost entirely due to the fact that manual is a special flower.
<natefinch> rick_h_: for a week it was 100% reproducible, and then all of a sudden, 100% not reproducible.  I stopped working on it because I wasn't getting anywhere, and didn't want to sink any more time into it without talking to you.
<babbageclunk> wallyworld: it turns out thumper was referring to owner data (although it would have been handy if he'd replied to my question to clarify!). :(
<rick_h_> natefinch: understand, we've got other stuff to run onto so we'll just keep an eye out for a repro
<mup> Bug #1474607 opened: worker/uniter/relation: HookQueueSuite.TestAliveHookQueue failure <ci> <go1.5> <go1.6> <regression> <windows> <juju:Fix Released by axwalk> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1474607>
<redir> morning
<redir> alexisb: yt?
<alexisb> redir, yep
<alexisb> wuz up?
 * rick_h_ goes for late lunchables
<babbageclunk> Anyone seen thumper?
<alexisb> babbageclunk, he has not arrived yet today
<alexisb> should be on shortly though
<babbageclunk> ok, thanks alexisb
 * perrito666 does some surgery in jujupy
<alexisb> babbageclunk, as promised :)
<alexisb> there is thumper
<alexisb> thumper who forgot to send mail to babbageclunk ;)
<alexisb> morning thumper
<thumper> oops
<thumper> my bad
<alexisb> yep
<natefinch> gahh.. I cannot for the life of me reproduce https://bugs.launchpad.net/bugs/1614635 even in the same environment where ahasenack says it happens 50% of the time.
<mup> Bug #1614635: Deploy sometimes fails behind a proxy <deploy> <landscape> <proxy> <juju:In Progress by natefinch> <https://launchpad.net/bugs/1614635>
<natefinch> wallyworld: how do I make bootstrap just use what's in streams?  I removed jujud locally and now I just get  ERROR cmd supercommand.go:458 failed to bootstrap model: cannot package bootstrap agent binary: no prepackaged agent available and no jujud binary can be found
 * thumper claps
<wallyworld> natefinch: that means there is no matching binary in strams for your client
<wallyworld> you will need to force it with agent-version
<wallyworld> but you then wear any compatibility issues
<wallyworld> between mismatched agent and client
<wallyworld> or you use --build-agent to build a local compatibile binary
 * thumper needs an alias of 'got est' to be 'go test'
<thumper> wallyworld: are you doing this one? https://bugs.launchpad.net/juju-core/+bug/1458576
<thumper> wallyworld: as I'm going to be doing some more status stuff soon
<wallyworld> thumper: my plate is full this week, i wasn't going to
<wallyworld> can look next week
<thumper> I'll take it
<thumper> I'll be in there
<thumper> messing around
<wallyworld> ok
<voidspace> thumper: ping
<thumper> voidspace: hey, in release call
<thumper> need a review?
<voidspace> thumper: yeah, just checking you were aware
<thumper> yep, aware, just busy
<voidspace> thumper: np
<redir> wallyworld: got a minute to chat about the cloud/region stuff?
<wallyworld> redir: in meetings sadly, i will ping when free
<redir> np
<redir> anyone know where I can find the cloud type the current modelcmd is operating on?
<wallyworld> redir: i have a couple of minutes between meetings if you want to chat
<redir> sure thing
<wallyworld> 1:1
<redir> brt wallyworld
<alexisb> wallyworld, I have a few minutes if you are avaailable
<alexisb> if not I will catch you tomorrow
<wallyworld> alexisb: ok, brt
<thumper> review up: http://reviews.vapour.ws/r/5622/
<axw> wallyworld: you'll appreciate this: https://twitter.com/MarkKriegsman/status/739664279083814912
<wallyworld> axw: indeed :-)
<perrito666> oh, we need to spray stencil juju changelog on a cow
<anastasiamac_> thumper: just to confirm: u r taking over this one, so i can re-assign to u? https://bugs.launchpad.net/juju/+bug/1455627
<mup> Bug #1455627: TestAgentConnectionDelaysShutdownWithPing fails <ci> <intermittent-failure> <lxc> <test-failure> <unit-tests> <windows> <juju:Triaged by dimitern> <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1455627>
<redir> wallyworld: landing PR, let me know if htere's anything I can do to start on changing modeldefaults to be a controller command
<thumper> looking at it
<wallyworld> redir: great. to make the change, the api methods need to move to the ModelManager facade, and the CLI command itself needs to embed BaseController not BaseModel
<wallyworld> redir: so essentially cut and paste and bit bit of jiggery
<perrito666> the only nice thing about working in b&r is that I never ever get a conflict
<redir> whelp I'll see what I can do to start and leave it with you at EoD
<wallyworld> sure
<thumper> really would like this reviewed: http://reviews.vapour.ws/r/5622/
<redir> thumper: looking
<redir> thumper: LGTM, but you'll want another +1.
<thumper> wallyworld: care for another +1?
<wallyworld> oh if i must
<thumper> wallyworld: or alternatively, we graduate redir so we don't need to bother
 * thumper has lunch appt
<thumper> so off for a bit
 * thumper just proposes a bit of work
#juju-dev 2016-09-08
<thumper> wallyworld: this too https://github.com/juju/juju/pull/6186
<thumper> now lunch
<wallyworld> thumper-lunch: you can tell rb tp pick it up using rbt post
<wallyworld> menn0: a whole lot of tests in api/controller/controller_test.go are no longer there with the various new migration tests being introduced. did you move them anywhere?
<menn0> wallyworld: yep.. see legacy_test.go in the same package
<menn0> legacy_test.go
<wallyworld> ah righto thanks
<menn0> weird ... the underscore doesn't show up for me :)
<menn0> wallyworld: the migration tests couldn't work any more when using JujuConnSuite so I set things up for moving these tests to using a mock API
<wallyworld> sgtm
<mup> Bug #1474607 changed: worker/uniter/relation: HookQueueSuite.TestAliveHookQueue failure <ci> <go1.5> <go1.6> <regression> <windows> <juju:Fix Released by axwalk> <juju-core:Invalid> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1474607>
<perrito666> you know where your day is going when you find yourself reading the code for mongo tools
<anastasiamac_> wallyworld: could u please answer user question on https://bugs.launchpad.net/juju/+bug/1620886? :D
<mup> Bug #1620886: juju status returns  ERROR connection shutdown <juju:Incomplete> <https://launchpad.net/bugs/1620886>
<wallyworld> don't have time right now
<wallyworld> maybe tomorrow
<wallyworld> need to get beta ready
<redir> what's this mean? http://juju-ci.vapour.ws:8080/job/github-merge-juju/9120/console at the end
<redir> the bot failed to merge but it tried to?
<redir> sinzui: ^^
<redir> if your around
<anastasiamac_> axw: could u plz comment on the bug wallyworld cannot? ^^
<wallyworld> anastasiamac_: it will take analysis and investigation
<wallyworld> can it wait till tomorrow?
<wallyworld> we need to concentrate on the beta today
<anastasiamac_> wallyworld: did u read the bug? it's a simple question...
<wallyworld> "I sthere a way i can bring the model back"
<wallyworld> that is a can of worms
<wallyworld> he started on beta12
<wallyworld> we made incompatible changes in 13/14
<wallyworld> he will likely need to hand edit yaml files
<anastasiamac_> i've read the bug. to me the answer is simple "no" he'll have to re-bootstrap. but u and andrew r more fmailiar with the area
<anastasiamac_> i'll assign to u \o/
<wallyworld> no thanks :-)
<wallyworld> don't have time right now
<anastasiamac_> who does?
<wallyworld> no one today
<wallyworld> beta take priority
<sinzui> balloons: You need to follow up with http://juju-ci.vapour.ws:8080/job/github-merge-juju/9120/console the merge failed, but the tests passed and the job itself reports success.
<wallyworld> let's see who's free tomorrow
<sinzui> redir: have you tried to force your branch to merge?
<redir> sinzui: don't know how
<sinzui> redir: okay. I am taking a guess I will pretend to be jenkins on that host and run the failing command
 * redir checks wiki
<sinzui> redir: "Pull Request successfully merged"
<redir> thanks sinzui-bot
<sinzui> redir: ballons There must have been a network hiccup. running the command again took a second to complete
<redir> sinzui: balloons I've noticed some slow GH action for the last 12h or so
<redir> wallyworld: I moved the api bits here: https://github.com/reedobrien/juju/tree/feature/move-model-defaults
<wallyworld> righto
<wallyworld> i'll look when i can
<redir> wallyworld: but haven't had time to make anything useful of moving the command bits yet
<redir> and I have to go RSN
<wallyworld> sure, np, i can pick it up
<wallyworld> ty
<redir> later juju-dev
<axw> wallyworld: please see replies on http://reviews.vapour.ws/r/5618/
<wallyworld> ok
<wallyworld> axw: all good, i was thinking the calling code would test for ErrExpired, but wasn't sure how it would be used
<axw> wallyworld: we check for NotFound atm, but not ErrExpired. it will make sense to do so when we have browser logins. because the client would pop up a browser and then go to the /auth/wait page straight away
<axw> wallyworld: it would find the interaction, but it *would* be expired if the user doesn't hit login inside 2 minutes
<wallyworld> yep
<natefinch> Whazzat?  2016-09-08 01:48:00 ERROR cmd supercommand.go:458 failed to initialize state: validating initialization args: mismatching uuid (2eb2f71a-00d1-43db-8575-2d4eacb2bf3f) and controller-uuid (4d5116b1-945a-46af-8a27-69e7269d2753) not valid
<natefinch> I saw some stuff go by on email/irc about UUIds... what's going on here?  I'm bootstrapping with a client from master, but trying to deploy beta17 .... is that broken, or did I do something wrong?
<anastasiamac_> menn0: is there anything in rsyslog on 1.25 that would have caused this? u/thumper were going to check but i have not seen an update... https://bugs.launchpad.net/juju-core/1.25/+bug/1616832
<mup> Bug #1616832: manual environment juju-db timeout <manual-provider> <juju:Incomplete> <juju-core:Won't Fix> <juju-core 1.25:Incomplete> <https://launchpad.net/bugs/1616832>
<axw> wallyworld: you ok with that branch landing then?
<wallyworld> yep
<wallyworld> natefinch: yep broken
<axw> thanks
<wallyworld> natefinch: that's we we strictly enforce client and agent version match
<natefinch> wallyworld: I thought we explicitly didn't do that, so you can use one client with multiple controllers that might be at different versions.... or is this just unique to bootstrap itself?
<wallyworld> just bootstrap
<wallyworld> the version check is just for bootstrap
<natefinch> Is this unique just to this one point in time, where pre-beta18 servers won't work with post beta18 and later clients, or is this a new policy in general?
<wallyworld> it happened between beta17 and 18
<natefinch> ok
<wallyworld> it's not a policy
<natefinch> ok
<wallyworld> but we don't guarantee client N can bootstrap agent N+1
<wallyworld> it works 99% of the time
<wallyworld> it betas, we are less strict about checking for such compatibility
<natefinch> yep
<menn0> anastasiamac_: NFI what's going on with that rsyslog issue
<menn0> anastasiamac_: digging some more
<anastasiamac_> menn0: thnx :)
<menn0> anastasiamac_: hmmm... I think I've found something
 * menn0 digs more
<thumper> very simple review: http://reviews.vapour.ws/r/5623/ only adds logging
<anastasiamac_> menn0: \o/
 * anastasiamac_ hopes menn0 found the reason for cpu/mem spikes on 1.25 and will solve all world problems!!
<menn0> anastasiamac_: I doubt this has anything to do with cpu/mem spikes :)
<anastasiamac_> menn0: i know but m mentioning to see who else is watching :D
<menn0> anastasiamac_: here's a helpful article: http://b.kl3in.com/2011/10/ubuntu-server-slowly-stops-responding/
<menn0> anastasiamac_: it seems that if syslogd/rsyslogd hangs and stops reading from /dev/log then the buffer fills up as processes keep writing out logs, and when that happens processes start hanging when they try to log
<menn0> anastasiamac_: i'm trying that out now
<anastasiamac_> menn0: this is so sad :(
<menn0> yeah, that's pretty awful
<anastasiamac_> menn0: definitely worth adding to the bug and marking as Invalid for juju... ther is nothing we can really do on our side :(
<menn0> anastasiamac_: well there is... we can have something that checks if rsyslogd is wedged and restart it when it is
<menn0> anastasiamac_: there's a suggested script in the blog article
<menn0> anastasiamac_: it's also possible that juju generated an invalid rsyslog config (or cert) which was causing it not to be able to work
<anastasiamac_> menn0: both of which are gr8 suggestions and should b mentioned in the bug..
<menn0> anastasiamac_: just doing a bit more research and then I'll update the bug
<anastasiamac_> menn0: awesome \o/ if u could add how much effort would be involved in resolving these, it'll help power-that-be to plan.
<menn0> ok
<anastasiamac_> menn0: i'll also add an eda tag to it in hopes that QA may come up with functional test to reproduce it \o/
<axw> wallyworld: external user login support: http://reviews.vapour.ws/r/5624/
<wallyworld> axw: awesome, am about to ptopose a PR and then i'll look at the remaining PRs of yours
<axw> thanks
<mup> Bug #1557769 changed: private-address returns name, not ip, under 1.25.4 <Charm Helpers:In Progress by stub> <juju:Triaged> <juju-core:Invalid> <juju-core 1.25:Triaged> <cassandra (Juju Charms Collection):Fix Released by stub> <https://launchpad.net/bugs/1557769>
<mup> Bug #1577556 changed: unit failing to get unit-get private-address in the install hook <intermittent-failure> <network> <juju:Triaged> <juju-core:Invalid> <juju-core 1.25:Triaged> <ubuntu-openstack-ci:Triaged> <mysql (Juju Charms Collection):Fix Released> <https://launchpad.net/bugs/1577556>
<natefinch> wallyworld: I don't even seem to be able to juju status from current master client to beta17 server.  is that expected?
<wallyworld> yes
<natefinch> ok
<wallyworld> controller and controller model uuids are now different
<wallyworld> they used to be the same
<wallyworld> hence not uuids by definition
<natefinch> well, they are uuids... just not of the same thing :)
<wallyworld> natefinch: note, with tools etc, the bootstrap messages always tell you what has been done so there are no surprises
<natefinch> wallyworld: yes, but it's buried in there if you're using --debug
<wallyworld> that  is true, debug is a firehose
<thumper> axw: got a few minutes?
<axw> thumper: yeah?
<thumper> looking at a storage bug and I want to get ideas on how to test and if I'm doing it right
<thumper> 5 min hangout?
<axw> thumper: sure
<thumper> axw: https://hangouts.google.com/hangouts/_/canonical.com/storge?authuser=0
<wallyworld> axw: or thumper: or anyone, here's a PR I'd like to land today if anyone has time to review at some stage http://reviews.vapour.ws/r/5625/
<natefinch> wallyworld: I can look at it
<wallyworld> yay, ty
<natefinch> anastasiamac_ (or anyone else) - what should I mark a bug assigned to me that appears to have been fixed, though perhaps not intentionally, and not by me?  Fix committed?
<anastasiamac_> natefinch: of course \o/
<natefinch> anastasiamac_: ok :)
<anastasiamac_> natefinch:  it'll help if u knew PR that fixed it but otherwise, unless u r happy to field questions around the bug, keep it assigned to u
<natefinch> anastasiamac_: yeah, I have no idea. I could repro in beta 17 reliably, can't in master.  Â¯\_(ã)_/Â¯
<anastasiamac_> natefinch: best kind of bug :D
<anastasiamac_> natefinch: well done \o/
<axw> wallyworld: http://reviews.vapour.ws/r/5620/ has already been looked over by mhilton and ashipika, so I'll stick with just your review if that's ok?
<wallyworld> axw: sure, np, just making sure
<natefinch> wallyworld: lol, took me a while to figure out what hoak meant
<wallyworld> lol
<wallyworld> i can tweak that
<natefinch> er haok
<natefinch> wallyworld: I didn't have a better suggestion, so I was going to ignore it, but it was kinda funny.
<natefinch> wallyworld: one thing - I kinda wonder if we'll make people worry if they see 3/4 machines for HA, when really, that fourth machine is hasvote: false wantsvote: false
<natefinch> wallyworld: I don't know if it's really fixable, it's a lot of info to cram into a single column
<wallyworld> natefinch: wantsvote false is not supposed to show
<natefinch> wallyworld: oh, I misread the test I was looking at.. You're right.  Nevermind
<wallyworld> natefinch: the 2/3 format is the same as is used for the scale column in status
<wallyworld> i just made it up as a compact way to display
<wallyworld> this betas will get feedback on that
<wallyworld> and we'll tweak as needed
<natefinch> cool
<thumper> axw: testing our fix now on canonistack
<axw> cool
<natefinch> wallyworld: you have a review
<wallyworld> natefinch: awesome, ty
<wallyworld> natefinch: i had better fix any issues before you disappear
<anastasiamac_> menn0: and since today is an rsyslog, just found this pearl :) https://bugs.launchpad.net/ubuntu/+source/juju-core/+bug/1318378
<mup> Bug #1318378: rsyslog starts dropping connections in large environments <logging> <rsyslog> <scalability> <sm15k> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1318378>
<menn0> anastasiamac_: that one is probably fixable by tweaking rsyslogd's config or raising process limits
<menn0> anastasiamac_: as for the first one, I've demonstrated that if rsyslogd is stuck, anything logging to it will get stuck.
<menn0> anastasiamac_: this will be mongodb and in 1.25, all the agents in the model
<anastasiamac_> \o/
<anastasiamac_> these bugs are crying for ur comments ;)
<menn0> writing up now
<anastasiamac_> natefinch: if u r still here, thumper: i *think* this has been fixed coz we do not run mond as sudo.. right? https://bugs.launchpad.net/ubuntu/+source/juju-core/+bug/1208430
<mup> Bug #1208430: mongodb runs as root user <mongodb> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1208430>
<natefinch> anastasiamac_: no idea... but uh... if they can 0 day the database.... whet else are we protecting?  The log files?  All the important data is *in* mongo :/
<natefinch> ok, I gotta run, I'm passing out here.  wallyworld - changes LGTM
<wallyworld> natefinch: awesome, tnas again
<wallyworld> thanks
<mup> Bug #1317909 changed: juju add-unit performance degrades in large environments <add-unit> <performance> <scalability> <sm15k> <juju:Triaged> <juju-core:Invalid> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1317909>
<mup> Bug #1318148 changed: Unit.PublicAddress shouldn't treat no machine as an error <landscape> <logging> <tech-debt> <ui> <juju:Triaged> <juju-core:Invalid> <https://launchpad.net/bugs/1318148>
<mup> Bug #1318378 changed: rsyslog starts dropping connections in large environments <logging> <rsyslog> <scalability> <sm15k> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1318378>
<menn0> anastasiamac_: lots of detail added to bug 1616832
<mup> Bug #1616832: manual environment juju-db timeout <eda> <manual-provider> <juju:Incomplete> <juju-core:Won't Fix> <juju-core 1.25:Incomplete> <https://launchpad.net/bugs/1616832>
<menn0> anastasiamac_: it should probably be reopened against 2.0 b/c it could happen there too since mongod still logs to syslog in 2.0
<menn0> anastasiamac_: it's probably less likely because we demand a lot less of rsyslogd in 2.0 though
<wallyworld> axw: if you are free, here's a cut and paste review http://reviews.vapour.ws/r/5627/
<axw> ok
<mup> Bug #1318378 opened: rsyslog starts dropping connections in large environments <logging> <rsyslog> <scalability> <sm15k> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1318378>
<wallyworld> thanks axw
<axw> nps
<thumper> axw: http://reviews.vapour.ws/r/5628/
<axw> looking
<thumper> I'm off now
<thumper> axw: if you're happy, can you do the merge dance?
<thumper> laters
<axw> thumper: will do. later
<axw> wallyworld: I missed something in my branch when I QAd before. need to make a small change, will ping you for a review soon. change-user-password isn't storing a macaroon, so you need to log in again after changing your password
<wallyworld> sure np
<wallyworld> axw: i assume you'l lupdate the release notes at EOD as well?
<axw> wallyworld: yup
<axw> wallyworld: http://reviews.vapour.ws/r/5620/diff/3-4/
<wallyworld> looking
<axw> wallyworld: I've just re-run through all the QA steps again, and also confirmed that "change-user-password && status" does not prompt you after the password is changed
<wallyworld> great
<wallyworld> axw: lgtm, good that you found it
<axw> wallyworld: thanks
<mup> Bug #1554436 changed: Juju adds any RFC1918 address it finds on any state servers to the apiaddresses list in agent.conf <canonical-bootstack> <network> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1554436>
<axw> wallyworld: there's a problem with "juju logout" which I was planning to defer, but maybe shouldn't. if you "juju logout", your cookie remains in the jar. then "juju login <user>" will use it again without you being prompted
<axw> wallyworld: fixing it is going to take a bit, because there's an issue with persistent-cookiejar
<wallyworld> axw: we can document as a known issue for this beta imo
<axw> wallyworld: ok
<babbageclunk> voidspace: ping
<wallyworld> axw: not sur eif you have a minute for a trivial format fix http://reviews.vapour.ws/r/5630/
<wallyworld> babbageclunk: or maybe you could take a peek at the above for me? it's trivial, and needed for the beta
<babbageclunk> wallyworld: sure, looking now
<wallyworld> you rock ty
<babbageclunk> wallyworld: LGTM!
<wallyworld> babbageclunk: yay, ty. btw, it seems you finally got the right person from maas to talk to?
<wallyworld> jeez, what a saga
<babbageclunk> wallyworld: Yup - ripped out all my bodges and put in the right thing really easily, working on the provider changes now.
<wallyworld> babbageclunk: sigh, how frustrating. seems like only a couple of people knew this secret knowledge
<babbageclunk> wallyworld: Well, it was there in the docs, but I never searched with a keyword that matched it.
<wallyworld> well, i didn't know the magic word either
<babbageclunk> wallyworld: It's good though - the hacked version would've been nasty.
<wallyworld> indeed, but it's all we thought we had :-)
<frankban> axw: could you please take a look at http://reviews.vapour.ws/r/5631/ ?
<axw> frankban: looking
<frankban> axw: ty
<axw> frankban: LGTM, thanks
<frankban> axw: cool thanks
<voidspace> babbageclunk: pong, sorry
<babbageclunk> dimitern: around?
<dimitern> babbageclunk: yeah?
<babbageclunk> dimitern: I'm trying to test something on AWS, but I lost my credentials when my machine crashed - actually, might have just found an email, hangon.
<babbageclunk> dimitern: Remind me of the url for the AWS console?
<dimitern> babbageclunk: I remember sending the shared AWS account creds by mail a while ago, but might have missed you
<dimitern> babbageclunk: pm-ed you the link
<babbageclunk> dimitern: awesome, thanks
<babbageclunk> voidspace, dimitern, frobware: Could someone take a look at this PR? https://github.com/juju/gomaasapi/pull/57
<voidspace> babbageclunk: looking
<babbageclunk> voidspace: thx!
<voidspace> babbageclunk: so if args.OwnerData is nil then the iteration in ownerDataMatches just drops through to true
<voidspace> babbageclunk: it seems a bit weird to call the filtering function even when there's no filtering
<voidspace> babbageclunk: it's not wrong, just feels like the code doesn't match the user intent
<voidspace> babbageclunk: only a slight niggle
<voidspace> boiler repair man here, have to let him in
<babbageclunk> voidspace: I think it does match the intent, though - a nil filter should match any owner data, right?
<voidspace> babbageclunk: to me a nil filter means "don't bother checking"
<babbageclunk> voidspace: I prefer not to have a different control flow (like an early exit) if the normal path through the block does the right thing.
<babbageclunk> voidspace: But I don't feel super-strongly about it in this case. I can add `args.OwnerData != nil &&` on the start.
<voidspace> babbageclunk: I don't feel strongly, it's a matter of taste and it's your code
<voidspace> babbageclunk: so fair enough :-)
<babbageclunk> voidspace: ok cool - I tried it and I don't like it. :)
<voidspace> hah
<dimitern> babbageclunk: sorry otp - looking though
<babbageclunk> dimitern: no rush - I'm not blocked on it.
<babbageclunk> dimitern: thanks though!
<babbageclunk> dimitern: Actually, I just realised I can't make myself a new access key for AWS - can you make me a new one and invalidate the old one? Or do I need jam or someone else?
<voidspace> babbageclunk: you don't explicitly test the nil case in new tests, is that because it's covered by the existing ones?
<babbageclunk> voidspace: Yeah - if it didn't work then they wouldn't get any machines back.
<voidspace> babbageclunk: cool, LGTM
 * babbageclunk successkid
<dimitern> babbageclunk: sure, just give me a few minutes
<dimitern> babbageclunk: sent you new kets
<dimitern> keys
<voidspace> gah
<voidspace> juju deploy stuck in allocating/pending and nothing in the debug-log and can't ssh
<voidspace> need moar logging
<rick_h_> voidspace: is this with tip?
<rick_h_> voidspace: shooting you an email please see if thisis what you're seeing
<voidspace> rick_h_: it  is, but it's also with a custom charm on lxd to test an interrupted sleep action
<voidspace> rick_h_: ah, does look a bit like it yes
<voidspace> rick_h_: I have an ipv6 address! but I can ssh into that
<rick_h_> fwereade: ping
<voidspace> and no juju logs on the system
<rick_h_> fwereade: important email your way please
<rick_h_> voidspace: so looking more closely at things, that email is only when HA is turned on close to when a deploy is in progress.
<rick_h_> voidspace: just testing a bootstrap/deploy without HA isn't an issue and I'm assuming you're not in HA for your testing
<dimitern> babbageclunk: reviewed
<rick_h_> voidspace: so thinking there's different things goin on between your setup and the email thread there
<babbageclunk> dimitern: thanks!
<babbageclunk> frobware: Sorry, just realised I merged that without waiting for an all-clear from you - were you alright with that change other than the comment you made?
<rick_h_> voidspace: ok, betting that marcoceppi's upcoming email to the juju list is your issue.
<rick_h_> voidspace: try http://paste.ubuntu.com/23147632/ and see if that unblocks you there
<rick_h_> voidspace: heh and as bac points out the commands there are moved in trunk so need to use set-model-config as model-config is an alias for get-model-config
<babbageclunk> dimitern: How can I get values into Config.ResourceTags() in a test?
<dimitern> babbageclunk: what's Config.ResourceTags()?
<babbageclunk> dimitern: It's a set of user-defined tags that get applied to instances when they're started by the provider.
<dimitern> babbageclunk: ah, so the juju one, not the maas one
<dimitern> babbageclunk: let me have a look
<babbageclunk> dimitern: Yeah - I'm hooking it up to maas owner data now.
<babbageclunk> dimitern: Ah, think I've found the answer in config.go: "Config holds an immutable environment configuration."
<babbageclunk> dimitern: I can't. Alright, I'll create it with them instead.
<dimitern> babbageclunk: yeah :) also config with "resource-tags" key is used to hold them
<dimitern> rick_h_: hey
<dimitern> rick_h_: I've noticed no HO link on the sync call
<rick_h_> dimitern: oh my bad
<dimitern> rick_h_: and I'm sitting in the standup one
<rick_h_> dimitern: k, I'll meet you there
<babbageclunk> dimitern: Thanks!
<babbageclunk> anyone seen alexisb this morning?
<alexisb_> babbageclunk, I just sat down
<alexisb_> omw
<rick_h_> natefinch: ping for standup
<dimitern> jam, fwereade: ^^
<perrito666> morning all
<dimitern> perrito666: o/
<natefinch> ahasenack: btw, I couldn't repro https://bugs.launchpad.net/juju/+bug/1614635 with current master.  I could repro easily with beta17, so hopefully it just got fixed by something else.
<mup> Bug #1614635: Deploy sometimes fails behind a proxy <deploy> <landscape> <proxy> <juju:Fix Committed by natefinch> <https://launchpad.net/bugs/1614635>
<ahasenack> natefinch: got it
<ahasenack> thx
<voidspace> rick_h_: hah, just seen your IRC message telling me to try model-config instead of set-model-config :-)
<rick_h_> voidspace: :)
<voidspace> rick_h_: so, thanks...
<redir> morning juju-dev
<alexisb_> morning redir
<redir> is there a qa list?
<mgz> redir: yes, juju-qa@lists.canonical.com
<mgz> you can always send things to it and can be manually moderated through, or you can join the fun
<redir> thanks mgz
<redir> I certainly don't get enough email as it is
<rick_h_> dimitern: frobware ping, CI is seeing this failure: https://bugs.launchpad.net/juju/+bug/1621538 can we confirm that this is part of ongoing issues/work
<mup> Bug #1621538: container networking: cannot juju ssh to container <ci> <maas-provider> <netowork> <regression> <juju:Triaged by dimitern> <https://launchpad.net/bugs/1621538>
<rick_h_> dimitern: frobware and this started yesterday
<rick_h_> dimitern: frobware it's blocking a beta18 as containers aren't working on maas
<dimitern> rick_h_: looking
<dimitern> 2016-09-08 13:55:23 ERROR juju.worker.proxyupdater proxyupdater.go:160 can't connect to the local LXD server: LXD socket not found; is LXD installed
<dimitern> but yeah, apart from that - the root cause is: WARNING juju.provisioner lxd-broker.go:62 failed to prepare container "0/lxd/0" network config: host machine device "br-eth0" has no addresses
<redir> alexisb: WRT https://goo.gl/yqrrPI what commands is the application config collapsing/replacing?
<rick_h_> dimitern: ok, this is blocking beta18 and we need to get it turned around asap if we're going to try to release tomorrow
<alexisb> redir,one sec
<dimitern> rick_h_: we have 3 options - 1) ignore this until we land the rest of the PRs fixing bug 1566791, 2) temporarily disable functional-container-networking job (or make it non-voting), 3) revert frobware's bridge all PR and merge it back after beta18
<mup> Bug #1566791: VLANs on an unconfigured parent device error with "cannot set link-layer device addresses of machine "0": invalid address <2.0> <4010> <cpec> <network> <juju:In Progress by dimitern> <https://launchpad.net/bugs/1566791>
<frobware> dimitern: is this related to the the bridge all the world?
<dimitern> frobware: yeah :/
<dimitern> and only because of the way CI is configured
<frobware> dimitern: I ran into issues too
<rick_h_> dimitern: frobware can you all join https://hangouts.google.com/hangouts/_/canonical.com/rick?authuser=1
<dimitern> not with authuser=1 I can't :)
<rick_h_> dimitern: :P feel free to url edit
<alexisb> ok redir, you have time for a HO?
<perrito666> Bbl lunch
 * rick_h_ goes for lunchables and such
<redir> alexisb: yes
<redir> if you still want to
<alexisb> only if you need it redir
<redir> alexisb: I am on track I think, thanks
<redir> alexisb: at least for the next 10 minutes:)
<babbageclunk> Is anyone else having problems adding machines with trunk? Mine just sit in pending.
<rick_h_> babbageclunk: there's a message on the mailing list around it atm
<babbageclunk> thanks rick_h_ - just saw that thread.
<rick_h_> babbageclunk: https://lists.ubuntu.com/archives/juju/2016-September/007845.html
<alexisb> babbageclunk, you still around?
<babbageclunk> alexisb: yup yup
<alexisb> babbageclunk meet hml
<alexisb> she is helping us ou ton some api updates for openstack
<alexisb> and had some qs around goose
<babbageclunk> hi hml
<alexisb> hml, babbageclunk may know given he just did some updates w/ goose
<babbageclunk> uh, do I know anything about goose?
<babbageclunk> I've been making changes to gomaasapi
<alexisb> babbageclunk, ah that is right, my bad
<alexisb> hml, ask anyways, we may get lucky :)
<hml> hi babbageclunk
<babbageclunk> I mean, I'll do my best!
<hml> babbageclunk: is there a stratgey for version number changes? iâm looking at upgrading goose to use some newer api - like neutron.
<rick_h_> babbageclunk: hml we've done that in the past with versioning the branch that the library is on so that you can rev the api with the branch and the dep pulls that branch
<rick_h_> babbageclunk: hml see https://github.com/juju/charm for an example for a versioned lib like this
<hml> thanks rick_h_
<babbageclunk> thanks rick_h_ :)
<babbageclunk> hey marcoceppi, are you the right person to ask for a review of https://code.launchpad.net/~2-xtian/charm-helpers/application-version-set/+merge/300183?
<redir> back
<marcoceppi> babbageclunk: sure, but you've got a competing implementation
<rick_h_> fight fight!
<babbageclunk> marcoceppi: doh!
<rick_h_> I've got $5 on babbageclunk
<marcoceppi> maybe not
<alexisb> I will make it 50 on babbageclunk ;)
<rick_h_> ooooh, the big players are dropping in
<babbageclunk> (psst, marcoceppi, we could make a little moolah off these rubes.)
<marcoceppi> well, you're up against lazypower ;)
<marcoceppi> babbageclunk: you lost
<marcoceppi> https://code.launchpad.net/~lazypower/charm-helpers/add-workload-version/+merge/305062
<babbageclunk> It got merged yesterday! I totally should have chased a while ago.
<marcoceppi> babbageclunk: and it had tests ;)
<marcoceppi> rick_h_ alexisb pay up
 * marcoceppi does an arbitrary release of charm-helpers
<alexisb> darn it babbageclunk!
<alexisb> I was counting on you ;)
<perrito666> marcoceppi: 2.becauseIsaso.1 ?
<rick_h_> marcoceppi: hah, owe you a beverage next week then
<rick_h_> damn, didn't know I'd be up against lazypower.
<rick_h_> never do that
<marcoceppi> it was a close one
<marcoceppi> 0.9.0 of charm-helpers is in pypi now, charm build will use it going forward and all charms will have access to it
<marcoceppi> babbageclunk: thanks for contributing though!
<babbageclunk> marcoceppi:  :) ah well, maybe next time!
<babbageclunk> cool cool - I just deployed postgres and noticed that the version column wasn't filled in. I'll do a PR for using it in the charm.
<marcoceppi> babbageclunk: yeah, we've been updating all of ours
<babbageclunk> marcoceppi: nice
<babbageclunk> morning menn0 - how's KiwiPyCon?
<menn0> babbageclunk: hey hey... NFI, I'm not going until tomorrow. already had plans for tonight.
<menn0> babbageclunk: how's things?
<babbageclunk> menn0: pretty good!
<babbageclunk> getting pretty stressed about the move
<menn0> babbageclunk: yeah, it's not an easy thing, but it'll be done with soon enough
<babbageclunk> menn0: looking forward to hanging out with the Tartleys on the way though.
<menn0> babbageclunk: totally! how long are you with then?
<menn0> them
<babbageclunk> menn0: I think 4 or 5 days, should be cool
<menn0> babbageclunk: that'll be awesome. z will be so different now I imagine.
<alexisb> perrito666, ping
<perrito666> alexisb: pong
<alexisb> heya perrito666, what is going on with your current PR
<alexisb> https://github.com/juju/juju/pull/6165
<perrito666> alexisb: I forgot to push a missing file and am Just returning home (where the file is in my computer) right now working on the laptop
<perrito666> apologies
<alexisb> perrito666, ok, lets please get that landed
<perrito666> aie
<mup> Bug #1317896 changed: juju-restore requires mongodb-clients <backup-restore> <tech-debt> <juju-core:Fix Released> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1317896>
<perrito666> alexisb: merged
<alexisb> perrito666, thanks
<voidspace> babbageclunk: hey, you there
<menn0> axw or perrito666: https://github.com/juju/juju/pull/6198 pls
<menn0> That's the result of me running go vet manually and realising we were missing stuff
<perrito666> menn0: ship it
<menn0> perrito666: thans
<menn0> thanks
<perrito666> menn0: tx for doing that
<mup> Bug #1621658 opened: juju attach does not like to be canceled <juju-core:New> <https://launchpad.net/bugs/1621658>
<babbageclunk> voidspace: wasn't but am now
<babbageclunk> menn0: Mind doing a super-quick review? https://github.com/juju/gomaasapi/pull/58
<menn0> babbageclunk: will do, almost done with standup
<babbageclunk> or voidspace can ^^
<babbageclunk> menn0: Oh yeah - I should have popped in.
<menn0> babbageclunk: isn't it like crazy late for you?
<babbageclunk> :(
<babbageclunk> yeah, wanted to get something finished off, got a bit carried away.
<menn0> babbageclunk: I get it :)
<menn0> babbageclunk: LGTM on that change (although the correct change is really to fix the MAAS API - dashes are clearly superior to underscores :-D)
<babbageclunk> menn0: I know, right?
#juju-dev 2016-09-09
 * redir wonders where â and â fall in that spectrum...
<redir> they look so similar
<redir> and how about â ?
<alexisb> babbageclunk, dude!  it is time for sleeping
 * alexisb goes to pick up kiddo talk to you all later
<babbageclunk> alexisb: you are right! I'm all done, except can't do a final test because of some weird dns issue.
 * babbageclunk goes to bed!
<wallyworld> axw: i'd like to get this landed for beta if possible, it's a small tweak to list controllers http://reviews.vapour.ws/r/5635/
<axw> looking
<axw> wallyworld: reviewed
<wallyworld> ta
<wallyworld> axw: 1:1?
<natefinch> I hate it when I fix a simple bug and it reveals a huge gnarly bug behind that one.
<anastasiamac_> natefinch: only one gnarly behind? consider it ur lucky day :D
<natefinch> well.. it's one gnarly bug because of two pieces of gnarly code working at cross purposes... both assuming too much about what the other one will do or not do
<natefinch> axw: got a minute?
<natefinch> axw: I'm going to ask you about code you wrote threee years ago ;)
<natefinch> start dusting off the tape drive :)
<axw> natefinch: sorry, missed your message
<axw> natefinch: what's up?
<natefinch> axw: heh, no problem. I've actually engineered a fix around it, but we can talk a little...
<natefinch> axw: juju help-tool
<natefinch> axw: https://github.com/juju/juju/blame/master/cmd/juju/commands/helptool.go#L123
<natefinch> axw: we make a dummy hook context and then pass that into all the functions and hope they don't look too closely at it :p
<axw> let me guess, something is looking at it? :)
<natefinch> axw: yes :)
<natefinch> axw: but they don't *really* need to, at least not so early, and since all this code does is call info... I just deferred the code that looks at the value until we actually run the command.
<axw> natefinch: ok. perhaps the interface for those commands should be changed, so that Run is the thing that takes the hook context?
<natefinch> axw: that's a good general solution.  I don't think anything should really need to use the hook context on command creation, and "context" is really more of a "while running" concept anyway (which is why we pass in the command context there)
<natefinch> axw: I made the minor change to get this one to work.  I think in the future it would be good to make the fix you suggested, but that seems like a bigger change than we really have time for right now.
<natefinch> axw: http://reviews.vapour.ws/r/5636/diff/#
<natefinch> or if anyone else wants an easy review ^ (+26 -12)
<axw> natefinch: reviewed, please add QA steps tho
<axw> (code LGTM)
<natefinch> oops, right, updated
<axw> natefinch: do you know if we have a CI test that uses resource-get?
<natefinch> axw: not sure, I can check
<natefinch> axw: yeah, there is
<axw> natefinch: cool, thanks
<wallyworld> ffs, landing bot is down :-(
<axw> wallyworld: can you please check if you can see the data I just shared on drive?
<wallyworld> axw: yep, am looking at it now. i think it would be good to extract the 3 interesting numbers for each run and tabulate
<axw> wallyworld: indeed, I need to prepare something for my talk later tho :/  I've responded with rough figures that are most interesting, will see what else I can do
<wallyworld> axw: i can pull out some numbers if i get time a bit later
<wallyworld> axw: ah i see the next email; i think that's ok for now, it gives the key data point
<wallyworld> axw: only other data point would be what it was before your changes
<axw> wallyworld: yeah, but I figured it's not that useful to know how bad the really bad implementation is :p
<axw> I'd prefer to focus on how to get it better
<wallyworld> fair enough
<menn0> axw or wallyworld: could I get a review of this please? http://reviews.vapour.ws/r/5629/
<menn0> it's fully QAed now
<wallyworld> ok
<wallyworld> menn0: except it appears landing bot is down :-(
<menn0> wallyworld: ah... oh well
<menn0> wallyworld: the review would be good to get done anyway
<wallyworld> sigh
<wallyworld> yep
 * menn0 looks at the bot
<wallyworld> menn0: oh, it's back
<menn0> wallyworld: ok cool
<wallyworld> got a server error before
<menn0> wallyworld: TestCertificateUpdateWorkerUpdatesCertificate is failing really often
<wallyworld> awesome
<wallyworld> it has been ok for a while, was failing a bit a while back
<natefinch> menn0: how is it failing?
<menn0> natefinch: http://juju-ci.vapour.ws/job/github-merge-juju/9153/artifact/artifacts/windows-out.log
<menn0> natefinch: the test thinks the certificate isn't being updated
<menn0> natefinch: it's happened to me twice in a row for a PR that changes a very unrelated shell script
<wallyworld> it should even be running on windows
<wallyworld> shouldn't
<menn0> it failed for me yesterday a few times too
<wallyworld> anything controller related does not need to run on windows
<menn0> wallyworld: agreed, but this doesn't seem like a windows specific problem
<menn0> i.e. maybe we're just lucky that it's not failing more often on linux
<natefinch> [LOG] 0:05.803 DEBUG juju.worker.certupdater addresses haven't really changed since last updated cert
<natefinch> I was looking into this cert updater stuff a lot for a manual provider bug I was looking into, so I'm fairly familiar with the code.   There's a watcher here that's getting fired, but we think nothing has changed. If nothing has changed, why is the watcher firing?
<menn0> natefinch: could be an update to the doc which just set the same values? that would still cause the watcher to fire.
<natefinch> yeah, true.  Still, seems suspicious that we're getting that "nothing changed" firing and then the test failed because nothing changed.
<menn0> natefinch: true
<menn0> natefinch: I just dug through the code. the watcher is tracking the machine document so any change to the machine will fire it. not all those changes will be address updates.
<natefinch> oh right, that makes sense
<menn0> natefinch: so that message could be completely normal
<menn0> or it's a clue :)
<natefinch> is the landing bot working, or not?
<wallyworld> menn0: lgtm
<wallyworld> natefinch: it is again now
<menn0> wallyworld: thanks
<wallyworld> menn0: i think i found the pinger issue. haven't tested yet, just looking at code
<wallyworld> but i don't fully understand it either
<menn0> wallyworld: for the cert updater?
<wallyworld> menn0: in apiserver/admin.go there's a call to startPingerIfAgent()
<wallyworld> but it is inside an if{}
<wallyworld> and i think nothing ever calls it
<wallyworld> that's the only place i can see where the pinger would be started
<wallyworld> i'll add some logging and test a bit
<menn0> wallyworld: I believe the if there to avoid unnecessarily starting pingers for controller machine agents which log in on behalf of a model.
<menn0> the thinking is there's no need to run the pinger if the client is on the same machine
<wallyworld> ok. i'll confirm if we are starting the pinger or not and gp from there
<menn0> sounds good
<wallyworld> fwereade: you around?
<fwereade> wallyworld, yeah, sorry
<mup> Bug #1493058 changed: ensure-availability fails on GCE <docteam> <gce-provider> <ha> <jujuqa> <juju:Triaged> <juju-core:Invalid> <juju-core 1.24:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1493058>
<anastasiamac_> wallyworld: u've kindly file this bug... but I  *think* this has now been fixed as off last week. right? :D
<anastasiamac_> https://bugs.launchpad.net/juju/+bug/1559701
<mup> Bug #1559701: kill-controller manual provider broken <2.0-count> <manual-provider> <manual-story> <juju:Triaged> <https://launchpad.net/bugs/1559701>
<anastasiamac_> axw: do u know? ^^
<axw> anastasiamac_: that's fixed. destroy-controller auto destroys manual machines now
<anastasiamac_> axw: \o/
<dimitern> mgz: hey there - about?
<voidspace> babbageclunk: ping
<babbageclunk> voidspace: pong
<mgz> dimitern: hey
<dimitern> mgz: I'm looking into bug 1621538
<mup> Bug #1621538: container networking: cannot juju ssh to container <ci> <maas-provider> <netowork> <regression> <juju:In Progress by dimitern> <https://launchpad.net/bugs/1621538>
<dimitern> mgz: it seems the error is different
<dimitern> mgz: not about parent br-eth0 not having an address but something even weirder - node with that [device] system_id already exists
<mgz> dimitern: there are only two changes in the regresion window
<mgz> pr #6156 and pr #6158
<dimitern> mgz: still digging in both maas and juju sources to see how that duplicate error can even happen - we're namespacing the hostnames(instance ids) we generate for containers and their backing maas devices
<babbageclunk> dimitern: dhcp won't start on my maas :( complaining about a missing conf file
<voidspace> fwereade: ping
<fwereade> voidspace, pong
<dimitern> babbageclunk: is it inside LXD ?
<voidspace> fwereade: so I'm trying to build an operation that "marks the action as failed and restores local state to a sensible state"
<babbageclunk> dimitern: I think it's because the drive filled up - I made the fs bigger. Any way I can get it to rewrite the config file?
<babbageclunk> dimitern: no, kvm
<fwereade> voidspace, ah yes
<voidspace> fwereade: afaics marking an action as failed means calling "state.State.ActionFinished"
<dimitern> babbageclunk: I'd try dpkg-reconfigure maas-dhcp first
<voidspace> fwereade: and nothing in the uniter/actions/resolver has access to that
<voidspace> fwereade: hmm... op_callback has some code that does that
<fwereade> voidspace, I thought I saw a method on callbacks that did that
<voidspace> fwereade: right
<fwereade> voidspace, so you should be able to supply that capability in the op factory
<babbageclunk> dimitern: just tried that, no dice.
<babbageclunk> dimitern: I guess it's reinstall time!
<voidspace> fwereade: right, sounds good to me :-) thanks
<dimitern> babbageclunk: what's the error you're getting?
<fwereade> voidspace, and the "sensible state" would be "that achieved by running the same code as in RunAction.Commit", I think
<babbageclunk> dimitern: http://paste.ubuntu.com/23154038/
<voidspace> fwereade: yeah, that I'm fine with (well - until I actually get to doing it...)
<babbageclunk> dimitern: And it's right - that file's not there.
<fwereade> voidspace, cool :)
<dimitern> babbageclunk: ok, before reinstalling, try also dpkg-reconfigure maas-rack-controller
<babbageclunk> dimitern: ooh yeah, that looks better!
<babbageclunk> dimitern: didn't seem to help - I'll try bouncing the machine just in case
<dimitern> babbageclunk: yeah - and also check the permissions on /var/lib/maas/*
<babbageclunk> dimitern: Hold the phone, actually I think that did it! Yay thanks!
<dimitern> babbageclunk: \o/ :)
<mup> Bug # changed: 1543660, 1546805, 1587644, 1592887, 1621658
<mup> Bug # opened: 1543660, 1546805, 1587644, 1592887, 1621658
<mup> Bug # changed: 1543660, 1546805, 1547806, 1587644, 1592887, 1621658
<mup> Bug #1547806 opened: open-port does not work on EC2 <juju:Triaged> <https://launchpad.net/bugs/1547806>
<rick_h_> dimitern: morning, how goes the battle with the functional container network tests?
<dimitern> rick_h_: morning, still trying to repro the new error
<dimitern> rick_h_: not related to missing addresses on host bridges
<rick_h_> dimitern: rgr
<dimitern> rick_h_: it seems maas 2.0 specific so far
<rick_h_> dimitern: they just upgraded to maas 2.0 final in CI yesterday before this run
<dimitern> rick_h_: I thought at first that job runs on maas 1.9 and wasted some time trying to find the error in lp:maas/1.9
<dimitern> rick_h_: good that I checked the job's jenkins config
<rick_h_> dimitern: it came up yesterday that it was on 2.0rc3 I think and was upgraded to 2.0 final
<dimitern> rick_h_: ok, that's useful - I'll concentrate on changes since 2.0rc3
<mup> Bug #1547806 changed: open-port does not work on EC2 <juju:Triaged> <https://launchpad.net/bugs/1547806>
 * rick_h_ goes to get the boy to school and make a coffee run, biab
<fwereade> babbageclunk, http://reviews.vapour.ws/r/5637/ is up if you have a moment; ask many questions if required :)
<frobware> dimitern: ping
<dimitern> frobware: pong
<frobware> dimitern: any joy with the container failure?
<dimitern> frobware: so far it seems something broke after the maas 2.0rc3 on finfolk got upgraded to 2.0 final
<frobware> dimitern: the maas update that we did with mgz yesterday?
<frobware> dimitern: (I thought that 1.9)
<dimitern> frobware: I though so as well, but the job runs on 2.0
<mgz> that was a different maas
<mgz> and this failure has been happening since the landings on the 7th
<dimitern> mgz: the first time it happend was after frobware's PR got reverted, so it can reach a bit further preparing container NICs
 * frobware jumps on a train; back in 90 mins... or so.
<babbageclunk> fwereade: was lunching - looking now
<fwereade> babbageclunk, cheers
<natefinch> rick_h_: got a suggestion for another bug to pick up?
<rick_h_> natefinch: for that one I pushed back on not doing anything on that
<rick_h_> natefinch: did you get the reply?
<natefinch> rick_h_: missed that, will go look
<rick_h_> ty
<natefinch> gah, I need multiple return values on my executables :/
<babbageclunk> GAH, can someone please tell me what I should avoid so that the bot will post to RB?
<natefinch> heh
<babbageclunk> double gah!
<babbageclunk> o/
<natefinch> babbageclunk: not sure what's wrong with the rb bot
<babbageclunk> Is it working for others at the moment?
<natefinch> babbageclunk: last night it made the review for me, but didn't update my PR description with the link
<babbageclunk> I think it's something to do with things I put in the description - it's looking for (Review request: http://blah) so it knows whether to update or create a new one. But it gets confused.
<rick_h_> natefinch: what about going with resource-get-fingerprint
<natefinch> rick_h_: well, seems like the information we really want is "is the info different on the server"  Do we care what the actual fingerprint is?  I don't think that tells us anything by itself
<babbageclunk> dimitern, voidspace, frobware: can I get a review of http://reviews.vapour.ws/r/5638/ please?
<dimitern> babbageclunk: looking
<babbageclunk> dimitern: thx!
<rick_h_> natefinch: I just sent an email off to chuck to clarify and make sure I'm understanding things correctly
<natefinch> rick_h_: cool
<rick_h_> natefinch: while that goes on how about looking at https://bugs.launchpad.net/juju/+bug/1620056 please
<mup> Bug #1620056: constraints should support cores=X <juju:Triaged by rharding> <https://launchpad.net/bugs/1620056>
<rick_h_> natefinch: need to create a card on the board for it
<natefinch> rick_h_: ok, will do
<natefinch> heh, I was like "don't we already support that?"  but it's a name change. I get it.
<rick_h_> natefinch: ty
<rick_h_> natefinch: right, little tweaking
<natefinch> rick_h_: the main problem is that the best interface for resource get would be to return two values: the path of the file on disk, and a boolean indicating whether or not it has been updated.  But that's kind of hard to do on the command line.
<rick_h_> natefinch: right, so that leads to two commands
<natefinch> rick_h_: but if you're running two commands already, you might as well just use md5sum
<rick_h_> natefinch: but thing there is that folks can/will use different tools/etc. Making things like building a charm with layers and such more complicated. What mechanism did this guy use to hash/etc
<rick_h_> natefinch: if it's a common use case thing we can make an opinionated standard that everyone just follows
<rick_h_> natefinch: honestly, I want something like a return code from resource-get that says "did the file on disk change or not"
<rick_h_> but I have a feeling anything non-0 might be more problematic for folks to rely on and build on
<babbageclunk> fwereade: LGTM
<natefinch> rick_h_: that's what I was thinking too.... if it's behind a new flag, it's probably not a big deal, since we wouldn't be breaking backwards compatibility.  like resource-get --check-new
<natefinch> rick_h_: probably most people would just use whatever function we put in charm helpers
<natefinch> rick_h_: replied to your email. The other option is to have a --yaml flag (or --json) that outputs structured data, so we can return the path and a boolean
<rick_h_> voidspace: fwereade ping for standup
<rick_h_> dimitern: ping for standup
<dimitern> omw
<babbageclunk> hmm - anyone have advice about how to get into a 2nd gen x1 carbon to replace the ssd?
<voidspace> rick_h_: omw
<mgz> wat
<mgz> why you drop me google
<alexisb> babbageclunk, ping
<babbageclunk> alexisb: pong
<alexisb> heya babbageclunk, sorry moving slowish this morning
<alexisb> did you want to touch base this morning?
<babbageclunk> I forgot I was sitting in the hangout - camera's been on while I've been replacing my ssd.
<alexisb> lol
<alexisb> ok I will pop on
 * dimitern is back
<voidspace> fwereade: was it you I'm due to be reviewing something for?
<fwereade> voidspace, http://reviews.vapour.ws/r/5637/
<fwereade> voidspace, ta
<babbageclunk> voidspace: If you're looking for a pretty relaxing Friday afternoon-style review you can look at this one: http://reviews.vapour.ws/r/5638/
<voidspace> babbageclunk: hah, because I have nothing else to do :-p
<voidspace> fwereade: looking
<babbageclunk> Ok, so do I want to do full-disk encryption? Will I still be able to install Windows for indie gaming fun?
<babbageclunk> Duh, no, that's what full-disk means.
<natefinch> babbageclunk: in my opinion, full disk excryption will just make your life harder.  In theory, if Something Badâ¢ happens, you'll have protection against the government grabbing your laptop and using its contents against you.  However, it doesn't prevent said government from hitting you with a pipe until you give them your password.
<dimitern> babbageclunk: LGTM, sorry it took so long
<babbageclunk> natefinch: true.
<babbageclunk> dimitern: No worries - sounds like there was some other stuff going on!
<babbageclunk> dimitern: Thanks!
<dimitern> babbageclunk: yeah, :/ a friday MAAS mystery (or likely even misery)
<dimitern> mgz: how goes?
<babbageclunk> dimitern: stink
<mgz> dimitern: doing some futzing still, will yell in a sex
<mgz> -x+c
<natefinch> wow, that really changes the meaning of your sentence
<dimitern> ;)
<mgz> ;_;
<dimitern> mgz: ok
 * rick_h_ goes for long lunch with family today, biab
<mgz> dimitern: running now
<dimitern> mgz: I'm tailing the maas logs in the meantime
<mgz> dimitern: state server coming up on 10.0.30.13
<mgz> you should be able to ssh in there with the ci staging key shortly
<dimitern> mgz: yeah, trying now
<mgz> dimitern: okay, we're at the fail point
<dimitern> mgz: no errors so far in the controller log
<voidspace> fwereade: LGTM
<dimitern> mgz: ok, got the error - looking earlier in the logs
<fwereade> voidspace, ta
<dimitern> mgz: /var/log/postgresql/postgresql-9.3-main.log on maas has some very interesting errors
<dimitern> frobware: ^^
<redir> morning juju
<frobware> dimitern: categorically works for me on 1.9
<dimitern> frobware: it works for me on both 1.9 and 2.0
<frobware> dimitern: hmm; I have a few teething problems on my maas 2.0 setup atm
<dimitern> frobware, mgz: I think I nailed it
<dimitern> the db got corrupted
<dimitern> when the upgrade was done yesterday
<mgz> dimitern: what's up with having both 9.5 logs (last from -09-08) and 9.3 logs (current)?
<dimitern> instead of upgrading from 2.0.0rc3 to 2.0.0, it was done first to 2.1.0alpha3, then downgraded to 2.0.0 final
<mgz> dimitern: aha
<dimitern> unfortunately the alpha3 package ran some db migrations, which dropped maasserver_node_system_id_seq
<mgz> the migrations *should* be two ways
<dimitern> and that's causing the device creation to fail
<dimitern> I bet I can repro it manually even with the maas cli only
<dimitern> pgsql 9.3 is the one maas uses, the newer was installed with the intent of upgrading the maas one, but that didn't happen
<dimitern> all those hints can be observed in /var/log - ./apt/term|history.log, ./postgresql/.. but unfortunately not in any the maas logs
<dimitern> frobware, mgz: and indeed I did - http://paste.ubuntu.com/23155215/
<dimitern> that maasserver_node_system_id_seq got dropped months ago after 2.0 switched to shorter node names (check the MP - the only google result for the seq name)
<dimitern> I guess nobody seriously thought about the existing maas 1.9 node ids will be a problem after upgrading to 2.0 and then to 2.1
<frobware> dimitern, mgz: so I just don't run into this because I don't upgrade... :(
<frobware> dimitern, mgz: bitter experience from Windows 95.
<dimitern> frobware: well isn't it better not to run into these issues? :)
<frobware> dimitern: yes, but customers (you would hope) are using our older versions...
<dimitern> mgz: I'm trying to see if it's salvageable without reinstall
<dimitern> by reverting the 2.1.0 migrations
<mgz> dimitern: thanks...
<frobware> dimitern, mgz: funnily enough I am having psql problems installing a fresh version of maas 2.0
<frobware> dimitern: that's my third install which has failed.
<rick_h_> frobware: dimitern any progress on identifying the issue?
<frobware> rick_h_: see issues above about psql failures in migrations
<dimitern> rick_h_: yeah
<rick_h_> frobware: reading backlog, so there's issues with upgrading and how the ids change over time?
<rick_h_> frobware: dimitern k, sounds like we need to file a bug against MAAS?
<rick_h_> frobware: dimitern and then to update our install to be a fresh install of MAAS in CI and we should pass tests cleanly?
<rick_h_> mgz: ^ ?
<dimitern> rick_h_: botched maas upgrade 2.0r3->2.1.0a1->2.0.0 corrupted the dn
<dimitern> db
<rick_h_> dimitern: oh, it was a botched upgrade?
<rick_h_> dimitern: ok
<dimitern> some db migrations ran, but not all are reversible
<rick_h_> sinzui: mgz so are we able to clean the MAAS and rerun?
<frobware> rick_h_: agreed to the clean install; leave the other as-is for continued investigation
<frobware> mgz:  ^^
<dimitern> rick_h_: I was trying to manually revert those 2.1.0 migrations, but not all can be reverted
<rick_h_> dimitern: ok, let's not bother with that imo
<dimitern> ok
<sinzui> dimitern: rick_h_  ouch, and the error happened before the botched upgrade
<rick_h_> sinzui: right, but it was a different error
<rick_h_> sinzui: that was backed out with the revert, but then we kept failing with a new error due to the botched upgrade
<dimitern> sinzui: yeah, a bit more complicated than usual
<rick_h_> sinzui: mgz dimitern frobware hangout to set the path forward?
<sinzui> rick_h_: dimitern: I don't know what clean up means in this case. A clean install is something I have never done. I wont promise ot for today
<dimitern> rick_h_: sure - standup HO?
<rick_h_> dimitern: rgr
<dimitern> I need to go soon
<mgz> I think we need a new maas vm
<frobware> sinzui: I can help with a clean install
<mgz> or some more serious help from a maas expert
<frobware> mgz: agreed
<rick_h_> alexisb: BrettD mgz sinzui welcome to join https://hangouts.google.com/hangouts/_/canonical.com/core?authuser=1
<alexisb> rick_h_, not atm
<rick_h_> alexisb: k, will fill you in afterwards then
<sinzui> mgz hop back up
<sinzui> back on
<sinzui> mgz: really, hop back on to https://hangouts.google.com/hangouts/_/canonical.com/core?authuser=1
<mgz> sinzui: omw
<marcoceppi> natefinch: what version of juju did you disable ciphers
<marcoceppi> like the shitty insecure ciphers
<rick_h_> marcoceppi: https://bugs.launchpad.net/juju/+bug/1604474 b16
<mup> Bug #1604474: Juju 2.0-beta12  userdata execution fails on Windows <azure-provider> <ci> <juju2.0> <oil> <oil-2.0> <regression> <vpil> <windows> <juju:Fix Released by natefinch> <https://launchpad.net/bugs/1604474>
<natefinch> marcoceppi: uh
<marcoceppi> rick_h_: nice, thanks
<natefinch> yes that
<sinzui> rick_h_: mgz: functional-container-networking does pass on maas 1.9. maas 2.0 is being rebuilt now
<rick_h_> sinzui: ty
<rick_h_> sinzui: feel a bit better about things now
<natefinch> method returns value, error. The only place we call the method... we ignore the error.  FanTASTIC.
<lazyPower> silly question, i've got an application depoyed, its a subordinate, and its active in the "unit view", but its not listed as active in the "app view" - is this known an expected? http://imgur.com/AGp83ik
<natefinch> lazyPower: dunno... there's been a lot of churn in that area.  Not sure who last updated that code.
<lazyPower> ok, i figured it was intentional as subordinates dont technically occupy a unit, they co-locate.. but it was a bit startling of a realization
<lazyPower> thanks natefinch
<natefinch> lazyPower: I know we intentionally hide them in some parts... not sure if this was an intentional part or not :)
<redir> ready for a review if anyone is still around: http://reviews.vapour.ws/r/5640/
<redir> bbiab
#juju-dev 2016-09-10
<mup> Bug #1622136 opened: Interfaces file source an outside file for IP assignment to management interface <juju-core:New> <https://launchpad.net/bugs/1622136>
#juju-dev 2017-09-04
<thumper> babbageclunk: hey there
<thumper> babbageclunk: can I get you to look at a PR for me?
<thumper> babbageclunk: https://github.com/juju/juju/pull/7821
<babbageclunk> thumper: sure
<thumper> babbageclunk: thanks
<thumper> babbageclunk: I'm working on a followup to that one
<thumper> that builds on it
<wallyworld> axw: when you get time, would like a second opinion on the modelling for the firewall rules. i've tried to keep it all very simple at the core data model; layers on top can add complexity if needed https://github.com/juju/juju/pull/7823
<axw> wallyworld: ok, will look in a bit. feeling like crap, been sneezing nonstop all morning... going to go lie down after
<wallyworld> axw: that's no good, it can wait, go afk and rest
<thumper> axw: seconded, go rest
<babbageclunk> thumper: approved - minor comment about a confusing variable name.
<thumper> babbageclunk: cool, thanks
<babbageclunk> wallyworld: looking at yours now
<wallyworld> yay, ty
<babbageclunk> wallyworld: approved.
<wallyworld> awesome, ty
<bdx> giving 2.3 edge a test run
<bdx> is there some kind of endpoint config for my model
<bdx> lets say the model from which machines are offering
<bdx> or applications are offering
<bdx> take the use case of an application deployed to maas
<bdx> on private networks
<bdx> relating to an application in a public cloud
<bdx> I seem to remember this being discussed somewhere
<bdx> possibly
<bdx> ha
<bdx> anyone know about a model config that defines a public egress gateway or something?
<wallyworld> thumper: veebers: balloons: would you guys be free now for release call?
<wallyworld> bdx: with the latest edge, you can do what you want. you offer the endpoint in the public cloud for example. if your maas is behind a NAT firewall and traffic originates from a given NAT address, you can either set a model config "egress-subnets" or when you relate to the offered endpoint in the public cloud, use "juju relate --via <subnet>". in each case, subnet is a CIDR, eg <nat address>/32
<veebers> wallyworld: I could be in 5, balloons is off today.
<wallyworld> ok, that would be great
<bdx> wallyworld: thats great! thx
<wallyworld> bdx: the model config is global for all relations, the --via option is per relation
<wallyworld> bdx: i have tested with mysql in aws, and i then deploy mediawiki to a lxd cloud on my laptop; similar to your set up with maas i think
<bdx> I see, perfect .... yeah I think thats what was stopping my postgresql relations earlier
<bdx> Ill give it a whirl, thanks
<wallyworld> bdx: ah psotgres - there will need to be some charm updates, it may not work out of the box just yet
<bdx> gotcha
<wallyworld> it might work, but the postgres charm needs to update the hba.conf
<wallyworld> and there's internal juu changes in progress to allow things to be modelled properly
<wallyworld> i don't think the NAT scenario would work just yet
<wallyworld> for postgres anyway
<wallyworld> it will all work real soon
<bdx> ok
<bdx> not sure if this is a corner case or not
<bdx> so like in my datacenter, we have a direct route to our aws us-west-2 vpc
<bdx> and vice versa
<bdx> the routing tables in aws point back to my private networks in the datacenter
<bdx> my datacenter (MAAS) nodes, and my aws instances dont have to traverse the WAN to talk
<bdx> because they talk via a virtual private gateway, and have fiber straight to our racks at the datacenter
<bdx> so for my controllers/models
<bdx> I would want the services to talk over the VPG and not the wan
<bdx> for my on-prem <-> aws instance
<bdx> we also dont get charged for data that travels in and out of our racks at the datacenter to us-west-2
<bdx> I'm wondering if providing the '--via' for my internal nets will make juju do what I want it to do
<bdx> (route via the VPG instead of externally)
<bdx> because I've already got the routes
<bdx> I guess this will turn into a matter of juju wanting to support the remote relation via the public endpoint eh?
<bdx> hmmmm
<thumper> wallyworld, veebers: oh, have you had the release call?
<thumper> here I am sitting all alone
<veebers> thumper: heh yeah we met as wallyworld needed to pop out. You want to meet?
<thumper> veebers: is there anything interesting to discuss?
<veebers> thumper: not really, just we have 2 things being worked on, they should hopefully be done this week for release, otherwise we push out a little longer
<thumper> veebers: I do have a question for you though
<thumper> veebers: http://ci.jujucharms.com/job/github-merge-juju/216/
<thumper> veebers: how do I find out what failed from this link?
<veebers> let em look now
<thumper> hmm... [xenial] Error: retrieving gpg key timed out.
<veebers> thumper: hit th e"open blue ocean" link on the side there, gives nice logging outpt
<veebers> thumper: but yeah, looks like a hiccup there failed it :-\
<thumper> what's blue ocean?
<thumper> looks pretty though
<veebers> thumper: blue ocean is a UI layer for jenkins designed for the pipeline builds (a way of declaring builds in code etc.)
<veebers> thumper: we use it for our merge and check-merge jobs, we where going to use it for the ci-run replacement, but have pivoted away from that
 * thumper nods
<babbageclunk> thumper: ok - after that hump of cloud names the amazon migration fails because we use different machine tagging: juju-env-uuid vs juju-model-uuid.
<babbageclunk> thumper: this doesn't happen on maas
<babbageclunk> thumper: since we use the agent-name value from maas.
<thumper> ok
<babbageclunk> thumper: do you know how we do it in openstack?
<babbageclunk> I'll have a dig.
<babbageclunk> thumper: hmm, looks like it'll be a problem there too.
<thumper> hazaah
<babbageclunk> thumper: is canonistack on openstack?
<thumper> yep
<babbageclunk> ok - working on sorting that out now
#juju-dev 2017-09-05
<anastasiamac> a review anyone plz - https://github.com/juju/juju/pull/7824
<anastasiamac> thumper: disallow self-reset and old facade version fix ^^
<wallyworld> babbageclunk: axw: late for standup, still doing interview
<babbageclunk> wallyworld: ok
<wallyworld> babbageclunk: axw: finished now
<wallyworld> thumper: if you get a chance to look at my pr? https://github.com/juju/juju/pull/7823
 * babbageclunk goes for a run
 * thumper headdesks
<babbageclunk> uh oh
 * thumper saves a minute off the test time by not running things twice
<thumper> the bundle resource tests were composed of a concrete test suite
<thumper> which tested all bundle deployment
<thumper> so to run the resource test, it reran all the bundle tests
<thumper> I've fixed that in my branch
<thumper> ugh...
<thumper> I see why it was needed now
<thumper> but it is horrible
<thumper> cyclic includes
<babbageclunk> ugh
<thumper> babbageclunk: https://github.com/juju/juju/pull/7825
<babbageclunk> thumper: looking
<thumper> babbageclunk: thanks
<thumper> hmm... local test failure that is nothing to do with my work...
<thumper> FAIL: config_test.go:39: ConfigSuite.TestGenerateControllerCertAndKey
<thumper> in controller package
<thumper> fails for anyone else?
<thumper> seems to consistently fail
<thumper> how did this get in?
<babbageclunk> thumper: on develop or 2.2?
<thumper> 2.2
<babbageclunk> thumper: hang on, building
<thumper> oh...
<thumper> I wonder if it is an embedded time in the testing certs
<babbageclunk> thumper: and they've just expired?
<thumper> hmm.. I'd expect more failures if that was the case
<thumper> config_test.go:62:
<thumper>     c.Assert(err, jc.ErrorIsNil)
<thumper> ... value x509.HostnameError = x509.HostnameError{Certificate:(*x509.Certificate)(0xc420125900), Host:"anyServer"} ("x509: certificate is not valid for any names, but wanted to match anyServer")
<babbageclunk> thumper: sorry, got distracted. It passes for me.
<thumper> hmm...
<thumper> weird
<thumper> I wonder why it started failing for me...
<babbageclunk> thumper: probably you broke it somehow. ;)
<thumper> babbageclunk: we'll see if the merge bot likes it I guess
 * thumper shrugs
 * thumper follows his own instructions on setting up metrics gathering
<thumper> I'm giving a presentation this afternoon on prometheus and grafana
<babbageclunk> thumper: approved
<thumper> babbageclunk: awesome, ta
<babbageclunk> thumper: looks like the mergebot liked it - did you work out why that test was failing?
<thumper> nope
<thumper> also...\
<thumper> babbageclunk: Yay
<axw> wallyworld: just realised a bit too late, might be worth changing Save to accept a FirewallRule. can be done later since you're already merging
<wallyworld> axw: i have a followup to add the cli and facades i'm about to propose, can do a drive by
<axw> wallyworld: thanks
<wallyworld> you feeling better?
<wallyworld> thanks for first review
<wallyworld> axw: if you get a chance at some point, here's PR with lots of boilerplate to add the firewall rules CLI; it also tweaks the state save api. the boilerplate stuff can be skimmed to make the review quicker https://github.com/juju/juju/pull/7826
 * thumper rolls 2.2 into develop again
<thumper> wallyworld: you around yet?
<wallyworld> maybe
<thumper> wanna jump in the release call early?
<wallyworld> sure
<thumper> I have a few questions around some of your work
<babbageclunk> thumper: ping
<babbageclunk> ?
<thumper> hey
<thumper> otp
<thumper> babbageclunk: what is your go version?
<babbageclunk> thumper: still 1.8
<thumper> I think that is why the test is passing for you
<thumper> and not me nor wallyworld
<babbageclunk> oh sting
<babbageclunk> k
<babbageclunk> Seems like a weird thing for a go version change to break. Maybe we were inadvertently relying on something undocumented/incorrect?
<thumper> it is a bug fix in the x509 package
<thumper> If any SAN extension, including with no DNS names, is present in the certificate, then the Common Name from Subject is ignored. In previous releases, the code tested only whether DNS-name SANs were present in a certificate.
<thumper> from the release notes
<babbageclunk> thumper: yeah, that would do it.
<thumper> we need to fix juju
<babbageclunk> thumper: when you get a moment, I'm having real trouble working out how to create an environ in juju1 - all of my understanding about it is juju2, and it changed a lot!
<thumper> babbageclunk: I can chat now...
<babbageclunk> thumper: cool - ho?
<babbageclunk> duh, I mean 1:1?
<thumper> ack
#juju-dev 2017-09-06
<thumper> axw: morning, how well do you know x509?
<axw> thumper: morning. not intimately, why?
<thumper> axw: we have a bug in the controller package with go 1.9 because upstream fixed a bug in x509 package
<thumper> I'm after someone who understands what we are doing a bit more
<thumper> possibly our test is a bit (or a lot) wrong
<axw> thumper: ah right. where is it?
<thumper> controller package
<thumper> the fix is outlined here: https://golang.org/doc/go1.9#minor_library_changes
<thumper> If any SAN extension, including with no DNS names, is present in the certificate, then the Common Name from Subject is ignored. In previous releases, the code tested only whether DNS-name SANs were present in a certificate.
<thumper> and our cert package Verify function
<thumper> we specify: opts := x509.VerifyOptions{DNSName: "anyServer", Roots: pool, CurrentTime: when}
<thumper> it is the DNSName that dies in the test
<thumper> config_test.go:62:
<thumper>     c.Assert(err, jc.ErrorIsNil)
<thumper> ... value x509.HostnameError = x509.HostnameError{Certificate:(*x509.Certificate)(0xc4201d5900), Host:"anyServer"} ("x509: certificate is not valid for any names, but wanted to match anyServer")
<thumper> I don't feel I understand our cert usage enough to work out what to do
<thumper> hoping someone else on the team does
<axw> thumper: looking
<axw> thumper: so it's failing on the second test case, because it has IP SANs specified. our certs are generated with a common name of "*", which was previously matching the "anyServer" we specify in Verify
<thumper> ah...
<thumper> hmmm
<axw> thumper: seems that we can just drop the DNSName from Verify, and not check that... seems pointless anyway
<thumper> axw: hmm...
 * thumper jumps in yet another call
<babbageclunk> axw: Gah, gopkg.in/amz.v3 doesn't expose DeleteTags. Rather than trying to add it to the package, I'm just going to use CreateTags to set juju-controller-uuid to "" - sound reasonable to you?
<axw> babbageclunk: ounds fine to me
<axw> sounds*
<babbageclunk> ool
<axw> )
<wallyworld> axw: maybe at some point you could look at this PR to change how relation status is modelled - no longer a field on the relation doc but a status entry. i have to do the juju.description change and update dependencies.tsv before landing https://github.com/juju/juju/pull/7831
<axw> wallyworld: will try for today, but may have to be tomorrow. just getting into some vsphere stuff
<wallyworld> no worries
<wallyworld> it can wait
<axw> wallyworld: code looks fine, but I'll take another look in the morning with a fresh mind, to relook at the watcher bits
<wallyworld> axw: no worries, i'll do the juju/description stuff in the meantime. with the watcher, the existing tests all pass so hopefully things are ok
<thumper> wallyworld: https://github.com/juju/juju/pull/7834
<wallyworld> looking
<wallyworld> thumper: yeah, that should be all that's needed hopefully
<thumper> I do recall that there was another problem before where the secondaries were not connecting to localhost...
<thumper> but were going to the primary
<thumper> but that was a different bug and also fixed
<thumper> so perhaps this was just another symptom of that?
<thumper> I wish I had the bug reference for that bug
<babbageclunk> thumper: I think I've sorted the openstack tag upgrading. ec2 is a lot harder, because the security groups need renaming (which really means recreating with the new name, associating with the instances and deleting the old one).
<thumper> babbageclunk: don't worry about ec2
<babbageclunk> thumper: So I'm thinking I'll leave that for now, not merge my partial
<thumper> right now we only care about maas
<babbageclunk> yeah
<babbageclunk> maas is fine - doesn't use tagging on maas 1.9
<thumper> veebers: probably worthwhile getting the tests run over 2.2 before I forward port the fixes to devel
<thumper> just to be sure
<thumper> veebers: did you work out how to add artful to the tool gen?
<veebers> thumper: artful will happen for any release happening now (1.25.13 got artful) will need to do something extra to get it for previous releases
<thumper> veebers: I think we'll be ok as long as we get artful when we get the 2.2.3 release
<veebers> thumper: ack re: tests, is there something blocking that? (it will just happen once you land that branch)
<veebers> thumper: we will get it for that release
<thumper> veebers: both branches have landed for 2.2
<thumper> I was just wanting to make sure that the fixes worked before forward porting
<thumper> so testing should be underway
<veebers> thumper: ack, awesome
<veebers> I'll keep an eye out, try minimise infra noise in the results
<veebers> *sigh* I need to work out why unit tests are taking an age on this machine: "ok   github.com/juju/juju/agent/agentbootstrap 612.908s"
<veebers> that's with GOMAXPROCS=8, this machine has 32 CPUs and 252GB ram :-\ could be IO bottleneck?
<thumper> veebers: the tests for any package are run in serial
<thumper> veebers: could be i/o with DB access
<veebers> thumper: ack, looking into possible io bottlenecks etc.
<veebers> thumper: hah yeah, it's IO: "DSK |          sda | busy    101% | ..."
<thumper> :)
#juju-dev 2017-09-07
<babbageclunk> axw: thanks for the review!
<axw> babbageclunk: np, will review the other one soon
<babbageclunk> axw: cheers
<axw> jam wpk: just saw this in the GCP newsletter, in case you're not subscribed: https://cloud.google.com/compute/docs/create-use-multiple-interfaces
 * babbageclunk goes for a run
<jam> good/bad: "You can only configure a network interface when you create an instance"
<babbageclunk> wallyworld: chasing a really weird bug with subordinates after upgrade
<wallyworld> you mean 1.25 -> 2.x?
<babbageclunk> yup
<wallyworld> joy
<babbageclunk> wallyworld: I needed to put in code to handle subordinates with multiple principals, which was pretty straightforward, but now I've got a weird thing going on in the upgraded model. Trying to track it down.
<wallyworld> babbageclunk: ok. do you need any changes to the 2.2 branch to fix any of this? we are about to cut the 2.2.3 release
<babbageclunk> wallyworld: not sure yet but don't think so
 * thumper relocates to a cafe as car getting a warrent
<rick_h> getting a warrent?
<babbageclunk> He's renting it out for a war.
<veebers> rick_h: NZ has WOF (warrent of fitness), need to get your car checked out every, um, year?
 * veebers should check his rego and wof
<babbageclunk> wallyworld: do you have a moment? Want to talk about the problem I'm seeing
<wallyworld> babbageclunk: ok
<babbageclunk> wallyworld: sorry, chatting to thumper now
<thumper> wallyworld: I'm not going to be able to make the meeting, still in the cafe
<wallyworld> ok
<thumper> was hoping to be home by now, but started talking to xarses in #juju
<thumper> and I don't want to leave him half done
<wallyworld> babbageclunk: joining us?
#juju-dev 2017-09-08
<babbageclunk> wallyworld: bugger - I think I've worked it out and it's actually a bug in the migration import (so in 2.2.3) for subordinates.
<babbageclunk> wallyworld: just checking to see whether I can reproduce it in straight 2.2.3->2.2.3 migration
<babbageclunk> wallyworld: yup
<babbageclunk> thumper: ^
<thumper> fuck
<babbageclunk> :(
 * thumper sads
<thumper> what don'
<babbageclunk> Sorry, I wish I'd worked it out sooner
<thumper> what don't we do right?
<babbageclunk> It's basically the reverse of the problem you fixed in the export - needs to skip creating relation scopes when the relation unit isn't valid.
<thumper> bugger...
<babbageclunk> thumper: https://github.com/juju/juju/blob/2.2/state/migration_import.go#L1090
<wallyworld> axw: i've pushed some changes to the status watcher
<thumper> bollocks
<axw> wallyworld: ok, looking
<thumper> babbageclunk: can you please create a patch against 2.2
<babbageclunk> yup yup
<thumper> we may roll a fast 2.2.4
<thumper> but I'd like confirmation that it works first :)
<babbageclunk> thumper: of course. :)
 * babbageclunk pops out briefly to pick up Ada 
<wallyworld> axw: if you need a break from vsphere, here's PR which adds firewall checking to cmr ingress. the worker business logic is all of 3 lines, but it needs some apiserver boilerplate to add the SetRelationStatus API https://github.com/juju/juju/pull/7839
<axw> wallyworld: ok, a bit later
<wallyworld> sure, np
<thumper> babbageclunk: please file a bug so we can track this issue
 * thumper creates a 2.2.4 milestone
<thumper> ffs
<babbageclunk> thumper: ok
<veebers> babbageclunk: when you have a moment could you describe the repro steps? It seems there is a gap in the functional testing that this slipped through
<babbageclunk> veebers: typing up the steps in the bug now - I'll paste a link here in a mo
<veebers> babbageclunk: awesome, thanks
<wgrant> Is this likely to be a regression relating to the subordinate relation issues that were fully fixed in 2.2.2, or is it just a migration bug?
<wgrant> ie. can I safely upgrade my smaller controllers to 2.2.3 tonight to test things out before bigger controllers get upgraded?
<babbageclunk> wgrant: it's related to the subordinate problem in 2.2, but it only happens on migration - upgrading won't cause it.
<wgrant> babbageclunk: Cool, thanks.
<babbageclunk> wgrant: if the subordinates are ok in your 2.2.2 model then they'll be ok in 2.2.3 (at least in the context of this bug).
<wgrant> Great.
<babbageclunk> wgrant: sorry, my mention of upgrading earlier probably muddied the water a bit, but I'm working on 1.25 -> 2 upgrade, which is migration under the hood.
<wgrant> babbageclunk: Oh, I ddn't see that, I was just worried since the subordinate relation code had been historically fragile.
<babbageclunk> wgrant: right - this is another lurking case of that same problem unfortunately.
<babbageclunk> veebers: https://bugs.launchpad.net/juju/+bug/1715794
<mup> Bug #1715794: migration: subordinate with multiple principals imports incorrectly <juju:Triaged> <https://launchpad.net/bugs/1715794>
<veebers> babbageclunk: sweetbix, I'll check out the repro
<veebers> babbageclunk: ah, nice catch. yeah the current model migration functional test doesn't do any relation removals
<wgrant> Would a the state dump from both ends of the migration reveal this, or would the original fix exclude the problematic relationscopes from the dump?
<babbageclunk> wgrant: do you mean dump-model? I think it wouldn't show the problematic relationscopes, because we've changed it not to look them (since they won't exist in a good model).
<babbageclunk> wgrant: but dump-db will show them.
<babbageclunk> oops - changed it not to look them up
<babbageclunk> wallyworld: PR for the fix? https://github.com/juju/juju/pull/7840
<wallyworld> ok
<babbageclunk> thanks
<wallyworld> babbageclunk: so to make sure i understand, we modify import to ignore bad data
<wallyworld> should the first line in the PR description be "incorrect" not "correct"
<babbageclunk> wallyworld: no, the import is good - we were fabricating relation scopes even when they weren't in the import data.
<babbageclunk> I mean, the import data is good.
<wallyworld> hmmm ok, i'm a bit confused then
<wallyworld> we change import to say if ru not valid then....
<wallyworld> how would ru not be valid if import data ws ok
<babbageclunk> The relationscopes aren't reflected directly in the export format, but where it used to have two entries for settings under the endpoint it now only has one.
<wallyworld> right, so the import data *is* wrong
<babbageclunk> wallyworld: nope
<wallyworld> but you just said above it is?
<wallyworld> we export bad data
<babbageclunk> No, we create bad data when importing good data.
<wallyworld> which we then attempt to import
<wallyworld> but the data by definition is not good if it allows us to create bad data
<babbageclunk> The export is right, but the nested loop in the code creates relationscopes and settings records even when there's nothing corresponding to them in the import data.
<wallyworld> sounds like the export format is fawed
<babbageclunk> wallyworld: I think you might be right - the import is smarter than it should be in this case.
<wallyworld> i guess i don't understand why ru.Valid() is needed - that seems like too late in the process. we should not even be getting to the point where we have bad ry data
<wallyworld> if we are then either the export data is flawed, or the import is broken
<wallyworld> we should never create bad ry's
<wallyworld> ru
<wallyworld> which we then happen to filter out via Valid()
<wallyworld> shouldn't the problem be solved further up?
<babbageclunk> The other option I toyed with was to continue if endpoint.Settings(unit.Name()) returned nil.
<babbageclunk> Shall we do a hangout?
<wallyworld> sure
<wallyworld> standup one?
<babbageclunk> yup
#juju-dev 2017-09-10
<babbageclunk> wallyworld: quick review? https://github.com/juju/description/pull/23
<babbageclunk> wallyworld: hey!
<wallyworld> babbageclunk: sorry, omw now, issues with coffe machine
<babbageclunk> covfefe machine
<wallyworld> babbageclunk: juju/description lgtm
<babbageclunk> wallyworld: ta!
<wallyworld> np
<babbageclunk> wallyworld: ah, I forgot no mergebot - could you merge it too?
<wallyworld> ok
<babbageclunk> Thanks!
<wallyworld> done
<babbageclunk> you rock
