#juju-dev 2012-03-19
<bigjools> fwereade__: around?
<fwereade__> bigjools, heyhey
<bigjools> fwereade__: hello!
<fwereade__> bigjools, how goes it?
<bigjools> fwereade__: desperately need your help, I am trying to fix that system_id thing
<bigjools> totally failing so far
<bigjools> do you have some time?
<fwereade__> bigjools, yeah, I'm not sure I explained myself very clearly
<fwereade__> bigjools, ofc
<bigjools> thank you, ok let me explain where I got to
<fwereade__> bigjools, cool
<bigjools> firstly - I am unclear on how to test this in a QA environment since juju checks code out of Launchpad (!)
<bigjools> that's utterly bizarre
<fwereade__> bigjools, yeah, that underlying bug has lead be to break every-juju-in-the-world twice now :/
<bigjools> when LP is down, by any chance?
<bigjools> or broken trunk
<fwereade__> bigjools, nah, not even *broken*... just a significant-enough change in trunk can be deadly
<bigjools> anyway, I can't work out how to get my branch's code on there to test
<fwereade__> bigjools, you should be able to use the origin field in environments.yaml
<fwereade__>     juju-origin: lp:~fwereade/juju/set-service-constraints
<fwereade__> bigjools, which it, well, god enough for testing
<bigjools> oh boy, ok :)
<bigjools> next question, what do you know about cloud-init? :)
<fwereade__> bigjools, um, embarrassingly little :(
<bigjools> fair enough, I am trying to work out why it's crashing when I boot my node :(
<fwereade__> bigjools, my go-to technique is "vdiff with a known-good one and see if anything jumps out at me"
<bigjools> oh at which level does that config go BTW?
<fwereade__> bigjools, that's inside a given environment
<bigjools> ok
<rogpeppe> mornin' campers
<fwereade__> heya rogpeppe
<bigjools> so at the same level as admin-secret et al?
<fwereade__> bigjools, do you know exactly what is crashing on boot?
<bigjools> I don't, the logs are useless unfortunately
<fwereade__> bigjools, is that the traceback you sent in the mail or something else?
<bigjools> it just says it exited with status 1
<bigjools> no traceback
<fwereade__> bigjools, sorry... what exited with status 1?
<bigjools> no this is cloud-init crashing now, not juju
<fwereade__> bigjools, ah-ha
<bigjools> which is a pre-requisite to getting as far as juju :)_
<fwereade__> bigjools, quite so
<bigjools> I suspect I need Daviey
<fwereade__> bigjools, would you pastebin me the cloud-init file, just in case?
<bigjools> fwereade__: you want the user-data it's using?
<fwereade__> bigjools, just in case anything leaps out at me
<fwereade__> bigjools, btw, are you using system_id as instance id throughout now?
<bigjools> well I changed it in launch.py but as I said, not even getting close to testing that ATM
<bigjools> too many other cloud-init changes have broken things I think
<fwereade__> bigjools, if that's all you changed and it's now killing cloud-init, it sounds interesting
<fwereade__> bigjools, ah, sorry, what else has changed?
<bigjools> not entirely sure tbh, the server guys have been busy!
<fwereade__> bigjools, ha, ok:)
<bigjools> btw why on earth is it branching code on the master node anyway? can a tarball not be pushed through?  even bzr serve on the client end would be better!
<fwereade__> bigjools, we absolutely need a sane use-the-same-code-everywhere-in-an-env story
<bigjools> no kidding :)
<fwereade__> bigjools, thinking out loud, I presume you don't know where cloud-init crashes?
<bigjools> fwereade__: I don't
<bigjools> stuff flashes up on the guest's console ... AHA
<fwereade__> bigjools, can you look on the instance and make inferences based on what's installed so far though?
<rogpeppe> fwereade__: hiya
<bigjools> vt7 has a traceback
<fwereade__> bigjools, cool
<bigjools> ImportError: No module named DataSourceMAAS.  It's the freaking s/MaaS/MAAS/ that happened recently.
 * bigjools takes it to the right channel ... :)
<rogpeppe> fwereade__: i just got that transient testing error too. i think i'll just choose a port at random.
<fwereade__> rogpeppe, cool, sounds good
<fwereade__> bigjools, ouch :(
<fwereade__> bigjools, still, progress :)
<bigjools> fwereade__: yeah, I can attack this now, I'll be back with you later!  Although having said that when I did a check-seed on the node, the user-data still had env JUJU_MACHINE_ID="0"
<fwereade__> bigjools, that should be there -- that's what it uses to poke the "machine 0 is already provisioned" data in
<fwereade__> bigjools, but it needs an instance id as well
<bigjools> oh it's not the system_id then?
<bigjools> where is instance_id conveyed?
<fwereade__> bigjools, nah, sorry: we have machine ids which are basically just ints, and instance ids which are provider-dependent
<fwereade__> bigjools, instance_id is sent in through set_instance_id_accessor and I *think* it's only used in the `juju-admin initialize` script
<bigjools> oh from zk
<bigjools> I see it now, it's set
<fwereade__> bigjools, cool, and it's a system_id?
<bigjools> ok let me fix cloud-init and then I can test this
<bigjools> it is :)
<fwereade__> bigjools, sweet
<fwereade__> bigjools, it's a public holiday for me today but I'm working the first half so I'll be around for a few hours more
<fwereade__> bigjools, just ping me if you need anything
<bigjools> fwereade__: ah ok thanks, very much appreciated
<fwereade__> bigjools, a pleasure :)
<fwereade__> rogpeppe, btw, I had a thought over the weekend: one of the big problems with the hook package is its name
<fwereade__> rogpeppe, because hooks themselves are really only very tangentially related to what it's doing
 * rogpeppe always likes a good name change
<fwereade__> rogpeppe, I'm starting to think that the best place for this code is cmd/server
<fwereade__> rogpeppe, but there's probably an even better place I haven't though of yet
<rogpeppe> fwereade__: does this code actually need its own package in fact?
<rogpeppe> fwereade__: couldn't it just go into the unit agent package
<fwereade__> rogpeppe, we don't have a unit agent package: you didn't want one :p
<rogpeppe> fwereade__: lol
<rogpeppe> fwereade__: well, then in the place that has that
<fwereade__> rogpeppe, that's in cmd/jujud and I don't think that's the right place
<rogpeppe> fwereade__: no?
<fwereade__> rogpeppe, I have a forthcoming cmd/server which is only connected to jujud in that a process invoked by jujud will happen to run the server
<fwereade__> rogpeppe, and it's starting to seem that the server, the tool execution context, and the tool implementations themselves should probably all go in there
<rogpeppe> fwereade__: sorry, i think i lost the implication: cmd/server is a command?
<fwereade__> rogpeppe, it may be that we want a main package/fun in cmd/jujuc, and then to stick it in cmd/jujuc/server
<fwereade__> rogpeppe, it's not really, no, but it is a "command server" and it "serves" cmd/Commands
<fwereade__> rogpeppe, ...but they're purely for use by jujuc, so jujuc/server may be clearer
<rogpeppe> fwereade__: i think that for our own sanity the subdirectories under cmd should all be main packages
<rogpeppe> but cmd/jujuc/server might work
<fwereade__> rogpeppe, it seemed that if I tried to add cmd/jujuc/server when there wasn't any code in cmd/jujuc, go just ignored it
<fwereade__> rogpeppe, is that expected or did I do something wrong?
<rogpeppe> fwereade__: ignored it when you did what?
<fwereade__> rogpeppe, go didn't run the tests in cmd/jujuc/server when I put the server code in there with jujuc otherwise empty
<rogpeppe> fwereade__: if it does that it, it's a bug
<fwereade__> rogpeppe, hm, that was when running go test .../cmd/...
<rogpeppe> fwereade__: it should still work. let me check.
<fwereade__> rogpeppe, or more likely that I did something wrong :p
<fwereade__> rogpeppe, but I was expecting at least a "you're stupid, I'm not doing that" message
<rogpeppe> fwereade__: it works for me.
<fwereade__> rogpeppe, then I guess I did something stupid, cmd/jujuc/server it shall be (if that makes sense to you?)
<rogpeppe> fwereade__: yeah, that makes sense. it's the server side of the jujuc commands.
<fwereade__> rogpeppe, and so that'll have Server, Context, and a whole bunch of things like LogCommand and RelationSetCommand
<rogpeppe> fwereade__: the only hesitation i have is you might want it to depend on stuff internal to jujud.
<fwereade__> rogpeppe, go on... such as?
<rogpeppe> fwereade__: just a hunch
<rogpeppe> fwereade__: depends how closely the callback commands interact with the stuff in the unit agent.
<rogpeppe> fwereade__: i guess anything we need can go in an interface, and that should be fine.
<fwereade__> rogpeppe, I'm not seeing it yet; I think that when I figure out precisely what is responsible for the socket things may be rearranged slightly
<fwereade__> rogpeppe, but the only things I expect them to hit are log and state
<fwereade__> rogpeppe, I'm not sure quite how the state will get there yet but I think that's a future consideration
<rogpeppe> fwereade__: if that's the case, it's a nice clean separation and +1
<fwereade__> rogpeppe, it's my intent anyway :)
<rogpeppe> fwereade__: well the server stuff is invoked by jujud, right?
<rogpeppe> fwereade__: (well, the unit agent within jujud)
<fwereade__> rogpeppe, it will be but I don't yet know exactly how
<rogpeppe> fwereade__: ok
<fwereade__> rogpeppe, and my own hunch says that we will start to want a unit agent package around the time it all starts to get hooked up
<fwereade__> rogpeppe, incidentally, one nice effect of go I'm coming to appreciate:
<rogpeppe> fwereade__: maybe. i'm still thinking the agents are small enough they can live inside jujud. but we'll see.
<fwereade__> rogpeppe, no-unused-imports means that just by opening a file and seeing a bunch of unrelated imports you detect a smell
<rogpeppe> fwereade__: yeah
<fwereade__> rogpeppe, the unit agent is I think big enough that it'll feel wrong
<fwereade__> rogpeppe, all the lifecycle and workflow and scheduler  stuff basically
<rogpeppe> fwereade__: yeah, maybe you're right.
<fwereade__> rogpeppe, the MA and the PA are probably compact enough they wouldn't feel bad really
 * rogpeppe goes to see how many lines of code the python version is
<fwereade__> rogpeppe, I could very well be wrong -- ATM the code run directly by the UA is smeared across juju.hook and juju.unit (in addition to all the state stuff etc)
<fwereade__> rogpeppe, but perhaps it isn't actually *big* enough to warrant its own package and I'm just responding to the unclear factoring
<rogpeppe>  fwereade__: yeah, that's quite a lot of code actually.
<rogpeppe> fwereade__: i'm wondering that with server and jujuc factored out the actual core unit agent code might be reasonably compact.
<rogpeppe> fwereade__: i.e. the core lifecycle, workflow and scheduler stuff.
 * fwereade__ is cautiously optimistic
<rogpeppe> fwereade__: it *feels* compact in my head, but that's probably because i'm not familiar with it :-)
<fwereade__> rogpeppe, it's fiddlier than it looks
<fwereade__> rogpeppe, as I discovered when I thought "yeah, I'll pick up agent upstartification, how hard can it be?"
<rogpeppe> fwereade__: yeah. it's probably the fiddliest bit of the whole system, right?
<fwereade__> rogpeppe, yeah, I think so
<rogpeppe> fwereade__: but i guess it's that bit which is really what makes juju juju.
<fwereade__> rogpeppe, but *even then* I think it's that the unit agent itself is intrinsically fiddly, and so a jujud/unit subpackage might be just the ticket
<fwereade__> rogpeppe, yeah, it's all about the agents :)
<rogpeppe> fwereade__: i was thinking its all about mapping juju state transitions to shell scripts...
<fwereade__> rogpeppe, there are indeed many valid perspectives :)
<rogpeppe> fwereade__: a review for you, if you choose to accept it: https://codereview.appspot.com/5853048/
<fwereade__> :)
<fwereade__> rogpeppe, I have a few from the other day
<rogpeppe> fwereade__: unfortunately it breaks the environs/ec2 amazon tests. but i think fixing that is for another review.
<fwereade__> rogpeppe, I hope you like how hook/context turned out after discussing with niemeyer for a while
<rogpeppe> fwereade__: oh yeah, from friday. i'll have a look - i've been pointedly avoiding looking at my email this morning...
<rogpeppe> fwereade__: oh, i did see that you'd made some changes that i wasn't expecting
<fwereade__> rogpeppe, as long as we don't end up *merging* broken stuff I'm fine with that :)
<rogpeppe> fwereade__: ExecInfo went away - i'm happy to see it, but i didn't see any discussion about it.
<fwereade__> rogpeppe, the crucial insight is that this really is only very slightly related to hooks in the first place
<rogpeppe> fwereade__: was that your G+ conversation with gustavo?
<fwereade__> rogpeppe, but it took me a while longer to think "maybe this shouldn't be in the "hook" package at all
<fwereade__> rogpeppe, that was what crystallised it, yeah
<rogpeppe> fwereade__: cool. i was like "i thought i didn't manage to convince you, but you've gone and done it anyway... how did *that* happen?!"
<fwereade__> rogpeppe, and it now makes me think that Context.ExecHook is what we'll need in the end but until it has a client I'm comfortable as it is
<rogpeppe> fwereade__: yeah, i'm happy how it looks now.
<fwereade__> rogpeppe, the leap was too great for me to see while I was still thinking it was about hooks
<fwereade__> rogpeppe, once you forget about hooks the rightness of your approach is clear
<rogpeppe> fwereade__: i still quite liked Exec and vars being methods on Context.
<fwereade__> rogpeppe, if you're OK with that I'll gladly put them back on
<rogpeppe> fwereade__: yeah, i'm very happy with that.
<rogpeppe> fwereade__: they're tied closely enough to Context that i think they work well as methods on it.
<rogpeppe> fwereade__: and it's trivial to factor them out later if we want.
<rogpeppe> s/want/need/
<fwereade__> rogpeppe, I'm thinking that if I do that I will move them into cmd/jujuc/server as well, may as well start as I mean to go on
<fwereade__> rogpeppe, at which point I think the methods actually become ExecHook and hookVars
<rogpeppe> fwereade__: doesn't Context move into jujuc/server too?
<fwereade__> rogpeppe, yes, exactly
<rogpeppe> fwereade__: so they can still be Context.Exec and Context.vars if you like
<fwereade__> rogpeppe, I'm not sure, I think they become an "alien" concept once it's under jujuc
<rogpeppe> fwereade__: hmm, i dunno. if they were appropriate as methods on Context before, i don't really see why that's changed when Context has moved.
<fwereade__> rogpeppe, sorry: they're still context methods, but they should change their names to make it clear that they're about hooks (not the jujuc tools themselves, which will only be called as side effects if you like)
<rogpeppe> fwereade__: ok, that makes sense.
<fwereade__> rogpeppe, cool
<rogpeppe> fwereade__: one thought: maybe "RunHook" rather than "ExecHook"
<fwereade__> rogpeppe, perfect
<bigjools> fwereade__: http://pastebin.ubuntu.com/890372/
<bigjools> fun!
<fwereade__> bigjools, we're making progress though
<bigjools> slow!
<fwereade__> bigjools, I think that just means that the resource-uri/system-id confusion ran deeper
<fwereade__> bigjools, would a resource-uri be unique and immutable in the same way as system-id is?
<fwereade__> bigjools, if so it is probably a more convenient representation and would allow you to forget about system-id entirely?
<bigjools> yes, resource_uri is just a URL with the system_id in there somewhere
<fwereade__> bigjools, ok: that makes it sound like you can drop the notion of system-id entirely and just use resource-uri as instance_id throughout
<fwereade__> bigjools, sorry poor advice before
<bigjools> I am seriously confused
<fwereade__> bigjools, sorry, let me step back a mo
<fwereade__> bigjools, a juju machine id is really entirely abstract -- it's a predictable way for us to refer to specific machines internally, regardless of whether or not they're actually provisioned
<fwereade__> bigjools, so it's basically just an int
<fwereade__> bigjools, we maintain a mapping between machine ids and provider-specific instance ids (I forget exactly how it's stored)
<fwereade__> bigjools, and the provisioning agent keeps an eye on that mapping
<bigjools> ok so far
<fwereade__> bigjools, and provisions new instances in response to seeing machine states which *aren't* yet associated to an instance
<fwereade__> bigjools, once it's provisioned an instance for a juju machine, it sticks it in the mapping
<bigjools> ok
<fwereade__> bigjools, I am not aware of any restrictions on the format of instance-id -- I don't think we ever try to parse them
<fwereade__> bigjools, so the only relevant property of instance-id is that it affords a convenient way to talk to the provider about a specific instance
<fwereade__> bigjools, system-id was that (or near enough) in the orchestra provider, which is why I suggested that it should be the case here
<bigjools> oh hmmm
<bigjools> not sure the checkout worked ok from cloud-init
<fwereade__> bigjools, if you have enough information to construct a resource-uri given (1) a system-id and (2) the maas provider details
<bigjools> bzr: ERROR: A control directory already exists: "file:///usr/lib/juju/juju/".
<fwereade__> bigjools, huh, not seen that, maybe it's just reacting to droppings from a previous attempt?
<bigjools> yes
<bigjools> I neglected to wipe properly
<fwereade__> bigjools, anyway , if you *can* construct the uri given system-id then it might make sense to keep system-id, but I don't have a firm handle on whether or not that's actually a good idea
<bigjools> fwereade__: well this is how it was originally, right?
<bigjools> I was setting the machine_id as the resource_uri
<fwereade__> bigjools, yeah but if it's not the best fit for the problem it should change
<bigjools> still confused tbh since I don't know what's going on in the depths
<fwereade__> bigjools, the problem is that the MaaSMachine thinks system_id is the instance id, while other parts of the code think that resource_uri is
<bigjools> what is it doing with the machine_id later?
<bigjools> e_toomanyids
<fwereade__> bigjools, machine id is I think a red herring here
<fwereade__> haha
<bigjools> so what is: cloud_init.set_instance_id_accessor() doing? I thought it set machine_id?
<fwereade__> bigjools, nope: instance_id
<bigjools> so its name has a clue :)
<fwereade__> bigjools, the clue's in the name :p
<bigjools> when instance_id is looked up later, how is it used?
<fwereade__> bigjools, give me a mo, double-checking
<fwereade__> bigjools, it's only actually used by the provisioning agent AFAICT
<fwereade__> bigjools, the only reason it intrudes on your consciousness at all is because we need to fake up initial state on bootstrap, to say "machine id 0 is already provisioned on instance id WHATEVER", and prevent the PA from trying to provision itself
<fwereade__> bigjools, that is done by `juju-admin initialize` -- grep for that and you should see how set_instance_id_accessor is relevant
<bigjools> fwereade__: sorry, total PC lockup :/
<fwereade__> bigjools, np
<bigjools> I have a call in 4 minutes
<fwereade__> <fwereade__> bigjools, give me a mo, double-checking
<fwereade__>  bigjools, it's only actually used by the provisioning agent AFAICT
<fwereade__>  bigjools, the only reason it intrudes on your consciousness at all is because we need to fake up initial state on bootstrap, to say "machine id 0 is already provisioned on instance id WHATEVER", and prevent the PA from trying to provision itself
<fwereade__> <-- bigjools has quit (Read error: Connection reset by peer)
<fwereade__> <fwereade__> bigjools, that is done by `juju-admin initialize` -- grep for that and you should see how set_instance_id_accessor is relevant
<fwereade__> bigjools, I should still be around afterwards unless it's *really* long
<bigjools> 20 mins
<fwereade__> bigjools, just grab me when you're free then :)
<bigjools> ok thanks
<fwereade__> rog, thinking about your review
<rog> fwereade__: cool, thanks
<fwereade__> rog, there are quite a lot of tests that start by Initializing a State
<fwereade__> rog, and the required data for initialization will become more complicated
<fwereade__> rog, so we will at some stage want a testing.InitializeState(addrs string) function, but maybe it's not justified yet
<fwereade__> rog, OTOH when we do need it, if it already exists, it'll be just one place to change
<rog> fwereade__: i think i'd leave that until we need it
<rog> fwereade__: it's trivial to find occurrences and to add
<rog> fwereade__: in fact, won't Initialize need to take an addrs method?
<rog> s/method/argument/
<fwereade__> rog, it already does (implicitly, in the Info), I think
<rog> fwereade__: ah, so what would testing.InitializeState give us?
<fwereade__> rog, but the eventual reuiqred args to Initialize will be more complicated than to Open
<rog> fwereade__: ok. what other stuff will it have?
<fwereade__> rog, at the very least we need the instance id, to set up the state I've been talking to bigjools about
<fwereade__> rog, and I'm 99% sure that we'll end up passing in the environment settings too, imminently
<fwereade__> rog, like must-be-done-for-12.04-imminently
<fwereade__> rog, maybe that's not too much to duplicate
<fwereade__> rog, after all, dummy provider env settings are going to be basically empty
<rog> Initialize is only called in three places AFAICS. when the duplication becomes a burden we can factor it out.
<rog> fwereade__: for now, let's not add stuff that we don't need.
<TheMue> fwereade__: just one question after reading niemeyers comment to my last proposal: when i've got two pingers pinging the same node and i say one to kill its work, the second one will recreate the node, doesn't it?
<fwereade__> rog, so it will probably be `(info *Info, instanceId, providerType string)`
<fwereade__> TheMue, it should do, but 2 pingers on the same node is Doing It Wrong
<rog> TheMue: there should never be two pingers pinging the same node :-)
<fwereade__> TheMue, what are you trying to accomplish?
<fwereade__> rog, (yes indeed, it's not called for, ty for discussing :))
<fwereade__> rog, why a 3 minute timeout?
<TheMue> fwereade__: niemeyer found a problem with retrieving an instance of Agent() twice
<fwereade__> TheMue, go on
<TheMue> fwereade__: you get two diffent instances then today
<TheMue> fwereade__: which is, when keeping a pinger inside, indeed isn't good
<fwereade__> TheMue, yeah, makes sense; I thought you were taking the pinger out anyway?
<TheMue> fwereade__: on the other hand he suggested an api change to return a pinger with agent.StartPinger()
<rog> fwereade__: because it takes about 2 minutes to boot, and 3 minutes seemed long enough for the zk node to be inited after boot (maybe it's not and that's why my test is failing). the test harness fails after 6 minutes.
<TheMue> fwereade__: but here the problem stays the same
<rog> TheMue: there's no problem if the agent doesn't cache the pinger
<rog> TheMue: i think
<TheMue> rog: it's exact the same problem
<rog> TheMue: what's the problem?
<TheMue> rog: in both cases it's an illegal usage of the api
<fwereade__> rog, I *think* that we have 2 interesting cases: on the instance, if any code is running before initialize is complete we Have A Problem
<TheMue> rog: if i create two agent instances or two pinger instance, both is wrong
<rog> TheMue: i don't think you can stop that. it's a distributed system.
<fwereade__> rog, and if we're connecting from outside I think we want to wait forever and let the user interrupt us
<TheMue> rog: Pinger has the method Kill()
<fwereade__> TheMue, why would you ever create 2 pingers for the same node anyway?
<rog> TheMue: that's fine. that's to kill that particular pinger
<TheMue> fwereade__: ask niemeyer why one would create two agent for the same unit anyway
<rog> fwereade__: ok. i added the timeout as an afterthought because my test was timing out after 6 minutes. but maybe that was correct, and i should just up the test harness timeout time.
<fwereade__> TheMue, why is agent different to any other state class? you can have N state.Units referring to the same ZK state and that shouldn't be a problem
<TheMue> fwereade__: i only say that, if the one way is an error, that error won't move away by returning the pinger
<TheMue> fwereade__: so it should be with pinger too
<fwereade__> rog, I'm not *sure* that my analysis is correct, give it a bit of a mental kicking
<fwereade__> TheMue, it's always possible to write code that does the wrong thing
<rog> TheMue: returning the pinger seems good to me. it means that the Agent doesn't need to keep track of that state - it's less code and no less correct IMHO
<fwereade__> TheMue, in practice the unit agent process will call StartPinger once and only once, and that's it
<fwereade__> TheMue, and the agent process itself will decide when it needs a Stop/Kill
<rog> fwereade__: how long does the bootstrap node take to come up and be usable, usually?
<TheMue> fwereade__: so why return the pinger?
<fwereade__> rog, I've never actually measured it
<rog> [09:47] <rog> TheMue: returning the pinger seems good to me. it means that the Agent doesn't need to keep track of that state - it's less code and no less correct IMHO
<rog> fwereade__: maybe i'll take the timeout out again.
<fwereade__> rog, it may be there's some case I missed
<fwereade__> TheMue, what rog said :)
<rog> fwereade__: no, i think you're right. i guess i thought that three minutes waiting after zk connect *should* be fine. surely we don't take that long to start up the juju init command after starting zookeeperd?
<fwereade__> TheMue, it may be we have some disconnect on how we expect state.Agent to be used?
<TheMue> rog: i only have the poor maintainer, new to the code, in 2 years in my eyes. asking state to give a unit, asking unit to give an agent, asking agent to start a pinger (why a pinger, i'm only interested to signal that the agent is alive, so what does a pinger has to do with it?) and then keep the pinger
<rog> TheMue: that's what a pinger *does*.
<fwereade__> rog, that sounds right
<rog> TheMue: (i wasn't happy with the name "Pinger" (i preferred "Occupy" and "Occupied" but gustavo's choice)
<rog> )
<TheMue> rog: it's a technological description how it works. but when i drive a car i'm not interested on how the motor works, i wonna drive a car from a to b
<rog> TheMue: i know that
<rog> TheMue: but that's a debate to have about Pinger, i think.
<TheMue> rog: my intention is to hide HOW we do something but to tell WHY we do it
<fwereade__> TheMue, yeah, I liked Occupy too
<rog> TheMue: if StartPinger was called "Occupy", would you be happier?
<TheMue> rog: the pinger is a fine tool, i only have the opinion that i have to keep the too inside to provide a clean api regarding agent (and later anything else) for the user of this api
<rog> TheMue: the pinger is *the* tool for detecting and signalling agent occupation
<fwereade__> rog, TheMue: `RegisterPresence() (*presence.Pinger, error)`?
<TheMue> rog: yes, this way it makes more sense
<rog> fwereade__: yeah, that would be fine for me.
<TheMue> fwereade__: i still wouldn't return the pinger. i would hide it.
<rog> TheMue: why hide it?
<rog> [09:54] <rog> TheMue: the pinger is *the* tool for detecting and signalling agent occupation
<bigjools> fwereade__: ok so I'm free now
<rog> TheMue: we've built this abstraction, why not use it as is?
<TheMue> rog: in this case the name isn't optimal
<rog> TheMue: otherwise perhaps we should build it slightly differently, so we *can* use it as is.
<fwereade__> TheMue, the trouble is that it ends up making state.Agent unique among state.FOOs in that it's not something you can reconstruct safely from a fresh state with nothing but keys
<fwereade__> bigjools, where were we? was I making sense? ;)
<bigjools> fwereade__: unfortunately not :)
<rog> TheMue: i don't think we should get hung up on the name.
<TheMue> fwereade__: that's why i wanted to embed it. btw, now the pinger (or better the AgentOccupier) is special too.
<bigjools> but I need to re-establish my test env, so I'll be a few mins
<fwereade__> bigjools, heh, ok: did I ever start making sense, or was there a specific point where I started babbling crackfully?
<fwereade__> TheMue, pinger is not just for agents
<rog> TheMue: i don't see that hiding it gains anything.
<bigjools> fwereade__: it's not you, more that I don't really understand what's going on inside juju when it deploys stuff
<fwereade__> rog, I think that hiding it keeps the name out of the way, and the name exposes the implementation too much for comfort
<TheMue> fwereade__: that's ok, so i understood it first. that's why i wanted to encapsulate it for agent, so that the agent api is clear
<fwereade__> bigjools, ultra-high-level sketch:
<rog> fwereade__: because it's called "Pinger" rather than "Occupier"?
<fwereade__> bigjools, the user makes changes to an "ideal" state stored in ZK and the PA starts/kills machines in response to changes in the ideal state
<fwereade__> rog, exactly (or some other name, whatever ;))
<rog> i do think that "Pinger" is an unfortunate name because it implies polling, and we might use some other technique in the future. but...
<fwereade__> bigjools, that's the steady state and it's pretty simple really (devil in details ofc)
<rog> i think that that package is exactly the right place for the thing returned from an Agent.
<fwereade__> bigjools, the ugliness comes at bootstrap time
<fwereade__> rog, agreed
<rog> TheMue, fwereade__: if we're writing more code just to hide a name that we've only just invented, let's just change the name!
<rog> TheMue: but you can be the one to persuade gustavo :-)
<fwereade__> bigjools, the PA is responsible for making sure that the machines which should exist do exist; and machine 0 is just another part of the environment, we don't want to have to treat it specially
<TheMue> rog: simply changing the name if it's still a multipurpose tool isn't it
<TheMue> rog: i'm talking about encapsulation and api design
<fwereade__> bigjools, so before we let the PA look at state, we prime the state such that it sees "machine 0 is meant to be provisioned... and, hey, it already is"
<rog> TheMue: it's a multipurpose tool that is designed for signalling presence on whatever underlying storage system we're using. that's *exactly* what the agent presence stuff is about.
<fwereade__> bigjools, doing so involves storing the instance id and the machine id together
<fwereade__> bigjools, hence the requirement for instance_id at bootstrap time
<rog> TheMue: so it seems perfect that it's that that's returned from Agent.
<fwereade__> bigjools, instance id should in all other circumstances purely be an internal detail
<rog> TheMue: we're adding more layers of abstraction "just in case", but YAGNI!
<fwereade__> rog, TheMue: strongly agree that it's not up to state.Agent to stop the pinger
<fwereade__> bigjools, but sadly you need to deal with it at bootstrap time
<TheMue> fwereade__: indeed not, it's up to the user of agent (he has got his pinger from agent) to also use it to signal "hey, it's me, the agent, i'm stopping".
<bigjools> fwereade__: the traceback is from a "deploy" though
<fwereade__> bigjools: huh, sorry, let me reread
<rog> TheMue: that sounds right to me
<fwereade__> bigjools, right, sorry: the trouble is that you're still using 2 different notions of instance_id
<rog> TheMue: pinger := unit.Agent().StartPinger(); .... pinger.Kill()
<fwereade__> bigjools, either always use system_id, or always use resource_uri
<bigjools> fwereade__: where am I using those?
<TheMue> rog: so to me an api like agentAPI.SignalWork() and agentAPI.SignalEndOfWork() sounds more natural
<bigjools> just launch.py
<bigjools> ?
<rog> TheMue: that makes the agentAPI stateful, which it doesn't need to be
<fwereade__> bigjools, (1) MaaSMachine turns system_id into MaaSMachine.instance_id
<rog> TheMue: the state can live in the pinger.
<fwereade__> bigjools, (2) the provider takes instance_ids in some methods
<TheMue> rog: the pinger IS stateful, and the pinger IS part of the state api today
<fwereade__> bigjools, (3) you also need to set one at bootstrap time
<fwereade__> bigjools, I think that's it
<rog> TheMue: there are two places that are stateful: the underlying zk tree, and the local pinger state. you'd be adding a third.
<fwereade__> bigjools, (4) it'll be *stored* using a juju.state.machine.MachineState but you should be 100% insulated from that detail
<TheMue> rog: i don't add one, i only hide the already existing one to the user f the agent api
<bigjools> fwereade__: is this provider api documented anywhere?  are there instructions on adding providers?
<rog> TheMue: you do add one. you store whether the pinger has already started so agentAPI.SignalWork can return an error if so.
<TheMue> rog: instead of keeping an instance of the pinger he kann also keen an instance of the agent, it's the same
<bigjools> because so far I scratched around
<bigjools> and clearly my understanding of things isn't right
<TheMue> rog: so where's the difference?
<fwereade__> bigjools, only the stuff in juju.providers.common.base; I'm sorry if it's unhelpful :(
<fwereade__> TheMue, but why make state.Agent special in this way?
<bigjools> fwereade__: not entirely unhelpful, but a bit lacking :)
<rog> TheMue: it's not needed. and there's nothing stopping you (possibly on another machine) creating another State and another Agent and (erroneously) starting another pinger on that.
<fwereade__> TheMue, every other state.Foo is safe to grab a fresh instance of at any time
<rog> TheMue: just checking local usage isn't sufficient.
<fwereade__> TheMue, making state.Agent special even in this one circumstance feels like a bad move
<bigjools> fwereade__: I'm really sorry, I am still struggling to get my head around all this ID stuff.  It's massively confusing :(
<fwereade__> bigjools, I definitely remember it was a pain for a while :(
<bigjools> not sure where to go from here
<TheMue> rog: so nothing stops you to create a new unit, get a new agent, create a second pinger. it's the same mistake, the same wrong usage. as you told yourself above.
<rog> TheMue: sure.
<TheMue> fwereade__: pinger is alsready special
<fwereade__> TheMue, then why allow the specialness to affect Agent?
<fwereade__> bigjools, ok, mechanical solution
<fwereade__> bigjools, grep the maas provider for uses of instance_id
<bigjools> ok
<fwereade__> bigjools, ensure that they all come from consistent sources -- they should all either be coming from a system_id, or from a resource_uri
<fwereade__> bigjools, it seems most of the provider was written to expect resource_uri
<TheMue> fwereade__: as told above, only to keep together what belongs together. and not to tell a "start this" but later b "stop this". instead i wonna have it bundled a clean way a "start this" and later a "stop this"
<fwereade__> TheMue, that's exactly what Pinger offers
<TheMue> fwereade__: no, the design after the last review shall be Agent.Start and Pinger.Stop
<TheMue> fwereade__: that's pain to a maintainer later who is not today invoked in the design process
<fwereade__> TheMue, it's pinger := agent.StartPinger() and pinger.Stop(), which doesn't apear to me to be unclear
<fwereade__> TheMue, were it agent.Start and pinger.Stop I'd agree
<fwereade__> bigjools, does that help at all?
<fwereade__> bigjools, it *should* just be s/system_id/resource_uri/ in maas.machine, and maas.launch
<TheMue> fwereade__: it's StartPinger(), indeed, i only shortened it
<bigjools> fwereade__: ok let me digest that
<fwereade__> TheMue, that may be the problem, it's not actually starting the agent at all
<TheMue> fwereade__: but where is the problem with agent.Occupy() and agent.Release()?
<fwereade__> TheMue, I have *no* problem with it, that was proposed
<fwereade__> TheMue, niemeyer didn't like it
<fwereade__> TheMue, I think he may have suspected me of political activism
<fwereade__> TheMue, OCCUPY JUJU
<TheMue> fwereade__: hehe
<fwereade__> TheMue, ah wait sorry
<TheMue> fwereade__: but i also proposed different names.
<fwereade__> TheMue, I *do* have a problem with agent.occupy/release
<fwereade__> TheMue, but I've tried to explain it -- the mismatch with the other state types -- and I think I'm not communicating it well
<bigjools> fwereade__: I don't understand what you mean by " ensure that they all come from consistent sources"
<fwereade__> TheMue, or possibly you disagree?
<TheMue> fwereade__: who will start the pinger and what is his intention behind it?
<fwereade__> bigjools, 1 sec
<rog> fwereade__, TheMue: for me, it's just less code and easier to get correct if you return the pinger.
<rog> which is usually swings the argument for me.
<fwereade__> TheMue, the agent process will start it, and the agent process will know how it means to stop and if/how it should kill the pinger
<fwereade__> TheMue, I don;t think stop/kill is a sensible or meaningful distinction on the *agent* itself
<TheMue> rog: and once we chage to a different method every code which isn't interested in a pinger but in a working agent functionality has to be changed?
<fwereade__> TheMue, I'd even say that the only reaosn to put StartPinger on there is so that we can keep the knowledge of the actual agent presence path entirely internal
<TheMue> fwereade__: but WHY does the agent start the pinger? what's his intention?
<rog> TheMue: what do you mean by "a different method"?
<fwereade__> TheMue, to signal presence... is that a trick question? ;)
<TheMue> fwereade__: ah, so a better name than Occupy() may be SignalPresence()?
<fwereade__> TheMue, I suggested something like that above
<TheMue> fwereade__: exactly ;) just a rhetoric way
<fwereade__> TheMue, RegisterPresence maybe?
<rog> [09:54] <fwereade__> rog, TheMue: `RegisterPresence() (*presence.Pinger, error)`?
<rog> :-)
<fwereade__> TheMue, I still feel that returning a PresenceRegisterer (eww) is the right thing
<fwereade__> TheMue, for now we have just one and we call it a Pinger
<rog> fwereade__: yeah.
<rog> fwereade__: seems ok to me.
<fwereade__> TheMue, it exposes an implementation detail but not gratuitously
<TheMue> rog: yep, only wanted to come back after Occupy()
<fwereade__> TheMue, we could handle it as a KillerStopper interface if we had any reason to worry about implementation changes
<rog> fwereade__: it's a pity it's called Pinger, but if you half-close your eyes and imagine it's called something less gratuitously implementation-specific, i think it works.
<TheMue> fwereade__: i'm no friend of exposing implementation details
<fwereade__> TheMue, but for now, an extra interface is gratuitous
<TheMue> fwereade__: i prefer a symetric handling where i can also tell agent to DeregisterPresence() regardless of the used underlying technology
<rog> we've designed an interface (presence) that is specifically about agent occupation. if we feel we have to hide it because it's "too implementation-specific" then we've got it wrong.
<fwereade__> TheMue, I may have to bow out and suggest you propose a name change to niemeyer; you're in a better position to advocate for it than I am, you're an actual client
<fwereade__> bigjools, sorry :)
<TheMue> rog: above you told that it's not only for agent
<fwereade__> bigjools, I *think* there are only 2 places where instance_ids enter the system
<rog> TheMue: that was fwereade__ i think
<bigjools> fwereade__: I think the basic problem is that it's unclear as to what exactly each provider's API is in terms of what data goes in and out
<fwereade__> bigjools, one of them is in MaaSMachine, where it passes d["system_id"] up to ProviderMachine.__init__
<rog> TheMue: for now, why don't we go with the KISS approach? we're using a statically typed language - this stuff can easily be changed in the future.
<bigjools> so your s/system_id/resource_uri/ still doesn't make sense to me. :(
<fwereade__> bigjools, the other one is CloudInit.set_instance_id_accessor
<rog> TheMue: (if we feel it's necessary)
<rog> TheMue: it's not like we'll have lots of clients using the Agent type.
<fwereade__> bigjools, when the PA is looking at machines it looks them up by machine id and gets back data that includes the instance id
<fwereade__> bigjools, to interact with the provider, it uses the instance id
<fwereade__> bigjools, the only places that information enters the system are (1) set_instance_id_accessor and (2) MaaSMachine.__init__
<fwereade__> bigjools, and that information is passed back into provider methods that take instance ids (or possibly MaaSMachines, which themselves have instance_ids set)
<fwereade__> bigjools, so whatever you set in .machine and .launch will get passed back into the provider as instance_id
<fwereade__> bigjools, oh, hold on, I need to look more closely at the maas provider
<TheMue> rog: maybe you're right. but i only have my experiences in maintaining legacy code, also in static languages. and also juju once will be legacy.
<bigjools> fwereade__: holding
<fwereade__> bigjools, ha: juju/providers/maas/tests/test_maas.py|64| # TODO: Add test for get_nodes with a system_id parameter.
<bigjools> :/
<TheMue> rog: but indeed, maybe it isn't woth it, maybe Unit and Machine should use pinger directly instead of an extra Agent. the old mixin approach leaded into a wrong direction.
<rog> TheMue: that was my thought originally, which is why i suggested you put this branch on hold for a while. but still, i think it does add some useful encapsulation, so it's still useful.
<fwereade__> bigjools, ok, I was confused (er, and may still be)
<fwereade__> bigjools, correct me if I'm wrong
<fwereade__> bigjools, scratch that
<fwereade__> bigjools, look at MaaSClient
<fwereade__> bigjools, get_nodes expects system ids
<rog> [08:37] <fwereade__> rogpeppe, as long as we don't end up *merging* broken stuff I'm fine with that :)
<fwereade__> bigjools, start, stop, release expect resource uris
<rog> fwereade__: in this case, the broken stuff is already merged...
<fwereade__> bigjools, juju is passing its idea of "instance_id" in in both those cases
<rog> fwereade__: (i just had a test time out after 20 minutes - i'm sure the zk should have been initialised within that time scale)
<fwereade__> rog, yeah, my instinct says that the Right Thing to do if not inited is context dependent
<fwereade__> rog, agents should just throw a hissy fit
<rog> fwereade__: yeah, i think you're right.
<bigjools> fwereade__: hummm
<rog> fwereade__: hmm, actually maybe that means that agents shouldn't wait at all
<fwereade__> rog, command line tools may want to time out sensibly, or just wait for ^C
<rog> fwereade__: maybe that means we should have Open and WaitOpen
<fwereade__> rog, I think that may be the case
<fwereade__> rog, +0.5, thinking
<rog> fwereade__: and perhaps WaitOpen could have a timeout duration argument
<fwereade__> rog, for now I'd leave it; not hard to add once  we need it
<rog> fwereade__: k
<fwereade__> rog, re broken stuff, I think I missed the import
<rog> fwereade__: ?
<fwereade__> rog, if it makes things *worse* we should merge it at the same time as some other branch that also fixes it
<fwereade__> rog, if it is a step towards making a broken thing less broken it's fine
<rog> fwereade__: it *exposes* some existing brokenness.
<fwereade__> rog, (er, "import" in the sense of "importance")
<rog> lol
<rog> fwereade__: i.e. it seems that the existing bootstrap stuff isn't actually succeeding in getting juju init to be called.
<fwereade__> rog, in that case I would prefer that it not be merged without another branch that either hides or fixes the brokenness
<rog> fwereade__: which i wasn't testing for before
<fwereade__> rog, Initialize doesn't work itself, so this is all somewhat moot
<rog> fwereade__: given that amazon tests weren't working until one merge ago, i don't think it's too bad to regress for a small while.
<rog> fwereade__: (i doubt anyone besides myself has actually run that test :-])
<rog> fwereade__: why moot?
<rog> fwereade__: we're interacting with the python backend for the time being.
<fwereade__> rog, ha true, forget I said anything
<fwereade__> rog, I'll leave it to your judgment then :)
<fwereade__> bigjools, how painful will it be to fix that?
<bigjools> fwereade__: I can't deal with this. I am tired and frustrated and about to pull all my hair out.  I am going to punt, but I don't know where
<bigjools> fwereade__: I can see there might be a mismatch of ID types but none of them make any sense to me still
<fwereade__> bigjools, you *should* only have to think about one at a time
<bigjools> fwereade__: this is one situation where I need to be in a room with someone
<fwereade__> bigjools, if it's not instance_id, you can ignore it
<fwereade__> bigjools, if it *is* instance_id, it needs either to *always* mean system_id, or *always* mean resource_uri
<rog> fwereade__, bigjools: maybe this is a time that G+ hangouts with extras (i.e. screen sharing) could be useful?
<fwereade__> bigjools, I misled you by asking you to fiddle with the input data (which did need to be done) while neglecting to notice the deeper internal mismatch in MaaSClient
<bigjools> quite possibly, but it's almost 9pm here and I am out of brain juice :(
<fwereade__> bigjools, let me think a mo
<fwereade__> bigjools, is the maas api documented at all?
<bigjools> yes, I need to find a URL for you
<bigjools> fwereade__: http://people.canonical.com/~gavin/docs/lp:maas/api.html
<bigjools> it needs regenerating, slightly out of date
<fwereade__> bigjools, if you get me that and point me to the tree you're working from I will take a quick look at it and see if I can either tell you what you need to do (or do a sketch of it myself if it's trivial enough)
<bigjools> fwereade__: I've not made any significant changes yet, still trying to work things out
<fwereade__> bigjools, without an actual maas provider to test against it will be just a sketch though I think
<bigjools> fwereade__: the api you can infer from the calls it's making in maas.py.  It's restful.
<fwereade__> bigjools, cool, I will see what I can do :)
<bigjools> which is why we need to use a combination of resource_uri and system_id :/
<bigjools> (e.g. get_nodes takes system_ids because there's no resource_uri as such)
<bigjools> so I can see where that needs fixing up
<fwereade__> bigjools, hr'm, I see the problem
<fwereade__> bigjools, it sounds like there's not much that can be done on the juju side until that's in place
<fwereade__> bigjools, bah
<bigjools> indeed
<fwereade__> blast, I put some toast on 10 mins ago, just a mo
<bigjools> fwereade__: I need to rest, migraine coming.  I'll hand this over to Gavin as he's working with me here
<fwereade__> bigjools, sorry I wasn't more help :(
<fwereade__> bigjools, tell him to ping me if he needs
<rog> fwereade__: i decided to add WaitOpen, since you mentioned it in your review :-)
<fwereade__> rog, cool
<fwereade__> rog, https://codereview.appspot.com/5832045
<fwereade__> rog, TheMue, that's me for the day (public holiday )
<rog> fwereade__: LGTM
<rog> fwereade__: cool, have fun!
<TheMue> fwereade__: enjoy
<niemeyer> Good mornings!
<rog> niemeyer: yo!
<TheMue> niemeyer: moin
<rog> TheMue: there's a review for you, BTW: https://codereview.appspot.com/5853048/
<TheMue> *click*
<rog> lc
<TheMue> rog: lgtm, only one minor note
<rog> TheMue: hmm, i don't know about the readability - i read '2e9' as 2 seconds. but i can't do 0.2 * time.Second, unfortunately.
<rog> TheMue: but i'll change it anyway
<TheMue> rog: yeah, it sadly only wants ints. maybe time.Second/5 ;)
<rog> :-)
<niemeyer> rog: We went over that before, IIRC.. I don't think we need Open and WaitOpen(..., timeout)
<rog> niemeyer: i had a discussion with william this morning
<rog> niemeyer: he thought we did...
<rog> niemeyer: so i changed it to be like that
<niemeyer> rog: Why?
 * rog goes to copy the code review comment he's making
<rog> that's what i had until this morning's discussion with william. we decided that the length of time to wait was quite consumer-dependent. in particular agents shouldn't wait at all, because they should never be started before zk is initialised. on the other hand there's not necessarily any right answer to how long to wait after bootup (i had it at 3 minutes, but william queried it, rightly i think).
<rog> niemeyer: our discussion is here (http://irclogs.ubuntu.com/2012/03/19/%23juju-dev.html) from 0940
<niemeyer> rog: Ok, I don't think that's the case..
<rog> niemeyer: ok, i can easily revert.
<niemeyer> rog: We can start agents in the same machine, or even in distributed machines, and those should wait
<niemeyer> rog: The initialization mechanism is there precisely to avoid silly races
<rog> niemeyer: when would we ever start an agent before calling juju-init ?
<niemeyer> rog: Between starting ZooKeeper and it having its data initialized
<rog> niemeyer: why would we do that?
<niemeyer> rog: ZooKeeper is started by init
<niemeyer> rog: At package installation time
<rog> niemeyer: but juju-init isn't called then, is it?
<niemeyer> rog: Maybe not, but still, there's no point
<rog> niemeyer: i think it's trivial to assure that juju-init is *always* called before any agent is started
<rog> niemeyer: and then the timeout serves no purpose
<niemeyer> rog: So why are you adding it?
<rog> niemeyer: for the client code.
<niemeyer> rog: Yeah, so it is useful.. what you're saying is that there are many cases where we won't have a race
<rog> niemeyer: yes.
<niemeyer> rog: what I'm saying is that it's pointless to consciously remove a race protection because you know that in some cases you don' t have it
<niemeyer> rog: It's like dropping a mutex because you know that in some cases you're calling the function serially
<rog> niemeyer: we're still protected - the function will return with an error.
<niemeyer> rog: Ok.. what's the benefit of removing the race protection?
<niemeyer> rog: That's not protection.. that's a crash
<rog> niemeyer: it means that if we *do* fail to juju-init first, we'll fail early
<niemeyer> rog: Sure, so let's not fail it!
<rog> niemeyer: it'll fail anyway, just 2 or 3 or 10 minutes later.
<niemeyer> rog: I'll repeat: there's no point in preventing a mechanism that avoids a race.
<niemeyer> rog: If juju-init doesn't run, it's a serious bug, and we'll know about it 3 minutes later.
<rog> niemeyer: ok. what's a good timeout, BTW.
<rog> ?
<niemeyer> rog: I'd rather have a single function, that is protected by default.
<niemeyer> rog: 3 minutes is fine for me.
<rog> niemeyer: ok. as i said, that's what i had before - i was just trying to represent fwereade__'s side of the argument...
<niemeyer> rog: There's no point in us thinking "Oh, do I have a race here? Do I use Open or WaitOpen?  How long do I use on WaitOpen?" on *every single call* of Open.
<rog> niemeyer: i had the same thought about DNSName...
<niemeyer> rog: It's the exact opposite case.
<niemeyer> rog: Open should generally complete in seconds under normally working circumstances.
<niemeyer> rog: Because whoever is starting zookeeper should initialize it shortly afterwards.
<niemeyer> rog: We're just closing that window.
<rog> niemeyer: ok.
<rog> niemeyer: PTAL
<niemeyer> rog: Checking
<niemeyer> rog: Review sent
<rog> niemeyer: what do you think about time.Sleep(0.2e9) vs time.Sleep(200 * time.Millisecond) ?
<niemeyer> rog: Hmm.. good question
<niemeyer> rog: The latter might be friendlier to newcomers. I grew used to the 0.2e9 or 2e8 notation, so I can quickly see what it means, but maybe it'd be wise to make it more readable.. do you have an opinion?
<rog> niemeyer: i also find it easy to read, but TheMue suggested less terse notation was more readable.
<rog> niemeyer: when it's used with time.Sleep i think the raw number is fine
<rog> niemeyer: with other calls, it's probably best to use explicit duration constants
<niemeyer> rog: That'd be a strange line to draw..
<rog> niemeyer: it's just a pity you can't use 0.2 * time.Second :-)
<rog> niemeyer: you're probably right
<niemeyer> rog: Yeah, it's a bit unfortunate indeed.. it should work as long as the division results in an integer
<rog> niemeyer: i don't think it does, because a floating point constant can't multiply a non-ideal constant
<rog> niemeyer: (perhaps that should be allowed in fact)
<niemeyer> rog: It really should
<TheMue> rog: tried time.Second/5?
<rog> TheMue: i'd prefer not to do that.
<rog> TheMue: 200 * time.Milliseconds is better
<TheMue> rog: especially it reads even worth.
<TheMue> rog: yep, definitely
<rog> niemeyer: done
<niemeyer> rog: LGTM
<rog> niemeyer: thanks
<niemeyer> rog: I hope you like the result as well
<rog> niemeyer: yeah, i was a little bit unhappy about adding WaitOpen earlier, but i was persuaded it was a good idea. you persuaded me back :-)
<niemeyer> rog: Hehe :)
<rog> niemeyer: and i was thinking about putting the initialize code in open anyway; it was a coin toss.
<rog> niemeyer: BTW this has exposed a bug in the environs/ec2 code - the juju init code never gets run, it seems. so the amazon test is currently broken.
<niemeyer> rog: Hmm
<niemeyer> rog: I suppose that's due to the lack of running the initialization code via cloud-init?
<rog> niemeyer: i'm guessing so
<niemeyer> rog: Cool, that should be easy to fix
<rog> niemeyer: my next branch hooks juju logging up to gocheck.C
<rog> niemeyer: then i'll get into the init code.
<niemeyer> Ah, cool
<TheMue> niemeyer: after some discussion about the Agent, its usage ony in Unit and Machine and the role of presence and the Pinger i today would say there's no need for the Agent type anymore. instead Unit and Machine should use presence directly.
<niemeyer> TheMue: Hmm, that might be a nice improvement indeed.. have you tried it out to see how it looks yet?
<TheMue> niemeyer: i wrote some comments after your review and discussed them with rog and fwereade__
<TheMue> niemeyer: i'm still not really happy with "StartPinger()". to me it doesn't really says something about the semantics behind it.
<TheMue> niemeyer: it's more a description on "how" something is done (by activily pinging a node).
<niemeyer> TheMue: The semantics is that it "starts a pinger" for the respective agent.
<niemeyer> TheMue: This feels a lot more clear in that regard than "Connect"
<TheMue> niemeyer: i would prefer something like "SignalPresence()", because it signals the presence of the agent
<TheMue> niemeyer: yeah, Connect() has been a bad choice, indeed
 * rog is a bit wistful for "Occupy" :-)
<niemeyer> TheMue: we already have presence.StartPinger.. let's please stick to the same terminology.
<rog> niemeyer: a review for you. https://codereview.appspot.com/5841067
<rog> niemeyer: as i say, i'm not entirely convinced it's right.
<rog> niemeyer: i'm off for lunch.
<niemeyer> rog: Enjoy
<TheMue> niemeyer: IMHO it's states task to build a higher abstraction on the state model. so StartPinger() in the context of presence ("Is node X present?") ok, but from states view I woild expect a different naming.
<niemeyer> TheMue: If I see pinger, err := unit.StartPinger(), I know what that means.
<niemeyer> TheMue: Why introducing another term for the same concept?
<TheMue> niemeyer: you, yes, because you're currently invoked in the implementation. how about the poor maintainer, new to the team in two years from now? asking himself whiy he has to call "StartPinger()" to signal, that the agent is now alive?
<niemeyer> TheMue: Yeah, maybe..
<robbiew> TheMue: rog: was there any progress made on you being able to access allhands.canonical.com?  Or should I nag HR ;)
<niemeyer> TheMue: StartAlivePinger?
<TheMue> robbiew: Esther already talked to me. She passed it to IS.
<robbiew> TheMue: ack, thx
<TheMue> niemeyer: Yeah, this way it gets more clear.
<niemeyer> TheMue: SetAgentAlive might be even better
<TheMue> niemeyer: So we wout have "AgentAlive()", "WaitAgentAlive()" and "SetAgentAlive()" directly on "Unit" and "Machine", "Setâ¦" is returning a "Pinger".
<TheMue> niemeyer: Sounds like a plan. ;)
<niemeyer> TheMue: Cool, +1
<niemeyer> Lunch, biab
<rog> robbiew: just tried it and it worked this time!
<robbiew> rog: awesome
<robbiew> let me know when you've submitted, so I can sign..then you can countersign and start the review ;)
<rog> robbiew: done
<robbiew> rog: ..and right back at 'cha ;)
<robbiew> thnx
<niemeyer> hazmat: Let's continue the charm URL discussion after that call
<hazmat> niemeyer, sounds good
<rog> niemeyer: any hints for debugging a machine that i can't ssh to? (i suspect i'm that's a symptom of the problem)
<rog> niemeyer: my userdata looks like this: http://paste.ubuntu.com/891013/
<niemeyer> rog: EC2?
<rog> niemeyer: yeah
<niemeyer> rog: ec2-get-console-output may help
<rog> niemeyer: ah, didn't know about that
<niemeyer> rog: Note it's not synchronous.. it make take a moment for the output to be visible
<niemeyer> rog: Are those broken lines supported by cloud-inti?
<rog> niemeyer: i suspected those - they're valid YAML, and i'm not sure how i can tell yaml.Marshal not to produce them.
<rog> niemeyer: i didn't know you could do that actually
<TheMue> niemeyer: so, next agent try plus some cleanup with https://codereview.appspot.com/5782053
<niemeyer> rog: it may well be fine.. I recall broken lines on Python too
<rog> niemeyer: hmm, the console output is useful, but i can't see any mention of cloudinit in there unfortunately: http://paste.ubuntu.com/891026/
<rog> niemeyer: oh, cloud-init
<rog> niemeyer: none of those key fingerprints matches the one sent, BTW
<rog> TheMue: i preferred it when the code wasn't duplicated. unit.Agent().Alive() reads well to me.
<TheMue> rog: see discussion above with niemeyer
<rog> TheMue: yeah, i saw that, but i didn't quite realise the implications. i think that the code duplications in unnecessary.
<rog> s/duplications in/duplication is/
<niemeyer> TheMue: Review delivered
<niemeyer> rog: The duplication is certainly unnecessary, but it's trivial to avoid it without introducing a new type
<TheMue> niemeyer: yup, notification just received
 * TheMue feels reminded for the first proposal, with three helper functions ...
<rog> niemeyer: what method would you use?
<rog> niemeyer: just define a helper function?
<niemeyer> rog: Yeah.. it's just a simple function used twice
<TheMue> niemeyer: LOL
<rog> niemeyer: AFAICS a new Agent type (with three methods, Alive, SetAlive and WaitAlive) would fit well here.
<TheMue> niemeyer: that has been the first proposal
<rog> niemeyer: then Unit and Machine both get Agent() *Agent methods
<niemeyer> rog: It would, but TheMue seems to prefer this approach, and I have no reason to disagree with him
<rog> niemeyer: ok. less code, but fair enough.
<niemeyer> rog: Disagree.. the amount of code in his implementation is fairly minimal.. adding a type wouldn't save much, if anything at all
<niemeyer> TheMue: Sorry about the back and forth.. I don't know about the whole history, and either way would be fine for me. Having two 30-line functions with the exact same implementation sounds unnecessary, though
<rog> niemeyer: 16 lines eventually and three less methods. well, i know it's not much, but every little helps :-)
<niemeyer> rog: There are at least three methods for the implementation of Agent, the declaration of Agent, and the two methods for returning the agent
<niemeyer> Besides another file, and imports
<niemeyer> rog: It's not less code, and even if it was, the current implementation is on the trivial side.
<rog> niemeyer: util.go :-)
<niemeyer> rog: No. Thanks.
<niemeyer> TheMue: It's cool to move on with your approach, just remove the dups please
<TheMue> niemeyer: so back to the first approach with three funcs? used then in the methods of unit and machine?
<niemeyer> TheMue: No..
<niemeyer> TheMue: WaitAgentAlive is the only function that has logic duplication we just need a single function with that logic for reuse
<niemeyer> s/duplication we/duplication. We/
<TheMue> niemeyer: in an own file named agent.go or in util.go?
<TheMue> niemeyer: and just to remove the dup the reader later has to look where he finds this logic while the other func are implemented directly using presence.
<niemeyer> TheMue: A file for a single function would be too much.. it's fine to put it below one of the implementations, and hook it on the next one
<TheMue> for those few locs
<rog> niemeyer: i'm off for the evening. will be probably be mostly out of contact tomorrow afternoon as i travel down to the london Go meetup (i'll be working on the train, but the reception is patchy)
<TheMue> i'll do so, just to stop the discussions.
<TheMue> rog: i would have liked to participate at that meeting.
<TheMue> robut the trip is too expensive.
<TheMue> eh, rog, but ...
<rog> TheMue: it's quite expensive for me to go down but i decided that i couldn't miss it
<TheMue> rog: now i'm giving a talk about go at the gtug in bremen in april
<niemeyer> rog: Cool, we'll certainly talk before that, but have there no matter what
<niemeyer> have fun there
<rog> niemeyer: hopefully it'll be interesting.
<niemeyer> rog: I bet it will!
<TheMue> rog: i think so, there are some interesting people
<rog> right, i smell food!
<niemeyer> rog: Mmmm.. that' dbe nice :)
<TheMue> niemeyer: one wish: presence gets a function "WaitAlive(zkConn, path, timeout) error" and that can be used in both agent methods (and anywhere else)
<TheMue> niemeyer: so it's a one-liner there too
<niemeyer> TheMue: Sounds good.. let's mutate the current function in that fashion then, since that's precisely the use case we need it for
<niemeyer> TheMue: Just push that in a different branch, please
<TheMue> niemeyer: so a new branch with the func and when it's submitted merge it into my agent branch and use it there?
<niemeyer> TheMue: Right.. you can just merge from trunk once it's in
<niemeyer> TheMue: Hopefully we can do the whole thing with a very quick turnaround
<TheMue> niemeyer: ok, will do so tomorrow, its late here. should be easy.
<niemeyer> TheMue: Sounds good, thanks
<niemeyer> TheMue: The idea of tweaking presence is a good one
<TheMue> niemeyer: thx
<TheMue> niemeyer: i'm off, bye
<niemeyer> TheMue: Have a good evening
<niemeyer> fwereade__: So, you've decided to drop it all for now?
<niemeyer> fwereade_: So, you've decided to drop it all for now?
<niemeyer> hazmat: ping
<niemeyer> hazmat: I'm up for that conversation about charm URLs, when you're ready
<hazmat> niemeyer, cool
<hazmat> niemeyer, i'm ready just picked up ethan from day care
<niemeyer> hazmat: Ok.. we can even go here I guess
<hazmat> niemeyer, sounds good
<hazmat> so wrt to it urls.. their of the form scheme:~user/series/name
<niemeyer> hazmat: The charm URL needs to work for deployments.. it's already supported in the command line, but not taken into account for the actual deployment with the backend
<niemeyer> hazmat: What's the issue you were mentioning about that?
<hazmat> niemeyer, my understanding is that with the merge of ~fwereade/juju/pa-start-machine-constraints that we can enforce series as a constraint
<hazmat> looking over deploy, it doesn't do it by itself
<niemeyer> hazmat: Series must not be a constraint
<hazmat> niemeyer, its not a user constraint
<niemeyer> hazmat: Not a user visible one, at least
<hazmat> its a system constraint
<niemeyer> hazmat: Ok
<hazmat> niemeyer, this is going to explode the question of consistent version of juju across distros per SpamapS's email earlier on the same
<niemeyer> hazmat: I don't know what the question is
<niemeyer> hazmat: Or how the version of juju is involved in that?
<hazmat> niemeyer, different versions of juju released in different distro releases
<hazmat> all in the same env
<niemeyer> hazmat: Yeah, it's an interesting problem
<niemeyer> hazmat: Versions have to be compatible, basically
<hazmat> niemeyer, its  a simple problem, if we didn't have to use packages
<niemeyer> hazmat: Right
<hazmat> its the juju upgrade itself problem.. but with env version pinning added
<hazmat> niemeyer, okay.. i've looked over the charm deploy code, this should be straight forward
<niemeyer> hazmat: Hmmm.. kind of
<hazmat> to add the series as system constraint from charm
<niemeyer> hazmat: juju upgrade has its own set of problems
<hazmat> niemeyer, agreed, it needs some configurable policy choice to enable cluster workflows
<hazmat> but focusing on what we can do atm.. i'm wondering if this something important enough to try and get it done for 12.04, ie. get a spec out on it today
<hazmat> niemeyer, did you see SpamapS's email on the topic https://lists.ubuntu.com/archives/juju/2012-March/001337.html
<hazmat> ie. being able to upgrade an environment seems pretty important for prod like usage
<niemeyer> hazmat: Definitely
<niemeyer> hazmat: I have a personal embargo to comment on that issue ;-)
<hazmat> niemeyer, you mean shipping the go binaries into provider storage won't fly ;-)
<hazmat> seems perfectly reasonable
<niemeyer> hazmat: I think it's reasonable too
<niemeyer> hazmat: It's probably the real long term solution
<niemeyer> hazmat: It's also boringly easy
<niemeyer> hazmat: Since it consists into a single binary being uploaded/downloaded
<niemeyer> Well, maybe two
<niemeyer> hazmat: But that won't solve the juju-py issue
<niemeyer> hazmat: I suppose the short term solution is to upgrade Oneiric
<hazmat> niemeyer, well its possibly N your uploading a binary tagged to a version, and instructing the cluster to use it by ref/version
<hazmat> niemeyer, well i'm debating if we should just do the same for juju py
<niemeyer> hazmat: Where the same means..?
<hazmat> allow for code repo + rev or bundle w/ version
<hazmat> for consistent cluster deploys
<hazmat> i'd rather not. cause it will tie me up for the cycle
<SpamapS> hazmat: since this is merged: https://code.launchpad.net/~fwereade/juju/apply-machine-constraints .. does that mean that https://code.launchpad.net/~fwereade/juju/pa-start-machine-constraints/+merge/86451 can land, and then constraints will be released?
<hazmat> but if needs tog et done..
<hazmat> SpamapS, yes, but only for ec2
<niemeyer> hazmat: It's not so simple, since there are deps involved too
<niemeyer> hazmat: It'd need to deal with downloading packages + deps, and uploading packages, if I see what you mean
<hazmat> niemeyer, i'm less concerned with the deps outside of txzk their all stable
<hazmat> niemeyer, yes.. for a bundle, exactly that
<SpamapS> hazmat: thats fine.. OMG Ubuntu needs it. :)
<niemeyer> hazmat: Right..
<niemeyer> hazmat: All the deps that need to :)
<hazmat> niemeyer, not nesc. dpkg binaries though, i was just going to grab push py eggs
<niemeyer> hazmat: It's also not readily available locally
<niemeyer> hazmat: Ugh..
<hazmat> niemeyer, :-)
<hazmat> niemeyer, the dpkg is avail local
<hazmat> unless its source
<niemeyer> hazmat: It's not avaliable locally
<hazmat> in which case use branch and revision
<hazmat> niemeyer, its not in the dpkg cache?
<niemeyer> hazmat: No, not necessarily.. it's a cache after all
<hazmat> SpamapS, considering it was a client skew there.. not sure this would help
<SpamapS> hazmat: this is orthogonal to that problem
<hazmat> agreed
<SpamapS> hazmat: they need xlarges for the webserver, and m1.small for everything else.
<SpamapS> webserverS I should say
<hazmat> SpamapS, oh.. constraints would help them yes..
<niemeyer> hazmat: As far as OMG-what-to-do-for-12.04 goes, I'd make sure that Oneiric and Precise match each other, and beg SpamapS for help
<hazmat> niemeyer, the omg refs are for the site omgubuntu site.. although it does apply to the other ;-)
<niemeyer> hazmat: Yeah, I was consciously not referring to that part of the problem :)
<hazmat> jimbaker, just a heads up.. per our conversation friday, i'm implementing the status changes per the spec
<jimbaker> hazmat, yes, i've just updated the spec per the discussion
<jimbaker> over that spec
<hazmat> jimbaker, awesome
<niemeyer> hazmat, SpamapS: Long term, having raw binaries in storage feels like the best in terms of portability and stability of environments
<jimbaker> i certainly welcome your impl help! thanks
<hazmat> niemeyer, so the problem is how to get the binary for py-juju.. and then spec the rest of it
<niemeyer> hazmat: So you mean you're also going to do that for 12.04? :-)
<hazmat> niemeyer, if needs to get done, i'll have to find something else to drop
<hazmat> niemeyer, but it also eases the take machine/upgrade for go-juju
<hazmat> s/take/takeover
<niemeyer> hazmat: Kind of.. we can always do that later, and I'm not optimistic that we can do something that will be used as-is for upgrading Go
<niemeyer> Upgrading to Go will likely be a breaking change for the environment itself
<niemeyer> It won't break charms, though
<niemeyer> hazmat: So we have to evaluate what's the pluses and minuses..
<hazmat> huh.. i thought we where playing approved any state changes for the purpose of transparent upgrade
<niemeyer> hazmat: If you get some ideas for how to do it in Python, I'd be happy to talk about themLet me know how you'd do it in Python..
<hazmat> niemeyer, speaking of which there is a state change notice for a trivial change to upgrade flag storage on the list
<niemeyer> I need to get some quick food now, though, as Ale is waiting..
<hazmat> niemeyer, okay.. i can spec it out
<hazmat> niemeyer, i guess what i really need is guidance from you and SpamapS  that this is critical for 12.04
<niemeyer> hazmat: WIll go over it.. there are a couple of entries in the TODO about answering emails
<hazmat> if its not then... i'll push on other things
<niemeyer> hazmat: My vague view on it is still the same since the last milestone
<hazmat> if it is i can at least get a spec for discussion
<niemeyer> hazmat: The focus is on making in *stable*
<niemeyer> hazmat: A lot has been happening, and significant changes are still going in right now
<niemeyer> Anyway.. biab
<niemeyer> hazmat: Back
<niemeyer> hazmat: The spec sounds good, though
<niemeyer> hazmat: I expect it will be fairly uncontroversial, at least in the part that does the actual upgrade
<niemeyer> hazmat: I don't know what you have in mind for the harvesting of dependencies, though
<hazmat> niemeyer, sounds good, i'll put on my list for today, first some impl and reviews
<niemeyer> hazmat: Thanks a lot
<hazmat> niemeyer, np
<niemeyer> jimbaker:
<niemeyer> 	 38 Â»        relations-error:
<niemeyer>  39            db: [blog3, blog4]
<niemeyer>  40 Â»        relations-pending:
<niemeyer>  41            db: [blog2]
<niemeyer> jimbaker: I don't think this reflects the agreement. Please see hazmat's email.
<niemeyer> In fact.. do we even need that spec?
<niemeyer> It was good to bootstrap the discussion, but hazmat's email to the list looks great
<andrewsmedina> rog: pin
<andrewsmedina> rog: ping
<hazmat> niemeyer, for spec review there's also http://codereview.appspot.com/5847053/ i'm making some changes per fwereade_'s review, but their minor
<hazmat> jimbaker, there is no pending.. only relations, relations-error
<niemeyer> andrewsmedina: Heya
<niemeyer> andrewsmedina: It's pretty late for rog now
<niemeyer> hazmat: Looking
<andrewsmedina> niemeyer: ty
<niemeyer> hazmat: Hmmm.. will wait for the next round if that's ok
<hazmat> niemeyer, sure
<niemeyer> hazmat: Do we need {force: true} on the yaml state?
<niemeyer> hazmat: for upgrade-charm?
<niemeyer> hazmat: I thought we were just not setting the upgrade depending on the state
<hazmat> niemeyer, its to support the --force upgrade to allow for upgrades from any state, i wanted to distinguish it from a normal upgrade
<niemeyer> hazmat: Just trying to understand why.. I thought the non-forcing behavior was client-side only
<hazmat> niemeyer, the additional behavior is only enabled by the --force flag, else normal upgrade checks apply
<niemeyer> hazmat: Right, but aren't the checks client side?
 * hazmat checks something
<niemeyer> hazmat: In fact.. upgrade-charms --force seems to go into the void
<hazmat> niemeyer, there's also agent side checks
<niemeyer> hazmat: Cool, +1 on the concept then.. but there seems to be a hole in the implementation
<hazmat> niemeyer, i'm all ears
<niemeyer> https://codereview.appspot.com/5752069/diff/6001/juju/control/upgrade_charm.py#newcode56
<niemeyer> hazmat: ^
<hazmat> niemeyer, indeed
<hazmat> missing a test clearly
<niemeyer> hazmat: It might be good to have a run with it in practice too, to check that it's glued up end-to-end
<jimbaker> hazmat, that makes sense re pending
<hazmat> niemeyer, re reviews, i would appreciate if you could have a look at jimbaker's relation specs all ones you  reviewed have feedback incorporated, and are ready for another look.
<jimbaker> hazmat, niemeyer, sounds good
<hazmat> we need to start them... now effectively, to get them implemented
<jimbaker> hazmat, agreed
#juju-dev 2012-03-20
<hazmat> niemeyer, there is some redundancy with the new format since we're showing the service relations : db : [myblog, teamblog] at the service level and showing the same for each unit as well in the relations/relation-errors block
<hazmat> we had talked about collapsing them, but per unit rel status is important
 * hazmat looks for previous context
<niemeyer> hazmat: Wasn't the redundancy always there?
<niemeyer> hazmat: The distinction is we show the active relations
<niemeyer> hazmat: Maybe all we need is relation-errors?
<hazmat> niemeyer, yeah.. that sounds reasonable
<hazmat> there's not really anypoint to showing it duplicated, and we flag errrors separately
<hazmat> s/their
<niemeyer> hazmat: yeah.. and if we regret we can always go back
<hazmat> sounds good, i'll do it up that way
<hazmat> niemeyer, thanks
<niemeyer> hazmat: np
<hazmat> hmm.. some of the dot rendering needs that info
<hazmat> nm
 * hazmat yawns
<rog> mornin' to anyone silly enough to be up at this hour
<rog> andrewsmedina: pong
<bigjools> speak for yourself, it's 5:12pm here :)
<rog> bigjools: :-) i knew it was a silly thing to say.
<rog> bigjools: australia?
<bigjools> yup
<bigjools> gah, this whole juju branching from LP thing has really screwed my day
<rog> bigjools: how's that?
<bigjools> I waited for a bootstrap to finish while it was setting up a node (takes ages) and then:
<bigjools> bzr: ERROR: http://bazaar.launchpad.net/%2Bbranch-id/348938/.bzr/repository/packs/c77bcca4ec91aee207a8f9d37b1a8608.pack is redirected to https://launchpad.net
<bigjools> which is a bug in LP/bzr somewhere
<bigjools> and the whole bootstrap grinds to a halt
<rog> aren't web services marvellous?
<bigjools> not today  :)
<rog> bigjools: BTW how long does it take for you to bootstrap an environment?
<rog> (usually)
<bigjools> but I'm still gobsmacked at juju branching off LP at all during deployment/bootstrap :/
<rog> bigjools: depends on the environment settings, right?
<bigjools> it depends.  I am testing maas, and if I wait for a machine to install, an hour, otherwise it depends on network speed for apt-get update/install etc.
<bigjools> are you talking about juju-origin?
<rog> bigjools: yeah
<bigjools> I still think that's crazy
<rog> bigjools: that you should have the option?
<bigjools> that it tries to check out a branch
<rog> bigjools: does it do that even when juju-origin is distro or ppa?
<bigjools> is this an artifact of my dev environment, or does it always do that by default?
<rog> bigjools: i couldn't say i'm afraid
<bigjools> let me rephrase - does it always bzr branch/checkout? or does it try to use other ways to get the code down?
<rog> bigjools: i was under the impression that if you used juju-origin=distro or ppa that it would just do apt-get
<bigjools> ah ok, I didn't know that
<bigjools> so just the default is crazy then :)
<rog> bigjools: ah, the default depends on your local environment
<rog> bigjools: i'd forgotten about that.
<bigjools> ah, that's what I was getting at
 * rog looks at the code
<rog> bigjools: it looks at the output of apt-cache policy
<bigjools> ok
<rog> bigjools: what does "apt-cache policy juju" produce for you when you run it?
<rog> bigjools: to be honest, i didn't really understand the motivation behind the code when i ported it to Go. i just copied the semantics and the tests :-) insight into why it's good or bad would be useful...
<bigjools> I really have no idea :/
<bigjools> juju is not installed locally, anyway, which is why I guess it branches it
<wrtp> bigjools: no idea what happened there
<wrtp> bigjools: last thing i saw from you was:
<wrtp> [07:31] <bigjools> I really have no idea :/
<bigjools> wrtp: I just said that juju is not installed from a package
<wrtp> bigjools: could you paste the exact output of apt-cache policy juju, please?
<wrtp> bigjools: i'm just looking to see how the code would deal with it - it's got quite a few cases.
<bigjools> wrtp: http://pastebin.ubuntu.com/891862/
<wrtp> bigjools: thanks
<wrtp> bigjools: yeah, it's the "Installed: (none)" that triggers it
<bigjools> wrtp: right
<wrtp> bigjools: set juju-origin and you'll be fine, i guess
<bigjools> wrtp: I am using it to pull my dev branch from LP
<bigjools> but got caught by that LP bug
<wrtp> bigjools: the underlying problem is outlined in this email: https://lists.ubuntu.com/archives/juju/2012-March/001337.html
<bigjools> :)
<TheMue> rog, fwereade: moin
<fwereade> heya wrtp, TheMue, bigjools
<wrtp> fwereade: yo!
<bigjools> hello fwereade
<bigjools> wrtp: I found what looks like a bug, can you verify this:
<bigjools> the user-data contains lines that append to the /etc/init/..conf files
<bigjools> so on multiple boots you end up with it saying to start the provisioning agent etc multiple times
<wrtp> bigjools: multiple juju bootstraps? or multiple boots of the machine?
<wrtp> bigjools: BTW it's ironic that the problem i'm currently trying to debug with the Go port is directly to do with the issue you had above...
<bigjools> wrtp: heh - each boot uses the cloud-init user-date to append the same stuff to the conf files
<wrtp> bigjools: hmm. i think juju usually assumes a fresh machine each time.
<bigjools> wrtp: not sure why it is using >> in the bash script then
<wrtp> bigjools: is that in juju or in cloudinit.py ?
<bigjools> wrtp: the user-data's runcmd coming via cloudinit
<bigjools> afk for a while
<wrtp> bigjools: i don't see that, but maybe i'm looking in the wrong place.
<wrtp> oh yeah, i do now.
<wrtp> bigjools: i think that might have changed since i did the port. hmm.
<wrtp> fwereade: you did the upstart stuff, right?
<fwereade> wrtp, yeah, reading back...
<wrtp> fwereade: from a test data file (cloud_init_ppa):
<wrtp>     /var/log/juju, 'cat >> /etc/init/juju-machine-agent.conf <<EOF
<wrtp> why the append?
<fwereade> wrtp, because, apparently, the crack was strong with me that day, but let me check something
<fwereade> wrtp, yeah, crack
<wrtp> bigjools: ^
<wrtp> :-)
<wrtp> fwereade: do any of the tests try rebooting the instances?
<fwereade> wrtp, hm, no
<wrtp> fwereade: it seems to me that it might be a good idea to try to test this stuff
<fwereade> wrtp, but rebooting them has AFAICT always worked in the past
<wrtp> fwereade: lucky :-)
<fwereade> wrtp, well, it tests that it has behaviour that has been experimentally verified to work, even if that was by sheer luck :/
<wrtp> fwereade: :-)
<fwereade> wrtp, it seems that ec2 runs user scripts only one anyway
<wrtp> fwereade: that would be very sensible
<wrtp> fwereade: maybe can say it's a bug in MaaS?
<fwereade> wrtp, nah, my bug
<wrtp> fwereade: but... isn't it up to the OS to decide whether the run the init scripts or not?
<fwereade> wrtp, I think of it as being up to cloud-init, and it's not not-a-bug just because ec2 machines happen to be set to run them only once
<fwereade> wrtp, it was hitherto a latent bug
<wrtp> fwereade: BTW, looking at juju/providers/common/tests/data/cloud_init_ppa, it looks like the "--session-file" argument is on a different line to the "exec python -m juju.agents.machine" command.
<wrtp> fwereade: how can that work then?
<fwereade> wrtp, isn't that yaml being "clever"?
<fwereade> wrtp, in that section line breaks are represented as pairs of line breaks, AFAICT
<wrtp> fwereade: quite probably. i haven't found a good explanation of all yaml's cleverness anywhere yet
<wrtp> fwereade: i'd have thought that stuff inside ' ... ' is treated literally.
<wrtp> fwereade: jeeze who could ever think of calling it "simple"?
<fwereade> wrtp, I basically treat yaml as a binary format
<fwereade> wrtp, it passes through a library that causes it to make sense, somehow, and I'm happy enough with that :p
<wrtp> fwereade: can we move to json sometime, please?
<fwereade> wrtp, I have no idea, I think that's one to punt to niemeyer
<fwereade> wrtp, I don;t know where the original yaml dependency came from
<wrtp> fwereade: yeah. i think niemeyer likes the indentation-based format.
<fwereade> wrtp, tbh I'm not *so* bothered by it -- in the paces where users might use it, it's usually pretty clear and obvious
<fwereade> wrtp, but, yeah, it gets ungainly with our own data
<wrtp> "In addition, it is only possible to break a long single-quoted line where a space character is surrounded by non-spaces."
<wrtp> from: http://www.yaml.org/spec/1.2/spec.html#id2788097
<fwereade> wrtp, OTOH note that cloud-init is not our code, and that needs to be yaml anyway
<wrtp> fwereade: yeah, i realise that.
<wrtp> fwereade: unfortunately.
<fwereade> wrtp, so this specific example is not really directly relevant to the cause anyway ;)
<wrtp> fwereade: yeah, it was more of a by-the-by.
<wrtp> fwereade: i can't see where in that section it says that newlines are removed.
<wrtp> fwereade: i suppose example 7.9 implies that though.
<fwereade> wrtp, I'm afraid I have little stomach for a deep dive into the yaml spec, but the way it converted all the \ns to \n\n (in that section only) seems to me to be strong-enough circumstantial evidence for my position ;)
<wrtp> fwereade: true 'nuff :-)
<fwereade> wrtp, hmm, I wonder whether deploying machine constraints will break everyone's jujus again :/
<wrtp> fwereade: it'll probably break the hacks that everyone's been using to get around the lack of machine constraints...
<fwereade> wrtp, oh, *that* is guaranteed, but *hopefully* people have known they were going away since last year
<fwereade> wrtp, I'm just worried about the impat on stuff that's already deployed from PPA, at the point it hits the PPA
<wrtp> fwereade: hmm, yeah. is the zk schema backwardly compatible?
<fwereade> wrtp, I need to double-check what happens if the constraints key is ever not found... but actually, that should be broken *already* if it's a problem
<TheMue> wrtp, fwereade_: a simple review: https://codereview.appspot.com/5843068
<TheMue> wrtp: btw, why today wrtp and not rog?
<fwereade_> TheMue, the big question is "what will this be used for?"
<TheMue> fwereade_: inside WaitAgentAlive() of Unit and Machine, like discussed yesterday with niemeyer
<fwereade_> TheMue, ok, and we definitely need a timeout for that?
<TheMue> fwereade_: we should leave waiting users the option to have a timeout (as usual in concurrent and distributed computing). which concerns do you have with timeouts?
<fwereade_> TheMue, those waiting users being?
<fwereade_> TheMue, my only concern is that it's code that won't be used, and we don't have enough information to choose a sensible timeout
<fwereade_> TheMue, the only thinks that wait for agents are command-line tools
<fwereade_> TheMue, I was under the impression that we deliberately *didn't* time those out so that people can write scripts that bootstrap and just keep going with the next command
<fwereade_> TheMue, I can understand that there are other plausible scenarios in which we might want to do have a timeout, but they don;t exist yet
<fwereade_> (sorry mangled english)
<TheMue> fwereade_: one moment please, doing esta in parallel ;) back in a few seconds
<TheMue> fwereade_: so, back again, started ESTA for UDS
<fwereade_> TheMue, ah cool
<fwereade_> TheMue, anyway it's not bad code; it's not a bad idea; it's just not something we have a reasonable certainty of needing imminently, so I'd prefer it if it just didn't exist
<fwereade_> TheMue, every line of code is a small weight on each of our brains ;)
<TheMue> fwereade_: WaitAlive() is the same functionality as it has been before in Unit and Machine. afaik a low level func like this one should provide the possibility for timeouts and it's task of its using code to determine how long it is willing to wait
<TheMue> fwereade_: no more code duplication later for selects with time.After
<TheMue> fwereade_: this argument by rog yesterday lead to the reduction to one func
<TheMue> fwereade_: btw, why no concerns before when the same code has been in Unit and Machine?
<fwereade_> TheMue, my point is that nobody's going to be doing anything other than "alive, ok = <-watch", without select, regardless
<TheMue> fwereade_: even you in your tests worked with a timeout
<fwereade_> TheMue, very specifically, only in the tests
<TheMue> fwereade_: timeouts and there handlings are essential in distributed and concurrent programming
<fwereade_> TheMue, by adding this mechanism, you *force* me to choose a timeout
<TheMue> fwereade_: so you think it's ok that parts of the software block forever until anyone kills the process?
<fwereade_> TheMue, exactly so
<TheMue> fwereade_: i could add a -1 for automatic max duration ...
<TheMue> fwereade_: but again, why hasn't been that question beforin when the same code has been in Agent or later in Unit or Machine?
<fwereade_> TheMue, that's still forcing me to make a choice, it's just that there's still only 1 meaningful choice
<TheMue> fwereade_: and btw, nobody is forced to use this function, anyone can use the other functions, they are unchanged
<fwereade_> TheMue, if I failed to spot and complain about the timeout before, I apologise for my inattention
<fwereade_> TheMue, sure; and if nobody does use the function, why have it in the first place?
<TheMue> fwereade_: i will use it, in Unit and Machine
<fwereade_> TheMue, to do what?
<TheMue> fwereade_: like discussed yesterday with niemeyer and rog
<TheMue> fwereade_: wait for an agent
<fwereade_> TheMue, I'm sorry I missed that; what are the new use cases in which a timeout doesn't break user expectations?
<rogpeppe> TheMue: out of interest, BTW, what code waits for an agent?
<TheMue> rogpeppe: would have to look again who is calling it
<fwereade_> TheMue, it's juju-ssh and juju debug-hooks
<fwereade_> TheMue, and that's it
<rogpeppe> fwereade_: interesting. why do they wait for an agent? (and which agent do they wait for?)
<fwereade_> TheMue, in both cases it's the command line and it will break existing behaviour that users expect
<TheMue> rogpeppe: there are several callers of watch_agent() in todays py code
<fwereade_> TheMue, I see two
<TheMue> fwereade_: just counted the search results, they include the tests
<fwereade_> rogpeppe, ssh waits for the machine agent
<rogpeppe> fwereade_: both of those occurrences just seem to be there to get the ip address of the machine
<fwereade_> TheMue, yeah; there are 2 non-test uses
<fwereade_> rogpeppe, I don't entirely agree with it in juju ssh
<fwereade_> rogpeppe, but that's what was agreed
<rogpeppe> fwereade_: i suppose it's better than polling ec2 for the ip address to appear
<fwereade_> rogpeppe, but it's needed for debug-hooks because it's utterly meaningless without an active unit agent at the other end effectively forwarding you a session
<rogpeppe> fwereade_: ah
<rogpeppe> TheMue: the existing watch_agent doesn't seem to have a timeout
<TheMue> fwereade_, rogpeppe: is anybody of you intereted in continue the agent method implementation (2 x 3 simple methods)? it's real fun, with a lot of principal discussions over different timezones and languages ...
<rogpeppe> :-)
<TheMue> rogpeppe: i exactly (!) implemented today behavior as a first draft. and that had to be changed.
<fwereade_> TheMue, sorry, I am honestly trying to save you work, but I don;t think I'm succeeding :(
<rogpeppe> TheMue: i seem to remember interfaces and embedding in the first draft :-)
<rogpeppe> actually, no interface, probably
<TheMue> rogpeppe: one interface (that only has been to verify that Unit and Machine provide the right methods, else useless) and three simple functions to directly use them
<TheMue> rogpeppe: embed has already been another way, after also switching to presence
<TheMue> fwereade_: which work do you wonna save? it's already done and changing it is new work. each time.
<TheMue> fwereade_: the wish of reducing code duplication (yes, the one with (!) timeout) has been by rog, niemeyer followed it and my only part has been to move it to presence instead of one single func in any of the state files to provide this functionality (waiting with timeout) also for other users in future (afaik there are several more watches).
<fwereade_> TheMue, I just don't understand why you need a timeout at all
<fwereade_> TheMue, perhaps one day you will
<fwereade_> TheMue, but I don't see what it gains us except lines of code
<fwereade_> TheMue, I clearly failed to adequately express my concerns with the original watcher type
<TheMue> Unit.WatchHookDebug(), Unit.WatchResolved(), but i don't know if they will base on presence, only some notes of methods that still have to be implemented
<fwereade_> TheMue, they certainly won't wait on presence
<rogpeppe> TheMue: i was under the impression that that only agents will be based on the presence package
<fwereade_> TheMue, they're called by the unit agent itself
<fwereade_> TheMue, it *knows* it exists ;)
<fwereade_> rogpeppe, we also have presence nodes for unit relations as I recall
<rogpeppe> fwereade_: interesting.
<fwereade_> rogpeppe, used to signal active participation in a relation (as distinct from "it may be working now by coincidence, but the unit won't react to changes")
<fwereade_> TheMue, I seem to recall pointing out the very limited us cases for agent watching before, but perhaps that got lost in the noise
<rogpeppe> fwereade_: is that implied by the unit agent's presence, but duplicated in a different place for convenience? i haven't looked into how this stuff works at all.
<fwereade_> rogpeppe, unit relations have state, of which "up" and "down" are generally relevant
<rogpeppe> fwereade_: a unit relation can be down when its unit agent is up?
<fwereade_> rogpeppe, certainly; failed hook?
<rogpeppe> ah, sure
<fwereade_> rogpeppe, the unit relation state has the last value written by the agent
<fwereade_> rogpeppe, but if the agent isn't even well enough to maintain its presence node we can be pretty sure that something is rotten in the state of... um, the service
<rogpeppe> fwereade_: i suppose what i'm trying to work out is if the pinger thing is necessary in this case, or whether we can just use zk as usual.
<rogpeppe> ah, ephemeral nodes.
<fwereade_> rogpeppe, we need some way to know that a remote thing is active
<fwereade_> rogpeppe, indeed :)
<TheMue> fwereade_: so the presence package is only for agents?
<fwereade_> TheMue, no, it's a general replacement for ephemeral nodes, which we have decided aren't worth the trouble
<rogpeppe> fwereade_: if we know the unit agent is alive, doesn't that imply the unit relation is, erm, actively inactive?
<TheMue> fwereade_: and there is no such usage of ephermeral nodes where the watcher isn't willing to wait endlessly?
<fwereade_> rogpeppe, if the unit agent is alive we hope/trust that it's also watching its watches
<fwereade_> TheMue, we don't wait on ephemeral nodes except in the 2 cases I mentioned
<rogpeppe> fwereade_: if the unit agent is dead, we can assume its unit relations are dead?
<fwereade_> TheMue, and we only directly care about unit relation ephemeral nodes in the context of `juju status` in which we definitely don't want to wait on them
<rogpeppe> i'm speculating wildly. please ignore me.
<fwereade_> rogpeppe, if it's dead then the service may well still be working correctly underneath
<fwereade_> rogpeppe, but we can be sure that it won't respond correctly to settings changes etc
<fwereade_> TheMue, the trouble is that this is not obvious from reading the python code :(
<rogpeppe> fwereade_: i'm trying to think slightly deeper about why we use a pinger. AFAICS it's to signify that there's something active pinging the node. given that (i think) the unit relation node is managed by the same code that manages the unit agent presence node, the activeness of the former could be used to imply the activeness of the latter, perhaps, is what i'm thinking.
<rogpeppe> oops
<TheMue> fwereade_: how do those other watches work? what do they watch? the presence of nodes or the change of node contents?
<rogpeppe> activeness of the latter could... activeness of the former
<fwereade_> rogpeppe, I think that's what we already do
<rogpeppe> fwereade_: so perhaps an ephemeral/presence node is unnecessary for the unit relation?
<fwereade_> TheMue, the only ephemeral watches I am aware of are those in ssh and debug-hooks, which simply wait for the presence of the thing at the other end
<fwereade_> rogpeppe, hm, that's interesting
<TheMue> fwereade_: so you say it's ok to wait forever there?
<fwereade_> rogpeppe, something's knocking at the corner of my mind, gimme a sec
<fwereade_> TheMue, yes, absolutely
<rogpeppe> fwereade_: i can imagine there might be race conditions that make it difficult
<TheMue> fwereade_: how shall i test it without blocking the test forever?
<fwereade_> TheMue, we explicitly got rid of timeouts on command line tools
<fwereade_> TheMue, you do something like I did in the original presence node tests?
<rogpeppe> TheMue: what i tend to do is to make sure that it blocks by waiting for a short period, then unblock it and check it unblocks.
<TheMue> fwereade_: hmm, yes, could do so too, indeed
<rogpeppe> TheMue: in fact you'll be testing almost exactly the same code...
<fwereade_> rogpeppe, ServiceRelationState.get_unit_state also checks the ephemeral node, not going to analyse all the callers of that at this stage ;)
<rogpeppe> fwereade_: the unit relation ephemeral node?
<fwereade_> rogpeppe, yeah
<fwereade_> rogpeppe, (also we use an ephemeral node to signal presence of an active debug-hooks session, watched by the unit agent; that one should also wait forever)
<TheMue> fwereade_: still got pain with code waiting for external events endlessly. it disregards the knowledge of > 20 yrs distributed and concurrent systems and caused so much pain.
<fwereade_> TheMue, it *is* basically the underlying model of juju though
<TheMue> fwereade_: but maybe indeed it's irrelevant for our system
<fwereade_> TheMue, I can't actually think of many places where timeouts are appropriate at the juju level
<TheMue> fwereade_: yeah, seams so
<fwereade_> TheMue, ZK is keepalive like hell underneath, but from our perspective we can wait forever
<rogpeppe> we've already got timeouts and retries happening at a low level
<fwereade_> TheMue, if there are problems we trust ZK to tell us about it
<rogpeppe> fwereade_: yeah
<TheMue> fwereade_: once again i'm driven by my history where this had been a bad behavior
<fwereade_> TheMue, yeah, we all carry baggage
<fwereade_> TheMue, it's hard for me to separate "what the python does" from "how juju should actually do it" ;)
<fwereade_> TheMue, but I think this is a juju-level property not a code-level one
<fwereade_> TheMue, if you see what I mean
<rogpeppe> perhaps the question is not: "should there be a timeout?" but "where should the timeout be?"
<TheMue> fwereade_: but i would like you to discuss it with niemeyer as he said "yes, implement it in presence" and i don't want to follow know your advise and get another one by him in the evening.
<fwereade_> TheMue, I think it comes down to "do you trust zookeeper" ;)
<fwereade_> TheMue, I can totally understand that
<rogpeppe> TheMue: for now, why not just make a function in the state package, as suggested by niemeyer?
<fwereade_> TheMue, niemeyer has firm opinions and the final say and while they are not arbitrary or capricious they are hard to predict with 100% accuracy
<TheMue> rogpeppe: the one in presence as an own branch is by niemeyer
<TheMue> rogpeppe: we talked about it yesterday when you stepped out
<TheMue> rogpeppe: the problems of time zones ;)
<fwereade_> TheMue, did he implement it because he knew it was needed himself, or as an alternative to your watcher proposal?
<fwereade_> TheMue, on the assumption that it *was* a necessary feature
<TheMue> rogpeppe: btw, now rogpeppe, this morning wrtp, yesterday rog. whay this?
<fwereade_> TheMue, I always assumed it was low-level psyops to contribute to an aura of glamorous mystery
<rogpeppe> TheMue: better than rogpeppe_ and rogpeppe__, i thought
<rogpeppe> fwereade_: that too
<TheMue> fwereade_: no, the implementation is by me. niemeyer and i talked about moving it to presence yesterday.
<TheMue> rogpeppe: and why not only one?
<rogpeppe> TheMue: because my irc client sometimes reconnects when the irc server already thinks i'm connected, so it has to choose a different one
<TheMue> rogpeppe: ic
<TheMue> rogpeppe: thankfully it's seldom here. and short time after reconnect i get my nick back automatically
<TheMue> rogpeppe: irc is sometimes really unstable, yep
<rogpeppe> fwereade_: i think we should let presence.WaitAlive through, despite misgivings. it can go later if we find it's never used.
<fwereade_> rogpeppe, TheMue: I'm ok with that
<TheMue> rog, fwereade_: hehe, and when i use it in Unit and Machine i'll get the next comments by you? *LOL*
<rogpeppe> TheMue: i'll be interested to see what value you choose for the timeout :-)
<fwereade_> TheMue, well, yes, because IMO if you're using timeouts by default then you're using ZK wrong
 * TheMue again wonders why the timeout hasn't been a topic before?
<fwereade_> TheMue, and I don't think you can come up with a timeout value that is a 100% accurate indicator of "something is wrong" as opposed to "ec2 is taking ages, whaddayagonnado?"
<fwereade_> TheMue, I'm pretty certain I have already talked with you about the very small set of clients for watch_agent, and their very limited use cases
<TheMue> fwereade_: but not about the timeout
<rogpeppe> yeah, i don't think we've really discussed timeouts before
<fwereade_> TheMue, no; when I said that it was all unnecessary, and the presence package covers all the use cases already, that's what I meant
<TheMue> fwereade_: i'm not good in intention reading ;)
<fwereade_> TheMue, nor does it have a custom memory allocator, because it doesn't need that either ;p
<fwereade_> TheMue, I think I honestly did try to direct you to the places you needed to see to understand the use cases
<TheMue> fwereade_: indeed
<fwereade_> TheMue, the fact that all user interactions must block forever is I suspect the crucial bit of floating context that you were missing
<fwereade_> TheMue, I know that only because I was around when it was discussed and changed to that behaviour
<fwereade_> TheMue, let's put it this way
<TheMue> fwereade_: exactly, this "block forever" is hard to get. i know different behaviors from those systems i've done in the past.
<fwereade_> TheMue, sorry I forgot what I was going to say
<TheMue> fwereade_: hehe
<fwereade_> TheMue, really it comes down to trusting ZK
<fwereade_> TheMue, if we do, we should generally assume that the events will land according to ZK's limited guaranteed; and if we don't, we should panic
<TheMue> fwereade_: a software written in java? no, never! *scnr*
<fwereade_> TheMue, haha :)
 * TheMue has done JEE for > 7 yrs, it has been no (!) good time
<fwereade_> TheMue, I can imagine
<fwereade_> rogpeppe, TheMue: lunchtime :)
<TheMue> fwereade_: enjoy
<rogpeppe> fwereade_: likewise
<rogpeppe> TheMue, fwereade_: i'm off to get the train down to london now. will probably be incommunicado tomorrow morning too. see you tomorrow!
<TheMue> rogpeppe: have fun, take videos, copy slides, publish everything. ;)
<hazmat> fwereade_, ping
<hazmat> fwereade_, i'm a little concerned about the ambiguity around the ec2-instance-type arch.. esp as its the most common way people will use constraints
<hazmat> on ec2
<fwereade_> hazmat, pong
<fwereade_> hazmat, there was a quiet discussion on the lists about how a 64-bit default would be a good idea, let's do it
<niemeyer> Mornings
<fwereade_> hazmat, is there some other ambiguity?
<hazmat> niemeyer, mornings
<fwereade_> heya niemeyer
<hazmat> fwereade_, ah right, yeah.. given 64bit on all types its not much of a concern
<fwereade_> hazmat, I can accept that it's a bit annoying not to be able to type "ec2-instance-type=m1.medium arch=i386"
<fwereade_> hazmat, but there are what, 4 arch-choice types
<fwereade_> hazmat, t1.micro is a bit of a joke really
<fwereade_> hazmat, m1.small is the default in many people's minds, and is what you get if you specify bare "arch=i386"
<fwereade_> hazmat, I think the proportion of our user base who specifically need 32-bit m1.medium, c1.medium, and t1.micro may have to bear it
<hazmat> fwereade_, fair enough
<fwereade_> hazmat, in fact, they get to experience the awesome productivity shortcut of typing fewer characters in total! ("ec2-instance-type" is a bit of a mouthful...)
<fwereade_> hazmat, niemeyer: of more concern is the HVM image thing
<niemeyer> fwereade_: hm?
<hazmat> fwereade_, people will choose the explicit when possible
<hazmat> fwereade_ are there ubuntu images for hvm?
<fwereade_> hazmat, niemeyer: I misread the EC2 information back in the day and somehow got the impression that hvm images were a nice improvement on cluster machines, not a hard requirement
<fwereade_> hazmat, I think so, just a mo
<fwereade_> hazmat, http://uec-images.ubuntu.com/query/oneiric/server/released.current.txt
<fwereade_> hazmat, oneiric	server	release	20120222	ebs	amd64	us-east-1	ami-beba68d7			hvm
<hazmat> fwereade_, it looks like its only in us-east-1
<fwereade_> hazmat, just the one, but hopefully that's good enough
<fwereade_> hazmat, cluster instances are only available in us-east-1
<fwereade_> hazmat, it does mean that get_image_id and get_instance_type are no longer independent, but they were already slightly uncomfortably linked
<fwereade_> hazmat, if I manage to do that quickly this afternoon, would you try to review it in your afternoon?
<hazmat> fwereade_,  cool, that makes more sense then, as for the image_id/instance_type.. that's fine.
<fwereade_> hazmat, I also wanted to ask about your branch
<hazmat> fwereade_, i could, but i'm wondering if its worth the trouble
<hazmat> re cc larges
<hazmat> it would be nice i guess
<fwereade_> hazmat, I'd rather spend a couple of hours to make it not be *guaranteed* to break
<fwereade_> hazmat, it's just way too shoddy to put up with
<hazmat> fwereade_, if your up for it, it would be nice to round out our ec2 constraints support with  the support for the biggest baddest vm on the block ;-)
<fwereade_> hazmat, exactly :)
<fwereade_> hazmat, I already need to check that field anyway, so I don't *accidentally* get an hvm image that won't work with normal instances ;)
<niemeyer> jimbaker: You've got a review on https://codereview.appspot.com/5836049
<hazmat> fwereade_, the series from charm branch?
<fwereade_> hazmat, yeah
<fwereade_> hazmat, just let me find it
<hazmat> fwereade_, https://codereview.appspot.com/5845073/
<fwereade_> hazmat, AFAICT it's a no-op
<niemeyer> fwereade_: So, what's the deal about 386 vs. amd64?
<fwereade_> niemeyer, you can choose arch on more instance types than just t1.micro now
<hazmat> fwereade_, hmm. yeah. with_constraints returns the new constraint..
<fwereade_> niemeyer, for that, I was already defaulting to 64-bit
<niemeyer> fwereade_: Ok, so you were just wondering if it was fine to default to amd64 to all?
<fwereade_> hazmat, and the series certainly is baked into the constraints; it's just happens not to be done at that point
<fwereade_> niemeyer, I think we discussed that on the lists; seemed to get a muted "yeah, sounds good" sort of response
<niemeyer> fwereade_: Yeah, it certainly sounds good to me
<niemeyer> fwereade_: Is there any other contentious point?
<fwereade_> niemeyer, it's a little ungainly specifying 32-bit instances other than m1.smalls, but I think that's an acceptable price
<fwereade_> niemeyer, you'd have to ask for "arch=i386 cpu=5" instead of "arch=i386 instance-type=c1.medium"
<hazmat> fwereade_, hmm.. ic, its part of the service state api
<fwereade_> s/instance-type/ec2-instance-type/
<hazmat> k, i'll yank that branch
<niemeyer> fwereade_: Uh, why?
<fwereade_> hazmat, yeah: the service knows the charm, the series isn't *important* until you've got an actual unit that needs to find a machine
<fwereade_> niemeyer, because of the overlapping behaviour that I think we agreed on
<fwereade_> niemeyer, I could just as easily decouple arch from ec2-instance-type
<fwereade_> niemeyer, but that opens us up to "arch=i386 ec2-instance-type=c1.xlarge" which is still a nonsensical request
<niemeyer> fwereade_: If there is a possibility of selecting the architecture, it means that there isn't an overlap
<fwereade_> niemeyer, true
<fwereade_> niemeyer, yep, that's what I should do then
<fwereade_> niemeyer, when it was just t1.micro defaulting to 64-bit seemed to be sensible
<niemeyer> fwereade_: Yeah, t1.micro is ridiculous enough that it doesn't matter much indeed
<niemeyer> fwereade_: Please feel free to not fix it now, though
<niemeyer> fwereade_: I don't know what stage this is, and it feels like a minor bug that could be filed on Launchpad and wait until someone has time to fix it
<fwereade_> niemeyer, it should be no more than 2 lines of core code to fix the conflict and make arch explicitly default to amd64
<niemeyer> fwereade_: Ah, ok, sounds less painful than filing a bug even! :)
<fwereade_> niemeyer, there will be tests to fix ofc, but it really will be cheap :)
<fwereade_> niemeyer, and it's a semantic bug really which always feels worse to let out into the wild
<niemeyer> fwereade_: Superb
<fwereade_> niemeyer, the other question that perhaps you missed is about cluster instances
<niemeyer> fwereade_: Yeah, I saw it, but it wasn't clear to me what it was about
<fwereade_> niemeyer, cluster instances require HVM images
<niemeyer> fwereade_: They need a specific image, I suppose
<fwereade_> niemeyer, I mistakenly though it was an option rather than a requirement
<fwereade_> niemeyer, it's not quite so cheap to fix but it's still less than an afternoon's work all told
<niemeyer> fwereade_: That's going to be interesting.. it means that getting an image has to cross over the charm URL and the constraints
<niemeyer> fwereade_: I'd not fix that now..
<fwereade_> niemeyer, they're combined at unit deploy time by the service
<fwereade_> niemeyer, series is a constraint, just not one exposed directly to the user, because we know we'll get it from the charm
<niemeyer> fwereade_: We can wait until someone reports a bug about it.. I suspect it will take a while, perhaps enough for it to be fixed already
<fwereade_> niemeyer, I'm not really confident pressing too far on with the go code until I have a couple of reviews for the early steps... can I please take a little time to polish it up until I have a couple of reviews in?
<niemeyer> fwereade_: Sure.. I was actually going to ask you about these branches in review
<fwereade_> niemeyer, cool
<niemeyer> fwereade_: In one of them you seem to have reverted back to kill the whole idea?
<fwereade_> niemeyer, the hook context one?
<niemeyer> fwereade_: Yeah
<fwereade_> niemeyer, it became clear after sleeping on our discussion that it is *not* a *hook* context
<fwereade_> niemeyer, it's a tool context
<fwereade_> niemeyer, which happens to be useful for hooks
<fwereade_> niemeyer, the hook package itself became unjustifiable; but it fits really nicely IMO as one of a number of things in a cmd/jujuc/server package
<fwereade_> niemeyer, jujuc is the dumb main.main that calls the server package
<fwereade_> niemeyer, the server package contains the various tools implemented as Command, and the RPC server that executes them
<fwereade_> niemeyer, it's a good place for that particular set of, er, synergistic functionality
<niemeyer> fwereade_: Sounds reasonable
<fwereade_> niemeyer, and it'll be as convenient as anything to use once we have an agent far enough along that it's *actually* needing to run hooks
<niemeyer> fwereade_: +1
<fwereade_> niemeyer, sweet
<fwereade_> niemeyer, well, that's the point of that change :)
<niemeyer> fwereade_: Cool, thanks
<fwereade_> niemeyer, as for the others: go-add-cmd-context you looked at once and seemed generally happy with, but there may be contention on FlagSet output
<fwereade_> niemeyer, go-tweak-supercommand sits on that and is pretty trivial really
<fwereade_> niemeyer, and go-testing-charm has I think been fixed
<niemeyer> fwereade_: I'll have a pass at it again as my next task right away
<fwereade_> niemeyer, awesome, thanks
<niemeyer> fwereade_: np, sorry for the delay.. I should probably have reviewed these before other things I've been reviewing
<niemeyer> fwereade_: Btw, this change probably deserves a better summary
<fwereade_> niemeyer, it's ok, I haven't been entirely blocked, there's always something worth my time :)
<niemeyer> fwereade_: It's saying "remove hook package", but it's actually pushing things forward
<fwereade_> niemeyer, haha, I hope the second part is generally assumed to be my intent ;)
<niemeyer> fwereade_: right :)
<jimbaker> niemeyer, thanks
<rogpeppe> niemeyer: morning!
<niemeyer> rogpeppe: Heya
<niemeyer> jimbaker: np
<rogpeppe> hmm, not surprising the connection is unreliable on the train...
<niemeyer> fwereade_: I see I didn't actually review the test files in the branch. I'll have a quick pass at that post LGTM
<niemeyer> fwereade_: Cool, delivered
<niemeyer> fwereade_: LGTM still
<niemeyer> fwereade_: Thanks for the ride through that branch :)
<fwereade_> niemeyer, excellent
<fwereade_> niemeyer, and thank you, it was fun :)
<rogpeppe> launching ec2 instances from on the train feels kinda cool
<TheMue> fwereade_: splitted the WaitAlive() into two variants. so i can use the one w/o timeout in Unit and Machine but we keep the one with timeout. if it never will be used it can be removed.
<fwereade_> TheMue, cool
<niemeyer> TheMue: Wait, what?
<niemeyer> TheMue: Why do we need two variants?
<niemeyer> TheMue: You actually needed the timeout, right?
<niemeyer> TheMue: There's no other use of this logic today.. if we're finding it incorrect, let's fix the one we have
<TheMue> niemeyer: this morning fwereade_ convinced me that we don't need a timeout
<niemeyer> TheMue: Ok, I'm not disputing either way.. let's just make the one function we have suite our use case
<TheMue> niemeyer: yes, i've done it that way and currently integrate it. so it behaves as we have it today.
<niemeyer> TheMue: Awesome, thanks
<TheMue> niemeyer: additionally i kept a version with timeout based on the one w/o. so if we once will use presence nodes and need timeouts we have it. i could also remove it, but i think the situation will come. ;)
<niemeyer> TheMue: Reading it..
<niemeyer> TheMue: Reviewed
<niemeyer> TheMue: Let's drop the function we don't need.. we can merge the timeout on WaitAlive itself when we need it (if ever)
<TheMue> niemeyer: ok *sigh*
<niemeyer> TheMue: Yeah, sorry for coercing you into removing code you don't need..
<TheMue> niemeyer: it's just that i still have problem with systems waiting endlessly. that's based on my experiences with other systems. it indeed seems to be ok for juju, but it will last until i'm feeling happy with it
<niemeyer> TheMue: That's a different problem that I'm not trying to convince you on
<niemeyer> TheMue: I'm happy to have a timeout, and it may indeed be the best thing.. maybe even a timeout by default?
<niemeyer> TheMue: We'll see when we actually have to use this function
<TheMue> niemeyer: the test for WaitAlive() is hidden in WaitAliveTimeout(). so now i'll move that internal code into the test. ;)
<niemeyer> TheMue: What I'm unhappy about is having two functions just because we have no idea about what we need
<niemeyer> TheMue: Heh
<niemeyer> TheMue: and now you know in practice why "hidden tests" are a bad idea
<TheMue> niemeyer: yes, you're right, having timeouts also needs to know what to do if a timeout happens
<niemeyer> TheMue: I'm not against having a timeout.. we'll probably have to add one soon enough.
<TheMue> niemeyer: but you also say that today the callers of WaitAlive() can wait endlessly?
<niemeyer> TheMue: In fact, I start to wonder whether WaitAlive is even a good plan
<TheMue> niemeyer: oh
<niemeyer> TheMue: When are we going to be calling those methods on unit and machine?
<niemeyer> TheMue: Have you had a look on the current code base to get an ide?
<niemeyer> idea
<TheMue> niemeyer: we found two places
<TheMue> niemeyer: in juju.control.debug_hooks.py and juju.control.ssh
<TheMue> niemeyer: debug_hooks waits in a while 1 loop
<niemeyer> TheMue: Looking
<TheMue> niemeyer: and ssh.y waits for the watch
<niemeyer> TheMue: Why don't we just expose the results of AliveW?
<niemeyer> TheMue: WatchAgentAlive(...) { return presence.AliveW(...) }
<TheMue> niemeyer: can do so, yes, it's only a bit more inconvenient for the caller (handling err, alive and watch, like it is now done in one function)
<niemeyer> TheMue: Ok.. I'm happy either wya
<niemeyer> way
<niemeyer> TheMue: Let's just not sprawl functions we have no use case for
<TheMue> niemeyer: ok
<TheMue> niemeyer: would only have expected to have no function w/o timeout. too much erlang influence, where receive has a timeout statement. ;)
 * wrtp thought that unit, machine, etc could just return a string path to the agent presence node. then clients could use presence on it to their hearts' desire.
<wrtp> but that's probably a bit subversive to say at this point, sorry.
<TheMue> wrtp: shut up, you're sitting in a train, no proper working place. *rofl*
<TheMue> wrtp: seems your connection is too good ;)
<TheMue> wrtp: but as long as we're moving inside the state package it's no problem to get the path, yes. we have zkAgentPath() for it
<TheMue> wrtp: only for callers outside of state it would be difficult. we would have to make the function public.
<wrtp> lol
<wrtp> TheMue: that was my thought
<niemeyer> TheMue: Let's have a timeout there then..
<niemeyer> TheMue: Sounds totally reasonable..
<niemeyer> TheMue: WaitAlive(conn, path, timeout)
<niemeyer> TheMue: The only thing I've been saying is that we have to decide what we want, and go with it. Having multiple functions that have no use case is no good.
<TheMue> niemeyer: totally agree
 * TheMue hugs niemeyer for the timeout variant ;)
<niemeyer> :)
<hazmat> bcsaller1, ping
<wrtp> i wondered why environs tests were taking 45s to run locally - then realised that FindImageSpec does a http request on uec-images.ubuntu.com several times. perhaps i should allow tests to run connectionless somehow.
<hazmat> bcsaller1, whenever your up... looks like a late night. i was checking out the subordinates
<niemeyer> wrtp: Definitely
<niemeyer> wrtp: The Python tests have a dump of the content locally to check that out
 * niemeyer => lunch!
<hazmat> bcsaller1, there's a missing yield in process new relations around subordinate deploy.. and the other problem is that it looks like it ends  up attempting nested containers
<hazmat> because unit deployer picks up the provider type local and tries to use containers, which we don't want for subordinates
<fwereade_> allenap, ping
<wrtp> niemeyer: we also have a local dump of the context to allow that (it's used in to test FindImageSpec), but i'm not quite sure what the best way to divert image requests only. perhaps have a variable holding the address of the image server, and change it for local tests. seems a pity to pollute the real code for testing, but maybe worth it.
<wrtp> s/in to/to/
<hazmat> bcsaller1, nevermind, it is picking up the right deployer
<fwereade_> hazmat, niemeyer: SpamapS makes a good point on the lists about default-instance-type, default-image-id
<fwereade_> hazmat, niemeyer: namely that changing them at this stage will justifiably piss people off
<fwereade_> hazmat, niemeyer: and that we should probably allow them but print deprecation warnings :((
<fwereade_> hazmat, niemeyer: thoughts?
<hazmat> fwereade_, he does make a good point, deprecation warnings, and using them still sounds reasonable, but it seems  less of a debt to see that folded into defaults for environment constraints rather than remaining as global overrides
<fwereade_> hazmat, crap, good point, we *have* to break environments.yaml anyway
<fwereade_> hazmat, so we should at least make sure to do it only once
<fwereade_> hazmat, when can we expect that change?
<hazmat> fwereade_, not sure
<hazmat> i wish i could get out of my juju talk this evening
<hazmat> i'll push forward on the specs
<hazmat> fwereade_, btw thanks for the reviews
<fwereade_> hazmat, cool, thanks
<fwereade_> hazmat, a pleasure, I still haven't really figured out what I'm thinking re GC
<fwereade_> hazmat, I have an incoherent draft email gathering dust :/
<hazmat> fwereade_, for the most part its just watching the topology and cleaning out non referenced things, along with recorded actions that have been completed
<fwereade_> hazmat, yeah, but the devil is in the details
<hazmat> fwereade_, always :-)
<fwereade_> hazmat, unit relations still referenced by other units need to survive until all other units have acked their departure
<hazmat> fwereade_, speaking of which if you have a moment, i was hoping to get a review of https://code.launchpad.net/~hazmat/juju/scheduler-peek-list/+merge/98104 from either you or niemeyer.. i basically rewrote the relation hook scheduler, its much simpler and more robust now
<fwereade_> hazmat, otherwise we'll get missing data in relation hooks on the other units
<hazmat> and much better tested
<fwereade_> hazmat, cool, I'll take a look
<hazmat> fwereade_, yeah.. relations don't cleaned up  till their not referenced in the topology, which basically happens when either service endpoint leaves the rel
<hazmat> fwereade_, thanks
<hazmat> bcsaller1, can a subordinate open a port on the container?
<SpamapS> Hey, I was trying to look at how maas calculates unit addresses, but I don't see it listed in juju.unit.address.get_unit_address .. also.. that if/elif chain seems really wrong.. this method should be moved to the providers themselves.
<fwereade_> hazmat, https://codereview.appspot.com/5841080
<bcsaller1> hazmat: they return information from their container when asked about networking. There might need to be a change in expose but really its the container service doing the work at that point. I hadn't considered if the implications of that are confusing in status wrt expose
<fwereade_> SpamapS, concur
<fwereade_> hazmat, re unit relations: surely "leave them until the service endpoint dies" is problematic... it means that the number of unit relations for a long-lived service with lots of occasional clients will just grow and grow forever
<hazmat> fwereade_, why? if the client leaves the relation is broken
<hazmat> either the relation exists between the services, or it doesn't if one side leaves it in which case it can be gc'd
<fwereade_> hazmat, but it can't, can it? not until we know that the units on the other side have run the appropriate broken/departed hooks
<fwereade_> hazmat, or possibly have been GCed themselves
<fwereade_> hazmat, if we don't wait for that we'll have hooks falling over needlessly
<fwereade_> hazmat, maybe it doesn't matter if we know they're going to die soon anyway
<fwereade_> hazmat, but I don't like the idea that a hook could fail to get state that exists according to its view of the world (ie the output of relation-list)
<fwereade_> hazmat, actually, wait, I'm going to WIP that and reinstate the default-image-id and default-instance-type stuff
<wrtp> right, train just arriving.
<fwereade_> hazmat, SpamapS: wait again, I think we should talk about that
<fwereade_> hazmat, SpamapS: actually, no, I think it's OK: we can just do a deprecation warning saying that using this field silently stomps over any constraints you may set at a later date
<hazmat> fwereade_, they'll need to remove their presence nodes in the rel
<hazmat> empty role containers is probably the the threshold for gc
<fwereade_> hazmat, something will need to but if an agent itself becomes unresponsive before removal then it won't be able to clean out its own presence nodes
<fwereade_> hazmat, sorry not presence nodes
<hazmat> fwereade_, then it session ephemeral/pinger will expire naturally
<fwereade_> hazmat, I misspoke, presence nodes are not relevant
<fwereade_> hazmat, I meant settings nodes
<fwereade_> hazmat, that's what other units may still be looking at for an arbitrarily long time
<fwereade_> hazmat, we can't delete those until we know nobody will be looking at them any more
<fwereade_> hazmat, a working agent can register disinterest itself, that's fine
<hazmat> fwereade_, the settings nodes are gc material not coordination.. the presence nodes are for coordination, if their dead and the rel has been marked remove..
<fwereade_> hazmat, a GC/machine agent that cleans up after a unit that won't die cleanly will have to clear those out itself
<fwereade_> hazmat, the trouble is that a lack of presence doesn't *necessarily* imply that the unit agent won't come back, does it?
<fwereade_> hazmat, ...but if it also doesn't exist in the topology it's a pretty safe bet
<fwereade_> hazmat, hmm
<fwereade_> hazmat, and it's easy if the rel has been marked for removal
<hazmat> fwereade_, two scenarios, clean exit from unit relation by running unit, stalled exit by badly behaving unit, the former is easy
<niemeyer> TheMue: LGTM, thanks!
<fwereade_> hazmat, agreed
<hazmat> so on the latter what does it see when it eventually comes alive.. or is terminated
<hazmat> we could have clean stop kill the settings node
<hazmat> and then only gc the structure, and on bad unit, its broken rel context has the required local data
<hazmat> from its settings node
<fwereade_> hazmat, but surely other units can be behind? ie still running a relation-changed hook that requires that unit's settings
<fwereade_> hazmat, if the unit clears out its own nodes it still has to wait for other units
<fwereade_> hazmat, regardless of where the functionality lies, *something* has to keep unit relation settings around until it's safe to delete them
<hazmat> fwereade_, true it has to wait for the stop of its execution
<fwereade_> hazmat, no, *other* units' hook execution
<hazmat> er. hook
<hazmat> fwereade_, ick ;-)
<fwereade_> hazmat, that's the problem
<hazmat> fwereade_, we can start with a much more pessimistic base version of gc
<hazmat> ie both endpoint services removed
<hazmat> and by removed, i mean the services are destroyed
<fwereade_> hazmat, true enough, nobody said this has to be the final version
<fwereade_> hazmat, yeah, service destruction is easy
<fwereade_> hazmat, I'm much more comfortable ignoring potential hook errors when *everything* related is going down as well ;)
<fwereade_> SpamapS, hazmat, niemeyer: this is interesting, default-ami doesn't work
<fwereade_> SpamapS, hazmat, niemeyer: this implies to me that we can basically just drop it without warning
<fwereade_> SpamapS, hazmat, niemeyer: the ec2 provider was looking for default-image-id instead, and therefore never finding it :/
<niemeyer> fwereade_: Nice, that makes things easier :)
<fwereade_> niemeyer, ok, I'll just drop default-ami
<fwereade_> niemeyer, is there a specific way I should be doing a deprecation warning?
<fwereade_> niemeyer, I thought we had some already but I can't find them
<SpamapS> fwereade_: default-ami is not valid. its always been default-image-id
<niemeyer> fwereade_: If it doesn't work, there's no need to warn :)
<fwereade_> SpamapS, are you 100% sure? it was default-ami in config back in july, and it's default-ami now
<fwereade_> SpamapS, I suspect that in fact nobody has ever used it
<hazmat> fwereade_, people have definitely used it, but with default-image-id
<hazmat> fwereade_, that's whats documented
<fwereade_> SpamapS, niemeyer: oh, wait: does config accept other random keys? if it does that would make sense
<hazmat> fwereade_, it does the validation is only against known keys
<niemeyer> fwereade_: default-image-id was always a hack, though.. I'd just send a polite message to the list pointing that the hack is being replaced by the real implementation
<hazmat> fwereade_, SpamapS has a branch out there to fix it, but i think the notion was that it was going away...
<hazmat> the schema validation of it that is
<fwereade_> hazmat, niemeyer, SpamapS: I see, thank you all
<fwereade_> niemeyer, SpamapS's point is that dumping *any* enforced change on our users at this point is a bad thing
<fwereade_> niemeyer, and I think he's right; and furthermore, the global env change will change environments.yaml as well
<fwereade_> niemeyer, I don't think it's sensible to land this and then make everyone change *again* imminently
<niemeyer> fwereade_: This isn't just "any change". This is a bug in juju that I've been pointing out since that flag was first introduced.
<niemeyer> fwereade_: It *has* to be fixed, or juju won't work
<fwereade_> niemeyer, yes: however much people hate it, we *must* land at least one environments.yaml change
<niemeyer> Also, https://code.launchpad.net/~clint-fewbar/juju/remove-default-ami/+merge/71278
<fwereade_> niemeyer, however, dumping *multiple* changes, over a few days, at this stage in the cycle, is surely going to lead to calls for our heads on pikes etc
<niemeyer> This was the fix to the schema, that never landed..
<fwereade_> niemeyer, the fix to the schema that was deferred until we had a replacement -- constraints -- which then sat in review purgatory for, what, 2 months
<niemeyer> fwereade_: I don't have anything positive to say about that, I'm afraid..
<fwereade_> niemeyer, in hindsight I should clearly have been making a big stink about it myself
<fwereade_> niemeyer, but, well, I didn't :(. and we have to deal with the situation as we find it in addition to figuring out how we can avoid blundering into it again in the future
<fwereade_> niemeyer, I think that we have left it too late to break them, but we can discourage their usage by complaining every time they're used
<fwereade_> niemeyer, however... if I apply this same argument to the global-settings change... I get worried
<fwereade_> niemeyer, because I think it *does* still apply
<niemeyer> fwereade_: default-image-id must die, now.. there's no way to implement support for multiple distributions while supporting it
<fwereade_> niemeyer, with default-series as well, you get something close enough: you just have to make sure the series matches the image
<fwereade_> niemeyer, if I could be sure nobody was using it I would 100% agree
<niemeyer> fwereade_: If someone is depending on it, they'll need to tell us, and we'll need to find something else
<niemeyer> fwereade_: default-image-id can't be supported as it is.
<niemeyer> fwereade_: We can talk about how to avoid introducing changes late in the cycle, we can talk about how to prioritize tasks appropriately over a cycle, but these are unrelated to the simple fact this has to be fixed.
<fwereade_> niemeyer, would it not be sensible to add a deprecation warning and kill it for 12.04.1?
<fwereade_> niemeyer, which robbiew tells me is the actual target for constraints
<niemeyer> fwereade_: It's not about constraints.. it's about series. If I deploy cs:~fwereade/precise/foobar
<niemeyer> fwereade_: foobar is in precise
<niemeyer> fwereade_: default-image-id is broken, and has always been. I was against supporting it at first, and now it's time to kill it and implement the real thing.
<fwereade_> niemeyer, I have no argument against "it must be fixed", and "it must die"
<niemeyer> fwereade_: At least you haven't brought them up so far. :)
<niemeyer> fwereade_: I can buy aguments like "we need to allow people to customize image selection"
<niemeyer> fwereade_: I can't take "let's continue supporting default-image-id" as it is, because it's a relevant bug to have it.
<fwereade_> niemeyer, nah, I *really* don't want anyone doing that
<fwereade_> niemeyer, I'm suggesting we leave it in with a suitable dire deprecation warning now, and actually remove it in the release 3 months down the line
<niemeyer> fwereade_: default-image-id = "ami-for-oneiric", deploy cs:~fwereade/precise/foobar, BOOM!
<fwereade_> niemeyer, yeah, and that's bad, no argument
<hazmat> well more than a deprecation warning
<hazmat> we can issue the error message with the solution, and just refuse to honor it
<robbiew> +1
<fwereade_> hazmat: so, a specific check during env parsing for those 2 keys, and immediate explosion saying "you must now use constraints"
<niemeyer> fwereade_: That's my opinion, though. To be honest, this will bite hazmat and SpamapS more than me, so I'm leaving it up for them to decide what's the best approach.
<hazmat> fwereade_, i'd keep it context relevant, to deploy/add-unit
<fwereade_> hazmat, hmm, ok
<hazmat> hmm.. and bootstrap
<fwereade_> hazmat, tbh it still feels like too sudden and painful a change for anyone depending on it
<niemeyer> hazmat: That's a brand new feature being developed. -1 on deciding/supporting that at this stage in the cycle.
<hazmat> fwereade_, i don't think people are depending on it.. it was mostly used a hack for lack of arch support
<fwereade_> hazmat, if we can be reasonably sure that's the case, then great
<fwereade_> hazmat, in that case I'm happy saying "tough, use constraints"
<hazmat> fwereade_, default-instance-type is much more commonly used
<niemeyer> There's a collection of opinions on this here:
<niemeyer> https://bugs.launchpad.net/juju/+bug/830995
<fwereade_> hazmat, but I kinda feel it should happen at env-parsing time
<niemeyer> "default-instance-type and default-image-id are inadequate" - 2011-08-22
<fwereade_> hazmat, if we're going to force immediate action then we should probably just force immediate action
<niemeyer> "I currently use this in development and testing of formulas in different ubuntu versions"
<niemeyer> -- Juan
<niemeyer> "Often need stacks with different types of machines deployed for different services."
<niemeyer> -- Mark
<hazmat> fwereade_, i'm just thinking about the existin environments, nothing to do unless you actually conflict
<hazmat> all of which are addressed by constraints
<niemeyer> etc .. etc..
<niemeyer> Right.. they're all claiming for constraints.
<niemeyer> And series selection
<niemeyer> Which is done via charm URL
<niemeyer> We're giving them what they want, not what they've asked for..
<hazmat> fwereade_, but parse time is fine.
<fwereade_> hazmat, OK
<fwereade_> hazmat, now wrt to braking environments.yaml *twice*
<fwereade_> breaking
<fwereade_> hazmat, I really think we should avoid doing that
<fwereade_> hazmat, so I think we have to coordinate it with the env-settings change
<hazmat> fwereade_, the env-settings change is mostly relating to storage and syncing.. with the removal of image-id, instance-type, and default-series.. what remains in env.yaml is probably the same
<hazmat> ie. its a behavioral change not a structural config change
<hazmat> although one more env.yaml that comes to mind is apt-proxy
<hazmat> url
<fwereade_> hazmat, IIRC niemeyer was keen that we remove non-access settings from env.yaml
<hazmat> fwereade_, apt-proxy-url is an access setting
<hazmat> in envs that need it, we can't even bootstrap correctly without
<fwereade_> hazmat, huh, seems you're right, I was sure we had more stuff gunking it up
<fwereade_> hazmat, I wasn't thnking of apt-proxy, I was thinking we had other non-access ones in there
<fwereade_> hazmat, OK, so we can be sure this is the *only* breaking change we can know we'll need to introduce?
<hazmat> hmm..
<hazmat> the subordinates increment the topology version but there's a transparent migration there
<fwereade_> hazmat, apart from anything else the env change will change zookeeper storage so people will have to bounce the whole env to upgrade
<hazmat> hmm actually that won't be meanigful unless the cluster is on the same code rev.. ie. it would still be problematic
<fwereade_> hazmat, (I think?)
<hazmat> fwereade_, absent consistent code versions, yeah.
<hazmat> we can migrate state, but unless everything deployed understands the  newer version it doesn't really resolve anything
<fwereade_> hazmat, so, OK, we do have 2 inevitable breaking changes
<fwereade_> hazmat, I think it's very important that we reduce that to 1
<fwereade_> hazmat, opinion?
<fwereade_> SpamapS, hazmat, niemeyer: just to confirm, btw: default-instance-id should be treated in exactly the same way we do default-image-id; right?
<hazmat> fwereade_, yes imo.
<fwereade_> SpamapS, negronjl: in light of your perspective, and beyond making sure there is only one of them: is there any way we can mitigate the impact of a necessary breakage?
<fwereade_> hazmat, I have been speaking to niemeyer: consider me at your disposal for the next couple of weeks
<fwereade_> hazmat, how an I have most impact?
<fwereade_> hazmat, I *want* to do constraints-get and unit constraints as part of it but I'm not sure they're our highest priority
<hazmat> fwereade_, awesome!
<hazmat> fwereade_, wrt to constraints.. + env-constraints
<fwereade_> hazmat, yep, the environment change seems to me to be the most critical thing
<fwereade_> hazmat, once we have that we can add env constraints
<hazmat> fwereade_, that sounds like a good place to start, env-constraints and settings changes
<hazmat> we should land those soon
<fwereade_> hazmat, yep, landing them is the big one
<hazmat> i'm going to work with bcsaller on subordinates, i've been testing them last night and today, i see a few problems that need resolving
<fwereade_> hazmat, ok, I will aim for a series of branches that culminate in env-constraints, along with complete removal of default-image-id and default-instance-type
<hazmat> fwereade_, awesome thanks
<fwereade_> hazmat, I think that I should push my current branch; with dii/dir printing loud deprecation warnings, but continuing to work as before (and completely overriding constraints)
<fwereade_> dii/dit^^
<fwereade_> hazmat, people on the bleeding edge thereby get a few days' warning, and they can start using constraints as soon as the cut the cord there
<hazmat> fwereade_, that sounds reasonable, what's the default for the bootstrap node?
<hazmat> oh.. its going to honor them still
<fwereade_> hazmat, same as everything: effectively an m1.small
<hazmat> fwereade_, for series?
<fwereade_> hazmat, it's actually quite easy to keep them working
<fwereade_> hazmat, IIRC it's taken from default-series
<fwereade_> hazmat, hmmm
<fwereade_> hazmat, that's not an access setting
<hazmat> fwereade_, right.. but that's one of the things thats going away..
<fwereade_> hazmat, I know there was something
<fwereade_> hazmat, oh!
<hazmat> and keep in mind osx clients, can't always take from current
<hazmat> we'll need to pass it on the cli for bootstrap i think
<fwereade_> hazmat, what are we going to use instead?
<hazmat> or use the latest current
<fwereade_> hazmat, required?
<fwereade_> hazmat, env constraints will need to be settable at bootstrap time
<fwereade_> hazmat, as will default series
<hazmat> fwereade_, i'd suggest required in the absence of host inspection, but that still feels implicit
<hazmat> SpamapS, any thoughts wrt to bootstrap node release series specification?
<fwereade_> hazmat, I kinda feel we should default to latest released series
<hazmat> fwereade_, custom introspection off cloud images?
<hazmat> url that is
<fwereade_> hazmat, *however*, whatever we do, moving default-series is *another* env.yaml change
<hazmat> fwereade_, yes, and it should be part of the warnings
<fwereade_> hazmat, yep, I'll add that now
<fwereade_> hazmat, so, for now: warnings on all three default-* keys, behaviour unchanged
<fwereade_> hazmat, blast I have to go
<fwereade_> hazmat, I'll try to be on later
<hazmat> fwereade_, cheers, i'm out in 2hrs, back in 7hr
<niemeyer> hazmat: Just submitted a review for https://codereview.appspot.com/5849054/
<niemeyer> hazmat: Good stuff.. just a few details
<niemeyer> hazmat: Let me know if you want to sync up on any of the points
<niemeyer> fwereade_: This may be relevant to you as well ^
<hazmat> niemeyer, thanks
<hazmat> i've switched tracks to getting pull together a presentation
<SpamapS> hazmat: was at lunch.. just read your question..
<SpamapS> IMO, all those default-* settings that are being obsoleted have to stay for a little while, and where they contradict the new way, warnings should be emitted.
<niemeyer> fwereade_: we should probably rename relation-list to relation-units in the future (keeping the former for compatibility)
<SpamapS> If somebody has asked for a default-image-id, we should honor that request, and warn them that we can't guarantee we're deploying the right charm on that image.
<niemeyer> fwereade_: We'll likely add something like relation-ids, which will make relation-list ambiguous, and potentially error prone
<SpamapS> If somebody has a default-series of oneiric, so be it. I see no reason to *remove* that setting.
<SpamapS> And for default-instance-type .. same deal.. leave it be and let it override the hard coded 'm1.small' as a default.. but let constraints override both of those.
<niemeyer> SpamapS: It will be removed, because the behavior you just described is a bug
<niemeyer> SpamapS: It doesn't have to be removed now, if you and hazmat agree that's the best approach
<niemeyer> SpamapS: But the behavior you just described is a bug, which has to be fixed
<SpamapS> I'm not asking you to keep cruft around forever. Just take on the pattern of deprecation before removal so people have time to adapt.
<niemeyer> jimbaker: https://codereview.appspot.com/5837050/ reviewed
<hazmat> SpamapS, so we can keep but deprecate instance-type, but image-id and series are more fundamentally problematic, because their just broken.. its not going to break an existing environment though, we're just forcing folks to update the config file.
<jimbaker> niemeyer, cool
<SpamapS> niemeyer: if the documentation says "this is deprecated don't use it" and it warns and says "This is deprecated don't use it" and the software still does what the user asked.. thats a reasonable method to fix a very sticky bug.
<SpamapS> hazmat: *that* is breaking an automated environment.
<niemeyer> SpamapS: That's the approach I'd use for a silly option that I wanted to deprecate.  I'd not do that on an option that creates a bug for scenarios I care about.
<niemeyer> SpamapS: Again, that's just my opinion, though.
<hazmat> SpamapS, forcing any user interaction even keeping existing apps and services running is breaking?
<niemeyer> SpamapS: I'm happy to have you and hazmat deciding how to handle the removal
<SpamapS> hazmat: at this point, if you error out because of options that were ok before.. you are breaking automated systems.
<SpamapS> And at *least* deprecate the options in the documentation first.
<hazmat> SpamapS, done
<SpamapS> I'm probably overreacting. But I've been through this 4 or 5 times now.. backward incompatible breaks have been coming at a pretty steady rate.. and at this point, the impact causes ripples through quite a few peoples' workflow
<jimbaker> niemeyer, thanks for the review. it does seem to me that relation-ids should take an --interface option, as i note in my response in the mp
<jimbaker> otherwise, this looks like we are in consensus on this
<niemeyer> jimbaker: Do you have a real use case where that's relevant
<niemeyer> ?
<jimbaker> niemeyer, i mention it in the accompanying keystone example
<niemeyer> jimbaker: That example seems wrong to me
<niemeyer> jimbaker: You're blindly iterating over a list of relation ids, without any mention of what these ids are actually associated with
<niemeyer> jimbaker: An interface defines a protocol
<niemeyer> jimbaker: There's generally little reason to list things based on protocol
<niemeyer> jimbaker: You'll generally want to know "where's my cache mongodb"
<niemeyer> jimbaker: Not "where's any mongodb"
<jimbaker> niemeyer, sounds good to me
<jimbaker> niemeyer, i will further update the proposal, especially the example, with this in mind
<fwereade_> SpamapS, hazmat: not sure if I can get back on properly today; but the first branch I propose will include: constraints updates; functioning default-* keys which override all constraints (so those who didn't specify them can use constraints straight off, but those who did don't have to deal with a *really* sudden behaviour change); and dire deprecation warnings of constraints incompatibility and immi
<fwereade_> nent removal for all 3 default-* keys
<fwereade_> SpamapS, hazmat: I think this will be safe to land quickly and is a necessary first step
<fwereade_> SpamapS, hazmat: so ppa users who don't keep up with the lists get *some* warning
<fwereade_> SpamapS, hazmat: if you forsee any problems with the above, please let me know :)
<SpamapS> fwereade_: sounds very reasonable
<fwereade_> SpamapS, cool, thanks
<fwereade_> SpamapS, I'm sorry about all this :(
<fwereade_> SpamapS, it also crosses my mind that nobody will actually be able to start using constraints without restarting their env
<SpamapS> fwereade_: don't be! I appreciate the volume and quality of change, and I know its not possible to cover every case every time.
<SpamapS> fwereade_: thats just the nature of the beast until we have an 'upgrade-environment' command
<fwereade_> SpamapS, but we're cutting it fine with this one, and realistically we will have at least one more release which an environment cannot live through usefully
<fwereade_> SpamapS, yeah, exactly, my feeling is that that's a critical for the next release
<fwereade_> SpamapS, shame we couldn't get to it this cycle, but so it goes
<fwereade_> SpamapS, once we have an upgrade-environment mechanism it will at least be *possible* to have non-breaking releases
<SpamapS> fwereade_: breaking releases are different than not having a feature available to you in an existing environment though.
<fwereade_> SpamapS, only slightly different, it's not like we can replace the PA
<fwereade_> SpamapS, and, blast, I have thought of a possible breakage, the client (which does get updated) could try to get constraints info for a state that was never created with it
<fwereade_> SpamapS, not hard to fix but not optional either
<fwereade_> SpamapS, andway, sorry, gtg again
<SpamapS> fwereade_: cheers!
<niemeyer> Time for some outsiding..
<fwereade_> if anyone's around, do you recall why default-series is optional for some providers but not apparently all?
#juju-dev 2012-03-21
<flacoste> hazmat: jimbaker: bcsaller: if any of you is around, would one of you be available to help poor bigjools with the problems we are having making the maas provider work?
 * bigjools makes puppy dog eyes
 * SpamapS can try to help
<SpamapS> I doubt I can offer quite the same level of code introspection as them.. but I *did* write a provider many moons ago
<bigjools> heh
<bigjools> SpamapS: ok thank you. my problem is that  the provisioning agent craps out when it tries to deploy stuff.  Let me try to pastebin a log.
<bigjools> we are confused about system/instance/resource/ etcetc IDs.
<bigjools> the current branch is here, FWIW: lp:~julian-edwards/juju/maas-system-id
<flacoste> SpamapS: thanks
<SpamapS> bigjools: I noticed that there was some stuff missing from other places in the code today too. Providers aren't nearly as self-contained as they should be
<bigjools> SpamapS: :(
<SpamapS> bigjools: pulling your branch now. What exactly can I help with?
<bigjools> SpamapS: I am trying to get my prov agent log, one sec
<SpamapS> bigjools: just an FYI, you need to add maas to juju/unit/address.py
<SpamapS> bigjools: but I do not know if that is for sure your issue. Most likely not. ;)
<bigjools> SpamapS: ok thanks
<fwereade_> bigjools, based on the error robbiew forwarded earlier, I would look askance at MaasProvider.get_machines
<bigjools> SpamapS: so here's a snippet of log: http://pastebin.ubuntu.com/893090/
<bigjools> yeah basically the same as that email
<bigjools> fwereade_: hi!
<fwereade_> bigjools, the description was basically that it was trying to shut down machines that it shouldn't even know about, right?
<fwereade_> bigjools, heyhey :)
<fwereade_> bigjools, that implies that the PA is getting more machines than it should back from provider.get_machines
<bigjools> fwereade_: well that AND the fact it was shutting down something that was not running (maas returns the 409 CONFLICT for that)
<bigjools> fwereade_: ah that is useful, thanks
<fwereade_> bigjools, it will only ever try to shut a machine down if the provider tells it it exists
<flacoste> bigjools: could it be that get_nodes() doesn't limit itself to the nodes that have been acquired?
<fwereade_> bigjools, of all those machines which exist, any which correspond to a machine state will be saved
<fwereade_> flacoste, sounds very likely
<bigjools> flacoste: yeah could be
<bigjools> I need to debug this
<fwereade_> flacoste, orchestra had to make that distinction
<bigjools> on an internet connection that has some b/w
<fwereade_> bigjools, best of luck
<SpamapS> it does look like get_machines shows all machines
<bigjools> ok I'll look into that, thanks for the advice guys!
<fwereade_> bigjools, I'm afraid I have to sleep now
<fwereade_> bigjools, a pleasure, glad to be of service :)
<SpamapS> bigjools: each of the other providers uses something to filter out the list of all machines that can be seen w/ the service+creds to just the ones pertinent to this environment
<bigjools> right
<SpamapS> bigjools: for orchestra it was mgmt class, for ec2 the group is used
<bigjools> I assumed it was just asking about all nodes passed
<flacoste> bigjools: so in our cases, returning the nodes associated to the user is probably fine
<flacoste> although
<flacoste> we might need to introduce a new environment key
<bigjools> flacoste: how so?
<flacoste> (in case a user, wants to start two separate environments)
<flacoste> juju bootstrap; juju bootstrap
<flacoste> both with the same credentials
<SpamapS> You need something to tag each node. One maas may host many envs for many users.
<flacoste> SpamapS: we are multi-tenant already
<flacoste> but not sure multi-env per user yet
<SpamapS> the potential to break your juju environment just by manually starting a node is a bit high though
<SpamapS> if nothing else, prefix the names
<SpamapS> or allow an optional filter in the environment configuration
<SpamapS> but anyway
<bigjools> I need to understand what juju needs here, I am not sure yet.  But let me come back to that when the other stuff is working
<SpamapS> sounds like the problem is in hand
<SpamapS> Anyway, to test this hypothesis one should be able to try it out with a maas that only has juju nodes
<hazmat> bigjools, ping
 * hazmat backtracks
<bigjools> hazmat: hey
<hazmat> hi bigjools just backtracking through the logs
<hazmat> it sounds like orchestra is returning something via the api that it shouldn't
<bigjools> you mean maas?
<bigjools> yeah - we think get get_machines call is returning the wrong things, it's not filtering properly
<hazmat> bigjools, yup i mean the resource uri vs the node url seem to be two different concepts. anyways.. i'm around for  a few hrs.. if you need any further debugging
<bigjools> hazmat: great, thanks!
 * bigjools relocating
<rogpeppe> mornin' all
<rogpeppe> TheMue: sorry i didn't take any pictures or record anything at all from last night!
<rogpeppe> TheMue: but it was interesting, and a good time was definitely had.
<fwereade_> heya rogpeppe
<rogpeppe> fwereade_: heya too
<fwereade_> rogpeppe, TheMue: laura's off school and cath has to go out, I will be away for a couple of hours
<roubi> hi
<rogpeppe> fwereade_: lots of tunnels just after kings cross so communication will be patchy...
<roubi> why i can't see
<roubi> discussions
<roubi> taking place in this channel
<TheMue> rogpeppe: moin
<TheMue> roubi: which kind of discussion do you expect?
<roubi> questions that every body ask about juju
<wrtp> fwereade_: lots of tunnels just after kings cross so communication will be patchy anyway...
<TheMue> wrtp: so you're in the train?
<wrtp> TheMue: i am
<wrtp> TheMue: again!
<TheMue> wrtp: ic ;)
<wrtp> TheMue: i said hi to Eleanor and Erik for you BTW
<TheMue> wrtp: great, thx
<wrtp> TheMue: at least... i said hi to *an* Erik!
<wrtp> TheMue: (the one that gave the talk)
<TheMue> wrtp: that's exactly the right one
<TheMue> ;)
<wrtp> TheMue: i'm not sure he remembered your name though - they do *many* startup gatherings...
<TheMue> wrtp: that's one of the problem when you so far only had electronical contact with nicknames
<wrtp> TheMue: yeah
 * wrtp can now run the environs/ec2 tests without accessing the internet. yay!
<TheMue> wrtp: great
<fwereade_> hazmat, ping, I'm getting a horrible feeling that we need to hit the "placement" key as well
<fwereade_> hazmat, it really doesn't feel like an access setting ;) :(
<niemeyer> Heya!
<fwereade_> heya niemeyer
<fwereade_> niemeyer, I think I'm missing some context on placement policies
<fwereade_> niemeyer, why did they become env-level only, when we use to have deploy --placement?
<fwereade_> used to have^^
<niemeyer> fwereade_: They may be set both in the environment and in the service, right?
<niemeyer> fwereade_: I mean, constraints can
<fwereade_> niemeyer, *constraints* can
<niemeyer> fwereade_: Yes, and that's the real placement.. --placement IIRC is a hack to allow re-use of machine 0
<fwereade_> niemeyer, it seems that *actually* placement ought to be a constraint, but not one I'd fully appreciated
<fwereade_> niemeyer, OK, this is another point of potential pain
<fwereade_> niemeyer, do we wish to retain that hack in some form?
<fwereade_> niemeyer, given that we currently have no other way to express it, my guess is "yes", but..?
<niemeyer> fwereade_: What's the potential pain?
<fwereade_> niemeyer, it *also* seems that they *cannot* be set on the service
<fwereade_> niemeyer, but perhaps I'm blind there
<fwereade_> niemeyer, that it's a non-access setting that currently lives in environments.yaml
<niemeyer> fwereade_: Placement policy sounds orthogonal to constraints and to environment settings
<niemeyer> fwereade_: Sorry, yes, not to environment settings
<fwereade_> niemeyer, yeah, local policy overrides constraints
<niemeyer> fwereade_: Why is it tricky?
<fwereade_> niemeyer, which I think matches intuitive intent
<niemeyer> fwereade_: Sounds like a normal setting that should go into set-env
<fwereade_> niemeyer, yeah; it's just something else we have to warn people about :(
<niemeyer> fwereade_: We need to warn them about only one thing
<niemeyer> fwereade_: Settings in environments.yaml are being moved onto the environment itself
<niemeyer> fwereade_: We don't need a separate warning for each setting
<fwereade_> niemeyer, this is true, but the required actions are different for different settings
<fwereade_> niemeyer, "default-series" and "placement" are both "use bootstrap/set-env"; the ec2 ones are "use constraints"
<niemeyer> fwereade_: That's fine.. we can easily document the distinction in a wiki page, and have both the email note and the error from juju pointing out to it
<niemeyer> fwereade_: We also don't have so much variation
<fwereade_> niemeyer, ok, that sounds good then
<niemeyer> fwereade_: Can we please call placement as "placement-policy" on the way to that?
<fwereade_> niemeyer, sounds like a good idea to me
<niemeyer> fwereade_: I might even call that as debug-placement-policy, but YMMV ;)
<fwereade_> niemeyer, sorry, paralysed myself wishing we'd implemented it as deploy/add-unit --force-machine=0
<fwereade_> niemeyer, I'm not quite sure that debug works myself
<fwereade_> niemeyer, but then again I'm not sure I have the best perspective on how people are using it
<niemeyer> fwereade_: This option was never intended to be visible/widely used
<niemeyer> fwereade_: Some clever people may be using it to fine tune specific tests and deployments in good ways, but it spoils much of the reasoning why we have juju
<fwereade_> niemeyer, I guess we would have got rather more blowback from making it harder to use back whenever it was it disappeared from the unit-adding commands
<fwereade_> niemeyer, I'm not sure it does, it feels at least 50% like a reasonable temporary workaround to what remains one of juju's drawbacks
<niemeyer> fwereade_: It's not a reasonable workaround at all, IMO, except in those very specific circumstances
<fwereade_> niemeyer, I guess the question is "how do we quell the ire of those who *are* using it"
<niemeyer> fwereade_: Having everything sitting on machine 0, next to ZooKeeper, in ways that can't ever be uninstalled...
<fwereade_> niemeyer, this is true, it does suck
<niemeyer> fwereade_: Hold on.. I wasn't suggesting removing it..
<niemeyer> fwereade_: We need this behavior for the local provider case.. that's why we have it
<fwereade_> niemeyer, well, the behaviour is fine for that; you seemed to be saying it was a mistake to have it available at all on ec2
<niemeyer> fwereade_: That's not what I said.. the option is just not intended to be very visible or widely used
<fwereade_> niemeyer, anyway, ok, understood
<fwereade_> niemeyer, we just move it exactly like default-series
<niemeyer> fwereade_: It's fine to have it in EC2.. bcsaller made some very good use of that while developing some of the local provider stuff
<niemeyer> fwereade_: Since he was able to test the LXC deployments straight in EC2
<fwereade_> niemeyer, cool, sorry misunderstanding :)
<niemeyer> fwereade_: That's why I'd name it debug-placement-policy
<niemeyer> fwereade_: This causes the environment to behave in very awkward ways if one doesn't know what's going on
<niemeyer> fwereade_: But it's fine if you're really into it
<fwereade_> niemeyer, the "debug" in there implies dev-only, rather than "what you should do for openstack" for example
<fwereade_> niemeyer, which IIRC was a lot of the motivation for it initally(?)
<niemeyer> fwereade_: You shouldn't do that for openstack, unless you're Adam :)
<fwereade_> niemeyer, rabbitmq with mysql
<fwereade_> niemeyer, haha
<niemeyer> fwereade_: There's a sequence of events that requires using it in a precise way. This is much closer to "debugging" than to being a "juju feature"
<fwereade_> niemeyer, I agree it's awkward; just fretting that it may have become a de facto feature
<niemeyer> fwereade_: For Adam, it has :)
<niemeyer> SpamapS, m_3: Is anybody else using the placement hack these days?
 * rogpeppe is back on a proper internet connection again.
<niemeyer> rogpeppe: welcome back to modernity
<rogpeppe> niemeyer: actually i felt quite futuristic controlling ec2 instances from on a train :-)
<niemeyer> rogpeppe: That's quite amazing indeed :)
<rogpeppe> niemeyer: managed to push out two merge requests too. better when the signal was better, usually in a station.
<rogpeppe> niemeyer, fwereade_, TheMue: https://codereview.appspot.com/5866049/ and https://codereview.appspot.com/5864047/ if you fancy taking a look.
 * rogpeppe is very much enjoying having lbox do prerequisites properly!
<m_3> niemeyer: not sure
<m_3> doubt it
<niemeyer> m_3: Cheers
<niemeyer> fwereade_: ^
<fwereade_> niemeyer, m_3: thanks :)
<m_3> np
<niemeyer> fwereade_, hazmat: Just had another round on https://codereview.appspot.com/5849054/
<SpamapS> niemeyer: the placement hack has been unavailable to us since --placement was removed
<SpamapS> Hey question..
<SpamapS> if you remove default-image-id ... how do private clouds work?
<niemeyer> SpamapS: It's actually been available all along.. but thanks, that answers the question as well
<SpamapS> niemeyer: heh, ok.. I'll take your word on that. I never knew it was.
<niemeyer> SpamapS: Adam said it was important for his use case, so we just moved it..
<niemeyer> SpamapS: He said it was fine, as long as it was available
<niemeyer> SpamapS: The question about private clouds is a good one
<niemeyer> fwereade_, hazmat: ^
<niemeyer> SpamapS: We need a proper way to map series > image in those as well
<SpamapS> I suspect the simplest way will be to set image id on each deploy.
<SpamapS> but thats ... tedious :-P
<fwereade_> niemeyer, *that* is plausibly an env access setting, now I come to think of it
<niemeyer> fwereade_: Yeah
<fwereade_> niemeyer, and SpamapS' observation does make me all nervous about forcing automated image selection on people now
<fwereade_> (again)
<niemeyer> fwereade_: Do you have any suggestions that do not involve having "deploy cs:~fwereade/oneiric/mongodb" deploying it in Precise?
<fwereade_> niemeyer, no, I don't
<fwereade_> niemeyer, I would suggest that the people sophisticated enough to use custom image ids are demonstrably comfortable in less-comfortable environments
<niemeyer> fwereade_, SpamapS: The proper way is likely to have e.g.
<niemeyer> images:
<niemeyer>     precise: <image id>
<niemeyer>     oneiric: <image id>
<niemeyer>     ...
<fwereade_> niemeyer, arch? hvm?
<niemeyer> fwereade_: arch is an issue.. hvm doesn't exist outside of ec2
<fwereade_> niemeyer, true, but that's another way of saying "it exists"
<fwereade_> niemeyer, it'll all be against the ec2 provider...
<niemeyer> fwereade_: Yeah, you're right
<niemeyer> hmm
<fwereade_> niemeyer, IMO the cost of persisting the hack until 12.04.1 is outweighed by the benefit of offering an actual transition path for anyone who is using it
<niemeyer> fwereade_: precise: {id: <id>, constraints: <matching constraint expression>}
<niemeyer> fwereade_: ?
<fwereade_> niemeyer, hmm, that could work, but it feels kinda icky
<niemeyer> fwereade_: Forget "persisting the hack".. let's describe it in terms of having series not working
<niemeyer> fwereade_: I don't care about the hack.. I care about series not working
<fwereade_> niemeyer, ok
<fwereade_> niemeyer, what I want is for everyone who needs it to run their own local uec-images :)
<fwereade_> niemeyer, ie, same data format and url %s-able by series name
<fwereade_> niemeyer, that makes it an env access setting, the isers are responsible for advertising sane images, and we're done
<niemeyer> fwereade_: That's a plausible idea. We just have to reduce the burn of setting it up, and we also need to think about the fact that non-real-EC2 deployments don't have pretty pre-defined classes of machines
<fwereade_> niemeyer, heh, I wish I could remember who told me that basically all openstack deployments reuse instance names (even if they don't perfectly match capabilities)
<fwereade_> niemeyer
<fwereade_> niemeyer, actually it's not just uec-images data: it *is* the capabilities of the instances, assumptions about what instance types can run what images are hardcoded :(
<fwereade_> niemeyer, however I don't think even amazon publish that data in a usefully-consumable format
<niemeyer> fwereade_: We do..
<fwereade_> niemeyer, it's a tradeoff that seemed sensible given the constraints I was operating under
<niemeyer> fwereade_: But we can't publish it for arbitrary deployments that have their own image ids and their own capabilities
<fwereade_> niemeyer, oh wait, did I totally miss the meaning of "we do"?
<niemeyer> fwereade_: seemed sensible? was operating under? I think you missed it :)
<niemeyer> fwereade_: Canonical publishes details for images in EC2
<fwereade_> niemeyer, if you're saying that we publish uec-images: yes, I know :p
<niemeyer> fwereade_: No, we publish the data that you said Amazon doesn't
<fwereade_> niemeyer, details of instance types?
<fwereade_> niemeyer, which ones require HVM images, etc?
<fwereade_> niemeyer, which one's it's OK to start with an i386 image
<fwereade_> niemeyer, it's the fact that even if the *names* match, private "ec2"s might have (say) m1.smalls that are 64-bit only
<niemeyer> fwereade_: Yes
<niemeyer> fwereade_: http://uec-images.ubuntu.com/query/precise/server/daily.txt
<fwereade_> niemeyer, I see lots of image data there in exactly the format I was asking for
<niemeyer> fwereade_: You need to know whether the instance is amd64/i386/ebs/hvm..
<fwereade_> niemeyer, I see nothing about instance types
<fwereade_> niemeyer, yes
<fwereade_> niemeyer, ebs we assume everywhere so I really have no option but to punt on that for now at least
<niemeyer> fwereade_: This is still relevant as it gives us which ones are ebs or not
<fwereade_> niemeyer, how is any of that information useful to me, as juju, running against a private cloud and attempting to determine whether or not it is appropriate for me to start a c1.medium with an i386 image?
<niemeyer> fwereade_: The constraints part of the picture must be provided by the provider
<niemeyer> fwereade_: This is already the case for the constraints mechanism you're implementing to exist at all
<fwereade_> niemeyer, agreed
<fwereade_> niemeyer, this information will indeed need to be exposed by the provider
<niemeyer> fwereade_: This is solving the other half of the problem.. given you know what constraints a machine is under, what image?
<niemeyer> fwereade_: I'll have lunch while I think a bit more about the issue :)
<fwereade_> niemeyer, that is not a problem in need of a solution
<fwereade_> niemeyer, we do that already
<niemeyer> fwereade_: Is it not?  That's great.. so what's the answer to SpamapS question?
<niemeyer> <SpamapS> if you remove default-image-id ... how do private clouds work?
 * niemeyer => lunch
<fwereade_> niemeyer, SpamapS: if I have my way, private clouds will work as follows
<fwereade_> niemeyer, SpamapS: (1) they have to publish uec-images format image data just like uec-images, and must specify the url in the env access settings
<fwereade_> niemeyer, SpamapS: (2) they have to [somehow] publish instance-type data in a format we can interpret
<fwereade_> niemeyer, SpamapS: if we remove default-image-id now, we utterly crush any current efforts to play with juju in private openstack clouds (right?)
<fwereade_> niemeyer, SpamapS: if we focus on it, we *can* come up with a sensible way to import the required information by 12.04.1, and we can IMO bear the risk of sophisticated users telling juju to do bad things in the meantime
<fwereade_> niemeyer, SpamapS: it now seems clear that default-instance-type and default-image-id are both critically important for this use case -- without d-i-t you're forced to use the ec2 instance types, and without d-i-i you're completely helpless
<fwereade_> niemeyer, SpamapS: is this approaching the tipping point at which we say "ok, we really actually cannot break this now, however much we'd like to"?
<niemeyer> fwereade_: What's the suggested behavior of "cs:~fwereade/precise/mongodb" vs. "cs:~fwereade/oneiric/mongodb"?
<hazmat> eek.. du -hs machine-agent.log
<hazmat> 163G    machine-agent.log
<niemeyer> hazmat: Holy crap
<niemeyer> rogpeppe: You have a couple of (trivial) reviews
<hazmat> niemeyer, yeah.. zk client debug loggin is a bit verbose ;-)
<hazmat> i can dev/null it, the alternative is attach a pipe, and filter/rate-limit messages into python logging
<rogpeppe> niemeyer: thanks a lot
<hazmat> it seems like it goes into a tail spin of errors/warnings when it has trouble connecting
<fwereade_> niemeyer, look up an appropriate image at a url based on series; use other constraints to pick an appropriate image from that data source
<niemeyer> hazmat: Sounds fine to dev/null it
<niemeyer> fwereade_: I mean for 12.04
<fwereade_> niemeyer, we loudly and anrgily warn that default-image-id is deprecated every time we parse env.yaml
<niemeyer> fwereade_: Sorry, I mean what's the plan for supporting multiple series in 12.04
<fwereade_> niemeyer, in private clouds, we don't have one
<fwereade_> niemeyer, for everyone else, they just have to stop using those keys
<fwereade_> niemeyer, well, that one specific key
<niemeyer> fwereade_: Ok, I can buy into that as an intermediate step
<rogpeppe> niemeyer: thanks for the LGTM on https://codereview.appspot.com/5857049/. it's got https://codereview.appspot.com/5864047/ as a prerequisite BTW in case you'd overlooked that one.
<fwereade_> niemeyer, cool
<fwereade_> niemeyer, I'm certainly not advocating that it should live on in 12.04.1
<fwereade_> niemeyer, however, since d-i-i was the driver for a lot of this, we should check response sanity for the other keys
<niemeyer> rogpeppe: I didn't overlook it.. just didn't get there
<rogpeppe> niemeyer: np
<fwereade_> niemeyer, d-i-t is I think helpful in the private cloud use case; aws users can just retire it when they feel ready
<niemeyer> fwereade_: I think d-i-t is --constraints, isn't it?
<niemeyer> fwereade_: If it is still useful after constraints, we're doing something wrong
<fwereade_> niemeyer, it is, but it makes aws-specific assumptions about the nature of the instance types it exposes
<niemeyer> fwereade_: Really? Why?
<niemeyer> fwereade_: I thought we had non-AWS needs in mind the whole time
<fwereade_> niemeyer, I am only aware of one place that publishes that information, and it's a github project that I'm reluctant to pull and parse at runtime
<niemeyer> fwereade_: Which means constraints is broken
<niemeyer> fwereade_: It must be finished
<niemeyer> fwereade_: default-instance-type is a constraint like any other
<niemeyer> fwereade_: cpu=N arch=X etc
<fwereade_> niemeyer, default-instance-type is a bad name: what it really is is force-instance-type
<niemeyer> fwereade_: Huh!?
<fwereade_> niemeyer, that is the effect it has always had
<niemeyer> fwereade_: "default" in there should really be *default*
<niemeyer> fwereade_: Because there's no other way to select the instance type
<niemeyer> fwereade_: Constraints should entirely obsolete the need for default-instance-type
<niemeyer> fwereade_: Or we're doing something wrong
<fwereade_> niemeyer, the wrong thing that we are doing is encoding assumptions about AWS in the environment
<niemeyer> fwereade_: Such as?
<fwereade_> niemeyer, cc2.8xlarge requires an HVM image
<niemeyer> fwereade_: We don't have that in the environmen, do we?
<fwereade_> niemeyer, t1.micros can be started with i386 and amd64 images
<fwereade_> niemeyer, are you aware of any other place to get that information?
<niemeyer> fwereade_: Sorry.. we're using "environment" in different ways
<niemeyer> fwereade_: It's not an environment setting
<fwereade_> niemeyer, developing the infrastructure to distribute it ourselves seemed a touch overambitious for a small compnent of a feature I had |1 month to do
<niemeyer> fwereade_: EC2 knowing that in Amazon a t1.micro is i386 sounds quite ok
<fwereade_> niemeyer, hence hardcoded assumptions like the above
<fwereade_> niemeyer, right
<fwereade_> niemeyer, can we make that assumption about instance types available in private clouds?
<niemeyer> fwereade_: No.. didn't we go over that already?
<fwereade_> niemeyer, a consequence of that is that constraints are "broken", as you put it, in certain private-cloud situations
<niemeyer> fwereade_: This has to be provided by the provider
<hazmat> even private openstack environments they typically do have something mapping to the instance types from ec2, but the definitions are more ad hoc user defined. there is a way via the native api to query out those capabilities.
<hazmat> this discussion sounds familiar
<fwereade_> hazmat, that is good to know
<fwereade_> niemeyer, hazmat: does anything like that exist for AWS (apart from that github project)?
<hazmat> fwereade_, but the mapping is not always complete for ec2 types
<hazmat> fwereade_, just that project the boto author did as far as machine parsable ones
<hazmat> fwereade_, we could see if smoser/someone could set that up on cloud-images
<niemeyer> <niemeyer> fwereade_: EC2 knowing that in Amazon a t1.micro is i386 sounds quite ok
<fwereade_> hazmat, it's not just serving it; it's updating it etc
<hazmat> but we'd also need to start distinguishing in the ec2 provider on how to query capabiiltiies based on impl (ostack/aws)
<niemeyer> fwereade_: We can also offer a document
<fwereade_> hazmat, it feels like a pretty serious responsibility to take on
<niemeyer> fwereade_: Similar to uec-images
<hazmat> fwereade_, really? https://github.com/garnaat/missingcloud setting up a cron job?
<niemeyer> fwereade_: You mean, providing data about which instance types exist?
<hazmat> fwereade_, oh its hand written you mean?
<fwereade_> niemeyer, as I said, I rejected that option as being unrealistic in the time we had available
<TheMue> niemeyer: todays watches of service and unit are build with callbacks. any already existing concept on how to handle it in go? or shall i use the idea of a watcher like rog and i already outlined a few days ago?
<fwereade_> niemeyer, perhaps I was wrong there
<niemeyer> fwereade_: Which option? Sorry.. it feels like there are wires being crossed all the time
<niemeyer> fwereade_: I'm a bit lost
<fwereade_> niemeyer, yeah, I'm losing track a bit
<fwereade_> niemeyer, I feel that it is inappropriate to depend on someone else's AWS data in an automated way
<niemeyer> fwereade_: We can publish that information
<fwereade_> niemeyer, and that taking on the responsibility of publishing it ourselves would be excessively painful
<fwereade_> niemeyer, and that even if that were not the case (I guess it isn't)
<niemeyer> fwereade_: Why?
<niemeyer> fwereade_: Canonical published the whole *operating system* that people are using
<niemeyer> fwereade_: I can't see how publishing which instance types are availalbe is a problem
<fwereade_> niemeyer, that my prospects of getting all that set up and out of my hair in the course of december 2011 was unrealistic
<fwereade_> niemeyer, if we provide it we have to keep up with the amazon announcements and always remember to update
<niemeyer> fwereade_: Yep
<niemeyer> fwereade_: Changes in instance types are nowhere close to frequent
<fwereade_> niemeyer, which IMO makes it only less likely that it will be something that we will remember to do in a timely way
<fwereade_> niemeyer, and makes a certain juju-update-dependent lag in access to latest instances acceptable
<niemeyer> fwereade_: Nothing too bad happens if it takes an entire month to be updated, really
<niemeyer> fwereade_: This is a non-issue in the context of things we've been talking about.. let's see what are the actual issues we have to decide now
<TheMue> niemeyer: e.g. my watch question above. *scnr*
<TheMue> niemeyer: ;)
<fwereade_> niemeyer: basically, given that we cannot discard d-i-i -- and that doing so was the primary motivation for dropping a sudden format change on our users -- can we perhaps entirely avoid inflicting an env.yaml change on them until sometime after 12.04?
<fwereade_> niemeyer, by keeping d-i-i for a while we'recommitting to another change down the line
<fwereade_> niemeyer, by making d-i-i and d-i-t act as though "default" meant "force", we can preserve existing behaviour for everyone, and make constraints accessible to all who can give up those keys
<niemeyer> fwereade_: Sorry, this whole conversation is flying well above my head
<niemeyer> fwereade_: default is default, not force
<niemeyer> fwereade_: constraints are broken, as you describe
<niemeyer> fwereade_: You're asking to not change environment, but I don't know what you mean by that
<niemeyer> fwereade_: We need a call
<fwereade_> niemeyer, sounds good
<niemeyer> TheMue: Not a good time, sorry
<TheMue> niemeyer: yeah, ic, followed your discussion with one eye
<TheMue> niemeyer: will port state/auth first
<fwereade_> niemeyer, invited on g+
<niemeyer> fwereade_: Uh oh
<niemeyer> Uh oh.. again
<rogpeppe> fwereade_, niemeyer: do you know anything about the way that admin_identity is used to make acls, by any chance? i'm seeing an "invalid acl" error, which i'm presuming is from StateHierarchy.initialize.
<rogpeppe> jimbaker: ^
<rogpeppe> i don't see any traceback
<rogpeppe> this is how juju-admin initialize is being invoked:
<rogpeppe> juju-admin initialize --instance-id='$(curl http://169.254.169.254/1.0/meta-data/instance-id)'  --admin-identity=sham --provider-type='ec2'
<rogpeppe> i'm wondering if admin-identity needs to be in a particular format
<rogpeppe> ah, got it!
<rogpeppe> wonderful how explaining things to people so often solves the problem...
<rogpeppe> (not that it's solved for definite yet - waiting for the machine to boot now)
<rogpeppe> dammit, i was wrong.
<rogpeppe> i think i've seen it now though...
<niemeyer> Ok.. off the call with fwereade_, talking to mthaddon about store now
<rogpeppe> dammit, i wrote that function once and can't find it! anyone know of a way to grep through all files in all branches in a bzr history? the bzr-grep plugin doesn't seem to do it.
<niemeyer> hazmat: It'd be good to have a call at some point today/tomorrow with you, fwereade_, and myself, to sync up on that conversation
<niemeyer> hazmat: Mainly clarification of the whole series/etc conversation
<hazmat> niemeyer, i'm game for it now if you'd like
<niemeyer> hazmat: fwereade_ just stepped out for some family time after about 2h of phone call
<niemeyer> hazmat: We both need a break before diving into it again
<rogpeppe> ah, found bzr-search and found my code!
<hazmat> niemeyer, ic fair enough, i hadn't realized, i'm around whenever you guys are ready
<hazmat> niemeyer, also the unit-stop spec could use a review, just pushed the latest
<niemeyer> hazmat: Aweomse, will have a look
<hazmat> niemeyer, i liked awsum ;-)
<hazmat> and thanks
<niemeyer> hazmat: Me too.. I also found funny the way it was ignored :)
<niemeyer> hazmat: Like "OMG, stop the bikeshed!" :)
<hazmat> hmm.. where oh where do we sync settings
<hazmat> ah there it is.. deploy
 * niemeyer perceived fwereade_ seems to have been hit by the network bug in Precise too
<niemeyer> [niemeyer@gopher ~]% ps auxw | grep chromium-browser | wc -l
<niemeyer> 10
<niemeyer> [niemeyer@gopher ~]% killall chromium-browser
<niemeyer> chromium-browser: no process found
<niemeyer> Why oh why
 * niemeyer invokes awk powers to do a trivial action.. poor normal users
<rogpeppe> all tests pass. woop woop.
<rogpeppe> we have lift off
<rogpeppe> niemeyer: a happy note to end the day on: https://codereview.appspot.com/5868051
 * rogpeppe is off for the day. see y'all tomorrow.
<niemeyer> rogpeppe: Neat!
<niemeyer> hazmat: I'm on the stop spec, btw
<niemeyer> hazmat: Delivered
<niemeyer> jimbaker: Any progress on the specs?
<niemeyer> jimbaker: and in their implementation?
<niemeyer> rogpeppe: Uh oh. I'm wondering if having pre-req support is a good idea. I'm starting to feel like we're getting tiny changes that are completely independent being piled up on unrelated changes.
<niemeyer> Now I'm getting two messages when I post a message in Rietveld.. eventual consistency is rocking our world :)
<fwereade_> niemeyer, hazmat: so, it turns out discussion of instance types and environment key deprecation is an excellent family insomnia cure; who would have guessed?
<fwereade_> niemeyer, hazmat: that is to say I have time for a chat :)
<niemeyer> fwereade_: Hehe :)
<niemeyer> fwereade_: I'm game as well
<niemeyer> hazmat?
<jimbaker> niemeyer, i've been working on two branches re impl, relation-id and relation-hook-context (to manage the contexts associated with using -r in the relation hook commands spec)
<niemeyer> jimbaker: I was just reading that spec, actually
<jimbaker> niemeyer, i'm going to do another round on the spec for relation-ids (what was called relation-info)
<jimbaker> niemeyer, good, i hope it's in the direction you like
<niemeyer> jimbaker: Is there anything else being said in addition to "relation-get needs to support the -r <relation id> argument"?
<jimbaker> niemeyer,not so much. most of the spec in relation-hook-commands-spec is to describe various scenarios through an example; and to address such details as the specifics of the caching of the relation hook contexts that are read in -r
<jimbaker> read in with -r
<jimbaker> and how the order they are written out
<niemeyer> jimbaker: Cool
<niemeyer> jimbaker: I don't think the ordering is important, btw
<jimbaker> niemeyer, sounds reasonable
<jimbaker> niemeyer, the ideal scenario is that uses a ZK multi
<jimbaker> in which case, it goes away
<jimbaker> however, one counterpoint is that does make it easier to test against log output corresponding to relation changes, if any
<jimbaker> so perhaps just an impl detail
<niemeyer> jimbaker: Right
<hazmat> niemeyer, i was just chatting with jim and looking over the wip impl. he's out at the moment though,
<hazmat> niemeyer, fwereade_ i'm up for the chat
<jimbaker> hazmat, i'm around right now
<niemeyer> hazmat: He seems alive and kicking :)
<hazmat> jimbaker, doh.. yeah i had written that message before i went away myself
<jimbaker> and was just chatting with niemeyer re the relation-hook-commands spec
<fwereade_> hazmat, niemeyer: heyhey
<hazmat> fwereade_, niemeyer  g+ invites out
<niemeyer> hazmat: Just give me a couple and will be with you
<niemeyer> jimbaker: Review delivered
<fwereade_> hazmat, would you follow up the warning thread with a precise and reassuring explanation of what you're planning re env settings? I fear missing some nuance ;)
<fwereade_> hazmat, and am a touch sleepy ;)
<hazmat> fwereade_, ack, and get some sleep
<hazmat> niemeyer, incidentally i noticed there's an lbox bug on milestone selection, it always selects the newest milestone afaics, instead of the oldest open milestone
#juju-dev 2012-03-22
<hazmat> bcsaller, can you finish up the review on this one .. https://codereview.appspot.com/5752069/
<hazmat> the force upgrade
<bcsaller> hazmat: yeah, looking now
<niemeyer> hazmat: Uh, indeed.. I guess I got it wrong
<bigjools> anyone around who can do some reviews please?
<bigjools> fwereade_: hi
<fwereade_> heya bigjools
<bigjools> fwereade_: I am this ---><--- close to getting the mass provider working
<bigjools> just one small issue left
<fwereade_> bigjools, I can absolutely do some reviews, but there's at least one other I have to finish first
<bigjools> which I need your advice on
<fwereade_> bigjools, swet, can I help with that?
<bigjools> when the master starts up a second node, it blows up at the point where it tries to get the charm from the provider file store
<bigjools> because it doesn't use authentication
<bigjools> I think we talked about this before but I  can't remember the resolution
<fwereade_> bigjools, we touched on it ever so briefly... IIRC all I said was something like "yeah, it would be sensible to require authentication for writes"
<bigjools> your connection is flaky today
<fwereade_> bigjools, how would you feel about either making the provider storage readable without auth, or about perhaps using signed URLS as the ec2.files get_url method does?
<bigjools> fwereade_: the provider already uses a signed request, I don't know why it's not been used in this case
<bigjools> perhaps the config is not available?
<fwereade_> bigjools, the url used to access the charm will have originally been created by MaaSFileStorage.get_url
<bigjools> fwereade_: the url is correct but it is not getting dispatched using the correct method that adds an oauth header
<fwereade_> bigjools, hm, who would it be authenticating as?
<bigjools> whatever is configured in environments.yaml
<fwereade_> bigjools, nodes shouldn't have access to provider credentials
<bigjools> why was none of this in docstrings? :/
<bigjools> ok so is it safe to make "get" operations unauthed?
<fwereade_> bigjools, I guess the answer is "we're bad at predicting which context people will need" :(
<fwereade_> bigjools, we think so
<bigjools> I'll see if we can do oauth in the url as well
<bigjools> otherwise, no auth :)
<fwereade_> bigjools, if you can sign the urls that would be nice but don't overexert yourself; no auth was considered acceptable for orchestra
<bigjools> ok
<fwereade_> bigjools, there are definitely a number of identity/auth questions that have yet to be addressed
<bigjools> it's a trivial change to remove auth in maas for file retrieval
<fwereade_> bigjools, then tbh I would go with that
<fwereade_> bigjools, the question of agent identity is I think agreed in principle, but at the moment we just trust them
<fwereade_> bigjools, it's one of the many stories I would love love love to have addressed before 12.04, but... :(
<bigjools> ok :(
<bigjools> fwereade_: is there a way to restart the agent on the node so it retries getting the charm>#
<bigjools> ?
<fwereade_> bigjools, you should be ok to just kill the machine agent process
<fwereade_> bigjools, (IMO the unit agent should really be the one getting the charm, but hey ho)
<fwereade_> bigjools, (it's upstartified, it'll come back up (I hope :p))
<bigjools> ok
<bigjools> I'll fix0rate maas
<bigjools> fwereade_: ok got past that hurdle only for a new steeplechase
<bigjools> in the charm.log it has:
<bigjools> juju.errors.JujuError: Unknown provider type: 'maas', unit addresses unknown.
<bigjools> so I guess I am missing some config?
<fwereade_> bigjools, hmm, let me investigate
<fwereade_> bigjools, would you pastebin the log?
<bigjools> fwereade_: http://pastebin.ubuntu.com/894806/
<fwereade_> bigjools, actually, sorry, it's clear
<fwereade_> bigjools, juju/unit/address.py
<bigjools> fwereade_: oooookay, we have a charm deploying
 * fwereade_ cheers and showers bigjools with confetti
<fwereade_> bigjools, awesome
<bigjools> what a long road!
<bigjools> fwereade_: so is that UnitAddress supposed to be public or private?
<fwereade_> bigjools, it should provide both
<bigjools> fwereade_: what's it for?
<fwereade_> bigjools, if there's no such thing as a private address just return the public one in that case
<bigjools> not a single comment in that file!
<fwereade_> bigjools, the only thing I'm sure of is that hooks can do something like `unit-info public-address`
<fwereade_> bigjools, and it's almost certainly used by `juju status`
<fwereade_> bigjools, I'm not aware of other uses
<bigjools> ok thanks
<fwereade_> allenap, ping, can we chat about _extract_stystem_id in your branch? (or possibly bigjools, if you have context?)
<allenap> fwereade_: Sure.
<allenap> fwereade_: Voice or here?
<fwereade_> allenap, here should be fine; I was just wondering about the return-unchanged behaviour
<fwereade_> allenap, what's the motivation? isn't it just deferring an exception minutely?
<allenap> fwereade_: I guess it was to defer the decision as to what is or isn't correct back to maas, in case of doubt.
<fwereade_> allenap, the only plausible case I can see is if *something* for some reason holds a system_id instead, in which case it doe the "right" thing, kinda, but IMO the presence of a system_id in place of a resource_uri is unquestionably indicative of a bug
<fwereade_> allenap, is there some scenario I've missed?
<allenap> "Be forgiving with input, be unforgiving with output" or something like that.
<allenap> fwereade_: No, you haven't missed anything.
<allenap> fwereade_: I'm happy to raise an exception instead.
<fwereade_> allenap, I see this as pre-existing internal data, rather than input, myself
<fwereade_> allenap, cool, if yu would do that that would be great
<allenap> fwereade_: Would an assertion fit with the rest of Juju, or should I raise some other exception?
<fwereade_> allenap, I'd probably suggest a juju.errors.ProviderError
<fwereade_> allenap, up to your judgment whether it should be exposed or just tested indirectly via garbage params to get_nodes
<fwereade_> allenap, I'm going to repaste, some of it probably got lost :/
<fwereade_> <fwereade_> allenap, I'd probably suggest a juju.errors.ProviderError
<fwereade_>  allenap, possibly even MachineNotFound but I'd need to think about that a bit
<fwereade_>  allenap, the only other thing is that we don't like to import anything with an _, even in tests
<fwereade_>  allenap, but we're not opposed to exposing utility functions purely for tests
<fwereade_> --- Disconnected (Connection reset by peer).
<fwereade_> --> You are now talking on #juju-dev
<fwereade_> <fwereade_> allenap, up to your judgment whether it should be exposed or just tested indirectly via garbage params to get_nodes
<fwereade_> allenap, I'll just make those comments on the review, ping me when it's ready for another round ;)
<allenap> fwereade_: I'll change it to a public function.
<allenap> fwereade_: Thank you!
<fwereade_> allenap, perfect
<rogpeppe> fwereade_: i was just looking at https://codereview.appspot.com/5847059 and realised there's a big gap in my knowledge - what is the scheduler actually *doing*?
<fwereade_> rogpeppe, handling callbacks from watch_unit_relations
<fwereade_> rogpeppe, which represent either settings node version changes for known relations, or adds/removes on the set of related ones
<rogpeppe> fwereade_: ah, so we probably wouldn't have a direct equivalent in the go port - we'd probably just use a goroutine and listen on channels, right?
<fwereade_> rogpeppe, and converting that stream of events into a stream of required hook executions in a context with the correct membership for the time at which the change was noted
<fwereade_> rogpeppe, I'm deliberately not thinking about that part of the problem yet, I don't want to prejudice the implementation ;)
<rogpeppe> fwereade_: that's fine. i (obviously) am :-)
<fwereade_> rogpeppe, my instinct says that hook scheduling will look somewhat different in go but that's all I can say really :)
<rogpeppe> fwereade_: BTW the amazon tests work again :-)
<rogpeppe> fwereade_: although only about 60% of the time :-(
<rogpeppe> i think we should do more retries when ec2 gives us an error that's likely to be transient.
<rogpeppe> fwereade_: have you got any outstanding reviews i should be looking at?
<fwereade_> rogpeppe, yes please, I should have a few on the active reviews page
<fwereade_> rogpeppe, re amazon, sounds sensible
<rogpeppe> i can never remember how to find the active reviews page
<rogpeppe> ah, found it.
<rogpeppe> fwereade_: are they in any particular order?
<fwereade_> rogpeppe, most are independent, but tweak-supercommand sits on top of add-cmd-context
<fwereade_> rogpeppe, although I'm not sure it actually needs to
<allenap> fwereade_: Ready for round 2?
<fwereade_> allenap, sure
<fwereade_> allenap, can't see updated MP yet
<allenap> fwereade_: Sorry, I should have waiting until that was ready before pinging.
<fwereade_> allenap, np, it's not a costly check for me ;)
<fwereade_> allenap, btw, heads up: not sure how you're deploying your charms (by full name including series, I guess?) but you'll need to handle default-series in environments config at some stage
<fwereade_> allenap, what this actually means is that you can completely ignore the requirement
<fwereade_> allenap, but you'll need to add a line temporarily to your environments.yaml when it lands
<allenap> fwereade_: I don't really know what that means :)
<fwereade_> allenap, it won't last long at all, and you should never otherwise have to care about it
<allenap> default-series, that is.
<fwereade_> allenap, `juju deploy wordpress` infers series from default-series
<fwereade_> allenap, `juju deploy cs:precise/wordpress` knows the series from the charm url
<allenap> fwereade_: What does series refer to? Distroseries, or something else?
<fwereade_> allenap, yeah, exactly
<fwereade_> allenap, charms target a specific series
<fwereade_> allenap, (that we must guarantee they actually be deployed on ;))
<fwereade_> allenap, so if you see a default-series error cropping up after a merge, go and add "default-series: oneiric" to your maas config and forget about it
<allenap> fwereade_: Okay, thanks.
<fwereade_> allenap, and when you later see errors complaining that default-series is no longer valid, remove it :)
<allenap> Hehe :)
<fwereade_> allenap, you *might* not even be hit by it
<fwereade_> rogpeppe, TheMue: btw, I'm not sure this was actually announced while you were around: I'm focusing on python for a couple of weeks
<rogpeppe> fwereade_: ah, ok. yeah, i didn't know that.
<fwereade_> rogpeppe, TheMue: so I am unlikely to be a timely and helpful reviewer
<rogpeppe> fwereade_: :-|
<fwereade_> rogpeppe, TheMue: I will ofc hit those if I have any time left over in my ~1 go hour per day
<rogpeppe> fwereade_: (i enjoy your reviews!)
<fwereade_> rogpeppe: thanks :)
<fwereade_> rogpeppe, TheMue: and ofc I'm also always around to talk, but otherwise generally expect diminished engagement from me for a short time
<allenap> fwereade_: There seems to be a problem with my branch; it's not getting scanned by Launchpad. I'm talking to the LP webops about it now.
<fwereade_> allenap, np, if it looks likely to take too long let me know and I'll diff it manually
<fwereade_> rogpeppe, a thought: the above doesn't mean I wouldn't really appreciate a reduction in my own go review backlog, or that I won't be trying to get them merged, just that they're not at the *top* of my list atm :p
<TheMue> fwereade_: hope you'll be back for go soon
<fwereade_> TheMue, thanks, I hope so too
<fwereade_> TheMue, I don't like python any more, it spells "true" wrong ;p
<TheMue> fwereade_: *lol*
<allenap> fwereade_: Turns out I was being a numpty. It's there now.
<fwereade_> allenap, cool
<fwereade_> allenap, LGTM
<fwereade_> allenap, I'll merge this and the other one shortly, just need to persist some of my own state before I do that though ;)
<allenap> fwereade_: Thanks! There's a trivial follow-on from use-maas-uris - https://code.launchpad.net/~allenap/juju/maas-to-maas/+merge/98759 - and it's already approved by hazmat. Would you be able to land that at the same time?
<fwereade_> allenap, surely :)
<allenap> Thanks.
<hazmat> allenap, fwereade_ it lives!
<fwereade_> hazmat, ...charm store?
<fwereade_> hazmat, oh maas
<fwereade_> hazmat, sorry braindead :p
<allenap> hazmat: \o/
<hazmat> fwereade_, the former soon as well hopefully
<hazmat> allenap, is there any api control over the image that's used on a machine in maas?
<fwereade_> hazmat, actually, I need a spot of advice(but I think your question to allenap is more important, I'll sit quiet a mo)
<allenap> hazmat: Not yet; it's plain precise only for now. It will definitely be added though.
<fwereade_> hazmat, ok: I'm thinking about blocking d-i-i/d-i-t use on the amazon cloud only
<fwereade_> hazmat, and it's leading me in a bad direction
<fwereade_> hazmat, the problem is this
<fwereade_> hazmat, to know whether an environment should accept those keys, we don't just need the ec2 uri
<fwereade_> hazmat, we also need to know if we're running on a legacy environment
<fwereade_> hazmat, (right?)
<fwereade_> hazmat, so that we don't just render existing environments inaccessible to the point of not even being destroy-environment-able
<fwereade_> hazmat, however, we cannot generally know if we're running in a legacy environment until we've seen how env state is stored in ZK
<fwereade_> hazmat, which then means we have to check twice: once on bootstrap (in which case it's easy, don't accept bad keys)
<fwereade_> hazmat, and once any time we want to connect to ZK (in which case we need to wait until we've connected to find out whether it's a legacy environment)
 * hazmat catches up
<fwereade_> hazmat, it would be insanity to find and tweak every call to provider.connect()
<fwereade_> hazmat, but to avoid doing *that*, we have to make ec2.MachineProvider do its own check/barf in an overridden connect(), and that involves hitting the state module
<fwereade_> hazmat, which I'm pretty sure the providers should not know about
<fwereade_> hazmat, that's about as far as I'd got
<hazmat> fwereade_, if a user doesn't perform an activity that identifies a legacy environment as such, then its not really appropriate to cause warning imo
<hazmat> ie. if their not deploying/adding-unit.. is it an active concern for the env
<fwereade_> hazmat, hm, maybe it is just 3 checks
<fwereade_> hazmat, bootstrap, deploy, add-unit
<fwereade_> hazmat, ok, sounds good to me; thanks
<hazmat> fwereade_, cool
<fwereade_> hazmat, hm, how would it be if I added something like get_in_legacy_environment() to GlobalSettingsStateManager
<fwereade_> ?
<fwereade_> hazmat, would just return False for now, so we always barf on bad keys, but if we're doing this in a parallel branch that's OK
<fwereade_> hazmat, could always return True I guess but I'd prefer to merge something that has the eventual desired behaviour but which doesn't ever actually trigger until there's a way to detect legacy environments
<fwereade_> hazmat, sensible?
<hazmat> fwereade_,  what needs to use get_in_legacy_environment?
<hazmat> fwereade_, the new code will end up turning legacy into new style data structures, its the old code that needs to use the legacy data, and it won't be using a new api
<hazmat> fwereade_, but the legacy structures will still be in place
<fwereade_> hazmat, I need to use it to do the checks: in a legacy environment we want to warn about those keys, but in a new one we want to error; don't we?
<hazmat> fwereade_, true
<fwereade_> hazmat, does GSSM sound like the right place to you for now?
<hazmat> fwereade_, perhaps on env manager
<fwereade_> hazmat, cool, whatever's best for you
<hazmat> either works, but env manager is used more commonly in the places we need to check this
<hazmat> gssm is used more when we tweak or read providertype/debug-log
<niemeyer> Heh, great.. I was talking to myself..
<niemeyer> Good morning all :)
<niemeyer> hazmat: Are you around yet by any chance?
<hazmat> niemeyer, yes.. just finishing up a subordinate review
<niemeyer> hazmat: Heya
<niemeyer> hazmat: I'd like to sync up at some point re. the charm store
<niemeyer> hazmat: For the initial test/beta/alpha/point-nil phase :-), we'll be deploying it without an SSL certificate
<niemeyer> hazmat: Which means we'll need to s/https/http/ on the code
<hazmat> niemeyer, i guess without cert checking its rather moot
<hazmat> niemeyer, yeah.. easy enough its a one liner
<niemeyer> hazmat: Super, +1 on cowboying it
<hazmat> niemeyer, does that mean a deploy is near?
<niemeyer> hazmat: Yeah, as quick as I can develop a charm for the charm store :)
<niemeyer> hazmat: We went full round and it's back on me now
<hazmat> niemeyer, so like 10m ;-)
<niemeyer> hazmat: Kind of :).. I mean a serious charm! :)
<hazmat> dog walk bbiab
<niemeyer> mthaddon: So, I'm dropping the SSL need, to avoid the conflict there
<mthaddon> niemeyer: you've discussed this with elmo? not sure if the issue was the SSL cert specifically or the ubuntu.com domain
<niemeyer> mthaddon: I didn't discuss this with elmo, but that's what I understood was a problem from our conversation yesterday
<niemeyer> mthaddon: There's nothing special about the ubuntu.com domain, AFAIK..
<niemeyer> mthaddon: If that's an issue too, I'm happy to change the client to store.eat-your-lunch.com
<mthaddon> niemeyer: eh? there's quite a lot that's special about the ubuntu.com domain
<mthaddon> ah, I see what you mean
<niemeyer> mthaddon: Sure, no problem..
<niemeyer> hazmat: Let's change the domain too, please..
<robbiew> niemeyer: I don't think we should change the domain
<niemeyer> robbiew: It's fine.. I'll provide the Elastic Balancer domain to hazmat
<niemeyer> robbiew: and will use the Elastic Balancer to distribute load on two independent charm store frontends, deployed with juju
<niemeyer> robbiew: Won't be a first class domain, but no one sees that domain anyway
<robbiew> niemeyer:  good point
<wrtp> niemeyer: admin-secret CL is now much smaller... except that lbox propose seems to have gone a little mad.
<niemeyer> wrtp: How so?
<wrtp> niemeyer: it's showing lots of diffs that it shouldn't
<niemeyer> wrtp: It's probably showing the diffs that bzr is showing
<wrtp> niemeyer: "diff --old lp:juju/go" shows the correct (small) diffs
<niemeyer> wrtp: Take the two revisions that are in Rietveld, and do a diff
<niemeyer> wrtp: Check if the revisions are right
<niemeyer> wrtp: Or if the diff is different
<wrtp> niemeyer: good idea
<wrtp> niemeyer: BTW, how can i find out the complete revision id of the current branch tip?
<niemeyer> wrtp: bzr revision-info
<niemeyer> wrtp: bzr log --show-ids also
<wrtp> niemeyer: thanks
<hazmat> niemeyer, which domain?
<wrtp> niemeyer: looks like it's using the wrong revision-id for the old version
<niemeyer> hazmat: Trying to figure out
<niemeyer> wrtp: What's the revision id it's usng?
<wrtp> roger.peppe@canonical.com-20120321155821-85i0cf6wo39qrpg6
<niemeyer> wrtp: Isn't that the revision id of the pre-req?
<wrtp> niemeyer: quite possibly - but that's wrong in this case, because the prereq has already been merged.
<niemeyer> wrtp: I see, ok
<niemeyer> wrtp: Will have to change lbox to use a different base depending on whether the pre-req was already merged or not
<wrtp> niemeyer: yeah - i wouldn't have thought of that either...
<niemeyer> wrtp: Can you please just confirm that this is indeed the case? Do you get the same "wrong" diff if you diff against the pre-req?
<TheMue> niemeyer: https://codereview.appspot.com/5875047 is ready for a review
<niemeyer> TheMue: Cheers!
<TheMue> niemeyer: moin ;)
<niemeyer> andrewsmedina: ping
<wrtp> niemeyer: yes
<niemeyer> TheMue: Erm, hold on
<wrtp> lunch
<niemeyer> wrtp: yes, you can check, or yes, you've checked?
<niemeyer> TheMue: I don't think we want that stuff right now
<niemeyer> TheMue: This isn't in use yet
<niemeyer> TheMue: Even in Python, I mean
<niemeyer> TheMue: I didn't even recall that this was actually merged already
<andrewsmedina> niemeyer: 64 bytes from andrewsmedina: icmp_seq=0 ttl=251 time=2.677 ms
<TheMue> niemeyer: hmm, thought i've seen calls of these functions
<niemeyer> andrewsmedina: :-)
<niemeyer> andrewsmedina: Heya
<andrewsmedina> niemeyer: everything ok?
<niemeyer> andrewsmedina: You've told me to ping you if there was something you might be involved in
<niemeyer> andrewsmedina: I think I have something for you to participate in. Interested?
<andrewsmedina> niemeyer: yes
<niemeyer> andrewsmedina: It's a bit less straightforward than the last task, but actually important
<andrewsmedina> niemeyer: I'm working on local env
<niemeyer> andrewsmedina: hazmat is conducting some adaptations in the way environments are stored, introducing a couple of commands (set-env, get-env) and also adapting bootstrap to take these options
<TheMue> niemeyer: make_identity() is called twice outside of state and several times in state/security.py
<andrewsmedina> niemeyer: nice
<niemeyer> andrewsmedina: We need to adapt our side of things with similar logic
<niemeyer> TheMue: This isn't in use..
<niemeyer> TheMue: Deploy a charm and try to see who touches that logic
<niemeyer> andrewsmedina: Interested in picking the task?
<wrtp> niemeyer: i fixed the CL for the time being anyway
<andrewsmedina> niemeyer: did in the python lib?
<wrtp> although i need to edit the description
<andrewsmedina> niemeyer:ops
<andrewsmedina> niemeyer: hazmat did this in the python lib?
<niemeyer> wrtp: How?
<niemeyer> andrewsmedina: He's working on it right now
<TheMue> niemeyer: why has it been developed? will it be needed in future? or is it just code that should have been removed?
<niemeyer> andrewsmedina: You can get more details with him to see where's the branch with the Python diff, etc
<niemeyer> andrewsmedina: It should be significantly simpler on our side, because not all of the things changing are implemented yet
<TheMue> niemeyer: and what do you need to be ported next? i still have the Watchâ¦() methods in Unit and Service open.
<niemeyer> TheMue: It has been developed because long ago there was a push to have more security around ZooKeeper, but other needs walked over that problem
<TheMue> niemeyer: understand
<andrewsmedina> niemeyer: ok
<andrewsmedina> hazmat: can you help me?
<niemeyer> TheMue: I don't know if it's going to be used or not, but I'm against developing and maintaining code while we don't know the answer to that
<TheMue> niemeyer: i'll put the code in my archive, maybe we need it again later. ;)
<niemeyer> TheMue: Sounds great.. pushing it to Launchpad is also a good idea
<niemeyer> TheMue: The one you proposed is obviously already there
<niemeyer> TheMue: If you have another one, just push as well
<hazmat> andrewsmedina, yes
<andrewsmedina> niemeyer: I did the implementations for interfaces related with local enviroment using lxc :)
<niemeyer> TheMue: Regarding the Watch, I thought we had a pretty good agreement on that
 * hazmat catches up
<niemeyer> TheMue: What's still pending?
<niemeyer> andrewsmedina: Wow, nie
<niemeyer> nic
<niemeyer> nice!!
<niemeyer> andrewsmedina: Btw, this may be a good time to remind you that small branches are easier to deal with :)
<andrewsmedina> niemeyer: I now
<hazmat> the security stuff is completely unused outside of making an identity for the admin user, and even there its not backed with an acl, so its not functional
<TheMue> niemeyer: maybe i'm lost in the fact that there's an agremment. last info here is the approach rog and i already made for a Watcher (specific goroutines waiting for change signals and retrieving all they need then)
<andrewsmedina> niemeyer: I've been kinda busy this week
<TheMue> niemeyer: if that's the agreement i'm pretty fine with it
<hazmat> andrewsmedina,  was there something in particular you needed help with ?
<andrewsmedina> hazmat: you could show me the diff of what you did?
<niemeyer> TheMue: I don't know what you and rog agreed.. I know that the three of us talked about an approach back in the Rally, and that we talked about this approach again a few days ago
<niemeyer> TheMue: What's the question?
<TheMue> niemeyer: yes, that the one i mean
<TheMue> niemeyer: only wanted to be sure. maybe you and rog discussed something else, so that i would start in the wrong direction.
<niemeyer> TheMue: We did discuss it, but everything we talked about was here in the channel and back at the Rally
<niemeyer> TheMue: We didn't have any discussions outside of that
<niemeyer> TheMue: Did you see this, for example: http://paste.ubuntu.com/870030/
<hazmat> andrewsmedina, you'll have to be more specific
<TheMue> niemeyer: yep, that's the one i have in mind
<niemeyer> TheMue: Actually, I think rog had a more complete one
<TheMue> niemeyer: i think i still have the link in my temp folder
<andrewsmedina> hazmat: niemeyer told me that you is conducting some adaptations in the way environments are stored
<andrewsmedina> hazmat: I will do it in Go port
<hazmat> andrewsmedina, gotcha, its not done yet.. i just started work on it yesterday, but basically the provider credentials/access go to /environment/provider and the rest go to /environment
<hazmat> and instead of setting the environment every deploy, its only done once, and subsequent modification requires the use of set-env
<andrewsmedina> hazmat: you're working on a branch?
<niemeyer_> Boom
<niemeyer_> TheMue: Did you find it?
<flacoste> SpamapS: around?
<TheMue> niemeyer: not yet, started with branching and inspecting the py code first
<hazmat> andrewsmedina, yup.. its lp:~hazmat/juju/environment-settings.. i haven't pushed yet, cause i'm still trying to figure out the api a bit
<andrewsmedina> hazmat: ok
<wrtp> back
<niemeyer> TheMue: Found it: http://paste.ubuntu.com/871544/
<wrtp> niemeyer: i merged and pushed the prereq branch
<niemeyer> wrtp: Brilliant
<niemeyer> wrtp: Thanks
<SpamapS> flacoste: I am, whats up?
<niemeyer> TheMue: There may be a few details, like the Done order being inverted, but wrtp's basis on it is pretty good
<niemeyer> TheMue: Ah, we agreed to get rid of Err for the moment too
<flacoste> SpamapS: did you intend to do an upload of juju to the archive today for beta freeze 2, if you do, can you make sure that the two branches necessary for it to work with maas are included?
<flacoste> pretty please
<flacoste> one of them is merged at revision 487
<TheMue> niemeyer: great, thanks. my agent watcher ones worked the same way (until i get aware that it isn't needed here)
<wrtp> niemeyer: yeah, i think i had a slightly updated version that took into account our discussion; i'll have a look
<flacoste> the other one is still waiting to be merged: https://code.launchpad.net/~allenap/juju/use-maas-uris/+merge/98756
<niemeyer> SpamapS: Ah, if you are considering an update, please hold on until a couple of hours as I sort out the repo address with hazmat
<niemeyer> SpamapS: Will be a one-liner
<niemeyer> wrtp: Thanks much
<flacoste> niemeyer: well, he didn't reply yet, but i see you'd also like one :-)
<niemeyer> flacoste: Yeah :)
<SpamapS> flacoste: I had hoped to, but I have not seen subordinates fully drop yet
<TheMue> niemeyer: btw, do wie have a (powerful) search interface to our irc logs? ;)
<niemeyer> TheMue: Very powerful one.. grep
<niemeyer> :)
<flacoste> SpamapS: i think it would still be worth it - even without subordinates, as that would allow people to test juju + maas from the archive
<TheMue> niemeyer: ok, on the command line
<flacoste> but subordinates would also be great!
<wrtp> niemeyer, TheMue: here's a slightly updated version. it still has Err though. http://paste.ubuntu.com/895174/
<wrtp> niemeyer: the question is: are we prepared to discard all errors that the watcher encounters?
<SpamapS> flacoste: I will file my FFe then.
<wrtp> niemeyer: (or just log them, i guess)
<niemeyer> hazmat: http://23.21.254.154
<niemeyer> hazmat: This will do for now..
<niemeyer> hazmat: It's an Elastic IP, so I can make sure it is preserved, and lands on something sensible shortly
<niemeyer> wrtp: Uh.. I'm certainly not prepared to discard any errors..
<hazmat> niemeyer, hmm.. those do get recycled.
<hazmat> niemeyer, ie. we can't ever release that eip
<hazmat> i mean we  can but its breakage
<wrtp> niemeyer: if there's no Err method, then the thing using the watcher can't get the error.
<niemeyer> wrtp: Wait returns the error..
<wrtp> niemeyer: FooWatcher doesn't have a Wait method
<niemeyer> hazmat: They don't get recycled unless I say so
<wrtp> niemeyer: maybe it should, but the watch channel can fulfil that role.
<niemeyer> hazmat: That's the main point of the Elastic IP
<niemeyer> wrtp: It has a Stop method that returns the result of Wait
<hazmat> niemeyer, yeah.. i just remember here some startup story about how they started getting netflix traffic because of  a recycled eip and dns caches
<hazmat> s/hearing
<wrtp> niemeyer: does that mean if you get eof on the channel that you always have to call Stop?
<niemeyer> wrtp: It just means we have access to the error..
<niemeyer> wrtp: Let's see the code, please..
 * TheMue should develop a little Go app for searching inside the logs. in time ranges, with patterns, filtered by user, output with surrounding lines and links direkty into the found occurence
<wrtp> niemeyer: i pasted it above
<niemeyer> wrtp: Let's see that code being used I mean.
<wrtp> niemeyer: sounds good.
<niemeyer> wrtp: It's trivial to add an Err method if we need it
<wrtp> niemeyer: i don't mind if the caller always needs to call Stop actually.
<niemeyer> wrtp: It's not trivial to suggest good usage without any usage
<niemeyer> wrtp: Right, it should anyway
<niemeyer> And I should get lunch!
<niemeyer> biab
<wrtp> niemeyer: that's a nice invariant thing, even if we "know" that when we get eof on the channel, it's already stopped.
<wrtp> niemeyer: enjoy!
<flacoste> SpamapS: thanks!
<wrtp> TheMue: here's an updated version of the watcher demo code, according to the discussion above: http://paste.ubuntu.com/895188/
<flacoste> SpamapS: for the upload, any revision starting at 488 will be good for MAAS
<TheMue> wrtp: thx
<SpamapS> flacoste: great
<hazmat> andrewsmedina, its very much a work in progress and raw, but i went ahead and pushed the branch fwiw.
<SpamapS> bcsaller: any chance subordinates will land in the next 3 hours?
<SpamapS> hazmat: ^^ ?
<hazmat> SpamapS, no
<SpamapS> ok, will leave that one out. ;)
<flacoste> SpamapS: once your FFe is filed, just ask Daviey for approval :-)
<hazmat> SpamapS, i think we've fixed most the of the issues, but it needs more unit tests, and support for departed hooks
<hazmat> SpamapS, one moment, i've got more branch to add
<hazmat> its a maas beautification thing
<SpamapS> I won't be uploading until much later
<SpamapS> just want to know what to put in the FFE
<SpamapS> whats the status of constraints at the moment?
<hazmat> SpamapS, the branch currently removes support for some previous environments.yaml settings
<hazmat> SpamapS, its ready to go in though
<hazmat> the backwards compatibility thread basically stopped its merge
<SpamapS> as it should. :)
 * hazmat sighs..
<hazmat> that's my fault
<fwereade_> bigjools, allenap: maas branches merged, one of you please verify it still works ;)
<hazmat> i delayed on its review/merge because i wasn't comfortable with the static constraints instead of provider based ones.
<hazmat> but that wasn't really important, given the big picture
<hazmat> so now compatibility
<allenap> fwereade_: I'll give it a go. bigjools is probably fast asleep.
<hazmat> fwereade_, thanks for merging all those maas branches
<hazmat> hmm.. eucalyptus compatibility..
<hazmat> niemeyer, fwereade_, jimbaker  can i get a +1 for this trivial (charm-store-url) http://paste.ubuntu.com/895250/
<hazmat> moving into cowboy phase
<fwereade_> hazmat, a pleasure
<hazmat> niemeyer, is there a store  hooked up to that ip address?
<fwereade_> hazmat, +1 despite the inherent cowboyishness, assuming niemeyer says "yes" to your last question ;)
<fwereade_> allenap, cool
<jimbaker> hazmat, +21
<jimbaker> well i meant +1
<hazmat> fwereade_ there's a python store impl in the code base ? i wanted to try an end2end test
<fwereade_> hazmat, there is a kinda hackish one, would be better to use the real one really
<hazmat> fwereade_, doesn't appear to be running atm
<hazmat> fwereade_, where is the python one?
<hazmat> nm.. oh.. i guess i can just the go one locally
<fwereade_> hazmat, lp:~fwereade/juju/charm-store-hack fwiw
<hazmat> quite a few conflicts running charmload in parallel
<hazmat> team meeting? i'm happy to skip it
<jimbaker> hazmat, +1 on skip
<fwereade_> hazmat, also +1 :)
<niemeyer> hazmat: No, it's a public IP unhooked
<niemeyer> hazmat: But there will be
<niemeyer> hazmat: Soon!
<SpamapS> niemeyer: wtf is 5 revs behind..
<niemeyer> SpamapS: Will check, thanks
<fwereade_> hazmat, hm, what parallel branch should we be working against?
<niemeyer> wrtp: So admin-identity is showing the real thing now?
<niemeyer> hazmat: +1 on the address.. -1 on env variable.
<wrtp> niemeyer: i'm not sure what you mean by "the real thing" there
<niemeyer> wrtp: The CL
<wrtp> niemeyer: yeah
<niemeyer> wrtp: pre-req issues
<wrtp> niemeyer: yes
<wrtp> niemeyer: there are a few other minor changes too (formatting, extra log messages) which i'm hoping you don't mind me bundling in the same CL
<wrtp> niemeyer: i'm a bit concerned that the live tests fail with relatively high frequency (~ 40% without actually measuring). it's due to transient errors from the EC2 servers. i'm wondering if we should make goamz/ec2 automatically retry when it gets one of those errors.
<niemeyer> wrtp: Huh?
<niemeyer> wrtp: I thought we had just addressed that?
<wrtp> niemeyer: that was different
<wrtp> niemeyer: that was dealing with eventual-consistency issues, not random server failure
<niemeyer> wrtp: What's random server failure?
<wrtp> niemeyer: here are a few examples i've collected: http://paste.ubuntu.com/895342/
<wrtp> niemeyer: i've seen all of those errors multiple times
<wrtp> niemeyer: actually, the first one i've only seen once.
<niemeyer> wrtp: no instances found is not a random server error..
<niemeyer> wrtp: Seems like normal eventual consistency issues
<wrtp> niemeyer: yeah, that's different, sorry, it shouldn't have been there.
<niemeyer> Ok, np
<niemeyer> wrtp: Yeah, those errors are awkward
<wrtp> niemeyer: i *think* that the ec2 package is best placed to deal with them
<wrtp> goamz/ec2, that is
<niemeyer> wrtp: I'm happy to have a pass at that after I get rid of my current assignments
<wrtp> niemeyer: i could do it if you like
<niemeyer> wrtp: I've implemented something I'm comfortable with in that area before
<niemeyer> wrtp: Please leave that with me as I've done it before.. should hopefully not take long
<niemeyer> wrtp: I'll copy some code over
<wrtp> niemeyer: cool, np.
<wrtp> niemeyer: the "remote error: handshake failure" error comes from the crypto/tls package, BTW
<niemeyer> wrtp: I was guessing that..
<niemeyer> wrtp: It's probably the same "unexpected EOF" problem in a different client side location
<wrtp> niemeyer: it doesn't look like it, actually.
<wrtp> niemeyer: the "handshake failure" comes from an error code sent by the remote side.
<wrtp> have we got a meeting now?
<niemeyer> wtf      20165  0.0  5.0 194760 25184 pts/2    Sl+  Mar20   0:49 python /home/wtf/ftests/build/juju/bin/juju status
<niemeyer> SpamapS: ^
<niemeyer> hazmat: ^
<niemeyer> That's why the wtf is locked up
<niemeyer> TheMue: and yeah, timeouts are good ;)
<niemeyer> wrtp: Indeedfully!
<TheMue> niemeyer: :D
<wrtp> timeouts are good, but go makes it easy to program them at the level they're needed, rather than baking them into every call...
<niemeyer> wrtp: It's great that no one is suggesting that we should bake them in every call then.
<wrtp> niemeyer: cool.
<niemeyer> wrtp: It's also great that we seem to have found pretty good spots for our timeouts so far.
<wrtp> niemeyer: just saying, before we start doing it :-)
<niemeyer> wrtp: Sure, just sayin' too..
<hazmat> SpamapS, did you have any luck tracking how to reproduce that destroy service issue?
<SpamapS> hazmat: its on my todo after I get through the 500+ email backlog for today ;)
<SpamapS> hazmat: but it happened twice where destroy-service gave that 'no node' and then had to be run one more time.
<niemeyer> SpamapS: wtf is alive again
<SpamapS> niemeyer: thanks!
<niemeyer> SpamapS: np.. we should have some kind to kill/retry the tests
<niemeyer> Erm
<niemeyer> SpamapS: np.. we should have some kind of timeout to kill/retry the tests
<niemeyer> wrtp: LGTM on admin-identity, with a trivial only
<wrtp> niemeyer: thanks
<wrtp> wasn't this meeting supposed to start 40 minutes ago?
<wrtp> (or 1h40m ago if you look at the calendar event...)
<niemeyer> wrtp: Yeah.. I'm happy to have it now
<niemeyer> TheMue, hazmat, fwereade_, bcsaller, jimbaker?
<niemeyer> Perfect timing :-)
<wrtp> not william, it seems :-)
<bcsaller> heh
<niemeyer> William is probably a bit overloaded with meetings.. I think we spent something like 4h on G+ yesterday
<niemeyer> fwereade_: Didn't we? :)
<fwereade_> niemeyer, sorry, I think I missed something
<niemeyer> <niemeyer> William is probably a bit overloaded with meetings.. I think we spent something like 4h on G+ yesterday
<niemeyer> wrtp: Ok, I think that meeting isn't flying :-)
<wrtp> hmm, i just joined a hangout with fwereade_ and hazmat, but i don't think it was the team meeting :-)
<niemeyer> wrtp, fwereade_, TheMue: I'd like to take that chance to schedule a weekly Go port meeting
<wrtp> niemeyer: looks like it
<wrtp> niemeyer: i think we should definitely do that
<niemeyer> When is a good time for you?  I guess my mornings are best?
<wrtp> niemeyer: the usual meeting is too many people to get useful stuff discussed, i think
<niemeyer> TheMue, fwereade_?
<wrtp> niemeyer: mornings definitely best, yes
<fwereade_> niemeyer, sorry, talking to hazmat
<niemeyer> fwereade_: No need to apologize.. I'm fine with you talking to hazmat.
<niemeyer> ;-)
<niemeyer> wrtp: What about.. Monday, at ..
<niemeyer> 14UTC?
<wrtp> niemeyer: i'm going to be away for two mondays in a row soon (the 2nd and the 9th) if that makes a difference
<niemeyer> wrtp: Well, it certainly does
<wrtp> but usually that would be fine, and the time's fine too
<niemeyer> wrtp: We can start with Tuesday, some time, and then move back to Monday once you're done with those
<niemeyer> s/some/same/
<wrtp> niemeyer: i'm away the entire week of the 2nd and monday and tues the following week
<niemeyer> Ugh, ok :)
<wrtp> niemeyer: (two public holidays around easter & i'm taking some other days too)
<niemeyer> Well, let's do it Monday then..
<niemeyer> Since you'll be off next week, no day will fit :)
<niemeyer> Next week => Week of the 2nd
<wrtp> niemeyer: yeah
<wrtp> niemeyer: but i'm around wed-fri on the week after that
<niemeyer> wrtp: Sure, we can move that week maybe, or have two meetings on the same week
<wrtp> niemeyer: sounds good.
<niemeyer> I'll add to the calendar
<wrtp> niemeyer: i like the idea of a monday meeting in general
<niemeyer> wrtp: yeah, it's good to quick the week off in a good way
<niemeyer> s/quick/kick
<wrtp> yeah
<fwereade_> wrtp, niemeyer: sorry... but week after next I'll be off mon/tue :(
<niemeyer> fwereade_: No worries.. people will be on/off casually.. we can adapt specific events
<TheMue> niemeyer: monday is ok
<fwereade_> wrtp, niemeyer: in principle, though, ++monday
<TheMue> niemeyer: could it be that tomb isn't tagged for the current weekly?
<wrtp> fwereade_: was just looking at adding a juju destroy-environment command. is there any command that *doesn't* take a --environment flag?
<niemeyer> TheMue, fwereade_, wrtp: If I didn't screw up, you should have received an event notification at the adequate time
<fwereade_> wrtp, there are certainly mooted ones, that don't exist yet
<niemeyer> TheMue: Checking
<wrtp> fwereade_: ok. just thought it could be a general argument that applied to all.
<wrtp> fwereade_: what mooted commands might not talk to the juju environment, BTW?
<fwereade_> wrtp, I'm not sure of there are any that exist right now, but sadly it's not completely general
<fwereade_> wrtp, juju source to branch a charm, I think
<niemeyer> TheMue: You're right.. pushing
<TheMue> niemeyer: great, thx
<wrtp> fwereade_: hmm, i wouldn't mind if that command still accepted the --environment flag, but errored out in ParsePositional if so.
<fwereade_> wrtp, it's not appropriate for supercommand though -- jujud uses it too and that's never available there
<wrtp> fwereade_: as it's definitely the exception rather than the rule
<fwereade_> wrtp, an embedded environment-arg-handling type would see a fair bit of use though
<wrtp> fwereade_: yeah, i'll do something like that.
<fwereade_> wrtp, cool
<fwereade_> hazmat, improved status output LGTM
<fwereade_> off for a while, everyone, take care
<niemeyer> hazmat: Sorry for bothering, but is the change to drop the env var for the store coming?
<niemeyer> hazmat: Just want to make sure SpamapS has the right thing to publish, if an update is rolling
<wrtp> fwereade_: have fun
<wrtp> i'm off too now. see you tomorrow
<niemeyer> wrtp, fwereade_: CheersÂ²
<TheMue> and i'm finishing for today too, bye
<hazmat> niemeyer, was on another call, just checked in with SpamapS, the change will be in, working it on now
<niemeyer> hazmat: Thanks a lot
<niemeyer> hazmat: Synchronizing re. store with IS ATM
<flacoste> SpamapS: do we have a package yet?
<SpamapS> flacoste: no, a few more commits trickling in
<flacoste> SpamapS: 35 minutes man ;-)
 * niemeyer breaks down for a moment
<SpamapS> OH
<SpamapS> flacoste: I misunderstood the timing
<flacoste> SpamapS: daylight savings confused you?
<SpamapS> no I thought it was tomorrow morning
<SpamapS> my bad
<hazmat> someone just pointed out to me a  memory leak in txzk
<hazmat> might be in the bindings.. still investigating
<flacoste> SpamapS: do you think you can still get it in?
<SpamapS> flacoste: working as fast as I can
<flacoste> we are in good hands!
<SpamapS> test suite fails
<SpamapS> http://paste.ubuntu.com/895639/
<SpamapS> hazmat: ^^
<hazmat> SpamapS, sigh.. so much for the cowboy
<hazmat> SpamapS, un momento
<SpamapS> lol
 * SpamapS blows smoke off his six shooter and reloads
<hazmat> SpamapS, done
<SpamapS> what was the deal?
<hazmat> SpamapS, we changed the store.charms.ubuntu.com hardcoded url to an elastic ip address for now
<hazmat> SpamapS, we'll change back b4 the rc once the store domain issues are resolved
<SpamapS> oh boy fun
<SpamapS> and more fails
<SpamapS> http://paste.ubuntu.com/895650/
<SpamapS> sorry guys this is a total no-go
<SpamapS> Even if I upload
<SpamapS> it will FTBFS
<rogpeppe> niemeyer: i'm just trying to plan summer hols. i've got "platform rally" in my diary for june 25th to 29th, but there seems no indication of that on the wiki. is there anything happening at around that time or has it been cancelled?
<fwereade_> SpamapS, huh, I wasn't familiar with that acronym but I'd expected at least *one* of the Fs to mean what I thought
<SpamapS> fwereade_: Fail To Build From Source
<hazmat> argh.. running full suite
<SpamapS> https://launchpadlibrarian.net/97931138/buildlog_ubuntu-natty-i386.juju_0.5%2Bbzr492-1juju3~natty1_FAILEDTOBUILD.txt.gz
<SpamapS> hazmat: is there some reason this test is slow juju.control.tests.test_upgrade_charm.RemoteUpgradeCharmTest
<SpamapS> test_latest_dry_run
<SpamapS> I mean
<SpamapS> perhaps something new to mock in twisted?
<hazmat> SpamapS, yes its.. still using the old domain address to mock out an external call, which doesn't match so then it goes external to the actual ip, but that's not hooked up yet, so it hangs
<SpamapS> Ok
<SpamapS> so.. just a lot of last minute thrashing w/o running the test suite?
<hazmat> yeah.. not in full
<hazmat> one constant typed   out everywhere
<SpamapS> Heh.. that'll teach ya
<SpamapS> so it was committed as a trivial..
<SpamapS> but it really wasn't :)
<SpamapS> ping me when there's a new commit, I'll re-try and re-upload. Can you also document the whole thing in a bug (release team requests as much)
<SpamapS> Its not clear at all *why* we made that last minute change from teh changelog.
<niemeyer> rogpeppe: I actually don't know how that looks like yet
<rogpeppe> niemeyer: ok, thanks.
<hazmat> niemeyer, so why are we changing the url to an ip address, if we need to change it back again as an SRU?
<hazmat> niemeyer, ie. isn't it better to just fix the domain name
<niemeyer> hazmat: Hmm.. I guess the ip address is a bad idea indeed, as it'll make it harder to switch in a bit.
<niemeyer> hazmat: I'm just trying to move things forward.. right now there's some contention going on in terms of getting the store moving
<niemeyer> hazmat: I'd prefer to use a domain, but apparently using a ubuntu.com in my account is an issue
#juju-dev 2012-03-23
<hazmat> SpamapS, any chance you haven't uploaded already?
<hazmat> SpamapS, we where just debating the whole value of using an ip address..
<hazmat> versus just using a domain name, and letting dns updates figure it out..
<SpamapS> hazmat: I am delaying uploading
<SpamapS> I think the IP should be fine as long as there is only one place to change it
<hazmat> SpamapS, niemeyer if we use the IP we will need to SRU, but the IP will work till then
<hazmat> SpamapS, yeah.. i centralized all the refs to it, one line change now
<niemeyer> SpamapS:Yeah, hazmat is right.. with a temporary domain, we'll still have to SRU to fix the domain soonish, but we can also switch the A record at the same time to the real store
<niemeyer> SpamapS: Which means we can stop the temporary store as soon as the new one is live
<niemeyer> (and the domain cache expires)
<SpamapS> I'd be much happier with a single line change SRU to re-point things than a non-working distro version.. :)
<SpamapS> Would be nice if users could work around archive sync issues with an environment variable tho
<niemeyer> SpamapS: Hmm.. there's no reason for the distro version to be non-working.. that's actually another argument in favor of the domain name
<niemeyer> SpamapS: Much better than a variable too, because it actually works without any environment changes
<SpamapS> I guess my point is, if you can get a version out now that works with the store, but introduces a potential need to re-point things later.. I'd prefer that over no store functionality.
<niemeyer> SpamapS: We're on the same page
<niemeyer> SpamapS: Let's point the code to juju-store.labix.org
<niemeyer> SpamapS: I'll make that work over the next couple of days
<niemeyer> SpamapS: If we need to SRU later, it'll be just to have it pointing to something more official. That domain can continue working for as long as necessary.
<SpamapS> hmm
<SpamapS> I'm just looking here..
<SpamapS> how does the charm store protect against MITM?
<SpamapS> I see no security in there
 * SpamapS fights the urge to facepalm
<niemeyer> SpamapS: There's none anymore.. that was https, and will be back one day once we manage to make devops a reality.
<SpamapS> Ok, so, in that case, I'd rather not have a charm store. :)
<niemeyer> SpamapS: Please don't facepalm.. I've been doing that and it's starting to hurt..
<SpamapS> at least with bzr branches from launchpad, you have ssh closed loop with users uploading
<niemeyer> SpamapS: OKAY.. maybe I should just shutdown the computer and get a beer. :)
<SpamapS> niemeyer: maybe we should invent a special glove with soft palms for doing code review. :)
<niemeyer> SpamapS: It's not even about code reviews.. it's really about the store
<SpamapS> niemeyer: I have some thoughts on how we can have our charm store cake and eat it too .. we'll need to embed a CA cert in juju but its better than nothing.
<niemeyer> SpamapS: I just want this thing live.. we can very easily improve things, move back and forth and so on
<SpamapS> OR we can all chip in $20 and get a real cert ;)
<niemeyer> SpamapS: We'd not even have to pay.. there are free certs in some providers
<SpamapS> true
<SpamapS> as long as none of them involve a rectal exam I'll volunteer to investigate
<niemeyer> SpamapS: I've registered one before with a registrar that I can't remember right now, but I have an SSL I can check
<SpamapS> StartSSL.com I believe might work
<niemeyer> SpamapS: If you feel strong about it, just put https://juju-store.labix.org in there, and we'll sort it out
<SpamapS> not sure if their CA is in the ubuntu ca certs.. but we can add one in the packaging
<niemeyer> SpamapS: The ones in that registrar I'm sure i tis
<niemeyer> SpamapS: In the one I don't recall right now
<SpamapS> niemeyer: I'm worried about the fact that charms run as root is all..
<niemeyer> SpamapS: I know.. there's a reason why it was https://store.juju.ubuntu.com
<SpamapS> niemeyer: indeed
<niemeyer> SpamapS: I've just been conceding on issue after issue
<SpamapS> niemeyer: I don't think we're going to solve this today... sounds like its been a long, weird day today for everybody
<hazmat> I vote we have it at the original https://store.juju.ubuntu.com
<hazmat> it will get sorted out when it gets sorted out then
<niemeyer> hazmat: Heh
<hazmat> in the end that's where its supposed to be
<niemeyer> hazmat: In the end that's where it will be
<niemeyer> hazmat: Not today, not tomorrow
<hazmat> niemeyer, but before the release i assume
<niemeyer> hazmat: I don't know..
<niemeyer> hazmat: I don't have control over it
<hazmat> and it might be solved next week... and its still fine... or even two months from now and its less fine.. but we can have env var for it..
<niemeyer> hazmat: It's up to you to decide, though.. if you update to that store and it's put live elsewhere, you'll have to deal with the change later.
<hazmat> its certainly not going to be at an ip address, which is what it is now
<niemeyer> hazmat: I'm happy with whatever, really
<hazmat> i'd rather pick the final destination, and the rest of it will take care itself... whenever it does
<niemeyer> hazmat: I'll be working to put that store live somewhere
<SpamapS> oddly enough.. the IP is a lot harder to MITM than the hostname
<SpamapS> still not "secure enough"
<SpamapS> one evil wifi is all it would take
<niemeyer> SpamapS: This is a test phase
<SpamapS> but yeah.. I have no answers for you guys.. :-P
<niemeyer> But I don't care, really.. I've been facing way too many walls for that stuff.. I'll just make it work and people can decide.
<hazmat> i'd like to go ahead and change it back to https://store.charms.ubuntu.com
<SpamapS> If left at store.juju.ubuntu.com ...
<SpamapS> with https...
<SpamapS> could we come up with ways to help people install a ca cert that we provide, for testing, and a /etc/hosts entry?
<SpamapS> I know thats *super* yuck
<SpamapS> just trying to flesh out the options
<niemeyer> SpamapS: IS won't provide me with the SSL for that domain, and they're not happy with it pointing to something I manage either
<SpamapS> Right, but we could use our own CA
<SpamapS> and put that CA in a "juju-testing-ca" package
<hazmat> niemeyer, are you sure about the store env var..
<niemeyer> SpamapS: Heh.. using another domain name is a *lot* simpler
<SpamapS> aye
<niemeyer> hazmat: Absolutely
<SpamapS> it is
<SpamapS> ok, I think I'll go see what beer is on sale at whole foods now :)
<niemeyer> I'd love some company for a beer right now, to be honest
 * hazmat too
<SpamapS> St. Paulie Girl is nice company :)
<SpamapS> she always smiles at my jokes
 * SpamapS disappears
<niemeyer> hazmat: The machine is stopped at
<niemeyer> Unpacking python-txzookeeper (from .../python-txzookeeper_0.8.0-0ubuntu1_all.deb) ...
<niemeyer> Setting up python-txzookeeper (0.8.0-0ubuntu1) ...
<niemeyer> You have not informed bzr of your Launchpad ID, and you must do this to
<niemeyer> write to Launchpad or access private data.  See "bzr help launchpad-login".
<niemeyer> hazmat: It's also booting Oneiric.. have we not switched to Precise yet?
<hazmat> hmm.. i've been using default-series
<hazmat> niemeyer, what do you have specified in your environments.yaml for default-series?
<hazmat> niemeyer, can you pastebin console output
<niemeyer> hazmat: Nothing
<hazmat> that bzr message is pretty standard
<niemeyer> hazmat: I mean, I don't have anything special in default-series
<hazmat> its a one time thing
<hazmat> niemeyer, do you have 'precise' there?
<niemeyer> hazmat: I don't have the setting at all
<niemeyer> hazmat: Using juju from trunk on 486
<niemeyer> hazmat: Hmmm.. still using juju-branch, though.. this is bogus
<niemeyer> hazmat: It's juju origin now, isn't it
<hazmat> juju-origin
<niemeyer> hazmat: Do I need to set the series for it to pick precise?
<hazmat>  lp:juju
<hazmat> yes
<niemeyer> hazmat: Hmmm.. shouldn't it pick it automatically, given that it's the expected value for 12.04?
<hazmat> niemeyer, yeah.. i was just digging into why that is
<hazmat> so the environment doesn't actually require default-series, so its None, but the get ami method has a default release value of oneiric
<niemeyer> hazmat: Alright, using default-series for the moment, switched to juju-origin, and restarting
<hazmat> i use ec2 on a regular basis
<hazmat> niemeyer, have you checked gozk for mem leaks?
<niemeyer> hazmat: Hmm.. not that I recall.. why?
<hazmat> niemeyer, someone handed me a txzk example on #twisted that showed some leakage
<hazmat> i was about to convert it to another library to isolate where the problem lies
<niemeyer> hazmat: Not entirely surprising.. I've fixed a few leaks early on in python-zookeeper
<hazmat> yeah.. i suspect the python bindings indeed
<niemeyer> hazmat: and it was entirely "by chance", realizing while looking at the code
<niemeyer> hazmat: I wouldn't be surprised if there's more of the same
<niemeyer> hazmat: Go is mark & sweep, so it's harder to get that kind of leak
<hazmat> i was going through some of the stdlib yesterday, its quite readable
<niemeyer> hazmat: Okay, kicking another run
<niemeyer> of the wtf
<niemeyer> hazmat: Yeah, they're pretty diligent about it
<niemeyer> hazmat: There was also a serious push for Go 1
<niemeyer> hazmat: So it's better in quite a few ways than it was 4 months ago
<niemeyer> hazmat: Still no go :(
<niemeyer> hazmat: juju status is blocked again..
<niemeyer> Will log onto the machine to see what's up
<hazmat> niemeyer, can you strace it?
<niemeyer> hazmat: Setting up the env
<niemeyer> hazmat: "2012-03-22 21:20:13,919 DEBUG Environment still initializing. Will wait."
<niemeyer> hazmat: Seems to have reached the server
<niemeyer> hazmat: Agents are running..
<niemeyer> hazmat: zk is empty.. initialization never run indeed
<niemeyer> hazmat: Have you been using juju-origin?
<niemeyer> hazmat: To run with a branch, that is
<niemeyer> hazmat: http://paste.ubuntu.com/895896/
<niemeyer> hazmat: That's probably it..
<niemeyer> SpamapS: So, wtf is stopped because deployment is broken for real
<hazmat> odd
<hazmat> that's the new setuptools bit for pypi mac installs
<niemeyer> Time for some sleep
<niemeyer> Night all
<hazmat> its fixed in trunk
<hazmat> sleep sounds good
<TheMue> rogpeppe: moin
<rogpeppe> TheMue: hiya
<TheMue> rogpeppe: same beautiful weather like here? yesterday i worked for some hour on our veranda, watching our bunnies on the lawn
<rogpeppe> TheMue: it was beautiful yesterday, but today it's misty.
<rogpeppe> TheMue: went for a longish bike ride this morning and should have had lovely views but didn't
<TheMue> rogpeppe: *sigh* hope it will get better today
<TheMue> rogpeppe: oh, carmen called, should help her with a flower in the garden, brb
<rogpeppe> TheMue: who's carmen?
<TheMue> rogpeppe: my wife, she's doing spring preparations
<rogpeppe> TheMue: funny, my wife is also called Carmen!
<TheMue> rogpeppe: yep, we already talked about it. *lol*
<rogpeppe> i'd totally forgotten!
<TheMue> rogpeppe: hehe
<TheMue> rogpeppe: allready started uds preparation?
<rogpeppe> TheMue: no, am going to book tickets today
<TheMue> rogpeppe: i've got my flights booked and now doing the ESTA form
<TheMue> rogpeppe: i would like you or william as room mates, same or almost the same time zones ;)
<rogpeppe> TheMue: my time zone is always weird at events like that!
<TheMue> rogpeppe: hehe
<rogpeppe> TheMue: flights sorted
<TheMue> rogpeppe: fine. i'll arrive sunday arround 12:30 at SFO
<TheMue> rogpeppe: so enough time to travel to the hotel, look around a bit and go to bed early
<rogpeppe> TheMue: i arrive 1420 SFO
<TheMue> rogpeppe: ah, also not too late
<rogpeppe> TheMue: and leave 1855 on the following saturday
<TheMue> rogpeppe: arrival is 1220, leaving is 1250
<rogpeppe> fwereade_: ping
<TheMue> rogpeppe: i'm in since 7:31 (your time) and havn't seen him yet
<TheMue> rogpeppe: maybe he's biking like you
<fwereade_> rogpeppe, pong
<rogpeppe> ha!
<fwereade_> TheMue, I'm around, I'm just quiet :p
<TheMue> fwereade_: hehe, ok
<rogpeppe> fwereade_: just wanted to chat about testing strategy
<rogpeppe> fwereade_: when you've got a mo
<fwereade_> rogpeppe, 10 mins? want to get a propose in and get a mnemonic test failure in another branch
<rogpeppe> fwereade_: np
<rogpeppe> fwereade_: any time
<rogpeppe> fwereade_: ping me
<rogpeppe> i discovered yesterday evening that my ex-lorry-driver neighbour, in his 80s, has installed ubuntu. i'm impressed.
<TheMue> rogpeppe: maybe he has seen that video of the daddy trying win, os x and ubuntu
<rogpeppe> TheMue: i don't think so. apparently a relation of his is using it, so he gave it a go. he says he gets a bit lost when he has to use the command-line...
<rogpeppe> TheMue: but otherwise he finds it pretty good.
<rogpeppe> TheMue: he's considerably older than the guy in that video
<rogpeppe> TheMue: they've been living in this street for 60 years
<TheMue> rogpeppe: wow
<rogpeppe> TheMue: he came across last night because he'd seen my wi-fi network and wanted to warn me that i should secure it properly!
<fwereade_> rogpeppe, golly :)
<TheMue> rogpeppe: i don't think i'll reach this number here. we have a quite large house, and we plan to move to a smaller one when we get older and the kids have left the house
<fwereade_> rogpeppe, I may have to wait a little longer, cath needs my help a mo
<TheMue> rogpeppe: he is really fit
<rogpeppe> fwereade_: np
<TheMue> rogpeppe: i've changed my wifi to wpa2. now my sister-in-law who is living in our house too has troubles. she's using xp, and by default it doesn't support wpa2.
<rogpeppe> TheMue: oh dear.
<TheMue> rogpeppe: maybe she should give ubuntu a try too
<rogpeppe> TheMue: depends how much she's dependant on windows stuff. some people, particularly those that are somewhat more tech-savvy, find it hard, because the microsoft-compatible apps don't work too well
<TheMue> rogpeppe: she mostly is doing web and mail and only private usage of office apps from time to time. so it could work.
<rogpeppe> TheMue: if she's an Outlook user, it could be hard
<TheMue> rogpeppe: no, only web mail (thankfully)
<rogpeppe> TheMue: she could also install WPA2 support, presumably.
<TheMue> rogpeppe: i had to use outlook the last 8 years, argh
<TheMue> rogpeppe: yep, i've dl'ed two fixes. maybe it's working then
<fwereade_> rogpeppe, so, testing
<fwereade_> rogpeppe, I saw your collected random failures which did indeed look a little alarming
<rogpeppe> fwereade_: i'm thinking of testing commands
<fwereade_> rogpeppe, ah, ok
<rogpeppe> fwereade_: so... i want to try to avoid testing the same stuff many times
<rogpeppe> fwereade_: so i'm wondering what you think about putting a dummy environs interface underneath the command stuff
<rogpeppe> fwereade_: which sends on a channel when a given operation is executed, but doesn't actually do anything
<rogpeppe> fwereade_: although it may start a zookeeper for the state
<fwereade_> rogpeppe, yeah; I think having an *actual* dummy environ type, like we do in pyjuju, and working against a real zookeeper, is probably sensible
<rogpeppe> fwereade_: yeah. for the moment i'm putting it in cmd/juju/environ_test.go, but it might be useful elsewhere in future
<fwereade_> rogpeppe, sounds good
<rogpeppe> fwereade_: i'm exporting the Environ within juju.Conn
<rogpeppe> fwereade_: that way there's a way in
<fwereade_> rogpeppe, cool
<rogpeppe> fwereade_: ok, great. just making sure you were ok with the general thrust.
<fwereade_> rogpeppe, definitely sounds sensible to me
<rogpeppe> fwereade_: as cmd/juju is "yours" :-)
<fwereade_> rogpeppe, kinda, I guess :p
<rogpeppe> fwereade_: you touched it last anyway. i think that gives you some veto rights :-)
<rogpeppe> fwereade_: i'm tearing it apart anyway :-)
<fwereade_> rogpeppe, haha, that does sound a little alarming :p
<fwereade_> rogpeppe, what's it conspicuously lacking for your purposes?
<rogpeppe> fwereade_: don't worry too much. it's all with due respect, i hope
<rogpeppe> fwereade_: it's mostly just refactoring shared stuff between commands, now that there's more than one
<rogpeppe> fwereade_: the basic structure remains
<fwereade_> rogpeppe, cool, I fully expect bootstrap to take a justified kicking; I would worry a little if you were planning to gut SuperCommand
<rogpeppe> fwereade_: oh no! i wouldn't dream of it.
<rogpeppe> fwereade_: currently i'm strictly within cmd/juju
<fwereade_> rogpeppe, ah, cool, that's great then ;p
<fwereade_> rogpeppe, and ofc if cmd itself needs work that's fine as well, the only reason I'd be nervous of that is because a new client is on the way
<rogpeppe> fwereade_: a new client?
<fwereade_> rogpeppe, cmd/jujuc/server will also use it, it fits too nicely to go duplicating stuff like command selection IMO
<rogpeppe> fwereade_: i thought it already did.
<fwereade_> rogpeppe, not in trunk
<fwereade_> rogpeppe, yet
<rogpeppe> fwereade_: ah, i didn't know if it'd been merged yet
<fwereade_> rogpeppe, that specific branch is currently not even proposed, I want to lose a bit of the backlog first before I make it work with the final versions of the stuff currently in review
<rogpeppe> fwereade_: ah, i lose track
<fwereade_> rogpeppe, tell me about it ;)
<TheMue> lunchtime
<hazmat> morning folks
 * hazmat tracks down a memory leak
<rogpeppe> hazmat: morning
<rogpeppe> lunch
<TheMue> back on the veranda
<niemeyer> Good morning everybody
<rogpeppe> niemeyer: yo!
<TheMue> niemeyer: morning
<rogpeppe> back
<hazmat> g'morning niemeyer
<fwereade_> niemeyer, if I was stupid and set the wrong -for in lbox propose, how do I do a new MP for that branch?
<niemeyer> Hey all
<niemeyer> fwereade_: Just delete the previous one
<fwereade_> niemeyer, d'oh :)
<niemeyer> fwereade_: ;)
<fwereade_> hazmat, when you have a moment:
<fwereade_> hazmat, https://codereview.appspot.com/5882056/
<fwereade_> hazmat, https://codereview.appspot.com/5876070/
<fwereade_> hazmat, https://codereview.appspot.com/5874064/
<fwereade_> hazmat, in that order
<niemeyer> fwereade_: Please note that there's a minor detail rogpeppe pointed out in terms of pre-req that still must be fixed
<fwereade_> niemeyer, oh, bother, missed that
<fwereade_> niemeyer, what's the problem?
<niemeyer> fwereade_: If you merge the pre-req, and then merge trunk onto the follow up branch, the next propose will still be diffing against the pre-req
<niemeyer> fwereade_: It's not such a big deal
<niemeyer> fwereade_: pre-req is still useful even with that
<rogpeppe> it confused me quite effectively though!
<niemeyer> fwereade_: You'll just a wrong diff in the specific case where you merge trunk on the follow up
<rogpeppe> (not that it takes much :-])
<niemeyer> (and not in the pre-req)
<fwereade_> niemeyer, ah ok, so it really just introduces more noise as it remains unreviewed?
<fwereade_> niemeyer, that's *almost* a feature :p
<niemeyer> fwereade_: yeah, if you merge trunk on the pre-req too, it fixes the problem
<fwereade_> niemeyer, cool
<niemeyer> fwereade_: Yeah ;)
<niemeyer> fwereade_: I have to fix lbox so that it checks the merge state of hte pre-req before deciding what to diff against
<fwereade_> niemeyer, sounds good
<niemeyer> hazmat: What do we do with the wtf?
 * rogpeppe feels all virtuous every time he writes "package testing_test"
<hazmat> niemeyer, i fixed that bug in trunk
<niemeyer> hazmat: Awesome, so I guess we'll have to jump over a few revisions
<niemeyer> hazmat: Will do that in a bit
<niemeyer> rogpeppe: :-)
<TheMue> rogpeppe: pls take a look athttp://paste.ubuntu.com/896502/. it realizes a Watcher as util type where own behaviors with three methods (and those individually needed) can be plugged in
<hazmat> niemeyer, yeah.. it needs to skip 486-494
<hazmat> 494 is good
<rogpeppe> TheMue: what about watchers that watch children of a node?
<rogpeppe> TheMue: or watchers that watch the contents of a node, come to that
<TheMue> rogpeppe: would have to check it. i've only choosen ExistsW because it doesn'd do much i/o.
<rogpeppe> TheMue: how many different watchers will there be?
<TheMue> rogpeppe: it works with content, see my last proposal
<rogpeppe> TheMue: link?
<TheMue> rogpeppe: https://codereview.appspot.com/5885059/
<TheMue> rogpeppe: but here still without behavior
<TheMue> rogpeppe: it's an idea to discuss. when the watch fires for every event regarding a node (content, creation, deletion, children) then it would work fine with one Watcher as type
<rogpeppe> TheMue: i don't think that's a good idea, as it means more network traffic than necessary.
<TheMue> rogpeppe: why?
<rogpeppe> TheMue: i *think* that registering a watcher for a node sends a message to zookeeper (if that watcher type isn't already registered for that node)
<TheMue> rogpeppe: afaik watches allways fire all kinds of events
<rogpeppe> TheMue: no, it would be nice if they did, but they don't
<rogpeppe> TheMue: GetW doesn't fire when children change, for example
<TheMue> rogpeppe: so my "exists watch" fires content changes and deletions by accident (monitored that)
<rogpeppe> TheMue: the set of events depends on the call. existsW does give changed events, yes.
<rogpeppe> but not children events
<rogpeppe> i think i documented it once
<TheMue> rogpeppe: do we have an i/o problem regarding watches or is this a kind of premature optimization?
<rogpeppe> TheMue: it makes the implementation more complex too - you have to fire off a new goroutine each time you do a watch.
<TheMue> rogpeppe: ok, so i would change the Watcher to be configurable. e.g. Init() of the behavior could return where it is interested in.
<TheMue> rogpeppe: like in http://paste.ubuntu.com/895188/, yes
<rogpeppe> looks like my updated zk docs didn't make it in
<TheMue> rogpeppe: one Watcher, one goroutine, that's no problem, that's what go is made for
<rogpeppe> TheMue: not quite - you need a new goroutine each time through the loop, or at least that's how i did it last time.
<TheMue> rogpeppe: so, if needed a fine control of what exactly should be watches should be only a small task
<TheMue> rogpeppe: not in my approach
<TheMue> rogpeppe: tested it in my proposal, multiple changes received fine
<rogpeppe> TheMue: with children and content changes?
<TheMue> rogpeppe: with content changes
<TheMue> rogpeppe: could easily extend it to alternatively watch children
<rogpeppe> TheMue: that's easy. but you're proposing watching content at the same time as children, no?
<TheMue> rogpeppe: currently not. my proposal right now is only interested in content. but as i said, i don't think it's difficult to make this configurable.
<TheMue> rogpeppe: so when creating the watcher instance you can say what you're interested in
<rogpeppe> TheMue: if it's configurable, the types become awkward because you can receive new children ([]string) or new content ([]byte)
<rogpeppe> TheMue: how about two kinds of watcher - one for children and one for content, both with appropriate types.
<rogpeppe> TheMue: the children watcher could generate events on a channel corresponding to which children are deleted or created.
<TheMue> rogpeppe: the behaviors Update() is called if there is a change. it then retrieves what it is interested in. and additional methods allow to grab this data after the change notification via the channel
<rogpeppe> TheMue: (i.e. deltas)
<rogpeppe> TheMue: that really is inefficient, because it incurs an extra round trip for each change.
<TheMue> rogpeppe: do we have i/o probems?
<rogpeppe> TheMue: no need to increase network traffic more than we need to.
<rogpeppe> TheMue: in some places it may be billed
<TheMue> rogpeppe: so it's premature optimization?
<rogpeppe> TheMue: no, it's appropriate design :-)
<rogpeppe> TheMue: it's easy to do this in a more efficient way
<TheMue> rogpeppe: last time you had trouble with code duplication, this time it shall be duplicated for each watch?
<rogpeppe> TheMue: no, each kind of watch - children or contents. are there any examples where a watcher needs to watch both of these at once?
<TheMue> rogpeppe: so far i didn't found any
<rogpeppe> TheMue: here's a sketch: http://paste.ubuntu.com/896534/
<rogpeppe> TheMue: that's the API
<rogpeppe> TheMue: well, a possible API :-)
<TheMue> rogpeppe: what is ChildChange.New?
<mthaddon> niemeyer: do you know if charmd will place nice with start-stop-daemon's --background, or if there's another way of daemonising the process?
<rogpeppe> TheMue: it says whether the names have been added or removed
<TheMue> rogpeppe: and ContentWatcher in my case has to return a ConfigNode
<rogpeppe> TheMue: it could be a type with Add and Del constants
<rogpeppe> TheMue: why's that?
<rogpeppe> TheMue: these are the types that the config node watcher (and all other watchers) would build on
<TheMue> rogpeppe: because it watches data in the config node
<mthaddon> one way to find out, I guess
<rogpeppe> TheMue: you could implement the config node watcher something like this: http://paste.ubuntu.com/896539/
<rogpeppe> TheMue: obviously there would have to be error checking and a tomb there too.
<TheMue> rogpeppe: that's what i wonna provide with the Watcher type
<rogpeppe> TheMue: or perhaps it could share a tomb with the ContentsWatcher, i'm not sure
<TheMue> rogpeppe: i will tink about a mix of your and my ideas
<rogpeppe> TheMue: think about a nice strongly typed, pipeline
<rogpeppe> oops
<rogpeppe>  TheMue: think about a nice strongly typed pipeline of events being processed
<rogpeppe> TheMue: channels work really well that way
<TheMue> rogpeppe: yep
<niemeyer> mthaddon: It doesn't do anything about daemonization..
<niemeyer> mthaddon: So --background should be fine
<mthaddon> k, will give it a try - thx
<niemeyer> mthaddon: As I said, though, if you need something else I can look into it
<mthaddon> niemeyer: k, will let you know - have got it saying "2012/03/23 15:02:06 JUJU Store opened. Connecting to: 127.0.0.1:27017" in the logs so far
<niemeyer> mthaddon: What's the status quo, btw? I'll be working on the charms today as I managed to find out what was happening yesterday
<niemeyer> mthaddon: Cool, that's all that should be necessary
<niemeyer> mthaddon: It means it's happily serving requests
<niemeyer> mthaddon: Hook charmload in the other end via crond, and you should have a fully working setup
<mthaddon> niemeyer: I think I'm getting pretty close - just need to open up some firewall rules, get DNS entries in place, and land the last few bits of the config that I've been testing with then we should be done
<niemeyer> mthaddon: Ok.. are we charming up.. are we loading that and moving on without charms.. ?
<niemeyer> mthaddon: Sorry, I'm complete uncertain of the situation given Elmo's email, sabdfl requests, etc
<mthaddon> niemeyer: we're working on both in parallel so it gives us options (by which I mean I'm continuing to work on the DC stuff while you get started on the charm stuff)
<niemeyer> mthaddon: Sounds great then.. thanks for that. I'll just put the store online and we can delegate for someone else to decide which one goes live.
<niemeyer> mthaddon: I'm happy either way, as long as we have something serving it.
<fwereade_> gents, popping out for a walk and a think, bbiab
<niemeyer> fwereade_: Enjoy
<niemeyer> I'm heading to lunch too
<niemeyer> biab
<rogpeppe> niemeyer, fwereade_: enjoy
<hazmat> fwereade_, cheers
<hazmat> SpamapS, it looks like we'll need a patch for python-zookeeper
<hazmat> SpamapS, there appear to memory leaks in all the async apis :-(
<hazmat> testing the patch now
<niemeyer> mthaddon: I don't know why --background is failing for you
<niemeyer> mthaddon: I've run it locally, and it works
<mthaddon> niemeyer: really, cos I've run it locally and it fails for me too - can you show me what you're doing locally?
<mthaddon> (i.e. outside of the production env)
<niemeyer> mthaddon:
<niemeyer> [niemeyer@gopher ..go/store/charmd]% start-stop-daemon --background --start --exec $PWD/charmd -- 1.2.3.4:5 6.7.8.9:1
<niemeyer> [niemeyer@gopher ..go/store/charmd]% ps auxw | grep charmd
<niemeyer> niemeyer 28212  0.5  0.0  62292  3452 ?        Sl   12:46   0:00 /home/niemeyer/src/launchpad.net/juju/go/store/charmd/charmd 1.2.3.4:5 6.7.8.9:1
<niemeyer> niemeyer 28216  0.0  0.0   9368   912 pts/7    S+   12:46   0:00 grep charmd
<mthaddon> ok, so you're not really testing the same options as we have in the initscript - in any case, I'll try and reproduce that locally
<niemeyer> mthaddon: start-stop-daemon doesn't seem to handle the program stdout well with --background, so we may need a log file option at least
<niemeyer> mthaddon: Yeah, I'm just trying it out in isolation, and observing that it actually works
<mthaddon> I still get that error removing the LOGFILE redirect in my local test
<niemeyer> mthaddon: Yeah, it's certainly unrelated
<mthaddon> interesting, and now it's saying "ERROR" locally but actually working
<mthaddon> damn, --verbose doesn't help
<mthaddon> I was wrong, it isn't working for me locally - process dies after a little while with no logfile
<mthaddon> will keep trying variations of the initscript
<niemeyer> mthaddon: I'll add an option to take the log file out of that picture
<niemeyer> mthaddon: Process dying after a little while probably reflects the address for MongoDB not being available
<hazmat> SpamapS, the patch fwiw is http://paste.ubuntu.com/896621/
<mthaddon> niemeyer: I wouldn't worry about it yet - I have some theories about how we can make it work (sudo to the user in question)
<mthaddon> niemeyer: I've installed mongodb locally so it will work
<mthaddon> (well, in my local lxc instance that I'm testing with, anyway)
<niemeyer> mthaddon: Have you considered using upstart?
<mthaddon> niemeyer: that's another option for sure
<hazmat> just as another data point, i did get them working locally for me yesterday, while testing out the client integration
<hazmat> although i did notice some mutual conflicts when running multiple charmloads
<niemeyer> hazmat: Ah, nice.
<niemeyer> hazmat: What kind of conflict did you get?
<hazmat> niemeyer, they would both detect the other was working on a charm, and both skip it
<niemeyer> hazmat: Hmm.. do you have the logs/
<niemeyer> ?
<niemeyer> hazmat: It works with a lock mechanism, so the first should go on
<hazmat> niemeyer, hmm. not anymore, but should be simple to reproduce
<niemeyer> hazmat: They will skip each others charms, though
<niemeyer> hazmat: But they won't both skip the same charm
<niemeyer> hazmat: Unless something caused the process to crash/terminate midway
<niemeyer> hazmat: In that latter case, any future executions will skip, until the lock timeout expires
<SpamapS> hazmat: huh?
<niemeyer> hazmat: I've actually tested that case, FWIW
<niemeyer> hazmat: Which is why I'd be interested in how to reproduce it
<hazmat> niemeyer, i'm not seeing it at the moment, i'll let it run for a bit more to see if it comes up
<hazmat> SpamapS, we are in dire need of a patch to go in our zk package
<niemeyer> hazmat: Might you have seen the crash case? If you kill the process while in progress, somehow, it'll not try this charm again until the locks don't expire
<hazmat> SpamapS, there's a significant memory leak otherwise
<niemeyer> mthaddon: upstart seems a lot smarter about those details
<niemeyer> mthaddon: Might be a trivial job to put it up with it
<mthaddon> k, will take a look
<SpamapS> hazmat: ahh.. bug?
<SpamapS> hazmat: probably worth SRU to 11.10 then
<hazmat> niemeyer, it looks fine to me  now, re charmload parallel
 * SpamapS still can't read those weird oldschool diffs... :-P
<niemeyer> hazmat: Ok, I bet you've seen the held lock case
<niemeyer> hazmat: If you tried again after a while (as you're doing now, actually) it'll note the expired lock and move on
<fwereade_> hazmat, do you have a moment to chat about MachineProviders and new-style environments?
<fwereade_> hazmat, it's not an exceptionally big deal, it's just that to create a provider we'll need data from both environment and environment/provider
<hazmat> fwereade_, sure
<fwereade_> hazmat, and I'm wondering whether it will still be possible/easy to perpetrate something like EnvironmentStateManager(client).get_config().get_default().get_machine_provider()
<hazmat> fwereade_,  the api for providersettings.. should return the merged dict
<fwereade_> hazmat, this is because a genericised instance-type will need a provider in order to construct Constraints objects
<fwereade_> hazmat, excellent
<fwereade_> hazmat, I don't think anything but the client and the PA will actually need provider instances so that should be fine even when we come to use ZK ACLs
<fwereade_> hazmat, so anyway: if I go ahead and use something like the train-wreck above for now, I should be fine, because an ESM will be all I'll need to construct a provider when I come to merge with you
<fwereade_> hazmat, right?
<niemeyer> mthaddon:
<niemeyer> mthaddon: http://paste.ubuntu.com/896649/
<niemeyer> mthaddon: upstart++
<hazmat> fwereade_, yeah.. that should be fine.. we can just sub out to  ProviderSettings(client).get_config()
<fwereade_> hazmat, great, tyvm
<hazmat> fwereade_, the get_default looks odd, there's only one in the env
<hazmat> for the zk storage
<fwereade_> hazmat, yeah, IMO it's kinda stupid, but ESM returns an EnvironmentsConfig with only one environment that you can then retrieve with get_default without knowing the name
<hazmat> fwereade_, ah.. right.. gotcha
<mthaddon> niemeyer: that's running it as root, right? but yeah, looks like a good place to start from
<fwereade_> hazmat, btw, I'll need to stop a bit promptly today, but I'll be around on and off this evening and at the w/e -- I'd love it if I could get a couple of reviews before your eod
<niemeyer> mthaddon: setuid and setgid are options in the upstart file
<mthaddon> yep yep
<niemeyer> mthaddon: :)
<hazmat> SpamapS, https://bugs.launchpad.net/ubuntu/+source/zookeeper/+bug/963280
<hazmat> fwereade_, thank you, have a good night
<SpamapS> hazmat: thanks, I'll escalate and get that pushed out ASAP
<SpamapS> hazmat: is it known upstream?
<fwereade_> hazmat, everyone: cheers, happy weekends :)
<hazmat> SpamapS, its not yet, i'm in progress on notifying everyone and putting the patches out
<hazmat> SpamapS, i pinged them on irc, but no response yet
<SpamapS> hazmat: ok, I'd like to get it reported there before pushing it into our packages
<niemeyer> hazmat: So there was a leak in pyzk indeed?  Where's the leak?
<SpamapS> hazmat: if I don't hear back from you in about 10 minutes, I'll go ahead and forward the issue myself.
<niemeyer> SpamapS: Is there a patch somewhere?
<SpamapS> niemeyer: hazmat has one apparently
<hazmat> SpamapS, yeah.. i tested it against 3.3.4
<hazmat> but the build system in 3.3.5
<hazmat>  is different, just  trying to verify against a clean source
<SpamapS> yeah precise has 3.3.5
<hazmat> SpamapS, patch attached
<jamespage> hazmat, SpamapS: want me to pick that up? I can push it back upstream as well
<hazmat> jamespage, i can do the upstream bit, the distro bits i'd rather leave in yours and SpamapS's capable hands
 * jamespage wishes LP had JIRA tracking capability
<SpamapS> doesn't make sense why it doesn't
<SpamapS> other than.. nobody has finished it
<hazmat> looks like the guy who  reported it to me also upstreamed.. ZOOKEEPER-1431
<jamespage> SpamapS, I think that is the case - there is a feature bug for it
<jamespage> hazmat, great
<SpamapS> ok, I'll leave this bug in jamespage's capable hands
<SpamapS> jamespage: thanks for stepping up
<jamespage> SpamapS, ack - on it now
<jamespage> SpamapS, np
<jamespage> hazmat, please could you propose that patch upstream as well
<jamespage> if it takes some load off you I'm happy to forward
<jelmer> jelmer
<jelmer> whoops, sorry :)
<hazmat> jamespage, done already
<jamespage> hazmat, sweet - thanks
<hazmat> jamespage, its attached to the issue
 * hazmat grabs some food
<jamespage> hazmat, SpamapS: tested and uploaded to precise - the release team will need to accept it but I don't think that will be an issue.
<hazmat> jamespage, awesome
<hazmat> bcsaller, jimbaker you guys got time for a quick meeting to catch up?
<bcsaller> hazmat: sure
<hazmat> bcsaller, jimbaker invites out
<SpamapS> hazmat: packages failing to build now
<SpamapS> Writing /build/buildd/juju-0.5+bzr494/debian/juju/usr/lib/python2.7/dist-packages/juju-0.5.egg-info dh_install -O--buildsystem=python_distutils
<SpamapS> cp: cannot stat `debian/tmp/usr': No such file or directory
<hazmat> SpamapS, huh
<hazmat> SpamapS, that doesn't look familiar..
<hazmat> i wonder if its find_packages in setup.py
<SpamapS> I believe that is known to cause issues
<SpamapS> I'm trying locally
<rogpeppe> i'm off.
<rogpeppe> have a great weekend everyone!
<rogpeppe> niemeyer: bootstrap & destroy commands *almost* there - just a couple of failing tests to rectify
<niemeyer> rogpeppe: Sweet!
<rogpeppe> niemeyer: you might wanna have a look at https://codereview.appspot.com/5892043/ which is a prereq
<niemeyer> rogpeppe: I do want, but I ought to push other things first
<rogpeppe> np
<niemeyer> hazmat, SpamapS: 2012-03-23 13:38:30-04:00 Running test ec2-wordpress... OK
<niemeyer> I've added some timeout logic to avoid some of the issues seen, so that it unblocks itself in the future
<niemeyer> (hopefully)
<jimbaker> hazmat, i was eating lunch just now
<niemeyer> http://wtf.labix.org/ is awaken and working. 494 has two OKs
<SpamapS> hazmat: no I think this is something else
<SpamapS> hazmat: probably a packaging issue, not upstream
<hazmat> jimbaker, no worries, are you up for doing it now?
<hazmat> jimbaker, invite out
<niemeyer> fwereade_: You said that you were going to change the default to amd64 after that conversation.. has that taken place yet?
<niemeyer> Hmm.. I guess not
<hazmat> niemeyer, there's a branch in review for that i believe
 * hazmat switches  out to reviews
<niemeyer> hazmat: Ah, sweet.. I'd like to deploy the store mongos on 64s
<hazmat> yeah.. 2gb limit sucks otherwise
<hazmat> niemeyer, its only a juju-origin away ;-)
<hazmat> crazy talk i know
<niemeyer> hazmat: I'll be using it, but I'm trying to stay trunk-crazy at most :)
<niemeyer> I'm stepping out for an appointment
<niemeyer> Back in ~1h
<hazmat> interesting http://zerovm.org/motivation/
<SpamapS> hazmat: very interesting
<SpamapS> hazmat: http://outflux.net/teach-seccomp/ ... possible answer to LXC's problems
<hazmat> SpamapS, serge has been interested in it
<hazmat> NACL come to roost
<hazmat> who is kees, and why is he so awesome
<hazmat> ah.. ic
<SpamapS> hazmat: Hah, former Canonicaler and Ubuntu Security Team Lead
<SpamapS> hazmat: and TB member :)
<niemeyer> hazmat: http://paste.ubuntu.com/897051/
<niemeyer> /var/lib/cloud/instance/scripts/runcmd: 8: /var/lib/cloud/instance/scripts/runcmd: juju-admin: not found
<niemeyer> /usr/bin/python: No module named juju.agents
<niemeyer> /usr/bin/python: No module named juju.agents
<niemeyer> run-parts: /var/lib/cloud/instance/scripts/runcmd
<niemeyer> :-/
<SpamapS> I'd guess that juju wasn't fully installed given that error.
#juju-dev 2012-03-25
<hazmat> jimbaker, can you resubmit those branches with lbox -cr
<hazmat> fwereade_, irc proxies are nice
<SpamapS> indeed.. or just irssi+screen :)
#juju-dev 2013-03-18
<thumper> davecheney: ta
<thumper> yay VMs
<thumper> fired up a clone for the live tests
<thumper> while running normal ones on another
<thumper> davecheney: ok so some of the live tests are failing...
<thumper> davecheney: how often do they fail for you?
<thumper> davecheney: can you ping me when you're back plz
<thumper> hi jam
<thumper> hi davechen1y
 * davechen1y blows a kiss
<jam> hi thumper
<jam> 6am is usually a bit early for me :)
<bigjools> I have several maps of string to array of strings and I want something that gives me those sorted by the map keys.  How do I do it in Go?
<davecheney> bigjools: you need to extract the keys into a []string
<davecheney> then sort the string
<bigjools> how do I keep the values of the keys?
<davecheney> then using that sorted []string, use that as a key into your map
<bigjools> what map?
<bigjools> I have many maps
<davecheney> bigjools: ok, i'll step back and let you explain the problem
<davecheney> i mistook your previous sentence
<bigjools> I should be more specific :)
<bigjools> I have a http.Requesl
<bigjools> argh
<bigjools> I have a http.Request.Query()
<bigjools> arse, hang on
<bigjools> which returns Values
<davecheney> bigjools: maybe launchpad.net/goamz/ec2/sign.go will be of use
<bigjools> which is a map[string][]string
<bigjools> we need to ultimately have a string containing "name:value,name:value" ... sorted on name
<bigjools> this is 2 lines of Python but in Go we're struggling :/
<davecheney> yes, Go does not provide the syntactic sugar for list comprehenions
<davecheney> er, whatever
<davecheney> bigjools: what happens when you have this
<davecheney> map["something"] = []string { "foo", "bar" }
<davecheney> what does that look like when it is encoded
<bigjools> I like comprehenions :)
<bigjools> davecheney: ok we have a handle on this I think, thank you
<davecheney> bigjools: http://play.golang.org/p/9k54JbkHsG
<davecheney> ^ a start, maybe ?
<bigjools> davecheney: awesome, good start.  Do you know if sort is case insensitive?
<bigjools> or can be made so
<davecheney> that ist he default sort
<davecheney> i'm sure it is case sensitive
<bigjools> I forgot to say that we need to lower case the map keys in the output string
<davecheney> bigjools: are you doing this to sign a request ?
<bigjools> davecheney: it's part of a signing, yeah
<bigjools> it's a rather complicated signature :/
<rogpeppe> mornin' all
<dimitern> rogpeppe: morning
<rogpeppe> dimitern: hiya
<jam> bigjools: I realzie this is late, but is this correct: http://play.golang.org/p/xtOOa3bDno
<fwereade> rogpeppe, dimitern, jam: heyhey
<jam> h
<jam> hi fwereade
<rogpeppe> fwereade: yo!
<dimitern> fwereade: hiya
<rogpeppe> jam: hiya
<fwereade> just stepping out for a bite of breakfast
<dimitern> wallyworld, mgz: standup?
<mgz> hey
<dimitern> jam: ping
 * TheMue is afk, lunchtime, biab
<rogpeppe> fwereade: any chance of a review of https://codereview.appspot.com/7815044/ ?
<rogpeppe> fwereade: i'll be happier if you've reviewed at least one of these branches :-)
<rogpeppe> fwereade: otherwise i feel a bit out on a limb
<fwereade> rogpeppe, I'm just finishing up a good solid morning of hackery, I'll be on reviews very soon
<rogpeppe> fwereade: cool, thanks
<rogpeppe> fwereade: nice to hear BTW.
<rogpeppe> fwereade: (the hackery that is)
<fwereade> rogpeppe, am I right in thinking that TestBootstrapAndDeploy is still "meant to be" failing at remove-unit time?
<rogpeppe> fwereade: yeah
<fwereade> rogpeppe, well cool :)
<rogpeppe> fwereade: i got some way into fixing the test, but it didn't work, and i haven't got back to it since
<rogpeppe> fwereade: if you wanna take a look, i'll send you the branch
<fwereade> (ie, cool, I haven't broken anything ;p)
<rogpeppe> fwereade: i *really* want to fix that bloody live test!
<fwereade> rogpeppe, I'll grab the branch, but I can't promise to focus on it any time soon
<rogpeppe> fwereade: maybe you could have a look to check for obvious wrongness anyway
<rogpeppe> fwereade:  bzr+ssh://bazaar.launchpad.net/~rogpeppe/juju-core/223-jujutest-fix-deploy-test/
<fwereade> rogpeppe, cheers
<jam> dimitern: yes?
<dimitern> jam: that was before you appeared in the standup
<jam> rogpeppe: what is "fix rpc test failure in trunk" on the Kanban board. Is that https://code.launchpad.net/~rogpeppe/juju-core/213-rpc-test-timeouts/+merge/148162?
<jam> I'm just noticing that the review lane went red, but I don't actually know which one that is.
<rogpeppe> jam: one mo, i'll just check
<fwereade> jam, sorry, that was me
<fwereade> jam, I accidentally did too much
<fwereade> jam, I am in review mode now :)
<jam> fwereade: yeah, I see you have 5 constraints things in the queue
<fwereade> jam, hopefully they're reasonably clear
<rogpeppe> jam: jeeze, i can't remember!
<rogpeppe> jam: i'm pretty sure i fixed that ages ago
<jam> rogpeppe: yeah, I think so, I think it just didn't get moved over9
<jam> which is why I was checking.
<rogpeppe> jam: and i don't *think* it was that CL
<jam> rogpeppe: I don't see any RPC test fix CLs in https://code.launchpad.net/juju-core/+activereviews
<mgz> rogpeppe: can't find anything else that looks like it...
<rogpeppe> jam: note to self: write better descriptions and link 'em to the CL in question!
 * rogpeppe wants an ls -lt on branches
<jam> rogpeppe: certainly I just reviewed "Write API design" though maybe that landed now?
<jam> rogpeppe: 'bzr branches' ?
<rogpeppe> jam: -t ?
<jam> some people have written plugins to check if branches have been merged, I could dig one of them out for you
<jam> so it does the recursive "find all branches not merged into X"
<jam> well, find all branches merged into X and prune them.
<rogpeppe> jam: that is: i want to see them in time-last-modified order.
<rogpeppe> jam: that plugin would be useful too though
<rogpeppe> pwd
<rogpeppe> jam: i think this was the fix: https://codereview.appspot.com/7324048/
<rogpeppe> anyone use bzr pipes here?
<fwereade> rogpeppe, https://codereview.appspot.com/7815044/ has a question
<rogpeppe> fwereade: looking
<rogpeppe> fwereade: the worst that could happen is that the entries get seen again and ignored, i think
<fwereade> rogpeppe, what happens if we remove an entity that was apparently never there?
<rogpeppe> fwereade: it's a noop
<fwereade> rogpeppe, ok, cool
 * rogpeppe checks again to be sure
<fwereade> rogpeppe, LGTM then assuming comment there
<rogpeppe> fwereade: thanks.
<rogpeppe> fwereade: BTW, although it isn't a problem if we remove an entiry that's not there, how do you see that happen with respect to the getAll/watch logic?
<rogpeppe> s/happen/happening/ ?
<rogpeppe> fwereade: oh, i see. ignore me.
<fwereade> rogpeppe, I have become sensitised to such constructs ;)
<rogpeppe> fwereade: the thing that isn't so great (although i'm not quite sure how to get around it) is that while we'll doing the getAll, we're not reading from the watcher channel, so we can stall other things.
<rogpeppe> fwereade: perhaps i should start a goroutine that reads from in and aggregates the results into a slice that's processed after getAll has completed.
<fwereade> rogpeppe, for now that feels like a good candidate for a comment, in case we see negative consequences later, but probably not worth the work right now
<rogpeppe> fwereade: agreed
<rogpeppe> dimitern: have you seen this? 245-allwatcher-state-backing
<rogpeppe> dimitern: http://pastebin.ubuntu.com/5625282/
<dimitern> rogpeppe: looking
<dimitern> rogpeppe: i've seen similar while fixing the uniter tests in my last CL, but all was working fine - i run the tests 10 times ok
 * rogpeppe is off for some lunch
<frankban> dimitern: I see that failure every time, last one using revno 1013. go1.0.2 on quantal.
<jam>  rogpeppe: that looks a bit like the failures thumper and wallyworld saw on Raring (consistent failures)
<jam> but I thought they landed changes to fix it.
<jam> I wonder if it might  have regressed things on Quantal/Precise?
<fwereade> frankban, jam, trunk is fine on precise for me
<fwereade> frankban, just in case, would you try removing $GOPATH/pkg/linux_amd64 and testing again?
<frankban> fwereade: trying
<jam> fwereade: confirmed. If I just run the worker/uniter tests directly everything passes for me on precise + go1.0.3
<jam> (P itself only has go1)
<dimitern> trunk is fine for me as well on quantal
<dimitern> using go1.0.3
<frankban> fwereade: same failures. I'll try upgrading go, how to do that (gophers ppa?), and what's the version you suggest?
<fwereade> frankban, I'm on 1.0.2 and I *thought* that was what were were using
<dimitern> fwereade: got a minute?
<fwereade> dimitern, sure
<dimitern> fwereade: i merged the changes, but u.SetCharmURL and u.SetCharm(old, removed now) assume some slightly different things
<fwereade> dimitern, oh yes?
<dimitern> fwereade: I'm fixing the tests now, but i'd like to look at carefully it after i propose
<fwereade> dimitern, if the change is significant it's probably worth thinking it through before we hit the tests too hard... they might actually still be right ;p
<dimitern> fwereade: i think i can sort it out, but not sure about one assert there
<fwereade> dimitern, cool, let me know if you need anything
<fwereade> dimitern, if lots of tests are broken, though, I think that is most likely to point at a problem elsewhere
<dimitern> fwereade: not a lot, just a couple for now
<fwereade> dimitern, ah, jolly good
<rogpeppe> fwereade: i responded to your review. (https://codereview.appspot.com/7815044) i added a comment, but resisted some of the other suggestions. YMMV.
<dimitern> fwereade: whoohoo! all passed! proposing :)
 * fwereade cheers
<fwereade> frankban, no joy?
<rogpeppe> fwereade: i don't see why you mind about the length of a test timeout failure. if anything, i think we should increase the limit elsewhere, as a heavily loaded system may easily pause for 500ms.
<rogpeppe> fwereade: (some tests have a 5 second timeout)
<fwereade> rogpeppe, true enough, sgtm
<rogpeppe> fwereade: cool. i wonder if we should have the short and long pauses as constants in testing somewhere actually. or as command line params.
<fwereade> rogpeppe, not a bad idea
<frankban> fwereade: no :-/ tried replacing quantal own golang-* (1.0.2) with gophers ppa golang-stable (same version), still the same failures. asked Matthew to run the tests in our branch, if they pass, then I further investigate on this problem of my configuration later
<fwereade> rogpeppe, one for our Copious Free Time ;p
<rogpeppe> fwereade: indeed
<fwereade> frankban, sorry to hear that
<dimitern> fwereade: https://codereview.appspot.com/7497047
<dimitern> fwereade: if you can take a look specifically in u.SetCharmURL changes
<dimitern> I'm off for today
<dimitern> good evening all
<frankban> fwereade: Makyo encountered the same failures in trunk. after increasing the timeouts in uniter_test.go (from 5 to 20 seconds and from 50 to 200 milliseconds) the tests pass in my machine (which is, apparently, quite slow :-/ )
<fwereade> frankban, crikey
<fwereade> frankban, I knew those tests were slow but they must be flat-out awful for you
<rogpeppe> fwereade: how would you feel about moving the Constraints type out of state and into its own package?
<fwereade> rogpeppe, +1
<fwereade> rogpeppe, it's been itching at me once r twice in every branch
<rogpeppe> fwereade: cool
<rogpeppe> fwereade: because it resolves an import cycle problem that benji is having
<rogpeppe> fwereade: (api/params wants to reference Constraints, but state imports api/params)
<fwereade> rogpeppe, all the better then :)
<rogpeppe> fwereade: indeed
 * rogpeppe really really hates arbitrary timeouts in tests.
<frankban> fwereade: the full suites takes about 3:30 minutes here
<rogpeppe> fwereade: i'm thinking launchpad.net/juju-core/constraints, but launchpad.net/juju-core/state/constraints could work too.
<fwereade> rogpeppe, I'd prefer top-level than inside state
<rogpeppe> fwereade: cool
<rogpeppe> fwereade: any suggestions for another noun other than Constraints in "constraints.Constraints" ? i haven't got a good one, but thought someone might.
<fwereade> rogpeppe, I was having wicked thoughts of constraints.T, but I don't think it'd be popular
<rogpeppe> fwereade: yeah, i was thinking about constraints.C
<fwereade> rogpeppe, pondering constraints.Value
<rogpeppe> fwereade: i thought of that too
<fwereade> rogpeppe, verbose enough to read well, no more annoyingly bulky than the existing specifier
<rogpeppe> fwereade: +1
<rogpeppe> fwereade: and not annoyingly stuttery
<fwereade> rogpeppe, cool
<fwereade> rogpeppe, yeah
<rogpeppe> fwereade: is there any particular reason not to use constraints.Value as the constraintsDoc?
<fwereade> rogpeppe, I am reluctant to make the db document the same type as the document I expose via the api
<fwereade> rogpeppe, I could maybe be convinced that we can swap in a new type when we need to
<rogpeppe> fwereade: i'm not sure it's worth the cost of the separation. we already use external types in our entity documents.
<rogpeppe> fwereade: (and that too)
<rogpeppe> here's the next branch in the allWatcher branches: https://codereview.appspot.com/7650048
<rogpeppe> reviews appreciated :-)
<rogpeppe> time to stop. g'night all.
<thumper> fwereade: hey, you around.
<benji> OK intrepid developers, I have a new branch up for review that adds more to the API: https://codereview.appspot.com/7600044
<fwereade> thumper, heyhey
<thumper> fwereade: hey
<fwereade> thumper, I'm not sure how clear I managed to be earlier
<thumper> fwereade: we should chat
<thumper> gimmie five though to fix these bugs
<fwereade> thumper, sorry about that, the whole machine went into spasm somehow
<thumper> fwereade: hangout?
<fwereade> thumper, sure
 * fwereade starts one
<fwereade> morning davecheney
<davecheney> fwereade: howdy
<fwereade> davecheney, how's it going?
<davecheney> good
<davecheney> trying to get an env running on hp cloud
<davecheney> also have a branch nearly ready for review that increases the delay time between mgo retries from *as fast as possible* to something we control
<fwereade> davecheney, cool -- not *too* painful I hope?
<davecheney> as always, testing is a problem
<fwereade> davecheney, sweet, that was annoying me again just today
<davecheney> esp as to do the delay, i'm working around mgo's built in retry logic
<davecheney> which i'm assuming can't be changed
<fwereade> davecheney, I don;t recall anything that would lead me to believe otherwise
<davecheney> things in charm testing land are positive
<davecheney> when ec2 plays ball charm-test and graph-test work
 * fwereade cheers
<davecheney> but ec2 continues to be unreliable inside its own network
<fwereade> davecheney, any indications on whether hp cloud is any better?
<davecheney> none yet
<davecheney> see prtv message
 * davecheney disappears for a quick bio break
<davecheney> thumper: can i get your $0.02NZ ob my concerns about using floats for money, https://codereview.appspot.com/7693048/
<thumper> sure...
#juju-dev 2013-03-19
<thumper> davecheney: as commented, I'd agree no floats if we were charging people, but I don't think we are
<thumper> as it is just used for ordering, I think floats are ok
<davecheney> thumper: i think it's going to bite us in the arse
<davecheney> and ordering could easily be acomplished by
<davecheney> const millicent = 1
<davecheney> const cent = 100 * millicent
<davecheney> etc
<thumper> what's this about go constants being untyped?
<davecheney> they are untyped
<davecheney> they are ideal
<thumper> it makes constants like C #defines
<thumper> which are horrible
<thumper> why would they replicate that?
<davecheney> probably, but a little more specified
<davecheney> so you don't need 100000000000000ULL
<davecheney> it works pretty well in practice
<bac> davecheney: could you have a look at https://codereview.appspot.com/7610046 again?
<davecheney> bac: sure
 * davecheney has just filled up all the WIP slots on the review lane
<jam> davecheney: poke about hp bootstrapping
<jam> just to check what your OS_REGION_NAME is set to. It needs to be something like: export OS_REGION_NAME="az-3.region-a.geo-1"
<jam> if it is just 'region-a.geo-1' then we can't find which compute region you actually want (az-1, az-2, az-3)
<davecheney> jam: ahh, that is probably it
<davecheney> error: cannot start bootstrap instance: cannot find image satisfying constraints: No such flavor
<davecheney> http://www.vh1.com/celebrity/bwe/images/2009/10/FLAVA-FLAV-DAYLIGHT-SAVINGS-TIME.JPG
<davecheney> jam: do I need to supply a default-image-id or something to run on hp ?
<jam> davecheney: yes, you need default-image-id. I thought that 'juju init' would report an HP image
<davecheney> i didn't use juju init
<jam> davecheney: default-image-id: "75845"
<davecheney> ta
<jam> default-instance-type: "standard.xsmall"
<davecheney> oh god, number cannot be string
 * davecheney pulls hair out
<jam> davecheney: where was that? I've gotten it working with the above config on Windows, so I would think you wouldn't have a problem.
<davecheney> don't quote 75845
<davecheney> then yaml tries to 'sense' the type of the key
<jam> davecheney: http://paste.ubuntu.com/5627423/
<jam> I'm using a ""
<jam> I think it is supposed to be a string
<davecheney> wow, bootstrap just pooed a screenful of binary data at me, ending with, returned unexpected status: 400; error info: {"badRequest": {"message": "Cannot find requested image 75845: Image 75845 could not be found.", "code": 400}}
<davecheney> is that image for region a, or region b ?
<jam> davecheney: there is nothing in region b
<davecheney> hmm, i'm using region-a
<davecheney> but using the indentity endpoint in region-b
<jam> yeah, I just got the same, it worked last week. :(
<jam> apparently they expired their images
<davecheney> bzzzzzzzzzzzzzt
<rogpeppe> mornin' all
<dimitern> morning!
<dimitern> jam: sorry about the meeting yesterday - got the mail to late to attend
<jam> dimitern: np
<dimitern> jam: so it's going to be a regular weekly meeting at 19h UTC ?
<jam> yeah
<jam> I think 16h UTC
<jam> yeah, 16h is what I see on the calendar
<jam> which is 17/18 DST
<jam> for you,  I believe
<dimitern> jam: weird.. it showed as 20-20:30 local for me
<jam> If it was the email I sent, it was 20-20:30 local time in Dubai
<jam> not CST
<jam> CET?
<dimitern> jam: I see :)
<dimitern> mramm: can you add the monday meeting to the juju calendar please?
<mramm> which monday meeting?
<dimitern> mramm: the juju cross team meeting (or smth)
<mramm> I'll get it added
<dimitern> mramm: cheers!
<rogpeppe> dimitern: https://codereview.appspot.com/7650048/ is up for review if you fancy it
<dimitern> rogpeppe: click
<dimitern> rogpeppe: while i'm on it - https://codereview.appspot.com/7497047 ?
<rogpeppe> dimitern: looking
<dimitern> rogpeppe: reviewed
<rogpeppe> dimitern: thanks!
<jam> mgz: /wave
<mgz> jam: mumble?
<jam> soitenly
<dimitern> rogpeppe: ping
<rogpeppe> dimitern: pong
<dimitern> rogpeppe: how about that CL?
<rogpeppe> dimitern: sorry, i've been chatting with mark
<dimitern> rogpeppe: np, no rush
<jam> mgz: mumble is timing out for me right now, so I'm going to go grab a snack, bbias
<mgz> jam: mumble kicked me...
<mgz> and I can't reconnect
<jam> mgz: try one more time
<jam> I'm back on it
<jam> mgz: still no love?
<mgz> it's back
 * dimitern quick lunch
<dimitern> i'd like a review on https://codereview.appspot.com/7497047
<rogpeppe> dimitern: reviewed. not very well, i'm afraid though - i'm having difficulty wrapping my head around it.
<dimitern> rogpeppe: tyvm
<dimitern> rogpeppe: the idea is to have multiple service settings (per charm url), and the refcount is used to track and clean them
<dimitern> rogpeppe: we need this in order to keep the old settings until there are any units that might use them (before they are upgraded to the new charm)
<rogpeppe> dimitern: one difficulty i have is that we're building a quite involved set of txn ops, and i have no idea what it might look like eventually, nor how the various asserts and ops might play together
<dimitern> rogpeppe: well, there are tests to make sure the refcounts are handled correctly
<rogpeppe> dimitern: are there any concurrent-change tests?
<rogpeppe> dimitern: i suppose we know that the whole txn must go through as a unit, so maybe it's not worth it
<rogpeppe> dimitern: i'm just concerned that there are many possible txn "programs" this code can be building and i'm not clear that the tests are exercising all possibilities.
<dimitern> rogpeppe: no concurrent tests, but I don't think they'll be that useful, and moreover i can't really wrap my mind around how to do it
<rogpeppe> dimitern: i wonder if you could enumerate a few cases for me, showing the whole set of txn ops produced for a SetCharm in various cases
<dimitern> rogpeppe: if you refer to settingsInc/Dec ops they are used only in a few places (add/destroy service mostly)
<rogpeppe> dimitern: they're used by SetCharm
<dimitern> rogpeppe: you mean service.SetCharm, right?
<rogpeppe> dimitern: yeah
<rogpeppe> dimitern: hmm, maybe i should download the branch and put a few printfs in
<dimitern> rogpeppe: please do, if you can
<dimitern> rogpeppe: so, cases:
<dimitern> rogpeppe: we need to make sure the config is compatible, and also check the charmurl is different (not already set)
<wallyworld_> mgz: https://pastebin.canonical.com/87106/
<rogpeppe> dimitern: because you're not allowed to change the charm url once it's set?
<dimitern> rogpeppe: no, because it's already set
<rogpeppe> dimitern: so you can't upgrade to a charm from a different URL?
<dimitern> rogpeppe: the crux of the ops happens in changeCharmOps
<dimitern> rogpeppe: you can't upgrade to the same url
<rogpeppe> dimitern: ah, i see
<dimitern> rogpeppe: and also we need to either create or replace the settings for this charm and service
<dimitern> rogpeppe: i suppose adding more comments inline will help with understanding better what's going on
<dimitern> rogpeppe: and we inc the new, dec the old, adding asserts as needed
<rogpeppe> dimitern: yeah, i think that would help me, at any rate
<dimitern> rogpeppe: ok, i'll then
<mgz> wallyworld_: ta
<rogpeppe> dimitern: FYI, here's an example of the ops we're producing: http://paste.ubuntu.com/5628008/
<wallyworld_> mgz: lp:~wallyworld/juju-core/rsyslog-conf   <-- wip, bootstrap = good, unit nodes = bad
<dimitern> rogpeppe: wow that's nifty how did you generate it?
<rogpeppe> dimitern: github.com/davecgh/go-spew/spew is really handy :-)
<dimitern> rogpeppe: cool :) thanks
<rogpeppe> dimitern: i produced the output above with simply: 		spew.Dump(ops)
<rogpeppe> dimitern: (oh except i ran Edit x/^ +/x/ /c/	/ on it because i prefer tab indentation and that's not the default)
<dimitern> rogpeppe: nice!
<mgz> wallyworld_: so, first thing to try is just copying /var/lib/cloud/instance/scripts/runcmd and running it, see what errors you get if any
<wallyworld_> ok, will do
<mgz> it looks from your cloud-config it should be safe to run again at a later stage
<mgz> but I note there's no set -e
<mgz> so everything should be getting run, regardless of exit codes
<mgz> the other thing that comes to mind...
<wallyworld_> mgz: running runcmd manually orked
<mgz> are the perms on the config being borked somehow? this runs as root, I expect rsyslog isn't fusyy, but you never know
<wallyworld_> rsyslog.conf can be owned by root
<mgz> right, and your issue was it wasn't present at all...
<wallyworld_> yes, "sudo sh ./runcmd" worked and generated the onf file and restarted syslog and everything is act expected
<wallyworld_> but it's not running on boot
<wallyworld_> or so it seems
<mgz> I think it is... from the output
<mgz> before the key generation, there's a line from wget
<mgz> then a note about rsyslog start/running
<wallyworld_> ah yes, missed that
<mgz> ...I'd hurl in a `cat /etc/rsyslog.d/25-juju.conf` before that restart cmd
<mgz> I wonder if this was just the port thing
<wallyworld_> can't see how - it's just echoing to a file to generate the conf
<mgz> and just the lack of output, and somehow confusion over the conf file existing was borking stuff
<wallyworld_> i'll dig some more tomorrow, too tired now, thanks for help
<mgz> I don't see what's special about the non-provisioning state apart from the send-logs-too part
<wallyworld_> yeah
<mgz> er... non-bootstrap node, whatever we're callig it
<mgz> a ghost!
<jam> mgz: mumble?
<mgz> ghost!
<abentley> jam: I've updated the branch to remove references to format 2.  Could you land it for me, please?
<jam> abentley: sure, I'll try
<abentley> jam: Thanks.
<jam> abentley: in progress now
<jam> abentley: done
<jam> rev 1022
<abentley> jam: Thanks!
<jam> abentley: can you run the test suite on trunk just to be sure? I'm running it here
<abentley> jam: Okay.
<bac> dimitern: sorry, i sent the email before pushing the code.  i'll ping you when it is done
<dimitern> bac: sure, np
<dimitern> bac: I got the habit of leaving draft comments on rietveld and then just lbox propose-ing - it sends them along
<bac> dimitern: it does?  oh that's great.  i thought i had to do it manually
<dimitern> bac: yes, it's nifty (both submit and propose do this)
<rogpeppe> dimitern: yeah, and it has the nice advantage that you don't post comments saying "done" when you haven't actually pushed the changes yet...
<dimitern> rogpeppe: exactly :) i use this a lot!
<dimitern> rogpeppe: btw I don't know why with your benchmarking branch one of the cases was run 50 instead of 100 times (like in your dump)
<rogpeppe> dimitern: it runs it more times if it runs faster
<rogpeppe> dimitern: it tries a small number of iterations first
<dimitern> rogpeppe: I see
<rogpeppe> dimitern: obviously yours was a little bit slower than mine and pushed it over the boundary
<dimitern> rogpeppe: no doubt - it has a teensy core i3 (one of the slowest)
<rogpeppe> dimitern: SSD?
<dimitern> rogpeppe: yes and a good one
<rogpeppe> dimitern: i'm surprised that doesn't make a bigger difference actually. i suspect it's all io bound
<rogpeppe> dimitern: i just got the SSD that i could order with the laptop
<dimitern> rogpeppe: mine is OCZ Vertex 128G
<dimitern> rogpeppe: is supports sata3 i think, but my mb is too old
<rogpeppe> dimitern: i've no idea how to find out what mind is...
<rogpeppe> s/mind/mine/
<dimitern> rogpeppe: :) just fiddled about in google about this - sudo hdparm -d /dev/sda
<dimitern> ops
<dimitern> not -d: -i instead
<rogpeppe> dimitern: one of the great things about macs is it's very easy to find out all the gory details of the machine specs
<dimitern> rogpeppe: how?
<rogpeppe> dimitern: apple menu, About, Details.
<rogpeppe> dimitern:  HDIO_GET_DMA failed: Inappropriate ioctl for device
<bac> dimitern: changes pushed
<TheMue> rogpeppe: isn't the answer simple - always too slow and too small (any system). ;)
<dimitern> rogpeppe: ah, yes - ubuntu used to have something like that (easily accessible somewhere... not the About this Computer in the tray menu though)
<rogpeppe> TheMue: actually i find it's pretty quick.
<dimitern> rogpeppe: -i is the info, -d is get/set dma sorry - see above
<dimitern> bac: sweet!
<rogpeppe> dimitern: ah, i thought -d was the "dump" option :-)
<TheMue> rogpeppe: i'm happy right now too, it has been a geeks statement that hw can't be fast enough. :D
<rogpeppe> dimitern: TOSHIBA THNSNC128GCSJ
<dimitern> rogpeppe: well, I was instantly afraid it was something bad, so checked :)
<dimitern> bac: nice, good to go
<bac> dimitern: by that you mean, submit?
<TheMue> rogpeppe, mramm: hangout?
<abentley> jam: All tests ok.
<jam> a
<jam> abentley: yep, same here
<dimitern> bac: yeah, at least on my part I like it
<bac> dimitern: ok.  dfc already said LGTM and i addressed his issue.  so off it goes.
<dimitern> rogpeppe: can you take a look at https://codereview.appspot.com/7497047 again to see if it makes more sense now? added more comments
<rogpeppe> dimitern: will do
<dimitern> mramm: what happened to the OCR card?
<dimitern> mramm: I mean any steps towards it? what do you think?
<rogpeppe> anyone other than dimitern fancy doing a review of https://codereview.appspot.com/7650048/ ?
<rogpeppe> also there's a fairly trivial followup which actually makes the StateWatcher work: https://codereview.appspot.com/7663048
<dimitern> rogpeppe: I'll take the follow-up
<rogpeppe> dimitern: ta!
<dimitern> rogpeppe: what will happen to other two watchers in state/state.go - watcher, pwatcher? should they be removed (eventually)?
<rogpeppe> dimitern: no - they're what all the other watchers (including the allWatcher) build on
<dimitern> rogpeppe: but having them in State as fields - wouldn't it be a bit confusing?
<rogpeppe> dimitern: you mean allWatcher vs watcher?
<dimitern> rogpeppe: yes, and pwatcher
<dimitern> (the fields, not the actual impl)
<rogpeppe> dimitern: the fields are needed (and used) by all the other watchers
<niemeyer> So, is allWatcher going to receive data about every single thing that happens in the environment?
<rogpeppe> dimitern: but if you can think of better names, please feel welcome to suggest
<dimitern> rogpeppe: no, just asking, sgtm
<niemeyer> By the way, state importing from state/api/params when state/api depends on state seems slightly backwards
<rogpeppe> niemeyer: currently, yes (because that's what the GUI requires). in the near future we will summarise and hopefully avoid seeing everything.
<niemeyer> rogpeppe: Ugh!
<rogpeppe> niemeyer: there have been some awkward cyclic import issues around there. the params package is there precisely to avoid a cyclic import.
<niemeyer> rogpeppe: There are many ways to avoid cyclic imports
<rogpeppe> niemeyer: we need to work with the GUI as it currently is before moving forwards.
<niemeyer> rogpeppe: Often, though, cyclic imports means that the organization itself isn't great
<rogpeppe> niemeyer: perhaps you're right.
<rogpeppe> niemeyer: suggestions welcome.
<niemeyer> rogpeppe: I had several suggestions before:
<niemeyer> 1) Implement the proper API that makes the GUI work with it
<niemeyer> This could require changes in the GUI, of course
<rogpeppe> niemeyer: i'm not sure what that means
<niemeyer> rogpeppe: Not implementing allWatcher
<rogpeppe> niemeyer: i could not get the GUI folks to change their entire structure at this point, i'm sorry.
<niemeyer> rogpeppe: Well, you actually could.. we've talked about this before in a call as I remember it
<niemeyer> rogpeppe: Spending an awful lot of time to put something broken in juju-core is no better than fixing the GUI
<rogpeppe> niemeyer: i believe that even when fixed, the GUI will want information about the status of all units, even if it's in summary form.
<niemeyer> rogpeppe: That's a broken design
<rogpeppe> niemeyer: is it broken to want to know how many units are in a failed state?
<niemeyer> rogpeppe: Nobody can possibly see 100k units at once in a meaningful way
<rogpeppe> niemeyer: sure they can - *summarised*
<niemeyer> rogpeppe: Heh.. that's a pretty different API than the watch API we have in place
<rogpeppe> niemeyer: really?
<niemeyer> rogpeppe: You wouldn't be watching a unit.. you'd be watching a single summary document
<rogpeppe> niemeyer: yes, either that, or summarising changes into a summary held in memory
<rogpeppe> niemeyer: that's the next phase in the allWatcher implementation (it's probable that the name "allWatcher" may no longer be appropriate after those changes)
<niemeyer> rogpeppe: This is a massive infrastructure to compute "100 units in failing state"
<niemeyer> rogpeppe: That's a single database query
<niemeyer> rogpeppe: Indexed
 * rogpeppe goes off to die
<dimitern> :D
<niemeyer> http://www.dilbert.com/2013-02-25/
 * rogpeppe can't muster a smile.
<dimitern> in such cases i remember monty python's "always look at the bright side of life"
<niemeyer> rogpeppe: We just need a proper plan for how to move forward
<niemeyer> rogpeppe: Perhaps this conversation took place and I'm just not aware, but given the conversations we had when I was pushing things closer, I though there was agreement on certain things that are clearly not the case at the moment
<niemeyer> rogpeppe: It's not necessarily bad, but it's definitely the kind of thing that needs alignment to avoid digging a big hole
<niemeyer> rogpeppe: I'll send a public sync up email
<niemeyer> rogpeppe: Or half-public, perhaps
<mramm> niemeyer: I don't think the hole is quite that big
<mramm> niemeyer: we can pretty easily change both sides -- either now or later
<mramm> niemeyer: which is not to say that digging holes and then patching them in later is the best thing to do
<niemeyer> mramm: Agreed. My concern at this point is mostly that I had discussed this with rogpeppe and hazmat in a call before, and it felt like the direction was a different oen
<niemeyer> one
<niemeyer> mramm: I don't want to be right, though.. I just want us to know that we're not digging that hole, and have a plan
<mramm> niemeyer: yes, but unfortunately that was 2+ months ago and discussions have happened since then
<niemeyer> mramm: Excellent
<niemeyer> mramm: Discussions is what we need
<mramm> niemeyer: I know roger and gary have been in pretty constant communication
<niemeyer> mramm: Why is it hard to build the GUI on top of the existing watchers in watcher.go?
<mramm> niemeyer: I have not been present in all of those meetings myself
<mramm> niemeyer: my (partial) understanding is that there are many things that the gui needs to know that are not yet watched in watcher.go
<niemeyer> mramm: Why is it hard to change the GUI to not require a "give me everything that ever changes in the whole environment" model?
<mramm> niemeyer: I think everybody agrees that we are going to change the GUI not to require that
<mramm> niemeyer: I think the only question is how that happens -- timelines, how steps in that direction get defined, etc
<niemeyer> mramm: i understand that. I'm simply surprised that, given the timelines, it's easier to rebuild watcher.go from the ground up.
<mramm> niemeyer: Why do you say from the ground up?   Am I missing something about the implementation?
<niemeyer> mramm: None of the logic in state/watcher.go is being used
<mramm> niemeyer: ok
<niemeyer> mramm, rogpeppe: There's no one dying.. and perhaps what is being one is the best possible way to do things.
<mramm> that is much more useful information than anything I've seen so far
<mramm> alarm bells are less helpful than specifics
<niemeyer> mramm, rogpeppe: But if nothing else, we need to be aware of the reasons for going in that direction, and why doing less would be harder
<niemeyer> mramm: My email was mostly a request for information..
<mramm> and questions like "hey, is there a reason we couldn't reuse any of the logic in state/watcher?" are much more likely to advance the discussio
<niemeyer> mramm: I assumed that these topics were well covered internally
<mramm> discussion
<niemeyer> mramm: That question happened months ago
<niemeyer> mramm: Either way, I'm sending a mail to canonical-juju
<mramm> ok, what was the answer then?
<mramm> niemeyer: please do not do that yet
<niemeyer> mramm: It's a sync up mail
<mramm> niemeyer: please send it to me first
<mramm> you can do what you want, of course
<mramm> but I think that it would be very valuable to focus this issue on technical discussion
<niemeyer> mramm: Don't worry, it's not a heat-up email.. it's just a way to get a written down plan about where we're headed to
<mramm> and remove the high-stakes sort of alarm trigger words that have been used thus far
<hazmat> fwiw i do agree that a summary doc (effectively getting aggregate  unit counts to state up to the service level would be cut out alot of traffic)
<mramm> my understanding at this point (not being in all the meetings) is that we all sort of agree on the future
<gary_poster> y
<hazmat> there's some concern about the latency for fetching the actual unit once the ui is interactive, ie. pay up front costs, and keep the rest of the traffic async delta
<mramm> where we will have summary documents, and not just a firehose of data
<hazmat> but that may have been pushed by some of the zk communication overhead.. even simple things like login where adding multi-second latency due to libzk issues (patches extant).
<mramm> I think we all agree that being notified of every change to a twenty thousand node nova compute center would be problematic
<mramm> hazmat: yea, I don't think it makes sense long term for the gui to think that it can effectively keep the entire state in memory...
<mramm> one of the reasons we went to mongo is that we don't want to have to be able to keep the state in memory on the server -- let alone on somebody's phone ;)
<hazmat> agreed, there was a notion that it could shift to indexed db and keep only working set in memory.. but yes.. simplicity would be better served by summary
<hazmat> and invalidation refs
<hazmat> mramm, html5 apis like indexeddb or even localStorage allow for quite alot of client side state not in mem
<mramm> hazmat: sure
<mramm> niemeyer: so, with respect to the current situation -- I'm pretty sure we agree on end-state
<mramm> niemeyer: the discussion right now is about how to most expediently get there
<dimitern> hazmat: the only issue with local storage is the quota limit of 5MB (i think at least in chrome, ff maybe as well)
<hazmat> dimitern, right.. hence indexeddb
<dimitern> hazmat: there's no quote there?
<dimitern> *quota
<mramm> dimitern: hazmat: but let's be clear here, with a reasonable mongo database you *don't* need to keep the entire state in memory for the GUI
<hazmat> indexeddb and webworkers both have  good support on mobile
<hazmat> dimitern, 50mb
<dimitern> mramm: except some of it - basically for performance reasons
<mramm> sure
<mramm> some data needs to be local
<mramm> but that is a subset of the whole state
<hazmat> the original collection view that the gui was supporting, needed to be able to display effectively all of a service's units with state and ip info on a single view.
<hazmat> with dynamic sized units and filtering.. it seems the future designs have moved into a paging / filtering mode
<dimitern> hazmat: i though you display only services, not individual units
<hazmat> dimitern, try clicking through a service on the ui.. you get to a service overview with  units, and to a unit details with ability to manipulate a unit (resolved, retry, delete, etc).
<dimitern> hazmat: right, I see it now
<hazmat> anyways.. i'm much more confident of paging through a result set of units w/ mongodb and low latency then zk (where each node is at *least* one  round trip)
<mramm> hazmat: clicing through can query more data right (sorry, was distracted for a bit)
<hazmat> mramm, yup... but keep in mind we get some pretty bad client physical separation to api servers that also adds to latency.. ie  southamerican client querying to an openstack .. the full push delta work minimized the perceived user latency when the app was loaded, previously it would be like random pauses as you clicked through some pages.
<hazmat> one of the larger concerns is making sure the client is operating on a correct server side representation when performing actions
<hazmat>  realistically only the server can enforce that
<hazmat> which means the client needs to send revision handles to contexts its operating on
<hazmat> there's also one significant api abberation in the juju-cli.. all the others operate on setting state to goal state.. but add-unit operates as a delta
<rogpeppe> hazmat: agreed
<rogpeppe> hazmat: but i'm not sure about the "needs to send revision handles" part
<rogpeppe> hazmat: as long as we have sensible operations and unique handles for entities, i think we're ok
<rogpeppe> hazmat: (we don't currently have the latter, unfortunately)
<hazmat> rogpeppe, its last write wins then.. two clients writing out to config
<rogpeppe> hazmat: yeah. i think that's reasonable.
<niemeyer> hazmat, rogpeppe, mramm: So, sync up mail sent..
<niemeyer> Hopefully demoted of emotion and any kind of blame.. just trying to see if we can get a written plan
<niemeyer> I'm heading off to lunch, back in a bit
<mramm> niemeyer: have fun at lunch
<mramm> niemeyer: hopefully we can get down to technical specifics and hammer out anything that needs hammering out
<mramm> I just noticed that you sent to canonical-juju, that alleviates my concern, I was flipping back and forth between IRC conversations, and mis-read you to say you were going to send a message to canonical-tech
<mgz> fuuuu
<rogpeppe> done for the day
<rogpeppe> g'night all
<niemeyer> mramm: Yeah, certainly not a topic relevant for tech..
<benji> hi all, I have gone through one review round on https://codereview.appspot.com/7600044/ and now need a second +1
<benji> by the way, while we're discussing reviews, I have a question: my current understanding that two LGTM reviews (with or without "trivials") are all that are required for approval (after any "trivials" are addressed to the branch author's satisfaction).  Is that the case?
<m_3> _thumper_: hey... lp question about moving bugs to new branches if you gotta sec
<_thumper_> m_3: hey, otp right now
<m_3> ack
<niemeyer> benji: It used to be.. but if someone questions it, that counts too
<niemeyer> benji: I mean, two LGTM + 1 "I don't get it" == hold
#juju-dev 2013-03-20
<davecheney> _thumper_: seen mramm ?
<_thumper_> davecheney: yeah
 * davecheney waits on mramm for our 1-on-1
<bigjools> davecheney: I'm trying to do a go get on github.com/andelf/go-curl and go says "cannot find package", yet it seems to work for others.  Any ideas?
<bigjools> ok we worked it out
<bigjools> FFS
<bigjools> my GOPATH started with a :
<bigjools> ridiculous
<davecheney> bigjools: O_O!!
<davecheney> ahh, there is a long story about that
<davecheney> which isn't (all) go's fault
<bigjools> davecheney: I see :)
<davecheney> for example, did you know that PATH=$PATH:: is short hand for
<davecheney> PATH=$PATH:.
<bigjools> !
<bigjools> davecheney: so when I do a perfectly reasonable line like this in my .bashrc, I am screwed then: export GOPATH="$GOPATH":/my/path
<davecheney> yup, that isn't awesome
<davecheney> that says GOPATH=.:/somethign/else
<davecheney> where . is the first element, and will be the target for go get
<davecheney> if it is of any consolation, i have fixed many of these usability problems in 1.1
<davecheney> bigjools: thumper do you two have time to talk about the mongo problem ?
<bigjools> sire
<bigjools> sure even
<davecheney> so, latest i have is everyone is working really hard to get mongo into at least raring by 13.04
<davecheney> but I think we need a backup
<davecheney> bigjools: what I would like to do is move from the tarball to your ppa for getting mongo for the bootstrap nodes
<bigjools> the current raring packaging has a bug when it tries to build with ssl, fwiw
<bigjools> \o/
<bigjools> how about an official PPA
<bigjools> I just did mine as I don't trust unsigned binaries :)
<davecheney> bigjools: sounds fine by me, i'd really like your guidence on this
<bigjools> ok
<davecheney> the trick is we need P, Q and R, amd64 as a must
<davecheney> 386 and arm as a maybe
<bigjools> thumper was saying that the 2.2.3 build didn't work for him
<davecheney> shitter
<bigjools> 2.2.2 was defo ok
<bigjools> I built 2.2.3 in the same way so I dunno what's up
<davecheney> bigjools: how do I, or you
<davecheney> i think you'd be faster
<davecheney> get a PPA location setup so I can then reference that in the cloud init scripts ?
<bigjools> activate a new PPA for the juju devs?
<davecheney> bigjools: full disclosure, i'm a launchpad numpty
<davecheney> you'll have to walk me through this
<bigjools> heh
<bigjools> np
<bigjools> which team should own the ppa?
<davecheney> i think htere is a juju team now
<davecheney> it used to be ~gophers
<bigjools> perfect
<davecheney> but that was changed recently
 * davecheney checks
<davecheney> bigjools: if you are not a member
<davecheney> i can fix that
<bigjools> so if you are an admin of the team you can activate a ppa for it
<bigjools> once you do that you can't rename the team
<bigjools> not sure I need to be a member
 * davecheney goes to look
<davecheney> https://launchpad.net/~juju
<davecheney> ^ i can create a new ppa
<bigjools> on the left mid way down, yes
<bigjools> so I suggest making one called devel, one called staging and one called stable
<bigjools> or perhaps s/devel/experimental/
<bigjools> then we can test packages and copy them later to the stable ppa
<davecheney> bigjools: sounds reasonable
<davecheney> i'm goig to be adding this ppa to the cloudinit script
<bigjools> the forms on LP are pretty self-explanatory I hope
<davecheney> how will having dev/stage/stable play into this
<bigjools> yeah then it does need a production PPA that is stable :)
<davecheney> ok, done devel and stable
<davecheney> that'll do for the moment
<davecheney> bigjools: also, because i'm a lp numpty, i'm not setup with all the pkg signing aparatus
<bigjools> easy to sort
<davecheney> ok
<bigjools> on the cmd line just type "add-apt-repository ppa:account/ppa-name"
<davecheney> oh yeah, i've done that side
<bigjools> it won't work until the 1st package is in the PPA though
<davecheney> but the publishing part will probably involve more steps
<bigjools> packaging is tricky, yes
<bigjools> I can help there
<davecheney> sweet, thanks
<bigjools> I suggest that once you have the experimental PPA up, you copy my 2.2.3 package into there
<davecheney> ok, experimental created
<davecheney> i copied your ppa once before
 * davecheney looks
<bigjools> so visit my PPA's packages page
<bigjools> and click copy packages
<bigjools> then your experimental PPA will be in the destination drop down
<bigjools> copy to same series and copy source + binary
<davecheney> https://launchpad.net/~julian-edwards/+archive/mongodb
<davecheney> why is it always a sonofabitch to find the copy link ?
<bigjools> click "view packages"
<bigjools> it's on that page
<davecheney> gotcha
<davecheney> ok, wheels are grinding now
<davecheney> while that is baking i shall make a branch to use it
<bigjools> will be interesting to see why it fails
<bigjools> or at least thumper said it fails :)
<davecheney> bigjools: can I use copy package to change the series to quantal, etc ?
<bigjools> davecheney: not a good idea to go backwards
<davecheney> ok
<bigjools> davecheney: I did a 2.2.2 for quantal
<bigjools> it works fine
<davecheney> that is ok
<davecheney> once we get _one_ working mongo
<davecheney> i can use default-series to boot an environment that matches
<bigjools> you can promote 2.2.2 from quantal to raring
<davecheney> bigjools: is there anyone I should talk to about using ppa's in cloudinit ?
<bigjools> smoser
<davecheney> of course
<davecheney> da man
<davecheney> bigjools: are you on juju-dev ?
<davecheney> ML
<bigjools> davecheney: yes I think so
<davecheney> ok, just announced my intentions for this packaging bollocks there
<davecheney> please comment and/or throw fruit
<bigjools> seems ok
<davecheney> didnt' expect it to be contraversial
<bigjools> I don't understand juju well enough to comment really
<bigjools> writing the python maas provider was enough for me :)
<davecheney> i can give you more background
<davecheney> but you probably don't want to know
<davecheney> thumper: seen this test failure ?
<davecheney> ... obtained *net.OpError = &net.OpError{Op:"write", Net:"tcp", Addr:(*net.TCPAddr)(0xc200245d50), Err:0x68} ("write tcp [::1]:46014: connection reset by peer")
<davecheney> ... expected *rpc.ServerError = &rpc.ServerError{Message:"transformed: message", Code:"transformed: code"} ("server error: transformed: message (transformed: code)")
<thumper> umm...
<thumper> perhaps
<thumper> I do still get some intermittant failures
<davecheney> happens 100% of the time
<davecheney> will bisect in a few mins
<thumper> hmm
<thumper> I ran tests just before I went out
<thumper> and all passed
<davecheney> trying to improve my jenkins builder to run our tests
<thumper> let me check my trunk revision
<davecheney> thumper: which revno
<thumper> r1024
<thumper> davecheney: what do you have?
<davecheney> hmm, how can I check
<davecheney> this branch has local changes
<thumper> bzr revno
<davecheney> yeah, checking 1024 now
<davecheney> thumper: yup, totally repeatable @1024
<davecheney> if you can confirm the failure I will log a bug about it
<thumper> hmm... I don't get it on r1024
<davecheney> we've seen failures like this before
<thumper> let me triple check
<davecheney> races between the client reading the reseponse and the server closing the connection
<thumper> ok  	launchpad.net/juju-core/rpc	0.186s
<thumper> that is with r1024 of trunk
<davecheney> thumper: can you please try GOMAXPROCS=2 go test launchpad.net/juju-core/rpc
<thumper> davecheney: tried three times, succeeded each time
 * thumper is trying to land the upload tools tweaks
<davecheney> with GOMAXPROCS=2 (or any number other than 1)
<davecheney> fair enough
<thumper> OMG there were a lot of conflicts with trunk
<thumper> davecheney: yes, with set to 2
<davecheney> that is important, gets the review queue unblocked
<thumper> exactly
<davecheney> carry on
 * thumper nods
<davecheney> bigjools: https://launchpad.net/~juju/+archive/experimental/+packages
<davecheney> looks like it worked
<bigjools> davecheney: cool
<bigjools> so hax0r away
<thumper> bigjools: have you fixed mongo?
<thumper> davecheney: branch landed
<bigjools> rofl
 * thumper is done
<bigjools> thumper: it's not a packaging problem AFAICS therefore it's a juju problem
<bigjools> or mongo.... take your pick. either way, it's not *my* problem :)
<thumper> heh
<bigjools> you said you saw libssl linked?
<bigjools> perhaps 2.2.3 has something that breaks juju
<thumper> maybe... NFI
 * thumper goes to help make dinner
<davecheney> LFTM, ldd $(which mongod) linux-gate.so.1 =>  (0xb76ef000) libssl.so.1.0.0 => /lib/i386-linux-gnu/libssl.so.1.0.0 (0xb7690000)
<davecheney> mongo is stupid
<davecheney> they require the elephant that is libboost
<davecheney> but they only need it for the headers
<davecheney> thumper: ok, the rpc test failure is real
<davecheney> just verified under 1.0.3 as well
<davecheney> it fails under my stress test script
<davecheney> will raise a bug
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1157553
<_mup_> Bug #1157553: rpc: tests are unreliable <juju-core:New> < https://launchpad.net/bugs/1157553 >
<davecheney> jam1: question about juju cli for windows ?
<jam> yes davecheney?
<davecheney> did you get a cross compile working, or did you do it directly on win32 ?
<jam> davecheney: just did it directly
<davecheney> fair enough
<davecheney> that is what I tought
<davecheney> thought
<davecheney> as you were
<jam> well, I need to actually run it there anyway, right ? :)
<davecheney> yes
<davecheney> sorry, was answerin a question from another channel
<jam> np
<jam> I did recompile go, but I probably could have just downloaded the 32-bit version. The big key was just needing 'gcc' for cgo.
<jam> Though maybe I would need a "close" match between the gcc for go and the one for goyaml
<jam> so maybe I would need to recompile go
<jam> C usually isn't as version specific as C++
<davecheney> i would really like to remove the cgo requirement for goyaml
<davecheney> which is a medium sized job
<davecheney> but might be helpful if we have to cross compiile a lot in the future
<jam> when I mentioned it in the past (1yr?) gustavo claimed that he didn't want to implement the mess that was yaml parsing.
<davecheney> ie, I don't see lp supporting a win32 target any ti,e in the future
<davecheney> yeah, that is very true
<davecheney> goyaml just wraps the python yaml c code
<jam> right
<davecheney> so it has the same compatibilty as the python version
<davecheney> a decent compromise
<jam> from what kapil said, yaml parsing speed becomes a bottleneck at large scale, so having a slow parser might also be bad
<jam> but it could be *really* useful if the API was compatible
<jam> so you could use "A" or "B" yaml libsd
<jam> libs
<jam> and then we could have native go  yaml for stuff like win32
<davecheney> i don't think yaml parsing speed is that importnat
<jam> davecheney: do you know why cgo doesn't support cross-compiling? You certainly have cross-compiling gcc
<davecheney> we don't have the topology node
<davecheney> jam: it's not that it doesnt' support it
<davecheney> but the moving parts required especially to go from elf -> PE
<davecheney> are substantial
<davecheney> if you did all the compilation steps by hand, and had the right cross compiler and cross linker, as well as headers and .so's
<davecheney> you could do it
<davecheney> but it was just too hard to autoate with the simple go tool
<jam> sure. So for Bazaar, we just created an EC2 instance that we bundled, and then brought up when we did a release to build the next binaries.
<jam> It wouldn't be hard to set that up again.
<rogpeppe> mornin' all
<jam> hi rogpeppe, hope your day has started well so far
<davecheney> morning rogpeppe
<rogpeppe> jam: the shower was hot :-)
<rogpeppe> davecheney: hiya
<davecheney> jam: i think that is a good suggestion
<davecheney> lets see how it pans out
<davecheney> maybe someone will kick in the resources for this machine on a native platform
<davecheney> :P
<jam> davecheney: so in the short-term, I have no problem doing the compiling on win32, but I imagine we'll want a more automated system so we can detect issues pre-release.
<dimitern> morning
<davecheney> jam: smells like a jenkins shaped problem
<jam> hi dimitern
<jam> davecheney: it does, indeed. Though my personal jenkins-fu is quite rusty.
<davecheney> jam: i've setup a jenkins CI build now that i have the right i386 mongo
<jam> I've used it, I've played with it, but not allt hat much.
<jam> davecheney: you have a mongod that runs on Windows? Or is that Linux-i386?
<davecheney> jam: jenkins is easy, imagine a hammer that has been worn down by repeatedly bashing on hard thins
<davecheney> that is jenkins
<davecheney> jam: i386, see mail to juju-dev about backstop plan for how to get away from using a tarball for mongo
<jam> vila from our old squad had jenkins stuff set up. I think his biggest trouble was trying to create a reproducible save state (so if needed someone else could bring it up similarly). It wasn't terrible to configure and sort of hack it together.
<jam> davecheney: yeah, I saw the "start at a ppa"
<jam> though IIRC the ppa build breaks the test suite.
<jam> If you run with '-gocheck.vv' I *think* rogpeppe and I (mostly him at my request) made it so the failing mongod would at least get logged.
<davecheney> jam: i think we fixed that
<dimitern> setting up jenkins is painful, but at least takes less than a day, even from scratch - did it twice :)
<rogpeppe> davecheney: i saw your TestTransformErrors failure yesterday
<rogpeppe> davecheney: it only happens in go tip
<rogpeppe> davecheney: and i haven't narrowed it down yet
<davecheney> rogpeppe: it happens with 1.0.3
<davecheney> i checked
<rogpeppe> davecheney: orly?
<davecheney> it's timing related
<davecheney> rogpeppe: http://play.golang.org/p/lSPW2BO2Tk
<davecheney> cd $PKG
<davecheney> bash stress.bash
<rogpeppe> davecheney: i'm sure it never happened before on go tip, so i'm surprised about 1.0.3
<davecheney> fails in a few seconds
<dimitern> fwereade: https://codereview.appspot.com/7497047/ please?
<rogpeppe> davecheney: ah, i've managed to reproduce the problem on 1.0.2 (and i also saw a log-target-race related panic)
<davecheney> it's less prevelent under the old 1.0.x scheduler
<davecheney> but the race is there
<rogpeppe> davecheney: yeah (this time i saw the issue in a different test, BTW, but still "connection reset by peer")
<rogpeppe> davecheney: (when reading a response, but writing the request though - it might be a different issue but i suspect not)
<rogpeppe> s/but/not/
<davecheney> rogpeppe: indeed, the faliure modes are always subtly different but it appears to be the client and the server racing to read the final message
<davecheney> rogpeppe: didn't dimeter fix a very similar bug in Atlanta ?
<fwereade> dimitern, sorry, on it right now
<dimitern> fwereade: cheers!
<rogpeppe> davecheney: ah, i *think* i might see the problem
<dimitern> how can i subscribe to canonical-juju ?
<davecheney> dimitern: how much cash do you have ?
<dimitern> davecheney: :D
<davecheney> dimitern: https://lists.canonical.com/mailman/listinfo/canonical-juju
<rogpeppe> davecheney: yup, fixed it, i think
<dimitern> davecheney: as a BG rapper sang once "i own tens of bucks"
<dimitern> davecheney: cheers
<rogpeppe> davecheney: https://codereview.appspot.com/7919043
 * davecheney looks
<davecheney> rogpeppe: that'll do it every time
<rogpeppe> davecheney: yup :-)
<davecheney> i'm surprised that ever work
<davecheney> worked
<rogpeppe> davecheney: yup.
<davecheney> what is the history on that file
<davecheney> i have a worrying feeling that we tried to 'fix' it before
<davecheney> (poorly)
<rogpeppe> davecheney: i guess it was almost always running the new goroutine first
<rogpeppe> davecheney: i'm not sure. i don't remember doing so
<davecheney> i have a vague memory from atlanta
<rogpeppe> davecheney: i might've missed it, being involved with gui stuff
 * davecheney checks
<davecheney> the reason i mentoin it is we had a very similar bug that week
<rogpeppe> davecheney: well, there are no fixes of that kind in the rpc package at any rate
<rogpeppe> davecheney: this produces only ian's recent changes to log:  for(i in *.go) {echo $i; bzr blame $i | grep -v '^ +\|' | grep -v roger.p }
<davecheney> someone showed me
<davecheney> bzr log --show-diffs
<davecheney> but I cna't make it work
<davecheney> what am I doing wrong ?
<rogpeppe> davecheney: i just checked that noone other than me has made any changes to the files in rpc. :-)
<davecheney> maybe it was a similar error in another package
<davecheney> screw it
<davecheney> get one more LGTM and job done
<davecheney> yup, i'm happy we are not reverting another fix
<fwereade> dimitern, qualified LGTM, ping me if there's any uncertainty
<dimitern> fwereade: tyvm
<dimitern> fwereade: no, i'm not getting the thing about using the wrong charm url in unit.go, please explain
<fwereade> dimitern, the service config we care about from the perspective of the unit is the config that is known to match the *unit*'s charm url
 * TheMue likes failing tests *hmpf*
<dimitern> fwereade: exactly what i'm doing isn't it?
<dimitern> fwereade: getting u.doc.CharmURL + u.doc.Service for the key
<dimitern> (well, vice versa, but the code is correct i think)
<fwereade> dimitern, looking up the service's charm url is defeating the point of doing all the work in the first place
<fwereade> dimitern, not entirely ofc, because you do the right thing in watcher.go
<fwereade> dimitern, when watching the unit's service config
<dimitern> fwereade: are you talking about WatchServiceConfig() ?
<fwereade> dimitern, WatchServiceConfig is Correct
<fwereade> dimitern, ServiceConfig is not
<dimitern> fwereade: aah, got you :)
<fwereade> dimitern, they should be using the same document
<dimitern> fwereade: right, sorry - i'll fix it
<fwereade> dimitern, np, I didn't catch that first time round anyway
<fwereade> dimitern, but, sorry, wait a mo
<fwereade> dimitern, no, we're good
<fwereade> dimitern, thanks :)
<dimitern> fwereade: i'm a bit divided what comments to put around the ops
<fwereade> dimitern, which ones? :)
<dimitern> fwereade: this case you described about inc ops with a single upgraded unit in error state getting reverted
<dimitern> fwereade: should we document this somehow, or make the "first use" comment more correct
<fwereade> dimitern, I think that the explanatory line should be either made strictly accurate, or maybe even dropped; as it is, it clearly contradicts the preceding if/else chain
<fwereade> dimitern, I'm not sure what explanatory power a strictly accurate version has though
<dimitern> fwereade: ok, i'm dropping it :) can be bothered to wrap my mind around commenting all cases in a sentence
<fwereade> dimitern, how about "The document might need to be created"?
<fwereade> dimitern, I think that is accurate and high-level helpful
<fwereade> dimitern, "A new settings document might need to be created"?
<dimitern> fwereade: that's like saying the water is wet
<dimitern> fwereade: we're creating it anyway
<fwereade> dimitern, ha, sorry, "a new refcount document might need to be created"?
<dimitern> fwereade: hmm... let me think a bit
<dimitern> fwereade: what we're actually doing there is either creating a refcount=1 or just refcount++
<fwereade> dimitern, honestly, it is documented anyway is settingsIncRefOp
<dimitern> fwereade: yeah, i think it makes sense, although not saying when we might do it... ah right - just take a look at the settingsIncRefOp
<fwereade> dimitern, on service creation we just blat it out directly because we don't need the complexity of an incref, but in this case we need it so we use it
<dimitern> fwereade: ok, i'll change both inc and dec comments so
<fwereade> dimitern, maybe the comments are best couched in terms of "the unit adds a reference to the new settings doc" and "the service drops its reference to its old settings doc"?
<dimitern> fwereade: even better! 10x
<fwereade> dimitern, cool
<TheMue> Ha, killed the next failure. *g*
 * fwereade cheers at TheMue
<TheMue> fwereade: After changes the hard coded references to $HOME/.juju there are now still some hidden glitches in the tests. ;)
<fwereade> TheMue, yeah, no doubt -- good luck with them
<TheMue> fwereade: thx
<dimitern> the update manager this morning suggested to do a "partial distribution upgrade" - never seen this before, i did it and /etc/issue still reports 12.10 - anyone seen this?
<dimitern> fwereade: done with the changes - would you like one last check (mostly for wording in comments) before I submit?  https://codereview.appspot.com/7497047/
<fwereade> dimitern, sure
<davecheney> good night gentlemen
<davecheney> don't forget, release on friday
<davecheney> i'll do the release notes dance tomorrow
<dimitern> davecheney: good night!
<davecheney> night y'all
<jam> night davecheney
<jam> fwereade: so we had a question come up recently about Series vs Arch. Specifically, one seems to be a conf parameter, and one is meant to be a --constraint (from what I could tell).
<jam> I noticed because both are a bit borked on Windows
<jam> where it wants <unknown> i386, and we don't have those charms available :)
<jam> I can change the code so that if series is unknown, it returns the default series from conf, but where do I overload the arch? (--constraint was the recommended method from mgz, IIRC)
<jam> mgz, dimitern: can either of you try bootstrapping on HP? I'm getting weird inability to write to swift, but I don't know if it is something about me.
<jam> I would ask wallyworld, but he is obviously away right now.
<fwereade> jam, heyhey
<fwereade> jam, usage of version.Current.Series will hopefully be evaporating imminently: tim's in the process of clearing up the environs api to accept series instead of tools
<jam> also, fwereade, https://code.launchpad.net/~fwereade/juju-core/bootstrap-constraints-3a/+merge/153725 landed, so I think your "EC2 should choose" kanban card can go into merged?
<jam> fwereade: yeah, I was hoping I wouldn't have to do much about series, and could let Tim do the work for me there.
<jam> But Arch is also important in this context, and it was unclear how that happens.
<jam> At the very least, it seemed both Arch and Series should be configured in the same manner
<jam> (environments.yaml, or --constraints, but not both)
<jam> well, even both, but not one one way and the other differently
<mgz> jam: can check
<jam> thanks mgz, good morning, btw
<jam> I hope your Muselix is tasty today
<fwereade> jam, I don't consider my card really done until I've merged -6 -- it was originally meant to include that too
<fwereade> jam, but I am working on merging it all today
<jam> k
<fwereade> jam, wrt series and arch I think matters are a little more subtle
<mgz> eaten long ago, the fun of getting on the 7:45 bus...
<jam> mgz: I thought colocation was Thurs?
<mgz> at least I have a net connection up to doing audio+ssh today, not sure what was up with the crazy packet loss at home yesterday
<jam> yeah, that got pretty crazy at the end.
<mgz> it is, I'm casually in town
<jam> I could hear you fine, but your ssh lag was pretty bad
<mgz> and irc stayed up...
<mgz> but incoming audio was being mangled, and screen was paaainful
<jam> mgz: so how much to get Fi-OS to your house? :0
<mgz> well, there have been promises from the local authority...
<jam> we promise to make you pay through the nose
<jam> to get 1Mbit
<fwereade> jam, would you like a quick hangout?
<jam> I guess T3 was 1.5Mbit?
<jam> fwereade: sure
<jam> fwereade: you want to set it up or me?
<mgz> it's mostly a service reliablity and cost thing these days rather than speed, which is actually pretty reasonable
<fwereade> jam, I was getting water but will do so now
<jam> np
<dimitern> fwereade: ping
<fwereade> dimitern, pong
<dimitern> fwereade: does it look ok?
<fwereade> dimitern, dammit sorry
<dimitern> fwereade: np, i'm already half though the upgrade-charm cmd anyway :)
<rogpeppe> dimitern: one last question on your review https://codereview.appspot.com/7497047/
<dimitern> rogpeppe: just answered :)
<rogpeppe> dimitern: hmm, but settingsRefsDoc doesn't *have* an id field. that was what was confusing me.
<dimitern> rogpeppe: it has an implied id
<dimitern> rogpeppe: if not specified mongo creates one, afaik
<rogpeppe> dimitern: oh, i see. we're relying on a mongo-created _id?
<dimitern> rogpeppe: yeah
<rogpeppe> dimitern: so how do we find out that key to increment the ref count?
<dimitern> rogpeppe: we use the serviceSettingsKey() to construct the key, and then Find/FindId
<rogpeppe> dimitern: oh, i see, we do specify the id when creating the document
<dimitern> rogpeppe: exactly
<rogpeppe> dimitern: we just don't have in the doc
<rogpeppe> dimitern: hmm, i think that's a bit confusing. all the other docs contain their id fields.
<dimitern> rogpeppe: yeah, if seemed a bit weird at first, but fwereade explained it and it made sense
<fwereade> dimitern, rogpeppe: no, we specify the key in the txn.Op
<rogpeppe> fwereade: yes, i saw that. but don't all the other doc types have the key in the struct too?
<dimitern> rogpeppe: well, saves a bit of typing: settingsRefsDoc{1} instead of {Id:x, RefCount:1}
<fwereade> dimitern, rogpeppe: it ends up cleaner imo, because then we can just use the doc directly in update operations without mgo whining about _id not neing sane to update
<fwereade> rogpeppe, and since it's not actually required anywhere else, meh, why include it?
<rogpeppe> fwereade: ok. maybe a comment next to the settingsRefDoc would be appropriate then
<fwereade> rogpeppe, +1 to that
<dimitern> rogpeppe: i'll add a comment
<rogpeppe> fwereade: because it wasn't obvious to me at any rate
<mgz> jam: bootstrap on hp worked btw, want to compare config?
<fwereade> rogpeppe, yeah, good observation
<jam> mgz:  my hp config: https://pastebin.canonical.com/87223/
<mgz> basically, pulled juju-core trunk, `go install ./...`, sourced creds, `~/go/bin/juju bootstrap --upload-tools` and `~/go/bin/juju status`
<dimitern> fwereade, rogpeppe: thanks for the reviews. i'll include your suggestions and submit it shortly
<fwereade> dimitern, cool, thanks
<jam> mgz: and the failure: https://pastebin.canonical.com/87224/
<jam> I can't --upload-tools on Windows
<jam> but that isn't where it is failing
<mgz> jam: https://pastebin.canonical.com/87225/
<mgz> jam: check your OS_REGION_NAME
<jam> mgz: export OS_REGION_NAME="az-3.region-a.geo-1"
<jam> I checked that
<jam> and the URL for the swift service looks right
<jam> it is going to: https://region-a.geo-1.objects.hpcloudsvc.com/v1/AUTH_3...
<jam> but it is getting an EOF
<jam> which if I then 'swift list' after changing OS_REGION_NAME, I don't see the bucket mentioned in the bootstrap debug
<jam> while I *do* see your hpgztesting bucket
<mgz> I'll try changing the bucket name to see if creation is borked somehow, I would have had that bucket before
<jam> mgz: I would have thought 'destroy-environment' would delete the bucket.
<mgz> nope, that works as well
<mgz> jam: you'd have thought...
<jam> mgz: (that can also be tested :)
<jam> mgz: I don't see your old bucket, and I do see a "hpgztesting-new" bucket
<mgz> $ swift list
<mgz> No handlers could be found for logger "keystoneclient.v2_0.client"
<mgz> Endpoint for object-store not found - have you specified a region?
<mgz> ...thanks python-swiftclient
<mgz> ...is that actually the bucket name?
<mgz> ah, no, it should have my hex bit as well
<jam> mgz: you have to change OS_REGION_NAME for swift list
<jam> because it doesn't match the nova region
<mgz> ah, ta
<jam> (note that you have to use the regular one for 'nova list' but you have to change it for 'swift list'... sign)
<jam> sigh
<mgz> jam: so yeah, sorry, works for me, and nothing's obviously wrong with your setup in comparison
<dimitern> rogpeppe, fwereade: is it ok for both of you if I just drop the last sentence of settingsRefsDoc comment (the one about the last unit upgrading should drop the old settings doc) and leave the rest? that is already mentioned in the service.go inline
<fwereade> dimitern, I think I'd still favour dropping all but the first, for the same reasons, but I think I should defer judgment to roger who's in a better position to know how the code looks when it's unfamiliar
<dimitern> fwereade: ok, so to make peace, I'll leave it then :)
<fwereade> dimitern, what? this means WAR!
<fwereade> dimitern, (no, that's fine ;))
<dimitern> fwereade: :D
<jam> WAR? This means PEACE!
<jam> wait...
<rogpeppe> fwereade: is there anywhere else that has a more coherent comment about the overall strategy for managing settings docs?
<fwereade> rogpeppe, there is not
<dimitern> rogpeppe: not in one place, not
<fwereade> rogpeppe, it would be a very good thing for us to write such a document, but in my personal queue it comes behind a how-to-write-transactions document
<rogpeppe> fwereade: in which case, i think it's good to have that comment there (and to perhaps fix it if it's not accurate), rather than inferring the strategy from scattered comments
<jam> wallyworld_: I didn't mean to scare you away, please come back :)
<fwereade> rogpeppe, yep, +1
<wallyworld_> jam: stupid mumble is freezing :-(
<rogpeppe>  fwereade: i really think it's nice to have comments in the code (perhaps in addition to other external docs)
<rogpeppe> fwereade: or at the least a pointer to the external doc within the code.
<fwereade> dimitern, for pedantry's sake would you mention that the service decrefs the settings doc for its old url on change please?
<dimitern> fwereade: sure, +1
<rogpeppe> fwereade: when the service config settings change?
<fwereade> rogpeppe, no, when the service's charm url changes
<rogpeppe> fwereade: how is that different from upgrading?
<fwereade> rogpeppe, it's not, but the decref is not mentioned -- only the incref to the new one
<dimitern> rogpeppe, fwereade: there it is - take it or leave it :) http://paste.ubuntu.com/5630902/
<rogpeppe> fwereade: "
<rogpeppe> When a unit upgrades to the new charm,
<rogpeppe>  717 // the old service settings ref count is decremented
<rogpeppe> "
<rogpeppe> isn't that what you're talking about?
<fwereade> rogpeppe, ok, I would appear to be lacking in perception
<fwereade> dimitern, go for it
<rogpeppe> fwereade: ah, i thought i was missing something crucial :-)
<dimitern> cheers
<jam> wallyworld_: no love for mumble today?
<wallyworld_> trying
<wallyworld_> jam: mumle won't connect, wanna do a hagout?
<jam> wallyworld_: unfortunately, mgz is on his chromebook, and there are not binaries for G+ on it... :( but I can hangout with you and proxy for martin)
<jam> wallyworld_: can you hear us?
<jam> I see you in mumble
<wallyworld_> no :-(
<jam> if you can hear, then you can "speak" in the channel maybe?
<wallyworld_> mumble locks the cpu
<jam> joy joy... :(
<wallyworld_> i guess we can start a hangout
<jam> wallyworld_: https://plus.google.com/hangouts/_/261ed219337d5a6c8fb184b90b243b7a06b746f3?authuser=2&hl=en
<jam> dimitern: mgz ^^
<mgz> will be there shortly
<wallyworld_> jam: can you hear me?
<jam> wallyworld_: we can hear you
<jam> you can't hear us now?
<jam> mgz: no audio
<TheMue> Ahhh! *jump* *jump* *jump* It seems I've found it.
<TheMue> *tschakka* Yeah, got it. *big-smile*
 * fwereade_ cheers at TheMue again
<fwereade_> rogpeppe, does http://paste.ubuntu.com/5631012/ look like something you've seen before?
<jam> mgz: it might have been the tools that are in the bucket on hp-cloud (in the shared account). When I added public-bucket-url, it seems to be working...
<jam> (maybe we have a bug if you download from swift and then try to upload it)
<jam> does swift do anything with uploading identical content?
<jam> like tell you "I don't need you to send any bytes because you already have a file matching that sha1" ?
<rogpeppe> fwereade_: no - AddRelation is new in the api. it's almost certainly something to do with the way that the AddRelation operation is being undone (or not...)
<rogpeppe> fwereade_: at least, i *think* AddRelation is new
<fwereade_> rogpeppe, yeah, that's the latest commit
<fwereade_> teknico, I'm seeing a consistent failure in trunk: http://paste.ubuntu.com/5631012/
<teknico> fwereade_, what did I break? looking
<dimitern> fwereade_: i have the same issue
<dimitern> just run the tests
<dimitern> on trunk
<fwereade_> teknico, if it's not happening for you, that's not immediately clear :)
<dimitern> fwereade_: could it be because of the go version? i'm on 1.0.3
<fwereade_> dimitern, I'm 1.0.2
<fwereade_> teknico, assuming it's not so trivial as to take only a couple of minutes, would you back it out please? I'd be happy to help investigate on another branch if it turns out to be something tricky
<fwereade_> teknico_, what was the last thing you saw?
<teknico_> fwereade_, my own question at 13:33:55
<fwereade_> teknico_, I saw nothing from you before 13:32
<fwereade_> teknico, I have http://paste.ubuntu.com/5631037/
<teknico> fwereade_, thanks, here's what I said:
<teknico> [13:33:55] <teknico> fwereade_, is it the test on line 1952? it did pass here before I submitted
<teknico> checking right now
<fwereade_> rogpeppe, fwiw it is not totally wonderful that a test failure seems to cascade into a panic; would it be simple to address this?
<teknico> fwereade_, yes, it's happening here too, I'm not clear on why, so I'm reverting the whole branch
<fwereade_> teknico, no worries, let me know when it's reverted
<rogpeppe> fwereade_: it's because opClientAddRelation returns a nil function pointer. i can't see how that could ever have worked
<rogpeppe> fwereade_: i don't believe the tests were run before submitting
<fwereade_> rogpeppe, ah, jolly good
<fwereade_> rogpeppe, thanks
<benji> fwereade_: I am about to submit this branch https://codereview.appspot.com/7600044/ and it was suggested that you might want to know it is coming because it may cause merge conflicts with your current work
<teknico> fwereade_, reverting branch is lp:~teknico/juju-core/revert-add-relation-commit
<teknico> fwereade_, shall I directly "lbox submit", or do I still need to "lobx propose"?
<fwereade_> teknico, I think you need to do both
<teknico> fwereade_, https://codereview.appspot.com/7719050 , wanna have a look or shall I just submit it?
<fwereade_> teknico, LGTMed on trust for form's sake
<TheMue> rogpeppe: ping
<rogpeppe> TheMue: pong
<dimitern> benji: please wait before submitting - trunk is currently broken
<TheMue> rogpeppe: just updated and made a test ./...
<benji> dimitern: thanks for the heads-up
<TheMue> rogpeppe: now I've got this http://paste.ubuntu.com/5631081/
<TheMue> rogpeppe: any idea?
<dimitern> TheMue: that's the problem we have in trunk, pending a revert on last commit
<rogpeppe> TheMue: yes, the recently merged branch broke trunk
<TheMue> rogpeppe: ah, thx, so it's well known
<teknico> fwereade_, landed, sorry about that
<teknico> and everyone too :-)
<fwereade_> teknico, np, I'm pretty sure we've all done it, thanks for cleaning it up swiftly
<dimitern> teknico: it happened to me recently, i know the feeling
<rogpeppe> fwereade_: i replied to the "Synchronizing: watchers, GUI, and API" thread BTW. you might want to have a look and weigh in.
<fwereade_> rogpeppe, looks sane to me -- I don't feel any particular need to add to it, because I've shunted those details off into my mental "oakland" bucket
<rogpeppe> fwereade_: cool, thanks
 * rogpeppe goes for some lunch
<rogpeppe> back
<niemeyer> Greetings al
<niemeyer> l
<niemeyer> mthaddon: Heya
<niemeyer> mthaddon: Can we do a quick store deploy today?
<mthaddon> niemeyer: sure, anyone from webops can do that - can you file an RT with the details?
<niemeyer> mthaddon: Sure
<mthaddon> thx
<niemeyer> mthaddon: Doing that right now
<niemeyer> mthaddon: #60244 is up
<_mup_> Bug #60244: install problem <ubiquity (Ubuntu):New> < https://launchpad.net/bugs/60244 >
<niemeyer> _mup_: no no no
<mthaddon> thx
<niemeyer> mthaddon: Thank you!
<mthaddon> niemeyer: https://pastebin.canonical.com/87252/ - has the branch location changed?
<niemeyer> mthaddon: Hmm
<niemeyer> mthaddon: It has, kind of
<niemeyer> mthaddon: It's still at lp:juju-core
<niemeyer> mthaddon: BUt the trunk owner has changed
<niemeyer> mthaddon: To ~juju
<niemeyer> mthaddon: If you had lp:juju-core, it should theoretically just work
<niemeyer> mthaddon: If you must use the explicit reference for some reason, then lp:~juju/juju-core/trunk should do
<mthaddon> bzr remember resolves the full URL I believe, so I'm not sure if that's an option - will adjust manually
<mthaddon> ok, happier now
<niemeyer> sweet
<mthaddon> niemeyer: deployed
<niemeyer> woot, thanks!
<rogpeppe> dimitern: i just saw this in trunk; any ideas?: http://paste.ubuntu.com/5631407/
<rogpeppe> dimitern: hmm, maybe not trunk actually, but i'm pretty sure the branch hasn't got a problem there.
<dimitern> rogpeppe: hmm, haven't seen this
<dimitern> rogpeppe: let me look closer
<dimitern> rogpeppe: does it happen every time?
<rogpeppe> dimitern: i'll let you know when this test finishes
<dimitern> rogpeppe: if it appears hung, just kill it - this happens sometimes on failure in the uniter tests
<rogpeppe> dimitern: it just passed
<rogpeppe> dimitern: i think it may just be a timeout that's too short
<dimitern> rogpeppe: so it's intermittent
<rogpeppe> (i hate those friggin' arbitrary timeouts)
<frankban> rogpeppe, dimitern: that seems similar to the 8 failures in trunk a mentioned yesterday, I was able to make the suite pass increasing timeouts in uniter_test
<rogpeppe> frankban: yes, that's what i was thinking
<dimitern> rogpeppe: no this seems something else
<dimitern> rogpeppe: the unit's charm never gets installed for some reason
<dimitern> fwereade_: ideas? ^^
<dimitern> frankban: what were these failures and what did you change?
<fwereade_> dimitern, rogpeppe: looking
<rogpeppe> fwereade_: the most significant difference i can see is that the broken version goes into "awaiting error resolution for "start hook"" after "loading uniter state" where the ok version says "charm is not deployed"
<fwereade_> rogpeppe, yeah, I am staring at that in complete bafflement
<frankban> dimitern: I still see these failures in trunk -> http://pastebin.ubuntu.com/5631469/
<dimitern> frankban: at trunk tip?
<frankban> dimitern: tests passed when I locally increased timeouts timeouts in uniter_test.go (from 5 to 20 seconds and from 50 to 200 milliseconds)
<frankban> dimitern: revno 1034
<dimitern> frankban: hmm... i'm getting trunk and running tests now
<dimitern> frankban: have you seen these before? r1034 is gustavo's store CL - not related to the uniter
<frankban> dimitern: yes, 2 days ago, 1034 is just the revno of my last test run in trunk
<frankban> dimitern: I think it was 1013.
<rogpeppe> fwereade_: can i run something by you please?
<fwereade_> rogpeppe, sure
<dimitern> frankban: that's really strange - i run all tests on trunk at least 10 times a day and haven't seen these (apart from temporary failures when I was working on the uniter itself before proposing)
<rogpeppe> fwereade_: i'm looking at line 1353 on https://codereview.appspot.com/7598043/diff2/31001:88001/state/state_test.go
<rogpeppe> fwereade_: and wondering why we're calling Initialize on the state when all we want is to write the environ config
<rogpeppe> fwereade_: there are quite a few other places that do that too. and Initialize is not in fact called in the usual case in tests!
<rogpeppe> fwereade_: (AFAICS)
<rogpeppe> fwereade_: i'm thinking that we should always call state.Initialize, apart from the few places we actually want to test Initialize.
<fwereade_> rogpeppe, +1 to that
<dimitern> frankban: all tests pass for me @1036 - what's your go version?
<rogpeppe> fwereade_: i'll leave frankban's changes to go in and then we'll fix it in a later branch
<fwereade_> rogpeppe, cool, thanks
<frankban> rogpeppe: cool
<frankban> dimitern: 1.0.2
<frankban> dimitern: it seems that my wall clock is faster than your, i.e. for some reasons, my local machine is slower
<dimitern> frankban: i'm on 1.0.3, but this shouldn't be the problem really.. are you using HDD or SSD drive?
<frankban> dimitern: hybrid drive
<fwereade_> rogpeppe, so, the only scenario I can come up with when looking at the code in trunk is that there's some weird stale uniter state somewhere
<dimitern> frankban: which timeouts in uniter_test you needed to change?
<fwereade_> rogpeppe, but I don't see how that's possible either
<dimitern> frankban: i'm on ssd, but i think this is not the most common in the team, so also probably not an issue
<frankban> dimitern: I changed all, just to check that time was the problem there
<rogpeppe> fwereade_: i haven't looked at the code yet, i'm afraid. i'm just running all tests again to see if it happens again.
<rogpeppe> frankban: LGTM
<frankban> dimitern: I think Makyo reproduced those 8 failures too, in his local env
<frankban> rogpeppe: \o/
<rogpeppe> frankban: thanks a lot for your patience with this branch!
<Makyo> frankban, dimitern.  Yes.  I have a log saved, I think.
<dimitern> frankban: with changed timeouts how long does it take to run all tests? these are my timings: http://paste.ubuntu.com/5631522/
<frankban> rogpeppe: thank you for your help
<dimitern> Makyo: does it happen with the current trunk tip as well? also timeouts perhaps..
<fwereade_> TheMue, am I being dense? I can't see what happens when $JUJU_HOME is not set
<rogpeppe> fwereade_: i just saw the same failure again. not in trunk though - perhaps my megawatcher changes are making a difference. that wouldn't surprise me entirely, as we've got the allWatcher that might be affecting things.
<dimitern> fwereade_: i have some questions about how to properly test upgrade-charm cmd and a wip CL, if you have the time?
<fwereade_> dimitern, I am very keen to take a look but chat with TheMue takes priority, if he's available
<fwereade_> dimitern, send me the link and I'll let you know my thoughts
<dimitern> fwereade_: sure, np - https://codereview.appspot.com/7927043/
<Makyo> dimitern, checking.
<TheMue> fwereade_: if it's not set it will fallback to $HOME/.juju.
<TheMue> fwereade_: and if $HOME is also not set it will panic with an according message.
<TheMue> fwereade_: dimitern asked for a test, i'll see how I can do it. i already tested it by hand and it's working.
<dimitern> TheMue: PanicMatches checker?
<rogpeppe> dimitern, fwereade_: and another failure: http://paste.ubuntu.com/5631542/
<fwereade_> TheMue, hold on, we panic if we don;t have a $HOME? why should we need a $HOME to load the config package?
<TheMue> dimitern: yep, remove both envs before and then call Restore... with the PanicMatches.
<rogpeppe> dimitern: i think the allWatcher must be interfering with the uniter operation somehow, but i'm not sure how.
<TheMue> fwereade_: if we don't have a home, how do you now where to locate .juju?
<dimitern> rogpeppe: wow.. I never seen this one - it's falling apart :)
<fwereade_> TheMue, why do we need to know that in order to manipulate a config file
<TheMue> fwereade_: we have a lots of places in the code where $HOME is read and taken, even if it is unset. Getenv returns "" then.
<fwereade_> dimitern, that latest one smells of races and luck in the change you made actually ;p
<TheMue> fwereade_: where do you read environments.yaml from if $HOME isn't set?
<dimitern> gents, have to step out for about half an hour, bbiab
<fwereade_> dimitern, having hit start-failed is not sufficient reason to believe we've run the subsequent upgrade
<fwereade_> dimitern, ttyl
<fwereade_> TheMue, but this panics on package load if $HOME is not set, right?
<fwereade_> TheMue, I don't think that's a reasonable requirement
<TheMue> fwereade_: yep, but I also can live with a different default for the juju home if $JUJU_HOME and $HOME aren't set. so what do you want JujuHome() to return then?
<fwereade_> TheMue, JujuHome can panic, that's fine
<TheMue> fwereade_: so we would panic a bit later. ;)
<TheMue> fwereade_: when the first one accesses it.
<rogpeppe> dimitern: wow, *everything* is failing now: http://paste.ubuntu.com/5631564/
<fwereade_> TheMue, right, and if that happens it's usually an error
<rogpeppe> dimitern: and that's without the allWatcher running.
<rogpeppe> dimitern: but this still isn't in trunk.
<rogpeppe> dimitern: so it may well be a problem i've introduced
<TheMue> fwereade_: so not JH() string but JH() (string, error)? no pro, please write it into the review.
<fwereade_> TheMue, I'm thinking about it
<fwereade_> rogpeppe, I'm baffled by the jump into start mode, but that other one seems like a real problem to me
<fwereade_> TheMue, I think the situation is usually better communicated by a panic than an error -- it generally indicates that one of us is not thinking properly and is calling the wrong code in the wrong context
<rogpeppe> fwereade_: ha, yes, i just saw that failure in trunk
<rogpeppe> dimitern: ^
<fwereade_> TheMue, but I also think we should have no risk of panic in the actual client: even if they don't have either set for whatever crazy reason, a goroutine trace is not a nice way to say hello
<rogpeppe> dimitern: so i don't think the issue is my fault, so i'm going to submit the allWatcher integration branch
<fwereade_> TheMue, in addition, even if $HOME is set, that is not enough reason to set JH
<benji> I think the tests are leaving un-cleaned-up files in /tmp/.  Is that a known issue?
<TheMue> fwereade_: IMHO it's a panic situation, but you're right about the info to the user.
<rogpeppe> dimitern: here's the latest failure i've seen, in trunk: http://paste.ubuntu.com/5631582/
<fwereade_> benji, it is not, they should in general be being cleaned up
<benji> fwereade_: thanks; I'll verify that it is and file an issue.
<fwereade_> benji, thanks
<TheMue> fwereade_: so I could switch from an automatic init() to a manual Init() error that has to be called in commands. and also a panic by JujuHome() if the init hasen't been done.
<rogpeppe> dimitern: it also has the same "[LOG] 15.63869 INFO: worker/uniter: awaiting error resolution for "start" hook" error
<fwereade_> rogpeppe, I'm sure that's a straight-up bug
<rogpeppe> fwereade_: yeah, i think it must  be
<rogpeppe> fwereade_: i was worried it was provoked by my allWatcher stuff, but i don't think so
<fwereade_> rogpeppe, well, actually, I don't know what it *is*, it still defies my understanding
<rogpeppe> fwereade_: well, i can reproduce the issue reasonably consistently
<fwereade_> rogpeppe, are you getting test droppings in your temp dir as well as benji?
<rogpeppe> fwereade_: in /tmp ?
<fwereade_> rogpeppe, yeah, but I forget where exactly
<fwereade_> rogpeppe, the code seems very clear that it *has* loaded a state file in a sane format, and that the state file told it what to do
<fwereade_> rogpeppe, which makes me wonder whether that one is a test isolation issue
<fwereade_> rogpeppe, although it's strange that it never came up before
<rogpeppe> fwereade_: it seems quite possible. perhaps the uniter isn't being cleaned up synchronously or something?
<fwereade_> rogpeppe, quite possible, I shall poke at it
<rogpeppe> fwereade_: i don't see any recent test droppings in /tmp
<fwereade_> rogpeppe, ok, cool, that is good to know
<rogpeppe> fwereade_: you might see if you can reproduce the issue with GOMAXPROCS=5 go test
<rogpeppe> fwereade_: also, i'm running go tip, which has a completely different scheduler now, which might be the trigger
<Makyo> dimitern, current trunk tests for me: http://paste.ubuntu.com/5631605/
<fwereade_> TheMue, I think so -- set it to panic-on-call by default, and make cmd/juju call something to make it safe
<TheMue> fwereade_: yep, will do so. you'll note it in the review too, please?
<fwereade_> TheMue, sure, I'm just soliciting opinions live before I do the review proper
<TheMue> fwereade_: yeah, sure, I only want to have it noted afterwards, in case I'm not remembering it correctly. I'm getting older. :D
<fwereade_> TheMue, sent
<TheMue> fwereade_: great, thank you
<TheMue> fwereade_: you ask why the juju home is set in the hook context. exactly there, during a hook execution, i had the case of no $JUJU_HOME and no $HOME.
<rogpeppe> fwereade_: you've got a review of the state dev docs branch, BTW
<TheMue> fwereade_: i sadly don't have the log anymore, but it has been the install hook.
<dimitern> back
<fwereade_> TheMue, my point is that I think agent code *should* fail if it even tried to look at JUJU_HOME
<fwereade_> TheMue, it's not its business at all
<fwereade_> TheMue, it's no more meaningful than trying to hit the ec2 metadata service from your own laptop
<TheMue> fwereade_: yeah, the change we discussed will lead to a removal of this setting.
<TheMue> fwereade_: good catch.
<fwereade_> rogpeppe, thanks, good comments
<dimitern> fwereade_: if you still feel like it, take a peek at my CL as well please
<fwereade_> dimitern, sorry, a load got added to the stack
<fwereade_> dimitern, fwiw I think I know the problem in the uniter tests
<dimitern> fwereade_: oh, what is it?
<dimitern> fwereade_: np about the cmd CL, i'll leave it for tomorrow
<fwereade_> dimitern, two things; :572 is asserting something it should be waiting for, and the tests aren't properly cleaning up after themselves
<fwereade_> dimitern, there's a RemoveAll(s.unitDir) that is not run quite as often as it should be if tests fail
<fwereade_> dimitern, if I check your review would you see if you can repro rog's failures against go tip?
<dimitern> fwereade_: sorry, i'm confused now
<dimitern> fwereade_: the 2 things are about uniter failures?
<fwereade_> dimitern, yeah, rog sent a number of pastes while you were away
<dimitern> fwereade_: these were test execution dumps, not code right?
<fwereade_> dimitern, yeah
<dimitern> fwereade_: and :572 is where?
<fwereade_> dimitern, line 572, the second forced upgrade error test iirc
<dimitern> fwereade_: let me check
<fwereade_> dimitern, it's a verifyCharm{1} that depends on unwarranted assumptions
<fwereade_> dimitern, then when that fails it takes out all subsequent tests because it doesn't clean up
<dimitern> fwereade_: is this the possible case of tests hanging on failure sometimes?
<dimitern> *cause
<dimitern> fwereade_: right! the RemoveAll in runUniterTests?
 * TheMue is AFK for some time, bbl
<fwereade_> dimitern, hanging forever is unrelated, is there a bug for that?
<fwereade_> dimitern, yeah
<dimitern> fwereade_: don't think so - but I've only seen it twice over a month probably and it was always when some other test fails first
<dimitern> fwereade_: so that RemoveAll should be deferred as well probably
<fwereade_> dimitern, can't remember what happens if RemoveAll fails
<fwereade_> dimitern, but yeah, I think that would od it
<dimitern> fwereade_: I *cannot* reproduce any of these failures against tip - tried 5 or 6 times already
<fwereade_> dimitern, hum, that is annoying
<dimitern> fwereade_: trying again now
<fwereade_> dimitern, quick hangout re upgrade-charm when you're free? o maybe over a beer later?
<dimitern> fwereade_: +1 for beer
<dimitern> fwereade_: still no joy - all tests pass
<fwereade_> ok: dimitern, rogpeppe, I need to visit the shops with some haste
<fwereade_> dimitern, if you make a test fail deliberately you can verify the effect of a deferred RemoveAll; if that's shown to work it'll get a swift LGTM on its own when I return
<fwereade_> dimitern, rogpeppe: but in the large I think this is a situation for skip-the-test, add-critical-bug
<rogpeppe> fwereade_: sounds reasonable
<fwereade_> dimitern, rogpeppe: I'm pretty sure I know what's going on but it won;t be fixed tonight and I don't *think* in this case that it is better to back out the change
<fwereade_> ttyl all
<rogpeppe> dimitern: have you tried running against go tip, with GOMAXPROCS > n ?
<dimitern> i'll add the quickfix for RemoveAll and try
<dimitern> rogpeppe: no
<rogpeppe> dimitern: that's where i see the problem (the go tip scheduler is different)
<dimitern> rogpeppe: i have go tip at hand, but haven't updated recently
<rogpeppe> dimitern: "hg sync; cd src; ./all.bash" and bob's yer uncle
<dimitern> rogpeppe: hg pull you mean?
<rogpeppe> dimitern: yeah, same thing. i think "sync" is a synonym they introduced that also copes with CL changes.
<rogpeppe> dimitern: unless you're contributing to the go project, there's no need to use it.
<dimitern> rogpeppe: haven't seen it, it might be a plugin (reports an error here)
<rogpeppe> dimitern: yeah, it's a plugin
<rogpeppe> dimitern: just use hg pull; hg update tip instead
<dimitern> rogpeppe: done, building now
<dimitern> rogpeppe: what's n? ^^ 2 or more
<rogpeppe> dimitern: i used 5, one more than the number of cores in my machine
<dimitern> rogpeppe: ok
<dimitern> rogpeppe: how do i run it with 5 ?
<rogpeppe> dimitern: GOMAXPROCS=5 go test
<dimitern> rogpeppe: running all tests now, first without GOMAXPROCS
<rogpeppe> dimitern: i hope you've changed GOROOT and GOPATH
<rogpeppe> dimitern: otherwise you might be running the wrong binaries
<dimitern> rogpeppe: ofc, even run go version: go version devel +8cc853b84f89 Wed Mar 20 09:06:33 2013 -0700 linux/amd64
<rogpeppe> dimitern: cool
<rogpeppe> dimitern: i have a separate GOPATH dir for the alternative go compiler, so there's no chance of confusion
<dimitern> rogpeppe: actually, do I have to wipe out pkg/* as well as changing GOROOT?
<rogpeppe> dimitern: probably not, as everything will have changed, but in general yes
<rogpeppe> dimitern: that's why i have a separate copy of everything in its own GOPATH
<dimitern> rogpeppe: so you have 2 separate bzr working trees?
<rogpeppe> dimitern: yup
<rogpeppe> dimitern: and a command (go-alt-pull) that pulls the current branch in my usual working dir into the alternative branch
<dimitern> rogpeppe: with a corresponding magical shortcut, beginning with ",x.." ? :)
<rogpeppe> dimitern: (most of the time i work with go tip, but i test against 1.0.2 before submitting
<rogpeppe> )
<dimitern> all tests pass with trunk/tip and go/tip, now will try GOMAXPROCS=5
<rogpeppe> dimitern: nah, i just type go-alt-pull
<rogpeppe> :-)
<rogpeppe> dimitern: ,x just edits.
<dimitern> rogpeppe: ahaa
<rogpeppe> dimitern: ("," is short for "0,$" and "x" says do something for everything matching the following regexp)
<dimitern> rogpeppe: apart from some weird warnings in decode/cgo something no other issues
<rogpeppe> dimitern: hmm.
<rogpeppe> dimitern: try it a few times :-)
<dimitern> i have some failures now
<rogpeppe> dimitern: cool
<rogpeppe> dimitern: the same ones?
<dimitern> in rpc though http://paste.ubuntu.com/5631798/
<dimitern> fwereade_: riak testing charm surprisingly has rev 7
<dimitern> rogpeppe: nope, no errors, will run it a couple of times again
<dimitern> rogpeppe: (apart from the rpc panic)
<rogpeppe> dimitern: ah, i think i know why that happens, and it should be fixed by dfc's upcoming branch
<dimitern> rogpeppe: the logging stuff?
<dimitern> rogpeppe: on second run (n=5) rpc passed
<rogpeppe> dimitern: yeah, the race setting log.Target
<dimitern> rogpeppe: it failed now, but with a different issue (SIGSEGV) http://paste.ubuntu.com/5631814/
<rogpeppe> dimitern: hmm, i've not seen that befoe
<rogpeppe> re
<dimitern> rogpeppe: how often do the uniter tests fail like that for you?
<rogpeppe> dimitern: most times
<rogpeppe> dimitern: i can reproduce by just running the uniter tests too
<dimitern> rogpeppe: rpc failed again (the same issue), but the uniter tests passed (i noticed with n=5 they are slightly faster - about 10-15%)
<dimitern> running again without n=5
<fwereade_> rogpeppe, does my sketchy judgment that it looks like (1) a genuinely flaky test that fails in verifyCharm, and (2) poor isolationcausing all subsequent test cases to fail
<fwereade_> rogpeppe, ...match your own observations?
<dimitern> fwereade_: imo 2 is definitely happening
<rogpeppe> fwereade_: i haven't been delving into it as much as dimitern
<dimitern> fwereade_: alas, not for me
<dimitern> that segfault is troubling though
<rogpeppe> fwereade_: (i figured both you and dimitern are closer to the problem than me, and will likely find the fix much sooner)
<fwereade_> dimitern, while I have concern about the things you're observing I am focused pretty hard on the uniter tests
<fwereade_> rogpeppe, ofc -- that was just something tossed into the air in case it were to get shot down with an instant "well no I've also seen X, Y, Z" :)
<fwereade_> dimitern, agreed
<fwereade_> dimitern, 8ish @cuba?
<dimitern> wow, new panic now (second run, n=1 go tip): http://paste.ubuntu.com/5631831/
<dimitern> fwereade_: sgtm
<dimitern> I've seen this before once
<dimitern> and it's a pass (except the panic)
<rogpeppe> fwereade_: looking at the errors, yes, i think your analysis seems likely
<dimitern> fwereade_: so my thoughts about the cleanup: there are 2 approaches - extract the loop body of runUniterTests and add a defer RemoveAll there; OR just add an additional defer RemoveAll in runUniterTests
<rogpeppe> dimitern: hmm, i can't see how that could happen
<dimitern> rogpeppe: what?
<rogpeppe> dimitern: the panic in randomHexToken
<dimitern> rogpeppe: today's the day for obscure test failures it seems :)
<rogpeppe> dimitern: unless... what revision of goose are you using?
<dimitern> maybe it's biorhythms or full moon or smth
<dimitern> rogpeppe: haven't updated recently, that could be it, lemme check
<rogpeppe> dimitern: yup, that's it
<rogpeppe> dimitern: if you're before r80, that's the reason
<dimitern> rogpeppe: i was, now @80, running again with n=5/tip
<dimitern> now the uniter tests failed..goroutine 4574  wtf?!
<dimitern> http://paste.ubuntu.com/5631858/
<dimitern> maybe mgo has hard time coping with a lot of concurrent procs?
<fwereade_> dimitern, whoa, that definitely deserves a bug, probably against mgo... niemeyer? does a theory leap out at you?
<fwereade_> actually, I have to be off for supper
<dimitern> fwereade_: so see you around 8 @cuba then
<fwereade_> rogpeppe, would you skip that  test and add a critical bug/card for it please?
<rogpeppe> fwereade_: the uniter test?
<fwereade_> rogpeppe, if nobody's around to approve it I pre-LGTM a single line that just skips the one that fails for you :)
<fwereade_> rogpeppe, please, if you have time
<rogpeppe> fwereade_: ok, will do
<fwereade_> rogpeppe, awesome, tyvm
<fwereade_> later all
<dimitern> i'll call it a day as well
<dimitern> night y'all
<rogpeppe> dimitern: g'night
<hazmat> seems odd that a co of juju-core gets .   submit branch: /home/rog/src/go/src/launchpad.net/juju-core/juju/.bzr/cobzr/go-state-units-under-service
<rogpeppe> hazmat: that is really weird
<rogpeppe> hazmat: i think that branch dates from when i did see some weirdness around there
<hazmat> rogpeppe, just some cached metadata in the branch
<hazmat> not an issue, just odd
<rogpeppe> hazmat: it may have happened when i did a push --overwrite, possibly
<rogpeppe> anyway, eod here
<hazmat> rogpeppe, cheers
<rogpeppe> hazmat: cheers to you too
<rogpeppe> and g'night all others
<niemeyer> fwereade_: I can't see how there could be a nil reference in that line.. mongoServers is part of the cluster by value
<niemeyer> fwereade_: Do you know what Go version this is running on?
<thumper> moring folks
 * thumper does the conflict resolution dance again
<robbiew> lol
 * thumper sighs
<thumper> never just as simple as resolving conflicts
 * thumper fixes test failures, import errors and other boring bits
 * thumper wishes go had a build_tests
<thumper> so I could check all the tests built before running them...
<mramm> that would be nice
<davecheney> thumper: any final comments on https://codereview.appspot.com/7524046
<davecheney> i don't think this bike shed will stand up to any more coats of paint
 * thumper looks
<thumper> davecheney: morning
<davecheney> mornig
<davecheney> gutter cleaners are here today in the complex
<davecheney> and they have bought the australia strategic reserve of petrol powered leaf blowers
<thumper> davecheney: defer log.SetTarget(log.SetTarget(c))
<thumper> does the defer command evaluate all the args at the call time?
<thumper> davecheney: I'm assuming that log.SetTarget(c) is evaluated prior to the defer stashing the method away for later
<davecheney> thumper: yes
<davecheney> so the inner log.SetTarget(c) sets the target to c and reutrns the previous value
<davecheney> that value is captured by the defer and passed to the outer log.SetTarget when run
<thumper> davecheney: got time for a hangout chat?
<davecheney> sure
<davecheney> lemmie go upstairs
<thumper> ok
#juju-dev 2013-03-21
<thumper> davecheney: so how do we define a custom error type again?
<thumper> davecheney: what do I search for in code?
<thumper> davecheney: actually nm.
<thumper> davecheney: the error I care about is a NotFoundError, which seems sane...
<davecheney> anything that has an Error() string method satisifies the error interface
<thumper> gah...
<thumper> tests seem to be blocking again for some reason
<thumper> how the hell can I know if the tests have stopped running?
<thumper> I think it is getting deadlocked somewhere
<davecheney> cntl-\
<davecheney> speaking of wtf
<thumper> oh, handy
<davecheney> we have a method to override the default dial timeout when calling the state.Open
<davecheney> BUT IT IS NEVER USED !!!
<davecheney> well, that isn't strictly true, it is used inconsistnetly
<thumper> hmm...
<thumper> it seemed to be stuck in the guts of go
<thumper> not in our code
<thumper> wtf?
<thumper> davecheney: can I pastebin for you to take a look at?
<davecheney> kk
<thumper> http://paste.ubuntu.com/5633106/
<davecheney> you might get a cleanre stack trace if you send SIGQUIT to the test process
<davecheney> in go 1.0
<davecheney> cntl-\ will panic both the test program, and the test runner
<davecheney> so you get two stack traces intermixed
<davecheney> i fixed that for 1.1
<davecheney> try again, and high the $PKG.test with a SIGQUIT
<thumper> kk
<davecheney> s/high/hit
<thumper> http://paste.ubuntu.com/5633115/
<thumper> davecheney: ^^
<thumper> davecheney: can you see what it is doing?
<davecheney> sorry, i think that is the wrong process
<davecheney> that is the `go` process
<davecheney> you want it's child
 * thumper sighs
<thumper> I seem to have a bucketload of mongod processes and go test processes running
<davecheney> you will have at least 8
<thumper> I had about 30 something tests, and 4 different mongod processes with 8 children each
<thumper> I've killed them all, and will try again
<davecheney> might be a good idea to clean out your /tmp
<davecheney> there will be lots of junk in there
<thumper> hmm...
<thumper> there is a lot of mongo*.sock
<thumper> davecheney: ok, it has blocked again...
<thumper> davecheney: using htop to look at the process tree
<thumper> davecheney: it seems that go test ./... has five different /tmp/go-build/.... something.test children
<thumper> I thought it wouldn't run in parallel?
<davecheney> it does not run the tests inside a package in parallel
<davecheney> but it will test packages in parallel
<thumper> ok
<davecheney> go test -p 1 ./...
<thumper> well, they seem to have somehow deadlocked each other
<davecheney> will disable
<thumper> oh ffs
<thumper> even with -p 1, the tests hang
<davecheney> which test is is
<davecheney> there will be one
<thumper> can't tell
 * thumper is busy killing things
<thumper> I just killed the system mongod process
<thumper> and rerunning the tests after switching to trunk
<thumper> best to start from a mostly known state
<thumper> getting further this time
<davecheney> btw, there is a working mongo PPA now
 * thumper hates how gocheck doesn't clean up it's test directories
<davecheney> if you want to switch to that
<thumper> davecheney: what was the problem?
<davecheney> no idea
<thumper> so... what changed?
<davecheney> the latest build jools did worked
<davecheney> dunno, ask him
<davecheney> i just copied his packge into our ppa
<thumper> not for me it didn't
<davecheney> https://launchpad.net/~juju/+archive/experimental
<davecheney> pls try this one
<thumper> I timed the tests here:
<thumper> real	2m42.342s
<thumper> user	5m34.444s
<thumper> sys	0m26.396s
<thumper> gah...
<thumper> second run through fails
<davecheney> do you have the test output ?
<thumper> by fails I mean hangs
<thumper> I had switched to my branch though
<thumper> I'm back on trunk again, and trying that
<thumper> I'm beginning to wonder if it is my work
<thumper> which while not impossible
<thumper> would suprise me
<thumper> as I didn't think I was doing anything weird
<thumper> but I must be
<thumper> trunk worked again
<thumper> well, got further than before
 * thumper back for the meeting later
<davecheney> kk
<wallyworld> jam: can you change ownership of goose to ~juju instead of ~gophers since i can triage bugs etc anymore
<jam> wallyworld: I'm not in ~gophers anymore either. I thought you said you had super powers still for LP
<wallyworld> i was wrong :-(
<wallyworld> jam: but you registered the project, so you should be able to do it
<wallyworld> change ownership
<jam> https://launchpad.net/~gophers/+members#active looks like only Gustavo is in ~gophers. I'll try
<jam> wallyworld: shouldn't it be an edit link beside "Maintaners" ?
<wallyworld> jam: yeah, if you can't see it then you don't have permission. just to check, the administer link
<jam> yeah, I tried that one too
<jam> just lets me set "aliases" for the project
<wallyworld> balls, i thought you would be able to
<jam> I can apparently set whether it tracks bugs in launchpad, but not change who can access those
<wallyworld> oh well, we'll have to ask gustavo to change ownership etc
<wallyworld> i filed a bug before but couldn;t triage or set importance etc
 * wallyworld relocates
<jam> wallyworld_: seconded your rsynclogd stuff
<wallyworld_> ta, will land
<rogpeppe> mornin' all
<dimitern> rogpeppe: morning
<rogpeppe> dimitern: hiya
<thumper> hi rogpeppe, dimitern
<rogpeppe> thumper: hya
<rogpeppe> thumper: hi, even
<dimitern> thumper: yo!
<fwereade__> hey all
<dimitern> fwereade__: heyhey
<fwereade__> dimitern, heyhey
<jam> morning mgz
<jam> wallyworld: side note I didn't want to forget about. Did we need to change the default firewall ports to include the rsyslogd port?
<wallyworld> jam: i didn't change anything, but it worked regardless
<wallyworld> jam: i had a mysql node and it logged back to bootstrap with no fw changes
<jam> wallyworld: so supposedly they are all in the same security group, so everything can talk to everything in the group, I wasn't sure if that was true on HP/Canonistack
<wallyworld> it appears to be true it works by trying it out :-)
<thumper> jam: can blue take this one ? https://canonical.leankit.com/Boards/View/103148069/104151606
<jam> thumper: that looks like one that is intended for us, but I have a really hard time going from URL links to finding the actual cards on the board
<thumper> jam: I'll move it to blue backlog
<thumper> jam: it has been moved to blue todo
<wallyworld> jam: mgz: dimitern:  this is a critical bug for release tomorrow, could you +1 it and i'll land after the standup later? https://codereview.appspot.com/7937044/
<jam> wallyworld: I have an in-progress review of it. I'll make sure to finish that.
<jam> your String() function looks like it exists in 2 places.
<jam> ah, nm
<wallyworld> yes, different structs
<jam> genericId vs genericInstanceId
<dimitern> wallyworld: i'm on it
<wallyworld> thanks guys :-)
<jam> rogpeppe: you submitted https://code.launchpad.net/~rogpeppe/juju-core/212-api-doc/+merge/147919 as of yesterday, can we move your 'write API design' card to Merged?
<rogpeppe> jam: i wasn't aware that i'd submitted it
<rogpeppe> jam: i still haven't seen two LGTMs
<rogpeppe> jam: in fact, i definitely haven't submitted it
<jam> rogpeppe: yeah sorry, I was looking at https://codereview.appspot.com/7919043/ and saw the last thing was submitted, but clearly that was the wrong link I had followed.
<rogpeppe> jam: np
<jam> rogpeppe: as a doc, which is better than having nothing, I'm willing to have you land it with a single +1
<rogpeppe> jam: thanks for your review BTW
<jam> so 'trivial'
<rogpeppe> jam: ok, thanks. will do.
<jam> TheMue: looking here: https://launchpad.net/~gophers/+related-projects there is "golxc" is that something that should stay in ~gophers or move to ~juju?
<rogpeppe> jam: did my responses to your questions seem reasonable, BTW?
<jam> rogpeppe: so I think the answer is: (a) it is stateful, but we can just put more processes/machines in front
<jam> (b) as a websocket, you maintain the connection until it gets interrupted, in which case you have to set up the state from scratch again
<jam> (c) we know this design is a bit worriesome because it needs to know about all changes, but that is a whole different discussion I'm happy to defer for now :)
<rogpeppe> jam: a) i'm not sure "in front" is right there - we'd be replicating the API server itself, not putting more things in front of it.
<jam> rogpeppe: in front of mongo
<rogpeppe> jam: ah yes, indeed
<jam> rogpeppe: on that point, could you put haproxy sort of thing to load balance?
<rogpeppe> jam: yeah, that's what i'd do
<jam> as long as it new to maintain the websocket to the same api server?
<rogpeppe> jam: yup
<jam> I'm not as familiar with websocket, but presumably it is "give me a connection and then stop pretending I'm talking HTTP"
<rogpeppe> jam: yeah, it hijacks the connection AFAIR
<rogpeppe> jam: but it has some cruft in the middle too (it does packets)
<jam> rogpeppe: so i'm happy to chat about the design, etc, but none of that blocks the landing of a doc that describes what we have/you are actually building.
<rogpeppe> jam: sure
<thumper> fwereade__: Rietveld: https://codereview.appspot.com/7943043
 * thumper is done for the day now
<fwereade__> dimitern, ping
<dimitern> fwereade__: pong
<fwereade__> dimitern, was the source of that bug yesterday clear?
<dimitern> fwereade__: which one?
<fwereade__> dimitern, the one roger saw, that cascaded nastily
<fwereade__> dimitern, rogpeppe: actually would you just update me quickly on what's done/planned wrt that issue?
<dimitern> fwereade__: not really, no (for me at least, and couldn't reproduce it)
<rogpeppe> fwereade__: i just disabled the test
<rogpeppe> fwereade__: and filed a bug
<rogpeppe> fwereade__: assigned to dimitern :-)
<dimitern> fwereade__: i filed the one about mgo today btw
<rogpeppe> dimitern: what one was that?
<dimitern> rogpeppe: bug 1158190
<_mup_> Bug #1158190: intermittent failure with go tip and GOMAXPROCS=5 <mgo:New> < https://launchpad.net/bugs/1158190 >
<fwereade__> dimitern, cool, thanks
<fwereade__> rogpeppe, ok, cool, thanks
<rogpeppe> dimitern: which revision of mgo are you using?
<dimitern> rogpeppe: how can i check? bzr info doesn't say
<rogpeppe> dimitern: i use bzr log
<dimitern> rogpeppe: ah, bzr info -h also shows that i found. so rev 183
<rogpeppe> dimitern: ok, cool, that's the same one as me
<dimitern> sorry, that was bzr info -v
<rogpeppe> dimitern: that's weird then; i can't see how it could get a nil pointer error on that line
<dimitern> rogpeppe: panic dumps don't lie :)
<rogpeppe> dimitern: actually, they can quite often be out by a line
<dimitern> rogpeppe: oh, i didn't know this
<rogpeppe> dimitern: it's *usually* fairly obvious
<rogpeppe> dimitern: and in this case, i can't see an obvious candidate (servers is not nil, and that's the only way that line can panic AFAICS)
<dimitern> rogpeppe: weird..
<rogpeppe> dimitern: tip has had some very significant changes recently. i wouldn't entirely rule out some memory-corruption issue.
<fwereade__> long lunch today, bbl
<TheMue> lunchtime too
<rogpeppe> dimitern, fwreade: i just saw another uniter test failure in trunk : http://paste.ubuntu.com/5633753/
<dimitern> rogpeppe: this looks like the same issue
<rogpeppe> dimitern: i'm not sure. it first dies in a different test, for a different initial reason
<rogpeppe> dimitern: (it first dies in "hook error service dying")
<dimitern> rogpeppe: is it consistently failing?
<rogpeppe> dimitern: nope
<rogpeppe> dimitern: i saw that one after about 10 runs
<rogpeppe> dimitern: all with different values for GOMAXPROCS
<rogpeppe> dimitern: (that one was with GOMAXPROCS=60)
<dimitern> rogpeppe: so values of n > 2xnumber of cores still work?
<rogpeppe> dimitern: yeah, you can have any number
<rogpeppe> dimitern: it's just the number of processes that can be running cpu-bound stuff at once
<rogpeppe> dimitern: i'm continually running dfc's stresstest shell script
<dimitern> rogpeppe: i see, so what should we do about it?
<rogpeppe> dimitern: we should delve in and try to understand what's happening
<rogpeppe> dimitern: i'd start by comparing the logs from the passing test with the logs from the failing test and see where they diverge
<dimitern> rogpeppe: ok
<jam> rogpeppe, fwereade__: I'm trying to make the change to default to "precise". I can easily update the test. However, when I change it, it only tests the logic inside Config.New() (where if the value is empty it gets auto-set to a new value)
<jam> it does not test the value in schema.Defaults
<jam> do you know how to trigger schema.Defaults?
<rogpeppe> jam: schema.Defaults is triggered when a config attribute isn't specified, no?
<jam> rogpeppe: I would think so, but if I just change the line in "New" then all paths return the value I specify
 * rogpeppe goes to look
<jam> So, AIUI, there should be "not specified" as separate from "specified as the empty string"
<jam> rogpeppe: ah, maybe I'm on crack because "default-series": version.Current.Series *is* precise on my machine.
<jam> And even though I'm monkey-patching the value during testing, it is too late, because the value is already in the map
<rogpeppe> jam: ha! (you're still running precise?)
<jam> rogpeppe: the last one to support Unity-2D, and 3D doesn't work very well in a VM
<rogpeppe> jam: ah, i see
<rogpeppe> jam: i didn't realise you used a VM
<jam> not always, but I'm in Windows to do the windows building, etc.
<jam> but I can't run the test suite there
<jam> so VM
<rogpeppe> of course
<jam> I do have a raring VM which works ok, but the "3D" support is pretty poor for virtualbox
<jam> even with the "allow 3D for guest" checked.
<gary_poster> fwereade__, fwiw, bug 1131608 is still a blocker for us to fully deliver (developing the charm needs it)
<_mup_> Bug #1131608: deployed series is arbitrary <juju-core:New for fwereade> < https://launchpad.net/bugs/1131608 >
<fwereade__> gary_poster, I *think* that, as of thumper's branch proposed this morning, it should actually be resolved
<gary_poster> fwereade__, oh, that would be great, thanks.
<fwereade__> gary_poster, I'm pretty sure that's the last piece that needed to be added
 * TheMue missed to restart irc after reboot *facepalm*
<fwereade__> rogpeppe, ping
<rogpeppe> fwereade__: pong
<fwereade__> rogpeppe, kanban :)
<rogpeppe> fwereade__: oh, bugger
<fwereade__> rogpeppe, ping
<rogpeppe> fwereade__: pong
<fwereade__> rogpeppe, since you can repro it, would you try out a fix for that uniter test please?
<rogpeppe> fwereade__: sure
<fwereade__> rogpeppe, line 570 should read:
<rogpeppe> fwereade__: let me just clone $GOROOT :-)
<fwereade__> info: "upgrade failed",
<fwereade__> ...and that's it
<fwereade__> rogpeppe, I *think*
<rogpeppe> fwereade__: so where's the source of the indeterminacy?
<rogpeppe> fwereade__: i'm first verifying i can still reproduce the bug; then i'll try the fix.
<fwereade__> rogpeppe, that we wait until a few steps *before* the point we care about, and then assert we're at that point
<rogpeppe> fwereade__: ah, i think i see - "hook failed: "start"" is just a stage on the way to "upgrade failed"
<rogpeppe> fwereade__: yeah
<fwereade__> rogpeppe, but actually wait
<fwereade__> rogpeppe, I am suddenly very confused by the test
<fwereade__> rogpeppe, even if it works, something's funny
<fwereade__> rogpeppe, fuck, it's harder than I thought
<rogpeppe> fwereade__: yeah, sorry, still fails
<rogpeppe> fwereade__: it would be great to sort out the test isolation issue too
<fwereade__> rogpeppe, yeah, I'm feeling a bit blocked on the thing I picked up, might take a proper look t both of those
<rogpeppe> pwd
 * rogpeppe wishes that gocheck printed searchable-for string with every assertion failure
<mgz> rogpeppe: like what exacty?
<rogpeppe> mgz: like "assertion failed"
<mgz> ah, you mean just searchable in the output
<rogpeppe> mgz: yeah
<mgz> yeah.
<rogpeppe> mgz: 'cos some tests (looking at uniter here) produce heroic quantities of output, and finding the errors is not easy!
<rogpeppe> mgz: ah, this'll work pretty well: search for _test\.go
<mgz> ha
<rogpeppe> fwereade__: i just saw another uniter test failure (in trunk, or nearly trunk, this time). in steadyUpgradeTests.
<rogpeppe> fwereade__: same symptom (never got expected hooks)
<fwereade__> rogpeppe, cool, I'll take a look there too, might be similar
<rogpeppe> fwereade__: will paste you the output if you want
<fwereade__> rogpeppe, please
<rogpeppe> fwereade__: http://paste.ubuntu.com/5634430/
<fwereade__> rogpeppe, thanks
<rogpeppe> pretty simple cleanup CL, if anyone wants to take a look: https://codereview.appspot.com/7945044
<fwereade__> rogpeppe, when you have a moment, would you try lp:~fwereade/juju-core/fix-1157898 please?
<rogpeppe> fwereade__: running tests on it now
<fwereade__> rogpeppe, that other one you saw is profoundly weird... it looks like the hook is (maybe?) running but the juju-log tool is not
<fwereade__> rogpeppe, if that fix works I might add a couple of logging lines before I propose, to help diagnose the other one if we see it again
<rogpeppe> fwereade__: looking good so far - three full tests without incident
<fwereade__> rogpeppe, excellent
<rogpeppe> fwereade__: so what's the signifcant of the code move in Uniter.deploy ?
<fwereade__> rogpeppe, to make the things I said in the test true -- specifically, by delaying the SetCharm until the operation is not stoppable
<fwereade__> rogpeppe, the critical point is doing so after the download has complete, after that it doesn't check Dying until it's done
<rogpeppe> fwereade__: so is that a genuine bug?
<fwereade__> rogpeppe, motivationwise, a bit of a hack; otherwise a reasonable change, I think... could definitely be argued to be more correct to not set a charm until you actually *have* the charm in hand
<rogpeppe> fwereade__: sounds reasonable to me
<fwereade__> rogpeppe, I think it is arguable that it was -- I'm not quite sure how it would behave after having set a charm but finding itself unable to download
<fwereade__> rogpeppe, but the test was definitely racy, and I think that now it no longer is
<rogpeppe> fwereade__: 5 iterations good so far
<fwereade__> rogpeppe, awesomesauce
<rogpeppe> fwereade__: and the Reset fix looks... why didn't we see that before? :-)
<rogpeppe> 6
<rogpeppe> pwd
<fwereade__> rogpeppe, haha :)
<benji> I'm seeing test failures on trunk, is that a known thing?  After skimming the scroll-back I didn't see mention of it.
<fwereade__> benji, I am not aware of any myself, would you paste them please?
<benji> fwereade__: http://paste.ubuntu.com/5634632/
<benji> I'll check back after lunch and see if things are better.
<fwereade__> benji, I think that might be jam's change
<fwereade__> jam, are you (1) around and (2) running on precise?
<dimitern> fwereade__: i think he mentioned he's running on precise
<fwereade__> dimitern, grumble grumble, I bet that's it
<rogpeppe> fwereade__: 12
<fwereade__> rogpeppe, more awesomeness, I'm starting to feel good about it
<mgz> `; beep` ha, a sign of a test suite that takes too long to run #;0~~
<dimitern> fwereade__: i think frankban could reproduce these failures as well
<fwereade__> dimitern, good point
<fwereade__> frankban, am I right in thinking you were seeing the uniter test failure that rog disabled yesterday?
<mgz> confrimed that's a regression due to r1044 on quantal
<mgz> shall I just back the change out for now, as it's jam's eod?
<frankban> fwereade__, dimitern: yes you are
<frankban> fwereade__: 8 failures in uniter_test
<fwereade__> frankban, would you try out lp:~fwereade/juju-core/fix-1157898 please?
<mgz> fwereade__: ^
<frankban> fwereade__: sure
<fwereade__> mgz, yes please
<fwereade__> mgz, I'll write him a bug
<fwereade__> mgz, hmm, should it be a bug? I'll just mail him
<mgz> mail or note in the review should be fine I'd say
<mgz> proposed, and will go ahead and submit
<mgz> benji: please pull lp:juju-core
<mgz> jam: merge lp:~gz/juju-core/backout_r1044 into your feature branch, revert+fix, and repropose
<dimitern> mgz: i doubt jam is around to do this at this time
<mgz> he has the log :)
<mgz> I don't expect him to do it till his next work day
<mgz> (true, this kind of thing could also go to the list)
<dimitern> but that would be sunday, right? i think it's better to back out that now, since trunk is broken
<fwereade__> dimitern, it is backed out :)
<frankban> fwereade__: tests pass in your branch
<fwereade__> frankban, sweet, tyvm
<frankban> fwereade__: np
<dimitern> fwereade__: oh, ok :) sorry i missed this
<rogpeppe> fwereade__: i just saw this after 22 successful runs: http://paste.ubuntu.com/5634687/
<fwereade__> rogpeppe, that's outside the scope of what I did, do I think I'm going to propose it as is for now
<rogpeppe> fwereade__: sure. just so's you know :-)
<dimitern> rogpeppe: are you running the tests with dave's stress testing script?
<rogpeppe> dimitern: yeah
<dimitern> rogpeppe: cool! so it's as stable as it gets for now after 22 runs
<rogpeppe> dimitern: yeah
<rogpeppe> dimitern: it's still not good though
 * rogpeppe says, with a serious face on
<dimitern> rogpeppe: why? have you seen the same issue again
<rogpeppe> dimitern: no, but any intermittently failing test is Bad
<rogpeppe> dimitern: even if it's "only" once every 22 times
<dimitern> rogpeppe: ah, sure :) but this one at least seems fixed
<rogpeppe> dimitern: yup
<fwereade__> dammit, must dash: if anyone fancies https://codereview.appspot.com/7950043 I'd be really happy
<rogpeppe> fwereade__: will take a look
<rogpeppe> reminder to self: never interrupt bzr at work
<rogpeppe> g'night all
<rogpeppe> i have a few CLs up for review if anyone fancies taking a look; they all have kanban tickets
<benji> fwereade__: the trunk appears happy now, thanks
<thumper> morning
<thumper> fwereade__: don't suppose you are around?
 * thumper wonders if there was anything else to email the list with...
<fwereade__> thumper, heyhey
<fwereade__> thumper, not really, but kinda
 * thumper was confused there for a minute
<thumper> fwereade__: I'm just wondering about testing the tools selection for start instance
<thumper> fwereade__: also, tools are specified by bootstrap...
<thumper> fwereade__: trying to finish this off before looking at the machine things we talked about last night my time
<fwereade__> thumper, still thinking about the tests, but not quite following on bootstrap
<thumper> well... you said that the tools weren't set in the start instance params
<thumper> well, they aren't if coming from StartInstance, but are from Bootstrap
<fwereade__> thumper, tools selection with --upload-tools should Just Work once we get rid of the weird multi-bucket fallback stuff in tools
<fwereade__> thumper, assuming we set default-series and agent-version at upload time, anyway
<fwereade__> thumper, ofc none of that is written yet
<thumper> :)
<fwereade__> thumper, but I think it's the stuff I was talking about in the saner bits of my last email
<thumper> I'm not clear on why the multibucket stuff needs to change
<thumper> fwereade__: which we do, except for it being mildly broken
<thumper> fwereade__: by version.Current anyway
<fwereade__> thumper, mainly because the falling back is deeply confusing to me
<fwereade__> thumper, so long as we only use the first bucket that has any tools, I can figure it out quite easily
<fwereade__> thumper, but the way bad matches from closer buckets beat good matches from distant buckets breaks my brain
<thumper> hmm...
<thumper> so, here is a question...
<thumper> if I have a development version of tools I have uploaded
<thumper> how does this interact with start instance?
<thumper> when it is looking for tools
<thumper> this bit still confuses me
<thumper> but I could probably just talk with dave about that when he starts
<thumper> if we just used the version defined in agent-version, we may not get my special uploaded tools
<fwereade__> thumper, I *think* that if we do agent-version right, we can easily pick the tools right
<fwereade__> thumper, but I confess to some uncertainty around the magic insertion of the dev-version flag
<fwereade__> thumper, there might be a "development" field in env-config that comes into play somewhere
<fwereade__> thumper, but I shouldn't really be getting into this now tbh
<thumper> :)
<fwereade__> thumper, if I'm still awake when nobody else is I'll swing by again
<thumper> fwereade__: just sleep :)
<fwereade__> thumper, if I don't, happy weekend :)
<thumper> you too
<thumper> long weekend for me
<thumper> davecheney: morning
<thumper> davecheney: I have a question (or two)
<davecheney> thumper: shoot mate
<thumper> davecheney: apart from all those other questions on the email list
<davecheney> i haven't got through all the correspondence on the list yet
<thumper> davecheney: https://canonical.leankit.com/Boards/View/103148069/104140367 is this done by the change I recently landed in worker/provisioner/provisioner.go line 264?
<thumper> davecheney: in trunk as of this morning
<davecheney> gimme a sec
<thumper> if so, yay, another thing done quickly
<davecheney> i suspect it is
<davecheney> the provisioner calls StartInstance
<thumper> cool, I'll move it to done
<davecheney> please hold, confirming
<davecheney> thumper: LGTM,
 * thumper holds (even though he has already moved the card)
 * davecheney has much love for bzr log --show-diff
<davecheney> wow, 3 LGTMs on the logging bikeshed
<davecheney> fuck yeah
<davecheney> i'm gonna submit that before anyone changes their mind
<thumper> :)
<thumper> davecheney: I saw two, but thought I'd throw mine in too for good measure
<davecheney> wallyworld_: has a point
<davecheney> but I prefer cut -f for log splitting
<davecheney> so that is how I wrote it
<thumper> I don't think wallyworld_ has a point
<thumper> don't put the colon in
<thumper> better without
<thumper> given a defined timestamp, and a defined severity
<thumper> it is trivial to get those two out, or ignore them if need be
<thumper> adding a colon adds a character for no real benefit
<davecheney> okdie dokes
<davecheney> shit, all the lanes are full
<davecheney> better get cracking
<thumper> one reason I did a few reviews :)
<davecheney> i'm going to move rogers api doc back out of review
 * thumper heads out for lunch
<thumper> back later
<davecheney> it's been in there since before atlanta
#juju-dev 2013-03-22
<davecheney> chasing another LGTM, https://codereview.appspot.com/7782049/
<davecheney> please
<wallyworld> davecheney: looking
<davecheney> wallyworld: ty
<wallyworld> davecheney: by the looks of it, most of the changes are just mechanical
<wallyworld> davecheney: in provisioner_test, was not using TestingDialTimeout an oversight?
<thumper> hmm....
<thumper> coffee...
<thumper> I think I should go make one
<wallyworld> bigjools just finished making me a coffee :-D
<wallyworld> he's getting better at it, needs more practice
<bigjools> no more for you then you cheeky cnut
<wallyworld> but you need the practice
<lifeless> wallyworld: o/
<wallyworld> hello there
<wallyworld> how's things?
<lifeless> wallyworld: pretty good, other than being stuck in san antonio :)
<lifeless> wallyworld: you?
<wallyworld> lifeless: well, switched teams to work on juju. interesting project. learning Go has had it's ups and downs. main issues for me are the lack of expected tools and std lib functionality cf more mature platforms like python etc
<lifeless> yah
<lifeless> its hard at the start of a setup like that
<lifeless> specially when it doesn't eagerly use existing libraries
<wallyworld> lifeless: san antonio? was that pycon or something?
<lifeless> pycon was in santa clara, I was there
<lifeless> did a side trip to san antonion to visit rackspace; plane I was meant to fly back to SJC on was bust
<lifeless> so an extra unplanned night here
<lifeless> fortunately I had a day leeway before my return flight from SJC to NZ
<thumper> hi lifeless
<lifeless> hi thumper :)
<wallyworld> lifeless: did you see "dongle-gate"
<wallyworld> wtf happened there? sounds like a storm in a tea cup all blown out of proportion
 * thumper invokes barry's 2nd law (http://barry.warsaw.us/software/laws.html)
<lifeless> wallyworld: I only heard bout it after the conf
<davecheney> no thumper ?
<davecheney> http://paste.ubuntu.com/5636105/
<davecheney> getting this very consistently
 * davecheney can't figure out why the hook tests are so unreliable
<rogpeppe> mornin' all
<fwereade__> rogpeppe, heyhey
<rogpeppe> fwereade__: yo!
<fwereade__> rogpeppe, would you take a quick look at https://codereview.appspot.com/7950043/ please?
<rogpeppe> fwereade__: looking
<fwereade__> rogpeppe, after that I'd like a quick chat about the params.*Info types
<rogpeppe> fwereade__: sounds good
<rogpeppe> fwereade__: reviewed
<rogpeppe> fwereade__: afk for a little bit, back in a short while
<rogpeppe> fwereade__: back
<fwereade__> rogpeppe, I think you're right about the Reset but I got distracted by broken trunk :/
<rogpeppe> fwereade__: that there should be a Flush in the TearDownTest too?
<fwereade__> rogpeppe, yeah, but I need to think about that bit a little more
<rogpeppe> fwereade__: shall we have that chat?
<fwereade__> rogpeppe, with you in a sec, would you start a hangout please?
<rogpeppe> fwereade__: https://plus.google.com/hangouts/_/405ae790bfb55499116d2961f7e7f2ecc352fb68?authuser=0&hl=en-GB
<rogpeppe> fwereade__: any chance you could jump back into that hangout? i just wanted to chat about one other thing
<fwereade__> rogpeppe, sorry, totally didn't see that
<rogpeppe> fwereade__: np. i was just wonderin about the suggestion to drop version.Current.
<rogpeppe> fwereade__: it seems a bit odd to me - there whole point is so that a given binary can know what version it is
<fwereade__> rogpeppe, I don't think it's sane, I'm responding now
<rogpeppe> fwereade__: and report it as such
<rogpeppe> fwereade__: ok, cool
<rogpeppe> davechen1y: hiya
<rogpeppe> davechen1y: can you tell us why dropping version.Current is a good idea?
<davechen1y> rogpeppe: because it is non deterministic
<davechen1y> i can build and test a cli on one series, and run it on another series and it does different things
<rogpeppe> davechen1y: but version.Current isn't about testing
<rogpeppe> davechen1y: it's about giving a unit the ability to find out what version it's running
<rogpeppe> davechen1y: or a juju binary too
<davechen1y> yes, and it is that multitude of roles that is causing the problem
<davechen1y> i think version.Current needs to be replaced by several things which are specific to the role of the caller
<fwereade__> davechen1y, many uses of version.Current are crack but many are not
<rogpeppe> davechen1y: how would you find out the current version?
<davechen1y> rogpeppe: sure, that is _1_ use for version.Current
<davechen1y> arguably the correct one
<rogpeppe> davechen1y: well, that's why i implemented it...
<davechen1y> but the use of version.Current for 'fuck, I need a version value, but I don't have one, fuckit, version.Current works'
<rogpeppe> davechen1y: agreed totally
<davechen1y> throughout the code is causing tim problems
<rogpeppe> davechen1y: but that doesn't imply that we should drop version.Current
<davechen1y> and it is _those_ cases we need to fix
<rogpeppe> davechen1y: just that we shouldn't use it in tests
<davechen1y> rogpeppe: please don't read too much into my use of the word drop
<davechen1y> by use of the word drop was as a tool expose, then audit all the places where version.Current is being used (mostly incorrectly)
<davechen1y> then, once we've found and fixed them, we can put version.Current back
<davechen1y> this may probably happening within one change proposal
<rogpeppe> davechen1y: ah, when i see a subject line like "version.Current must die"  and your proposal to "delete version.Current" that sounded pretty unambiguous :-)
<rogpeppe> davechen1y: i don't think we need to remove it
<rogpeppe> davechen1y: just grep for it
<davechen1y> well, tim wants to kill it, and I want to bisect it
<rogpeppe> davechen1y: tim has no proposal for how you would *actually* find the current version
<davechen1y> rogpeppe: i cna't speak for him fully
<davechen1y> but I believe he believes most of those pieces of data come from the environment configuration
<rogpeppe> davechen1y: another possibility is to always set version.Current at the start of the tests, to a known value
<davechen1y> i cannot speak further to this
<rogpeppe> davechen1y: actually none of those pieces of data come directly from the env config
<davechen1y> rogpeppe: like I said, I cannot explain his plans
<davechen1y> i am not fully aware of them
<rogpeppe> davechen1y: np. sorry, i thought you'd discussed it at length.
<davechen1y> this is only what I recally from our conversations
<davechen1y> we'd talked at length about the problme
<davechen1y> ... less so about the solution
<rogpeppe> davechen1y: what do you think about just setting version.Current to a known value at the start of tests?
<davechen1y> i don't think tim will accept that
<davechen1y> and that doesn't address the problem that we use it in too many places for too many things
<rogpeppe> davechen1y: then testing.DefaultVersion() ?
<davechen1y> rogpeppe: i don't want to speak too much too this
<davechen1y> i don't know tims plans
<rogpeppe> davechen1y: ok
<davechen1y> but this is more that just testing
<davechen1y> and is related to the set of cards he has to do with inserting series into start instance
<rogpeppe> davechen1y: ok
<rogpeppe> davechen1y: thanks for the input
<davechen1y> rogpeppe: related, your issue you raised that looks like it was a crash in the hash map code
<davechen1y> there are two things that might be causing that
<davechen1y> 1, is gc can run concurrently with hashmap insertion
<davechen1y> and corrupt the hash map
<davechen1y> we believe we've fixed those bugs now
<davechen1y> so tip right now shouldn't do that
<davechen1y> assuming that is the cause
<davechen1y> 2. this might actually be reall concurrent hash map mutation
<rogpeppe> davechen1y: it died in append - i didn't think it looked like it was in the hashmap code
<rogpeppe> davechen1y: but perhaps you were looking at some other aspect of the trace
<davechen1y> but the gdb you got pointed to the hashmap interator
<davechen1y> for sure, things with the new GC are not all beer and skittles
<rogpeppe> davechen1y: ah, the second one, yes
<rogpeppe> davechen1y: yeah, i'm not really surprised - it's a big change
<rogpeppe> davechen1y: i don't know that i entirely trust the disass output though - i was dumping a binary that might have been different from the original
<davechen1y> oh
<davechen1y> well that could point the finger at anything
<rogpeppe> davechen1y: well, i *think* i rebuilt the same binary, but i'm not entirely sure
 * davechen1y wishes he'd been able to convince rsc to allow go test -c ./...
<rogpeppe> davechen1y: i really tried (perhaps a bit too hard actually)
<rogpeppe> davechen1y: workaround: http://paste.ubuntu.com/5636443/ (i call it "gotest-c")
<davechen1y> nah, rsc was right, we're trying to get 1.1 out the door
<davechen1y> not the time for new features
<davechen1y> and we can always just patch our go trees
<davechen1y> or use your workaround
<rogpeppe> davechen1y: true 'nuff
<rogpeppe> davechen1y: i wish we didn't use cgo at all. then i could be absolutely sure it was a Go bug, not something we're doing wrong at the C boundary
<davechen1y> rogpeppe: good point
<davechen1y> i was tlaking to jam on wed that cross compiling for win32 would be much simpler if we didn't use cgo for goyaml
<rogpeppe> davechen1y: if only we could just ditch yaml
<TheMue> rogpeppe: +1 ;)
<davechen1y> rogpeppe: i like the cut of your jib, sir
<davechen1y> fwereade__: could I trouble you to finish the sentence you started yesterday, https://docs.google.com/a/canonical.com/document/d/1d_b6VTt1bNDCNCrrIVEmEtNIoZi1toGvt5MvVBnxGxw/edit
<fwereade__> davechen1y, done, sorry
<davechen1y> fwereade__: no problem
<davechen1y> if the tree passes i'll do the release tomorrow morning
<fwereade__> davechen1y, cool, thanks
<TheMue> so, with the latest change all test should pass (just running). then final cleanup and propose.
<davechen1y> TheMue: yes, the tree passes at the moment
<davechen1y> did anyone hear from jam or mgz about a public bucket to push tools' for hp cloud ?
<TheMue> *hmpf* one fail left
<davechen1y> TheMue: my tree passed, what is failing for you ?
<TheMue> davechen1y: not the current trunk, only my branch with my changes. ;) they affect several places.
<davechen1y> sadface
<TheMue> davechen1y: why?
<fwereade__> TheMue, things affecting several places :)
<fwereade__> davechen1y, sorry, I don't know about an hp public bucket
<TheMue> fwereade__: hehe, yep, always not simple.
<davechen1y> s'ok
<davechen1y> i wrote to jam and mgz
<davechen1y> worse case we can always publish an adendum to the release notes
<fwereade__> davechen1y, sgtm
<TheMue> *ouch* that panic has been new and definitely not caused by a change. will repeat the test, maybe an intermittent one.
<TheMue> yip, passes
<fwereade__> TheMue, if so, please make sure it's recorded
<TheMue> fwereade__: yep
<TheMue> *very-big-small* all pass
<TheMue> s/small/smile/
<TheMue> lunchtime
<bac> hi rogpeppe
<rogpeppe> bac: hiya
<bac> rogpeppe: i've got a question about charm Meta -- you the right guy?
<rogpeppe> bac: ... maybe :-)
<bac> rogpeppe: for charms that don't provide a 'limit' the default is 0, except for logging, in the provides section the limit is set to 1.  i'm trying to figure out if that is correct and a general rule
<bac> so if you look at testing/repo/series/logging/metadata.yaml you'll see no limit set for the 'provides' section.  but those values are being returned as 1
<dimitern> bac: take a look at charm/meta.go how the yaml gets parsed and default values filled in
<bac> dimitern: i have and i didn't find my answer.
<rogpeppe> bac: hmm, i wasn't previously aware of the "limit" attribute
<bac> dimitern: i don't, however, understand the 'coerce' bits
<rogpeppe> bac: Coerce is kinda clever
<rogpeppe> fwereade__: does limit limit the number of units that can join a particular relation?
<dimitern> bac: judging from Coerce and the schema (ifaceSchema) limit can be either nil (empty, like "limit: ") or an int
<fwereade__> rogpeppe, I *think* that limit is meant to mean how many services that relation can be established with
<rogpeppe> fwereade__: i think that's what i meant actually
<fwereade__> rogpeppe, but to the best of my knowledge there has never been code that actually interprets it
<rogpeppe> fwereade__: ok, so bac's question is kinda moot
<fwereade__> rogpeppe, yeah
<fwereade__> rogpeppe, there's a bug for it somewhere
<fwereade__> rogpeppe, and also for, er, is it "required"?
<dimitern> fwereade__: yeah, that was weird for me when i saw it as well
<fwereade__> rogpeppe, some other meta relation field anyway
<rogpeppe> fwereade__: i'm trying to think of a scenario where it might be useful
<bac> rogpeppe: it is moot, but right now i have to change my expected results to match something i don't really expect (1 vs 0) and that makes me uncomfortable.
<rogpeppe> bac: ha, i see.
<fwereade__> rogpeppe, it doesn't make much sense to have wordpress talking to two different databases
<bac> if that is the case, i'd rather just remove it from the results i return.
<fwereade__> bac, +1 to not representing it in the gui
<rogpeppe> fwereade__: hmm, i didn't realise you could have a "requires" relation with more than one service
<bac> fwereade__: ok, so remove it from the results i send back.  sound good to me.
<fwereade__> bac, sgtm
<rogpeppe> fwereade__: so currently, you can do: juju deploy wordpress mysql1 mysql2; juju add-relation wordpress mysql1; juju add-relation wordpress mysql2 and it'll "work" ?
<fwereade__> rogpeppe, I am not aware of anything that would prevent it
<dimitern> mramm: i see you added the on call reviewer schedule to the calendar, but how about what we discussed - having 2 people each day (distributed across time zones)?
<rogpeppe> fwereade__: hmm
<fwereade__> dimitern, I think we decided to go with the simplest thing for now
<dimitern> fwereade__: ah, ok then
<fwereade__> rogpeppe, TheMue, mramm: I have to pick up laura today, I won't make the kanban meeting I'm afraid
<rogpeppe> fwereade__: ok
<TheMue> fwereade__: ok, thx for the info.
<mramm> dimitern: I don't know how to do time zone distribution in a reasonable way -- there are basically 2 zones, EU and AU, with 5 and 3 people in them, plus john who is sort of in the middle.
<mramm> and I thought we had decided to just do 1 person for now
<mramm> but if we need time-zone spread, we can do that
<dimitern> mramm: well, we can have 1 guy around AU and 1 around EU each day maybe?
<mramm> well, that means that the AU guys have to do it more than 1x week
<dimitern> mramm: for now it's ok, but we should think about it if it proves insufficient (slow reviews e.g.)
<mramm> right
<mramm> we will improve as needed
<bac> rogpeppe, dimitern: if you've got time to do a review i'd appreciate it:  https://codereview.appspot.com/7668045
<dimitern> bac: i'll take a look in a bit
<bac> thanks!
<rogpeppe> bac: me too
<bac> \o/
<dimitern> bac: reviewed
<bac> thanks
<fwereade__> TheMue, would you explain the high-level thinking behind the various funcs for manipulating home/juju_home?
<fwereade__> TheMue, I'm having some difficulty forming a picture of what gets called when and why
<rogpeppe> fwereade__: latest in the version.Current saga: tests fail when run on quantal 386
<rogpeppe> fwereade__: because imagesData doesn't have any image for that combination
<fwereade__> rogpeppe, I'm not surprised, I'm starting to think we basically shouldn't ever be using real series
<rogpeppe> fwereade__: in testing, yeah.
<fwereade__> rogpeppe, indeed :)
<rogpeppe> fwereade__: except live tests, i guess
<fwereade__> rogpeppe, true, but they need distinct config anyway
<rogpeppe> fwereade__: yup
<fwereade__> TheMue, ping
<TheMue> fwereade__: just fetched a tea ;) and wanted to answer your question
<fwereade__> TheMue, heyhey, I was just looking at the JUJU_HOME stuff
<TheMue> fwereade__: Init() is to init the juju home based on $JUJU_HOME or $HOME
<TheMue> fwereade__: JujuHome() returns it
<fwereade__> TheMue, the big question in my mind is why we're calling Init() in test setup
<TheMue> JujuHomePath() allows to build a path to a file/dir in juju home
<TheMue> fwereade__: because otherwise some tests are failing (with a panic) when they retrieve the juju home somewhere in a nested function
<TheMue> fwereade__: I first tried it w/o any init and then digged down step by step
<fwereade__> TheMue, but why would we want to require that HOME or JUJU_HOME be set in order to run tests?
<TheMue> fwereade__: so if a test tests Foo(), which calls Bar() and that Yadda() and Yadda() references to the juju home, you simply need it
<fwereade__> TheMue, no
<fwereade__> TheMue, we should not need any magic env vars set just to run the tests
<TheMue> fwereade__: we do, already now
<fwereade__> TheMue, those? or others?
<TheMue> fwereade__: we have many code, which relies on $HOME
<TheMue> fwereade__: and those tests would fail w/o a $HOME too
<fwereade__> TheMue, I thought this branch was replacing that?
<TheMue> fwereade__: this branch now replaces the static init()
<TheMue> fwereade__: but the code using $HOME is nested
<TheMue> fwereade__: we said JujuHome() shall panic, if $HOME or $JUJU_HOME are not set
<fwereade__> TheMue, so what? surely the tests should be manipulating exactly one thing -- the value of jujuHome
<TheMue> fwereade__: so the replacement of the old direct code with the usage of the new API would panic
<fwereade__> TheMue, and we have SetTestJujuHome for that
<fwereade__> TheMue, but you have made STJH panic if Init hasn't been called, which STM to be missing the point
<TheMue> fwereade__: ok, sounds reasonable
<fwereade__> TheMue, also, what's the deal with origJujuHome?
<TheMue> fwereade__: but still there is code that is init statically in tests which relies on a juju home (or on $HOME in the past)
<fwereade__> TheMue, can you give me an example of when we need to call Init in a test (that isn't testing Init? ;p)
<TheMue> fwereade__: it allows to restore it, even if $JUJU_HOME as env var has been changed inside the tests
<fwereade__> TheMue, well, kinda
<TheMue> fwereade__: one moment, cleaned up my editor, have to find the place
<fwereade__> TheMue, but it stops you being able to nest `defer STJH(STJH(c.MkDir()))`
<fwereade__> TheMue, if you do that when it's already been done somewhere else, things will act weird
<TheMue> fwereade__: https://codereview.appspot.com/7923043/diff/12001/environs/cloudinit/cloudinit_test.go?column_width=120 line 26ff
<fwereade__> TheMue, I see no reason to call Init there
<TheMue> fwereade__: why do you want to nest setting a test home?
<fwereade__> TheMue, ah, ok, I do -- Init and SetTestJujuHome are equally bad there
<TheMue> fwereade__: ok, setting home should work too, yes. inside of config.New() today $HOME/.juju is accessed
<fwereade__> TheMue, because when you see that line, you think it's bug-free in and of itself, but actually it's not... or maybe it is, but it becomes really hard to tell when you have all these different mechanisms crossing over
<TheMue> fwereade__: sorry, can't follow your last sentence
<fwereade__> TheMue, sorry
<fwereade__> TheMue, I'm saying that there is too much surprising action at a distance for me to follow
<TheMue> fwereade__: but yes, I can follow your idea of using STJH() also for an initializing in tests. here I thought too short
<fwereade__> TheMue, but I don't seem to have communicated why it is wrong to call Init in tests?
<fwereade__> TheMue, I wouldn't have STJH in code but I'm not typing it out every time in irc ;)
<fwereade__> TheMue, you should not be calling Init in tests because it's making a global state change that cannot be restored cannot be restored
<TheMue> fwereade__: what I meant is, that STJH shouldn't rely on an Init before
<fwereade__> TheMue, ah, got you
<TheMue> fwereade__: it can be restored
<TheMue> fwereade__: that's why ...Orig is only set once and never changed again
<fwereade__> TheMue, original correct state -- with jujuHome="" -- *cannot* be restored
<fwereade__> TheMue, it will panic if you try
<fwereade__> TheMue, if you put in Init in one test, other tests in the package might or might not run Inited, dependingon circumstance
<fwereade__> TheMue, am I making sense to you?
<TheMue> fwereade__: would have like to see it painted on a whiteboard. *lol*
<TheMue> fwereade__: but yes, Restore... relies on Init before
<TheMue> fwereade__: and that's the failure
<fwereade__> TheMue, what's the actual use case for Restore... anyway?
<fwereade__> TheMue, can't we just just STJH throughout?
<TheMue> fwereade__: hmm, would be possible. but I would like a mechanism like in FakeHome more. an own type with a Restore method
<TheMue> fwereade__: so the call would be defer STJH("foo").Restore()
<fwereade__> TheMue, that would be fine too, my problem is that we have a load of different ways to do basically the same thing
<TheMue> fwereade__: yes, we have, the problem with a universal programming language ;) but rogers idea is very elegant
<fwereade__> TheMue, in fact that would be nice, because you can just stick it in a field with clear intent
<fwereade__> TheMue, sorry, which idea?
<TheMue> fwereade__: the idea of an own type (based on a string) with a Restore method
<fwereade__> TheMue, we have two fakeHome types already I think, I would be hoping to merge those rather than to create a new one:)
<TheMue> fwereade__: i left them almost untouched for now, but i think they may be merged and better integrated into the juju home idea in future (refactoring card for post 13.04)
<fwereade__> TheMue, but the main point is that config.Init() must not be called in tests, except for those testing itself
<TheMue> fwereade__: yep, got you
<fwereade__> TheMue, long review sent I'm afraid
<TheMue> fwereade__: ok, thx. will handle it with care. ;)
<fwereade__> aww poo, self-evaluation
<fwereade__> guess I won't be getting anything else done today
<TheMue> fwereade__: btw, the cachePath change to $JUJU_HOME/cache works because of the setting of $JUJU_HOME in Init if it not yet set
<rogpeppe> fwereade__: oh bugger, that
 * TheMue done this evaluation stuff this morning
<fwereade__> TheMue, but that doesn't happen
<TheMue> fwereade__: what doesn't happen?
<fwereade__> TheMue, setting $JUJU_HOME in Init
<fwereade__> TheMue, which is a good thing, because I can think of no good reason to do so
<fwereade__> TheMue, but it does mean the charm store will only work in very specialized circumstances
<rogpeppe> fwereade__: i'm afraid i got horribly diverted today by a feasibility study of whether it was possible to test our transactions thoroughly. http://play.golang.org/p/iyfOvjcWR1
<rogpeppe> fwereade__: answer: it would be possible, but quite expensive.
<TheMue> fwereade__: argh, sorry, missed it when changing the code. had it in there before.
<fwereade__> rogpeppe, that's cool, but, yeah, not lightweight
<rogpeppe> fwereade__: if our ops on mongo weren't so slow, it might be feasible
<rogpeppe> fwereade__: the nice thing is it can test arbitrary transactions together in a thorough way
<fwereade__> rogpeppe, that is very nice, indeed
<rogpeppe> fwereade__: it's something that we might consider writing tests for, even if they took days to run, just so we can have some kind of assurance.
<fwereade__> rogpeppe, yeah, I would be very keen on future work in this direction
<rogpeppe> fwereade__: i'll save it for later :-)
<rogpeppe> fwereade__: the only major assumption is that when an operation generates a transaction, it's a deterministic function of the state it's previously seen.
<fwereade__> happy weekends everyone
<dimitern> fwereade__: you too! :)
<rogpeppe> fwereade__: any you!
<rogpeppe> s/any/and
<rogpeppe> happy weekends all!
<dimitern> rogpeppe: same to you :)
#juju-dev 2013-03-23
<Rondonimus> Hi
#juju-dev 2014-03-17
<davecheney> rick_h_: it's *always* mongodb
<waigani> I'm hitting an error that I can't track down. Here is my wip: https://codereview.appspot.com/76670043
<waigani> error: launchpad.net/juju-core/testing/testbase.PatchEnvPathPrepend(0): not defined
<davecheney> wallyworld: launchpad.net/juju-core/testing/testbase.PatchEnvPathPrepend(0)
<davecheney> ^ still importing the lp version
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1293269
<_mup_> Bug #1293269: juju fails to destroy environment due to missing configuration key <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1293269>
<davecheney> great, now destroy environment doesn't work
<thumper> o/ davecheney
<thumper> I have a branch for the local provider lxc settings
<thumper> davecheney: ah, I know what you did there
<davecheney> thumper: thanks
<thumper> davecheney: that was all me...
<thumper> they used to be omit, then I changed the default to "", and now they are omit again
<thumper> pretty sure they were only omit for releases
<davecheney> :(
<thumper> dev envs are likely screwed
<davecheney> here is my environment.yaml
<davecheney>     local:
<davecheney>         type: local
<davecheney> <<EOF
<thumper> yeah...
<thumper> you will need to do the manual thing
<thumper> although, I have a bug on my plate to fix destroy environment
<thumper> however due to the early dying
<thumper> won't help this case
<davecheney> fudge
<davecheney> thumper: has this screwed 1.17.4 -> 1.17.5 upgrades ?
<thumper> davecheney: no, those are fine
<thumper> the setting was changed and changed back in beteween releases
<davecheney> right
<davecheney> cool
<davecheney> sorta
<thumper> yeah...
<thumper> kinda
 * davecheney remains alarmed, but not alert
<bodie_> rick_h_, davecheney thanks :) I'm on the packaged Mongo for Ubuntu 14.04 -- tried building 2.4.9 from source but I got tons of build errors for some reason.  bleh
<davecheney> bodie_: that is the worst of all choices
<davecheney> sudo apt-get install juju-mongodb
<bodie_> oy
<bodie_> ok
<bodie_> much appreciated
<davecheney> this should have already happened for you
<davecheney> this package is a dep of juju-local
<davecheney> is this a documentation bug ?
<bodie_> Maybe it didn't go in because I already had Mongo
<bodie_> but, I don't think Go would fetch a package, would it?  Just source, I thought
<bodie_> I'm still just getting my head wrapped around the whole thing, I think I may have been told the packaged Mongo was suitable
<davecheney> not suitable
<davecheney> manditory :)
<davecheney> this smells like a bug
<davecheney> juju should detect that you had /usr/bin/mongdb
<davecheney> where we rerequire
<davecheney> /usr/lib/juju/bin/mongodb [sic]
<davecheney> if you uninstall mongo
<bodie_> hm
<davecheney> and run a juju command it *should* give you reliable advice
<davecheney> thumper: well fuck, how do I delete that environment then ?
<davecheney> just table flip it ?
<thumper> yeah...
<thumper> lxc-destroy all the machines
<thumper> stop the services
<thumper> and delete the upstart entries
<thumper> delete the rsyslog entry
<davecheney> right-o
<thumper> and delete everything from /var/lib/juju/containers
<bodie_> I think I have some kinda conflict with a mongo already sitting on my system, but I just now removed everything mongo using apt... maybe the one I'd built from source
<bodie_> yeah
<bodie_> okay, removed that and now it's suggesting I install mongodb-server
<bodie_> but you're saying I need to use juju-mongodb
<davecheney> bodie_: this is absolteyly a bug
<bodie_> yay, I accomplished something
<wallyworld> davecheney: you recently updated the ec2 provider to add ppc and arm arches to tools lookup constraints, right?
<davecheney> wallyworld: only ppc
<davecheney> someone else is tpo blame for arm
<davecheney> arm64
<wallyworld> davecheney: because we have a list of hard coded aws instances types (m1.small, etc) which are recorded as supporting i386 and amd64 only
<davecheney> wallyworld: i know
<wallyworld> hence image matching will fai;
<wallyworld> fail
<davecheney> this was raised in review and I believe I removed those lines were removed
<wallyworld> which lines?
<wallyworld> the tools lookup arches?
<davecheney> the ones in the ec2 and azure provider from memory
<wallyworld> so
<wallyworld> 	return &simplestreams.MetadataLookupParams{
<wallyworld> 		Series:        e.ecfg().DefaultSeries(),
<wallyworld> 		Region:        region,
<wallyworld> 		Endpoint:      ec2Region.EC2Endpoint,
<wallyworld> 		Architectures: []string{"amd64", "i386", "arm", "arm64", "ppc64"},
<wallyworld> 	}, ni
<davecheney> guess I didn't get all of them
<wallyworld> the Architectures above should just be i386, amd64 right?
<davecheney> not sure about arm
<wallyworld> no problem
<davecheney> i'd say no
<davecheney> but I think arm via stgrabers' qemu ami is possible
<davecheney> being realistic, just 386 and amd64
<wallyworld> ok. so long as we are consistent across the board
<wallyworld> that's what i'm aiming for
<wallyworld> then we can add in the extra arhes
<wallyworld> arches
<wallyworld> i just got a little confused when reading the code
<bodie_> do I need to submit a bug report or can I simply lbox propose my fix?
<davecheney> bodie_: if you can fix it yourself, go for it
<bodie_> so it HAS to be juju-mongodb?  I thought it just had to be a version with SSL support
<davecheney> bodie_: they are one in the same
<davecheney> especially if you are on precise
<bodie_> hmm
<bodie_> not in my case I guess since I'm using trusty tahr
<bodie_> it installed 2.7.0-pre
<davecheney> bodie_: oh bodie_ the problem is so many turtles deeper
<bodie_> I like turtles... but not this many turtles
<bodie_> was really confusing trying to make my tests pass -- I knew it had something to do with the Mongo version but beyond that it gets really murky
<bodie_> and then having to wrangle build issues on top of that was making me insane
<bodie_> I think GCC 4.8.2 treats a certain thing as a warning that isn't supposed to be treated as a warning
<bodie_> do you know if the Vagrant box is suitable for dev?
<davecheney> bodie_: is the README in the project incorrect ?
<bodie_> ah, the make install-dependencies bit?  I hadn't tried that, I think I stopped reading when I saw the directions for installing the binaries and assumed the README was about that
<bodie_> the juju build was failing due to an incompatibility with the latest gwacl, but someone told me about the "known working revisions" doc
<bodie_> anyway, thanks for the assist :)
<waigani> the "PatchEnvPathPrepend(0): not defined" error I was hitting was resolved by rebuilding the pkg dir
<waigani> which means the branch is now ready for review: https://codereview.appspot.com/76670043
<davecheney> thumper: good news
<thumper> davecheney: yes?
<davecheney> building static tools means we don' tneed install moew pkgs into the environment
<wallyworld__> axw: how far away is this?
<wallyworld__> 	// TODO(axw) 2014-02-11 #pending-review
<wallyworld__> 	//     Embed state.Prechecker, and introduce an EnvironBase
<wallyworld__> 	//     that embeds a no-op prechecker implementation.
<axw> wallyworld__: oops, that was reviewed and should've had an issue added...
<axw> but the EnvironBase thing may not happen, as the implementation ended up changing
<axw> wallyworld__: why?
<wallyworld__> axw: i want to introduce some more common, shared functionality across all providers
<wallyworld__> in this case SupportedArchitectures()
<axw> wallyworld__: okay. I have no immediate plan to add it, so feel free to do it
<wallyworld__> i'll just do a new interface i think
<wallyworld__> Go loves single method interfaces :-/
<wallyworld__> just gotta think of a name, and no I don't want to use Archer cause there's no arrows to be seen anywhere
<axw> why do you need another interface?
<axw> adding it to Environ seems appropriate
<wallyworld__> guess so
<wallyworld__> i think though having an EnvironCapability interface would be good
<wallyworld__> we can add more methods to it as needed
<wallyworld__> eg is block storage supported
<wallyworld__> and easier to stub out for testing
<axw> sounds reasonable
<thumper> wallyworld__:  https://codereview.appspot.com/74660044
<wallyworld__> thumper: https://codereview.appspot.com/76710043/
<wallyworld__> :-)
<thumper> np
<thumper> wallyworld__: I have a parent teacher interview, but will review ASAP on return
<wallyworld__> ok, no hurry
<wallyworld__> have fun
<thumper> wallyworld__: why not utils.arch ?
<thumper> then arch.AMD64
<thumper> or something
<wallyworld__> thought about that, yeah can do that
<thumper> I'll keep looking first
<wallyworld__> just didn't want to introduce such a small package
<thumper> just wondering out loud at the first thing I saw
<thumper> ok, let me keep looking first
<wallyworld__> sure, but i think you may have a valid poitn cause i almost did it that way
<davecheney> 2014-03-17 02:54:32 INFO juju.worker.uniter uniter.go:474 running "config-changed" hook
<davecheney> 2014-03-17 02:54:32 ERROR juju.worker.uniter uniter.go:480 hook failed: fork/exec /var/lib/juju/agents/unit-u0-0/charm/hooks/config-changed
<davecheney> just hit this with 1.17.5.1
<bodie_> latest test output....
<bodie_> http://paste.ubuntu.com/7105935/
<bodie_> going to bed, will look over this in the morning. any input welcome
<bodie_> o/
<davecheney> bodie_: sorry mate, still the wrong mongodb
<davecheney> or more speciically
<davecheney> the test suite tries to bring up a mongodb, but fails
<davecheney> most of those tests don't consider the case that mongo couldn't fail to start
<davecheney> and so panic during tear down
<davecheney> well fuck
<davecheney> i need to log a bug about charm-helpers-sh being missing
<davecheney> but i can't confirm if it works on precise because I've hit another bug
<thumper> davecheney: is that hook failure one where the hook doesn't exist?
<thumper> davecheney: the code paths should catch that, and I was horribly confused...
<davecheney> thumper: that is correct
<davecheney> it shouldn't happen
<thumper> I don't understand how we get that error
<davecheney> axw: was looking into it
<davecheney> i think
<thumper> we explicitly catch that error
<thumper> and I see that explicit catch happening locally
<thumper> and on ec2
<thumper> so I'm dumb struck
<axw> I was looking at what?
<axw> nope.. don't think so
<axw> oh yeah I remember this...
<davecheney>  /insert jarring chord
<jam> morning thumper, I'm in the hangout whenever you're around
<wallyworld__> thumper: so can we now drop JUJU_TESTING_LXC_FORCE_SLOW?
<thumper> wallyworld__: yes
<thumper> jam: ok
<wallyworld__> thumper: wanna do that then as part of your work?
<thumper> wallyworld__: yeah
<wallyworld__> ok, i'll re-review then
<axw> davecheney: so the uniter code normally works because exec.Command internally uses LookPath
<axw> LookPath is the one that returns an *exec.Error
<axw> is it not working with gccgo?
<davecheney> axw: this was one ec2
<davecheney> with gc
<axw> huh
<davecheney> this is not a ppc64el specific issue
<thumper> wallyworld__: https://codereview.appspot.com/74660044/
<wallyworld__> ok
<axw> davecheney: you're on Go tip aren't you? (or > 1.2 at any rate)
<axw> davecheney: behaviour of os/exec.Command has changed in 1.3
<davecheney> axw: is that a question or a statement ?
<axw> first is a question, second is a statement
<davecheney> then yes, and ok
<axw> the statement leads to my question, I'm backwards like that
<axw> ok
<davecheney> but others have hit this problem using released tools
<axw> mmk, nfi why it would fail otherwise
<jam> axw: LGTM on https://code.launchpad.net/~axwalk/juju-core/lp1293310-non-existent-hooks/+merge/211247
<axw> jam: thanks
<rogpeppe3> mornin' all
<axw> morning rogpeppe3
<rogpeppe3> axw: hiya
<axw> rogpeppe3: I made the changes to https://codereview.appspot.com/75990043/, but thought I'd wait until you had a glance in case you weren't happy about the new package
<rogpeppe3> axw: looking
<rogpeppe3> axw: tiny thing: i wonder if the test should use /bin/sh rather than /bin/bash, as i think that might be what cloud-init is using.
<rogpeppe3> axw: (i may be wrong there though)
<axw> rogpeppe3: the synchronous bootstrap bit is always bash
<rogpeppe3> axw: ok, and the DumpFileOnError code is only ever run in that context?
<axw> rogpeppe3: and add-machine ssh, but that's also bash
<rogpeppe3> axw: LGTM
<axw> thanks
 * jam => luncdh
<dimitern> wallyworld_, are you around?
<jam> morning dimitern
<dimitern> jam, morning
<jam> dimitern: I missed you earlier
<jam> rogpeppe3: 1:1?
<rogpeppe3> jam: ah yes!
<dimitern> jam, i'm sorry i trusted my nexus ubuntu touch to sound the alarm but it didn't and i read later that it's not implemented yet
<wallyworld_> dimitern: hi
<vladk> dimitern: hi
<dimitern> hey vladk
<dimitern> wallyworld_, I have a question about upgrades
<wallyworld_> sur
<wallyworld_> e
<dimitern> wallyworld_, we need to create some dirs with root permissions
<wallyworld_> ok. where?
<dimitern> wallyworld_, specifically, when upgrading rsyslog config and logdir
<dimitern> wallyworld_, what's the best way to do that? i've seen provider/local calling exec.Command("sudo", "/bin/bash", "-s") and attaching stdout and stderr
<wallyworld_> dimitern: the  machine agent runs as root, right?
<dimitern> wallyworld_, or maybe generate a script and pass it on stdin (i.e. mkdir -p <logdir>)
<dimitern> wallyworld_, so you're saying we should be able to create /var/log/juju-<namespace> in an upgrade step?
<wallyworld_> i think so
<dimitern> ok, i'll try that
<wallyworld_> it's just an educated guess
<wallyworld_> but would be the simplest i think
<wallyworld_> cause an upgrade step is easy enough to write, and is only executed the once when needed to get to 1.18
<dimitern> wallyworld_, yeah, i was testing my fix for bug 1291400 and found out some errors due to the logdir in var/log not getting created and hence rsyslog cannot create its certs there
<_mup_> Bug #1291400: migrate 1.16 agent config to 1.18 properly (DataDir, Jobs, LogDir) <regression> <upgrade-juju> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1291400>
<wallyworld_> dimitern: good luck, let me know if it works. my only question is if the owner of the dir would be root but i think so since machine agent runs as root
<dimitern> wallyworld_, it has to be a specific user/group i think, i'll check
<wallyworld_> maybe, not sure ottomh
 * dimitern really hates how we don't chown local provider logs to the user so I don't have to do sudo less ...
<wallyworld_> even so, the process will have the priviliges to do it
<wallyworld_> dimitern: i agree. file a bug :-)
<dimitern> will do :)
<wwitzel3> hola
<rogpeppe3> hi all
<rogpeppe3> voidspace: ping
<wwitzel3> hey rogpeppe3
<rogpeppe3> wwitzel3: yo!
<wwitzel3> rogpeppe3: you have a good weekend?
<rogpeppe3> wwitzel3: an excellent weekend, thanks
<rogpeppe3> wwitzel3: we actually had some sun yesterday, and even better, managed to go out and take advantage of it
<rogpeppe3> wwitzel3: you?
<wwitzel3> rogpeppe3: that's great :)
<wwitzel3> rogpeppe3: yeah, it was good, nothing exciting, but nice and relazing
<wwitzel3> relaxing
<voidspace> rogpeppe3, morning
<rogpeppe3> voidspace: hiya
<voidspace> rogpeppe3, hi :-)
<voidspace> rogpeppe3, sorry about Friday
<rogpeppe3> voidspace: how's your configuration working now?
<voidspace> rogpeppe3, well, mostly good
<voidspace> rogpeppe3, :-)
<voidspace> rogpeppe3, I can't get the USB monitor working with Ubuntu and I think I'm stuck on a maximum of three monitors
<voidspace> rogpeppe3, but other than that, good
<voidspace> rogpeppe3, I still need to move the drives to the new machine
<rogpeppe3> voidspace: only three monitors, how can you survive? :-)
<voidspace> heh
<voidspace> it is hard
<voidspace> anyway
<voidspace> rogpeppe3, shall I pair with you today?
<rogpeppe3> voidspace: i was just about to suggest that, yes
<voidspace> great
<rogpeppe3> voidspace: whenever you're ready
<voidspace> rogpeppe3, I'm pretty much ready
<voidspace> rogpeppe3, hangout for audio?
<rogpeppe> voidspace: let's start with that
 * rogpeppe tries to work out how to create a hangout
<voidspace> rogpeppe: shall we use the juju-core one, or start our own?
<rogpeppe> voidspace: let's just use the juju-core one
<rogpeppe> voidspace: https://plus.google.com/hangouts/_/canonical.com/juju-core
<mgz> jam: are you around?
<jam> mgz: I am, I just need to do a quick "get some coffee/tea" run, and then I'll be happy to chat
<jam> mumble or G+ ?
<mgz> sure thing
<mgz> lets try mumble
<jam> mgz: no love....
<jam> I'll try again
<jam> mgz: so I'm guessing this is Dubai's anti-VOIP software going into effect.
<mgz> we'll have to make a new hangout
<mgz> cunning
<jam> At least, it is what it sounds like when I try to use Skype-out
<jam> mgz: there is a hangout associated with our 1:1 calendar event
<jam> can you get to that?
<mgz> is the canonical mumble server not over ssl?
<mgz> lets see
<rogpeppe> mgz: have you got the link to the instructions for reconfiguring the gobot, please?
<jam> rogpeppe:  https://lists.ubuntu.com/archives/juju-dev/2014-March/002182.html <https://lists.ubuntu.com/archives/juju-dev/2014-March/002182.html>
<rogpeppe> jam: thanks!
<jam> rogpeppe: he's in G+ with me, and its easier to paste via me, then copy it to IRC :)
<rogpeppe> mgz: is there a reason that the verify_command in the tarmac configuration doesn't do a godeps -u ?
<mgz> because it would also need go get -u
<jam> rogpeppe: because half the time that would just break because you have to fetch first?
<jam> and go get -u can overwrite the current branch that we are trying to test?
<rogpeppe> jam: in that case, it *should* break, i think
<rogpeppe> jam: otherwise we're testing with the wrong deps
<rogpeppe> jam: currently i'm trying to help michael get his Instances aggregration branch in, and there's no easy way to do it
<voidspace> rogpeppe, https://pastebin.canonical.com/106567/
<rogpeppe> voidspace: yeah, just fixed it, i think
<voidspace> rogpeppe, cool, thanks
<voidspace> rogpeppe, nope :-)
<rogpeppe> voidspace: indeed
<mgz> now the hook borked
<voidspace> rogpeppe, I'm grabbing coffee
<rogpeppe> voidspace: k
<perrito666> hi everyone
<rogpeppe> fwereade: just encountered an interesting error with juju resolved
 * fwereade peers nervously at rogpeppe
<fwereade> hi again perrito666 :)
<rogpeppe> fwereade: it's not too bad actually
<rogpeppe> fwereade: i did "juju resolved", and the hook was still in error state
<rogpeppe> fwereade: i did it again, and it said "ERROR cannot set resolved mode for unit "tarmac/0": already resolved"
<rogpeppe> fwereade: but that was because a hook was still running
<rogpeppe> fwereade: i wonder if the uniter should set the unit status out of error state the moment it starts to try rerunning the hook
<fwereade> rogpeppe, it *was* designed that way but I'm trying to remember why
<rogpeppe> fwereade: BTW the first time i did "juju resolved --retry"
<rogpeppe> fwereade: it certainly felt a bit weird when it happen: "why can't i resolve this hook that's in error state?")
<fwereade> rogpeppe, IIRC the idea is that we make sure we actually try to handle the resolved before we mark it ready to try again
<rogpeppe> happened
<rogpeppe> fwereade: i think that's the bit that's not so intuitive
<voidspace> fwereade, I've been exchanging emails with  Erik Naslund :-)
<fwereade> rogpeppe, the failure mode of setting resolved early is that we could not actually respond, but claim we did
<rogpeppe> fwereade: because the user has no visibility into when hooks are actually running
<fwereade> rogpeppe, I *think* that's worse
<fwereade> voidspace, ah cool -- how's he doing?
<fwereade> voidspace, say hi :)
<voidspace> fwereade, I will do
<voidspace> fwereade, he is using mock at his startup and sent an email saying thanks
<fwereade> voidspace, excellent, I'm pretty sure I introduced him to it :)
<ahasenack> ng
<wwitzel3> natefinch: ping
<natefinch> wwitzel3: howdy
<wwitzel3> natefinch: quick hangout?
<natefinch> sure
<rogpeppe> x
<bodie_> is there a standard for which version of ubuntu this should work best on?  davecheney says I'm still using the wrongodb, but this is definitely the one make install-dependencies set me up with
<jam> bodie_: most testing is done on Precise, we tried to enable Trusty correctly, but then ended up regressing support elsewhere, so we have a patch in progress to restore proper support for Trusty's local provider
<bodie_> makes sense
<bodie_> is there anything I can do to help there?
<bodie_> maybe I'll try the Vagrant setup in the meantime
<axw> fwereade: heya. did you want to review https://codereview.appspot.com/70190050/ before Jesse lands it?
<axw> it's a big 'un
<axw> looks (nearly) good to me, but just checking if I should provision a LGTM on your review
<axw> rogpeppe: yeah, I found that (resolved not changing error state immediately) a bit unintuitive too. some sort of feedback would be nice
<fwereade> axw, if I haven't looked at it by eod I think we can go ahead,it
<bodie_> just downloaded the 639MB vagrant image in almost exactly a minute... love this fios
<axw> fwereade: okey dokey
<fwereade> 's had several rounds and you were making smart comments last time I saw
<fwereade> axw, you can represent me :)
<axw> I'll do my best
<fwereade> axw, btw I have been trying to assimilate all the awesome azure changes, couple of quick questions
<fwereade> axw, 1 (possibly trivially working, probably deserves minor reflection) subordinates in fancy-azure mode
<fwereade> axw, I suspect they will all work fine, but if you haven't considered them explicitly please do some thinking
<axw> I haven't, and I think they'll work fine, but yes I will have a think about it
<fwereade> axw, 2 (maybe more of a sticking point) can we change the startinstance params so that it's expressed in terms of instances to avoid, rather than leaking the concept of "principals" into environs?
<fwereade> axw, I haven't figured out whether that screws everything up wrt AddMachine though
<axw> <fwereade> axw, I suspect they will all work fine, but if you haven't considered them explicitly please do some thinking
<axw> <axw> I haven't, and I think they'll work fine, but yes I will have a think about it
<axw> sorry, net cut out
<fwereade> <fwereade> axw, 2 (maybe more of a sticking point) can we change the startinstance params so that it's expressed in terms of instances to avoid, rather than leaking the concept of "principals" into environs?
<fwereade> <fwereade> axw, I haven't figured out whether that screws everything up wrt AddMachine though
<axw> fwereade: that makes sense for ec2, but not for azure I think?
<fwereade> axw, my though for azure was that we could just put the machine into the same cs/as as all supplied instances (in fancy-azure mode) or ignore (in placement mode)
<axw> i.e. with AZ you want to avoid other machines in the AZ, but in azure you want to stick them all in the same Cloud Service
<axw> true
<fwereade> axw, yeah, but in azure we can do so iff the instances to avoid all match
<fwereade> axw, otherwise we can't distribute for reliability but whether that's an issue depends on the setting I think
<fwereade> axw, thank god we decided not to allow racing provisioners though
<axw> :)
<axw> indeed
<fwereade> axw, there might be some horrible hole in my logic but I'd like us to try quite hard to keep state concepts out of environ implementations
<fwereade> axw, the jobs/info dependencies make me grumpy but they don't feel too fundamental, this raher does
<axw> fwereade: yeah, I'll have to have a think about how it'll work
<fwereade> axw, tyvm
<axw> It'll mean recording more information specifically for provisioning
<fwereade> axw, expand please?
<axw> fwereade: instances to avoid is determined by the provider policy, right? that needs to happen as we're doing AddMachine if we're going to support clean machine assignment for ec2
<axw> then it needs to be picked back up by the provisioner to pass to StartInstance
<fwereade> axw, I think we can just always assume instances-to-avoid == instances-running-a-unit-of-this-service
<fwereade> axw, provider is free to best-fit that
<fwereade> axw, and in the case of manual placement it's completely overridden
<fwereade> axw, ah doh I see
<fwereade> axw, ormaybe not
<axw> fwereade: I think that might work, need to let it marinate
<fwereade> axw, it's more info we need at provisioning time, but I think we can just get away with a quick scan of the units on that machine -- if any -- and an API call to find out what other instances are also running those units
<fwereade> axw, (prinicpals only, I think, but you already know that)
<axw> yup. so basically what I was going to do, but hiding the principals from StartInstance
<fwereade> axw, (also consider how/whether we can/should handle principals running on hosted machines -- I'd accept a TODO-figure-it-out there because it won't become a real concern until we start implementing zone distribution inside the other providers)
<fwereade> axw, yeah, exactly
<fwereade> axw, it's reimplementation but not fundamentally rearchitecting :)
<axw> on hosted machines...?
<axw> fwereade: what do you mean by "hosted machines"?
<fwereade> axw, 2/lsc/3
<fwereade> axw, 2/lxc/3
<axw> ah
<axw> yep, I only thought about it in so far as it's not really supported in Azure :)
<fwereade> axw, I'm not quite decided what the Right Thing is yet, in an ideal world we'd be supplying some sort of weighting information -- ie it matter much more that we're not near instance X than Y, because the unit running on X is not well-distributed yet but the one on Y is
<fwereade> axw, but that's off in best-is-enemy-of-good territory
<fwereade> axw, so sticking with the simplest thing might be wise
<fwereade> gaah need to do something quickly before meeting at 2, gtg
<axw> fwereade: later, thanks for the chat
<rogpeppe> voidspace: lp:~rogpeppe/juju-core/515-state-rename-api-addresses
<voidspace> mgz, ping
<voidspace> mgz, I'm starting again on a new machine
<voidspace> mgz, so I need the bzr magic incantations to create the local branches
<voidspace> mgz, and yet again I don't remember them :-)
<voidspace> nor indeed where you put the instructions
<mgz> voidspace: yeah, I said I should write them down
<mgz> voidspace: `bzr switch -b trunk; bzr pull --remember lp:juju-core; bzr switch -b feature_branch`
<voidspace> mgz, wonderful, thanks
<voidspace> I will email those to myself
<voidspace> it's just the initialisation I can't remember
<voidspace> using it is fine :-)
<voidspace> https://pastebin.canonical.com/106571/
<voidspace> rogpeppe, ^^
<wwitzel3> the logging calls in state/open.go where do those get logged to? logger.Infof
<rogpeppe> wwitzel3: i'm not sure i understand the question
<rogpeppe> wwitzel3: do you mean "where do the log messages end up?"
<wwitzel3> there are logger.Infof calls in state/open.go, Open() where do the string values passed to Infof end up?
<wwitzel3> what file would I tail to see them?
<wwitzel3> cloud-init-output.log has them
<bodie_> Is there a virtualbox / vagrant image specifically for core dev?
<bodie_> i.e., Go installed, etc
<rogpeppe> trivial method-renaming review anyone? https://codereview.appspot.com/76890043
<bodie_> if not I'm going to host one
<rogpeppe> wwitzel3: it depends on the context
<rogpeppe> wwitzel3: is this when running a real environment?
 * rogpeppe goes for lunch
<wwitzel3> rogpeppe: I found it on the machine being bootstrapped in cloud-init-output.log
<bodie_> does anyone know an eta on fixing the azure provider?
<bodie_> er, rather the compat with the updated provider
<bodie_> ... with the updated driver*
<bodie_> sigh
<mgz> bodie_: use godeps to flip back to the comaptible revision
<bodie_> I was just using bzr
<mgz> axw has some juju-core branches in progress but they're not yet landed
<bodie_> is there a fancy simple go way?
<mgz> and this is why we have godeps :0
<mgz> rogpeppe: ^you got a potted simple way?
<axw> godeps -u dependencies.tsv
<bodie_> hehe.. potted.  fun working with people who have variations in their syntax ;)
<bodie_> thanks axw
<axw> nps
<mgz> should be something like: `go get launchpad.net/godeps/...; go install launchpad.net/godeps/...; godeps -u dependencies.tsv`
<mgz> from inside juju-core dir
<axw> go install is not necessary if you do go get
<bodie_> go install rebuilds, right?
<bodie_> *double-checks note to self: "stop asking dumb questions"*
<axw> bodie_: go install = go build and install the result to $GOPATH/(pkg|bin)
<mgz> go get installs? that seems obnoxious, no wonder I generally avoid it
<bodie_> heh
<bodie_> I feel like the go tool is trying to do too much and not doing enough at the same time
<bodie_> it should either do very little, or do things properly
<natefinch> wwitzel3: ready to start up again?
<natefinch> bodie_: the go tool does the right thing... we're doing slightly the wrong thing right now.   *usually* trunk just builds.  I actually think it's a mistake for us to ever have trunk not build.... but others may disagree.
<bodie_> heh
<jamespage> sinzui, good morning
<sinzui> hi jamespage
<jamespage> sinzui, so I was poking at 1.17.5 for trusty upload the morning and wanted to make a time/risk decision that I was not in the full facts of to make
 * sinzui nods
<jamespage> sinzui, and that really boils down to when 1.18.x will be released; I don't really want to upload 1.17.5 (and revert the juju-mongodb change I put in for .4) if its only really going to be there for a few days
<sinzui> jamespage, understood
<wwitzel3> natefinch: yep, ready
<sinzui> jamespage, My only angst relates to ppc64 and arm64.
<jamespage> sinzui, in the context of 1.18?
<sinzui> jamespage, I can ask utlemming to release streams.canonical.com, but without those arch among the tools, they wont be tested
<sinzui> jamespage, but too your point...they need to know juju-mongodb
<jamespage> sinzui, right
<sinzui> jamespage, I still think we can see 1.18.0 this week.
<jamespage> sinzui, so will 1.18.0 use juju-mongodb across all archs if avaliable? this is fairly critical for the MIR
<sinzui> or I release again when I see juju-mongodb restored
 * jamespage scrubs fairly from that sentence
<sinzui> jamespage, yes, juju-db will be used when it is available
<jamespage> sinzui, great
<jamespage> sinzui, sorry - juju-db or juju-mongodb?
<jamespage> (just to make sure we all know which one)
<sinzui> jamespage, sorry. I tried to type less. I mean the latter
 * jamespage breathes again
<jamespage> I know that was discussed as a rename
<bodie_> so I'm on the verge of wiping my workstation and installing 12.04 (using 14.04 now) -- should I wait for this change to go in and just use a VM for now or something?
<bodie_> tests seem to be breaking
<bodie_> gustavo said it's mongo
<bodie_> and I am using juju-mongodb
<natefinch> bodie_: 14.04 works fine.  I'm on 14.04 and so are several other devs.
<natefinch> bodie_: does mongod --help have the ssl section in the help now?
<bodie_> yeah
<natefinch> bodie_: can you pastebin the failing tests?  I know you have before, but I want to see if there are differences now that you have the right mongo
<bodie_> http://paste.ubuntu.com/7108482/
<bodie_> feeling a tiny bit hopeless at this point, it's been 4 or 5 solid days of struggle
<bodie_> mongod isn't in my path
<bodie_> it's in /usr/lib/juju/bin/mongod
<natefinch> bodie_: the code I believe right now expects mongod to be in /usr/bin/
<bodie_> it's these tiny little gotchas that make this impossible for people to join in on
<bodie_> ok, not impossible
<bodie_> frustratingly unobvious
<bodie_> this is what I got from using make install-dependencies
<natefinch> bodie_: that actually is probably the problem.... we have a bug with the current code about the juju-mongodb.   I know it's terrible, but try moving mongod to /usr/bin
<bodie_> can I just symlink it?
<natefinch> bodie_: possibly, but you may need to actually remove the juju/bin one.... try symlinking first.
<natefinch> bodie_: and yes, we need a better onboarding process, with better documentation etc.
<bodie_> I keep thinking it's just me being an idiot
<bodie_> was really hoping to get some code in by friday at least
<bodie_> thanks for the assistance
<bodie_> see, gustavo said it was the wrong mongo version
<bodie_> I don't know how he got that from my test output
<bodie_> do I just need to move mongod or all 5 binaries?
<bodie_> 6*
<natefinch> bodie_: I think just mongod, but I'm honestly not 100% sure.
<natefinch> bodie_: That is very strange that you're getting all those panics, though.   While we have problems with the juju/bin mongod.... it doesn';t usually panic like that
<bodie_> you're on trusty, right?
<natefinch> bodie_: yes
<bodie_> so I don't really have any reason to expect switching distros to help
<bodie_> i'll see if moving the binaries fixes this
<bodie_> where did you get your copy of mongodb?
<bodie_> James used make install-dependencies and it put his copy in /usr/bin, unlike mine
<natefinch> bodie_: I built mine from source
<bodie_> finally.  looks like the symlinks worked
<bodie_> interesting, what version of gcc?
<natefinch> bodie_: 4.8 I think lemme check
<natefinch> bodie_: 4.8.2
<bodie_> I wasn't able to get the source build to work, it kept spitting out warnings
<bodie_> -_-
<bodie_> forums looked like maybe a gcc 4.8 issue
<bodie_> but I guess not
<natefinch> bodie_: er.... hmm.... that's my version of gcc now, but I was on Raring when I built mongo
<bodie_> ah
<natefinch> bodie_: so possibly an older version of gcc, I don't know
<bodie_> ok
<rogpeppe> dimitern, natefinch, mgz, fwereade: trivial review please? https://codereview.appspot.com/76890043/
<bodie_> aaaand 107,772 test errors
<mgz> rogpeppe: wwitzel3 got there already
<rogpeppe> mgz: cool
<dimitern> rogpeppe, reviewed
<rogpeppe> dimitern: thanks
<rogpeppe> wwitzel3: thanks to you too - i just hadn't refreshed the page to see your review :-)
<rogpeppe> dimitern: i don't understand your comment
<rogpeppe> dimitern: the API hasn't change (and hopefully won't change)
<rogpeppe> changed
<dimitern> rogpeppe, ah, sorry - the public/agent facing api is still the same
<rogpeppe> dimitern: yeah
<dimitern> rogpeppe, then ignore me ;)
<rogpeppe> dimitern: duly ignored
<TheMue> adeuring: started testing but have to update my 3rd party packages first
<TheMue> adeuring: could you paste me your error(s)?
<adeuring> TheMue: http://paste.ubuntu.com/7108636/
<TheMue> adeuring: hehe, yep, bingo
<TheMue> adeuring: exactly the same here
<adeuring> TheMue: the last error is from r 2428. Reverting to r2427 fixes that one
<adeuring> but the other two error messages remain...
<TheMue> adeuring: that's bad :/
<mgz> adeuring: you need the right rev of gwacl, not top
<mgz> *tip
<mgz> and you want the ratelimiter package
<mgz> see the log ^ for godeps tips
<adeuring> mgz: ahhh.. thanks, that fixed it!
<bodie_> natefinch... any thoughts?  http://paste.ubuntu.com/7108679/
<bodie_> I have very little idea what could be causing this
<natefinch> bodie_: that's the test that fails if mongod exists in /usr/lib/juju/bin :)
<natefinch> bodie_: rename or move mongod from that directory and it'll pass.
<natefinch> bodie_: it's a bug in the code & tests
<natefinch> bodie_: we half-added a feature to use mongo from that directory
<natefinch> bodie_: (where we = me)
<natefinch> rogpeppe, mgz: is there a technical reason why cloud-init-output.log isn't mirrored back to the client when bootstrapping with --debug?  That would make my life so much easier
<rogpeppe> natefinch: axw is the one to ask
<rogpeppe> natefinch: i think it now sends back the output if the bootstrap fails
<mgz> natefinch: it's because bootstrap is still in a funny place
<mgz> were I redoing from scratch, I'd just have --debug echo/tail/poll the console-log nova api, and similar for local/manua
<mgz> curent;y we have a funny mix of cloud-init parts and ssh-in-and-do-things parts that have disparate logging
<bodie_> oy.  ok.  thanks natefinch
<natefinch> mgz: probably why some of our errors come back as "rc: 1"
<mgz> yeah, those are paricularly bad
<mgz> because the stderr gets dropped
<mgz> unlike with cloud-init
<natefinch> mgz: I actually tried fixing that, and ended up with stderr being "return code 1"  or something along those lines equally useless
<mgz> :)
<mgz> too many lovely layers
<stokachu> is there any documentation on using kvm as a container?
<stokachu> the code looks like it supports it but couldnt find any documentation
<bodie_> natefinch, can you point me to the bit that tries to use /usr/lib/juju/bin ?  would be nice if I could take a crack at fixing something
<natefinch> stokachu: there's a bug about us needing to add documentation about it
<stokachu> ok cool, any notes lying around?
<bodie_> I guess /agent/mongo/mongo.go
<natefinch> bodie_: yeah... I think the problem is that the test is using the MongoPath method, and the MongoUpstartService function is not.  we hard coded the MongoUpstartService but forgot to fix the test.  REally we should hardcode MongoPath for now, and put MongoUpstartService back to using MongoPath.
<natefinch> stokachu: juju help add-machine, and just replace lxc with kvm
<stokachu> natefinch: ah ok thanks
<bodie_> I see
<stokachu> and use manual provider right?
<perrito666> hey, make on master yells errors in lxc.go and clonetemplate.go
<natefinch> stokachu: that's orthogonal
<perrito666> is this expected or I have something broken?
<natefinch> perrito666: I'm not sure anyone actually uses make
<natefinch> stokachu: do juju help deploy to see how to deploy to a specific machine (including new container)
<stokachu> ok thanks will do
<rogpeppe> voidspace: can you hear me?
<rogpeppe> voidspace: or are you still coffeeing?
<voidspace> rogpeppe, https://lastpass.com/
<voidspace> rogpeppe, vault
<voidspace> rogpeppe, that's the lastpass terminology anyway
<rogpeppe> dimitern, fwereade, natefinch, mgz: small review (adding State.SetAPIAddresses)? https://codereview.appspot.com/76950043
<stokachu> i <3 juju+kvm
<natefinch> voidspace, rogpeppe: I love lastpass.  I don't know how anyone actually uses the web without it (or something like it)\
<rogpeppe> natefinch: yeah, i think i need to use it
<rick_h_> natefinch: +1 got the wife using it as well
<voidspace> natefinch, I've just migrated from 1password as part of the move away from OS X
<voidspace> natefinch, so far so good :-)
<natefinch> it makes it trivial to use a different very strong password with every single service, which is really the only way to be secure.
<natefinch> voidspace: I've been using it for a few years now.  No complaints.
<rogpeppe> two branches out for review, please: https://codereview.appspot.com/76950043, https://codereview.appspot.com/76870044
<rogpeppe> dimitern, natefinch, wwitzel3, mgz, fwereade: ^
<natefinch> voidspace: one thing I like is that if you pay for it, you get the mobile client and access to the dolphin browser plugin (if you're on android, not sure if they have it on iOS)... the dolphin plugin works just like the desktop plugin, which is handy for browsing on your phone
<natefinch> rogpeppe: I'll grab at least one
<voidspace> natefinch, I've been using 1password for years which is very similar, except with syncing via dropbox instead of the passwords on their servers...
<rogpeppe> natefinch: the 1st is a prereq of the 2nd
<voidspace> natefinch, you can't have browser plugins on iOS unfortunately :-/
<natefinch> voidspace: pretty sure lastpass is encrypted locally before you send your data to them, btw
<natefinch> voidspace: I have a suggestion you probably won't like ;)
<voidspace> natefinch, you buy me a new phone and I'll use it :-)
<rick_h_> natefinch: works in mobile firefox. I use it in that on android
<natefinch> voidspace: rofl
<natefinch> voidspace: that's cool.  I'll have to try it and see how it compares to dolphin
<natefinch> er rick_h_ ^
<natefinch> rogpeppe: there you go
<rogpeppe> natefinch: thanks
<rogpeppe> natefinch: the second one is even simpler :-)
<natefinch> rogpeppe: I did both ;)
<rogpeppe> natefinch: brilliant!
<voidspace> rogpeppe, back
<voidspace> rogpeppe, I got dumped out of the hangout - trying to join again
<rogpeppe> voidspace: k
<voidspace> rogpeppe, hmm... it won't let me back in
<rogpeppe> voidspace: what error do you see?
<voidspace> rogpeppe, it takes me to the error page and says "there was an error", with a "Start a new hangout" button
<rogpeppe> voidspace: awesome
<voidspace> rogpeppe, now it's saying "It's taking too long to connect you to this video call.Try again in a few minutes"
<voidspace> rogpeppe, which at least is progress I think...
<rogpeppe> voidspace: i could try leaving and joining again
<wwitzel3> Google is having some issues with hangouts atm
<voidspace> that would explain it...
<rogpeppe> voidspace: it still thinks you're connected, BTW
<rogpeppe> voidspace: ha, i also cannot rejoin
<voidspace> rogpeppe, creating a new one fails. wwitzel3 would appear to be correct :-)
<rogpeppe> voidspace: yup
<voidspace> rogpeppe, I'm going to work on my vim setup as that's pretty essential
<voidspace> rogpeppe, I have half an hour to EOD
<rogpeppe> voidspace: sgtm
<rogpeppe> voidspace: we've done alright today i reckon
<voidspace> rogpeppe, Mondays I have to leave promptly at 6pm due to krav maga, most other days I can be more flexible
<voidspace> rogpeppe, yep, been fun
<rogpeppe> voidspace: branch just merged
<rogpeppe> voidspace: one still to go
<voidspace> I saw
<voidspace> great
<voidspace> yep
<rogpeppe> dimitern: ping
<rogpeppe> fwereade: ^
<natefinch> TIL not to mess with voidspace.... unless there's some other krav maga I don't know about, that's like, a UK form of knitting
<voidspace> natefinch, hehe
<voidspace> natefinch, I've only been doing it for a couple of months
<voidspace> natefinch, I really enjoy it but still at the "complete beginner" stage
<voidspace> natefinch, I still advise not messing with me though...
<perrito666> natefinch: wwitzel3 hi, just wanted to say hello, you seem to be my most overlapped co-teammers :)
<voidspace> right - EOD
<voidspace> 'night all
<bodie_> laters
<voidspace> see you tomorrow
<natefinch> voidspace: night!
<natefinch> perrito666: hi, yes, it's nice not to be the only one here in the afternoons :)
<wwitzel3> perrito666: hi :)
<natefinch> rogpeppe, mgz: wwitzel3 and I were wondering if there were things in state.Open we should *not* be doing in an HA environment... .like if you're not the first bootstrap node to come up.   It's a little hard to tell what might cause problems if it were ran more than once.
<rogpeppe> natefinch: i've certainly be *trying* to make everything that happens in state.Open applicable in a HA environment
<rogpeppe> s/be /been /
<natefinch> rogpeppe: we're having a problem where bootstrap seems to finish successfully, but then juju status gets an error ERROR state/api: websocket.Dial wss://ec2-54-198-131-15.compute-1.amazonaws.com:17070/: dial tcp 54.198.131.15:17070: connection refused
<rogpeppe> natefinch: is this on tip?
<natefinch> rogpeppe: tip-ish.  I synced my branch last thursday, I believe
<rogpeppe> natefinch: what do you see in the logs?
<rogpeppe> natefinch: (it sounds like the machine agent isn't coming up properly)
<natefinch> rogpeppe: I'm bringing up a new environment now, I'll check and see if I see anything noteworthy in the logs.
<natefinch> rogpeppe: ahh, hmm, looks like we're getting a mongo error when we try to read the replsetconfig while opening state (we do that to know if we need to call initiate)
<rogpeppe> natefinch: sounds right
<rogpeppe> natefinch: i have some code that does the initial replicaset setup for state, if it might be helpful for you
<natefinch> rogpeppe: we actually just were working on that today.  Right now our code does it in state.Open if the replset isn't already initiated
<rogpeppe> natefinch: i'm not sure that's a great idea
<rogpeppe> natefinch: it seems to take quite a long time after setting the initial (1 member!) replica set configuration until you can actually use the mongo again
<rogpeppe> natefinch: i have no idea why that should be
<rogpeppe> natefinch: i would prefer to keep the replicaset logic out of the state package
<natefinch> rogpeppe: I was just putting it there because there's 40 lines of setup to dial mongo, which we have to do anyway
<rogpeppe> natefinch: yeah, i see that, but i wonder if there some other possibility
<natefinch> rogpeppe: I'm sure we can factor out the 40 lines so we don't have to write it multiple times.  And re-dialing isn't a big deal.
<rogpeppe> natefinch: well, we'll have to redial anyway
<natefinch> rogpeppe: right
<bodie_> This line should read juju-mongodb, right?  http://paste.ubuntu.com/7109636/
<bodie_> It currently says mongodb-server
<bodie_> which I don't think will come from the ppa
<rogpeppe> natefinch: i can think of two possibilities atm
<rogpeppe> natefinch: 1) add a method to State that returns the mongo DialInfo that's appropriate for dialling the state's mongo server.
<rogpeppe> natefinch: 2) factor out the dial code into another package and make state.Open call it
<natefinch> rogpeppe: seems like the mongo package we already have would be a reasonable place to put the refactored method.  Getting stuff out of state seems like a good idea.
<rogpeppe> natefinch: 3) factor the dial code out of state entirely and pass a mongo session into state.Open and state.Initialize
<natefinch> rogpeppe: 2 or 3 seem fine, I don't really have an opinion as to which is better.  Probably passing the session into state is more flexible.
<natefinch> (so I guess I do have an opinion ;)
<natefinch> no strong opinion :)
<rogpeppe> natefinch: the first option is least work
<rogpeppe> natefinch: the third option is the most work
<natefinch> rogpeppe: the difference between 2 and 3 seems small. though I don't know about the tests.  Mostly it seems like cut and paste.
<natefinch> (aside from tests)
<rogpeppe> natefinch: yeah, 3 isn't actually so bad - there are only 33 calls to state.Open in the code
<natefinch> heh
<rogpeppe> natefinch: i think it's my preferred option actually
<rogpeppe> natefinch: but it is actually quite a big change, now i think about it
<rogpeppe> natefinch: we'd need to move state.Info out of state
<rogpeppe> natefinch: it might all work out nicely though
<rogpeppe> natefinch: i would definitely run the idea past fwereade though, as it is definitely making the state abstraction a more leaky
<rogpeppe> s/a more/more/
<wwitzel3> natefinch: done for the day, but I will still be here on and off
<wwitzel3> natefinch: if you make any break throughs make sure you push it to your branch so I can check it out in the morning
<natefinch> wwitzel3: will do.
<natefinch> rogpeppe: I'll mention it to fwereade.  Probably to get things going ,the best idea right now is to do #1.  Bringing on extra work, even if it's refactoring things a little more nicely, is probably not the best idea.
<rogpeppe> natefinch: agreed
<perrito666> hey, I am on my way out, see you all tomorrow :)
<natefinch> perrito666: see you tomorrow :)
<thumper> o/
<natefinch> o/ thumper
<hazmat> thumper, so with aufs local lxc.. is that optional?
<thumper> hazmat: it is about to be
<thumper> hazmat: I have a branch that allows configuration
<hazmat> thumper, awesome
<thumper> spent yesterday writing tests for it mostly
<thumper> hazmat: it was on all the time (other than btrfs)
<hazmat> yeah.. twould be nice as constraint..
<hazmat> but env config is an ok fallback
<hazmat> thumper, cause next would be aa-unconfined profile as container/workload constraint
<hazmat> re ostack on local
<hazmat> all of which would come back to supporting provider specific constraints
 * thumper nod
<thumper> s
<rogpeppe> right, i'm done
<rogpeppe> g'night all
<natefinch> EOD for me.  Night all.
<marcoceppi> thumper: did fast lxc land in 1.17.5?
<thumper> marcoceppi: yes, but it seems there may be permission issues
<thumper> looking into it
<marcoceppi> thumper: well if there's something i have to do, like modify a path or whatever, I'm down for that, I just want fast lxc while writing this charm
<thumper> yes, it is there
<thumper> mwhudson: ping
<thumper> mwhudson: maybe you are flying...
 * thumper waits for the merge...
<thumper> phew, third time lucky
<marcoceppi> thumper: so, after the first time I deploy, each deployment (even between bootstraps) should be fast?
<marcoceppi> thumper: oh, actually, I got an error executing lxc-clone
<marcoceppi> thumper: ah, juju-local needs to depend on aufs-tools
<thumper> ah
<marcoceppi> thumper: :\ http://paste.ubuntu.com/7110975/
<thumper> write that shit down L)
<marcoceppi> after install aufs-tools
<thumper> hmm...
<marcoceppi> thumper: oh,I'll def file a bug as soon as I track down the issue
<thumper> coolio
<marcoceppi> s/I/you
<thumper-gym> heh
<marcoceppi> thumper-gym: yeah, I aufs-tools didn't fix it and i'm nto sure where else to look. I'm guessing the lxc-container-aufs option is in trunk and not released?
<mwhudson> thumper-gym: hey, i am at kingsford smith airport so am happy for you to try to stop me getting bored out of my mind :-)
<mwhudson> (that said the internet is pretty rubbish, so i might not be very good conversation)
#juju-dev 2014-03-18
<waigani> wallyworld_: morning
<wallyworld_> hey
<waigani> I have a problem for you:
<waigani> pulled trunk, got "undefined: ratelimit.New", go get github.com/juju/ratelimit, still get error. What did I miss?
<wallyworld_> what error?
<waigani> ratelimit is undefined
<waigani> it can't find the package
<wallyworld_> go get -u
<wallyworld_> maybe
<waigani> ah
<davecheney> marcoceppi: thanks for fixing the charm-helper-sh problem
<waigani> wallyworld_: no luck, strange..
<wallyworld_> hmmm
<wallyworld_> i did hear that tip of ratelimit is not compatible with juju core
<wallyworld_> but for the whole package to be undefined doesn't fit with that
<wallyworld_> i just pulled ratelimit and it is there
<waigani> it must be something with my setup, otherwise others would be hitting this
<wallyworld_> are you sure your GOPATH is correct?
<waigani> I'll check go paths etc (again!)
<waigani> hmm, GOPATH looks correct, ratelimit is in $GOPATH/src/github.com/juju/ratelimit, tried rebuilding pkg hmmm
<thumper> mwhudson: where are you off to?
<mwhudson> thumper: uk
<thumper> sprint?
<thumper> or holiday?
<davecheney> waigani: i'm concerned that the solution to your problems is 'rebuild' everything
<mwhudson> holiday!
<mwhudson> dad's 80th
<thumper> nice
<davecheney> that smells like something is wrong with your setup
<thumper> have fun
<thumper> waigani: you need to do this:
<thumper> go get github.com/juju/ratelimit
<thumper> then run
<mwhudson> thumper: i have to keep reminding myself that the routing with 6 hours in sydney was over $1000 cheaper :-)
<thumper> godeps -u dependencies.tsv
<thumper> mwhudson: haha
<davecheney> mwhudson: where you going ? neptune ?
<thumper> mwhudson: emma and pheobie going too?
<thumper> mwhudson: how do you spell that?
 * thumper feels like it is wrong
<mwhudson> thumper: no, solo
<mwhudson> thumper: phoebe
<davecheney> o before e == long eee sound
<waigani> thumper: thanks, davecheney: more of a desperate attempt than a solution
<davecheney> waigani: lets fix the underlying problem
<thumper> waigani: you have godeps built?
<davecheney> please tell the doctor where it hurts
<thumper> davecheney: the underlying problem is not having the deps right :)
<hatch> hey I just updated to 1.17.5 on precise and now local machines give me this error "(error: container failed to start)" is this a known issue?
<waigani> thumper: godeps -u dependencies.tsv
<waigani> "/home/jesse/go/src/github.com/juju/ratelimit" now at 0025ab75db6c6eaa4ffff0240c2c9e617ad1a0eb
<waigani> godeps: cannot update "/home/jesse/go/src/github.com/juju/testing": fatal: reference is not a tree: 9c0e0686136637876ae659e9056897575236e11f
<davecheney> waigani: go version
<thumper> waigani: cd ~/go/src/github.com/juju/testing
<davecheney> go env | pastebinit
<thumper> git pull
<davecheney> thumper: yeah, if waigani is using go get
<thumper> probably a way to do it without changing directory
<waigani> davecheney: go version go1.2.1 linux/amd64
<davecheney> and a working copy already exists
<davecheney> then it will not update
<davecheney> go get -u github.com/juju/ratelimit
<thumper> also, godeps doesn't pull
<davecheney> is required
<thumper> davecheney: ack
 * thumper breaks his own rules and takes the laptop into the kitchen
<waigani> thumper: davecheney: wallyworld_: all working now, tests pass thanks
<thumper> good
<thumper> axw: all that thinking and we already have it in the Makefile :)
<axw> doh :)
<thumper> davecheney: any idea why I'd be getting link errors when trying to build the tests on power?
<thumper> oh fark
<thumper> stabby
<thumper> stabby
<thumper> stabby
<rick_h_> thumper: needs more stabby!
<thumper> rick_h_: â â¹
<thumper> rick_h_: ââ¹ ââ¹ââ¹ââ¹
<rick_h_> lol
<mwhudson> thumper: heh, you're being roped into the "make it work on funny arch" game?
<thumper> mwhudson: yeah...
<wallyworld_> davecheney: hiya
<thumper> davecheney: I need help
<axw> waigani: shipit
<axw> waigani: well, apparently there's a text conflict in state/initialize_test.go, so after you fix that :)
<waigani> oh really?
<waigani> okay, axw thanks for the massive review
<axw> waigani: no worries :)
<thumper> waigani: https://code.launchpad.net/~waigani/juju-core/remove-checkers-dir/+merge/210935 needs updating with trunk, someone added a test pointing to old checkers
<waigani> thumper: on it
<thumper> waigani: also the migrate testbase branch has conflicts with trunk
<davecheney> hello
<davecheney> i'm here to help
<davecheney> thumper: sorry, was eating
<thumper> davecheney: you aren't allowed to eat
<thumper> my god man
<davecheney> sorry sir
<thumper> why is mongodb-server failing on power?
<thumper> I'm guessing you've solved this already
<davecheney> thumper: ahh
<davecheney> this is a good question
<davecheney> juju looks for /usr/bin/mongod
<davecheney> and chides you if it's not there
<davecheney> BUUUUUUUUUUUUUUUUT
<davecheney> the actual path of juju-mongodb is somewhere else
<davecheney> becuase this is our special versino of mongo
<davecheney> so even if the package is installed, the check still fails
<davecheney> or the upstart script points to the wrong location
<thumper> are we installing that?
<davecheney> well
<thumper> I didn't think that that branch had landed
<thumper> it did land and was reverted
<davecheney> the install instructions say 'apt-get isntall mongo-server
<davecheney> which isn't correct
<thumper> so the default mongo-server on power is fucked
<thumper> and we need to use our one?
<davecheney> thumper: not fucked
<davecheney> not available
<thumper> I installed it
<davecheney> thumper: as I understood it
<davecheney> we (juju) should always be recommeding juju-mongodb package
<thumper> but it fails to start
<davecheney> not just for trusty/ppc64el
<davecheney> but for everyting back to precise
<davecheney> thumper: it doesn't work 'cos v8 doesn't run on ppc
<thumper> it should be juju-db package (according to our sabdfl)
<davecheney> the juju-mongodb package does not have a javascript interpreter enabled
<thumper> well... unless you have the v8 power thing from github
<mwhudson> um
<thumper> o/ mwhudson
<mwhudson> i thought a v8 port was available for power?
<mwhudson> probably not packaged though i guess
<thumper> I don't think it is packaged
<davecheney> thumper: sir, it's a little late to reopen this discussion
<thumper> developed by not available
<davecheney> we (juju) should always be recommeding juju-mongodb package
<thumper> davecheney: I'll take it to the people I know...
<davecheney> ^ am I mistaken
 * thumper sighs
<mwhudson> i agree that davecheney is right about where things _should_ be
<thumper> davecheney: you are correct with what we originally said
<mwhudson> nfi what the current state is though
<thumper> however we had been trumped along the way
<thumper> but I'm not sure whether it has been actioned
 * thumper will take to the email
<davecheney> who wants to get on a hangout and thrash this out ?
<thumper> davecheney: so... your local on power had a hacked juju-db upstart?
<davecheney> thumper: NO
<davecheney> my hack is
<davecheney> ubuntu@winton-02:~$ ls -l $(which mongod)
<davecheney> lrwxrwxrwx 1 root root 24 Mar 14 00:55 /usr/bin/mongod -> /usr/lib/juju/bin/mongod
<davecheney> IMO two things need to happen
<mwhudson> i don't want to muddy the water tooooo much
<davecheney> 1. juju stops recommending the mongodb-server package
<mwhudson> but https://launchpad.net/ubuntu/+source/mongodb/1:2.4.9-1ubuntu1/+build/5664994 is a ppc build of mongo for trusty
<davecheney> 2. upstart scripts are corrected to reflect the path is /usr/lib/juju/bin/mongod
<davecheney> mwhudson: sorry
<davecheney> please, this discussino is over
<davecheney> james page made this package
<mwhudson> ok
<thumper> mwhudson: I did try it, it failed to start
<davecheney> it's too late to try to re solve this problem
<mwhudson> ah ok, so it builds but doesn't work, ok
<davecheney> we have a solution
<davecheney> it's a really good solution
<mwhudson> davecheney: i'm not trying to change the solution
<davecheney> because it *ALSO* solves the problem for precise
<thumper> davecheney: AFAIK nate is working on changing to use the juju db package
<davecheney> which we still have to support for another 3 years
<thumper> davecheney: it is a different argument about the name 'juju-db' or 'juju-mongodb'
<thumper> I want to see where we are with it
<davecheney> thumper: cool
<davecheney> the name of the replacement pacakge is unimportant
<thumper> davecheney: so you just symlinked /usr/bin/mongod
<davecheney> the key is that it does not live in /usr/bin/mongod
<davecheney> and it isn't called mongodb-server
<thumper> ack
<davecheney> thumper: yes, that is my workaround
<davecheney> to unblock myself
<davecheney> it is not a soluton
<thumper> right
 * thumper is on to fix the solution
<davecheney> 1. juju stops recommending the mongodb-server package
<davecheney> 2. upstart scripts are corrected to reflect the path is /usr/lib/juju/bin/mongod
<davecheney> ^ i believe this is the correct sequence
<thumper> sure...
<thumper> davecheney: do you want to be CC'ed on the emails, or are you happy to be left off?
<thumper> axw: due to the current aufs issues, I'm going to submit a branch to default aufs to off
<axw> thumper: okey dokey
<davecheney> thumper: sure
<davecheney> email away
<wallyworld_> davecheney: i'm guessing i need to configure the ppc vms to be able to ssh out to an ec2 instance?
<davecheney> thumper: re aufs: i agree
<davecheney> i know mramm really really wants it
<davecheney> but lets get *something* working
<davecheney> before moving on to making it use all the super amazing optional features
<davecheney> wallyworld_: oh, there is no external access
<thumper> wallyworld_: if you have the http proxy
<thumper> davecheney: none at all?
<davecheney> goota go through the proxy
<wallyworld_> thumper: i used the squid proxy to get http
<davecheney> and the proxy is whitelist only
<wallyworld_> the ip addresses didnt work
<davecheney> wallyworld_: i'm really sorry
<davecheney> you have to raise an RT
<davecheney> please cc me on the RT
<davecheney> i'm maintaining a list of open ones
<davecheney> i'm tyring to get the PM to be more active in getting them actioned
<thumper> wallyworld_: also, ask for all the vms
<wallyworld_> davecheney: so my ssh foo is crap - i can't just set up a config and use the proxies from the doc, they are only for http?
<davecheney> wallyworld_: not quite sure what you are asking
<thumper> wallyworld_: axw has good ssh-fu
<davecheney> the doc talks about from bne -> your vm
<davecheney> not the other way around
<davecheney> incoming access is via batuan, the jump host
<wallyworld_> davecheney: yes, but the doc also gives proxies to use inside the vm
<davecheney> outgoing access is via proxy
<wallyworld_> i have outgoing http access via http://squid.internal:3128/
<davecheney> correct
<davecheney> but that is whitelisted
<wallyworld_> but you are saying an RT is needed to get outpoing ssh?
<davecheney> wallyworld_: yes
<davecheney> and also to have addtional sites whitelisted for the proxy
<wallyworld_> davecheney: ok, will raise rt. cause once that's done, i have a branch which would have been abl to bootstrap a workload on aws
<davecheney> for example I had to ask rog to hold off on the launchpad.net/goyaml -> gopkg.in/goyaml change
<davecheney> because that site was not whitelisted by the proxy
<davecheney> OH FUCK
<davecheney> i just realised that
<davecheney> sync bootstrapo
<wallyworld_> yep
<davecheney> we required ssh access from the client
<davecheney> WE"RE FUCKED
<wallyworld_> yep
 * davecheney cries
<wallyworld_> oh?
<wallyworld_> we can't organise that?
<davecheney> the chances or elmo allowing that are effectively zero
<davecheney> can we disable sync bootstrap ?
<axw> what's the problem?
<wallyworld_> no outgoing ssh
<wallyworld_> from ppc boxen
<axw> you can't proxy it?
<axw> what is outgoing?
<wallyworld_> bootstrap tries to ssh into aws instance and fails to connect
<davecheney> axw: sync bootstrap
<axw> if you can ssh, you can ssh out using a tunnel at the worst
<axw> if you can ssh in*
<davecheney> axw: a human can
<davecheney> juju probably cant
<axw> davecheney: do you have an HTTP proxy you can go out through?
<axw> davecheney: if so, you can set up ssh to use HTTPS/CONNECT using ProxyCommand
<axw> man ssh_config, look for ProxyCommand
<axw> and set up your ~/.ssh/config accordingly
<davecheney> axw: worth a shot
<davecheney> i think some parts of aws are whitelisted
<davecheney> but it might just be the api endpoiints
<davecheney> axw: are you actively trying to use the proxy for outgoing ssh ?
<axw> davecheney: I'm not doing ppc stuff, so no
<davecheney> axw: thanks for confirming
<axw> davecheney: I can help debug if you're having trouble though
<davecheney> axw: i am also not trying
<axw> ok
<thumper> wallyworld_: are you trying?
<wallyworld_> no
<wallyworld_> am finish tests
<wallyworld_> finishing
<wallyworld_> i can try after that cause i want to get the branch proposed
<wallyworld_> and it's a bit of a rabbit hole with existing code cut and pasted around a bit
<wallyworld_> the branch i'm working on is not blocked by the ssh issue
<thumper> wallyworld_: https://codereview.appspot.com/77220043
<thumper> wallyworld_: I changed lxc-clone and lxc-clone-aufs to bools
<wallyworld_> ok
<thumper> wallyworld_: as I tried to use it locally and realised how fucked up strings were in their place
<wallyworld_> will look in 2 minutes
<thumper> sure
<thumper> no rush
<thumper> I have to head out shortly
<thumper> also has aufs defaulting to off for now
<wallyworld_> thumper: what's failing at the moment are tests which i don't think should have passed originally
<thumper> haha
<thumper> damn
<wallyworld_> ie upload tools should not upload for non dev versions
<wallyworld_> ie 1.2.x should not upload since it's not a dev release
<thumper> right... and I take it that it is?
<wallyworld_> yep
<wallyworld_> so i'm having to change existing tests
<thumper> but 1.2.x.1 should right?
<wallyworld_> yep
<thumper> ok
<wallyworld_> those table driven tests are a pita
<wallyworld_> or at least that how it appears
<wallyworld_> i have to did though gobs of output
<wallyworld_> i might review your stuff to give my eyes a rest
<davecheney> thumper: https://codereview.appspot.com/77220043/ LGTM
<thumper> davecheney: ta
 * thumper goes away for while...
<davecheney> wallyworld_: is there an MP for the tools/arch change you are working on ?
 * davecheney is ready to review
<wallyworld_> davecheney: it's still wip sadly. it was ready and i used it to test live but i added extra tests and they failed cause we have cut and paste code and broken tests which passed even though they should not have IMO
<davecheney> le fuck
<wallyworld_> unless i am wrong, if version.Current is 1.2.x say, the --upload-tools should not work cause 1.2 is a release version
<davecheney> wallyworld_: i think that you are correct
<davecheney> but that restriction was never implemented in code
<wallyworld_> and yet tests set up such a version.Current and called bootstrap --upload-tools and expected tools to be uploaded
<wallyworld_> well, it was in some places
<davecheney> so, the tests assume that restrictino is not in place
<davecheney> remove the restriction ?
<wallyworld_> hmmm. i would like to keep it
<wallyworld_> cause relese tools should be available
<wallyworld_> i think it's reasonable to say that upload-tools should be for dev versions
<davecheney> wallyworld_: sure, i have no position on this
<davecheney> --upload-tools always produces, x.y.z.1
<wallyworld_> yeah
<davecheney> so there is no possibility of a conflict
<wallyworld_> there was 2 pices of almost identical code
<wallyworld_> one lot enforced it i think, and the other didn't
<wallyworld_> i've made changes so can't recall exactly
<wallyworld_> but the code is now shared, so we can change the policy in once place
<wallyworld_> davecheney: if you still feel like reviewing https://codereview.appspot.com/77270043 a lot of the diff is deleted/moved code and tests. i also retained upload release versions for explicit uploads
<wallyworld_> davecheney: you should be happy once this lands
<wallyworld_> tested on ppc box but failed due to ssh - the tools bit worked fine
<davecheney> wallyworld_: /me looks
<wallyworld_> ta :-)
<vladk> jam, good morning
<vladk> jam, https://juju.ubuntu.com/docs/config-LXC.html says that
<vladk> "The usage of LXC Linux Containers enforces that bootstrapping and destroying of an environment are done as root. ...
<vladk> ... sudo juju bootstrap"
<vladk> but I have got
<vladk> "ERROR bootstrapping a local environment must not be done as root"
<vladk> where can I write about it?
<dimitern> vladk, it is done by root, but sudo is called later in the bootstrap process
<vladk> dimitern, morning
<dimitern> vladk, morning!
<vladk> dimitern: I see, but there is an error in documentation, if I follow it, I have got the above error. So I would like to know who can I write to fix the documentation
<vladk> dimitern: how can I 'juju bootstrap' with trusty series?
<dimitern> vladk, evilnickveitch is in charge of the documentation, and there are bugs filed for the docs separately
<dimitern> vladk, https://bugs.launchpad.net/juju-core/docs
<dimitern> vladk, so on trusty, you just need to do juju bootstrap --upload-tools
<vladk> dimitern: how to understand upload-tools?
<dimitern> vladk, ah, that's an instruction to bootstrap to package your existing binaries (from cmd/juju and cmd/jujud) into a tools.tar.gz package and set the version to version.Current.Number + 0.0.0.1
<dimitern> vladk, i.e. if version.Current is 1.17.6, with --upload-tools version 1.17.6.1 will be uploaded (and .2 next time you call --upload-tools and so on)
<jam> vladk: morning
<jam> vladk: the docs do need updating, they are correct for juju 1.16.6 (our last stable release), but not correct for trunk
<jam> however, we won't update the docs until 1.18 is out, since otherwise they'd be wrong for everyone using stable
<evilnickveitch> vladk, dimitern, feel free to file bugs anyhow on things you think are wrong, but mention you are running the dev version
<evilnickveitch> someday we will have notes for the unstable releases too
<dimitern> evilnickveitch, \o/
<rogpeppe> mornin' all
<vladk> rogpeppe: morning
<rogpeppe> vladk: hiya
<dimitern> mgz, fwereade, maas vlan meeting?
<jam> mgz: vladk, fwereade: care to join us to discuss MaaS ?
<jam> https://plus.google.com/hangouts/_/canonical.com/discuss-vlan
<mfoord> rejoining
<jam> mgz: poke about MaaS vlan
<jam> mgz: poke
<perrito666> good morning everyone
<jam> morning perrito666
<jam> wwitzel3: let me know when you come online
<wwitzel3> jam: I'm here
<jam> wwitzel3: hi. We're still doing some of the MaaS VLAN discussion, but I think we're winding down.
<jam> wwitzel3: I'm in the hangout associated with the 1:1 calendar event I sent you
<jam> mgz: poke for standup
<dimitern> vladk, ^^
<mgz> gah, sorry guys
<adeuring> could somebody have a llok here: https://codereview.appspot.com/77260044 (trivial fix(
<voidspace> adeuring: LGTM
<adeuring> voidspace: thanks
<rogpeppe1> voidspace: shall we make another hangout?
<jam> natefinch: if you have known tests that fail, and we can fix them with runtime.GOMAXPROCS(1), we can certainly add them.
<wwitzel3> I'm interested in vlan stuff, but if I sit in on that, I will loose all the HA stuff jam and I talked about this morning
<wwitzel3> natefinch: I am going to take a look at the code changes you pushed up and run them through MaaS, ping me when you're back in the saddle.
<rogpeppe1> voidspace: https://plus.google.com/hangouts/_/7acpj0plqg71qcoie9g4705ddo?hl=en-GB
<voidspace> rogpeppe1: coming
<voidspace> just testing something - please ignore this message (or not at your pleasure)
<rogpeppe1> mgz, fwereade, dimitern, jam: we want to factor out the address-related stuff from the instance package. suggested package is juju-core/netaddr, with the same Address, etc types that are current in the instance package
<rogpeppe1> does that seem reasonable?
<dimitern> rogpeppe1, +1
<mgz> why the factor out?
<rogpeppe1> mgz: because addresses need to be seen by clients, and it doesn't really make sense for clients to import instance
<mgz> that surprises me a little, I thought everything imported instance
<rogpeppe> mgz: there's a load of stuff in the instance package which is really not applicable to API clients
<mgz> eg, lots of things use Instance.Id even if they don't use other bits
<rogpeppe> mgz: factoring out the address-related stuff seems kind of reasonable to me
<rogpeppe> mgz: instance.Id, yes
<mgz> it seems reasonable to move it if it helps things
<mgz> I just thought instance was pretty pervasive
<rogpeppe> mgz: it is, but i guess to me the address stuff feels much less juju-specific than the other stuff in the instance package
<rogpeppe> mgz: and dividing them up could make for two nicely coherent packages
<rogpeppe> mgz: but if you don't think it's a good idea, i don't think it matters too much
<mgz> it seems reasonable if it actually helps anything
<mgz> as in, if there are places that don't ending up importing both anyway :)
<TheMue> mgz, rogpeppe: could you tell me why juju-backup/juju-restore are no regular commands (e.g. juju backup/juju restore)?
<TheMue> and why they are called plugins?
<mgz> TheMue: they are, just magic plugin things
<rogpeppe> TheMue: they're a bit hacky to be first-class citizens
<rogpeppe> TheMue: but "juju backup" should work
<TheMue> rogpeppe: hacky in the way they are implemented? sure, a bit different. ;)
<rogpeppe> TheMue: as should "juju restore"
<rogpeppe> TheMue: (the former requires the shell script to be installed tho)
<TheMue> rogpeppe: hmm, where is the magic hidden that the script is called? didn't found a reference to the script in the code. maybe too blind.
<rogpeppe> TheMue: it starts with "juju-"
<TheMue> rogpeppe: ah, so if juju (the command) doesn't find a command (e.g. "foo") directly implemented it looks for a "juju-foo" file to execute it?
<TheMue> rogpeppe: then also the naming as plugin makes sense
<TheMue> rogpeppe: found the magic, thanks. important for the documentation I'm writing
<natefinch> jam: want to meet now?
 * jam whimpers
<jam> soo many meetings
<jam> but actually yes
<natefinch> jam: ok
<jam> fwereade: are you back yet
<natefinch> jam: maybe the new manager person will help take some of the meetings off your plate?  She has a name that I have, of course, forgotten.
<jam> natefinch: alexis is *my* manager, not a team lead
<jam> so just another person to report to :)
<natefinch> jam: heh... I just hoped she might be able to help with some of the managerial stuff, but sounds like that's not the problem :)
<jam> natefinch: I'm in the channel from the earlier calendar event
<natefinch> jam:  oh, ok
<sinzui> jamespage, Did you unsubscribe me from ~juju-packaging/+archive/stable ?
<jamespage> sinzui, nope
<sinzui> oh dear. I don't know who could do it besides you and me
<sinzui> jamespage, I will avert the 1.18.0 release catastrophe by resubscribing and updating cloud city with the new credentials
<rogpeppe> x
<rogpeppe> mgz: fancy a review of some instance package additions? https://codereview.appspot.com/77410043/
<voidspace>  rogpeppe I can hear you
<voidspace> rogpeppe: wow, that's about a ten second latency on audio
<rogpeppe> voidspace: ok, let's try the IRC challenge
<rogpeppe> voidspace: i'll type x and say it at the same time
<voidspace> rogpeppe: ok
<rogpeppe> x
<rogpeppe> x
<rogpeppe> y
<voidspace> rogpeppe: the first one seemed instantaneous
<rogpeppe> z
<voidspace> rogpeppe: maybe a second or two
<voidspace> rogpeppe: let me try
<voidspace> rogpeppe: this
<voidspace> rogpeppe: did you even get that?
<voidspace> wow
<voidspace> ouch
<rogpeppe> about 11 seconds delay
<voidspace> rogpeppe: wwitzel3: I have 1Mbit upstream
<voidspace> rogpeppe: wwitzel3: maybe if I switch off my camera it will help
<voidspace> reduce my upstream
<voidspace> rogpeppe: lunch sounds *great*
<voidspace> :-)
<rogpeppe> awesome, let's do lunch :-)
<voidspace> byeeee
<natefinch> wwitzel3: ok, ready, finally
<wwitzel3> natefinch: ok :)
<adeuring> another trivial MP: https://codereview.appspot.com/77430043
<mgz> rogpeppe: looking
<axw> hazmat: you don't use the PublicAddress client API do you?
<axw> hazmat: actually.. never mind, probably too late in 1.17 to remove it now.
<mgz> rogpeppe: lgtm
<rogpeppe> mgz: thanks
<mgz> adeuring: lgtm.
<adeuring> mgz: thanks
<mgz> adeuring: did you also check the other providers just in case? :)
<adeuring> mgz: let me look again ;)
<adeuring> no looks good
<bits3rpent> Any reason updated master wont replace juju binaries?
<natefinch> are you doing juju install or juju build? Build doesn't install, it just compiles
<natefinch> er sorry, go build bo install
<natefinch> brains
<bits3rpent> go install launchpad.net/juju-core/...
<bits3rpent> haha, it's fine :)
<hazmat> axw, i don't, there's a usage mode in deployer (-f) where people want that but i extract from status
<hazmat> axw, iotw i was going to.. but no i don't use it atm
<fwereade> bits3rpent, are you talking about replacing binaries in $GOBIN or those installed via apt?
<natefinch> bits3rpent: that should work, unless there's no changes to the code.   Do you have just a single path in your GOPATH?  go install always installs to the first path in GOPATH, if you have multiple
<natefinch> fwereade: ahh good question
<bits3rpent> fwereade: $GOBIN
<axw> hazmat: thanks, it'll likely stay in anyway
<fwereade> bits3rpent, hmm, that's surprising -- does it install to that location if you move/delete what's already there?
<axw> fwereade: I'm planning to make "juju ssh" connect to private addresses, proxying through the API server address. Any objections to this approach?
<axw> it won't be azure-specific
<fwereade> axw, I'm +1 on that in general, public ips are in short supply all over the place
<axw> cool, I can work on that independent of the azure stuff
<axw> fwereade: did you see my reply about PrincipalUnits vs. DistributionInstances? I can't really see away around it while also preventing add-machine at that point
<fwereade> axw, sorry, I didn't get to that yet, just a sec
<axw> add-machine can be prevented by a Prechecker, but that's not supposed to be reliable
<fwereade> axw, remind me the forces that lead towards a callback ratherthan a plain slice
<fwereade> axw, but DistributionInstances sounds like a tolerable name to me :)
<axw> fwereade: purely for performance
<fwereade> axw, so we don't bother calculating it for providers that can't handle it? fair enough
<axw> fwereade: though it's on the state server already, so... *shrug*
<fwereade> axw, I'm fine leaving it as a callback, I had a vague recollection there was another reason to prefer it, I just can't remember what
<axw> fwereade: perhaps I'll think of it in the middle of the night, but I can't think of any other reason now
<fwereade> axw, fwiw I wouldn't be opposed to a field that took a string that providers are free to use to try to helpfully tag nstances if possible
<axw> fwereade: sure. that's not a blocker really. the thing blocking me now is that I can no longer prevent StartInstance with no assigned principals
<mgz> axw: it *is* the middle of the night, isn't it?
<axw> mgz: it's only 10:40pm :)
<mgz> :)
<fwereade> axw, I see -- it means there's no way to prevent add-machine?
<axw> fwereade: yup
 * fwereade makes a face and stares into space a bit
<axw> fwereade: I'm using the principals in two places for that: in PrecheckInstances, and in StartInstance
<axw> PrecheckInstance*
<fwereade> axw, how do prevent --to?
<fwereade> s/do/do we/
<axw> fwereade: a new UnitAssigner state policy
 * axw context switches a bit
<axw> fwereade: UnitAssigner.CheckUnitAssignment(*state.Unit, *state.Machine) error
<fwereade> axw, yeah, just found it again
<axw> fwereade: so Azure just returns an error all the time. It only gets called if a new machine isn't created
<axw> err, if an existing machine is requested
<fwereade> axw, so, one thing we had for a while that didn't end up sitting quite right was having the environ itself return the assignment policy
<fwereade> axw, ie AssignLocal, AssignCleanEmpty, etc
<fwereade> axw, I'm trying to figure out if there's some similar way of describing the allowable operations at that sort of level
<axw> fwereade: --to effectively overrides the policy
<axw> oh, I see what you mean
<fwereade> axw, thinking a bit more: it's *kinda* a CanAssignUnits flag on an environ, that controls validity of both --to and add-machine
<voidspace> rogpeppe: are you back?
<rogpeppe> yup
<axw> fwereade: sounds reasonable. I'm fine with blocking it at that level
<voidspace> rogpeppe: want to try starting a new hangout
<voidspace> rogpeppe: nearly typed hangup by mistake
<rogpeppe> voidspace: https://plus.google.com/hangouts/_/76cpjgribubncq8d4f1s1nalr0?hl=en-GB
<voidspace> I have enough hangups already without you starting more
<voidspace> cool :-)
<axw> fwereade: though, the UnitAssigner was meant to grow to handle filtering machines that can be assigned to for ec2 AZs
<axw> fwereade: so you may support assignment, but you can't assign to any old machine
<fwereade> axw, --to overrides that, right? this is just for auto-assignment to existing machines?
<axw> fwereade: yup
<axw> fwereade: supporting add-machine/--to could be a flag, independent of that
<axw> just saying I had intended to put them both together in one interface
<bits3rpent> Is there anyway to only display unassigned bugs on launchpad?
<fwereade> axw, I'm wondering whether the assignment could be handled with an interface more like FurthestPossibleInstance(candidates, onesToAvoid []Instance) (Instance, error)
<fwereade> bits3rpent, sorry: there's an Assignee field in https://bugs.launchpad.net/juju-core/?advanced=1
<fwereade> bits3rpent, with a "Nobody" radio button
<bits3rpent> fwereade, thanks!
<axw> fwereade: for ec2, yes, that sounds about right
<fwereade> bits3rpent, if you click the "search" button at the top of the page it might ignore the stuff you specify below, though
<fwereade> bits3rpent, use the one at the bottom :)
<fwereade> axw, well, for azure in assignment-allowed mode we can just return any of them
<bits3rpent> fwereade, will do :)
<fwereade> axw, (and for ec2 we will also need an error meaning "you can start an instance that's even further than any of those"
<fwereade> )
<axw> yeah that makes sense
<axw> this is getting ahead of the immediate need though I think
<axw> fwereade: did you have any thoughts on how to flag a StartInstance as being add-machine'd without PrincipalUnits?
<axw> how that works actually has ramifications on API and maybe state
<axw> the other stuff can be changed with refactoring
<fwereade> axw, I was thinking we could actually get away with doing so at the API/state layer, if we just look up the flag on the environ
<fwereade> axw, if we can guarantee only good data gets into state, we don't need to worry about checking it again later
<fwereade> note: the obove is obviously not true
<fwereade> but sometimes it's not unreasonable to pretend it is
<axw> :)
<axw> it's actually not the end of the world if we do start the instance, so I'm okay with that
<axw> so basically some variation of the PrecheckInstance modification I did
<marcoceppi> I've got a bone to pick with the destroy-environment command
<fwereade> marcoceppi, oh yes?
<marcoceppi> fwereade: I just watched someone type juju status <environment> and get annoyed that the output was something other than what the user expected
<marcoceppi> We're mixing user expereinces with environments and destroy-environment started this
<fwereade> marcoceppi, ha, so the destroy-environment change trained them wrong
<fwereade> marcoceppi, bugger
<natefinch> dang
<marcoceppi> yeah, I feared this, but now I have evidence
<fwereade> can anyone remember where that requirement originated?
<axw> fwereade: better go to sleep before I fall asleep on the keyboard. I'll rework my stuff and repropose after the ssh proxy branch is up. Thanks for the chat again...
<fwereade> axw, thank you
<fwereade> axw, sleep well
<natefinch> fwereade, marcoceppi:  it was my change that made the environment name mandatory, with -e optional, because flags shouldn't be mandatory
<natefinch> (for destroy-environment)
<natefinch> fwereade, marcoceppi: the name was made mandatory due to feedback from several users at canonical that said destroy environment was scary, especially if you have several environments on the same machine.  So forcing the user to type out the name of the environment is insurance against destroying the wrong one
<marcoceppi> natefinch: I'd rather just make the -e flag manditory
<fwereade> marcoceppi, natefinch: if it's causing active upset I think I'd come down on the side of consistency with other uses of -e there... did we do a stable release with that change in yet?
<marcoceppi> fwereade: I don't think so
<natefinch> marcoceppi, fwereade: yeah, in hindsight, the consistency with the rest of the commands is more important than the fact that it's weird to have a mandatory flag
<fwereade> natefinch, I'm worried that people will have updated to use the no-e version
<fwereade> natefinch, any serious drawback to just accepting both forms?
<marcoceppi> fwereade: could add a depreciated warning
<natefinch> fwereade: we accept both now
<natefinch> fwereade: or at least, on tip we do
<fwereade> marcoceppi, natefinch: just swap the deprecation warning :)
<marcoceppi> fwereade natefinch should I file a bug wrt this?
<sinzui> natefinch, Are you still working on Bug #1271937, I thought someone else was going to be assigned
<_mup_> Bug #1271937: Use juju-mongodb when the package is available <arm64> <ci> <mongodb> <ppc64el> <trusty> <juju-core:In Progress by natefinch> <https://launchpad.net/bugs/1271937>
<mgz> fwereade: if I hurl up a mp of a first pass of state networks doc, can you look it over?
<fwereade> mgz, soon, on a call atm
<mgz> I currently lack tests, and want to know what a sane base level of testing is for new state documents
<mgz> (my template, constraints, has no unit-testy bits for just the constraints doc)
<dimitern> anyone willing to review my fix for bug 1291400 ? https://codereview.appspot.com/77340044/
<_mup_> Bug #1291400: migrate 1.16 agent config to 1.18 properly (DataDir, Jobs, LogDir) <regression> <upgrade-juju> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1291400>
<dimitern> mgz, voidspace ^^ ?
<dimitern> natefinch, wwitzel3 ^^ ??
<mgz> dimitern: can probably look in a sec
<dimitern> mgz, ta! it's not huge
<voidspace> dimitern: he beat me to it...
<voidspace> xchat notifications are very subtle
<mgz> fwereade: https://codereview.appspot.com/77270046
<vladk> dimitern: ping
<natefinch> sinzui: about Bug #1271937, I have code that makes it work.  it should be landing this week. I'm not sure what the timeline is for people that want it
<_mup_> Bug #1271937: Use juju-mongodb when the package is available <arm64> <ci> <mongodb> <ppc64el> <trusty> <juju-core:In Progress by natefinch> <https://launchpad.net/bugs/1271937>
<dimitern> voidspace, :) thanks anyway
<dimitern> vladk, pong
<sinzui> natefinch, The landing will unblock several people. I will consider a 1.17.6 release when the fix is available.
<fwereade> mgz, yeah, that's pretty much what I was looking for
<natefinch> sinzui: there's basically two ways to make it work:  one involves a fair amount of code, which is what I have, and one is a little hackier but much easier.  currently we create the upstart script on the client and send it to the bootstrap node, so it doesn't know if the bootstrap node has juju-mongodb or not.
<natefinch> sinzui: the fix is either to create the upstart script on the bootstrap node, or just create it with the command as "mongod" and let the PATH figure out which one to use
 * sinzui nods
<natefinch> sinzui: my code does the former, but the latter would be trivial to get in.  my code should be ready in a day or two, but historically I've been bad about estimating when this particular code will finish
 * fwereade bbs
<wwitzel3> rogpeppe: any idea where we might get a hold of the actual hostname of a machine (to use for mongo replicaset) during bootstrap.
<rogpeppe> wwitzel3: ha ha
<rogpeppe> wwitzel3: actually...
<rogpeppe> wwitzel3: perhaps you can, these days
<rogpeppe> wwitzel3: i'm not entirely sure whether bootstrap-state now has a valid environ config now or not
<wwitzel3> rogpeppe: ok, so the environ config is where it would be though
<rogpeppe> wwitzel3: no, but the environ config lets you get an environs.Environ which allows you to get an instance.Instance which allows you to find out the address :-)
<rogpeppe> wwitzel3: you'll also need the bootstrap machine's instance id
<wwitzel3> rogpeppe: obviously
<rogpeppe> wwitzel3: from ... i can't quite remember where
<wwitzel3> rogpeppe: sounds straightfoward ...
<dimitern> wwitzel3, rogpeppe, it's provider-state
<rogpeppe> dimitern: ah yes
<dimitern> bootstrap-state just says "storage is writable"
<rogpeppe> dimitern: i guess now that bootstrap is synchronous, there's actually no need for the bootstrap instance to query the provider-state file
<rogpeppe> dimitern: (assuming it still does)
<wwitzel3> rogpeppe, dimitern: what does that mean, it's provider-state?
<rogpeppe> mgz: another instance package review for you: https://codereview.appspot.com/77470043
<rogpeppe> wwitzel3: there's a file in the environ's Storage that's conventionally used to store the bootstrap machine's instance id
<dimitern> mgz, review poke ;)
<mgz> dimitern: nearly have a gap :0
<dimitern> mgz, cheers :)
<wwitzel3> rogpeppe: so how do you retrieve an instance if you have a given id?
<rogpeppe> wwitzel3: Environ.Instances([]instance.Id{id})
<rogpeppe> wwitzel3: godoc launchpad.net/juju-core/instance and  launchpad.net/juju-core/environs for details
 * dimitern is away for a while
<rogpeppe> any reviews of this would be appreciated:  https://codereview.appspot.com/77470043
<rogpeppe> voidspace: away for a minute or two
<viperZ28_>  it looks like Juju does not have ability to enforce runtimes only startup
<viperZ28_> In my test I brought up a multi-node RabbitMQ cluster, I then took one of the machines out using `lxc-shutdown`,
<viperZ28_> Juju did not try to restart the machine or spin up another one
<viperZ28_> I was hoping Juju would sense the downed machine and make an attempt to restart it
<viperZ28_> I am also looking for plans to integrate with vSphere/ESX stack
<perrito666> hey fine people, please review this if you'v got the time https://codereview.appspot.com/77490043
<natefinch> viperZ28_: you're correct, it doesn't do that currently.  At some point it probably will, we're just not there yet.
<viperZ28_> any plans for ESX/v* integration?
<bodie_> anyone familiar with wemux?
<bodie_> I'm trying to set up pairing for my team and it's not being friendly
<natefinch> viperZ28_: as far as I know there's no plans for it, but plans can change and I'm not the leader of much.  fwereade might have a better idea, but I think he's AFK for a while
<fwereade> viperZ28_, oddly enough we've found that most people seem to get upset when we start up downed units/machines for them -- and the python version *did* do that, and nobody noticed for months, until they tried to shut down some unused machines out-of-band and got upset when juju replaced them :/
<viperZ28_> perhaps this should be an option
<voidspace> rogpeppe: grabbing a drink too
<viperZ28_> if I want a HA RabbitMQ cluster I could not use Juju for this use case
<wwitzel3> rogpeppe: can you invite me to your hangout?
<fwereade> viperZ28_, I think I agree there, but I'm not sure I can promise it'd get scheduled any time soon -- to Do It Right in the general case we'd need to be able to assign the storage from the old nodes to the new ones
<bits3rpent> is there still a bug in juju debug-log?
<fwereade> viperZ28_, well, you'd need monitoring and human response, at least
<fwereade> viperZ28_, automated response to *apparent* failure can itself be problematic
<bodie_> anyone know what's up with the bug that's like
<bodie_> # launchpad.net/juju-core/worker/instancepoller
<bodie_> go/src/launchpad.net/juju-core/worker/instancepoller/aggregate.go:67: undefined: ratelimit.New
<bodie_> ?
<bodie_> I'm getting this as output of go get on a fresh system
<viperZ28_> seems in a cloud model or IT as a service this is a requirement, of course with added alerting and throttling
<natefinch> bodie_: that's a new package that recently got added
<natefinch> bodie_: go get github.com/juju/ratelimit I believe
<bodie_> shouldn't that be pulled down automatically if it's needed?
<bodie_> bits3rpent check this out
<fwereade> viperZ28_, it also enables a whole host of exciting problems -- consider a network partition in which a majority of nodes are lost to the state servers, and the state servers bring up replacements
<natefinch> bodie_: go get -u launchpad.net/juju-core  will get it, but it'll overwrite whatever branch you're working on in juju-core
<viperZ28_> I think this is manageable as long as a single master is controlling units running
<natefinch> bodie_: do a go get launchpad.net/godeps   and then you'll have the godeps program, which we use to track dependencies
<natefinch> bodie_: then from the root of juju-core, you can do godeps -u dependencies.tsv and it'll tell you if any dependencies are missing or out of date
<viperZ28_> so is the Juju model more about orchestrating startup and less about operation?
<viperZ28_> or should I say maintaining state and operation
<fwereade> viperZ28_, (also: re vsphere/esx: no current plans, but we love new substrates and would be keen to assist anyone who wanted towork on it)
<natefinch> viperZ28_: for now, basically, yes.  I mean, for operation, you can add or remove instances, add or remove services.
<fwereade> viperZ28_, not necessarily -- but the correct responses at runtime are very application-spcific
<fwereade> viperZ28_, and hence more within the individual charms' purview
<viperZ28_> this is true, I think this is why there is a gap in PaaS and IaaS when talking about services
<bodie_> natefinch, I got that bit.  it looks OK now.
<natefinch> bodie_: cool
<rogpeppe> wwitzel3: https://plus.google.com/hangouts/_/76cpjgribubncq8d4f1s1nalr0?hl=en-GB
<fwereade> viperZ28_, more detailed runtime orchestration is certainly possible via the API fwiw -- we're not currently focused on building more sophisticated external brains for juju, but we're definitely not opposed to them :)
 * fwereade puts his own supper on, bbs
<viperZ28_> Does Juju currently have REST API?
<viperZ28_> I saw this https://bugs.launchpad.net/juju/+bug/804284, but didn't see a resolution
<_mup_> Bug #804284: API for managing juju environments, aka expose cli as daemon <pyjuju:Triaged> <juju-core:Fix Released by jameinel> <https://launchpad.net/bugs/804284>
<hatch> viperZ28_ that bug is resolved
<viperZ28_> is there any documentation for the API?
<hatch> I don't think so, the GUI communicates via a websocket
<bodie_> cool
<bodie_> that appeases me
<natefinch> viperZ28_, hatch: right, there's an API, but it's through websocket  (the CLI uses that too)
<fwereade> viperZ28_, er, not really :( -- the juju-gui talks to it, but I'm not sure that bit is very well abstracted out; and then there's https://launchpad.net/python-jujuclient
<fwereade> bbiab again
<hatch> well, not abstracted out to the point you could put it into another project
<hatch> there is a feature request to do that however
<viperZ28_> thanks all for the insight
<hatch> viperZ28_ the GUI is open source so you could extract the commands from here https://github.com/juju/juju-gui/blob/develop/app/store/env/go.js
<hatch> ex) https://github.com/juju/juju-gui/blob/develop/app/store/env/go.js#L454
<bits3rpent> Does the API server in fact currently log everything as debug?
<bits3rpent> I am trying to replicate https://bugs.launchpad.net/juju-core/+bug/1173122
<_mup_> Bug #1173122: API server should not log passwords <logging> <security> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1173122>
<bits3rpent> tail -f ~/.juju/local/log/unit-*.log shows no debug messages
<bits3rpent> Has anyone replicated this bug?
<natefinch> bits3rpent: if you bootstrap with --debug it'll output debug messages
<bits3rpent> Thank you.
<bodie_> so many gotchas :(
<natefinch> bodie_: is there something we can do to help?
<bodie_> we're just figuring out the collaborative side of things right now, I'm putting a vm together for us to share that hopefully will have working versions of the code and such
 * rogpeppe is done for the day
<rogpeppe> g'night all
<bodie_> nite
<perrito666> hey, my local copy of juju-core, obtained using go get lacks a trunk, it does has a branch master, why is that?
<natefinch> perrito666: go get doesn't actually check out the branch, it just gets the code.  You can do a bzr branch lp:juju-core to get an actual branch
<perrito666> natefinch: just a branch and then I can switch to trunk?
<natefinch> perrito666: there's no real reason to be on trunk per se.  You can't commit directly to trunk anyway, and a branch of trunk is just as good as trunk as long as it's up to date
<perrito666> natefinch: indeed, but since I was just peer programming with gz and his local had trunk I was wondering if my own branch was not from the wrong place
<natefinch> perrito666: oh, so, that's not a real thing.  That's a branch called trunk that he keeps in sync with master
<natefinch> perrito666: he's using cobzr, which is a way of keeping multiple branches in the same directory on disk.   It's the way git works, and it's handy for working with go.
<perrito666> natefinch: I am cobzring too :)
<natefinch> perrito666: the way you start it is to do bzr checkout lp:juju-core, which will grab trunk/master/tip whatever you want to call it from juju-core, then do brz switch -b trunk, and that'll make you a colocated branch called trunk that you can keep in sync with main
<natefinch> perrito666: ahh good
<perrito666> altough the idea to have a trunk in place is nice
<natefinch> perrito666: yeah, it's a lot easier to just have a cobzr branch called trunk, then you can always just bzr switch trunk; bzr pull; bzr switch -b <newbranch>
<bodie_> you guys need a bot to answer these questions :S
<bodie_> I had the same exact problems not two days
<bodie_> ago
<natefinch> bodie_: how do you know I'm not a bot? :)
<perrito666> bodie_: we could also add it to CONTRIBUTING or README I guess
<perrito666> :p
<bodie_> now that's just crazy talk
<bodie_> READ the README?
<perrito666> bodie_: for what I have seen of natefinch he is a bot that has a tiny human controlling him in fron of him
<perrito666> s/fron/from
<bodie_> lol
<bodie_> pay no attention to the man behind the curtain...
<bodie_> or, maybe it's more like the men in black aliens that pilot a humanoid robot
<natefinch> bodie_: he means my 9 month old daughter who sits on my lap for our morning standups because she always wakes up at 6am
<natefinch> (or earlier)
<bodie_> I can so relate!
<bodie_> the tiny masters are powerful...
<natefinch> yep
<perrito666> natefinch: tx, I have my repo up and running
<perrito666> out of the box tests explode right?
<natefinch> perrito666: uh.... not if your environment is set up correctly.  The main thing is that you need mongo with SSL installed in /usr/bin  (note that apt-get does not install one with SSL capability)
<natefinch> perrito666: also you currently need to *not* have mongod at /usr/lib/juju/bin/mongod
<perrito666> duh, I had run the install dependencies step on another vm
<perrito666> natefinch: thanks again
<natefinch> perrito666: welcome
<perrito666> curiosity, tests also fail if bzr name is not set
<natefinch> perrito666: yeah that should really be fixed, but since you'll almost certainly have one to work on juju, it's usually not a huge deal
<thumper> morning
 * thumper peruses the email for urgent items
<bodie_> howdy
<perrito666> one more, I am getting coding errors from lxc.go, something there that I am missing right?
 * perrito666 runs updates for go lxc
<waigani> thumper: https://codereview.appspot.com/76670043/, https://codereview.appspot.com/73390043/ (added tests for auto-restart) (oh and good morning)
<thumper> morning
<thumper> perrito666: what sort of errors?
 * thumper looks at his calendar
<thumper> natefinch: ping
<natefinch> thumper: yo
<thumper> hi natefinch
<thumper> the juju-mongodb package work, is that progressing?
<thumper> or shall I take it off you?
<natefinch> wayne and I are finishing it up. Currently works except for the local provider for some reason
<thumper> I'm also curious as to the actual other reasons it broke things
<thumper> natefinch: want me to look at it?
<natefinch> thumper: sure, help would be appreciated. the branch is lp:~natefinch/juju-core/030-MA-HA
 * thumper grabs it
<natefinch> thumper: the reason it broke things is because we were creating the upstart script for mongo on the client, and using the existence of /usr/lib/juju/bin/mongod on the client to determine the path of mongod to specify in the upstart script that gets created on the server
<thumper> ah
<natefinch> how that came about was that I landed half the changes in a branch, and so it had the changes to figuring out mongod, but not the changes to creating the upstart script on the server
 * thumper nods
<natefinch> s/creating/create/
<natefinch> thumper: also, destroy environment local isn't cleaning up the new upstart script yet, so you may need to do a stop juju-db-v2 between bootstraps, or it'll complain the mongo port is in use
<thumper> any particular reason for changing the upstart script name?
<natefinch> thumper: poor man's version control
<natefinch> so we can know if we need to remove the old version and install the new version
<thumper> natefinch: what is the purpose of all the agent config changes?
<thumper> seems somewhat unrelated to just the mongo package
<natefinch> thumper: so the branch is really for HA support, so that's code the machine agent runs to call EnsureMongoServer when it notices it's now supposed to be an EnvironManager
<perrito666> thumper: I had an outdated version of golxc
<thumper> perrito666: ack
<thumper> natefinch: hmm...
<natefinch> thumper: we could probably factor out just the code that calls ensure mongo server during bootstrap, but it would kind of be a pain
<natefinch> thumper: the other option is a temporary hack to change the upstart script to have the command it runs be "mongod"  and then if /usr/lib/juju/mongod is the first mongod in your path, bam, it works.
<thumper> I'm very concerned that we have tied a dependent piece for something I need right now into HA which is still coming
<natefinch> thumper: there's not much actual HA stuff here.  it can land on its own without the rest of HA.
<natefinch> we didn't know we needed it right now until a few days ago, from my understanding, so we didn't worry too much about it as long as it wasn't breaking things.
<thumper> yeah... I didn't know it was breaking things either
<natefinch> Well, so it was breaking things, and then martin made a patch to revert it to the previous behavior, and then from what I hear, some people with a lot of money decided they wanted it to work sooner rather than later.
<natefinch> though this is all third hand, so I may be misunderstanding various bits.
<thumper> natefinch: how would you feel if I created a branch that pulled bits out of your branch and just did the package change thing?
<natefinch> thumper: I'd be happy it was you and not me.
<thumper> haha
<thumper> natefinch: I'm pretty sure I know what I need to do...
<thumper> ish
<natefinch> I think all you really need to do is take the code from agent/bootstrap.go and have it call EnsureMongoServer.  We've tweaked ensure mongo server to do some replicaset stuff, but that should be easy to see and ignore
<thumper> I was thinking a bit simpler than that :-)
<natefinch> what's your plan?
 * thumper is still reading the diff
<natefinch> very little of the diff on that branch is applicable
<natefinch> the only thing that really matters is the call to ensuremongoserver in agent/bootstrap.go and making sure agent/mongo/mongo.go is doing the right thing to figure out the mongo path
<natefinch> come to think of it, that's not really hard at all to pull out
<thumper> so... the key bit is that you are moving the creation of the mongo upstart script to the machine agent?
<natefinch> so, yes and no.  That part is actually only needed for HA.  If you only have one bootstrap node, you can just toss EnsureMongoServer in agent/bootstrap.go
<natefinch> and remove the code that creates the upstart script on the client
<thumper> well... actually the bare minimum I need to do is to update cloudinit :-)
<thumper> and fix the path of the import
<natefinch> You mean before we send it to the server?  Isn't that going to hit the same problem, where we don't know where mongod will exist on the server?
<thumper> natefinch: oh... now I get it
<thumper> natefinch: as the simplest thing that can work now, how about this:
<thumper> natefinch: we update the mongo upstart script to add the location of the juju-mongodb mongodb exec to the path, and don't specify the full path for mongodb ?
<thumper> that way if we have juju-mongodb installed, we pick up the right mongodb
<natefinch> thumper: yeah, that was my hack to get it done ASAP in a way that's not awful
<thumper> I think that is the best thing to get what we need *right now*
<thumper> that way I'm not blocked on you
 * thumper was surprised to see that the mongo upstart script location had moved in trunk
<thumper> hadn't noticed that move
 * thumper hacks around a bit
<thumper> natefinch: your branch didn't install the juju-mongodb package
<natefinch> thumper: nope
<thumper> ok
<natefinch> ok, I gotta run.  Email me if you need anything, I'll keep an eye on it.
<natefinch> btw, the mongo upstart script moving was part of my half-feature commit that broke things :/
<thumper> ah bollocks
<jamespage> thumper, it does "juju-mongodb | mongodb-server"
<thumper> ah
<thumper> can we change that?
<jamespage> we can
<thumper> jamespage: I'm thinking that for the local provider on trusty, we always use mongodb-server
<thumper> either that, or I need to check
<jamespage> thumper, can't
<jamespage> thumper, mongodb-server is not mir'able
<jamespage> juju-mongodb is
<thumper> sorry, typed the wrong thing...
<jamespage> phew
<thumper> what I meant to say was:
<thumper> on trusty, I want to have the local provider use juju-mongodb
<thumper> so we just have one mongo thing for trusty juju\
<thumper> so I don't have to see what you hvae installed
<jamespage> thumper, so mongodb-server would not be supported
<jamespage> that's fine
<thumper> umm...
<thumper> perhaps as of 1.17.6 we make the local provider use juju-mongodb
<thumper> on trusty
<thumper> does that sound reasonable?
<thumper> jamespage: I'm trying to get a minimal patch in for juju so that it will work with power and not screw up the ha stuff too much
<thumper> effectively I want to do a switch internally that says "if trusty use juju-mongodb, otherwise use mongodb-server"
<thumper> and deal with other operating systems as we get them
<jamespage> thumper, that would apply to all providers right?
<thumper> right
<jamespage> thumper, not just local
<thumper> right
<jamespage> OK - that's fine
<jamespage> thumper, if >= trusty :-)
<thumper> we have juju-mongodb for trusty everywhere right?
<thumper> right
<jamespage> thumper, yes
<thumper> ok... that is my plan then
<jamespage> +1
<jamespage> thumper, I'll line up the juju-mongodb change when 1.17.6 releases
<thumper> awesome...
<thumper> jamespage: also...
<thumper> jamespage: I have a plugin coming 'juju-local' that we should include in the juju-local package
<bodie_> worth noting that trusty installs juju-mongodb to /usr/lib/juju/bin and juju looks for it in /usr/bin
<thumper> but it has nothing in it yet
<thumper> I'll keep you in the loop
<jamespage> thumper, ok
<thumper> bodie_: ack, this will be part of the change
<bodie_> hooray
<bodie_> does that mean I don't have to wipe my workstation and reinstall with 12.04?  *sigh*
<thumper> bodie_: you can if you like
<bodie_> I've just been setting up a remote dev box today in the hopes it'll get resolved soon on trusty
<bodie_> as much as i love reinstalling my pc...
 * thumper dev machine is trusty
<thumper> it is also my day to day machine
<bodie_> same here
<bodie_> I just keep getting one thing after another
<bodie_> moved the mongod binary to /usr/bin, now some other thing is broken
<thumper> don't do that
<thumper> :)
<bits3rpent> Hello all, I am working on the bug stating that juju in debug mode stores passwords of logins etc.
<bits3rpent> I will be fixing the issue (most likely) a configurable switch
<bits3rpent> i.e juju bootstrap --debug --show-passwords (Default is false)
<jcw4> this bug right?  https://bugs.launchpad.net/juju-core/+bug/1202682
<bits3rpent> Would everyone prefer the whole log is dropped, or just that the password is parsed out.
<_mup_> Bug #1202682: debug-log doesn't work with lxc provider <cts-cloud-review> <debug-log> <local-provider> <papercut> <ssh> <ui> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1202682>
<bits3rpent> https://bugs.launchpad.net/juju-core/+bug/1173122
<_mup_> Bug #1173122: API server should not log passwords <logging> <security> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1173122>
<jcw4> oops thanks bits3rpent
<bits3rpent> I wanted to keep it as a switch just in case someone wanted it, but do most of you not want any possibility of passwords being logged (via very specific switch)?
<bits3rpent> jcw4 no problem
<bits3rpent> Also I was wondering if all of you would rather have the whole log output dropped or just the password parsed out.
<thumper> bits3rpent: how do you propose to parse the output?
<bits3rpent> thumper: check for ,"Password":*", and replace with empty string
<bits3rpent> check result of jsoncodec.DumpRequest(hdr,body) to be specific
<thumper> wallyworld_: I have a few errands in town, back later
<wallyworld_> ok
<thumper> wallyworld_: lbox is still working out the diff for my branch
<wallyworld_> of course it is
<thumper> https://code.launchpad.net/~thumper/juju-core/juju-mongodb/+merge/211642
<wallyworld_> will look
<thumper> not tested on power yet, will check that later...
 * thumper thinks
<wallyworld_> hopefully rt will be done soon too
<thumper> might have time now
 * wallyworld_ reboots to apply updates
<thumper> wallyworld_: power still seems to have issues, will chase more later
<wallyworld_> ok, what issues?
<thumper> not starting issues
<wallyworld_> thumper: i think nate is doing some work in the area of EnsureMongoServer etc
<thumper-afk> yes I know
<thumper-afk> I've taken a minimal approach
<thumper-afk> his work wasn't yet installing juju-mongodb
<thumper-afk> and was tied up with HA stuff
<wallyworld_> ok, so no issues landing yours then
<thumper-afk> nope
<davecheney> sinzui: lost you
<davecheney> i think we're done
<davecheney> no need to reconnect
<bodie_> http://paste.ubuntu.com/7117018/ ....
<bodie_> this is on a totally clean 13.10 install... did make install-dependencies and go get -u dependencies.tsv
<bodie_> any input welcome
<davecheney> i'd guess the ssh command exited early
<bodie_> so maybe just a fluke?
<davecheney> bodie_: hope so
#juju-dev 2014-03-19
<davecheney> worker/instancepoller/aggregate.go:67:22: error: reference to undefined identifier âratelimit.Newâ bucket := ratelimit.New(gatherTime, 1)
<davecheney> i have the latest github.com/juju/ratelimit
<bodie_> same test failure on my part again
<bodie_> http://paste.ubuntu.com/7117154/
<bodie_> hrm
<bodie_> broken pipe
<waigani> davecheney: godeps -u dependencies.tsv
<thumper> wallyworld_: the reason we don't use version.readSeries is because it conflates the host with what we are going to start using tools
<wallyworld_> ok
<wallyworld_> is there no way to avoid "some-series"
<thumper> sure, can use "precise"
<thumper> however
<thumper> in order for that method to work correctly, they need to pass in series
<thumper> so it should use that when it happens
<thumper> that function is unused as yet
<wallyworld_> ok
<wallyworld_> just make sure you tell them.....
<thumper> ack
<thumper> davecheney: ping
<thumper> davecheney: I'm getting the same failure for juju-mongodb that I was getting for mongodb-server on power
<thumper> it seems to think there is no journal present
<thumper> is mongod changing the user from root?
<thumper> davecheney: you had the local provider working, did you do anything special for mongod?
<thumper> davecheney: I want to check in prior to filing a bug
<thumper> o/ axw
<axw> hey thumper
<davecheney> thumper: all I did was the symlink
<davecheney> to be fair
<davecheney> tests don't pass on ppc64el
<thumper> davecheney: I get it failing to even start
<thumper> davecheney: complaining about no journal
 * thumper does an update
<thumper> lets see if that changes anything
<thumper> nope
<davecheney> whaaaaaaaaaaaa ?
<davecheney> when you deploy the local machine
<davecheney> sorry, bootstrap local env
<thumper> umm... not long ago
<davecheney> worker/instancepoller/aggregate.go:67:22: error: reference to undefined identifier âratelimit.Newâ bucket := ratelimit.New(gatherTime, 1)
<thumper> I grabbed my branch
<davecheney> ^ any idea how to fix this one
<thumper> rebuilt
<davecheney> then I can try to have a look
<thumper> run godeps?
<davecheney> ubuntu@winton-02:~/src/launchpad.net/juju-core$ go get -u -v github.com/juju/ratelimit
<davecheney> github.com/juju/ratelimit (download)
<davecheney> github.com/juju/ratelimit
<davecheney> what ?
<davecheney> \
<davecheney> head isn't buildable /
<davecheney> where is the source for godeps ?
<thumper> launchpad.net/godeps
<davecheney> sorry
<davecheney> dumb question
<davecheney> ubuntu@winton-02:~/src/launchpad.net/juju-core$ godeps -u github.com/juju/ratelimit
<davecheney> godeps: cannot parse "github.com/juju/ratelimit": open github.com/juju/ratelimit: no such file or directory
<davecheney> oh for fuxcks sake
<davecheney> why the fuck doesn't it default to something sane
<davecheney> ubuntu@winton-02:~/src/launchpad.net/juju-core$ pstree -p | grep -C2 mongo |              |-{jujud}(31735) |              `-{jujud}(31742) |-mongod(31700)-+-{mongod}(31705) |               |-{mongod}(31706) |               |-{mongod}(31707) |               |-{mongod}(31708)
<davecheney> urgh
<davecheney> thumper: bzr up'd
<davecheney> and apt-get updated
<davecheney> still works for me
<davecheney> thumper: have you removed the mongodb-server package ?
<thumper> davecheney: no
<davecheney> that'll be the problem
<thumper> why?
<davecheney> first in the path maybe ?
<thumper> nah, full path specified
<davecheney> yes, but /usr/bin/mongod => mongodb-server <-- borken!
<davecheney> thumper: hang on, let me test that scenario
<thumper> davecheney: my upstart script specifies /usr/lib/juju/bin/mongod
<davecheney> thumper: i'm still trying to reproduce the failure
 * thumper is reading mongo source files ...
<davecheney> that'll only make it worse
<hazmat> davecheney, my eyes are bleeding already
<hazmat> davecheney, the assert seems to be from here.. https://github.com/mongodb/mongo/blob/v2.6/src/mongo/util/logfile.cpp#L241
<davecheney> thumper: i've just bootstrapped
<davecheney> root     32316  2.1  0.5 445460 43088 ?        Ssl  01:19   0:00 /usr/bin/mongod --auth --dbpath=/home/ubuntu/.juju/local/db --sslOnNormalPorts --sslPEMKeyFile /home/ubuntu/.juju/local/server.pem --sslPEMKeyPassword xxxxxxx --bind_ip 0.0.0.0 --port 37017 --noprealloc --syslog --smallfiles
<davecheney> worked fine
<hazmat> hmm
<hazmat> davecheney, are you on the same emulation thingy thumper is on?
<thumper> davecheney: so what is different for you and me?
<thumper> this seems to be an os alignment bug
<bodie_> I could really use a second pair of brain cells on this
<davecheney> ii  juju-mongodb                2.4.9-0ubuntu2     ppc64el            MongoDB object/document-oriented database for Juju
<davecheney> ii  mongodb-server              1:2.4.9-1ubuntu1   ppc64el            object/document-oriented database (server package)
<thumper> hazmat: davecheney is on a different machine
<davecheney> bodie_: sure, mine are lonley
<bodie_> it's in TestWriteFailure in environs/sshstorage
<bodie_> http://paste.ubuntu.com/7117154/
<bodie_> no idea what's supposed to happen, or what's going wrong, except the broken pipe bit
<thumper> davecheney: can I get you to try my branch?
<bodie_> replicable
<bodie_> it's on a remote
<davecheney> thumper: sure
<thumper> lp:~thumper/juju-core/juju-mongodb
<davecheney> bodie_: you might ahve to turn up the debug on the remove sshd
<davecheney> bodie_: my guess is /bin/ssh is
<thumper> davecheney: build it, and make sure that juju-mongodb is installed
<davecheney> a. crashing
<davecheney> b. being hung up on
<thumper> bootstrap a simple local provider
<thumper> davecheney: and it should start with no tweaking at all
<davecheney> OT: hazmat where can I get a trusty install image that is < 2.5 Gb
<thumper> davecheney: in theory
<bodie_> could it be caused by disabling root login?
<davecheney> bodie_: absolutely
<bodie_> hm
<davecheney> check /var/log/secure.log
<davecheney> or audit.log
<davecheney> or wherever the AUTH syslog target points too
<davecheney> bodie_: the bug is that that test is discarding the output of stderr
<hazmat> davecheney, hmm.. i just use cloud-images most of the time.. their minimal and small.. but you mean like a boot installer.. if you need super small and have the bandwidth.. there's always net install
<davecheney> hazmat: i just want to download it and try it
<davecheney> do I really need a 2.5 Gb iso ?
<bodie_> hrm ? ... I'm redirecting stderr into the test output..  so you're saying it didn't go in because it was being discarded in the first place?
<hazmat> davecheney, http://cdimage.ubuntu.com/daily-live/current/ look like 950mb
<davecheney> hazmat: ta, that'll do nicely
<hazmat> davecheney, kvm with cloud image.. gets you down to even smaller
<davecheney> bodie_: by default stderr and stdout go to dev null unless otherwise requested
<hazmat> davecheney,  240mb http://cloud-images.ubuntu.com/trusty/current/
<davecheney> hazmat: ta muchly
<hazmat> davecheney, bottom of that page for the raw downloads
<davecheney> much better solution
<bodie_> I called it like so: go test ./... > output.txt 2>&1 &
<hazmat> davecheney, also.. virtualbox images here.. http://cloud-images.ubuntu.com/vagrant/trusty/current/
<davecheney> bodie_: this isn't something you've done
<bodie_> ok
<davecheney> this is a bug inside the test code
<davecheney> hazmat: a bounty of resources
<davecheney> hazmat: that error
<davecheney> da fuq
<wallyworld_> thumper: pingy
<davecheney> that isn't a journal problem
<davecheney> this is somethig to do with the 64mb pages option
<thumper> coming
<davecheney> thumper: this may be the problem
<davecheney> Linux winton-02 3.13.0-8-generic #28-Ubuntu SMP Mon Feb 17 08:22:39 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux
<davecheney> thumper: what you got ?
<davecheney> i reckon you've got a different kernel
<thumper> uname -a?
<davecheney> thumper: yup
<thumper> Linux rockne-02 3.13.0-15-generic #35-Ubuntu SMP Mon Mar 3 15:56:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux
<davecheney> hmm
<davecheney> ok, that is another angle to try
<waigani_> thumper: hangouts is not loading for me :(
<waigani_> wallyworld_: you lot still in hangout? I can't even load https://plus.google.com/
<wallyworld_> waigani_: yeah, we're here
<waigani_> wtf?
<waigani_> :(
<wallyworld_> talk about you :-P
<davecheney> waigani_: thumper is using the internet
<davecheney> you'll have to wait your turn
<wallyworld_> lol
<waigani_> lol - bastards
<thumper> davecheney: can you run 'getconf PAGESIZE'
<thumper> davecheney: I get 65536
<davecheney> ubuntu@winton-02:~$ getconf PAGESIZE
<davecheney> 4096
<davecheney> bingo
<wallyworld_> i get 4096 also
<thumper> arse
<davecheney> thumper: https://docs.google.com/a/canonical.com/spreadsheet/ccc?key=0Aje1qBKEwuLVdDIxdG9NeWF2REJjemdLbllzbGotamc&usp=drive_web#gid=0
<wallyworld_> thumper: patch getconf :-)
<davecheney> line 10
<thumper> wallyworld_: on rockne?
<wallyworld_> yeah
<davecheney> thumper: wallyworld_ the 4k/64k thing is provided by the kvm emulation
<davecheney> it's totally possible that different vm's get different views of the underlying hardware
<wallyworld_> but why does a log funtion vare?
<wallyworld_> care
<wallyworld_> we could just patch it to see if it works as an experiment
<davecheney> wallyworld_: simpler/safer just to apt-get update a new kernel image
<wallyworld_> davecheney: what does your uname -r say?
<davecheney> 3.13.0-8-generic
<davecheney> i'm a week or two behind
<wallyworld_> i wonder that thumper's is?
<wallyworld_> what
<davecheney> wallyworld_: check scrollback
<wallyworld_> ah. -15
<davecheney> wallyworld_: i thnk that assert is checking that the log file is aligned to the os page size
<davecheney> probably because they use mmap a lot
<davecheney> ^ just a guess
<davecheney> but probably not totally wrong
<wallyworld_> makes sense i guess
<davecheney> mongo does a lot of dumb shit for performance without proof that the simple way was a problem to begin with
<wallyworld_> seems like they could do better though
<wallyworld_> s/mongo does a lot of dumb shit for performance without proof that the simple way was a problem to begin with/mongo does a lot of dumb shit
<davecheney> fassert( 16142, _fd >= 0 );
<davecheney> what the bollocks is this ?
<davecheney> every assert has a unique number
<davecheney> but this isn't a constant ?
<thumper> davecheney: it'll be a number they can grep the source for :-)
<wallyworld_> fassert(42, 'the meaning of life')
 * thumper takes a stab in the dark
<davecheney> thumper: i hope youre joking
<wallyworld_> it's mongo, of course not
 * thumper guesses not
<davecheney> fassert( 2 , 2 != 2 ) // totally unique, bro
<waigani_> chrome does not load plus.google.com anymore, yet firefox does ?!? You'd think, if anything, it would be the other way around
<thumper> davecheney: can you try my branch?
<thumper> I was told that there was a second VM for me, but it looks like not...
<thumper> permissing denied (public key)
<thumper> wallyworld_: added cloudinit test for trusty state server
<wallyworld_> thanks :-)
<thumper> np
<davecheney> wallyworld_: bug: 2014-03-19 03:04:00 DEBUG juju.environs.simplestreams simplestreams.go:995 metadata: &{map[com.ubuntu.juju:14.04:ppc64:{ 1.17.6.1 ppc64   map[20140319:0xc21042bde0]} com.ubuntu.juju:12.04:ppc64:{ 1.17.6.1 ppc64   map[20140319:0xc21042b720]}] map[] Wed, 19 Mar 2014 03:03:55 +0000 products:1.0 com.ubuntu.juju:released:tools}
<davecheney> thumper: root     11980  0.7  0.4 1076640 37628 ?       Ssl  03:04   0:00 /usr/lib/juju/bin/mongod --auth --dbpath=/home/ubuntu/.juju/local/db --sslOnNormalPorts --sslPEMKeyFile /home/ubuntu/.juju/local/server.pem --sslPEMKeyPassword xxxxxxx --bind_ip 0.0.0.0 --port 37017 --noprealloc --syslog --smallfiles
<davecheney> branch looks fine, LGTM
<davecheney> urgh, bugs in the server package
<davecheney> Removing mongodb-server (1:2.4.9-1ubuntu1) ...
<davecheney> arg: remove
<davecheney> mongodb stop/waiting
<wallyworld_> davecheney: me not understand?
<davecheney> the middle line
<davecheney> some output that shouldn't leak to the user
<wallyworld_> that's debug
 * thumper goes to drink coffee
<thumper> davecheney: thanks for testing
<davecheney> thumper: looking good
<davecheney> sorry your vm has brain issues
<davecheney> ttps://bugs.launchpad.net/ubuntu/+source/mongodb/+bug/1294455
<_mup_> Bug #1294455: mongodb-server leaks debug information during remove <mongodb (Ubuntu):New> <https://launchpad.net/bugs/1294455>
<davecheney> not important
<bodie_> so I tried upping the log level on sshd to see about that broken pipe, and it caused a bunch of other test failures for some reason.
<bodie_> I guess my next step is to try a 12.04 vm, and if that doesn't work, i don't know
<davecheney> bodie_: you said you disabled root login
<davecheney> that is probably the root cause
<bodie_> yeah, still breaks with root login enabled
<davecheney> ok
<davecheney> bodie_: are you canonical
<davecheney> ?
<bodie_> contracting w/ canonical, so, sorta
<bodie_> hai o/
<davecheney> right, you have my sympathies
<wallyworld_> davecheney: as soon as outgoing ssh and access to port 10170 is sorted out, you can pull trunk and should be able to bootstrap ec2, hp cloud etc environments
<davecheney> i have no idea how you have managed to get yourself into this knot
<bodie_> i suspect it's a reality tv show
<davecheney> i recommend a 12.04 vm
<davecheney> that is what the landing bot users
<davecheney> wallyworld_: OH CRAP, i forgot about the api port
<davecheney> this is a nightmare
<wallyworld_> davecheney: i asked for the api port in the rt
<bodie_> hey man, I'm getting paid to write Go and work from home.  I'LL.  TAKE.  IT.
<davecheney> wallyworld_: could I try it now
<bodie_> I just really want to get rolling.  the testing stuff is driving me #go-nuts.
<wallyworld_> davecheney: you could, but it won't bootstrap cause it will get stuck on ssh
<wallyworld_> unless your vm is special
<davecheney> wallyworld_: could I bootstrap from my amd64 machine to a precise/386 image, or from an armhf machine to an amd64 precise image
<davecheney> that is effectively the same thing
<wallyworld_> yes
<wallyworld_> yes you could
<davecheney> good, i'll test that while I wait
<wallyworld_> so long as the tools are there
<wallyworld_> or, so long as they are packaged and available locally
<davecheney> wallyworld_: i'll choose a version that I know tools exists for
<davecheney> bodie_: same, https://twitter.com/davecheney/status/446075333360361472
<wallyworld_> davecheney:  if you have the tarballs, you can use the --metadata-source bootstrap are and they will be uploaded
<wallyworld_> arg
<wallyworld_> just fyi
<davecheney> wallyworld_: whoa there big fella, i'm not that keen
<axw> davecheney: I came to the realisation the other day that I basically never wear shoes anymore
<axw> my feet never felt so free
<axw> thumper: https://codereview.appspot.com/77340046 when you're not too busy
<axw> I'd kinda like to get it in for 1.17.x so we don't have backwards compat nightmares
 * axw goes to pick up his daughter
<thumper> kk
 * thumper will look now
<davecheney> wallyworld_: I don't thnk we should/need to ask for a firewall hole for 17017
<davecheney> that must be proxyable
<davecheney> and we should prxy it
<davecheney> it's http after all
<wallyworld_> hmm, yeah
<davecheney> so the proxy will have to allow
<davecheney> CONNECT machine-0:17017 HTTP/1.1
<wallyworld_> 17070 i think you mean
<wallyworld_> i'll update the rt
<davecheney> yes
<davecheney> i don't know what i'm talking abouit
<davecheney> jsut kind of wildly stabbing at the keys
<thumper> davecheney: so, with my branch and your VM, did the local provider start and work with no tweaks?
<davecheney> yes
<thumper> \o/
<thumper> davecheney: I've made your tweak suggestion
<davecheney> hang on, lemmie get some proof
<thumper> davecheney: I was kinda relying on us refactoring that code in the next six months due to other OS support
<thumper> but was a good call
<thumper> so I changed it
<thumper> o/ vladk
<vladk> thumper: hi
<davecheney> O_o! http://paste.ubuntu.com/7117795/
<davecheney> thumper: http://paste.ubuntu.com/7117798/
<davecheney> winning!
<thumper> yeah baby, yeah!
<thumper> davecheney: make sure you write that one up for the end of day email!
<thumper> wallyworld_: so for power -> ec2 we are just waiting firewall poking?
<wallyworld_> thumper: believe so. of course my code will first first time, guarantee it
<thumper> coolio
<wallyworld_> thumper: before we ship 1.18, i want to get my current branch in if possible, which is proper detection of supported arches on each provider, based on what images are available via simplestreams
<thumper> wallyworld_: line it up
<wallyworld_> thumper: still wip
<davecheney> wallyworld_: root     11980  0.1  0.4 1076640 41212 ?       Ssl  03:04   0:05 /usr/lib/juju/bin/mongod --auth --dbpath=/home/ubuntu/.juju/local/db --sslOnNormalPorts --sslPEMKeyFile /home/ubuntu/.juju/local/server.pem --sslPEMKeyPassword xxxxxxx --bind_ip 0.0.0.0 --port 37017 --noprealloc --syslog --smallfiles
<davecheney> out of the box
<davecheney> noice
<wallyworld_> davecheney: you mean the password being logged?
<davecheney> wallyworld_: eh ?
<wallyworld_> davecheney: sorry, i didn;t grok your paste
<davecheney> mongo obscures the passwd on the cmdline itself
<davecheney> i didn't do that
<davecheney> marcoceppi: what would it take to promulgate a trusty/mysql charm ?
<davecheney> thumper: does debug-log work for the local provider ?
<thumper> davecheney: not until it is handled over the api
<davecheney> ubuntu@winton-02:~/src/launchpad.net/juju-core$ juju debug-log
<davecheney> Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
<davecheney> Permission denied (publickey).
<davecheney> ERROR rc: 255
<davecheney> ok
<thumper> davecheney: there is an all-machines.log
<davecheney> yeah, i just forgot
<thumper> np
<davecheney> ubuntu@winton-02:~/src/launchpad.net/juju-core$ tail -f ~/.ju-bash: cannot create temp file for here-document: No space left on device
<davecheney> ^C
<davecheney> uh oh
<davecheney> somethign has gone batshit
<axw> thumper: thanks, yeah I tested with local and canonistack - just about to test with azure
<davecheney> ok cock
<davecheney> root@winton-02:~# du -sh /var/lib/mongodb/*
<davecheney> 3.1G    /var/lib/mongodb/journal
<davecheney> 64M     /var/lib/mongodb/local.0
<davecheney> 17M     /var/lib/mongodb/local.ns
<davecheney> 0       /var/lib/mongodb/mongod.lock
<davecheney> 4.0K    /var/lib/mongodb/_tmp
<davecheney> mongo has gone batshit
<davecheney> oh hang on
<davecheney> this is the system mongo db
<davecheney> not outs
<davecheney> ours
<davecheney> ok
<davecheney> i'll just delete that
<thumper> haha
<davecheney> it probably is still batshit insane
<davecheney> lemmie raise a bug for that
 * thumper tries to land that branch again
<davecheney> If you're happy and you know it, land your branch â¬
<davecheney> root@winton-02:~# du -sh /var/lib/mongodb
<davecheney> 8.0K    /var/lib/mongodb
<davecheney> great, now I can't trigger the problem
<davecheney> asshole
<thumper> davecheney: branch landed
<dimitern> morning all
<dimitern> mgz, jam, relatively quick review ? https://codereview.appspot.com/77340044/
<rogpeppe> mornin' all
<rogpeppe> dimitern: 1000 line diff is "relatively quick"? :-)
<axw> morning
<axw> relative to a 2000 line diff it is ;)
<dimitern> :)
<dimitern> rogpeppe, most of that are tests
<rogpeppe> dimitern: so we're not supposed to read tests as closely then? :-)
<dimitern> :D
<rogpeppe> axw: hiya
<dimitern> i'll remember that one
<rogpeppe> dimitern: i confess that i probably *don't* read tests as closely as production code.
<dimitern> rogpeppe, well then ;) give it a try ? https://codereview.appspot.com/77340044/
<rogpeppe> dimitern: currently reviewing https://codereview.appspot.com/77500045
<dimitern> rogpeppe, sure, np
<dimitern> fwereade, so it turned out there *will be* a sprint in Malta after all.. just not for us ;)
<dimitern> last week of may for the apps & client guys
<axw> I'd rather go to Malta than Las Vegas... hmm that is quite the first world problem
<fwereade> dimitern, ha, yeah
<rogpeppe> lol
<davecheney> ya'all are writing the the wrong Go application
<fwereade> dimitern, I will continue to quietly agitate for malta sprints
<rogpeppe> axw: do we now have a valid environment configuration in the state from the word go now?
<axw> rogpeppe: yup
<rogpeppe> axw: cool. we can probably simplify various pieces of agent logic because of that
<axw> rogpeppe: yeah, there are definitely some things I haven't gotten around to simplifying. e.g. workers waiting for valid config
<rogpeppe> axw: that's what i was thinking of
<axw> rogpeppe: it is still possible to get an invalid config *after* bootstrap, but that's very close to being fixed
<axw> rogpeppe: there's a small amount of work required to ensure config changes in state are transactionally validated&updated
<vladk> morning all
<axw> morning vladk
<rogpeppe> axw: ah yes, because if you've got two set-environments concurrently, the combination might not be valid, i guess
<rogpeppe> vladk: hiya
<axw> rogpeppe: yup. waigani has been working on fixing it. there's a bit more to go, but it's still racy for the moment
<dimitern> mgz, ping
<dimitern> axw, fwereade, rogpeppe, jam, i'd really appreciate a review so i can land https://codereview.appspot.com/77340044/
<rogpeppe> dimitern: sorry, still on that other review
<fwereade> dimitern, I'll take a look
<axw> about to go afk, will take a look later if it's not reviewed
<dimitern> fwereade, ta!
<rogpeppe> fwereade, jam: do you think we need to elide passwords from service-set configuration attributes in the log?
<fwereade> dimitern, reviewed, let me know what you think
<dimitern> fwereade, cheers, looking
<dimitern> fwereade, i'm not quite following you on that one https://codereview.appspot.com/77340044/diff/1/agent/agent.go#newcode135
<dimitern> fwereade, how do you suggest to improve the doc comment?
<fwereade> dimitern, so, we've got a global mutex for config changes
<dimitern> fwereade, I agree it's not perfect, but it's a step in the right direction
<dimitern> fwereade, yeah
<fwereade> dimitern, right, yeah, as I say prgress not perfection
<fwereade> dimitern, just add a note to configMutex explaining the situation if there isn't already one
<dimitern> fwereade, that it's used to guard against modifying a created config?
<fwereade> dimitern, that it sort of implies a singleton config, but all we really have is an accidental-singleton config
<dimitern> fwereade, ok, got it
<jam1> rogpeppe: can I get a follow up review on: https://codereview.appspot.com/76120044/ I simplified it
<rogpeppe> jam1: looking
<rogpeppe> jam1: what was the problem with my suggested (much simpler still) approach?
<jam1> rogpeppe: the only difference is I use package vars (that I can change in a threadsafe manner), and do some logging.
<rogpeppe> jam1:  utils.OverrideGOMAXPROCSFuncs seems ugly to me
<rogpeppe> jam1: and it doesn't seem necessary
<rogpeppe> jam1: i'd prefer to keep mocking inside package boundaries when reasonable to do so
<jam1> rogpeppe: so the alternative is to make them public variables, which doesn't seem great, vs a public function to set them.
<jam1> rogpeppe: how do you suggest telling if GOMAXPROCS was called? by calling it once to force it to a given value, and then calling it again to observe that the runtime has changed?
<rogpeppe> jam1: we could use the approach already used inside cmd/jujud: var useMultipleCPUs = utils.UseMultipleCPUs
<jam1> (runtime.GOMAXPROCS(1); run the agent; finalN := runtime.GOMAXPROCS(0); c.Assert(finalN, gc.Equals, 2) ) ?
<rogpeppe> jam1: (see newDeployContext for an example)
<jam1> fairy nuff
<jam1> we miss a bit of end-to-end of a test, but we do test the two layers
<axw> fwereade: I reworked the DistributionInstances CL. I'm now looking at reworking (replacing) the UnitAssigner one. Based on our discussion last night, I'm going to add a method, SupportsUnitPlacement, to the new EnvironCapability interface that wallyworld introduced
<dimitern> fwereade, re https://codereview.appspot.com/77340044/diff/1/agent/agent.go#newcode176
<jam1> rogpeppe: TBH I don't really give a ****, I just want GOMAXPROCS getting called.
<dimitern> fwereade, the _DELETE_ key is used to specify values to delete
<axw> fwereade: then I'll move EnvironCapability into state, and check it during AddMachine/Unit.AssignTo*
<fwereade> axw, I'm going to pre-send a single comment on one of your reviews, please read it and ping me :)
<rogpeppe> jam1: yeah, sorry. i just didn't like the complexity i saw there.
<axw> fwereade: sure
<axw> NOT LGTM? ;)
<rogpeppe> jam1: particularly as it cluttered the public interface of utils
<fwereade> axw, ;p
<fwereade> dimitern, yeah, and it makes me sad
<rogpeppe> jam1: FWIW your test wasn't properly end to end either
<fwereade> dimitern, it feels likewe're kind of abusing that Paramsstruct because it has some fields that overlap with what we really want
<dimitern> fwereade, initially I tried having "" to signal delete, but it's not quite enough if you just need to add/set an empty field
<fwereade> dimitern, it seems like it'd be clearer and less vulnerable to hilarious screwups to use a different struct for the migrate func
<dimitern> fwereade, ah, you're saying we need a different struct than AgentConfigParams ?
<fwereade> dimitern, eg changing tag/nonce is likely to be fatal
<axw> fwereade: why would StateServer not be true for new state servers? are you thinking about repurposed machine agents?
<dimitern> fwereade, *any* reckless change is likely to be fatal :)
<fwereade> dimitern, granted, but I don;t see much reason to allow changes that basically can't ever make sense
<fwereade> axw, the trouble is that MachineConfig.StateServer really means MachineConfig.Bootstrap IIRC
<fwereade> axw, maybe everything already ignores StateServer if passed into StartInstance?
<axw> fwereade: it's only used in environs/cloudinit
<dimitern> fwereade, ok, what then, to summarize? change the params struct and remove some of the things you can change? or just change the struct
<fwereade> axw, if so that's sort of ok but still very vulnerable to accidental screwups
<fwereade> dimitern, wait, was that struct new in that CL? I thought it was repurposed from creation-time
<fwereade> dimitern, I think we want a separate struct
<dimitern> fwereade, AgentConfigParams was chosen because it fits nicely with the NewAgentConfig() taking params and constructing one
<fwereade> dimitern, but maybe I missed something?
<fwereade> dimitern, yeah, my contention is that they're different jobs and want different params
<dimitern> fwereade, yeah, but if we're going to have mostly the same fields?
<fwereade> dimitern, stuff like "" meaning "leave alone" in one context, and "set to empty" in another, etc
<dimitern> fwereade, ok, I'll look into it and repropose
<fwereade> dimitern, how many fields are you actually changing though?
<dimitern> fwereade, potentially DataDir, LogDir, Jobs and Values
<fwereade> dimitern, I'd really prefer a struct with those fields (and a separate DeleteValues field, rather than a magic key)
<dimitern> fwereade, changing DataDir is as dangerous as changing Tag or Nonce, but it's required for this migration
<dimitern> fwereade, will do
<fwereade> dimitern, understood, this is a sharp knife we're building
<axw> fwereade: I was thinking I could change the DistributionInstances code to return the instances for state server machines. You still need to know whether it's going to be a state server at provisioning time (I had planned to use MachineConfig.StateServer to decide whether to do this)
<fwereade> axw, I think we will know that, but we don't currently express it well
<axw> ... because instances can't be moved between Cloud Services later
<fwereade> rogpeppe, can I get your input on this discussion with axw? see comments in https://codereview.appspot.com/73910043/ and discussion here
<axw> fwereade: alternatively, base it on the jobs in state for the machine being provisioned
<rogpeppe> fwereade: looking
<fwereade> axw, yeah, ISTM that the DistributionInstances path is the rightone
<axw> fwereade: if MachineConfig.StateServer really means "BootstrapMachine" then I think that's the way to go
<fwereade> axw, I think it pretty much does *but* you have done the most stuff to bootstrap lately, and I have only *read* that code rather than *actively worked with it* so your instincts may well be more finely tuned
<axw> fwereade: that CL is going to have to change a bit anyway, because labels won't be useful anymore
<fwereade> axw, if you can say for sure I'm wrong there then I will gladly defer to your superior perspective :)
<axw> fwereade: it is now, but that's only because HA isn't merged yet :)
<fwereade> axw, yeah, indeed :)
<axw> fwereade: that code was all kinda circumspect
<axw> I figured it'd have to change a bit
<fwereade> axw, there's a whole load of stuff that sync-bootstrap has rendered irrelevant-shading-to-crazy, and it'd be lovely to get time to rationalise all that one day
<axw> indeed
<fwereade> axw, but while it's not actively hurting us it remains a copious-free-time deal ;)
<axw> fwereade: does the EnvironCapability thing sound reasonable?
<axw> I think it's basically what you were saying last night
<fwereade> axw, that name sounds good, but I sorry, which CL?
<axw> fwereade: none, I was referring to a message before
<axw> <axw> fwereade: I reworked the DistributionInstances CL. I'm now looking at reworking (replacing) the UnitAssigner one. Based on our discussion last night, I'm going to add a method, SupportsUnitPlacement, to the new EnvironCapability interface that wallyworld introduced
<axw> <axw> fwereade: then I'll move EnvironCapability into state, and check it during AddMachine/Unit.AssignTo*
 * fwereade hunts frantically for the EnvironCapability code
<axw> fwereade: environs/interface.go
<axw> fwereade: landed yesterday I think
 * fwereade sees it there and wonders how his grep failed
<axw> fwereade: we could also use this for enabling/disabling the firewaller job, for example
<fwereade> axw, yeah, that sounds sane to me
<fwereade> axw, thanks
<axw> goodo
<axw> thanks
<axw> fwereade: out of interest, why would/should StateServer not affect cloudinit? won't that be the place for installing juju-mongodb/mongodb-server still?
<fwereade> axw, it seemed rather better to let the new state servers know all their special info over the api connection rather than jamming important info into cloudinit (where other things can read it)
<axw> I think that might be the only thing to do though, the rest requires a secure connection - so probably a InstallMongo flag would be more appropriate
<fwereade> axw, also, it'd be nice to be able to promote a machine to a state server at some date in the future
<axw> yeah, makes sense
<fwereade> axw, even if we don't do so today
<axw> fwereade: sure, except for on azure of course :)
<fwereade> axw, quite so :)
<fwereade> axw, goddamn azure :)
<axw> heh
<fwereade> axw, anyway I think the DistributionInstances thing is good *anyway*
<axw> the new manual provider
<fwereade> axw, because we'll want to distrubute state servers where possible everywhere, right?
<axw> yeah I think so too, I was just curious
<axw> fwereade: yes it'll be useful for ec2
<perrito666> good morning everyone
<axw> fwereade: bbl, gotta take care of my little one. thanks for the chat.
<fwereade> axw, cheers
<jam1> rogpeppe: https://codereview.appspot.com/76120044 updated
<jpds> jam1: Hey.
<jam1> hi jpds
<jpds> What happened to "juju image-metadata" ?
<jam1> jpds: it is 'juju metadata generate-image'
<jam1> or 'juju metadata validate-image'
<jpds> And how do I get index and imagemetadata.json into my S3 stuff?
<jam1> jpds: in 1.16 or in 1.17+ ?
<jam1> We've been polishing it a bit in 1.17
<jpds> jam1: 1.16.
<jam1> :(
<jpds> jam1: OK, so I have the stuff; where am I suppose to put it?
<jam1> jpds: you haven't bootstrapped yet, correct?
<jpds> jam1: Trying to.
<jam1> (in 1.17 you can juju bootstrap --metadata-source /path/on/local/disk) but that won't help you for 1.16
<jpds> Excellent.
<jam1> I'm digging out 1.16 to see if I have answers for you
<jam1> jpds: as a rough approximation, you need to put it somewhere preserving the directory structure (streams/v1/index.json) and then point "image-metadata-url=PATH_TO_ROOT"
<jpds> I'm just going to upgrade to 1.17.
<jam1> jpds: works for me :)
<voidspace> grabbing a cuppa before standup
<jpds> jam1: It's still complaining that index file has no data for cloud.
<jam1> jpds: do you have the backtrace of commands that you've run and their output?
<jpds> jam1: Yes.
<jam1> You need to make sure that your "generate-image" command has the settings that match what you are bootstrapping.
<natefinch> standup
<jam1> natefinch: thanks for the reminder, switching machines
<jam1> brt
<rogpeppe> jam1: reviewed
<jam> mgz: poke
<dimitern> fwereade, updated https://codereview.appspot.com/77340044/
<dimitern> fwereade, should be ready to land now i think
<perrito666> so, to what degree of poking am I allowed to go to obtain a review? :p
 * perrito666 starts mailing plastic fingers
<jam> wallyworld: if you're around, any ideas why "juju bootstrap" would be trying to look for "com.ubuntu.cloud:server:12.04:i386" ?
<jam> Is this a case of "if local Arch == i386, boot a machine with i386" ?
<jam> This is with released 1.17.5
<jam> so not trunk.
<wallyworld> um
<jam> wallyworld: this is custom image metadata, where the target arch should only contain amd64
<jam> but the image metadata lookup fails with:
<jam> index file has no data for product name(s) ["com.ubuntu.cloud:server:12.04:i386"]
<wallyworld> jam: is the client machine precise i386?
<jam> wallyworld: I think we should add a line at startup of "juju" processes which logs version.Current
<wallyworld> that would be helpful
<wwitzel3> natefinch: http://paste.ubuntu.com/7119308/
<fwereade> dimitern, LGTM
<fwereade> perrito666, hassle everybody mercilessly
<dimitern> fwereade, tyvm
<dimitern> fwereade, and I did test it live again :)
<axw> rogpeppe: thanks for the review. you're right about AS not being changeable, but I'm going to change the use of StateServer based on discussion with fwereade anyway
<axw> rogpeppe: it'll unify some of the logic between state/non-state grouping, and it'll also be useful for ec2 AZs later
<axw> (where we need to have a set of instances so we can manually distribute)
<perrito666> fwereade: what a reckles advice :p
<rogpeppe> axw: sgtm
<wwitzel3> rogpeppe: in the cmd/jujud/bootstrap_test.go , there are some tests to initBootstrapCommand, which calls InitializeState. That, after our changes from yesterday calls environs.New(envCfg). It works fine when running, but the tests are now failing with an "environment is not prepared" error from environs.New
<wwitzel3> rogpeppe: any thoughts on either refactoring the code to make those tests not fail OR refactoring the tests so that the environment is prepared by the time environs.New is called?
<rogpeppe> wwitzel3: i think the tests *should* be passing config from a prepared environment
<rogpeppe> wwitzel3: if they're not, they need to be fixed
<wwitzel3> rogpeppe: ok, so the testConfig getting passed in to initBootstrapCommand ?
<rogpeppe> wwitzel3: probably
<wwitzel3> rogpeppe: or I guess, how would I ensure that the config being passed in is from a prepared environment? Is there an example of that in a test somewhere?
<wwitzel3> rogpeppe: testConfig right now is just a a dummy.SampleConfig that gets run through the b64yaml helper.
<rogpeppe> mgz: ping
<rogpeppe> wwitzel3: you can get the prepared environment config from the dummy environment that is started up in most tests
<rogpeppe> fwereade, jam, mgz: could we talk about API addresses for a moment or two?
<fwereade> rogpeppe, got a call in 7 mins, but if it's quick, sure
<rogpeppe> fwereade: https://plus.google.com/hangouts/_/canonical.com/juju-core ?
<rogpeppe> fwereade: might be quickest
<mgz> rogpeppe: I'll be there
<rogpeppe> mgz: thanks, that's very useful
<mgz> rogpeppe: that sounded good to me,
<mgz> and hangouts hate me
<mgz> so, carry on :)
<wwitzel3> natefinch: I pushed tests for the initiateReplicaSet and some other minor fixes to my branch
<wwitzel3> natefinch: I am still trying to figure out this environment not prepared stuff
<natefinch> wwitzel3: ok, I just got back, I'll merge in and see what I can see
<ahasenack> hi guys, https://bugs.launchpad.net/juju-core/+bug/1271144 is still happening, I added some log files to it
<_mup_> Bug #1271144: br0 not brought up by cloud-init script with MAAS provider <cloud-installer> <landscape> <local-provider> <lxc> <maas> <regression> <juju-core:Fix Released by rogpeppe> <https://launchpad.net/bugs/1271144>
<ahasenack> can someone reopen the bug?
<bodie_> morning fine fellas
<rogpeppe> ahasenack: i *thought* we'd fixed it
<sparkiegeek> rogpeppe: so did we ;)
<ahasenack> rogpeppe: I just tried with 1.17.5, still no br0 and lxc deployments to bootstrap fail
<rogpeppe> ahasenack: but it was very hard for us to test that our fix worked
<rogpeppe> ahasenack: because we don't have access to a maas to test on
<ahasenack> rogpeppe: I reproduced it both with a real metal maas server, and with maas running on kvm
<rogpeppe> ahasenack: do you know which revno corresponded to 1.17.5 ?
<ahasenack> let me see if the branch is tagged
<ahasenack> is juju-core still in lp? That's where I'm looking
<ahasenack> rogpeppe: juju-1.17.5          2422
<wwitzel3> yeah, after the fix, I wasn't able to reproduce the error on my vbox maas
<ahasenack> current is 2443
<ahasenack> rogpeppe: we can help debug whatever you need
<ahasenack> that was just running juju tags, I'm assuming that is the official tag for 1.17.5
<sparkiegeek> wwitzel3: did you tear down and re-deploy each time?
<ahasenack> er, bzr tags
<sparkiegeek> wwitzel3: emminently reproducible here :)
<rogpeppe> wwitzel3: can you run with this?
<jam> wwitzel3: a quick question, were you testing LXC on machine-0 or on one of the other machines?
<bodie_> hmm...  i'm getting breakage on a fresh go get launchpad.net/juju-core
<bodie_> log/syslog/config.go:228: method slConfig.CACertPath is not an expression, must be called
<bodie_> and a few others
<bodie_> I don't imagine broken code would get merged
<bodie_> yeah tons of stuff is broken.  the only thing I can understand that might explain this is a bad connection....
<jam> bodie_: a fresh go get probably got updates from 3rd party branches that are newer than we are using
<jam> (go get grabs the tip of all dependencies)
<jam> we have a dependencies.tsv to state what version we are using
<jam> you should be able to "go get launchpad.net/godeps", and then do "godeps -u dependencies.tsv"
<jam> which should set your tree to be the exact version of dependencies we should use for building.
<bodie_> tried that, and install still complained about tons of random breakage
<bodie_> I just double-checked on my workstation to confirm, and nothing went wrong
<bodie_> but it's totally reproducible on my remote
<bodie_> starting to think the gopher gods have it in for me
<natefinch> bodie_: godeps doesn't actually update your branches, it just tells you which ones are broken.  I'm not sure what the bzr command is to tell those branches to move to the right commit.
<jam> bodie_: *potentially* it is a different version of go libraries?
<jam> natefinch: godeps -u dependencies.tsv does the right thing *if* you've already downloaded the full histories
<jam> it won't fetch new data
<jam> but if you have it locally, it will set the branch to the right revision
<natefinch> jam: ahh, so you need to bzr pull first, then run godeps
<bodie_> but go get already bzr pulls doesn't it?
<jam> natefinch: well, potentially "go get -u" to do the recursive fetch of all dependencies, but that potentially overwrites any local commit you have in a dependency.
<jam> bodie_: "go get" or "go get -u" ?
<jam> -u is needed to update in place
<natefinch> bodie_: go get puts you at head... we need some non-head stuff
<bodie_> go get -u -...
<bodie_> well i'm just saying, if you already go got, godeps should have the latest bzr content
<jam> bodie_: right, go get -u launchpad.net/juju-core/... ; godeps -u dependencies.tsv should leave you in a working state
<natefinch> jam: it seems like having head of all our branches not building is a big mistake for exactly this reason.  ideally godeps should only be needed for release branches that need older stuff
<jam> natefinch: you can't develop a 3rd party lib and never land it independently of trunk
<natefinch> jam: oh, yeah, I see... if you're working on a feature branch, and the feature needs updates to external packages.... yeah ok. \
<bodie_> I might have just screwed up here by starting off with go get -u -v when I never ran go get without -u
<bodie_> giving this a try
<wwitzel3> jam: I was deploying the lxc to machine-0 and I was doing a full destroy each time. I have it running now.
<natefinch> jam: this is why Gustavo's versioning of packages is good... prevents exactly this problem by just making a new version that the new code references.
<jam> natefinch: right, the recent one is that axw has been improving gwacl, but that meant gwacl tip was incompatible with juju-core for a while.
<jam> natefinch: "ish"
<jam> except Gustavo has broken us several times
<jam> because he *thought* his change was compatible
<niemeyer> jam: have I?
<jam> when it really wasn't
<jam> niemeyer: you broke the test suite
<niemeyer> jam: Really!?
<bodie_> fight fight!
<jam> when you added timeouts for mongo
<niemeyer> jam: I'm now aware about that
<jam> because we were already doing timeouts in juju-core
<jam> and your timeouts and our timeouts doubled up
<natefinch> jam: heh, yeah, it's definitely not always possible to know if a change will break... but there are changes you *know* will break people
<niemeyer> jam: Well, sorry.. but that's a completely different kind of "breakage"
<niemeyer> jam: What else have I broken?
<jam> niemeyer: our test suite started failing because of an update to mgo.
<jam> I don't know of a case where stuff stopped compiling
<niemeyer> jam: That's good..
<niemeyer> jam: What else?
<jam> niemeyer: the test suite has broken at least one other time, and we pinned to rev 241 for a while.
<jam> I don't remember the exact reason offhand
<niemeyer> jam: Okay, phew.. glad the "several times" was an excited overstatement then
<niemeyer> jam: In terms of timeouts, why did it break again?  Just out of curiosity..
<bodie_> oh god
<bodie_> digitalocean's mirror has go1
<jam> niemeyer: we had a test that asserted we retried an appropriate number of times, but we ended up having mgo retrying on top of juju retrying
<jam> something like that
 * bodie_ slowly claws skin off face
<niemeyer> jam: Oh man.. okay..
<bodie_> I'm going to go yell at someone.  brb.
<niemeyer> jam: That's *totally unspecified*
<niemeyer> jam: If you assume mgo will connect a specific number of times, updates *may* break it
<jam> niemeyer: so, more context as I'm remembering. mgo used to flood connections to Mongo, and we needed it to slow down a bit. We did the slow-down in Juju code, but then mgo started slowing down, and then we were too slow, IIRC.
<niemeyer> jam: Sure, that's okay..
<niemeyer> jam: I reckon the update broke the test logic
<mgz> jam: the socket timeout issue was the most painful, that gave us random failures
<niemeyer> jam: But the test logic was truting on unspecified behavior.. I simply cannot promise to not touch that kind of logic
<mgz> rogpeppe will remember the details
<niemeyer> trusting
<niemeyer> mgz: There was an actual bug there in this case
<niemeyer> So yeah, that kind of breakage is on me.. will pay a beer to the debugger, sorry.
<niemeyer> But that's unrelated to natefinch's point.
<mgz> rogpeppe will look forward to that :)
<niemeyer> mgz: Yeah, I owe him more beers than I can pay
<niemeyer> So, the point remains: the versioning of stable APIs is a good method to avoid API breaks.
<rogpeppe> voidspace: lp:~rogpeppe/juju-core/524-state-api-hostports
 * rogpeppe agrees with niemeyer
<bodie_> niemeyer++
<bodie_> okay, now I just think I'm going insane
<bodie_> ubuntu 12.04 golang is version 1?!
<bodie_> this can't be right
<niemeyer> bodie_: Why not?
<natefinch> bodie_: this is why I just always build go from source, plus you need to do that to enable cross compilation anyway
<bodie_> I don't know...  I guess it was my mistake to assume something would be a proper version on 12 LTS
<jam> natefinch: bodie_: precise is just that old
<jam> you need to "add-apt-repository ppa:juju/golang-go"
<jam> and we'll give you 1.1/1.2 (I don't quite remember now)
<niemeyer> bodie_: Hmm.. 1 looks like a proper version...? :)
<niemeyer> bodie_: 12 was 2 years ago
<natefinch> niemeyer: almost all go code written these days won't compile in go 1.
<niemeyer> natefinch: These days is two years after two years ago.. :)
<natefinch> niemeyer: yes... and there's been plenty of time to update the golang package to 1.1
<niemeyer> natefinch: Distributions don't get updated with new code just because it got released
<niemeyer> natefinch: LTS means long term *support*
<bodie_> just asked the same question in #go-nuts and they pointed me to niemeyer's godeb.  heh
<niemeyer> natefinch: Security fixes and the such will get applied
<bodie_> (a couple of minutes ago)
<niemeyer> natefinch: If you want new code, just update to a new release
<bodie_> I feel special
<jam> niemeyer: natefinch: yeah, generally Ubuntu doesn't change your toolchain underneath you in the official archives.
<jam> it would have been possible to potentially add a "golang-go-1.1" sort of package.
<jam> but not to change the "golang-go" package out of 1.0 status.
<bodie_> yeah, cause that's the version it was at, at that point in time.  I get that.
<bodie_> it's just annoying to realize... lol
<jam> bodie_: :)
<jam> well, I would have pushed to be compileab
<bodie_> spent an hour trying to figure out why everything was broken until I checked the version -_-
<jam> compilable on default tools of Precise for a bit longer.
<niemeyer> bodie_: That's reasonable :)
<jam> but 1.1 was just too much better than 1.0
<bodie_> I don't think making a habit of making sure you have a properly updated toolchain is a bad idea
<bodie_> I just get lazy and assume I'll have the things I need from repos
<bodie_> didn't even cross my mind to check the version until I noticed the output of go get -v was so different
<natefinch> bodie_: you probably should run trusty.  It's what most of us are developing on at this point
<bodie_> maybe I'll make a mention of this in CONTRIBUTING...
<bodie_> oh I'm running trusty on my workstation
<bodie_> not fun
<bodie_> this is why I'm making a 12.04 vm
<natefinch> no?  I have no problems with it
<bodie_> it's been literally one thing after another for an entire week including the weekend
<bodie_> maybe I'm just being dumb, maybe not everything is totally shipshape on 14.04
<bodie_> ruling out the latter
<bodie_> you also built your mongo on a different release
<bodie_> the packaged mongo doesn't work with juju
<bodie_> the packaged juju-mongo doesn't work with juju
<natefinch> bodie_: true.  I guess if mongo doesn't even build on trusty, that's a problem.   Though there was a fix to the mongo problem last night.  You just need to install juju's mongodb on trusty now
<bodie_> GCC 4.8.2 doesn't work with mongo source
<bodie_> okay, cool :)
<bodie_> maybe I'll do that after I get this 12.04 vm running.  lol
<natefinch> bodie_: if you want.  It seems like you're much more likely to run into problems running an old version than a new version
<bodie_> pretty stoked that Go is moving to a Go-compiled Go.  pretty awesome :D
<bodie_> yeah, hopefully I can just make it work on this box.
<bodie_> otherwise to the VM I go
<bodie_> also had issues with a 13.10 vm last night ....
<bodie_> ssh breaking
<bodie_> I don't even...
<bodie_> I'm about ready to decide I should have gone into a career as a gardener
<ppetraki> has anyone tried to download a bundle lately from the charm store? I've tried to export several different ones and all I'm getting is
<ppetraki> envExport:
<ppetraki>   services: {}
<ppetraki>   relations: []
<ppetraki>   series: precise
<wwitzel3> ahasenack, jam: run through it a few times now, still unable to reproduce with my local maas. tear and clean instance each time .. http://paste.ubuntu.com/7120061/
<ahasenack> wwitzel3: try getting rid of the squid cache
<wwitzel3> ahasenack: be happy to if I knew how
<ahasenack>   if [ ! -d /var/cache/squid-deb-proxy/00 ]; then
<ahasenack>    $SQUID -z -f /etc/squid-deb-proxy/squid-deb-proxy.conf
<ahasenack>   fi
<ahasenack> maybe trashing that directory and restarting
<ahasenack> the initscript will create it again
<arosales> ppetraki: also may want to try #juju-gui
<bodie_> hmmmm
<bodie_> getting the same weirdness as I was last night on my 13.10
<bodie_> http://paste.ubuntu.com/7120110/
<bodie_> anyone know what might be up?  I tried debugging sshd last night, but I just got more breakage
<wwitzel3> ahasenack: removed the folder and turned off the squid-deb-proxy service, giving it another try now
<ahasenack> wwitzel3: I'm not sure you can leave it off, juju might expect it to be running
<wwitzel3> ahasenack: true
<wwitzel3> ahasenack: removed the squid-deb-proxy directory, rebooted, and the br0 interface comes up .. my maas is providing DNS and DHCP. Is yours not? Maybe that is the difference?
<ahasenack> wwitzel3: ours is
<ahasenack> wwitzel3: I tried another maas, an older one on precise (v1.4), and I also nuked the squid cache there. There it worked this time
<ahasenack> wwitzel3: so we only have one maas where this is happening right now, before nuking the cache there I think we need to know what is going on
<axw> bodie_: those tests don't run a real ssh executable
<axw> bodie_: see storageSuite.sshCommand; it writes a fake ssh command to $PATH
<wwitzel3> ahasenack: I am able to replicate the behavior before rogpeppe's fix .. maybe if revert to prefix, bootstrap, and populate the cache, then upgrade to 1.17.5 and try to bootstrap without removing the cache .. if I can replicate it that way.
<wwitzel3> ahasenack: I will try that to see if i can get in to the same error pattern you're experiencing
<ahasenack> wwitzel3: ok, I'll try again on that maas machine to see if it wasn't just an archive fluke, but without touching the squid cache
<wwitzel3> ahasenack: sounds good
<axw> fwereade: https://codereview.appspot.com/77820044  - when you have some time
<axw> (please)
<sparkiegeek> wwitzel3: can you paste your cloud-init-output.log from a successful run?
<sparkiegeek> (cc: ahasenack ^^)
<sparkiegeek> wwitzel3: I see it trying to install bridge-utils immediately *before* an apt-get update
<sparkiegeek> wwitzel3: also line 24 from http://paste.ubuntu.com/7120061/ looks like template fail :)
<ahasenack> sparkiegeek: wwitzel3: http://pastebin.ubuntu.com/7120231/
<ahasenack> wwitzel3: that's /var/lib/cloud/instance/cloud-config.txt on the bootstrap node
<bodie_> axw..... ok...  so how am I supposed to tackle this
<ahasenack> it's the same on the other bootstrap node where it worked
<ahasenack> could be "luck". Would love to see if it's just a missing apt-get update
<ahasenack> oh, btw
<ahasenack> sparkiegeek: wwitzel3: where it's failing is trusty, and where it worked for me not is precise
<sparkiegeek> ahasenack: "not is precise" ?
<ahasenack> s/not/now/
<ahasenack> weirdest typo that happens to everyone
<sparkiegeek> :)
<ahasenack> so what I mean, is that it's using different archives
<axw> bodie_: sorry, I'm not sure what the root cause is. it just looks like a broken test, but I can't tell at the moment
<natefinch> axw: isn't it like 2am there?
<axw> natefinch: nah, just after 11pm
<bodie_> it's a broken test, I think that's as far as I've gotten.  heh
<axw> bodie_: I meant as opposed to broken code under test
<natefinch> axw: ahh, what time zone is that?  I don't have it in my world clock list
<bodie_> ko
<bodie_> ok*
<bodie_> probably caused by my custom shell
<axw> natefinch: UTC+8, aka AWST
<bodie_> although that doesn't make sense either
<natefinch> axw: cool.  I'll add that to my list.  Gotta keep track of everyone's time zone so I know when it's reasonable to bug them :)
<axw> :)
<perrito666> yey, for he who loves reviews now we have a 2 for 1 great offer, review https://codereview.appspot.com/77850043 and get to also review its dependency https://codereview.appspot.com/77490043
<perrito666> do not miss this opportunity dear juju hackers
<axw> bodie_: it's explicitly executing as bash, though it's possible that your ~/.bashrc is influencing
<natefinch> perrito666: I'll look
<mgz> axw: that's my best guess too, but I'm a little surprised .bashrc gets triggered
<axw> mgz: that's *a* bug, we should disable it wiht --norc in the test code
<axw> some of the tests do that, but evidently not this one
<axw> bodie_: try sticking "--norc" after "#!/bin/bash" in sshCommand, see if that does anything
<bodie_> ok
<axw> fwereade: yeah, Base doesn't quite sit well with me but I can't think of a better name :\
<axw> EnvironDefaults?
<rogpeppe> axw: in i prefer entities that do one thing. "Base" types can so easily become a bin to chuck all kinds of random stuff in
<rogpeppe> s/in/in general/
<bodie_> woot, got broken pipe test passing.
<bodie_> sshstorage
<bodie_> on on to the next one
<axw> bodie_: how? --norc?
<axw> rogpeppe: hmm... point taken. maybe it can be replaced with multiple types with default behaviour for specific things
<rogpeppe> axw: that would seem better to me
<rogpeppe> axw: that means that if you add some new methods, you'll need to explicitly add the default behaviour to all providers, but i actually think that's an advantage
<rogpeppe> axw: as it forces providers to think about whether the default behaviour is actually appropriate for that provider
<axw> yeah
<axw> fair enough. I will take another look at that part tomorrow.
<rogpeppe> axw: thanks
<axw> I'm off, g'night folks
<bodie_> axw, I got it working by changing my shell back to bash
<natefinch> bodie_: weird.... I was pretty sure rogpeppe was running uh... some non-bash shell, and he can run the tests.
<bodie_> totally weird
<rogpeppe> natefinch: yeah, i do
<rogpeppe> natefinch: my $SHELL
<rogpeppe> % echo $SHELL
<rogpeppe> /home/rog/other/plan9port/bin/rc
<bodie_> :)
<bodie_> hah!
<bodie_> I'm using oh-my-zsh
<bodie_> probably some plugin messing with things
<bodie_> I'm quickly realizing I need to be as absolutely stock as possible
<bodie_> THEN get my tests to pass
<bodie_> then start experimenting, I guess
<mgz> bodie_: you're doing a fine job finding test suite isolation issues...
<mgz> just not really want you want to be doing
<natefinch> mgz, bodie_: a good point.... that actually is quite useful, just, yeah....
<bodie_> even then, it surely would be better to get tests passing first so I at least know what's breaking what, and then start customizing my environment
 * fwereade is disappearing for a bit to have a snippet of his public holiday
<bodie_> heh, enjoy!
 * fwereade will be around for a call in 4 hours at the very least
<bodie_> ppa:juju/golang-go isn't found -- is it $1/golang ?
<mgz> bodie_: yup
<bodie_> fyi, ppa is 1.1.2
<bodie_> maybe THAT'S the problem with my 14.04?  it's running 1.2.1?
<bodie_> oy.
<bodie_> natefinch, go version?
<natefinch> bodie_: I'm running 1.3, anything >= 1.1 should be fine
<natefinch> bodie_: er 1.2
<natefinch> it would be a trick to be running 1.3
<bodie_> I would like to do that as well lol
<voidspace> rogpeppe: do you have me on mute?
<voidspace> rogpeppe: or can you just not hear me?
<voidspace> rogpeppe: or are you ignoring me?
<voidspace> hah
<rogpeppe> mgz: a review for you? https://codereview.appspot.com/77820043
<wwitzel3> ahasenack: http://paste.ubuntu.com/7120556/ , also I was not able to reproduce the error state by bootstrapping with the pre-fix version, then destroying and bootstrapping with the fix.
<mgz> rogpeppe: looking
<rogpeppe> mgz: ta
<mgz> rogpeppe: now actually reviewing it
<rogpeppe> mgz: ta#2
<mgz> rogpeppe: lgtm.
<rogpeppe> mgz: ta#3!
<rogpeppe> mgz: the document has only been around for a few revisions. it's not used by anything else
<mgz> as in, wasn't in the last 1.17 release either? that's perfectly fine then.
<natefinch> Am I the only one that thinks this formatting is terrible?
<natefinch> func StartInstance(
<natefinch>         env environs.Environ, machineId string,
<natefinch> ) (
<natefinch>         instance.Instance, *instance.HardwareCharacteristics, error,
<natefinch> ) {
<natefinch> ....
<natefinch> }
<mgz> yeah, it's dire
<ahasenack> wwitzel3: how could we inject some debugging into that cloud-init script? I would love to see the contents of sources.list and sources.list.d/* at that point, and/or maybe just try adding an apt-get update before it tries to install bridge-utils
<ahasenack> wwitzel3: like http://pastebin.ubuntu.com/7120722/ (untested)
<mgz> ahasenack: can be done.
<mgz> rogpeppe: I wonder if SetAPIHostPorts should actually error out on a zero port
<rogpeppe> mgz: perhaps
<rogpeppe> mgz: there are so many other ways that the addresses can be wrong though, it seems like it's not really worth doing that one thing
<mgz> yeah, maybe, not sure what the right level to be checking these things is really
<jamespage> is it possible to change the timeout on the bootstrap command? it appears to be 10 mins right now - I can think of physical server deployments where this is not long enough :-)
<natefinch> jamespage: yep, there's a config value you can set in the environments.yaml
<mgz> jamespage: bug 1257649 (see status)
<_mup_> Bug #1257649: ssh timeout for bootstrap could be configurable <config> <maas-provider> <micro-cluster> <juju-core:Fix Released by dimitern> <https://launchpad.net/bugs/1257649>
<jamespage> natefinch, mgz: phew
<natefinch> jamespage: juju help bootstrap will show you how, I believe
<rogpeppe> mgz, fwereade: currently we store instance ids in the control bucket. can you think of a strong reason to continue to do that for the api addresses, rather than storing the addresses themselves?
<rogpeppe> if we store addresses, then we won't need an Environ to be able to work out the API server addresses - just read-only access to the control bucket
<rogpeppe> although, i guess we will need to know where the control bucket is
<mgz> yeah, we also don't promise that one is globably readable either I think
<mgz> but putting addresses in it ('as well') should be fine as a basic optimisation
<rogpeppe> mgz: i'm wondering if we need to maintain the instance ids at all
<rogpeppe> mgz: ah, of course we do...
<rogpeppe> mgz: for destroy-environment --force
<mgz> yup.
<mgz> weeeel...
<mgz> you can implement that in other ways
<mgz> just nuking all machines with matching names is how we generally do it I think
<rogpeppe> mgz: good point
<rogpeppe> mgz: so... maybe we don't need to do it
<rogpeppe> mgz: it would make some logic simpler, and it opens the door to having long-term useful .jenv files without provider credentials
<rogpeppe> here's the branch that changes the peergrouper to publish the api addresses in the state: https://codereview.appspot.com/77600048
<rogpeppe> reviews appreciated
<rogpeppe> mgz, dimitern, fwereade, natefinch, wwitzel3: ^
<wwitzel3> rogpeppe: will take a look in a bit
<rogpeppe> wwitzel3: ta
<perrito666> wow, launchpad just sent me an email stating it errord trying to send an email from a comment I added, how rude
<perrito666> :p
<bodie_> oh my god
<bodie_> I think my tests may finally be passing
<mgz> you're special now :)
<jamespage> mgz, I'm not getting a all-machines.log aka debug-log with the MAAS provider and 1.17.5 on trusty - is that something new?
<jamespage> can't find a bug that looks like it
<mgz> ugh, probably, we did change several things around rsyslog
<ahasenack> wwitzel3: around?
<mgz> jamespage: the big change was bug 1281071
<_mup_> Bug #1281071: juju internal use of rsyslog should use ssl/tls for aggregation <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1281071>
<wwitzel3> ahasenack: yep
<ahasenack> wwitzel3: it worked with this patch: http://pastebin.ubuntu.com/7121040/
<mgz> ...but that was in 1.17.4?
<ahasenack> wwitzel3: apt-get update being the fix for me, the rest was just debugging
<ahasenack> Setting up bridge-utils (1.5-2ubuntu7) ...
<ahasenack> stop: Unknown instance:
<ahasenack> networking stop/waiting
<ahasenack> installed bridge-utils, restarted networking, br0 is up
<mgz> jamespage: only obvious change since then is the adding of the rsyslog-gnutls package
<wwitzel3> ahasenack: ok, I will figure out a way to get that in there
<wwitzel3> ahasenack: have to ping some people to see if there is a prefered way to make that happen or if we can just do as you did and add the apt-get update line.
<jamespage> mgz, yeah - I've seen that work OK on OpenStack with 1.17.4
<ahasenack> wwitzel3: I'm building debs in my junk ppa, others can try too (I see jamespage was affected too)
<mgz> jamespage: we probably need a new bug filed, if you can find anything on a unit about rsyslog and the forwarding, that would be relevent
<jamespage> mgz, doing that now
<jamespage> https://bugs.launchpad.net/juju-core/+bug/1294776
<_mup_> Bug #1294776: No debug-log with MAAS, 14.04 and juju-core 1.17.5 <juju-core:New> <https://launchpad.net/bugs/1294776>
<mgz> thanks!
<wwitzel3> mgz: you have an option on the br0 fix? just added an apt-get update before the explict apt-get install bridge-utils that was rogpeppe's fix?
<wwitzel3> mgz: since we have it confirmed fixing the issue for ahasenack
<wwitzel3> s/option/opinion
<rogpeppe> wwitzel3: ah, so apt-get update is necessary?
<ahasenack> it was for me, not for wwitzel3
<rogpeppe> ahasenack: weird
<ahasenack> could be a difference in the images that were used to install the node?
<rogpeppe> ahasenack: do you know what's actually going on there?
<ahasenack> rogpeppe: apt-get install bridge-utils was not finding bridge-utils, that's all I know
<rogpeppe> ahasenack: ahhh
<ahasenack> I checked sources.list, they look fine
<ahasenack> so I tried apt-get update
<ahasenack> later during the bootstrap process apt-get update is ran
<rogpeppe> ahasenack: so i guess some images ship with a sources.list that doesn't include bridge-utils
<ahasenack> and later again, bridge-utils is installed by something else, and then it works
<ahasenack> rogpeppe: well, no, sources.list is ok, if it weren't a simple apt-get update wouldn't have fixed it
<ahasenack> rogpeppe: maybe some images were generated after apt-get update was run, and some were generated without apt-get update having ever been run
<ahasenack> I can check /var/lib/apt if you want, probably
<rogpeppe> ahasenack: sounds worrying :-)
<ahasenack> it's always good to start with a fresh apt-get update run anyway
<ahasenack> you do it in the sync bootstrap
<ahasenack> "just to be sure"
<ahasenack> Adding apt repository: deb http://ubuntu-cloud.archive.canonical.com/ubuntu precise-updates/cloud-tools main
<ahasenack> Running apt-get update
<ahasenack> Running apt-get upgrade
<ahasenack> more because you add another sources.list, sure
<rogpeppe> to me it all seems like things to slow down our already slow instance start time, but perhaps that's just me
<rogpeppe> i do see that we need to do it
<rogpeppe> but i look at docker and thing "a better way is surely possible"
<rogpeppe> s/thing/think/
<mgz> I'm a little nervous about fixes when we don't understand why...
<mgz> cloud-init does update/upgrade as one of its earliest steps
<mgz> well before we get to our runcmd
<mgz> why do we need it twice?
<rogpeppe> mgz: +1
<wwitzel3> rogpeppe, mgz, ahasenack: when I look at the order that things run on my maas, we issue an apt-get update before attempting to install bridge-utils
<ahasenack> mgz: where does it do it?
<ahasenack> wwitzel3: where's the evidence? There are two places where bridge-utils gets installed
<ahasenack> wwitzel3: in my case, the first one fails
<ahasenack> my first one:
<ahasenack> Reading state information...
<ahasenack> E: Unable to locate package bridge-utils
<ahasenack> stop: Unknown instance:
<ahasenack> networking stop/waiting
<ahasenack> + install -D -m 644 /dev/null /var/lib/juju/nonce.txt
<ahasenack> that comes from that piece of code I patched to include apt-get update
<ahasenack> it failed
<ahasenack> then later,
<ahasenack> The following NEW packages will be installed:
<ahasenack>   bridge-utils
<ahasenack> and it goes on and installs it
<ahasenack> I don't know what prompted that one
<mgz> ahasenack: the cloud-config specifies apt_update as true
<ahasenack> I think it was bootstrap
<ahasenack> right, it's in that part where it installs package by package
<ahasenack> git
<mgz> ahasenack: check the cloud-config file in /var/lib/cloud or whereever it is
<ahasenack> mgz: that apt_update true kicks in after
<mgz> o_O
<ahasenack> mgz: that file does not have apt-get update
<ahasenack> only after I added it via the patch
<mgz> oh god, is this an ssh race
<mgz> wwitzel3: is the bridge-utils install done via runcmd or via the ssh-in-and-do-shit step?
<ahasenack> first attempt, that failed, is via runcmd
<ahasenack> it's in that patch, I pasted it
<ahasenack> http://pastebin.ubuntu.com/7121040/
<ahasenack> I just added -y, and apt-get update before
<mgz> runcmd should absolutely be *after* apt_upgrade
<ahasenack> but apt-get install bridge-utils was there already
<mgz> smoser!!!!
<ahasenack> and that's the one failing
<bodie_> welp, got my tests to pass
<smoser> yes
<mgz> wait, I think we already had this conversation
<smoser> runcmd is after cloud-init implmented apt-get install
<mgz> did when runcmd happens get changed?
<mgz> ahasenack is seeing odd things
<ahasenack> mgz: where do we set apt_update True?
<smoser> running 'apt-get update' is not "just to be sure", its really a requirement. if you want to avoid missing packages in -updates.
<mgz> environs/cloudinit/cloudinit.go AddAptCommands
<wwitzel3> mgz: I have it being installed twice (bridge-utils) once during the ssh-in .. it is one of the first things to happen after SSH, then later another install is attempted, but at that point it is already at the latest version.
<ahasenack> wwitzel3: you have it as part of cloud-init too
<mgz> smoser: the question is, why do we have to run it *again* in runcmd when the cloud-config has apt_update set
<ahasenack> wwitzel3: check /var/lib/cloud/instance/cloud-config.txt
<ahasenack> mgz: how does the cloud-config "file" generated by provider/maas/environ.go get to play with the one from environs/cloudinit/cloudinit.go ?
<voidspace> rogpeppe: sorry, I thought the call was at 7pm
<voidspace> rogpeppe: it's a EuroPython committee meeting call
<rogpeppe> voidspace: np
<rogpeppe> voidspace: it's 7pm CET, i guess
<voidspace> yep
<ahasenack> mgz: and, remember, this issue of br0 not coming up only happens with the bootstrap node, not with other nodes that are brought up by juju later on
<ahasenack> hints at different cloud-inits
<smoser> mgz, well there are race conditions that are unavoidable in apt
<mgz> ahasenack: well, does you /var/lib/cloud/instance/clopud-config.txt have the update key or not?
<smoser> but i doubt your'e hitting that
<smoser> do you see evidence of apt-get update having run in /var/log/cloud-init-output.log ?
<ahasenack> mgz: without my patch, no apt-get update in it.
<mgz> smoser: yeah, seems more like some odd bug in our code if you've not done anything evil recently
<mgz> ahasenack: okay, that's the bug then.
<smoser> mgz, trusty ? precise ?
<smoser> for sure neither one *should* hvae had a regression to this affect.
<ahasenack> smoser: I don't think it's an issue with cloud-init.
<smoser> and look in /var/log/cloud-init.log
<smoser> look for WARN in /var/log/cloud-init.log
<ahasenack> there is no evidence that apt_update: True was passed to it.
<smoser> well, look at /var/log/cloud-init-output.log
<smoser> it will clearly show you if it was urn
<ahasenack> I did
<smoser> (and you'll also see mention of it in /var/log/cloud-init.log if it ran it)
<ahasenack> it's clear that it is following the commands juju gave it via http://pastebin.ubuntu.com/7121040/
<ahasenack> and there was no apt-get update in that
<mgz> so, somehow wayne has that, and you don't
<ahasenack> luck I guess
<ahasenack> also, we use the fast-path-installer, which dumps a cloud image on the node
<ahasenack> wwitzel3: do you use that too?
<mgz> I'm assuming he does not
<wwitzel3> ahasenack: nope
<ahasenack> ok, that narrows it down a lot
<ahasenack> if you use the debian installer, then it's more or less guaranteed that apt-get update was run
<ahasenack> and /var/lib/apt is still "fresh"
<ahasenack> whereas the fast-path-installer uses a cloud image, that I think was downloaded to the maas server only once when it was first installed
<ahasenack> or it downloads it from the internet, I don't know
<wwitzel3> ahasenack: yeah the first 3 lines when I bootstrap after the ssh keys are generated are adding the cloud-tools repository, then an update, and then an upgrade
<ahasenack> wwitzel3: ok, the bug happens before that
<ahasenack> wwitzel3: you only see it in the cloud-init-output.log file
<ahasenack> it happens before juju bootstrap ssh'ed in
<ahasenack> wwitzel3: if you want, switch to the fastpath installer and try again, maybe you'll hit the bug too
<wwitzel3> ahasenack: yeah, my cloud-config.txt doesn't have a apt-get update in it, but my bridge is setup fine probably because update has already been run
<ahasenack> that's my conclusion
<ahasenack> mgz: ^^^ wwitzel3's cloud-config.txt doesn't have apt-get update
<ahasenack> like mine
<ahasenack> he got lucky :)
<ahasenack> I didn't :)
<mgz> okay, so we need to find out why that's not getting in, and fix it.
<ahasenack> hazmat: remember the br0 not coming up bug on the bootstrap node? There were two cloud-init configs involved, right? Because br0 does get up in services deployed to new machines, just not in the bootstrap one
<ahasenack> hinting that the one used for bootstrap is different than the rest
<ahasenack> there was a long discussion here in the channel when that was first filed
<hazmat> ahasenack, yes.. there was.. i thought it was addressed with 1.17.5
<ahasenack> hazmat: nope, still happening
<hazmat> ahasenack, nutshell was on bootstrap node previously bridge-utils wasn't installed because of a divergent code path for bootstrap when constructing cloud-init
<ahasenack> hazmat: in my case, a simple 'apt-get update' just before 'apt-get install bridge-utils' worked
<ahasenack> hazmat: so the fix was to just add "apt-get install bridge-utils" in bootstrap's cloud-init?
<hazmat> ahasenack, ah.. the package is old..
<ahasenack> hazmat: right
<natefinch> wwitzel3: btw, when you get time to get back on the stuff we were working on, I merged from trunk and fixed some tests. Still have the environment not prepared failure, but that's the only one left right now.
<ahasenack> hazmat: I think it was assumed that apt_update: was set to True in cloud-init
<ahasenack> hazmat: which it might as well be, but not for bootstrap's cloud-init
<hazmat> ahasenack, i'd file a bug.. i gotta run out for a min.. bbiab
<ahasenack> hazmat: sure, it's filed, we were "remembering" it
<ahasenack> that should be enough info to fix it
<wwitzel3> natefinch: great, I will pull those changes and take a look
<voidspace> g'night all
<rogpeppe> i'm done too
<rogpeppe> happy evenings, all
<wwitzel3> later rogpeppe
<perrito666> natefinch: could you give a second look to https://codereview.appspot.com/77850043/ ? I think it is much better now
<perrito666> tx :)
<natefinch> perrito666: LGTM
<perrito666> natefinch: thank you
<natefinch> wwitzel3: looks like the dummy environ expects some configuration to be set when you get the state out of it.  Looks like it expects a state-id to be set
<wwitzel3> natefinch: is there an example of that?
<natefinch> wwitzel3: not surte
<wwitzel3> k, I'll take a look around, brb
<natefinch> thumper: how's your knowledge of the dummy provider?  I'm trying to update the jujud bootstrap tests to include a valid environment and it's bombing out due to an environment not being prepared, but I can't figure out how I'm actually supposed to prepare an environment for the dummy provider.
<thumper> natefinch: sorry, no idea
<natefinch> thumper: heh ok
<bodie_> http://paste.ubuntu.com/7121842/
<bodie_> any ideas?
<bodie_> I'm on 14.04
<bodie_> go1.2.1
<bodie_> have the mongo from make install-dependencies
<bodie_> have all the right revisions via godeps -u dependencies.tsv
<natefinch> bodie_: you're building trunk, right?
<bodie_> just wiped all my source code and go-got the new thing before doing all of the above
<bodie_> yeah
<natefinch> bodie_: do you have mongod in /usr/lib/juju/bin/ ?
<bodie_> http://paste.ubuntu.com/7121855/
<bodie_> so I guess I have it, but not in my PATH
<bodie_> this is just what I got via make install-dependencies
<natefinch> bodie_: is this running the replicaset tests?
<natefinch> bodie_: nm, I see at the bottom of the paste
<bodie_> it's not crucial, but i'd really like to get the tests passing on my local
<bodie_> do I need juju-mongo in my PATH?
<natefinch> bodie_: no, I don't have it in my path
<natefinch> bodie_: ha, I'm lying, I do have *a* mongod in my path, and removing it causes that exact error
<natefinch> bodie_: so, yeah, add that directory to your path, should fix that error
<bodie_> nice
<bodie_> thought it looked familiar to what was going wrong when I had no mongo
<perrito666> you know, it is very hard for me to code in this and not think permanently on http://www.gamefabrique.com/games/juju-densetsu/ which I played as a kid
<natefinch> that is perhaps the worst error handling I've ever seen, though.  No "can't find mongo"  instead a nil pointer error.
<bodie_> heh
<bodie_> yeah, the tests need some work
<bodie_> I just am keen to get rolling on actions
<natefinch> perrito666: I don't know the game, but man, do those kind of graphics bring back memories
<perrito666> natefinch: It was a famicom game :) I have it ringing in my head since I heard of juju
<bodie_> yeesh.
<bodie_> *** Test killed: ran too long (10m0s).
<bodie_> FAIL	launchpad.net/juju-core/worker/uniter/debug	600.003s
<bodie_> :(
<thumper> sinzui: https://bugs.launchpad.net/juju-core/+bug/1294632
<_mup_> Bug #1294632: lxc broken in trunk r2440 <bootstrap> <ci> <local-provider> <lxc> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1294632>
<thumper> sinzui: have there been any packaging changes with respect to jujud?
<sinzui> yeah. that is sad
<thumper> sinzui: I can run the local provider locally and all good
<thumper> the only difference is that I compile jujud
<thumper> and the bot probably doesn't
<thumper> which means it may not have a jujud locally
<thumper> and hence goes looking for tools
<sinzui> thumper, I did more than compile. I packaged it
<thumper> sinzui: what is it testing? the package?
<thumper> and which packages?
<thumper> does your package contain jujud?
<natefinch> bodie_: weird, those tests run in ~1 second for me.
<sinzui> juju-core_1.17.6
<thumper> sinzui: the CI machine, does it do any compile from source, or just tests packages?
<bodie_> what the hell...
<bodie_> hmm, maybe there's some driver I'm missing (this is on a VM)
<sinzui> thumper, I think you mean the new behaviour searches the env (PATH actually) for the jujuds and asks if they match
<thumper> sinzui: no... the local provider always needs to get some tools
<thumper> what it does by default is to look for a jujud that lives next to the command line juju
<thumper> if it doesn't find one, it tries to compile one
<sinzui> thumper, CI makes the tarball, makes the package from the tarball, then installs the package in a non-system location. Paths are updated and the tests begin.
<thumper> if it can't copmile one, it looks for tools
<sinzui> thumper, tests worked until the revision..tools were found before that rev
<bodie_> with juju-mongodb in my PATH now: http://paste.ubuntu.com/7121938/
<thumper> sinzui: ok, what is in the package?
<sinzui> What WE RELEASE
<bodie_> (this is my 14.04 machine, not the one taking 10 minutes)
<thumper> and the contents hasn't changed recently?
<sinzui> thumper, nothing else failed, so we know the tools were found and also sent to the CPCs
<thumper> hmm...
<sinzui> packaging rules haven't changed for a few weeks
<thumper> sinzui: I've assigned it to wallyworld to fix
<thumper> he should have a better idea what is going on
<natefinch> thumper, sinzui:  this seems bad (from running the tests with the new juju-mongodb:   value *mgo.QueryError = &mgo.QueryError{Code:16149, Message:"exception: cannot run map reduce without the js engine", Assertion:false}
<thumper> sinzui: is there any way I can look at the CI machine?
<thumper> natefinch: which test?
<thumper> AFAIK, we shouldn't be using the javascript engine at all
<natefinch> thumper: TestCharmStreaming   from store/server_test.go
<sinzui> thumper, ~/jobs/local-deploy/workspace/extracted-bin/usr/lib/juju-1.17.6/bin$ jujud version
<sinzui> 1.16.6-precise-amd64
<thumper> sinzui: what about ./jujud version
<thumper> it doesn't search the path, it looks for one beside the juju being executed
<sinzui> ah correct, I didn't export path like the test does
<natefinch> thumper: looks like mgo's MapReduce function must require the javascript engine  (used in Store/store.go  Store.Counters)
<natefinch> sinzui: do we run the tests in CI with juju-mongodb? This seems like it should have failed there.
<thumper> natefinch: CI runs precise
<thumper> no juju-mongodb for precise
<sinzui> thumper, you are now subscribed to lp:~sinzui/+junk/cloud-city The staging-id_rsa key will let you ssh to jenkins@54.84.137.170
<sinzui> thumper,
<sinzui> PATH=/var/lib/jenkins/jobs/local-deploy/workspace/extracted-bin ./jujud version
<sinzui> 1.17.6-precise-amd64
<thumper> sinzui: actually
<thumper> you are right
<thumper> bootstrap is broken locally
<thumper> for me too right now
 * thumper sends evil looks wallyworld's way
<natefinch> thumper: how can we know trusty works if we don't have CI running on it?
<thumper> natefinch: we cross our fingers
<natefinch> thumper: it didn't work
<thumper> sinzui: can we have CI set up for trusty?
<natefinch> :)
<natefinch> sinzui: sorry to give you more work.  I know you've been terribly bored lately.
<sinzui> natefinch, mongodb-server is forced on the unittests because juju-mongodb support was reverted
<sinzui> natefinch, ppc64el and arm64 always have juju-mongodb installed
<sinzui> natefinch, the lxc tests to *not* use juju-mongodb...it broke deployments to the cloud
<bodie_> so is this why my 14.04 getup is failing?
<bodie_> pardon me for being obvious here, but...
<bodie_> I just want to know if I need to nuke my workstation and replace it with a bulletproof version
<natefinch> bodie_: it seems juju has an incompatibility with the version of mongodb we built specifically to work with juju...
<bodie_> oy
<natefinch> bodie_: when you find a bulletproof version, let us know
<thumper> sinzui: so I landed a branch yesterday that forces juju-mongodb for all trusty deployments
<natefinch> bodie_: we stripped out the javascript engine because it had problems on some OSes... but I guess we didn't realize one of the functions we call actually gets sent out to the javascript engine.
<bodie_> ah, right
<sinzui> thumper, the aws and canonistack trusty tests are happy
<thumper> interesting
<thumper> sinzui: what do they do?
<sinzui> Deploy a wordpress stack, then another tests deploys a stack on stable, then upgrades to the RC
<natefinch> bodie_: do you want to borrow my mongod?  I can zip up the whole directory and you can just drop it on disk.  So long as you're on 64 bit, it should work
<sinzui> thumper, I can see juju-mongodb was installed and started in th last test
<bodie_> well, it's working on my remote.  I'd rather get it to legitimately work on trusty
<bodie_> ootb type thing
<bodie_> that way I know it's ... well.  trusty
<bodie_> i'll just use the vps for now
<natefinch> bodie_: I'm not sure there is a "legitimate" way right now, if you can't actually build mongodb on trusty
<bodie_> well there's the packaged version and the juju packaged version
<bodie_> packaged version seems to not work because it's 2.7.0-pre
<thumper> sinzui: was mongodb-server installed too?
 * sinzui looks
<marco-traveling> quick, what does juju set-env do
<marco-traveling> because juju help is a bit ambiguous
<sinzui> thumper, There is no evidence of it
<thumper> hi marco-traveling
<marco-traveling> hi thumper
<thumper> marco-traveling: it updates the config kept in the state database
<thumper> marco-traveling: that is used by everything
<thumper> marco-traveling: doesn't touch anything local
<marco-traveling> thumper: cool, so it doesn't update the environmet variables for a hook env
<sinzui> thumper, I am confident that juju-mongodb was the only mongo installed.
<natefinch> gotta run, sorry.  Good luck with mongo everyone
<thumper> marco-traveling: no, it updates the juju environment config
<thumper> sinzui: thanks
<ev> is there a preferred way for injecting environment variables such that the juju hook context can see them? I tried baking the cloud image with /etc/environment populated with http_proxy, socks_proxy, no_proxy, etc, but that didn't make its way into $charm/hooks/hooks.py
<thumper> ev: not at this stage
<thumper> ev: what are you after?
<thumper> ev: juju is now proxy aware (for http, https, ftp and no)
<thumper> ev: no socks_proxy though
<ev> thumper: no_proxy support, basically
<ev> I can't send traffic to swift through the proxy
<thumper> ev: you can either put it in your environments.yaml, or for a running environment: 'juju set-env no_proxy=foo,bar'
<ev> thumper: if you had to do this, how would you? Right now I've hacked os.environ['http_proxy'] into every single charm, which is...less than ideal.
<ev> wait what?! How do I get this in my environments.yaml, and how did I miss this option
<thumper> ev: 'juju set-env http_proxy=http://myproxy.com'
<thumper> ev: because I'm terrible at documentation...
<thumper> ev: it is very recent...
<thumper> ish
 * thumper goes to look at docs
<thumper> ev: sorry about that...
<thumper> ev: environment now has:
<thumper> http-proxy, https-proxy, ftp-proxy, no-proxy, apt-http-proxy, apt-https-proxy, apt-ftp-proxy
<thumper> the apt values default to the plain values (ie. apt-http-proxy defaults to http-proxy)
<thumper> but can be overridden by themselves
<thumper> eg. my local provider has "apt-http-proxy: http://10.0.3.1:8000" for squid deb proxy
<ev> thumper: that's excellent
<ev> thank you
<thumper> np
<waigani> thumper: https://codereview.appspot.com/77930043
<bodie_> hmmm
<bodie_> getting "no reachable servers" in my store_test
<bodie_> anyone know what might cause that?
<bodie_> (i.e. is the store down?)
<thumper> intermittent test failure
<thumper> we all get that
<thumper> really annoying
<thumper> feel free to fix
<thumper> will trade fix for beer
 * bodie_ puts up an ultraviolet bug-zapper lamp.
<bodie_> apparently that worked
<bodie_> passed this time
<thumper> I did say it was intermittent
<sinzui> thumper, is you scratch out some notes about proxy changes, I will write it up for the release notes and put it into the unstable docs
<sinzui> s/is/if/
<thumper> sinzui: ok, ta
<ev> thumper: is there any way to tell juju to not clean up the bootstrap node on failure? I constantly find myself wanting this.
<ev> so I can peek at its brain
<thumper> ev: no... not that I'm aware of
<ev> rubbish
<ev> so if I'm seeing:
<ev> Installing package: git
<ev> 2014-03-19 21:54:31 ERROR juju.provider.common bootstrap.go:127 bootstrap failed: rc: 1
<ev> Stopping instance...
<ev> do I have any recourse?
<thumper> ev: which provider
<ev> openstack
<thumper> does it need an apt proxy?
<ev> it has one set
<ev> hmm, maybe this is on my end
<thumper> ev: can you pastebin a 'juju get-env'
<ev> let me just make sure spawning an instance normally works
<ev> thumper: that wont work without a bootstrap :)
<thumper> haha, yeah
<thumper> ev: environtments.yaml config values?
<wallyworld> thumper: https://codereview.appspot.com/77960043
 * thumper looks
 * wallyworld goes to make breakfast now he has fixed his regression
<waigani> thumper: https://codereview.appspot.com/77970043
<ev> thumper: it was on my end - dns was missing
<thumper> ev: ok, cool
<ev> suspected something was up when apt /immediately/ returned 1, instead of hung like when it's not going via the proxy :)
<ev> woooo! "starting juju machine agent"
<thumper> wallyworld: https://bugs.launchpad.net/juju-core/+bug/1285923 is that part of your branch?
<_mup_> Bug #1285923: provider/ec2: tests fail expecting a ppc64 image for precise  <ec2-provider> <ppc64el> <juju-core:Triaged by thumper> <https://launchpad.net/bugs/1285923>
<thumper> ev: \o/
<wallyworld> thumper: not that i know of, i'd need to look
<wallyworld> could be
<thumper> ok
<bodie_> d'you guys have an opinion on goimports?
<bodie_> I like being lazy, but I don't like being wrong and lazy
<wallyworld> thumper: the arch bit is fixed. i wasn't aware of the series issue. i'll have to read that more closely to grok what the claimed issue is
<thumper> wallyworld: ok
<thumper> bodie_: I don't use it
<thumper> not yet anyway
<bodie_> is it necessary to use this PPA to get the lbox tool?
<bodie_> ppa:gophers/go
<bodie_> when I try to apt-get update, it looks like it's down
<bodie_> can't I just go get launchpad.net/lbox ?
<bodie_> giving that a shot to see if it works
<davecheney> bodie_: it's just a command
<davecheney> use go get
<bodie_> yeah, that worked
<bodie_> I'm trying to use lbox now -- it's asking me to navigate to my browser, but I'm on a headless remote.  is this a problem?
<davecheney> yes that will be a problem
<davecheney> you could try
<bodie_> like, is there any other way I can do it, or should I just rsync it to my local and do it from here?
<bodie_> bah....
<davecheney> bzr login
<davecheney> you need an OAUTH key from LP
<bodie_> yeah
<bodie_> hm
<bodie_> bzr unknown command login
<bodie_> does that require cobzr?
<bodie_> er, no
<bodie_> well, can I just push my branch and then use launchpad.net to propose it?
<bodie_> bzr lp-propose-merge?
<davecheney> bodie_: try copying the file ~/.lpad_oauth
<bodie_> from where to where?  It's not on either machine
<bodie_> can't I at least use bzr to push to my account on launchpad.net and then merge it with my web browser on my workstation?
<bodie_> er, "propose" the merge
<bodie_> oh LOL it lets me view the site in my terminal
<bodie_> excellent
<bodie_> w3m
<davecheney> sensible-browser for the win
<bodie_> hmmm
<bodie_> works til I get to here
<bodie_> bodie@juju-dev:~/go/src/launchpad.net/juju-core$ bzr launchpad-login
<bodie_> binary132
<bodie_> bodie@juju-dev:~/go/src/launchpad.net/juju-core$ bzr register-branch fix-bson-references
<bodie_> launchpad.net password for binary132@gmail.com: :
<bodie_> bzr: ERROR: Invalid url supplied to transport: "https://binary132%40gmail.com:<my password in plaintext!  yikes>@xmlrpc.launchpad.net/bazaar/": nonnumeric port
<bodie_> pardon my paste
<davecheney> bodie_: that isn't your LP login
<davecheney> thats your gmail login, yes ?
<bodie_> well, I used bzr launchpad-login to set it as binary132, which is my lpad login
<bodie_> so I'm not quite sure why it's asking for "launchpad.net password for binary132@gmail.com"
#juju-dev 2014-03-20
<stokachu> anyone know what "juju.container interface.go:55 unused config option: "use-clone" -> "false"" is referring to?
<sinzui> stokachu, I think thumper added a feature for trusty users to make lxc fast using the lxc-clone feature
<stokachu> sinzui: ah this error shows up during kvm deploys
<stokachu> which makes sense if this is only for lxc
<stokachu> and only seems to show up when i set container: kvm in the environments.yaml for local
<stokachu> otherwise lxc is used for bootstrap while the others are kvm based
<sinzui> stokachu, its a warning isn't it?
<stokachu> sinzui: yea looks to be just a warning im running into issues where kvm instances aren't starting up though
<stokachu> http://paste.ubuntu.com/7122808/
<sinzui> I see lots of warning because my configs are 1.16.6 and 1.17.6
<stokachu> if i dont set container: kvm then i can deploy kvm's just fine, however, they dont honor the machine constraints
<sinzui> stokachu, I think the clone issue is just juju being verbose
<stokachu> ok cool, ill dig some more to see why i cant get these instances started
<sinzui> stokachu, the constraint issue may be bug 1294783
<_mup_> Bug #1294783: deploy to kvm does not honor --constraints <cloud-installer> <constraints> <kvm> <local-provider> <cloud-installer:New> <juju-core:Triaged> <https://launchpad.net/bugs/1294783>
<stokachu> yea i filed that one lol
<sinzui> stokachu, sorry, my listing doesn't show reporters
<stokachu> was trying to work around it with using kvm as the bootstrap node
<thumper> stokachu: the fact that it is showing up is a bug
<thumper> stokachu: it shouldn't be
 * thumper thinks...
<thumper> yeah
<thumper> it should omit it if it isn't there
<stokachu> yea looks like it did omit it as i got the kvm instances deployed, unfortunately, i coudn't get it to honor the constraints still
<thumper> stokachu: what are you trying to do?
<thumper> with the local provider, the host is the bootstrap node
<stokachu> so im working on a cloud-installer project where the single installer bootstraps juju onto a kvm instance and deploys openstack charms on separate instances
<stokachu> ah ok
<stokachu> so with that said im trying to get the kvm instances deployed with a --constraints mem=1G
<stokachu> ive tried juju set-constraints mem=6G and used juju deploy <charm> --to kvm:0 --constraints mem=1G
<stokachu> thinking i had to set the machine constraints to something that would allow 1G instances
<thumper> umm...
<stokachu> only thing i haven't tried is juju add-machine --constraints mem=1G
<thumper> you can't deploy onto machine 0 with the local provider
<thumper> or at least you shouldn't be able to...
<stokachu> the containers are what im deploying to
<stokachu> under machine 0
<thumper> right, nope, that's not supported
<thumper> if you are trying to do crazy stuff like that, the manual provider is the recommended way
<bodie_> anyone got a clue how to propose a merge from a remote dev box?
<stokachu> http://paste.ubuntu.com/7122846/ - thats not supported?
<stokachu> machine 0 is the host machine which connects to libvirt to create the containrrs
<thumper> stokachu: the fact that it works is mildly surprising to me
<stokachu> it actually works better than lxc lol
<thumper> but it isn't supported...
<thumper> as in, I've not made sure it works
<bodie_> pardon my frustration, i've been sitting on this commit for like an hour while I eat dinner, hoping to get it at least proposed by the end of the night
<stokachu> so both lxc/kvm are not supported as containers to be deployed to on machine 0?
<thumper> no
<thumper> I guess it works
<thumper> but it isn't supported
<thumper> the networking isn't guaranteed to work for a start
<thumper> if it does, it's a fluke
<thumper> stokachu: the local provider works by creating containers "as machines"
<thumper> so machine 1 on the local provider is normally lxc
<thumper> stokachu: AFAIK, you are the first crazy person to try this
<stokachu> do i unlock an achievement award or anything? :D
<thumper> sure...
 * thumper hands stokachu an award
<stokachu> lol
<thumper> however...
<stokachu> so with manual provider do i need to do anything special to deploy to kvm only?
<thumper> you have successfully worked out how to have a mixed container local provider
<thumper> with no extra work from me
<stokachu> hah yea i mixed kvm/lxc together on machine 0
<thumper> no...
<thumper> do this
 * thumper thinks...
<thumper> actually, I should probably check this out locally first
<thumper> the thing is,
<thumper> the containers inside machine 0
<thumper> are just using the default dhcp
 * thumper thinks harder
 * thumper goes to read the source
<stokachu> yea whatever virbr0 is
<stokachu> i think
<stokachu> it creates a bunch of vnetX interfaces on the host machine too
<thumper> right
<thumper> so it is using the default bridge for kvm on the host
<thumper> and lxc container would be using lxcbr0
<thumper> you could... make them talk
<stokachu> yea i didnt test if they could talk to each other
<thumper> by changing 'lxc-bridge' -> virbr0
<stokachu> ooo
<sinzui> bodie_, I am not current with what your stuck on, and I try to avoid lbox. Do you have a bzr issue?
<thumper> in the config
<stokachu> im going to try that
<thumper> then the lxc containers would use the same host bridge
<thumper> stokachu: you are crazy btw
<stokachu> haha im not even sure how i got on this path
<stokachu> it just happened
<bodie_> sinzui, I'm working on a remote VPS because my local machine isn't passing tests due to the mongo stuff.
<thumper> stokachu: and also, AWESOME
<bodie_> I have a bzr branch ready to lbox propose, but it wants me to use the web browser on the host
<stokachu> haha ty
<thumper> waigani: I think I need to log into the bot and blow away the pkg dir
<bodie_> which is headless
<thumper> waigani: it has to do with some things not being rebuilt that should be IMO
<thumper> waigani: still referring to types that aren't there any more
<thumper> waigani: not sure why go isn't rebuilding them properly
<bodie_> I tried using bzr, but I'm not sure what the workflow should be, and I'm getting stuck on bzr register-branch
<waigani> thumper: right, I should probably learn how to do that sometime
<thumper> bodie_: what are you doing?
<bodie_> I got it logged in to Launchpad.net on the remote via w3m, but it gets grumpy when I try to register-branch
<sinzui> bodie_, does bzr whoami agree with your email on lp
<bodie_> yeah
<sinzui> bodie_, does bzr lp-login prompt for you password
<sinzui> and doe you agree lp has your keys
<bodie_> bzr lp-login just shows my username
<sinzui> is so, I push branches to Lp before using lp everything, but lbox suck green donkey's ****
<bodie_> maybe the ssh key on my remote isn't my user's key
<bodie_> lol
<thumper> stokachu: ok, and that warning you were getting for the kvm local provider, that is fixed in trunk
<waigani> thumper: do I need to pass lxc Start a configFile/consoleFile or is it smart enough to read the ones passed in on creation?
<sinzui> `bzr push` works for me, though I have my bzr locations setup to push to the project, You can be explicit though....
 * thumper looks
<sinzui> bodie_, ... bzr push lp:~binary132/juju-core/my-branch
<thumper> waigani: take what is now in the create container for the start bit, and move it into start container
<thumper> waigani: then have create container call start container
<waigani> thumper: sure
<bodie_> thanks sinzui.  I owe you
<bodie_> now what?
<stokachu> thumper: sweet thanks
<stokachu> thumper: im actually testing that network-bridge option now
<sinzui> bodie_, I think lbox will honour where you put the branch so...can run
<thumper> stokachu: so what does your provider config look like now?
<sinzui> bodie_, lbox propose -cr
<stokachu> thumper: http://paste.ubuntu.com/7122907/
<thumper> stokachu: FWIW, you don't need either of these two: "default-series: precise" or " authorized-keys-path: ~/.ssh/id_rsa.pub"
<stokachu> then i just do juju deploy <charm> --to [kvm|lxc]:0
<stokachu> ah ok good to know
<sinzui> That will probably run some  tests in a tainted environment, and if satisfied will crate the LP MP, followed by the RV...and then I fail half the time because it want me to login with an identity and password I try never to use
<thumper> stokachu: for kvm, use --to kvm:0
<thumper> stokachu: for lxc, just do as normal
<thumper> juju deploy foo
 * thumper shakes his head
<bodie_> woohoo! https://code.launchpad.net/~binary132/juju-core/fix-bson-references/+merge/211847
 * thumper mutters ""crazy shit"
<bodie_> I did that via web portal
<stokachu> lol will be awesome if it works
<thumper> bodie_: or as we like to refer to it as "launchpad"
<bodie_> ahhh SHIT!  I didn't see there were conflicts.....
<sinzui> bodie_, :( conflicts.
<thumper> wallyworld: got the bot ip address?
<thumper> bodie_: always with conflicts
<thumper> all the time
<thumper> \o/
<wallyworld> 10.55.61.118
<bodie_> sigh
<sinzui> bodie_, just push the updates, Lp will see the change and update the diff
<bodie_> :) looks like very few conflicts
<bodie_> ok
 * thumper makes an alias gobot='ssh ubuntu@10.55.61.118'
<thumper> wallyworld: how can I tell if the bot is currently trying to merge?
<wallyworld> thumper: i just tail the log file in ~tarmac/logs
<thumper> waigani: reapprove your branch now
<bodie_> sinzui -- so I need to merge trunk into my branch?
<waigani> thumper: done
<bodie_> then I'll see the conflict and fix it
<bodie_> or do I need to branch trunk, merge my branch into that
<sinzui> bodie_, that right, merge trunk. the conflicts will be listed edit each file then use `bzr resolve` then `bzr commit`
<bodie_> ok
<thumper> sinzui: https://bugs.launchpad.net/juju-core/+bug/1293330
<_mup_> Bug #1293330: trusty charms are not deployable on ec2, causes provisioner to go into a restart loop <juju-core:Triaged> <https://launchpad.net/bugs/1293330>
<thumper> sinzui: you said ec2 trusty CI all green
<sinzui> It is, and you might see I tested it locally with the RC candidate. It passed
<thumper> sinzui: can you try ap-southeast-2
<sinzui> I will
<thumper> sinzui: the regions shouldn't be different, but they are
<waigani> thumper: landed!
<stokachu> thumper: looks like setting the network-bridge to virbr0 doesn't start lxc containers
<thumper> stokachu: status?
<stokachu> kvm starts but lxc containers sit in a pending state
<stokachu> looking through the logs now to see whats going on
<thumper> stokachu: is it downloading the lxc cloud image?
<thumper> stokachu: are you on raring?
<thumper> raring? no trusty
<stokachu> the host is saucy
<stokachu> it downloads the lxc-ubuntu-cloud image
<thumper> run sudo lxc-ls --fancy
<stokachu> hah http://paste.ubuntu.com/7122992/
<stokachu> i guess it does work
<stokachu> and juju ssh works
<stokachu> ohh and juju status updated its info from pending to started after i juju ssh'd
<stokachu> testing the trusted wordpress+mysql deployment :X
<bodie_> ok, think I cleared up the merge conflicts
<bodie_> ready for review! https://code.launchpad.net/~binary132/juju-core/fix-bson-references
<bodie_> really dumb fix
<stokachu> sweet
<stokachu> thumper: http://paste.ubuntu.com/7123027/
<stokachu> works
<thumper> stokachu: awesome
<stokachu> so what doesn't work is deploying to lxc on machine-0
<stokachu> but everything else started up like a champ
<thumper> yeah, don't do that
<thumper> stokachu: because the lxc containers on machine 0 are using lxcbr0
<thumper> so on a different network
<stokachu> ah ok
<sinzui> thumper, I can reproduce the issue with ap-southeast-2. I attached the only log I could see to take https://bugs.launchpad.net/juju-core/+bug/1293330
<_mup_> Bug #1293330: trusty charms are not deployable on ec2, causes provisioner to go into a restart loop <juju-core:Triaged> <https://launchpad.net/bugs/1293330>
<thumper> sinzui: ta
<stokachu> thumper: i guess my last question is should i keep my findings to myself? :)
<thumper> stokachu: no
<stokachu> cool, i was thinking about a blog post but with a huge disclaimer lol
<sinzui> axw, regarding bug 1293198,  I could try setting termination protection on the instance as juju is setting up. The call to destroy the machine will fail, so we can do an autopsy on the machine.
<_mup_> Bug #1293198: cannot bootstrap with win-client <ci> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1293198>
<axw> sinzui: one other thing you could do is run juju bootstrap with --debug
<axw> sinzui: then we can see what commands are being executed
<sinzui> axw, It didn't show me anything different before the error...it was identical to show-log from the call to apt-get update
<sinzui> I can get you the debug log in a few minutes though
<waigani> thumper: ping
<thumper> waigani: ya
<waigani> pmed you
<waigani> I'm probably off the bottom of your screen
<bodie_> I assume that by making a merge proposal it'll catch someone's attention soon enough?  or do I need to ping people in here to get it reviewed?
<axw> sinzui: there should be a "Running script on..." line before any of the apt-get, etc. feedback
<thumper> bodie_: mostly
<thumper> yes
<bodie_> all righty
<sinzui> axw, I attached the debug log to the bug
<axw> sinzui: thanks
<axw> sinzui: mkdir -p '\var\lib\juju\agents\machine-0'
<axw> :(
<axw> should be easy to fix
<sinzui> ha ha
<axw> I'll get on that
<thumper> axw: approved
<axw> thumper: thanks, just fixing ssh now...
<thumper> sinzui: I just want to confirm something
<thumper> sinzui: our trusty CI tests have trusty bootstrap nodes, yes?
<sinzui> thumper, status says so
<thumper> sinzui: ok, cheers
<thumper> just rebutting a bug
<sinzui> thumper, it download the precise?
<sinzui> I think so
<thumper> sinzui: https://bugs.launchpad.net/juju-core/+bug/1293330/comments/2
<_mup_> Bug #1293330: trusty charms are not deployable on ec2, causes provisioner to go into a restart loop <juju-core:Triaged> <https://launchpad.net/bugs/1293330>
<thumper> that is your status
<thumper> showing bootstrap on trusty
<thumper> which is good, because it matches the code
<sinzui> the checksum didn't match. I was going to download it and run sha256
<thumper> kk
<thumper> wallyworld: where is the supported series stuff for providers?
<wallyworld> um
<wallyworld> there isn't really
<wallyworld> you ask for an image for a series and it tells you if it has one
<thumper> hmm...
<wallyworld> eg you deploy cs:trusty/myql
<wallyworld> it looks for a trusty image
<wallyworld> if there's none there, then no deployed charm for you
<thumper> wallyworld: I'm looking at the bootstrap issue
<thumper> lets say I'm on a power machine
<thumper> and I want to bootstrap to ec2
<thumper> ec2 doesn't have power
<wallyworld> there, the series comes from denv config default-series
<thumper> I don't think we should fail by default
<wallyworld> power is the arch though
<thumper> if they have explicitly specified an arch
<thumper> then we can fail if it can't find one
<thumper> if they didn't specify, then we should try with amd64
<wallyworld> that's what we do
<thumper> hmm... no it isn't
<thumper> because the tests fail
<thumper> I think it is what we said we should do
<thumper> but I don't think anyone has implemented it
<wallyworld> so you saying it looks for an image with arch ppc64
<wallyworld> if none is specified
<axw> thumper: https://codereview.appspot.com/78070043
<thumper> looing
<thumper> wallyworld: yep
<thumper> wallyworld: and I'm saying that if we don't find ppc64, we should try amd64
<wallyworld> thumper: that's what constraints are for i guess
<sinzui> wallyworld, thumper . I just ran curl to get the tools again on the pending machine. My call got tools that passed the sah256sum
<thumper> sinzui: that is just weird
<sinzui> CI was idle. It was publishing new tools...
<wallyworld> i'm glad we put in the checksum then
<sinzui> but if a proxy is involved, maybe it delivered a version from a few hours ago
<sinzui> well I can deploy again and see if it works
<wallyworld> turn it off and on again
<wallyworld> thumper: the default fallback arch could be amd64, or it could be an env config just like "default-series" is
<thumper> wallyworld: I think amd64 would be a good fallback
<thumper> it has support everywhere
<wallyworld> yes but i'm not sure explicit is always good
<wallyworld> implicit imean
<wallyworld> ymmv
<sinzui> I meant to say CI was idle, it was NOT publishing. I wouldn't have deployed using the release candidate if it was about to mutate
<sinzui> wallyworld, thumper, I deploy the charm again. It was successful. I think network/proxy is in play and my chosen test with a volatile version of juju is a factor
<wallyworld> at least that explains it
<sinzui> thumper, I don't think bug 1293330 is critical, and I am not even sure it is high.
<_mup_> Bug #1293330: trusty charms are not deployable on ec2, causes provisioner to go into a restart loop <juju-core:Triaged> <https://launchpad.net/bugs/1293330>
<thumper> sinzui: me neither
<thumper> sinzui: it is an ec2 issue?
<thumper> or was it a tools sync issue?
<sinzui> I used streams
<sinzui> I suspect a proxy. I wouldn't work on this bug unless it was affect tools that had been release unchanged for a few days
<thumper> wallyworld: https://codereview.appspot.com/78030044/
<thumper> wallyworld: it is small
<thumper> and needed shortly by wai
<thumper> waigani
<wallyworld> ok
<thumper> boo yeah
<thumper> wallyworld: patching version.Current in two places fixes the bug
<wallyworld> \o/
<thumper> wallyworld: it was just finding out where to put them :-)
<wallyworld> i thought i got them all
<wallyworld> clesrly not
<thumper> confirmed fixed on power
<wallyworld> thumper: reviewed with bikeshed
<thumper> you know how much I love bikesheds
<wallyworld> i think it is a good suggestion
<wallyworld> much clearer intent
<thumper> yeah, looks fine
<wallyworld> to me
<thumper> I'll paste you this other real simple branch as soon as lbox is done
<thumper> https://code.launchpad.net/~thumper/juju-core/fix-ec2-test-isolation/+merge/211855
<thumper> or you can just look there
 * wallyworld waits with baited breath
 * thumper goes to make coffee
<wallyworld> thumper: only if you have time. it's smaller than it looks because deletions. https://codereview.appspot.com/78030045
<wallyworld> otherwise i'll bother someone in europe to look
<thumper> wallyworld: I have time
<wallyworld> ok, ta
<thumper> davecheney: why did you change the importance and milestone of https://bugs.launchpad.net/juju-core/+bug/1293330 after sinzui?
<_mup_> Bug #1293330: deploys may fail on ec2 when the juju tools are new. <ec2-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1293330>
<davecheney> thumper: 'cos I think the bug is more serious
<davecheney> something is corrupting tools on download
<thumper> davecheney: what makes you think that?
<thumper> davecheney: no, nothing is corrupting it
<thumper> the tools were changing
<davecheney> fine, change it back
<davecheney> not even going to ask why tools are being overwritten
<thumper> I think it is part of the daily building of the latest code
<thumper> the CI for tip
 * davecheney puts fingers in ears
<wallyworld> thumper: do you just want a comment for VType? the code has been there for a while
<thumper> wallyworld: I just didn't know what it was
<wallyworld> we were just not marshalling it
<thumper> ok
<thumper> no comment needed
<wallyworld> ta
<thumper> simplestreams can remain a black box :)
<wallyworld> well, it's not simplestreams per se
<wallyworld> it came from the old cloud init data format
<wallyworld> before simplestreams
<wallyworld> i'll make it more verbose
<wallyworld> thumper: the pass by pointer question - not my code, but at a guess, we tend to pass structs by pointer in may places
<wallyworld> why not there
<thumper> sometimes we do, sometimes we don't
<wallyworld> well, it is sorta my new code
<thumper> if it is always passed
<thumper> then that's fine
<thumper> consider a lot of our XXXParams structs
<thumper> they are always pass by value
<thumper> not reference
<wallyworld> is there a clear idiom
<wallyworld> i prefer pass by pointer out of C++ habits
<thumper> wallyworld: I prefer pass by reference
<thumper> can't be nil
<thumper> in C++
<wallyworld> true
 * axw enjoys not having to think too much about object ownership
<jam1> wallyworld: I'm running into a problem where your new UploadTools logic is refusing to bootstrap the local provider. Have you run into this ?
<wallyworld> jam1: have you pulled trunk? I fixed that first thing this morning
<jam1> wallyworld: no I hadn't, that's why I was asking. Thanks
<wallyworld> np. i refactored about 3 lots of the same logic together in one method and misinterpreted a flag
<vladk> good morning
<dimitern> morning vladk
<rogpeppe> dimitern, vladk: mornin'
<dimitern> hey rogpeppe :)
<dimitern> thumper-afk, when you can please check cadmin, I filled some leave for approval
<jam1> morning dimitern, rogpeppe
<rogpeppe> jam1: hiya
<rogpeppe> jam1, dimitern, fwereade: currently looking for a review on this, if you have a little time: https://codereview.appspot.com/77600048
<dimitern> rogpeppe, looking
<rogpeppe> dimitern: ta
<axw> rogpeppe: are you working on https://bugs.launchpad.net/juju-core/+bug/1271144? I'll take a look if you're not
<_mup_> Bug #1271144: br0 not brought up by cloud-init script with MAAS provider <cloud-installer> <landscape> <local-provider> <lxc> <maas> <regression> <juju-core:Triaged by rogpeppe> <juju-core (Ubuntu):Triaged> <juju-core (Ubuntu Trusty):Triaged> <https://launchpad.net/bugs/1271144>
<rogpeppe> axw: wwitzel3 has been looking at that, i believe
<axw> okey dokey
<rogpeppe> axw: you might also want to take a look at https://codereview.appspot.com/77600048 BTW
<axw> rogpeppe: sure, will take a look
<voidspace> morning all
<rogpeppe> axw: last i heard, people were confused about the cause of the br0 issue. wwitzel3 checked it and it worked on his setup. but it failed for ahasenack. unless you have access to a MAAS, you'll find it difficult to test.
<vladk> morning jam1 voidspace
<rogpeppe> voidspace: mornin'
<axw> morning voidspace
<voidspace> vladk: axw: rogpeppe: morning
<axw> rogpeppe: yeah, I don't at the moment
<sparkiegeek> rogpeppe: axw: I can help with debugging that if you need it
<axw> rogpeppe: the specific problem I was going to look at was bridge-utils not installing
<rogpeppe> axw:  yeah
<rogpeppe> axw: that's the issue that was being looked at
<sparkiegeek> axw: I can reproduce it every time :)
<axw> sparkiegeek: thanks, I'll make sure I'm not duplicating effort first :)
<axw> that helps
<axw> rogpeppe: ok
<rogpeppe> axw: do we set AptUpdate in the cloud-init we produce at bootstrap time?
<axw> rogpeppe: no, it's done in the synch phase
<rogpeppe> axw: ah, that may well be the problem
<axw> rogpeppe: the simplest change would be to do it in both
<rogpeppe> axw: yeah
<axw> rogpeppe: I was going to modify the MAAS provider code to add it in specifically for MAAS though
<axw> anyway, will wait to see if wwitzel3 has it under control
<rogpeppe> axw: ok, why don't you do that?
<rogpeppe> axw: given that we've got sparkiegeek around and eager to test for us :-)
<axw> rogpeppe: why don't I do what?
<sparkiegeek> axw: I tried compiling 1.17.5 but I got a failure with gwacl, if you give me a binary I can run with it
<axw> sparkiegeek: okay cool, I'll try and knock one up
<rogpeppe> axw: add AptUpdate to the cloud-init for maas
<axw> sparkiegeek: got somewhere I can dump the binary?
<jam1> rogpeppe: hopefully one final review of https://codereview.appspot.com/76120044/ ?
<rogpeppe> jam1: already done
<jam1> sparkiegeek: you should be able to cd $GOPATH/src/launchpad.net/gwacl; bzr update -r 231
<jam1> we are using a slightly older version from tip because there are changes in progress
<jam1> (or bzr pull . -r 231 --overwrite, if update doesn't work)
<jam1> anyone have any ideas on: "... cannot use 37017 as state port, already in use"
<jam1> I'm trying to bootstrap the local provider
<jam1> and everything has been destroyed
<jam1> there is no local.jenv, and no jujud processes are running.
<jam1> now, I *can* telnet localhost 37017
<jam1> but I don't think anything is actually running there
<jam1> ahh... hmmm. mongod is still running
<jam1> and destroy-environment --force isn't killing it
<dimitern> rogpeppe, reviewed
<rogpeppe> dimitern: ta!
<dimitern> mgz, how's it going with https://codereview.appspot.com/77270046/ ?
<sparkiegeek> jam1: so that moved me a little further along, but now I get http://paste.ubuntu.com/7124257/
<jam1> sparkiegeek: well, the easiest thing is to use 'godeps', so "go get launchpad.net/godeps" and then "cd $GOPATH/src/launchpad.net/juju-core; godeps -u dependencies.tsv"
<jam1> which will set all your dependencies to the right version.
<jam1> rogpeppe: I think I'm comfortable enough doing "godeps -u" in the bot now, however we'll want to make sure it gets installed as part of the setup script. Do you have access to the Juju environment?
<sparkiegeek> jam1: I was just following the README (hint hint, nudge, nudge)
<rogpeppe> jam1: the gobot juju environment?
<jam1> rogpeppe: yeah
<rogpeppe> jam1: funnily enough, i was at that very moment about to look at the gobot with a view to doing just that
<rogpeppe> jam1: (because i want to update the code to use a more recent version of the ratelimit package)
<rogpeppe> jam1: i do have access, yes
<rogpeppe> jam1: (at least, i did last time i looked)
<axw> rogpeppe: sparkiegeek has live tested my change and it works, so I've reassigned that bug to myself now - proposing a fix now
<rogpeppe> mgz, jam: i'm thinking that after doing juju set on the gobot environment, we should probably do a "swift upload tarmac gobotnext.yaml" of the same data, right?
<rogpeppe> axw: thanks!
<jam1> rogpeppe: yes
<axw> fwereade: if you didn't see, I came up with a less stupid name for the API/param field: DistributionGroup
<axw> well I think it's less stupid anyway
<fwereade> axw, yeah, iliked that
<axw> fwereade: also updated the code to do the right thing for env managers
<fwereade> axw, I'm just pondering notfound vs unauthorized for unknown machines -- probably nbd but I need to think it through a bit
<axw> fwereade: no worries, no rush. this stuff is hanging around till 1.19 anyway
<axw> rogpeppe: do yo uhave a moment to look over https://codereview.appspot.com/77890045/ ?
<rogpeppe> axw: looking
<jam1> mgz: where is https://bugs.launchpad.net/juju-core/+bug/1291165 at ?
<_mup_> Bug #1291165: empty .jenv file breaks destroy-environment and bootstrap <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1291165>
<wwitzel3> hello
<rogpeppe> axw: reviewed
<axw> rogpeppe: thanks
<axw> rogpeppe: you make a good point about just configuring the cloudinit.Config. ideally we would do both SetAptUpdate and AddPackage there, for MAAS only
<axw> rogpeppe: I'd rather get this fixed though, so I'll add a tech-debt bug
<rogpeppe> axw: sgtm
<jam> morning wwitzel3, you should probably sync up with axw about the MaaS br0 bugs
<jam> wwitzel3: axw has a patch, which should be landing now
<axw> hey wwitzel3, just about to land https://codereview.appspot.com/77890045/
<axw> wwitzel3: fixes the apt-get bits, and uses ifdown/ifup instead of service networking restart
<dimitern> jam, fwereade, I was thinking of creating a blueprint for MaaS VLAN support, so we can track the progress more easily
<fwereade> dimitern, +100
<jam> dimitern: you can do so if you want, but *I* certainly find dragging cards on the Kanban board easier than editing the WIP of a blueprint
<dimitern> jam, I'll take care of updating it regularly
<wwitzel3> axw: I was actually just reviewing that :)
<axw> wwitzel3: ah, shall I hold off on approving then?
<wwitzel3> axw: I am pulling down that branch and will give it a go on my MaaS. I node that I setup with fastpath/curtin so I can verify the fix.
<axw> wwitzel3: no idea what fastpath/curtin is, but sounds good - thanks
<axw> wwitzel3: it has been live tested on Garage MAAS, fwiw
<wwitzel3> axw: well the OP of the bug was using the fastpath installer for maas, which turns out was the main difference in why my maas node had already run apt-get update and theirs had not.
<sparkiegeek> wwitzel3: ahhh, that's what it was
<axw> wwitzel3: ah, I see. thanks, would be good to know it still works in that mode then
<sparkiegeek> axw: FWIW, it wasn't Garage MAAS, but A.N.Other MAAS
<axw> sparkiegeek: oops, thanks for correcting me :)
<perrito666> good morning
<wwitzel3> sparkiegeek: yep, that was what we narrowed it down to
<wwitzel3> morning perrito666
<sparkiegeek> axw: fastpath (AKA Curtin) is download an image and dd it on to the disk, non fast path is regular debootstrap
<axw> sounds nifty, I'll have to take a look
<sparkiegeek> (i'm being a bit loose on the exact behaviour, but it's broadly like that)
<rogpeppe> mgz, jam: where's a good place to install godeps on the 'bot so that it's accessible from tarmac verify scripts?
<jam> rogpeppe: ~/.local/bin IIRC
<jam> rogpeppe: we have some other stuff in there already, so it is in the $PATH for cron
<rogpeppe> jam: ah, great, that's just what i needed to know. i was trying to think of a way of working out what $PATH was in that context
<jam> rogpeppe: crontab -l
<jam> rogpeppe, wwitzel3, dimitern, waigani: standup time
<rogpeppe> vladk: ^
<rogpeppe> https://plus.google.com/hangouts/_/calendar/bWFyay5yYW1tLWNocmlzdGVuc2VuQGNhbm9uaWNhbC5jb20.sbtpoheo4q7i7atbvk9gtnb3cc
<wwitzel3> jam: hangouts is being a pest
<wwitzel3> attempting to get in there
<jam> wwitzel3: you probably have to be on your canonical account
<jam> we are over the 10 user threshold
<jam> so it is a @canonical only hangout to get to 15
<axw> wwitzel3: thanks for the comments. I'll land what I've got and then look at refactoring. Probably trivial, but needs more testing
<rogpeppe> voidspace: there's a meeting currently
<rogpeppe> voidspace: see above
<voidspace> ah
<voidspace> thursday early meeting!
<voidspace> rogpeppe: thanks
<rogpeppe> voidspace: yeah
<rogpeppe> voidspace: you will *just* fit in if you arrive now
<axw> alexisb: if you're awake ^^
<voidspace> rogpeppe: too late I think
<voidspace> "you aren't allowed to join this call"
<voidspace> maybe the wrong identity, will try again
<jam> voidspace: you have to join as your @canonical
<jam> voidspace: easiest IME is to use the calendar event
<jam> since it should be on the right calendar for you
<rogpeppe> voidspace: add "?authuser=1" to the url
<voidspace> jam: yea, that's what I did - thanks
<voidspace> in
<dimitern> vladk, found a nice tiny bug 1227074 we can quickly pair on later, if you want
<_mup_> Bug #1227074: runtime panic when running any juju command in a deleted directory <juju-core:Triaged> <https://launchpad.net/bugs/1227074>
<perrito666> dimitern: nothing that has panic on the title can be nice and tiny :p
<vladk> dimitern: fine, but probably tomorrow, I am going on meeting with Mike Baker today
<dimitern> vladk, sure, np
<dimitern> perrito666, it's a silly panic anyway :)
<vladk> dimitern: do we need that juju works from deleted directory or just change a panic to error?
<dimitern> vladk, the latter
<perrito666> to test things with ec2, do we have an amazon account purposed for testing or should I use my own?
<jam> dimitern: is there anything in juju that actually cares about $CWD?
<jam> I don't really care if we fail, but it seems very spurious
<natefinch> perrito666: use your own and expense it
<dimitern> jam, apparently we just panic in that case
<jam> (I know metadata generate-image cares, but aside from that)
<perrito666> natefinch: ack
<jam> perrito666: I can give you creds for the shared account
<wallyworld> mramm: bug 1248332
<jam> you're allowed to use your own and expense
<_mup_> Bug #1248332: user doc for simplestreams metadata and private clouds <docs> <juju-core docs:Triaged by evilnick> <https://launchpad.net/bugs/1248332>
<jam> but I can give you creds as well
<perrito666> jam: as you prefer, both are the same to me, I guess Ill go with mine, since I only set it up for this and it is already set in juju
<dimitern> jam, it seems the issue is cmd.DefaultContext includes the dir and that's why it reads os.Getwd() and panics
<jam> perrito666: sure, you'll just have to go through the expensing it at the end of every month (which is a bit of a pain)
 * perrito666 thinks that if he yawns a little bit more he will eat his coffee cup
<perrito666> jam: I guess it is analog to the pain I had loading my national holidays :p
<jam> perrito666: but you get to do it every month, and have to copy your receipts to expenses@canonical.com
<natefinch> perrito666: I highly suggest setting up an alert for if your monthly bill goes over a certain amount (like $100).
<natefinch> perrito666: (I had a $1000 bill a couple months ago due to forgetting about machines)
<natefinch> 0.
<natefinch> 2
<perrito666> natefinch: dont worry, .ar govt takes care of making a big fuzz every time I spend US dollars
<natefinch> perrito666: heh
<wallyworld> fwereade: i can land this for 1.17.6 if you +1 it https://codereview.appspot.com/78030045/
<dimitern> wwitzel3, i might be able to help you with bug 1294776 if you get stuck btw
<_mup_> Bug #1294776: No debug-log with MAAS, 14.04 and juju-core 1.17.5 <logging> <regression> <juju-core:Triaged by wwitzel3> <https://launchpad.net/bugs/1294776>
<wwitzel3> dimitern: I have to get it setup a bit first, as my local maas is running precise
<wwitzel3> dimitern: but that shouldn't take took long
<dimitern> wwitzel3, np
<rogpeppe> wallyworld: so does something like this look plausible? http://paste.ubuntu.com/7124711/
<wallyworld> looking
<wallyworld> rogpeppe: almost. the tools metadata is generated off tarballs, not just jujud. so we need to add a tar step
<wallyworld> i'd have to look at the juju build tools code to see exatly what goes into a tarball
<wallyworld> i think we could extend the generate-tools to doa build step
<fwereade> wallyworld, +1, so long as it's explicit
<wallyworld> yep
<fwereade> wallyworld, it's the inscrutable magic in upload-tools that, uh, screws us
<wallyworld> yeah
<fwereade> wallyworld, https://codereview.appspot.com/78030045/ reviewed, no real blockers but I'd like a chat, brb for that
<wallyworld> sure
<rogpeppe> wallyworld: right, so that's the kind of thing i'm thinking of that we should really make trivial to do, and it doesn't appear to be currently
<wallyworld> there would be release scripts for it too
<wallyworld> since that's what the guys use for CI
<wallyworld> so we could look at packaging those
<wallyworld> so far there's been no need to do it externally (apart from release) because upload tools does it
<wallyworld> but if we get rid of upload tools, i agree 100% we need to add tooling support
<rogpeppe> wallyworld: it would actually be useful even without --upload-tools
<rogpeppe> wallyworld: i mean, even *with* --upload-tools
<rogpeppe> :-)
<dimitern> rogpeppe, fwereade, mgz https://codereview.appspot.com/76910044 - Client.ServiceDeployWithNetworks API
<wallyworld> sure :-)
<wallyworld> rogpeppe: the release scripts are hosted on lp, i can't recall the branch name right now
<wallyworld> fwereade: if you wanted a quick hangout to clarify the branch, i can do that. ping me when you are redy
<fwereade> wallyworld, I'm back in the meeting hangout
<fwereade> ah no
<fwereade> rog/michael are there, let's start a new one
<wallyworld> ok
<fwereade> wallyworld, https://plus.google.com/hangouts/_/76cpi6pcd36amc6k6f520f6lq8?hl=en
<rogpeppe> mgz: does this look reasonable as a gobot config setting? http://paste.ubuntu.com/7124775/
<rogpeppe> mgz: i've
<rogpeppe> mgz: here are the diffs: http://paste.ubuntu.com/7124790/
<rogpeppe> jam: ^
<mgz> havin' a look
<mgz> so, apart from being in the wrong format for juju set, amd I don't understand the diff at all
<mgz> I'm oretty terriffied of go get as a tarmac test step
<rogpeppe> mgz: oh shit, it contains private keys
<rogpeppe> mgz: go get doesn't get anything if it's not already there
<mgz> yeah, that sound have been paste.canonical.com :)
<rogpeppe> s/not //
<rogpeppe> mgz: do we need to change those keys now
<rogpeppe> ?
<mgz> doesn't matter too much as we don't give the bot a public ip, so only people in canonical could screw us anyway
<mgz> rogpeppe: nah, just get IS to take the paste down
<mgz> oh, well, the launchpad private key is bad
<mgz> that's free access to our trunk
<mgz> hey everyone, come modify juju-core
<jam> rogpeppe: I would *not* do go get as a test step
<jam> that, and it looks like you're wrapping the commands?
<jam> not sure why it looks that way
<rogpeppe> jam: i can't see how to get around it - how else do we add a new dependency without manually editing the configuration script?
<rogpeppe> jam: and go get is a no-op unless we've added a new dependency
<jam> rogpeppe: we have to do the work when updating a dep anyway
<jam> since you aren't doing -u
<fwereade> mgz, hey, btw, joyent-provider-storage still seems to be hanging around
<fwereade> mgz, we fixed the deps, right?
<rogpeppe> jam: when we update a dep, we can just do juju set tarmac bogus=foo
<jam> rogpeppe: well, we could do that anyway then
<rogpeppe> jam: to poke the bot into running the config-changed hook
<rogpeppe> jam: that doesn't help when we add a new dependency
<rogpeppe> jam: because the dependency won't be in trunk
<rogpeppe> jam: so go get -u .../juju-core/... won't fetch it
<jam> rogpeppe: I'm willing to have some potentially dangerous operations be manual, tbh
<jam> if it isn't something that happens all the time
<rogpeppe> jam: i'm not sure what you mean by "wrapping the commands" BTW. are you referring to the yaml quoting wrapping?
<rogpeppe> jam: why is this potentially dangerous?
<jam> rogpeppe: your diff has certain "verify_command" lines on 2 lines
<rogpeppe> jam: yaml quoting concatenates lines that aren't separated by a blank line
<jam> rogpeppe: it is downloading 3rd party code without a human actualyl running the command
<rogpeppe> jam: so is the "go get -u .../juju-core/...
<rogpeppe> "
<jam> rogpeppe: we initi
<jam> initiate it manually by changing config
<jam> vs everytime the bot sees a branch to merge
<rogpeppe> jam: maybe we should have a separate config setting holding additional branches to go-get
<jam> rogpeppe: that sounds like a good way to trigger it.
<rogpeppe> jam: in fact, that would be trivial to do, i think
<rogpeppe> jam: i don't think it requires any changes to the charm
 * perrito666 wishes he would stop writing git pull when he tries to bzr pull
<mgz> perrito666: `alias git="echo use bzr you fool"`
<jam> mgz: that won't work for *too* much longer :)
<perrito666> mgz: that is actually an idea I am really considering
<mgz> you just need a noreallygit alias too :)
<jam> mgz: well "\git" is that in bash, IIRC
<perrito666> heh, I will only have this machine until friday so I could very well do that
<rogpeppe> jam, mgz: ok, how about this? http://paste.ubuntu.com/7124845/
<jam> you can certainly do "\rm foo" to get around "alias rm=rm stuf"
 * rogpeppe only just manages to avoid pasting private keys again :-)
<mgz> rogpeppe: where is that unzip mongodb-server coming from
<rogpeppe> unalias -a is my friend
<perrito666> I could actually do a wrapper that checks in what kind of repo I am and do the proper thing without making me feel bad for misspelling bzr :p
<mgz> rogpeppe: I nearly poked you :)
<rogpeppe> mgz: that's already there
<rogpeppe> mgz: it's actually the end of an apt-get install line
<rogpeppe> mgz: those are two packages that it's apt-get installing
<mgz> oh dear god the diff syntax
<mgz> I see
<mgz> thanks
<rogpeppe> mgz: but yaml has wrapped the line, oh so helpfully
<rogpeppe> dimitern: i've made (almost) all the changes pwd
<rogpeppe> dimitern: ignore me for the moment
<rogpeppe> dimitern: PTAL  https://codereview.appspot.com/77600048
<rogpeppe> mgz: so does it look ok?
<rogpeppe> mgz: if so, i'll try it out
<fwereade> bodie_, ping
<mgz> perrito666: it seems you have bzr whomai borked on your local machine
<mgz> it's "Horacio n <horacio.duran@canonical.com>Dur"
<perrito666> mgz: ah, marvels of intermachine encoding
<mgz> I guess macs suck with non-ascii? :P
<fwereade> mgz, did I miss a response to the joyent-storage question?
<perrito666> mgz: that is actually ubuntu server
<bodie_> fwereade, what's up :)
<perrito666> which does indeed suck with a few things non ascii
<mgz> fwereade: probaly not, it just needs landing
<mgz> something's been borked whenever I've tried, but I could actually force it through
<fwereade> bodie_, hey, i saw just looking at your MP -- would it be possible to get lbox set up so I can do shiny happy line-by-line comments on the review?
<fwereade> bodie_, to be fair, I haven't seen anything that needs commenting yet
<perrito666> mgz: thanks for the heads up, Ill fix it
<mgz> perrito666: r2437.3.3 is fine, and that's the one *I* committed on ubuntu server :)
<fwereade> bodie_, but referencing particular bits of code inLP reviews is really tedious ;)
<bodie_> Understand, what's missing for me to be fully set up?
<bodie_> Rietveld?
<perrito666> mgz: again, my ubuntu server :p
<rogpeppe> natefinch: ping
<fwereade> bodie_, in one of the docs -- possibly CONTRIBUTING? it describes lbox setup
<fwereade> bodie_, I'm pretty sure you just need to auth with google on the CLI to use it
<natefinch> rogpeppe:  sorta here
<rogpeppe> natefinch: if you had a moment, i wondered if you could join this call for a few moments to discuss bootstrap-state
<fwereade> bodie_, it also enforces description format and lets you auto-link bugs and stuff
<rogpeppe> wwitzel3 too, presuming you're pairing with nate currently
<bodie_> fwereade, the problem is that I'm working on a headless remote host configured with 13.10 since my local workstation is on 14.04 and refuses to play nice
<rogpeppe> https://plus.google.com/hangouts/_/calendar/bWFyay5yYW1tLWNocmlzdGVuc2VuQGNhbm9uaWNhbC5jb20.sbtpoheo4q7i7atbvk9gtnb3cc?authuser=1
<bodie_> I tried using the happy MP technique in the doc and it asked me to open a web browser, which was impossible...
<wwitzel3> rogpeppe: I'm here
<fwereade> bodie_, you don't need to do anything *right now*, I've just approved it for you
<fwereade> bodie_, for which, tyvm
<fwereade> bodie_, sorry brb
<rogpeppe> wwitzel3: could you join us?
<rogpeppe> dimitern: any chance of a LGTM on that branch you reviewed?
<wwitzel3> rogpeppe: keep getting not allowed to join
<natefinch> rogpeppe: joining
<rogpeppe> wwitzel3: try changing authuser=1 to authuser=0
<rogpeppe> wwitzel3: in the URL
<jam> wwitzel3: probably need to be in your canonical account
<wwitzel3> yep on my canonical account
<wwitzel3> rogpeppe: can you send me an invite
<jam> wwitzel3: the account is hard-coded in that URL (the first or second login with "authuser")
<jam> I usually strip that off
<perrito666> mgz: apparently "I am in argentina but set this up as an english server" confused a little bit the settings and it let me without proper LC_ALL
<wwitzel3> jam: it tells me I am the first one there, then I join, and then I get the not allowed message, this is on my canonical account, tried authuser=0 and removing it al ltogether
<jam> wwitzel3: I just invited your Canonical account
<wwitzel3> jam: thanks
<jam> wwitzel3: well, the link worked for me, but meh :)
<bodie_> http://blog.labix.org/2011/11/17/launchpad-rietveld-happycodereviews
<bodie_> anyone know if there's a way to do this in headless mode?
<bodie_> I'm working on a remote while we get 14.04 mongo nonsense up to speed
<bodie_> speaking of which, here's today's 14.04 mongo nonsense if anyone has a few brain cells to spare: http://paste.ubuntu.com/7124946/
<dimitern> rogpeppe, i was mostly concerned with not using statetesting
<dimitern> rogpeppe, LGTM
<dimitern> rogpeppe, fwereade, I still need a review on https://codereview.appspot.com/76910044 please
<fwereade> bodie_, is that consistent? we have intermittent "no reachable servers"es that nobody's managed to get to the bottom of
<dimitern> mgz, ping
<mgz> dimitern: hey
<bodie_> i'll double-check, but i'm pretty sure this is what I got last time
<jam> fwereade: bodie_: that particular test can just be run again, we haven't managed to track down why it takes >45s to start up the server.
<jam> it doesn't usually
<dimitern> mgz, I was wondering about your serviceNetworks branch
<rogpeppe> dimitern: thanks!
<fwereade> jam, wrt dimitern's review, I feel like we *really* need API versioning :/
<dimitern> mgz, how far away are you from landing it?
<mgz> dimitern: yeah, I need to add a test and land it, have just been poking jenv things
<fwereade> dimitern, and I'm afraid I have to eat, cath's just back
<mgz> if it's blocking your path I'll go ahead and do that
<dimitern> fwereade, sure, when you can
<dimitern> mgz, almost - i can land mine without yours and then do a follow-up
<dimitern> to integrate
<mgz> dimitern: I'll go ahead and finish it up
<dimitern> mgz, cheers!
<dimitern> perrito666, are you working on "CLI "juju deploy --network/--exclude-network"" card?
<perrito666> dimitern: I began last night but reached nothing in the time I had, I am now with the bug fwereade just threw at me, you can take it if you want (I just assigned to myself next available task, I have no particular interest in it"
<perrito666> )
<dimitern> perrito666, not to worry, just checking to update the blueprint
<fwereade> mgz, can you handle the joyent call tonight? I need to stop a bit early
<mgz> fwereade: sure
<bodie_> newest 14.04 mongo stuff (you were right jam, the timeouts went away on their own -- this is what I was seeing yesterday)
<bodie_> http://paste.ubuntu.com/7125084/
<dimitern> rogpeppe, do you have time to look at https://codereview.appspot.com/76910044 ?
<rogpeppe> dimitern: am currently pairing with voidspace; should be able to have a look later.
<dimitern> rogpeppe, ok then
<rogpeppe> dimitern: could we not just add IncludedNetworks and ExcludedNetworks to params.ServiceDeploy, and make sure that the behaviour when they're empty is the same as the current behaviour of Deploy?
<rogpeppe> dimitern: i.e. keep the existing API call
<dimitern> rogpeppe, except we can't verify if the apiserver actually did anything when we set these
<dimitern> rogpeppe, (i.e. an old server will just ignore them)
<dimitern> rogpeppe, hence the new API call - we can test whether it's supported
<rogpeppe> dimitern: if we're talking to an old server, is giving an error actually better than doing something without networks?
<rogpeppe> dimitern: (given that we can't specify networks on that server)
<bodie_> did anyone have any thoughts on that test report?
<bodie_> oh
<bodie_> it's the mapreduce issue with not having the js engine
<dimitern> rogpeppe, yes, because we can detect and report it immediately, rather than hitting issues later
<dimitern> rogpeppe, if the server does not support it, that's fine - we can't do it, but it's better to know early from UX perspective
<bodie_> should I open a bug report for the issue?
<dimitern> bodie_, it seems you're using the juju-mongodb package?
<dimitern> bodie_, it doesn't have the v8 js engine built-in, hence the error
<bodie_> right
<bodie_> so that's a bug, right?
<dimitern> bodie_, please file one, yes
<bodie_> I know natefinch and a couple of others were discussing this yesterday
<bodie_> now, I'm not getting this error on my 13.10 instance
<bodie_> even though it is configured the same way (with juju-mongodb)
<dimitern> that's because the package in saucy is probably different?
<bodie_> however, I think its version of juju-mongodb is different -- 2.4.6
<bodie_> yeah
<rogpeppe> dimitern: ok, i guess that's fair enough
<rogpeppe> dimitern: i don't see why we don't make DeployWithNetworks backwardly compatible with Deploy though
<dimitern> rogpeppe, what do you mean?
<rogpeppe> dimitern: well, you currently require some networks to be set
<rogpeppe> dimitern: i don't really see that's necessary
<dimitern> rogpeppe, that's the only thing different from servicedeploy
<dimitern> rogpeppe, why'd you call it otherwise?
<rogpeppe> dimitern: if it wasn't for the backward compatibility issue, we'd just call it Deploy, right?
<rogpeppe> dimitern: in some ways it would just be better to call this DeployV2 or something
<rogpeppe> dimitern: from a client point of view, they could *always* call DeployWithNetworks rather than Deploy
<natefinch> bodie_: the current position is that juju-mongodb is *only* for trusty, and older versions should use regular mongodb (with SSL)
<dimitern> rogpeppe, yeah
<rogpeppe> dimitern: (and that's true for our client code too)
<dimitern> rogpeppe, and if we *had* api versioning it would've been even easier
<natefinch> bodie_: sorry... you hit us just as this stuff was stabilizing (but before it was actually stable)
<rogpeppe> dimitern: not sure about that - if we went william's direction, you'd need to copy the entire API code
<bodie_> all good, I got my remote working so I'm happy
<bodie_> natefinch, isn't the juju-mongodb on Saucy just a repackaging of the system default?
<dimitern> rogpeppe, any api should have means of telling you "i support this" for a specific version
<bodie_> https://bugs.launchpad.net/juju-core/+bug/1295140 <--- filed bug
<_mup_> Bug #1295140: Trusty juju-mongodb map-reduce fails due to lacking js engine <js> <mongodb> <trusty> <juju-core:New> <https://launchpad.net/bugs/1295140>
<natefinch> bodie_: I have no idea. Possibly.
<dimitern> rogpeppe, and the ability to change the interface between versions
<dimitern> esp. one we need to support for 5y
<bodie_> another question: let's say I want to run "real" mongo on my workstation (I'm working on another project that uses it, which might need the js engine.)  Can I have both packages installed?
<rogpeppe> dimitern: for this specific thing, i had in mind that we could ask the API what calls it supports, and potentially what the argument and return types look like
<bodie_> also, hullo
<rogpeppe> dimitern: that would have made it easy to add arguments to the call and still preserve the ability to check that the API implements it correctly, without needing versioning
<rogpeppe> dimitern: one other thing: you can embed params.ServiceDeploy into params.ServiceDeployWithNetworks
<dimitern> rogpeppe, yeah, that will save us a lot of boilerplate in the long run
<dimitern> rogpeppe, i did that initially, but didn't like it very much
<natefinch> bodie_: no idea about having multiples.  I just have one, which is real mongodb built with SSL.  That's probably your best bet.
<dimitern> rogpeppe, wasn't sure it'll serialize properly as well to be honest
<rogpeppe> dimitern: it serialises correctly
<rogpeppe> dimitern: here's another possibility: just add the parameters to Deploy
<rogpeppe> dimitern: but also add DeployWithNetworks as an alias for Deploy
<bodie_> yeah, that would be nice to get working
<bodie_> but since make install-dependencies installs juju-mongodb, I feel like that should work before I go tweaking things to make it work
<dimitern> rogpeppe, hmm.. that's a interesting possibility
<bodie_> I have my remote working, so I'm content until that gets cleared up
<rogpeppe> dimitern: then clients that care can call DeployWithNetworks; eventually we can deprecate DeployWithNetworks.
<dimitern> bodie_, it should be possible to have both packages running, since they use different ports for mongo, but i doubt anyone tested this
<dimitern> rogpeppe, ok, write that in the review pls
<dimitern> rogpeppe, i'll get to it once mgz lands his state changes
<rogpeppe> dimitern: ok, will do
<bodie_> thanks :) I think I'll wait to make sure the packaged version works before changing anything
<bodie_> i've just spent a week trying to get my tests to pass at all
<bodie_> so, the fewer degrees of freedom, the better
<dimitern> bodie_,  :) fair enough
<wwitzel3> dimitern: for bug 1294776 does the node itself need to also be 14.04? I upgraded my MAAS provider to 14.04 but I have an all-machines log
<_mup_> Bug #1294776: No debug-log with MAAS, 14.04 and juju-core 1.17.5 <logging> <regression> <juju-core:Triaged by wwitzel3> <https://launchpad.net/bugs/1294776>
<dimitern> wwitzel3, istm you need 1.17.5 juju client bootstrapping on a 14.04 (v1.5 I think?) maas env
<sinzui> mgz, how goes bug 1291165 ?
<wwitzel3> dimitern: yeah, that is what I have .. 14.04 maas, 1.17.5 client, precise node ... I will install a trusty node as well and try it there.
<dimitern> wwitzel3, if've sure you rebuilt cmd/juju & cmd/jujud before bootstrapping with --upload-tools?
<mgz> sinzui: doing a little juggling, will try to land today
<wwitzel3> dimitern: yeah I removed them before I re-ran go install.
<sinzui> mgz, wwitzel3 . I think CI will pass the current rev. I will defer your bugs to the next release (which might be tomorrow if your fixes land...or I call it 1.18.0
<wwitzel3> sinzui: thanks for the update
<mgz> sinzui: that sounds okay
<perrito666> Hey I am trying to run  script for a test and, at some point the script creates an instance of amazon and tries to run "juju --show-log upgrade-juju -e amazon --version 1.17.6" which fails saying there are no matching tools available, I am running juju from trunk
<perrito666> any hints?
<dimitern> perrito666, better use --debug than --show-log, and put it after the command; you won't need -e envname if "amazon" is your current (juju switch)
<perrito666> dimitern: Ill make the changes, altough that is run by the testing script (which is not mine)
<dimitern> perrito666, then --version is not really needed (unless you want to force a specific version). paste the log you're getting? paste.ubuntu.com
<perrito666> dimitern: tx, Ill re run with the debug flag
<niemeyer> Morning all
<dimitern> hey niemeyer
<rogpeppe> niemeyer: hiya
<wwitzel3> hi niemeyer
<niemeyer> Yos!
<natefinch> rogpeppe: I updated our code with your pastebin, and modified it so it compiles.... I think it's the right translation to actual code, but when we do environs.NewFromAttrs() it says "Environment is not prepared"  Which is like... uh, yeah, how can I prepare it before I have it to prepare?
<natefinch> rogpeppe: the new code: http://paste.ubuntu.com/7125553/
<rogpeppe> natefinch: ah, i think you probably need to call environs.Provider(cfg.Type(), and then call prov.Prepare on the provider
<natefinch> rogpeppe: ahh
<perrito666> dimitern: http://paste.ubuntu.com/7125593/
<rogpeppe> natefinch: alternatively you could use configstore.NewMem to create a configstore.Storage and call environs.Prepare
<rogpeppe> natefinch: i'm not sure that's worth doing though
<perrito666> dimitern: ignorethe python traceback there
<mgz> rogpeppe: (or anyone) do you remember where we got to with a juju-level error for doing retries of provider transient issues at the sprint?
<mgz> we don't seem to have landed anything
<rogpeppe> mgz: i can't quite remember if we wanted to land that on the providers themselves, or the code that calls them
<mgz> a bit of both
<rogpeppe> mgz: did those gobot changes look ok to you in the end, BTW?
<mgz> rogpeppe: I have some general fears still, but I'm happy to let you try blowing things up :)
<dimitern> perrito666, are you sure there are actually 1.17.6 tools available?
<mgz> and have confidence We Can Rebuild It
<rogpeppe> mgz: i don't do the go get any more in verify, FWIW
<perrito666> dimitern: Â¯\_(ã)_/Â¯ most likely not
<rogpeppe> mgz: (but i'm sure you saw that)
<mgz> yeah, last version seemed much less risky
<dimitern> perrito666, what's this script testing?
<perrito666> dimitern: backup/restore
<perrito666> For what I see in their jenkins logs they where running this with .5
<rogpeppe> mgz: cool. i'll push it and see what happens.
<perrito666> perhaps I should try to fetch that rev
<dimitern> perrito666, I see
<dimitern> perrito666, try looking in the bucket for what tools are there
<rogpeppe> oh ffs, you can't do juju set with the output of juju get
<rogpeppe> that is really crappy behaviour
<rogpeppe> and the error is really not obvious
<natefinch> rogpeppe: that is ugly
<rogpeppe> huh, but... it must've worked before
<rogpeppe> i must be doing something wrong
<mgz> rogpeppe: right, it sucks
<rogpeppe> mgz: ah, no i wasn't doing anything wrong
<rogpeppe> mgz: yes it does
<mgz> you need to dedent everything a level and remove all the config junk
<rogpeppe> mgz: yeah
<rogpeppe> mgz: i remember going on about this before, but it stayed the same for compatibility reasons
<rogpeppe> mgz: but having encountered it for real, it really does suck badly
 * rogpeppe writes a little shim to automate the crappy editing required
<natefinch> rogpeppe: do I need to prepare the environment after calling provider.Prepare?  provider.Prepare returns an environ... seems like it would be weird to call environs.NewFromAttrs() to make a new environ again
<rogpeppe> natefinch: no, Prepare prepares the environment, not unsurprisingly
<natefinch> rogpeppe: ok, so, when the tests run, they're still getting a "environment not prepared" error
<rogpeppe> natefinch: where are you getting the environment config attributes from?
<voidspace> natefinch: you were obviously never a boy scout then
<natefinch> voidspace: I quit after cub scouts, it's true
<voidspace> "always be prepared"...
<voidspace> bdum-tish
<natefinch> rogpeppe: from dummy.sampleconfig
<rogpeppe> natefinch: right - they need to come from the environment
<natefinch> rogpeppe: I think they do?  http://paste.ubuntu.com/7125800/
<natefinch> rogpeppe: full file: http://paste.ubuntu.com/7125804/
<rogpeppe> natefinch: no, they don't
<rogpeppe> natefinch: configs are immutable
<rogpeppe> natefinch: you need to get the env returned by Prepare and call Config on it to get the config
<natefinch> rogpeppe: ahh, ok.  weird. Sure.
<dimitern> rogpeppe, updated https://codereview.appspot.com/76910044 - is it better now?
<rogpeppe> dimitern: looking
<natefinch> rogpeppe: ok, now the tests are saying environment destroyed, which is progress, sorta.
<rogpeppe> dimitern: reviewed
<dimitern> rogpeppe, cheers
 * dimitern is away for 2h
<rogpeppe> mgz: a little workaround for juju's misbehaviour: http://paste.ubuntu.com/7125888/
<rogpeppe> mgz: i've named it "juju-set"
<voidspace> anyone using goimports with vim?
<natefinch> voidspace: probably a lot of people, but not me :)
<voidspace> hah
<voidspace> I'm currently failing to get it working
<natefinch> voidspace:  I had thought it was a drop in replacement for gofmt
<natefinch> voidspace: haven't actually used it
<voidspace> natefinch: you have to tell vim to use goimports instead
<voidspace> and the docs say: For vim, set "gofmt_command" to "goimports":
<natefinch> voidspace: or just rename it and put it ahead in the path? :)
<voidspace> but not actually showing how to do that
<voidspace> natefinch: hah
<bodie_> I was just thinking about goimports
<natefinch> voidspace: linux people are bad at directions.  I don't know why
<bodie_> would have made my first commit a lot lazier
<bodie_> but I don't trust it
<bodie_> I had to insert an import by hand into 20 files :P
<arosales> mgz: fwereade: do you guys have time to sync up with dstroppa on the joyent provider?
<natefinch> bodie_: I trust it, but I prefer to see the information about imports going into and out of the code... it's important information at times
<arosales> mgz: fwereade I have a conflict but it would be nice to see where the joyent provider currently is.
<rogpeppe> mgz: how do i get ssh access to gobot node 0 (so i can run debug-log)?
<voidspace> ah, to set a variable in .vimrc you don't use set you use let
<voidspace> now it works
<rogpeppe> mgz: hmm,
<rogpeppe> 2014-03-17 11:53:25 INFO juju.worker.uniter context.go:255 HOOK # cd /home/tarmac/gwacl-trees/src/launchpad.net/gwacl; bzr pull --overwrite
<rogpeppe> 2014-03-17 11:53:25 INFO juju.worker.uniter context.go:255 HOOK Unable to obtain lock  held by go-bot@bazaar.launchpad.net on taotie (process #31845), acquired 3 seconds ago.
<_mup_> Bug #31845: Debian sync too soon renders uninstallable in dapper <pdp (Ubuntu):Fix Released> <https://launchpad.net/bugs/31845>
<rogpeppe> 2014-03-17 11:53:25 INFO juju.worker.uniter context.go:255 HOOK See "bzr help break-lock" for more.
<rogpeppe> 2014-03-17 11:53:25 INFO juju.worker.uniter context.go:255 HOOK bzr: ERROR: Could not acquire lock "(remote lock)": bzr+ssh://bazaar.launchpad.net/%2Bbranch/gwacl/
<perrito666> brb, lunch
<rogpeppe> mgz: i'm seeing lots of this kind of thing on the 'bot: http://paste.ubuntu.com/7126134/
<rogpeppe> mgz: are the "could not acquire lock" messages expected, or do i need to manually break the lock?
<sinzui> hi rogpeppe, natefinch : Do either you you have a minute to review this branch that increments the version: https://codereview.appspot.com/78320043
<rogpeppe> sinzui: LGTM
<sinzui> thank you rogpeppe
<rogpeppe> natefinch: how's it going?
<natefinch> rogpeppe: I think it's working, our code just expects an instance to have been created, so I have to figure out how to get one of those in the environment (our code call env.Instances)
<rogpeppe> natefinch: you should be able to call StartInstance on the Environ
<rogpeppe> natefinch: you could perhaps do that directly after you call Prepare
<natefinch> rogpeppe: I'm surprised at how hard this is.  Is there something that'll return the right StartInstanceParams that'll give me a generic machine?  I don't know what fields to fill out, and an empty one just fails.
<rogpeppe> natefinch: your best bet is to look inside the dummy.StartInstance implementation, i'm afraid
<rogpeppe> natefinch: i could tell you what to do, but i'd have to do that first...
<natefinch> rogpeppe: that's fine
<wwitzel3> rogpeppe: aww, but I like hand holding, makes me feel safe
<rogpeppe> wwitzel3: :-)
<natefinch> rogpeppe, wwitzel3: from the package docs: The configuration YAML for the testing environment must specify a "state-server" property with a boolean value. If this is true, a state server will be started the first time StateInfo is called on a newly reset environment.
<rogpeppe> natefinch: i think you'll find that there is a state-server property already there
<natefinch> rogpeppe: yep, it's there and defaulted to true.  But calling it returns "dummy environment has not state configured."  sigh
<natefinch> it's like no one's ever actually tried to *use* this package
<rogpeppe> natefinch: calling what returns that?
<natefinch> rogpeppe: env.StateInfo() after calling Prepare().  DOcs made it sound like it would start up a dummy state machine.
<rogpeppe> natefinch: why are you calling StateInfo?
<natefinch> rogpeppe: "If this is true, a state server will be started the first time StateInfo is called on a newly reset environment."
<rogpeppe> natefinch: (BTW i'm pretty sure the code i gave you (largely inherited from before) sets state-server to false
<rogpeppe> natefinch: why do you want a state server?
<natefinch> rogpeppe: you're right, I missed that it was setting state server to false
<natefinch> rogpeppe: setting that to true makes StateInfo return "environment is not bootstrapped" which at least makes sense.
<rogpeppe> natefinch: why are you calling StateInfo?
<rogpeppe> natefinch: FWIW the docs are out of date - the state server is now started when Bootstrap is called
<natefinch> rogpeppe: I was trying to find a shortcut around having to write out a whole StartInstanceParam with MachineConfig, since I had no clue as to how to populate them correctly
<rogpeppe> natefinch: the dummy environment never creates instances unless you call StartInstance
<rogpeppe> natefinch: if you look in dummy.environ.StartInstance, it looks pretty clear which fields need to be set
<natefinch> rogpeppe: yes, but it's dumb that I have to look at the implementation to figure out what to set all those things to.  Why not just set them for me if they're not set?  Like I said, it's like no one has ever actually used this package.
<rogpeppe> natefinch: looks like MachineNonce, StateInfo.Tag, APIInfo.Tag
<rogpeppe> natefinch: mostly StartInstance is not called directly, but by higher level code that is expected to set those fields
<rogpeppe> natefinch: what you're doing is unusual, but i don't think you'll find it's that hard
<rogpeppe> natefinch: MachineConfig{MachineNonce: "bootstrap-nonce", &state.Info{Tag: "machine-0}, &api.Info{Tag: "machine-0"}} might do the job
<rogpeppe> natefinch: oh, and MachineId: "0"
<natefinch> rogpeppe: it needs machineId, too.  Yes, it's not hard, I guess I just would have assumed someone would have made a helper method that would just set some reasonable defaults for people who don't care about the particulars of the bootstrap node.
<rogpeppe> natefinch: if it was done in more than one place, it would be worth doing
<rogpeppe> natefinch: once upon a time i seriously considered making the dummy environment create an instance at bootstrap time, but it broke loads of tests, so i didn't
<rogpeppe> natefinch: i guess you could add it as a config option
<natefinch> rogpeppe: now that I know what to look for, I see several tests are doing this, actually
<rogpeppe> natefinch: which ones, out of interest?
<natefinch> rogpeppe: worker/provisioner/kvm-broker_test.go   has a startInstance method that does it.  same with the lxc broker
<natefinch> rogpeppe: environs/jujutest/livetests  does the same sort of "make up all the fake info and call startinstance"  though, modifying it to fail on purpose
<natefinch> rogpeppe: I guess that's not several, but it's a few
<rogpeppe> natefinch: the broker tests aren't calling the dummy environ
<rogpeppe> natefinch: and in the end, there's no really good default for those parameters - they are genuine parameters to the StartInstance call
<natefinch> rogpeppe: I can make a function that takes a single string for machine id and defaults all the rest of it... that seems pretty defaultable.
<rogpeppe> natefinch: i guess if you want fake values for APIInfo and StateInfo, and the same nonce for all instances
<natefinch> rogpeppe: you just make the nonce "foobar-" + machineId
<natefinch> rogpeppe: Though to be fair, I don't really know what we use the nonce for.
<rogpeppe> natefinch: it's only used to guard against an extremely unlikely case
<rogpeppe> natefinch: that will probably never actually happen
<rogpeppe> natefinch: if the provisioner dies just after it's started the instance but before it's recorded the instance id in the state
<rogpeppe> natefinch: then when it starts again, it'll start another instance
<rogpeppe> natefinch: the nonce means that we can know when that old instance connects, that it's not the one we're expecting
<rogpeppe> natefinch: so for testing purposes, it can be anything tbh
<rogpeppe> natefinch: i wouldn't be against a testing.FakeMachineConfig(machineId string) *cloudinit.MachineConfig function FWIW
<natefinch> rogpeppe: I see. Thanks for the explanation.  And yeah, that's basically all I really wanted.  And maybe something in dummy to start up an environment easily.  I just don't want anyone to have to go through my pain again.
<rogpeppe> natefinch: BTW i'm seeing replicaset test failures in the 'bot: https://code.launchpad.net/~rogpeppe/juju-core/521-peergrouper-publish/+merge/211785/comments/500728/+download
<natefinch> rogpeppe: that's annoying and sucky.  I haven't really had a chance to go back and try to make them more reliable.
 * rogpeppe is done for the day
<perrito666> such is my luck, found a bug by fixing another :p
<natefinch> rogpeppe: btw, there's juju/testing/instance.go which has a StartInstance method that takes an environment and a machineid :)   It doesn't quite work,  but it's close
<natefinch> s/method/function/
<rogpeppe> natefinch: ah, i guess it needs tools
<natefinch> rogpeppe: yep
<rogpeppe> natefinch: i think you'll find that's more hassle than it's worth
<natefinch> rogpeppe: probably
<perrito666> hey what is under cmd/plugins, it it made by us too?
<perrito666> I see that restore backup seems to be re-implementing ssh and scp which are under utils
<natefinch> perrito666: yeah, it's made by us.  I'm not entirely sure why it's reimplementing those things
<perrito666> natefinch: apparently it is reimplementing scp without using the identity file from ~/.juju which fails, at least on ec2 with a very sad and undescriptive error :p
<natefinch> perrito666: bzr blame to figure out who to complain to ;)
<mfoord> rogpeppe: well done on fixing the bot :-)
<perrito666> natefinch: roger.p but I am not sure if he is the author or just someone who moved stuff
<natefinch> perrito666: I think roger wrote at least some of it.  send an email to juju-dev@lists.ubuntu.com if you want more information, I think most of the UK guys are out for the day by now.
<perrito666> natefinch: what would be roger.p name expanded? :p
<natefinch> perrito666: sorry, Roger Peppe, aka rogpeppe on irc
<natefinch> perrito666: he's in the UK.
<perrito666> ah I should have figured that on my own (who he is, not where)
<natefinch> perrito666: it's ok, there's a lot of people on the team
<wwitzel3> done for the day
<thumper> o/
<natefinch> thumper: morning
<wwitzel3> hey thumper
 * thumper is off to a physio appt
<thumper-physio> mramm: ping
<thumper> sinzui: how close are you to writing the release notes?
<thumper> sinzui: I should write up something on the proxy support
<sinzui> thumper, streams.canonical.com did not update, so the release is stalled.
<sinzui> you can add to https://docs.google.com/a/canonical.com/document/d/1CAN-tmQYGLdy1Dd6Ra13EjzqYDivfVZnwRhTfjAdlOQ/edit
<sinzui> and I can revise if needed
<thumper> ok, will do now
<thumper> sinzui: can you make it so I can edit?
<sinzui> oops
<sinzui> thumper, reload
<thumper> ta
<bodie_> I want to make sure I understand this bit about dynamic type.
<bodie_> http://golang.org/ref/spec#Type_assertions
<bodie_> the phrasing of this is kind of unclear.
<bodie_> "x.(T) => "asserts that the dynamic type of x is identical to the type T.""
<bodie_> I thought Go didn't have dynamic types at all
<bodie_> therefore, isn't this really saying: "interfaces are a container type; empty interfaces are satisfied by all types, and so can contain any type"
<bodie_> so it's not REALLY dynamic, just.... virtually dynamic.
<thumper> bodie_: x needs to be an interface
<bodie_> yeah
<bodie_> the phrase "dynamic type of x" threw me off, I guess
<thumper> sinzui: can you look over the addition there?
<thumper> sinzui: actually, time to write some more
<sinzui> I will
<thumper> sinzui: look ok? also added a section for lxc-clone
<sinzui> I see
<sinzui> thumper, this is all goos
<sinzui> good
<thumper> cool
<perrito666> sinzui: ping?
<sinzui> perrito666, hello
<perrito666> have a moment for me?
<sinzui> Maybe in 15 minutes
<perrito666> sinzui: no hurry, it can wait
<perrito666> thank you
<sinzui> hi perrito666
<perrito666> ji, sorry for the impolite ping instead of hello, I was paying attention at something else
<perrito666> say, I am working in https://bugs.launchpad.net/juju-core/+bug/1291022
<_mup_> Bug #1291022: Cannot restore a state-server on ec2 and openstack <backup-restore> <ec2-provider> <hp-cloud> <regression> <juju-core:Triaged by hduran-8> <https://launchpad.net/bugs/1291022>
 * sinzui nods
<perrito666> sinzui: yet, I am getting completely different errors :|
<perrito666> http://pastebin.ubuntu.com/7127660/
<perrito666> can you tell me the specs of the installation where you ran the restore that you pasted on the ticket?
<perrito666> It would seem that it goes trough machine 0 setup and it blows on machine 1
 * perrito666 wonders if anyone ever managed to restore an ec2 using this
<sinzui> perrito666, 1. you get the crucial error I see when working with hp or aws
<sinzui> perrito666, The instrumentation of failure might be different because of the credentials and my version of euca/nova?
<sinzui> perrito666, the specs? do you mean the env.yaml I used
<perrito666> sinzui: 1) ah I thought that was part of the test given the message on line 155
<fwereade> thumper, you know the uniter/debug timeouts -- I have a paste that says where they are when they timeout, in case you don't and it rings any bells: http://paste.ubuntu.com/7127717/
<sinzui> perrito666, This cam from juju. I saw it on my command line for hp and ec2. I put the test into juju CI anyway wondering if I had a local setup problem:
<sinzui> error: cannot restore bootstrap machine: cannot get public address of bootstrap machine: machine "0" has no public address
<sinzui> perrito666, juju did standup a state server though. you can even get the status if it.
<perrito666> sinzui: I encountered some other issues (besides yours) that is what puzzled me
<sinzui> maybe juju is impatient. the public address will be available if it waits a little longer
<perrito666> sinzui: I think that is the problem, perhaps a race condition, altough, after that there is another bug which I solved on my local version :)
<perrito666> so you would not have arrived much further
<perrito666> perhaps it works here bc of the lag to the server
<perrito666> sinzui:  http://pastebin.ubuntu.com/7127730/
<sinzui> that's promising
<fwereade> perrito666, sinzui: fwiw we obviously *do* have that public address *somewhere* because we just sshed to it -- can we get it from there, instead of whatever we're using to look it up that's racy/faily?
<perrito666> fwereade: I think the issue is different
<perrito666> fwereade: I am in south amÃ©rica and my connection to amazon is rather slow :) that gives the machine time to be provisioned and get an ip
<fwereade> perrito666, fwiw I have certainly run backup/restore against ec2, and seen it work, in the past
<perrito666> fwereade: my guess if that even if we could get hold of the ip (which I think we can) we would need to wait anyway
<perrito666> fwereade: well, running 1.7.6 here I found out that restore reimplements ssh/scp but omits to use identity file
<perrito666> at least in the context of the test, where an id_rsa is created for it
<fwereade> perrito666, ha, good catch
<fwereade> perrito666, that's worth a fix independent of anything else
<perrito666> other parts use utils/ssh which does use the right id_rsa and work
<perrito666> so for the sake of fixing this particular issue I hardcoded a few things on my local version to see if I could get to the end of the actual restore
<fwereade> perrito666, am I right in thinking that it's the rsyslog stuff that's failing there on machine 1?
<fwereade> perrito666, but regardless, that "need to wait anyway" is interesting,expand please?
<perrito666> sorry I was tending the laundry
<perrito666> fwereade: well, I was not able to reproduce the actual error reported by sinzui
<sinzui> perrito666, Since I have never seen a pass of my test, there are some lines that I assume will work. If the exit code of the restore is 0, the script calls status until it sees all the machines and unit agents have started. That has a 10 minute timeout.
<perrito666> I need to look into it but I guess that -> cannot get public address of bootstrap machine: machine "0" has no public address
<sinzui> perrito666, you have an error none-the-less. juju-backup exited and stated the update failed
<perrito666> means that it has to wait a bit more since the only difference between me and the test machines is a extrmely poor connection
<perrito666> I will take a deeper look
<sinzui> perrito666, this is the test run from a few hours ago: http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/functional-backup-restore-devel/86/console
<perrito666> sinzui: I do get the connection error, which is strange because after that all marks success
<perrito666> well back to the research then :) I guess Ill tackle those one by one I only wish the test would not take so long
<perrito666> sinzui: thank you you have been very helfpul
<perrito666> fwereade: btw, aren't you on holiday
<perrito666> or was that yesterday?
<fwereade> perrito666, that was yesterday :)
<perrito666> agh, that always happens to me when I work from home
<fwereade> perrito666, it's ok, I'm actually programming this evening, and that makes me happy
<fwereade> perrito666, it must be late for you too ;p
<perrito666> well, I am debugging, which makes me happy, so we are all happy
<perrito666> fwereade: yes it is almos 8pm
<perrito666> got caught with this bug
<fwereade> perrito666, if you're still around when axw shows up he might have something useful to say about the rsyslog complaints on machine 1
<fwereade> perrito666, maybe drop him an email when you stop if he's not around yet
<perrito666> fwereade: sure, can you translate axw into a real name.lastname :p ?
<fwereade> perrito666, ha, sorry, it's andrew wilkins
 * fwereade thinks, and has a sudden crisis of faith
 * fwereade was right
<davecheney> pop quiz: is floating point a valid configuration value type
<davecheney> marco-traveling: and I think no
<davecheney> would anyone care to refute that point ?
<davecheney> thumper: https://bugs.launchpad.net/juju-core/+bug/1295420
<_mup_> Bug #1295420: local environment does not survive reboot on ppc64el <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1295420>
<davecheney> ^ i'm confused about this one
<davecheney> the units appear to be up
<davecheney> and there is no indicatoin that they are looping trying to reconnect
<davecheney> in their logs
#juju-dev 2014-03-21
<davecheney> thumper: yup, this environment is now broken
<davecheney> it doesn't respond to any cli commmands
<perrito666> fwereade: still there?
<thumper> davecheney: hmm... interesting
<thumper> perrito666: I'd expect not, fwereade is probably sleeping, and if he isn't, he should be
<davecheney> thumper: normally i'd be expecting hte unit agents to be freaking out and restarting like crazy
<davecheney> but they are all connected, and just sitting there
<thumper> weird
<davecheney> thumper: do you want to take a look
<thumper> sure
<davecheney> what is your lp id name
<davecheney> ie, ssh-copy-id $WHO
<thumper> thumper
<thumper> :-)
<davecheney> ubuntu@winton-02:~/charms/trusty$ ssh-copy-id thumper
<davecheney> /usr/bin/ssh-copy-id: ERROR: No identities found
<davecheney> maybe i'm doing it wrong
<davecheney> yup
<davecheney> i was
<davecheney> thumper: machine is winton-02
<davecheney>   Hostname 10.245.67.2
<davecheney> copy your .ssh/config stanza and replace the hostname
 * thumper sshes in
<thumper> davecheney: something here is lying
<thumper> davecheney: I can status
<thumper> and it tells me that the machine agent for 0 is down
<thumper> which it isn't
<thumper> becaues it responded to status
<thumper> :-)
<thumper> davecheney: you rebooted about an hour ago?
<thumper> the three lxc machines are all showing as started
<thumper> davecheney: juju ssh 1 works, and both the machine and unit agent are running according to upstart
<davecheney> thumper: yeah, but they don't do anything
<davecheney> i did juju remove-unit mysql/1
<davecheney> and it's still there
 * thumper is reading logs
<davecheney> it's liek status is jammed at some point in the past
<davecheney>  1016  juju status
<davecheney>  1017  juju remove-unit mysql/1
<davecheney>  1018  juju remove-machine 3
<davecheney>  1019  juju status
<davecheney> ^ did jack
<davecheney> thumper: try to do stuff with that enviornment
<thumper> hmm...
<thumper> davecheney: none of the watchers are firing
<davecheney> urk
<davecheney> but they are poll driven, right ?
<thumper> kinda
<davecheney> thumper: being told I have to go to the shops to get food for our family
<davecheney> afk for a bit
<thumper> waigani, wallyworld_: axw has power issues, will be online later
<wallyworld_> ok
<thumper> power as in electricity, not ppc64
<marcoceppi> davecheney: hey
<marcoceppi> what's that concept that's not bundles but is bundles
<marcoceppi> but is like bundles
<thumper> stacks
<marcoceppi> thanks thumper
<thumper> np
<thumper> marcoceppi: I had to turn aufs off by default with lxc-clone
<thumper> marcoceppi: too many weird edge-cases
<thumper> marcoceppi: it is however, btrfs aware
<marcoceppi> thumper: yeah, saw the release, but it will still use lxc-clone
<thumper> so will use fast snapshots
<marcoceppi> woo
<thumper> yes, still use lxc-clone by default
<thumper> but lots of i/o to create a machine as it copies ~800M
 * thumper goes to make a coffee
<waigani> agent-state on new machines (except 0) is stuck on pending
<waigani> trying to ssh into machine one fails:
<waigani> ERROR machine "1" has no public address
<waigani> all-machines log logs this error:
<waigani> ERROR juju runner.go:220 worker: exited "environ-provisioner": no state server machines with addresses found
<thumper> waigani: how long did you wait?
<waigani> lxc-ls and "uvt-kvm list" list no containers
<thumper> waigani: if you haven't done things before, it takes a while
<thumper> waigani: you need to do 'sudo lxc-ls'
<waigani> thumper 10min?
<thumper> 'sudo lxc-ls --fancy'
<waigani> still not up
<waigani> ah, I'll try that
<thumper> is it downloading the image?
<thumper> if you haven't run the local provider before, it is downloading the cloud image
<waigani> ah, there is a lot of network activity
<thumper> tail the log
<waigani> all-machines ?
<thumper> sure
<waigani> thumper: tailing and seeing activity - cheers
<davecheney> thumper: back
<davecheney> are you still using winton-02 ?
<thumper> davecheney: I have wallyworld_ and waigani looking in to replicating this locally while I finish off a much needed patch
<thumper> no, I'm out of winton-02
<davecheney> thumper: ok
<davecheney> can I manually destroy that environmet
<davecheney> or do you guys stil need it
<thumper> davecheney: no, kill it
<thumper> I think we have enough info to work from
<davecheney> kk
<waigani_> thumper, davecheney: lxc-start/stop machine 1 -agent-status showed started/down as expected. Restarted my machine, machine 0 agent-status: started.
<davecheney> waigani_: which mongo are you using ?
<thumper> waigani_: are you on trusty yet?
<waigani_> no :( (busted)
<davecheney> waigani_: recommendation: spin up an environment on ec2
<thumper> waigani_: can I get you to cause a change to the machine that has started up?
<davecheney> deploy cs:ubuntu/trusty
<davecheney> and use that to test
<waigani_> davecheney: MongoDB shell version: 2.4.6
<thumper> waigani_: just to confirm that the bits are hooked up
<davecheney> waigani_: dpkg -l | grep mongo
<waigani_> davecheney: 1:2.4.6-0ubuntu5
<davecheney> waigani_: wrong version
<davecheney> you neet juju-mongodb
<waigani_> ah, how do I get that?
<davecheney> waigani_: 1. use trusty
<waigani_> hehe, okay
<davecheney> 2. sudo apt-get install juju-mongodb
<waigani_> tried that
<waigani_> could not find it
<davecheney> waigani_: it wont if you aren't using trusty
<davecheney> it may be available in the cloud archive if you are using a cloud image on hp cloud or ec2
<waigani_> oh these are steps, not options
<waigani_> right
<waigani_> okay, so I need to update to trusty to debug
<davecheney> waigani_: i'd recommend deploying the ubuntu charm on a cloud
<davecheney> it's faster and less likely to ruin your afternoon debugging upgrde programs
<davecheney> problems
<waigani_> okay, do we have ec2 creds we can use?
<davecheney> waigani_: i can ask for a new ppc vm for you
<davecheney> probably take longer than today
<davecheney> then you can debug the problem at the source
<waigani_> davecheney: :D
<davecheney> waigani_: someone (not me) can give you the hp cloud credentials
<davecheney> that might be another solution
<waigani_> hp, okay
<waigani_> thumper: ?
<davecheney> i only say ec2 because I *KNOW* they have working trusty images
<davecheney> hp cloud, less certain
 * thumper has no hp clould stuff
<davecheney> wallyworld_: ?
<wallyworld_> yes?
<davecheney> cummon folks, we're developing a tool to managed public clouds, and nobody has the credentials to test on the clouds ?
 * wallyworld_ reads backscroll
<davecheney> wallyworld_: do you have HP creds you can share with waigani_
<thumper> waigani_: you need to be on trusty to get the juju-mongodb
<waigani_> hehe
<wallyworld_> i do.
<waigani_> okay, I need to update to trusty anyway...
<wallyworld_> davecheney: except it's my own user name with my password
<wallyworld_> best to get a new account added by asking antonio
<wallyworld_> he has a master account he can create sub accounts from i believe
<davecheney> i think antonio is still online
<waigani_> sorry, I was off downloading trusty, what is antonio's nic?
<davecheney> arosales:
<waigani_> msg hi arosales, I'm part of tim's team. I'm told you are the keeper of cloud credentials. Would I be able to get some for ec2 please?
 * davecheney sadtrombine
<waigani_> doh forgot the /
<davecheney> waigani_: good thing you didn't include your payment details
<waigani_> lol - facepalm
<wallyworld_> thumper: on trusty, i can restart via juju run reboot and also via lxc-stop machine 1 and it shows and down and then started again and config changes seem to be propagated
<wallyworld_> i haven't retried rebooting the host yet
<davecheney> wallyworld_: the test case is
<davecheney> 1. use trusty
<wallyworld_> tick
<davecheney> 2. use juju-mongodb
<davecheney> 3. reboot
<wallyworld_> ok, 2 out of 3 ain't bad
<wallyworld_> i'll reboot
<wallyworld_> good bye cruel world
 * davecheney plays revelry
<waigani_> wallyworld_: you're going to win! hmph
<wallyworld> davecheney: well that didn't go so well, no bootstrap agents restarted after reboot
<wallyworld> well
<wallyworld> they are running but juju status fails
<wallyworld> can't connect to state api port
<wallyworld> actually, machine 1 agents are running but not machine 0
<jam> wallyworld: so the containers came back up correctly, but not the host agent. What is "service juju-agent-wallyworld-local status" say?
<jam> as well as "service juju-db-wallyworld-local status"
<wallyworld> jam: i checked those, db was running, agent was not
<wallyworld> i started agent by hand
<wallyworld> but stats still fails, loking into it
<jam> wallyworld: hopefully there is something in machine-0.log ?
<thumper> waigani_: hey
<wallyworld> nope :-(
<waigani_> hello
<thumper> waigani_: sorry tab fail
<thumper> was curious about wallyworld's status
<jam> wallyworld: http://askubuntu.com/questions/207143/how-to-diagnose-upstart-errors says "/var/log/upstart/JOBNAME.log"
<wallyworld> ah found it
 * thumper goes back to write more tests
<thumper> wallyworld: found the failure
<thumper> or the log file
<jam> mine has: /bin/sh: 1: /bin/sh: cannot create /var/log/juju-jameinel-local/machine-0.log: Directory nonexistent
<wallyworld> the upstart script has the wrong log dir
<jam> which doesn't bode well, I don't know why it would be trying to log there.
<wallyworld> so the job fails
<thumper> WTF?
<wallyworld> mine also says
<wallyworld> /home/ian/jujulocal/tools/machine-0/jujud machine --data-dir '/home/ian/jujulocal' --machine-id 0 --debug >> /var/log/juju-ian-local/machine-0.log
<wallyworld> sad trombone
<thumper> and /var/log/juju-ian-local doesn't exist?
<wallyworld> nope
<wallyworld> i have a ~/jujulocal/log
<wallyworld> which is where the logs were written before i rebooted
<wallyworld> so its just the upstart script that is wrong
<thumper> ah...
<thumper> I know what it is
<thumper> the upstart script is being rewritten
<thumper> with the wrong log dir
<wallyworld> yep :-)
<thumper> i fucking knew it
<thumper> it shouldn't use the log dir from the agent
<thumper> as that is wrong
<jam> wallyworld, thumper: I see the same thing in my upstart, it is trying to redirect to /var/log/$STUFF but we should only be writing to /home/jameinel/.juju/local/$STUFF
<thumper> it isn't what is used by the local provider
<thumper> I have to finish this branch
<jam> thumper: you mean it is using the $DIR that is bind mounted inside the LXC ?
<jam> well, machine-0 can't be bind mounted
<thumper> jam, wallyworld: hangout? and I can explain what I think is the problem
<thumper> then I can go back to work
<wallyworld> sure, i think i can find it anyway
<wallyworld> but let's talk to be sure
<jam> thumper: technically, I'm not working and I have to go have breakfast, but feel free to chat with wallyworld
<thumper> haha
<thumper> kk
<wallyworld> thumper: https://plus.google.com/hangouts/_/calendar/am9obi5tZWluZWxAY2Fub25pY2FsLmNvbQ.mf0d8r5pfb44m16v9b2n5i29ig?authuser=1
<thumper> calendar?
 * thumper puts head back down
<wallyworld> davecheney: sooo, there is one showstopper - juju machine agent update script is wrong for local provider, so after a reboot, no machine 0 agent for you. that doesn't explain everything you are seeing perhaps, but that's the focus right now to be fixed
<thumper> wallyworld:  https://codereview.appspot.com/78660043
<wallyworld> looking
<wallyworld> thumper: i've fixed the issue, i think. gotta write a test and test live
<thumper> ugh
<thumper> I've missed a test
<thumper> wallyworld: ack, and very cool, thanks
 * wallyworld will wait for test to be added
<thumper> lboxing now
 * thumper waits...
<wallyworld> thumper: did you now destroy env for local doesn't remove the upstrat scripts?
<wallyworld> know
<thumper> wallyworld: it should
<wallyworld> well, it didn't just now
<thumper> that's a bug...
<wallyworld> yeah
<wallyworld> i'll file 2 bugs
<thumper> it should use the same mechanism that manual provider uses
<thumper> to remove the script
<thumper> it is possible that manual agent removal is broken too
<thumper> wallyworld: test updated and pushed
<wallyworld> 1 bug for upstart script creation for local, one for removal
<wallyworld> looking
<thumper> wallyworld: the only drive-by there, is a change of an existing job
<thumper> from "host units" to "all"
<wallyworld> ok
<thumper> as the lock dir is needed on all machines
 * thumper thinks
<thumper> oh... ick
 * thumper thinks some more and looks at code
<thumper> ugh
<thumper> I remember why we did it... but it is icky
<thumper> the only machine that is not "host units" is a local provider state machine
<thumper> as normal state machines also host units
<thumper> so we don't try to run it on machine 0 for local
<thumper> which is right
<thumper> hmm...
<thumper> actually doesn't hurt
<thumper> because it uses agent dir
<thumper> and it only tries to chown if there exists a /home/ubuntu
<thumper> so I'd rather keep it as "all machines" as it better describes what is intended
<wallyworld> sounds reasonable
<wallyworld> thumper: my fix works and we now have a valid all-machines-log symlink :-D
<wallyworld> it wasn't valid before
<wallyworld> so looking good, Vern
<wallyworld> thumper: sadly, i changed the code and no tests failed :-(
<thumper> :(
<thumper> possible to write a test to save us next time?
<wallyworld> yep, that's the plan
<wallyworld> i always write a test when fixing a bug
<thumper> that's because you're AWESOME!
 * wallyworld blushes
<wallyworld> thumper: that destoy env thing  - i used --force cause machine agent wasn't running. it left behind upstart scripts as well as mongo process etc. so not really a show stopper i guess
<thumper> wallyworld: we already have a bug for making --force clean up more
<thumper> lets just make sure we do it next week
<wallyworld> yep
 * thumper has hit EOW
<thumper> I've approved that branch and hope it lands
<thumper> later folks...
<arosales> waigani_, hello
<arosales> sorry I missed your ping
<waigani_> arosales: hello :)
<waigani_> I was looking for some ec2 creds. Are you the right person to talk to?
<davecheney> waigani_: hp man, hp
<waigani_> arosales: s/ec2/hp
<arosales> waigani_, hp I can do let me get those to you.
<bloodearnest> ok, lxc-clone: true is like, the best thing EVAR. You guys have made me a non-trival % more productive. 2 months before review time too! Thanks! :D
<axw> no power and no coffee makes axw go something something
<axw> sigh, gotta restart - hotplugging display link doesn't seem to work since too well since I upgraded
<waigani_> axw: what's going on?
<axw> waigani_: with hotplugging?
<waigani_> thumper said you lost power or something today?
<axw> waigani_: oh yeah, power outage all morning
<waigani_> ha, that sucks for you
<waigani_> pen and paper coding...
<axw> I had like 10 minutes left on my laptop before it came back on
<waigani_> oh nice
<axw> couldn't do much without coffee though ;p
<waigani_> haha
<axw> caffeine dependence is the price for flavoursome mornings
<wallyworld> fwereade: i'm off to soccer. here's a small mp that fixes a critical 1.17.7 issue to do with local provider logging and upstart config https://codereview.appspot.com/78730043
<axw> wallyworld: what makes you think it's incorrect? thumper found that rsyslog has an apparmor profile that only allows it to write in /var/log/...
 * axw plays with it
<wallyworld> axw: the symlink was wrong
<wallyworld> the branch fixes it
<axw> ok, I'll take a look.
<wallyworld> the symlink in ~/.juju/local/log didn't point to /var/log/...
<wallyworld> the upstart file was also wrong so that a reboot didn't restart the local machine agent
 * wallyworld -> soccer
<axw> later
<rogpeppe> mornin' all
<axw> morning
<vladk> good morning
<axw> morning vladk
<axw> waigani: I've already done the local provider Destroy vs. broken environments (except for one last thing which I'm doing now), so reassigning the card to myself
<waigani> kk
<wwitzel3> hello
 * davecheney waves to wwitzel3 
<axw> hey wwitzel3
<axw> davecheney: did you got to the bottom of why local on ppc doesn't survive reboot?
<davecheney> axw: not yet
<davecheney> it's hard to reproduce the problem
<davecheney> it might be juju-mongodb
<axw> davecheney: ok, just wondering if it was connected to the bug wallyworld raised
<axw> (#1295501)
<_mup_> Bug #1295501: local provider upstart script broken <local-provider> <logging> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1295501>
<dimitern> mgz, hey
<dimitern> mgz, how is it going with the state changes?
<perrito666> good morning
<dimitern> fwereade, hey
<fwereade> dimitern, heyhey
<dimitern> fwereade, re state changes for vlans
<fwereade> dimitern, yeah
<dimitern> fwereade, you're thinking of having 2 new collections - serviceNetworks and machineNetworks?
<dimitern> fwereade, or the latter will just be a couple of fields in the machine doc?
<fwereade> dimitern, yeah -- serviceNetworks needs NoNetworks, machineNetworks just needs Networks
<dimitern> morning perrito666
<fwereade> dimitern, I'm generally against extending entity documents
<fwereade> dimitern, we have a history of screwing up watcher behaviour by doing so
<dimitern> fwereade, why just networks for machines?
<perrito666> fwereade: turns out the rsyslog thing has actually changed, apparently restore is broken, but since tests didn't pass before we did not know
<dimitern> fwereade, we should list both included and excluded ones i think
<perrito666> axw: thanks for the mail :)
<fwereade> dimitern, because the machine stuff is the record of reality, while the service stuff is the specification
<fwereade> dimitern, it's like hardware characteristics vs constraints
<dimitern> fwereade, so to get both we need to fetch the service's excluded networks and use the machine's included networks
<axw> perrito666: no worries
<fwereade> dimitern, ah-ha, thank you, something has crystallised in my mind
<dimitern> fwereade, yeah?
<fwereade> dimitern, so, looking forward, we'll want to be able to add machines with net/nonet specifications
<dimitern> fwereade, not so forward even
<dimitern> fwereade, i thought that was one of the basic features we're aiming to have for maas
<fwereade> dimitern, at the moment we take that info purely from the assigned units
<fwereade> dimitern, we kinda elided that in favour of servce-only
<fwereade> dimitern, but the forces in play are actually the same as for constraints
<fwereade> dimitern, are you familiar with machine constraints?
<dimitern> fwereade, hmm.. but how about the networker worker - where will it record what networks it started/not started for a machine? and where to get which ones to process in the first place? the service?
<dimitern> fwereade, not that much
<fwereade> dimitern, ok, so when we create a new machine for a unit to live on, we record the constraints in play (env/service combination) and subsequently use those when provisioning the machine
<fwereade> dimitern, this is a bit different to the model we thought we'd have for networks, but it shouldn't be, I think
<dimitern> fwereade, yeah - so we compute the effective set and save it with the machine
<fwereade> dimitern, exactly
<fwereade> dimitern, same deal
<fwereade> dimitern, and this means that it's trivial to create a machine without units, but with net/nonet specification, and store that directly
<fwereade> dimitern, so in fact my "call it serviceNetworks" thing on mgz's review was wrong
<dimitern> fwereade, why would you do that?
<dimitern> fwereade, net/nonet spec should always go with a service
<dimitern> fwereade, well... except in case "i know what i'm doing, just give me a machine like that"
<fwereade> dimitern, people do sometimes like to create machines ahead of time and leave them idle, even if they know what they will want to do with them in future
<fwereade> dimitern, yeah
<dimitern> fwereade, why was your comment about serviceNetworks wrong?
<fwereade> dimitern, because we need to store net/nonet data for both services and machines
<dimitern> fwereade, true
<fwereade> dimitern, (mgz, perrito666): and this means we want globalKey ids, not serviceName keys
<dimitern> fwereade, what will the key be? either serviceName or machineId ?
<fwereade> dimitern, (mgz, perrito666): and *then* subsequently we want to store, separately, what-the-machine-actually-got, analogous to HardwareCharacteristics
<fwereade> dimitern, it should be the entity's globalKey
<fwereade> dimitern, like constraints/settings/any other collection that assoicates with multiple entity types
<dimitern> fwereade, i see
<dimitern> fwereade, sgtm
<dimitern> fwereade, i'll look some more into constraints / hardwarecharacteristics
<rogpeppe> dimitern, fwereade, mgz: standup?
<rogpeppe> wallyworld: ^
<dimitern> jam, ^^
<dimitern> oh, he's probably off today right
<perrito666> hey, can I bzr switch to a tag?
<wwitzel3> natefinch: I merged in trunk and fixed some conflicts on my copy of 030-MA-HA and I'm pushing that up now
<wwitzel3> natefinch: I need a copy of your latest fixes though, I don't have the environment fixes
<voidspace> rogpeppe: grabbing coffee, stretching my limbs and returning
<wwitzel3> natefinch: actually you should probably merge your stuff with trunk before pushing it up, since you will actually have all the changes and I'll just have to merge again when I pull your branch
<natefinch> wwitzel3: ok
<wallyworld> axw: you still around?
<axw> wallyworld: I am now
<wallyworld> axw: the reason the upstart script failed is because the output file is is being redirected to doesn't exist
<wallyworld> jam had the same problem
<axw> wallyworld: why doesn't it exist?
<wallyworld> cause local provider creates a log file elsewhere
<wallyworld> not in /var/lib/juju
<wallyworld> /var/log
<wallyworld> my changes fix the issue and also repair the brokem symlink
<axw> wallyworld: see my comment in launchpad - it's not broken on my system. so there's something more at play here
<wallyworld> ie all-machines.log -> /var/log/juju-ian-local/all-machines.log
<dimitern> wallyworld, it /var/log/juju-<namespace/
<wallyworld> full path: <juju local root dir>/log/all-machines.log -> /var/log/juju-ian-local/all-machines.log
<wallyworld> the above was broken
<axw> wallyworld: machine-0.log is written into ~/.juju/local/log. all-machines.log is written into /var/log/juju-<namespace>
<axw> wallyworld: and symlinks in either direction
<wallyworld> yes
<dimitern> mgz, i'll take over your lp:~gz/juju-core/networks_state_doc branch to finish it off and land it, cause it's blocking 2 of my other branches
<wallyworld> i agree with the first bit
<wallyworld> rsyslog writes to /var/log/juju-blah/all-machines.log
<mgz> dimitern: it's done, I'll land
<wallyworld> the symlink in juju local log points to that
<dimitern> mgz, ah, good morning :)
<wallyworld> before my changes the symlink pointed somewhere that didn't exist (can't recall where now)
<axw> well that is not the case on my machine
<wallyworld> and my changes also fix the broken upstart script
<axw> I don't know what you mean about the upstart script being broken
<wallyworld> wonder why it's differnet for you? it was broken for me john dave
<axw> that is also not broken on my machine
<mgz> dimitern: hm, I need an lgtm still
<wallyworld> the upstart script is broken because it redirects output to a non existant log file
<wallyworld> hence it fails
<wallyworld> and machine agent doesn't start
<fwereade> mgz, dimitern: can you scroll back? I diuscussed some stuff that needs to be a touch different with dimitern
<axw> I don't know, but we need to get to the bottom of it. because the change you proposed will break in another way when we do the agent.conf LogDir->rsyslog worker change
<dimitern> mgz, can you propose your changes from last reviews, i'll take a look
<fwereade> mgz, if you would land that joyent branch instead I would be most grateful
<axw> wallyworld: right, so the same issue in both cases
<axw> wallyworld: i.e. the root cause is that the log file doesn't exist
<dimitern> mgz, yeah, and look at the scrollback
<axw> wallyworld: sounds suspiciously like permissions. did you change root-dir at all?
<mgz> k
<wallyworld> not that i know of
<bodie_> morning all
<axw> morning
<mgz> fwereade: ah, okay
<wallyworld> axw: i'm not recalling the rsyslog worker change you mention
<axw> wallyworld: is this on your machine, or on ppc?
<wallyworld> my machine
<mgz> er... so, I'm not sure how to untie it from servie
<axw> wallyworld: the bit about propagating MachineConfig.LogDir -> agent.Config -> worker/rsyslog
<wallyworld> i think it failed for dave on ppc, not sure
<axw> wallyworld: atm worker/rsyslog hard codes /var/log/juju-<namespace>
<wallyworld> ok, i didn't realise we were doing that
<axw> wallyworld: it will need to, to support debug-log on local
<wallyworld> makes sense to change it
<dimitern> axw, actually it's worse - it hardcodes agent.DefalultLogDir, which is /var/log/juju/
<axw> dimitern: yeah, I think log/syslog tacks on the namespace
 * axw checks
<wallyworld> so, weird that it works for some but not others it seems
<dimitern> mgz, i have some ideas and we discussed the way forward with fwereade, so if you're willing, just propose what you have so far and I can take it over and finish it, so you can do the joyent change
<mgz> fwereade: in fact, apart from undoing hte renaming I've done, I'm not sure what you want, it already uses the same keys as constraints etc
<wallyworld> axw: so this was the upstart bit that was wrong
<wallyworld> /home/ian/jujulocal/tools/machine-0/jujud machine --data-dir '/home/ian/jujulocal' --machine-id 0 --debug >> /var/log/juju-ian-local/machine-0.log
<wallyworld> machine 0 log should not be in /var/log
<wallyworld> it is in juju local log
<dimitern> wallyworld, it's not, it's symlinked there from local log dir
<wallyworld> that's what my change does
<axw> wallyworld: right, but there's a symlink...
<wallyworld> not on the broken systems
<wallyworld> the only symlink i had/have is the all machines one
<wallyworld> which was also wrong
<dimitern> wallyworld, and it really should be in /var/log/juju-<namespace>/, because rsyslog have access only to /var/log/
<axw> wallyworld: so you have stuff going to /home/ian/jujulocal rather than /home/ian/.juju/local, which suggests root-dir has been changed
<axw> wallyworld: I'll see if that has anyhting to do with it
<wallyworld> yep
<wallyworld> i did change root dir
<wallyworld> but john didn't
<wallyworld> and it broke for him the same way
<dimitern> wallyworld, was there 1.16->1.17 upgrade involved with this local env my any chance?
<wallyworld> no
<dimitern> s/my/by/
<wallyworld> i just started a local provider from turnk
<dimitern> ah, ok
<mgz> fwereade: the joyent storage branch has conflicts
<axw> wallyworld: still works when I do that...
<wallyworld> hmmm
 * wallyworld shrugs
<wallyworld> dimitern: axw: i just got back from soccer and need to go eat. i'll bbiab
<mgz> the bigger issue is I'm not sure if dstroppa wants the move of gojyent in or not
<bodie_> question
<bodie_> juju-mongodb is broken on 14.04
<axw> wallyworld: nps, I'll keep investigating
<bodie_> I'm thinking of building from source just so I can make use of my workstation
<mgz> I guess I just try to land as is, and we can fiddle with deps later
<bodie_> do I need to remove juju-mongo from my path until things are cleared up?
<bodie_> rather, juju is broken wrt juju-mongo...
<wwitzel3> fwereade: I am unable to replicate bug 1294776 on my local MAAS even after upgrading the provider and node to 14.04
<_mup_> Bug #1294776: No debug-log with MAAS, 14.04 and juju-core 1.17.5 <logging> <regression> <juju-core:Triaged by wwitzel3> <https://launchpad.net/bugs/1294776>
<wwitzel3> jamespage: ^
<axw> dimitern: btw, worker/rsyslog does do the appending of namespace to logdir (at the bottom of newRsyslogConfigHandler - a little bit non-obvious)
<jamespage> wwitzel3, I see that on openstack as well btw
<jamespage> just updated that bug
<axw> jamespage: did you see my comment?
<jamespage> axw, it does have rsyslog-gnutls installed
<jamespage> (the paste was from 1.17.6 openstack deploy)
<axw> jamespage: thanks, hadn't seen the paste
<jamespage> axw, only just did it :-)
<axw> jamespage: and the ls -l please, if you didn't see that
 * jamespage is testing 1.17.6 prior to asking the release team for an ack on an upload
<jamespage> axw, that paste is foobar
<jamespage> (the apt-cache policy one)
<jamespage> wrong box
<axw> heh, so I see
<jamespage> axw, pasted right one this time!
<axw> hrm, ok, doesn't really shed any light unfortunately
<axw> wallyworld: when you get back, can you please destroy your env, remove any /var/log/juju*, remove ~/jujulocal, and bootstrap with --debug?
<axw> then pastebin that and ~/jujulocal/log/cloud-init-output.log
<axw> wallyworld: just gonna dist-upgrade and see if that's it
<wwitzel3> jamespage: thanks for the update
<jamespage> wwitzel3, np - if you want me to poke anything else just ping me here - don't always look at bugs straight away
<axw> wallyworld: no difference for me after dist-upgrading, so it's not a new policy...
<wallyworld> axw: trying again now, just deleting stuff
<wallyworld> axw: deleting all that stuff and trying again, seems to have fixed it. there's now a /var/log/juju-ian-local/machine-0.log symlink and the all machines one is correct
<wallyworld> must have been left over root stuff
<axw> hmm
<axw> stange
<wallyworld> but bad that we didn't error
<axw> strange even
<wallyworld> i think we need to do a local 1.16 or 1.17 set up and destroy and then try again with 1.18
<axw> I'd like to repro... there's obviously something that needs to be fixed
<axw> ok
<axw> I'll try that in a bit
<wallyworld> ok, i'm a bot too weary to do much more tonight
<wallyworld> bit
<axw> nps, I will dig in and let you know how I get on
<wallyworld> ok, i'll update the bug
<bodie_> confirming juju-core works on 14.04 with source build of mongodb
<dimitern> wallyworld, axw, btw I have this handy cleanup-juju script to obliterate mercilessly any remnants of a local environment: http://paste.ubuntu.com/7130453/ (call as sudo cleanup-juju and change localenv to your envname - preferably unique for pgrep's sake, and obviously dimitern to your username)
<wallyworld> cool, will be handy thanks
<dimitern> sometime you need to run it twice in a row if the agents/mongodb are doing stuff to clean up all
<axw> thanks dimitern
 * dimitern thinks it's about time for another "snippet" type blog post
<mgz> gah, damn criss-crosses
<mgz> I think I;m screwed
<rogpeppe> lunch
<rogpeppe> mgz: if you have a moment at some point, i'd really like to find out why the 'bot keeps getting "lock held" messages, which break the config-changed hook
<rogpeppe> mgz: sample message: Unable to obtain lock  held by go-bot@bazaar.launchpad.net on taotie (process #21172), acquired 4 seconds ago.
<_mup_> Bug #21172: libnotify pops balloons on top of fullscreen window (e.g. screensaver ) <libnotify (Ubuntu):Fix Released by mvo> <https://launchpad.net/bugs/21172>
<mgz> rogpeppe: because it's doing things
<mgz> we talked about this the other day, you can manually take the lock on the machine, but it should mostly be harmless
<mgz> fwereade: so, I;m being partly hosed on the joyent storage landing because ian/daniele landed some of the history, without the changes, as an unrelated bug fix in r2401
<mgz> which is requiring some major unpicking
<fwereade> mgz, ouch
<fwereade> mgz, btw, anything significant in the call yesterday?
<mgz> no, didn't happen as neither antonio or the other joyent guy could make it
<mgz> wanted to catch up with daniele but missed him
<sinzui> jamespage, do you have any advice for Bug #1295609
<_mup_> Bug #1295609: Unable to bootstrap local provider, missing juju-mongodb dependency <mongodb> <packaging> <juju-core:Triaged> <https://launchpad.net/bugs/1295609>
<dimitern> I was just hit by that after the last update - had to install juju-mongodb to be able to bootstrap
<sinzui> dimitern, I was misguided. I restored the previous packaging rule believing juju *preferred* juju-mongodb, but would fall back to mongodb-server
<sinzui> Without fallback we need to create different packaging rules for the series.
<dimitern> sinzui, yeah seems so
<sinzui> well, maybe jamespage has a clever idea
<jamespage> sinzui, I've switched to have only juju-mongodb in the packaging to enforce the dependency
<jamespage> but I've not got a release team ack on that yet
<natefinch> wwitzel3: I made the fix William suggested in the meeting and I merged from main, rerunning the tests now
<sinzui> jamespage, you did that for just the trusty packaging?
<jamespage> sinzui, yes
<wwitzel3> natefinch: great, is that pushed up?
<jamespage> sinzui, I've not got an ack for the upload just yet
<sinzui> hmm, in my case mongosb-server would be listed as removable, but I shouldn't do that if I have an env up
<jamespage> sinzui, obviously that breaks on older releases so its not a clean backport
<natefinch> wwitzel3: the fix william suggested is (though there's one syntax error that I missed somehow.... but it'll be obvious).  After the merge I'm getting compile errors in the build, so I'm figuring those out now
<jamespage> sinzui, indeed it would
<sinzui> jamespage, I can update the packaging in a few hours for trusty users of the ppa.
<jamespage> sinzui, OK _ hoping for an ack on the 1.17.6 upload
<jamespage> I tested on local, maas and openstack OK
<sinzui> jamespage, oh, I was pondering using this on HP cloud to create a 5 node maas for Juju CI http://manage.jujucharms.com/~virtual-maasers/precise/virtual-maas
<sinzui> jamespage, Do you think it would work?
<jamespage> sinzui, ah - we might have a good plan on that
<jamespage> sinzui, rharper in my team has hacked together a openstack integration for MAAS
<jamespage> sinzui, so he can have MAAS control openstack instances that netboot and install like regular nodes
<sinzui> jamespage, He has been updating http://manage.jujucharms.com/~canonical-ci/precise/virtual-maas
<jamespage> sinzui, yeah - he started looking at that stuff
<jamespage> sinzui, and then he and smoser hacked on this approach as well
<jamespage> lemme ping him
<jamespage> sinzui, the nice thing about this is we could just charm it all up and deploy multiple different series on serverstack as test environments
<smoser> sinzui, hp cloud doesn't expose nuetron does it?
<smoser> the limiting factor for all of this is that you have to get a second network from the primary.
<smoser> because you're not going to successfully run a dhcp server on your primary interface.
<natefinch> rogpeppe: what's with this error?  It's not the normal syntax error string... there's no filename or anything:  src/launchpad.net/juju-core/cmd/juju$ go test
<natefinch> # testmain
<natefinch> launchpad.net/juju-core/testing/testbase.PatchEnvPathPrepend(0): not defined
<natefinch> type.launchpad.net/juju-core/testing/testbase.Restorer(0): not defined
<natefinch> FAIL	launchpad.net/juju-core/cmd/juju [build failed]
<smoser> it will work on canonistack or server stack, but i could'nt find a public cloud that exposed both neutron and used kvm.
<smoser> perhaps you could make this work on xen hvm (rackspace).
<sinzui> smoser, yeah, the network has been a factor in other things I need to test.
<axw> natefinch: waigani was getting that, had to delete the pkg dir
<natefinch> axw: I thought it might be something like that. Thanks.
<axw> nps
<smoser> but what rharper has done is really REALLY cool.
<smoser> if we could get a public cloud that had kvm and neutron, then anyone with a credit card could trivially deploy maas
<smoser> and it tests pretty much all of maas (from pxe boot to power control)
<sinzui> smoser, :/ so I probably need to wait, but with some research I might find a cloud that will work
<sinzui> smoser, I am not familiar with joyent's setup, might that be viable when they are available?
<smoser> i went looking a few days ago.
<smoser> and didn't find anything really.
<smoser> the closest to having hte necessary pieces are EC2 and rackspace that i know of
<smoser> and both of those are xen.
<smoser> which adds a wrinkle.
<smoser> and you can't specify '--kernel' on EC2, so thats another wrinkel
<sinzui> thank you very much smoser. At least know the criteria to pull this off. Not next week though as I hoped
 * sinzui copies conversation
<smoser> on a "full openstack" like you get on canonistack or server stack, its doable.
<smoser> and really very cool to see.
<dimitern> mgz, you've got a review on https://codereview.appspot.com/77270046/
<rharper> jamespage: sinzui  re:virtual-maas charm; I'm currently using the non-juju mode of it; I've got an update to setup-maas and pulled in a few scripts (partially based on the maas-to-libvirt-tools) to automate the creation/registration of the nodes
<sinzui> rharper, great news. That would be more helpful to me
<sinzui> rharper, I hesitate to run a 5 node vmaas on canonistack. I moved most of juju-ci off that cloud because we were using most of the resources
<rharper> sinzui: I've got a deadline I'm working toward next week, but after that, I can push a branch against virtual-maas charm with the changes for review -- I have a branch, but not pushed against upstream maas; though I'm currently  applying the nova power type patch in-line in virtual-maas charm since we're working with 1.4.X maas out of the cloud-archive
 * sinzui nods
<natefinch> wwitzel3: well, frig, the fakeuploadtools thing was working before I merged from main :/
<wwitzel3> natefinch: I'm in a similar boat, I can't bootstrap a 14.04 node on maas with trunk .. it was working before I pulled
<natefinch> wwitzel3: I just pushed up a merge from main.
<natefinch> wwitzel3: it doesn't fix anything, but at least we'll both be broken with the same code
<wwitzel3> natefinch: k, will pull it down .. is the fakeuploadtools broken in a test as well? or just when trying to use local provider?
<natefinch> wwitzel3: the fake upload tools stuff is not related to the local provider.  There seemed to be some confusion about that in the standup.  The local provider is broken.  What was fixed yesterday was just being able to run the bootstrap tests we'd been working on.  Of course, now those tests are broken again
<wwitzel3> natefinch: ok, that was my fault
<perrito666> ah I finally caught the bug :D
<perrito666> rogpeppe: have a sec?
<wwitzel3> natefinch: I will take a peak at these tests, see what I see
<natefinch> wwitzel3: there's a panic in bootstrapsuite not being able to find the tools , even though the first thing we do is upload the tools.  I tested that before I merged from main and it worked.  I don't know why it's different now
<axw> rogpeppe: part one of removing things we don't need to do now that we have synchronous bootstrap :)
<wwitzel3> natefinch: tried to do a bisect for more details, but sadly it doesn't know how to step through the commits that make up the merge, so it just tells me that r2288 is bad and gives me the diff between that and r2286 .. lol, not helpful
<mgz> dimitern: on your review, I can revert the naming easily enough, but I think I need your help understanding some state sublties
<mgz> if you have a mo to walk me through some things
<mgz> dimitern: er, you porbably missed that, I'll requery
<dimitern> mgz, yep, my internet started acting funny
<wwitzel3> natefinch: fixed the BootstrapSuite
<natefinch> wwitzel3: wow sweet.  What did you do?
<wwitzel3> natefinch: we were uploading the tools to newly created stor and then never doing anything with it. So I changed it to just upalod the tools to env.Storage()
<wwitzel3> natefinch: pushed
<wwitzel3> natefinch: that was a lie .. I can't type my password right .. pushing now
<wwitzel3> natefinch: ok, you can grab it lp:~wwitzel3/juju-core/030-MA-HA
<natefinch> wwitzel3: you shouldn't need to type your password every time.... I haven't typed mine in months... not even sure what my password is, to be honest
<wwitzel3> natefinch: i know, i know, I just haven't setup an agent yet
<perrito666> natefinch: I type mine every time, default ubuntu server install lacks an agent and I am too lazy to set up one
<natefinch> can't you just get it to use your launchpad key?  maybe I set up an agent and it's been too long, I don't know
<perrito666> natefinch: I get a decent linux work machine this afternoon :) Ill take the time to set up my workspace correctly
<wwitzel3> natefinch: my lp key has a passphrase, I'm not typing my lp password
<natefinch> wwitzel3: oh, I see. Yeah, I don't think I put a passphrase on mine.  Too lazy.
<perrito666> natefinch: ah, yes, same here than wwitzel3
<perrito666> natefinch: we are differents kind of lazy
<natefinch> I only use my LP key for lp, so no one could do anything with it except commit as me ;)
<natefinch> wwitzel3: thanks for the fix. I must have messed it up when I converted to using upload tools, I swear it worked at one point, but maybe I'm just crazy :)
<wwitzel3> natefinch: could of been an artifact of the merge
<mgz> eval `ssh-agent`; ssh-add
<mgz> is not very hard...
 * perrito666 crafts a particularly egregious commit that angers all of the most violent devs... using natefinch key
<perrito666> amazon restored
<perrito666> restore missing state server
<perrito666> fwereade: :D it worked
<wwitzel3> mgz: thanks, done :)
 * fwereade cheers at perrito666
<wwitzel3> natefinch: ok, so now it is just local provider?
<mgz> dimitern, fwereade: state networks branch should be good to go, you may just want to look over last changes/comments
<natefinch> wwitzel3: sorry, had to go for a bit for some family stuff.  back now.  Yes, just local provider
<vladk> dimitrin: do you have a moment to look over https://codereview.appspot.com/78660045/ ?
<vladk> dimitern: do you have a moment to look over https://codereview.appspot.com/78660045/ ?
<natefinch> wwitzel3: well, I found the source of one bug, possibly several... environs/cloudinit had its own constant for the mongo service name
<dimitern> vladk, reviewed
<natefinch> wwitzel3: er environs/cloudinit.go
<natefinch> and this is why you only define constants in *one* place
<natefinch> wwitzel3: local provider was also creating its own mongo service name
<natefinch> wwitzel3: in provider/local/environ.go
<natefinch> wwitzel3: I haven't fixed anything yet, just finding some problems so far
<wwitzel3> natefinch: ok wil start poking around from there
<natefinch> wwitzel3: I'm working on basically removing all reference to MongoServiceName everywhere, no one needs to know it except the thing creating the upstart service.  I added a RemoveService() function to the mongo package to replace one place where we were using the name outside that package
<perrito666> hey, could anyone with bash fu take a look at https://codereview.appspot.com/78870043 ?
<perrito666> thank you
<mgz> perrito666: I do not fulfill the requirement :)
<perrito666> mgz: my fault, I just committed go code and yet I request a bash dev to check it :p
<natefinch> perrito666: sorry, my bash is possibly worse than my spanish, and I haven't taken spanish in 20 years.
<mgz> we need scott or someone :)
<perrito666> natefinch: well I havent taken spanish in about 13 yrs and I am pretty good at it :p
<mgz> living is south america is cheating
<mgz> then again, living in the US should be cheating too
<mgz> *in
<natefinch> perrito666: I can ask where the bathroom is, and that's about it.  If bash involves more than calling commands and optionally piping into other commands, I'm out
<perrito666> well `find . -iname "bathroom"`
<natefinch> I never understood why they didn't write the command so I could just say find bathroom like a normal person.
<natefinch> or possibly find bathroom .
<rogpeppe> perrito666: looking
<perrito666> natefinch: normal people dont use find to find out where the bathroom is
<natefinch> perrito666: I've never been accused of being normal :)
<rogpeppe> perrito666: reviewed
<rogpeppe> natefinch: bash is a ridiculous shell
<perrito666> rogpeppe: lovely thank you
<natefinch> rogpeppe: this is why I don't write shell scripts.  Why write in some wacky language when you can write in a real language?
<rogpeppe> natefinch: 'cos the basic shell syntax is almost perfect for human use
<rogpeppe> natefinch: and pipes are awesome
<perrito666> rogpeppe: yes, just humans who are not suitable for bash
<perrito666> rogpeppe: [[ $agent = unit-* ]] && [ -d "$agent/state/relations" ] does not do the same as my test
<natefinch> rogpeppe: the only part of the shell syntax that seems at all logical is pipes
<bodie_> some people practically only use bash
<rogpeppe> i only use rc
<perrito666> relations will exist, its the folders under it that wont
<natefinch> bodie_: some people sleep on beds of nails, too.
<bodie_> I worked at digitalocean with a guy from IBM who pretty much only knew Bash, but muddled along with other languages when necessary
<bodie_> i tried to reason with him
<bodie_> :(
<rogpeppe> perrito666: i think my test is equivalent to yours
<rogpeppe> perrito666: (your test only checked that ls could read the relations directory, AFAICS)
<natefinch> pretty much any time I need a conditional statement, I drop into a real language (python or go, pretty much, and my python's getting rusty)
<bodie_> who needs conditionals when you can have one-line perl maps from hell?  heh
<natefinch> bodie_: I also never write regexes unless someone is twisting my arm
<natefinch> bodie_: which pretty much puts perl off limits, which is fine with me
<perrito666> rogpeppe: nope, ls -A folder/ lists all inside it excepting . and ..
<bodie_> haha, avoiding regex is always good.
<rogpeppe> perrito666: yes; if the relations directory is empty, your test will succeed
<rogpeppe> perrito666: it's possible you want if ls -d $agent/state/relations/*/* 2> /dev/null; then
<rogpeppe> perrito666: or maybe go the same route as the other one and do: find $agent/state/relations -type f | xargs sed -i ...
<perrito666> yeah find is more elegant in that case too, thank you
<rogpeppe> voidspace: https://codereview.appspot.com/78890043
<voidspace> EOW
<voidspace> g'night all, have a good weekend
<natefinch> wwitzel3: ug, accidentally deleted one extra line when I was removing references to MongoServiceName, and it took forever to figure out what was making things blow up in weird ways.
<rogpeppe> review appreciated, if anyone has some time to spare: https://codereview.appspot.com/78890043/
<rogpeppe> and that's me for the week
<rogpeppe> happy weekends all!
<natefinch> rogpeppe: happy weekend
<wwitzel3> natefinch: I just got back from lunch, I dug a little bit on the local provider stuff before but no break throughs
<natefinch> wwitzel3: yeah, me either.  I managed to break a lot of tests by changing the MongoServiceName stuff (mostly I think it's just that I need to mock out the new RemoveService()) method, since I'm getting access denied errors)
<natefinch> wwitzel3: so, it looks like it's hanging on the line where we create transaction log collection.
<wwitzel3> natefinch: ok
<wwitzel3> natefinch: we had a problem there before too, before we were initializing the replicaset properly
<wwitzel3> natefinch: probably related somehow
<natefinch> wwitzel3: yeah, that's my thought, I'm looking now to see if maybe the local provider is somehow missing that step
<sinzui> hi natefinch wwitzel3 , last juju devs awake. is rsyslog-gnutls a hard dependency for juju-local?
<sinzui> I ask because it does not exist for arm64
<natefinch> wwitzel3: figured out at least part of it.  THe logic in ensure mongo server was bailing out before it actually initiated the service
<wwitzel3> sinzui: with not great confidence, but based on my initial poking around, it looks like yes, it is a hard dep for juju-local
<wwitzel3> natefinch: I thought we told it to wait there?
<wwitzel3> natefinch: well, I guess .. can we tell it to wait there? :)
<natefinch> wwitzel3: if it's already installed, but not running, we just run it, but we don't check to see if it's initiated.  I'm guessing the local provider is installing the service early or something
<wwitzel3> natefinch: ohhh
<wwitzel3> natefinch: good find
<sinzui> wwitzel3, natefinch I am also looking into packaging/publishing issues. The package might be available but the test images cannot see it
<natefinch> sinzui: we definitely try to install the gnutils, but I have no idea how critical their use is. Definitely things currently will  break if it's not there
<sinzui> natefinch, I suspect the problem is ec2/ami. I don't think any of the universe packages are being seen
<natefinch> sinzui: ahh, huh.  I have no idea if that's normal or not.
<sinzui> natefinch, nothing about the arm64 image is normal
<wwitzel3> hah
<natefinch> wwitzel3: I pushed up my code.  it doesn't actually fix things yet, but I think it's better. Mongo doesn't seem to like localhost as its hostname in the replicaset, which is something I remember from when I was twiddling with replicasets locally earlier.  I think it needs to use the local machine's hostname
<wwitzel3> natefinch: yeah you can only used localhost/127.0.0.1 if all of the members of the replicaset use localhost
<natefinch> wwitzel3: well, you'd think a replicaset of one would be ok with localhost then
<wwitzel3> natefinch: in theory it should be
<wwitzel3> natefinch: http://docs.mongodb.org/manual/reference/replica-configuration/#local.system.replset.members[n].host
<natefinch> wwitzel3: 2014-03-21 20:00:21 ERROR juju.cmd supercommand.go:300 failed to initiate mongo replicaset: couldn't initiate : can't find self in the replset config my port: 37017
<natefinch> wwitzel3: I think I remember there needs to be a command line flag set for it to accept localhost
<wwitzel3> natefinch: can you see what host it is trying ti use? is it for sure trying to use localhost?
<natefinch> wwitzel3: double checked, definitely is "localhost"
<wwitzel3> natefinch: ok
<wwitzel3> natefinch: looks like the bind_ip would have to be 127.0.0.1 for the localhost in replicaset to work.
<wwitzel3> natefinch: so you're right, we need to use the actual machine hostname
<natefinch> wwitzel3: I tried setting the bind_IP to 127.0.0.1 too
<wwitzel3> natefinch: well that *should* have worked
<wwitzel3> natefinch: I was able to do it on my local machine that way anyway, which means squat
<natefinch> wwitzel3: the trick looks to be that you need to include the port in the hostname when you use localhost
<natefinch> wwitzel3: then bootstrap finishes on locval
<wwitzel3> natefinch: ahh, I did do that
<natefinch> wwitzel3: we were just passing in the hostname to initiate
<wwitzel3> natefinch: nice so local is worky now too?
<natefinch> wwitzel3: sort of, I hard coded adding the port to see if it would work, so I have to find a way in real code to do it.  But at least we know what the fix is
<natefinch> wwitzel3: and bind_ip can stay 0.0.0.0, that works fine
<natefinch> wwitzel3: actually, not that hard.  if address == "localhost", append :port.
<wwitzel3> natefinch: :)
<natefinch> wwitzel3: gah, bootstrap works but then juju can't connect to the API, so like juju status returns connection refused
<natefinch> wwitzel3: pushed, at least
<wwitzel3> natefinch: ok, I will take stab at it for a bit here
<natefinch> EOD for me.  Have a good weekend everyone.
<wwitzel3> same here
<bits3rpent> Hello everyone, I just uploaded a fix for the password logging bug. I would appreciate if anyone would review it. https://code.launchpad.net/~jwharshaw/juju-core/fixlogbuild/+merge/211655
#juju-dev 2014-03-22
<stokachu> are all the charms being actively synced over to github or should i still use lp.net for the latest stuff?
<rick_h_> stokachu: most of them are over in LP and loaded into the store.
<rick_h_> stokachu: so you'll find the full repos there
<stokachu> rick_h_: cool thanks
#juju-dev 2014-03-23
<hazmat> anyone know what happened to the api for logs
<rick_h_> hazmat: it got back burnered and didn't end up landing yet
<rick_h_> hazmat: I'm trying to recall what it got pulled to but can't recall off the top of my head
<hazmat> rick_h_, yeah.. not seeing in the review queue or tree.. but i remember the discussions and thought it was imminent
<rick_h_> hazmat: yea, we were tracking it to support debug-log in the gui, but it got handed off and then that person ended up getting pulled to a higher priority
<rick_h_> hazmat: so it was still yellow close to red on the planning spreadsheet
<hazmat> yeah.. looks like the latest was this mp https://code.launchpad.net/~rogpeppe/juju-core/themue-058-debug-log-api/+merge/202909
<hazmat> but that implementation is broken with ha
<rick_h_> yea, there was a lot of implementation discussions and a couple of stabs and it needed some more work is my understanding.
<stokachu> is there a way to copy a non core juju plugin to the deployed system?
<stokachu> to be made available to a charm
<hazmat> stokachu, not sure if you got an answer.. but no re non-core plugin
<stokachu> hazmat: ok thanks
<hazmat> stokachu, the closet thing you can get is api access ala the gui, via config or login with credentials
<stokachu> hazmat: ok i was trying to write a parser for charm development but couldnt think of a clean way to access the library
<hazmat> stokachu, if its just a binary you can totally include it in a charm or pass it via juju scp
<hazmat> stokachu, but that's a little different then modifying the env
<stokachu> hazmat: so if i wrote it in go i could pass it through with scp
<hazmat> stokachu, its a fairly trivial go program to validate charms..
<stokachu> was trying to do a parser for a 1 file charm deployment
<hazmat> stokachu, you mean like a bundle archive? or
<stokachu> yea
<hazmat> stokachu, but why do you want the validation in the service/unit?
<stokachu> its more of a parser like install => [ names => ['nginx'] ]
<stokachu> translate to apt-get install ..
<hazmat> stokachu, deployer has a good amount of the logic for validation and checking.. i'd be game for seeing bundle archive support there... but i'm not sure we're talking about the same thing
<hazmat> stokachu, if you want a dsl for charms, use charm-helpers..
<hazmat> stokachu, we already have ansible, salt helpers there..
<stokachu> but you still have to symlink all hooks to a hooks.py file
<hazmat> stokachu, still not clear on the use case but bug  1009687 would cover that
<_mup_> Bug #1009687: charms should be able to provide a 'missing-hook' script <pyjuju:Confirmed> <https://launchpad.net/bugs/1009687>
<stokachu> hazmat: ah ok that helps
<hazmat> just linked it to core as well
<stokachu> just didnt like the fact i had to symlink files everywhere
<hazmat> stokachu, so charm-helpers + missing-hook + 1 file  should be close to what your saying..
<stokachu> would be easy if i could alter the cloud-init as well
<stokachu> hazmat: yea exactly
<stokachu> thinking more in line of like a Dockerfile or Vagrantfile
<hazmat> stokachu, you need to use the api for provisioning if you want to mod cloud-init
<hazmat> stokachu, i do it in some of my manual provider based plugins/tools
<stokachu> hazmat: yea i was looking at your judo stuff
<stokachu> hazmat: do you have it in there?
 * hazmat tries to remember judo
<hazmat> so long ago.. before gui and before azure.. before core.. there was ju-jitsu ..
<stokachu> haha i remember that
<hazmat> stokachu, no.. that stuff is way old and pyjuju based..
<hazmat> stokachu, i'm on a plane atm.. (bandwidth bites).. but checkout add.py on github in my juju-lxc project
<hazmat> re using the provisioning api
<hazmat> you can get a script, shove it into cloudinit and tweak the rest as needed
<hazmat> stokachu, what's the use case?
<hazmat> for modifying cloudinit?
<stokachu> hazmat: so basically i was attempting to create a Dockerfile syntax like for one file
<stokachu> but required installing a few packages
<stokachu> and i didnt want to install the packages in an install-hook and then re-eval after the fact
<hazmat> stokachu,  dockerfiles have no orchestration.. there basically a simple makefile..
<hazmat> with snapshots for each command
<stokachu> yea this was for charms
<stokachu> rather than writing pure shellscript
<hazmat> stokachu, iotw.. how is this dsl going to do relations?
<hazmat> stokachu, use ansible
<stokachu> well i was attempting to do it in perl :)
<stokachu> just to hack at it
 * hazmat holds back the lol
<hazmat> stokachu, what you want is already possible today using charm-helpers ansible support
<hazmat> single file dsl, with relation support
<stokachu> yea
<stokachu> but you still have the symlinks until that bug gets fixed right?
<hazmat> stokachu, yes
<stokachu> ok thats not so bad then
<hazmat> stokachu, have fun
#juju-dev 2015-03-16
<anastasiamac> axw: wallyworld: about cmd output
<anastasiamac> axw: wallyworld: m going to add status to list table :D
<axw> anastasiamac: thanks
<anastasiamac> axw: wallyworld: m going to rename test data to avoid confusion
<anastasiamac> axw: wallyworld: m think that if we want to diff btw unit/service in output, should do it in this pr...
<anastasiamac> axw: wallyworld: thoughts?
<axw> anastasiamac: IMO, we shouldn't care whether or not it's owned by a service or a unit, just whether it's attached to any units
<axw> anastasiamac: by "we", I mean the user
<anastasiamac> axw: wallyworld: k :D so besides adding status and renaming test data, is there anything else that u think should b addressed in this pr on output? :D
<axw> anastasiamac: a few things, I'm commenting now
<anastasiamac> axw: thx :D
<wallyworld> anastasiamac: agree with axw about service vs units fwiw
<anastasiamac> wallyworld: tyvm :D
<anastasiamac> wallyworld: good to know that this part of output is *kind of* done, axw comment pending :D
<axw> anastasiamac wallyworld: I'm free for a hangout whenever. I've commented on the diff
<wallyworld> ia'll take a look
<anastasiamac> wallyworld: do u still want to discuss output?
<wallyworld> maybe we should, just so we're all on the same page
<wallyworld> we can go to the standup hangout
<wallyworld> axw: did you want to joins us quickly?
<axw> wallyworld: omw
<axw> wallyworld: tomorrow I'll need to head out in the morning for a while, to sign the transfer of land
<wallyworld> sure, np
<axw> settlement is next week :o
<wallyworld> \o/
<gsamfira> here's a nice treat for whomever is curious: http://paste.ubuntu.com/10607588/
<gsamfira> http://paste.ubuntu.com/10607590/
<thumper> ha
<thumper> insteresting
<gsamfira> trying out a noop charm now
<gsamfira> see if it actually deploys and runs hooks
<gsamfira> http://paste.ubuntu.com/10597232/ <-- one other treat :)
<axw> gsamfira: nice :)
<axw> gsamfira: that's a state server on jessie?
<gsamfira> axw: yep :)
<axw> neato
<gsamfira> took longer to generate the jessie image for maas then get juju to run on it
<axw> hehe :)
<gsamfira> has a few bugs, but they should be easy fixes :)
<thumper> wallyworld: got a few minutes?
<wallyworld> sure
<thumper> 1:1 hangout?
<wallyworld> yup
<axw> wallyworld: when you're free, PTAL: https://github.com/juju/juju/pull/1841
<wallyworld> sure
<anastasiamac> axw:wallyworld: shall i filter out stroage without Unit ?
<anastasiamac> axw: wallyworld: storage even... from ouput
<axw> anastasiamac: I would prefer we leave it there until we have a
<axw> flag
<anastasiamac> axw: k :D thnx!
<anastasiamac> axw: wallyworld: PR is cleaned up :D plz revisit.. m taking baby to the doc and will check l8r :D tyvm
<wallyworld> axw: what about AttachmentTag instead of EntityTag on params.MachineStroageId ?
<axw> wallyworld: sorry, was afk. hmmm I guess so
<axw> yeah ok, will change
<wallyworld> just a suggestion
<wallyworld> seems a little more meaningful
<axw> wallyworld: I wasn't very happy with EntityTag, that seems slightly better
<axw> anastasiamac: if you use a string, then old clients will be able to see new status values. with an int, they'll get things they don't understand
<axw> anastasiamac: IOW, using a string means the client doesn't need to interpret the value.
<wallyworld> axw: if you have time, here's an initial PR to start adding support for persistent volumes
<wallyworld> http://reviews.vapour.ws/r/1169/
<axw> wallyworld: cool, looking
<axw> wallyworld: reviewed
<wallyworld> ty
<dimitern> wallyworld, hey, thanks for landing my fix
<wallyworld> dimitern: np, still waiting on si
<wallyworld> ci
<dimitern> yeah, we'll see
<jam> dimitern: I'll be there in just a sec, need to use the restroom
<dimitern> jam, sure, omw as well
<dimitern> wallyworld, build-revision was disabled so we might have waited whole day for nothing - so I've enabled it and that will kick off all the rest
<jam> dimitern: I can't hear you at lal
<jam> al
<dimitern> jam, i've rejoined
<wallyworld> dimitern: oh, ffs, i wonder why it was disabled. i saw CI jobs running during the day
<dimitern> wallyworld, yeah, so I can see the tests are running now
<wallyworld> axw: re persistent machine scoped volumes. you can get those if you hog smash units onto the same machine, and when the unit is destroyed, the volume remains. So i was taking persistent to pertain to the lifecycle of the unit. In most cases, that will match the lifecycle of the machine, but doesn't have to
<wallyworld> dimitern: so i must have been seeing a subset of the tests or something
<axw> wallyworld: by that definition, all storage is persistent? I'm pretty sure it's meant to be about whether or not it outlives the machine...
<dimitern> wallyworld, I guess so - the industrial and charm test jobs most likely
<axw> wallyworld: see "Data persistence" in the spec
<wallyworld> axw: fair enough, i was thinking it might be considered to be a bit limiting
<TheMue> morning o/
<dimitern> TheMue, o/
<wallyworld> axw: the reason i put persistent on volumeDoc was that access to volumeparams is no longer available when SetVolumeInfo is called. I take the point about being able to derive it from DeleteOnTermination but are we always going to be able to do that with other providers
<axw> wallyworld: in my current branch, I have code to transfer info from params to info when provisioned. I don't understand your question there - the provider *has* to be able determine whether or not it just created a persistent volume
<axw> wallyworld: i.e. all providers will have to know whether or not they're creating persistent volumes. if they are, then they *must* set Persistent:true, if they are not, then they *must* set it to false
<axw> otherwise we'll end up with volumes floating around costing people $$
<wallyworld> axw: yes, agreed. my point wasn't whether we could create persistent/non-persistent volumes, but whether we'd have a way post creation to query/access that info to pass on to SetVolumeInfo. But I guess we will always have the ability to do that
<axw> wallyworld: we call SetVolumeInfo with the information the storage provider returns from CreateVolumes
<axw> wallyworld: so CreateVolumes needs to record whether or not each volume it created was persistent
<wallyworld> i'll wait for your current branch to land before i finally re-propose this one
<wallyworld> ok, if that's in the contract, fair enough
<axw> wallyworld: my next branch -- https://github.com/axw/juju/compare/watch-machine-storage...axw:storageprovisioner-api-attachments -- will propose after the other one lands
<axw> wallyworld: FYI, this commit copies bits between params/info: https://github.com/axw/juju/commit/196ad48a0ec4633f545979530e1e40d3e4529487
<axw> wallyworld: so you can cherry-pick that if you want to repropose your branch in the mean time
<wallyworld> axw: sure, will look soon, hopefully trunk will be unblocked rsn
<axw> wallyworld: that branch isn't very interesting, just adding a pile of methods to storageprovisioner API for the worker changes
<axw> just an FYI
<axw> feel free to ignore until I propose it
<wallyworld> ty
<dimitern> dooferlad, voidspace, standup?
<perrito666> morning
<TheMue> perrito666: heya o/
<dimitern> voidspace, are you around?
<voidspace> dimitern: sorry guys - omw
<voidspace> got distracted
<dimitern> jamespage, jam, fwereade, hey guys, are we having the call now to discuss jamespage's trip to germany?
<dimitern> jam, fwereade, alexisb, me and jamespage are in the hangout now
<voidspace> dimitern: do you think I'm building my asserts correctly?
<voidspace> dimitern:  Assert: append(isAliveDoc, unknownOrSame)
<voidspace> dimitern: I'm sure I saw that pattern elsewhere in our code
 * voidspace checking
<dimitern> voidspace, I think so - looks fine on initial glance
<voidspace> dimitern: hmm... yes, we do exactly the same elsewhere
<voidspace> dimitern: e.g. state/machine.go
<voidspace> dimitern: Assert: append(isAliveDoc, bson.DocElem{"nonce", ""}),
<dimitern> voidspace, only txn.DocExists cannot be appended like this (and txn.DocMissing)
<jam> dimitern: I'm there
<voidspace> dimitern: thanks
<dimitern> voidspace, so when you check what caused ErrAborted in this case you'll need to also consider the doc was removed, in addition to life != alive, and state != unknown || same
<voidspace> dimitern: we already do that
<voidspace> dimitern: looking for errors.IsNotFound on Refresh
<dimitern> voidspace, I think the issue is around line 209 in SetState
<dimitern> voidspace, ErrAborted will always be the case you enter the if attempt > 0 block
<dimitern> voidspace, but on lin 209 you're checking the error from Refresh
<dimitern> voidspace, I think instead you should check i.State() != AddressStateUnknown && i.State() != newState
<dimitern> voidspace, and the same applies to line 245 in AllocateTo - check i.State() != AddressStateUnknown instead
<voidspace> dimitern: isn't it the other way round, State() == AddressStateUnknown
<voidspace> ah, right
<voidspace> dimitern: yeah, I see
<dimitern> voidspace, :) yeah
<voidspace> dimitern: and now it works, so just need to add those missing tests...
<voidspace> dimitern: thanks
<dimitern> voidspace, awesome!
<mup> Bug #1432577 was opened: lxc containers on AWS can not be exposed <juju-core:New> <https://launchpad.net/bugs/1432577>
<dimitern> voidspace, I'm omw to our 1:1
<voidspace> dimitern: cool
<voidspace> dimitern: branch updated with test
<voidspace> well, push in progress
<dimitern> voidspace, great!
<mup> Bug #1432577 changed: lxc containers on AWS can not be exposed <juju-core:New> <https://launchpad.net/bugs/1432577>
<mup> Bug #1432577 was opened: lxc containers on AWS can not be exposed <juju-core:New> <https://launchpad.net/bugs/1432577>
<voidspace> dimitern: still can't land my branch, critical bug :-)
<voidspace> bug 1431888
<mup> Bug #1431888: Juju cannot be deployed on a restricted network <ci> <deploy> <network> <regression> <juju-core:Fix Committed by dimitern> <juju-core 1.23:Fix Committed by dimitern> <https://launchpad.net/bugs/1431888>
<voidspace> dimitern: ah, it's fix committed
 * TheMue is at lunch
<dimitern> voidspace, yeah, I know - I've set the job to retest with the fix, so it should be released soon I hope
<voidspace> dimitern: cool
<voidspace> dimitern: for testing the upgrade step defined in the state package I need some IP addresses that don't have a Life field
<voidspace> dimitern: is there a better way than manually constructing them as bson.D{...} and doing the insert?
<voidspace> dimitern: if I insert them using state then they get a Life field of course
<voidspace> dimitern: alternatively I can add them and then *remove* the Life field
<dimitern> voidspace, well, older versions of juju with added ip addresses will not have a life field
<dimitern> voidspace, it's fairly common to insert documents directly in state to setup a scenario
<dimitern> voidspace, as part of an upgrade step
<voidspace> dimitern: ok, manual insert it is
<dimitern> voidspace, (I mean to test a step - in reality, those ip addresses will already be in state)
<voidspace> of course
<voidspace> dimitern: and if I insert without an _id then mongo adds it for me?
<dimitern> voidspace, hmm.. depends - if it's ObjectId it will
<dimitern> voidspace, otherwise you need to set it manually
<voidspace> dimitern: ok, thanks
<voidspace> dimitern: found a good test in upgrades_test as a template
<dimitern> voidspace, sweet!
<mfoord> that was fun
<mfoord> a brief power cut
<mfoord> on which note
 * mfoord lurches to lunch
<dimitern> FYI, master is unblocked, 1.23 not yet; I've re-queued all recent merges which bounced on master due to the blocker
<mfoord> dimitern: cool, re-queuing merge then
<mfoord> dimitern: ah, you've done it :-)
<mfoord> thanks
<dimitern> mfoord, :)
 * fwereade forgot he has to be out for a few hours
<mup> Bug #1431888 changed: Juju cannot be deployed on a restricted network <ci> <deploy> <network> <regression> <juju-core:Fix Released by dimitern> <juju-core 1.23:Fix Committed by dimitern> <https://launchpad.net/bugs/1431888>
<dimitern> 1.23 is unblocked as well - I couldn't find any PRs that bounced to re-queue though - if you have any, feel free to re-queue them
<mup> Bug #1432652 was opened: upgrade_test.go failing on PPC64el <ci> <ppc64el> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1432652>
<mup> Bug #1432654 was opened: tracker_test.go failing on ppc64el <ci> <ppc64el> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1432654>
 * dimitern steps out for a while
<abentley> dimitern, mfoord: It would be good if you waited until we get a bless before landing a bunch of new code that could well cause a curse.
<mup> Bug #1431444 changed: juju run results contain extraneous newline <juju-core:Invalid by cherylj> <https://launchpad.net/bugs/1431444>
<mup> Bug #1431685 changed: juju nova-compute charm not enabling live-migration via tcp with auth set to none <juju-core:Invalid> <nova-compute (Juju Charms Collection):New> <https://launchpad.net/bugs/1431685>
<mup> Bug #1432577 changed: lxc containers on AWS can not be exposed <juju-core:New> <https://launchpad.net/bugs/1432577>
<mfoord> abentley: if that's the general rule we need then trunk should be blocked waiting on a bless
<abentley> mfoord: Yes, that works for me.
<mfoord> :-)
 * perrito666 throws a couple of liters of holy wather on the direction of jenkins
<abentley> perrito666: It's not looking good: http://juju-ci.vapour.ws/job/run-unit-tests-trusty-ppc64el/2557/console
<perrito666> well that is what wather will do to servers :p
<natefinch> wwitzel3: shouldn't the converter be running on the machine agent, not the unit agent?
<mfoord> dimitern: for you: https://medium.com/on-coding/programmer-passion-considered-harmful-5c5d4e3a9b28
<wwitzel3> natefinch: probably? it seems to restart just fine when the host ports are updated.
<wwitzel3> natefinch: but I guess we won't be able to issue machine level updates in a unit level watcher.
<wwitzel3> natefinch: I will push it up the stack
<natefinch> wwitzel3: cool
<dimitern> mfoord, :)
<mfoord> dimitern: wait, does this mean the upgrade step should be in steps124 not steps 123?
<dimitern> abentley, we did wait for the failing test in question to pass
<mfoord> dimitern: would like to know if you think this test is sufficient?
<mfoord> dimitern: https://github.com/voidspace/juju/compare/address-life...voidspace:address-life-upgrade
<mfoord> dimitern: TestIPAddressesLife I mean
<dimitern> mfoord, will have a look in a bit
<mfoord> dimitern: ok
<mfoord> dimitern: I'll do a proper PR for it, I think it's done - so long as the test is sufficient
<dimitern> mfoord, so the upgrade steps need to be in steps123 I think
<mfoord> cool, that's where it is
<mfoord> dimitern: in the task for address lifecylewatcher you state: "add a state lifecycle watcher (which is a notify watcher) monitoring ipaddressesC Life changes."
<mfoord> dimitern: should it be a NotifyWatcher rather than a lifecycleWatcher
<mfoord> dimitern: a lifecycleWatcher is a commonWatcher not a NotifyWatcher AFAICS
<mfoord> dimitern: ah no, my mistake
<mfoord> dimitern: lifecycleWatcher also implements Changes
<dimitern> mfoord, it's the same thing - just implementation detail
<mfoord> making it a NotifyWatcher
<mfoord> yep, understood
<dimitern> mfoord, a notify watcher reports empty changes
<dimitern> mfoord, ok :)
<mfoord> dimitern: AFAICS lifecycleWatcher is cuurently unused...
<dimitern> mfoord, hmm.. ok so what's behind machine.Watch then?
<mfoord> where's that defined? not in state/machine.go
<mfoord> dimitern: it's an EntityWatcher
<mfoord> or entityWatcher rather
<mfoord> dimitern: do a grep for lifecycleWatcher in the codebase
<mfoord> I'm happy to use it, just sayin'...
<dimitern> mfoord, ok, then it might have changed since I knew that part of the code
<dimitern> mfoord, entity watcher it is then
<mfoord> dimitern: there is a lifecycleWatcher
<mfoord> dimitern: it should probably be deleted if it's unused
<dimitern> mfoord, no actually, wait a sec
<mfoord> I think lifecycleWatcher is what we want
<dimitern> mfoord, there is a whole bunch of lifecycleWatchers
<mfoord> entityWatcher watches for more than just lifecycle changes (I think)
<mfoord> are there?
<mfoord> there's a test that claims there are...
<dimitern> mfoord, yeah - check state/watcher.go
<mfoord> ah, newLifecycleWatcher
<mfoord> case change, that's why my grep failed
<mfoord> :-)
<mfoord> fair enough
<dimitern> mfoord, yeah :)
<mfoord> just keeping you on your toes...
<dimitern> mfoord, :D
<dimitern> mfoord, an entity watcher would've worked, but it will fire for any changes in the collection, not just life values
<mfoord> yep
<dimitern> mfoord, so I'm looking at the diff for your upgrade step
<mfoord> dimitern: cool
<dimitern> mfoord, what immediately springs to mind is that we should only add a life field with value "alive" to addresses allocated to machines which themselves are still alive, otherwise - it should be "dead"
<mfoord> dimitern: causing us to release the dead addresses
<mfoord> dimitern: cool
<dimitern> mfoord, that's right
<dimitern> mfoord, and only one other issue I could see off hand - other steps usually have a test like TestXYZIdempotent - which runs the step twice to ensure it's ok
<mfoord> dimitern: ok
<mfoord> dimitern: however...
<mfoord> dimitern: if the watcher only gets notified about changes after the watcher starts (i.e. probably not during an upgrade) then the dead addresses will probably never be removed
<mfoord> dimitern: unless you want starting the watcher to check for already dead addresses
<mfoord> dimitern: which it can do
<dimitern> mfoord, well think about it this way - nothing runs before the upgrade is complete, then everything restarts;
<dimitern> mfoord, also, workers watching other entities' life cycle will get an automatic change when the watcher is started, so will go an fetch all dead ips in our case and try to release them
<dimitern> mfoord, therefore, it should all work out eventually I think
<mfoord> dimitern: "workers watching other entities' life cycle will get an automatic change when the watcher is started, so will go an fetch all dead ips in our case and try to release them"
<mfoord> dimitern: why will "other entities life cycle" watchers cause all dead ips to be fetched?
<mfoord> dimitern: why would other watchers cause ips to be fetched *at all* anyway, let alone dead ones
<dimitern> mfoord, :) ok, I started out speaking in general then moved to our specific case
<mfoord> dimitern: do you mean a lifecycleWatcher *is* notified of new dead entities when it starts?
<dimitern> mfoord, I mean our ips watcher will do the same as the other life watchers
<dimitern> mfoord, and the worker which uses the watcher likewise - just react when a change happens (our worker reacts by getting all Dead ips and releasing them one by one)
<mfoord> dimitern: ah, you mean "changes" includes *all dead entities*
<mfoord> dimitern: so the next change (i.e. the restart) will include them
<dimitern> mfoord, well technically the "changes" are always empty for notify watchers, they just signify "something has changed"
<mfoord> ok
<dimitern> mfoord, so you'll need to go fetch the actual docs to see what changed, in our case - we'll just fetch all Dead ips
<mfoord> right
<mfoord> I (wrongly) assumed that the notification would include the dead ips
<mfoord> so that ones that *start dead* would be missed
<mfoord> that's fine then
<dimitern> yep
<mfoord> dimitern: http://pyfound.blogspot.co.uk/2015/02/john-pinner.html
<mfoord> dimitern: lovely guy, I worked with him on EuroPython for two years (when it was in Birmingham) and PyCon UK since pretty much the start
<dimitern> mfoord, I see - he looks like a nice guy
<mfoord> he was :-/
<dimitern> mfoord, those sort of occasions are never welcome or expected :/
<mfoord> yeah, we hoped he'd make it to the next PyCon UK - but wasn't to be
<dimitern> was he sick for some time?
<mfoord> dimitern: he had cancer, but he was expected to last longer
<dimitern> mfoord, oh I see.. terrible
<abentley> dimitern: That's a start, but we still don't have something we can release.
<dimitern> mfoord, you've got a review on the upgrade step btw
<dimitern> abentley, was it because I enabled the job or for some other reason?
<abentley> dimitern: I don't know.  What revision was it?
<mfoord> dimitern: thanks
<dimitern> abentley, 2448 was the last one I saw for 1.23, 2449 - for master
<dimitern> abentley, and for both of these I manually restarted the restricted network job, as commented on the bug
<abentley> dimitern: So you landed cbaacb83e10f7757362f06e11c392ad3388ddf23 and 33de6a0b87bb7db23749c9a8a4f1a17dbb72f014 ?
<dimitern> abentley, wallyworld landed the cbaacb8 for me
<dimitern> abentley, I forced the other one to land as it was fixing a regression around kvm containers not being addressable under maas
<abentley> dimitern: So functional-restricted-networks failed for 2448.  Was that the test you were watching?
<redelmann> hi!
<redelmann> one simple question!
<dimitern> abentley, it did fail initially because the instance had termination protection on it
<redelmann> should "juju scp -r service/0:/path/to/dir/ ." work?
<redelmann> it say: error: flag provided but not defined: -r
<dimitern> abentley, after that I restarted it manually with the same rev 2448 - http://juju-ci.vapour.ws:8080/job/functional-restricted-network/1317/
<abentley> dimitern: I don't see any runs that passed.  just 1313, 1314, 1315.
<redelmann> sorry
<abentley> 1316 was against 2449.
<redelmann> i forget scp -- -r
<dimitern> abentley, last two - 1316 and 1317 passed http://juju-ci.vapour.ws:8080/job/functional-restricted-network/
<dimitern> abentley, yes, and 1317 was against 2448
<abentley> dimitern: You can't run against a previous build-revision when the next build-revision has started.
<abentley> dimitern: The streams for 1317 were for 2449.
<abentley> dimitern: Because we overwrite the streams in the "publish-revision" step.
<dimitern> abentley, ok, I see
<dimitern> abentley, I wasn't going to do it anyway, jumping the line like this, but it was sitting there since yesterday
<dimitern> abentley, any idea why build-revision was disabled in the first place?
<dimitern> abentley, I suspect sinzui left it so I can connect to the ec2 instance to investigate the networking issue
<abentley> dimitern: That would make sense.
<dimitern> abentley, ok, so there was a bit of a stir with the reports due to my intervention, but I don't believe I did anything too bad - as the next rev is tested (already underway) things will fall into place
<abentley> dimitern: The rev currently being tested is master, not 1.23.
<dimitern> abentley, so then to trigger a re-test of 1.23 another change needs to land?
<abentley> No, we can do it manually.
<dimitern> abentley, ok, good
<dimitern> abentley, I hope that won't be too much of a trouble
<abentley> dimitern: But by default, juju-ci will want to test 772cb769e6277403f0f6ac6e41241a52d102badc
<abentley> dimitern: So I'll have to disable build-revision.
<dimitern> abentley, to get 1.23 ahead of master?
<abentley> dimitern: Yes, so that I have a window when it's not testing where I can start a manual re-test.
<dimitern> abentley, ok, that sounds good - i won't touch anything more today
<mup> Bug #1432759 was opened: Transient error on status while running deployments via quickstart <juju-core:New> <https://launchpad.net/bugs/1432759>
<perrito666> ericsnow_: I might have found a small bug in http://reviews.vapour.ws/r/1172/
<perrito666> I added an issue
<ericsnow_> perrito666: thanks
<mrpoor> hi
<natefinch> Hello :)
 * natefinch just got bit by that whole "This thing says it's State but it's really not State-state"
<ericsnow_> perrito666: thanks for catching that; I've updated the patch
<perrito666> ericsnow_: cool :D
<perrito666> natefinch: well it is A State, not The State
<davechen1y> natefinch: imagine me grinning, with that kind of evil, sadisic grin
<natefinch> davechen1y: do you have another kind of grin?
<davechen1y> sometimes it looks slightly less creepy
 * perrito666 was going to mock natefinch when he found himself between 3 different kinds of state
<davechen1y> if it's dark
<davechen1y> and i'm not looking directly at you
<perrito666> ok ok, we might need to rename a few states
<perrito666> or make grep a lot smarter :p
<natefinch> wwitzel3: gotta run, kids are going crazy. I'll try to get a test run on my code later... right now, enabling the watcher I had made makes my local provider fail before it even creates the ~/.juju/local/ folder... which I would have hoped would be impossible, but evidently not.
<natefinch> s/watcher/worker/
<wwitzel3> rgr
<thumper> niemeyer: ping
<ericsnow_> could I get a review on http://reviews.vapour.ws/r/1173/
<mattyw> thumper, as you're around I have a fairly low priority ping for you - if you're busy feel free to ignore
<mattyw> calling it a night all, nighty night
<perrito666> menn0: there are errors displaying yourlast pr
<menn0> perrito666: I know. it's a long running feature branch. all the changes have already been reviewed. you can ignore. I just need to get the merge commit in.
<menn0> perrito666: while we're talking...
<menn0> perrito666: if you run the state tests on current master do you see this too? :
<menn0> $ go test ./state
<perrito666> menn0: dont look at me, my reviews carry the weight of a feather
<menn0> # github.com/juju/juju/state
<menn0> state/unit.go:1787: assignment count mismatch: 2 = 1
<menn0> # github.com/juju/juju/state
<menn0> state/unit.go:1787: assignment count mismatch: 2 = 1
<menn0> FAIL	github.com/juju/juju/state [build failed]
<menn0> perrito666:  :)
<perrito666> menn0: lemme look
<ericsnow_> menn0: that cleared up for me when I ran godeps -u
<menn0> ericsnow_: ok, let me try that. I have a hook which runs godeps automatically...
<menn0> ericsnow_: ...which apparently didn't work. all good now.
<menn0> perrito666: never mind. thanks anyway.
<ericsnow_> menn0: cool :)
<perrito666> menn0: ok, I was running them
<menn0> perrito666: if they're running then you're not seeing the problem. the compile fails straight away.
<perrito666> menn0: ok
<perrito666> wallyworld: you make me feel so roman when you call me horatio :p
<perrito666> wallyworld: ping me when you are around pls
<wallyworld> :perrito666 in meeting, talk soon
<perrito666> wallyworld: no hurry at all
 * perrito666 tries to get more upload and discovers that his internet provider punishes permanence as a client
<davechen1y> in related news
<davechen1y> a googler i know has got himself locked out of his GCE account
<davechen1y> because his daemon is consuming all the request quota
<davechen1y> if you've used AWS
<davechen1y> you know that feel
<perrito666> davechen1y: I have never been there, aparently I hvent used it enough
<perrito666> also, what happened to your nickname
<davechen1y> internets
<wallyworld> perrito666: hi, free now
<perrito666> wallyworld: segfault, now was not allocated
<perrito666> sorry, that was a really bad joke
<wallyworld> groan
<davechen1y> yellow card
<perrito666> oh cmon, really? I didn't eve hit the guy
<davechen1y> let's go to the video replay
<davechen1y> 09:55 < perrito666> wallyworld: segfault, now was not allocated
<davechen1y> 09:55 < perrito666> sorry, that was a really bad joke
<davechen1y> 09:55 < wallyworld> groan
<davechen1y> the referees decision is final
<perrito666> oh, he is just acting, Ill complain to the league
<ericsnow_> davechen1y: lol
<perrito666> ok so, apparently my ISP will raise my bill a 30% in avg every 6 months if I stay for 1.5 years more
<perrito666> is this a common practice over the world?
<menn0> perrito666: not in my experience
 * perrito666 gives a tour around all the possible ISPs and finds everyone has the same behavior, that is... mad
<perrito666> so it is more convenient for me to actually drop the service every 6 months and get a new membership
 * perrito666 calls to drop the service and get a new connection
<ericsnow_> perrito666: PTAL http://reviews.vapour.ws/r/1173/
<perrito666> ericsnow_: going
<ericsnow_> perrito666: thanks
<menn0> perrito666: in my experience, if that kind of thing happens, you can usually get the deal that new subscribers get when you threaten to leave
<perrito666> menn0: trying to, Ill call after ericsnow_ 's patch is reviewed
<ericsnow_> wallyworld: I have a followup to your suggestion to comment individual constants: http://reviews.vapour.ws/r/1173/
<wallyworld> ericsnow_: will look after meeting
<ericsnow_> wallyworld: ta
<perrito666> ericsnow_: lemme know if my comment makes sense
<ericsnow_> perrito666: k
<perrito666> I was a bit lazy to re-write the whole thing
<perrito666> anyone has anything else to review? it seems Ill be on hold for a long time and the music is quite soothing, ideal for reviewing
<ericsnow_> perrito666: I've updated that patch; see if it looks okay to you now
<perrito666> ericsnow_: creative :)
<ericsnow_> perrito666: hopefully easier to follow :)
<perrito666> ericsnow_: there you go, fix then shipped
<ericsnow_> perrito666: thanks
<axw_> wallyworld: FYI, http://reviews.vapour.ws/r/1176/ -- would be good if you could take look later
<wallyworld> sure
<perrito666> ericsnow_: btw, add please's and sorry's wherever it might apply, I am not being impolite, my eyes are just tired
<ericsnow_> perrito666: :)
#juju-dev 2015-03-17
<ericsnow_> davechen1y: PTAL http://reviews.vapour.ws/r/1163/ and http://reviews.vapour.ws/r/1164/
 * perrito666 hears a kid receiving express education on the house next door
<perrito666> sitting in my office during the night makes for the most interesting show
<jw4> perrito666: is that Argentenian for FastMath (tm) ?
<jw4> s/FastMath/FasttMath/
<perrito666> jw4: I dont know what FastMat â¢ is, but I was referring at hand against kids bottom
<jw4> perrito666: :-p
<jw4> I figured as much - I was trying too hard to be funny
<perrito666> and of course kids screaming reply
<ericsnow_> perrito666: did http://reviews.vapour.ws/r/1172/ look okay now?
<perrito666> ericsnow_: yup, you need a senior rubber stamp though
<ericsnow_> perrito666: thanks
<menn0> thumper, waigani, davechen1y, wallyworld : refactoring of api.Open: http://reviews.vapour.ws/r/1179/
 * thumper looks
<thumper> menn0: done
<menn0> thumper: thanks
<perrito666> thumper: hey, you owe me an emails answer :) could I collect on that?
<thumper> perrito666: um... sure
<perrito666> thumper: the one about the mongo admin user
<thumper> yeah, I know the one
<thumper> just busy right ow
<thumper> will get to it shortly
<perrito666> thumper: well you have a lot of time, I am going to sleep
<davechen1y> can anyone else get to the ppc vm's ?
<davechen1y> sinzu, can you get to your ppc vm's anymore ?
<davechen1y> looks like batuan has lost all our keys
<wallyworld> axw_: reviewed, just a few questions
<axw_> wallyworld: cheers
<axw_> wallyworld: PTAL
<wallyworld> looking
<wallyworld> axw_: lgtm
<axw_> thanks
<redelmann> Hi, I have a noob question about hooks
<redelmann> somebody can give me a hand?
<jw4> redelmann: what's your question - we'll try :)
<redelmann> want to prevent django migration when new unit is add
<redelmann> jw4, want to prevent django migration when new unit is add
<redelmann> jw4, can i set a variable inside the charms?
<jw4> redelmann: hmm; that's fairly specific to the charm you're using I think
<redelmann> jw4, im making a custom charm
<jw4> ah, I see
<redelmann> jw4, can i change a charm config variable from inside the charm?
<jw4> redelmann: yes you should be able to - I'll need to dig a bit to see exactly how
<redelmann> jw4, something like migrate: true?
<redelmann> jw4, charmhelper does not have something to do that?
<jw4> redelmann: it's outside my experience, but I can see if I can find more info in the source
<redelmann> jw4, could use a file, but it is not the most elegant solution
<redelmann> jw4, ok dont worry. i was asking because i thought there were a direct solution
<jw4> redelmann: okay - someone else on this channel or on #juju may know off the top of their head
<mup> Bug #1373236 changed: apiserver/backups: backups must be migrated off legacy provider storage <backup-restore> <juju-core:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1373236>
<redelmann> jw4, with direct solution i meant implemened solution in juju
<redelmann> jw4, thank
<jw4> :)
<jw4> redelmann: '$ juju help set' if you haven't tried that yet
<wallyworld> axw_: RE persistent attribute, we could use the provider ValidateConfig() to reject attempts to set persistent for providers which don't support it?
<redelmann> jw4, but i cant run "juju set" inside charm hooks. or i can?
<jw4> redelmann: that's what I don't know
<redelmann> jw4, i will try tomorrow at work
<jw4> kk
<redelmann> jw4, i'm seeing postgresql charm, i think they use some "state file". Maybe that's the elegant solution
<redelmann> jw4, thank you.
<jw4> cool
<jw4> good luck redelmann
<wallyworld> axw_: i've updated the pr to add extra provider validation
<thumper> gah...
<thumper> trying to work out why my test wasn't working
<thumper> hadn't set the feature flag
<thumper> ...
<thumper> yay?
<jw4> thumper: at least you know they work
<thumper> yeah :)
<menn0> thumper: it is not my day. due to other branches landing db-log now has merge conflicts again.
<thumper> :(
<menn0> thumper: prepare for another PR for you to merge
 * thumper braces himself
 * thumper is still braced
<alexisb> thumper, menn0 is helping you with your gym routine ;)
<thumper> alexisb: heh... kinda
<menn0> ha :)
<menn0> thumper: https://github.com/juju/juju/pull/1857
<thumper> menn0: merged
<thumper> menn0: I don't see much point actaully looking at the diff
<thumper> it isn't like I'd be able to spot anything :)
<menn0> thumper: there was a conflict due to waigani's most recent change
 * thumper nods wisely
<menn0> thumper: I guess this is one downside of big feature branches. you end up with a much larger "conflict surface" :)
<thumper> true
<menn0> thumper: not a showstopper (they're still worth it) but something to be aware of
 * thumper sighs
<thumper> my magic branch to stop serving the initial environment at the root of the api now breaks a shed ton of tests
<thumper> because guess where all the tests connect to...
<waigani> :(
<menn0> thumper: woot! db-log branch merged
<thumper> \o/
<menn0> thumper: and I have a fix for the intermittent upgrade test failure
<thumper> double \o/
<thumper> \o/\o/
<thumper> a ':' bites me again
<thumper> ok, down to four test failures, and I'm sure at least one of them is about logging
<thumper> FARK!!!
<thumper> shitty shitty shitt tests
 * thumper stabs it through the heart
<thumper> I'll work out how to write it properly tomorrow
<menn0> thumper: new logging stuff, or old?
<thumper> well, I'm working around the 'response redacted' bti
<thumper> implementing it (I think) a little better
<thumper> but also, we have tests that assert that we call version 1 of Login
<thumper> now I have version 2
<thumper> I want to rewrite the test
<thumper> not just change 1 to 2
<thumper> the test is broken
<thumper> and testing the wrong behaviour
<thumper> it obviously don't care that I'm logging in with v1 or v2
<thumper> it actually cares about ipv4 vs ipv6
<thumper> so it should be testing that
<thumper> not the version of the login
<thumper> bah humbug
<menn0> totally. understood.
<thumper> ok, I'm done
<thumper> laters peeps
<axw_> wallyworld: just got back. will take a look now
<wallyworld> sure, had to rebase as well
<menn0> wallyworld: if you have a moment: http://reviews.vapour.ws/r/1184/
<wallyworld> sure
<menn0> wallyworld: thanks
<axw_> wallyworld: still feels a little fragile, as it's opt-out rather than opt-in, so default behaviour would be to "support" the persistent attribute but ignore it. we can iterate on it, what you did with the scope checking is a good step
<axw_> wallyworld: shipit
<wallyworld> axw_: yeah, we need to improve validation a bit, plus add schema support etc
<menn0> wallyworld: i'm done for the day. if you're happy with that PR would you mind attempting a $$merge$$ on it?
<wallyworld> menn0: looks good, did you change the api call for any reason?
<wallyworld> menn0: just did a +1
<wallyworld> i can merge
<menn0> wallyworld: cool thanks... i'll do the merge
<wallyworld> ok
<menn0> wallyworld: i changed the API because it seemed silly to trigger a possible environment destruction when we still need the env for the test
<wallyworld> fair point
<menn0> wallyworld: so I chose a non-destructive yet restricted API call
<wallyworld> +1
<menn0> wallyworld: cheers
<axw_> wallyworld: FYI, https://github.com/axw/juju/tree/worker-mounter
<axw_> need to do some more testing before I propose it
<axw_> it appears that "storage list" is broken, looking into it atm
<wallyworld> ok
<axw_> wallyworld: still around? http://reviews.vapour.ws/r/1185/
<mattyw> morning all
<voidspace> grrr... someone else landed an upgrade step
<voidspace> now I have conflicts to resolve
<voidspace> *really easy* conflicts, but still...
<voidspace> dimitern: "addresser"?
<voidspace> dimitern: or networker/addresses ?
<dimitern> voidspace, you mean the name of the worker?
<voidspace> dimitern: yep
<dimitern> voidspace, addresser I think - not in networker though - top level
<voidspace> dimitern: yep, cool - thanks
<TheMue> Hi folks, greetings from the CeBIT
<dimitern> TheMue, hey! how's it going there?
<TheMue> First informational talks for me, so far few interested people
<TheMue> But this area optically more looks like for recreation instead of a real booth
<voidspace> dimitern: just you and me for standup I think :-)
<TheMue> Not well designed
<dimitern> TheMue, it will pick up as it goes
<dimitern> voidspace, omw
<TheMue> I hope so
<natefinch> Anyone seen a message like this before?  I'm getting this when destroying a local environment: ERROR while stopping mongod: exec ["stop" "--system" "juju-db-nate-local"]: exit status 1 (stop: Method "Get" with signature "ss" on interface "org.freedesktop.DBus.Properties" doesn't exist)
<dimitern> natefinch, I've seen this
<dimitern> natefinch, is this on vivid?
<natefinch> dimitern: utopic
<dimitern> natefinch,  inside a container?
<dimitern> natefinch, perhaps this is relevant - http://ubuntuforums.org/showthread.php?t=1594566
<dimitern> I have seen this error recently, but can't quite recall the circumstances - what I remember is that it has nothing to do with dbus missing or otherwise
<natefinch> it's just the local provider on my laptop.  Laptop is utopic.  *shrug*
<natefinch> connman is not installed, so that link is evidently not applicable
<voidspace> natefinch: my laptop arrives today, it will have windows on it
<voidspace> natefinch: is installing ubuntu straightforward?
<natefinch> voidspace: IIRC it's just a matter of making an ubuntu USB stick and booting with it in
<natefinch> voidspace: I don't remember having to twiddle with the bios or anything
<voidspace> natefinch: cool, that's what I'm hoping :-)
<perrito666> morning everyone
<voidspace> perrito666: morning
<perrito666> natefinch: fell off the bed?
<natefinch> perrito666: nah, I been getting up at 5am for a couple weeks now, trying to get this HA stuff done.. my day is pretty broken up with kids stuff, though, so I'm usually AFK from 7-9:30
<voidspace> dimitern: do you think "juju add-machine lxc:0" should be sufficient to allocate an IP address?
<voidspace> dimitern: 1.22, maas
<dimitern> voidspace, in 1.22 it won't allocate anything
<voidspace> dimitern: gah, dammit
<dimitern> voidspace, you need 1.23 - then it will work
<voidspace> dimitern: I should have done 1.23
<voidspace> yep
<dimitern> :)
<voidspace> heh, ah well - at least I've got maas working
<voidspace> dimitern: took a bit of effort - it has a dependency on python-apt that isn't automatically installed
<voidspace> so everything was broken until I worked it out
<dimitern> voidspace, which one? maas?
<natefinch> Anyone have an idea of why this worker, when started in MachineAgent.Run() would cause juju to simply not run?  https://github.com/natefinch/juju/blob/ha-to/cmd/jujud/agent/converter.go
<voidspace> dimitern: yep, maas
<voidspace> dammit, screwed up my git checkout and now have juju 1.22 local branch named juju-1.23.0
<dimitern> voidspace, weird.. I haven't seen this with maas before
<voidspace> I guessed the tag wrong - should have remembered it wasn't released yet
<voidspace> dimitern: maybe you had python-apt already installed?
<dimitern> voidspace, ah :/
<voidspace> it's quite common
<voidspace> I just have a kvm image *just* for the maas controller - so only maas dependencies installed
<dimitern> voidspace, I've installed what maas packages from the ppa brought over I guess
<perrito666> natefinch: without any ouput?
<voidspace> dimitern: me too
<wallyworld> axw_: was at soccer, will look soon
<wwitzel3> that was weird
<perrito666> wwitzel3: ?
<wwitzel3> my bnc acted up, can't tell if what I typed was actually sending
<wwitzel3> did you guys see my comments about using the termination work as a reference piece?
<perrito666> nope
<perrito666> I havent seen any comment from you
<wwitzel3> natefinch: I ran into the same issue
<wwitzel3> natefinch: I am currently making a barebones working using the termination worker as a reference
<wwitzel3> natefinch: then I was going to make it watch the jobs collection
<wwitzel3> natefinch: and just get that starting without hanging
<natefinch> perrito666: yeah, no output, no logs at all
<wwitzel3> natefinch: I'll let you know if that works in a second
<natefinch> perrito666: like, /var/log/juju/ is empty
<natefinch> wwitzel3: cool
<natefinch> I gotta go afk for a couple hours.  Kids are up.
<voidspace> dimitern: how long should it take for a new container to switch from "instance-id: pending" ?
<dimitern> voidspace, if it's the first one, a few minutes until it downloads the image
<voidspace> dimitern: ah
<voidspace> dimitern: ok
<perrito666> wallyworld: if you are still around I have answered you on the review and would appreciate an answer to that
<wallyworld> perrito666: ok, looking
<dimitern> voidspace, you could ssh into the host, sudo su -, then look in /var/lib/juju/containers/*/*.log and /var/lib/lxc/*
<voidspace> dimitern: it's started :-)
<voidspace> dimitern: and has a dns-name
<dimitern> voidspace, it's a great feeling when it works, isn't it? :)
<perrito666> dimitern: only if you know why
<wallyworld> perrito666: hmmmm, i think we might want to retain the Stopped state
<wallyworld> for that legacy agent-state
<voidspace> dimitern: upgrade worked fine
<voidspace> dimitern: and I appear to be able to deploy into the container
<voidspace> dimitern: so the upgrade didn't break the container
<dimitern> perrito666, :)
<perrito666> wallyworld: works for me
<dimitern> voidspace, so what was your scenario pre-upgrade?
<wallyworld> perrito666: let's do it and we'll validate the decision
<voidspace> dimitern: I just added the container
<voidspace> dimitern: bootstrap then add container, wait for that to complete, upgrade
<voidspace> dimitern: I have to go!
<voidspace> dimitern: see you in a bit
<dimitern> voidspace, ok, that's great - need to check with a destroyed container as well
<perrito666> wallyworld: I believe Ill go with Unknown->Error and Terminated->Terminated
<dimitern> voidspace, ping me when you're back
<wallyworld> perrito666: i don't think Unknown->Error is quite right. we know if start hook has run
<wallyworld> so unknown maps to pending or started
<wallyworld> also, Terminated should ->Stopped
<wallyworld> to keep using the legacy terminology
<axw_> wallyworld: I'm thinking of simplifying storage provisioning a bit. I'm thinking of extending the storage.Provider interface to indicate whether or not it supports dynamic provisioning. If it does, provisioning will be done by the storage provisioner, and if it doesn't it'll be done by the machine provisioner
<axw_> wallyworld: so then there'll be exactly one thing responsible for provisioning a volume
<axw_> wallyworld: atm there's a race between storage provisioner and machine provisioner, and it's manifesting in the storage provisioner in my testing
<wallyworld> axw_: that aspect was starting to occupy my thoughts also
<wallyworld> not from anything concrete
<wallyworld> but just to eliminate unnecessary complexity
<axw_> wallyworld: the other thing I was thinking was removing Attachment from VolumeParams and FilesystemParams
<wallyworld> that i hadn't been thinking about
<axw_> wallyworld: so they're always created separately, rather than "maybe with the volume, maybe later on"
<wallyworld> ok
<axw_> wallyworld: we'd still do both at the same time in the machine provisioner
<axw_> so there'd be a different type for that
<wallyworld> ok, let's see how it plays out
<axw_> I'll spike on it tonight/tomorrow, I think I can get something in that makes the storage provisioner more reliable
<wallyworld> \o/
<wallyworld> axw_: i also ran into that attached vs pending bug
<wallyworld> in this http://reviews.vapour.ws/r/1186/
<wallyworld> added a todo
<wallyworld> could you look when you get a chance, no hurry
<axw_> wallyworld: will do
<wallyworld> ta
<mattyw> fwereade, got a moment - irc is fine
<fwereade> mattyw, sure
<jw4> perrito666: thanks for the review - I think those issues can be dropped... thoughts?
<perrito666> jw4: if you can say why :p I can certainly agree (but i was meaning to add them as comments not issues)
<axw_> wallyworld: reviewed
<wallyworld> ty
<jw4> perrito666: sure - see if my responses make sense and if not we can keep them as issues
<perrito666> jw4: ill answer in rb so its easier to follow up
<jw4> perrito666: cool, tx!
<jw4> thanks perrito666 I'll update the PR with your feedback :)
<perrito666> jw4: the ones where I said nothing i am fine with dropping
<jw4> perrito666: cool
<wwitzel3> natefinch: well I got past juju just out right hanging, now I at least am getting "unknow object type Converter"
<wwitzel3> natefinch: with an error of not implemented .. but it doesn't tell me what is not implemented
<wwitzel3> natefinch: but getting closer :)
<wwitzel3> THere is a way to tell logging not to redact responses right?
<natefinch> wwitzel3: dunno about redacting responses
<wwitzel3> natefinch: ok, well I'm trying to figure out what is "not implemented"
<wwitzel3> natefinch: right now I've got it starting from the machine agent with StartWorker but it is just in a restart loop with that error
<wwitzel3> natefinch: but again, that is better than hanging completely
<wwitzel3> natefinch: I can push what I have if you want to look
<natefinch> wwitzel3: sure
<natefinch> fwereade: got any time to give us a hand?  We're trying to wire up a watcher, and it's causing the whole of jujud to blow up for some reason.
<wwitzel3> natefinch: ok, pushed
<wwitzel3> natefinch: well it isn't blowing up anymore, not it is just in an isolated restart loop :)
<wwitzel3> improvement!
<natefinch> wwitzel3: repeated smaller explosions better than one big one?  I guess so :)
<alexisb> perrito666, you available for cloudbase call?
<fwereade> natefinch, go on
<fwereade> natefinch, fist guess is that it's returning an error that is somehow triggering the isFatal check in its runner or something?
<natefinch> fwereade: could be, I'm just surprised that it would kill jujud before it could even write a log file
<natefinch> fwereade: https://github.com/natefinch/juju/blob/ha-to/cmd/jujud/agent/converter.go
<fwereade> natefinch, that ErrTerminateAgent looks to me like the problem
<natefinch> fwereade: this is a worker for machine agent for when we convert a normal server to a state server
<fwereade> natefinch, ErrTerminateAgent means "ok clean yourself up and never run again"
<fwereade> natefinch, returning just-some-error will exit nonzero and put ourselves in the hands of the init system, which should then bounce us
<fwereade> natefinch, so (assuming you do want to restart jujud, which seems sane) create some error that triggers isFatal, but don't use ErrTerminateAgent
<natefinch> ahh ok
<wwitzel3> oh well that's my fault
<wwitzel3> reading the code it sure does look like ErrTerminateAgent is handled like the other IsFatal errors
<wwitzel3> right along side the upgrade error
<fwereade> wwitzel3, and it is, at the runner level -- but cmd/jujud/agent/machine.go:360
<fwereade>     case worker.ErrTerminateAgent:
<fwereade>         err = a.uninstallAgent(agentConfig)
<wwitzel3> ahh
<wwitzel3> yeah, we don't want that
<fwereade> natefinch, wwitzel3: and FWIW it's my fault for not fixing ErrTerminateAgent when it became clear it was a problem waiting to happen
<fwereade> natefinch, wwitzel3: it's a Very Bad Thing that we return it from the Uniter, for example
<fwereade> natefinch, wwitzel3: it's not the uniter's job to decide whether the whole process should die
<fwereade> natefinch, wwitzel3: it's the uniter's job to inform whatever started it that the unit doesn't need/want to exist any more
<fwereade> natefinch, wwitzel3: and that code can then itself make a sober judgement as to whether uninstalling itself is really justified
<fwereade> natefinch, wwitzel3: and, if so, itself return ErrTerminateAgent to the top level directly
<fwereade> natefinch, wwitzel3: sane?
<wwitzel3> fwereade: I'm working on a bit of a different problem than natefinch .. I'm refactoring the worker to it properly uses api facade instead of calling state directly as in the proof of concept.
<wwitzel3> fwereade: but I'm running in to a restart loop with the new converter work, getting back not implemented, but all the params and responses are redacted, so it is a bit hard to tell what's happening
<wwitzel3> fwereade: peppering logging bits throughout now :)
<fwereade> wwitzel3, ha
<fwereade> wwitzel3, that's a good thing in general
<wwitzel3> fwereade: yeah, well we wanted to get a proof of concept working first, make sure we were on the right path
<fwereade> wwitzel3, yeah, indeed
<fwereade> wwitzel3, I don't have any immediate insight there, though, I'm afraid
<wwitzel3> fwereade: think I found the problem
<wwitzel3> fwereade: it helps if you declare the facadeVersion in api/facadeversions.go when using a facade
<wwitzel3> :P
<fwereade> wwitzel3, haha
<wwitzel3> I could of sworn I did that
<mup> Bug #1420049 changed: ppc64el - jujud: Syntax error: word unexpected (expecting ")") <deploy> <openstack> <regression> <uosci> <juju-core:Fix Released by axwalk> <juju-core 1.22:Fix Released by axwalk> <https://launchpad.net/bugs/1420049>
<mup> Bug #1428692 changed: cannot boostrap vivid: Operation failed: No such file or directory <ci> <local-provider> <regression> <vivid> <juju-core:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1428692>
<perrito666> natefinch: ?
<fwereade> by the way, can I trouble someone for another review of http://reviews.vapour.ws/r/1165/ please?
<fwereade> ideally someone who's worked in the uniter, but if not I can answer questions
<fwereade> wwitzel3, perhaps :) ^^
<dimitern> fwereade, I'll have a look
<fwereade> dimitern, tyvm
 * dimitern learned a new word today - "impending" :)
<fwereade> dimitern, that word goes very well with "doom" :)
<wwitzel3> haha
<dimitern> fwereade, I bet it does :D
<dimitern> fwereade, in the comment starting at line 56 in modes.go, you mention we shouldn't try to accept leadership if we already did, but the code below doesn't seem to check for that
<fwereade> dimitern, canAcceptLeader := !opState.Leader
<fwereade> dimitern, the ordering of the statements in the comment is misleading though
<dimitern> fwereade, also the check for Kind != Continue looks like we won't try if we're in any hook currently, not just pending hook
<fwereade> dimitern, hence "eg (= for example) pending hook" not "ie (= that is specifically) pending hook"
<dimitern> fwereade, right, I'll leave a comment to rephrase the comment a bit to match the impending code below it :)
<fwereade> dimitern, haha, thanks
<dimitern> fwereade, ok, fair enough
<fwereade> dimitern, I should rephrase that too though
<fwereade> dimitern, "if we're busy doing anything else at all, don't try to accept leadership"
<dimitern> fwereade, yeah, that'll be much better, cheers
<mup> Bug #1433116 was opened: 386 compilation error: dblogpruner/worker.go:32: constant 4294967296 overflows int <i386> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1433116>
<sinzui> dimitern, natefinch We need someone to fix this critical ^. CI aborted testing because it cannot build all the packages. I am going to remove 386 from building to allow other tests to run
<dimitern> sinzui, natefinch, I'll have a look, but unfortunately won't be able to help much about it today
<jw4> sinzui: can you remind me how to run the tests using the gccgo compiler again?
<dimitern> jw4, add -compile gccgo
<jw4> dimitern: ta
<dimitern> jw4, sorry, -compiler gccgo
<jw4> kk
<mup> Bug #1433116 changed: 386 compilation error: dblogpruner/worker.go:32: constant 4294967296 overflows int <i386> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1433116>
<jw4> fwiw, go 1.4 seems incompatible with gccgo
<natefinch> jw4: yeah, I ran into that.... brought it up as a problem on golang-dev, and they weren't really interested in fixing it
<jw4> natefinch: lol, yeah, just noticed you were the reporter :)
<natefinch> jw4: heh
<natefinch> jw4: you can install go 1.2 side by side with 1.4 and use 1.2 as needed with gccgo... it's a little annoying, but not the end of the world
<jw4> natefinch: yeah - I just switched to my laptop with 1.2.1 installed
<dimitern> perrito666, natefinch, trivial review http://reviews.vapour.ws/r/1187/ - please take a look
<mup> Bug #1433116 was opened: 386 compilation error: dblogpruner/worker.go:32: constant 4294967296 overflows int <i386> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1433116>
<natefinch> dimitern: ship it!
<dimitern> natefinch, cheers!
<natefinch> also: good god gomaasAPI is a horrible cluster of a go package
<dimitern> it really is
<dimitern> it's written like a c library
<natefinch> it's written like they just translated the REST API directly into Go calls, so you have to understand the REST API in order to use the package, rather than making the package abstract away the REST API.
<dimitern> natefinch, another review - hopefully fixing bug 1433116 - http://reviews.vapour.ws/r/1188/
<mup> Bug #1433116: 386 compilation error: dblogpruner/worker.go:32: constant 4294967296 overflows int <i386> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1433116>
<perrito666> dimitern: add a comment there, looks like one of those things one would remove upon first sight thinking is a noob error
<dimitern> perrito666, good point, will do
<dimitern> perrito666, updated
<dimitern> perrito666, and I have another one for you - http://reviews.vapour.ws/r/1189/ :) - last one for today I promise
<perrito666> dimitern: looking
<dimitern> perrito666, thanks!
<perrito666> dimitern: btw, int in i386 is int32?
<perrito666> somehow now I am not sure that your fix fixes the fix as you intend it to be fixed
<dimitern> perrito666, it is
<dimitern> perrito666, you mean because it will be negative perhaps?
<perrito666> dimitern: https://play.golang.org/p/2LEwN2pZ9X
<perrito666> dimitern: it blows
<perrito666> at least there
<perrito666> i wold expect the value to become math.MaxInt32
<perrito666> actually I dont see a good reason for that value not being uint32 all the way (or in any case, explicit int size)
<natefinch> dimitern, perrito666 : is there a reason it needs to be typed?
<dimitern> perrito666, good point - we could constrain it to math.MaxInt32 for i386 or use the 4GB default otherwise
<natefinch> +1 for using a math.Maxsomething
<dimitern> natefinch, not that I can think of
<perrito666> natefinch: yes, I believe its a good practice when the code is going to run in different int size archs to be explicit
<perrito666> or you are just moving the overflow to the next guy
<dimitern> perrito666, natefinch, ok, I'll tweak it a bit by initializing the default depending on the arch
<natefinch> perrito666: constants never overflow, though
<perrito666> there is explicit information in the type "this number needs a box this big to fit"
<perrito666> natefinch: so says the doc
<perrito666> natefinch: you cannot guarantee the use of that constant is going to be given
 * perrito666 loves typing things
<voidspace> dimitern: back
<voidspace> dimitern: took longer than expected, sorry
<voidspace> dimitern: I'll have to work tonight
<voidspace> my new laptop arrived
<voidspace> have to set that up later :-)
<perrito666> voidspace: congrats, what did you get?
<voidspace> perrito666: Dell XPS 15
<voidspace> perrito666: it's basically as powerful as my desktop
<perrito666> nice
<voidspace> quad-core i7, 16gb ram
<dimitern> voidspace, hey, no problem
<dimitern> voidspace, I've discovered a nasty bug in our address allocation for maas - see http://reviews.vapour.ws/r/1189/
<voidspace> dimitern: ah, ok
<dimitern> voidspace, can you approve that btw?
<voidspace> dimitern: was it our bug then, and not theres?
<voidspace> dimitern: will look
<voidspace> dimitern: hah, ouch
<dimitern> voidspace, yeah, it was - but not because their API docs where clear enough to follow :)
<voidspace> dimitern: LGTM
<dimitern> voidspace, thanks, setting to land
<voidspace> dimitern: ah, requested_address is actually the param name for claimstickyaddress op
<dimitern> voidspace, and for ipaddresses op=reserve as well apparently
<voidspace> dimitern: ah, I see
<voidspace> it's that way round
<voidspace> dimitern: ip is for release
<jw4> OCR PTAL http://reviews.vapour.ws/r/1181 (got a round of reviews from perrito666 already, just need graduated reviewer approval)
<dimitern> jw4, will look shortly
<dimitern> voidspace, yeah :)
<jw4> tx dimitern
<dimitern> perrito666, natefinch, voidspace - this should be a better fix for bug 1433116 I guess - http://reviews.vapour.ws/r/1188/
<mup> Bug #1433116: 386 compilation error: dblogpruner/worker.go:32: constant 4294967296 overflows int <i386> <test-failure> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1433116>
<dimitern> natefinch, perrito666, voidspace, no sorry, that's not good enough still - updated http://reviews.vapour.ws/r/1188/ again
<natefinch> dimitern: I wonder how much we care about the one extra byte on amd64 :/
<mup> Bug #1433161 was opened: feature request: support virtual services <cloud-installer> <juju-core:New> <https://launchpad.net/bugs/1433161>
<dimitern> natefinch, you mean 2 bytes I guess?
<dimitern> if not actually four
<dimitern> natefinch, re int vs int32?
<natefinch> dimitern: nevermind, I was thinking MaxUint32... actually it's a factor of 2
<dimitern> natefinch, ah, well 4GB limit for logs seems too much to me even on x86_64
<natefinch> #1433116
<mup> Bug #1433116: 386 compilation error: dblogpruner/worker.go:32: constant 4294967296 overflows int <i386> <test-failure> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1433116>
<voidspace> dimitern: LGTM
<dimitern> voidspace, thanks!
<natefinch> dimitern: is there a reason we couldn't change whatever is using int to int64?  I mean...  seems like what we intend, anyway
<dimitern> natefinch, that's not for me to say - I intend to fix the build failure on i386, not to change the assumptions
<natefinch> dimitern: the assumption is that we wqanted 4GB
<natefinch> dimitern: the architecture problem just shows why using "int" was a bad idea
<natefinch> dimitern: having the log half as big because you're on a different architecture does not seem like a design decision anyone would intentionally make
<perrito666> natefinch: we are punishing users for actually still use i386
<natefinch> perrito666: heh
<voidspace> perrito666: yep, that sounds reasonable
<natefinch> dimitern: seriously, though.  The fix to just change LogPruneParams to int64 seems way more simple and obvious
<voidspace> dimitern: you wanted me to repeat the upgrade test with a destroyed container
<voidspace> dimitern: and then see if the IPAddress is marked as Dead?
<perrito666> voidspace: off course it is, no savvy user should use archs older than the ony my mom uses, and she has x86_64
<dimitern> natefinch, ok, can I ask you to comment on the proposal then and I won't land it as is, will leave it for thumper or menn0 to decide
<dimitern> voidspace, yes please
<voidspace> dimitern: I'll have to use direct mongo access to see I guess...
<voidspace> dimitern: unless you have a better idea
<dimitern> voidspace, not really
<dimitern> voidspace, you could just add more logging :)
<voidspace> dimitern: hah, well if we're going to trust *code* to tell us
<voidspace> dimitern: we could just trust the test
<voidspace> dimitern: but that's not a bad idea
<dimitern> voidspace, I forgot to mention that earlier actually - it's good to have logs during upgrade steps lest it goes wrong
<voidspace> dimitern: I'll create a new branch and add logging
<dimitern> natefinch, thanks!
<dimitern> voidspace, sounds good
<natefinch> dimitern: welcome, sorry to be a pain in the butt :)
<dimitern> natefinch, not at all - I prefer a better solution than a quick and dirty fix :)
<dimitern> jw4, reviewed
<jw4> dimitern: gratzie
<perrito666> ok ill be OoO for a moment, be back later
<jw4> dimitern: I think the first two issues can be dropped?  I'll work on the third.
<jw4> dimitern: and yes, I'm also waiting for a sanity check from fwereade
<dimitern> jw4, let me have another look
<jw4> dimitern: ta
<dimitern> jw4, yeah, sounds good re the first two issues
<jw4> dimitern: cool
<dimitern> jw4, sorry, so do you intend to keep the second assert there?
<jw4> dimitern: it seems reasonable to me - it's asserting a subtly different thing than the first assert
<dimitern> jw4, if you need to verify the cause as well and there's a specific error you can check for, better use jc.Satisfies
<jw4> dimitern: oh, interesting
<jw4> dimitern: since I only touched this because wording changed elsewhere, I'm not keen to change the tests too much.
<jw4> dimitern: but I'm willing if you feel strongly about it
<dimitern> jw4, not that strong :)
<jw4> dimitern: k
<dimitern> voidspace, I guess you're afk now
<voidspace> dimitern: no, here
<dimitern> voidspace, ah, ok :)
<frankban> ocr could you please take a look at https://github.com/juju/charm/pull/94 when you have time? thanks!
<voidspace> hmmm... so the bad news is that maas seems to have crapped out pretty badly and looks like I need to reinstall
<voidspace> the good news is that I've started the ubuntu install on the xps 15...
<voidspace> hah, although looks like I need a mouse as the trackpad isn't working
<voidspace> or maybe I can proceed without it
<mattyw> voidspace, any idea if dimitern has gone for good?
<mattyw> (for good meaning - today)
<voidspace> screen is lovely
<voidspace> mattyw: yes, gone for the day
<mattyw> voidspace, ok thanks - screen as in < tmux?
<mattyw> perrito666, you available to do a review for me?
<perrito666> mattyw: sure
<perrito666> mattyw: shoot
<mattyw> perrito666, http://reviews.vapour.ws/r/1005/
<mattyw> perrito666, thanks very much
<mattyw> perrito666, it has an LGTM but I made some largeish changes since
<perrito666> mattyw: it has a ship it
<perrito666> ah ok
 * perrito666 wonders how mattyw did large changes in such a short patch
<mattyw> perrito666, I try to keep my patches short and focused
<mattyw> perrito666, it was largish in that it was > 60% of the patch changed
<perrito666> so to make a change large you add a lot and then remove it? :p
<voidspace> mattyw: screen as in - the physical screen on my new laptop (Dell XPS 15 - currently installing ubuntu on it)
<mattyw> voidspace, oh right, I saw you twittering about that I think
<mattyw> voidspace, be interested to see how you get on, I like dells - and I think I'm due a new one at the end of the year
<ericsnow_> voidspace, mattyw: natefinch and I have that same laptop, I believe
<ericsnow_> voidspace, mattyw: and axw too
<mattyw> ericsnow_, is that the one you had in brussels?
<ericsnow_> mattyw: yep
<mattyw> ericsnow_, ah yes - on the laptop with the insane resolution :)
<ericsnow_> mattyw: to which I forgot the power cord and had to beg hits off the other two guys :)
<perrito666> mattyw: I have some concerns I reviewed it
<mattyw> perrito666, ok thanks
<mattyw> perrito666, yeah - that type thing is the thing that's keeping me from landing this branch - everyone has a different idea for how it should look - and it started off not being part of my change :)
<mattyw> perrito666, good shout about the maps - I'd forgotten about that
<perrito666> mattyw: in my pov this opens the possibility of garbage strings comming down the line, the speciffic type sends a message of what you are expecting to come down the line
<perrito666> mattyw: that is the justification for my objection , in case it helps
<mattyw> perrito666, that's right, but so does type MeterStatusCode string
<mattyw> perrito666, I don't disagree with you at all
<mattyw> perrito666, I'm just trying to find a way of fixing it
<perrito666> mattyw: well a function expecting to receive (or return) a string is much less useful in terms of information than one expecting MeterStatusCode
<perrito666> you are making a statement in terms of the content of that string :)
<perrito666> natefinch: you can break the tie?
<perrito666> anyway, i need to get afk for a moment, bbl
<mattyw> perrito666, no problem
<natefinch> perrito666:  reading scrollback
<natefinch> wwitzel3: any luck?
<mup> Bug #1433254 was opened: manual provider on trusty/precise syntax error near unexpected token `then' <manual-provider> <ppc64el> <systemd> <juju-core:In Progress by ericsnowcurrently> <juju-core 1.23:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1433254>
<marcoceppi> I've got a general golang question
<marcoceppi> I want to get the http status code of a URL, I'm guessing this is the http library but I'm not sure how to check response code
<jw4> marcoceppi: golang.org/pkg/net/http/#Get
<jw4> when you issue the Get on the url you get back a *Response
<jw4> which has Status among other things
<marcoceppi> jw4: awesome, thanks! I'll give that a /go/
<marcoceppi> bwhahah
<jw4> lol
<natefinch> you can tell a newbie gopher by all the bad jokes
<jw4> natefinch: lq - marcoceppi is a ninja so I'm going to lq (laugh quietly) instead of lol
<mup> Bug #1433244 was opened: MAAS should handle RAM upgrades without decommissioning / recommissioning existing nodes <openstack> <uosci> <juju-core:New> <MAAS:Won't Fix> <https://launchpad.net/bugs/1433244>
<mup> Bug #1433244 changed: MAAS should handle RAM upgrades without decommissioning / recommissioning existing nodes <openstack> <uosci> <juju-core:New> <MAAS:Won't Fix> <https://launchpad.net/bugs/1433244>
<ericsnow_> OCR: PTAL http://reviews.vapour.ws/r/1190/
<ericsnow_> perrito666: ^^^
<ericsnow_> good morning, thumper
<thumper> o/
<ericsnow_> thumper: when you get settled, I have a quick question for you
<thumper> better be quick
<thumper> stand up is now
<ericsnow_> thumper: I can wait
<mup> Bug #1433244 was opened: MAAS should handle RAM upgrades without decommissioning / recommissioning existing nodes <openstack> <uosci> <juju-core:New> <MAAS:Won't Fix> <https://launchpad.net/bugs/1433244>
<ericsnow> menn0, perrito666: PTAL http://reviews.vapour.ws/r/1190/
<menn0> ericsnow: i'll take a look after i've finished with the review i'm currently looking at
<ericsnow> menn0: thanks!
<perrito666> ericsnow: I added one comment to the bash stuff but my brain is jello I cannot in good concience review embedded bash and produce useful output
<ericsnow> perrito666: np :)
<thumper> cmars: chat today?
<rick_h_> thumper: he's out this week
<thumper> rick_h_: cheers, I guess that is a no then :)
<thumper> rick_h_: my daughter is wanting me to help with a school thing on Friday
<thumper> rick_h_: any chance we can reschedule our meeting?
<rick_h_> thumper: we can try tomorrow but have to ask urulama|afk as it's late late his time
<thumper> rick_h_: ok
<thumper> rick_h_: alternatively I could try to make it earlier
<thumper> rick_h_: say... if I got up at 6am for the call
<thumper> rick_h_: it would make it earlier for urulama|afk and you
<thumper> as long as it doesn't clash with other calls you may have
<thumper> rick_h_: so... 3 or 3.5 hours earier than currently specified
<rick_h_> thumper: well I'm booked honestly. You can see if there's a slot that works for you but my wed/thurs are call days
<thumper> ah
 * thumper tries to see rick_h_'s calendar
<rick_h_> thumper: if you can find a slot happy to move it around
<thumper> rick_h_: I see a slot there...
 * thumper calculates timezones
<thumper> rick_h_: are you EDT?
<rick_h_> thumper: yep
<rick_h_> feel free to take it then
<thumper> 1pm? your time
<rick_h_> thumper: wfm
<rick_h_> and urulama|afk looks open
<thumper> I think that would be 6 or 7pm for ubuntulog2
<rick_h_> so happy to move it earlier then
<rick_h_> lol
<thumper> ok, lets do it for this week
 * thumper moves
 * thumper sighs
<thumper> perhaps ubuntulog2 doesn't want to come
<thumper> unless it can automatically minute our meeting
<thumper> rick_h_: moved
<rick_h_> thumper: yay, I can't wait :)
<thumper> heh
<menn0> ericsnow: done. lots of problems with the bash code unfortunately.
<ericsnow> menn0: k, thanks
<mup> Bug #1433336 was opened: juju deploy tags are persistent accross juju deploys <cts> <juju-core:New> <https://launchpad.net/bugs/1433336>
<ericsnow> menn0: are you sure about `[[ ! $? ]]` ?
<ericsnow> menn0: I'm pretty sure you're right in the case of a single bracket, but the double bracket works
<menn0> ericsnow: yep. I just included an example in reply to your comment.
<ericsnow> menn0: weird. your example works for me
<ericsnow> menn0: ah, it's the ! that is messing things up
<thumper> ericsnow: thing I found with bash is always use -ne or -eq
<thumper> never test unaddorned
<thumper> also, I always quote args, "$1"
<thumper> never just $1
<ericsnow> thumper: yep
<thumper> in [[ ]] at least
<ericsnow> bash is a minefield
<thumper> also... I hate bash scripts
<thumper> true that
<ericsnow> menn0's got my back :)
<alexisb> wallyworld, ping
<wallyworld> alexisb: hey
<alexisb> heya
<alexisb> I just sent you a mail
<wallyworld> ok, looking
<alexisb> the error looks familiar
<alexisb> does it to you as well?
<menn0> ericsnow: strange that it works differently for you
<menn0> ericsnow: could be a bash version thing or a shopt thing
<menn0> ericsnow: i'd still play it safe and use the more conventional check
<ericsnow> menn0: no, it was only without ! that it worked the way I expected
<ericsnow> menn0: agreed
<wallyworld> alexisb: the series string is missing from the binary version
<wallyworld> what are they running juju on?
<alexisb> a maas setup on a VM
<alexisb> and I cant repo in on my env
<ericsnow> menn0: all fixed
<wallyworld> alexisb:  no, not surprised, it's a setup issue of some sort, but i'll need to dig a little to find root cause
<alexisb> I dont want to bother youm but can you give me some pointers on what to look at?  It is on a partners test setup in class and it is bugging me that I cant figure out what he has done different
<menn0> ericsnow: looking again
<ericsnow> menn0: ta
<wallyworld> alexisb: no problem at all. can we run sync-tools with --debug
<wallyworld> so we can see where it's looking for the tools tarballs
<wallyworld> it could be a badly named file
<thumper> ericsnow, menn0: what is the first line of the discovour script?
<thumper> by default, it runs through /bin/sh right?
<thumper> which is dash on some servers
<thumper> not bash
<menn0> #!/usr/bin/env bash
<thumper> just a query, but I do wonder if bash is really bash on some servers
<thumper> it may not be
<menn0> it would be pretty awful if it wasn't
<thumper> yes
<thumper> perhaps it was changing sh for dash not bash
<thumper> that I recall
<ericsnow> thumper: that sounds plausible
 * menn0 nods
<menn0> what you get with /bin/sh is a little "flexible"
<alexisb> wallyworld, getting logs for you
<menn0> but is supposed to be posix compatible sh
 * thumper sighs
<thumper> apiserver tests take over two minutes
<wallyworld> alexisb: ty, and then pastebin please
<thumper> I wish this was faster
<menn0> directly asking for bash should be safe enough though
<alexisb> o crap wallyworld I just emailed
<wallyworld> alexisb: never minds, email works :-)
 * thumper swears at the tests
<ericsnow> davecheney: could you take another look at http://reviews.vapour.ws/r/1172/?
<wallyworld> alexisb: just for kicks, can you please try "sudo apt-get install distro-info-data" \
<wallyworld> on the machine running ysnc-tools
<alexisb> yep one sec
<wallyworld> i suspect the machine doesn't know about vivid
<alexisb> wallyworld, already installed
<alexisb> didn't make a difference
<menn0> ericsnow: done. just one question but ship it otherwise.
<ericsnow> menn0: thanks
<wallyworld> alexisb: ok, i'll dig into the code and get back to you
<wallyworld> alexisb: also, bug 1433336 - it's working as designed, constraints used at bootstrap time become default constraints from then on
<mup> Bug #1433336: juju deploy tags are persistent accross juju deploys <cts> <juju-core:New> <https://launchpad.net/bugs/1433336>
<alexisb> wallyworld, thanks, not urgent, just bugging us :)
<wallyworld> so if you want to bootstrap to a machine using tags, you'll need to set the constraints again afterwards
<alexisb> wallyworld, can you add a note to bug 143336
<mup> Bug #143336: __bobo_traverse__ and ftp/webdav <bug> <zope> <Zope 2:Invalid> <https://launchpad.net/bugs/143336>
<wallyworld> yep
<alexisb> thanks
<wallyworld> i'l find this current issue first
<alexisb> wallyworld, I know you guys are busy, so really not urgent stuff, but we got lots of feedback from classmates on juju stuff (from core to gui) so I was trying to get info back to the team
<wallyworld> alexisb: that sort of feedback is very much desired
<wallyworld> was just giving you a heads up on the reason for the bug
<wallyworld> i agree the usability kinda sucks in that case
<thumper> oh for the love of <insert deity>
 * thumper headdesks
<wallyworld> alexisb: i think that the maas machine needs to have that distro info package installed as well
<wallyworld> alexisb: i suspect the standard maas images don't have it
<alexisb> wallyworld, I installed distro info on both the maas machine and the bootstrapped machine that has the maas image, and it makes no difference.
<wallyworld> alexisb: trouble is, juju will now have cached it's knowledge of the distro info, so you will need to bounce the agent
<wallyworld> sudo service juju-agent restart   <-- i can't recall OTTOMH what the exact service name is
<wallyworld> look in /etc/init
<menn0> ericsnow: I just added one more comment. Summary: you should use utils.ShQuote instead of %q or "%s"
<wallyworld> alexisb: how'd you get on? ping if you're still stuck
<alexisb> still not working
<alexisb> but they are going to kick us out soon
<alexisb> wallyworld, I will try to repo tomorrow
<wallyworld> alexisb: ok, let me know how you go etc
<ericsnow> menn0: would you mind also taking a look at http://reviews.vapour.ws/r/1172/
<wallyworld> thumper: perrito666 says you owe him an email?
<thumper> yeah...
 * thumper has been distracted
<thumper> ohh... shiney
<menn0> ericsnow: will do. i just have to take care of a few other things first.
<ericsnow> menn0: no worries :)
#juju-dev 2015-03-18
<thumper> fark
<thumper> I thought this was a small branch
<thumper> $ git diff master | wc -l
<thumper> 973
<thumper>  15 files changed, 467 insertions(+), 174 deletions(-)
<thumper> hmm... menno is reviewer
<thumper> no worries then
 * thumper goes to write more tests :)
<menn0> thumper: screw you hippie :)
 * thumper bows
<thumper> oh fuck
<thumper> hmm...
 * thumper found an interesting bug^Wfeature
<thumper> FINE
<thumper> IF THAT IS THE WAY IT'S GOING TO BE...
 * thumper beats the code into submission
<jw4> fwereade: if you get a chance in the morning could you review http://reviews.vapour.ws/r/1181/ ?
<jw4> fwereade: I've updated with good feedback from perrito666 and dimitern
<thumper> menn0: got a few minutes to discuss a testing approach?
 * thumper grumbles
<thumper> davecheney: can I pass around an unbound method as a func to call on a target?
 * thumper recalls the following
<thumper> type foo struct {}
<thumper> func (f *foo) method(param string) error {...}
<thumper> foo.method -> func (*foo, string) error
<thumper> ...
<thumper> that'll work
<menn0> thumper: back now. standup hangout?
<thumper> yea
<mup> Bug #1433384 was opened: utopic unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core:Triaged> <https://launchpad.net/bugs/1433384>
<mup> Bug #1433384 changed: utopic unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core:Triaged> <https://launchpad.net/bugs/1433384>
<mup> Bug #1433384 was opened: utopic unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core:Triaged> <https://launchpad.net/bugs/1433384>
<thumper> menn0: what do you think about having the function api.OpenWithLoginV1 exported publicly?
<thumper> menn0: that way we can call it from the apiserver package
<menn0> thumper: it's a little ugly but it doesn't bug me that much
 * thumper nods
<menn0> thumper: perhaps it should be OpenWithVersion
<thumper> but that implies it takes an arg
<thumper> but it doesn't
<menn0> thumper: would that make sense? then there'd be just one call to let people get at any version.
<thumper> hmm...
<thumper> perhaps
<menn0> thumper: but could it take an arg
<menn0> ?
<thumper> yes
<thumper> we could switch on the version
<thumper> and error on unknown
<menn0> thumper: why do you really need to export it from apiserver?
<thumper> it is an api client function right now
<thumper> and I want to use it in the apiserver package
 * thumper hunts for some books of equal height
<menn0> thumper: right I see.
 * thumper is trying standing up
<thumper> Beginning GIMP is the same thickness as Generative Programming
 * thumper is winning
<menn0> :)
<thumper> hmm... that's better
<thumper> keyboard was a little low before
<menn0> thumper: just thought some more about it
<thumper> yus...
<menn0> thumper: and it seems ok
<thumper> which bit?
<menn0> thumper: esp if there's a docstring saying that it's mainly for testing
<thumper> exporting one that takes a version
 * thumper nods
<thumper> will do
<menn0> thumper: either is ok
<thumper> and I'll have it take a version number
<thumper> I think it is a little more flexible
<menn0> thumper: but i prefer the version taking form
<thumper> ack
 * thumper takes a deep breath and ignores a little mess
 * thumper makes a note to clean it up next
<thumper> menn0: I'm not sure I'm going to get all this done before 6 today
<thumper> menn0: as I have a parent-teacher interview at 4
<thumper> and a class at 5
<thumper> but I'll finish it off tongiht
<thumper> menn0: I don't expect you to review it today (maybe tomorrow) :)
<menn0> thumper: np
<axw> wallyworld: http://reviews.vapour.ws/r/1193/
<wallyworld> ok
<menn0> anyone able to review this? http://reviews.vapour.ws/r/1192/
<menn0> it's a fix for a problem I introduced that is preventing compilation on i386
<wallyworld> axw: i've done the go-amz mods ti support volumes, writing tests. had to update v4-unstable branch with latest v3 changes, conflicts due to stupid paths in imports. that shits me no end with Go
<wallyworld> menn0: looking
<thumper> menn0: class is at 6 not 5
<thumper> so I'm about to push this branch up
<menn0> thumper: ok, bring it on
<thumper>  20 files changed, 588 insertions(+), 192 deletions(-)
 * menn0 warms up his review issue fingers
<wallyworld> menn0: i just left a suggestion, not that important
<menn0> wallyworld: good idea. i'll do that.
<thumper> hmm...
<thumper> why is reviewboard not picking that up
<thumper> menn0: https://github.com/juju/juju/pull/1868
<menn0> thumper: cool. i'll do that next.
<menn0> looks like CI is blocked at the moment
<menn0> bug 1433384
<mup> Bug #1433384: utopic unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core:Triaged> <https://launchpad.net/bugs/1433384>
<menn0> thumper: your PR didn't make it to RB?
 * thumper shrugs
<thumper> NFI
<menn0> thumper: i'll review on GH
<thumper> this standing desk thing has my shoulders aching
<thumper> less hunching and more hanging I guess
<rick_h_> thumper: time to go to walking desk :)
<ericsnow> thumper: FYI, that RB problem is a bug in my hook code where it cannot handle unicode properly (yay 7 year old Python version)
<ericsnow> thumper: from time to time Github sends unicode in the request (or we have some in our code somewhere)
<ericsnow> thumper: once the vivid stuff settles down I'm planning on fixing it
<wallyworld> axw: reviewed, sorry about delay, pool man was here
<axw> wallyworld: nps, thanks
<thumper> ericsnow: ah... I was wondering what it was
<ericsnow> thumper: yeah, it's the most common reason why the hook fails
<thumper> ok, handy to know
<thumper> I'm going to head off to my boxing class now
<thumper> time to hit things hard
<menn0> thumper: review done
<ericsnow> thumper: the other is when a branch get's rebased against an incompatible base revision to the original
<thumper> menn0: awesome thanks
<thumper> menn0: I was wondering about the restricted root ordering
<thumper> it seemed weird to do work that you were going to throw away
<thumper> but I'll look for the method first
<thumper> for consistency
<thumper> with the other methods
<ericsnow> menn0: thanks for pointing out filepath.EvalSymlinks
<ericsnow> menn0: I've updated the patch
<menn0> thumper: yeah it does looks weird but I did put some thought into it for upgradingRoot
<thumper> laters peeps
<thumper> menn0: ack, will address tomorrow
<menn0> ericsnow: no problems
<ericsnow> thumper: o/
<menn0> ericsnow: i've got to finish up in a sec so i might not be able to get back to your PR until tomorrow
<ericsnow> menn0: no worries
<ericsnow> menn0: thanks for the reviews
<ericsnow> menn0: we kept you busy today
<ericsnow> menn0: I'll probably have someone else wrap up the review so I that we can maybe release 1.23b1 tomorrow
<jw4> menn0: thanks for the review, and the kind words :)
<menn0> ericsnow: np
<menn0> jw4: np
<menn0> ericsnow: i just managed to sneak another review in
<menn0> ericsnow: one problem, but otherwise ship it
<ericsnow> \o/
<ericsnow> menn0: a pity review :)
<wwitzel3> ericsnow: what are you still doing here?
<ericsnow> wwitzel3: just checking on reviews before going to bed :)
<axw> wallyworld: I've proposed another couple of branches. I'll now look at the last bits required to populate filesystem info for filesystems-on-volumes
<axw> can't land anything atm, as trunk is blocked
<wallyworld> axw: ok, i'm still plowing through writing tests for the go-amz work, have to go to soccer soon, may look after i get back. i expect tmorrow we'll have EBS volume source working
<axw> wallyworld: awesome :)
<wallyworld> axw: oh, and that will include attach, detach
<axw> great
<axw> wallyworld: I haven't removed VolumeParams.Attachment yet, will defer that for now
<axw> but it's still necessary to have separate attach/detach anyway
<wallyworld> axw: sure, np, yep
<wallyworld> and it's not much extra work to do it up front
<dimitern> wallyworld, hey there, do you know if thumper's on leave?
<wallyworld> dimitern: he was here before, but it's past his EOD now
<dimitern> wallyworld, yeah, I was wondering about that i386 issue, but I've seen menn0 proposed a fix for it.. and now we have another blocker
<wallyworld> i saw the i386 fix (reviewed it) but hadn't seen the other blocker
<dimitern> #1433384 - some leadership fallout it seems
<mup> Bug #1433384: utopic unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core:Triaged> <https://launchpad.net/bugs/1433384>
<dimitern> fwereade, hey
<fwereade> dimitern, heyhey
<dimitern> fwereade, I wonder if you can clarify the logic around worker/leadership/tracker_test.go - the durations chosen
<dimitern> fwereade, seems like refreshes() picks a halfRefreshes in the order or a few nanoseconds, which leads to frequent failures in TestWaitMinionBecomeMinion
<dimitern> fwereade, that's related to the current blocker for 1.23 - bug 1433384
<mup> Bug #1433384: utopic unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core:Triaged> <https://launchpad.net/bugs/1433384>
<fwereade> dimitern, hmm, did I screw up the logic? it ought to always pick (2n+1)/2 * coretesting.ShortWait for integer N
<dimitern> fwereade, hmm so that looks like it needs to yield a duration around 1.5 - 2.5 x shortWait
<fwereade> dimitern, yeah, it shouldn't ever return anything less than half shortWait
<fwereade> dimitern, and looking at the code I can't immediately see how it's not doing that
<dimitern> fwereade, depending on N ofc; the trouble is if you run the worker/leadership tests a few times - that same test (and only it so far I can see) fails - it happens to me about 50% of the time
<dimitern> fwereade, hmm, actually it's a lot less than 50%
<dimitern> fwereade, can you try running this on 1.23 in worker/leadership?: time go test -c && for i in `seq 50`; do ./leadership.test & sleep 0.01; done
<fwereade> dimitern, yeah, it looks like I *can*induce failures, but I'm not seeing the one you mentioned above
<dimitern> fwereade, but is the ultimate failure the same - got unexpected readiness: true ?
<mattyw> morning all
<dimitern> mattyw, o/
<fwereade> dimitern, afraid not -- I'm seeing occasional failures to unblock the release goroutine in TestFailGainLeadership
<fwereade> mattyw, o/
<fwereade> dimitern, and they're not guaranteed to trigger, whether running with seq 50 or seq 500
<dimitern> fwereade, :) ok, there goes my one clue of the cause of the failure
<dimitern> fwereade, I'll experiment on i386 to see if it makes a difference
<fwereade> dimitern, hmm, seq 5000 seems to be stressing it a bit more, I managed to get 21 failures -- but *all* of them are in unblockRelease :/
<dimitern> fwereade, are you using go 1.2.1 from the archive?
<fwereade> dimitern, hmm, looks like I'm on 1.2 actually
<fwereade> dimitern, (I think the unblockRelease *is* an issue actually -- we ought to have a generous timeout there instead of the default: case)
<dimitern> fwereade, right, sounds fair
<fwereade> dimitern, yeah, with coretesting.LongWait in unblockRelease I can do seq 5000 without failures
<dimitern> fwereade, nice! well, if I have success with that other test on i386 I can propose a fix for both
<fwereade> dimitern, cool, thanks
<dimitern> fwereade, it gets weirder.. on a i386 machine even with seq 5000 I couldn't reproduce the TestWaitMinionBecomeMinion failure
<fwereade> dimitern, blargh
<dimitern> fwereade, I had ~25 failures and all of them related to unblockRelease
 * fwereade out, laura's school, bbs
<dooferlad> dimitern, voidspace, TheMue: hangout?
<voidspace> dooferlad: yes
<dimitern> dooferlad, sorry, omw
<natefinch> I need to be able to alias juju destroy-enviroment to juju destroy-environment
<natefinch> ...or learn to type/spell
<mattyw> natefinch, juju burn-more-fossil-fuels?
<mattyw> natefinch, would you mind taking another look at http://reviews.vapour.ws/r/1005/? I'm going to fix the test up - but I'll do it in a follow up
<mup> Bug #1433566 was opened: Precise unit test failure discoverySuite.TestDiscoverInitSystemScript  <ci> <precise> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1433566>
<voidspace> virt-manager changes the owner of iso images you use to create kvm images from!
<dimitern> voidspace, yeah, because they're in added to the libvirt storage pool
<voidspace> dimitern: the iso image?
<voidspace> that seems unfriendly
<dimitern> voidspace, indeed
<voidspace> dimitern: I can now create a storage pool - but can't read the iso image. Which is odd. I think it's because the iso image is behind a symlink
<voidspace> well, copying elsewhere worked - for whatever reason
<dimitern> voidspace, :) go figure
<voidspace> dimitern: I think the type of access it uses doesn't like symlinks
<voidspace> block level
<dimitern> voidspace, why are there symlinks in there?
<voidspace> dimitern: I moved the directory and left a symlink in the old location
<voidspace> dimitern: ~/Downloads
<voidspace> dimitern: it's where the iso image is
<voidspace> dimitern: for the storage volumes I used bind mount to move them
<voidspace> dimitern: which may or may not be necessary - but works
<voidspace> dimitern: I just needed to set the right owner/group on the files as well
<dimitern> voidspace, hmm bind mounts should work better than symlinks
<dimitern> voidspace, can't you specify default owner/perms at mount time of the bind?
<voidspace> dimitern: the bind mount is working fine
<voidspace> dimitern: the ~/Downloads symlink is also fine normally
<voidspace> dimitern: it's just that because it's a symlink virt-manager can't get block level access to iso images there
<voidspace> dimitern: I should have specified the full path instead of via the symlink - just habit
<dimitern> voidspace, ah, I got you
<dimitern> fwereade, are you around?
<mup> Bug #1433577 was opened: Vivid unit tests need to pass <ci> <test-failure> <vivid> <juju-core:Triaged> <https://launchpad.net/bugs/1433577>
<mup> Bug #1433577 changed: Vivid unit tests need to pass <ci> <test-failure> <vivid> <juju-core:Triaged> <https://launchpad.net/bugs/1433577>
<mup> Bug #1433577 was opened: Vivid unit tests need to pass <ci> <test-failure> <vivid> <juju-core:Triaged> <https://launchpad.net/bugs/1433577>
<natefinch> mattyw: that severity is different than the severity you were using before... is that right?
<mattyw> natefinch, it's changed a little bit, the logic had changed with it though yes
<mattyw> natefinch, I think this order makes more sense
<ericsnow> dimitern: FYI, bug 1433577 looks like the same issue as bug 1433384, but on vivid
<mup> Bug #1433577: Vivid unit tests need to pass <ci> <test-failure> <vivid> <juju-core:Triaged> <https://launchpad.net/bugs/1433577>
<mup> Bug #1433384: unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core:In Progress by fwereade> <https://launchpad.net/bugs/1433384>
<dimitern> ericsnow, well part of it is
<ericsnow> dimitern: right
<dimitern> ericsnow, fwereade is already on #1433384 btw
<mup> Bug #1433384: unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core:In Progress by fwereade> <https://launchpad.net/bugs/1433384>
<ericsnow> dimitern: ah, great
<dimitern> ericsnow, I wanted to check something with you re systemd support
<ericsnow> dimitern: sure
<dimitern> ericsnow, looking at https://github.com/juju/juju/blob/master/container/lxc/clonetemplate.go#L129-L172
<ericsnow> dimitern: yep
<dimitern> ericsnow, and the corresponding test - https://github.com/juju/juju/blob/master/container/lxc/lxc_test.go#L811-L841
<dimitern> ericsnow, do you think this will work for systemd? I mean for upstart it does (it's live tested), but I wasn't sure if you can have multiple commands in ExecStart (provided all of them have abs. paths)
<ericsnow> dimitern: systemd only allows a single command for ExecStart (or any Exec*)
<ericsnow> dimitern: however, the systemd code in juju turns a multi-line ExecStart into a separate shell script that ExecStart then calls
<dimitern> ericsnow, so it will work?
<ericsnow> dimitern: it should work fine; I'll look more closely at it in a minute
<perrito666> ericsnow: wwitzel3 ?
<dimitern> ericsnow, thanks!
<mup> Bug #1433615 was opened: Manual provider fails to bootstrap: failed to check provisioned status ... /: Is a directory <bootstrap> <ci> <manual-provider> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1433615>
<alexisb> natefinch, ericsnow, fwereade, dimitern thank you all for tackling critical bugs for 1.23 quickly
<ericsnow> alexisb: blame my feeling of guilt :)
<alexisb> ericsnow, given the number of features we landed in 1.23 we will be seeing a fair amount of guild :)
<alexisb> we will just need to tackle the bugs quickly and get things hardened and stable, ready for the release
<dimitern> ericsnow, :) guilt usually works for me as well
<dimitern> alexisb, thanks!
<ericsnow> fwereade: you free to meet briefly?
<fwereade> ericsnow, oops, sorry, with you in 2
<ericsnow> fwereade: no worries
<fwereade> mgz, sinzui: FATAL: Unable to delete script file /tmp/hudson7105716500738528055.sh ?
<natefinch> gah, why the heck is machine 0 running without the environmanager job when it first boots up? :/
<natefinch> oops, nevermind... figured out what I was doing wrong :)
<natefinch> man I hate it when people type assert stuff and then if the type assert fails... don't tell me what type it actually was in the error - https://github.com/juju/juju/blob/master/apiserver/watcher.go#L90
<jw4> natefinch: at least log the type right?
<jw4> natefinch: not sure if it's a potential security issue to reveal the type in the error?
 * jw4 out for a couple hours
<fwereade> dimitern, can you clarify http://reviews.vapour.ws/r/1165/diff/3/?file=42926#file42926line396 ?
<fwereade> dimitern, that's a comment I added (in what I thought was the right place) to address my remembering someone had once asked why we did that, and that it wasn't immediately clear from context
<fwereade> dimitern, if I explained it nearer the top of the method it'd be further away from the code it's explaining and likely to rot even faster than most comments
<natefinch> fwereade: I hate that this code doesn't differentiate between "not found" and "wrong type" :/  https://github.com/juju/juju/blob/master/apiserver/watcher.go#L90
<fwereade> natefinch, I think the error is reasonable for both cases, but yes, when debugging you'd probably want to know more
<fwereade> abentley, if you're around I appear to have broken the bot with http://juju-ci.vapour.ws:8080/job/github-merge-juju/2552/
<abentley> fwereade: looking...
<abentley> fwereade: do you have an idea what the cause is here?  I don't understand why it can't delete the script.  I can (try to) delete the script, but I imagine the same thing will happen with the next run.
<fwereade> abentley, I have no idea, I'm afraid
<abentley> fwereade: AFAICT, the script is being deleted by jenkins, and it has 0644 with ownership jenkins:jenkins, and /tmp (the parent directory) is 0777
<abentley> fwereade: I might as well try
<abentley> fwereade: I've deleted the script and triggered a rebuild.
<abentley> fwereade: It seems to be getting further: http://juju-ci.vapour.ws:8080/job/github-merge-juju/2553/console
<fwereade> abentley, cool, thanks
<voidspace> dimitern: ping
<voidspace> dimitern: you shouldn't be here even if you are :-)
<voidspace> dimitern: but just in case...
<voidspace> dimitern: running upgrade with loggging shows that the life field is correctly set to "dead" for a destroyed container IP address
<voidspace> dimitern: so I'm proposing the logging branch (just two logging lines)
<voidspace> dimitern: and moving onto the worker
<natefinch> you know it's bad when I can't even tell WTF is going on with a full traceback.
<natefinch> this watcher code is a twisted mess
<perrito666> natefinch: there are worse, beware of what you wish
<perrito666> I have been around watcher when changin state, need a hand?
<natefinch> perrito666: yeah... just trying to debug why this watcher is causing an error
<natefinch> perrito666: brb
<perrito666> heh, seems my help causes people to flee :p
<natefinch> perrito666: back
<natefinch> perrito666: let's jump in moonstone
<natefinch> wwitzel3: you around?
<wwitzel3> natefinch: yep
<natefinch> wwitzel3: was the watcher working for you?  I keep getting a problem where the state server says it's trying to find a watcher with an empty id... but I can't figure out why, because the watcher code is all so twisty
<wwitzel3> wwitzel3: nope. that is the same error I left it at last night, I'm working on that no, trying to get it fixed.
<wwitzel3> natefinch:
<wwitzel3> natefinch: the setup runs, the handle runs (which is our usual throw away the first run stuff), then I get that unknown watcher id
<natefinch> wwitzel3: ok, cool (sorta).  I made a branch with more logging on my github called nate-ww3-ha-to
<natefinch> it also has some of the work to actually do stuff inside the watcher when it triggers... but yeah, I haven't been able to figure out the unknown watcher ID thing either
<perrito666> juju local behaves much like the ring from lotr, when you most need it it leaves you stranded
<natefinch> wwitzel3: I know that it's because it's somehow getting an empty id in newNotifyWatcher, but can't figure out why
<wwitzel3> natefinch: right, and the Next attempt throws the error
<niedbalski> abentley, ping
<niedbalski> abentley, are you able to target https://bugs.launchpad.net/juju-core/+bug/1423936 to 1.22 ?
<mup> Bug #1423936: Juju backup fails when journal files are present <backup-restore> <cts> <juju-core:Triaged by niedbalski> <https://launchpad.net/bugs/1423936>
<abentley> niedbalski: I can.
<alexisb> o perrito666 look a backup and restore bug, your favorite
<wwitzel3> lol
<perrito666> alexisb: it is my favorite
<perrito666> because natefinch has a patch for it
<perrito666> :p
<alexisb> looks like niedbalski has fixed it for you though
<perrito666> and by natefinch I meant niedbalski
<natefinch> haha I was going to say... news to me
<alexisb> heh exactly
<perrito666> natefinch: you should have seen your face
<wwitzel3> natefinch frequently codes up patches and just keeps them in his back pocket, JIC
<perrito666> lol
<alexisb> abentley, that bug will need to be discussed in the release call, so that xwwt and team can decide how/when the point release will go out
<abentley> alexisb: ACK.  With 1.23 in beta, 1.22 may soon be obsolete anyway.
<alexisb> abentley, we will be pushing it to trusty
<niedbalski> abentley, thanks
<abentley> niedbalski: np
<ericsnow> dimitern: FYI, I wrote a patch to test systemd in containers/lxc: http://reviews.vapour.ws/r/1203/
<bogdanteleaga> is there any way to debug go build hanging completely?
<thumper> bogdanteleaga: umm... best person to suggest stuff would be davecheney I think
<thumper> wallyworld: you up?
<wallyworld> thumper: yeah, in meeting, talk soon
<thumper> ack
<mup> Bug #1433615 changed: Manual provider fails to bootstrap: failed to check provisioned status ... /: Is a directory <bootstrap> <ci> <manual-provider> <regression> <juju-core:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1433615>
<wallyworld> thumper: zup
<thumper> wallyworld: quick hangout?
<wallyworld> sure
<ericsnow> bbl (running some errands)
<mup> Bug #1433384 changed: unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core 1.23:Fix Committed by fwereade> <https://launchpad.net/bugs/1433384>
<mup> Bug #1433384 was opened: unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core 1.23:Fix Committed by fwereade> <https://launchpad.net/bugs/1433384>
<mup> Bug #1433384 changed: unit-test failure: TrackerSuite.TestWaitMinionBecomeMinion <ci> <i386> <regression> <test-failure> <utopic> <juju-core 1.23:Fix Committed by fwereade> <https://launchpad.net/bugs/1433384>
<ericsnow> wallyworld: FYI, those failing tests on precise just passed: http://juju-ci.vapour.ws:8080/job/run-unit-tests-precise-amd64/2538/console
<wallyworld> ericsnow: awesome. i'm still waiting for the manual provider tests to pass also
<ericsnow> wallyworld: when you have a minute could you take another look at http://reviews.vapour.ws/r/1164/?
<ericsnow> wallyworld: I haven't done much manual testing on it yet (due to more pressing concerns)
<wallyworld> ericsnow: looking
<ericsnow> wallyworld: ta
#juju-dev 2015-03-19
<menn0> davecheney: I just answered your question about why I didn't just use a 64-bit int
<thumper> wallyworld: I've just marked https://bugs.launchpad.net/juju-core/+bug/1433566 fixed released
<thumper> ci jobs passed
<mup> Bug #1433566: Precise unit test failure discoverySuite.TestDiscoverInitSystemScript  <ci> <precise> <regression> <test-failure> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1433566>
<wallyworld> thumper: awesome, just need to wait for manual CI job now
<wallyworld> oh, look like tha's gone too
<thumper> wallyworld: https://bugs.launchpad.net/juju-core/+bugs?field.tag=ci+regression&field.tags_combinator=ALL is clear of critical
<wallyworld> so unblocked
<wallyworld> \o/
<wallyworld> gentlemen, start your engines
 * thumper tries to land his branch
<thumper> while waiting for the bot, merge master and run all tests...
 * thumper hopes no one added a test that'll break with his changes
<thumper> oh FFS
<thumper> build fails
<gsamfira> thumper: I find linden tea helps during merges :D
<wallyworld> axw: i'll move the go-amz to propose against v3, not sure if you started looking yet
<gsamfira> does not help the actual merge, but hey...its good tea :P
<davecheney> menn0: that's an annoying issue with the mgo driver
<thumper> davecheney: which annoying issue is that?
<davecheney> thumper: menn0 's
<thumper> ah
<thumper> the db size issue
<thumper> anyone know how to check the diff between a branch and master for just the changes the branch adds?
<thumper> master is 5824 revisions ahead
<thumper> and I wanted to see what this old branch of mine did
<thumper> effectively what are the changes that would be merged in
<mup> Bug #1433566 changed: Precise unit test failure discoverySuite.TestDiscoverInitSystemScript  <ci> <precise> <regression> <test-failure> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1433566>
<ericsnow> wallyworld: \o/
<wallyworld> yeah
<anastasiamac> wallyworld: gentlemen?.. :(
<wallyworld> anastasiamac: it's a car racing term
<anastasiamac> wallyworld: doesnt fly
<wallyworld> no, cars drive
<anastasiamac> wallyworld: m smiling in disappointment
<wallyworld> :-)
<thumper> heh
<axw> damnit, had to unblock while I was afk :)
<axw> wallyworld: you said there's a typo, I don't see one
<axw> the "if when" is intentional, if that's what you were thinking of
<wallyworld> axw: from memory "if when"
<wallyworld> oh ok
<axw> maybe not very common grammar, but I don't think it's incorrect
<wallyworld> axw: when you are free, i've proposed against V3 https://github.com/go-amz/amz/pull/41
<axw> wallyworld: thanks, will look now
<axw> wallyworld: done
<wallyworld> axw: ty, will look
<thumper> \o/ merge bot doing my branch now
<thumper> axw: your one failed BTW
<axw> doh, thanks thumper
<axw> thumper: will be interested to hear how that goes, I've been thinking of trying a standing desk. my knees and back don't like all this sitting
<natefinch> I wish ubuntu wouldn't reset my laptop's brightness to "staring directly at the sun" every time I reboot.
<thumper> axw: I've been waking with a sore back for the last few weeks
<axw> natefinch: heh yeah :(
<thumper> axw: this morning, it was all good
<thumper> and that was just after about using it four hours yesterday
<perrito666> ok, sleep, bye
<thumper> o. perrito666
<axw> thumper: nice
<natefinch> there's like a million studies out there that say that sitting all day is really bad for you, even if you work out regularly and are in shape.
<thumper> natefinch: yeah, this is one driving force for me trying it
<thumper> and I've been thinking about it for a while
<axw> natefinch: yeah I know, just need to get my arse into gear and do it
<thumper> and I decided to hack something up to trial it before I spent any money on a 'custom' solution
<thumper> so laptop is on a step-stool thingy
<natefinch> thumper: good plan, since most of those standing desks are crazy expensive
<thumper> and keyboard is balanced on a few draws balanced on six books
<axw> heh
<thumper> not the most robust or stable
<thumper> but enough to try it out
 * thumper goes to make a coffee
<jw4> thumper: I assume you found your answer about your old branch but: 'git checkout <old branch>; git log master..' (assuming old branch was off of master)
<jw4> or: git cherry -v master
<natefinch> anyone online any good with watchers?  Wayne and I are trying to get one working for the HA --to feature working, but the state machine keeps giving an error: "unknown watcher id"   (adding some logging shows the id is somehow an empty string)
<dimitern> natefinch, do you have some code to look at?
<jw4> natefinch: sounds like the watcher wasn't registered ?
<jw4> is this on a private branch?
<natefinch> dimitern: https://github.com/natefinch/juju/tree/nate-ww3-ha-to    specifically https://github.com/natefinch/juju/blob/nate-ww3-ha-to/apiserver/converter/converter.go#L44
<dimitern> natefinch, looking
<natefinch> jw4: quite possibly not registered.  Not sure what I need to do to register one
<dimitern> natefinch, where are you getting the error from?
<natefinch> dimitern: apiserver/watcher.go  newNotifyWatcher
<wallyworld> axw: when did you see Attachment being remove from VolumeParams? we still doing that?
<natefinch> dimitern: at least, that's where the error is returned by machine-0... it gets logged on machine-1 (the one calling the watcher)
<menn0> changes ready. now let's see if thumper's recent merge broke mine.
<axw> wallyworld: not sure yet. why do you ask?
<dimitern> natefinch, right, so I think I see the issue
<wallyworld> axw: because i need to handle in in CreateVolume() to do the attaching there
<wallyworld> if specified
<natefinch> dimitern: oh good
<axw> wallyworld: can you just ignore the VolumeParams.Attachment in your branch, and handle it in AttachVolumes please?
<dimitern> natefinch, hmm.. no sorry - there's nothing apparently wrong with the apiserver code
<wallyworld> axw: sure, i thought that might be the case, hence asking :-)
<axw> wallyworld: nps. we'll need to change the existing providers before it gets removed
<dimitern> natefinch, except maybe you don't need to call watcher.EnsureErr(watch) on line 66
<axw> wallyworld: i.e. so loop would create the file in CreateVolumes, and then losetup in AttachVolumes
<wallyworld> yep, sounds good
<dimitern> natefinch, this is it I think - comparing to other cases (e.g. uniter.WatchConfigSettings) if there's any error, just set the Error part of the result
<jw4> natefinch: are you running a specific test?
<jw4> go test ./apiserver/converter/... ?
<jw4> I guess not
<natefinch> jw4: sorry, no, the test is "juju bootstrap && juju add-machine"  and then watch the log for error messages
<jw4> kk
<jw4> natefinch: fwiw I see the call to resources .Register(watcher)
<natefinch> jw4: right right, yeah I saw it too
<menn0> axw: thanks for the review
<axw> menn0: nps
<axw> menn0: what're the headers for btw?
<menn0> axw: the /log API (used by debuglog) and /logsink (new, used by agents to send logs) use HTTP basic auth
<axw> menn0: ah, right. still using HTTPS, right?
<menn0> axw: they use TLS websockets but without the RPC layer that most of our API uses on top
<jw4> natefinch, dimitern fwiw I'm pretty sure the NotifyWatcherId on the params.NotifyWatchResult is an empty string somehow
<axw> right, just wanted to know that the basic auth is encrypted
<axw> menn0: thanks
<menn0> axw: definitely still encrypted
<dimitern> jw4, no, I think you're mixing up the Id parameter of the RPC Request
<dimitern> jw4, watchers are the only entities that actually use the Id
<menn0> axw: my recent changes in the api package are all about sharing as much of the connection setup code between normal API connections and lower level websocket only API connections
<menn0> axw: and that includes the TLS setup
<axw> sounds good
<jw4> dimitern: yeah I think you're right
<jw4> exit
 * jw4 in wrong window
<natefinch> jw4, dimitern: I don't k ow what is going wrong, but this code was supposed to be in over a week ago, and this is one of the main problems that's been tripping us up.
<jw4> natefinch: yeah - I'm still betting on either an RPC serialization issue or an unchecked param being sent down
<dimitern> natefinch, ok, let me try that branch locally, might think of something
<dimitern> natefinch, got a link?
<natefinch> dimitern: I appreciate it
<jw4> dimitern: I can repro with the branch and steps natefinch gave in the scrollback
<natefinch> dimitern:   https://github.com/natefinch/juju/tree/nate-ww3-ha-to   or just git remote add nate https://github.com/natefinch/juju    branch nate-ww3-ha-to
<dimitern> jw4, oh, yeah - I should've looked
 * dimitern thinks it's still only 5:36 am
<jw4> :)
<dimitern> natefinch, thanks, pulling it now
<jw4> natefinch: bingo
<jw4> api/converter/converter.go
<jw4> line 41
<jw4> (actually 38 or so, I added logging)
<jw4> result.NotifyWatcherId:""
<jw4> gah
<jw4> ignore me
<natefinch> jw4: heh
<jw4> natefinch, dimitern actually...
<natefinch> jw4, dimitern: maybe resources.Register is somehow returning an empty string
<natefinch> hmm.. that shouldn't be possible
<jw4> natefinch yeah - it must be a serialization issue?
<jw4> apiserver WatchForJobsChanges is certainly returning "" for the NotifyWatcherId
<jw4> and that's wrong
<jw4> whether it's an RPC issue or not we'll know in a minute when my new logging takes effect
<jw4> boom
<jw4> it's gotta be RPC serialization
<jw4> on the apiserver side it's populated
<jw4> on the api side its ""
<natefinch> jw4: interesting
<jw4> but that doesn't make sense 'cause it's just a string
<jw4> natefinch, dimitern oooh
<jw4> api uses params.NotifyWatchResult, apiserver uses params.NotifyWatchResults
<dimitern> natefinch, I see you're returning params.NotifyWatchResults but in the client-side api you're passing a single NotifyWatchResult
<jw4> the slice is getting lost
<dimitern> jw4, ah :) you're faster
<natefinch> OH FFS
<jw4> dimitern: well I was working on it longer and it's 9pm here not 6 am
<dimitern> jw4, :) fair point
<natefinch> thanks a lot, static typing :/
<dimitern> :D
<dimitern> natefinch, well that part of the code is decisively not all static typing
<dimitern> reflect magic
<natefinch> dimitern: I know... it's totally all reflection BS
<natefinch> dimitern: seems like we could be a little smarter about recognizing when we're deserializing into something completely wrong, though.   Also, the whole "two types with the same name except for an S on the end" is pretty shit
<natefinch> that fixed it
<natefinch> thanks, guys, that's awesome
<dimitern> natefinch, \o/
<jw4> fun debugging
<natefinch> wayne and I have been pulling our hair out for a couple days now
<jw4> fresh eyes usually help
<dimitern> natefinch, well, i'll just add it to the growing list of beers for nuremberg *wink*
<natefinch> dimitern: certainly :)
<jw4> yep confirmed it works here too nice
<dimitern> jw4, indeed
 * jw4 eod
<dimitern> jw4, have a good one! :)
<natefinch> whelp bedtime.  in 5 hours I can get up and try to get the rest of this working.  Thanks guys.
<jw4> dimitern: thanks, you too - your day is off to a great start
<dimitern> jw4, oh I hope so hehe
<jw4> lol
<mup> Bug #1433116 changed: 386 compilation error: dblogpruner/worker.go:32: constant 4294967296 overflows int <i386> <test-failure> <juju-core:Fix Released by menno.smits> <https://launchpad.net/bugs/1433116>
<menn0> axw: review of a 2-line fix please: http://reviews.vapour.ws/r/1208/
<axw> looking
<menn0> axw: I just noticed that the tests were still failing on i386
<axw> menn0: was the state package already updated?
<axw> to accept MB
<menn0> axw: yes
<menn0> axw: that const was there from when it worked in MB
<menn0> axw: it just need to be high enough that pruning by size doesn't come in to play for that test
<menn0> axw: sorry
<menn0> axw: I meant to say "when it worked in bytes"
<axw> that makes more sense :)
<axw> thanks
<menn0> axw: sorry. it's EOD and I'm feeling frazzled.
<axw> menn0: nps :)  LGTM
<axw> wallyworld: when you have some time, http://reviews.vapour.ws/r/1209/
<axw> wallyworld: some bits required for watching filesystem attachments
<wallyworld> axw: sure, will do it after scholl pickup soon
<axw> cheers
<wallyworld> axw: implementing DescribeVolumes() - we pass in provider ids, but there's no easy way to get the juju volume tags to populate the storage.Volume items with
<axw> wallyworld: ah yeah, we should pass both in I guess.
<axw> hmm
<axw> brain fart
<wallyworld> axw: for now, i'm just finishing off tests and will propose the working ebs volume source, i'll add a todo
 * axw looks at interface
<axw> wallyworld: SGTM
<wallyworld> axw: these sorts of things always happen writing infrastructre first :-)
<axw> wallyworld: a couple of options; we could not require the Volume to have a VolumeTag, or we could pass it in with the volume ID
<axw> wallyworld: probably the former -- we should really separate identify from other details
<wallyworld> axw: of hand, i think we should always generate fully populated storage.Volume items in case they are passed around, hence we shuld pass in tags ? agree ?
<axw> I've had in the back of my mind that we should have a storage.VolumeInfo, which describes properties of a volume without the tags and so on
<wallyworld> VolumeInfo works too
<axw> wallyworld: we don't need to implement Describe yet anyway
<wallyworld> we could embed it in Volume
<wallyworld> it's easy to implement, already done, i'll just add a todo
<axw> ok
<wallyworld> right, off to school pickup, bbiab
<mattyw> morning all
<dimitern> morning
<wwitzel3> morning
<wallyworld> axw: i have to have dinner, i'll review your branch after, i've just finished this one http://reviews.vapour.ws/r/1210/ , i have a small tweak i need to propose for go-amz also, will do that after dinner
<axw> wallyworld: nps. I may have to rethink this stuff again :/    I think either the diskformatter needs to be folded into the storageprovisioner, or otherwise we need to do the mounting in the diskformatter after all
<axw>  will take a look at yours
<mramm> is there a link to online docs about what you can do with placement?
<mramm> hey, anybody else getting bad gateway errors on the juju docs site?
<mramm> urulama: ^^^
<wwitzel3> mramm: for /docs yes
<urulama> mramm: it's being fixed, deploy on a way
<wwitzel3> mramm: but https://jujucharms.com/docs/commands works
<mramm> refreshing seems to help
<urulama> mramm: getting started affected
<urulama> mramm: getting started: aws, azure and testing your setup are broken currently
<urulama> (and a bunch others)
<urulama> mramm: sorry, didn't find any user oriented placement doc with examples :(
<mramm> urulama: no problem thanks for looking
<wallyworld> axw: did you still want that branch reviewed, or do you need to think about it?
<axw> wallyworld: still review please. it needs to be there either way, it's just a question of how filesystems-on-volumes are handled
<wallyworld> ok
<axw> wallyworld: I'm thinking of chucking out the diskformatter, and creating a new filesystem provider. the storage provisioner will still need some special handling for that though, I think
<wallyworld> ok
<axw> wallyworld: I realised I didn't add tests for api/storageprovisioner... doing that now, at the same time as adding the rest of the filesystem methods
<axw> wallyworld: I've updated the branch with the rest of the methods in api/storageprovisioner, and added tests to that package
<wallyworld> ok, ta
<axw> server-side will come in another branch
<wallyworld> axw: +1, was fairly easy to review because there's so much similar boilerplate, not just in this stuff but everywhere
<axw> wallyworld: which missing tests are you referring to in the message?
<wallyworld> apiserver
<wallyworld> unless i missed them
<axw> wallyworld: ah yeah, that code isn't really exercised yet. will do
<axw> thanks
<wallyworld> sure
<dooferlad> dimitern: hangout?
<voidspace> dimitern: stdup?
<wallyworld> axw: here's that small go-amz fix
<wallyworld> https://github.com/go-amz/amz/pull/42
<axw> wallyworld: lgtm
<wallyworld> ty
<axw> wallyworld: if you're still around, another fairly straight forward one: http://reviews.vapour.ws/r/1211/
<wallyworld> sure
<axw> wallyworld: I'm going to now focus on getting tmpfs working, so we can test filesystems. getting filesystems in volumes is probably another couple of days work, due to some required changes I hadn't thought about
<axw> I mean, tmpfs invoked via dynamic storage provisioner
<wallyworld> axw: sgtm
<wallyworld> axw: lgtm on rb
<wallyworld> just an import fix
<axw> wallyworld: thanks
<axw> wallyworld: last one for tonight http://reviews.vapour.ws/r/1212/
<wallyworld> ok
<wallyworld> axw: done
<axw> wallyworld: tyvm. good point about todo, will fix now
<wallyworld> ta
<axw> wallyworld: if you've got any ideas about how we can unify params and so on for volumes and filesystems, let me know... lots of very similar code in that package :/
<wallyworld> indeed, we need to step back and re-evalutate the model a bit
<wallyworld> axw: last one if you're still around https://github.com/go-amz/amz/pull/43
<wallyworld> dimitern: if you had a few minutes, this is a small change to the amz test server ^^^^^
<dimitern> wallyworld, sure, will have a look
<wallyworld> ty
<dimitern> wallyworld, LGTM
<wallyworld> ty :-)
<mup> Bug #1433254 changed: manual provider on trusty/precise syntax error near unexpected token `then' <manual-provider> <ppc64el> <systemd> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1433254>
<mup> Bug #1434070 was opened: upgrades are broken in master 1.24-alpha1 <ci> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1434070>
<natefinch> wwitzel3: how goes?
<ericsnow> fwereade: is this what you had in mind? http://reviews.vapour.ws/r/1206/
<ericsnow> fwereade: I waited to land it so that you could look it over first
<mup> Bug #1434092 was opened: updateSeriesVersions() gets called too late in the initialization process <juju-core:New> <https://launchpad.net/bugs/1434092>
<dimitern> jam, fwereade, I'd appreciate a review on this http://reviews.vapour.ws/r/1118/ when you have time
 * dimitern needs to go out for a while, might not come back until late
<perrito666> natefinch: ericsnow wwitzel3 that was intended to be unmute but I instead used hang
<perrito666> because my mouse does those things when running out of mem, sorry
<natefinch> lol
<natefinch> we just all hung up anyway
<dooferlad> ericsnow: Could I have a very quick review? https://github.com/juju/testing/pull/56
<ericsnow> dooferlad: you bet
<ericsnow> dooferlad: wrapping up another one first
<dooferlad> ericsnow: thanks!
<perrito666> natefinch: we could go for a more explicit OneNotifyWatcherResult, ManyNotifyWatcherResults :p
<natefinch> perrito666: I actually don't think that's a horrible idea.  I was going to give suggestions in the email, but didn't really want to bike shed it
<natefinch> perrito666: it doesn't help that they have to match up across files
<perrito666> we could make a test that checks that two variables on a same namespace have a certain levenstein distance :p
<natefinch> perrito666: that's a very interesting linter rule... I kinda like it, actually.
<perrito666> uff, wfh is terrible at lunch time when you live in a neigborhood with too many old ladies
<natefinch> ......explain?
<perrito666> natefinch: young people eat at work
<perrito666> old ladies cook from 11AM and a really tempting smell comes trough the window
<perrito666> natefinch: lunch here is quite a heavy meal since our breakfast is mostly tea/coffee and crackers
<perrito666> I dont even have breakfast,I only eat a couple of fruits
<perrito666> so I have to work for two hs thinking on my relatively poor lunch while smelling stew, roasted meat, pizza, etc
<ericsnow> dooferlad: FYI, while we don't have hooks set up (yet) for the testing repo, you can still use rbt to post a review request
<dooferlad> ericsnow: ack
<perrito666> omg, someone is doing a wine reduction... how cruel
<natefinch> perrito666: lol
<natefinch> wwitzel3: you around?
<natefinch> anyone know if we already have something in machine agent that watches for changes to the machine?  Seems like we would, but it's a little hard to figure out through all the levels of abstraction in this code
<mup> Bug #1434092 changed: updateSeriesVersions() gets called too late in the initialization process <juju-core:New> <https://launchpad.net/bugs/1434092>
<mup> Bug #1434092 was opened: updateSeriesVersions() gets called too late in the initialization process <juju-core:New> <https://launchpad.net/bugs/1434092>
<mup> Bug #1434092 changed: updateSeriesVersions() gets called too late in the initialization process <juju-core:New> <https://launchpad.net/bugs/1434092>
<natefinch> oh my god, the watcher code... jesus
<natefinch> there's like four levels of indirection that could get reduced to one or two lines of code in most cases :/
<mattyw> natefinch, I've just had my fingers in the watcher code - if there's anything I've done you don't like I love to learn :)
<mattyw> natefinch, also, I'm pushing more changes to the meter status branch - I've marked it WIP in rb because there are still things to be cleaned up before its reviewed again...
<mattyw> natefinch, but how does this test grab you? https://github.com/mattyw/juju/blob/metricsmanager-meterstatus/state/meterstatus_test.go#L266
<natefinch> mattyw: I don't actually know what the combinations mean, but it is nice and readable
<natefinch> mattyw: you don't really need the var block, though, they can each just be := instead
<mattyw> natefinch, if I did that gofmt made sure they were all nicely lines up for me :)
<natefinch> mattyw: ahh, interesting point
<natefinch> mattyw: that never would have occurred to me, but yes, I think that helps readability, so, cool, nice work.
<mattyw> natefinch, I'm going to call it a day
<mattyw> natefinch, but if you're around tomorrow I will pester you if you don't mind :)
<natefinch> mattyw: have a good evening
<natefinch> mattyw: I'll be around :)
<mup> Bug #1434246 was opened: Destroying environment takes down others with similar name <juju-core:New> <https://launchpad.net/bugs/1434246>
<natefinch> hatch: on that bug about destroy-environment, how exactly did you invoke destroy environment?  i.e. did you use -e or rely on the "current" environment?
<hatch> natefinch: umm it was scripted, lemme take a peek
<hatch> natefinch: I reproduced it once but I will do it again so I can capture the exact steps for the bug
<natefinch> hatch: also please note if you used --force
<hatch> natefinch: juju_env = lambda: os.getenv('JUJU_ENV') is the line it uses to get the environment
<hatch> but I'll try manually as soon as this test run is done
<hatch> probably 10m now or so
<natefinch> hatch: I knew that if I asked if you did it way X or Y that you'd come back and say Z ;)
<mup> Bug #1331505 changed: destroy-environment shuts down joyent machines <ci> <destroy-environment> <joyent-provider> <juju-core:Won't Fix> <https://launchpad.net/bugs/1331505>
<hatch> natefinch: isn't that the way it goes? :D
<natefinch> hatch: but, still... how does it use juju_env... to do -e?
<hatch> natefinch: the command executor adds -e to the commands using the value returned from the above line
<natefinch> hatch: *shrug* works correctly for me
<natefinch> I'm on master, of course, but I doubt we've changed that any time recently
<hatch> yeah ok I'll try again one this run is done - these tests take a while :)
<natefinch> hatch: no problem :)
<thumper> sinzui: ping
<sinzui> hi thumper
<thumper> o/
<thumper> just talking with abentley about ci tests
<thumper> in standup now
<thumper> ping after
<thumper> sinzui: do you have a few minutes to chat?
<sinzui> I do
<thumper> awesome
 * thumper makes a hangout
<thumper> sinzui, waigani: https://plus.google.com/hangouts/_/canonical.com/core-ci
<wallyworld> thumper: you seen blocker bug 1434070 which is suggested in a comment may be jes related?
<mup> Bug #1434070: upgrades are broken in master 1.24-alpha1 <ci> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1434070>
<thumper> will look shortly, otp
<wallyworld> ok
<hatch> natefinch: so, sorry, I'm no longer able to reproduce that bug
<hatch> I'm going to have to investigate a little further because I was definitely able to reproduce it before
<thumper> wallyworld: looking now
<natefinch> hatch: ok, let me know
<hatch> curtis triaged it and linked it to a security group bug as well
<hatch> so they may be related
<thumper> wallyworld: I know what it is...
 * thumper goes to fix
<wallyworld> \o/
<thumper> wallyworld: wouldn't have happened with 1.22 :)
<thumper> too many assumptions
 * thumper thinks
<thumper> man, this is harder than I thought...
<wallyworld> must. not. comment
<ericsnow> anastasiamac: ping
<thumper> wallyworld: got a minute?
<thumper> wallyworld: I have a fix but unsure if or how to add a test for it
<anastasiamac> ericsnow: hi?
<thumper> wallyworld: http://reviews.vapour.ws/r/1215/diff/#
<ericsnow> anastasiamac: just wanted to encourage you about your patch after leaving so many review comments :)
<anastasiamac> ericsnow: k :D - encourage away :D
<anastasiamac> ericsnow: 7am here, m about to drop off kids at school bbl :D
<wallyworld> thumper: sorry, was dealing with C8H10N4O2
<wallyworld> looking
<thumper> caffiene is my favourite drug
<thumper> followed closely by C2H6O
<thumper> although not in a pure form
<wallyworld> indeed
<wallyworld> thumper: you probably could or should write a test but it's an edge case and i want to get landings unblocked :-)
<thumper> the thing is, you can't really tell
<thumper> without log scraping
<thumper> and that is truly horrible
 * thumper thinks
<ericsnow> thumper: I'm looking at that agent login patch
<wallyworld> thumper: i hit the big red button
<ericsnow> thumper: is there a way for an upgrade step to handle that?
<thumper> ericsnow: no, because the upgrade needs to log into the api first
<wallyworld> thumper: as per policy, we need to unblock asap or revert, so if the current work unblocks, we can land and then revise later with a test if needed
<ericsnow> thumper: ah, makes sense
<thumper> ericsnow: the agents already handle the case which add the env uuid to the agent config
<thumper> but it is in 1.23
<thumper> 1.24 now stops server the environment at the root of the api by default
<thumper> so if you don't have an env uuid, you need to log in with an older version
<thumper> agents NEVER want the root of the api
<thumper> sorry
<thumper> they NEVER want just user manager/ env manager
<thumper> they always want to talk to an environment
<thumper> so the place to force it is in the agent login connection
<thumper> wallyworld: oh, you already added the merge line
<thumper> ta
<wallyworld> thumper: yeah, i need to land stuff for storage, so keen :-)
<thumper> heh
<thumper> also, if one of the upgrade ci tests pass, this fix will have worked
<thumper> 'cause it broke all
<wallyworld> yup
<wwitzel3> there is an error you can raise from a worker to restart jujud correct?
<perrito666> ErrTerminateAgent iirc
<perrito666> wwitzel3: ^
<wwitzel3> perrito666: no, that calls uninstallAgent
<perrito666> ouch
<wwitzel3> perrito666: fwereade pointed that out to me the other day, I thought we had one for just restarting, but maybe not.
<perrito666> do you not have access to the tomb?
<wwitzel3> perrito666: somehow I'm sure
#juju-dev 2015-03-20
 * perrito666 is waiting to land changes
<axw> wwitzel3: worker.ErrRebootMachine
<anastasiamac> ericsnow: thnx for review! it's amazing what ppl with fresh eyes can see! I really appreciate ur patience with such a huge PR too :D
<wallyworld> perrito666: looks like trunk will be blocked for a while sadly
<perrito666> wallyworld: t's ok, I dont need sleep
 * perrito666 wants to be like dimitern 
<ericsnow> anastasiamac: glad to do it; I try to be thorough and honest :)
<ericsnow> anastasiamac: and I like to learn a thing or two in the process :)
<wallyworld> perrito666: it won't be fixed till your SOD tomorrow because a fix landed but the tests are failing on CI even though they pass locally and there's no build artifacts
<wallyworld> so we need QA team input
<ericsnow> wallyworld: I made some broad changes on http://reviews.vapour.ws/r/1164/
<ericsnow> wallyworld: when you have a few minutes could you take a look?
<wallyworld> ok
<ericsnow> wallyworld: ta
<wallyworld> ericsnow: btw, pool config attributes may be string, int, bool, whatever
<ericsnow> wallyworld: FYI, I'm manually testing while doing other stuff
<wallyworld> ok
<ericsnow> wallyworld: I was confused because at the CI the attrs are always strings
<wallyworld> well of course!
<ericsnow> :)
<perrito666> wallyworld: well one of the things I wanted to land is for 1.23 :p
<perrito666> so I might still be in luck
<wallyworld> ericsnow:  they have to be parsed into the right type - we don't use a schema yet though
<wallyworld> perrito666: yeah, 1.23 is open
<ericsnow> wallyworld: np, just trying to make sense of a bunch of new-to-me code :)
<perrito666> now I just need a review
 * perrito666 looks at ericsnow  :)
<ericsnow> wallyworld: to be honest, very little was hard to grok
<ericsnow> wallyworld: that's certainly to anastasiamac's credit
<perrito666> one of my favorite code lines --> // TODO(fwereade) GAAAAAAAAAAAAAAAAAH this is LUDICROUS.
<wallyworld> ericsnow: indeed, i feel the same looking at other work myself
<wallyworld> perrito666: that's hilarious
 * wallyworld checks to see if it is his code
<ericsnow> perrito666: I probably won't be able to give you a review on the until tomorrow, sorry
<anastasiamac> ericsnow: i appreciate ur consideration :D
<perrito666> wallyworld: its a test that compiles jujud
<wallyworld> yeah, not me. whew :-)
<perrito666> wallyworld: lol
<perrito666> :p lol it is the person that first taught me how to code in juju
<wallyworld> ericsnow: that's done
<ericsnow> wallyworld: thanks :)
<wallyworld> but trunk is blocked anyway :-(
<perrito666> natefinch: kids asleep driven development?
<natefinch> perrito666: yep
<natefinch> wwitzel3: you around?
<natefinch> whelp, when things start randomly breaking that were working for the last two weeks, it's time for bed.
<anastasiamac> axw: wallyworld: cleared PR \o/ PTAL http://reviews.vapour.ws/r/1213/
<wallyworld> ok
<anastasiamac> ericsnow: I was not just dropping issues... I just was not publishing all comments in hope to minimise volumes of email. plz let me know if there r still some issues that i have not addressed or answered
<ericsnow> anastasiamac: will do, thanks for taking the time to work through all that :)
<anastasiamac> ericsnow: thnx for review - really appreciated ur time and input
<anastasiamac> wallyworld: so with respect to your last comment about "called".... r u happy with my amendments - i.e. removing the "called" from places where I *know* my implementation was called using other checks?...
<wallyworld> anastasiamac: yes, so long as there are other side effects to ensure the code was called
<anastasiamac> wallyworld: tyvm :D there are other means!
<wallyworld> in the past, people have had blocks of code with c.Assert() that we not called
<anastasiamac> wallyworld: good to know :) will keep in mind...
<axw> wallyworld: rootfs attachment via storageprovisioner: http://paste.ubuntu.com/10631966/
<axw> wallyworld: hooks firing: http://paste.ubuntu.com/10631970/
<davecheney> oh gawd -- autotools!
<axw> davecheney: ?
 * axw enjoys not thinking about build configuration anymore
<davecheney> ../gcc/trunk/configure --prefix=/opt/gccgo --enable-languages=c,c++,go
<axw> fun times
<davecheney> for increasingly small values of fun
<davecheney> protip, building gcc requires more than 7gb of disk space
<axw> davecheney: my LLVM/Clang/llgo build dir is 15GB :\    that is with full debug and all targets, but still...
<wallyworld> axw: awesome on the storage provisioning :-)
<axw> wallyworld: just reworking the tmpfs provider now, then I'll write some tests and propose
<axw> wallyworld: I think I'll remove the FilesystemParams.Attachment field while I'm at it
<axw> but leave volume alone for now
<wallyworld> sgtm
<axw> wallyworld: are you around?
<wallyworld> yeah
<axw> wallyworld: hey, I'm just thinkg about how rootfs should work...
<axw> wallyworld: I think it should be using "mount --bind" to put things in place properly
<axw> but that won't work on LXC
<axw> OOTB
<wallyworld> we can create an lxc.conf that will allow that though
<wallyworld> and it's a use use we want too right - bind mount a host dir
<axw> wallyworld: yeah, but not by default. it'd be nice if we could at least get rootfs to work OOTB
<wallyworld> hmmm, yeah
<axw> wallyworld: so, the alternative is we symlink on LXC
<wallyworld> i wonder if we can do it different just for lxc, sounds messy
<axw> wallyworld: which would fail if the path already exists... but it's an option
<wallyworld> what is the likeihood of the path existing - on boot, not very high
<wallyworld> on storage add, we can just error
<axw> wallyworld: the charm can specify any path, so it depends on what the charm specifies
<axw> wallyworld: i.e. a charm could say mount the filesystem at "/var/lib/foo", and it *should* work even if that directory exists
<wallyworld> true
<wallyworld> would be easier if users just let juju decide and inform via the attached hook
<axw> wallyworld: so I could have it try to "mount --bind", and if that fails then try to symlink if the dir doesn't exist
<axw> indeed
<wallyworld> and if that fails, then what's the fallback
<axw> no fallback, can't do anything else
<wallyworld> i *think* that's reasonable, maybe if the dir is empty we remove it
<axw> wallyworld: actually, if the dir exists *and* it's on the same filesystem as /, then we could just carry on
<wallyworld> and only fail if dir contains data
<wallyworld> s/remove/use
<wallyworld> yes, just carry on
<axw> hrm, I feel a bit nervous about removing existing things
<axw> even if they're empty
<wallyworld> sorry, i mistyped
<wallyworld> if the dir is empty, we just carry on
<wallyworld> as to the charm it's the same thing
<axw> yep
<wallyworld> but fail if there's data there
<wallyworld> for safety
<axw> ok, that sounds good
<wallyworld> great
<dimitern> morning o/
<mup> Bug #1434437 was opened: juju restore failed with "error: cannot update machines: machine update failed: ssh command failed: " <juju-core:New> <https://launchpad.net/bugs/1434437>
<dimitern> fwereade, hey, are you around?
<dimitern> fwereade, also jam if here - please have a look at this http://reviews.vapour.ws/r/1118/
<fwereade> dimitern, heyhey
<dimitern> fwereade, hey, that's the PR which will hopefully fix some flaky filter tests
<fwereade> dimitern, did we have intermittent failures in all the ones you touched?
<dimitern> fwereade, yes, esp. depending on the stress the machine running them is under
<fwereade> dimitern, the thing is I'm a bit worried about the AssertReceiveBetween stuff
<dimitern> fwereade, yeah?
<fwereade> dimitern, particularly the (0, 2) case -- won't that pass *every time* whatever the behaviour is?
<dimitern> fwereade, it's unavoidable to allow for some flexibility
<fwereade> dimitern, yeah, but... the flexibility is the problem
<dimitern> fwereade, in the normal case it passes with the lower bound
<fwereade> dimitern, right, but we have no way of knowing that
<dimitern> fwereade, I've verified it by looking at the logs, but I guess it's not obvious
<fwereade> dimitern, yeah, I believe it works, I just don't think it gives us protection going forward
<fwereade> dimitern, I suspect the mocked-out-api approach is the one we need to take to actually fix this
<dimitern> fwereade, the right way going forward is to make all the watchers in the filter mockable
<dimitern> :) yeah
<fwereade> dimitern, yeah, exactly
<dimitern> fwereade, so you'd rather leave the tests flaky for now and not land my "fix" until a proper one can be done?
<fwereade> dimitern, I think that false positives are worse than false negatives, yes
<fwereade> dimitern, if you consider a test failure to be a positive, that is
<fwereade> ;p
<fwereade> er wait I think I have that the wrong way round
<fwereade> dimitern, a test suite that sometimes fails for code that works is inconvenient; a test suite that always passes for code that doesn't work is deadly
<dimitern> fwereade, I agree :)
<dimitern> fwereade, ok, I'd ask you to comment on it at least, before I close it then
<fwereade> dimitern, will do
<dimitern> fwereade, cheers
<natefinch> wwitzel3: are/were you up early, or late? :)
 * natefinch notes the email wayne sent him an hour and a half go... at 4am
<perrito666> natefinch: morning
<natefinch> perrito666: morning
<voidspace> natefinch: are you using nvidia-prime?
<voidspace> natefinch: or are you using the intel graphics, or using something else to enable optimus?
<natefinch> voidspace: I twiddled with it for a while to try to get nvidia to work, I think I'm just using intel graphics, but honestly don't remember where I ended up with that.  Linux + graphics = pain
<voidspace> natefinch: ok, thanks
<natefinch> sorry :)
<voidspace> natefinch: I have an issue with a kvm image not booting (black screen) and I wonder if it's a graphics driver issue
<voidspace> natefinch: it went through the install fine and then black screen on reboot
<voidspace> natefinch: I'm trying again with installing trusty into kvm instead of utopic to see if that makes a difference
<voidspace> natefinch: it *shouldn't*
<natefinch> I know vivid containers on non-vivid hosts was having problems, but haven't heard of utopic being a problem
<voidspace> utopic is the host, so really shouldn't be an issue
<voidspace> I couldn't find anyone else with the same issue either - which is what makes me suspect driver issues
<voidspace> anyway, trying trusty
<voidspace> natefinch: these instructions make it seem simple to enable nvidia...
<voidspace> natefinch: http://www.webupd8.org/2013/08/using-nvidia-graphics-drivers-with.html
<voidspace> natefinch: for 14.04 but should be the same for utopic I guess...
<natefinch> "Multiple monitors don't work out of the box"
<voidspace> well, you can't have *everything*
<natefinch> ...... see.. multiple monitors works for me now, no way am I going to risk spending 4 hours fiddling with it if I accidentally break that
<voidspace> :-)
<voidspace> coffee
<natefinch> wget https://raw.githubusercontent.com/kovidgoyal/calibre/master/setup/linux-installer.py | sudo python   .....sure, why not! :/
<voidspace> natefinch: dimitern: trusty install works
<natefinch> voidspace: weird... well, good enough for now I guess
<voidspace> natefinch: dimitern: so either a problem with utopic or just a problem with that particular install I did
<dimitern> voidspace, awesome!
<voidspace> natefinch: dimitern: yep, trusty fine for this
<voidspace> hah, seems like the virsh network I configured isn't working though
<voidspace> I'll look at it later
<dimitern> voidspace, cool
<voidspace> dimitern: as far as I can tell in workers the standard way to get an environ is to use WatchForEnvironConfigChanges
<voidspace> dimitern: does that sound right?
<voidspace> dimitern: it's what the provisioner and firewaller do
<dimitern> voidspace, in general - yeah, but you should have it simpler as you have access to stare directly, like the cleaner, right?
<voidspace> dimitern: I didn't see the cleaner using envron
<voidspace> cleaner  worker does next to nothing - it calls state.Cleanups
<voidspace> st.Cleanup
<dimitern> voidspace, yeah, but my point was - it takes a *state.State in its ctor
<voidspace> dimitern: right, we have a state.State
<dimitern> voidspace, from st, you could always construct and environ
<voidspace> dimitern: so should I get the config from state and open a new environment
<voidspace> right
<dimitern> voidspace, get the EnvironConfig and use config.New
<voidspace> yep
<dimitern> voidspace, you'll be running on the state servers only, so it's ok
<dimitern> voidspace, the difference for other (normal) workers running on other machines, is they need to use the api
<voidspace> right
<voidspace> understood
<dimitern> voidspace, hence that WatchForEnvironConfigChanges (which is not even needed just to get the config for some time now IIRC)
<dimitern> voidspace, ok :) cheers
<voidspace> dimitern: worker is again "done", so back to testing
<voidspace> dimitern: using an interface for the environ (a "releaser") for testing - and getting a NetworkingEnviron from the state
<dimitern> voidspace, great! have you pushed anything?
<voidspace> and properly releasing
<voidspace> dimitern: https://github.com/juju/juju/compare/master...voidspace:address-life-worker
<dimitern> voidspace, looking
<voidspace> dimitern: addressWorker.removeIPAddresses is essentially the code copied from the provisioner api - with added logic to get the instance ID from the machine ID
<sinzui> sorrry perrito666 : your HA branch broke Windows and OS X builds bug 1434544
<mup> Bug #1434544: backups broke non-linux builds <ci> <osx> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1434544>
<perrito666> sinzui: WHAT?
<perrito666> sinzui: hold, I have that change in my local machine :| wtf did happen there, sending a fix right away, apologies
<voidspace> dimitern: damn, some calls to errors.Annotatef that should be logging instead
<voidspace> dimitern: friend arrived, so breaking for lunch
<sinzui> perrito666, I think a function signature changed
<dimitern> voidspace, ok, I have some comments - ping me when back please
<perrito666> sinzui: It did, I just stashed that change instead of committing it
<mup> Bug #1434544 was opened: backups broke non-linux builds <ci> <osx> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1434544>
<perrito666> yes, yes mup I know
<sinzui> perrito666, we've all done that
<perrito666> I should stopworking at two branches at once
<perrito666> can I get an amen? http://reviews.vapour.ws/r/1217/
<dimitern> perrito666, ship it!
<perrito666> merging
<perrito666> btw sinzui we need to discuss new CI testing for new restore
<perrito666> I am pretty sure that the next spot in your agenda is somewhere around 2018 :p so save it for me plz
<sinzui> :)
<perrito666> this patch, besides breaking windows and osx, fixes the issue we had when we first tried to deprecate old restore but somehow changes a bit the behavior of ha restore and also now we support systemd
<mup> Bug #1434544 changed: backups broke non-linux builds <ci> <osx> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1434544>
<mup> Bug #1434544 was opened: backups broke non-linux builds <ci> <osx> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1434544>
 * perrito666 looks at the pr to see if it merges faster
<perrito666> wee, merged
<mup> Bug #1434555 was opened: ppc64el unit test timeout <blocks-release> <ci> <ppc64el> <regression> <juju-core:Triaged> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1434555>
 * fwereade out laura's school again, bbl
<ericsnow> perrito666, wwitzel3: standup?
<perrito666> ericsnow: going
<voidspace> dimitern: back
<dimitern> voidspace, hey, so a few comments for the worker implementation
<voidspace> dimitern: shot
<voidspace> *shoot
<dimitern> voidspace, 1) you could do it much simpler if you implement it as a StringsWorker - no need to do most of the things for a full worker, e.g. implement the loop
<voidspace> dimitern: ah, ok
<dimitern> voidspace, e.g. return NewStringsWorker in the ctor and just do the SetUp and Handle implementation
<voidspace> dimitern: yeah, I see - looking at StringsWorker now
<voidspace> nice, thanks
<dimitern> voidspace, another thing - it needs to be a singular worker as well
<voidspace> dimitern: right - I thought that was done in the way we start it
<voidspace> dimitern: don't we wrap singular workers?
<dimitern> voidspace, hmm let me check
<dimitern> voidspace, awesome - you're correct - the state worker is already a singular
<dimitern> voidspace, and a final thing - I'd rather define an interface with all state methods you need and take that in the ctor
<dimitern> voidspace, this way it's much easier to mock and test later
<voidspace> dimitern: ao an interface for state.State
<voidspace> dimitern: ok, I was just following the other workers
<voidspace> easy to do though, we don't use many methods
<voidspace> dimitern: so no problem
<dimitern> voidspace, yeah, only a subset of it - the methods you need
<dimitern> voidspace, great, thanks!
<voidspace> dimitern: thank you
<voidspace> dimitern: so in my worker I want to kick off a goroutine to remove the initial dead addresses
<voidspace> dimitern: and the way that's done currently is in a loop watching to see if the worker has died
<voidspace> dimitern: selecting on the dying channel
<voidspace> dimitern: which is really tomb.Dying()
<dimitern> voidspace, yeah, that sounds correct
<voidspace> dimitern: if I use a StringsWorker I don't have access to that
<voidspace> dimitern: as tomb is private to the StringsWorker
<voidspace> just looking to see if there's anything else
<voidspace> in a handler I would do it in SetUp
<voidspace> dimitern: not that I can see
<voidspace> dimitern: we don't necessarily need to worry about the worker dying
<voidspace> dimitern: if the *state* dies then a call will error out and we'll bail immediately
<voidspace> dimitern: ditto on the connection to the environ
<dimitern> voidspace, right, well you do have TearDown()
<dimitern> voidspace, which could be used to stop that goroutine
<voidspace> dimitern: so write to a channel in TearDown and select on that?
<voidspace> cool, that'll do
<voidspace> dimitern: thanks :-)
<dimitern> voidspace, np, hope it looks nicer this way :)
<voidspace> well I deleted a bunch of code which is always good
<dimitern> voidspace, indeed!
<voidspace> dimitern: hmm... except StringsWorker is designed to work with api watchers
<voidspace> dimitern: SetUp must return an api watcher
<voidspace> and we're using EnvironObserver
<voidspace> well
<voidspace> I'm not sure that we even need that
<voidspace> and we are using a watcher
<voidspace> let me look into it
<dimitern> voidspace, hmm that's true - but api.StringsWatcher is just an interface
<dimitern> voidspace, and there's the equivalent state.StringsWatcher interface which is the same
<dimitern> voidspace, except the api one does not have Wait and Kill, only Stop
<dimitern> voidspace, but Stop is calling Wait(Kill(nil)) internally, so that's fine
<voidspace> dimitern: so I can call and use WatchIPAddresses
<voidspace> dimitern: which is the point I guess...
<dimitern> voidspace, yeah
<voidspace> :-)
<dimitern> voidspace, btw why did you need EnvironObserver?
<voidspace> dimitern: pretty sure now that we didn't
<voidspace> dimitern: it was code I "borrowed" and didn't trim correctly...
<voidspace> it's gone
<dimitern> voidspace, sweet :)
<voidspace> dimitern: so I now have a dying channel that my goroutine is selecting on
<voidspace> dimitern: in TearDown shall I just write something arbitrary to the channel?
<voidspace> in the tomb package I can't see the dying channel ever written to (probably I just haven't found it)
<voidspace> in Kill it is just closed
<dimitern> voidspace, I think it's better to just close that channel in TearDown - no need to send (any potentially block)
<voidspace> dimitern: but if we close it, does that trigger the select listening on it?
<dimitern> voidspace, yes it does
<voidspace> ah, that'll be how the code works then
<voidspace> I tried googling but didnt' find that particular fact
<voidspace> thanks
<dimitern> voidspace, e.g. case _, ok := <-someChan: if !ok { closed .. }
<voidspace> dimitern: right, our particular select is just  case <-a.dying:
<voidspace> dimitern: that won't be enough, we'll need the ok
<dimitern> voidspace, giving a chan to a goroutine and closing it to signal something to it is pretty common :)
<dimitern> voidspace, I'm not even sure you'll need to check for !ok if that's the only reason for the select case
<dimitern> voidspace, try it out :)
<voidspace> dimitern: yeah, I'll try it in the playground
<voidspace> dimitern: yep, close triggers it without needing ok
<voidspace> dimitern: http://play.golang.org/p/x8O2HdTgOm
<dimitern> voidspace, nice :)
<alexisb> wwitzel3, ericsnow ping
<ericsnow> alexisb: hi
<alexisb> and happy friday!
<alexisb> heya ericsnow
<alexisb> can you work with sinzui and xwwt to get release notes together for gce provider?
<ericsnow> alexisb: I started to add an entry the other day and wasn't sure what more to say than "GCE is now supported as a provider" :)
<ericsnow> alexisb: so what else should be there?
<alexisb> I am sure that evilnickveitch and sinzui can give you better details then I
<sinzui> ericsnow, I am editing https://docs.google.com/a/canonical.com/document/d/1V6AU2mEbTOXQygsn-9eZg-DHjxcVpMylyq017KS78mU/edit and writing everything based on what I think is needed reading juju ini --show
<alexisb> however at minimum we should have details on what is supported and how it is used
<ericsnow> sinzui, alexisb: got it
<ericsnow> alexisb, sinzui: I'll add that
<alexisb> ericsnow, thanks
<perrito666> sinzui: did my fix unlock 1.23?
<sinzui> perrito666, CI is still testing the previous revision
 * perrito666 headbutts the desk
<jw4> perrito666: you've been learning from thumper
<perrito666> jw4: I think he headbutts other people's hands
<jw4> perrito666: it's okay - you'll get there too
<alexisb> katco, ping
<katco> alexisb: hi hi
<alexisb> hey there katco are you in full force today?  or our you out?
<alexisb> s/our/are
<katco> alexisb: i took today off to celebrate the equinox (plus i happen to be sick :( ) you caught me checking email though. what can i do for you?
<alexisb> nothing, you are off, go!
<katco> alexisb: had i not been sick we would be at the butterfly house with our daughter =/
<alexisb> :(
<ericsnow> natefinch: ping
<alexisb> ok ericsnow given you are already in the release notes and my internete sucks and google docs hates me I am volunteering you :)
<alexisb> I need someone to add leader elections to the 1.23 release notes, noting that it is behide a feature flag
<alexisb> william, jam and I can add details later
<ericsnow> alexisb: yikes
<ericsnow> alexisb: FWIW, I know nearly nothing about it
<alexisb> just need a placeholder
<ericsnow> alexisb: I'm glad to give it a go though
<ericsnow> alexisb: k
<alexisb> ericsnow, please dont send anytime on the details
<ericsnow> alexisb: got it
<dimitern> voidspace, dooferlad, as OCRs can you have a look at this huge, but mostly mechanical refactoring branch? http://reviews.vapour.ws/r/1219/
<dooferlad> dimitern: on it
<dimitern> dooferlad, ta
<mattyw> natefinch, ping?
<perrito666> mattyw: he is most likely OoO
<mattyw> perrito666, ok thanks
<natefinch> back everyone
<natefinch> wwitzel3: you around?
<sinzui> natefinch, dimitern do you have a minute to review http://reviews.vapour.ws/r/1220/
<dimitern> sinzui, looking
<dimitern> sinzui, ship it!
<sinzui> thank you dimitern
<mup> Bug #1431918 changed: gce minDiskSize incorrect <tech-debt> <juju-core:Fix Released by wwitzel3> <juju-core 1.23:Fix Released by wwitzel3> <https://launchpad.net/bugs/1431918>
<mup> Bug #1431918 was opened: gce minDiskSize incorrect <tech-debt> <juju-core:Fix Released by wwitzel3> <juju-core 1.23:Fix Released by wwitzel3> <https://launchpad.net/bugs/1431918>
<mup> Bug #1431918 changed: gce minDiskSize incorrect <tech-debt> <juju-core:Fix Released by wwitzel3> <juju-core 1.23:Fix Released by wwitzel3> <https://launchpad.net/bugs/1431918>
<natefinch> what's the metadata to turn off apt-get upgrade, anyone remember?
<hazmat> os-update: false?
<hazmat> natefinch: per the latest cloud init docs package_upgrade: false
<hazmat> http://cloudinit.readthedocs.org/en/latest/topics/examples.html#run-apt-or-yum-upgrade
<hazmat> oh.. you mean the juju syntax
<natefinch> hazmat: heh, yeah, I think it's apt_upgrade: false  ... at least that seems to be what the code is saying, if I'm in the right place
<hazmat> natefinch: enable-os-upgrade: false
<hazmat> https://github.com/juju/docs/blob/master/src/en/config-general.md
<natefinch> hazmat: wow, I didn't know we had all that documented, that's awesome.  Thanks for the pointer
<perrito666> it is amazing how noticeable is the fact that mosquito repellent effect has passed
<natefinch> haha
<perrito666> it seems to work as an interruptor, from one moment to the other, everything itches
<natefinch> yuup
 * natefinch contemplates ignoring all this "return a special error to get jujud to restart" and just calling os.Exit(1)
<perrito666> natefinch: It took me 6 months to get approval on a patch that does that same thing and has a very good excuse :p hope you have patience
<natefinch> perrito666: I don't need patience, I have internal customers that need this in 1.23
<natefinch> perrito666: the annoying thing is, there's a few different things that LOOK like they should do the right thing.... and don't.
<perrito666> well as fwereade would certainly say, "have you read the doc about that?"
<perrito666> and, me too, was just 10 mins looking at an error that arose from plural vs singular
<natefinch> ug
<natefinch> gah... now jujud isn't restarting at all. Sonofa
<natefinch> like, it stays dead
<natefinch> thanks a lot, upstar
<natefinch> t
<perrito666> natefinch: what did you do?
<natefinch> os.Exit(1)
<perrito666> oh, that is odd, upstart should restart you on 1
<natefinch> and somehow the logging statements right before that aren't being flushed to disk.  Thanks a lot, loggo.
<perrito666> natefinch: well you are exiting :)
<perrito666> which most likely prevents all other routines to finish whatever they where doing
<natefinch> perrito666: it should be flushing the log messages.. what if I log something important right before a crash?
<jw4> OCR PTAL : http://reviews.vapour.ws/r/1221/ <--- reviewed previously and merged to 1.23 - this is to forward port it to master
<natefinch> it's in the same goroutine, too
<perrito666> can you not force loggo to flush?
<natefinch> perrito666: I don't immediately see a way to... the problem is that it just takes writers
<natefinch> so there's no flush, even if the underlying thing can flush
<perrito666> you are having a classic friday problem
<natefinch> been having friday problems for two weeks
<sinzui> hi natefinch bug 1434680 is super critical
<mup> Bug #1434680: 1.22.0 cannot upgrade to 1.23-beta1 or 1.24-alpha1 <ci> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1434680>
<sinzui> natefinch, We are in the middle of a release of 1.23-beta1. we have a choice of aborting or continuing with the caveate that upgrades from 1.22.0 are broken
<mup> Bug #1434070 changed: upgrades are broken in master 1.24-alpha1 <ci> <regression> <upgrade-juju> <juju-core:Fix Released by thumper> <https://launchpad.net/bugs/1434070>
<mup> Bug #1434680 was opened: 1.22.0 cannot upgrade to 1.23-beta1 or 1.24-alpha1 <ci> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1434680>
<jw4> sinzui, natefinch I don't have access to the upgrade logs for those bugs - can you verify whether they are related to the uniter stopped state upgrades I just merged into 1.23 ?
<perrito666> I can see this happening a lot machine-0: 2015-03-20 15:19:51 ERROR juju.rpc server.go:554 error writing response: EOF
<perrito666> that was from allmachines
<perrito666> machine 0 is full of
<perrito666> 2015-03-20 15:43:16 DEBUG juju.mongo open.go:122 TLS handshake failed: x509: certificate is valid for localhost, juju-apiserver, not juju-mongodb
<sinzui>  natefinch has access to everything
<jw4> thanks sinzui - I'll eagerly ^H^H^H^H anxiously await natefinch 's verdict
 * natefinch crackles with power.
<jw4> hehe
<jw4> is that kragle?
<natefinch> oh wait, no, I was just sitting on a bag of chips
<jw4> haha
 * jw4 assumes *everyone* has watched the LEGO movie
<natefinch> I have not.... we got it, but it was too intense for our 3 year old
<natefinch> someday I should throw it in after they go to bed
<jw4> yeah; 3 is a bit young for it - my 9 year old liked it though
<sinzui> natefinch, as a canonical employee you can see every log at http://reports.vapour.ws/releases/2466
<jw4> although it RUINED watching The Hobbit with him - he kept associating Gandalf with the 'Gandalf' in LEGO movie
<natefinch> sinzui: ima lookin
<natefinch> jw4: haha... always watch the old movies first
<sinzui> natefinch, and maybe jw4, as project members of juju-core, you must have permission to see hidden comments in Lp...otherwise there is a regression
<jw4> kk
<natefinch> sinzui: is there a trick to seeing the hidden ones?
<sinzui> they should just be dark grey
 * sinzui reviews project
<natefinch> sinzui: btw, the comments are hidden, but I still see the attachments in the lower right
<sinzui> ha ha. Launchpad really sucks
<natefinch> lol
<sinzui> natefinch, I need to review them now :(
<jw4> perrito666: oh - you were talking about the upgrade regression when you mentioned the TLS handshake failures
<sinzui> natefinch, I deleted the attachments because Juju still puts certs in DEBUG when juju is not run a sDEBUG
<natefinch> sinzui: sorry about that.... juju sucks, too
<jw4> I'm seeing those too in my local upgrade test
<sinzui> natefinch, you are an admin of ~juju. as the project maintainer you should be seeing every hidden message on the juju-core project? Are you logged in?
<perrito666> jw4: I was, I thought that since natefinch was not answering I could throw you a line
<natefinch> sinzui: it says i'm logged in
<jw4> perrito666: thank you! :)  I was trying to figure out you were referring to your previous conversation with natefinch
<sinzui> natefinch, jw4, I uploaded redacted logs to https://bugs.launchpad.net/juju-core/+bug/1434680
<mup> Bug #1434680: 1.22.0 cannot upgrade to 1.23-beta1 or 1.24-alpha1 <ci> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1434680>
<jw4> sinzui: thank you!!
<jw4> okay, those seem to be the same errors I'm getting that perrito666 reported
<jw4> it's like somehow the state server started presenting the wrong TLS certificate?
<natefinch> sinzui: I can get to the logs on the vapour link
<jw4> or more like the client is trying to connect using the wrong server name?
<sinzui> jw4, Those might also be what dimitern reported on https://bugs.launchpad.net/juju-core/+bug/1434070
<mup> Bug #1434070: upgrades are broken in master 1.24-alpha1 <ci> <regression> <upgrade-juju> <juju-core:Fix Released by thumper> <https://launchpad.net/bugs/1434070>
<jw4> hmm; looks likely
<ericsnow> sinzui: could it be a network issue?
<ericsnow> sinzui: not the network but juju networking
<sinzui> ericsnow, for every substrate?
<jw4> hmm I take it back - the logs reported by dimitern don't include the TLS handshake error
<perrito666> jw4: sure they do
<ericsnow> sinzui: is there a machine-0.log somewhere from a successful run that we could compare?
<perrito666> https://bugs.launchpad.net/juju-core/+bug/1434070/comments/8
<mup> Bug #1434070: upgrades are broken in master 1.24-alpha1 <ci> <regression> <upgrade-juju> <juju-core:Fix Released by thumper> <https://launchpad.net/bugs/1434070>
<sinzui> ericsnow, no upgrade passed so no
<sinzui> ericsnow, I think a downgrade to 1.21.3, is required. that is what was stable until yesterday
<jw4> perrito666: ah I see - the initial attachements didn't include machine-0
<jw4> maybe this is normal, but when I try to do 'juju status' with my borked upgrade it looks like 'juju status' is starting a new mongod process under my uid ?
<natefinch> jw4: that doesn't sound normal to me
<jw4> its running using my UID and the PPID is 1
<jw4>  /usr/lib/juju/bin/mongod
<natefinch> 0.o
<perrito666> well it is local provider
<perrito666> and your machine IS machine-0
<perrito666> jw4: care to pastebin you ps faxu?
<jw4> perrito666: sure
<jw4> http://paste.ubuntu.com/10637523/
<jw4> pid 33016 is the mongod that appears when I use 'juju status'
<perrito666> so
<perrito666> weldon   33016  0.5  0.5 250800 57240 pts/3    Sl   12:43   0:17 /usr/lib/juju/bin/mongod --dbpath /tmp/test-mgo229055436 --port 59450 --nssize 1 --noprealloc --smallfiles --nohttpinterface --oplogSize 10 --ipv6 --nounixsocket --nojournal --sslOnNormalPorts --sslPEMKeyFile /tmp/test-mgo229055436/server.pem --sslPEMKeyPassword xxxxxxx
<perrito666> that is part of a testrun
<jw4> gah!
<perrito666> that is why it has your uid
<jw4>  sudo redact-last-20-lines
<jw4> :)
<perrito666> look into your /tmp you might also have some garbage from old tests
<perrito666> brb
<jw4> tx perrito666
<natefinch> dumb test mongo sticking around
<jw4> yeah, sorry for the noise
<natefinch> it's happened to all of us
<perrito666> jw4: local provider happens
<jw4> hehe
<mup> Bug #1313016 changed: allow annotations to be set on charms <api> <charms> <improvement> <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1313016>
<mup> Bug #1389326 changed: juju-backup is not a valid plugin <backup-restore> <plugins> <juju-core:Fix Released by marcoceppi> <https://launchpad.net/bugs/1389326>
<mup> Bug #1403955 changed: DHCP's "Option interface-mtu 9000" is being ignored on bridge interface br0 <cts> <kvm> <lxc> <network> <juju-core:Fix Released> <isc-dhcp (Ubuntu):Confirmed> <https://launchpad.net/bugs/1403955>
<mup> Bug #1409639 changed: juju needs to support systemd for >= vivid <hs-arm64> <systemd-boot> <juju-core:Fix Released by ericsnowcurrently> <juju-core (Ubuntu):Triaged> <juju-core (Ubuntu Vivid):Triaged> <https://launchpad.net/bugs/1409639>
<mup> Bug #1415671 changed: Joyent provider uploads user's private ssh key by default <joyent-provider> <juju-core:Fix Released by natefinch> <https://launchpad.net/bugs/1415671>
<mup> Bug #1415693 changed: Unable to bootstrap on cn-north-1 <bootstrap> <ec2-provider> <online-services> <juju-core:Fix Released by cox-katherine-e> <https://launchpad.net/bugs/1415693>
<mup> Bug #1421237 changed: DEBUG messages show when only INFO was asked for <ci> <security> <juju-core:Fix Released by thumper> <https://launchpad.net/bugs/1421237>
<mup> Bug #1423454 changed: cloud-image-utils needs to be installed <tech-debt> <juju-core:Fix Released by wallyworld> <https://launchpad.net/bugs/1423454>
<mup> Bug #1424069 changed: juju resolve doesn't recognize error state <regression> <resolved> <juju-core:Fix Released by wallyworld> <https://launchpad.net/bugs/1424069>
<mup> Bug #1424590 changed: juju status --format=tabular <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1424590>
<mup> Bug #1427840 changed: ec2 provider unaware of c3 types in sa-east-1 <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1427840>
<mup> Bug #1428117 changed: EC2 eu-central-1 region not in provider <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1428117>
<mup> Bug #1428119 changed: EC2 provider does not include C4 instance family <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1428119>
<mup> Bug #1428430 changed: AllWatcher does not remove last closed port for a unit, last removed service config <api> <juju-core:Fix Released by themue> <https://launchpad.net/bugs/1428430>
<mup> Bug #1431130 changed: make kvm containers addressable (esp. on MAAS) <addressability> <kvm> <maas-provider> <network> <juju-core:Fix Released by dooferlad> <https://launchpad.net/bugs/1431130>
<mup> Bug #1431134 changed: fix container addressability issues with cloud-init, precise, when lxc-clone is true <addressability> <cloud-init> <ec2-provider> <lxc> <maas-provider> <network> <precise> <usability> <juju-core:Fix Released by dimitern> <https://launchpad.net/bugs/1431134>
<perrito666> daf...
<perrito666> sinzui: that is you or mup went bonkers again?
<sinzui> perrito666, some subscribed mup useless bug information
<jw4> perrito666: well I just got 15 emails too so I suspect it's not mups fault
<perrito666> jw4: i did not, or I have those filtered
<sinzui> Team ~juju should be getting emails about bugs, but I don't think anyone cares about closing except the reporter
<jw4> yeah, it was just the Fix Released notification
<natefinch> perrito666: probably filtered.  Mine are...
<natefinch> sinzui: I gotta run, have company and they're expecting me to get dinner.    Afraid I've not been much use anyway (other than pointing out limitations in Launchpad)
<sinzui> natefinch, okay, I wasn't really expecting a fix today
<ericsnow> sinzui:  I just bootstrapped local provider with a fresh-built 1.22 (tip), and then did the same upgrade command from the CI logs
<ericsnow> sinzui: and it worked fine
<ericsnow> sinzui: maybe
<ericsnow> sinzui: the command finished successfully but it looks like it didn't do much, so I'm not sure I have much to add yet
<sinzui> ericsnow, I bootstrapped with 1.22.0 locally, then upgraded to 1.23-beta1 and lost control
<ericsnow> sinzui: which revision is 1.22.0?
<sinzui> ericsnow, 1.22.0 the package we ask users to use
<sinzui> ericsnow, per https://docs.google.com/a/canonical.com/document/d/1ILRWMChkqZ7YeXNCNsmaF89ewqAL_j0qVeF3E9BMwys/edit#heading=h.dp1wyrj1wujg, It was commit 44caaac
<ericsnow> sinzui: if I build from that revision it also works
<ericsnow> (I'm on trusty)
<sinzui> ericsnow, I suspect you are tainted. We one more machines in more environments without developer tools installed.
<sinzui> ericsnow, 14 tests means 14 substrates, machines, archs, and series. it isn't a single dirty machine
<ericsnow> sinzui: sure
 * sinzui is boostrapping again
<perrito666> sinzui: which might be the reason why no one broke this in its own machine
<sinzui> ericsnow, I think I got a volunteer to test vivid with 1.23-beta1 on monday. maybe he can explain the lxc breakage
<ericsnow> sinzui: sweet
<ericsnow> sinzui: also I don't think the unit tests failures on vivid are me :)
<sinzui> ericsnow, no, I don't think so either. the something is very odd about two of them because I only see ping failures in when running tests in lxc
<sinzui> ericsnow, I cannot upgrade
<sinzui> I will attach my personal log
<ericsnow> oh how I wish local provider ran entirely in VMs
<sinzui> ericsnow, I think we all do.
<ericsnow> sinzui: I put it on the agenda for Nuremberg (and thumper agreed to own it)
<perrito666> you migth have the most attended session of all
<perrito666> beware
<sinzui> ericsnow, indeed thump has ideas to make it work right
<ericsnow> sinzui: we've had the ideas for a while, just not the resources
<sinzui> yeah. I know the pain.
<ericsnow> sinzui: speaking of vivid, how do I re-run local-deploy-vivid-amd64 with --DEBUG?
<sinzui> ericsnow, we can add --debug to the command line arg of the test. then rerun the test with the current packages. CI only retests current packages
<ericsnow> sinzui: that's fine, I just want DEBUG logging
<sinzui> ericsnow, bugger --debug was lost this week in an upgrade...I will have it back in a few minutes
<ericsnow> sinzui: no worries
<jw4> in the TestPrune* dblogpruner tests I keep getting an error : "failed to retrieve log counts: no such cmd: scale"
<jw4> running 'scale' has ubuntu advising me to install csound-utils which doesnt' seem right
<jw4> hmm interesting "scale" seems to refer to a key in a bson.M{} struct
<jw4> holy moly it's a bug
<jw4> :)
<jw4> OCR PTAL : Fix for bug 1434741 http://reviews.vapour.ws/r/1222/
<mup> Bug #1434741: PruneLogs suffers from an indeterminate map iteration order bug <juju-core:New> <https://launchpad.net/bugs/1434741>
<mup> Bug #1434741 was opened: PruneLogs suffers from an indeterminate map iteration order bug <juju-core:New> <https://launchpad.net/bugs/1434741>
#juju-dev 2015-03-21
<redelmann> hi, simple question, can i access to my juju enviroment from differen machine i create it? just having a juju backup?
<redelmann>  bootstrap instance is in aws
#juju-dev 2015-03-22
<menn0> davecheney, mwhudson: you probably know that waigani and thumper are out today due to holiday in otago
<menn0> davecheney, mwhudson: what do you want to do re the standup?
<jw4> thanks menn0 (and mgz) :)
<menn0> jw4: I didn't realise there was this limitation with Database.Run. I can understand why now that I think about it more.
<jw4> menn0: yeah, it was an interesting bug
<menn0> jw4: mgo should probably error/panic if you give a bson.M to Run given that it's not going to work reliably. oh well.
<jw4> I guess I'll wait til CI is unblocked to merge... back to my sunday afternoon :)
<davecheney> menn0: stndup ? i just went to breakfast as normal
<davecheney> :)
<davecheney> i'm working on gccgo crap all day
<davecheney> EOM
<menn0> davecheney: all good. mwhudson and I had a quick chat and that was it.
<davecheney> blah blah, loging is compllicated, mongo blows, also, shared libraries
<davecheney> got it
<menn0> davecheney: :)
<wallyworld_> anastasiamac: standup?
#juju-dev 2016-03-21
<thumper> wallyworld: added a comment to that review...
<wallyworld> ok
<wallyworld> thumper: it needs to be a file, not a dir
<thumper> filepath.Join(c.MkDir(), "somefile"),
<wallyworld> doh
<wallyworld> sigh
<wallyworld> fair point
<mup> Bug #1559280 changed: creating hosted model config: opening model: endpoint: expected string, got nothing <bootstrap> <ci> <juju-core:New> <juju-core admin-controller-model:Triaged> <https://launchpad.net/bugs/1559280>
<mup> Bug #1559285 changed: creating hosted model config: opening model: storage-endpoint: expected string, got nothing <bootstrap> <ci> <juju-core:New> <juju-core admin-controller-model:Triaged> <https://launchpad.net/bugs/1559285>
<wallyworld> thumper: that bug is fix committed, i wonder if we could remove the blocker tag to unblock master
<cherylj> katco: can you review?  http://reviews.vapour.ws/r/4247/
<axw> wallyworld: guessing wildly, I'm wondering if the --config CI is passing has an "authorized-keys" with an empty string value
<axw> wallyworld: in controller/modelamanager we don't update the attrs if there's an existing key
<axw> wallyworld: we don't check if the value is empty tho
<wallyworld> axw: i suspect the same - i am updating the code to account for that - i have passing tests but it is still failing in live testing bit i am close
<axw> wallyworld: ok cool
<davecheney> lucky(~/src/github.com/juju/juju) % juju kill-controller testing
<davecheney> error: controller testing not found
<davecheney> lucky(~/src/github.com/juju/juju) % juju bootstrap --upload-tools testing aws/ap-southeast-2
<davecheney> ERROR cannot create controller "local.testing" info: model info already exists
<davecheney> is it deleted or not !?!?
<cherylj> there's already a bug for that, davecheney.  I've had to manually delete things out of cache.yaml to work around it
<davecheney> thanks
<wallyworld> axw: it's failing because jujud bootstrap command is getting passed a controller model config which still has authorized-keys-path and it creates a cfg using NoDefaults which means this is not stripped out and causes havoc further down the line
<axw> wallyworld: argh, of course
<wallyworld> axw: i wonder if bootstrap jujud should be using WithDefaults
<axw> wallyworld: where?
<wallyworld> jujud/bootstrap.go Run()
<axw> wallyworld: i don't think so. by that stage, it should have a complete config
<wallyworld> but there i think it is simply using what was passed from the bootstrap cmd
<wallyworld> so may not be complete
<wallyworld> i could be wrong
<axw> wallyworld: pretty sure it shouldn't be, possibly broken during changes. will look
<wallyworld> axw: so what we should be doing is stripping authorized-keys-path on the bootstrap client side when we read the keys, i'm guessing we don't do that
<axw> wallyworld: yep
<axw> wallyworld: atm on server we'll ignore authorized-keys if authorized-keys-path is set
<axw> wallyworld: so we should remove authorized-keys-path after reading
<wallyworld> yep
<axw> from hosted model config
<axw> wallyworld: actually we do remove after reading, just not from hosted model config
<wallyworld> axw: that is done if UseDefaults is used
<wallyworld> in my case though host model config does not have that key i don't think
<wallyworld> hosted
<axw> wallyworld: if bootstrap config has authorized-keys-path, it must've come from --config. everything in --config will be added to hosted model config
<wallyworld> yeah
<wallyworld> axw: here's a fix http://reviews.vapour.ws/r/4252/
<thumper> wallyworld: what was your thinking on unblocking?
<wallyworld> thumper: i think we should
<thumper> wallyworld: have you personally run the tests on windows?
<wallyworld> no, don't have windows
<thumper> I'd like to wait until at least someone has run the windows tests
<wallyworld> ok
<axw> wallyworld: reviewed
<wallyworld> ta
<natefinch> wallyworld: is the timestamp in a bzr commit ref reliable?  e.g. ian.booth@canonical.com-20141121040613-ztm1q0iy9rune3zt ?  can I parse that timestamp and trust that if I compare two of them, the later one definitely comes from a later ref?
<natefinch> wallyworld: context - I'm trying to write an automated dependencies.tsv merge tool
<wallyworld> natefinch: i think so
<wallyworld> natefinch: for bzr , the rev id is monotonic
<wallyworld> ally increasing
<wallyworld> so simplier to use that i think
<natefinch> I wasn't sure if the id was reliable... either one is fine, I just want to be able to tell which one is later
<natefinch> since basically every time I merge dependencies.tsv I just pick whichever commit is later
<axw> wallyworld: I've repro'd the "cannot obtain provisioning script" bug, digging in now
<wallyworld> axw: what did you do to repro?
<axw> wallyworld: bootstrapped lxd provider, added another lxc machine OOB and did "add-machine ssh:.."
<wallyworld> axw: hmmm, ok. i basically did the same thing except my bootstrap machine manual
<axw> wallyworld: yeah I think I did that last time I tested and that worked
<wallyworld> ok, at least then it is explainable
<axw> wallyworld: hrm, CI definitely is bootstrapping with manual tho
<wallyworld> yeah, go figure
<axw> wallyworld: anyway, *seems* to be related to ControllerInstances not returning any instances
<davecheney> thumper: http://reviews.vapour.ws/r/4254/
<wallyworld> axw: i'm also at a loss as to why show-controller is failing in the restore job
<wallyworld> i've asked for more info like the yaml files and args passed to show-controller
<axw> wallyworld: ok, haven't gotten to that one yet. I think I see the problem with manual
<axw> wallyworld: we're setting use-sshstorage too late now
<axw> wallyworld: (dumb config attr name, but still required)
<wallyworld> axw: does that explain why it works sometimes?
<menn0> thumper: http://reviews.vapour.ws/r/4255/ pls
<axw> wallyworld: it would work if the controller machine were able to ssh to itself as ubuntu
<wallyworld> which i think it could for me
<wallyworld> as i copied my ssh keys across
<axw> wallyworld: that'll do it. also, the machine agent will write the "system" public key
<axw> but asynchronously, so it's racy
<wallyworld> joy
<wallyworld> axw: i've gone through the python CI scripts for the restore job and everything appears ok. i though we may have had an issue with default-model name was the same as the controller name which is what the scripts do but i tested that and show-controller worked
<wallyworld> so it appears the scripts bootstrap, do some stuff, then attempt to call show-controller and that fails
<axw> wallyworld: is the script swallowing stderr? I don't see any error messages, just the backtrace
<wallyworld> yeah, that's part of the issue, i haven't checked but it must be
<wallyworld> there's not much to go on
<wallyworld> axw: can we rename that use-sshstorage attribute then if you are in the area and we need to retain it for 2.0?
<axw> wallyworld: yup
<wallyworld> \o/
<axw> wallyworld: the issue with lxd is that we're filtering machines by hosted model UUID, and not controller model UUID
<wallyworld> axw: for which issue?
<axw> wallyworld: so ControllerInstances fails if you call it with the hosted model config
<wallyworld> ah the ssh-storage one
<axw> wallyworld: no, related, but different
<axw> wallyworld: lxd as controller vs manual as controller
<axw> you can't add-machine ssh with either
<wallyworld> right, both causes affect the ProvisioningScript
<axw> yup
<axw> wallyworld: I think I might simplify the ProvisioningScript code a bit. we get the Environ to get the addresses ... but we already store addresses in state
<axw> so I'll just have it grab them out of state
<wallyworld> sgtm
<axw> wallyworld: https://github.com/juju/juju/pull/4816
<wallyworld> looking
<wallyworld> axw: hoow does that simplifcation work around the bug?
<axw> wallyworld: environs.APIInfo calls ControllerInstances. ControllerInstances is broken for lxd and manual when called with hosted model config
<axw> possibly other providers too
<wallyworld> ah right, that's the bit i was missing, that APIInfo called ControllerInstances
<mup> Bug #1559706 changed: TestFinalizeCredentialInvalidFilePath fails on windows <blocker> <ci> <regression> <unit-tests> <windows> <juju-core:Fix Released by wallyworld> <https://launchpad.net/bugs/1559706>
<axw> wallyworld: no rush, PR to drop use-sshstorage: https://github.com/juju/juju/pull/4817
<wallyworld> ok, will look in 5
<wallyworld> axw: small one also http://reviews.vapour.ws/r/4260/
<axw> wallyworld: LGTM
<wallyworld> ta
<mup> Bug #1555744 changed: kill-controller / destroy-controller prevents reuse of controller name <docteam> <juju-release-support> <juju-core:Invalid by wallyworld> <https://launchpad.net/bugs/1555744>
<mup> Bug #1559844 opened: InvalidVolumme.NotFound error destroying aws environment <juju-core:Triaged> <https://launchpad.net/bugs/1559844>
<axw> wallyworld: last one: https://github.com/juju/juju/pull/4819
<wallyworld> ok
<wallyworld> axw: lgtm
<axw> wallyworld: ta
<axw> wallyworld: see my reply to your comment?
<wallyworld> no, looking
<wallyworld> yeah probably
<wallyworld> axw: do you have a recollection of an destroying aws environment and getting an InvalidVolume.NotFound error as per the bug above? i seem to recall we had seen that and may it was supposed to be fixed
<axw> wallyworld: I don't recall anything specific, sorry
<wallyworld> sure, np
<wallyworld> axw: our cleanup after failed bootstrap still could leave inconsisent metadata; this pr partially addresses that http://reviews.vapour.ws/r/4262/
<axw> wallyworld: you can't environs.Prepare with an existing controller name, so most of that doesn't make sense. it works because you're reinstating old stuff, but it would be simpler to just check if the environs.Prepare error is not "AlreadyExists", and revert only then
<axw> I mean, it makes sense, but it's not very straight forward
<axw> wallyworld: actually, it doesn't do the right thing in that case. I'll just comment on the PR
<axw> wallyworld: just need to write some tests for login itself, then I can propose the basics
<wallyworld> axw: sgtm, admin controller branch currently in CI
<TheMue> morning
<dimitern> voidspace, ping
<mup> Bug #1542206 changed: space discovery still in progress <ci> <intermittent-failure> <joyent-provider> <precise> <regression> <juju-core:Invalid> <juju-core maas-spaces-multi-nic-containers:Invalid> <https://launchpad.net/bugs/1542206>
<voidspace> dimitern: got someone with me, be free about 11'ish
<dimitern>  voidspace ok, ping me then please
<voidspace> dimitern: ping
<voidspace> dimitern: actually, gimme 5  - gonna reboot
<dimitern> voidspace, sure
<voidspace> dimitern: half my windows have dissappeared and won't return *grrr*
<voidspace> dimitern: aaaaand back
<voidspace> dimitern: you wanna hangout?
<dimitern> voidspace, hey, omw
<jamespage> dimitern, morning - do we have any other flags on network-get other than primary-address?
<jamespage> I could really do with understanding the cidr for a binding as well
<jamespage> that is discoverable - but just wondered...
<dimitern> jamespage, not yet, but more args are planned
<dimitern> jamespage, well, it wasn't really possible to report back anything but the address until the multi-nic work
<dimitern> jamespage, so now I guess we can give you the full proposed YAML/JSON output, as discussed in the network-get spec?
<jamespage> dimitern, well we can parse a yaml OK :-)
<jamespage> dimitern, is that already accessible?
<jamespage> or is that work for you?
<dimitern> jamespage, now almost the complete information about machine's interfaces and addresses is available in state and to network-get
<jamespage> dimitern, ok - so I just need to parse that
<jamespage> right-oh
<dimitern> jamespage, just a sec, let me paste a straw-man proposal of what network-get can return without --primary-address
<dimitern> jamespage, how does this look ? http://paste.ubuntu.com/15455682/
<voidspace> dimitern: https://github.com/juju/juju/pull/4822
<voidspace> dimitern: http://reviews.vapour.ws/r/4264/
<jamespage> dimitern, is cidr-address the ip address with the netmask? or the subnet cidr?
<dimitern> voidspace, LGTM
<dimitern> jamespage, yeah, like in /e/n/i: address 10.20.19.100/23
<dimitern> s/\23/\24/
<dimitern> jamespage, please note there are potentially multiple devices that will be returned for a given binding
<voidspace> dimitern: thanks
<lq> http://linkcash.co/2Zn
<jamespage> dimitern, why would multiple bindings be returned?
<jamespage> sorry devices, not bindings?
<dimitern> jamespage, well, if you have 2 addresses from the same subnet of the same device
<dimitern> jamespage, or alternatively - 2 devices each with addresses from the same space the binding is on
<dimitern> jamespage, s/each with addresses/each with an address/
<axw> wallyworld: replied to your PR if you want to take another look
<wallyworld> sure
<wallyworld> axw: so it ensures that current account value is correct for example?
<axw> wallyworld: yep. current account is per controller
<axw> wallyworld: remove the controller and it doesn't make sense anymore, so we remove that info
<wallyworld> right, but then we should ensure it is reset to what it was
<wallyworld> i guess that happens when we reset the current controller
<axw> wallyworld: there's two possibilities: you just created hte controller, or you didn't. if you created it, there was no value. if you didn't, attempting to create the controller again fails
<axw> (and you don't update models/accounts/bootstrap-config)
<wallyworld> and it decorateAndWrite fails half way through, restting current controller should be enough
<axw> wallyworld: resetting current-controller and removing the controller from ClientStore
<wallyworld> yes
 * axw nods
<wallyworld> we remove current controller but don't currently reset
<axw> wallyworld: so: move defer up, like you did; reset current-controller; don't remove controller if errors.IsAlreadyExist(err)
<axw> wallyworld: agree?
<wallyworld> axw: yep, i've already fixed the error check
<axw> cool
<wallyworld> axw: should be good to go
<axw> wallyworld: still a bit weird: cannot destroy newly created controller details "my-controller" info
<axw> wallyworld: "controller %q details" would be less weird to me
<axw> dropping info
<wallyworld> ffs, i'm tired
<wallyworld> fixing
<axw> no doubt
<axw> wallyworld: LGTM, thanks
<wallyworld> so as of now, i can't quite see the failure mode that would lead to people not being able to bootstrap with the same controller name after a failed attempt. there's a bug or two on that. maybe we have fixed stuff the past week or two
<axw> wallyworld: you mean with that fix?
<axw> wallyworld: if environs.Prepare fails half way through then you're buggered (without that fix)
<wallyworld> yeah, once this goes in, plus i think there's been other small cleanips
<wallyworld> i was more hoping out loud we'd be ok now
<axw> wallyworld: yeah, otherwise it's down to hitting ctrl-c or power out at the wrong time
<axw> wallyworld: I think kill-controller should be expected to work if environs.Prepare returns successfully
<wallyworld> axw: i think so too
<axw> wallyworld: it would be nice if we had some way to do the prepare transactionally/atomically
<wallyworld> yup
<axw> wallyworld: write to a separate directory and slide the symlink or something
<wallyworld> juju 2.1 :-)
<axw> wallyworld: yeah, inclined not to worry about that for now
<wallyworld> we got to get this farking adminbranch landed
<jamespage> dimitern, which golang version do I need for lxd support?
<mgz> jamespage: 1.3 or later
<jamespage> mgz, ta
<natefinch> jamespage: One thing to keep in mind - if you're using the client built for trusty, it won't have LXD, since it wasn't built with 1.3+
<jamespage> natefinch, yes I just hit that
<jamespage> natefinch, rebuilding with 1.3
<jamespage> its in backports afterall
<natefinch> cool
<dimitern> jamespage, on trusty you'll need to add trusty-backports to /etc/apt/sources.list and then do apt-get update
<axw> wallyworld: for the morning: https://github.com/juju/juju/pull/4823
<wallyworld> ok, ty
 * axw away
<katco> morning all
<natefinch> morning
<rick_h_> morning katco
<katco> natefinch: how's the family?
<katco> rick_h_: heyo
<katco> rick_h_: hey, i was wondering this over the weekend: are you a full stack guy, or a unit test guy?
<natefinch> katco: good.  snow day today
<katco> natefinch: uh oh
<rick_h_> katco: both!
<katco> natefinch: we woke up sat. morning to heavy snow, 1" on the ground. by noon it was all gone and sunny.
<rick_h_> katco: I cringe when there's a PR that doesn't address both ends imo
<rick_h_> katco: hah, gotta love first day of sprint
<rick_h_> spring
<rick_h_> damn, been working here too long to s/sping/sprint
<katco> rick_h_: where do you feel full-stacks should live? ci?
<katco> rick_h_: lol
<natefinch> katco: yeah, it's like 5 inches here... gonna be 40Â° later, but still gotta snowblow or it'll take forever to melt
<rick_h_> katco: yes, in a perfect world they're in CI and part of gating landing
<katco> rick_h_: cool
<rick_h_> katco: only when they take too long to run, should they not be gating landings, and then there's a push to figure out why/speed up
<rick_h_> katco: imo and all that :)
<katco> rick_h_: hey, everyone's got an elbow ;)
<rick_h_> e.g. can we gate on a test on a single platform/etc so that they can be part of the gating process still
<rick_h_> but yea, I've seen too many unit tests break an API that isn't noticed because nothing full stack was run that used it.
<rick_h_> katco: kind of like updating the mock, and missing that updating the mock means the real things needs an update
<katco> rick_h_: yeah, that is definitely an antipattern
<rick_h_> katco: that being said, I like it when unit tests break right at the point of function X and say "oh, yea...that got an unexpected arg" and fixing things is much simpler/etc
<rick_h_> katco: and with tons more of those, it's much more bitesize/managable to work on a smaller chunk of code
<katco> rick_h_: yeah me too. i like it even better when the unit test that tests the actual api meets your interfaces breaks when you change your interface ;)
<rick_h_> :)
<katco> rick_h_: so when do you get the new camper?
<rick_h_> katco: hopefully April sometime
<rick_h_> katco: my luck, right before sprints start :P
<katco> rick_h_: doh... well maybe you can get it prepped and then go for a get-away when you get back ;)
<rick_h_> katco: actually, so think I figured out my camper guy's trip this summer. Figure on heading to St Lious
<rick_h_> my wife doens't have an iterest, but taking the boy to the arch would be cool, and we go by chicago
<katco> rick_h_: dude! if you make the trip, definitely keep me in the loop!
<katco> rick_h_: no pun intended
<katco> rick_h_: we have some fantastic natl. parks around here
<rick_h_> katco: yea, I think we'll head past chicago, stay somewhere there, hit up chicago for a day, crash, go on to your area. Crash, see the arch/etc. And head back. It's like 8hrs so split between two days makes for a relaxing week trip.
<katco> rick_h_: yeah that sounds like a fun trip
<jamespage> dimitern, force of habit - I keep creating lxc containers, not lxd ones...
<jamespage> dimitern, also noted that lxd in s bundle always creates trusty machines
<jamespage> dimitern, resulting in a mismatch when the service is then deployed to them...
<natefinch> jamespage: the always trusty thing may just be a problem with configuration.  You need to import non-trusty images and alias them as ubuntu-<series>
<natefinch> jamespage: so, like from https://jujucharms.com/docs/devel/config-LXD#images - lxd-images import ubuntu xenial amd64 --sync --alias ubuntu-xenial
<jamespage> natefinch, with the maas provider?
<jamespage> I'm running from a branch right now....
<natefinch> jamespage: sorry, no, I was talking about the lxd provider, sorry, I may have misunderstood your comment.
<jamespage> natefinch, I keep doing "juju add-unit --to lxc:1"
<jamespage> rather than lxd:1
 * jamespage faceplants
<natefinch> jamespage: oh, I see
<natefinch> jamespage: heh yeah
<dimitern> jamespage, sorry, was afk - did you managed to sort it out?
<jamespage> dimitern, yeah
<jamespage> dimitern, ok - so seeing some oddness with lxd containers
<jamespage> specifically
<jamespage> I've had a container come up without a default route
<jamespage> and two subsequent ones come up without and interfaces...
<mup> Bug #1559293 changed: show-controller fails <ci> <show-controller> <juju-ci-tools:Triaged> <juju-core:Invalid> <juju-core admin-controller-model:Invalid> <https://launchpad.net/bugs/1559293>
<dimitern> jamespage, so we've seen issues like that and that's why we do ifup -a || true at the end of the runcmds generated for container userdata
<dimitern> jamespage, so I suspect those containers with the issues will recover after cloud-init has completed?
<jamespage> dimitern, it never completes afaict
<jamespage> as it can't download juju tools...
<dimitern> jamespage, is the container on trusty?
<jamespage> dimitern, no its all on xenial
<dimitern> jamespage, anything useful in /var/lib/lxd/ ? e.g. cloud-init-output.log?
<dimitern> jamespage, I've seen an issue with the xenial image my vmaas had (apt-get update was failing with unsigned index or something like that), so I did update the images in my vmaas and retried - this time was ok
<jamespage> dimitern, I'm not having trouble with physical xenial hosts - just the containers...
<dimitern> jamespage, so the provisioning does not complete for any lxd containers?
<jamespage> dimitern, it managed one but that was then missing a default route
<jamespage> dimitern, then other ones did not have any configured network interfaces - and nothing in the lxd profile either - let me recheck that
<dimitern> jamespage, if you can paste some logs from the host machine I'll have a look
<Guest_98765> Allah is doing
<jamespage> dimitern, http://paste.ubuntu.com/15463715/
<Guest_98765> sun is not doing Allah is doing
<Guest_98765> moon is not doing Allah is doing
<jamespage> dimitern, http://paste.ubuntu.com/15463718/
<Guest_98765> stars are not doing Allah is doing
<jamespage> lxd profile for machine
<Guest_98765> planets are not doing Allah is doing
<Guest_98765> galaxies are not doing Allah is doing
<Guest_98765> oceans are not doing Allah is doing
<Guest_98765> mountains are not doing Allah is doing
<Guest_98765> trees are not doing Allah is doing
<Guest_98765> mom is not doing Allah is doing
<Guest_98765> dad is not doing Allah is doing
<Guest_98765> boss is not doing Allah is doing
<Guest_98765> job is not doing Allah is doing
<Guest_98765> dollar is not doing Allah is doing
<Guest_98765> degree is not doing Allah is doing
<Guest_98765> medicine is not doing Allah is doing
<Guest_98765> customers are not doing Allah is doing
<dimitern> jamespage, ok, that was useful - it fails to store br-eth1.2667 for machine-2  :(
<dimitern> jamespage, can you also paste the log of machine-0 please ?
<Guest_98765> you can not get a job without the permission of allah
<jamespage> dimitern, sure one sec
<Guest_98765> you can not get married without the permission of allah
<Guest_98765> nobody can get angry at you without the permission of allah
<Guest_98765> light is not doing Allah is doing
<dimitern> jamespage, also a paste of "ifconfig -a" on machine-2 will hopefully explain why br-eth1.2667 was not added
<katco> natefinch: so you're working on the rest of that bug now... eta?
<Guest_98765> fan is not doing Allah is doing
<Guest_98765> businessess are not doing Allah is doing
<Guest_98765> america is not doing Allah is doing
<Guest_98765> fire can not burn without the permission of allah
<natefinch> katco: this afternoon.  Mostly propagated all the changes needed, just need to throw in some more tests
<Guest_98765> knife can not cut without the permission of allah
<katco> natefinch: awesome, ty.
<Guest_98765> rulers are not doing Allah is doing
<Guest_98765> governments are not doing Allah is doing
<Guest_98765> sleep is not doing Allah is doing
<Guest_98765> hunger is not doing Allah is doing
<Guest_98765> food does not take away the hunger Allah takes away the hunger
<Guest_98765> water does not take away the thirst Allah takes away the thirst
<Guest_98765> seeing is not doing Allah is doing
<Guest_98765> hearing is not doing Allah is doing
<Guest_98765> seasons are not doing Allah is doing
<Guest_98765> weather is not doing Allah is doing
<Guest_98765> humans are not doing Allah is doing
<Guest_98765> animals are not doing Allah is doing
<rick_h_> hmm, no one has ops here eh?
<Guest_98765> the best amongst you are those who learn and teach quran
<katco> rick_h_: it's never come up
<natefinch> heh
<rick_h_> katco: yea, reaching out to IS on how to fix it, what practice we have there
<natefinch> I just ignored the guy
<Guest_98765> one letter read from book of Allah amounts to one good deed and Allah multiplies one good deed ten times
<katco> same
<natefinch> but it would be nice to be able to punt people so we don't all have to do that
<Guest_98765> hearts get rusted as does iron with water to remove rust from heart recitation of Quran and rememberance of death
<Guest_98765> heart is likened to a mirror
<Guest_98765> when a person commits one sin a black dot sustains the heart
<TheMue> It's no guy, it's just a bot.
<jamespage> dimitern, http://paste.ubuntu.com/15463811/
<jamespage> dimitern, and http://paste.ubuntu.com/15463817/
<dimitern> jamespage, thanks, unfortunately the logging level is not high enough to understand the problem - can you please try this:
<dimitern> jamespage, first run $ juju set-model-config logging-config='<root>=TRACE'
<dimitern> jamespage, and then bounce all jujud processes on machine-0, and paste the log again?
<mup> Bug #1560061 opened: Eth0 device type determined to be "" <ci> <deploy> <juju-core:Incomplete> <juju-core maas-spaces-multi-nic-containers-with-master:Triaged> <https://launchpad.net/bugs/1560061>
<katco> ericsnow: natefinch: btw, https://github.com/juju/juju/pull/4805 very good work guys. you should be super proud :)
<ericsnow> katco, natefinch: +1
<natefinch> my baby's all grown! *sniff*
<katco> ericsnow: natefinch: finishing up the tests for the charm list-resources and then we're just doing bug fixes!
<dimitern> jamespage, any luck?
<katco> cherylj: speaking of which, tyvm for landing that
<ericsnow> cherylj: yeah, thanks!
<cherylj> katco, ericsnow my pleasure!
<natefinch> cherylj: ditto! :)
<natefinch> katco: reading the card I'm working on... it seems like the scope is a little vague.  "add channels wherever we use charm.URL"  I had only been adding channels to the calls required for list-resources.  I can add it elsewhere as well, but it's the sort of change that spiders out to a lot of code.
<cherylj> I wanted to get it in there while there were no conflicts :)
<natefinch> good thinking
<katco> cherylj: hehe
<katco> natefinch: does the stand-alone charm command support formatting flags?
<natefinch> katco: looks like it, yes
<natefinch> katco: looks like they use the formatter stuff just like juju uses
<ericsnow> natefinch: really all that matters for channels is the code that interacts with the charm store (relative to resources)
<ericsnow> natefinch: so the touches to that code will bleed through the API too
<natefinch> ericsnow: yeah
<natefinch> ericsnow: and stuff like add pending resources
<ericsnow> the only other bit is the updates poller, which we still haven't sorted out
<ericsnow> natefinch: sorry the card wasn't more clear
<natefinch> ericsnow: right, so I have changes to use a charmstore.Charm type (url + channel) propagated through a lot of the code here: https://github.com/juju/juju/pull/4790/files
<ericsnow> natefinch: I'll take a look
<natefinch> of course, maybe we should just be using the channel value on charm.Url :/
<natefinch> rogpeppe: charm.URL has a Channel field.. should we just use that for tracking the channel a URL is assigned to?
<rogpeppe> natefinch: no, you should never use it
<rogpeppe> natefinch: it's going to be deleted
<ericsnow> rogpeppe: I had a hunch :)
<natefinch> rogpeppe: well, so, Eric and I were lamenting that we have to change the signature of 1000 methods that used to take charm.URL to now take a URL + channel... why would we not just use the channel field on URL?
<rogpeppe> natefinch: you only need to change that for methods that can take an unresolved url
<natefinch> rogpeppe: yes, but it propagates out through the code more than you might think
<rogpeppe> natefinch: for example?
<rogpeppe> natefinch: this is in cmd/juju/service?
<natefinch> rogpeppe: everywhere we use charmstore.Charm here (and some API methods I havent' fixed yet, but that's not going to be affected by changing charm.URL): https://github.com/juju/juju/compare/feature-resources...natefinch:list-resources-channels?expand=1
<natefinch> rogpeppe: basically any time we have a URL that needs to be passed to the charmstore, we need the channel as well
<jamespage> dimitern, sorry been otp
<rogpeppe> natefinch: any time you have an unresolved URL, right?
<ericsnow> rogpeppe: my "unresolved url" you mean those without a revision (e.g. revision -1)?
<dimitern> jamespage, np
<natefinch> rogpeppe: we need it even for specifically revisioned urls, because different channels can have different resources associated with the same revision of the charm
<rogpeppe> ericsnow: yes
<jamespage> dimitern, http://paste.ubuntu.com/15464291/
<ericsnow> rogpeppe: I was under the impression that a "resolved" url may still be available in different channels
<rogpeppe> natefinch: oh really? oh no.
<natefinch> rogpeppe: and we'd need to store the channel for a service, so we know which channel to look for updates to the charm
<ericsnow> (aside from "unpublished")
<rogpeppe> this model is totally broken
<natefinch> rick_h_: ^
<dimitern> jamespage, cheers, looking
 * rick_h_ tries to parse the traceback
<natefinch> rick_h_: rogpeppe is surprised that the same rev of a charm can have different resources in different channels
<rick_h_> natefinch: ugh
<natefinch> rogpeppe: you can publish mysql-3 with resource-1 to dev and with resource-2 to stable, in theory
<rogpeppe> natefinch: that goes against the whole notion of dev vs stable AFAICS
<natefinch> rogpeppe: well, think about it.. .you publish mysql-3 & resource-1 to stable... then you want to update the resource, so you publish mysql-3 & resource-2 to dev for CI to test....
<rick_h_> rogpeppe: dev vs stable is just a tag on a revision
<rick_h_> rogpeppe: every charm revision can have the latest blessed set of resources/charm.
<rick_h_> rogpeppe: the whole point of resources is you update them, and get a new stable set, without needing to update the charm revision
<rogpeppe> rick_h_: and if an old revision doesn't work with one of the newly published resources?
<natefinch> rogpeppe: resources are keyed off channel + charm-rev
<rick_h_> rogpeppe: older revision of the charm?
<rick_h_> rogpeppe: publishing will require the resource revisions
<rick_h_> rogpeppe: so updating resources requires a new publish with the previous charm revision + the new resource revisions
<rick_h_> rogpeppe: it's a forward rolling set, if the older revision doesn't work, it's ok because it was published with a different revisioned resource that at one time did work
<rogpeppe> rick_h_: i am still deeply concerned about these semantics
<rick_h_> rogpeppe: then let's chat about them please because they're getting delivered and quickly as we get the feature out the door
<rogpeppe> rick_h_: i would be very happy to have a chat about them
<rick_h_> natefinch: are you free in 20?
<rick_h_> natefinch: will invite you into the chat I've got with the ui team then plkease
<natefinch> rick_h_: I can be. yes
<ericsnow> katco: ^^^
<dimitern> jamespage, can you also paste the /e/n/i on machine-0 ?
<katco> rick_h_: yeah toss me an invite as well if you don't mind
<rick_h_> katco: rgr
<katco> rick_h_: ta
<jamespage> dimitern, erm 'destroy-controller' just executed - have to free up the env for my eod
<ericsnow> rick_h_: me too :)
<rick_h_> ericsnow: :)
<dimitern> jamespage, ah, ok then
<dimitern> jamespage, we something's odd about the created bridges there - br-eth1 does not seem to appear, and that's causing the issues
<dimitern> s/we/well/
<redir> morning katco, cherylj
<cherylj> hey there redir!  Welcome aboard :)
<redir> thanks
<cherylj> redir: how are you finding your first day?
<rick_h_> redir: hey, you're not started yet have you? I thought it was april?
<redir> first order of business is to fill out the new starter stuff
<cherylj> fun stuff
<katco> redir: hey there!
<redir> rick_h_: it was. but things went much more smoothly than expected
<rick_h_> redir: awesome to hear, welcome back to the party
<redir> thanks!
<katco> ericsnow: natefinch: perrito666: this is a new member of tanazanite, Reed Simpson, aka redir
<ericsnow> redir: welcome!
<redir> cherylj: so far, 2 minutes in it is awesome:)
<katco> :)
<redir> thanks, ericsnow:)
<katco> redir: be sure to reach out if you need any help with the new starter task
<redir> katco: will do
<katco> ericsnow: natefinch: perrito666: wow, i misremembered his last name. what a welcome :( this is Reed O'Brien
<alexisb> redir, welcome!
<redir> no worries, I always wanted to be a famous hockey player
<katco> :)
<alexisb> :)
<redir> thanks alexisb, everyone:)
<lazyPower> o/ redir
<lazyPower> welcome aboard
<redir> thanks lazyPower
<katco> redir: lazyPower is on our ecosystems team. they do a lot of charming and community outreach. he's a great person to know and generally awesome
<lazyPower> woo
<lazyPower> i dont know that i can live up to that description, but hey o/ :D
<lazyPower> have a charm question? i'll help ya whip it into shape or get you in touch with the ppl that can make it happen
 * redir gets notepad
<lazyPower> katco is elnode finished? :D
<katco> lazyPower: it is not... last problem i ran into was emacs specific though. getting emacs to start with no stdin or something
<lazyPower> next sprint we should make that a bounty and have a hack session on it
<redir> how many teams are there?
<redir> quick what are they?
<lazyPower> tanzanite, sapphire, moonstone, onyx and uhh
<katco> lazyPower: bzzzzz! incorrect
<rick_h_> someone doesn't read the juju mailing list :P
<katco> lazyPower: redir: there are 2: tanzanite (your team) and onyx
<lazyPower> oh right
<rick_h_> :)
<lazyPower> the great restructuring of the minerals
<redir> diamonds await
<lazyPower> welp nothing to do here
 * lazyPower jetpacks away
<redir> so 2 teams: black and blue
<natefinch> welcome redir!
<redir> :)
<redir> thanks natefinch
<redir> which color is ecosystems?
<katco> redir: that's an entirely different macro-team
<katco> redir: tanzanite and onyx are sub-teams of juju-core
<redir> is there a map?
<redir> :)
<rick_h_> only in our heads :)
<natefinch> there used to be moonstone and sapphire, but they've been merged into tanzanite and onyx
<mgz> I think eco uses colourless mana
<katco> mgz: +1
<redir> I see
 * redir imagines ecosystems to be earth, wind, and fire.
<natefinch> redir: mostly fire :)
 * natefinch doesn't even know what that means.
 * redir doesn't either but they are colourless
<hatch> wb redir :D
<redir> thanks, hatch
<rick_h_> natefinch: ericsnow katco and optionally redir please join https://plus.google.com/hangouts/_/canonical.com/ui-daily?authuser=1
<rick_h_> updating authuser as required
<mup> Bug #1559844 changed: InvalidVolumme.NotFound error destroying aws environment <juju-core:Triaged> <https://launchpad.net/bugs/1559844>
<mup> Bug #1560107 opened: juju client doesn't pass version and other useful metadata to api calls <juju-core:New> <https://launchpad.net/bugs/1560107>
<mup> Bug #1542206 opened: space discovery still in progress <ci> <intermittent-failure> <joyent-provider> <precise> <regression> <juju-core:Triaged> <juju-core maas-spaces-multi-nic-containers:Invalid> <https://launchpad.net/bugs/1542206>
<mgz> has some one added redir to github/lp grouos yet?
<alexisb> mgz, probably not yet, there will be some delay in official registration stuff for redir, given his start date got bumped up
<mgz> alexisb: I was just doing it for niedbalski and wondered if I could save effort :)
<mgz> redir: what's your github username?
<cherylj> hey redir, I'm working on getting you a room for the QA sprint next week.  Would you want to attend all 5 days?  (I personally won't fly in until Monday evening)
<redir> mgz reedobrien
<redir> cherylj: whatever you want me there for
<redir> I can do all five days or partial if you prefer
<mgz> redir: github has sent you an email, accept that, then go to github.com/orgs/juju/people and set your membership to public
<redir> I am trying to reset my lp account. I have 2FA setup on it, but it doesn't like my 2FA codes
<redir> any way to get someone to clear that?
<redir> mgz: done
<katco> ericsnow: natefinch: https://github.com/juju/charmstore-client/pull/6
<ericsnow> katco: k
<katco> ericsnow: natefinch: note that i've had to subvert the full-stack tests since the charmstore doesn't yet support resources
<mgz> redir: did you do the reset sequence?
 * katco heats up some lunch
<mup> Bug #1542206 changed: space discovery still in progress <ci> <intermittent-failure> <joyent-provider> <precise> <regression> <juju-core:Triaged> <juju-core maas-spaces-multi-nic-containers:Invalid> <https://launchpad.net/bugs/1542206>
<mgz> redir: if you tried that and it didn't work, er... I always bugged selene or someone, I think the right channel is...
<mgz> redir: #canonical-support on the canonical network
<redir> mgz: I tried to resync
<redir> mgz: I don't see a way to reset the sequence
<redir> mgz: not on the canonical network yet
<mgz> alexisb: bug 1542206
<mup> Bug #1542206: space discovery still in progress <ci> <intermittent-failure> <joyent-provider> <precise> <regression> <juju-core:Triaged> <juju-core maas-spaces-multi-nic-containers:Invalid> <https://launchpad.net/bugs/1542206>
<redir> I'll try a little more then bail until I have access there
<alexisb> mgz, looking
<mup> Bug #1542206 opened: space discovery still in progress <ci> <intermittent-failure> <joyent-provider> <precise> <regression> <juju-core:Triaged> <juju-core maas-spaces-multi-nic-containers:Invalid> <https://launchpad.net/bugs/1542206>
<ericsnow> katco: PR reviewed (mostly LGTM)
<natefinch> katco, ericsnow: well, I was reviewing the PR until.... No server is currently available to service your request.
<katco> natefinch: getting that too :(
<natefinch> you know it's bad when you can't even get to the status page
<katco> natefinch: status.github.com is even down
<natefinch> katco: heh
<natefinch> katco: http://i.cubeupload.com/D6bRPx.png
<redir> buh
<katco> natefinch: status seems to be up. looking at several graphs with a shelf in the last minute or so :p
<natefinch> oh yeah, so it is
 * natefinch puts away his sword.
<cherylj> natefinch: back in my day, that used to say "Compiling!"
<cherylj> ;)
<cherylj> and when I worked in AIX, it was a legitimate excuse
<natefinch> cherylj: for Juju it should say "running unit tests"
<cherylj> took over an hour to build a kernel on our POS machines
<cherylj> natefinch: ha, no kidding
<katco> ericsnow: natefinch: fyi, i copy/pasted the help text from core's help text
<natefinch> katco: hmm... I wonder if our help text is wrong, too, then
<redir> cherylj: point of sale machines? :)
<cherylj> redir: no, no IBM sold that off long ago
<cherylj> ;)
<cherylj> toshiba - those suckers
<redir> so the other PoS machines
<cherylj> redir: so, should I book your room for a checkin  Sunday, checkout Friday?
<redir> cherylj: sounds perfect
<redir> thanks
<cherylj> redir: kk, I'll submit a request to the sprint organizer
<rick_h_> redir: ok, does your login work?
<alexisb> dimitern, ping
<dimitern> alexisb, pong
<alexisb> heya dimitern happy monday :)
<alexisb> on bug 1542206, cheryl's last comment...
<mup> Bug #1542206: space discovery still in progress <ci> <intermittent-failure> <joyent-provider> <precise> <regression> <juju-core:Triaged> <juju-core maas-spaces-multi-nic-containers:Invalid> <https://launchpad.net/bugs/1542206>
<alexisb> it looks like juju is failing the bootstrap
<alexisb> could that potentially be a bug in the spaces work for providers that dont currently support spaces?
<dimitern> alexisb, :)
<redir> rick_h_: yes!
<cherylj> alexisb, dimitern : in all fairness, it could also be completely unrelated to spaces :)
<cherylj> and that error is a red herring
<dimitern> alexisb, I suspect that's an issue with joyent
<alexisb> dimitern, ok, can we please work with cheryl to dig a bit
<dimitern> alexisb, as there's no other issue recently related to that on any other cloud
<alexisb> it is important to get beta3 out this week
<dimitern> cherylj, alexisb, sure
<alexisb> thanks dimiter
<tvansteenburgh> hey guys, can the value for 'series' in metadata.yaml be a string, or must it be a list?
<cherylj> oh hell.  mgz I see the model migration test failed in a different spot that passed in the previous run
<mgz> cherylj: hm, but still a 20m timeout
<cherylj> yes
<mgz> cherylj: it's the how-good-are-you-at-reading-deadlock-gopanics test!
<mgz> is the the same one, is it a different one? :)
<natefinch> tvansteenburgh: always a list I believe
<tvansteenburgh> natefinch: ta
<mgz> cherylj: well, that sucks
<cherylj> mgz: yeah.  They're in completely separate packages
<mgz> latest one is useless, it's in the parent process
<mgz> so, no clue to what caused the actual test binary to fail
<cherylj> yeah
<mgz> (fail to exit)
<dimitern> mgz, can you get the cloud-init-output.log and machine-0.log of a joyent bootstrap instance that failed?
<dimitern> mgz, with --keep-broken I guess..
<mgz> dimitern: yeah, with the issue it's intermittent
<dimitern> mgz, hmm I'm now beginning to suspect that was caused due to the MADE-state-workers changes
<cherylj> oh boy
<cherylj> mgz, dimitern I think it's related to the same change that broke restore
<cherylj> https://github.com/juju/juju/commit/104e75494e9fc97cb2a0084b4384936abcee9622
<dimitern> cherylj, uh oh that definitely looks promising as a cause
<mgz> ...yup, that seems likely
<cherylj> so, we can revert just the joyent part of that change
<cherylj> or we can figure out why sometimes the filter doesn't work in joyent
<dimitern> cherylj, it looks like one attempt passed ok though: http://reports.vapour.ws/releases/3791/job/joyent-deploy-trusty-amd64/attempt/1929
<mgz> dimitern: it is intermittent
<dimitern> an that's without a bundle - just deploy
<cherylj> dimitern: right, I suspect that there's some state where if we try to get the joyent instances that match our filter, we don't get a complete list
<mgz> so likely an ordering issue or similar in the bootstrap process
<dimitern> ah, I see
<mgz> did the machine we just created appear in our list run after or something
<mgz> sometimes yeay sometimes ney
<natefinch> rogpeppe: so... about that Channel field on charm.URL.....
<rogpeppe> natefinch: make a new type
<cherylj> mgz ... so revert the joyent part?
<ericsnow> rogpeppe: could we include the charm store macaroon on that type?  It would be optional.
<natefinch> rogpeppe: Why not use the field?
<rogpeppe> natefinch: because it's not part of the URL
<rogpeppe> natefinch: it's very wrong to use it. please don't.
<natefinch> rogpeppe: it not being part of the URL makes sense.  I guess this is a consequence of the URL no longer being a sufficient unique identifier
<rogpeppe> natefinch: yes that's right
<mgz> cherylj: I would be tempted to for beta3, but this probably needs some more serious bug fixing, admin-controller likely depends on storage being gone?
<rogpeppe> natefinch: that's part of the reasons for my rant earlier
<natefinch> rogpeppe: we were headed down that road already, I just wanted to make sure there wasn't a shortcut
<natefinch> rogpeppe: yeah, I totally understand.
<cherylj> mgz: we haven't completely removed storage - maas still uses it
<rogpeppe> natefinch: BTW I don't think that the channel is sufficient either
<rogpeppe> natefinch: you probably want to pass around the full tuple
<rogpeppe> natefinch: because a channel + a URL doesn't uniquely identify things
<ericsnow> rogpeppe: what about the charm store macaroon on that new type?
<rogpeppe> ericsnow: i guess it depends what you call the type
<ericsnow> rogpeppe: the charm ID and the macaroon are pretty tightly coupled on the core side
<ericsnow> rogpeppe: maybe it should just be a core type
<natefinch> rogpeppe: I've made a github.com/juju/juju/charmstore.Charm type that contains the URL and the channel.  Probably adding the macaroon is a good idea.  I'm less certain about the resources.. channel+URL is unique-enough to get the expected resources (whether or not store resources are overridden is stored elsewhere)
<rogpeppe> natefinch: i don't think the macaroon should be there really.
<redir> rogpeppe!
<rogpeppe> redir: yo!
<rogpeppe> redir: you made it back!
<rogpeppe> redir: welcome again :)
<rogpeppe> natefinch: what are you going to use it for?
<mgz> cherylj: good/bad news on ppc64 failure... just more timing prone issues
<natefinch> rogpeppe: authenticating with the charmstore.  It probably doesn't need to be on the type, since it's not really part of the identity of the charm
<rogpeppe> natefinch: channel+URL is not unique enough to get the expected resources over time
<rogpeppe> natefinch: sorry, my question was really: what are you going to use the Charm type for?
<redir> thanks rogpeppe
<cherylj> mgz: we could also have this "ErrNotBootstrapped" error be one of the ones we retry in the bootstrap (just like upgrade in progress)
<natefinch> rogpeppe: mainly right now for querying the charmstore for updates to the charm and/or associated resources
<rogpeppe> natefinch: and please don't call it Charm - it's not a charm but some kind of identifier for a charm
<ericsnow> rogpeppe: +1
<rogpeppe> natefinch: do all resources get pulled into the environment when you deploy a charm?
<rogpeppe> s/environment/model/
<mup> Bug #1560152 opened: Local charm fails to deploy if it's too big <juju-core:New> <https://launchpad.net/bugs/1560152>
<natefinch> rogpeppe: we grab all the metadata pre-emptively, grab bytes on-demand
<rogpeppe> natefinch: so if you do juju deploy somecharm-5, then juju deploy somecharm-5 again and the default resources have changed, what will happen?
<cherylj> perrito666: did you see bug 1557726?
<mup> Bug #1557726: Restore fails on some openstacks <backup-restore> <openstack-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1557726>
<natefinch> rogpeppe: you get different resources and when you look at juju list-resources, you'll be told the first deploy has updates available for its resources
<rogpeppe> natefinch: so the service holds the whole tuple of charm + resources?
<natefinch> rogpeppe: basically the same as if you deploy wordpress twice, and the charm gets updated in between
<natefinch> rogpeppe: correct
<mup> Bug #1560152 changed: Local charm fails to deploy if it's too big <juju-core:New> <https://launchpad.net/bugs/1560152>
<cherylj> hey mgz, I see that bug 1542206 was originally opened in Feb, which was before the change to joyent to use tags for controller instances
<mup> Bug #1542206: space discovery still in progress <ci> <intermittent-failure> <joyent-provider> <precise> <regression> <juju-core:Triaged> <juju-core maas-spaces-multi-nic-containers:Invalid> <https://launchpad.net/bugs/1542206>
<cherylj> mgz: the link to the failures in the bug seem to start on Mar 15
<rogpeppe> natefinch: so i guess you could just call it CharmURLWithChannel.
<rogpeppe> natefinch: type CharmURLWithChannel struct {charm.URL; Channel Channel}
<mgz> cherylj: I think we have some wrong symptom association
<cherylj> mgz: yeah, I was hoping that was the case
<rogpeppe> natefinch: bulky but accurate
<natefinch> rogpeppe: I'd rather call it something that is somewhat more extensible, on the thought that we may add/change data in it later (if we decide we want to give a single number to a charm + resource tuple, for example). CharmID I think would work.
<mgz> cherylj: I rejigged the rule to make it more specific
<mgz> cherylj: so, it should now match just on this issue
<cherylj> mgz: k, thanks.  Should I update the bug to be more specific?  (That the bootstrap fails with "model is not bootstrapped")
<mgz> cherylj: yeah, that's a good idea
<rogpeppe> natefinch: yeah, that would probably be ok.
<perrito666> cherylj: hi, just coming today, was a bit ill, I see it, I seem to have introduced that with k3 support
<cherylj> perrito666: :(  hope you're feeling better
<cherylj> sorry to make things worse with that bug :)
<perrito666> cherylj: sort of, lots of white rice
<perrito666> simple fix, actually
<perrito666> openstack should actually be returning 300 and its returning 200
<perrito666> cherylj: how blocking this is? (do you need me to fix it now or can I fix it tomorrow?)
<mgz> perrito666: we are probably talking about two bugs
<cherylj> perrito666: it is not blocking
<mgz> one is rackspace/keystone 3 the other is joyent/storage
<perrito666> mgz: the response pasted in the bug is about opesntack returning an unexpected http code
<cherylj> perrito666: yeah, I'm talking about the keystone 3 bug
<mgz> also *hugs*
<perrito666> mgz: dont hug me or ill have to run to the bathroom
<mup> Bug #1560152 opened: Local charm fails to deploy if it's too big <juju-core:New> <https://launchpad.net/bugs/1560152>
<perrito666> mgz: the links on the bug, actually the first, havent checked the second, are 404ing
<katco> fwereade_: hey, is it true you prefer exported test suites?
<mgz> perrito666: from the log?
<perrito666> http://juju-ci.vapour.ws:8080/view/Juju%20Revisions/job/functional-backup-restore-jujuqa-stack/
<perrito666> mgz: that one
<mgz> perrito666: are you logged in with developer credentials from lp:cloud-city?
<mgz> ...not lp: but lp:etcetc/
<katco> marcoceppi: hey, our final patch to the charm cmd is under review: https://github.com/juju/charmstore-client/pull/6
<katco> marcoceppi: it will be landed soon
<perrito666> mgz: meh I was unlogged
<perrito666> mgz: I would have expected more smart from jenkins
<mgz> it's Security
<natefinch> katco: we should get rogpeppe to review that, though it's past EOD for him
<rogpeppe> natefinch: i'm working late today
<rogpeppe> natefinch: trying to get the deploy-with-channels stuff landed for you
<mgz> everything has to be a huh-dunno not a no-way or you leak the super-sekret test names
<natefinch> rogpeppe: I figured that would be a big change.  I have some code that may help / complicate it for you
<marcoceppi> katco: thanks, we'll need this and the charm-help pull req to complete this (and the store updated)
<rogpeppe> natefinch: it's only big because i couldn't resist cleaning things up a bit
<perrito666> mgz: ok, give me some guidance here :p
<katco> marcoceppi: what is required in the store?
<perrito666> I see a build history whose newest build is feb 27
<marcoceppi> katco: the latest charmstore deployed, it failed friday
<katco> marcoceppi: ahh
<rogpeppe> natefinch: i might have got the wrong end of the stick and not done nearly as much as i need to though
<katco> marcoceppi: and what do you mean by a PR against charm-help?
<mgz> perrito666: you don't have the creds to hand? lp:~juju-qa/+junk/cloud-city
<marcoceppi> katco: https://github.com/juju/charmstore-client/pull/2
<natefinch> rogpeppe: here's what I have, which has the CharmID type: https://github.com/juju/juju/pull/4790/files
<katco> marcoceppi: ah ok. i thought you were asking me to do other things :)
<natefinch> rogpeppe: I only did the part that was needed for juju charm list-resources, so there's a lot of other touches that would need to be made for general charm/channel usage, like deploy
<natefinch> cherylj, katco: so if resources is merged, does that mean any new changes should go against master, or should we still be doing PRs versus the feature branch?
<katco> natefinch: cherylj: good question. i think since they're bug fixes they can go against master?
<rogpeppe> natefinch: ha, you've duplicated quite a bit of what i've already done today...
<natefinch> rogpeppe: I was afraid of that
<natefinch> rogpeppe: sorry I didn't ping you earlier today... I was still thinking of my change as being quite focused, though of course it spiraled out from the original intent of just supporting channels in one call.
 * redir goes to grab lunch. Taking some of the getting started reading ianb send to me.
<redir> bbiab
<rogpeppe> natefinch: this what i'm on currently (just the dependency update) https://github.com/juju/juju/pull/4807/files
<rogpeppe> natefinch: just running the tests (all passing so far, but probably 3 hours to go until they're finished...)
<rogpeppe> natefinch: i've added some ericsnow TODOs :)
<ericsnow> rogpeppe: :)
<natefinch> rogpeppe: 3 hours until the work on the tests is finished, or 3 hours until the tests finish running?
<rogpeppe> natefinch: the latter
<rogpeppe> natefinch: probably a slight exaggeration
<rogpeppe> natefinch: but my machine does seem to be slow today
<natefinch> rogpeppe: ok, good. I was gonna say, maybe compiling on your smart phone is not the best idea
<natefinch> s/compiling/running tests
<rogpeppe> natefinch: i should get a sodding great water cooled monster to run the tests on
<rogpeppe> natefinch: ... alternatively maybe i should rig up a charm for parallelising the test running on ec2
<natefinch> rogpeppe: my 2.5 year old core i7 does ok... like 15-20 minutes
<natefinch> rogpeppe: I do have a really fast (for 2.5 years ago) SSD which I think helps
<rogpeppe> natefinch: mine's probably not that much longer than that tbh. it just feels like 3h.
<natefinch> rogpeppe: certainly
<cherylj> natefinch, katco, yes if you have new changes / bug fixes for resources, please make them against master
<natefinch> rogpeppe: have you tried Russ' gt?
<rogpeppe> natefinch: i have a lovely 1TB SSD sitting on the shelf waiting for the day that I can waste upgrading to it
<rogpeppe> natefinch: it didn't work so well for me
<natefinch> rogpeppe: it worked at times for me, but was definitely flaky at times, too
<katco> cherylj: cool, ty. ericsnow: natefinch: bug fixes go against master now
<katco> cherylj: should we delete the feature branch?
<ericsnow> katco: k
<rogpeppe> natefinch: i'd very much appreciate a review of the above PR BTW - what was originally reviewed was very different to what it is now
<natefinch> rogpeppe: absolutely
<rogpeppe> natefinch: thanks
<rogpeppe> i'm sure the ec2 tests didn't used to take 2 1/2 minutes to run
<rogpeppe> i wonder what's going on there
<rogpeppe> ooh, we're on w
<rogpeppe> natefinch: yay! tests passed.
<rogpeppe> natefinch: did you wanna take a look before I push it?
<natefinch> rogpeppe: I'm looking
<redir> back
<cherylj> thumper: http://reviews.vapour.ws/r/4267/
<thumper> cherylj: not bad, +8,634 â146
<thumper> smaller than I thought
<cherylj> heh
<natefinch> rogpeppe: lgtm... very nice to have all that bakery nonsense wrapped up in a reusable place
<katco> cherylj: hey, do you know if we should delete our feature branch?
<cherylj> katco: not required, but you can if you'd like
<katco> ericsnow: natefinch: any objections?
<cherylj> katco: I imagine we're going to do a bit of pruning after we ship 2.0.0
<natefinch> katco: seems like a good idea so no one twiddles against it by accident
<ericsnow> katco: sounds good
<thumper> +1 to deleting the merged feature branch
<katco> natefinch: also going to delete nate-minver
 * cherylj is secretly paranoid that the merge didn't actually happen properly
<katco> cherylj: looks like you can restore branches. plus i guarantee you we have copies floating around on people's local machines ;)
<natefinch> katco: glad to have it done with. :)
<cherylj> haha, ok thanks :)
<katco> thumper: is there anything we can do to help out onyx?
<thumper> yes
<thumper> katco: do you have some time now?
<katco> thumper: 15m
<thumper> you can't do much in 15 minutes :)
<thumper> I mean, does your team have spare capacity
<katco> thumper: ericsnow has more time :) and i'll have time after
<cherylj> har har
<natefinch> thumper: lol, spare capacity
<katco> thumper: er strike that. i won't have time after. ericsnow does though
<thumper> katco: I think we need to think carefully about what we have committed to over the whole team
<thumper> we have already said that model migration won't be fully functional
<rogpeppe> natefinch: thanks
<thumper> is there something else that we have fully committed to that needs work?
 * katco scans feature list
<natefinch> thumper: all the charmstore part of resources :/  I mean, we plan to do it, but it's not at all done yet.
<thumper> katco: probably best to sync up with alexisb and wallyworld
<katco> thumper: planned on doing so in this afternoon's release standup
<thumper> natefinch: that would be a first focus then I guess
<thumper> kk
<katco> thumper: i was just referring to today's deadline
<thumper> oh, just for today?
<thumper> no
<thumper> we're good
<natefinch> sorry, I should butt out, katco has it all under control.
<katco> thumper: yes if you were trying to push for something
 * natefinch goes back to teh coding
<thumper> we are... but nothing that more hands will help in the short timeframe
<katco> thumper: ok, just thought we'd check :)
<thumper> cheers
<rogpeppe> aargh, i'd forgotten you can't reply to reviewboard comments directly in the summary section
<katco> rogpeppe: you can i think
<rogpeppe> katco: i always do "Add comment" (thinking it's gonna reply) and it opens a new issue instead.
<katco> rogpeppe: yeah i think you have to click on the comment in the gutter, then there's like an "add reply" or something
<rogpeppe> katco: i'll look harder next time
<rogpeppe> ericsnow: i've made some replies to your review comments but they're all in the same place.
<ericsnow> rogpeppe: k
<rogpeppe> s/same place/wrong place/
<rogpeppe> katco: i don't think you can click the comment in the gutter from the overview mode, unless i'm missing something
<rogpeppe> ericsnow: are you ok with me landing it?
<katco> rogpeppe: maybe i'm misinterpreting what overview mode is
<rogpeppe> katco: the page you get by following the original review link
<ericsnow> rogpeppe: yep
<rogpeppe> katco: not the page with all the complete diffs on
<rogpeppe> ericsnow: ta
<natefinch> katco: I'm going to be a bit late on proposing my PR.  I spent a bunch of time doing code reviews and meetings etc.  But it should be done tonight.
<mup> Bug #1560191 opened: kill-controller is hinky without a model-controller behind it <juju-core:New> <https://launchpad.net/bugs/1560191>
<mup> Bug #1560192 opened: Restoring admin: model configuration has no admin-secret <backup-restore> <ci> <juju-core:Incomplete> <juju-core admin-controller-model:Triaged> <https://launchpad.net/bugs/1560192>
<mup> Bug #1560201 opened: The Client.WatchAll API command never responds when the model has no machines <juju-core:New> <https://launchpad.net/bugs/1560201>
<mup> Bug #1560203 opened: stringForwarderSuite.TestRace sometimes fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1560203>
<rogpeppe> ericsnow: while i'm looking at the code, can I ask you what the "Spec" in CharmstoreSpec stands for?
<ericsnow> rogpeppe: specification
<ericsnow> rogpeppe: really the for the client though :/
<rogpeppe> ericsnow: ok. thats... not what i expected.
<rogpeppe> ericsnow: i thought it might have been Specialization
<thumper> davecheney: wondering about your take on this bug: https://github.com/lxc/lxd/issues/1781
<thumper> davecheney: due to pathological juju bug, created a very long path
<mup> Bug #1560201 changed: The Client.WatchAll API command never responds when the model has no machines <juju-core:New> <https://launchpad.net/bugs/1560201>
<mup> Bug #1560203 changed: stringForwarderSuite.TestRace sometimes fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1560203>
<cherylj> thumper, menn0 model-migration is merged!
<thumper> \o/
 * thumper does a little dance
 * fwereade_ cheers
<fwereade_> thumper, I'm in the hangout
<alexisb> :)
<thumper> Æª(Ëâ£Ë)â Æª(Ëâ£Ë)Ê â(Ëâ£Ë)Ê
<cherylj> hey perrito666, looking back on bug 1536792
<mup> Bug #1536792: Some providers release wrong resources when destroying hosted models <juju-core:In Progress> <https://launchpad.net/bugs/1536792>
<cherylj> perrito666: why was joyent not fixed yet?
<cherylj> perrito666: was it waiting for the change to use tags instead of storage?
<natefinch> gah, people need to stop calling things resources when they're not *Resources* ... messes me up every time
<mup> Bug #1560201 opened: The Client.WatchAll API command never responds when the model has no machines <juju-core:New> <https://launchpad.net/bugs/1560201>
<mup> Bug #1560203 opened: stringForwarderSuite.TestRace sometimes fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1560203>
<natefinch> man I hate external tests
<menn0> cherylj, thumper : woot!
<perrito666> cherylj: sorry I am back, I am not sure I know what you are talking about
<perrito666> cherylj: oh, joyent was going to be done by an external
<cherylj> hmm
<perrito666> natefinch: you need to stop picking names extremely common in computers and repurposing them perhaps?
<natefinch> perrito666: well, *someone* needs to stop picking those names... :)
<redir> natefinch: perrito666 they need to be more resourceful
<ericsnow> redir: I see what you did there <wink>
<redir> ;)
<davecheney> umm, ++ /var/lib/jenkins/juju-ci-tools/ec2-run-instance-get-id
<davecheney> euca-run-instances: error (InstanceLimitExceeded): Your quota allows for 0 more running instance(s). You requested at least 1
<davecheney> CI just told me to go rewrite it in erlang
<mup> Bug #1559310 changed: bootstrap fails: "model is not bootstrapped" <bootstrap> <ci> <juju-core:Incomplete> <juju-core admin-controller-model:Triaged> <https://launchpad.net/bugs/1559310>
<rogpeppe> ericsnow: ping
<ericsnow> rogpeppe: hey
<rogpeppe> ericsnow: so, when deploying with a channel, i can deploy the charm ok, but presumably we need some way to say what the channel was in the deploy API request, right? have you done that bit?
<ericsnow> rogpeppe: we've done nothing like that
<rogpeppe> ericsnow: ah, so your resources stuff is assuming that the channel is in the state, but so far there's nothing to put it there?
<ericsnow> rogpeppe: I believe that natefinch is still working on the bit that passes the channel for the resource-related API call
<rogpeppe> natefinch: ok
<rogpeppe> ericsnow: ok
<rogpeppe> natefinch: i'd be interested to hear your plans here
<ericsnow> rogpeppe: I expect he'll be back later
<rogpeppe> ericsnow: do you know what the status of making changes to the API currently are?
<rogpeppe> ericsnow: am I allowed to add new API parameters to existing calls without creating a new version?
<ericsnow> rogpeppe: not positive, but I believe so relative to the 2.0 release
 * ericsnow checks docs
<ericsnow> rogpeppe: best I've found: https://github.com/juju/juju/blob/master/doc/design/juju-api-implementation-guide.md#versioning
<katco> cherylj: available for a call?
<urulama> rogpeppe: need a channel as a parameter?
<katco> rogpeppe: ah i see the scrollback now. natefinch is assuming the channel will be there and is assuming published for now. no plans to do anything more than consume what's stored
<mup> Bug #1560237 opened: worker/peergrouper: intermittent data race <juju-core:New> <https://launchpad.net/bugs/1560237>
<axw> wallyworld: my desktop has decided now would be a good time to start cutting out again
<axw> so may be late to or drop out of standup
<anastasiamac> axw: ack
<rogpeppe> ericsnow, natefinch, katco: here's the PR (sketchy tests I'm afraid but I think worth it anyway) https://github.com/juju/juju/pull/4833
<rogpeppe> axw: fancy a review? ^
<axw> rogpeppe: in standup, a bit later sorry
<rogpeppe> axw: np
<rogpeppe> please, can someone review this branch? it's almost midnight here and I really need to get this landed. https://github.com/juju/juju/pull/4833
<rogpeppe> ericsnow, natefinch, katco: ^
<rogpeppe> wallyworld: ^
<ericsnow> rogpeppe: looking
<rogpeppe> ericsnow: thanks
<rogpeppe> ericsnow: please don't wait until the end of the review if you have some comments, so i can get on with fixing things
<ericsnow> rogpeppe: nothing so far
<rogpeppe> ericsnow: cool
<ericsnow> rogpeppe: just getting my bearing relative to some of the refactors you did (nice ones :)
<rogpeppe> ericsnow: in 1m20s i'm gonna turn into a pumpkin
<ericsnow> rogpeppe: LGTM
<ericsnow> ha, what timing
<rogpeppe> ericsnow: phew!
<ericsnow> rogpeppe: thanks for the hard work on this
<rogpeppe> ericsnow: saved from pumpkinhood
<rogpeppe> ericsnow: np
<ericsnow> rogpeppe: there are still a few things we have to sort out still
#juju-dev 2016-03-22
<rogpeppe> ericsnow: i've pushed $$merge$$ - perhaps you could take it over if there are some silly errors from the bot?
<rogpeppe> ericsnow: indeed
<ericsnow> rogpeppe: sure
<rogpeppe> ericsnow: but this should actually work reasonably for most use cases
<rogpeppe> ericsnow: thanks
<urulama> ty, ericsnow
 * rogpeppe beds headward
<ericsnow> rogpeppe: np
<rogpeppe> g'night all
<mup> Bug #1560262 opened: relation visibility rules different between service/service and service/subordinate relations <canonical-is> <juju-core:New> <PostgreSQL Charm:Triaged> <postgresql (Juju Charms Collection):Triaged> <https://launchpad.net/bugs/1560262>
<wallyworld> axw: with the restore issue - reading the code leads me to believe the issue is because we no longer store admin-secret in bootstrap config. we could add it back but wouldn't it be better to store it in the backup metadata?
<axw> wallyworld: admin-secret is the password for admin@local, should just grab it out of accounts.yaml instead
<wallyworld> ah, yes, good point
<wallyworld> that does ite the restore to the user who did the backup
<axw> wallyworld: (it would be better to pull something out of backup metadata tho)
<wallyworld> agreed
<wallyworld> i may fix the easy way for now
<axw> wallyworld: +1
<wallyworld> i'd rather the backup be stand alone
<axw> wallyworld: needs an overhaul, so yeah
<wallyworld> axw: joy, we also need ca-cert which is fine, but also ca-private-key which we don't currently store with the controller metadata
<axw> wallyworld: what do we need that for?
<axw> I thought we would generate new certs
<wallyworld> i'll have to look deeper - the current code attempts to create an env config and it complains
<davecheney> thumper: cherylj https://github.com/juju/juju/pull/4836
<wallyworld> axw: atm we don't prepare or anything - the code just assumes we had a complete config available via bootstrap config, so it needs to be reworked
<anastasiamac> wallyworld: axw: last change for openstack virt type http://reviews.vapour.ws/r/4276/
<axw> anastasiamac: looking
<anastasiamac> axw: tyvm \o/
<axw> wallyworld: yeah, it does assume we have a bootstrap config already. it always did, so the only potential issue is to do with info we may have thrown away, like ca-private-key
<axw> wallyworld: only immediate problem I mean
<wallyworld> axw: yeah, not having the same ca-private-key seems problematic i think?
<wallyworld> maybe not
<axw> wallyworld: sorry, I don't know what's newly generated, and what needs to be the same
<wallyworld> i guess if we are rebootstrapping, there's no existing machine sout there
<wallyworld> so no need for existing agents to be able to reconnect
<wallyworld> which i think would be the main issue
<axw> wallyworld: part of the restore process involves SSHing to the agents and fixing them up, IIANM
<axw> wallyworld: the controller's going to have a new IP, after all
<wallyworld> right but doesn't the restore assume everything is gone if we use the -b option
<wallyworld> maybe not
<wallyworld> i'll have to read the code
<axw> anastasiamac: just a couple of comment changes, otherwise LGTM
<anastasiamac> axw: awesome \o/ thnx :D
<axw> anastasiamac: BTW, "code is kinda weird" is not your fault, just how it is / has to be due to openstack limitation
<axw> wallyworld: you said there was another bug about permission denied? I don't see it on LP against admin-controller-model. did you say you still had to add it?
<wallyworld> axw: it's just not targetted to that branch, let me get it
<wallyworld> https://launchpad.net/bugs/1461561
<mup> Bug #1461561: juju run fails with "Permission denied (publickey)" on manual provider <intermittent-failure> <manual-provider> <run> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1461561>
<axw> ta
<mup> Bug #1441302 changed: Vivid unit tests are not reliable enough <test-failure> <vivid> <juju-core:Won't Fix> <https://launchpad.net/bugs/1441302>
<axw> wallyworld: I suspect it's just a race with the worker to update authorized_keys
<wallyworld> sounds likely yeah
<mup> Bug #1496032 changed: backups restore won't create bootstrap on GCE <backup-restore> <docteam> <gce-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1496032>
<wallyworld> axw: reviewed
<axw> wallyworld: thanks
<axw> wallyworld: please see reply about type name
<wallyworld> ok
<wallyworld> axw: no reply yet, did you publish?
<axw> wallyworld: derp. published now
<wallyworld> axw: GenericResource I think is ok, but I'm not strong;y opinionated either way
<axw> wallyworld: in that case I'll leave it. if it becomes a pattern, we should strongly consider changing the framework to allow injecting things directly
<wallyworld> sgtm
<axw> wallyworld: do you want me to wait before merging to admin-controller-model?
<axw> wallyworld: wait for CI to be a bit happier?
<wallyworld> may as well merge now, what could possibly go wrong :-)
<wallyworld> we should have a run in say 5 hours
<wallyworld> still time to fix stuff before tomorrow and better to find issues early at this stage
<wallyworld> we could merge the other 2 fixes first i guess
<cherylj> gah, I see the problem with the joyent provider
<cherylj> they're ignoring the tags passed into startinstance
<anastasiamac> cherylj: :D
<davecheney>   thumper mwhudson https://github.com/golang/go/issues/14904
<davecheney> i'm detouring into this bug this arvo
<davecheney> cmd/juju/status trips it up 100% of the time
<wallyworld> axw: this backup thing is messy - apart from old bootstrap config, the only place we store the admin-secret aka oldpassword is in the agent.conf file, and the same is true for the ca-private-key. So assuming we have that file, which the backup does include in the archive, we can parse it on restore, but i think it would be better to do that stuff server side and include in the json metadata for the backup. but that sucks because
<wallyworld> nothing so far in apiserver needs to know about agentconfig and i'd rather keep it that way. so it's all a bit yuck
<wallyworld> axw: maybe we should just include the private key and admin secret in bootstrap config
<axw> wallyworld: it would be nice not to, but I guess we could as a stop gap at least
<wallyworld> axw: yeah, i don't really want to. so what would we prefer - parsing the conf files on restore? or extracting the info srver side on backup
<axw> wallyworld: we definitely can't generate new ones at restore time?
<wallyworld> i'm not sure. the private key i think needs to be paired with the cacert
<axw> wallyworld: yeah, we would need to update that too.
<thumper> davecheney: interesting...
<wallyworld> axw: and a new admin secret would not be the end of the world i guess. but we do have all the info in the backup archive
<wallyworld> so we'd really need to look to use that info if availab;e
<axw> wallyworld: for a quick solution, I'd go with parsing agent.conf on restore. then we can look at improving.
<axw> wallyworld: we also need to think about how we're going to persist user auth details to log back into the controller after restore, in lieu of an accounts.yaml that's valid with the creds at time of backup
<axw> wallyworld: options are to store in the backup metadata, or to require the user to login. but these are things we could look into at the same time as the rest
<wallyworld> ok, was thinking the same thing, also we only need this if we are rebootstrapping
<axw> wallyworld: I can't make heads or tails of the SSH thing. the system key is added to authorized-keys at bootstrap, and I've confirmed they're in the hosted model config; and that is present in cloud-config when machines are started
<axw> so there's no race AFAICT
<wallyworld> fark
 * axw disappears to make lunch
<wallyworld> axw: i think adding admin-secret to bootstrap config is ok. it's used for the gui login having it accessible like that makes sense to me
<axw> wallyworld: I don't understand why we need it there. it's already in accounts.yaml.
<axw> wallyworld: admin-secret == password for admin@local
<wallyworld> ah true
<wallyworld> there's not a nice way that avoids yuk to get it server side, so i'll read from there for now
<wallyworld> parsing the agent conf files client side is also no go
<wallyworld> so i'll get that ingo srrver side and include with backup metadata
<axw> wallyworld: why is it no go?
<wallyworld> all the code to open the archive and extract the files is all server side. there's a lot of it. and there's no guarantee what you'll get in terms of file names - it depends on machine id.
<wallyworld> i'd have to refactor a lot of code
<axw> wallyworld: ok
<wallyworld> restore works by uploading the archive to state server and server side code does the work
<wallyworld> so none of that code is client side
<axw> wallyworld: ah, I think I figured out when the race could happen with authorized-keys. if you specify it explicitly in --config
<axw> wallyworld: that'll go into hosted model config, whereas admin model's config will have the system's public key added
 * axw works on a patch
<wallyworld> ah i see
<wallyworld> bbiab, school pickup
<anastasiamac> axw: wallyworld: tag fixes for joyent http://reviews.vapour.ws/r/4282/
<axw> anastasiamac: is there some significance to the "tag." prefix?
<axw> anastasiamac: do we need to add that to the prepopulated tags?
<anastasiamac> axw: I am fixing the broken code. what ever was there or whatever it was copied from, i've kept
<anastasiamac> axw: but i would have to say that probably not..
<axw> anastasiamac: I guess if you tested and it worked, there's no significance. alrighty
<axw> anastasiamac: LGTM, thanks
<anastasiamac> axw: thnx. I'll re-google tag names in joyent but m pretty it's an arbitrary prefix :D
<mup> Bug #1560331 opened: juju-br0 fails to be up when no gateway is set on interface <juju-core:New> <https://launchpad.net/bugs/1560331>
<jam> axw: are you around for a bit? I'd like to figure out what we need for storage and LXD containers
<jam> especially for normal deployments (maas/ec2) but also for LXD provider.
<axw> jam: sorry, didn't see notification - not on usual machine (desktop's on the fritz)
<axw> jam: still around for a while
<jam> axw: so the LXC code has something about passing through the loop device
<jam> and I need to know what we actually use, and how we need to do that with LXD
<jam> AIUI if we want a device, then we talk directly to LXD and ask it to pass a device through (might need to be in the lxd profile, I need to investigate that side)
<jam> axw: but I'd like to know what we're using and how I can measure that I've done it correctly.
<axw> jam: the existing LXC storage code wasn't great. it was simply allowing access to all loop devices to all containers
<jam> axw: who was creating the loop device to then pass in?
<axw> jam: the container does it internally. they have access to the /dev/loopX files
<axw> jam: (it would be much better if the host passed it in)
<jam> axw: but who on the outside would create the fil to then mount?
<jam> you'd ask the container broker ?
<axw> jam: there's several places where we can create storage: dynamically on the machine (e.g. loop); dynamically on the controller (e.g. ebs); at machine provisioning time (e.g. maas)
<axw> jam: the last one might fit with creating in the container broker
<axw> jam: but it would be nice if there's some way we could do it dynamically through the LXD API
<jam> axw: LXD would let us pass the device through (supposedly), but it wouldn't do the mkfs, etc. stuff
<jam> I don't believe
<axw> jam: all we need is a device. juju will create filesystems as necessary
<axw> jam: does LXD have an API for allocating block devices? I don't see it in the REST API doc, but that may just be outdated
<axw> wallyworld: https://github.com/juju/juju/pull/4844
<wallyworld> looking
<wallyworld> axw: what if the user has explictly set their auth keys for the hosted model
<axw> wallyworld: how?
<wallyworld> ah, i was thinking we had a capability for host model config on cli
<axw> wallyworld: you can only specify the config for both admin and hosted model
<wallyworld> one day we should add that separate host model config perhaps
<wallyworld> axw: lgtm
<wallyworld> axw: off to soccer, restore tests are a bitch, need to refactor a bit, will finish when i get back and hopefully land tonight
<axw> wallyworld: okey dokey
<wallyworld> axw: i'm so not happy with the solution, but there's little other option right now
<wallyworld> that we can do quickly
<axw> jam: so thinking about it a bit more, if we were to pass disks through to the containers, I think we'd need to do that in two places: an environment-level storage provider for the lxd provider; and a machine-level storage provider for lxd-type containers
<axw> jam: for hte latter, we'd need to update machine-level storage to manage disks for containers as well as disks for the machine itself
<axw> jam: AFAICT there's no help from LXD for dynamically allocating volumes from ZFS or BTRS for use in containers (apart from root volume), which would be ideal
<axw> at worst we could do what we were doing with LXC: open up loop devices to all containers and let them have at them. not ideal tho
<axw> jam: I'd prefer if we could allocate any volumes the host supports to containers, though
<TheMue> aaargh, building a shell script generating a shell script but then doing wrong escaping. wonderful, costed me some time.
<TheMue> morning btw
<mup> Bug #1560391 opened: apt-mirror is not used in containers with MAAS provider <juju-core:New> <https://launchpad.net/bugs/1560391>
<fwereade> don't suppose anyone who did backup/restore work is online?
<fwereade> because apparently I have perpetrated lp:1559712 and I need a bit of context about why this is an error
<fwereade> perrito666, ericsnow perhaps? ^
<perrito666> Well i am having breakfast and chatting from a phone
<perrito666> But shoot
<perrito666> Fwereade^
<perrito666> In short, prepare restore puts juju in a pseudo ro mode for that iirc, it writes info in mongo so I presume that is what is failing hence the transaction aborted
<voidspace> dooferlad: standup?
<perrito666> Fwereade gtg need to take my wife to work + not feeling so well
<mup> Bug #1560428 opened: cmd/juju/common depends on gopkg/check.v1 <juju-core:New> <https://launchpad.net/bugs/1560428>
<mup> Bug #1560428 changed: cmd/juju/common depends on gopkg/check.v1 <juju-core:New> <https://launchpad.net/bugs/1560428>
<fwereade> perrito666, np, let me know if/when you're available
<fwereade> again :)
<mup> Bug #1560428 opened: cmd/juju/common depends on gopkg/check.v1 <juju-core:New> <https://launchpad.net/bugs/1560428>
<dimitern> fwereade, davecheney, FYI - updated the pprof wiki page with instructions how to access it from a browser
<dimitern> awesome tool btw
<dimitern> https://github.com/juju/juju/wiki/pprof-facility
<mup> Bug #1560457 opened: help text for juju bootstrap needs improving <juju-core:New> <https://launchpad.net/bugs/1560457>
<wallyworld> axw: not sure if you have a moment to look at http://reviews.vapour.ws/r/4287/
<wallyworld> axw: you may need to hold your nose, but we need to re-write backup/restore somewhat for multi-model
<rick_h_> wallyworld: hold on that though. I want to talk through that and what's in 2.0 vs 2.1 and how we go about that please
<wallyworld> rick_h_: oh, no fear, we have no much for anything right now :-)
<wallyworld> s/much/time
<rick_h_> wallyworld: k
<wallyworld> rick_h_: what we need to do to get 2.0 out the door given how the cloud credentials stuff now removes the need for storing an entire bootstrap config means that there's a few things we need to clean up for 2.1
<rick_h_> wallyworld: rgr
<marcoceppi> katco natefinch how does deploying a charm with resources local work?
<marcoceppi> like, what's the command line work like?
<marcoceppi> s/work/look/
<natefinch> marcoceppi: yep
<natefinch> marcoceppi: juju help deploy ;)
<natefinch> marcoceppi: uju deploy foo --resource bar=/some/file.tgz --resource baz=./docs/cfg.xml
<natefinch> marcoceppi: you just specify the resources you want to have uploaded when you deploy the charm.   You can do that for store resources too
<marcoceppi> natefinch: I DON'T HAVE TIME FOR YOUR REASONABLE SUGGESTIONS
<natefinch> marcoceppi: er, store charms that is
<natefinch> marcoceppi: lol
<marcoceppi> natefinch: yeah, but I'm waiting for charmstore update, etc
<natefinch> marcoceppi: right, just wanted to clarify that it's not just for local charms
<marcoceppi> natefinch: yeah, lookikng forward to that, but I've got a charm now with resources
<marcoceppi> natefinch: I know there's a filename param, etc
<marcoceppi> natefinch: and I remember that it's mostly ignored
<natefinch> marcoceppi: the extension of the file you upload needs to match the extension of the filename in the charm metadata, it'll complain if you don't have them match
<marcoceppi> natefinch: okay, one file is literally `python-jujusvg` with no ext
<marcoceppi> what happens then?
<marcoceppi> as in, it's just a binary
<natefinch> marcoceppi: then we enforce that what you upload doesn't have an extension
<marcoceppi> cool
<marcoceppi> <3
<natefinch> marcoceppi: yeah, the filename in the metadata is really just used for the extension for uploads, and its' what we store the file data as on the units... though since we return the full path from resource-get, it shouldn't matter too much to the charm itself
<marcoceppi> natefinch: perfect, I remember that being shook out in capetown happy to see it implemented as such
<natefinch> marcoceppi: yeah, I'm mostly really happy with how the feature turned out.  I think it'll be super useful.
<marcoceppi> natefinch: a few days ago the instance containing svg.juju.solutions died, so I'm charming that up with resources. I'll let you know how it goes
<natefinch> marcoceppi: cool cool.  Please let us know any pain points or other sharp edges
<mup> Bug #1560487 opened: local provider fails to create lxc container from template <juju-core:New> <https://launchpad.net/bugs/1560487>
<rogpeppe> does anyone know the difference between jujuclient.Controller.Servers and jujuclient.Controller.APIEndpoints ?
<rogpeppe> it's a new type, so *presumably* they both have a role to play, but in my limited experimentation they both hold exactly the same thing.
<rogpeppe> fwereade, dimitern, natefinch, axw, wallyworld: ^
<natefinch> rogpeppe: no idea, I haven't used that package
<rogpeppe> natefinch: np
<wallyworld> rogpeppe: they mean the same as when used with the jenv stuff - servers are the host names, api endpoints the ip addresses
<rogpeppe> wallyworld: what's the difference? both seem to contain host names and port numbers
<rogpeppe> wallyworld: ah, you mean one has resolved IP addresss + port and the other has the equivalent hostnames ?
<wallyworld> rogpeppe: as i understand it, if the hostnames are known they are used
<wallyworld> yes
<rogpeppe> wallyworld: that is *really* confusing!
<rogpeppe> wallyworld: just the naming really
<wallyworld> rogpeppe: that stuff goes back a looong time
<natefinch> we're really bad at naming
<rogpeppe> wallyworld: it's a new type!
<wallyworld> which we ported
<wallyworld> copied even
<rogpeppe> wallyworld: hrmph
<wallyworld> we didn't want to change the semantics
<wallyworld> as people would have groked the meaning b now
<rogpeppe> wallyworld: the semantics perhaps didn't need to change, but at least it could be documented that a) they both hold host:port pairs and b) one is resolved and the other isn't
<natefinch> wallyworld: ...or not :)
<wallyworld> rogpeppe: it was probaly around your time on juju when this was first done :-)
<rogpeppe> wallyworld: i still can't work out which is resolved and which isn't
<wallyworld> it's as documented now as it evr was
<natefinch> wallyworld: you're just digging your hole deeper
<wallyworld> why?
<wallyworld> we can't change the world in a day
<natefinch> wallyworld: if it's been the same for forever and it's been this well documented for forever, that just means it's bad and should be fixed.
<rogpeppe> wallyworld: just about everything else has changed about that stuff
<wallyworld> yes it should, let me get 25 hours in a day and i'll do it
<rogpeppe> wallyworld: so i'd've thought that aspect could probably change too
<wallyworld> not everything has changed, that bit hasn't :-)
<rogpeppe> how is anyone meant to guess what the intended relationship between these two things?
<rogpeppe> 	// Servers contains the addresses of hosts that form the Juju controller cluster.
<rogpeppe> 	// APIEndpoints is the collection of API endpoints running in this controller.
<rogpeppe> they sound very different from one another.
<wallyworld> guess so, that doc was as we founf it
<wallyworld> a lot of this stuff has been ported, not reimplemented
<wallyworld> you don't need to understand everything to the nth degree if you are porting it
<rogpeppe> wallyworld: it's actually got worse since the original, which was this: http://paste.ubuntu.com/15472129/
<rogpeppe> wallyworld: that at least explained the situation
<wallyworld> we should copy that doc back across
<rogpeppe> wallyworld: +1
<rogpeppe> wallyworld: BTW both the field names *and* the comments have changed from the original
<wallyworld> yes, can't recall why now, may have been to comply with the spec we were given actually
<katco> morning all
<katco> wallyworld: evening o/
<wallyworld> hi
<katco> wallyworld: doing ok?
<wallyworld> no, life sucks
<cherylj> someone's not living the dream today :/
<katco> wallyworld: understandable atm... but the sun will rise on a better day :)
 * wallyworld is grumpy
<katco> wallyworld: anything i can do to help?
<wallyworld> wave a magic wand and solve all the merge conflicts and test failures
<rogpeppe> wallyworld: sorry if i excacerbated the situation :)
<wallyworld> rogpeppe: np, i have copied the text across already
<katco> wallyworld: i can work on merges today if you'd like
<rogpeppe> wallyworld: thanks
<rogpeppe> wallyworld: changing the names would be good too. Something like: APIEndpoints and ResolvedAPIEndpoints ?
<wallyworld> katco: thanks, bu there's a lot of ingrained knowledge needed to resolve the conlficts
<dooferlad> frobware: is there no maas call today?
<wallyworld> rogpeppe: i'll see what we can do
<katco> wallyworld: ok. well, lmk
<wallyworld> will do
<wallyworld> katco: you could review http://reviews.vapour.ws/r/4287/ - it's a temporary fix to get restore working, still a long way to go to fix it all properly
<katco> wallyworld: tal
<rogpeppe> wallyworld: thanks, that's appreciated
<cherylj> rogpeppe: is your 070-use-charmstore-v5-api branch something we need to get into the beta this week?
<rogpeppe> cherylj: no, sorry, it was a temporary hack to get a PR in a mutually dependent repo landed. I can delete it now.
<cherylj> rogpeppe: ah, ok, thanks!
<rogpeppe> cherylj: deleted
<cherylj> thanks!
<rogpeppe> cherylj: and BTW if you ever need to land something in a repository which both depends on an is depended on by juju-core, that's the way to do it and still have godeps work.
<rogpeppe> s/on an is/on and is/
<cherylj> good to know.  Thanks :)
<ericsnow> rogpeppe: FYI, that merge went through on the first try :)
<rogpeppe> ericsnow: yeah, i saw in the morning and was happy
<ericsnow> :)
<rogpeppe> ericsnow: thanks for keeping an eye on it
<ericsnow> rogpeppe: np
<katco> rogpeppe: hey thanks for working so hard to land that patch :) work continues i guess?
<fwereade> katco, if you know backup: ISTM that creating a RestoreInfoSetter should not create a document; or, if it should, that it ought to handle concurrent creations of the same type
<fwereade> katco, is there some synchronisation mechanism somewhere in restore that I've missed?
<katco> fwereade: eh? is this regarding wallyworld's patch?
<rogpeppe> katco: i'm leaving as is for the time being, as i have to make progress in other areas.
<fwereade> katco, not at all I'm afraid
<rogpeppe> katco: it should be sufficient for most purposes. there may be some bugs :)
<katco> fwereade: ah. i don't know hardly anything about backup/restore i'm afraid :( but ericsnow and natefinch should
<fwereade> katco, np, thanks
<katco> fwereade: i'm reviewing a patch now, so maybe i'll become an expert in the next bit of time ;)
<wallyworld> i don't recall doing anything with RestoreInfSetter in my patch?
<rogpeppe> katco: hope that's ok. did you have any other particularly pressing things?
<fwereade> ericsnow, natefinch: ^ ? (RestoreInfoSetter)
<ericsnow> rogpeppe, katco: there are a few things left (like sending the channel through the AddCharm API endpoint)
<fwereade> wallyworld, nothing to do with you
<natefinch> perrito666: is Mr. Restore
<rogpeppe> ericsnow: file a bug
<ericsnow> fwereade: perrito666 did all the work on restore, though I did review much of the work
<fwereade> wallyworld, my problem entirely :)
<katco> rogpeppe: the only horse i have in that race is natefinch being able to determine what channel a charm was deployed from
<wallyworld> \o/
<ericsnow> rogpeppe: k
 * katco head explodes from the different threads of convo
<fwereade> perrito666, last I heard, was not feeling well
<natefinch> gah.. people talk about sending channels around in go and I get all confused
<ericsnow> fwereade: :(
<fwereade> ericsnow, I will do my best and ping you for a review
<ericsnow> fwereade: sounds good; thanks!
<ericsnow> natefinch: ha
<katco> ericsnow: ty
<rogpeppe> katco: yeah, currently it'll probably always use stable.
<perrito666> fwereade: I am back fully restored
<perrito666> fwereade: how can I help you?
<ericsnow> rogpeppe, katco: we still need to sort out the long-lived macaroon situation, but I don't think it's as urgent
<tvansteenburgh> where can i find a list of all the valid Request types for each api facade?
<ericsnow> tvansteenburgh: I don't believe that is cataloged anywhere, meaning you have to read through the facades code and apiserver/params/*.go
<natefinch> tvansteenburgh: yeah, unfortunately, what ericsnow said.  We really should document all that stuff.
<ericsnow> tvansteenburgh: godoc might help make it more manageable though
<bogdanteleaga> do we have something like this for stderr? https://github.com/juju/cmd/blob/master/output.go#L157
<tvansteenburgh> ericsnow, natefinch: ok, well if i know the name of the facade, where in the source should i go to find the Request types for it?
<ericsnow> tvansteenburgh: all the API data types are lumped together in several files in apiserver/params.go
<ericsnow> tvansteenburgh: so you have to see what types are in the API methods and then find them in those params files
<fwereade> perrito666, so the heart of it is that creating a RestoreInfoSetter will sometimes fail
<perrito666> it being the bug you mentioned early today?
<fwereade> perrito666, which is presumably happening now because of timing changes in my branch
<fwereade> perrito666, yeah
<fwereade> perrito666, and I am worried that just fixing the current symptom is going to leave us unstable all the same
<perrito666> fwereade: see priv msg
<dimitern> wallyworld, ping
<wallyworld> yo
<dimitern> hey, I've found a storage issue with bootstrapping on maas now
<dimitern> http://paste.ubuntu.com/15472360/
<dimitern> looks like related to removing provider storage ?
<tvansteenburgh> ericsnow: i don't see a apiserver/params.go?
<ericsnow> tvansteenburgh: apiserver/params/params.go
<ericsnow> (in core)
<wallyworld> dimitern: maas still has provider storage
<wallyworld> it can't use tags the same way as the other providers, so we still need ito
<wallyworld> are you referring to this line? DEBUG juju.provider.common state.go:36 putting "provider-state" to bootstrap storage *maas.maasStorage
<dimitern> wallyworld, nope - it fails like joyent - see the paste above - just after it connected once client logins were unblocked
<perrito666> dimitern: didnt remove storage in maas
<perrito666> oh wallyworld said that already
<wallyworld> dimitern: so why do you think it's a storage issue?
<dimitern> oh :/ I was hoping it was something known.. I'll keep digging
<wallyworld> dimitern: cherylj had a tgeory
<wallyworld> theory
<wallyworld> can't recall exactly what now
<dimitern> wallyworld, joyent precise jobs failed very similarly and intermittently and it was due to missing tags
 * cherylj reads backscroll
<perrito666> dimitern: joyent iirc, was a porblem with tags not being populated as fast as they where requested
<cherylj> perrito666 - the problem with joyent was that they were not setting tags passed in as StartInstance args
<perrito666> ah, I got it wrong then :)
<cherylj> perrito666: no worries, I didn't discover that until adding in the logic to retry the api connection in bootstrap if the error was "not bootstrapped"
<cherylj> I saw that even after several minutes, we never got tagged instances
<cherylj> so, I had to look elsewhere :)
<dimitern> perrito666, cherylj, wallyworld, false alarm, I found out why it fails - not related to storage
<cherylj> dimitern: ah, good :)
<mup> Bug #1560511 opened: The AddCharmWithAuthorization API endpoint needs to respect channels. <juju-core:New> <https://launchpad.net/bugs/1560511>
<mup> Bug #1560520 opened: Charm channels must be used on the controller. <juju-core:New> <https://launchpad.net/bugs/1560520>
<mup> Bug #1560520 changed: Charm channels must be used on the controller. <juju-core:New> <https://launchpad.net/bugs/1560520>
<mup> Bug # opened: 1560520, 1560525, 1560527, 1560531
<mup> Bug #1560525 changed: Juju 2.0-beta3 stabilization  <blocker> <juju-core:Triaged> <https://launchpad.net/bugs/1560525>
<mup> Bug #1560527 changed: juju get should be able to take a key argument <juju-core:New> <https://launchpad.net/bugs/1560527>
<mup> Bug #1560531 changed: Charm store macaroons must be used on the controller. <juju-core:New> <https://launchpad.net/bugs/1560531>
<mup> Bug #1560525 opened: Juju 2.0-beta3 stabilization  <blocker> <juju-core:Triaged> <https://launchpad.net/bugs/1560525>
<mup> Bug #1560527 opened: juju get should be able to take a key argument <juju-core:New> <https://launchpad.net/bugs/1560527>
<mup> Bug #1560531 opened: Charm store macaroons must be used on the controller. <juju-core:New> <https://launchpad.net/bugs/1560531>
<redir> morning
<perrito666> morning redir
<redir> feeling better perrito666 ?
<perrito666> redir: much, thank you
<redir> good good
<perrito666> redir: did you get your mail and other stuff?
<dimitern> dooferlad, ping
<dooferlad> dimitern: pong
<dimitern> dooferlad, hey, I wasn't sure you're around today
<dimitern> dooferlad, meet babbageclunk - Christian
<dooferlad> dimitern: I am, but was at the hospital this morning for a scan (baby #2)
<dooferlad> hello babbageclunk!
<dimitern> I see, ok
 * dimitern needs to go out, but will be back later
<TheMue> dooferlad: baby #2?
<dooferlad> TheMue: yep :-)
<TheMue> dooferlad: great news, grats
<dooferlad> TheMue: due early August.
<dooferlad> TheMue: Thanks!
<TheMue> dooferlad: so enough time left for the preparation. ours maybe both will have left home by end of year, depends on universities
<dooferlad> TheMue: ah, happy and sad at the same time.
<TheMue> dooferlad: exactly, but it makes proud to see how they grow up.
<redir> perrito666: not yet
<TheMue> dooferlad: hehe, and we already think about downsizing the house, when we don't need so much room anymore
<TheMue> dooferlad: means selling the big one and buy a new, smaller, and more modern one
<dooferlad> TheMue: heh, for me the only larger thing we may buy is a car!
<TheMue> dooferlad: here wel already changed to a smaller one (but with a little more luxary). the large station wagon soon hasn't been needed anymore
<cherylj> frankban: ping?
<frankban> cherylj: on call, will ping you asap
<cherylj> k, thanks!
<ericsnow> redir: did you get onto ReviewBoard yet?
<voidspace> alexisb: ping
<redir> ericsnow: I don't beleive so.
<redir> ianb said he'd work on it.
<redir> going to listen to the people and culture orientation thing in a minute
<redir> Then I'll ping alexisb about the paperwork stuff.
<ericsnow> redir: go to http://reviews.vapour.ws/ and click on the github button
<ericsnow> redir: see https://github.com/juju/juju/blob/master/CONTRIBUTING.md#code-review
<redir> ericsnow: to be sure, would I receive and email re: reviewboard?
<redir> ericsnow: will do
<ericsnow> redir: also see https://github.com/juju/juju/blob/master/doc/contributions/reviewboard.md
<redir> ericsnow: great thanks
<ericsnow> redir: np
<cherylj> ericsnow: for bug 1560531 and bug 1560520, are these blockers for the next beta?
<mup> Bug #1560531: Charm store macaroons must be used on the controller. <juju-core:New> <https://launchpad.net/bugs/1560531>
<mup> Bug #1560520: Charm channels must be used on the controller. <juju-core:New> <https://launchpad.net/bugs/1560520>
<ericsnow> cherylj: they prevent the correct behavior when folks use channels, so I'd say so
<ericsnow> cherylj: rick_h_ could probably say more; and I know it's on urulama's radar
<ericsnow> cherylj: oh, and regarding the macaroons one, I'm not sure but I'd be inclined to call it a blocker too
<rick_h_> ericsnow: didn't we say for the beta we'd get the initial deploy in and come back with follow ups to fix the bugs in that behavior after deploy?
<ericsnow> cherylj: it will keep private charms from doing the right thing or using resources
<ericsnow> rick_h_: the initial deploy stuff isn't finished relative to channels
<ericsnow> rick_h_: basically just the --channel flag is added
<rick_h_> ericsnow: ok, and it won't deploy the right one when used?
<ericsnow> rick_h_: right; it will always use stable
<ericsnow> currently
<rick_h_> ericsnow: oh...then boooo
<ericsnow> cherylj: and by "next beta" you mean the one after the one we're wrapping up this week, right?
<redir> :)
<cherylj> ericsnow: we're still trying to wrap up the one for this week
<ericsnow> cherylj: those bugs just need to be in the final release, however that works out
<cherylj> okay, thanks
<ericsnow> cherylj: I do not anticipate they will be resolved this week
<cherylj> ericsnow: cool, thanks
<ericsnow> cherylj: np
<katco> cherylj: best to talk to urulama for status/eta
<urulama> otp
<voidspace> alexisb: meet babbageclunk :-)
<alexisb> babbageclunk, heya man, welcome!
<babbageclunk> alexisb: hi!
<babbageclunk> Thanks!
<frankban> cherylj: ping
<urulama> katco, cherylj, ericsnow: yes, that's not a full implementation, but it's also not so problematic as you've described. once you have a fully qualified url from the channel pointer, the logic to resolve ACL should allow you to deploy it, not just stable one
<urulama> so, i'd like to have a QA performed when this actually brakes
<urulama> it's a bug as in not fully implemented, but it might not be critical
<alexisb> frankban, I a m pestering cherylj so her responses will be slow
<frankban> alexisb: np
<ericsnow> urulama: thanks for clarifying
<urulama> so, the plan is to test it and in case it's critical, we'll work in it tomorrow
<ericsnow> urulama: how can the store determine the channel from the charm URL?
<urulama> today is eod for eu
<ericsnow> (resolved URL)
<urulama> ericsnow: so, the case when this will brake is when stable channel has more rights than development channel.
<urulama> ericsnow: so, you'd get a resolved url, then the CS logic check if you can deploy it from stable first, then from development, then unpublished
<urulama> ericsnow: if you have access to stable but not development, then you can deploy it
<ericsnow> urulama: in the case the user specified explicitly the channel they want
<katco> urulama: that doesn't seem quite right... what if the charm was deployed from development to begin with?
<urulama> ericsnow: but the revision is the same
<ericsnow> urulama: right
<urulama> i said it's not filly implemented
<urulama> it's just not as critical to be a blocker
<katco> urulama: ah ok, that's the edge-case
<ericsnow> urulama: k
<urulama> well, unless someone else call it a blocker :)
<katco> urulama: i don't think we should implement the fallback algo. you described while we wait for full implementation
<katco> urulama: it's just a guess at a path that may or may not be correct
<rick_h_> right, it's a feature in a beta. It can not work properly in the beta
<rick_h_> katco: urulama ^
<urulama> katco: that's already in the charmstore ... once the channel is passed to addcharm, everything will be ok
<ericsnow> katco: for download all we need is the resolved URL, which is the same for the different channels
<urulama> ericsnow: yes
<katco> ericsnow: how does that account for resources?
<ericsnow> katco: so the edge case is that the requested channel doesn't actually have that revision, but that is checked when we resolved the URL
<ericsnow> urulama: I think I get it now
<ericsnow> katco: for resources we still need the channel
<katco> rick_h_: understood... i'm just trying to connect urulama with cherylj/you so the right calls can be made. again, my team does not have a horse in this race.
<rick_h_> katco: rgr
<katco> rick_h_: in fact, i would encourage direct communication b/t those parties :)
<katco> ericsnow: we are just adding confusion. let's remove ourselves from the conversation going forward
<urulama> cherylj: seems we need to talk ... gimme an hour of my so called life, will be back later :)
<cherylj> pfft...  "life"
<cherylj> isn't this job your life?!
<urulama> it is :)
<urulama> cherylj: so, in 1h ... is that ok for you?
<cherylj> urulama: yeah, I'll be here :)
<cherylj> just ping me
<urulama> ok, i do want to verify this, tbh, fell more at ease with proper QA then a bunch of guesses :)
<katco> ericsnow: have you looked at natefinch 's http://reviews.vapour.ws/r/4269 ?
<ericsnow> katco: I reviewed it yesterday but haven't looked since
<ericsnow> katco: will take a look
<katco> ericsnow: it seems like there's a lot of resources functionality that should live in the component? but unsure
<cherylj> damn, I was just about to ping rogpeppe about breaking master!!
<cherylj> anyone know why he would've removed github.com/gabriel-samfira/sys	 from dependencies.tsv?
<cherylj> ericsnow: ^^  I see you reviewed the PR.  Any thoughts?
<ericsnow> cherylj: I expect it was accidental
<cherylj> ericsnow: okay, thanks.  Just wanted to sanity check before I add it back in
<ericsnow> cherylj: np
 * natefinch is back
<katco> natefinch: wb
<katco> natefinch: i was mentioning to ericsnow: for your patch, it looked like there was some resources functionality in there that maybe belonged in the component? but wasn't sure
<katco> ericsnow: how hard would it be to keep http://reviews.vapour.ws/r/4272 going? i don't think we should land it before 2.0 because i don't want to be a huge source of conflicts for incoming branches
<natefinch> katco: hmm... good question.
<ericsnow> katco: what do you mean "keep it going"?
<katco> ericsnow: keep the patch updated with changes as they come in
<ericsnow> katco: as to conflicts, it would only be a problem for new HTTP endpoints, not new facades
<ericsnow> katco: I doubt there will be much churn in that bit of code
<katco> ericsnow: are you aware of any new endpoints coming in with the 2.0 feature branches?
<ericsnow> katco: no, though I could image model migrations potentially having something
<katco> ericsnow: alright... i suppose we can land this code if we can also assist in any merge conflicts
<katco> natefinch: btw what are you working on now?
<ericsnow> katco: k
<redir> ericsnow: yes
<natefinch> katco: I'd spent a little time trying to clean up my patch from last night, so there wasn't so much duplication and unnecessary levels of abstraction... but that's not super pressing right now.  I was going to ask what I should do next.
<redir> whoops, was scrolled. nm ericsnow
<ericsnow> redir: :)
<katco> natefinch: for now focus on reviews to unblock ericsnow
<natefinch> katco: happy to
<katco> natefinch: i'm peeking at the backlog, but open to any pets you or ericsnow feel need addressing
<ericsnow> katco: we do need to sort out the long-lived macaroon issue in the relative "soon" timeframe
<katco> ericsnow: meaning before the apr. 8th deadline?
<natefinch> katco, ericsnow: do we clean up old resources when we remove a service?
<katco> natefinch: i think ericsnow landed that patch
<ericsnow> natefinch: yep
<katco> natefinch: we took care of all the critical bugs
<natefinch> cool... I thought I remembered that, but just wanted to make sure
<katco> natefinch: so now we're onto uh... "bothersome"? bugs? and charmstore implementation
<katco> natefinch: but we should take the rest of the iteration to just do bugs and stay agile for the release
<ericsnow> katco: the macaroon issue will keep private charms from working with resources
<natefinch> katco" I think there's some gap between the channel work that roger did and what I did, I'd have to look to know for sure.
<ericsnow> natefinch: FYI, I'm reviewing your patch right now
<katco> natefinch: there is, and will that will remain until they have a full implementation up. nothing to do for now.
<katco> ericsnow: natefinch: so sounds like the long-lived macaroon should come next. i'll start drafting that email
<natefinch> ericsnow: yeah, I'm interested to hear your thoughts... looking at it with fresh eyes this morning, I think there's a bunch of cleanup I could do to avoid duplicate abstraction
<natefinch> katco: cool
<ericsnow> natefinch: yeah, I'm jotting down my thoughts right now :)
<natefinch> ericsnow: I think we can collapse Client and BaseClient and just have my clientWrapper interface to abstract away the ugly details of csclient.Client and the charmstore repo dance.
<natefinch> ericsnow: but I may be missing some externalities that you were taking into consideration
<mup> Bug #1560593 opened: debian lintian reports mis-spellings <packaging> <juju-core:Triaged> <https://launchpad.net/bugs/1560593>
<mup> Bug #1560595 opened: help text for juju show-cloud needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1560595>
<ericsnow> natefinch: baseClient *is* that wrapper
<ericsnow> natefinch: baseClient and Client represent different things
<natefinch> ericsnow: except with my code, there's a lot more logic inside baseClient than basically anywhere else in the file
<natefinch> ericsnow: maybe that was just me putting the logic in the wrong layer, though
<ericsnow> natefinch: it's partly a consequence of not having that logic living under the charmrepo repo
<ericsnow> natefinch: and of dealing with the way channels are handled in csclient.Client
<natefinch> ericsnow: right.. so we need our own wrapper to make using csclient/charmrepo less awful
<ericsnow> natefinch: and that's what baseClient is for
<natefinch> ericsnow: but that's really the point of the entire package, so why do we need Client?
<ericsnow> natefinch: let me finish my review and we can discuss it some more
<natefinch> ericsnow: sure :)  Sorry :)
<redir> what's JFDI?
<cherylj> redir: it's a keyword that lets you try to merge changes into a branch, even if it's blocked by a bug
<redir> just do it. got it, tx
<cherylj> redir: you can see if branches are blocked here:  http://juju.fail/index.html
<redir> marked, thanks
<cherylj> hey rogpeppe, was there a reason https://github.com/juju/juju/pull/4807/ removed github.com/gabriel-samfira/sys from dependencies.tsv?  or was it an accident?
<rogpeppe> cherylj: oh bother
<rogpeppe> cherylj: i forgot to do GOOS=windows godeps.
<cherylj> so accident, then :)
<rogpeppe> cherylj: yes
<rogpeppe> cherylj: oops
<cherylj> rogpeppe: http://reviews.vapour.ws/r/4299/
<rogpeppe> cherylj: reviewed
<rogpeppe> cherylj: sorry for the inconvenience
<cherylj> rogpeppe: it happens :)  You were up rather late getting these things in
<rogpeppe> cherylj: no excuse though :)
<natefinch> rogpeppe: we should import _ "github.com/gabriel-samfira/sys/unix" somewhere just to avoid that
<rogpeppe> natefinch: godeps should be better about tagged imports.
<natefinch> rogpeppe: if only we had a good relation with its author ;)
<cherylj> heh
<ericsnow> natefinch: my review is up
<rogpeppe> natefinch: how would that help?
<ericsnow> natefinch: as to merging baseClient and Client, baseClient is a wrapper around csclient.Client
<natefinch> ericsnow: but isn't that what Client is, too?
<ericsnow> natefinch: and Client is a wrapper around BaseClient, thus keeping a tighter control on what functionality we depend on in Juju
<mgz> rogpeppe: you already looked at the build issues with your charmv5 api branch right?
<ericsnow> natefinch: Client also provides some Juju-specific functionality
<rogpeppe> mgz: yeah, cherylj pointed out that I'd mucked up the deps
<natefinch> ericsnow: if it lives under github.com/juju/juju, it's all juju-specific
<ericsnow> natefinch: whereas baseClient adapts csclient.Client (and charmrepo.CharmStore) to a more sensible API
<rogpeppe> mgz: i'm hoping that's the only issue
<arosales> aside from 'kill-contoller' any folks have any hints on how to reclaim and juju 2.0 environment?
 * arosales stuck in this loop http://paste.ubuntu.com/15473933/
<cherylj> arosales: you gotta manually delete the info from the cache.yaml
<cherylj> I've been doing that all week :/
<arosales> cherylj: ah, ok thanks
<cherylj> we should totally fix that for beta3
<mgz> rogpeppe: oh, so you've not looked at reports.vapour.ws/releases/3799 directly?
<rogpeppe> mgz: nope
<mgz> rogpeppe: not sure the other build problems all come from windows dep
<rogpeppe> mgz: I'm eod
<cherylj> mgz: that was a temporary branch that rogpeppe already deleted
<ericsnow> natefinch: there are many things in the core repo that aren't Juju-specific, but no one has taken the time to pull them out
<cherylj> I pinged him about it this morning :)
<mgz> cherylj: okay, all I want to know is if I need to report bugs, sounds like not?
<cherylj> mgz: no, not for that branch
<cherylj> mgz: and for the windows build failure on master, I've already submitted a fix for dependencies.tsv
<mgz> excellent
<rogpeppe> mgz: sorry, we needed a temporary branch to make godeps work in another repo with a circular dependency with juju-core
<rogpeppe> mgz: see mattyw for details
<mgz> ah, but didn't actually want it tested?
<ericsnow> natefinch: ideally baseClient would go away and we'd use csclient.Client (or some external surrogate) directly
 * rogpeppe thinks that "base" is an insidious qualifier for any name.
<mattyw> mgz, yeah, we have some horrific circular dependency thing going on :(
<arosales> cherylj: safe to just rm ~/.local/share/juju/models/cache.yaml and start anew?
<ericsnow> rogpeppe: it's a base class <wink>
<cherylj> arosales: as long as you don't have any other controllers you need to talk to
<natefinch> rogpeppe: +1
<arosales> cherylj: nope
<natefinch> ericsnow: blech
<natefinch> ericsnow: (re: base classes)
<cherylj> arosales: you may also need to clean up ~/.local/share/juju/controllers.yaml
<ericsnow> natefinch: lol
<arosales> cherylj: ok, thanks for the the help.
<cherylj> np, I'll see if we can get that bug fixed on beta3
<cherylj> too many people are hitting it
<mgz> mattyw: you can probably push the rev to the repo without actually creating a branch
<mgz> then reference the rev directly
<mgz> this is all rather corner-casey though
<mgz> you could also ask someone in CI to blacklist a branch you don't want tested
<natefinch> katco, ericsnow: so about the problem with GetResource not having all the data... can we just have the charmstore return the rest of the metadata?  The only things it's missing is Origin, Type, and Description, which seem like a trivial amount of data to return along with the bytes themselves.
<ericsnow> natefinch: the charm store client is supposed to be returning that info (see https://docs.google.com/document/d/1T_7XQ-pmE4gFiD2SSaZnQUqlcLkA7dk0c0PB_ebEkrI)
<natefinch> ericsnow: ok, we need to adjust GetResource on csclient.Client, then
<natefinch> ericsnow: are we doing a multipart body, or dumping the extra data into headers?  Right now there's a header for revision and hash, we'd need additional headers for type, path, description, and origin.
<natefinch> ericsnow: (I presume the latter, since we haven't dealt with multipart bodies anywhere else yet)
<katco> natefinch: i would do headers as the metadata is metadata about what's in the body
<ericsnow> natefinch: origin is strictly inferred and the fingerprint should be in a header
<ericsnow> natefinch: the type, path, and description come from the charm metadata, which may or may not help depending on our use of charmrepo.CharmStore
<natefinch> ericsnow: current use does not make that helpful.  It seems like it makes the API friendlier if we just return all the metadata along with the bytes, given that the payload is tiny compared to the bytes themselves.
<ericsnow> natefinch: the correct way would be to use a multipart body, but it's a pain
<natefinch> ericsnow: agreed on both points
<ericsnow> natefinch: at least take a look to see how much work it would be to do a multipart body
<natefinch> ericsnow: there is a library for it... I'll have to check how much of it exists in 1.2, though
<ericsnow> natefinch: well, we did multi-part for backups with 1.2, so...
<mup> Bug #1560618 opened: Cloud types are unknown <docteam> <juju-core:New> <https://launchpad.net/bugs/1560618>
<mup> Bug #1560624 opened: cmd supercommand.go:448 failed to bootstrap model: no matching tools available <cdo-qa> <juju-core:New> <https://launchpad.net/bugs/1560624>
 * thumper goes to find the cat, another trip to the vet for us, yay
<natefinch> ericsnow: so, multipart seems fine, I'll get that adjusted, and then that'll fix the metadata problem for that PR
<ericsnow> natefinch: sweet
<TheMue> thumper: oh, hope not too bad.
 * TheMue loves cats
<thumper> TheMue: she has diabetes, and we are getting the insulin levels checked
<natefinch> ericsnow: I'm just going to assume the first part is the metadata and the second is the bytes... that seem reasonable?
<ericsnow> natefinch: yep
<natefinch> ericsnow: so much cleaner and nicer this way.
<TheMue> thumper: oh, possible for cats too? our last one sadly had troubles with his kidneys. leaving us after being a family member for more than 14 years.
<ericsnow> natefinch: yep :)
<thumper> TheMue: yeah, was a surprise to us too
<thumper> only found out a month ago
<thumper> hence the regular visits now
<TheMue> thumper: so it's good they found it and you can take care for it
<thumper> yeah
<natefinch> my wife was a vet tech for a while, she says kidney problems and diabetes are super common for cats
<TheMue> natefinch: about kidney problems we then learned too. so we're now more careful and try to detect troubles earlier
<TheMue> natefinch: but our new cat, we've got it since some weeks now, already trains us. she owns us already. :D
<natefinch> TheMue: :)  We're cat people, too.  We had one cat, and decided he needed a friend, so we went to the shelter to get one more cat... and came home with two more :D
<TheMue> natefinch: *rofl* but you're not only cat people. when I see the pics of your zoo.
<natefinch> TheMue: the goats and chickens are actually not much harder to take care of than the cats... we do have to trim the goats' hooves every 6 weeks, and clean out the chicken coop once a month or so... but otherwise it's just like cats, make sure they have food and water, and they're fine
<TheMue> natefinch: I love it when I see your kids, ok, the girls, take care for them. and the junior surely soon will too
<natefinch> rogpeppe: how important is it to print out the data in this error message? https://github.com/juju/charmrepo/blob/v2-unstable/csclient/csclient.go#L669   seems like we could use json.Decoder to parse directly off the reader, rather than copying into a buffer first
<natefinch> TheMue: yeah, it's great having the animals for the kids, and they're fun for us too
<TheMue> natefinch: can imagine. I've grown up on countryside with many animals around me. always have been great (ok, sometime much work *g*)
<rogpeppe> natefinch: i've found those error messages extremely useful in the past
<rogpeppe> natefinch: particularly when the problem comes from a proxy not the service itself
<mup> Bug #1560665 opened: help text for juju status needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1560665>
<mup> Bug #1560667 opened: help text for juju list-clouds needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1560667>
<natefinch> rogpeppe: fair enough... was trying to avoid reading the full data into memory is all
<alexisb> wallyworld, thumper, fearless juju managers, when you have a moment I would like to steal you guys
<thumper> alexisb: when and for what?
<TheMue> hehe, stealing
<alexisb> review priorities as we get down to the wire
<alexisb> thumper, I am flexible
<alexisb> I can put something on the calendar for this afternoon if that is easier
<katco> alexisb: fearless juju managers are now only wallyworld & thumper, yeah?
<alexisb> katco, yep
<alexisb> feel free to crash but I only need them, lucky souls
<katco> :)
 * TheMue continues to extend his tomb-alike loop to allow also hierarchically monitored and restarted goroutines
<alexisb> thumper, I am going to send an invite, let me know if the time doesnt work for you
<alexisb> cherylj, ping
<natefinch> ericsnow: standup?
<cherylj> alexisb: wrapping up a meeting now
<natefinch> wallyworld: you up?
<ericsnow> natefinch: coming
<wallyworld> natefinch: have a meeting clash :-(
<thumper> alexisb: I can tell when I'm not wanted ;-|
<alexisb> :)
<thumper> http://reviews.vapour.ws/r/4300/
<thumper> menn0: ^^ trade for yours
<menn0> thumper: looking
<menn0> thumper: done
<urulama> katco, ericsnow: just verified and was able to deploy a development charm from charm store
<urulama> with --channel development
<urulama> wallyworld: ping you around
<wallyworld> urulama: i am, just finishing a meeting
<urulama> np, i'll wait
<wallyworld> won't be long
<urulama> wallyworld: forgot we have sync in a few hours ... in between, take a look at these https://pastebin.canonical.com/152562/ https://pastebin.canonical.com/152560/ ... last one is multiseries charm
<wallyworld> urulama: looking in 30 seconds
<cherylj> well this is exciting:  "go1: internal compiler error: in fold_binary_loc, at fold-const.c:10124"
<mgz> cherylj: yeah, I didn't get as far as filing a bug for that but filled in the issue
<cherylj> I suspect the answer is going to be "use go 1.6"
<cherylj> mwhudson: how goes 1.6 in trusty?
<mwhudson> cherylj: slangasek promised he'd look at it today
<cherylj> cool
<mwhudson> cherylj: is that on ppc64el, or some other platform?
<cherylj> mwhudson: yes, ppc64el
<cherylj> so, ppc64 seems to go okay, just ppc64el falls over
<mwhudson> ppc64?
<cherylj> mgz:  but the build-binary-trusty-ppc64el succeeded?
 * cherylj is confused
<mgz> cherylj: it may only be a tests compile issue
<mgz> building binaries doesn't compile *_test.go
<cherylj> ah
<cherylj> ok
<axw> wallyworld: sorry I was away from computer last night. have started reviewing, will finish after standup
<mgz> cherylj: so, it's not test only, for some reason the way dpkg-buildpackage does the build just doesn't hit the issue
<mgz> probably because it does itsybitsy steps and then links everything at the end
<mgz> rather than doing go build .../...
<mgz> but I can repo the failure building just that package in charmstore so nfc
<mup> Bug #1560732 opened: Azure endpoint ACLs disappear after machine-0 restart <juju-core:New> <https://launchpad.net/bugs/1560732>
<mup> Bug #1560732 changed: Azure endpoint ACLs disappear after machine-0 restart <juju-core:New> <https://launchpad.net/bugs/1560732>
<mgz> welll this works as a fix...
<mup> Bug #1560732 opened: Azure endpoint ACLs disappear after machine-0 restart <juju-core:New> <https://launchpad.net/bugs/1560732>
#juju-dev 2016-03-23
<mup> Bug #1560732 changed: Azure endpoint ACLs disappear after machine-0 restart <juju-core:New> <https://launchpad.net/bugs/1560732>
<perrito666> man, lint really hates some juju files
 * arosales can't get update-clouds to pick up changes to .local/share/juju/credentials/cred.yaml
<arosales> so list-credentials doesn't see the latest
<perrito666> creds
<perrito666> isnt it?
<arosales> is there a way to force juju to pick up the update?
<arosales> perrito666: are you saying cred.yaml should be creds.yaml?
<arosales> I can bootstrap with the current config
<perrito666> nah, I got confused, ignore me
<arosales> ok
<arosales> and what is the recommended way to set a different region for a cloud
<arosales> set-default-region?
<wallyworld> redir: your email should be fixed now
<redir> loos like it wallyworld, many thanks
<redir> *looks
<arosales> I think set-default-region is what I am looking for . . .
<wallyworld> arosales: yes, that is the command you are looking for
 * wallyworld likes jedis
<wallyworld> arosales: as of beta 3, there will be no need to edit credentials.yaml by hand - you have commands like add-credential and set-default-region etc to do it all
<wallyworld> arosales: you can also list/show credentials contents with/without secrets
<arosales> wallyworld: well I edited the cred.yaml to add another user and show-credentails didn't pick up my changes
<wallyworld> redir: could you subscribe to the canonical lists using your canonical email address. they will reject requests from your personal address
<wallyworld> arosales: pastebin it if you like and i'll help
<redir> wallyworld: I will, they did
<perrito666> wallyworld: so, I have the manta removal ready too
<perrito666> should I propose it against master too?
<wallyworld> perrito666: does it depend on anything ?
<anastasiamac> redir: r u in #canonical?
<perrito666> wallyworld: on this http://reviews.vapour.ws/r/4303/ but I am not sure how to tell reviewboard that
<perrito666> the depends-on field is a text field
<perrito666> ericsnow: ?
<wallyworld> perrito666: the reason for asking - can it be proposed agains admin-controller-model
<perrito666> mm, I can try
<wallyworld> perrito666: you don't do it in rb - you propose against a branch in gh
<perrito666> wallyworld: I wanted the fancy version
<arosales> wallyworld: http://paste.ubuntu.com/15476275/
<perrito666> wallyworld: ill propose against admin-controller-model
<wallyworld> perrito666: rb will pick it up automatically
<wallyworld> just propose in gh and rb will do the right thing
<perrito666> wallyworld: rb has fancy depends sadly I dont know how to use it, ill propose against admin-...
<wallyworld> arosales: file should be called credentials.yaml
<wallyworld> arosales: are you running tip of master? or a packaged beta? if tip, you can use the interactive add-credential and also autoload-credentials
<arosales> wallyworld: beta2
<wallyworld> beta3 will add tools to better manage credenrtials
<arosales> I'll rename to credentials.yaml
<wallyworld> ok, ping if that doesn't work
<arosales> wallyworld: so should I update https://jujucharms.com/docs/devel/getting-started
<arosales> or does this not mater in juju 2.0 ga?
<wallyworld> if it says cred.yaml then yeah, let me look
<wallyworld> arosales: it looks ok at first glance but is a little out of date eg list-clouds does not include lxd, manual or maas
<arosales> wallyworld: no dice, http://paste.ubuntu.com/15476290/
<arosales> wallyworld: well there is no specific mention of the credentails.yaml location or the exact naming requirement
<arosales> only "juju add-credential aws -f mycreds.yaml"
<wallyworld> arosales: no! you want ~/.local/share/juju/credentials.yaml
<arosales> ok
<wallyworld> arosales: the release notes contain that info
 * perrito666 's kindgom from a straight merge
<wallyworld> arosales: but for beta3 - no need to deal with this stuff anynmore thankfully
<wallyworld> it's all very manual in beta2
<arosales> wallyworld: ok that works
<wallyworld> arosales: awesome, sorry about the hassle with it, it's wip :-)
<arosales> wallyworld: I do like setting my defaults in a yaml file so I always don't have to specify them at the command line
<wallyworld> yep
<arosales> wallyworld: no worries
<arosales> glad to be trying it out, and thanks for the help
<wallyworld> sure, thanks for testing for us :-)
<wallyworld> beta3 will be awesome
<perrito666> wallyworld: https://github.com/juju/juju/pull/4857
<arosales> wallyworld: I am looking forward to it
<perrito666> going for dinner, bbl
<mgz> don't promise too big wallyworld :P
<wallyworld> thank perrito666
<wallyworld> mgz: i'm optimistic :-)
<perrito666> wallyworld: and for the other one http://reviews.vapour.ws/r/4303/
<wallyworld> perrito666: ta, will look
<redir> how does one undeploy? i.e. a failed deploy of a charm?
<mgz> redir: depends on the reason
<mgz> redir: machine didn't come up? retry-provisioning
<redir> mgz: correct on failure and thanks
<mgz> redir: install hook failed? resolved --retry
<redir> didn't come up on aws
<mgz> for debug purposes it's often useful to use the underlying cloud commands to see what juju is seeing
<redir> mgz such as ec2-provider? or other commands?
<mgz> I tend to use euca2ools against ec2, there are lots of options
<redir> mgz oic. Thanks
<redir> retry-provisioning worked
 * redir eod
<cherylj> wth happened on the last CI run?
<anastasiamac> cherylj: hurddles?
<mgz> cherylj: which last one...
<cherylj> mgz: http://reports.vapour.ws/releases/3803
 * thumper wonders how much he just broke...
<cherylj> everything.  You broke everything, thumper
 * cherylj slow clap
<cherylj> just kidding :)
<thumper> :)
<mgz> jenkins is intermittently returning that atm
<mgz> I hadn#'t seen it'd affected quite so many tests
<mgz> cherylj: I did something dangerous 1 hour ago, but I see that's from 2 hours ago so am much pleased
<cherylj> haha
<mgz> I may just restart the frontend machine
<mgz> and look at its juju logs
<mgz> /apache logs
<cherylj> mgz: so from the last full run on master, we had the windows build failures, which I already fixed.
<cherylj> mgz: and bug 1558901
<mup> Bug #1558901: TestAddLocalCharmSuccess read has been closed <ci> <go1.5> <go1.6> <intermittent-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1558901>
<cherylj> mgz: and that charmstore v5 ppcel failure
<mgz> cherylj: I have a fix for that
<cherylj> oh?
<cherylj> you're my hero, mgz
<mgz> I am just doing three things at once so haven't proposed yet
<cherylj> mgz what's the fix
<mgz> cherylj: we also had secgroup exhaustion on canonistack
<mgz> cherylj: rather than doing approx
<mgz> switch obj:
<mgz> case someTagType{}
<mgz> instead do:
<mgz> switch obj.(type)
<mgz> case someTagType
<mgz> which doesn't confuse some gcc const optimiser
<cherylj> ah
<thumper> boo
<thumper> allWatcherStateSuite.TestStateWatcherTwoModels
<thumper> intermittent test failure
<axw> wallyworld: sorry for delay, reviewed
<wallyworld> np, ty
<wallyworld> axw: a small one also http://reviews.vapour.ws/r/4305/
<mup> Bug #1560757 opened: allWatcherStateSuite.TestStateWatcherTwoModels intermittent failure <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1560757>
<axw> wallyworld: forgot to mention, cmd/modelcmd merge looked find
<axw> fine*
<wallyworld> axw: gr8, ty
<axw> wallyworld: one other thing we're missing, "juju logout"
<wallyworld> axw: yeah, i have indicated that will slip to next week
<wallyworld> along with tidying up controller listing
<axw> wallyworld: ok
<wallyworld> to show logged in controllers etc
<axw> wallyworld: tidying up controller listing?
<axw> ok
<wallyworld> axw: if you get time, here's a charm deployment bug fix http://reviews.vapour.ws/r/4308/
<axw> wallyworld: lgtm. should probably check with uros to make sure the charm store should be allowing that
<axw> seems a bit weird to allow in charm store charms
<axw> local case is sane tho
<wallyworld> axw: yeah, he was the one who pinged me about the issue; my comments in the bug are a guess,he just sent me a pastebin; i have a meeting this arvo to follow up
<wallyworld> but the fix is good regardless
<axw> ok
<axw> wallyworld: for list-controllers, I'm going to add cloud type to controllers.yaml. the rest is already in bootstrap config, and can't be expected to exist in controllers.yaml anyway (e.g. if you do "juju register", most of that info is gone)
<wallyworld> axw: what about when we start to support heteroegeous controllers?
<wallyworld> cloud type shouild be on the model?
<wallyworld> for future proof
<wallyworld> we could include the admin model cloud type or something for now when we list
<axw> wallyworld: I'd prefer not to special case some model like that
<axw> wallyworld:
<axw> wallyworld: tho I suppose we could just check the UUID
<axw> hrmmmmm
<wallyworld> it would just be the default for now for list controllers - i meant yeah check the uuid
<wallyworld> if model uuid == controller uuid
<axw> wallyworld: what if the user doesn't have access to the admin model though?
<wallyworld> we can still tell them what type of cloud it is though - this is client side right?
<wallyworld> i'm going frm memory, need to read the bug again
<axw> wallyworld: how...? they can't list the models they don't have access to
<wallyworld> hmmm, right we may need to refresh models
<wallyworld> if they don't have access, can we not report the cloud type then
<axw> wallyworld: I'm saying if you don't have permissions to do it. but yes, that too
<wallyworld> i guess though people would want to know cloud type regardless of if theyhave admin model access
<axw> most of the time, yeah
<wallyworld> but that holds only for homogenoues controllers
<wallyworld> since it doesn't make sense otherwise
<axw> wallyworld: well, there's the cloud that the controller runs in
<axw> wallyworld: and the cloud(s) that it manages
<axw> currently one and the same, but later on cloud-that-it-runs-in is still valid
<wallyworld> yes but if you don't have admin access, then what do you want to know that for
<axw> true.
<wallyworld> if you can create a model of your own in whatever cloud you want
<axw> wallyworld: I wonder if we shouldn't just use bootstrap config for this
<axw> wallyworld: all this info is in bootstrap config already
<wallyworld> it is yes
<wallyworld> but not on everyone's machine
<wallyworld> i do like adding it to model metadats for that reason
<wallyworld> since that gets refreshed
<wallyworld> axw: so the person who asked for this would be a controller admin, we could do it per model so long as you have admin access, and get feedback
<wallyworld> it will satisfy that initil use case
<wallyworld> we can easily change / extend to get from bootstrap config. but even then, if you have bootstrap config you'd be admin
<axw> wallyworld: I think I could live with having it on model details, but I'm not so keen on requiring a model refresh to get the admin model
<wallyworld> axw: sgtm, you would expect to have that
<axw> wallyworld: what sounds good? expect what?
<wallyworld> not having a model refresh because it's not needed because you'd expect to have those model details
<wallyworld> axw: i was just suggesting having it on model for the reasons outlined, if there's a show stopper then we can think of something else
<wallyworld> but to me it makes sense anyway :-)
<axw> wallyworld: seeing as adam wants the maas-server and other info anyway, I'm treating them as orthogonal. we'll render bootstrap config if we have it, which addresses his immediate needs. we can add type to model if/when it's needed. people can already do "juju get-model-config type" if they need to
<wallyworld> axw: sgtm
<wallyworld> axw: later when you are free, here's a pr to remove manta from joyent. credentials entry nicer now http://reviews.vapour.ws/r/4309/
<axw> wallyworld: will take a look after lunch
<wallyworld> axw: np, there's a queue anyway
<axw> wallyworld: done
<wallyworld> ta
<wallyworld> axw: i had one question/suggestion
<wallyworld> thoughts?
<axw> wallyworld: yeah, I was thinking that too, for the type anyway. the name is pretty much useless, because it's always going to be "admin"
<axw> wallyworld: how about this: I'll remove name, take type up to the top level, and if that's all that's in config, omit config
<wallyworld> axw: sgtm
<wallyworld> axw: a small fix http://reviews.vapour.ws/r/4314/
<axw> wallyworld: looking
<axw> wallyworld: LGTM, just one request
<wallyworld> sure ty
<axw> wallyworld: are any of the other cards more important to you than others?
<axw> oh, there was the list-models one
<wallyworld> axw: yeah, i retested that but couldn't repro
<wallyworld> axw: i reckon we should print the current controller when listing models
<wallyworld> axw: as for cards, anything usability related, so the text when registering if there's no current model etc, or warning when --config has unknown values
<wallyworld> axw: also, roger suggested renaming controller apiendpoints and servers to APIEndpoints and ResolvedAPIEndpoints
<wallyworld> well i guess that's one rename
<axw> wallyworld: yeah, the field names there already aren't great
<axw> ok, I'll just pick off some stuff
<wallyworld> ta
<wallyworld> axw: have started manually testing restore, something looks broken. i manually kill the controller, the only machine, and restore -b, and it detects that there's no controllers and then proceeds to try and connect continually to the (killed) controller ip
<wallyworld> i have soccer but will look later, unless you are able to take a look while i'm gone
<wallyworld> the original admin-secret issue is fixed, tis occurs after that
<urulama> wallyworld: multi-series charms are ok now
<wallyworld> urulama: yay, awesome
<urulama> wallyworld: indeed, thanks for the quick fix
<wallyworld> np, i'm an "expert" :-)
<urulama> LOL
<wallyworld> urulama: still a bit concerning that the stored charm metadat *seems* to lack series
<wallyworld> we'll need to dig into that
<urulama> wallyworld: you're using charmstore v5 to get the blob, right? if you're using v4, that would indeed be the case, you'd get a blob where metadata doesn't have series
<wallyworld> urulama: not sure tbh, i think it's v5 since i think i merged master after those changes landed
<urulama> wallyworld: ok, let's dig in for beta4
<wallyworld> yep, i'll add it to the list
<wallyworld> axw: yeah, so it finally timed out
<wallyworld> 1-85dd-bc5bff067743/api: dial tcp 54.237.98.19:17070: getsockopt: connection timed out
<wallyworld> 2016-03-23 08:56:58 ERROR cmd supercommand.go:448 getting API info: addresses for [] not found
<wallyworld> off to soccer, look check back later
<wallyworld> will
<voidspace> babbageclunk: heya, morning
<voidspace> babbageclunk: how was your first day?
<babbageclunk> voidspace: Hi, great thanks!
<voidspace> babbageclunk: did you get company email and access to HR systems working?
<TheMue> babbageclunk: heya, greetings from a former team member
<voidspace> babbageclunk: and yay, you're working with us - that's great :-)
<babbageclunk> voidspace: Yup yup.
<babbageclunk> TheMue: Cheers!
<voidspace> babbageclunk: we have standup meetings at 10am - can you see them on your calendar?
<voidspace> babbageclunk: if not I can give you the link
<voidspace> TheMue: o/ morning
<TheMue> voidspace: o/ heya
<voidspace> babbageclunk: I'm going to have a chat with Tim about work changes on the MAAS 2.0 provider in about half an hour
<voidspace> babbageclunk: you can join us if you like, so you can help with the work
<babbageclunk> voidspace: yup, Dimiter invited me to them. I guess I should go into one of the chill areas for that.
<voidspace> babbageclunk: getting your HR stuff done *and* getting a dev environment setup is pretty good for day 1
<babbageclunk> voidspace: :)
<voidspace> babbageclunk: getting MAAS setup in KVM will probably take a chunk of day 2...
<babbageclunk> voidspace: Yeah, I'd like to join in the MAAS talk
<babbageclunk> voidspace: ok - any tips on where to start? I've got virt-manager installed.
<voidspace> babbageclunk: this guide is helpful
<voidspace> https://insights.ubuntu.com/2013/11/15/interested-in-maas-and-juju-heres-how-to-try-it-in-a-vm/
<voidspace> babbageclunk: I think that's the one I used
<voidspace> babbageclunk: the important thing is to setup a network in virt-manager for the instances to communicate
<babbageclunk> voidspace: ok, cool - I'll start working through that.
<voidspace> babbageclunk: and so that maas can manage dhcp for that network
<voidspace> babbageclunk: once you get it installed you can repeat with MAAS 2.0...
<voidspace> babbageclunk: for this work we really need *both* unfortunately
<babbageclunk> voidspace: send me a link for the MAAS chat with Tim?
<voidspace> babbageclunk: yep, when he lets me know where :-)
<babbageclunk> voidspace: cool cool
<voidspace> babbageclunk: when setting up the KVM instances for MAAS I generally give the MAAS controller a 20gb disk and the nodes 10gb
<voidspace> babbageclunk: that link is a little old for setting up maas - setting it up through the web ui is pretty easy
<voidspace> babbageclunk: the important thing is getting the network setup, and then when you install ubuntu on the maas controller you need to manually configure networking
<voidspace> babbageclunk: i.e. pick an ip address on the subnet of the network you created in virt-manager
<voidspace> babbageclunk: for me, the virt-manager network is 172.16.0.0/24 and I give the controller 172.16.0.2
<babbageclunk> voidspace: ok, thanks.
<babbageclunk> voidspace: stupid question: despite that link saying 13.10, there's nothing wrong with using 15.10, right?
<voidspace> babbageclunk: for maas 1.9 I use trusty (14.04)
<voidspace> babbageclunk: for maas 2.0 you will need to use xenial (16.04)
<voidspace> babbageclunk: 15.10 would be fine for maas 1.9 though - possibly better, not sure :-)
<babbageclunk> voidspace: thanks
<babbageclunk> voidspace: And just grab the current daily desktop image for xenial from here? Or is there a better source?
<babbageclunk> voidspace: oops http://cdimage.ubuntu.com/daily-live/current/
<voidspace> babbageclunk: yeah, start with the daily - update/upgrade dance will pull in changes anyway
<voidspace> babbageclunk: except use the server image not desktop
<babbageclunk> voidspace: Where can I find the server one?
<voidspace> ah
<voidspace> hang on
<voidspace> let me google that for you...
<thumper-afk> o/ voidspace
<thumper-afk> voidspace: calling
<babbageclunk> voidspace: ok, I did that instead, sorry!
<voidspace> thumper-afk: I couldn't find the tab that was making the noise!
<thumper-afk> heh
<voidspace> thumper-afk: can we do it in a hangout so babbageclunk can join?
<thumper-afk> ack
<thumper> voidspace: was trying
<thumper> voidspace: make one
<voidspace> thumper: babbageclunk: https://plus.google.com/hangouts/_/canonical.com/juju-maas2
<babbageclunk> voidspace: incoming
<voidspace> babbageclunk: where are you punk?
<babbageclunk> voidspace: "requesting to join video call"
<voidspace> babbageclunk: can you not join with your canonical id?
<menn0> babbageclunk: hi!
<menn0> babbageclunk: on the first screen when you join a hangout, the Google account in use is shown in a small font at the bottom. It's often wrong when you click an arbitrary link to a hangout.
<menn0> but you can switch it
<babbageclunk> menn0: Hi! Yeah, I worked that out after a bit.
<menn0> babbageclunk: it took me weeks, so you're doing well :)
<babbageclunk> menn0: On now, not understanding a whole lot! Reminds me of the first meetings I went to at BATS.
<menn0> babbageclunk: that's quite normal
<dooferlad> voidspace, dimitern, babbageclunk: standup?
<voidspace> dooferlad: in a meeting with thumper right now
<dooferlad> voidspace: everyone?
<voidspace> dooferlad: me and babbageclunk
<dooferlad> voidspace: is it relevant to me, or should I wait until you are done?
<voidspace> dooferlad: dimitern: postpone a little bit please
<voidspace> dooferlad: it's about 2.0
<mup> Bug #1560888 opened: bootstrap aws fails but leaves instance alive <juju-core:New> <https://launchpad.net/bugs/1560888>
<voidspace> dooferlad: maas 2.0 I mean
<voidspace> dooferlad: so not relevant to you I don't think
<dooferlad> voidspace: ack
<voidspace> dooferlad: dimitern is on holiday now
<dooferlad> voidspace: ah
<mup> Bug #1560920 opened: State.RestoreInfoSetter code/tests woefully inadequate <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1560920>
<babbageclunk> voidspace, thumper - where'd everyone go?
<voidspace> babbageclunk: we're there...
<voidspace> babbageclunk: you're a square
<voidspace> babbageclunk: we're done
<babbageclunk> voidspace: ok, won't join back in!
<voidspace> babbageclunk: liar
<thumper> right, off to bed
<thumper> later folks
<voidspace> thumper: g'night and thanks
<wallyworld> axw: found restore issue, fixing
<jam> perrito666: so because of the earlier problems I went ahead and added the push hook to my work area. But it turns out that go-1.6 is slow enough with "vet" && "build" that half the time the push fails because it takes to long to run verify.bash
<jam> yay
<jam> so it tends to be more of a "git push" ^C, ./scripts/verify.bash, "git push --no-verify"
<axw> wallyworld: what was it?
<axw> never mind, I see your PR
<wallyworld> axw: found another issue - i think i need to set the current model to admin also
<wallyworld> or else bootstrap eventually fails after ages
<wallyworld> still need to diagnose that
<axw> wallyworld: ok. your changes LGTM, I'll be around for a little longer if you want me to review
<axw> not a lot longer tho
<wallyworld> axw: np, ty
<wallyworld> axw: it gets all the way through bootstrap and eventually times out after starting jujud agent, so hopefully is a small problem
<wallyworld> axw: retesting, i think i need to set the original controller uuid into the cfg used t create the environ
<axw> wallyworld: is NewGetBootstrapConfigFunc not already doing that?
<axw> wallyworld: when you restore, it's expected that your controller details on disk match what's in the backu
<wallyworld> let me check
<axw> p
<axw> wallyworld: (one of the limitations I'd like to get away from, by making backups portabe)
<axw> portable*
<wallyworld> axw: i was just looking at the symptoms which were in the log info messages which is that the code to open the admin model after the agent starts simply fails
<wallyworld> claiming the model cannot be found
<wallyworld> was hoping to avoid detailed debugging, but seems like i lost that bet
<axw> wallyworld: double check that the ca-cert and private key extracted from metadata are valid?
<wallyworld> axw: it finally completed this time after i set current model to "admin", yeah i missed where we were setting the controller uuid
<axw> wallyworld: huh, ok
<wallyworld> so it didn't like having current model set to a model with an invalid uuid
<axw> wallyworld: I see. because you were on the default model I guess?
<wallyworld> yup
<axw> and it didn't match what was in the new hosted model
<wallyworld> but jeez, it took a long time after the agent started
<wallyworld> yep
<wallyworld> cause we generate a new one
<wallyworld> hence my todo to clean that up
<wallyworld> i'd like to support multi-model properly
<axw> wallyworld: yeah, needs an overhaul. we shouldn't need anything on the client except the backup file
<wallyworld> axw: sadly CI is part way through a admin-controller-model branch run, so then it will run master etc, so it will be ages before we see another run
<wallyworld> yep
<axw> joy
<wallyworld> needs to be rewritten :-(
<wallyworld> for 2.1 i guess
<wallyworld> axw: nearly http://reports.vapour.ws/releases/3807
<wallyworld> not sure why the failing joyent tests aren't listed
<wallyworld> axw: TestAddUserAndRegister seems to be failing regularly on windows sadly
<axw> wallyworld: doh. so that and restore are the last things it appears?
<voidspace> babbageclunk: back by the way - just making tea for the carpet cleaner man
<wallyworld> axw: yeah, i need to check the joyent logs since those tests are shown as failing
<voidspace> babbageclunk: and then seeing how bad it is trying to merge master onto drop-maas-1.8
<wallyworld> axw: that windows test could be this branch, need to check
<voidspace> babbageclunk: if dimitern landed the multi-nic support on master it may be very bad...
<voidspace> babbageclunk: in which case I'll postpone
<babbageclunk> voidspace: Having trouble getting the maas controller set up - can install and configure everything, but then when it reboots post install I just get a weirdly corrupted screen.
<wallyworld> axw: ah ha
<wallyworld> validating "credentials" credential for cloud "joyent": unknown key "manta-key-id"
<axw> wallyworld: doh :)
<wallyworld> we need to get the creds updated
<axw> wallyworld: I've gotta go. I have a bunch of changes to make the error messages for no-current-controller and no-current-model more readable, but need to update tests befroe I can propose
<wallyworld> axw: np, ty, see you tomorrow
<axw> bonne nuit
<wallyworld> Gute Nacht
<axw> well that's not a very nice thing to say
 * axw actually goes
<jamespage> jam, anastasiamac: hey - so smoser and I are discussing what the virt attribute should be set to for the multi-hypervisor openstack cloud stuff
<jamespage> 'lxd' and 'kvm' is my preference - what do you think? its inline with virt terms used by juju already...
<jamespage> but that is a deviation from anastasiamac changes which are currently lxd and qemu in goose I think...
<anastasiamac> jamespage: I have not changed goose
<anastasiamac> jamespage: in juju we are expecting "lxd" and "qemu"
<anastasiamac> jamespage: if u'd like different values, it's a simple change but change nonetheless (on our end) \o/
<jamespage> anastasiamac, yeah I understand that - just trying to see if we can get some consensus on what these values should be
<anastasiamac> jamespage: note that you can specify anything you like in simple streams
<jamespage> as we already use virt types in other parts of juju
<jamespage> anastasiamac, how does juju map to the stream data ?
<anastasiamac> jamespage: but if u want to filter and use it in constraints, these are the values :D
<jamespage> anastasiamac, right
<anastasiamac> jamespage: each provider can have different values
<anastasiamac> jamespage: for ec2, it's "pv" and "hvm"
<anastasiamac> jamespage: for azure, "Hyper-V", etc..
<jam> anastasiamac: http://www.innervoice.in/blogs/2014/03/10/kvm-and-qemu/
<jam> kvm != qemu AFAICT
<anastasiamac> jamespage: so, to answer ur q, we don't really "map" so much as "compare" what we see in simpel streams with what we expect on instance type and what was given as a contraint
<anastasiamac> jam: jamespage: I have only used the values as "lxd" and "qemu" since this is what you gave :D I understood it as a requirement :P
<jam> jamespage: virtmanager seems to treat them differently, though it looks like its how it runs it, not the actual file format.
<jamespage> yes that is the case
<jam> so as anastasiamac we're probably agnostic
<jam> name it what you want, and ask Juju to run the one that you named it.
<jam> jamespage: as for LXD, its support for image formats seems to actually want to have a root.tar.gz + some extra metadata
<anastasiamac> jam: this would b true if we did not validate constraints...
<jam> anastasiamac: ah, we restrict what you can pass in?
<jam> http://cloud-images.ubuntu.com/releases/streams/v1/com.ubuntu.cloud:released:download.sjson
<jam> you can see there are 2 files
<jam> root.tar.gz and lxd.tar.xz are what the LXD codebase reads
<anastasiamac> jam: we validate constraints if they are given
<jamespage> jam: well that's not quite true for lxd in openstack - we just use the root.tar.gz
<jamespage> jam: the other part of this change is populating stream data in the openstack cloud with appropriate virt attributes
<jam> jamespage: so *I* don't have any clue what the extra data is, just that if you do "lxc launch ubuntu:trusty" it grabs both files, combines them somehow and validates the LXD hash matches the "combined_sha256"
<anastasiamac> jam: jamespage: if u populate virt in simple streams with "lxd" and "qemu" now, juju tip is ready for it :D
<jamespage> anastasiamac, yes I understand that
<jamespage> sorry - my question is about what this should be, not what we have right now...
<jamespage> jam, anastasiamac: we do juju deploy --to lxd:0 or --to kvm:0
<jam> jamespage: kvm does feel like it is going to match up better with what other people are going to be using.
<jamespage> should this be consistent with that....
<jam> jamespage: correct "kvm:0" and "lxd:0"
<jamespage> yup
<jam> jamespage: but that is for containers inside the machine
<jamespage> so this is a 'what is the juju experience question...'
<jam> if Openstack has multiple hypervisors
<jam> I have no idea what the exact syntax is
<jamespage> juju deploy mysql --constraints="virt-type=lxd"
<jamespage> juju deploy mysql --constraints="virt-type=kvm"
<jamespage> I think
<rick_h_> jam: jamespage anastasiamac I think it should be kvm please
<jam> juju deploy --constraints virt=kvm
<jam> jamespage: I wonder if it is confusing to have both syntaxes
<rick_h_> jam: jamespage anastasiamac we're talking and folks are asking for kvm and trying to map qemu isn't a win for clarity/etc.
<jamespage> rick_h_, that was my take...
<jamespage> but wanted to chat :-)
<jam> jamespage: :). I would tend to agree, but I would actually focus on what people think of in Openstack terms vs Juju terms.
<jamespage> I think that they think kvm or lxd
<jamespage> not lxc or qemu
<rick_h_> jam: exacly, all our RFI and such ask for KVM support
<jamespage> LXD vs KVM after-all
<jamespage> ;-)
<jam> virt-type=cgroup-machine-containers ?
<rick_h_> jam: and our whole discussion is kvm, lxd, and hyper-v
<jamespage> ohh now that is a good question....
<rick_h_> lol
<jamespage> anastasiamac, jam: the test cloud is currently populated with 'lxd' and 'kvm' for virt attributes
<natefinch> rogpeppe, ericsnow: https://github.com/juju/charmrepo/pull/80 - changed GetResource to return bytes + full metadata, using multipart body.
<jamespage> if we're agreed that's all good then I'll ask smoser to land my simplestreams changes...
<anastasiamac> jamespage: "lxd" and "kvm" are reasonably awesome. I'll change values on juju side  \o/
<rick_h_> ty anastasiamac
<jamespage> anastasiamac, +1 great
<voidspace> babbageclunk: wasn't multi-nic support that had landed, just a trivial conflict
<rogpeppe> natefinch: do you really think the multipart thing is less ugly than using headers?
<babbageclunk> voidspace: oh, nice.
<anastasiamac> jamespage: can I keep access to my wonderful cloud for a while longer - like a week more :D?
<jamespage> anastasiamac, sure
<voidspace> babbageclunk: I've pushed and run tests, about to create PR
<anastasiamac> jamespage: tyvm!
<rogpeppe> natefinch: you do know that encoding/json always reads all the bytes of the object into its buffer before parsing, right?
<babbageclunk> voidspace: I recreated my controller, but I still get weird graphical corruption after reboot
<babbageclunk> voidspace: sent you a screenshot
<voidspace> babbageclunk: not got the screenshot, but...
<natefinch> rogpeppe: gah, no, I didn't realize it just reads the whole thing..
<voidspace> babbageclunk: I sometimes get that and a "force reset" cures it
<voidspace> babbageclunk: however, I usually don't bother - because I boot without opening the display and then ssh in
<natefinch> rogpeppe: I hoped it was smarter than that... though it may be a technical limitation, I guess
<natefinch> rogpeppe: and yes, I think the multipart thing is way less ugly
<voidspace> babbageclunk: yeah, that looks like what I sometimes get - see if a "force reset" helps
<voidspace> babbageclunk: alternatively wait until CPU activity settles down and try ssh'ing in anyway
<rogpeppe> natefinch: well, we're gonna read everything into memory anyway, right?
<rogpeppe> natefinch: so we're O(n) space anyway
<natefinch> rogpeppe: I figured if we accidentally try to deserialize a 100 meg zip file, it would bail early and not load 100 megs into RAM
<rogpeppe> natefinch: tbh I like endpoints that I can easily use with command-line tools, and multipart responses are awkward
<rogpeppe> natefinch: hopefully we're talking to the server we expect to be talking to
<rogpeppe> natefinch: we could accidentally try to deserialize a 100 meg JSON object...
<natefinch> rogpeppe: it may be more difficult to use some CLI tools with the endpoint, but it makes the code - which is what we maintain, way way simpler, and think that's well worth it.
<rogpeppe> natefinch: +79 -51
<rogpeppe> natefinch: really simpler?
<natefinch> rogpeppe: I didn't add headers for all the other fields we were going to have to support to support the full resource
<rogpeppe> natefinch: wouldn't it be better just to have another endpoint to return resource metadata?
<rogpeppe> natefinch: which would be consistent with the way we treat charm archives, for example
<natefinch> rogpeppe: I was just saving us a round trip
<rogpeppe> natefinch: i don't think it's worth it
<rogpeppe> natefinch: send 'em both concurrently if you care
<babbageclunk> voidspace: gah, why does my irc connection keep dropping?
<natefinch> rogpeppe: I'm still unclear on why you think this is worse.  Just because of the multipart stuff?
<rick_h_> natefinch: is there an api call to just get the binary and one to just get the metadata?
<rick_h_> natefinch: e.g. how can the gui load the metadata and then offer a download link for just the bytes to the browser through the api?
<rogpeppe> natefinch: mostly API consistency. it's something we don't do currently, and i'm not sure there's enough reason to set a precedent here.
<rogpeppe> natefinch: and yes, the multipart stuff seems ugly to me.
<natefinch> rick_h_: that's a good point. Certainly having an enpoint for just the metadata seems like it would generally be useful
<rick_h_> natefinch: yes, clients will want to get at that without the bytes please
<voidspace> babbageclunk: no idea - did you get my messages?
<voidspace> babbageclunk: can you ssh in despite the corruption?
<natefinch> rick_h_: we have an endpoint that returns all the metadata for all resources for a charm... that's likely what the GUI would use, btw.  But we can add an endpoint for a one-by-one call, too
<babbageclunk> voidspace: I got some messages...
<babbageclunk> voidspace: yup, I can ssh in
<rogpeppe> natefinch: BTW I think it might be unnecessary to provide a way to get the resource without specifying a revision.
<rick_h_> natefinch: yes, should be filterable at least as we attempt to do things like allow upload/current data, etc. on specific resources
<voidspace> babbageclunk: you don't need the display anyway
<rick_h_> natefinch: at some point in time we'll want to show the charmstore history of a resource
<natefinch> rogpeppe: yes, I assumed we'd always be requesting metadata for a specific revision of a resouce
<babbageclunk> voidspace: Sweet, not blocked on that anymore.
<rogpeppe> natefinch: the client code doesn't seem to assume that
<rogpeppe> natefinch: (lines 279, 293 in the old file)
<voidspace> babbageclunk: if you've installed maas you should be able to get to it via the web ui
<natefinch> rogpeppe: oh, good point.. that's actually old code that should have been removed.  At one point I had toyed with allowing -1 for a revision, but we decided it wasn't really necessary
<voidspace> babbageclunk: probably at http://172.16.0.2/MAAS
<natefinch> rogpeppe: (there was more surrounding code to support it, I just forgot to remove that part evidently)
<rogpeppe> natefinch: that also removes the need for the revision header too
<anastasiamac> mattyw: ping
<mattyw> anastasiamac, hey there, have a review for me?
<natefinch> rogpeppe: yep
<rogpeppe> natefinch: probably good to leave the hash header in there so match what archives do
<anastasiamac> mattyw: read my mind \o/ trivial plz - https://github.com/juju/juju/pull/4875
<anastasiamac> mattyw: oops, here is rb link http://reviews.vapour.ws/r/4324/
<natefinch> rogpeppe: also useful to be able to double check that you got the right data
<rogpeppe> natefinch: exactly.
<mattyw> anastasiamac, I normally ignore people that don't send rb links, but I'll make an exception ;)
<rogpeppe> natefinch: and a header seems like a good fit for that
<natefinch> rogpeppe: definitely
<anastasiamac> mattyw: I sent both just to be special :D
<mattyw> anastasiamac, if you're sure that's the fix then LGTM
<natefinch> rogpeppe: so I'm going to rename GetResource to DownloadResource, to make it more obvious, and make a new GetResource that returns metadata
<rogpeppe> natefinch: I think GetResource is still right
<rogpeppe> natefinch: it's consistent with GetArchive
<anastasiamac> mattyw: tyvm \o/
<rogpeppe> natefinch: the resource info will probably be a meta endpoint, right?
<natefinch> rogpeppe: yes
<natefinch> rogpeppe: that doesn't help me name the function in the client wrapper though :)
<rogpeppe> natefinch: and hence amenable to Meta
<natefinch> rogpeppe: GetResourceMeta?
<rogpeppe> natefinch: no, I don't think you need a new API entry point
<rogpeppe> natefinch: the existing Meta method should be up to the job
<rogpeppe> natefinch: well... maybe
<rogpeppe> natefinch: probably not actually
 * rogpeppe tries to think of a decent interface that allows bulk getting of heterogenous meta endpoints with dynamically specified paths
<rogpeppe> natefinch: failing that, ResourceInfo would work OK as a name
<natefinch> rogpeppe: ResourceMeta perhaps
<natefinch> rogpeppe: to run against id/meta/resource
<rogpeppe> natefinch: yeah, maybe
<rogpeppe> natefinch: or just don't make an entry point yet
<rogpeppe> natefinch: and use Get directly
<rogpeppe> natefinch: then you can tailor your call to how you actually end up using it
<katco> rogpeppe: i'm afraid i've kidnapped him into a meeting :)
<rogpeppe> natefinch: because then you have freedom to get as much metadata on as many charms as you like in one request, including arbitrary resources
<rogpeppe> katco: how could you?! :)
<katco> rogpeppe: hehe ;p
<babbageclunk> voidspace: ok, I've created an admin and connected to the web gui
<babbageclunk> voidspace: the doc says to run maas-import-pxe-files, but that doesn't exist. Are they already downloaded?
<dooferlad> babbageclunk: hey, did you get anywhere with a KVM MAAS? https://github.com/dooferlad/kvm_maas is a helper script that me and jam put together with a link to some instructions to get going
<dooferlad> babbageclunk: they are imported by MAAS via the GUI
<voidspace> babbageclunk: they aren't, but you can start importing boot images from the web ui
<voidspace> babbageclunk: there should be a tab in the UI for images
<dooferlad> babbageclunk: http://maas.ubuntu.com/docs/install.html#post-install-tasks
<babbageclunk> dooferlad: Ooh, that looks useful - just setting up the controller at the moment, then I'll have a go with this.
<voidspace> babbageclunk: great, my controller hard disk is corrupted
<ericsnow> natefinch, rogpeppe: +1 on ResourceInfo (and leave GetResource as-is)
<rogpeppe> ericsnow: i'm suggesting not doing ResourceInfo for now
<rogpeppe> ericsnow: leave it like most of the other endpoints
<ericsnow> rogpeppe: a meta endpoint for a specific resource revision isn't an option, no?
<babbageclunk> voidspace: doh.
<rogpeppe> ericsnow: if we end up with lots of places getting resource info, then let's do it then
<rogpeppe> ericsnow: as then we'll know what the common pattern is
<babbageclunk> voidspace, dooferlad - ok, thanks - was trying to follow the old (but KVM-specific) doc rather than the newer one
<rogpeppe> ericsnow: i'm sorry that'll mean you need to write less code :)
<jam> cherylj: tych0: two new reviews for LXD related patches http://reviews.vapour.ws/r/4318/ and http://reviews.vapour.ws/r/4323/ and if cherylj you could let me know if it is ok to land on Master
<ericsnow> rogpeppe: lol
<jam> they are primarily testing fixes for now.
<cherylj> jam: I'd really like to hold off until after we ship beta3
<ericsnow> rogpeppe: TBH, we'll still write a ResourceInfo method either way--either on the client or on a wrapper around the client
<jam> cherylj: so should I create a feature branch for LXD stuff? I'd like to build off my work, though I guess if we get beta3 out Friday it won't be sitting particularly long
<rogpeppe> ericsnow: the thing to avoid is doing that and then looping through lots of resources calling that method
<cherylj> jam: we're trying to get things out tomorrow
<tych0> jam: both look fine to me
<cherylj> jam: you can create a fb if you'd like.  It won't get a CI run until after beta3
<ericsnow> rogpeppe: we only need it for downloading, which isn't something we'll do for many-at-once
<ericsnow> rogpeppe: you're saying we should support a bulk call for getting resource info at a specific revision?
<rogpeppe> ericsnow: tbh, if you're just doing it in one place, why not just do it in that one place?
<ericsnow> rogpeppe: makes sense
<rogpeppe> ericsnow: well, there are many possibilities
<rogpeppe> ericsnow: and i *think* at the moment my inclination is to make the user of csclient decide how best to do it
<ericsnow> rogpeppe: regardless, there isn't a point to doing the multi-part GetResource stuff :)
<rogpeppe> ericsnow: indeed
<ericsnow> rogpeppe: so you're suggestion is that users use the Get endpoint to do bulk calls that accomplish the same thing as a hypothetical ResourceInfo endpoint?
<rogpeppe> ericsnow: yes, until it seems that there's a pattern that can be factored out
<ericsnow> rogpeppe: I'll have to take another look at it but I didn't see how the Get endpoint could accommodate resources at specific revisions
<rogpeppe> ericsnow: one mo
<ericsnow> rogpeppe: np
<babbageclunk> dooferlad: Do I need to install a rack controller on my maas controller?
<dooferlad> babbageclunk: I haven't looked at MAAS 2 really. voidspace should be able to answer that one.
<babbageclunk> dooferlad: ahh - am I getting confused between docs for 2.0 and 1.x?
<voidspace> babbageclunk: yes
<voidspace> babbageclunk: for 2.0 you do
<voidspace> babbageclunk: and probably you are confused, yes
<ericsnow> babbageclunk: welcome, BTW :)
<ericsnow> babbageclunk: I'm hoping things have gone smoothly for you :)
<babbageclunk> voidspace: ok, and for 1.x how do I set up DHCP?
<voidspace> babbageclunk: let me look at the docs - my MAAS is down at the moment
<babbageclunk> ericsnow: Hi! Yes, so far. Although I'm geting confused by MAAS stuff at the moment
<voidspace> babbageclunk: you have to tell MAAS to manage DHCP, but I can't recall how to do that for MAAS 1.9
<voidspace> I can remember for 2.0
<voidspace> babbageclunk: if you get MAAS installed and setup in one day you're doing better than most of us managed the first time
<voidspace> although to be fair it has improved since I did it the first time
<ericsnow> babbageclunk: be careful; if you figure it out then you're responsible for explaining it to people <wink>
<voidspace> babbageclunk: http://maas.ubuntu.com/docs1.9/cluster-configuration.html
<voidspace> babbageclunk: so, under the "clusters" tab you should see the interface with the subnet
<voidspace> babbageclunk: you need to edit that
<rogpeppe> ericsnow: something like this could work, though I'm not sure of the exact proposed endpoints: http://paste.ubuntu.com/15479784/
<voidspace> babbageclunk: and put in "appropriate values" for everything - including changing it from unmanaged to having maas manage dhcp
<voidspace> babbageclunk: for the dynamic range do something like 172.16.10 -> 172.16.128
<voidspace> babbageclunk: and then for static range 172.16.0.129 -> 172.16.0.255
<rogpeppe> ericsnow: then the Meta map would have an entry for each requested resource
<mup> Bug #1560624 changed: cmd supercommand.go:448 failed to bootstrap model: no matching tools available <cdo-qa> <juju-release-support> <juju-core:Triaged> <https://launchpad.net/bugs/1560624>
<voidspace> (172.16.0.10 -> 172.16.0.128 but I'm sure you got that)
<ericsnow> rogpeppe: ah, meta endpoints can use a path (in the includes)
<voidspace> babbageclunk: you still here?
<rogpeppe> ericsnow: yeah, the idea is that you can ask for info on a particular meta endpoint with $id/meta/resource/$name-$rev
<rogpeppe> ericsnow: it's the same deal as it is now with the extra-info endpoint
<babbageclunk> voidspace: sorry, was reading docs - none of this matches what I see in the gui
<voidspace> babbageclunk: what do you have in the GUI?
<voidspace> babbageclunk: what version are you running?
<voidspace> babbageclunk: are you setting up 1.9 or 2.0 - I just gave you the instructions for 1.9
<ericsnow> rogpeppe: that's specific to a single charm URL though, right?  how would a single call for multiple charms, each with different resources, work?
<babbageclunk> voidspace: on the clusters tab, I see 1 cluster called "Cluster master"
<rogpeppe> ericsnow: one mo
<babbageclunk> voidspace: 1.8.3
<babbageclunk> voidspace: :(
<voidspace> babbageclunk: nice :-)
<rogpeppe> ericsnow: ah, you can't do that
<rogpeppe> ericsnow: well, you kinda could
<voidspace> babbageclunk: you probably need to add a ppa and update maas to 1.9
<ericsnow> rogpeppe: not that *we* need that (right now)
<voidspace> babbageclunk: it should be a clean upgrade though
<babbageclunk> voidspace: but even after changing the url to have 1.8, the docs don't match
<voidspace> babbageclunk: what OS did you go for?
<babbageclunk> voidspace: 15.10
<voidspace> babbageclunk: can you click on the cluster master?
<babbageclunk> voidspace: Ah, ok - should I edit the interface in there?
<voidspace> babbageclunk: yes, I missed that step out - sorry
<voidspace> babbageclunk: I think the ppa you want is ~maas/stable
<rogpeppe> ericsnow: if the meta endpoint doesn't exist, it's just omitted from the result. so a request like /meta/any?id=$id1&id=$id2&include=resource/$r1&include=resource/$r2 where the resources are the union of all the required resources would work
<voidspace> babbageclunk: sudo add-apt-repository ppa:~maas/stable
<rogpeppe> ericsnow: you'd potentially get some extra data in the results though
<voidspace> babbageclunk: then apt-get update/upgrade dance should work
<ericsnow> rogpeppe: well, the "resource" meta endpoint could support identifying the charm in the meta path
<rogpeppe> ericsnow: no
<ericsnow> rogpeppe: :)
<rogpeppe> ericsnow: well, not if it's like all the other meta endpointa
<rogpeppe> s
<ericsnow> rogpeppe: k
<ericsnow> rogpeppe: we can cross that bridge later if it comes up :)
<rogpeppe> ericsnow: 'cos a meta endpoint is directly associated with an entity
<rogpeppe> ericsnow: in the end, just make several concurrent requests...
<ericsnow> rogpeppe: sounds good
<ericsnow> rogpeppe: thanks for all the help
<rogpeppe> ericsnow: we *should* support HTTP2 in the near future so that becomes not a great deal worse than making a single request
<rogpeppe> ericsnow: np
<babbageclunk> voidspace: ok - upgrading to 1.9 now
<voidspace> babbageclunk: cool, are you writing this up by the way?
<babbageclunk> voidspace: yup, keeping notes
<voidspace> babbageclunk: great, thanks
<rogpeppe> ericsnow: i'm thinking of adding a more general metadata request API to csclient; something like this perhaps: http://paste.ubuntu.com/15479991/
<ericsnow> rogpeppe: I like how that encapsulates the functionality clearly
<babbageclunk> voidspace: ok, on 1.9 now, that should reduce confusion!
<voidspace> babbageclunk: cool
<voidspace> babbageclunk: I'm going to reboot and fsck in the hope of recovering my maas 2.0 install
<voidspace> babbageclunk: hmmm... maybe I have an alternative 2.0 install
<voidspace> babbageclunk: I still need to fsck, but maybe not immediately
<voidspace> babbageclunk: I have two versions of MAAS 2.0 - one from their next-proposed ppa and one from an experimental ppa
<voidspace> it's my next-proposed one that is hosed, looks like experimental is still running
<voidspace> at least with that I can make API calls
<voidspace> which I need for the design doc I'm writing
<rogpeppe> ericsnow: then your bulk resource request could look like this: http://paste.ubuntu.com/15480026/
<ericsnow> rogpeppe: nice
<voidspace> nope, dammit - that's hosed and can't even apt update
<rogpeppe> ericsnow: going to lunch now
<voidspace> nor see any boot images or boot sources
<voidspace> fsck
<ericsnow> rogpeppe: good luck <wink>
<mup> Bug #1561023 opened: charmstore v5.WillIncludeMetadata gccgo build failure <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1561023>
<babbageclunk> voidspace: stink
<voidspace> and after a reboot I can update
<voidspace> fair enough
<voidspace> babbageclunk: once you have MAAS running you need to enlist and commission a couple of nodes
<voidspace> babbageclunk: you don't need to install an OS on them as commissioning does that anyway
<voidspace> hah, except after a reboot maas doesn't seem to be running
<voidspace> will upgrade and re-reboot
<babbageclunk> voidspace: do I need to set up DHCP/DNS management?
<voidspace> babbageclunk: I will have to go get my daughter from school soon
<voidspace> babbageclunk: yes
<voidspace> babbageclunk: that's needed for the nodes to pxe boot
<mgz> rogpeppe, cherylj: review please, https://github.com/juju/charmstore/pull/582
<voidspace> babbageclunk: do that via the cluster controller
<babbageclunk> Cool
<cherylj> mgz: out of curiosity, were you able to verify that it fixes the issue on one of the ppc slaves?
<mgz> cherylj: yeah, it compiles with that change
<cherylj> mgz: sweet
<marcoceppi> katco: natefinch I tried resources! I got errors
<natefinch> marcoceppi: doh
<katco> marcoceppi: :( otp but what happened? ericsnow ^^^
<marcoceppi> katco natefinch false alarm!
<natefinch> marcoceppi: not an error, a feature? ;)
<marcoceppi> oh man
<marcoceppi> it worked
<marcoceppi> but I goofd
<marcoceppi> stupid github
<marcoceppi> hehehe this is so awesome
<natefinch> marcoceppi: awesome! :)
<katco> marcoceppi: :D
<voidspace> brb
<natefinch> marcoceppi: https://www.youtube.com/watch?v=9cQgQIMlwWw&t=6
<marcoceppi> dis mai jam
<babbageclunk> voidspace: Hmm, my new node doesn't want to PXE boot.
<mgz> cherylj: https://github.com/juju/juju/pull/4878
<babbageclunk> voidspace: It gets a sensible-looking ip address and shows some messages including "Booting under MAAS direction"...
<babbageclunk> voidspace: but then there's a message saying: Loading ubuntu/amd64/generic/trusty/no-such-image/boot-kernel... failed: No such file or directory
<marcoceppi> natefinch katco it's a bit unwieldy, but works pretty sweet: juju deploy ../../build/charm-svg --series trusty --resource webapp=$HOME/Projects/svg.juju.solutions.tar.gz --resource python-jujusvg=$HOME/.go/bin/python-jujusvg
<babbageclunk> voidspace: which is weird, because the image I downloaded on the cluster was 15.10 (wily) rather than trusty
<natefinch> marcoceppi: nice.
<babbageclunk> voidspace: trying with a trusty image downloaded as well
<voidspace> babbageclunk: you need trusty downloaded for enlisting I think
<babbageclunk> voidspace: that would do it
<voidspace> babbageclunk: I have maas 2.0 with an enlisted node
<voidspace> so I can query the machines api (via the command line interface) and see the result
<voidspace> which gets me unblocked for the moment
<babbageclunk> voidspace: yay, enlisted!
<voidspace> babbageclunk: cool, next commission
<voidspace> babbageclunk: and then you have a successful install
<voidspace> babbageclunk: normally for juju you'll need two machines - one to bootstrap the juju controller to and then one to deploy something to
<voidspace> babbageclunk: but for this work one machine will be enough as it will be a while before we even get a successful bootstrap
<voidspace> babbageclunk: you could setup the power type on the enslisted node so that maas can power it on & off
<voidspace> babbageclunk: but I usually just manually power them on / off
<redir> morning
<babbageclunk> voidspace: what should the poe
<babbageclunk> voidspace: duh power type be
<voidspace> babbageclunk: virsh
<voidspace> babbageclunk: followed by a connect string and password
<voidspace> babbageclunk: which I can never *remember* how to construct (the connect string)
<voidspace> so I usually don't bother
<voidspace> not hard though
<rogpeppe> mgz: have you raised a gccgo issue about the problem fixed by https://github.com/juju/charmstore/pull/582 >
<rogpeppe> ?
<babbageclunk> voidspace: Unfortunately I didn't set the power type before commissioning it. Now the node is off, but VMM can't talk to it
<voidspace> babbageclunk: that's fine if you bootstrap to it you can just manually power it on
<voidspace> babbageclunk: you should be able to add the power details with it switched off though
<babbageclunk> voidspace: bah, scrapped that node and tried again
<mgz> rogpeppe: I was going to try and catch mwhudson about it
<mgz> this should not be a problem for much longer :hope:
<rogpeppe> mgz: if you can make a tiny test case, just report it on golang.org/issue
<babbageclunk> voidspace: yeah, it was just because way it turned off after enlisting left <something, either VMM or libvirt> confused about its state - meant that it thought it was on, so I couldn't start it, but it was actually off, so trying to stop it failed.
<babbageclunk> voidspace: there's probably a proper way to get them back in sync
<voidspace> cool
<babbageclunk> voidspace: no, I'm still stuck. What do I need to do to get the node to move from Commissioning to Ready?
<babbageclunk> voidspace: Is it the power?
<babbageclunk> voidspace: If I just power up the node it just says No bootable device
<voidspace> babbageclunk: ah, that's a maas bug :-/
<voidspace> babbageclunk: it's not commissioned then
<voidspace> babbageclunk: there is a workaround, let me find it
<voidspace> babbageclunk: they're supposed to have fixed that last week
<babbageclunk> voidspace: :(
<babbageclunk> voidspace: in 1.9?
<voidspace> oh
<voidspace> no that's in 2.0
<voidspace> it's commissioning but says no bootable device?
<voidspace> that's weird, it should pxe boot
<babbageclunk> voidspace: that's right
<voidspace> babbageclunk: let me boot 1.9 and see what I get
<babbageclunk> voidspace: aha - the nic is turned off in the boot options.
<voidspace> that would do it
<babbageclunk> voidspace: sweet, that seems to be doing it now
<voidspace> cool
<voidspace> babbageclunk: worth adding a note to your doc
<voidspace> babbageclunk: bbiab
<babbageclunk> voidspace: ok, I'mma try dooferlad's script for adding nodes now.
<mup> Bug #1561088 opened: EnsurePasswordSuite.SetUpTest on windows <blocker> <ci> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1561088>
<bogdanteleaga> mgz: http://reviews.vapour.ws/r/4328/
<mgz> bogdanteleaga: ta
<mgz> rogpeppe: I filed https://github.com/golang/go/issues/14931
<mgz> cherylj: ^see bogdan's change
<mgz> bogdanteleaga: you can $$fixes-1561088$$ land that
<bogdanteleaga> mgz, done
<mgz> we'll get a blessed master yet.
<bogdanteleaga> mgz, haha
<mgz> bogdanteleaga: I got it, didn;t have the bug high enough prio
<babbageclunk> dooferlad: to run your kmaas script, what should the value of maas_name be?
<cherylj> bogdanteleaga: thank you very much for picking up that bug!
<bogdanteleaga> cherylj, np I almost didn't see the 2nd test suite in the same file
<tych0> ahoy mates
<tych0> what's the status of 2.0 final? i think we have one more non-trivial LXD change coming down the pipe
<tych0> just want to be sure to get it in before 2.0 final is tagged
<rick_h_> tych0: so beta3 is getting put together, final is still a little bit out
<rick_h_> tych0: what kind of change?
<tych0> rick_h_: s/lxcbr0/lxdbr0, basically
<tych0> the actual code change for juju should be pretty minor
<tych0> but the UX will suffer
<tych0> at least for the lxd provider
<rick_h_> tych0: k, when would that go out? I'd just want to keep it in sync that the 'released' beta of both work together for folks beta testing right now
<tych0> rick_h_: well, we're not going to do it until there is a patch for juju
<tych0> i haven't had time to write the patch yet
<rick_h_> tych0: ok, can you email jam and andrew and have them in sync please?
<tych0> andrew?
<rick_h_> foo.../me goes to look
<rick_h_> tych0: froboware
<tych0> sure
<tych0> will do
<rick_h_> ty
 * mwhudson waves at mgz
<mgz> mwhudson: wotcha. I filed an upstream bug, apparently it's fixed in gcc 5 and my google fu just sucks
<mgz> was easy enough to work around anyway (bug 1561023)
<mup> Bug #1561023: charmstore v5.WillIncludeMetadata gccgo build failure <ci> <gccgo> <ppc64el> <juju-core:Fix Committed by gz> <https://launchpad.net/bugs/1561023>
<redir> cherylj: ping
<cherylj> redir: hey there
<redir> Hi, is now a good time/
<redir> ^^ if not, ping me when it is cherylj
<cherylj> sure, I have a few minutes
<redir> tx
<mup> Bug #1561212 opened: register logic can lead to user lockout <docteam> <juju-core:New> <https://launchpad.net/bugs/1561212>
<thumper> wallyworld: you around?
<wallyworld> thumper: yeah, in release standup
<thumper> wallyworld: chat when you're done?
<wallyworld> sure
<thumper> rick_h_: you around?
<wallyworld> thumper: 1:1?
<thumper> yeah
<voidspace> thumper: hey, hi
<voidspace> thumper: do you need me to send you an email as well?
<thumper> voidspace: nah, got the docs
<thumper> will attempt to get vmaas setup
<thumper> and poke some ideas around gomaasapi
<thumper> voidspace: do you have anything in a gomaasapi branch yet?
<voidspace> thumper: not yet I'm afraid
<thumper> voidspace: that's fine, I'll start one
<voidspace> thumper: by the end of the day I was trying to sketch out interface design and went through looking at the endpoints we actually useed
<voidspace> thumper: less than I thought
<voidspace> thumper: a proper interface design would need to look at the parameters and the data we used from the returned values
<voidspace> thumper: but really that would need knowing the 2.0 endpoints we need to use instead and the structure of the data it returns
<voidspace> thumper: and of course that's not documented so it's a lot of work
<voidspace> thumper: if you get vmaas setup (did you see Christian's doc?) and start on the gomaasapi infrastructure
<voidspace> thumper: I can pick that up in the morning
<voidspace> thumper: and we can fill gomaaasapi out as we start using it
<thumper> yep
<thumper> we'll see how far I get today
<voidspace> thumper: the open question is to whether convert to the new API layer for devices/subnets/spaces
<voidspace> thumper: yep, cool - thanks
<voidspace> thumper: right, I won't send this email then - I'm signing off
<voidspace> g'night and good luck :-)
<thumper> voidspace: ack
<voidspace> I burned too much time on one of my maas 2.0 vms which is borked
<voidspace> fortunately I had a spare and that was only partly borked :-/
<thumper> hmm...
<thumper> how do you bork a vm?
 * thumper wonders
<mgz> easiest review evar for https://github.com/juju/juju/pull/4882 plz
<anastasiamac> mgz: LGTM :D
<mgz> anastasiamac: thanks!
<anastasiamac> mgz: and u too for the fix \o/
<mgz> he who didn't notice in review can fix it :D
<anastasiamac> mgz: yep, it's in juju commandments \o/
#juju-dev 2016-03-24
<thumper> http://reports.vapour.ws/releases/3811/job/run-unit-tests-trusty-ppc64el/attempt/4913
<thumper> so need to get off gccgo
<rick_h_> thumper: around now
<mgz> thumper: master merge, dat bug fixed I see bug 1561023
<mup> Bug #1561023: charmstore v5.WillIncludeMetadata gccgo build failure <ci> <gccgo> <ppc64el> <juju-core:Fix Committed by gz> <https://launchpad.net/bugs/1561023>
<thumper> rick_h_: hey, wondering if you wanted to chat about maas2 or not
<thumper> rick_h_: I forwarded you my sentiments
<rick_h_> thumper: yea, saw that while at the boy's school. Haven't given it a careful read yet.
<thumper> rick_h_: so... chat or not?
<rick_h_> thumper: sure
<thumper> https://plus.google.com/hangouts/_/canonical.com/tim-rick?authuser=1
<wallyworld> axw_: we should probably land that rename Server->ResolvedAPIEndpoints so it is locked in
<anastasiamac> wallyworld: axw_: so I want to have 2 controllers, each with a hosted model
<anastasiamac> wallyworld: axw_: so that models in both have the same name
<wallyworld> anastasiamac: juju now creates a hosted model when bootstrapping
<anastasiamac> wallyworld: axw_: (to see if I can kill one and still connect to the other)...
<wallyworld> it defaults to the name "default"
<wallyworld> imaginative right
<wallyworld> but you can also use the --default-model <name> arg
<anastasiamac> wallyworld: exactly what I need!
<wallyworld> the controller model is always called "admin"
<anastasiamac> wallyworld: but I cannot really add more models right as we r updating create-model?
<wallyworld> but you need latest master
<wallyworld> no, you can also use create model
<anastasiamac> yep, the tip :D
<wallyworld> create model has been enhance
<wallyworld> it now no longer *requires* credentials if you are admin
<wallyworld> if you are admin the new model inherits the credentials from the admin model
<anastasiamac> wallyworld: reall?! even better :D
<wallyworld> but you can also specify different ones
<anastasiamac> wallyworld: i'll give it a spin right now! tyvm
<wallyworld> you used to *have* to always specify credentials each time
<wallyworld> and non admons still need to
<menn0> thumper: this is something that came out of a discussion I had with fwereade this morning: https://github.com/nylas/stress-tester
<menn0> not that :)
<menn0> this: http://reviews.vapour.ws/r/4332/
<menn0> thumper: ^
<thumper> oh hai
<menn0> wow, that means I hadn't copied anything into the clipboard since last night :)
<thumper> menn0: a question
<menn0> thumper: yes?
<thumper> on the pr
<thumper> well, review
<menn0> ok
<menn0> thumper: responded
<thumper> shipit
<menn0> cheers
<wallyworld> anastasiamac: what's the status of bug 1536792
<mup> Bug #1536792: Some providers release wrong resources when destroying hosted models <juju-core:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1536792>
<anastasiamac> wallyworld: this is what m testing now... I think it's all good but need to confirm
<anastasiamac> I suspect most of it would have been fixed with tag fixes in joyent :D
<wallyworld> anastasiamac: awesome, i'll mark as resolved in the release notes
<anastasiamac> wallyworld: not yet
<wallyworld> i'm optimistic
<anastasiamac> wallyworld: i'll mark as resolved when m 100% sure
<anastasiamac> gimme an hr or two? :P
<anastasiamac> wallyworld: i have 2 controllers, both with default models
<anastasiamac> i have added a machine to both, so each default model has a machine 0
<anastasiamac> now i try to
<anastasiamac> juju remove-machine 0 -m=default
<anastasiamac> ERROR model local.tags:admin@local:=default not found
<anastasiamac> if i do not set -m, the command runs fine
<anastasiamac> but I can still see the machine in status :(
<wallyworld> ok, we'll need to test and see if there's an issue
<wallyworld> i wonder if it is getting confused because of two model names the same on different controllers
<mup> Bug #1558333 changed: juju's logging the literal "$cmd" instead of value of $cmd <juju-log> <logging> <juju-core:Invalid> <https://launchpad.net/bugs/1558333>
<anastasiamac> oh and m on joyent.. so it could be that joyent resources, the bug u quoted aboce is not quiet there yet..
<wallyworld> axw_: have you seen the above issue recently? ^^^^^^
<anastasiamac> machine in nether controller is removed...
<anastasiamac> let me kill one controller...
<anastasiamac> however, mayb is non-issue as both machines are shown as "pending" still..
<anastasiamac> so i think that this a joyent issue...
<anastasiamac> i've killed one controller
<anastasiamac> and now, i can still list-controllers
<anastasiamac> but both status and list-models hangs..
<wallyworld> it may be that the current controller is still set to the one killed?
<anastasiamac> wallyworld: so please do not mark this bug as resolved - it clearly isnt
<anastasiamac> no, i've switched controllers once one was successfully killed!
<wallyworld> depends - may just be joyent
<wallyworld> we can put the caveart on there
<wallyworld> we'll need to test with other providers to be sure
<anastasiamac> wallyworld: i will assume it's joyen and will work on this bug 1536792
<mup> Bug #1536792: Some providers release wrong resources when destroying hosted models <juju-core:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1536792>
<wallyworld> ok, tyvm
<wallyworld> maybe worth a test on aws to be sure
<wallyworld> in case we need to fix modelcmd or something
<anastasiamac> k. i'll run the same on aws just to make sure
<wallyworld> axw_: ping?
<thumper> wallyworld: well... in a relatively short time I have a vmaas setup with one machine comissioned
<thumper> with xenial and maas2
<wallyworld> awesome sauce
<wallyworld> thumper: uses those notes?
<thumper> I annotated the doc with my findings
<thumper> yeah
<thumper> pretty trivial
<wallyworld> thumper: well that's another thing - you can't annotate bloody videos
<thumper> :)
<wallyworld> i'll have to do the same
<wallyworld> anastasiamac: probably already on your todo list - i've add a heading in What's New, can you remember to add a section further down with details? New Openstack machines can be provisioned based on virtualisation type
<anastasiamac> wallyworld: sure, so r release notes ready from ur perspective?
<wallyworld> more or less
<wallyworld> still need to do a little more
<anastasiamac> wallyworld: excellent. tyvm
<mup> Bug #1561293 opened: remove machine model flag <juju-core:New> <https://launchpad.net/bugs/1561293>
<wallyworld> axw: hey
<axw> wallyworld: hey
<wallyworld> axw: there's a section already added to release notes on CLI login (for SSO). can you add the local macaroon stuff there or in a subsection as appropriate?
<axw> wallyworld: got the link to notes handy?
<wallyworld> https://docs.google.com/document/d/1ID-r22-UIjl00UY_URXQo_vJNdRPqmSNv7vP8HI_E5U/edit#
<wallyworld> axw: also see if i've forgotten something. we've added a lot
<axw> wallyworld: sure
<wallyworld> axw: lgtm also, much nicer, ty
<axw> wallyworld: thanks
<wallyworld> axw: did you want to do that rename as a drive by?
<wallyworld> so it lands together in one go
<axw> wallyworld: can do
<wallyworld> would be good i think
<wallyworld> one less thing to worry about later
<wallyworld> axw: and if you land soon, it will get included in the next CI run, current one looking ok so far except that log rotation still broken
<axw> wallyworld: ok. can you please take a look at the paragraph I added to "command-line login"?
<wallyworld> sure
<wallyworld> axw: lgtm, ty
<axw> wallyworld: I forget what the suggestion was. UnresolvedAPIAddresses and APIAddresses?
<wallyworld> ResolvedAPIEndpoints (instead of Servers) I think
<wallyworld> resolved-api-endpoints for yaml
<axw> wallyworld: Servers is the unresolved set
<wallyworld> and then we also still have api-endpoints for the ip addresses
<wallyworld> oh, did i get that wrong way around
<axw> wallyworld: I think I'd rather say Unresolved for that, since it's the one that should typically not be used
<wallyworld> sgtm
<wallyworld> yep, just re-read my notes, i got it backwards
<mup> Bug #1561293 changed: remove machine model flag <juju-core:Invalid> <https://launchpad.net/bugs/1561293>
<mup> Bug #1561300 opened: juju debug-log does honor current model aside from the admin model <conjure> <juju-core:New> <https://launchpad.net/bugs/1561300>
<axw> wallyworld: we currently render both servers and api-endpoints in show-controller. do you think we should remove the unresolved ones?
<axw> I don't think they have any value to the user
<wallyworld> axw: we can do and put them back if there's complaints, now is the time
<axw> wallyworld: ok, doing that now
<mup> Bug #1561315 opened: Ensure availability uses wrong constraints <juju-core:Triaged> <https://launchpad.net/bugs/1561315>
<axw> wallyworld: can you take a quick look at the PR again?
<axw> please
<wallyworld> yup
<wallyworld> axw: lgtm, let's hope we beat the next CI run :-)
<wallyworld> axw: did you see the test failure?
<axw> wallyworld: went to make lunch, not yet
<axw> doh
<wallyworld> np, list a small one
<wallyworld> just
<axw> wallyworld: there's new simplestreams in the works for azure already? does it include windows and centos?
<wallyworld> axw: yup
<wallyworld> not far off being ready
<axw> wallyworld: ubuntu too?
<wallyworld> didn't want to distract :)
<wallyworld> axw: so yeah, believe so. the windows and centos streams will be maintained by us on streams.canonical.com, the ununtu streams on cloud-images
<wallyworld> simplestreams search will look in 2 search paths
<axw> wallyworld: ok
<wallyworld> axw: i added the extra search path and signing key back in dec at oakland
<wallyworld> has taken till now to get the streams done
 * axw nods
<wallyworld> all needs to be tested etc
<wallyworld> might be issues, who knows
<davecheney> allwatcher_internal_test.go:3066: tw.c.Assert(tw.NumDeltas(), jc.GreaterThan, 0)
<davecheney> ... obtained int = 0
<davecheney> ... expected int = 0
<davecheney> golf clap
<mup> Bug #1561339 opened: environs/sync: test failure <juju-core:New> <https://launchpad.net/bugs/1561339>
<davecheney> menn0: https://github.com/juju/juju/pull/4887
<davecheney> ^ urgent
<davecheney> axw: https://github.com/juju/juju/pull/4887
<axw> davecheney: LGTM
<davecheney> danka, I'll hulk smash that in
<davecheney> axw: as mwhudson noted, Juju's compiler breaking bug came early this development season
<axw> heh :)
<davecheney> fwiw, I don't think working around that bug made the code any worse
<davecheney> in fact, I think it made it better
<axw> davecheney: agreed
<davecheney> but I make no comment on the name 'processedStatus'
<davecheney> i was going to change to to just status
<davecheney> but then we'd have shit like
<davecheney> status.Status.Status =
<axw> heh
<davecheney> and I couldn't see through the tears to keep typing
<axw> juju/juju/juju/Status.Status.Status
<mwhudson> go 1.6 is in trusty-proposed
<mwhudson> https://launchpad.net/ubuntu/+source/golang-1.6/1.6-0ubuntu1~14.04
<davecheney> 16:50 < axw> juju/juju/juju/Status.Status.Status
<axw> mwhudson: woohoo, thank you :)
<davecheney> mwhudson: fantastic
<davecheney> I think that deserves a toot of your horn on juju-dev@
<mwhudson> one thing to note is that this package doesn't install /usr/bin/go
<mwhudson> (rather it's /usr/lib/go-1.6/bin/go)
<mwhudson> ah yeah
<wallyworld> axw: doh, next CI run started without your change
<axw> indeed
<wallyworld> anastasiamac: my theory is correct
<wallyworld> https://apidocs.joyent.com/cloudapi/
<anastasiamac> \o/
<anastasiamac> m fixing now :D
<wallyworld> axw: the reason joyent is farked up - we need to prefix tags with "tag."
<wallyworld> we pass in tags from ResourceTags()
<wallyworld> eg model uuid etc
<axw> ah, so it is a special prefix
<wallyworld> appears so
<axw> okey dokey
<wallyworld> i just had an educated guess, i think it's corretc
<wallyworld> we'll find out soon enogu, but the api doc appears to confirm it
<axw> wallyworld: yeah, `An arbitrary set of tags can be set at provision time, but they must be prefixed with "tag."`
<wallyworld> axw: but what i did see was an extraordinary long time between instance start up and the log entry "spaces discovery complete, client connections now allowed"
<wallyworld> 10 minutes or thereabouts
<axw> yikes
<wallyworld> and also worker machines not able to talk back to start server
<wallyworld> but one thing at a time - we'll get the tags fixed first
<wallyworld> anastasiamac: looks like it worked :-)
<anastasiamac> yes.. m just running couple of other tests a nd then that's it!
<davecheney> 16:55 < mwhudson> one thing to note is that this package doesn't install /usr/bin/go
<davecheney> 16:55 < mwhudson> (rather it's /usr/lib/go-1.6/bin/go)
<davecheney> ^ this is a good thing, it's why we have update-alternatives
<mwhudson> i guess you can use that too
<mwhudson> the package could even include an alternative, but it doesn't
<mwhudson> alternatives are a bit iffy for things that end up as build-depends
<mwhudson> (alternatives for go went away in xenial yesterday)
<mup> Bug #1561375 opened: state: TestRegisterNoSecretKey unreliable <juju-core:New> <https://launchpad.net/bugs/1561375>
<dooferlad> voidspace: sorry, just noticed the time. On my way to the hangout.
<voidspace> dooferlad: me too
<voidspace> no babbageclunk yet
<fwereade> voidspace, he's in there already
<voidspace> fwereade: ah, just not on irc
<voidspace> babbageclunk: hey, hii
<voidspace> babbageclunk: so you're on maas 2.0 for the morning then
<babbageclunk> voidspace: yup yup
<babbageclunk> voidspace: might drop off the standup then
<voidspace> babbageclunk: yeah
<voidspace> babbageclunk: my merge of master onto the "drop-1.8" branch landed and I'm now landing that onto our maas2 branch
<voidspace> babbageclunk: which we still haven't done anything with yet...
<babbageclunk> voidspace: should be an easy merge then!
<voidspace> babbageclunk: :-)
<mgz> so, we nearly got a blessed master, one test failure off. then admin-controller-model merged.
<voidspace> mgz: I had a load of "unknown" failures on my drop-1.8-support branch, all seemingly joyent related
<voidspace> mgz: http://reports.vapour.ws/releases/3811
<voidspace> mgz: plus a couple of test timeouts and a couple with known bug numbers
<voidspace> mgz: is the joyent problem just a spurious thing, known, or something else? (do you think)
<rogpeppe1> does anyone here understand how the new model/controller/account hierarchy works?
<mgz> voidspace: what version of master did that branch last merge... ah, my rev, so includes admin-controller
<voidspace> mgz: ah, they're all failing on master as well
<voidspace> mgz: so my branch has consistently been "no worse than master"... :-)
<mgz> voidspace: three things
<mgz> voidspace: one, you have the new regressions from master (timeout etc across a lot of tests)
<mgz> voidspace: two, you mis-resolved conflicts in dependencies.tsv and dropped a bug fix
<voidspace> mgz: misresolved dependencies.tsv? I didn't resolve a conflict there
<voidspace> mgz: and dropped a bug fix - you mean calling the test setup twice was a bug fix?
<mgz> voidspace: three, the joyent config is borked for reasons unclear
<voidspace> unless that was a previous conflict resolution
<mgz> voidspace: well, it's not clear how exactly, and talking in shas makes life complicated
<voidspace> mgz: can you point me to the dependencies.tsv issue
<mgz> but the rev you merged includes the fix for bug 1561023
<mup> Bug #1561023: charmstore v5.WillIncludeMetadata gccgo build failure <ci> <gccgo> <ppc64el> <juju-core:Fix Committed by gz> <https://launchpad.net/bugs/1561023>
<voidspace> mgz: also the bugfix I dropped - I'll fix those
<mgz> but the dependencies.tsv in your branch at that rev doesn't have the charmstore dep bump
<mgz> so... something happened
<mgz> it's gone on master
<mgz> I wonder if the joyent cred issue is similar?
<voidspace> I'm tempted to blame git, until someone can prove me wrong...
<mgz> voidspace: well, it's certainly not wrong to blame git
<voidspace> :-)
<voidspace> mgz: I do have that old version in dependencies.tsv - but I don't have anything that would have conflicted with it
<voidspace> mgz: that's an easy fix though
<voidspace> mgz: you said screwed up dependencies.tsv *and* dropped a bug fix
<voidspace> mgz: was that two things or one thing?
<mgz> voidspace: merging that branch into master produces a sane diff at least
<mgz> so, nothing to borked happened git-wise
<mgz> voidspace: one thing, and I'm not actually sure what's up
<voidspace> mgz: yeah, I just wonder how that dependencies.tsv line got reverted
<voidspace> mgz: I'll update it manually in my branch - it's just concerning if things are missing
<voidspace> mgz: the diff doesn't show it as a change against master, so it looks like it just hasn't been pulled into that branch
<mgz> voidspace: the rev tested doesn't include the change from 1abf825dcb71f198c3895e24fecc99537454c64e ... which predates rev 9e2a02b17a92c90b524b33938b5b32bb74451538 which... aha
<mgz> voidspace: is *not* the rev you merged
<mgz> voidspace: so actually, just merging master again would resolve that, and likely the joyent creds issue
<voidspace> mgz: right, and pull in new problems instead :-)
<voidspace> mgz: maybe I'll wait until there's a bless and pull that in
<mgz> voidspace: that would be reasonable
<voidspace> the timeouts are worrying but I don't *think* they're consistent against test runs on my branch
<voidspace> I'll check
<voidspace> if they are I'll need to look at them
<mgz> I certainly don't see any failures that are obviously from changes in that branch
<voidspace> the changes are all in the maas provider so it would be "interesting" if it caused failures elsewhere
<wallyworld> voidspace: what joyent creds issue are you having?
<mgz> wallyworld: ERROR cmd supercommand.go:448 validating "credentials" credential for cloud "joyent": manta-user: expected string, got nothing
<wallyworld> and btw joyent has been quite sick - they were upgrading their data centre, and now we have issue which are still being diagnosed, maybe network. one instance took 10 or more minutes from agent start up to get the "maas spaces all discovered" message
<wallyworld> network spaces i mean
<wallyworld> mgz: you need to remove the manta items
<wallyworld> they are no longer needed
<voidspace> wallyworld: thanks
<wallyworld> we added the instructions to the release notes, but have not advertised yet
<wallyworld> didn't think anyne besides Qa was using joyent :-)
<mgz> wallyworld: this is just a sync issue between different versions of the code and those changes
<wallyworld> ah ffs {\\\"code\\\":\\\"NotAuthorized\\\",\\\"message\\\":\\\"QuotaExceeded:
<wallyworld> that's the latest reason for joyent failures
<wallyworld> sigh, we're having no luck
<mgz> wallyworld: yeah, that's some of them
<wallyworld> mgz: the joyent provider was also misusing tags
<wallyworld> that was done in master a while back i think
<wallyworld> fixed in latest CI run
<wallyworld> the machines were not being tagged properly with uuid
<wallyworld> so ControllerInstances() was all messed up
<wallyworld> as was bootstrap finalisation
<babbageclunk> voidspace: ok, I've got a maas2 cluster going, I think. I might try doing the setup to get power addresses working so that I can try dooferlad's script
<babbageclunk> voidspace: unless there's something else I should do?
<wallyworld> mgz: voidspace: i sent an email - we have a routing issue with joyent machines, nfi what's wrong
<mgz> wallyworld: thank you
<wallyworld> mgz: any ideas welcome :-)
<mgz> wallyworld: did you also look at the windows test run?
<wallyworld> the unit test failure?
<mgz> with admin-controller it's now hitting timeouts
<wallyworld> i have only seen one test failure in the builds i have looked at, and it says there's an existing file handles bug
<wallyworld> i'll look at the latest run
<mgz> wallyworld: that does sometime happen anyway, but last three runs have all hit
<mgz> *** Test killed: ran too long (10m0s).
<mgz> FAIL	github.com/juju/juju/api	609.327s
<wallyworld> mgz: nothing has changed inthe api package at all lately, and the last time in the admin comtroller branch it was the file hadnles bug
<wallyworld> i'll look at the logs though, maybe something changed i don't know about
<wallyworld> but i would have expected the same pass or fail now in master as in the latest admin controller runs
<mgz> wallyworld: also http://reports.vapour.ws/releases/3813/job/run-unit-tests-race/attempt/1216#highlight
<wallyworld> as what was run before the merge into master
<wallyworld> i have not looked at the races at all
<wallyworld> they have been happening for a while in master, have not been ob the radwr for us
<mgz> wallyworld: http://reports.vapour.ws/releases/3810 <- this is my baseline
<mgz> wallyworld: that's just before admin-controller-model merged, followed by a trivial win fix
<wallyworld> mgz: right, but in the latest admin controller model runs we are not seeing the same issues
<wallyworld> the only windows test failure i recall was the file handle one'
<mgz> compare 3812 3813 3814 with the merge in
<wallyworld> mgz: here's the latest admin controller run http://reports.vapour.ws/releases/3809
<wallyworld> all known issues from master
<mgz> which all timeout on the windows tests, and have a new data race
<wallyworld> ?
<wallyworld> bug 1521699 is not a tineout
<mup> Bug #1521699: windows unit tests fail because handles are not available <ci> <intermittent-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1521699>
<wallyworld> that's the only windows test failure
<wallyworld> in that last admin controller run
<wallyworld> what are you referring to when you say timeout?
<wallyworld> so just before we merged admin controller into master - we got a clean run apart from a few known inherited issues, so we got agreement from qa team to nerge
<wallyworld> one of the issues is a know CI script issue - the log rotation one
<mgz> wallyworld: http://juju-ci.vapour.ws/job/run-unit-tests-win2012-amd64/buildTimeTrend
<wallyworld> the azure arm deploy is also known - an lxc issue on xenial
<mgz> master was failing (with one small issue) in 30 mins
<mgz> it's now 2 hrs
<mgz> wallyworld: anyway, lunch...
<wallyworld> right so we need to root cause which it is wrong to straight away blame admin controller model when that branch was clean
<wallyworld> before merging
<wallyworld> mgz: is the machine under load perhaps, there's things like this Panic: cannot create index for logs collection: WSARecv tcp 127.0.0.1:55607: i/o timeout (PC=0x41597B)
<wallyworld> that's quite low level, can't immediately see the issue is caused by juju
<voidspace> babbageclunk: ah sorry, missed your tweet
<voidspace> babbageclunk: where are you at now?
<babbageclunk> voidspace: no worries, unless you're going to say "Nooooo, you should have been working on something else!"
<babbageclunk> voidspace: juuuuust about got the last piece of the power management setup done - will try using the kvm_maas script to add nodes.
<voidspace> babbageclunk: cool
<voidspace> babbageclunk: I'm not sure how much you can help on the API spelunking stuff I'm doing
<voidspace> babbageclunk: getting some stuff deployed with juju on maas 1.9  might be useful
<voidspace> babbageclunk: get familiar with the juju command line and concepts
<babbageclunk> Ok, I'll start on that once I do this and get it written up.
<babbageclunk> voidspace: (Also, just about to pop out and pick up some monitors.)
<voidspace> babbageclunk: ok, cool
<mup> Bug #1561526 opened: api/usermanager: no way to find if a user doesn't exist <juju-core:New> <https://launchpad.net/bugs/1561526>
<cherylj> morning, everyone.
 * cherylj reads backscroll
<mgz> cherylj: wotcha
<cherylj> hey mgz, how are things?  :)
<cherylj> so ,looks like we need a bug for the github.com/juju/juju/worker/environ data race failure
<cherylj> and we need to get someone on the stringforwarder failure for ppc
<cherylj> and someone on the windows test timeouts
<cherylj> and the joyent networking stuff
<cherylj> I'll open a bug for the environ data race, unless you already did mgz ?
<mgz> cherylj: not yet
<rogpeppe1> anyone know of a handy mongodb key sanitizer function? nice if it was reversible.
<cherylj> k, I'll do it now
<cherylj> katco: do you have someone who can look at bug 1560203?
<mup> Bug #1560203: stringForwarderSuite.TestRace sometimes fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1560203>
<cherylj> katco: right now it's blocking our ability to release
<cherylj> we seem to hit it every time
<mgz> ah, 3814 is done now, ace
<cherylj> blergh, katco is out
<cherylj> natefinch, ericsnow, can either of you take bug 1560203?
<mup> Bug #1560203: stringForwarderSuite.TestRace sometimes fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1560203>
<mgz> good to bug jam on that too as it's his code :)
<cherylj> cmars: can you spare people to help us get the release out?
<mgz> I couldn't get the test to fail on its own, but seems as part of a full run on ppc64 it comes up a lot
<natefinch> cherylj: katco is out today and tomorrow btw
<mgz> so I tuink t
<mgz> *think the 'race' assumptions are just not conservative enough
<natefinch> cherylj: oops, sorry, I see you realized that already
<cherylj> :)
<mgz> cherylj: the dumb option is just skip the test
<cherylj> hey dooferlad, how are things going with the proxy bug?  (I'm sure it will come up in the cross team call this morning)
<natefinch> cherylj: that's some gnarly code
<dooferlad> cherylj: moving along. Have had some useful discussions with fwereade on how to fix it properly and have some code that I am reasonably happy with.
<cherylj> dooferlad: excellent.  Is early next week still a sane target?
<cherylj> like Monday / Tuesday?
<mgz> cherylj: 3814/joyent-deployer-bundle failed due to routing issues when trying to fetch tools from the state server
<dooferlad> cherylj: yes
<natefinch> gah, why is this code under juju/juju?
<mgz> could be our manual firewall cleanup rules not working any more?
<dooferlad> cherylj: though remember that Friday and Monday are public holidays (at least in the UK)
<cherylj> dooferlad: d'oh, that's right
<cherylj> thanks :)
<dooferlad> cherylj: no problem :)
<cherylj> dooferlad: you'll want to touch base with mgz to discuss CI testing with proxies for 1.25.5
<cherylj> dooferlad: make sure that we can get a test set up for it
<mgz> yeah, I've looked at that a little this week
<dooferlad> cherylj: sure.
<cherylj> muchas gracias
<cherylj> natefinch: are you looking at that stringforwarder bug?
<natefinch> cherylj: sort of.
<cherylj> heh
<natefinch> cherylj: both my managers are out, so I'm a little unclear on whether or not I should take this bug over what I was otherwise working on
<natefinch> cherylj: on the upside (for you), it looks interesting and the code seems to need some cleanup, so it makes me want to fix it.
<cherylj> sweet!
<rick_h_> natefinch: what are you working on today?
<natefinch> rick_h_: getting the resources juju code to actually support channels
<natefinch> rick_h_: it's a "bug" that we don't :D
<alexisb> natefinch, today we need help with bugs
<alexisb> everyone
<rick_h_> natefinch: ok, +1 ^
<natefinch> alexisb: ok, thanks for the clarification
<rick_h_> :)
<cherylj> I'm making the bugs that are currently blocking us actual blockers, so they'll show up on juju.fail
<natefinch> juju.fail is probably my favoritest thing that marcoceppi has ever done :)
<cherylj> yeah, it's nice :)
<marcoceppi> natefinch: I also have juju.qa - still waiting for a good idea to come to me
<mgz> cherylj: I'm unsure what to do over the win2012 unit test timeout,
<mgz> ian is right that the last run of the feature branch didn't have issues
<mgz> but nothing else in master particularly affects that
<cherylj> mgz: the last run of the feature branch may have been masking it
<mgz> cherylj: my best bet may be restart the windows slave and rerun?
<cherylj> because of the setuptest
<mgz> that's one small test later in the process than where the timeout happens
<cherylj> I'm suspicious of this because of the change could have brought in, namely around utils.OutgoingAccessAllowed
<cherylj> change it could have brought in
<natefinch> Oh man, that is so confusing. StringForwarder has a Receive() method that actually sends a message. :/
<cherylj> lol
<natefinch> I think I'll rename it "Forward"... since that's what it's doing
<cherylj> bogdanteleaga: would you be able to help with bug 1561566?
<mup> Bug #1561566: Many windows tests fail with "WSARecv tcp 127.0.0.1:53731: i/o timeout (PC=0x41593B)" <blocker> <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1561566>
<alexisb> ericsnow, ping
<bogdanteleaga> cherylj, looks really weird, but I have a feeling I've seen it before, I think it was transient though
<bogdanteleaga> I'll ask gabriel too
<cherylj> thanks, bogdanteleaga!
<mgz> cherylj: I may have some good news on that
<cherylj> oh?
<cherylj> that would be awesome
<mgz> cherylj, bogdanteleaga: http://paste.ubuntu.com/15487526/
<bogdanteleaga> mgz, if those were still around you probably wanna check the tempdir folder as well
<mgz> yeah, rm -rf $TMP/* running
<mgz> bogdanteleaga: what do I run to restart the (cloud) machine cleanly?
<mup> Bug #1561555 opened: Data race in github.com/juju/juju/worker/environ <blocker> <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1561555>
<mup> Bug #1561566 opened: Many windows tests fail with "WSARecv tcp 127.0.0.1:53731: i/o timeout (PC=0x41593B)" <blocker> <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1561566>
<bogdanteleaga> mgz, shutdown -r -t 0 should do
<wallyworld> cherylj: a small one https://github.com/juju/juju/pull/4889
<ericsnow> alexisb: pong
<alexisb> ericsnow, we have a push on bugs to get a very important beta3 out
<alexisb> ericsnow, can you please work with cherylj on any needs she may have of you today
<wallyworld> ericsnow: a +1 of this small date race fix would be great https://github.com/juju/juju/pull/4889
<ericsnow> alexisb: k
<natefinch> wallyworld: you should go to bed.  it's after midnight on the first day of your vacation
<wallyworld> natefinch: soon, just one more fix :-)
<ericsnow> alexisb: keep in mind that we still have to land the implementation of resources in the charm store in short term
<alexisb> ericsnow, yes I know
<ericsnow> cherylj: keep me posted on how I can help
<natefinch> alexisb: you should make wallyworld take an extra day of vacation for every hour he works past midnight, that would teach him ;)
<alexisb> natefinch, agreed ;)
<ericsnow> natefinch, alexisb: lol
<alexisb> ericsnow, natefinch this is just to support beta3 over the next 2 days
<ericsnow> wallyworld: ship-it
<wallyworld> tyvm
<alexisb> ericsnow, natefinch I will work with katco and wallyworld on adjusting schedules with the 2 day impact
<alexisb> but we need to get beta3 out
<ericsnow> alexisb: the nice thing is that the charm store patches don't have quite the same deadlines :)
<alexisb> ericsnow, yep :)
<alexisb> and sorry for the shift in priority but this one is important
<ericsnow> alexisb: not that we can affort to be complacent about them!
<ericsnow> alexisb: np
<wallyworld> ericsnow: ok, that should be landing, can you monitor for me? i've asked abentley to pause CI pending any more short term work to address any remaining issues; we'll need to keep him inthe loop
<ericsnow> wallyworld: k
<ericsnow> wallyworld: should I ping him once that's merged?
<abentley> ericsnow: yes, please.
<ericsnow> abentley: will do
<wallyworld> ericsnow: just need to check what else there is - mgz did we think a windows slave restart would help the windoes tests?
<wallyworld> there's also a potential log rotation fix
<mgz> wallyworld: I belive so, I'm doing that now
<wallyworld> haven't looked at that
<wallyworld> i think the only other issue is joyent?
<wallyworld> and its networking issue
<wallyworld> ericsnow: mgz: it's possible TestAddUserAndRegister has an issue on windows cleaning up the logsink.log file at the end of the test. nfi why
<mgz> wallyworld: the other big thing is joyent routing
<wallyworld> there's nothing special in that test
<wallyworld> so maybe we c.Skip() on windows with a todo for now
<wallyworld> mgz: yeah, i don't know enough to fix joyent routing issues
<mgz> http://data.vapour.ws/juju-ci/products/version-3814/joyent-deploy-trusty-amd64/build-1965/machine-0.log.gz
<mgz> that's not actually the most helpful
<mgz> basically... the provisioned machines can't talk back to admin:machine-0 to get tools
<mgz> this is similar to what we added hack firewall cleanup rules to work around
<wallyworld> mgz: yep, but only if they use the cloud address
<mgz> it's possible those hacks need updating to the new world of admin controllers
<wallyworld> the public address is fine
<wallyworld> mgz: admin controller should make 0 difference
<wallyworld> it's just a controller
<mgz> the hack being:
<mgz> $RELEASE_TOOLS/joyent-curl.bash /cpcjoyentsupport/fwrules | sed -e 's/[\[\{]/\n\0/g;' | grep $JOB_NAME | sed -e 's/.*"id":"\([^"]*\)".*/\1/' | xargs -I{} $RELEASE_TOOLS/joyent-curl.bash /cpcjoyentsupport/fwrules/{} -X DELETE || true
<wallyworld> i'll take your word for it :-)
<mgz> but given all the joyent downtime and other misc fallout from the last few days it could also be something else
<wallyworld> ericsnow: mgz: one difference in that testAdduserAndRegister test is that it does a c.Assert(api.Close(), jc.ErrorIsNil) at the end whereas other tests just do a api.Close(). we could try that or just skip the test for now
<wallyworld> but that's just clutching at straws
<wallyworld> trying to find something wrong
<wallyworld> the test itself doesn't do anything with logsink.log
<wallyworld> it's all setup by JujuConnSuite
<wallyworld> under the covers
<mgz> windows machine restarted. rerunning 3814 win2012 unit tests.
<abentley> wallyworld, mgz: cherylj and I think the joyent issues are caused by the admin model and the default model being in different regions.
<wallyworld> hmmm, that may explain it
<wallyworld> would need to check the code to be sure
<ericsnow> abentley: in case you didn't notice, this merged: https://github.com/juju/juju/pull/4889
<mgz> abentley: actual different regions? not just different network zones (or whatever joyent calls that)
<cherylj> mgz: yeah
<cherylj> mgz: I see that my controller is in us-east-3
<ericsnow> abentley: first try too <wink>
<cherylj> mgz: but when I deploy to the default model
<wallyworld> AFAIR, there's no attempt to be in different regions, but also no attempt to be in the same region
<abentley> wallyworld: Should we test now or hold for more fixes?
<cherylj> mgz: it gets put in us-east-1
<abentley> wallyworld: We specify region in the bootstrap config, so we'd expect the same one to be used for both.
<cherylj> mgz, abentley, deploying a machine in the admin model properly puts the machine in the same region as the controller
<wallyworld> abentley: depends on whether that is passed through, i'd need to check. but have we looked at the joyent instances to confirm?
<cherylj> wallyworld: yes
<wallyworld> so seems like we need to explicitly constrain the region of any hsted models
<cherylj> wallyworld: yeah, seems to be
<wallyworld> i'd need to check the code
<wallyworld> can't recall exactly the setup behaviour ottomh
<cherylj> wallyworld: I can take this, go to bed!
<wallyworld> let me take a quick look
<wallyworld> abentley: maybe just hold off on a new run for  bit longer
<abentley> wallyworld: Okay, let me know.
<abentley> wallyworld: The functional-log-rotation-unit issue also seems to be the same joyent issue.  I re-ran it on AWS and it passed.
<wallyworld> abentley: oh good :-)
<abentley> cherylj: ^^
<wallyworld> so really one more issue
<cherylj> yay
<abentley> wallyworld: Does this cross-region apply to other providers?  We saw a bunch of admin machines all by themselves in aws earlier this week.
<abentley> Maybe their default model machines were in a different region.
<cherylj> ahhhh, I can check on AWS
<cherylj> that reminds me I need to check my dashboard for other regions!!
<wallyworld> abentley: not sure, from what i can see, the region for joyent comes from the sdc url which is the same for both admin and hosted
<cherylj> what's the default amazon region?
<wallyworld> us-east-1
<abentley> wallyworld: On bootstrap, we're not allowed to pass in sdc-url.  Only region.
<wallyworld> abentley: SDC-URL COMES FROM CREDENRIALS
<wallyworld> caps lock fail
<wallyworld> sorry
<ericsnow> natefinch: if you're working on bug #1560203, would you mind assigning it to yourself?
<cherylj> I was like woah!
<mup> Bug #1560203: stringForwarderSuite.TestRace sometimes fails <blocker> <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1560203>
<cherylj> didn't know you felt so strongly about SDC-URL
<wallyworld> lol
<wallyworld> 'A' key too close to caps lock
<natefinch> ericsnow: oops, yep, thanks
<cherylj> lol
<natefinch> wallyworld: you should remap caps lock to something useful... I made it the cmopose key so I can easily make things like â¢ :)
<wallyworld> :-)
<wallyworld> cherylj: https://github.com/juju/juju/blob/master/provider/joyent/environ_instance.go#L99
<wallyworld> the region for start instance appears to be the same for all models based on https://github.com/juju/juju/blob/master/provider/joyent/config.go#L170
<wallyworld> and sdc-url comes from credentials
<wallyworld> which is the same for both admin and default model
<wallyworld> with create-model  the user can pass in different credentials so could shoot themselves in the foot
<natefinch> ahhh.... stupid gocheck
<natefinch> always put your func TestPackage(t *testing.T) { gc.TestingT(t) } in the internal package, not _test package, otherwise it doesn't actually run any internal tests :/
<natefinch> some might call that a feature ;)
<wallyworld> cherylj: so it seems confusing based on the above how machines in hosted models can be in different regions for joyent
<cherylj> wallyworld: I'm not specifying sdc-url and I see this ssue
<wallyworld> cherylj: let me see where sdc-url comes from
<cherylj> wallyworld:  I'm using bootstrap joyent joyent/us-east-3
<wallyworld> cherylj: right, so it comes from the clouds.yaml file; will be the same for all models
<wallyworld> so i don't see off hand how machines get created in different regions
<cherylj> There's nothing in my clouds.yaml for joyent
<wallyworld> cherylj: fallback-public-clouds.yaml
<wallyworld> in the code base
<cherylj> ah
<cherylj> wallyworld: I'll keep looking into this
<wallyworld> ok, it annoying me that the regions are different and they shouldn't be
<wallyworld> according to the code as i see it, but i need sleep
<wallyworld> i'll find out i guess at the release standup :-)
<wallyworld> cherylj: the only other thing is to consider skipping that 1 failing test on windows as per backscroll a bit ago
<wallyworld> cherylj: func (joyentProvider) RestrictedConfigAttributes() []string {
<wallyworld> add sdc-utl
<wallyworld> sdc-url
<wallyworld> to the result
<wallyworld> that should fix it
<wallyworld> that will force histed model config to share the sdc-utl value
<wallyworld> with the admin model
<cherylj> nice, thanks wallyworld
<wallyworld> you may need to add additional ones, i think that's all that's needed
<cherylj> ok, will test
<wallyworld> tl;dr; - add any attrs there that must be duplicated between admind and host model
<wallyworld> hosted
<wallyworld> cherylj: eg for cloudsigma, the result is return []string{"region"}
<wallyworld> so for joyent return []string{"sdc-url"}
<wallyworld> hopefully a one line fix :-D
<cherylj> awesome, thanks :)
<wallyworld> yep, also "region" for ec2 :-)
<wallyworld> joyent provider is the red headed step child
<wallyworld> that no one loves
<rick_h_> wallyworld: fyi, cancelled our call. Obviously you're here late and off so want to make sure you don't expect to have it :)
<wallyworld> rick_h_: i don't mind, i need to be up 30 miuntes before hand for the relese standup
<wallyworld> happy to still have it
<rick_h_> wallyworld: nope, I'm not :P
<wallyworld> rick_h_: ok, hopefuly my status email filled you in
<rick_h_> wallyworld: rgr
<rick_h_> wallyworld: email me if you need anything
<wallyworld> will do
<wallyworld> i am happy i think we got joyent sorted out
<wallyworld> beta3 will rock
<wallyworld> ttyl
<wallyworld> abentley: one last thing before i go - we think there's a one line fix for joyent - cherylj will ping you when  landed
<redir> morning
<natefinch> someday reviewboard will pick this up: https://github.com/juju/juju/pull/4890
<natefinch> if anyone wants to review it
<abentley> cherylj: Here is a list of attrs that may occur in environments.yaml that we have blacklisted from the bootstrap config in 2.0: https://pastebin.canonical.com/152723/
<cherylj> ericsnow: can you review natefinch's PR?  https://github.com/juju/juju/pull/4890
<ericsnow> cherylj: will do
<jam> mgz: bug me about what?
<mup> Bug #1561611 opened: Joyent machines deployed to hosted models use wrong region <blocker> <ci> <juju-core:Triaged by cherylj> <https://launchpad.net/bugs/1561611>
<mgz> jam: look at nate's pr ^
<mgz> bug you about that
<cherylj> mgz: I see the windows tests are still failing :(
<mgz> cherylj: well, good news bad news...
<mgz> right, that
<mgz> go tests to run again, still super unreliable compared to before on master
<mgz> two things under apiserver hitting 600s timeout, strongly suggests actual deadlock
<mgz> bogdanteleaga: can you normally get a clean test run locally? if so, can you try with current master?
<jam> natefinch: so the first thing that jumps out at me is that if you call 'Stop()' twice you'll get a panic. Its often hostile to not make cleanup actions reentrant
<natefinch> jam: ahh you're here, great
<bogdanteleaga> mgz, I stopped trying to run the whole thing on windows a while ago, they seem to be way slower on windows
<natefinch> jam: I wanted to talk to you about my change, but figured you were out
<bogdanteleaga> I remember hitting something like 299.x seconds with a 300s timeout on a suite
<bogdanteleaga> but I'll give it a shot
<mgz> bogdanteleaga: yeah, can do it on 30 mins on big cloud machine but it's not much use for narrowing down problems
<mgz> bogdanteleaga: apiserver seems the interestng package
<natefinch> jam: the reason I did it is to make it more obvious that it's not thread-safe to call stop multiple times. When I first looked at the code for Stop, I figured the nil check was an attempt to make it threadsafe (which obviously it's not)
<natefinch> er not thread safe to call stop from multiple threads
<mgz> master as-of before admin controller and HEAD, if they behave differently
<bogdanteleaga> mgz, I'll try head first, you're saying apiserver times out?
<natefinch> jam: and given that we don't actually need to call stop multiple times in production, that seems ok
<bogdanteleaga> also what go version are you using?
<mgz> bogdanteleaga: yeah
<jam> natefinch: So i think it isn't all that uncommon to want to do something like defer (Stop()), and then end up with logic that might also Stop early.
<jam> generally re-entrant cleanups are going to play nicer
<jam> I'd rather just put a Mutex in there
<natefinch> jam: yeah, the defer and then also call stop is true
<mgz> bogdanteleaga: urk, that thar is a reasonable question, not sure if we've updated the window build to 1.6 yet
<mgz> # C:/go/bin/go version
<mgz> go version go1.2.2 windows/amd64
<mgz> soo... that's also on our list to get done
<bogdanteleaga> 2507 files changed, 160673 insertions(+), 117711 deletions(-)
<bogdanteleaga> :)
<natefinch> jam: I just kind of hate writing code for conditions that we don't actually need or care about
<mgz> bogdanteleaga: :)
<bogdanteleaga> mgz, are the binaries used for deployment built using 1.6 though?
<natefinch> jam: anyway, I can add the mutex if you think that's the right fix.
<jam> natefinch: done vs stopch is fine, New vs NewFoo is good, and I'm happy with therest
<jam> natefinch: yeah, most is just spelling that I'm agnostic about, I'd just like Stop() to be reentrant
<natefinch> jam: thanks... sorry for stomping all over your code
<jam> natefinch: if its clearer for someone that isn't me, then its all good
<mgz> bogdanteleaga: not yet I think, cross builds also on go 1.2
<mgz> we're about to move all the remaining bits to go 1.6 though
<natefinch> jam: do you agree with the changes to TestRace?  That's the one that was failing.  It seemed like checking that the goroutine stops before running out of runway was not actually the point of the test.
<mgz> bogdanteleaga: basically, I'm fine signing off windows test failures for the release until we've done some of that upgrade work,
<mgz> if you can confirm that master as of now is not vastly more borked than it was a few days ago
<cherylj> jam, do you think that this PR could have anything to do with these windows failures?   https://github.com/juju/juju/pull/4798
<voidspace> alexisb: you back?
<jam> cherylj: yes, with the caveat that it is exposing brokenness in the test suite.
<jam> cherylj: we were doing stuff like not calling SetUpSuite
<jam> or calling PatchValue before calling SetUpTest
<jam> which would cause the patched value to never be cleaned up.
<jam> cherylj: IIRC mgz had a patch that changed at least one of them to correctly call SetUpSuite
<alexisb> voidspace, I am and will be free shortly
<alexisb> will ping
<voidspace> alexisb: cool
<cherylj> natefinch, ericsnow, can I get a review? http://reviews.vapour.ws/r/4340/
 * ericsnow reviews
<ericsnow> cherylj: LGTM
<natefinch> jam, ericsnow: the reason I made loop into a stanadalone function is that it's a goroutine... it doesn't have any state its storing, and it shouldn't be implied that it is.  There's two separate concerns, a goroutine looping over channel, and a type that can send messages to that channel.  linking them to the same object is conceptually incorrect, and unnecessary.  Plus it means the goroutine has receive-only halves of done and messages, making it
<natefinch> clear that it should not (and cannot) be the one closing them.
<ericsnow> natefinch: my point was that it is confusing that way
<natefinch> ericsnow: hmm, ok.
<natefinch> ericsnow: my default is to make a goroutine a standalone function, since that's what it is.  I actually found it confusing that it was a method, since it didn't really need to be :)
<natefinch> ericsnow: I can hit a middle ground I think... just share the done and messages channels in the struct
<ericsnow> natefinch: even putting that loop function inside New() as a closure would help, though jam's point about other loop methods is still correct
<natefinch> ericsnow: what was your oops comment about?
<ericsnow> natefinch: you had pushed up some typos
<natefinch> ericsnow: oh, did I fix it?
<ericsnow> natefinch: apparently you caught them anyway :)
<natefinch> heh
<natefinch> ericsnow: also, the fix was actually just removing the limited for loop inside TestRace, and removing the error that happened if the goroutine went through the for loop before the test called stop
<ericsnow> natefinch: k
<bogdanteleaga> mgz, it was successful, but it almost timed out with several packages over 550s
<mgz> bogdanteleaga: thanks
<mgz> cherylj: ^I'm fine signing off windows test for a beta3
<bogdanteleaga> mgz, trying 1.6 now
<redir> cherylj: review please http://reviews.vapour.ws/r/4341/
<bogdanteleaga> mgz, oh and it was apiserver only :)
<mgz> bogdanteleaga: that should do, other things timed out in CI as well but seems like we just made juju slower on windows tests
<natefinch> W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/trusty-updates/main/source/Sources  Hash Sum mismatch
<natefinch> cute
<natefinch> team meeting anyone?
<cherylj> alexisb: you going to make the is call?
<alexisb> cherylj, sorry looks like I missed it
<cherylj> yeah, there wasn't anything to talk about anyway.  Unless you had something
<alexisb> nope
<mgz> natefinch: we failed...
<natefinch> mgz: indeed
<redir> ericsnow, natefinch review please: http://reviews.vapour.ws/r/4341/
<ericsnow> redir: will do
<ericsnow> redir: done
<redir> ericsnow: tx, I'll have a couple questions -- after a reboot.
<ericsnow> redir: k
<natefinch> redir: I reviewed as well.
<natefinch> ericsnow: it's just you and me for the standup today, mind if we meet early?
<ericsnow> natefinch: sounds good
<ericsnow> natefinch: now?
<natefinch> ericsnow: yep
<redir> holler when you're off ericsnow
<ericsnow> redir: k
<alexisb> cherylj, I am finding my first issue testing out beta3
<alexisb> I had a controller I created with beta2
<alexisb> trying to kill it with beta3 fails
<alexisb> but when I kill it with the old beta2 it works
<cherylj> alexisb: that doesn't surprise me
<cherylj> there have been problems like that between the other betas
<cherylj> the expectation is that you create and destroy with the same versionj
<alexisb> we should note it in known issues though, dont you think?
<alexisb> or somewhere in the release notes
<cherylj> alexisb: yeah, could say something about the different betas not guaranteed to be compatible with eachother
<alexisb> cherylj, if you are good with saying something I can add
<cherylj> alexisb: yeah, probably worth noting.  Thanks :)
<wallyworld> alexisb: beta3 is not compatible with beta2
<wallyworld> we make no guarantees of compatability
<wallyworld> the releases notes will say as much when i update them
<alexisb> wallyworld, you are not allowed to correct me on your vacation day
<wallyworld> alexisb: will disappear after release standup
<wallyworld> alexisb: we changed the format ofr controllers.yaml
<wallyworld> better to do to now than after releases
<alexisb> wallyworld, agreed,  I am just trying to be a 'new' user
<wallyworld> sure np, we just need to be clear that folks need to "start again" between betas
<redir> Am I supposed to 'resolve' issues in review board when I fix them or leave them for the original reviewer to resolve?
<redir> somebody say new user?
<ericsnow> redir: feel free to mark them as resolved
<redir> ericsnow: one outstanding
<ericsnow> redir: if you aren't sure if you've satisfied the reviewer though, it sometimes pays to leave it alone until you get more feedback
<redir> ericsnow: your first one, I am not sure...
<ericsnow> redir: I've dropped that one
<redir> OK.
<ericsnow> redir: so you should be ready to go :)
<redir> So then does it automatically merge?
<redir> sorry this is my first bug
<redir> so first time through.
<redir> Or do I merge it myself?
<ericsnow> redir: you have to add a comment in the PR with $$merge$$ in it
<redir> k. tx.
<redir> And the buildbots don't run the tests before the merge?
 * redir wonders where the build dashboard is.
<ericsnow> redir: there's a merge bot that adds a link to the PR for the merge request and then runs the tests and does the merge for you
<redir> awesome. so if the tests fail the merge should too. thanks ericsnow
<ericsnow> redir: congrats on your first patch merged :)
<ericsnow> redir: yep
<ericsnow> redir: np
<redir> ...insert fancy ascii art in here...
<alexisb> i would like to point out that wallyworld did not ask about mongo
<alexisb> in the release call
<wallyworld> alexisb: no point :-( i saw the email
#juju-dev 2016-03-25
<redir> sigh
<redir> ericsnow: you still around?
<deanman> Hi, i would like to write a docker charm utilising the layer-docker charm but it is not clear to me what's the best way to do this. Do i simply clone layer-docker or can i use something from charm-tools to pass this as an argument?
<mup> Bug #1561959 opened: juju kill-controller fails with latest admin/default model changes <conjure> <juju-core:New> <https://launchpad.net/bugs/1561959>
 * fwereade is feeling a growing sense of dread that the dummy provider is somehow involved in the insanity I've seen since merging master
<fwereade> ...yes, it is. destruction of a *hosted* environ is resetting the mgo server
<fwereade> f7u12
<bogdanteleaga> anybody using go1.6 yet?
<natefinch> bogdanteleaga: yeah, many people are building with it
<bogdanteleaga> natefinch, ever seen provider/openstack/firewaller.go:434: arg cfg.UUID() for printf verb %s of wrong type: (string, bool) when running go vet on pushes?
<bogdanteleaga> cfg.UUID returns only string, but this seems to happen since I got 1.6
<natefinch> bogdanteleaga: you may want to delete your $GOPATH/pkg directory... could have an outdated library in there
<natefinch> bogdanteleaga: sometimes that happens when you switch go version, and it'll try to build with an outdated library, which can result in the signature that is in the code not reflecting the signature of what is being linked, which is what this sounds like
<bogdanteleaga> natefinch, right, I forgot about that one
<bogdanteleaga> natefinch, seems to work, thanks
<natefinch> bogdanteleaga: sweet
<alexisb> morning all, happy friday
<natefinch> alexisb: morning :)
<natefinch> how the heck do I have jujud running from a binary that doesn't exist on disk?
<mup> Bug #1562052 opened: juju bootstrap doesn't create secgroups when there are environment naming conflicts <juju-core:New> <https://launchpad.net/bugs/1562052>
<redir> morning:)
<natefinch> redir: morning :)
<natefinch> ericsnow: lol - from cmd/juju/charmcmd: // Command is the top-level command wrapping all backups functionality.
<natefinch> ericsnow: https://pbs.twimg.com/media/B3rqN7EIMAEFdJ-.jpg
<bogdanteleaga> cmars, I addressed the comments, overall I'm not sure what's preferred for the CLI
<bogdanteleaga> cmars, I'm slightly in favor of just waiting, but people had concerns with that
<cmars> bogdanteleaga, thanks for the changes.. what's the targeted release for this change? is it intended for juju 2.0 or after?
<bogdanteleaga> cmars, 2.0
<redir> speaking of commands
<redir> you around cherylj ?
<cherylj> redir: yep, whatps?
<cherylj> ha
<cherylj> what's up
<redir> got a minute to chat about those usage text bugs?
<cherylj> sure.  hangout?
<cherylj> or irc?
<redir> HO
<cherylj> let me grab my headset
<redir> voice would prlly be much quicker
<redir> k
<cherylj> redir: https://plus.google.com/hangouts/_/canonical.com/cheryl-reed?authuser=0
<mup> Bug # changed: 1319890, 1472711, 1490865, 1520247, 1524064, 1524297, 1536792, 1542206, 1545057, 1547898, 1551857, 1553915, 1554044, 1554721, 1555430, 1555585, 1557714, 1558191, 1559233, 1560237, 1561023, 1561088, 1561555, 1561611
<ericsnow> natefinch-afk: ha, backups
<mup> Bug #1562088 opened: flexible /etc/rsyslogd.d/25-juju.conf configuration <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1562088>
<natefinch> oh man...  charmrevisionupdater imports juju/testing, which imports provider/dummy, which imports apiserver, which imports apiserver/charmrevisionupdater
<natefinch> I just can't even
<voidspace> g'night all
<voidspace> happy easter
<natefinch> man, I really hate it when a test in one file depends on hard-coded values defined in another file.  Like... how am I supposed to know where this magic value comes from?
<redir> How does one typically mark a bug in lp? I don't see a way to add tags. Should I just add a comment? Something else?
<natefinch> redir: there are tags down at the bottom
<natefinch> redir: depends on how you're marking it, though...
<redir> natefinch: thanks.
<redir> natefinch: good point, needs update, needs review
<redir> needs edit
<redir> What is the equivalent of @mentioning someone?
<natefinch> redir: that's probably more just a comment.  tags are generally used for permanent or semi-permanent categorization
<natefinch> redir: lol... uh, email them a link?
<redir> k, comment it is.
<redir> and emailing a link to the bug it is
<redir> tx natefinch
<natefinch> redir: lp is still pretty oldschool, though people will get emailed who have already indicated an interest (forget exactly what gets you subscribed to a bug)
<natefinch> redir: no problem.  LP is an... interesting piece of software
<redir> How do I manully destroy-controller. Seems I updated before killing a controller and now it can't destroy it.
<redir> it is lxd
<natefinch> redir: easiest way might be to de-update and rebuild to kill it.  certainly that's the most reliable way to make sure everything is cleaned up correctly.
<natefinch> redir: we should probably figure out a more manual solution, since it's not always possible to know what code to revert to.... I have an LXD environment running that's been around for a few weeks that I don't yet know how to clean up
<redir> natefinch: I see. I don't know what version to back out to either as I was on tip...
<redir> thanks
<natefinch> ericsnow: you around? I could use some help with these charmrevisionupdater tests
<ericsnow> natefinch: sure
<ericsnow> natefinch: in moontstone
<mup> Bug #1560203 changed: stringForwarderSuite.TestRace sometimes fails <blocker> <ci> <intermittent-failure> <test-failure> <juju-core:Fix Released by natefinch> <https://launchpad.net/bugs/1560203>
<mup> Bug #1560525 changed: Juju 2.0-beta3 stabilization  <juju-core:Invalid> <https://launchpad.net/bugs/1560525>
<alexisb> natefinch, you still on by chance?
#juju-dev 2016-03-26
<redir> caio juju-dev
<redir> have a nice easter weekend
#juju-dev 2017-03-20
<menn0> axw, babbageclunk: so.... https://github.com/juju/juju/pull/7123
<babbageclunk> menn0: trying to get something finished off so I can get some other people's opinions. Also I think wallyworld wanted me to take a look at a PR of his, so I might need to do that first - it's probably smaller than this one too!
<menn0> babbageclunk: could be :)
<babbageclunk> menn0: But I'll try to get to it soon!
<menn0> wallyworld: easy one: https://github.com/juju/cmd/pull/48
<wallyworld> menn0: sure. am also looking at resources one. will be awesome to get that all fixed up at least
<menn0> wallyworld: after the resources one comes the big apiserver facade registration cleanup i've been sitting on
<wallyworld> sounds painful
<menn0> wallyworld: the payloads and resources PRs were needed before that could be finished
<wallyworld> yep
<menn0> wallyworld: yep, not exactly a soft cushion :)
<axw> menn0: re "unnecessarily exposed? Backend": I'd prefer if the apiserver facade constructors that map state.State -> Backend were outside the package, and the package *only* dealt with Backend. not in this PR of course, but I don't think Backend is unnecessarily exposed
<menn0> axw: yeah fair enough... I was 50/50 on that one
<axw> menn0: I think we can neaten things up more when the global registry goes away
<menn0> axw: I think so too.
<menn0> axw: I need to do that for my big project
 * axw nods
<menn0> axw: (which is why I'm doing all this)
<axw> wallyworld: lol @ MiG comment. I was thinking the same thing :)
<wallyworld> :-)
<axw> wallyworld: can we branch yet?
<wallyworld> axw: i think real soon, anastasia is coordinating it; i thought it was going to be to coincide with beta1
<wallyworld> i'm guessing you need to do breaking stuff
<axw> wallyworld: ok
<axw> wallyworld: yes
<axw> I want to land my changes to detach storage
<axw> a few changes bundled up together which can't land until you can detach storage
<wallyworld> hmmm, let's discuss first thing tomorrow when she's back onboard
<axw> there's still one or two other things I can do in the mean time
<wallyworld> ok
<axw> wallyworld: I ended up cheating a little bit, and not having storage attachment removal gate on volume/filesystem attachment removal. instead, just fail in "attach-storage" if the volume/filesystem is attached to another machine
<axw> more or less comes to the same effect, but simpler
<wallyworld> simple is good
<axw> wallyworld: was it the lifecycle watcher that you were looking at, when you found that the filter is ignored if a members query is specified?
<axw> wallyworld: I'm going to make it unconditional - any concerns about that?
<wallyworld> axw: i think it was. off hand, i can't see an issue with what you want to do, but we will need to be careful that there's no unintended consequences. from memory, i needed to add a unit test to explicitly cacth the issue so i'm not 100% sure out test coverage is complete
<wallyworld> i can't think of any bad consequences off hand
<axw> wallyworld: okey dokey. all tests pass, and all the existing usages look to be safe - but will double check before I go ahead with changes.
<wallyworld> ta, yeah i think it will be ok, just be a little more cautious
<jam> axw: trivial review? https://github.com/juju/juju/pull/7121
<axw> looking
<jam> and a less trivial but hopefully useful one https://github.com/juju/juju/pull/7122
<axw> jam: how's gogland so far?
<jam> axw: the one thing it does nicer than vim is "find callers of this function" and "find implementations of this function" vs just "find definition"
<jam> I'm not sold on its VIM integration, as a lot of the nice "jump to this" in the standard UX overlaps with VIM's commands
<jam> (eg, ^B is jump to definition in gogland, but also previous-page in VIM)
<axw> jam: do you use vim-go? it has integration for go-oracle, which can find callers/callees, etc.
<jam> fortunately the important ones "gd" to jump to definition in VIM works in goland
<jam> axw: I do, but I've often found it hard to get it set up correctly on all machinse
<jam> I use Trusty, Xenial, Windows and Mac
<axw> okey dokey
<jam> getting them all set up nicely would be great, I should probably just try to start a fresh install
<axw> jam: yeah, little things like that (re key bindings) often mean I find other IDEs with "vim mode" difficult to use
<jam> I thought at least 'vim-go' had a "I'll automatically install dependencies" which it didn't seem to actually do
<axw> jam: I've started reviewing hte other, gotta pick up my daughter from school shortly. I'll continue when I return
<jam> axw: thanks
<wallyworld> axw: can you peek at this small PR? I wanted to rename remote application "registered" to something hopefully more meaningful https://github.com/juju/description/pull/4
<axw> wallyworld: sure, just after I finish with jam's
<wallyworld> no rush
<axw> jam: done
<axw> wallyworld: LGTM. normally that'd require a version bump, but I'm assuming we don't care about compat yet
<wallyworld> axw: yeah, usages are behind a flag
<wallyworld> ty
<babbageclunk> jam: would you be able to take a look at https://github.com/juju/juju/pull/7126?
<jam> looking
<babbageclunk> jam: Thanks - there's some stuff in there I'm not really sure about - GCE networking seems much simpler than what we support so a lot of the fields on SubnetInfos and InterfaceInfos are kind of fudged.
<jam> babbageclunk: I have the feeling those structs are heavily influenced by AMS and MAAS and not so much by other providers, so we likely need a common ground. I'll look at it
<jam> babbageclunk: actually, having comments that link to the GCE docs on networking would probably also be good
<babbageclunk> jam: yeah, that's a really good call - I'll add that.
<jam> "Any communication between instances in different networks, even within the same project, must be through external IP addresses."
<jam> from https://cloud.google.com/compute/docs/networking
<jam> sounds like we can't really do private=>private networking on GCE either.
<jam> wallyworld: ^^
<wallyworld> hmmmm
<wallyworld> awesome
<babbageclunk> d'oh, didn't spot that
<wallyworld> we'll have to come up with a solution sooner rather than later it seems
<babbageclunk> Although all the machines in models managed by a controller will be on the same network.
<babbageclunk> At the moment there's no way to get a machine on a non-default network in GCE
<babbageclunk> (I hacked something in for my testing.)
<babbageclunk> wallyworld: ^
<wallyworld> babbageclunk: interesting, i didn't realise we had that restriction. but the external ip address restriction will still trip us up
<wallyworld> regardless of what network (single or otherwise) machines come up on
<babbageclunk> I don't see how, until they can be on different networks they're all on the same network and can talk on the cloud-local address right?
<babbageclunk> wallyworld: It's a much looser restriction than the azure one.
<wallyworld> babbageclunk: ah, ignore me, i misread the text
<babbageclunk> (As far as I understand)
<wallyworld> it should work then. easy enough to test
<babbageclunk> way
<babbageclunk> duh
<wallyworld> we'd need to check that the firewaller discovers at least one subnet
<wallyworld> or else it will default to open to the world
<babbageclunk> wallyworld: yeah, true - I'll give it a try now
<wallyworld> to be sure just comment out the default 0.0.0.0/0 in the firewaller
<babbageclunk> ok
<jam> babbageclunk: don't we have to map 1 model == 1 project in GCE, which means a multi-model controller is spanning projects and thus networks?
<jam> babbageclunk: at the very least, if you have >1 credential, then you certainly can't have 1 controller == 1 project
<jam> which would hint that you have to span projects and thus networks
<babbageclunk> jam: well, at the moment we couldn't have controllers with models in different projects.
<jam> so if we're using https://cloud.google.com/compute/docs/networking#legacy_non-subnet_network
<jam> babbageclunk: that sounds like a *severe* problem with our GCE implementation
<jam> given we support multiple auth on other providers
<babbageclunk> jam: The qa account uses legacy, but it depends how the default network has been set up.
<jam> back to my thread, https://cloud.google.com/compute/docs/networking#legacy_non-subnet_network seems to say everything talks on the same single-subnet across all regions for a project
<babbageclunk> jam: not sure about the multi-auth bit
<jam> babbageclunk: regardless the default network, the goal has always been to support "juju add-model --credential=FOO" which would mean the controllers would span projects
<jam> it sounds like we should have modeled 1 Model == 1GCE Project, though I can imagine we didn't do that
<babbageclunk> jam: ok. I haven't seen that, but I think you're right.
<jam> babbageclunk: its possible we put "Project" into part of the GCE credentials
<babbageclunk> jam: We definitely don't have 1 model = 1 project at the moment. :(
<babbageclunk> jam: it is in the credentials - I haven't seen add-model --credential=FOO
<wallyworld> babbageclunk:  it seems you are saying the GCE provider is lacking the abiltity to create models with different credentials
<wallyworld> that is quite a bad problem then
<babbageclunk> wallyworld: I haven't tried, I don't know for sure that it doesn't
<babbageclunk> wallyworld: but all of the testing I've been doing is with one set of creds.
<wallyworld> ok, we'll need to check that tomorrow
<jam> fwiw, there has been a strong request for us to request 'static' external IP addresses on GCE as well
<jam> I haven't had time to dig into it
<jam> but apparently IP addresses are even-more dynamic on GCE
<jam> on AWS you only get new addresses if you "Stop" the machine explicitly and then "Start" it again
<jam> apparently on GCE just doing a "shutdown -r" will give you a new address
<mattyw> jam, you around to do a review?
<jam> mattyw: currently otp and doing another review, but i can add it to the queue if you like
<mattyw> jam, thanks very mcuh https://github.com/juju/juju/pull/7109
<babbageclunk> wallyworld: hmm - looks like there's a nil pointer panic happening in the firewaller - could be the upgrade of the GCE client package. I'll chase it in the morning.
<wallyworld> babbageclunk: i've already started looking. it's happening for me too on LXD
<wallyworld> i'll look a bit more tomorrow as well if i don't find the cause tonight
<babbageclunk> wallyworld: ah, ok - I thought it was my fault but I guess it could be yours!
<wallyworld> :-)
<wallyworld> probably
<babbageclunk> catch you tomorrow
<wallyworld> a *lot* of refactoring has been happening
<wallyworld> ttyl
 * babbageclunk exits pursued by a bear.
<wallyworld> \o/
 * wallyworld is having a whiskey
<wallyworld> oh you said bear
<wallyworld> i thought you said beer
<perrito666> morning
<mattyw> perrito666, morning morning
<jam> babbageclunk: ping
<jam> morning perrito666
<perrito666> jam: I am supposed to have a meeting with you now :)
<jam> hi perrito666, sorry caught up in a review brt
<perrito666> tx
<tasdomas> perrito666, hi - could I get a review: https://github.com/juju/juju/pull/7114
<perrito666> tasdomas: not soon, if you are in a hurry I suggest you find a backup reviewer :)
<tasdomas> perrito666, thanks ;-]
<perrito666> jam: I feel like we are cheating you, you have little day left so our "have a good day" is not as effective for you
<jam> perrito666: I carry it over for the next day :)
<jam> matty
<jam> sorry, was trying to search for your name
<jam> mattyw: https://github.com/juju/juju/pull/7109 reviewed
<mattyw> jam, ack thanks
<wallyworld> hml: hey, i'm not sure what happened before, but i re-added our 1:1 to the calendar. does the time suit?
<hml> wallyworld: the time works for me - our previous time was in the middle of your night.  :-)  most likely lost off your calendar
<wallyworld> hml: great. it is on the back of the release call so if i'm a minute late i won't be long
<perrito666> hml: wallyworld does not need to sleep, he is a robot of sorts, the only activity he has is soccer :p its like an aussie terminator
<wallyworld> i need something to keep me sane
<wallyworld> i also need coffee
<perrito666> its as if Terminator had been made with Paul Hogan as the main character
<wallyworld> lol
<hml> perito666: ha!  iâve seen him online and talking around lunch in the US and wondered.
<perrito666> hml: paul hogan or wallyworld ?
<hml> perrito666: wallyworld - havenât seen paul in a while
<menn0> gah! httptest.ResponseRecorder.Result() doesn't exist in Go 1.6 :(
<menn0> and i've used it quite a bit
<perrito666> and we are using go 1.6 for?
<menn0> that's the minimum Go we support but I'm using Go 1.8 on my machines
<menn0> the official Go in our ubuntu releases is Go 1.6
<perrito666> I thought we stopped caring about that with snaps
<menn0> perrito666: the Go snap only started working properly very recently (and still has a GOROOT issue which mwhudson is fixing)
<wallyworld> and out tests fail with go 1.8
<wallyworld> go 1.8 is stricted on url parsing for example
<wallyworld> stricter
<perrito666> I have not tried juju on 1.8 yet, I still use 1.7
<wallyworld> we want to move to go 1.8 - the process is underway
<wallyworld> juju works fine with 1.8, just the unit tests need fixing
<perrito666> wallyworld: have you seen plugins? they are very cool, still very beta, but looks like a nice thing to look for the future
<wallyworld> haven't seen them yet
<perrito666> wallyworld: here, have some shameless plug by me :p https://perri.to/2017/03/go-plugins-and-content-delivery/
<wallyworld> interesting
<babbageclunk> wallyworld: reviewed the first one - looks good other than what I think is a logic bug in resolving the filters.
<babbageclunk> wallyworld: looking at the other one now
<wallyworld> babbageclunk: tyvm, will look after my meetings this morning
<babbageclunk> wallyworld: cool
<babbageclunk> wallyworld: oh my god, I was expecting the second one to be smaller.
<wallyworld> babbageclunk: are you just looking at 2nd commit?
<wallyworld> it was branched off the first one
<babbageclunk> wallyworld: oh, yay!
<wallyworld> it's in the description :-)
<babbageclunk> wallyworld: I was thinking some of the changes looked familiar.
<perrito666> babbageclunk: that did sound like a Cards Against Humanity white card :p
<babbageclunk> perrito666: yeah, I thought that as I was typing it.
<wallyworld> babbageclunk: what i'll do is make all fixes to the second branch and just do the one landing
<babbageclunk> wallyworld: ok
<anastasiamac> wallyworld: could u plz reply to question on bug 1567169 - tyvm
<mup> Bug #1567169: juju deploy bundle does not honor existing machine definitions <conjure> <deploy> <native-deploy-gap> <s390x> <uosci> <OpenStack Charm Test Infra:Confirmed> <juju:Triaged> <juju-core:Invalid> <https://launchpad.net/bugs/1567169>
<wallyworld> anastasiamac: it's more a matter of when will the logical mapping be done. i'm not sure if it's on the radar yet
<wallyworld> i had thought someone was looking at it but not sure
<anastasiamac> wallyworld: the question was whether there is a workaround and there is none.
<wallyworld> outside of a bundle, you can use placement directives to some extent but i don't know off hand what bundles support in that area
<anastasiamac> wallyworld: my understanding - bundles do not support deploying to existing machines. no workaround. plz add a political correct response since u've fielded previous query.. It looks to me like the user is asking your opinion ;)
<wallyworld> no but bundle may support placement directives which could be used in some way. i can't answer until that is known
<anastasiamac> wallyworld: right now, the answer that they want is to the question "whether there is a workaround". Current answer si "no workaround at this stage"
<wallyworld> no, there may be a workaround, thayt's my point
<wallyworld> placement directives might be able to be used if supported
<anastasiamac> wallyworld: placement directives are not supported in bundle AFAIK
<anastasiamac> wallyworld: however
<anastasiamac> wallyworld: if u know otherwise, plz indicate who is best person to ask, if not u
<wallyworld> just looking at the code, the bundle struct does have them
<wallyworld> but the comments are unclear
<wallyworld> anastasiamac: more investigation is needed since TestDeployBundleMachinesUnitsPlacement and the bundle code seem to indicate that the placement they want should still work even though that contradicts our stated position and the bug report
 * babbageclunk runs
#juju-dev 2017-03-21
<axw> wallyworld: I'm going to have to take off for a while after standup, gotta take my brother to the GP
<wallyworld> no orrie
<wallyworld> worries
<wallyworld> hml: standup if you're free?
<menn0> thumper: quick one pls? https://github.com/juju/juju/pull/7129
 * thumper looks
<thumper> approved
<menn0> thumper: cheers
<anastasiamac> thumper: wallyworld: menn0: coming?
<menn0> anastasiamac: yep
<wallyworld> jam: if you had time later to talk about CMR, can you ping me?
<wallyworld> axw: you have time for a smallish review? https://github.com/juju/juju/pull/7130
<axw> wallyworld: sorry, ended up having to take my brother to the hospital ED, had to get a cast on his broken foot
<axw> so today was a write off
<wallyworld> axw: hope he's ok! how did he break it?
<axw> wallyworld: skateboarding
<wallyworld> kids these days :-)
<ashipika> he should stick to cricket ;)
<axw> wallyworld: yes... kids... this is my older brother :p
<wallyworld> lol
<wallyworld> still younger than me
<ashipika> any of you core gents care to have a look at https://github.com/juju/juju/pull/7109 ?
<wallyworld> sure
<ashipika> wallyworld: thanks
<wallyworld> ashipika: there's an issue in that a new facade method is added to uniter facade without bumping the version to 4
<wallyworld> and the uniter client needs to check that the version it is talking to is 4 and faily gracefully if not
<wallyworld> there's a BestFacadeVersion method or similar that is used to see what facade version is available
<wallyworld> the client calls that and if it gets 3 back, it knows the new SLA method is not available
<wallyworld> s/the client/the caller
<axw> wallyworld: I'm going to be out a few hours tomorrow too. dentist for myself, then brother back to the GP. le sigh
<wallyworld> axw: no worries, whatever you need
<wallyworld> dentist = wallet pain :-(
<ashipika> wallyworld: ah, yes.. forgot about versions.. )
<wallyworld> ashipika: yeah, PITAi know
<ashipika> wallyworld: well.. there's pain and there's the dentist :) i'll take core over the dentist..
<wallyworld> :-)
<jam> wallyworld: were there other CMR things, or just what you sent in the email?
<wallyworld> jam: for now the email. was my lasy comment where i restated the semantics for network-get as per your thinking?
<wallyworld> ie in my reply
<jam> wallyworld: yeah, context implies "what do you want for that machine" and no relation context is "what do I need for local configuration"
<tasdomas> wallyworld, could I ask you to take a look at this PR as well: https://github.com/juju/juju/pull/7114 ?
<wallyworld> jam: i really want to try and avoid using public addresses where possible and supported by the provider snd deployment etc, but it will be hard to do that without a lot more development
<wallyworld> tasdomas: sure, ok
<wallyworld> jam: also, we'll need to see if we can allow charms to work unchanged via unit-get; not sure if that's feasible yet. but i'd prefer not to invest a lot of time into making unit-get work with CMR if possible or if we can justify avoiding it and putting the effort into making the charms more modern and use network-get
<jam> wallyworld: you're looking more for 'relation-get' to give an appropriate 'private-address' for the remote side
<jam> 'unit-get' is about local config
<wallyworld> tasdomas: you already have 2 lgtms :-) did you still want me to look?
<jam> eg "mysql$ relation-get wordpress" should give the same thing as "wordpress: network-get mydb -r mysql"
<tasdomas> wallyworld, well, if you could quickly glance over the PR, I'd really appreciate it
<wallyworld> tasdomas: ok
<wallyworld> jam: relation-get is for the whole relation databag, right. there in general won't be a "wordpress" key. it will be something like "private-address" from memory
<axw> wallyworld: reviewed your PR
<wallyworld> axw: ty
<jam> wallyworld: relation-get private-address is what they generally call
<jam> private-address is the one bit of the data bag that Juju puts automatically
<wallyworld> jam: right, that's my understanding also
<jam> (more interesting is the call in the other direction, to be fair, but I started writing before thinking all that through)
<wallyworld> jam: so the issue is juju doesn't really know explicitly what the charm wants to use "private-address" for
<wallyworld> whereas when it calls network-get we can make a more informed decision
<wallyworld> as to "my local ddress" or "what i advertise"
<jam> wallyworld: so *relation-get* is fairly clear, because it is asking for "give me the IP address of the other application"
<jam> so that I can talk to it
<jam> the issue is that Wordpress and Mysql might actually coordinate on a different key
<jam> like say
<jam> URL
<wallyworld> right, like postgres does
<jam> so we want 'network-get' (instead of unit-get) so that when MySQL populates the URL field
<wallyworld> but postgres would make that url by calling neywork-get
<jam> it can give the context for what IP address it wants to put in there
<wallyworld> exactly
<wallyworld> that's why i want to not use unit-get at all
<wallyworld> and migrate the charms we care about
<jam> wallyworld: my understanding is that most charms actually use 'relation-get private-address' over custom URLs
<jam> I don't think I've advocated fixing unit-get
<jam> just 'relation-get private-address' and 'network-get'
<wallyworld> my issue with the former is we are still guessing a bot at the the purpose of why the charm wants to use private address rather than being explicit
<wallyworld> use of network-get ensures we don't guess anything
<wallyworld> but we can see how it plays out i guess
<wallyworld> axw: will comment - list vs find - list is used by model admin, it exposes connection count, charm name, other details. find is used to discover what endpoints are available. they both read from the same part of the model, hence shared code but different facade apis
<wallyworld> tasdomas: sorry, got caught up, looking now
<jam> wallyworld: again, we know that when Wordpress calls relation-get private-address for MySQL there isn't anything *Wordpress* could do but connect to MySQL
<jam> which is different from "maybe I need to set a bind address for MySQL in the MySQL charm, or I need to give the public address because someone is calling me from outside my network"
<wallyworld> jam: yeah, maybe i'm being overly paranoid about thinking there could be another reason
<wallyworld> jam: the fun bit will be figuring out what to provide for that private-address value. eg for aws wirh models in same region etc, it can be the cloud local address. but for azure it will need to be the public address etc
<wallyworld> and that then affects how the firewall ingress rules are done
<jam> wallyworld: so IMO, whatever you would put in "network-get -r relation --primary-address" on one side would be what you would put in "private-address" to be seen from the other side
<jam> if we started with CMR == public-address that would be a reasonable starting point given everything we've seen so far
<jam> in fact, only in the very special "its all my account in AWS" is it possible to use private address
<jam> cause otherwise its a "different network" even if they are both private, etc.
<rick_h> jam: anything to chat over today?
<jam> hi rick, I'll brt
<wallyworld> jam:  true, but i was hoping to at least be able to reasonable short circuit where possible
<wallyworld> but yeah, public address would work, so long as a public address is available
<wallyworld> not always the case though
<wallyworld> eg openstack
<wallyworld> tasdomas: quick comment - the tables in the CLI output should no longer use CAPS for headers
<wallyworld> "Title case" instead
<tasdomas> wallyworld, ah, ok
<tasdomas> wallyworld, thanks
<wallyworld> np
<wallyworld> tasdomas: we made that change for 2.0 as per Mark's guideance
<wallyworld> so best to change the budget commands also :-)
<tasdomas> wallyworld, will do - thanks for the heads up
<wallyworld> np
<mup> Bug #1674655 opened: juju kill-container fails <juju-core:New> <https://launchpad.net/bugs/1674655>
<ashipika> wallyworld: the current uniter facade version is v4.. having added that method, do i need to bump it up to v5?
<ashipika> wallyworld: or just fail gracefuly if talking to v3..
<ashipika> or lower
<wallyworld> ashipika: the issue is that 2.0.x and 2.1.x will be using v4, right? a hacky way would be to make the api call and handle the NotImplemented error as an indication the api is not supported
<wallyworld> CodeNotImplemented i think
<wallyworld> that would be less strictly correct bit still work
<wallyworld> bumping to v5 shouldn't make much effor tthouhh
<perrito666> just bump the version, is almost free
<wallyworld> just a few lines to register the facade
<wallyworld> we do that for the storage facade for example
<ashipika> ack.. v5 then
<jam> hi perrito666
<perrito666> jam: hi
<mup> Bug #1674655 changed: juju kill-container fails <juju-core:Invalid> <https://launchpad.net/bugs/1674655>
<frankban> wallyworld, perrito666: do you remember what constraints can be applied to lxd and to kvm containers when creating a machine in juju?
<perrito666> frankban: no
<wallyworld> frankban: for lxd at least, i don't think we do much. we may honour mem for kvm
<wallyworld> not 100% sure though
<frankban> wallyworld: my guess was mem and disk for both?
<frankban> I'll go with that
<wallyworld> could do, yeah, sounds plausible
<frankban> wallyworld: ty
<wallyworld> frankban: i just checked
<wallyworld> mem, core, disk for kvm
<wallyworld> cores
<frankban> wallyworld: ah cool thanks
<frankban> cores
<wallyworld> frankban: and nothing that i can see for lxd
<frankban> wallyworld: ok the GUI will reflect that
<wallyworld> frankban: we do need to introduce max constraints for lxd so the contaners don't use all the memory for example
<wallyworld> right now. constraints are min
<frankban> wallyworld: interesting, that will be a good challenge for UX
<wallyworld> yes
<wallyworld> we're thinking of new, separate constraints. max-mem etc
<frankban> ah, cool
<frankban> wallyworld: so, when using the lxd provider, do w esupport any constraints instead?
<frankban> wallyworld: for top level machines (lxd containers in that case)
<wallyworld> frankban: as far as i am aware, any lxd instance creted ignores constraints. the host machine on which the lxd container runs is a separate issue
<frankban> wallyworld: ok thanks
<frankban> wallyworld: last question, I know that the provider type for local models in juju2 is "lxd". was it "local" or "lxc" in juju1?
<wallyworld> frankban: local i think from memory
<frankban> wallyworld: I think I remember it was local yeah
<frankban> thanks
<wallyworld> that was a lot of wine ago
<ashipika> wallyworld: had a discussion with mattyw.. and decided not to bump the facade version just because of one added method.. the api method will fail gracefully when talking to an older facade though
<ashipika> wallyworld: otherwise we'll exceed maxInt too soon :)
<wallyworld> ashipika: it's only a couple of lines of code - more probbly to fail gracefully
<wallyworld> literally one register call
<wallyworld> ashipika: see apiserver/application/application.go
<wallyworld> the only other change is to call BestFacadeVersion in the caller
<ashipika> wallyworld: thy will be done..
<wallyworld> vers := root.BestFacadeVersion("Uniter")
<wallyworld> ty :-)
<wallyworld> if vers < 5 {} else {}
<ashipika> BestAPIVersion or BestFacadeVersion?
<ashipika> wallyworld: ^
<wallyworld> ashipika: i yse BestFacadeVersion
<wallyworld> ashipika: but
<wallyworld> BestAPIVersion() can be used directly on the client
<wallyworld> both work
<wallyworld> so if you already have the client created, BestAPIVersion
<wallyworld> if you just have the api root, BestFacadeVersion
<mattyw> wallyworld, any ideas on the best way to test this?
<mattyw> wallyworld, doesn't look like there are any existing mocks we can use?
<wallyworld> mattyw: you mean unit test or real world test? for unit test, there is a way but i forget, i can check
<ashipika> wallyworld: unit test
<mattyw> wallyworld, unit test, we're trying to find examples, but looks like we might have to write the mocks ourselves
<wallyworld> mattyw: i can't find any tests for existing things like this. we used to have a way. for this PR, i would ok it so long as the new stuff works
<ashipika> wallyworld: it appears that if i create a uniter.State with api/base/testing.APICallerFunc.. the testing.APICallerFunc returns BestFacadeVersion 0..
<ashipika> wallyworld: so a glitch in the testing.APICallerFunc allows me to test that
<wallyworld> ashipika: that could well be. i've not dug into that code in a while. if you have found a way then awesome
<ashipika> wallyworld: fwereade has a todo on the BestFacadeVersion in the APICallerFunc.. but it has never been done.. long term backlog
<wallyworld> yeah, we have a lot of tech debt
<perrito666> jam: ?
<jam> omw
<perrito666> o my wod?
<ashipika> wallyworld: pushed fixes for https://github.com/juju/juju/pull/7109 ..
<ashipika> wallyworld: no i didn't
<wallyworld> ashipika: so no version bump?
<ashipika> wallyworld: i.was.changing.the.wrong.branch.please.don't.ask.
<ashipika> wallyworld: there.. i pushed.. this time the right branch
<wallyworld> ah, that's better
<anastasiamac> jam: feature tst PR for 'get-config' and 'juju config' as discussed: https://github.com/juju/juju/pull/7132
<wallyworld> ashipika: looks good with a rename suggestion. thanks for the extra work on the versioning
<perrito666> ok, while my tests re-run on CI ill go run a couple of errands, bbl
<ashipika> wallyworld: renamed.. thanks for all the help!
<jam> anastasiamac: reviewd
<perrito666> annyone in need of a review? this is THE time to ask
<mattyw> thumper, ping?
<thumper> mattyw: hey, otp just now
<mattyw> thumper, no problem will you be around in about 45 minutes to talk model migration?
<mattyw> thumper, irc will be fine - unless you really want to see my face
<thumper> mattyw: I have about 9 minutes now
<bdx> how's it going all?
<thumper> well... not everything is on fire
<bdx> I'm wondering, how a new dependency is added to dependencies.tsv?
<bdx> eehh ... good .. I guess
<bdx> sorry
<bdx> ha
<rick_h> thumper: is so full of happy thoughts
<rick_h> thumper: bdx wants to play with newrelic and juju and is trying to see how to inject the newrelic bits
<rick_h> thumper: any hints on how to add stuff is <3
<thumper> bdx: new library dependencies normally have to be approved by the tech board
<thumper> but how I add things is do write the code that uses the libraries
<thumper> then go `godeps ./... | grep 'newlibname'
<thumper> then add that line into the dependencies.tsv file in alphabetical order
<thumper> that way you don't override the whole file
<thumper> but just add new bits
<thumper> bdx: make sense?
<bdx> thumper: yes, perfect. thank you!
<bdx> thumper: looking at the newrelic go apm agent installation instructions, "Import the github.com/newrelic/go-agent package in your application. Create a config and application in your main function or in an init block." - where would be the best place to place this in the juju codebase?
<bdx> thumper: in here -> https://github.com/juju/juju/blob/staging/cmd/jujud/main.go#L39 ?
<thumper> no
<thumper> I'd keep all newrelic stuff isolated in the new relic provider
<bdx> thumper: newrelic provider - what is this?
<thumper> oh...
<thumper> what is newrelic exactly?
<bdx> thumper: its a monitoring/alerting platform
<bdx> thumper: they have an apm product for go applications
<rick_h> thumper: you should test out the free newrelic platform on your django app
<bdx> I was trying to play around with seeing what types of metrics we can get for juju via the go agent
<rick_h> thumper: <3 getting hints at slow queries, etc
<bdx> thumper: (sorry about the acronyms) APM - appication performance monitor
<bdx> http://imgur.com/ysLJimx
<thumper> otp again...
<bdx> thumper: speaking of acronyms ... otp?
<bdx> one true pairing?
<bdx> on the plane?
<bdx> :)
<thumper> on the phone
<niedbalski> anastasiamac, thumper I can't locate the 1.25 backport of lp#1613855
<bdx> quick question
<bdx> is the jujud binary that runs on controllers the same jujud binary that runs on juju deployed machines?
<bdx> I can't seem to identify any difference between the two ...
<menn0> bdx: the controller machine agents, workload machine agents and unit agents all use the same jujud binary
<menn0> bdx: the behaviour of the binary changes depending on the context
<menn0> bdx: internally, it's the workers that run that change the behaviour the most
<bdx> menn0: thanks for that
<bdx> the context provided by what is in mongo?
<bdx> menn0: "it's the workers that run that change the behaviour the most" - does this mean that the workers (that exist in jujud) will run in the controller context, but not outside of that?
<menn0> bdx: sorry, on a call. will get back to you.
<bdx> menn0: np
<rick_h> thumper: sent you an email with details, test it running and added your ssh key so you can sshuttle to the network and view the grafana as it runs.
 * rick_h goes back to dishes
<stokachu> im going to try and get a macOS version of conjure-up ready before the beta release on thursday
<stokachu> is someone able to push juju 2.2 beta to brew?
#juju-dev 2017-03-22
<anastasiamac> veebers: sinzui: balloons: ^^
<veebers> anastasiamac, stokachu: I believe that only either sinzui or anastasiamac can do that due to having macs?
<anastasiamac> veebers: my mac is dual boot but does not boot into macOS since Oct ;( sinzui maybe the only one \o/
<stokachu> ive got a mac now too so i can do a build if need be
<stokachu> anastasiamac: just have sinzui sync up with me if he doesn't think he'll get to a mac build for the beta
<anastasiamac> stokachu: \o/ as long as u don't bootstrap controller on mac - we don't support it
<stokachu> anastasiamac: just localhost right?
<stokachu> anastasiamac: im going to blacklist localhost in conjure-up on a mac anyway
<anastasiamac> stokachu: AFAIK we only support clients on Mac... sounds good :D
<stokachu> hmm
<thumper> rick_h: let me know if you come back and have a few minutes
<thumper> rick_h: I think we have this one solved
<rick_h> thumper: what's up?
<thumper> rick_h: looking at that dashboard, memory seems to be holding steady
<thumper> rick_h: also, I'd like you to run something for me on the apiserver
<thumper> rick_h: juju-statepool-report
<rick_h> k, sec
<rick_h> thumper: https://pastebin.canonical.com/183289/
<thumper> perfect
<thumper> rick_h: fixed that leak
<thumper> hazaah
<rick_h> thumper: yea, looks like it
<rick_h> thumper: ty!
<thumper> happy?
 * rick_h will find something else to be displeased about :P
<rick_h> thumper: very much so, I'll let this run for a while more and then get screenshots/etc to let folks konw we've tested it
<thumper> sure
<rick_h> thumper: I'd also like to chat with your QA folks to see how we add some sort of regular check like this
<rick_h> thumper: it's pretty scriptable at this point with the setup here, but I assume we'll want to do this somewhere maybe on a cloud or something
<rick_h> actually, I should run this on gce or aws and see if it has the same issue w/o the fix
 * thumper nods
<thumper> may well do
<menn0> thumper, wallyworld, axw, jam: due to kids stuff, tech board may be tricky for me today. i'll try to make it if I can.
<wallyworld> ok
<thumper> menn0: ack
<menn0> wallyworld, axw: here's the big apiserver cleanup: https://github.com/juju/juju/pull/7133
<menn0> no more global facade registry
<wallyworld> menn0: will look after standup
<menn0> wallyworld: no rush
<axw> menn0: looks awesome at a first glance. have to go shortly, or I'd be all over it
<menn0> axw: no rush I have to pick up kids anyway
<axw> thumper: your PR is against staging
<thumper> bollocks
 * thumper fixes
<thumper> changed
<thumper> $ juju bootstrap --agent-version 2.1.0 dev leak-test-2.1
<thumper> error: requested agent version major.minor mismatch
<thumper> wat?
<thumper> why can't my 2.2-beta1 local juju deploy a 2.1 controller?
<thumper> that's dumb
<thumper> ugh...
<thumper> we do that because out tool checking code is shite
<thumper> what a POS
<wallyworld> thumper: from memory the policy was to require the bootstrap client and agent versions to match down to the minor version number
<wallyworld> it never used to be like that a long time ago, but due to potential incompatibilities a policy decision was made
<jam> thumper: I've been trying to draw out your ping vs pong and ReadDeadline stuff, I think we might have a small problem with timing
<wallyworld> menn0: love the pr
<menn0> wallyworld: \o/
<jam> thumper: as for minor version stuff, we did it because we broke bootstrap at least once between minor versions and we never wrote code to say "if you're this version controller, use *this* bootstrap code"
<thumper> jam: working on interactive stuff
<jam> if we change the arguments to "bootstrap-state" for example (which is what we did)
<jam> thumper: specifically, you only increment 'readdeadline' when you get a Pong
<thumper> yeah
<jam> so if you have a Ping and an immediate Pong, then if your next Ping happens and its pong is delayed longer than the first
<jam> you might hit timeout
<jam> It's easier if it is drawn out, but I have to get the kid ready for school.
<menn0> bdx: just remembered I still owed you an answer. sorry.
<thumper> jam: yeah, I'm hitting something in my testing, and adding logging to work it out
<jam> thumper: also, we need to update Write deadline everytime we do any writes, not just with the Pinger
<thumper> but yes, I'm hitting an unexpected close
<wallyworld> menn0: thumper: jam: i may well miss tech board, i forgot i need to drive my son to see a surgeon
<jam> thumper: because Deadline is for *everything* on the socket
<menn0> bdx: the distinction between machine and unit agent is determined by the jujud command line
<thumper> jam: but these only write control messages
<jam> thumper: where are we writing things like the actual log messages themselves?
<thumper> we only read
<jam> or is the ping initiated only on the reader side
<thumper> yes
<menn0> bdx: there are further distinctions if the jujud is acting as a machine agent. the agent finds out its role(s) over the API (ultimately from Juju's MongoDB) and runs the workers required for each role.
<jam> thumper: k. I think we might need to set some sort of Read delay whenever we send a Ping, possibly smaller than we set it when we get a Pong
<jam> there should be a "how often do we Ping" and a "how long can a Pong" take. and getting a Pong would probably set the Deadline to Now()+time-to-next-ping+time-for-response
<thumper> jam: I cribbed this from all the online docs and examples around this ahndling
<jam> and sending a Ping would set it to Now()+time-for-response
<jam> say it might take 10s for a Pong to respond, but it could respond in 1ms.
<jam> and our ping interval is 20s
<thumper> jam: yeah, I was wondering about that delay, so that is why I'm doing some interactive testing with additional logging
<jam> 0s, Deadline=20s, Ping, 0.1s Pong, Deadline=20.1s, 20s Ping, 20.1s Timeout, 22s Pong
<jam> thumper: ^^ check my work, but I think that's the failure case I came up with
<thumper> I think the key is to make sure the ping frequency is sufficiently smaller than the pong delay added
<thumper> perhaps we set pong delay to 90s and ping frequence to 60s
<thumper> we ping every 60 seconds
<thumper> and that gives a 30s window for pong to come back
<thumper> even though we expect a subsecond pong
<thumper> each pong pushes the read window out 90s
<jam> thumper: so I think they are functionally equivalent, I'm just not sure which is easier for people to understand
<jam> thumper: menn0: tech board?
<babbageclunk> jam: when you get a chance can you take another look at https://github.com/juju/juju/pull/7126?
<babbageclunk> jam: I'll do separate PRs to store the networks in state and tag subnets with the vpc-id in ec2.
<anastasiamac> babbageclunk: axw: wallyworld: PTAL https://github.com/juju/names/pull/78
<wallyworld> anastasiamac: lgtm, ty!
<axw> wallyworld: just got back, need to eat then I'm free to chat
<wallyworld> axw: rightio. i'm about to go get lachie from the train, but will be home in say 20 or 25 minutes
<axw> ok, sounds good
<babbageclunk> wallyworld, axw: I'm here too!
<wallyworld> babbageclunk: awesome, time for a chat in the standup HO?
<wallyworld> axw: ^^?
<axw> wallyworld: sure, brt
<anastasiamac> babbageclunk: axw: wallyworld: dependencies update PTAL : https://github.com/juju/juju/pull/7135
<wallyworld> anastasiamac: lgtm
<jam> babbageclunk: reviewed
<jam> rick_h: did you restart your testing? I tried logging into Grafana again, but admin/testings didn't work
<rick_h> jam: yes, remove the s
<rick_h> jam: just sent an email on the new run starting
<rick_h> jam: there's a race in the grafana charm that if you set the password too close to the deploy (before it's up) the password isn't set right so you end up changing it again (add an s to the end heh)
<jam> rick_h: ah, it doesn't respect config at boot, only if it gets set once the app is actually up?
<jam> sucky
<rick_h> jam: rgr, seems that way.
<rogpeppe> here's a refactoring of the juju login command. would really appreciate a quick review as this is critical for us. https://github.com/juju/juju/pull/7136
<rogpeppe> jam: any chance you might be able to take a look?
<niedbalski> thumper, anastasiamac wallyworld https://bugs.launchpad.net/juju-core/+bug/1675154 , any idea about this one?
<mup> Bug #1675154: Upgrade not possible from 1.25.6 to 1.25.10  <sts> <juju-core:New> <https://launchpad.net/bugs/1675154>
<mup> Bug #1675154 opened: Upgrade not possible from 1.25.6 to 1.25.10  <sts> <juju-core:New> <https://launchpad.net/bugs/1675154>
<thumper> niedbalski: not at this stage, but I need to step away to eat breakfast
<mattyw_> thumper, let me know when you're back from breakfast, I have reviews for you if you have time
<thumper> mattyw_: here
<mattyw_> thumper, so I *think* descriptions might be right now? https://github.com/juju/description/pull/5
 * thumper looks
<mattyw_> thumper, and after that we should talk about my core pull request - I think I know why that test was passing (when it shouldn't have) but we should talk about it
<thumper> ok
<thumper> mattyw_: review done
<mattyw_> thumper, awesome thanks, I'll fix it up now and hassle you again. Would you have time to discuss https://github.com/juju/juju/pull/7128 in the meantime or would you rather I get descriptions/5 out of the way first?
<thumper> we can discuss now
<thumper> mattyw_: noticed why it was passing
<mattyw_> thumper, the reason the test was passing without me having done the serialisation code I think is because it all comes down to s.importModel. That calls state.Export and state.Import. But at no point does that hit yaml
<thumper> because the import/export tests don't serialize/deserialize the model
<thumper> right
<mattyw_> awesome
<thumper> it assumes that the model package is tested
<thumper> as it should
<mattyw_> so if I get the description package sorted is the rest of the stuff in that pr the right approach?
<thumper> mattyw_: so when the description PR is done, we are good
<thumper> yep
<mattyw_> awesome, thanks very much for the help. I'll fix it up now
<mup> Bug #1675154 changed: Upgrade not possible from 1.25.6 to 1.25.10  <sts> <juju-core:Invalid> <https://launchpad.net/bugs/1675154>
<mattyw_> thumper, ptal https://github.com/juju/description/pull/5/files
<mattyw_> thumper, sorry to hassle you - but are you able to take another look?
<thumper> mattyw_: sucked into calls now
<thumper> I have it up
<mattyw_> thumper, ok - I'll go afk for a bit if you could take a look at that and https://github.com/juju/juju/pull/7128/files when you can. I'll keep checking back every so often incase there are things to do
<babbageclunk> wallyworld: you about? (I mean, you probably shouldn't be, but if you are...)
<wallyworld> babbageclunk: in release call
<wallyworld> will be done soon
<babbageclunk> wallyworld: ok thanks
<wallyworld> babbageclunk: free now
<babbageclunk> wallyworld: hey - so jam approved my PR, but in this comment (https://github.com/juju/juju/pull/7126#pullrequestreview-28347395) he suggested deploying to an imported subnet.
<wallyworld> looking
<babbageclunk> wallyworld: I'd like to do that test, but I'm not sure how to - should I use some kind of placement directive?
<wallyworld> babbageclunk: there is a placement directive for subnets, yes
<wallyworld> --to subnet=<blah>
<wallyworld> i think
<wallyworld> where blah is a subnet id or cidr
<babbageclunk> wallyworld: I don't think that's implemented in GCE though. It's probably not that much work to add...
<wallyworld> babbageclunk: in that case my view is we land what you have
<babbageclunk> ok, doing that
<wallyworld> babbageclunk: we can add a card to the board to remind us to come back and check
<babbageclunk> ok - I'll do that
<wallyworld> hml: did you end up using the standalone mysql charm or your testing? i assume that's all gone ok?
<hml> wallyworld: yes, i used the standalone mysql - and everything worked.
<wallyworld> yay
<hml> wallyworld: added the tests, one final sanity check and iâll do the PR
<wallyworld> let;s mak sure a bug is filed against the percona cluster charm
<hml> wallyworld: okay, will file - anything specific i should say beyond the obvious
<wallyworld> not really
<babbageclunk> wallyworld: should I move on to storing network for the subnet in state? or something else
<wallyworld> babbageclunk: would be good to round out the current gce work, agree?
<babbageclunk> wallyworld: yeah, I think so
<wallyworld> should hopefully be fairly straightforward
<babbageclunk> ok, doing that.
<thumper> mattyw_: description PR merged
<babbageclunk> wallyworld: hmm - should the network name be part of the subnet's document ID?
<babbageclunk> wallyworld: It should, right? We could have the same cidr in different networks.
<wallyworld> babbageclunk: it depends what's needed to make the id unique and what asserts are used
<wallyworld> what do we do for ec2?
<babbageclunk> wallyworld: the id at the moment is just the CIDR.
<wallyworld> right, that sounds wrong
<babbageclunk> at the moment we don't capture vpc-id in ec2
<wallyworld> or maas?
<babbageclunk> No, jam was saying maas doesn't have that concept. (Although what are fabrics in maas?)
<babbageclunk> oh, no - fabrics aren't the same as different networks. https://docs.ubuntu.com/maas/2.1/en/intro-concepts#fabrics
<mattyw_> thumper, many thanks, how's the core one now? https://github.com/juju/juju/pull/7128
<thumper> mattyw_: need to update description hash for dependencies
<thumper> mattyw_: rest looks good
<wallyworld> babbageclunk: it looks like the state model is wrong - it assumes cidr is unique
<wallyworld> babbageclunk: which may be true for maas
<wallyworld> but you are sying is not true for gce?
<mattyw_> thumper, I think the old hash was ok - but I've updated it now anyway
<babbageclunk> wallyworld: I don't think it's true for GCE no, hang on, checking docs
<thumper> mattyw_: well, you should always refer to the mainline hash :)
<mattyw_> true
<thumper> mattyw_: LGTM
<mattyw_> thumper, many thanks
<babbageclunk> wallyworld: https://cloud.google.com/compute/docs/networking#ip_ranges
<babbageclunk> wallyworld: ranges must be unique and non-overlapping within a network
<wallyworld> babbageclunk: hence cidr as doc id is ok then, right?
<babbageclunk> wallyworld: but couldn't we have subnets from different networks?
<babbageclunk> Or should I not worry about that for now?
<wallyworld> maybe, but i'm not familiar enough with the juju network data model. they did the current model for a reason i assume
<wallyworld> i'd just stick with the convention that's been done for other existing providers
<wallyworld> for now
<babbageclunk> wallyworld: I'm talking about in state - I don't think it's anything to do with providers at that point, is it?
<wallyworld> right, i'm talking about in state too - but the providers set the data to go into state
<wallyworld> so what's in state needs to be able to model what the providers provider
<babbageclunk> ha
<babbageclunk> Ok, so I won't worry about trying to incorporate the network id into the doc id for now, we can do that later. More important to capture it for now.
<babbageclunk> wallyworld: ^
<wallyworld> babbageclunk: i'm still not sure of why you think we need to do that just for gce?
<babbageclunk> I don't think we do jsut for GCE.
<wallyworld> ok, so i don't think we should be second guessing the network model at this point. let's implement for gce something consistent with what's already been done. the model needs to be lookeed at holistically perhaps
<babbageclunk> ok
<wallyworld> as part of the ongoing work to implement spaces etc
<anastasiamac> hml: ping
<hml> anastasiamac: ack
<anastasiamac> hml: could u do a quick hangout?
<anastasiamac> hml: we could hog ur team's standup one ;D
<hml> anastasiamac: sure
<anastasiamac> hml: k. m there :)
#juju-dev 2017-03-23
<anastasiamac> wallyworld: thumper:axw:babbageclunk: PTAL https://github.com/juju/juju/pull/7143, fixes a bug :D
<anastasiamac> more like a gap than a broken functionaility...
<babbageclunk> anastasiamac: just having lunch, but I'll take a look afterwards (if someone else doesn't!)
<anastasiamac> babbageclunk: \o/
<wallyworld> anastasiamac: babbageclunk: i'm not sure we've agreed to the approach yet. it's under discussion via email
<wallyworld> we need to get agreement what they want to do is supported and correct first
<anastasiamac> wallyworld: obviously not parties are on this email trail.... maybe discussing it in the bug will be better/more effective means as communication
<axw> anastasiamac: reviewing
<wallyworld> anastasiamac: agreed in principal. but we don't want to polluate the bug with lots of back and forth. the summarised outcomes can be added though
<axw> wallyworld: this change is unintrusive, I don't think we need to have any more discussion before landing this particular change
<axw> wallyworld: this is an alternative to what was being proposed, which was adding pre-populated relation settings
<wallyworld> axw: that is true. but i was concerned it is a solution to a problem they didn't need to have if things were deployed the recommended way. i'd have to re-read the email trail
<axw> wallyworld: ehh except there's some weirdness in the PR, so maybe I'm still misunderstanding the issue.
<wallyworld> right
<wallyworld> i'm hesitant because i'm not sure w'ere all on the same page
<axw> wallyworld: yeah I'll continue discussions.
<anastasiamac> wallyworld: sure but now that the PR is up, email communication is not best, especially if it excludes ppl that r blocked byy the bug and r proposing solutions
<wallyworld> anastasiamac: the bug is not the place to have detail design discussions. status updates etc sure
<wallyworld> or steps to repro etc etc
<axw> anastasiamac wallyworld: I'm taking the discussion to the PR.
<wallyworld> yup, +1
<axw> and will take it back to email if that becomes unwieldy
<wallyworld> axw: FYI, as discussed, this is the CDO vsphere bug https://bugs.launchpad.net/juju/+bug/1669483
<mup> Bug #1669483: [2.1] juju bootstrap fails when there is no VM in vsphere datacenter or all VMs are in folders <cdo-qa-blocker> <oil> <oil-2.0> <vsphere> <juju:Triaged> <https://launchpad.net/bugs/1669483>
<axw> wallyworld: okey dokey
<axw> wallyworld: might be hard to test, since larry is away
<axw> I don't think I should be deleting existing VMs
<wallyworld> yeah, best effort and all that
<cmars> hi, i have a fix for LP:#1675214. can i get a review? https://github.com/juju/juju/pull/7147
<mup> Bug #1675214: destroy-model failed because could not determine model SLA level <ci> <destroy-model> <regression> <juju:In Progress by cmars> <https://launchpad.net/bugs/1675214>
<thumper> jam: to be honest, I think we don't care for the use case you mentioned
<thumper> not really, not now
<thumper> however, if you think we do, I could add a txn-revno assert back in
<thumper> jam: we do do a txn-revno assert in 2.x
<wallyworld> cmars: looking
<cmars> wallyworld, thanks!
<wallyworld> cmars: yeah, or account for the facade version for each model - won't be too hard to do that
<wallyworld> lgtm for the temp fix - just code deletion really
<cmars> wallyworld, ok thanks!
<wallyworld> babbageclunk: not sure if you are able to squeeze in a review before your EOD https://github.com/juju/juju/pull/7144
<stokachu> first cut of conjure-up on macOS done \o/
<babbageclunk> wallyworld: that's a tiny one! looking now
<wallyworld> babbageclunk: that's what all the girls say too :-)
<wallyworld> stokachu: well done!
<stokachu> wallyworld: hopefully that'll increase our userbase by a lot
<stokachu> i need to look into xhyve though for localhost deployments
<stokachu> like docker does
<wallyworld> stokachu: maybe? how many of our target audience  ueses macs? i personally hate them
<stokachu> wallyworld: good question, im not sure
<stokachu> word is though all the conferences people use macs
<stokachu> kubecon etc
<wallyworld> interesting
<wallyworld> we should collect stats on that
<stokachu> we do
<babbageclunk> definitely true at Python conferences I've been to
<stokachu> nice, yea i think having a localhost deployment mechanism will be handy
<stokachu> basically xhyve to boot ubuntu
<stokachu> then do some magic there to wire it up
<stokachu> or vsphere would be nice here too i think
<babbageclunk> wallyworld: reviewed
<jam> babbageclunk: so I think we do need to have a better primary key than CIDR for subnets, but probably not worth you fixing right now. (you can't fix all the bugs in networking on your own :)
<babbageclunk> jam: :) fair enough!
<wallyworld> jam: that's exactly what i sid too :-)
<wallyworld> babbageclunk: thanks for review
<jam> babbageclunk: and my other point was "we won't really know if we have the right data until we actually go to *use* the data", (we're only recording it with your patch)
<jam> but rough glance looks like its the right thing, and we can iterate from here.
 * jam takes the dog out
<wallyworld> babbageclunk: i replied to one of you comments, see if if makes sense?
<axw> jam: 1:1? wallyworld is here too
<mup> Bug #1468752 changed: "juju ssh" adds an additional strings to all commands when used on Windows, in interactive mode <ssh> <windows> <juju:Fix Released by
<mup> gz> <juju 2.0:Fix Released by gz> <juju 2.1:Fix Released by gz> <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1468752>
<mup> Bug #1616149 changed: Incompatible protocol between older client and candidate server <ci> <regression> <test-failure> <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1616149>
<mup> Bug #1637267 changed: Juju fails to restore state-server on xenial <ci> <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1637267>
<mup> Bug #1616832 changed: stuck rsyslogd causes mongodb to block <eda> <needs-ci-test> <sts> <juju:Triaged> <juju-ci-tools:Triaged> <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1616832>
<wallyworld> axw: you happy to go with "not supported" initially for nova networks and subnets in heather's PR ?
 * axw looks
<axw> wallyworld: hm. not sure, I'll need to check the behaviour of the code that calls Subnets
<wallyworld> axw: yeah, if the interface claims support and then the other methods error, might not be good
<axw> wallyworld: have the changes to discover subnets already been made
<axw> ?
<wallyworld> axw: in the worker, yeah
<axw> oh yeah
<axw> wallyworld: we at least need to look for NotSupported in the discoverspaces worker, in discoverSubnetsOnly
<wallyworld> axw: yeah, i'd rather the provider not lie to the worker though
<axw> wallyworld: welp, the only other option I guess is to return separate Environ implementations when nova and neutron are supported. I'm not sure if that's feasible though, because opening an an Environ is not supposed to make any API calls
<wallyworld> yeah, also true. maybe handle not supported in the worker with a todo to fix the openstack provider to return subnets for nova
<axw> wallyworld: yep
<axw> wallyworld: FWIW, I'd prefer if we had an Environ.Networker method, which every Environ was required to implement. it would return an Environ or a NotSupported error
<wallyworld> no argument from me there
<axw> wallyworld: and then Networker would have methods for checking support for subnets, spaces, etc.
<wallyworld> yep
<wallyworld> i don't like the current implementation
<jam> wpk: ping
<wpk> jam: pong
<jam> hi wpk, I wanted to go over some of the key bugs, though if you want it can wait until standup I suppose
<jam> I'm going to be gone over the weekend, so I was hoping to chat with you about some next steps
<wpk> I'm ready, should we talk here or maybe just join standup hangout now?
<jam> sure, brt
<jam> wpk: i'm there now
<jam> wpk: https://github.com/juju/juju/pull/7119 reviewed
<jam> sinzui: what triggers someone to get an "affiliated" icon in Launchpad?
<jam> I'd like to get WPK a Juju sticker so that I don't accidentally pick some other Witold
<sinzui> wow you challenge me
<sinzui> jam: I think it is show when the user is a member of the owner or driver teams.
<jam> well, he's in '~juju-core' afaict
<sinzui> jam, we forgot to add him to ~juju
<jam> but it looks like there is ~juju-hackers he wasn't part of
<jam> I just did
<jam> "Juju Hackers" aka ~juju
<jam> sinzui: thanks for the pointer
<jam> sinzui: btw wpk's current branch https://github.com/juju/juju/pull/7119 is likely to want a feature tests
<jam> or an update to whatever we currently have that 'apt-proxy' works
<jam> anyway I'm out for a bit
<sinzui> jam, yep, that was the one I was curious about. I reported bug 1635633 and am eager to close it
<mup> Bug #1635633: assess_proxy does not test apt proxies <gap> <proxy> <juju-ci-tools:Triaged> <https://launchpad.net/bugs/1635633>
<wallyworld> babbageclunk: did you see bug 1675546?
<mup> Bug #1675546: container networking broken in GCE <ci> <gce-provider> <lxd> <network> <regression> <juju:Triaged by 2-xtian> <https://launchpad.net/bugs/1675546>
<rick_h> wallyworld: anything to chat on today?
<wallyworld> rick_h: not as such. very happy about the latest memory profile
<rick_h> wallyworld: +1
<babbageclunk> wallyworld: yup, chasing it, but had to rush off to the doctor - back now.
<wallyworld> ty
<wallyworld> we will release beta1 regardless
<wpk> I'm having problems with bootstraping on maas, I'm constantly getting
<wpk> /var/lib/juju/nonce.txt does not exist
<wpk> 23:23:13 DEBUG juju.utils.ssh ssh.go:292 using OpenSSH ssh client
<wpk> packet_write_wait: Connection to 10.2.15.254 port 22: Broken pipe
<wpk> any ideas on what might I be doing wrong?
<menn0> wpk: can you give a bit more context? is this with real hardware or vmaas?
<babbageclunk> wallyworld: the bug is because the juju-qa env is a legacy network, so network interfaces are returned with no subnet. It looks like I can get everything I need off the network in that case - doing that.
<sinzui> babbageclunk: interesting. We/me also has an ancient aws account
<wpk> menn0: real hardware, PC + 2 NUCs with AMT as nodes. I'm doing juju bootstrap maastiff, and as I understand it SSHs to the node, installs all the stuff, reboots it, and then SSHs to it again just to fail on the above, then decomissions it
<babbageclunk> sinzui: yeah, that's part of the problem - new accounts wouldn't have this kind of network as the default
<babbageclunk> wpk: can you deploy a node in maas without juju?
<menn0> wpk: seems something is up with SSH on the host causing it to drop the SSH connection attempt. is it possible to get the sshd logs on that host?
<wpk> babbageclunk: yes, no problem with that
<wpk> menn0: it gets decomissioned before I have a chance to get to it :/
<menn0> wpk: try adding --keep-broken to the juju bootstrap command
<menn0> wpk: that should keep the system around
<wpk> menn0: ok
<wpk> https://pastebin.canonical.com/183603/ that's the whole log
<sinzui> wpk: your localhost cannot ssh to that address. As menn0 says try --keep-broken to keep it up
<sinzui> wpk: while juju is bootstrapping, you *should* be able to ssh in as ubuntu@<ip>
<wpk> I am
<sinzui> wpk: you are sshed in?
<sinzui> if so, use -v to see what key works.
<sinzui> you can force the juju key using -i $JUJU_DATA/ssh/juju_id_rsa
<wallyworld> wpk: there's a bug or 2 where this issue is discussed; usually the cause is a set up issue, see last comment or 2 on https://bugs.launchpad.net/juju-core/+bug/1314682
<mup> Bug #1314682: Bootstrap fails, missing /var/lib/juju/nonce.txt (containing 'user-admin:bootstrap') <bootstrap> <juju> <maas-provider> <juju:Expired> <juju-core:Won't Fix> <https://launchpad.net/bugs/1314682>
<wallyworld> wpk: also https://ask.openstack.org/en/question/59760/juju-bootstrap-failed/
<wpk> sinzui: I was able to ssh in, using /home/wpk/.local/share/juju/ssh/juju_id_rsa key, and /var/lib/juju/nonce.txt exists on this node
<wpk> sinzui: the nide has Internet access and dns resolution working
<wpk> wallyworld: nothing worrying in cloud-init.log
<wpk> hm, except for this:
<wpk> 2017-03-23 23:28:12,171 - util.py[DEBUG]: Running module apt-configure (<module 'cloudinit.config.cc_apt_configure' from '/usr/lib/python3/dist-packages/cloudinit/config/
<wpk> cc_apt_configure.py'>) failed
<wpk> (...)
<wpk> ValueError: Old and New apt format defined with unequal values True vs False @ apt_preserve_sources_list
<wallyworld> that would do it
<wpk> wallyworld: and what's the cause and how to fix it?
<wallyworld> wpk: i have no idea off hand - it's an issue outside of juju and maas from whay i can see. you'd need to google the error and fix your apt config
<wallyworld> but if cloud init doesn't complete, you'll see the nonce error
<wallyworld> so you need to fix whatever is stopping cloud init from running properly
<wpk> wallyworld: but why there's the nonce error if the file is there?
<wpk> https://bugs.launchpad.net/cloud-init/+bug/1646571
<mup> Bug #1646571: apt failures non fatal, but cloud the log <landscape> <cloud-init:Incomplete> <MAAS:Confirmed> <https://launchpad.net/bugs/1646571>
<wpk> (and as it says - it's not fatal, cloud-init continues after that)
<wallyworld> without digging into the code and analysing logs etc, there's no way to tell. we'll need to devote time to looking into it
<wallyworld> the bug you link above indicates a maas issue
<wallyworld> can you add extra info from your sitution to the bug?
<wpk> wallyworld: I will, tomorrow (1AM here)
<wallyworld> thanks
#juju-dev 2017-03-24
<anastasiamac> wallyworld: plz comment on bug 1672879 when u get a chance
<mup> Bug #1672879: [2.1.1] Controller does not look at agent_stream in model-config - it uses agent_stream that it booted with <oil> <oil-2.0> <juju:Triaged> <https://launchpad.net/bugs/1672879>
<babbageclunk> wallyworld: can I get a review for my fix for the GCE legacy network bug? https://github.com/juju/juju/pull/7150
<wallyworld> babbageclunk: sure
<babbageclunk> wallyworld: fanks
<menn0> axw: ping
<axw> menn0: pong
<menn0> axw: regarding the rev menioned in bug 1669180, that's quite an old rev for a recent ticket. how sure are you that it's the right one?
<mup> Bug #1669180: proxy-ssh/juju ssh --proxy is ignored <cdo-qa-blocker> <juju:In Progress by menno.smits> <https://launchpad.net/bugs/1669180>
<menn0> axw: it seems like jam's more recent work is what Gabriel/you are referring to
 * axw looks
<wallyworld> babbageclunk: minor nits only
<menn0> axw: it does look like jam may have done something weird with proxy handling in his recent changes though
<axw> menn0: hmm yeah doesn't look like it was changed in that rev, sorry
<menn0> axw: ok cool... just making sure i'm not going crazy
<axw> menn0: oh wait
<axw> it is
 * axw gets the line
<axw> menn0: https://github.com/juju/juju/commit/d25d100f3c04eb6dc22c58db51e87c1d947a6836#diff-f75a59a4d41206aca31bb545d00e3e31R326
<menn0> axw: ha, that's the line I was looking at too.
<menn0> axw: for some reason I had thought that was part of jam's recent changes
<menn0> axw: I agree that it's fishy
<menn0> axw: thanks i'll dig further
<axw> menn0: no worries
<babbageclunk> wallyworld: I called those network initially but that clashed with the network package I use just further down. Do you think I should rename the import?
<wallyworld> babbageclunk: nah, tis ok
<wallyworld> leave as is tehn
<babbageclunk> wallyworld: I agreee that it's a bit awkward though.
<babbageclunk> wallyworld: just checking against a non-legacy network to be sure.
<wallyworld> ty
<thumper-headdown> anyone... https://github.com/juju/description/pull/6
<thumper-headdown> juju branch to follow
<anastasiamac> thumper-headdown: i'll swap ur PR for thoughts on bug 1675048 :) way forward or solution would b nice \o/
<mup> Bug #1675048: mongod doesn't return when killed <mongodb> <juju:Incomplete> <https://launchpad.net/bugs/1675048>
 * anastasiamac looking now
<axw> menn0: your change seems sane to me
<menn0> axw: did you see my email?
<axw> oh I see email, /me reads
<menn0> axw: just more detail
<menn0> axw: I'd like jam's feedback before proceeding
<menn0> axw: we might end up going with this for 2.2 and something more sophisticated for 2.3
<axw> menn0: I suppose an alternative to classifying addresses would be to find an address that's routable from the controller's perspective. that might be as simple as finding addresses in the same space
<axw> menn0: but yeah, I think this is probably fine for now
<axw> and now I see I basically just said the same thing as your last sentence, which I hadn't got to yet ;:p
<thumper-headdown> ah fuck...
 * thumper-headdown headdesks gently
<thumper-headdown> hmm...
<thumper-headdown> maybe not
<thumper> well shit
<thumper> axw added model status back in april 2016 and migration missed it
<thumper> and here I thought it was something new
<anastasiamac> it was new in april 2016 :)
<thumper> not helpful
<anastasiamac> :(
<thumper> on the plus side, can make it required in the description rather than optional
<thumper> which is cleaner
<anastasiamac> it seems to me that we need to have better testing that description pkg is in sync with what we have in juju....
<thumper> well... this was missed when it was together, so that isn't the issue
<anastasiamac> exactly.. what ele could have been missed in the same manner?..
<anastasiamac> else*
<babbageclunk> wallyworld: I'm trying to test the add-subnet part of my provider-network-id change, but I can't run it with GCE because add-subnet requires a space name and spaces aren't supported on GCE. Should I add a card to make add-subnet accept an empty space name?
<babbageclunk> wallyworld: PR for storing network id: https://github.com/juju/juju/pull/7152
<wallyworld> babbageclunk: looking
<babbageclunk> wallyworld: ta
<wallyworld> babbageclunk: empty space name is what we want, tes
<axw> wallyworld: I have pods running in kubernetes in vsphere
<axw> I'll do some more testing, but looking good
<wallyworld> whoot!
<wallyworld> proxy did the trick?
<axw> wallyworld: yep
<wallyworld> babbageclunk: my OCD kicked in with naming consistency, see what you thnik
<babbageclunk> wallyworld: the problem is there are already inconsistent names, so you're never going to be happy. :(
<wallyworld> true. i was hoping not to make it worse :-)
<wallyworld> but, your call
<babbageclunk> wallyworld: This is definitely better - there are heaps of other ones on network.InterfaceInfo that are all in this direction.
<wallyworld> righto
<babbageclunk> wallyworld: So, I could make them consistent with the other names in the same structure, but then we're going to have weird inconsistencies between the types. Maybe that's better?
<wallyworld> hmmm, maybe best to stick with something that's more consistent overall
<wallyworld> lessen the tech debt
<babbageclunk> wallyworld: Yeah, that was where I ended up, I think.
<wallyworld> sgtm then
<babbageclunk> wallyworld: sorry, kind of annoying.
<wallyworld> yeah, can't be helped, it's what was there
<babbageclunk> wallyworld: I might quickly get add-subnet working with no space, then?
<wallyworld> why not
<babbageclunk> wallyworld: since it's a positional arg (with optional availability zones after it) I think it needs to be required as "" though. Too crufty?
<wallyworld> oooh, yuk
<wallyworld> but what other choice is there that retains compatibility
<wallyworld> maybe we add a --space arg
<wallyworld> and deprecate the positional arg
<wallyworld> or if --space is used, don't interpret the first psoitional arg as a space
<wallyworld> just assume ones
<wallyworld> zones
<wallyworld> so we support old syntax but add better syntax
<babbageclunk> Ok, so it would be: if there's a --space option, the positional args are cidr-or-provider-id [zone1 zone2...]
<babbageclunk> if more than one positional arg and no --space, the 2nd arg is space name (preserving backwards compat)
<babbageclunk> and we allow just 1 arg (and --space not specified) meaning add this subnet not in a space?
<babbageclunk> I think that works.
<babbageclunk> Has the possibility of putting something into a space accidentally if someone has spaces named the same as an AZ and they don't know about the legacy mode? That would be pretty weird though.
<babbageclunk> wallyworld: ^?
<wallyworld> babbageclunk: yeah +1 but i think 1 arg should  error as it does now
<wallyworld> but the error would mention the prefered syntax
<babbageclunk> so the preferred syntax for adding a subnet without a space would be `add-subnet <provider-id>--space=""` ? Seems janky.
<babbageclunk> oops, `add-subnet <provider-id> --space=""`
<babbageclunk> wallyworld: ^
<axw> wallyworld: confirmed, kubernetes is happy now. ran the microbot smoke test, and it's responding happily
 * axw moves on to the vswitch bug
<wallyworld> babbageclunk: or add-subnet --space foo mysubnet
<wallyworld> axw: quick review? https://github.com/juju/names/pull/79
<axw> looking
<wallyworld> axw: awesome about k8
<babbageclunk> wallyworld: right, but the no-space version would be add-subnet --space "" mysubnet?
<wallyworld> babbageclunk: yeah :-( maybe just keep as is
<wallyworld> and can be bikeshedded with folks next week
<wallyworld> babbageclunk: or propose and we can ask for input on the pr
<babbageclunk> I'll send an email to jam and cc you - think I should add anyone else on?
<babbageclunk> wallyworld: ^ (I'm still trying to get the hang of when to put someone's nick at the start of messages)
<wallyworld> babbageclunk: sorry, i keep switching back to IDE
<wallyworld> include witold as well since he's in that space (pun intended)
<babbageclunk> wallyworld: ha, good friday afternoon joke!
<wallyworld> dad joke
<jam> babbageclunk: you're adding subnets to the "undefined" space, thus "" is perfectly appropriate for now
<jam> babbageclunk: that's concretely the "what do we want to call the otherwise-unnamed space"
<jam> right now it is exactly called ""
<babbageclunk> jam: ok, thanks - so I understand that as: we still require the space to be passed positionally, but we allow "". Right?
<babbageclunk> jam: I guess in general, we'd want people to be using named spaces. So this is enough of a special case that it's ok for for the command to do it is a little awkward?
<babbageclunk> gah, edito: ".... for the command to do it to be a little awkward?"
<wpk> wallyworld: I've updated https://bugs.launchpad.net/juju/+bug/1314682 with what I'm getting
<mup> Bug #1314682: Bootstrap fails, missing /var/lib/juju/nonce.txt (containing 'user-admin:bootstrap') <bootstrap> <juju> <maas-provider> <juju:Expired> <juju-core:Won't Fix> <https://launchpad.net/bugs/1314682>
<wpk> sinzui: ^^^
<sinzui> wpk: did you setup the dns/addresses to be ip4
<sinzui> wpk: did the instructions in comment 28 not work?
<wpk> sinzui: cloud-init finishes, apt-get update takes ~1.5 seconds
<wpk> there is an error in cloud-init.log but found non-fatal and unrelated
#juju-dev 2017-03-26
<babbageclunk> Can someone with merge privs on the juju repos merge this? https://github.com/juju/description/pull/7 thumper's offline and the mergebot isn't set up.
<wallyworld> babbageclunk: it's large, but most of these changes are new tests and boilerplate. i will be basing new work off it for the backend stauff. there's no real urgency as i can branch of what I have https://github.com/juju/juju/pull/7158
<wallyworld> babbageclunk: the followup will also add the feature flag to the grant/revoke changes
<wallyworld> i'll look to get that done today, well before beta2
<babbageclunk> wallyworld: ok, looking now
<babbageclunk> wallyworld: actually, might hold off and make a start on the private-address stuff first.
<wallyworld> sure
#juju-dev 2018-03-19
<veebers> thumper: I misspoke earlier, it wasn't chamrstore that was failing w/ go 1.10 but a dep of charmstore. The newer revs of the deps build, but the diff for updating charmstore to take into account the updated apis etc. is getting bigger and more guess-work-like :-). My intention is to ask assistance from Roger
<thumper> ack
<thumper> babbageclunk: got a few minutes?
<babbageclunk> thumper: sure - in 1:1?
<thumper> ack
<thumper> veebers: seems the centos unit tests are broken with the go 1.9 -> 1.10 change
 * veebers looks
<veebers> thumper: fixed, was old config on the jenkins slave
<thumper> coolio
<jam> test suite review: https://github.com/juju/juju/pull/8503
<wallyworld> jam: here's a PR for that unit provisioner worker bug https://github.com/juju/juju/pull/8504
<manadart> jam: Unearthed a bug in my space prior patch for space validation.
<manadart> In QA testing I satisfied that one could not set a space in which controller machines had no address(es).
<manadart> But it checks machine addresses, which are not decorated with spaces. Only the provider addresses are; at least for current MAAS behaviour.
<manadart> So the question is whether we should check all/provider/machine addresses...
<manadart> Or should the machine addresses be space-aware?
<manadart> jam: NVM. Addresses() merges both.
<jam> manadart: was making coffee, will chat with you in 1:1
<manadart> Small one for review if anyone is inclined: https://github.com/juju/juju/pull/8506
<thumper> balloons: got 5 minutes?
<balloons> thumper, sure
<balloons> thumper, in our 1:1
<thumper> balloons: jump in the 1:1
<thumper> ack
<balloons> obligatory "great minds think alike; no great minds think for themselves!"
<babbageclunk> thumper: looks like you're doing 1:1 with balloons - I guess we basically had ours yesterday anyway?
<thumper> babbageclunk: no, I'll be with you shortly
<babbageclunk> okeyas
<babbageclunk> I don't know what that is
#juju-dev 2018-03-20
<wallyworld> veebers: i made some comments on the PR, we probably need to chat about them. let me know if you have a minute at some stage
<veebers> wallyworld: ack, in process of removing the suggested deps.
<veebers> wallyworld: I have time now if that suits
<wallyworld> ok, let's
<wallyworld> babbageclunk: the tracker workers need to offer the full Environ or CAAS Broker as these are used in several places and the downstream workers need the full interface
<babbageclunk> wallyworld: oh, sure, but they could have another clause in their type-switches for CloudDestroyer
<wallyworld> babbageclunk: i'm not sure I follow. Those tracker workers (originally just the one - EnvironTracker) are to provide the full Environ to downstream, and the guarantee is that any config changes will be reflected in the supplied Environ. Narrowing the type at the Tracker  output is the wrong thing to do IMO. Bt maybe I'm missing something. If it's easier to talk 1:1 we can do that?
<babbageclunk> yeah, lets
<wallyworld> 1:1
<thumper> OMG...
<thumper> found it
<thumper> was digging through the code looking for how we get ports assigned for tests.
<thumper> a lot of digging
<thumper> hmm...
<thumper> while part of me thinks we can wrap this... another part says I'm just going to be introducing a race condition
<thumper> not one we care about, but one that will trigger the race detector
<thumper> ffs
<wallyworld> babbageclunk: i pushed the change we discussed but sadly testing fails.... juju.worker.dependency "undertaker" manifold worker returned unexpected error: expected *environs.Environ, got *undertaker.cloudDestroyer
<wallyworld> so it seems the Get() does strict type checking
<wallyworld> so i think we need to use the original approach
<babbageclunk> wallyworld: No, you just need to add a clause to the manifold output funcs for CloudDestroyer
<babbageclunk> in the environ tracker and broker
<babbageclunk> wallyworld: here https://github.com/juju/juju/blob/develop/worker/environ/manifold.go#L58
<babbageclunk> and here https://github.com/juju/juju/blob/develop/worker/caasbroker/manifold.go#L58
<wallyworld> ah i see, so support both clouddestroyer and environ
<babbageclunk> yup
<thumper> wallyworld: the test port change is going to have to be a change in the agent itself
<thumper> otherwise there are races setting the config and getting other workers to notice
<thumper> so won't be doing it today...
<wallyworld> ok
<thumper> it is possible, just icky
<wallyworld> babbageclunk: changes pushed. manually tested on both caas and iaas models
<babbageclunk> cool - looking now
<babbageclunk> wallyworld: approved
<wallyworld> yay
<wallyworld> ty
<jam> while reviewing other code, I found a bunch of panic() calls in normal code paths. Anyone care to review https://github.com/juju/juju/pull/8510
<jam> veebers: balloons: ^^ somehow we have both go vet and go fmt issues on that branch. Did we stop calling scripts/verify.bash as part of the pre-commit check ?
<manadart> jam: Looking at #8510 now. Just pushed to #8501 with sanitised address scenarios.
<balloons> jam, Chris mentioned some needed changes to the check; but it was to ensure things didn't pass that shouldn't
<balloons> Did you figure or
<jam> balloons: I just wanted to note that I ran into it, I didn't dig in at all
<jam> balloons: why is it that everytime I check ci-run, I see red, I click on it, and I see *lots* of green, but the one that matters is red.. :(
<jam> :)
<balloons> ;)
<balloons> I like clicking the view tabs at the top. Unit tests are tantalizingly close
<jam> balloons: we need a similar cleanup of $TEMP on Windows, thoughts?
<jam> balloons: I'm guessing we don't have cygwin/msys there, right?
<jam> balloons: i finally worked out: ssh developer-win-unit-tester powershell.exe 'Set-Location "$Env:TEMP"; Dir' | vim -R -
<jam> and we have  ~11k files in TEMP
<balloons> jam, wow
<balloons> jam, the intent was always to spawn a new instance, but never could get the experience right; where we could ssh in on a fresh instance
<jam> balloons: well, we have a weird symlink or something in %TEMP%, but I've deleted 10k or so files
<balloons> jam, windows unit tests are now having trouble it seems after running brilliantly for a long time
<cmars> good morning. could someone review this metrics-related PR, https://github.com/juju/juju/pull/8484 ?
<cmars> would be much appreciated :)
<externalreality> manadart, really small https://github.com/juju/juju/pull/8512
<manadart> externalreality: Ack. Got 1:1 with balloons. Shouldn't be long.
<externalreality> manadart, someone seems to have been making a effort to migrate all these retry attempts to juju/retry. I wonder if that is still a thing
<balloons> jam, we use https://github.com/PowerShell/Win32-OpenSSH
<balloons> no cygwin nonsense needed that way
<manadart> externalreality: Approved.
<cmars> hi manadart, are you the on call reviewer today? (Is that still a thing?)
<externalreality> manadart, as far as testing for the upgrade implicit space names in 2.3 -> 2.4 - would you be able to provide more context around that.
<externalreality> manadart, the upgrade step is pretty clear, but I can't tell what is missing from the trello card description.
<balloons> cmars, there really isn't an on-call reviewer anymore. The Pr looks pretty straightforward though, just adding labels to the struct
<manadart> externalreality: Want to jump on a hangout?
<cmars> balloons: ok, yeah, that's it. Ive got a followup that adds the labels to 'juju metrics' output
<balloons> cmars, I don't see a test for checking the error raised in the hook context
<balloons> follow-up pr ready?
<cory_fu> Is there a way to tell what feature flags are enabled for a controller / unit?
<agprado> Question, has application Facade V6 landed? I see V5 as normal, and V6 with a CAAS flag.
<agprado> I'm implementing `resolv --all` and I will need a new application Facade, but I trully trully don't want to jump from V5 to V7. That just looks wrong.
<cmars> balloons: updated, is that what you had in mind?
<balloons> cmars, looking
<balloons> cmars, thank you. I approved. The PR displaying labels should be of more interest
<cmars> balloons: thanks for the review. the followup's in progress, i'm basically just adding a labels field/right-most column to the output
<hml> Is there an OpenStack version if this bug: https://bugs.launchpad.net/juju/+bug/1699930 - duplicate subnets cause spaces fail during bootstrap?
<mup> Bug #1699930: Subnets are not associated with VPCs when using AWS <juju:Fix Released by wpk> <juju 2.2:Fix Released by wpk> <https://launchpad.net/bugs/1699930>
<hml> i just canât find the bug
<balloons> cory_fu, I think there is -- presumably you've looked at show-controller and model details?
<hml> foudn it
<cory_fu> balloons: I have, and nothing
<cory_fu> balloons: I also checked controller and model config, and didn't see anything there, either
<cmars> balloons: could you review https://github.com/juju/utils/pull/298 or find someone to take a look? it's a prereq for displaying metric labels.
<cmars> and sorting on them..
<jam> balloons: I realize no need for cygwin for ssh, but then I need to learn Powershell instead of just doing bash
<balloons> jam, yea, powershell or cmd.exe
<balloons> cory_fu, i guess perhaps the db, but then that may be more than desired.
<balloons> cmars, a slice isn't inherently unique -- what am I missing in your PR?
<cmars> balloons: the source, a map, will have unique keys
<cmars> balloons: it's probably academic at this point anyway.. it looks like juju/juju at develop's HEAD doesn't build with juju/utils current master
<balloons> cmars, ahh sure
<balloons> cmars, fixing that would be appreciated I'm sure
<cmars> balloons: that's probably too heavy for me to lift, but i can ping rog about it
<balloons> cmars, rogpeppe has a couple outstanding PR's I wish he would land already
<cmars> balloons: meantime i'll use local private functions to join the string maps.. maybe that's safer anyway in the long run.. less chance of keyvalues.Join getting changed and breaking things elsewhere
<cmars> sometimes a little code duplication isn't such a bad thing, when the intents are different
<balloons> cmars, ok. But I was kind of reviewing with that in mind.. since you put it into utils, it should have secondary use
<cmars> sorting vs rendering vs ...
<cmars> yeah, maybe i should close it
<balloons> right.. I'm ok with it
<balloons> veebers, are you going to bring juju/utils up to date?
<balloons> I'm actually a little surprised it's not
<veebers> balloons: update juju/juju to use newest juju/utils? yes currently working on it (rolled into the task of moving to charmstore.v5 from -unstable)
<balloons> cmars, so I guess it is part of the big update ^^
<balloons> veebers, ack, I didn't realize utils was caught up in it as well
<cmars> balloons: yes, i think that's what i uncovered :)
<veebers> balloons: fyi https://github.com/juju/juju/pull/8498
<wallyworld> thumper: did we wan tto chat about intrview?
<thumper> yep
<babbageclunk> thumper: hey, the failure we sometimes see where the API starts up is because we try to find a free port but when we try to open the API server listener it's been taken, is that right?
<babbageclunk> thumper: so if I'm adding code to open a listener (which needs to be able to switch ports because it should be run in tests where 80 can't be bound), I should open the listener in the test and pass it in to avoid another case of that problem?
#juju-dev 2018-03-21
<thumper> babbageclunk: let's chat
<babbageclunk> thumper: sure - 1:1
<thumper> ok
<wallyworld> anastasiamac: not sure if you have time for a review? https://github.com/juju/juju/pull/8514
<anastasiamac> wallyworld: i can try :)
 * anastasiamac looking while lunching :D
<wallyworld> ty, spreading the love around
<wallyworld> have lunch first
<wallyworld> there's no rush
<balloons> jam, have to reboot the PC
<jam> balloons: k
<jam> manadart, externalreality: btw, I added a card on the ha-space stuff that Tim pointed out. Namely "juju enable-ha --to X" should check that X is in the ha-space and prevent you from doing so if it isn't
<manadart> jam: Yes, saw it; thanks.
<manadart> jam externalreality: PR for the multi-address/no-ha-space-config is up.
<manadart> https://github.com/juju/juju/pull/8517
<manadart> Only the second commit is relevant, as per description.
<jam> manadart: reviewed
<cmars> hi balloons, could i get a review of https://github.com/juju/juju/pull/8513 ? this exposes labels in the `juju metrics` command output
<thumper> morning
<wpk> evening
<hml> review anyone?  https://github.com/go-goose/goose/pull/62
<hml> wpk: on pr8407, for createDefaultBridgeInDefaultProfile() why only ipv6 settings and not ipv4?
<wpk> hml: because we offically support only ipv4 on LXD, we even have special error messages for that occassion :)
<wpk> hml: we should do it for v6 to, but I wouldn't make it in this PR
<hml> wpk: something isnât parsing in my brainâ¦ we write ipv6 settings because we only support ipv4?
<wpk> hml: we disable v6 explicitly
<wpk> hml: that's what we've been doing previously, we could 'fix' that bug, again, not this PR
<hml> wpk: duh - ipv6.address: none
<hml> does the network already contain ipv4 information?
<wpk> hml: by default
<hml> or the profile
<hml> ah
<hml> i missed the word none  :-)
<balloons> hml, I left some comments for you
<hml> balloons: ty
<hml> âkeep-broken only applies to bootstrap yes?
<balloons> veebers, appears you are correct. We can't use our awesome versioning anymore it seems for snaps
<balloons> I take that back, our version is missing something
<wpk> balloons: btw
<wpk> balloons: I had a weird 'hang' in CI run today
<wpk> balloons: let me find the run...
#juju-dev 2018-03-22
<thumper> wallyworld: can we chat now?
<thumper> start a bit early?
<thumper> I have a feeling I'm going to have to run a bit early too
<babbageclunk> Does anyone know why the pre-push script was deleted?
<wallyworld> babbageclunk: yeah, i was wondering why my commits were going so fast
<jam> manadart: so it is worker/machiner that transitions from Dying to Dead, which then tells worker/provisioner that it can StopInstance
<jam> and machiner is calling EnsureDead which gives us a chance to make sure we don't go to Dead as long as we HaveVote.
<jam> we'd still need something to remove this machine from votingmachineids and machineids.
<manadart> jam: Could we make votingmachineids derived for a start?
<jam> manadart: so 'votingmachineids' is supposed to be the denormalization of... WantsVote ?
<jam> manadart: afaict, 'machineids' is denormalizing JobManageModel, and 'votingmachineids' is denormalizing !novote
<manadart> jam: Do we gain more from the denormalisation than we hurt when needing to tear-down?
<manadart> jam externalreality: https://github.com/juju/juju/pull/8517 accommodates feedback to-date including new error messages following our discussion.
<jam> manadart: so I think at least one of those denormalizations is useful, as it means peergrouper can watch for changes in that set, rather than having to watch all possible machines
<jam> but that's probably only 'machineids'
<jam> manadart: votingmachineids doesn't seem to be valuable (to me)
<externalreality> manadart, that error message doesn't seem to be too different from what is was before.
<manadart> invalid config value for "juju-mgmt-space": machines with no addresses in space "mgmt-space": "0"
<manadart> versus
<manadart> invalid config "juju-mgmt-space"="mgmt-space": machines with no addresses in this space: 0
<manadart> I didn't do the <count> machines have no...
<manadart> You have the singular vs plural thing with that.
<externalreality> manadart, "there are controller machines that are not in space <space>: 0, 1, 2"
<externalreality> That would leave little room for any confusion or ambiguity, no?
<externalreality> Its not a hill to die on however, LGTM
<jam> balloons: I put together an ugly hack, but it lets me bootstrap on bionic with mongodb-server-core.
<jam> balloons: some of it may survive, but I thought you might be interested in testing it.
<jam> I sent you an email about it.
<balloons> jam, awesome, thanks
<hml> one line review anyone?  https://github.com/juju/juju/pull/8522
<manadart> hml: Approved.
<hml> manadart: ty!
<cmars> hi, can i get a review of https://github.com/juju/juju/pull/8513 ?
<cmars> balloons: ^^ ?
<balloons> it's on the list cmars
<cmars> balloons: awesome, thanks!
<thumper> I'm happy now that the team calls have changed times, I can now make lunch time gym class
 * thumper rolls
<anastasiamac_> anyone keen to review https://github.com/juju/juju/pull/8519
<anastasiamac_> thumper: babbageclunk: wallyworld ^^
<babbageclunk> anastasiamac_: yup, looking
<anastasiamac_> babbageclunk: my hero! thnx \o/
<anastasiamac_> babbageclunk: i agree with difficulty of catching everywhere -  we just need to be cautious when reviewing changes to provider/adding new functionality
<anastasiamac_> babbageclunk: adding the conversion on the other side does not help however...
<anastasiamac_> babbageclunk: we wrap errors on our side and if we'd wrap "cred invalid" error, we'd lose its type anyway...
 * anastasiamac_ commenting on PR
<babbageclunk> anastasiamac_: I think you can't add it higher up (since you need to look for it in some places in the code), but maybe we could add something to the library - everything ends up going through the query method (I think after not very much reading).
<anastasiamac_> babbageclunk: again, the problem is that we wrap errors in our code (after the library calls) and will lose the type.... hence, why StartInstance was not an easy change -  we sometimes expect a typed ZoneIndependent error but for crednetial we actually want it to b a different type...
<babbageclunk> anastasiamac_: ? I'm suggesting you convert the problem ones before that though.
<anastasiamac_> babbageclunk: can we talk in a-team standup?..
<babbageclunk> yes!
#juju-dev 2018-03-23
 * thumper is back
<thumper> anastasiamac_: I left a comment on the pr asking if possible to split up
<thumper> 4k reviews are really hard
<anastasiamac_> thumper: thnx. i have broken it into separate commits (I thought) but I'll look at ur suggestion :D
<thumper> anastasiamac_: I think we may be talking about different PRs :)
<thumper> anastasiamac_: I was talking about oracle one :)
<anastasiamac_> thumper: oooh, yes, this one is hard :) it also has conflicts/broken CI run.... thank you for looking at it..
<thumper> veebers: got some time to chat?
<veebers> thumper: for you? Always
<thumper> 1:1?
<veebers> thumper: ack, omw
 * thumper afk to collect kid from school
<wallyworld> anastasiamac_: if you're so inclined, here's that PR https://github.com/juju/juju/pull/8524
<anastasiamac_> wallyworld: looking, altho m variously inclined :D
<balloons> cmars, reviewed. Sorry it took so long
<cmars> balloons: thanks!
#juju-dev 2018-03-24
<wpk> https://twitter.com/the_sttts/status/976382985640529920 it's so true it's painful..
#juju-dev 2018-03-25
<jam> 800 line diff, but it is strictly a cut and paste of code from a couple giant files into small focused files: https://github.com/juju/juju/pull/8525
