[00:57] <davecheney> wallyworld_: https://code.launchpad.net/~dave-cheney/juju-core/134-tag-release-1.11.2/+merge/174087
[00:57] <davecheney> ^ tagging, mark II
[00:58]  * wallyworld_ takes a look
[00:58] <davecheney> thumper still got man flu ?
[01:08] <wallyworld_> davecheney: out of curiosity, i listed the current tags on trunk and there's some weird ones there. nothing resembling a juju release
[01:11] <davecheney> they look like old juju 0.6/7 tags
[01:11] <davecheney> blame gustavo
[01:12] <wallyworld_> ok. and we haven't tagged 1.10 etc?
[01:13] <wallyworld_> seems kinda strange not to have tagged the releases
[01:13] <davecheney> wallyworld_: yes, this is suckful
[01:14] <davecheney> i'm trying to redeem myself
[01:14] <wallyworld_> you don't need to try too hard. you are always pushing to get proper processes in place which i fully agree with
[01:15]  * davecheney blushes
[01:15] <wallyworld_> davecheney: sorry for making your pocket all wet
[01:15] <davecheney> oh, i thought I had a drinking problem
[01:16] <wallyworld_> that's not mutually exclusive :-)
[01:21] <davecheney> wallyworld_: can I try submitting this tag change ?
[01:21] <davecheney> see if it works this time ?
[01:22] <wallyworld_> sure. i've not used bzr to create tags before. i think it will work
[01:24] <davecheney> wallyworld_: related, https://docs.google.com/a/canonical.com/document/d/1eeHzbtyt_4dlKQMof-vRfplMWMrClBx32k6BFI-77MI/edit
[01:43] <davecheney> wallyworld_: https://code.launchpad.net/~dave-cheney/juju-core/134-tag-release-1.11.2/+merge/174087
[01:43] <davecheney> merged fine, but not tag :(
[01:44] <wallyworld_> bigjools: any hints? ^^^^^^
[02:14] <bigjools> wallyworld_: wat?
[02:14] <wallyworld_> see the backscroll. davecheney wants to tag a rev for the 11.2 release. he added a tag locally, pushed up a mp, merged, but the tags didn't get copied in
[02:15]  * bigjools looks
[02:20] <bigjools> it's not merged
[02:21] <wallyworld_> bigjools: davecheney: the mp says merged though, doesn't it?
[02:22] <bigjools> it does, but if you look at the branch history, it's not
[02:22] <wallyworld_> so what would cause that?
[02:23] <bigjools> because not all the revisions were copied - the tag was not on the top revision
[02:23] <bigjools> more to the point there's no tag on r1415 but there is one on r1414 which is dimiter's rev
[02:24] <bigjools> ie the tag was put on a revno that is already merged
[02:24] <bigjools> rather than the empty commit
[02:24] <davecheney> bigjools: how would you do it ?
[02:24] <davecheney> i used
[02:24] <davecheney> bzr tag ; bzr commit --unchanged
[02:24] <wallyworld_> i assume the command run locally was bzr tag -r 1214 blah?
[02:24] <bigjools> commit first then tag
[02:25] <bigjools> tag works on committed revnos
[02:26] <bigjools> you can see what you did on here: http://paste.ubuntu.com/5863584/
[02:26] <bigjools> can you bzr tag lp:juju-core directly?
[02:27] <bigjools> just tag the revision you just landed
[02:27] <davecheney> bigjools: i don't think we can, Go bot owns everything
[02:27] <wallyworld_> davecheney: didn't you want to tag rev 1214?
[02:27] <davecheney> wallyworld_: no
[02:28] <wallyworld_> davecheney: if your ssh keys are registered, you can commit directly to juju-core
[02:28] <bigjools> davecheney: should work though, your key is on it
[02:28] <davecheney> wallyworld_: ok, i'll do that
[02:28] <bigjools> don't commit again
[02:28] <bigjools> just tag
[02:29] <bigjools> bzr tag -d lp:juju-core
[02:29] <bigjools> with a -r of course
[02:31] <davecheney> bigjools: right-o, thanks
[02:31] <bigjools> nae prob
[03:12] <fwereade> davecheney, oh hell, I thought mgz was handling 1.11.2... there are a couple of open issues aimed for that milestone
[03:12] <fwereade> wallyworld_, sorry I fell asleep, I'll do your reviews now
[03:13] <davecheney> fwereade: ok, how should I fix this ?
[03:13] <wallyworld_> fwereade: no hurry. it's the middle of the night for you. just do them later!
[03:14] <wallyworld_> fwereade: btw, will have add-unit with force machines up for review soon
[03:15] <fwereade> davecheney, depends how far the release has progressed -- if it's done, but just includes bugs we hoped it didn't, I'm fine doing another for 1.11.4 tomorrow I guess?
[03:15] <fwereade> wallyworld_, yeah, but I've just woken up, may as well do something useful
[03:15] <davecheney> fwereade: i've beent trying to release this fucker for two weeks
[03:15] <davecheney> we can always do another release
[03:15] <davecheney> we can do as many as we like
[03:16] <fwereade> davecheney, yeah, indeed
[03:16] <davecheney> the next release number is 1.11.3
[03:16] <davecheney> but see https://docs.google.com/a/canonical.com/document/d/1eeHzbtyt_4dlKQMof-vRfplMWMrClBx32k6BFI-77MI/edit#
[03:32] <fwereade> wallyworld_, crap, what "containertype" is set on a new environ machine?
[03:32] <wallyworld_> fwereade: ""
[03:32] <fwereade> wallyworld_, ah ok, that does make sense, I expected it to be "none" for a moment
[03:32] <wallyworld_> all new machines which are not containers get ""
[03:33] <wallyworld_> none is only for constraints
[03:33] <wallyworld_> add-unit force machine is up https://codereview.appspot.com/10962046
[03:45] <fwereade> wallyworld_, cool, I have a review of https://codereview.appspot.com/11019044/ up, I may have misunderstood some things
[03:45]  * wallyworld_ looks
[03:48] <wallyworld_> fwereade: afaik, deployment constraints for stuff like mem are not yet supported - there is a todo in the deploy command
[03:48] <wallyworld_> hence there's no work done with those in th assignment policies right now
[03:49] <fwereade> wallyworld_, please, what is the difference between a "deployment" constraint and a "provisioning" constraint?
[03:49] <fwereade> wallyworld_, you keep talking as if there's a distinction but I don't see one
[03:49] <wallyworld_> sorry, i might be confused. isn't provisioned related to starting a machine?
[03:49] <wallyworld_> and deployment related to deploying a charm
[03:50] <fwereade> wallyworld_, sure -- but what's the distinction from the POV of the constraints?
[03:50] <fwereade> wallyworld_, (deployemtn is a very hairy and overloaded word in juju land :/)
[03:51] <fwereade> wallyworld_, constraints are still constraints and still expected to apply, whether to a new machine or an existing one
[03:51] <wallyworld_> not sure what is meant by POV of constraints. right now, when a unit is deployed, constrints are ignored
[03:51] <fwereade> wallyworld_, sure, but that's just a bug, right?
[03:51] <wallyworld_> yes
[03:51] <fwereade> wallyworld_, not a fundamental distinction
[03:51] <wallyworld_> but this mp is not aout fixing that bug
[03:52] <wallyworld_> since when it was started, not hardware characteristics were there
[03:52] <wallyworld_> or so i recall
[03:52] <wallyworld_> the rest of the constraints is coming
[03:52] <wallyworld_> this mp is just for th contsainer stuff
[03:53] <fwereade> wallyworld_, yeah, I see... I worry that this puts the cart before the horse a little
[03:53] <wallyworld_> only for a short time, and it is not advertised functionality - it's an interim step that is useful to us as developers
[03:53] <wallyworld_> helps greatly to progress the container work
[03:54] <wallyworld_> as soon as the hardware for bootstrap machine is done, i'm onto this next step
[03:55] <wallyworld_> it allows us to more easily sort out the addressability work
[03:59] <wallyworld_> fwereade: with the mongo query you question - you suggested an alternative but i did reply saying i didn't think it would work. the machineDoc doesn't have any "children" attribute and mongo doesn't support cross document queries
[03:59] <wallyworld_> hence i need to read in the containers and do a $nin
[04:00] <wallyworld_> it sucks balls but that's mongo for you
[04:00] <fwereade> wallyworld_, ha, that would make sense then
[04:01] <wallyworld_> sadly it appears so :-(
[04:06] <wallyworld_> fwereade: replied to your comments
[04:06] <wallyworld_> also, the AddMachineParams is used outside of constraints for add-machine etc
[04:07] <wallyworld_> so it's ContainerType attribute makes sense
[04:16] <fwereade> wallyworld_, ok, that's seeming clearer now
[04:16] <wallyworld_> \o/
[04:16] <fwereade> wallyworld_, I have a couple of comments on https://codereview.appspot.com/10777044/ -- thoughts?
[04:16]  * wallyworld_ looks
[04:17] <fwereade> wallyworld_, (there probably is a good reason, I just can't see it, and because I can't see it the drawbacks seem stark)
[04:20] <wallyworld_> fwereade: the "--force-machine lxc" syntax is analogous to "add-machine lxc". it means to create a new container on a new instance
[04:20] <wallyworld_> note i got rid of the leading "/" for add-machine lxc
[04:21] <fwereade> wallyworld_, yeah, I think the problem is in the implicit creation via these paths, not just the syntax xhosen to do it
[04:21] <wallyworld_> but we want to support "--force-machine 25/lxc" ?
[04:22] <wallyworld_> which creates a new container on machine 25
[04:22] <fwereade> wallyworld_, and ctype is ignored if mid == "" in AddUnits anyway
[04:22] <fwereade> wallyworld_, yeah, I think so
[04:23] <wallyworld_> ok, i'll fix that. sorry, i've got so many branches on the go with tweaks in each i may have lost track a bit
[04:23] <fwereade> wallyworld_, I know the feeling :)
[04:24] <fwereade> wallyworld_, sorry, though, I'm not quite clear which bit you'll be fixing
[04:24] <wallyworld_> 1st branch (clean assignment policy): pass in entire constraints struct and for now just use container type
[04:25] <wallyworld_> instead of just passing in container type. less churn later
[04:25] <fwereade> +1
[04:25] <wallyworld_> 3rd branch (add unit force machine): disallow "lxc" and defer splitting up of force machine spec
[04:26] <fwereade> wallyworld_, I think that's the second branch
[04:26] <fwereade> wallyworld_, but that sounds perfect :)
[04:27] <wallyworld_> i think its the 3rd
[04:27] <wallyworld_> let me check, just gotta switch back to it
[04:27] <wallyworld_> pipeline is
[04:27] <wallyworld_>    clean-policy-container
[04:27] <wallyworld_>    force-machine-containers
[04:27] <wallyworld_> *  add-unit-force-machine
[04:27] <fwereade> wallyworld_, it's force-machine-containers; I haven't looked at the 3rd branch yet but so long as it follows what we just discussed I'll be happy :)
[04:28] <wallyworld_> ah sorry, was getting ahead of myself
[04:28] <wallyworld_> 3rd one adds force machine to add-unit
[04:28] <wallyworld_> similar to 2nd
[04:28] <fwereade> wallyworld_, cool
[04:29] <fwereade> wallyworld_, if you don't mind I will not look at that one right away because the most obvious comment will be "make it like the second one"
[04:29] <wallyworld_> sure, np. thanks for the input
[04:29] <fwereade> wallyworld_, cheers
[04:30]  * fwereade is just glad he's ~up to date with your reviews before your eod, roughly
[04:32]  * fwereade will be back a bit later
[06:47] <jam> fwereade: when do you sleep? :)
[06:47] <jam> dimitern: care to chat about when the password stuff triggers? I'm trying to sort out why the api is spinning after an upgrade. I have a thought, but I don't have as much depth here.
[06:48] <dimitern> jam: sure, as far as I can help
[06:49] <jam> dimitern: so we have a field OldAPIPassword, but that field is not referenced anywhere in the code. Can you confirm it has never been used?
[06:49] <jam> I think there was a discussion as to whether we have 2 passwords or just 1 for API vs direct State access
[06:49] <jam> I don't quite know what the current state is.
[06:50] <jam> or what is was in 1.10
[06:50] <dimitern> jam: if it's not used now that's new to me - it used to be used for the inital connection / password changing dance
[06:50] <jam> dimitern: agent/agent.go line 227 has "info.Password = info.OldPassword"
[06:50] <jam> Nothing in there touches APIPassword or OldAPIPassword
[06:51] <dimitern> jam: it's passed in agent conf on bootstrap i think
[06:51] <dimitern> jam: and then just changed
[06:52] <jam> dimitern: so, a small bit of confusion before clarity. There is agent.Conf which has OldAPIPassword in it, which is never used (that I can see)
[06:52] <jam> there is also an *optional* APIInfo pointer
[06:52] <jam> which can contain something which holds Password and OldPassword
[06:53] <dimitern> jam: if you take a look at machine.SetPassword - roger's recent changes made it call SetMongoPassword as needed
[06:53] <jam> api.Info doesn't have a spot for "OldPassword"
[06:54] <dimitern> jam: because at bootstrap time there's no api yet
[06:55] <jam> dimitern: so on fresh install of trunk, in agent.conf, I have: oldpassword: "", stateinfo: password: XXX, apiinfo:password: XXX where XXX is the same
[06:56] <jam> dimitern: on an upgrade, I have: oldpassword: Something, stateinfo: password: somethingdifferent, apinfo:password: ""
[06:56] <jam> I have the feeling we need to use stateinfo.password for the apinfo.password
[06:56] <jam> because the code is looping trying to connect with oldpassword and failing.
[06:56] <dimitern> jam: seems correct yes - api and state passwords should match
[06:59] <jam> dimitern: so what I *think* happened, in 1.10 we would bootstrap, create the agent.conf without an api password. The State connection would connect, and then immediately change the password, but it doesn't record that fact in the API password, only the stateinfo.password
[07:00] <dimitern> jam: seems likely, because iirc at that time api password changing was not working correctly
[07:01] <fwereade> jam, dimitern: as far as I'm aware stateinfo and apiinfo duplicate the same information, apart from the addresses
[07:01] <dimitern> fwereade: yes, or they should at lest
[07:01] <dimitern> least
[07:01] <jam> fwereade: as in, we should just ignore APIInfo.Password and always use StateInfo.Password ?
[07:01] <fwereade> dimitern, yeah, this is why I bitch and moan about these sorts of OAOO violations
[07:02] <fwereade> jam, i suspect that's actually worth a try... but probably not sane long-term
[07:02] <jam> what about the vestigial Conf.OldAPIPassword
[07:02] <dimitern> oaoo ?
[07:02] <fwereade> dimitern, Once And Only Once
[07:02] <dimitern> aah
[07:04] <fwereade> jam, on balance I think I would not have a problem with a tweak to agent.Conf that effectively dropped all of apiinfo apart from the addresses
[07:05] <jam> fwereade: is the thought that eventually we don't want to allow direct state access, so we actually want to drop the StateInfo side of that file?
[07:05] <jam> :w
[07:05] <fwereade> jam, but agent.Conf itself is kinda weird, what with how it serializes itself and all
[07:05] <fwereade> jam, yeah, basically
[07:06] <fwereade> jam, stateinfo is now authoritative but will one day be very infrequently used
[07:07] <fwereade> jam, using stateinfo solves a problem today but we'd need to do something like always read from stateinfo, and write to both, and still apply liberally fancy footwork to dance through the next upgrade
[07:08] <fwereade> jam, so... I dunno, I said yes above but probably not for pure or defensible reasons
[07:08] <fwereade> jam, some of this is academic because apparently dave did a release last night
[07:09] <fwereade> jam, I think we should at least be aiming for another one, imminently, that *can upgrade* from 1.10 without devastation
[07:09] <fwereade> jam, and consider that one for promotion to 1.12
[07:10] <jam> fwereade: so it isn't hard to change agent.OpenAPI to notice that Conf.APIInfo.Password isn't set but Conf.StateInfo.Password is set
[07:10] <jam> though my immediate result is that it may not be enough
[07:10] <fwereade> jam, yeah, I feel like there is probably no shortage of edge cases in play
[07:11] <jam> fwereade: is there a way to get juju to re-read its agent.conf without killing it?
[07:11] <fwereade> jam, heh, it doesn't do that anyway? well that sucks
[07:12] <jam> fwereade: afaict I can change it, and it keeps trying with the old content
[07:12] <fwereade> jam, oh, ffs
[07:13] <fwereade> jam, the original plan was that we store the separate bits in separate files, and make agent.Conf an accessor for those files, precisely so that agents could update parts of their own conf without falling over
[07:14] <fwereade> jam, eg so that we can write out state info gleaned from the api and then have our state worker come up because, yay, valid information just appeared
[07:15] <jam> fwereade: so ATM, the only place that reads the agent conf is at the beginning of cmd/jujud/machine.go: MachineAgent.Run()
[07:15] <jam> so if there is something that restarts the Agent.Run() it will reread it
[07:16] <jam> I *think* restarting the API is being done by the runner
[07:16] <fwereade> jam, IIRC the justification for not doing that was that the possible errors from reading the conf repeatedly were a hassle and dirtied up the api or something
[07:16] <fwereade> jam, yeah, if the conf is just read once in Run() it's useless for that purpose
[07:18]  * fwereade has already used up his daily quota of heavy sighs
[07:19] <dimitern> fwereade: can lend you some, if you need ;)
[07:19] <fwereade> haha :)
[07:20] <fwereade> ok, it could be worse; there could be snakes in here with me
[07:20] <dimitern> :D - with that weather?
[07:21] <fwereade> dimitern, I seems to have had the uk's two weeks of nice weather since I arrived
[07:21] <fwereade> jam, so, ok, some sort of change to agent conf has to be in the offing
[07:21] <dimitern> fwereade: ah, cool - summer time then
[07:21] <fwereade> jam, because sooner or later we *will* want to rewrite parts of it
[07:22] <jam> fwereade: I know that today it has a fair amount of cruft in it. But as you say, lets get something that *can* upgrade from 1.10 :)
[07:22] <fwereade> jam, and when we do that we either have to mess around locking access to it, or we need to break it up into parts that can be read independently
[07:23] <fwereade> jam, so I leave it up to your judgment what to actually do about it today
[07:34] <fwereade> jam, dimitern, tyvm for reviews
[07:34] <dimitern> fwereade: yw
[07:36] <jam> fwereade: happy to, you always return the favor.
[07:36] <fwereade> jam, the reason for the repeated startsyncs is the separate state objects... the agentState.Sync ensures that the appropriate event has been delivered to the apiserver's watcher, and at some point after that it will change in the local state
[07:36] <jam> fwereade: I *really* wish we encoded agent cert differently in the file, it is so hard to read manually.
[07:36] <fwereade> jam, tell me about it
[07:36] <jam> fwereade: a simple base64 into a string
[07:36] <jam> so nice
[07:37]  * fwereade knows, he really does
[07:38] <fwereade> jam, you know it'd be even nicer if we just put the cert and key files up directly and manipulated them on the instance
[07:38] <fwereade> jam, we put that same data up in far too many ways
[07:38] <fwereade> jam, I think there's another copy with the cert and key concatenated for use by the mongodb server itself
[07:45] <jam> fwereade, dimitern: so without touching juju-1.11* I tried just editing agent.conf so that it had the same password for agentinfo.password as stateinfo.password. I can see it trying to login, fail, fall back to oldpassword, also fails.
[07:45] <jam> Is there a reason to believe that StateInfo.password should be different from APIInfo.password ?
[07:46] <jam> would there be some other field that didn't get set properly?
[07:46] <dimitern> jam: i'd look in the agent conf rendering in cloud init
[07:46] <fwereade> jam, last time I asked roger (because I get nervous about this stuff periodically) he said they should be the same
[07:46] <jam> fwereade: on a fresh trunk install, they are the same
[07:46] <jam> and it works
[07:46] <jam> I'm trying to figure out if that password is stored somewhere
[07:47] <jam> differently in the db, for example
[07:47] <fwereade> jam, but, upgraded: the state one works, the api one is missing, and the state password doesn't work for the api?
[07:47] <jam> internally we check entity.PasswordValid which compares utils.PasswordHash(password) to entity.PasswordHash
[07:47] <jam> fwereade: correct
[07:48] <jam> fwereade: so I'm thinking that the "I'm connecting to the DB" password is stored in a mongo config, vs the "I'm a Machine-0 agent" is stored inside the DB
[07:48] <fwereade> jam, that should indeed be the case
[07:48] <jam> and 1.10 is setting the Mongo password, but not the contents of the entity in the DB
[07:48] <fwereade> jam, perhaps we were at one stage setting only the mongo password?
[07:48] <jam> so, essentially, the entity *has* no password in the db.
[07:48] <jam> inside the db
[07:49] <fwereade> jam, that sounds very plausible
[07:49] <jam> and we don't know what the password could be, because we only store password hashes.
[07:49] <jam> fwereade: so how do we unbreak it?....
[07:49] <jam> if entity.PasswordHash == "" any password is valid?
[07:50] <jam> fwereade: as an aside, did we sort out how to install lxc on upgrade?
[07:50] <jam> as the other 'we can't upgrade' bug
[07:50] <fwereade> jam, that strikes me as insanity
[07:50] <fwereade> jam, we did not, because I fell asleep yesterday before talking to thumper
[07:51] <jam> fwereade: and he's been sick anyway, I believe. Maybe we'll catch him at the standup.
[07:51] <fwereade> jam, ah ok
[07:51] <jam> fwereade: here are some issues: We can't change the 1.10 upgrade code. So even if we said "while you upgrade, set these values" it only works when upgrading from trunk, not too trunk.
[07:51] <jam> So we need a hook at first run of a new binary?
[07:51] <fwereade> jam, I'm having a ciggie before the standup, brb
[07:52] <jam> np
[07:52] <jam> mostly just thinking outloud
[07:58] <jam> fwereade: confirmed. m.doc.PasswordHash == ""
[07:58] <fwereade> jam, balls
[07:58] <fwereade> jam, ok we can fix that, we have a state connection
[07:58] <jam> fwereade: where does that happen?
[07:59] <jam> MachineAgent.Run() calls MachineAgent.FixAllTheUpgradeBreaks() ?
[07:59] <fwereade> jam, basically yeah :/
[08:05] <jam> mgz: team meeting? https://plus.google.com/hangouts/_/bf3f4cfe715017bf60521d59b0628e5873f2a1d3
[08:47] <wallyworld_> fwereade: so if you are able to +1 the current 3 branches (or indicate what needs fixing), i'll get tim to do another +1 tomorrow so i can land
[08:49] <fwereade> wallyworld_, I think what we agreed this morning is fine, I'll add to those CLs
[08:50] <wallyworld_> fwereade: ty. i've updated the first 2. i need to port the changes forward to the 3rd
[09:01] <fwereade> wallyworld_, awesome, tyvm
[09:01] <wallyworld_> np
[09:22] <mgz> yeay, we have a whole bunch of new public ips for canonistack!
[09:22] <mgz> it breaks a few things in juju, but hey
[09:50] <dimitern> fwereade, jam, TheMue: https://codereview.appspot.com/10949046 - deployer facade (done properly this time, i hope)
[09:51] <TheMue> fwereade: I added the race detection report to the issue. to me it looks like the calling of MgoSuite.SetUpTest() and MgoSuite.TearDownTest() dislike the working in parallel.
[09:51] <jam> fwereade: just to get your pre-implementation approval before I spend a lot of time on it. We want to add a cmd/jujud/agent.go function which is called something like EnsureValidConfiguration(agent, st *state.State) and that gets called at the beginning of MachineAgent.Run() and ?UnitAgent?.Run()
[10:03] <TheMue> dimitern: you've got a review
[10:03] <dimitern> TheMue: thanks
[10:03] <TheMue> dimitern: yw
[10:04] <jam> dimitern: reviewing, I'll have some comments.
[10:06] <dimitern> jam: sure
[10:13] <dimitern> fwereade: ping
[10:17] <jtv1> Is there an easy way to verify in gocheck that a multi-line text contains a particular substring?  I tried Matches in multi-line mode but it only seems to work if my pattern occurs in the *first* line of the text.
[10:18] <dimitern> jtv: prepend '(*|\n)' before the regexp
[10:18]  * jtv tries
[10:19] <jtv> dimitern: that fails to parse...missing argument to repetition operator: `*`
[10:20] <dimitern> jtv: sorry, see this: 	c.Assert(err, ErrorMatches, "(.|\n)*cannot allocate a public IP as needed(.|\n)*")
[10:20] <dimitern> jtv: ErrorMatches (and Matches I think) prepend '^' and append '$' to the regexp, so this is a way around it
[10:21] <jtv> Oh ffs...
[10:22] <jtv> Thanks for pointing that one out...  pretty horrible to leave that kind of subtlety undocumented!
[10:22] <dimitern> jtv: tell me about it!
[10:22] <dimitern> jtv: spend half a day to discover that at the time
[10:23] <jtv> Well then I'm very, very glad that I ran into you.  :-)
[10:23] <dimitern> ;) np
[10:26] <jam> dimitern: I'm going to publish my comments, though I want to chat with you about them because I realize I'm *slightly* off in my statements.
[10:26] <dimitern> jam: ok
[10:27] <jam> dimitern: so the fundamental thing, is you are generating the set-of-all-units at construction of DeployerAPI time.
[10:27] <jam> which I think is too early.
[10:27] <jam> A MachineAgent will be running a long time.
[10:28] <jam> and it will get units added and removed as time goes on
[10:28] <dimitern> jam: so getAuthFunc gets called every time an entity is about to be accessed, so we get a fresh list of units that are relevent
[10:28] <jam> it won't be re-connecting to the API during that time.
[10:28] <jam> dimitern: your getAuthFunc only checks the set I think
[10:28] <jam> ah maybe you're right. It looked like it was canned, but I guess the the outer function returns the inner.
[10:29] <dimitern> jam: no the function it returns only checks the set, but the set is constructed in the beginning of each Remove, SetPasswords or Life calls
[10:29] <jam> dimitern: so... can we add some tests in deployer_test about the specifics of removing
[10:29] <jam> so that we know we got our getAuthFunc right?
[10:29] <dimitern> jam: sounds reasonable, will do
[10:30] <dimitern> jam: and no, probably WatchUnits is not the best way to get them, but it certainly is the easiest
[10:30] <dimitern> jam: otherwise, i have to get all principals, for each one get all subs, and then construct the list
[10:31] <dimitern> jam: william suggested using the watcher as a simpler way to do the same
[10:32] <jam> dimitern: k. It still feels a bit strange to me. I think the timeout comment still applies.
[10:32] <dimitern> jam: i can try relying on the initial changes set and not timeout, but blocking seems even worse
[10:32] <dimitern> jam: in production code especially
[10:33] <jam> dimitern: (a) timeout of 50ms is way to slow in a real deployment, (b) Given our assumptions about initial event on watchers blocking doesn't seem that bad.
[10:33] <jam> If it actually blocks, we've got code level problems, right?
[10:33] <jam> not "oh the database is busy now".
[10:34] <dimitern> jam: not sure, that's why i'd like fwereade opinion on that
[10:34] <dimitern> jam: i can change it to block, no problem
[10:35] <fwereade> dimitern, sorry, breakfasting
[10:35] <fwereade> dimitern, reading back
[10:35] <jam> fwereade: you know that you're not allowed regular body functions like eating sleeping or going to the bathroom, right? You're barely allowed to smoke :)
[10:35] <dimitern> :D
[10:36] <jam> dimitern: so if the initial Changes event blocks (hard, as in never completes), it would be because our assumption about changes works is wrong
[10:36] <jam> dimitern: If it just takes a while (2s, 5min) it is because the database is heavily loaded.
[10:37] <dimitern> jam: so you propose to use LongWait if we need timeouts?
[10:37] <fwereade> jam, dimitern: ...and so if that's blocking hard, probably so would an explicit get-and-collect
[10:37] <jam> I could argue for some "unreasonably long" timeout, but that would probably be minutes.
[10:37] <jam> not even 500ms
[10:37] <jam> fwereade: I would expect so, yes.
[10:38] <jam> fwereade: I'm roughly ok using changes as a shortcut for the get+collect because we know changes already has that logic.
[10:38] <fwereade> jam, dimitern: do you feel it's unreasonable to depend on a guaranteed first event (which may ofc be a channel close in case of error...)?
[10:38] <dimitern> fwereade: so what's gonna be - blocking read or longer timeout?
[10:38] <jam> I just think we shouldn't timeout, or should take *much* longer to timeout.
[10:38] <fwereade> jam, dimitern: yeah, I would prefer to trust the underlyng watcher to do what it should
[10:38] <jam> dimitern: I slightly prefer blocking read in a "less chance we got this tunable parameter wrong"
[10:38] <dimitern> fwereade, jam: ok, will change it to blocking then
[10:40] <jam> dimitern: I hope I'm getting the concept of when to timeout across. Timeout is useful for detecting when you have bad *code*. That means you want to do it in testing. In production you should have already caught the bad code and fixed it, so you can just trust the thing is written correctly (And if it isn't write more tests :)
[10:40] <jam> dimitern: and then on things that are genuinely uncertain
[10:40] <jam> like network access
[10:41] <jam> does that make sense?
[10:41] <jam> fwereade: you're welcome to correct my thought if you feel differently.
[10:41] <jam> (I'm certainly willing to debate it)
[10:42] <dimitern> jam: the worst thing that can happen is, we'll just return late from the method calling getAuthFunc, with an error
[10:42] <fwereade> jam, that makes sense to me at least in this case
[10:42] <jam> (13:51:49) jam: fwereade: just to get your pre-implementation approval before I spend a lot of time on it. We want to add a cmd/jujud/agent.go function which is called something like EnsureValidConfiguration(agent, st *state.State) and that gets called at the beginning of MachineAgent.Run() and ?UnitAgent?.Run()
[10:42] <jam> since you were out
[10:42] <fwereade> jam, ah sorry
[10:42] <fwereade> jam, that sounds plausible to me though
[10:43] <jam> k
[10:43] <jam> fwereade: manager meeting in 15, right?
[10:43] <jam> Can I get anything productive done in that time...
[10:43] <fwereade> jam, oh, god, already, bah
[10:46] <dimitern> guys, since I'm the OCR today, I'll take a look at the review queue after I land this branch of mine
[10:59] <jtv> dimitern: I've got some fun ones lined up for you.  :)
[10:59] <dimitern> jtv: cool :) keep 'em coming
[11:06] <jam> jtv: do we need to update gwacl on go-bot for your patch to land?
[11:06] <jam> https://code.launchpad.net/~jtv/juju-core/az-customdata/+merge/174157
[11:11] <rvba`> jam: yes
[11:20] <rvba> jam: but don't update right now, you need the fix in jtv's branch to cope with the changes in gwacl.
[11:23] <jtv> rvba: I figured it'd be either yours or mine, doesn't matter which lands first.  :)
[11:24] <jam> rvba: do they need to land in lockstep?
[11:24] <rvba> jtv: it will be yours.
[11:24] <jtv> jam: rvba's branch and mine?  No, they're decoupled otherwise.
[11:24] <jam> (the branch can't land without the update but the update can't be done without the branch)
[11:24] <jam> jtv: gwacl + juju-core
[11:24] <jtv> But whichever comes first will also necessitate a gwacl update.
[11:25] <rvba> jam: that's right.
[11:25] <jtv> So yes, as soon as mine lands, we'll need the gwacl update.
[11:25] <jam> jtv: actually, as soon as yours is scheduled for landing
[11:26] <jtv> That works for me too.  :)
[11:27] <jam> jtv: which means I (or someone likes me) needs to be pinged when you are ready to mark it APproved
[11:27] <jam> jtv: that's my question about lock-step
[11:27] <jam> if you can land the code, and then update gwacl, or the reverse it is much easier to do
[11:28] <jtv> Yes...  unfortunately gwacl upstream is already updatd.
[11:28] <jtv> *updated.
[11:28] <jam> jtv: dependency changes are rougly equivalent to db schema changes in LP
[11:28] <jtv> Yes I understand.
[11:37] <jtv> dimitern: allenap` had a better way to get multi-line regex matching...  The "s" flag.  It makes newlines match "."  For example, "(?s).*foo.*" looks for any instance of "foo" in the input.
[11:44] <dimitern> jtv: if it works, great
[11:44] <dimitern> jtv: with the ^$ already in plce
[11:44] <dimitern> place
[11:57] <jam> fwereade, dimitern: Beta1 freeze is August 29th, so 4 months into the cycle.
[11:57] <jam> Alpha 2 will be July 25th.
[11:58] <dimitern> cool, soon then
[13:05] <dimitern> fwereade: ping
[13:10] <dimitern> abentley: hey, after the changes rvsubmit works fine, just the approvers list could still be duplicated
[13:10] <abentley> dimitern: I don't understand that.  The approvers are represented as a set, so they can't have duplicates.
[13:11] <fwereade> dimitern, pong
[13:11] <abentley> dimitern: Could you point me at the case where this happened?
[13:11] <dimitern> abentley: I don't know how, but my previous MP had that - 2 LGTMs from 2 people and the list in the commit message contained twice each reviewer name
[13:11] <dimitern> fwereade: can you take a look please? https://codereview.appspot.com/10949046/
[13:12] <abentley> dimitern: Can you please give me a link so I can check them out?
[13:12] <dimitern> abentley: http://bazaar.launchpad.net/~go-bot/juju-core/trunk/revision/1417
[13:13] <abentley> dimitern: Please give me a link to the merge proposal or the reitveldt code review, not to the branch.
[13:14] <dimitern> abentley: it's in the commit message, sorry: https://codereview.appspot.com/10944044/
[13:20] <abentley> dimitern: When I run "bzr rv-submit https://code.launchpad.net/~dimitern/juju-core/066-apiserver-stringswatcher/+merge/173896", it wants to append R=fwereade, jameinel to to bottom, which doesn't give any duplicates.  Did you maybe run rv-submit against this branch before I fixed the duplicate problem?
[13:21] <dimitern> abentley: ah, sorry - this was before, haven't tested for duplicates after the fix, just ensured the approved revision is set and the bot is happy
[13:21] <dimitern> abentley: will try it out by adding an extra lgtm next time i land
[13:22] <abentley> dimitern: Sorry, what?  Were you saying the bug still exists without testing whether the bug exists?
[13:23] <abentley> dimitern: I fixed the duplicates bug yesterday because you pointed it out.
[13:23] <dimitern> abentley: no, i looked at the patch and it seemed so, sorry
[13:24] <abentley> dimitern: Okay.
[13:24] <dimitern> abentley: didn't claim i've seen the bug, was just asking if it was fixed as well
[13:25] <dimitern> abentley: thanks again for the quick fix
[13:25] <abentley> dimitern: I thought you were saying the bug still existed when you said "the approvers list could still be duplicated".
[13:26] <dimitern> abentley: bad phrasing on my part
[13:27] <abentley> dimitern: Okay.
[13:45] <fwereade> dimitern, reviewed, let me know your thougts
[13:46] <dimitern> fwereade: cheers
[13:48] <dimitern> fwereade: i don't get why block the removal of alive units?
[13:49] <fwereade> dimitern, because the deployer gosh-darn-well ought to be deploying them, not trashing them
[13:49] <fwereade> dimitern, and the provisioner should be doing the same thing in the same situation
[13:50] <dimitern> fwereade: and if we want to remove something that's stuck?
[13:50] <fwereade> dimitern, it's not the deployer's responsibility to make a unit dying, but it's the deployer's responsibility to lead it to the grave once it is
[13:50] <jam> dimitern: fwereade: I'm proposing a patch now which fixes bug #1199915 . It isn't perfectly tested, but I did do a live upgrade test, and it does fix the existing bug
[13:50] <_mup_> Bug #1199915: api cannot connect, fills deployed machine log with spam <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1199915>
[13:50] <fwereade> dimitern, we tell the system we want to remove it by making it dying, not alive
[13:51] <jam> ugh, wrong one
[13:51] <jam> I fixed the LXC bug
[13:51] <jam> bad me
[13:51] <jam> bug #1199913
[13:51] <_mup_> Bug #1199913: upgrade from 1.10 to trunk cannot deploy containers <juju-core:Triaged> <https://launchpad.net/bugs/1199913>
[13:51] <fwereade> dimitern, stuckness is I think orthogonal here
[13:51] <dimitern> fwereade: ok, fair enough
[13:52] <dimitern> fwereade: i'd prefer not to do the AssignedUnitNames in this CL though
[13:52] <dimitern> fwereade: (and I though you said you won't mind the watcher approach ;)
[13:53] <jam> fwereade, dimitern: https://codereview.appspot.com/10986044/
[13:58] <fwereade> dimitern, yeah, I think I reiterated there that I don't *mind* it but it would probably still be better as a direct state method; twas a rumination not really a request
[13:58] <dimitern> fwereade: ok ;)
[14:03] <fwereade> jam, reviewed
[14:08] <dimitern> jam: reviewed as well
[14:28] <rvba> dimitern: Hi! May I add another review to your queue? https://codereview.appspot.com/11166043
[14:29] <dimitern> rvba: sure, will get to it a bit later
[14:29] <rvba> dimitern: ta!
[14:30] <jam> fwereade: so there is fslock, but the actual path to lock is pretty hard to get at. Thoughts?
[14:30] <jam> fwereade: and yes, MachineAgent will be broken on non-debian-based platforms, but it already is today.
[14:31] <jam> And whatever we do to make Container a bit happier there *should* give us a guide for apt-get install.
[14:31] <jam> (I'm hoping )
[14:31] <jam> Sure RH will be different
[14:31] <jam> but until we actually support RH.
[14:31] <jam> mgz mentioned trying to have cloud-init do the work, but we looked at it, and it isn't really designed around calling as a script
[14:33] <fwereade> jam, the path isn't that hard is it? it's relative to dataDir I thought
[14:35] <jam> fwereade: "what" dataDir?
[14:36] <fwereade> jam, *the* dataDir :)
[14:36] <fwereade> jam, it's a global lock for all units
[14:37] <fwereade> jam, they all get to it relative to dataDir, ie, usually /var/lib/juju
[14:41] <fwereade> jam, this may be messy actually :/ we do *not* want the machine agent to go down and wait on that lock while a unit finishes running a long hook. blast
[14:43] <dimitern> fwereade: should I change the Remover to refuse removing alive entities?
[14:44] <fwereade> dimitern, yes please, it should just be an additional check once you've got the unit
[14:44] <dimitern> fwereade: I never have a unit - I need to amend the Remover to have Life() as well
[14:44] <dimitern> fwereade: (the interface)
[14:46] <fwereade> dimitern, ha, good point
[14:47] <dimitern> fwereade: i'm embedding a lifer into the remover
[14:47] <fwereade> dimitern, I don't think that's too much of an issue in practice? perfectc
[14:47] <dimitern> fwereade: but since both embed tagger, would that be an issue?
[14:48] <fwereade> dimitern, just embed lifer instead of tagger
[14:48] <fwereade> dimitern, it's coherent
[14:48] <dimitern> fwereade: ah! yes
[14:48] <fwereade> dimitern, remove only currently makes sense for lifecycle entities anyway
[14:48] <jam> fwereade: we could use it in a strictly advisory way.
[14:48] <jam> Try to grab it
[14:48] <jam> if we fail, still try to install lxc
[14:49] <jam> worst case, the apt-get install fails
[14:49] <jam> and we are in the same boat
[14:49] <jam> best case, we prevent a race
[14:49] <jam> so try for eg 1 minute
[14:49] <fwereade> jam, I forget how fslock works, is it plausible that we could grab it before restarting?
[14:49] <fwereade> jam, and release it after handling upgrade changes on startup?
[14:49] <jam> fwereade: no, because we don't control the old code.
[14:50] <fwereade> doh
[14:50] <jam> fwereade: remember 1.10 is the thing triggering the upgrade
[14:50] <jam> fwereade: if we had hooks during upgrade, we could tell it to install lxc before restarting :)
[14:50] <fwereade> jam, ok, sgtm
[14:50] <fwereade> jam, indeed :)
[14:58] <jam> fwereade: I updated the CL: https://codereview.appspot.com/10986044/  I'm testing it now, but since I'm "not working in the evening" I figured I would solicit feedback before I finish.
[14:59] <jam> dimitern: ^^
[15:01] <dimitern> jam: will look soon
[15:04] <jam> dimitern, fwereade: confirmed that it solves bug #1199913
[15:04] <_mup_> Bug #1199913: upgrade from 1.10 to trunk cannot deploy containers <juju-core:Triaged> <https://launchpad.net/bugs/1199913>
[15:04] <dimitern> jam: great!
[15:06] <jam> I'm off again (for dinner), I will definitely try to check back in tonight. (I only have 3 more working days before I go on vacation, so I'll try to put in some overtime :)
[15:08] <fwereade> jam, LGTM
[15:09] <dimitern> jam: LGTM with 1 comment
[15:14] <dimitern> fwereade, jam: updated https://codereview.appspot.com/10949046
[15:42] <fwereade> dimitern, need to take a break for a bit I'm afraid
[15:42] <dimitern> fwereade: sure, when you can; i can use a break myself
[16:57] <teknico> marcoceppi: hi, some improvements (hopefully :-) ) to your discourse charm: https://code.launchpad.net/~teknico/charms/precise/discourse/add-cheetah-pkg-misc-fixes/+merge/174248
[17:05] <marcoceppi> teknico: thanks! I've been hoarding progress on the charm, so this is a good excuse to update the remote branch
[17:06] <teknico> marcoceppi: you're welcome!
[17:07] <teknico> marcoceppi: if you don't mind me asking, any specific reason the charm is written in shell script, rather than something saner? :-)
[17:08] <marcoceppi> teknico: I hack faster in shell :)
[17:09] <teknico> marcoceppi: oh, I see. It must be painful ;-)
[17:37] <hazmat> mramm, so re current state of containers
[17:37] <mramm> hazmat: sure...
[17:38] <hazmat> mramm, we can create them with add-machine 0/0 but how do we deploy to them or is that post local provider?
[17:38] <hazmat> is that just going to be a constraint for containerized deploys
[17:39] <mramm> juju deploy wordpress --force-machine
[17:39] <mramm> with syntax like 0/lxc1
[17:39] <mramm> the spelling of all that is up for debate with william and mark tomorrow morning
[17:40] <hazmat> ic
[17:40] <hazmat> thanks
[20:37] <fwereade> mgz, ping
[20:38] <fwereade> mgz, if you're around, do you know offhand how we push tools to hp cloud on release?
[21:31] <_thumper_> morning
[21:36] <thumper> mramm: hi there, I'm semi-functional
[21:36] <thumper> mramm: so working, but not up to full speed
[21:36] <mramm> gotcha
[21:37] <mramm> glad to see you are feeling better!
[21:45]  * thumper sighs
[21:45] <thumper> fwereade: ping
[21:45] <thumper> I guess I wasn't able to articulate why I didn't want the agent stuff merged with tools
[21:45] <thumper> while it is useful in some way
[21:45] <thumper> it again brought in all the state and api dependencies
[21:46] <thumper> which we were trying to get away from
[21:46] <thumper> the whole point of tools.Tools was to be able to use it *without* importing state
[21:47]  * thumper thinks on how to untangle
[21:48]  * thumper wonders how to defer this horrible problem
[21:49] <thumper> we suck as a team at dependency management
[21:49] <thumper> the whole interface segregation principle is violated throughout the code
[21:49] <thumper> with willful abandon
[22:07]  * thumper has import loops again...
[22:07] <thumper> due to this lack of segregation
[22:07]  * thumper thinks
[22:09]  * thumper gives up on this
[22:24]  * fwereade hears thumper loud and clear
[22:27] <fwereade> thumper, hi, btw, hope you're on the mend
[22:27] <thumper> I think I am
[22:27] <thumper> just not fully there yet
[22:27] <fwereade> thumper, glad to hear it, we've been missing you
[22:27] <thumper> fuzzy head and all that
[22:27] <fwereade> thumper, don't overdo it :)
[22:27]  * thumper smiles
[22:27] <thumper> I'm going to try to work out how to get local tools synced for the local provider without having to build them
[22:28] <thumper> I have a plan
[22:28] <fwereade> thumper, frank is not currently working on auto-sync-tools but may do very soon, so please let him know if you start work on it
[22:29] <fwereade> thumper, sounds like it may be at risk of collision
[22:29] <thumper> ack
[22:36] <m_3_> does anybody know what happened to update-alternatives in juju-core-1.11.2-3~1414~raring1
[23:36] <thumper> m_3_: nope, perhaps dave does, he should be around soon
[23:38] <m_3_> ack... thanks