/srv/irclogs.ubuntu.com/2012/07/17/#juju-dev.txt

niemeyerdavecheney: Proposal in the list02:41
rogdavecheney: mornin'06:33
davecheneyrog: howdy06:35
=== philipballew__ is now known as philballew
fwereade_rog, heyhey07:20
rogfwereade_: mornin' guv07:22
fwereade_rog, how's it going?07:22
rogfwereade_: am finally looking at relation-units-watcher BTW...07:22
fwereade_rog, cool, if you see any obvious simplifications I will be most pleased07:22
fwereade_rog, it's an awful lot better now it has watch-presence-children to depend on07:23
rogfwereade_: the main simplifications would be gained, i believe (although i haven't properly verified), by applying the CL i proposed yesterday07:23
rogfwereade_: i may be wrong, but all that tomb checking seems unnecessary.07:25
fwereade_rog, so how do I avoid sending changes after departs?07:25
rogfwereade_: instead of keeping two maps, one with tombs and one with names, i'd keep a single map with a locally defined relationUnit type.07:27
fwereade_rog, maintaining state on the loop goroutine feels kinda hacky to me, like saying "we can't make this do the right thing so patch it up later"07:27
rogfwereade_: then it would be trivial to mark that so that you don't issue any updates after a depart07:27
rogfwereade_: you're already maintaining state with the tombs07:27
fwereade_rog, (state to filter the changes, I mean)07:27
rogfwereade_: the unitLoop code would be much simpler, for one07:28
rogfwereade_: you'd have a couple of extra lines to maintain the state, but about 30 less (total guess!) lines overall07:28
fwereade_rog, and, hmm, I don;t think I need names at all anyway, do I?07:30
fwereade_rog, oh bother, yes I do, for removes07:31
rogfwereade_: and the same applies to TheMue's firewaller code. i saw your suggestion to add tombs and i thought, yes you're right, the way things are, we do. but... noooo, it's already much bigger than it needs to be this is so unnecessary if we just changed our invariants a little07:31
rogoops07:31
* rog hates pressing return when he meant to press backspace07:31
rogwereade_: and the same applies to TheMue's firewaller code. i saw your suggestion to add tombs and i thought, yes you're right, the way things are, we do. but... noooo, it's already much bigger than it needs to be  and unnecessary if we just changed our invariants a little07:32
rogis what i meant to say07:32
rogfwereade_: all the code looks very reasonable given our current rules BTW. i'm just thinking that we could make things simpler if we wanted to.07:34
fwereade_rog, I think making those guarantees makes our lives simpler overall07:34
fwereade_rog, your mileage clearly varies on this ;)07:34
rogfwereade_: clearly :-)07:34
rogfwereade_: how does it make our lives simpler?07:35
rogfwereade_: given that it's adding lots of code (IMHO obviously)07:35
fwereade_rog, it's easier to reason about the components when we use them07:35
fwereade_rog, a Stop that doesn't really stop feels kinda weak to me07:35
rogfwereade_: if you want a Stop that really stops, it's trivial to do.07:36
rogfwereade_: you just stop and wait for eof on the channel.07:36
rogfwereade_: think of it as flushing the pipeline07:36
rogfwereade_: where the pipeline consists of components like unitLoop07:37
fwereade_rog, indeed, we can't do it when we have components like unitLoop; we have to add that deadness state07:38
rogfwereade_: actually, i think we could do it without the deadness state07:39
rogfwereade_: you *already* do t.Kill then t.Wait07:39
rogfwereade_: which is the moral equivalent07:39
fwereade_rog, I thought you were saying the tombs were unnecesssary?07:40
rogfwereade_: they are07:40
fwereade_rog, and to make that work I need the selects that you don't like07:40
rogfwereade_: i'm saying that you could close the unit's stop channel, then wait for eof on its update channel.07:40
fwereade_rog, we have one updates channel for all unit loops07:41
rogfwereade_: oh yeah, good point :-)07:41
rogfwereade_: i knew there was a reason07:41
fwereade_rog, as I said I'm really +-0 on the change -- it feels to me like the things it makes easier are balanced by the things it makes harder07:42
fwereade_rog, either would be a reasonable way to start out but I don;t feel like a switch in midstream really gains us much07:43
rogfwereade_: from my point of view, we are just starting out on the real uses of watchers.07:43
fwereade_rog, I think I'm just less bothered by all the selects than you are ;)07:43
rogfwereade_: which is why i'm suggesting the change now07:43
fwereade_rog, in the end it's niemeyer's call and I don't think he's convinced... but maybe he will be by looking at relationUnitsWatcher :)07:45
fwereade_rog, I don't feel it's a win in this context but I am probably too close to the current implementation to be enitirely objective07:45
rogfwereade_: i'll try and rephrase relation units watcher to try and show the advantage. i'll probably fail :-)07:46
fwereade_rog, either way one of us will learn something :)07:47
rogfwereade_: i suppose it's coming from my experience with programming with channels and that working with one-way pipelines is much more elegant than two-way comms.07:47
fwereade_rog, makes sense; I personally feel that a one-shot 1-bit pathway in the second direction gives us some explicit clarity that I find helpful07:49
fwereade_rog, I may well come to change this view in time :)07:50
rogfwereade_: here's how i might expect unitLoop to look, BTW: http://paste.ubuntu.com/1096214/07:54
* fwereade_ remains unconvinced :)07:56
* fwereade_ is also going to pop out for some breakfast, he was up late reviewing last night07:57
rogfwereade_: enjoy. i haven't broken my fast yet...07:58
fwereade_rog, cheers07:58
Arammoin.08:18
fwereade_rog, do you recall there was a dicsussion somewhere about journalling for workflow state transitions in the UA?09:00
rogfwereade_: yeah09:00
fwereade_rog, I was thinking of doing something like that and thought I should probably catch up with the consensus09:00
rogfwereade_: i made some notes, which ended up in one of the google docs09:01
rogfwereade_: one mo, i'll find 'em09:01
fwereade_rog, awesome, tyvm09:01
rogfwereade_: they'd been removed from that doc, but i found them anyway... http://paste.ubuntu.com/1096304/09:04
fwereade_rog, cheers09:04
rogfwereade_: i'd had a few more thoughts further along those lines; i'll see if i can find some more notes09:05
fwereade_rog, hmm, ok, this is for hook executions specifically?09:06
rogfwereade_: yes09:07
rogfwereade_: well, anything that takes any time really09:07
rogfwereade_: so that we don't miss updates09:07
* fwereade_ ponders09:07
Aramis the meeting in one hour or two hours?09:08
* Aram is confused about TZ again.09:08
fwereade_rog, it kinda feels like it should be for state transitions rather than hook executions09:08
fwereade_Aram, I *think* it's 2h09:08
rogAram: 2 hours, i think09:08
Aramthanks.09:09
* Aram heads for a snack.09:09
rogfwereade_: perhaps09:09
rogfwereade_: i wanted to avoid writing all state data to the log file09:10
fwereade_rog, if you mean the settings of every unit, I agree09:10
fwereade_rog, but I think in general we do need to keep track of known membership per-relation and latest settings version per-related-unit09:11
rogfwereade_: my idea was that each thing that can trigger a hook corresponds to a single event from some watched thing, and each watched thing has a version.09:13
fwereade_rog, not true09:13
fwereade_rog, install/start don't09:13
fwereade_rog, config-changed does09:13
rogfwereade_: install/start i think can be treated as special cases.09:14
fwereade_rog, more critically, relation unit changes are not 1:1 with hook execution09:14
rogfwereade_: agreed.09:14
rogfwereade_: it's 1:N, right?09:14
fwereade_rog, I think the only quibble is that we always need to do a -changed after a -joined09:15
fwereade_rog, so in practice IMO that actually means we should always precede a -changed of a hitherto unknown unit with a -joined09:15
rogfwereade_: that should work ok09:15
fwereade_rog, yeah, the code seems to want it :)09:15
rogfwereade_: that was the idea behind "for all outstanding intentions"09:16
fwereade_rog, ahhh, got you09:16
rogfwereade_: because you might excecute a joined but not its associated changed09:16
fwereade_rog, except, hmm, consider -broken09:17
* rog considers broken.09:17
fwereade_rog, that overrides *everything*09:17
fwereade_rog, clear the queue, just break the relation09:17
rogfwereade_: remind me of the semantics of broken09:17
fwereade_rog, called when the unit itself leaves the relation09:17
fwereade_rog, doesn't have access to any useful state really09:18
fwereade_rog, similarly, actually, not sure how queue reduction fits in09:18
rogfwereade_: sorry, i don't remember at all how -broken works.09:18
fwereade_rog, it's just the hook for "this specific is no longer part of the relation"09:19
rogfwereade_: queue reduction is easy i think09:19
fwereade_s/specific/specific unit/09:19
rogfwereade_: assuming i understand what you mean by that09:19
fwereade_rog, stuff like "a -changed followed by a -departed" should just be a -departed09:20
rogfwereade_: the idea is that we only add to intentions when we are just about to execute the hook09:20
rogfwereade_: so the queue reduction can happen before the intentions make it to disk09:20
fwereade_rog, ok, cool09:21
fwereade_rog, so it's just for what in python is the HookScheduler, rather than the workflow stuff09:22
rogfwereade_: the workflow stuff?09:22
rogfwereade_: the stuff that works out what hooks to execute given what state changes?09:22
fwereade_rog, keeping track of the states of the unit and its relations09:23
fwereade_rog, more or less yeah09:23
rogfwereade_: i'm imagining that the workflow could be a goroutine pipeline component09:25
rogfwereade_: which takes in any changes and spits out intentions09:25
fwereade_rog, yeah, I'm having leanings in that direction but not quite sure how it will all fit together09:26
rogfwereade_: but i'm not entirely sure how that would work, it's just an initial gut feeling09:26
fwereade_rog, it needs to store its own state in ZK for sure09:26
rogfwereade_: what state does it need to store?09:26
fwereade_rog, er, its state -- charm_upgrade_error, or installed, or running, or whatever09:27
fwereade_rog, so status can see it09:27
fwereade_rog, I think state maintenance and actual hook execution are distinct problems09:28
rogfwereade_: i'm wondering if it would be better that the central loop did that09:28
rogfwereade_: i see hook execution and state maintenance as two sides of the same coin09:28
fwereade_rog, no question that they're intimately connected09:29
fwereade_TheMue, morning09:29
rogTheMue: hiya09:29
TheMuerog, fwereade_ : Morning09:29
rogfwereade_: thing is, some of the state maintenance transitions come from the results of hook executions, i think09:30
fwereade_rog, some but not all09:31
rogfwereade_: which seems to argue in favour of having the central hook-execution loop responsible for all of them, that way there's only one thing in charge of the state.09:31
fwereade_rog, I don't think that's right... hook execution is a single and relatively simple responsibility09:32
fwereade_rog, having the single hook executor goroutine responsible for keeping the separate workflows of the unit and all its relation up to date, especially when not all workflow transitions necessarily correspond to hook executions, sounds like a nightmare09:33
fwereade_rog, *some* external state changes imply workflow state transitions, *some* of which imply hook executions09:34
rogfwereade_: hmm, it seems like there's a two way flow, which i hadn't appreciated before. i.e. a different sequence of hooks will be executed depending on the results of previous hooks09:34
fwereade_rog, yeah, for example if we're in an error state we won't be worrying about any other transitions that we otherwise might be expected to pay attention to09:35
rogi'd been imagining: {external state changes -> workflow state transitions -> hook executions} as a one-way pipeline flow09:35
fwereade_rog, I *think* that still applies, what did I miss?09:36
rogfwereade_: but perhaps an error state just implies throwing all workflow state transitions away, which would be easy09:36
fwereade_rog, I *think* it means treating that workflow as though the UA was not executing at all09:37
rogfwereade_: right09:37
fwereade_rog, and then when it's active again doing a big-bang diff against the last known state when the workflow was sane09:37
fwereade_rog, and executing whatever changes come from that09:37
rogfwereade_: "having the single hook executor goroutine responsible for keeping ...". that wasn't my intention09:38
fwereade_rog, ah, sorry, I guess I'm just not clear what the intentions correspond to if not hook executions09:38
rogfwereade_: i'd thought {external state changes -> workflow state transitions -> (hook executions + state maintenance)}09:38
fwereade_rog, I just can't yet figure out how many places we actively need journalling, and of what sort09:41
fwereade_rog, I'm not sure the hook *executor* needs it at all but I haven't yet even convinced myself09:41
rogfwereade_: the question is: if we crash half way through executing a hook, how do we know that we need to execute it again when restarting?09:42
rogfwereade_: actually perhaps the pipe line could be extended: {external state changes -> workflow state transitions -> hook executions -> state maintenance}09:44
fwereade_rog, I need to refresh my memory with the python it seems :)09:48
rogfwereade_: because AFAICS all state maintenance is done as a result of hook executing (and charm upgrade, presumably, which i think could be considered similar)09:49
fwereade_rog, ok, a given workflow transition is associated with N lifecycle operations which may or may not execute hooks, any of which may in general cause an error and thereby a different workflow transition and set of lifecycle operations09:53
rogfwereade_: does an error ever in fact cause anything more than the cessation of lifecycle operations?09:54
fwereade_rog, it usually leads to a new state transition IIRC09:54
fwereade_workflow transition09:54
rogfwereade_: i'm not sure that answers my question09:55
fwereade_rog, well, it leads to a different workflow transition than the one we expected09:56
fwereade_rog, but in general all the workflow is responsible for is telling the lifecycle to do stuff and writing new state when it has done so09:56
fwereade_rog, so the answer kinda depends on your perspective09:57
rogfwereade_: that's the way the python code is done, yeah, but i'm wondering if we need that generality09:57
rogfwereade_: because that generality is the thing that means that it's not a one-way pipeline09:58
fwereade_rog, I think it's reasonable to suggest that the way the workflow and lifecycle can affect one another is confusing and couldprobably be done better09:59
rogfwereade_: and i'm wondering if the error state is sufficiently special that we don't mind hard-coding that09:59
fwereade_rog, we have a bunch of error states specific to what transition failed, and how we recover from them differs09:59
fwereade_rog, I'm not certain this is an essential property of the system though10:00
rogfwereade_: interesting10:00
rogfwereade_: what sort of recovery are we talking about here?10:00
rogfwereade_: i've got an idea10:01
fwereade_rog, what hook to (maybe, depending on resolved mode) run again10:01
* fwereade_ listens10:01
rogfwereade_: when we get an error, we treat that as a break in the pipeline, so we tear down the workflow goroutines, do whatever we need to get the show back on the road, then start it up again10:02
fwereade_rog, yeah, I think so10:03
rogfwereade_: that means that we've still got a nice clean one way flow, apart from termination, which is an abnormal condition, and therefore requires exceptional handling10:03
fwereade_rog, from a relation unit's POV, recovering from an error state should be exactly the same as dealing with process restart10:04
rogfwereade_: interesting10:04
fwereade_rog, so we never watch anything excet resolved when we're in an error state10:05
rogfwereade_: and then restart everything10:05
fwereade_rog, (these are tentative statements of opinion rather than fact)10:05
fwereade_rog, but it feels like it might be a good way to go10:05
rogfwereade_: i'd taken that for granted :-)10:05
fwereade_rog, I suspect if anything the critical thing is to break up the lifecycle so that bits-that-affect-workflow are separate from bits-affected-by-workflow10:07
davecheneyevening lads10:12
rogfwereade_: definitely10:14
rogdavecheney: hiya10:14
fwereade_davecheney, heyhey10:14
davecheneymiss anything good ?10:15
rogdavecheney: a wee discussion about the unit agent stuff10:15
davecheneyrog: this channel is logged somewhere right ?10:15
davecheneyanyone know the url ?10:15
rogdavecheney: one mo10:15
rogdavecheney: i just google for "ubuntu irc logs"10:16
rogdavecheney: http://irclogs.ubuntu.com/2012/07/17/%23juju-dev.txt10:16
davecheneylmgtfy10:16
davecheneynoice10:16
rogdavecheney: sadly it's updated very slowly10:16
davecheneyprobably monitoring a tonne of channels10:17
TheMuedavecheney: Heya10:17
rogdavecheney: in this case it's only about 12 minutes out of date10:17
TheMuedavecheney: I added http://irclogs.ubuntu.com/ to my bookmarks, so the access is easy.10:18
rogfwereade_: i'm sure this won't convince you, but for the record: https://codereview.appspot.com/6399052/diff/2001/state/watcher.go10:18
rogTheMue: yeah, but you still have to manually navigate to the right date and the right channel, right? :-)10:19
fwereade_rog, err and stop and stopped kinda feel like an ad-hoc tomb to me ;)10:20
TheMuerog: To the one I want to have, yes. Multiple access paths (e.g. by channel) and search would be nice.10:20
TheMuerog, fwereade_: Btw, thx for your reviews.10:21
fwereade_TheMue, yw, hope they're useful10:21
rogfwereade_: no more than: var err error; ... if err != nil { return } is an ad hoc tomb, IMHO...10:21
TheMuerog, fwereade_ : I'll move all watchers into the loop and maybe used a shared environ from the provisioner.10:22
rogfwereade_: what do you mean by "move all watchers into the loop"?10:22
fwereade_rog, aren't they pretty much giving us all that tomb does, which just returning doesn't?10:22
fwereade_TheMue, and I'm not sure the provisioner should actually be responsible for the environ anyway10:23
fwereade_TheMue, feels to me like the agent process should get a State and an Environ, and be looking after the changes to the Environ, and provisioner/firewaller should both just be dumbly using the environ10:23
rogfwereade_: +1, i think10:24
fwereade_davecheney, thoughts? ^^10:24
rogfwereade_: because we can just call SetConfig (or whatever it's called) on the environ10:25
TheMuefwereade_: Sounds fine too10:25
fwereade_rog, that's the idea10:25
fwereade_rog, I *think* it should be safe10:25
fwereade_rog, AIUI it's meant to be ;)10:25
rogfwereade_: i think it was designed tobe10:25
davecheneyfwereade_: yes, the provisioniner needs to be responsible for the environ10:25
davecheneythat was an explitic request from gustabo10:25
davecheneygustavo10:25
fwereade_davecheney, hmm, I'm saying it shouldn't10:25
davecheneyfwereade_: i think you have a good case10:26
fwereade_davecheney, or rather, I'm saying the agent should be but the worker should not10:26
davecheneybut that was an extended and painful review process10:26
davecheneyI would be loathe to change it without guidence10:26
fwereade_davecheney, heh :)10:26
fwereade_davecheney, I know the feeling :)10:26
davecheneyfwereade_: the reason the firewaller, in my proposals, maintains an independent connection to the environ and the state10:28
davecheneywas based on a conversation at UDS-q10:28
davecheneywhere it was mentioned that the firewaller service may not live with the provisioner10:28
davecheneyit only cohabits currently for convenient access to the secrets10:28
fwereade_davecheney, hmm, interesting10:28
fwereade_davecheney, not sure that's a reason for them to cohabit, is it?10:29
fwereade_davecheney, it's not like *anything* is restricted from looking at the secrets ;)10:29
davecheneyfwereade_: indeed10:30
davecheneyso with that in mind the provisioner, machiner et al, all operate independantly10:31
fwereade_davecheney, STM like the agent processes should be responsible for setting up the bits their workers need, and then firing off the workers, but I guess we could make it work either way without duplication10:33
fwereade_davecheney, it's the duplication in firewaller/provisioner that currently bugs me10:33
fwereade_davecheney, and the fact that watching the environs complicates each of their main loops10:34
TheMuefwereade_: Yep, see your point, sounds reasonable to me.10:34
rogfwereade_: i think i agree. i don't see why the environment setting needs to be in the provisioner's main loop.10:35
TheMue*: So in general, whoever uses a worker is responsible to pass an environment.10:35
fwereade_rog, TheMue: cool, cheers10:35
fwereade_TheMue, yeah, I think so, and is also responsible for keeping the Environ up to date, because any worker ought to be able to just deal with it10:36
davecheneyfwereade_: i'm surprised your more worried about the environ than the state10:36
rogfwereade_: it means, i think, that when the PA starts, it would do three things: start the environment settings agent and wait for an environ; start the provisioner and the firewaller with the environment thus obtained10:36
davecheneyas in, duplicates in process10:36
fwereade_davecheney, I'm worried about both :)10:36
fwereade_davecheney, I have comments on separate reviews whining about each of those things ;)10:36
davecheneyfwereade_: good, just checking10:36
fwereade_rog, yeah, think so10:37
=== Aram2 is now known as Aram
davecheneyso, who has the hangout invite ?10:56
niemeyermramm: I sent an invite a few minutes ago already11:03
mrammhttps://plus.google.com/hangouts/_/79b690ed61262ab9e0bbaa673cd718b15127573311:03
mrammahh11:03
mrammsorry11:03
Aramyou did?11:03
mrammI'll join that one11:03
niemeyerAram: Yep11:04
Aramdamn flash11:04
Arammeh11:07
Aramwtf11:07
AramA connection error occurred while loading this page. Please try refreshing the page.11:07
niemeyerhttps://bugs.launchpad.net/juju-core/+bug/102295411:14
niemeyer"un-in-progressing"11:18
niemeyerfwereade's new verb11:18
Arammilestones++11:45
fwereade_davecheney, were my reviews clearish?11:59
* fwereade_ tries to remember what they were, all I can currently remember is that there were things I wasn't sure I'd said effectively12:00
fwereade_davecheney, niemeyer: ah, yeah, that was it12:01
fwereade_davecheney, niemeyer: doing authorized-keys "properly", ie responding to env-sets, feels to me like it's the MA's job12:01
fwereade_davecheney, niemeyer: sane?12:01
niemeyerfwereade_: Does sound sane to me12:02
fwereade_davecheney, niemeyer: this is something we don't handle at all in the python but feels like it's necessary for env-set authorized-keys=BLAH to be useful12:02
davecheneyfwereade_: sorry, i'm unclear what you are talking about ?12:03
fwereade_davecheney, https://codereview.appspot.com/6405046/diff/2001/environs/ec2/cloudinit.go#oldcode11612:03
niemeyerfwereade_: Agreed12:03
niemeyerfwereade_: But it sounds like a bug we can file and postpone12:03
davecheneyfwereade_: right, i'm with you now12:03
niemeyerfwereade_: After all, it's not handled at all in Python :-)12:04
davecheneyas per my reply, it's important, but those comments should go somewhere else12:04
fwereade_niemeyer, yeah, I'm not suggesting it's *high* priority, but I think it's something without which we can't reasonably call env-set "done"12:04
niemeyerfwereade_: +112:04
fwereade_davecheney, sorry, I saw no replies12:04
davecheneyfwereade_: with your permission i'll copy that text into a bug for the backlog12:04
davecheneyis that ok ?12:04
* fwereade_ goes and pokes at his email client12:04
fwereade_davecheney, that's great, thank you12:05
fwereade_davecheney, it's a point I was raising as I thought of it rather than a request for immediate fixing12:05
davecheneyfwereade_: i have a very broad view of how bug trackers work12:05
davecheneyto me, they are simply places where you put things you don't want to loose12:05
davecheneythe rest is just semantics12:05
fwereade_davecheney, yeah, makes sense12:06
davecheneyOMG, i just found a bug in LP; if you alter the field on the show bugs screen12:07
davecheneythe report bug link stops working12:07
davecheneyfwereade_: done, bug raised12:08
davecheneyfwereade_: if I can also draw your attention to http://codereview.appspot.com/6408047/#msg5, which may have been lost in the either12:09
davecheneyand http://codereview.appspot.com/6408047/#msg412:09
davecheneyerr, ether12:09
fwereade_davecheney, was just coming to that12:09
fwereade_davecheney, I'm still -112:10
fwereade_davecheney, I really think that a half-initialized machine is  a Bad Thing12:10
niemeyerfwereade_: I've replied to all your comments on the config branch.. can you please have a look and see what is sensible and what is not?12:10
niemeyerfwereade_: I'll step out for breakfast and will come back to act on it12:10
fwereade_davecheney, if initzk falls over half way through but everybody else keeps going we will get confused12:11
niemeyerfwereade_: Will ping you before starting in case you'd like to discuss12:11
davecheneyfwereade_: yup, cloud init is not checking the return status of any of the commands12:11
fwereade_davecheney, if we have a machine but the instance-id is not set, the PA will go ahead and prvision it12:11
davecheneywitness the hilarity of the ec2 apt mirror snafu a few weeks ago12:11
fwereade_davecheney, so I really think it should go inside Initialize12:11
fwereade_davecheney, I *also*think Initialize should not *require* it12:12
fwereade_davecheney, otherwise all our tests will break, because there will be 1 machine where there were once 012:12
davecheneyfwereade_: I have to call it a night, i'm alread in the doghouse12:12
fwereade_davecheney, sorry :)12:12
davecheneycould you please raise this as a bug12:12
fwereade_I still think it's a blocker on the CL12:12
davecheneyyup, assign it to me12:13
fwereade_but I'll raise the other as a bug12:13
davecheneyi'll do it tomorrow12:13
fwereade_niemeyer, looking now12:13
fwereade_davecheney, cheers, awesome12:13
davecheneynight all12:13
niemeyerfwereade_: if path != "" || keys == "" {12:52
niemeyer<fwereade_> What happens when both path and keys are empty?12:52
niemeyerfwereade_: So, tell me.. what happens? :-)12:53
roglunch12:55
fwereade_niemeyer, doh :(12:56
fwereade_niemeyer, but shouldn't there be a "whoa, you specified both" error path?12:56
niemeyerfwereade_: I'm happy to define that when both are provided, the path takes over.. or to concatenate them12:57
niemeyerfwereade_: This is addressing the point you brought up yesterday related to set-env12:57
fwereade_niemeyer, I'm easy tbh12:57
fwereade_niemeyer, I think I have overthought that particular functionality :/12:58
fwereade_niemeyer, my perspective is as likely to be warped as it is to be correct12:58
niemeyerfwereade_: I understand.. IMO it's fine to have a well defined behavior that makes the implementation simpler12:59
fwereade_niemeyer, SVGTM13:00
niemeyerfwereade_: I suspect concatenating them is the least-surprising behavior13:00
fwereade_niemeyer, agreed in the small, feels maybe tricky to do the Right Thing on env-set13:01
niemeyerfwereade_: Ah, yes13:01
niemeyerfwereade_: The current behavior handles that properly I think13:01
TheMuefwereade_: Do you have a quick link into your relation-unit branches? I would like to see how you do it with tombs in the sub-goroutines.13:48
fwereade_TheMue, https://codereview.appspot.com/6405044/13:49
TheMuefwereade_: Thx13:49
fwereade_TheMue, given that you're using little types for the sub-goroutines, the tombs would probably be better placed on the types13:49
fwereade_TheMue, but I *think* the genral idea is sound13:50
TheMuefwereade_: I'll place them in those types, yes.13:50
TheMuefwereade_: Your deferred finish() is neat.13:52
fwereade_TheMue, cheers :13:53
fwereade_)13:53
niemeyerrog: LGTM on https://codereview.appspot.com/6344113/14:07
rogniemeyer: thanks a lot14:09
rogniemeyer: we could return ports sorted, but i don't really see the point, as only the testing code is concerned about that, i think.14:09
niemeyerrog: Thank you!14:09
niemeyerrog: I'd be glad to see ports sorted whenever looking at that information, personally14:09
rogniemeyer: if we do that, we'll want to implement something like state.PortSlice, otherwise every provider will need to do its own sort.Interface implementation for ports.14:12
rogniemeyer: or state.SortPorts14:12
rogi suppose14:12
rogniemeyer: i'd be ok doing that if you'd like14:12
rogniemeyer: state.SortPorts, that is14:13
niemeyerrog: +114:13
niemeyerrog: We already have the implementation anyway14:13
rogniemeyer: sure. will do.14:13
niemeyerrog: Thanks!14:13
niemeyerfwereade_: https://codereview.appspot.com/6354045/ is still deleting .lbox14:14
fwereade_niemeyer, gaah, I guess I forgot that :/14:16
niemeyerfwereade_: Would you have a moment to talk about the logic in there?14:19
fwereade_niemeyer, ofc14:19
niemeyerfwereade_: Looking at line 509 on presence.go14:19
fwereade_niemeyer, here?14:19
fwereade_niemeyer, yep14:19
niemeyerfwereade_: It stikes me as odd that we're firing another watch before we even cared to observe what the previous watch said14:19
niemeyerfwereade_: I may just be missing the underlying logic, though14:20
fwereade_niemeyer, the underlying idea is that AliveW returns a bool and a chan that sends the other bool14:20
fwereade_niemeyer, if the bool we get out of the new AliveW is not the same as what was fired by the original watch, state has changes again since the original watch fired14:21
niemeyerfwereade_: Well, if you look a third time, it may have changed again.. we may do that ad infinitum14:21
fwereade_niemeyer, but I looked and got a watch which should be guaranteed to fire next time it changes14:22
fwereade_niemeyer, the point is that if it differs, the latest know state is no different to the last one we notified of14:22
fwereade_niemeyer, and therefore we should not send a spurious "hey dude, still alive (or dead)" event, and should just start again with the new watch14:23
niemeyerfwereade_: Sorry, I still don't get it14:23
niemeyerfwereade_: Why are we not doing that a third time, just in case?14:23
niemeyerfwereade_: (it's not a tricky question.. it would help me understand)14:23
fwereade_niemeyer, because the watch is guaranteed to fire when the value we get out changes14:24
niemeyerfwereade_: Right.. that's true for the first watch too, right/14:24
niemeyer?14:24
fwereade_niemeyer, the first watch has fired14:24
fwereade_niemeyer, we know something has changed14:24
niemeyerfwereade_: Exactly14:24
fwereade_niemeyer, but it's just like using ZK normally14:24
niemeyerfwereade_: We were just told something has changed.. why are we asking again?14:25
fwereade_niemeyer, because it might have changed any number of times again between our being notified that something changed and our starting a new watch14:25
fwereade_niemeyer, we need a new watch, right?14:25
niemeyerfwereade_: Yes, and it may have changed any number of times again after the second watch14:25
fwereade_niemeyer, the point of the watch is that if it has we'll immediatey get notified next time through the loop14:26
niemeyerfwereade_: Ah, I see14:26
niemeyerfwereade_: So your point is that we need this watch anyway, so we may as well do it ahead of time and verify the second result already14:26
niemeyerfwereade_: Right?14:27
fwereade_niemeyer, yes, we need the watch; and the important current value is the one that comes out when we get the watch; and *if* that value differs from the one we got out of the first watch, the state has changed an odd number of times in the interim14:28
fwereade_niemeyer, and therefore the latest state is the same as the last one we notified the client of14:28
fwereade_niemeyer, and therefore we should not notify them14:28
niemeyerfwereade_: Understood, makes sense, thanks for explaining14:28
fwereade_niemeyer, a pleasure14:28
fwereade_niemeyer, I imagine it could use an extra comment or two ;)14:28
niemeyerfwereade_: Yeah, I'll suggest that14:29
fwereade_niemeyer, I'm pretty sure I readded the lbox as it should have been but I'm getting "error: Failed to send patch set to codereview: can't upload base of .lbox: ERROR: Checksum mismatch." -- once I have your review I think I will just clone it onto a fresh branch and link it to the original for contnuity of discussion14:43
niemeyerfwereade_: I'd prefer to fix it instead, if we manage to14:44
niemeyerfwereade_: Every time we recreate a CL we lose all the context for the previous comments14:44
niemeyerfwereade_: and I can't review the delta anymore14:44
niemeyerfwereade_: There's probably a bug in lbox14:45
niemeyerfwereade_: If needed, let's just remove and readd it14:45
fwereade_niemeyer, that's what I thought I'd done14:47
fwereade_niemeyer, codereview now seems to think that there are two .lbox files, one of which was deleted and another of which was added14:47
niemeyerfwereade_: Can you uncommit the re-add?14:49
niemeyerfwereade_: Or is it not tip anymore?14:49
fwereade_niemeyer, done14:50
niemeyerfwereade_: Ok, you'll probably have to force-push now14:50
niemeyerfwereade_: Since the revision is already up14:50
niemeyerfwereade_: Try "bzr push" just to confirm14:50
niemeyerfwereade_: If that doesn't work, just make sure the branch URL is right (and not trunk!) and then do bzr push --overwrite14:51
fwereade_niemeyer, ok, yep, needed to --overwrite14:53
niemeyerfwereade_: You have a review15:12
niemeyerfwereade_: It's really just cosmetic stuff, except for the fact the pinger has changed since the last review, and it's not clear why15:12
niemeyerLunch time.. biab15:14
rogniemeyer: quick once-over before i submit? https://codereview.appspot.com/634411315:40
niemeyerrog: Awesome, thank you!15:41
rogniemeyer: i take it that LGTY15:41
niemeyerrog: Definitely, thanks15:42
TheMuerog, fwereade_ : the firewallers first part is at https://codereview.appspot.com/6374069 in again, will continue this way with the second part.15:43
TheMuerog, fwereade_ : looks simpler now, and uses tombs for each machine.15:43
rogTheMue: what happens to a machine units watcher when the machine it's watching is removed?15:48
rogTheMue: i'm concerned that we might receive an error from that watcher before we see the machine removal, and thus cause the whole thing to fall over15:50
TheMuerog: that's in the next branch. I think I'll add there a tomb and a finish() for all tombs too. but I'm not yet exactly shure.15:54
rogTheMue: i'm concerned that something that i'd envisaged as about 6 lines of code has expanded to 50 lines. i don't think it should have to be this complex, but i don't quite see how to avoid it with our current conventions.15:55
roglc15:55
TheMuerog: yeah, it's getting bigger again. *sigh*15:58
TheMuerog: I would like to only have one range loop for machine and co too.15:58
TheMuerog: but we also have to react on the firewall, on depending goroutines etc.15:59
rogTheMue: i don't see a problem with doing that15:59
TheMuerog: I'm listening16:00
TheMuerog: btw, the code is now smaller than the one you lgtm'ed yesterday16:01
TheMuerog: only the tomb is new, so we really now when the machine instance has ended working16:02
rogTheMue: yesterday's code was 144 lines. today's is 165. how is that smaller?16:03
rogTheMue: we already know when the machine instance has been removed - we can just flag it as dead and ignore it if it sends any events after we've killed it.16:05
rogTheMue: that would only take about 4 lines of code.16:05
rogTheMue: but i know fwereade_ has opinions about this too :-)16:05
TheMuerog: the additional lines are two helpers, a constructor, a forgotten statement and comments for types and methods.,16:06
fwereade_rog, I'm still looking :)16:07
TheMuerog: could you show me your 4 lines? last time they contained much pseudo code and error testing had to be added16:08
rogTheMue: http://paste.ubuntu.com/1096832/16:09
rogTheMue: no error checking is needed in that loop, i think.16:09
rogTheMue: (it can be done by the central loop when it receives the nil change notification16:10
rog)16:10
fwereade_rog, I think I'm happier with more lines of code but clearer boundaries16:10
rogfwereade_: personally, i find the logic with the myriad of tombs and watchers quite hard to follow16:11
rogfwereade_: i like to keep concurrent code as minimal as possible16:12
TheMuerog: and when the firewaller dies or stops, or a machineUnitsWatcher?16:12
rogfwereade_: as it's quite easy to get wrong and hard to test fully16:12
rogTheMue: if the firewaller stops, it kills the machineUnitsWatcher which gives EOF on its changes channel, which propagates through the machine loop and back to the firewaller which exits when all its sub-watchers have terminated.16:13
niemeyerTheMue: Sent comments16:14
rogTheMue: if a machineUnitsWatcher dies, it gives EOF as above.16:14
TheMueniemeyer: thx16:14
niemeyerrog: This looks sensible.. the only disadvantage is that any errors on the watcher are lost16:15
niemeyerrog: It could be fixed by adding an err field to the Change though16:15
niemeyerrog: Or someone else has to Stop() the watcher16:15
rogniemeyer: you're talking about the above branch, presumable?16:16
rogpresumably16:16
niemeyerrog: I'm talking about the above paste16:16
rogniemeyer: ah, thanks16:17
niemeyerrog: I think the logic in there works correctly, though16:17
niemeyerrog: It'd probably be wise to move forward with it and let that kind of simplification for a follow up16:17
rogniemeyer: the errors can later be retrieved because the central loop can interrogate the watcher itself.16:17
rogniemeyer: or, as you say, add an err to the change type16:17
niemeyerrog: It can, it has to interrogate all the watchers, and Stop them16:17
niemeyerrog: Also must be careful not to close machinesChanges16:18
rogniemeyer: definitely.16:18
fwereade_TheMue, I'm starting to wonder what we get from the machine type that couldn't be got from (say) a (*Firewaller)machineLoop(*Machine, *tomb.Tomb)16:18
rogniemeyer: but i don't think that's hard to ensure.16:18
niemeyerIt's not hard.. it's just another approach that TheMue will have to get right16:18
niemeyerHe seems to have gotten that one approach almost correctly16:19
niemeyerSo I'm tempted to suggest moving forward with it, and simplifying in a follow up16:19
rogniemeyer: seems ok.16:19
fwereade_TheMue, but then Firewaller has machines and machines has ports and neither of those seem to be used a great deal, and I think my issue is that I haven't closely followed the original discussion so I'm not quite sure of the plan16:19
TheMuefwereade_: and then go that loop per machine? would be possible, but I think more complex in the end (there will be more types later). take a look at the Py code, all in one type with callbacks.16:20
rogfwereade_: did you see my original sketch?16:20
fwereade_rog, not closely enough that it stuck in the mind, I'm afraid16:21
rogfwereade_: http://paste.ubuntu.com/1096860/16:21
TheMueniemeyer: the small types at the end of the file are left from rogs paste. they grow with each new branch. the follow-up (already in) has a working unit (will rename it to unitTracker).16:22
fwereade_rog, ok, I do remember that; not fully seeing the path from here to something-like-there at the moment16:24
niemeyerTheMue: Okay16:24
fwereade_rog, but, honestly, I has an EOD sleepy on and I think I should take a rest16:24
rogfwereade_: TheMue's branch is the beginning of that16:24
niemeyerTheMue: I suggest keeping the machineTracker with the current mechanism for now16:25
rogfwereade_: it implements the "start machine units watcher for m; add it to machines" piece16:25
niemeyerTheMue: To avoid delaying it much further with new logic16:25
fwereade_rog, there's something I can't quite put my finger on about having ports on both machine and unit16:26
rogfwereade_: i played around with a few configurations, but that one seemed to work best16:27
TheMueniemeyer: sounds reasonable16:27
rogfwereade_: because they're distinct things16:28
rogfwereade_: the machine ports are a union of the unit ports on that machine16:28
fwereade_rog, fair enough16:30
rogfwereade_: and we need to keep track of the machine ports so that we know which ports to close when a unit's ports change.16:30
fwereade_rog, I guess we don't have anything stopping two units on the same machine from attempting to mess with the same ports?16:31
rogfwereade_: i discussed this with niemeyer16:31
TheMueSo, have to step out, Carmen calls for dinner. But will return later.16:31
rogfwereade_: we thought it was reasonable if each port was considered "owned" by a given unit16:32
rogfwereade_: then the open-port command should give an error if the port is owned by another unit16:32
rogfwereade_: note that machine.ports in my sketch is a map from state.Port to *unit16:33
fwereade_rog, yeah, that sounds sensible... and, ah-ha; yeah, I'm sleepy :(16:33
rogfwereade_: ok, happy snoozes!16:33
fwereade_rog, cheers16:33
fwereade_might be on a bit later, not sure16:34
niemeyerfwereade_: Have a good one16:35
niemeyerDoc appointment.. back in ~40mins16:50
* rog is off for the night. have fun, see y'all tomorrow17:17
niemeyerBreak time. biab19:48
mrammniemeyer: have fun!19:50
=== Aram2 is now known as Aram
fwereade__niemeyer, ping20:36
fwereade__niemeyer, so, I did a bit of archaeology and figured out what happened with those Pinger changes: long story short, they were part of the testing change that I thought I'd proposed among the other changes (and were therefore ok, because they were *changed* code but not *moved* and changed code), and it turns out they're not anywhere in the changesets in https://codereview.appspot.com/6348053/ ...so, I thought I'd proposed them, but I actually hadn't, a20:43
fwereade__nd committed them anyway, which is a pretty monumental screwup :(20:43
fwereade__niemeyer, I am very much aware that  this is Not Ok, and I guess it must have happened via a bedtime `lbox propose` whose results I didn't check20:44
fwereade__niemeyer, but that's an explanation not an excuse20:45
fwereade__niemeyer, and I guess we're just lucky that, actually, it *does* seem to be more stable now and by sheer luck (or maybe judgment, who knows) I didn't screw it up really badly20:46
fwereade__niemeyer, I think20:46
fwereade__niemeyer, um, anyway, flagellation over, but if you want to take over for a bit I wouldn't feel it was unjustified20:46
fwereade__niemeyer, I guess it's good that presence.ChildrenWatcher has such a venerable history, nobody would have noticed if I'd reproposed from scratch as I was tempted to do :/20:48
davecheneyfwereade_: thank you for raising that issue, i'll take a swipe at it today21:25
fwereade_davecheney, cool, cheers21:26
fwereade_davecheney, I suspect it will be best to start by changing Initialize and leaving bootstrap alone for a bit, given that we expect some churn in environ config21:27
davecheneyi'm concerned that this change will be delayed by the other churn in that area21:27
davecheneywhich was my motivation for proposing it is addressed later21:27
fwereade_davecheney, your instincts are probably better than mine: I was only thinking about that code because we need default-series for deploy, and I think you're more deeply involved than I am21:29
davecheneyfwereade_: i wouldn't say deeply :) i'm motivated primarily by the goal of having deploying somethign21:29
fwereade_davecheney, but because I wanted the default-series to be a quick fix before going back to the UA, I have been somewhat scattered about it21:30
davecheneyerr, having something to deploy21:30
fwereade_davecheney, oh, me too ;)21:30
davecheneyanyway, i'll take a look now21:30
fwereade_davecheney, but implementing deploy ended up feeling like more trouble than I had hoped :/21:30
davecheneybtw, juju status, the output, is that actul yaml, or just formatted to look similar ?21:30
fwereade_davecheney, should be yaml21:31
fwereade_davecheney, takes --format, defaults to yaml, also allows json21:32
davecheneyfwereade_: right, that is what i thought, but I was confused by the doc strings that appeared to dictate a format21:32
davecheneyor a layout, to be more exact21:32
fwereade_davecheney, hmm, also dot, svg, png, it seems21:33
davecheneymap's gonna map21:33
fwereade_haha :)21:34
fwereade_davecheney, hmm, I just wanted to create a Hooker type :/21:35
fwereade_davecheney, I can probably think of a better name21:36
davecheneyfirst bug logged against launchpad, do I get a badge or achivement award ?21:40
davecheneyfwereade_: forgive my thickness21:49
davecheneybut reading the bug you raised21:49
davecheneyi don't see how that is (directly) related to the problem at hand21:49
davecheneywhich I though was the population of machine/0 into the state after we had returned from initalise21:50
niemeyerfwereade_: So what's the history there?22:11
niemeyerfwereade_: Sorry, that's not the right question22:11
niemeyerfwereade_: What do we want? :-)22:11
davecheneyniemeyer: two secs22:15
davecheneyniemeyer: http://codereview.appspot.com/6408047/#msg3, which became, https://bugs.launchpad.net/juju-core/+bug/102565622:16
niemeyerdavecheney: Sorry, I'm a bit out of context22:22
niemeyerdavecheney: I was asking about the stuff fwereade_ said earlier22:22
davecheneyok, ignore me then :)22:22
niemeyerdavecheney: But that's interesting too :-)22:22
niemeyerdavecheney: What is potentially racy, more precisely?22:25
niemeyerdavecheney: When you have a few minutes, I'd like to understand better what's going on thre22:27
niemeyerthere22:27
davecheneyniemeyer: racy because we are setting /initalised, then doing some more fudging of the state before the command exits22:28
niemeyerdavecheney: That part is totally fine22:28
niemeyerdavecheney: that node handles the lack of atomicity in zk22:29
davecheneyyes, i think so too, because the only consumers of those machine entries, are started by cloud init after that process has exited22:29
niemeyerdavecheney: It prevents code from trying to e.g. create other nodes before the fundamental parents even exist22:29
niemeyerdavecheney: That's irrelevant, IMO22:29
davecheneywhat is irrelevant22:29
niemeyerdavecheney: AddMachine in State should work22:29
niemeyerdavecheney: That's a standalone assumption22:30
niemeyerdavecheney: Do I misunderstand?22:30
davecheneyno, we are in agreement22:30
davecheneyso, can I drop that comment block ?22:30
niemeyerdavecheney: Yeah, the comment seems to assume /initialized is something it's not22:31
davecheneyniemeyer: will do22:31
niemeyerdavecheney: Now, for the second part of the question: why are we adding a machine by hand like this rather than doing it the way we did in Python?22:32
* davecheney checks the python22:33
niemeyerdavecheney: I can describe, sorry.. I thought it was a conscious decision22:34
niemeyerdavecheney: This isn't hard-coded22:34
niemeyer    sub_parser.add_argument(22:35
niemeyer        "--instance-id", required=True,22:35
niemeyer        help="Provider instance id for the bootstrap node")22:35
niemeyerdavecheney: Hmmm.. nevermind22:35
niemeyerdavecheney: Clearly my memories are failing me22:36
davecheneyniemeyer: yup, but the current state.Initialize provides no way to pass that value in22:36
davecheneymaybe it should22:36
niemeyerdavecheney: Don't worry.. I'm on crack22:36
niemeyerdavecheney: i was incorrectly complaining about something else22:36
davecheneywhich is the bug I thought that william was going to raise for me last night22:36
davecheneyniemeyer: np22:36
niemeyerdavecheney: I'm surprised to see machine/0 hardcoded around initialize, but it *is* hardcoded in Python too22:37
davecheneyniemeyer: :)22:37
niemeyerdavecheney: Which clearly means we shouldn't worry about that now22:37
davecheneyniemeyer: which was my point22:37
niemeyerdavecheney: I can see why William was worried too, because in Python /initialize was guarding the machine creation22:37
davecheneyit could be better, but at the moment I think a todo in the code and bug in the issue tracker for the backlog should suffice22:37
niemeyerdavecheney: But that's not it's reason of existence really.. I do recall that part :-)22:38
davecheneyniemeyer: hence my comemnt about a possible race22:38
niemeyerdavecheney: The problem was that it was extremely boring to guard every single action against the lack of existence of the critical parents22:38
davecheneybut given the consumer of that piece of information in the state is not started till jujud initzk returns22:38
niemeyerLike /charms, /services, etc22:38
davecheneyyeah, that sounds dull22:38
niemeyerdavecheney: So /initialized came into play22:38
niemeyerdavecheney: But the machine creation is clearly outside of that need22:39
niemeyerdavecheney: So +1 on dropping the comment, and +1 on getting this in22:39
niemeyerdavecheney: The TODO should also not mention Initialize.. seems fine where it is22:39
niemeyerdavecheney: Some day we do want that to move into Bootstrap, though.. in the not upcoming future.. :-)22:40
davecheneyi'll raise a bug for the backlog22:40
niemeyerdavecheney: Thanks!22:42
davecheneyniemeyer: thanks for the review23:05
niemeyerdavecheney: np23:05
niemeyerI'm also addding a comment about the stuff we just talked about23:06

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!