/srv/irclogs.ubuntu.com/2012/09/28/#juju-dev.txt

davecheney	niemeyer: scp & scp reproposed, some nice reuse in there	00:14
niemeyer	davecheney: Sweet, will have a quick look	00:49
niemeyer	davecheney: http://play.golang.org/	00:52
niemeyer	Erm	00:52
niemeyer	davecheney: http://play.golang.org/p/cPscJ6RuoX	00:52
davecheney	niemeyer: yes, i had to check that as well	00:53
niemeyer	davecheney: Sorry, I don't get it?	00:53
niemeyer	davecheney: "This will panic if len(c.Args) == 1. I've redone the logic to be less crackful."	00:53
davecheney	conshttp://play.golang.org/p/okbxoy-UI2	00:53
niemeyer	davecheney: c.Target, c.Args = c.Args[0], c.Args[1:]	00:53
davecheney	niemeyer: http://play.golang.org/p/TObeRIa8wL	00:53
davecheney	yes, you are right	00:54
davecheney	but I dont understand why http://play.golang.org/p/Fd_jNi6mVe	00:54
niemeyer	davecheney: Because that's out of bounds	00:57
niemeyer	davecheney: s[1:] works when len(s) == 1 for the same reason that s[:1] works.	00:58
davecheney	i dont' think that is correct	00:59
davecheney	but this isn't the right place to argue about it	01:00
davecheney	i'll fix my code	01:00
davecheney	ok, someone explained it too me in the channel	01:02
niemeyer	davecheney: Stepping out for the night.. have a good day!	03:27
davecheney	enjoy	03:54
rogpeppe	davecheney, fwereade_: mornin'	06:05
davecheney	rogpeppe: howdy	06:05
fwereade_	davecheney, rogpeppe: heyhey	06:06
rogpeppe	fwereade_: it'd be great if you could have a glance at the uniter upgrade branch, if you have a moment sometime this morning: https://codereview.appspot.com/6561063/	06:06
fwereade_	rogpeppe, sure	06:07
fwereade_	rogpeppe, (still thinking)	06:14
rogpeppe	fwereade_: np	06:15
fwereade_	rogpeppe, ISTM that it would be simpler (not to mention less conflicty, and mildly kinder to the network) to get the Unit only when you actually need to run an upgrade	06:18
fwereade_	rogpeppe, doing that removes the need to change runOnce and Uniter	06:19
fwereade_	rogpeppe, is there a deeper motivation in play than "I need a unit"?	06:19
* rogpeppe has a look to remind himself		06:20
rogpeppe	fwereade_: we need to unit immediately	06:20
rogpeppe	fwereade_: because we use it to announce the current agent version	06:20
rogpeppe	fwereade_: but that's not the reason we change runOnce and Uniter	06:21
rogpeppe	fwereade_: the reason for that is that we want to upgrader to be as independent as possible of uniter bugs	06:21
rogpeppe	fwereade_: so we do the absolute minimum necessary before starting the upgrader	06:22
rogpeppe	fwereade_: hence it's important that the uniter factory method doesn't return an error - early errors should not take down the upgrader.	06:22
rogpeppe	s/need to unit/need the unit/	06:23
rogpeppe	fwereade_: does that make sense?	06:23
fwereade_	rogpeppe, hmm, apart from the ones which should, but I see it's tricky	06:23
rogpeppe	fwereade_: which errors should take down the upgrader?	06:24
rogpeppe	fwereade: last thing i was was [07:23:38] <fwereade_> rogpeppe, hmm, apart from the ones which should, but I see it's tricky	06:25
rogpeppe	s/was/saw/	06:26
fwereade	rogpeppe, cheers	06:26
fwereade	rogpeppe, I'm not quite convinced that deferring the error helps you much though, 1 mo	06:26
fwereade	rogpeppe, surely an early error return from the Uniter will hose the upgrader just as badly, because runTasks will terminate it	06:27
rogpeppe	fwereade: no, because we've got some special case logic in the upgrader for just such an eventuality	06:28
rogpeppe	fwereade: if it's killed early on, it waits until it has at least had a squizz at the proposed version	06:28
rogpeppe	fwereade: and if that's changed, it doesn't exit until it has actually downloaded the upgrade	06:28
rogpeppe	fwereade: (well, with some timeout too)	06:29
rogpeppe	fwereade: it's given 5 minutes	06:29
* fwereade continues to think		06:30
fwereade	rogpeppe, something about all that does make me a little queasy... I feel that if I ask an upgrader to stop, it should jolly well stop, by criminy	06:37
rogpeppe	fwereade: this was discussed at the time	06:37
rogpeppe	fwereade: i think the upgrader is special	06:37
rogpeppe	fwereade: because it's the only way we can escape bad s/w	06:37
fwereade	rogpeppe, I agree, but IMO that means it should have control of the tasks rather than being antisocial :)	06:38
rogpeppe	fwereade: it will stop... when it's made good and sure that someone is not trying to upgrade us	06:38
rogpeppe	fwereade: well, as you know, that's how i started and that was deemed incorrect	06:39
rogpeppe	fwereade: so this is what we're doing. and it doesn't seem bad to me.	06:39
fwereade	rogpeppe, I do feel your pain there... but I think I need to talk to niemeyer about this	06:40
rogpeppe	fwereade: ok	06:40
fwereade	rogpeppe, sorry :(	06:40
rogpeppe	fwereade: if we change this now, BTW, it knocks everything off	06:41
rogpeppe	fwereade: because everything is built in this way currently.	06:41
rogpeppe	fwereade: the upgrader is "just" another task	06:41
rogpeppe	fwereade: so, given that we're not swimming in free time, and this architecture will work for the time being, perhaps we could move forward as we are and consider a change later?	06:42
fwereade	rogpeppe, the collision with my own uniter changes is not entirely trivial	06:43
fwereade	rogpeppe, and there is a value I return from NewUniter that really should block upgrades	06:43
rogpeppe	fwereade: which is?	06:44
fwereade	ErrUnitDead	06:44
rogpeppe	fwereade: i don't think it matters too much tbh. it's an edge case - it doesn't matter much if we do upgrade in that case, it's just an extra 5s delay	06:45
TheMue	morning	06:46
rogpeppe	fwereade: FWIW i think i'd probably structure it in a similar way even if the upgrader was in control - we want the upgrader to be independent right up until the moment it decides that now is the time to upgrade.	06:47
rogpeppe	TheMue: hiya	06:47
TheMue	rogpeppe: hiya	06:48
rogpeppe	fwereade: if you like, i'll merge your most recent uniter branch and include it as a prerequisite	06:48
fwereade	rogpeppe, surely the right thing to do there is to have the upgrader in change of runTasks? the wait & retry business could be handled internally, surely?	06:48
fwereade	rogpeppe, it's not the code conflict that bothers me so much as the lack of clarity	06:49
rogpeppe	fwereade: it's not that straightforward. you want to start the upgrader independently, regardless of what the other tasks are doing. then the wait and retry isn't logic that's in the outer loop (because it's dependendent on what gets downloaded) but in a separate goroutine.	06:50
rogpeppe	fwereade: so the structure starts to look similar to what we've currently got	06:50
rogpeppe	fwereade: i think the notion of a task being able to delay shutdown until it's done what it needs to do is a reasonable one, and one of the reasons we use tombs like we do.	06:52
fwereade	rogpeppe, yeah, I see that	06:53
rogpeppe	fwereade: last seen: [07:53:09] <fwereade> rogpeppe, yeah, I see that	06:54
fwereade	rogpeppe, I think I'm convinced, although I still don't quite entirely like something about it	06:54
rogpeppe	fwereade: i understand.	06:56
fwereade	rogpeppe, that said, I think I am coming around to your perspective that something outside the uniter should handle death-watching	06:56
rogpeppe	fwereade: wasn't there a load of mode-specific stuff that you needed to do when killed?	06:57
fwereade	rogpeppe, the uniter's Dying response is not interesting from your perspective, I think (although there's little point upgrading a dying unit)	06:57
fwereade	rogpeppe, I think you probably should be watching for Dead and crash-stopping though	06:58
rogpeppe	fwereade: "me" being...?	06:58
rogpeppe	fwereade: ah, i see	06:58
rogpeppe	fwereade: you mean i should crash-stop when something returns ErrDead	06:59
fwereade	rogpeppe, almost certainly yes	06:59
rogpeppe	fwereade: maybe. although i think it's a very rare edge case tbh	06:59
fwereade	rogpeppe, but what I was trying to say is that the upgrader itself should know not to bother watching a unit once it's dying	07:00
rogpeppe	fwereade: the upgrader doesn't watch a unit	07:00
fwereade	rogpeppe, I submit that it should, so that binary upgrades work within the same framework as anything else -- I don't think they transcend entity lifetime ;)	07:01
rogpeppe	fwereade: there's always a delay between entity being killed and entity actually dying. this just makes it a little longer in some edge cases.	07:01
rogpeppe	fwereade: if you upgrade a system, there's a possibility that some dying units might linger for a few seconds more as they download a new version. i don't think that's too bad a price.	07:03
fwereade	rogpeppe, well, it means that it continues to act alive -- in some, but not all, respects -- for arbitrarily longer	07:03
rogpeppe	fwereade: does it?	07:04
fwereade	rogpeppe, it may be that this is actually correct behaviour	07:04
rogpeppe	fwereade: what is the upgrader keeping alive?	07:04
rogpeppe	fwereade: after all, the Unit is marked dead	07:04
fwereade	rogpeppe, wait, I'm talking about bad behaviour on Dying more than ugliness on Dead here	07:05
fwereade	rogpeppe, I don't like the ugliness on Dead but I could live with it	07:05
rogpeppe	fwereade: i'm not sure it changes the Dying behaviour at all, does it?	07:05
fwereade	rogpeppe, the question that currently exercises me is "should Dying entities upgrade their code?"	07:05
fwereade	rogpeppe, maybe, actually, they should	07:06
rogpeppe	fwereade: i think so	07:06
fwereade	rogpeppe, yeah, I can imagine a bug blocking clean shutdown that can be resolved by an upgrade	07:06
rogpeppe	fwereade: i think it's nice to have upgrading totally divorced from any of the other logic	07:06
fwereade	rogpeppe, the fact that they stop watching for charm upgrades on Dying then becomes potentially problematic	07:07
rogpeppe	fwereade: i think charm upgrades are a different thing - they're at a higher level.	07:07
rogpeppe	fwereade: and they really do relate directly to the unit	07:07
rogpeppe	fwereade: so it makes sense not to upgrade a charm when the unit is dying	07:08
fwereade	rogpeppe, hm, the bug-blocking-clean-shutdown I guess does notapply because the user can already use juju resolved	07:10
rogpeppe	fwereade: for charm upgrade, yeah - we're providing the always-available layer on top of a charm.	07:11
fwereade	rogpeppe, ok, all sounds reasonable, I'll finish the review :)	07:11
rogpeppe	fwereade: tyvm	07:11
rogpeppe	fwereade: there's also this fairly trivial: https://codereview.appspot.com/6564063/	07:12
rogpeppe	everything in my window system has just start breaking. i'm gonna reboot	07:28
rogpeppe	started	07:28
rog	anyone here use multiple monitors under ubuntu?	07:33
fwereade	rog, ok, I have another thought	07:33
fwereade	rog, sorry no	07:33
rog	fwereade: ok, listening	07:33
fwereade	rog, I wil be comfortable with this is there is some very basic life handling at the top level	07:33
fwereade	rog, I think we need:	07:34
fwereade	rog, 1) drop the unit return from runOnce, because it's potentially panic-inducing anyway	07:34
fwereade	rog, 2) when we get the unit in runOnce, return a special error on NotFound, and that same special error if the unit exists but is Dead	07:35
fwereade	rog, 3) if we see that error in Run, return nil	07:36
fwereade	rog, 4) return the UpgradedError when we're doing an upgrade restart	07:36
fwereade	rog, 5) add a "normal exit 0" stanza to the upstart conf	07:36
fwereade	rog, and I think that's it	07:36
fwereade	rog, that I think conveys my intent as best I can	07:37
fwereade	rog, the precise details of implementation are ofc approximate	07:37
davecheney	the disparity between *ConfigNode.Map() and Charm.Config().Option is a pain in the balls	07:38
fwereade	rog, does that sound sane	07:38
fwereade	davecheney, ouch, I bet	07:38
davecheney	map[string]string vs map[string]interface{}	07:38
davecheney	mix in some json or yaml and it's a royal pain	07:39
rog	fwereade: i don't know what you mean about dropping the unit return from runOnce - i don't think it can induce panic	07:39
rog	fwereade: if we can't get the unit in runOnce, we'll never start the upgrader	07:39
fwereade	rog, bah, true	07:39
rog	fwereade: we could also check for dead if you like.	07:39
rog	fwereade: that would be trivial	07:39
* rog wishes we used map[string]string throughout		07:41
rog	bbs	07:42
fwereade	rog, leaving unit stuff aside for a moment, what is your opinion of the "normal exit 0" thing?	07:42
fwereade	rog, (when you return ofc0	07:42
rog	fwereade: maybe the agent should just remove its own upstart conf	07:44
fwereade	rog, that feels icky to me	07:45
rog	fwereade: who else is going to remove it?	07:45
fwereade	rog, the machine agent?	07:45
rog	fwereade: ... or the principal unit agent, right?	07:46
fwereade	rog, and I think it's fine to have a unit agent run on startup and exit immediately without error to indicate that it's done all it has to do	07:46
fwereade	rog, yeah, whoever deployed it	07:46
fwereade	rog, or a machine agent for that matter	07:46
rog	fwereade: yeah, i think you're right	07:47
rog	fwereade: presumably it is possible to get upstart to never start something again after it's exited ok	07:47
rog	fwereade: i'm slightly dubious though - if we reboot, surely it'll start again anyway	07:47
fwereade	rog, AIUI the stanza "normal exit 0" should be sufficient	07:48
rog	fwereade: ok, sounds fine.	07:48
fwereade	rog, it will; and the UA will cleanly observe its deadness, exit without error, and never trouble the machine again that run	07:48
rog	fwereade: interesting point: currently if one of the workers dies without an error, runTasks will just let it die, but continue on. i wonder if actually it should kill everything as usual in that case, so the tasks are always tied together.	07:49
fwereade	rog, hmmmmmmmmmm	07:50
rog	fwereade: then the unit agent can exit with a nil error when the unit is dead	07:50
rog	fwereade: and everything gets shut down cleanly.	07:50
fwereade	rog, I think that in code I would prefer an explicit error return	07:51
rog	fwereade: a nil error seems a fine way of saying "i'm shutting down with no error" to me	07:51
fwereade	rog, an an error seems to me to be a fine way to signal "what you asked me to do ain't gonna happen"	07:52
rog	fwereade: it's done exactly what we asked it to do, surely?	07:52
rog	fwereade: but... given that it wants other things to die along with it, perhaps ErrDead is good.	07:53
fwereade	rog, gaah, sorry	07:54
rog	fwereade: last thing i saw was: [08:52:16] <fwereade> rog, an an error seems to me to be a fine way to signal "what you asked me to do ain't gonna happen"	07:55
fwereade	rog, well, we get to choose how we define it -- I see the condition of lacking a unit as fundamentally an error condition for the uniter, because it means it can't do anything	07:55
fwereade	rog, the client code may be in a position to handle that specific error in a different way	07:55
rog	fwereade: zero is a ok value :-)	07:55
rog	fwereade: last thing i said before you went:	07:56
rog	[08:53:45] <rog> fwereade: but... given that it wants other things to die along with it, perhaps ErrDead is good.	07:56
fwereade	rog, a Uniter needs a viable Unit to do its job -- if it cannot get the unit, or the unit is dead, that is an error :)	07:56
fwereade	rog, ah, I missed that, sorry	07:56
fwereade	rog, +1 on ErrDead causing nil return from Run	07:56
rog	fwereade: ok, will do	07:57
fwereade	rog, btw, would you put it somewhere easily accessible, like state, so I can also return it from the Uniter please?	08:00
rog	fwereade: it could even go in the Uniter if we wanted	08:01
rog	fwereade: i mean, in worker/uniter	08:01
fwereade	rog, +1	08:01
fwereade	rog, ok, sorry to keep banging on about the unit return, but it really doesn't feel right	08:06
fwereade	rog, and it's a little goroutine-icky, even if not technically unsafe	08:06
rog	fwereade: it's never used two goroutines simultaneously	08:06
fwereade	rog, but it crosses my mind that the upgrader already has the unit, and can surely send down its PathKey in the UpgradedError itself?	08:06
fwereade	rog, that's why I said not unsafe	08:07
rog	fwereade: i'm not sure i see the issue	08:07
rog	fwereade: runOnce only returns when none of its sub-tasks are running	08:07
rog	fwereade: therefore there can be no problem with goroutine ickiness	08:07
rog	fwereade: i may be on crack	08:08
fwereade	rog, I guess I just don't like a return value that is present-but-useless when no error, or most errors; present-and-useful on one specific error, and not-present on some other errors	08:09
rog	fwereade: it's always present :-)	08:10
rog	oh no it's not	08:10
fwereade	rog, so you somehow return a unit when a unit is not found?	08:10
rog	indeed	08:10
fwereade	rog, ISTM that the PathKey is the important thing, not the context, and that that would be a fine thing to send down with the error	08:10
fwereade	rog, sorry, "not the context" is kinda meaningless	08:12
rog	fwereade: that does assume that the agent is always going to be named after the PathKey.	08:12
rog	fwereade: otherwise the upgrader wouldn't be able to send it	08:12
rog	fwereade: i know what you mean about the ickiness though	08:13
fwereade	rog, hmm, AIUI that was the only point of PathKey in the first place	08:13
fwereade	rog, does Machine have one too?	08:13
rog	fwereade: yeah i know - that's why i wanted to call it "AgentName" ...	08:13
rog	fwereade: it does	08:13
fwereade	rog, feels to me like the way to go	08:14
rog	fwereade: the ironic thing is that we already have the means to make the agent name, without involving unit or machine	08:14
fwereade	rog, ha!	08:14
fwereade	rog, but then, meh, we (currently) always need the unit anyway, we may as well ask it what it thinks	08:14
fwereade	rog, +1 on AgentName	08:15
rog	fwereade: been there, it was wrong	08:15
fwereade	rog, understood, but it may just have been an idea before its time :)	08:15
rog	fwereade: maybe. perhaps i'll start with PathKey and see whether the incongruence might change niemeyer's mind	08:18
fwereade	rog, SGTM	08:21
rog	fwereade: the other thing that occurred to me is the UpgradedError could contain a closure which would do the upgrade (i.e. call ChangeAgentTools)	08:22
rog	fwereade: but that's probably a bit icky	08:23
fwereade	<fwereade> rog, the only thing that stopped me proposing that was a feeling that UpgradedError was weird enough already	08:24
fwereade	rog, but I think it would probably actually be the cleanest solution	08:24
rog	[09:23:16] <rog> fwereade: but that's probably a bit icky	08:25
fwereade	rog, not sure -- it's already not quite right, because it's not an upgraded error	08:26
fwereade	rog, it's actually an UpgradeReadyError	08:26
rog	fwereade: i dunno - it's "someone has upgraded me"	08:26
fwereade	rog, and if you think of it like that, a RunUpgrade method mades sense	08:26
fwereade	rog, the actual upgrade of the agent does not take place until the agent calls ChangeAgentTools though	08:27
rog	fwereade: UpgradeReadyError doesn't sound much like an error	08:27
fwereade	rog, in what way is it worse than UpgradedError>	08:27
rog	fwereade: i agree	08:27
rog	fwereade: you're probably right, it's just about the same	08:27
fwereade	rog, indeed -- the whole error idea is what still feels mildly abusive	08:28
fwereade	rog, but I think at least it's worth a try	08:28
fwereade	rog, https://codereview.appspot.com/6561063/ reviewed anyway	08:32
fwereade	rog, https://codereview.appspot.com/6564063/ LGTM	08:37
rog	fwereade: thanks	08:37
rog	fwereade: dammit, the upgrader can't do the actual upgrade easily	08:49
rog	fwereade: it doesn't have agent.Conf.DataDir	08:49
fwereade	rog, really? blast	08:49
rog	fwereade: and i'm slightly reluctant to pass that in just for this	08:49
fwereade	rog, doesn't it use DataDir somehow to figure out the place to put the real tools dir in the first place?	08:50
fwereade	rog, I thought the upgrader put the tools, and then just told the client that they can now symlink to the tools' known lcation	08:50
rog	fwereade: yes, it does indeed have dataDir around, you're right	08:50
fwereade	rog, cool	08:51
rog	fwereade: ok, np then	08:51
Aram	hello.	09:18
rog	Aram: mornin'	09:20
TheMue	Aram: moin	09:29
rog	TheMue: hiya	09:34
TheMue	rog: didn't we already seen 3h ago? ;)	09:35
rog	TheMue: i thought maybe i hadn't :-)	09:35
TheMue	rog: it's ok, we both are 40+. so i know that. :P	09:36
rog	fwereade: do you think all upstart scripts we produce should have the "normal exit 0" stanza?	09:36
rog	TheMue: :-)	09:36
fwereade	rog, hmm, it demands a similar change in the machine agent, but -- for our purposes at least -- that might be a sensible way to go	09:37
fwereade	rog, it will not be unbearably hard to make it configurable later if we want to	09:37
rog	fwereade: i've already made the change to the machine agent	09:38
fwereade	rog, well then go for it :)	09:39
TheMue	release meeting time	11:02
* davecheney waits		11:02
TheMue	i ping the boss	11:03
Aram	aah, meeting.	11:04
Aram	I forgot about this.	11:04
TheMue	hmm, no answer	11:04
rog	fwereade: this is still WIP, but hopefully is closer to what you'd like: https://codereview.appspot.com/6561063	11:07
mramm2	pong	11:14
TheMue	ah, the master	11:14
mramm2	hahah	11:14
TheMue	just wanted to collect the state of all to send it to you, but now you're here. fine.	11:15
rog	davecheney, fwereade: invites are out	11:18
Aram	davecheney: fwereade: https://plus.google.com/hangouts/_/9dde8fe32d77795910d89838149fce4ccddea664?authuser=1&hl=en	11:18
rog	i lost conn	11:27
rog	now "we're having trouble connecting with the plugin"	11:28
rog	fwereade: presumably your net connection isn't good enough for a hangout this morning?	11:36
fwereade	rog, oh hell I completely forgot and was eating lunch	11:36
fwereade	rog, joining	11:36
rog	fwereade: https://plus.google.com/hangouts/_/9dde8fe32d77795910d89838149fce4ccddea664?authuser=1&hl=en	11:37
davecheney	all: http://paste.ubuntu.com/1247363/	11:44
davecheney	^ hook, sadness	11:44
TheMue	fwereade: is the current ServiceRelationWatcher what you expected with issue #1032539?	11:48
fwereade	TheMue, just a mo, let me take a look; if it has Added/Removed, no	11:49
rog	davecheney: have you looked at the log output?	11:49
rog	davecheney: that's where the hook output will go, which should be diagnostic	11:49
rog	bbs	11:51
rog	fwereade: done, i think: https://codereview.appspot.com/6561063	12:06
rog	(live tests pass)	12:07
fwereade	rog, awesome, I'll take a look in a sec	12:07
rog	fwereade: thanks	12:07
rog	fwereade: trivial? https://codereview.appspot.com/6568064	12:32
rog	lunch	12:32
fwereade	rog, LGTM on UpgradeReadyError... oh wait there was something I meant to check	12:41
fwereade	rog, what's the deal with changing the arge to the machine agent?	12:41
rog	fwereade: hmm, let me check	12:41
rog	fwereade: i'm not sure what you mean. which argument?	12:42
fwereade	rog, I can't find it, I may be remembering from an older version	12:43
fwereade	rog, yeah, it doesn't exist -- sorry	12:44
fwereade	rog, https://codereview.appspot.com/6568064/ looks trivial to me	12:46
rog	fwereade: cool, will submit	12:46
rog	fwereade: (thanks!)	12:47
niemeyer	Hello all!	13:04
niemeyer	Sorry, my phone warned me about the early meeting a bit late.. did we have one?	13:06
Aram	we did	13:06
niemeyer	Aram: Nice	13:10
niemeyer	Aram: Have you seen the machines watcher I've pushed?	13:10
Aram	niemeyer: still wip, found a bug and making final adjustments now.	13:11
Aram	niemeyer: no, I'll take a look.	13:11
niemeyer	Aram: https://codereview.appspot.com/6566066/	13:11
niemeyer	Aram: Thanks	13:11
niemeyer	Aram: Since we're both working on that stuff, it's good to cross-review	13:11
niemeyer	Aram: Btw, I've talked about a similar pattern to the watcher you're pushing with William in the sprint, and now it occurred to me that you were not around in the conversation	13:12
niemeyer	Aram: is it using a single goroutine, or a goroutine per unit?	13:12
Aram	a goroutine for the machine plus one goroutine for each principal unit (not for subordinates).	13:13
niemeyer	Aram: I don't think we need more than a single goroutine	13:15
niemeyer	Aram: The logic ends up simpler rather than more complex with a single goroutine	13:16
niemeyer	Aram: Please have a look at RelationUnitsWatcher	13:16
niemeyer	Aram: It has a similar problem in which it has to decide what to watch as it goes	13:16
Aram	niemeyer: yeah, it can be done with only one goroutine. I'll change it after I fix this one bug. I did it this way because it matched the way I thought of the problem, but it's easy to change now.	13:17
niemeyer	Aram: It started with the same design of one-per-subcontext, and then after a conversation it got refactored to be a single goroutine	13:17
niemeyer	Aram: Yeah, I totally understand that.. it feels "right", which is why I figured I should ask	13:18
rog	niemeyer: yo!	13:27
niemeyer	rog: Yo!	13:28
rog	niemeyer: after discussion with william this morning, i made quite a few changes to the uniter upgrade branch	13:28
rog	niemeyer: i hope you find them ok...	13:28
rog	niemeyer: https://codereview.appspot.com/6561063/	13:28
rog	niemeyer: am just running the first live test on the machine agent with provisioner and firewaller in-built	13:29
rog	niemeyer: if that works, we're basically there for upgrading	13:29
niemeyer	rog: It looks like you've made changes to the upgrader itself?	13:30
rog	niemeyer: as requested, yes	13:30
niemeyer	rog: Why did it affect the branch, specifically?	13:30
rog	niemeyer: because fwereade was not happy about the fact that runOnce returned the unit	13:31
rog	niemeyer: because it was only valid sometimes	13:31
rog	niemeyer: so the fix was to make the upgrader responsible for doing the actual upgrade itself.	13:32
niemeyer	rog: Yeah, I'm not against it for sure.. sounds like a great idea	13:32
niemeyer	rog: I just wish this was done by itself	13:32
niemeyer	rog: Rather than on top of another big branch	13:32
rog	niemeyer: yes, perhaps i should have done that. it seemed like i'd have got into a twisty mess, but it may have been ok	13:33
niemeyer	rog: I think that's where I am right now :)	13:33
rog	niemeyer: ok, i'll back out the changes and create another branch	13:34
niemeyer	rog: It's fine, I'm already on it	13:34
niemeyer	rog: if err == uniter.ErrDead {?	13:36
niemeyer	rog: Was this merged with William's change?	13:36
rog	niemeyer: no, this was another thing that william suggested	13:36
rog	niemeyer: perhaps i should have left that for later too	13:36
niemeyer	rog: How's that related to upgrading?	13:37
niemeyer	82 if state.IsNotFound(err) \|\| err == nil && unit.Life() == state.Dead {	13:37
niemeyer	?	13:37
niemeyer	rog: Gosh man..	13:37
rog	niemeyer: it's related to the logic around upgrading.	13:37
niemeyer	rog: That's related to unit lifecycle	13:38
rog	niemeyer: william asked me to change it, so i did. i'll change it back, np	13:38
niemeyer	rog: Which is not handled now, and as far as I understand is completely unrelated to upgrading	13:38
niemeyer	rog: Okay, can we please go back to applying the upgrading pattern that was already in place to the unit?	13:39
niemeyer	rog: Or, optionally, refactor the upgrading out without adding it to the unit	13:39
niemeyer	rog: and then add it to the unit in a separate step	13:39
niemeyer	rog: The branch as it is seems somewhat out of scope in whichever angle we look	13:39
rog	niemeyer: if i just take out the lifecycle changes, would that be enough?	13:40
niemeyer	rog: I have no idea.. right now I have a branch that has at least: 1) Add upgrade support to unit; 2) Refactor the upgrade machanism; 3) Add lifecycle support to the unit	13:40
niemeyer	rog: I'd do them as (2), (1), and keep (3) out entirely.. William is working on it	13:41
rog	niemeyer: ok	13:41
niemeyer	rog: I'd also be fine with (1) + (2), though, as that's the order you did and might be easier to get back onto that state	13:42
rog	niemeyer: that would indeed be much easier, as i did the lifecycle thing last	13:43
niemeyer	rog: Right	13:43
rog	niemeyer: thanks	13:43
rog	niemeyer: i should know better when to stop what i'm doing and make a new independent branch	13:43
niemeyer	rog: I have a hard time understanding it, to be honest	13:44
niemeyer	rog: It's so pleasing to get small branches in..	13:44
niemeyer	rog: Fast, painless	13:44
niemeyer	rog: Easy for people to look at and "Hah, of course we want that"	13:44
rog	niemeyer: yeah, but in this case, the incentive was "if i just make these few small changes, then william will be happy"	13:44
niemeyer	rog: You can always make William happy in another branch :-)	13:45
rog	niemeyer: yeah	13:45
niemeyer	rog: Even more when the change is already huge	13:45
TheMue	so, late lunchtime today, afterwards an appointment outside	13:50
Aram	niemeyer: you got a review.	13:51
TheMue	niemeyer: maybe i come back to you in the evening regarding the security groups, but most is already clear	13:51
niemeyer	Aram: Thanks!	13:51
Aram	niemeyer: I'll mark my branch as WIP and do some more changes.	13:51
niemeyer	TheMue: Super, that branch Aram reviewed has your test, btw	13:51
niemeyer	Aram: Cool	13:51
TheMue	niemeyer: yes, i've revied it too and like it	13:52
TheMue	reviewed	13:52
niemeyer	Aram: Can you clarify this bit: "to check that when we get an event, the lifecycle of the entity is really what we expect it to be."	13:53
Aram	niemeyer: we get []int{2,3,5}, it would be nice if we could check that 5 is alive because it was just added, 2 is dead and 3 is dying.	13:54
niemeyer	Aram: Hmm.. what are we testing?	13:54
niemeyer	Aram: This feels like testing the test itself? I mean, it's the test itself that is putting the unit to Dead	13:55
Aram	the test puts the unit to dead, but perhaps the watcher misfired for whatever wrong reason, stil delivering the correct event. by checking that the unit is dead we make sure that the watcher fired for the correct reason.	13:56
niemeyer	Aram: Sorry, I don't think that's the case. The test is saying "u.EnsureDead()".. it doesn't make sense for this specific test to ask "Is u actually dead?"..	13:57
niemeyer	Aram: We have tests for EnsureDead elsewhere	13:57
niemeyer	Aram: Regarding this: "I pondered about this. If we use Select anyway, why don't we use the real document, machineDoc in this case, and ignore the fields we don't care about?"	13:58
niemeyer	Aram: I was on the fence..	13:58
niemeyer	Aram: I think you're right.. we should just use the machine doc	13:58
rog	niemeyer: i hope this is more digestible: https://codereview.appspot.com/6561063	14:20
niemeyer	rog: Thanks very much, looking	14:20
rog	fwereade: i've taken out the life-cycle-related changes - they'll go in another CL	14:20
fwereade	rog, ok, SGTM, so long as they come soon I won't fret :)	14:20
niemeyer	rog: At these times I <3 Rietveld	14:22
rog	niemeyer: yeah, being able to diff against the different stages is invaluable	14:23
niemeyer	rog: Yeah, and diffing across the back and forth produces clean diffs	14:23
rog	niemeyer: ah yes	14:23
Aram	at previous job we had the worst review tool ever.	14:23
Aram	custom made in house, probably 15 years ago.	14:24
rog	fwereade, niemeyer: factored-out branch: https://codereview.appspot.com/6567067/	14:33
niemeyer	rog: Reviewed	14:39
niemeyer	rog: The first one, that is	14:39
rog	niemeyer: tyvm	14:39
niemeyer	Aram: Nasty. I think I never had anything close to Rietveld either, to be honest	14:40
niemeyer	So much lifetime wasted :-)	14:40
rog	niemeyer: the cloudinit change	14:42
rog	is necessary because the upgrader uses the machine's PathKey for the agent	14:42
rog	name. I could roll that back and explicitly pass in an agent name to NewUpgrader	14:42
rog	instead, if you like.	14:42
Aram	dy = -dx/(x + c1)	14:44
Aram	y = -I[dx/(x + c1)]	14:44
Aram	y = -ln\|x + c1\| + c2' = -ln\|x + c1\| + ln(exp(c2')) = -ln(exp(c2')\|(x + c1)\|) = -ln\|c2(x + c1)\|	14:44
Aram	meh, sorry, not here.	14:44
Aram	damn paste.	14:44
niemeyer	rog: No, sounds sensible then, thank you	14:44
rog	niemeyer: cool.	14:44
niemeyer	Aram: Curious	14:45
rog	niemeyer: BTW for the nil error, i think maybe it would be best as: if err == nil {err = fmt.Errorf("tasks finished with no error") } or something	14:45
niemeyer	rog: "uniter error: %v" feels very terse and clear	14:46
rog	niemeyer: rather than "uniter error: nil" which is still not great	14:46
rog	niemeyer: ok, will do	14:46
rog	niemeyer: seems funny saying "error" when there's no error, that's all	14:46
Aram	niemeyer: I found an awesome hacky way of solving the not so complex differential equation x * y' + 1 = exp(y).	14:47
niemeyer	Aram: What does the equation mean?	14:48
Aram	can't remember where I first encountered it, particle physics for sure. A new solution just sprang (?) in my mind randomly and I had to write it down.	14:52
niemeyer	rog: Fair enough regarding the error message, you're right	14:52
niemeyer	Aram: That's how it generally goes :-)	14:53
rog	niemeyer: ok, i'll go with my suggestion above, thanks	14:53
niemeyer	rog: Thank you	14:53
rog	ah, this will be why my combined machine and provsioning agent isn't working!	14:53
rog	2012/09/28 14:47:51 JUJU loaded invalid environment configuration: no registered provider for "ec2"	14:53
rog	not too hard to fix :-)	14:54
Aram	I must specify that I've probably encountered this equation more than 7 years ago though, heh.	14:54
Aram	at a particular physics olympiad.	14:54
niemeyer	rog: Follow up reviewed	14:54
rog	niemeyer: thanks!	14:55
rog	niemeyer: i wondered about worker.ErrDead but considered that it might be useful for a client to be able to distinguish which worker had given the ErrDead. but maybe that's overthinking it.	14:56
niemeyer	Aram: Differential equations are surprisingly useful.. I wish I could keep the background math in my head for longer.. but haven't really ever used them in anger, so it remains just as a spare tool	14:57
=== hazmat` is now known as hazmat
niemeyer	rog: The only agent that can be dead is the one that is running	14:58
rog	niemeyer: ok.	14:59
niemeyer	rog: e.g. we can't have a dead provisioning worker without a dead machine	14:59
Aram	I have to step out early today guys, but I'll finish work in the weekend on the watcher.	15:00
rog	Aram: have fun!	15:00
niemeyer	Aram: Have a pleasant EOD	15:00
Aram	thanks.	15:00
rog	machiner/provisioner/firewaller upgrade worked live, yay!	15:15
niemeyer	rog: WOAH	15:17
niemeyer	I guess we're closing September in pretty good shape	15:18
rog	niemeyer: in worker: var ErrDead = errors.New("agent object is dead") ?	15:20
rog	pwd	15:20
rog	niemeyer: better than we were a week ago, no question :-)	15:21
niemeyer	rog: s/object/entity/, otherwise LGTM	15:22
rog	niemeyer: cool	15:22
rog	niemeyer: https://codereview.appspot.com/6570063/	15:51
rog	niemeyer: the final piece of the puzzle	15:51
niemeyer	rog: Looking	15:51
rog	fwereade: https://codereview.appspot.com/6570063/	15:51
fwereade	rog, ack	15:52
fwereade	rog, that is awesome	15:54
fwereade	rog, LGTM	15:55
rog	fwereade: cheers!	15:55
rog	fwereade: i need to do another branch to pass the state into the Machiner directly, i think.	15:56
rog	fwereade: or maybe niemeyer will call that out in this branch	15:56
rog	fwereade: actually, nope, it's not a problem, cool.	15:59
niemeyer	rog: Done	15:59
rog	niemeyer: thanks!	15:59
niemeyer	rog: Very cool	15:59
rog	niemeyer: it was very easy and almost worked first time...	16:00
niemeyer	rog: Great stuff	16:00
niemeyer	Lunch is calling	16:01
niemeyer	biab	16:01
rog	right upgrades are GO!	16:11
rog	mramm2: ^	16:11
mramm2	rog: awesome	16:20
rog	mramm2: the only thing we don't have currently is the --bump-version functionality.	16:21
rog	niemeyer: stage 1 of --bump-version: https://codereview.appspot.com/6560066	16:43
niemeyer	rog: Looking	16:43
niemeyer	rog: Done	16:47
rog	niemeyer: thanks!	16:48
niemeyer	rog: My pleasure	16:48
rog	niemeyer: BTW, about --bump-version:	16:48
rog	niemeyer: i'm wondering when we don't want it enabled	16:48
rog	niemeyer: if we're uploading tools to private storage	16:48
niemeyer	rog: Any time checking out a release and uploading it	16:49
rog	niemeyer: why would we want to do that without bumping the build version? nothing will see the new release.	16:50
niemeyer	rog: Bumping the version is a hack.. uploading tools to private storage is not	16:51
niemeyer	rog: People can upload tools to private storage in their own cloud, with a real juju release	16:51
rog	niemeyer: ok. should we make it an error if the tools already exist in the storage then?	16:51
niemeyer	rog: How's that connected to the above?	16:52
rog	niemeyer: because otherwise it's likely to be a mistake, and people might be surprised when nothing happens.	16:52
niemeyer	rog: When you copy a file to a location that already contains the same file, nothing happens other than the same file being in the same place	16:52
niemeyer	rog: That's not surprising	16:53
rog	niemeyer: i'm not sure. in this case, it may well feel like we're asking juju to use a particular version.	16:53
rog	niemeyer: which it will be, just not the version we've just uploaded	16:54
niemeyer	rog: It will be the version we just uploaded if that's the version that is being used	16:54
niemeyer	rog: I don't see the problem, really	16:54
niemeyer	rog: --upload-tools uploads tools.. that's all	16:54
rog	niemeyer: well, the problem there is we've got two "versions" of the s/w with the same version number	16:54
niemeyer	rog: If people expect more than this, they'll be wrong	16:55
rog	niemeyer: it's actually "upgrade-juju --upload-tools" and the problem i see is in the first word there.	16:55
niemeyer	rog: That means you've changed the release	16:55
rog	niemeyer: yeah - we've changed the release but the version number hasn't changed. that feels a bit like a mistake to me.	16:56
rog	niemeyer: i'm trying to imagine a situation where it's useful to overwrite the existing tools for a given version	16:57
niemeyer	rog: If you checkout 2.2.0, that's 2.2.0	16:57
niemeyer	rog: Not 2.2.0.100, not 2.2.0.1001	16:57
rog	niemeyer: ok, that's fine.	16:58
niemeyer	rog: We're doing development, and we have tools to do what we want when we have to	16:58
niemeyer	rog: Thus --bump-version	16:58
rog	niemeyer: if someone has already checked out and uploaded 2.2.0, is it useful to them to be able to overwrite the existing 2.2.0 in storage?	16:58
niemeyer	rog: Don't know.. doesn't look like a problem we should worry about right now	16:59
rog	niemeyer: (i'm not suggesting automatic --bump-version BTW)	16:59
rog	niemeyer: ok	16:59
niemeyer	rog: If people ask to upload, and it uploads, sounds fine	16:59
niemeyer	rog: We can fine tune that behavior over time too.. anything we think right now will likely be wrong in the real world	17:00
rog	niemeyer: ok, seems reasonable	17:00
rog	niemeyer: submitted. ... and with that, it's time for me to call it a week. oh, i might sign off a few bugs first :-)	17:07
niemeyer	rog: Super, thanks for the great stuff this week	17:07
rog	niemeyer: got there in the end :-)	17:08
niemeyer	rog: Yeah	17:08
rog	fwereade, Aram, niemeyer, everyone else: have a great weekend!	17:14
fwereade	rog, and yourself :)	17:14
niemeyer	rog: Thanks, you too!	17:14
fwereade	niemeyer, btw, I just put a couple of trivials in the queue	17:14
niemeyer	fwereade: I'll look right now	17:14
niemeyer	fwereade: Done	17:19
niemeyer	fwereade: Done	17:21
fwereade	niemeyer, lovely, thanks	17:21
niemeyer	fwereade: My pleasure.. have a couple of questions on the second one	17:22
fwereade	niemeyer, the bits you're asking about are basically the same	17:22
fwereade	niemeyer, if you do a plain waitHooks{}, it means, "Iexpect no further hooks to have been run"	17:22
fwereade	niemeyer, the initial StartSync is to poke the uniter into responding to events that might cause spurious events to happen	17:23
fwereade	niemeyer, the subsequent ones are to make sure that it responds to state changes in a timely way	17:23
fwereade	niemeyer, sound sensible?	17:24
niemeyer	fwereade: Ah, yeah, thanks.. I suggest dropping it down to, say, 100ms then, since this is the happy case	17:25
fwereade	niemeyer, SGTM	17:25
niemeyer	fwereade: We're getting a second every 5 of those, otherwise	17:25
fwereade	niemeyer, indeed, dropping it will not hurt at all	17:25
fwereade	niemeyer, ok to submit with that change?	17:26
niemeyer	fwereade: LGTM	17:26
fwereade	niemeyer, cheers	17:26
fwereade	hmm, got to go out in a mo, actually, but they should be in soon enough :)	17:27
niemeyer	fwereade: Have a good weekend, and thanks for the hard work too.. feels like we're on the runway for a fully working implementation	17:30
* niemeyer steps out for a while		19:41

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!