/srv/irclogs.ubuntu.com/2012/09/13/#juju-dev.txt

wrtp	davecheney, niemeyer: you'll be glad to know that the recent cloudinit changes and traffic auth changes mean that we can actually run a uniter live.	00:27
niemeyer	wrtp: WOAH	00:27
niemeyer	wrtp: No kidding!	00:27
* niemeyer grabs some port wine to celebrate :)		00:27
wrtp	davecheney: in a few minutes, i should see the first live uniter upgrade	00:27
wrtp	niemeyer: ^	00:27
wrtp	niemeyer: the first time failed because i'd forgotten to add the UpgradedError check...	00:28
* wrtp does the same		00:28
wrtp	ah, interesting case. it would work, but it doesn't get as far as starting the upgrader because the uniter is returning an error when initialised.	00:37
wrtp	hmm, not easily fixed either.	00:38
wrtp	niemeyer: this an interesting case that i hadn't anticipated; i'm not quite sure what to do	00:48
niemeyer	wrtp: Hmm	01:08
niemeyer	wrtp: What's the error?	01:08
niemeyer	wrtp: and at which point?	01:08
wrtp	niemeyer: it actually doesn't really matter so much what the error is in fact	01:08
wrtp	niemeyer: although it's actually ModeInit: bad http response 404 Not Found	01:09
wrtp	niemeyer: the problem is that if one of the workers consistently dies early, the task exits, and all the other tasks (including the upgrader) get killed off	01:10
wrtp	niemeyer: so the upgrader never has a chance to upgrade	01:10
wrtp	niemeyer: i've worked out what i think is a decent solution though	01:10
niemeyer	wrtp: If the uniter is consistently dying early, sounds like something is wrong?	01:10
wrtp	niemeyer: indeed it is, but our code should be able to cope with that, even if it is wrong.	01:11
wrtp	niemeyer: otherwise we have a situation where we're running some bad code and we can't upgrade out of that, even though nothing has crashed	01:11
niemeyer	wrtp: If it's dying consistently before even starting, I don't see how we'd be able to do anything?	01:12
niemeyer	wrtp: We can't go "Hey, I think I'll upgrade anyway, just in case!"	01:12
wrtp	niemeyer: it was, but my first thought was: put the checks in after we've started the uniter	01:12
wrtp	niemeyer: so we're running the upgrader concurrently.	01:13
wrtp	niemeyer: but then you run into the case above.	01:13
wrtp	niemeyer: but as i said, i think i have a nice solution.	01:13
niemeyer	wrtp: Oh/	01:16
niemeyer	?	01:16
wrtp	niemeyer: 1) we change the runTasks function so that it gives UpgradedError priority over other errors	01:17
wrtp	niemeyer: 2) we change the upgrader so that if it's stopped while downloading, it waits a while for the download to complete before actually dying	01:17
niemeyer	wrtp: Hmm	01:17
niemeyer	wrtp: Feels easy to fall onto races	01:28
niemeyer	wrtp: I wish we had a more deterministic way to say that	01:28
wrtp	niemeyer: i think it's not too difficult	01:28
wrtp	niemeyer: obviously we can't wait forever because then a download that hung up would mean that an upgrader would never be killable, but i reckon a minute would be fine	01:29
niemeyer	wrtp: Feels like guess work	01:29
niemeyer	wrtp: It could also take 10 in a slow network, or 20	01:30
wrtp	niemeyer: point of reference: in ec2 it takes 4s	01:30
wrtp	niemeyer: we could add something to the downloader that signals that some progress is being made	01:31
niemeyer	wrtp: So S3 to EC2 is your reference of "slow network"!? :-)	01:31
wrtp	niemeyer: it'd still be guesswork, but not quite so wild	01:31
niemeyer	wrtp: That's probably as good as it gets	01:31
wrtp	niemeyer: fair enough	01:31
wrtp	niemeyer: we could make the timeout 1 day if we liked	01:31
niemeyer	wrtp: I don't like any of that.. :(	01:31
niemeyer	wrtp: We need a more deterministic way to define whether we want to stop now or whether we think we should check for an upgrade	01:32
wrtp	niemeyer: when do we ever want to stop now ?	01:32
niemeyer	wrtp: When we say "stop".. when there's an unrecoverable error in the state connection, ..	01:33
wrtp	niemeyer: yes, we could add code to enable stopping immediately, but that doesn't solve our problem	01:33
wrtp	niemeyer: ah yes, if we could know when we had such an error, that would be good	01:34
wrtp	niemeyer: we'd have to lose lots of errorContextfs though	01:34
wrtp	niemeyer: or actually, maybe errorContextf could be changed to deal with it	01:35
niemeyer	wrtp: Hm?	01:36
wrtp	niemeyer: ignore me	01:36
wrtp	niemeyer: i'm on crack	01:36
wrtp	niemeyer: we just need a way of asking whether the state connection is still ok	01:36
niemeyer	wrtp: Hmm	01:38
niemeyer	wrtp: Did you get to the point of seeing the upgrader behavior when the issue was happening?	01:39
niemeyer	wrtp: Was it bailing out before ever checking for any upgrade?	01:39
wrtp	niemeyer: i'm not sure. it doesn't currently print a log message when a download starts	01:40
niemeyer	wrtp: I was just having a look at it, a twist on the first idea you had sounds like an interesting direction	01:41
wrtp	niemeyer: i've actually finished writing the code for the first idea already BTW	01:42
niemeyer	wrtp: It's not just about "if a download is in progress", though	01:43
wrtp	niemeyer: it's also "have we seen an initial environ config"	01:43
niemeyer	wrtp: Right	01:43
wrtp	niemeyer: i've already got that logic too	01:43
niemeyer	wrtp: We have to ignore Dying and, if we have a valid environ, go check for an upgrade.. only then, put the Dying() consideration back on the table	01:44
niemeyer	wrtp: Oh, you rock :)	01:44
wrtp	niemeyer: http://paste.ubuntu.com/1201690/	01:44
niemeyer	dying = time.After(1 * time.Minute)	01:46
niemeyer	!?	01:46
wrtp	niemeyer: oops, well obviously we can't do that :-)	01:46
wrtp	niemeyer: but something that sends on dying after a minute would do the job	01:46
niemeyer	wrtp: I'd say we can ignore dying on the first round, and go try an upgrade (facilitated by factoring out somethings onto a function).	01:48
niemeyer	wrtp: If we have a timeout, that timeout should be on the downloader itself, not on that loop	01:49
niemeyer	wrtp: It's the downloader that should be responsible for complaining that things are stuck	01:49
niemeyer	wrtp: if we don't do that, we have problems either way (what if a charm download stops, what if ...)	01:50
wrtp	niemeyer: what about waiting for the initial config?	01:50
niemeyer	wrtp: Expand please?	01:51
wrtp	niemeyer: if we haven't yet received the inital environ config, there's no download to complain	01:52
wrtp	niemeyer: so we would need a top-level timeout anyway	01:52
wrtp	niemeyer: but that's not hard to arrange	01:53
wrtp	niemeyer: we can use both	01:53
niemeyer	wrtp: We can ask the state for an environment upfront, without using WaitForEnviron	01:53
niemeyer	wrtp: and run one attempt to upgrade	01:54
niemeyer	wrtp: By calling a function explicitly that contains the block that is within the select today	01:54
wrtp	niemeyer: we actually don't need to use WaitForEnviron at all. we don't use the environ.	01:54
wrtp	erm, maybe we do	01:55
niemeyer	tools, err := environs.FindTools(environ, binary, flags)	01:55
niemeyer	We need it	01:55
wrtp	niemeyer: ah yes, good point.	01:56
niemeyer	wrtp: Either way, your plan is a great quick solution	01:57
niemeyer	wrtp: dying = u.tomb.Dying + 5 min delay	01:57
niemeyer	wrtp: Once we know there are no downloads available, reset it to pure Dying	01:58
wrtp	niemeyer: once we know there are no downloads available, we can just return	01:58
niemeyer	wrtp: Uh?	01:59
wrtp	i think	01:59
niemeyer	wrtp: What if there is a new upgrade?	01:59
niemeyer	wrtp: It should still work as usual	01:59
niemeyer	wrtp: The loop waits for environ changes	02:00
wrtp	niemeyer: this is after we've already received a dying signal, right?	02:00
niemeyer	wrtp: No, I was talking about upgrader behavior at all times	02:00
wrtp	niemeyer: ok, i'm not sure what you mean by "dying = u.tomb.Dying + 5 min delay" then	02:00
niemeyer	wrtp: On entrance, a dying channel is built that will fire if u.tomb.Dying fires, but with a 5 mins delay	02:00
wrtp	niemeyer: ahhh	02:01
niemeyer	wrtp: Everything works as usual then.. we get into the loop, Changes fires with the first env config,	02:01
niemeyer	wrtp: THen, when we get the first download-not-found, we reset dying to take off that Delay	02:01
niemeyer	wrtp: From then on it's fine for it to die at any point	02:02
wrtp	niemeyer: we still want to try to complete a download if we're killed, no?	02:02
niemeyer	<niemeyer> wrtp: THen, when we get the first download-not-found, we reset dying to take off that Delay	02:02
niemeyer	wrtp: By definition, to get a download-not-found, we've checked if there was a download, and noticed there wasn't any	02:03
niemeyer	wrtp: Before that, dying is Dying + 5 mins delay	02:03
niemeyer	wrtp: Which means we have the 5 mins to complete the download	02:03
wrtp	niemeyer: yes, that sounds good	02:03
wrtp	niemeyer: because it's only the first download opportunity we really care about	02:04
niemeyer	wrtp: We can tweak as we go, but it sounds like this is a pretty trivial change on top of what we have today	02:04
niemeyer	wrtp: Right	02:04
niemeyer	wrtp: If we break in the middle later, we'll loop and get into that again in either case	02:04
wrtp	niemeyer: yeah, that sounds great actually	02:04
niemeyer	wrtp: If there's no upgrade to do we reset it as well, of course	02:05
wrtp	niemeyer: ah, i thought that's what you meant by download-not-found actually	02:06
wrtp	niemeyer: but it's all good	02:06
niemeyer	wrtp: Yeah, both that and actual failures in download (we just break right now)	02:06
wrtp	god i love how easy it is to reason about this stuff with channels	02:08
niemeyer	wrtp: +1!	02:09
wrtp	niemeyer: e.g. http://paste.ubuntu.com/1201718/	02:09
wrtp	niemeyer: a little better: http://paste.ubuntu.com/1201720/	02:12
niemeyer	wrtp: Yeah, that's nice	02:17
niemeyer	wrtp: A bit unfortunate that it hangs a goroutine forever for the 5 mins delay	02:18
niemeyer	wrtp: But I guess that's minor	02:18
wrtp	niemeyer: well, it can be 5 minutes only if the channel is buffered	02:18
niemeyer	wrtp: Hm?	02:19
wrtp	niemeyer: oh no, i'm talking rubbish	02:19
wrtp	niemeyer: it hangs a goroutine for 5 minutes, as you say, minor.	02:19
niemeyer	wrtp: No, it hangs forever	02:19
wrtp	niemeyer: oh yeah, i see what you mean	02:20
wrtp	niemeyer: i can't get very worked up about it :-)	02:21
niemeyer	wrtp: If we use a local var t *Tomb assigning to the u.tomb, we could use something like this: http://paste.ubuntu.com/1201731/	02:21
niemeyer	wrtp: So we Kill the delayed tomb after we have the first round acknowledged	02:22
niemeyer	wrtp: and set t back to the real tomb	02:22
wrtp	niemeyer: i see what you're doing, but i think it's overkill	02:22
wrtp	niemeyer: we really couldn't care less about one goroutine	02:22
niemeyer	wrtp: Up to you.. the two functions have the same size, are ready, and one of them doesn't leak memory	02:23
niemeyer	Alright, it's time for me to take some shower	02:25
wrtp	niemeyer: "leaking memory" is a little strong. it cleans up when the upgrader quits	02:25
niemeyer	wrtp: Dude, and it's super late for you too	02:25
wrtp	niemeyer: indeed	02:26
niemeyer	wrtp: Yes, when the process dies, it will release the memory :-)	02:26
wrtp	niemeyer: or 5 minutes after the upgrader returns.	02:26
niemeyer	wrtp: Which is done when the process dies!?	02:27
niemeyer	wrtp: Otherwise we'll always have an upgrader running	02:27
niemeyer	wrtp: Either way.. I'm not keen or trashing memory like that when it's trivial to avoid, but I'm not keen on bikeshedding on it either.. have a great sleep there	02:27
wrtp	niemeyer: will do, thanks	02:28
wrtp	niemeyer: enjoy your shower...	02:28
fwereade	hey, anyone who's around, cath is not well and I'm popping out to get some medicine... and I may in general be a bit sketchier than usual today	07:03
davecheney	right o	07:05
wrtp	fwereade: mornin'	08:00
fwereade	wrtp, heyhey	08:06
fwereade	wrtp, looks like you got a lot done last night -- nice :D	08:07
wrtp	fwereade: you'll be glad to know i got the uniter running live last night	08:07
wrtp	fwereade: it broke immediately, but that's not the point :-)	08:07
wrtp	fwereade: it was started and connected to the state etc	08:07
fwereade	wrtp, oh, bugger, I didn't read everything I missed	08:07
fwereade	wrtp, what fell over?	08:07
wrtp	fwereade: it failed to download the charm	08:07
wrtp	fwereade: it's probably something i'm doing wrong in the test	08:08
fwereade	wrtp, huh, weird	08:08
wrtp	fwereade: one mo, i'll show you its log	08:08
wrtp	fwereade: it repeats forever doing this: http://paste.ubuntu.com/1202116/	08:09
wrtp	fwereade: (it would be nice if we could see the url that's failed, don't you think?)	08:09
wrtp	fwereade: it's actually quite good that it failed because it exposed me to an upgrader issue that i hadn't considered before	08:09
fwereade	wrtp, huh, very odd	08:10
fwereade	wrtp, but ModeInit is I think before install time, isn't it?	08:10
fwereade	wrtp, that should be ModeInstalling	08:10
fwereade	wrtp, most likely something to do with getting the unit address	08:11
wrtp	fwereade: this is a classic example of why errorcontextf is not always sufficient BTW	08:14
fwereade	wrtp, I'm not denying your general point about ErrorContextf, but I think that had it been used in the provider methods it would have been fine... surely?	08:15
fwereade	wrtp, your point that it makes it inconvenient to return specific error values holds far more water IMO :)	08:16
wrtp	fwereade: well, yeah, it must be used in every single function	08:16
fwereade	wrtp, it must be used mindfully, at any rate :)	08:17
fwereade	wrtp, IMO it comes down to a question of who has responsibility for producing sane errors -- the thing that fails, or its client	08:17
fwereade	wrtp, despite the drawbacks, of which I am aware, I still come down on the thing-that-failed side	08:18
wrtp	fwereade: it depends what you mean by a sane error i think	08:18
fwereade	wrtp, probably, yeah :)	08:18
wrtp	fwereade: with the current scheme, we're always passing the buck - when we see a function like the above, we say it's fine because it adds error context. but it's only fine if all the things it calls do the same.	08:19
wrtp	fwereade: and in many of those functions it doesn't really look like a context is necessary.	08:20
wrtp	fwereade: so nobody calls it out in review	08:20
fwereade	wrtp, hmm, I think that statement is flawed in exactly the same way as "if it's up to the client, it's only ok if every single error everywhere is annotated" would be	08:20
fwereade	wrtp, in either case it's essentially advocating tracebacks	08:20
wrtp	fwereade: i think some kind of (fairly minimal) traceback is exactly what we need for diagnosing problems like we've just encountered	08:21
wrtp	fwereade: we need just enough to walk us through the tree of possible error paths.	08:22
wrtp	fwereade: what i'm going to have to do now is find out which of the functions that ModeInit is calling might be able to generate the error we saw. that's slow and error prone in itself.	08:23
fwereade	wrtp, AFAICS there are only two things that could have failed there, and the actual problem is the same in either case, right?	08:23
fwereade	wrtp, one is PrivateAddress, the other is PublicAddress, and (almost certainly) they'd each need to be fixed in the same way	08:24
fwereade	wrtp, I totally understand there are cases where it could be much less pleasant	08:24
wrtp	fwereade: ah yeah, that's true. i'd forgotten they accessed an http link	08:24
fwereade	wrtp, and such a case is certainly evidence for inadequate annotation	08:25
fwereade	wrtp, but I think I'm comfortable with the tradeoff	08:25
wrtp	fwereade: i'll change their error messages to be a little better for a start	08:25
fwereade	wrtp, the number of cases where I've actually been baffled by an error is pretty small	08:25
wrtp	fwereade: just wait until we've encountered more of these kinds of errors reported by people in the wild :-)	08:26
fwereade	wrtp, fair point -- but all the same, I'm uncomfortable with the idea that the way to handle errors right is to hand-hack traceback functionality everywhere	08:27
wrtp	fwereade: i don't think of it as traceback functionality. i think of it as describing the error :-)	08:27
fwereade	wrtp, it's the "everywhere" that bugs me more than the precise term we use for the mechanism by which we describe the error (but, yes, good descriptions will be better than tracebacks)	08:38
wrtp	fwereade: i really feel there's no shortcut. but i'm rolling with it for now. if we decide to change the policy in the future, it won't be too hard.	08:39
fwereade	wrtp, yeah :)	08:40
fwereade	wrtp, btw, are your current branches messing with PATH in upstart?	08:52
wrtp	fwereade: no	08:53
fwereade	wrtp, awesome :)	08:53
fwereade	wrtp, but I do have to fix something now then	08:53
wrtp	fwereade: at least... there might be a dreg in the tests. i'll check.	08:53
fwereade	wrtp, yeah, the problem is that the uniter tests mess with PATH	08:54
fwereade	wrtp, it's not too bad though, I can do a trivial and notsonice fix quickly, and then sort out the long-running-hook-server thing which will let me do it nicely	08:54
wrtp	fwereade: i've found the metadata problem BTW	08:54
fwereade	wrtp, oh yes?	08:54
wrtp	fwereade: it's using the version 1.0, but public-hostname didn't exist in that version	08:55
fwereade	wrtp, ha :)	08:55
wrtp	fwereade: tbh i think it should probably use "latest".	08:55
fwereade	wrtp, mmmmmmaybe, I'm just not sure what guarantees they make re "latest" compatibility	08:55
wrtp	fwereade: grr	08:56
wrtp	04.15.979 ... value *errors.errorString = &errors.errorString{s:"cannot put charm: cannot write file \"local_3a_precise_2f_dummy-1\" to control bucket: remote error: handshake failure"} ("cannot put charm: cannot write file \"local_3a_precise_2f_dummy-1\" to control bucket: remote error: handshake failure")	08:56
fwereade	wrtp, yeah, those piss me off :/	08:56
wrtp	fwereade: ok, i'll use 2012-01-12 then	08:56
fwereade	wrtp, brb cig	08:56
fwereade	wrtp, cool	08:56
wrtp	fwereade: actually i think "latest" should be fine	08:59
fwereade	wrtp, I'm happy to trust your research/judgment :)	09:03
wrtp	fwereade: for metadata categories it talks about "version introduced" which to me implies that a category can't be retracted once introduced	09:04
wrtp	fwereade: and all their examples use "latest"	09:05
fwereade	wrtp, sounds sane to me :)	09:05
Aram	hello.	09:12
fwereade	Aram, heyhey	09:16
wrtp	fwereade: holy shit, it worked!	09:17
* fwereade cheers		09:17
fwereade	wrtp, running unit agent?	09:17
wrtp	fwereade: unit agent upgraded	09:17
fwereade	wrtp, sweeeet	09:18
fwereade	wrtp, and I'm just about to land charm upgrades too	09:18
fwereade	wrtp, although I haven't actually implemented upgrade-charm yet	09:18
wrtp	fwereade: i need to think of a nice test so we can live test that the uniter can run a charm	09:18
wrtp	fwereade: how about a trivial charm that starts a web server? then we can poll for a while after the charm has started to see if the web server comes up	09:19
wrtp	Aram: morning!	09:19
fwereade	wrtp, yeah, sounds sensible, just checking for unit status is not adequate to tell that the charm is working	09:20
* wrtp is quite happy now		09:20
wrtp	fwereade: indeed. i want to check that a hook has been executed for real	09:20
wrtp	fwereade: this can lead into other tests that test more sophisticated functionality, i think	09:20
fwereade	wrtp, sgtm	09:20
fwereade	wrtp, will also test expose, which is nice	09:21
fwereade	wrtp, maybe just an echo server?	09:21
wrtp	fwereade: in fact, can i tell from the status that a charm has finished executing its start hook?	09:21
fwereade	wrtp, yeah, once unit status is started	09:21
wrtp	fwereade: sweet. that means that i won't have to wait too long after that	09:22
wrtp	fwereade: and i've just implemented the unit watcher, so i can use that to watch the status	09:22
wrtp	fwereade: lovely jubbly	09:22
fwereade	wrtp, you could write an echo server with plain/pirate/3117 transforms, to test config-changed as ell :)	09:22
fwereade	s/ ell/ well/	09:22
wrtp	fwereade: i think i'll just write a server that executes a specified jujuc hook and sends back the result	09:23
wrtp	s/hook/callback/	09:23
fwereade	wrtp, how do you plan to do that?	09:23
wrtp	fwereade: hmm, good point!	09:23
fwereade	wrtp, :p	09:23
wrtp	fwereade: darn, out of context callbacks	09:24
fwereade	wrtp, yeah, I'm strongly inclined to do the preliminary work for that today, just because it makes the uniter much cleaner (and will, I think, improve performance a little)	09:24
fwereade	wrtp, but you shouldn't count on the full thing being available any time soon at all	09:25
fwereade	wrtp, but even a trivial echo server will verify that open-port works, and by extension (probably) the rest of the jujuc stuff	09:25
wrtp	fwereade: maybe it's silly, but i'd prefer something that provides some actual info from the other side, so we know that we're talking to the right thing.	09:27
fwereade	wrtp, well, anything you want to send back from jujuc can be grabbed and stored when you run the start hook, right?	09:28
wrtp	fwereade: hmm, can i assume that python will be installed?	09:29
fwereade	wrtp, but ISTM that config-changed is a useful mechanism for checking end-to-endness	09:29
fwereade	wrtp, sorry I don't know	09:29
wrtp	fwereade: yeah, i was thinking that	09:29
wrtp	fwereade: what's the simplest web server i can write?	09:29
wrtp	fwereade: that doesn't need to pull in any more resources than we already have	09:30
fwereade	wrtp, if you have python, SimpleHTTPServer is pretty damn trivial	09:31
wrtp	fwereade: yeah, if	09:31
fwereade	wrtp, but the default python version is changing, so you want to be careful of that,	09:31
wrtp	fwereade: oh pish	09:31
fwereade	wrtp, I actually think you probably can guarantee some python	09:31
fwereade	wrtp, but yeah, exactly	09:31
wrtp	fwereade: perl is probably installed	09:31
wrtp	fwereade: (although i've never written a line of perl in my life)	09:32
fwereade	wrtp, whatever you need, it's one line in theinstall hook, surely?	09:32
fwereade	wrtp, in my very limited experience you're not missing much ;p	09:32
wrtp	fwereade: i firmly believe that to be the case :-)	09:32
wrtp	fwereade: i miss the inferno shell where i could write: listen 'tcp!*!12345' {echo hello there}	09:37
wrtp	fwereade: and instantly have a working server	09:37
wrtp	fwereade: i think we must have python because cloud-init is written in python.	09:38
wrtp	fwereade: i'll just have to be a teeny bit careful of version changes	09:38
wrtp	fwereade: does hook output get logged into /var/log/juju/unit*.log ?	09:41
fwereade	wrtp, nope, but juju-log will	09:41
wrtp	fwereade: hmm. where does hook output go?	09:41
fwereade	wrtp, I suspect it goes nowhere at the moment	09:42
wrtp	fwereade: i think that's wrong	09:42
fwereade	wrtp, fair point, we just want to be careful not to do it how we did in python, where every line on stderr gets prefixed with ERROR: ;)	09:42
wrtp	fwereade: just trying to deploy the real wordpress charm, just to see what happens...	09:43
wrtp	fwereade: http://paste.ubuntu.com/1202263/	09:45
wrtp	fwereade: also: http://paste.ubuntu.com/1202265/	09:45
wrtp	fwereade: it's looking good!	09:46
wrtp	fwereade: no way of knowing why the install hook failed though	09:46
fwereade	wrtp, that's pretty cool :D	09:46
fwereade	wrtp, I kinda would like to separate hook output from uniter output, but that's just an instinct, not sure it holds water	09:47
fwereade	wrtp, yeah, debug-hooks would be useful for that	09:47
wrtp	fwereade: i'm not sure it does. they do need to be readily distinguishable though	09:47
wrtp	fwereade: what does debug-hooks do again?	09:48
fwereade	wrtp, yeah, that's the heart of it	09:48
fwereade	wrtp, whenever the uniter runs a hook it just gives the client a shell instead	09:48
wrtp	fwereade:	09:49
wrtp	2012/09/13 09:42:35 JUJU HOOK some hook output	09:49
wrtp	?	09:49
wrtp	fwereade: i don't think we need to know if it was written to stdout or stderr	09:50
fwereade	wrtp, sounds reasonable	09:50
wrtp	fwereade: i might just go and do that to scratch my itch.	09:51
fwereade	wrtp, +1	09:51
fwereade	wrtp, btw, I think I have some idea why I wanted the agent tools to go in the agent tools dir -- because the uniter does write to it	09:53
wrtp	fwereade: i don't think that's a problem. a given version of the uniter will always write the same thing, no?	09:54
fwereade	wrtp, yeah, indeed, I'm not sure it's really a big deal	09:54
wrtp	fwereade: ah, is it likely the reason for the failure is that the PATH isn't set up yet?	10:04
fwereade	wrtp, ha!	10:04
fwereade	wrtp, yes, very likely indeed	10:04
wrtp	fwereade: hook output logging is done BTW, with the small matter of a few tests to go	10:05
wrtp	fwereade: very happy with how easy it was to navigate the uniter code	10:05
fwereade	wrtp, sweet :D	10:24
fwereade	wrtp, sorry, I think I'm being dense... what's the use case for the unit watcher?	10:46
wrtp	fwereade: the test uses it to watch for the agent tools version changing	10:46
wrtp	fwereade: it can also be used to watch for the unit status changing	10:47
wrtp	fwereade: otherwise we'll have to poll, which seems wrong.	10:47
fwereade	wrtp, hmm, I'm -1 on test-only code polluting the API, where we can help it	10:47
fwereade	wrtp, I'm comfortable polling in tests (although ofc I do appreciate it when I don'rt have to)	10:47
wrtp	fwereade: i don't think it's strictly test only. we could easily have some client functionality that wants to watch some aspect of a unit.	10:48
fwereade	wrtp, however, it doesn't cover status changes properly (down?)	10:48
fwereade	wrtp, I dunno, I feel we have too much speculative code in state anyway	10:49
wrtp	fwereade: FWIW we've already got MachineWatcher which is basically the same thing	10:50
fwereade	wrtp, is MachineWatcher used? (or does it have a concrete use case in mind?)	10:51
wrtp	fwereade: it's used to watch the instance id for one	10:51
fwereade	wrtp, ok, so it has a use, and I'm fine with that (although really that STM like a job for a more specialized watcher)	10:51
wrtp	fwereade: this was at niemeyer's request. i started with a more specialized watcher.	10:52
fwereade	wrtp, ha, ok	10:52
wrtp	fwereade: i think the UnitWatcher is sufficiently general that it doesn't really count as test-only code, even though that's how we're using it currently.	10:53
fwereade	wrtp, I kinda feel a watcher that doesn't fire on certain important changes is a bit of a bad thing though	10:53
wrtp	fwereade: we will eventually use it to do major-version upgrades	10:53
fwereade	wrtp, if it's meant to watch status, it ought to actually watch status	10:53
wrtp	fwereade: yeah, maybe it should watch the alive status too.	10:54
fwereade	wrtp, a ToolsVersionWatcher that can handle both units and machines, and strips out unwanted changes, feels much cleaner to me	10:54
wrtp	fwereade: i agree, but i've been told that's wrong. so here we are.	10:55
fwereade	wrtp, (btw, tangentially related: with the provisioner bool in machine, can we drop the PA now and just run the appropriate workers in the MA?)	10:56
fwereade	wrtp, bah	10:56
fwereade	wrtp, this is the trouble with the "we should do more reviews ourselves thing"	10:56
wrtp	fwereade: yes, that's on the todo list, maybe today	10:56
wrtp	fwereade: i know	10:56
fwereade	wrtp, sweet	10:56
wrtp	fwereade: what do you think is the right behaviour for logging output from hooks that continue to run something printing in the background after exiting?	11:23
fwereade	wrtp, hum, I rather feel that hooks probably shouldn't do that, but good question	11:24
wrtp	fwereade: i'm tempted to throw away any output after the hook exits	11:24
wrtp	fwereade: that way at least it's clean and we can't get any intermingling	11:24
fwereade	wrtp, yeah, I'd assumed it would just be a CombinedOutput	11:24
fwereade	wrtp, wait a mo	11:24
wrtp	fwereade: i don't want to use CombinedOutput as i want to log lines as they happen	11:25
fwereade	wrtp, won't plain hook output go to logdir/agentname.out anyway?	11:25
wrtp	fwereade: why would it?	11:26
* fwereade goes to peer at docs		11:26
fwereade	wrtp, ok, it wouldn't	11:27
fwereade	wrtp, FWIW, log-until-it-ends matches the python	11:30
fwereade	wrtp, but I agree that logging lines as they happen is a good thing	11:30
fwereade	wrtp, (that is also what py does0	11:30
fwereade	wrtp, bbs, updates need restart	11:30
fwereade	rogpeppe, btw, when you said the machine/unit watchers would be used for major version upgrades... how?	12:00
fwereade	rogpeppe, ah ok, sorry, now I see	12:00
rogpeppe	fwereade: cool, np	12:00
fwereade	rogpeppe, except	12:00
* fwereade thinks		12:00
fwereade	rogpeppe, ok, so we upgrade everything, and everything knows not to hit state (apart from that bit?) until after the state itself has been upgraded?	12:02
rogpeppe	fwereade: that's right	12:02
rogpeppe	fwereade: well, actually we probably make a new state server	12:03
fwereade	rogpeppe, doesn't that put quite a major restriction on the sort of state changes we can actually make?	12:03
rogpeppe	fwereade: i'm not sure of the details	12:03
rogpeppe	fwereade: how do you mean?	12:03
fwereade	rogpeppe, well, where's it going to write the change? in the old document (created/managed by old code)? or in the new document, which doesn't exist yet and is on a different server?	12:06
rogpeppe	fwereade: where is what going to write the change?	12:07
rogpeppe	fwereade: the upgrader?	12:07
fwereade	rogpeppe, yeah	12:07
fwereade	rogpeppe, oh, wait, it writes the code version before it's running the new code, right?	12:07
fwereade	rogpeppe, feels a bit off somehow	12:07
rogpeppe	fwereade: it uploads the new tools, writes the tools version, waits until all agents are in "pending" state, transforms the database, then triggers all the agents to have at it	12:09
rogpeppe	fwereade: by pending, i mean "major-upgrade-pending" of course	12:09
fwereade	rogpeppe, ok, thanks :)	12:10
fwereade	Aram, pre-review delivered on https://codereview.appspot.com/6492110/ -- but I'm worried I've got completely the wrong end of the stick	12:18
fwereade	Aram, I thought the one thing we weren't meant to be notified of was a remove? (except when the watcher happens to have missed a Dead, in which case we should get a Dead change..?)	12:19
Aram	fwereade: I only implemented current behavior, not what we want it to be in the future.	12:21
Aram	you are right about the future.	12:21
fwereade	Aram, my concern is that the future is now	12:22
fwereade	Aram, AIUI we dropped the life stuff in state so we could focus on it in mstate	12:23
Aram	we can never switch, nor test everything if the API is not the same though.	12:24
fwereade	Aram, but this API is useless	12:24
fwereade	Aram, as is the one in state	12:24
fwereade	Aram, actually, how many of these watchers have non-test clients currently?	12:27
fwereade	Aram, I presume that at least the firewaller/provisioner/machiner does use some of them, maybe most of them	12:28
Aram	white:juju-core$ lsr \| egrep 'go$' \| egrep -v 'state1\|mstate\|test' \| xargs egrep 'Watch[A-Z][A-Za-z]*' \| wc	12:28
Aram	64 394 5470	12:28
Aram	white:juju-core$ lsr \| egrep 'go$' \| egrep -v 'state1\|mstate\|test' \| xargs egrep '[A-Z][A-Za-z]*Change' \| lsr \| egrep 'go$' \| egrep -v 'state1\|mstate\|test' \| xargs egrep '[A-Z][A-Za-z]+Change[^A-Za-z]' \| wc	12:30
Aram	206 1304 18589	12:30
Aram	nah, that's wrong	12:31
Aram	white:juju-core$ lsr \| egrep 'go$' \| egrep -v 'state\|mstate\|test' \| xargs egrep '[A-Z][A-Za-z]*Change[^A-Za-z]' \| wc	12:31
Aram	8 52 792	12:31
Aram	white:juju-core$	12:31
Aram	white:juju-core$	12:31
Aram	white:juju-core$ lsr \| egrep 'go$' \| egrep -v 'state\|mstate\|test' \| xargs egrep 'Watch[A-Z][A-Za-z]*' \| wc	12:31
Aram	22 78 1799	12:32
fwereade	Aram, ok, so the plan is to implement broken watchers and only fix them once we've finished the switch (this is not necessarily a criticism, but I am disappointed if that's actually our only path to where we need to be)	12:33
Aram	I don't see how it can be any other way. we use watchers therefore we can't change our client code because we don't have "fixed" watchers. I can't do "fixed" watcher first since we would not have an easy path to transition to mstate since client code would want "broken" patches.	12:34
Aram	to make the transition fluid the APIs have to be the same.	12:35
Aram	the real fix would have been to have fixed watchers in state as well.	12:35
fwereade	Aram, yeah, I guess so, I can't come up with anything better :(	12:35
* fwereade quietly freaks out in a corner somewhere		12:36
* fwereade tries not to disturb anyone else		12:36
fwereade	Aram, well, hmm, actually I'm starting to feel it would be better to break what we currently have working in the interest of getting some real code running against mstate	12:37
fwereade	Aram, but... I suspect this is the cowboy-nature coming to the fore, and I should probably shut up	12:38
rog	fwereade: https://codereview.appspot.com/6494132	12:43
rog	fwereade: i'm becoming convinced that stopping logging at the end of a hook is not right	12:43
fwereade	rog, I'm becoming convinced that if we get output after the end of a hook we should spam the user with UR DOIN IT WRONG messages	12:44
rog	fwereade: i think there's potentially great benefit in having all logs funnel into one place	12:44
rog	fwereade: as it is, every server that anyone starts logs to a.n.other log file somewhere	12:44
rog	fwereade: i remember having difficulty with that when writing a charm	12:45
fwereade	rog, and I think that any sane service will have its own logging, and we are not in the business of indulging craziness like expecting hook child processes to work properly :)	12:45
fwereade	rog, isn't that a job for logging charms?	12:45
fwereade	rog, if I'm looking in the juju logs I care about juju	12:46
rog	fwereade: i think logging should be more fundamental to the system than that	12:46
fwereade	rog, if I care about what (say) riak is doing, surely I can look at riak's logs	12:46
rog	fwereade: if i'm a deploying a charm, i care about juju and what my stuff is doing, because the two interact	12:46
rog	fwereade: we want to be able to see a time line of the whole system if possible	12:46
rog	fwereade: i'm pretty sure that doing this right is a major key to getting juju easier to use and debug.	12:48
fwereade	rog, that is fine by me :)	12:57
rog	fwereade: there are various web services that do that at low cost	12:57
fwereade	rog, having juju agents themselves mediate that sending directly feels bad though	12:57
rog	fwereade: perhaps you're right.	12:58
fwereade	rog, I'm not actually against logging, I just feel like there is more than one concern here (even if not quite as many as 2, IYSWIM)	13:02
rog	fwereade: although when a daemon dies 1 second after you've started it in the background, saying "cannot do something", i didn't think so	13:02
rog	fwereade: perhaps we could provide a standard way to say "watch this log file please"	13:02
rog	fwereade: though i suppose logger(1) isn't bad for that	13:02
fwereade	rog, feels to me like a feature tailor-made for a relation with eg rsyslog	13:02
rog	fwereade: sadly subordinate charms don't work too well for logging currently	13:02
fwereade	rog, ha, ok, I haven't investigated :(	13:02
fwereade	rog, why not?	13:02
rog	fwereade: you can't guarantee that the subordinate has started before the principal	13:02
rog	fwereade: which i think you want if you want to set up a logging destination so that every message from the principal is logged.	13:02
fwereade	rog, it doesn't sound like it would be too hard for a logging charm to catch up with messages from the past, but maybe that's crazy, it could certainly have pathological behaviour in some situations	13:04
rog	fwereade: FWIW i think that we should guarantee that if a service has subordinates, their start hooks should complete before the principal's install hook gets run.	13:04
rog	fwereade: depends if the messages are going onto disk or straight through the network	13:04
fwereade	rog, the best we can do is guarantee that happens some of the time, though, right?	13:04
fwereade	rog, why would they be going across the network before the subordinate is configured to send them there?	13:05
rog	fwereade: of course if you add a subordinate after deploying a unit of service we can't guarantee it :-)	13:05
rog	fwereade: it might not be possible to switch destinations mid-flow. ok, i admit i don't know anything about syslog :-)	13:05
fwereade	rog, nor do I really :)	13:06
rog	fwereade: but as a general point, i think it would make subordinates much more useful	13:06
rog	fwereade: because they could be used to configure some aspect of the system for the principal, before the principal starts	13:06
rog	fwereade: for instance, to tweak kernel paramaters	13:07
rog	eters	13:07
fwereade	rog, I still don't really see why it matters for the principal to run in an unhelpful mode for a few seconds before being bounced by the subordinate when it comes up	13:07
fwereade	rog, and the subordinates will always have to be written so as to be able to deal with suddenly being deployed against a running unit anyway, surely?	13:08
rog	fwereade: you're assuming that whatever it is the subordinate is setting up can be changed after the principal starts	13:08
rog	fwereade: sure - some subordinates might not be able to work in that scenario	13:08
fwereade	rog, I dunno, that level of interference feels to me like it would be better addressed by letting people launch on custom images	13:09
rog	fwereade: it's a combinatorial problem - there are n-dimensions of tweaks, and we can't provide images that address all points in the space.	13:10
fwereade	rog, I'm not suggesting that we should, I'm just suggesting that allowing people a mechanism for choosing ther own images addresses this use case without introducing a whole new class of subordinate charms (ie these ones that cannot be started aftertrhe principal)	13:11
rog	fwereade: i did think of a way around it using the current system primitives BTW	13:12
rog	fwereade: i might have mentioned it before	13:12
* fwereade isn't sure whether or not he can hear a bell ringing		13:12
fwereade	rog, go on :)	13:12
rog	fwereade: a "sync" interface	13:13
* fwereade can hear a bell now, but can't remember how it was meant to work		13:14
rog	fwereade: a principal provides it; a subordinate requires it; the principal only starts when it gets a "sync done" message from the subordinate	13:14
fwereade	rog, how does the unit know whether or not it's got a subordinate when it runs its start hook in the first place	13:15
fwereade	?	13:15
fwereade	rog, and can we have 2 synced charms?	13:15
rog	fwereade: you'd set "required-subordinate-count" in the service config, or something like that	13:15
fwereade	rog, hmmm, and how do we deal with people deployed syncing subordinates when the services are already running?	13:16
fwereade	rog, sorry, I'm way off on a derail here	13:17
rog	fwereade: i don't think that's a problem actually, but yeah	13:17
rog	fwereade: it's an interesting example, i think	13:17
fwereade	rog, yeah, but I should be writing actual code right now ;p	13:18
rog	fwereade: indeed	13:18
rog	fwereade: so i'll leave the logging as it is for now. review appreciated :-)	13:18
fwereade	rog, oh yeah, that was why I started talking about this ;P	13:18
fwereade	rog, LGTM	13:23
rog	fwereade: thanks	13:23
rog	niemeyer: morning!	13:29
niemeyer	rog: Morning!	13:29
rog	niemeyer: i got the uniter one stage closer this morning - the unit upgrading worked.	13:29
niemeyer	rog: Woah!	13:30
rog	niemeyer: and it failed because the PATH was not set up rather than anything else, i believe	13:30
niemeyer	rog: Aw.. almost there then :)	13:30
rog	niemeyer: although charm stdout was lost, so i don't know (hence this morning's CL: https://codereview.appspot.com/6494132)	13:30
niemeyer	rog: Cool, looking	13:31
niemeyer	rog: Hmm	13:33
rog	niemeyer: hmm?	13:33
niemeyer	rog: Why are we seeing this issue? When a process exits we can generally determine deterministically that we have all its output	13:34
rog	niemeyer: no we can't	13:34
rog	niemeyer: i know this because i see the issue all the time	13:34
niemeyer	rog: Ah, I see the issue, backgrounding.. ok	13:35
rog	niemeyer: actually it happens with no backgrounding at all	13:35
fwereade	hey, I just noticed: we just hit r500 :)	13:37
rog	niemeyer: or... it's possible that os/exec works around the problem for us actually	13:37
niemeyer	rog: http://paste.ubuntu.com/1202638/	13:38
niemeyer	rog: How does that work?	13:38
rog	niemeyer: good question; i'm looking in the source	13:38
niemeyer	rog: How does it know to stop reading, more specifically	13:38
niemeyer	rog: Cool	13:38
niemeyer	rog: My understanding is that if a process has exited, we necessarily have all output that came from it into buffers	13:39
niemeyer	rog: So we can simply read	13:39
niemeyer	rog: If there's more, there's backgrounding going on that may be considered misbehaving	13:39
rog	niemeyer: do we specifically disallow starting any background process from a hook without redirecting its stdout & stderr?	13:40
niemeyer	rog: This is bad practice, but I don't think we care in this case	13:41
niemeyer	rog: What we care, I think, is that we're logging everything the hook ever gave us by itself	13:42
rog	niemeyer: we do still have the issue, i think	13:42
niemeyer	rog: So what is the issue? :)	13:42
rog	niemeyer: the issue is that the process can write some stuff into the pipe, exit, we see the exit, but the pipe still has data in	13:43
niemeyer	rog: I think that's exactly what I mentioned above	13:43
niemeyer	niemeyer> rog: My understanding is that if a process has exited, we necessarily have all output that came from it into buffer	13:43
rog	niemeyer: it happens because we're using a pipe rather than sending the data into a local buffer	13:43
rog	niemeyer: in this case, we're not using a local buffer.	13:43
rog	niemeyer: an alternative would be to block forever waiting for EOF, which is what would happen if we used CombinedOutput, for example	13:44
rog	niemeyer: i prefer to be a little more robust, i think.	13:44
=== hazmat is now known as 45PAA1GWO
niemeyer	rog: Hm.. I don't understand.. those two things are exactly the same	13:46
niemeyer	rog: The only way to send data into local buffer is by using a pipe	13:46
niemeyer	rog: We certainly have to finish reading after the process has exited, not before	13:46
niemeyer	rog: Okay.. (?)	13:46
niemeyer	rog: I'd prefer to be robust, but meaningful.. (?)	13:46
niemeyer	rog: What is the actual symptom of the problem?	13:46
rog	niemeyer: we'd see log messages produced by the hook after RunHook had returned	13:46
rog	niemeyer: for a few microseconds at most	13:47
niemeyer	rog: How can the hook produce output after it has returned?	13:47
niemeyer	rog: Sorry, I'm pretty confused by now	13:47
rog	niemeyer: because there's data still in the pipe	13:47
niemeyer	rog: It's not producing output	13:47
niemeyer	rog: There's data in buffers.. that's normal unix behavior	13:47
rog	niemeyer: no, it's produced it, but we haven't read it yet	13:48
niemeyer	rog: Exactly.. that's handled without arbitrary delays	13:48
rog	niemeyer: so we'll probably get EOF immediately and everything is hunky dory	13:48
rog	niemeyer: but if we always wait for EOF then we get bitten if there is a background process that's keeping the pipe open	13:48
rog	niemeyer: we could make the delay 2s or so if that would make you happier	13:49
niemeyer	rog: We don't get bitten, because we don't have to wait..	13:49
rog	niemeyer: i'm afraid we do. i'm just writing some code to demo the issue.	13:49
niemeyer	rog: Thanks, that'd be awesome	13:49
rog	niemeyer: essentially the OS doesn't guarantee that the pipe has been drained when wait(2) returns	13:51
niemeyer	rog: The pipe may never get drained if there's a process holding it	13:52
rog	niemeyer: exactly	13:52
niemeyer	rog: We only care that the process has exited, and we can't read anything from it anymore	13:52
rog	niemeyer: the process can exit and there still be data in the pipe.	13:53
rog	niemeyer: which we can read.	13:53
rog	niemeyer: anyway, one mo	13:53
niemeyer	rog: Then we read it.. (!?)	13:53
rog	niemeyer: yeah, but we need to wait for it to be read, hence the <-logger.done	13:54
niemeyer	rog: What I mean is that we don't have to put a boundary on the reader to output its data in 50ms	13:56
niemeyer	rog: There's a deterministic event happening..	13:56
rog	niemeyer: only if there's no background process	13:56
niemeyer	rog: Always	13:56
niemeyer	rog: "I can't read data now" is deterministic	13:57
rog	niemeyer: how do we know when that event has completed?	13:57
niemeyer	rog: No matter if there's something in background or not	13:57
rog	niemeyer: we know there might be some data, but we don't have any way of knowing for sure, or how much there is	13:57
niemeyer	rog: Hence, "I can't read data now"	13:57
niemeyer	rog: If we can, doesn't matter how much there is, or how much time has passed, we can read it	13:58
rog	niemeyer: how do we know "I can't read data now"?	13:58
rog	niemeyer: there's no non-blocking read	13:58
niemeyer	rog: Because the system call tells us that?	13:58
niemeyer	rog: Oh, there isn't?	13:58
rog	niemeyer: i don't think so	13:58
* niemeyer looks		13:58
rog	niemeyer: we could probably use syscall.SetNonblock	14:00
rog	niemeyer: but i'm reluctant to do that	14:00
niemeyer	This sucks :(	14:00
rog	niemeyer: tbh i don't mind waiting a bit longer if someone starts a process in the background and that triggers the timeout	14:02
rog	niemeyer: given they're not meant to be doing that anyway	14:03
niemeyer	rog: Well, the consequence isn't great.. besides the lack of determinism (we can be closing the door on the logger before we've read the real output) we're holding resources, perhaps consecutively, for things that may not exit	14:04
rog	niemeyer: that's all true	14:04
niemeyer	rog: Anyway, thanks for the coverage	14:06
niemeyer	rog: It's a great workaround and we should definitely move on with it for now	14:06
rog	niemeyer: np. i'm not sure what you'd like to do about the issue though.	14:07
rog	niemeyer: ok, cool	14:07
niemeyer	rog: I'll finish the review so we can integrate this	14:07
rog	niemeyer: thanks	14:07
rog	niemeyer: FWIW i verified that CombinedOutput does block until EOF, even if the process has exited.	14:10
niemeyer	rog: Cheers. That's indeed the right thing for it to do	14:10
rog	niemeyer: yup	14:10
rog	niemeyer: but not what we want, i think	14:11
niemeyer	( echo foo; sleep 5&; echo bar) \| cat	14:11
rog	niemeyer: yup	14:11
niemeyer	This is the issue, in summary :)	14:11
niemeyer	rog: Agreed	14:11
niemeyer	Gosh, and the bank calls me again.. the government is asking for new docs for the exchange contracts	14:30
niemeyer	Just what I was asking for	14:30
niemeyer	rog: Done, sorry for the delay	14:39
rog	niemeyer: tyvm	14:39
rog	niemeyer: do you think we just just truncate long lines?	14:40
rog	niemeyer: or should we send them as several log messages?	14:41
niemeyer	rog: I don't think we need to do much, to be honest	14:43
niemeyer	rog: ReadLine already truncates them	14:43
rog	niemeyer: so you'd go with that?	14:43
rog	niemeyer: (and ignore continuation lines)	14:44
niemeyer	rog: No, just log them	14:44
rog	niemeyer: ok, so we'll split lines	14:44
rog	niemeyer: fair enough	14:44
niemeyer	rog: Yeah, ReadLine already does it..	14:45
rog	niemeyer: yeah - it's always a difficult call whether to discard continuations of a split line though	14:45
rog	niemeyer: i'm mildly tempted to print [%d bytes truncated at end of line]	14:46
rog	niemeyer: in case someone cats an executable or something	14:46
niemeyer	rog: It's not truncated	14:47
rog	niemeyer: but... if someone's printing a large lump of json, you probably want all of it, split or not	14:47
niemeyer	rog: ReadLine will return on the next line	14:47
niemeyer	rog: Ah, I see, you're thinking about changing the behavior	14:47
rog	niemeyer: yeah, i'm concerned about changing the meaning of log messages	14:48
rog	niemeyer: but i can make the buffer pretty big, so it's really not much of an issue	14:48
niemeyer	rog: Change the meaning of log messages?	14:48
niemeyer	rog: What's the meaning of log messages? :)	14:48
rog	niemeyer: well, if we're grepping out messages by prefix (not uncommon), then we'll miss continuation lines	14:49
rog	niemeyer: but i think it's common to have some line-length restriction, and better that than lose information	14:50
niemeyer	rog: Grepping for prefixes would still work fine, I believe	14:50
rog	niemeyer: i'll use a buffer size of 128K or something	14:51
niemeyer	rog: Why? Please just keep the default	14:51
rog	niemeyer: not if the prefix is part of the message logged, i believe	14:51
niemeyer	rog: The prefix will end up in the same place	14:51
rog	niemeyer: yes, but if we grep for it, the continuation lines won't have it	14:51
niemeyer	rog: Yes, they won't, because they are continuation lines that don't have the prefix.. duh?	14:52
rog	niemeyer: so our grep output won't have all messages with that prefix	14:52
niemeyer	rog: Man	14:52
niemeyer	rog: Okay, please do whatever	14:52
rog	niemeyer: it depends how people are using the logging. 4K is probably well sufficient for almost all purposes, i'd guess	14:53
rog	niemeyer: so i'll leave the default	14:53
fwereade	niemeyer, https://codereview.appspot.com/6496120 should be near-enough trivial, assuming you're comfortable with the approach in the short/medium term	14:57
rog	niemeyer: "s/0.1/0.2/ probably, given the internal change?" - is there a timeout change you intended to suggest but didn't?	14:58
niemeyer	rog: Huh.. indeed	15:03
niemeyer	rog: I typed it, but it got eaten somehow	15:03
niemeyer	rog: I was suggesting a small bump to 100ms	15:04
rog	niemeyer: sounds good	15:04
niemeyer	rog: Cheers	15:04
niemeyer	fwereade: Looking	15:04
fwereade	brb, cig	15:04
fwereade	b	15:09
niemeyer	fwereade: Seems mostly sane	15:11
niemeyer	fwereade: There's a small bit that is feeling contrived that sounds easy to simplify	15:11
fwereade	niemeyer, ha, now that's faint praise ;p	15:11
fwereade	niemeyer, cool	15:11
fwereade	niemeyer, I kept finding myself wanting to extract a whole new type but it doesn't quite feel like the right time yet	15:12
niemeyer	fwereade: We're sending (dataDir, agentName) aaaaall the way down to EnsureJujuSymlinks, so it can call AgentToolsDir, and then we send that information aaaaaall the way up so we can use it	15:12
niemeyer	fwereade: When all these functions really care about is the directory, not the dataDir, not the agentName	15:13
fwereade	niemeyer, fair point... so, make EJC just take a dir then? or do I misapprehend?	15:14
niemeyer	fwereade: Yeah, both ensureFs and EJC can take the toolsDir themselves	15:15
niemeyer	fwereade: and we can generate that info at the top level by calling it explicitly from NewUniter	15:15
niemeyer	fwereade: Neither of them, I think, care about dataDir or agentName	15:15
fwereade	niemeyer, I had considered dropping ensureFs entirely, how would you feel about that? -- just ensure the state dir, and EJC the toolsDir directly?	15:16
niemeyer	fwereade: +1	15:16
fwereade	niemeyer, cool, on it :)	15:16
niemeyer	fwereade: The other thing I quickly pondered was whether we should just stick filepath.Dir(os.Args[0]) in the PATH rather than passing toolsDir into RunHooks, but I'm not so sure about that	15:17
fwereade	niemeyer, not so keen really	15:18
niemeyer	fwereade: Cool	15:20
rog	niemeyer: i exported var LineBufferSize to make it easy for the tests to check line splitting. hope that seems ok, before i submit.	15:24
rog	niemeyer:	15:24
rog	// LineBufferSize holds the maximum length of a line read	15:24
rog	// from a hook. Lines longer than this will overflow into	15:24
rog	// subsequent log messages.	15:24
rog	var LineBufferSize = 4096	15:24
rog	niemeyer: oops, i meant to paste this: https://codereview.appspot.com/6494132	15:24
rog	actually, it could easily be const	15:24
niemeyer	rog: It doesn't have to be public as well	15:26
rog	niemeyer: ok, i'll add export_test.go	15:26
niemeyer	rog: Ugh.. why?	15:26
rog	niemeyer: because the testing code needs access to the value	15:26
rog	niemeyer: so it can verify the split easily	15:26
niemeyer	rog: Just hardcode the number in the test as well.. this is not nearly as interesting as it sounds :)	15:27
rog	niemeyer: ok; thought you might not like that :-)	15:27
rog	fwereade: i'm seeing sporadic uniter test failures: http://paste.ubuntu.com/1202839/	15:28
fwereade	rog, aw poo, I'll take a look	15:28
* fwereade is baffled, offhand		15:30
fwereade	rog, I'll investigate more closely in a mo	15:30
rog	fwereade: i only get it when i do go test ./... (i.e. it seems like a race of some kind)	15:30
fwereade	rog, thanks, good to know	15:30
* niemeyer => lunch!		15:33
rog	fwereade: i've just reproduced the behaviour in trunk (i was slightly concerned it was a change i'd made). it happens about 30% of the time for me.	15:43
fwereade	rog, excellent -- I have not seen that one myself, but I'll see if I can figure it out	15:44
fwereade	rog, ...but... maybe tomorrow :( cath kinda needs support asap, she's been ill all day and laura's been around too	15:45
rog	fwereade: np	15:45
rog	fwereade: go tend your flock :-)	15:45
fwereade	rog, I'l hanging on for dear life to repropose one branch then that's probably it for me	15:45
rog	fwereade: do you mind if i make the PATH change?	15:46
fwereade	rog, kinda, because that's what I'm proposing ;)	15:46
rog	fwereade: ah, do it, DO IT!	15:46
fwereade	rog, just forcing myself to witness a full test run ;)	15:46
rog	fwereade: aw, i know that feeling very well	15:47
rog	fwereade: live?	15:47
fwereade	rog, afraid not, I should probably do that too, but for that I definitely don't have time :(	15:48
rog	fwereade: 'sok, i'm sure local is fine	15:48
fwereade	rog, https://codereview.appspot.com/6496120 and goodnight :)	15:53
rog	fwereade: nnight. hope cath gets better before you have to leave...	15:56
niemeyer	rog: ping	16:53
rog	niemeyer: pong	16:53
niemeyer	rog: Oh, hmm	16:53
niemeyer	rog: I guess we'll need to talk anyway, given fwereade's comment	16:53
niemeyer	rog: Was just looking at https://codereview.appspot.com/6498124/	16:53
niemeyer	rog: and pondering about where mstate is	16:54
niemeyer	rog: But we'll need further info it seems	16:54
rog	niemeyer: i thought about doing mstate but thought i'd leave it until there were more watchers and the pattern was fully established	16:54
niemeyer	rog: I think it'd be fairly equivalent to the machine one	16:55
rog	niemeyer: yeah. i wondered if there was going to be some equivalent of the generic machinery that was in state, where each watcher took less lines	16:55
niemeyer	rog: I'd prefer to add it together.. hopefully we can transition entirely next week during the sprint (fingers crossed)	16:55
niemeyer	rog: But we'll need to talk to fwereade anyway	16:55
rog	niemeyer: what do you think about the UnitWatcher as a concept?	16:56
niemeyer	rog: Probably, but we'll need to evolve a bit before that, I suspect at least	16:56
rog	niemeyer: shall i just poll for the time being?	16:56
niemeyer	rog: I personally think it's fine	16:56
niemeyer	rog: We'll need to watch it anyway for other reasons, I believe (e.g. life)	16:56
rog	niemeyer: yeah	16:56
rog	niemeyer: given fwereade's remarks, i wondered about making the unit watcher watch the unit's presence too	16:57
rog	niemeyer: then at least it would signal all status changes	16:57
niemeyer	rog: Initial feeling is that this would cross two unrelated issues together	16:58
niemeyer	rog: I'd be curious to know more about his feelings	16:58
rog	niemeyer: the difficulty is that you watch almost all unit status changes.	16:58
rog	niemeyer: but yes, i tend to agree.	16:59
niemeyer	rog: Why did you need it again, for reference?	17:00
rog	niemeyer: for watching the agent version	17:00
rog	niemeyer: so that i can tell when the unit agent upgrades	17:00
niemeyer	rog: Ah, for testing only, ok	17:00
rog	niemeyer: yes. although when we do major version upgrades, we'll probably need it too	17:00
niemeyer	rog: Interestingly, I notice we have a bunch of unrelated watchers on the unit.. ports, resolved, ..	17:04
niemeyer	rog: and a bunch of properties that can't be observed	17:05
niemeyer	rog: All of those are very rarely changed.. I wonder if we should move towards observing the unit itself	17:05
rog	niemeyer: yeah. that has made me a bit uncomfortable. we can watch some properties of the unit because they happen to be stored in a certain way. that might not be true in mstate tho'	17:05
rog	niemeyer: what do you mean by that?	17:06
niemeyer	rog: It's even the opposite, which kind of sucks	17:06
niemeyer	rog: We stored them in a certain way because we wanted to observe them	17:06
niemeyer	rog: let's say we then decide to observe an existing property with state (pre-mstate).. we'd be forced to observe the unit node itself	17:07
rog	niemeyer: i'm not that familiar with mstate. what do you mean by the unit "node" ?	17:07
rog	oh, sorry	17:07
rog	pre-mstate	17:07
niemeyer	rog: I mean the document (in mstate) or node (in zk) that contains most unit information	17:08
niemeyer	rog: We've split certain bits out, originally, because we felt it'd be more efficient to observe it that way	17:08
niemeyer	rog: It now feels like this is a poor design	17:08
rog	niemeyer: most but not all (thinking of the presence state, for example)	17:08
niemeyer	rog: Because we're attempting to anticipate every use case	17:09
niemeyer	rog: That's a very particular one	17:09
niemeyer	rog: That will always be special	17:09
rog	niemeyer: true	17:09
niemeyer	rog: Because we'll have significant volume of small bits of information going on and off regularly	17:09
rog	niemeyer: so you think the unit watcher (in mstate) should observe the entire unit document?	17:11
rog	niemeyer: (i don't even know if that makes sense actually!)	17:11
niemeyer	rog: Right, we have already unified all settings in a single document even	17:11
niemeyer	rog: Because the cost of loading the extra settings is nothing compared to the cost of everything else	17:12
niemeyer	rog: (make query, transfer ir, lookup index, lookup data, transfer back, unmarshal in, unmarshal out, blah blah)	17:12
rog	niemeyer: so in fact the unit watcher in mstate would automatically see all unit changes, e.g. ports etc?	17:12
niemeyer	rog: Exactly, and so will every other watcher that watches the unit, even if we do have separate watchers	17:13
niemeyer	rog: We can even filter out the specific settings, if we do want that, but again it's pretty irrelevant	17:13
niemeyer	rog: The cost of the operation is not there	17:13
niemeyer	rog: In summary, my point is that I think we could do pretty well with a single unit watcher, instead of tons of per-setting watchers for the unit	17:14
rog	niemeyer: i think it's still worth having typed channels rather than always passing around an object that you must invoke a method on to get the value	17:14
rog	niemeyer: because channels are generic. i had to jump through hoops to watch both the machine agent tools and the unit agent tools with the same code.	17:15
niemeyer	rog: I don't know.. it doesnt' feel worth it.. I believe these watchers will each have a single use case in the whole code	17:15
rog	"generic" is maybe a bad way of saying it.	17:15
rog	niemeyer: it also means that every client needs to check if the setting it's after has actually changed	17:16
rog	niemeyer: i think part of the problem is that our watchers are so darn heavyweight	17:16
rog	niemeyer: if we did it right, a new watcher should only be 6 or 7 lines of code, i think.	17:16
niemeyer	rog: Indeed, but instead of checking it, we're doing a lot of heavy lifting to do exactly the same in a generic way	17:17
niemeyer	rog: Even then, it doesn't feel great	17:17
niemeyer	rog: We need a new method on the type, a new type to put on the channel, a new function to handle it, etc etc, for each detail we want to watch	17:18
rog	niemeyer: part of the problem for me is that interfaces don't work when you've got chan T, but only when you've got T	17:19
niemeyer	rog: That is unfortunate indeed..	17:20
niemeyer	rog: at the same time, it's not reaaally an issue in this case.. we have type-specific channels in both cases	17:21
rog	niemeyer: if i've got a chan *state.Unit, it's type specific, but it doesn't really represent what i'm watching.	17:26
niemeyer	rog: Well, if you're watching the unit it does represent what you're watching	17:26
rog	niemeyer: this is what i had to do in livetests: http://paste.ubuntu.com/1203085/	17:27
rog	niemeyer: so i could treat units and machines in the same way	17:27
rog	niemeyer: for the purposes of waiting for them to upgrade	17:28
niemeyer	rog: That seems pretty involved	17:30
niemeyer	rog: Isn't it just a matter of having a function that looks like	17:30
niemeyer	rog: func machineToolsHolder(ch <-chan *Machine) <-chan ToolsHolder	17:32
niemeyer	rog: ?	17:32
niemeyer	rog: To overcome exactly the issue you described above?	17:32
niemeyer	rog: The logic itself is exactly the same for both	17:32
rog	niemeyer: that's essentially what i pasted	17:33
niemeyer	rog: Heh	17:33
niemeyer	rog: Except completely unrelated, right? :)	17:33
niemeyer	rog: I'm saying converting a channel type onto another channel type	17:33
niemeyer	rog: The logic is the same	17:34
niemeyer	rog: You're converting a watcher onto anohter watcher	17:34
rog	niemeyer: well, it stores the machine too, so that the generic code can inspect the error if it wants	17:34
rog	niemeyer: i'm not convinced it would be less code overall	17:35
rog	niemeyer: ah, i see what you mean actually.	17:36
niemeyer	rog: It removes the duplication that you have there, with two watchers that look exactly the same..	17:36
rog	niemeyer: where MachineToolsHolder is something like:	17:37
rog	type MachineToolHolder interface {	17:37
rog	AgentTools() (*state.Tools, error)	17:37
rog	}	17:37
niemeyer	Yep	17:37
rog	i mean ToolsHolder of course	17:37
rog	niemeyer: good enough for testing anyway	17:38
rog	niemeyer: i still think it's a pity that we require that every stage in a channel pipeline have its own tomb.	17:40
rog	niemeyer: but i lost that one ages ago.	17:40
niemeyer	rog: I don't think we require that	17:45
rog	niemeyer: i think we do if we want to maintain the invariant that calling Stop means no more values will be produced on a channel.	17:45
niemeyer	rog: In fact, the mstate watcher mechanism explicitly does not have a per-channel tomb	17:46
niemeyer	rog: What we require in the design, since the beginning, is that we deterministically stop background activity	17:46
rog	niemeyer: are you talking about the MachineWatcher? i should have a look then	17:46
niemeyer	rog: That sounds sane	17:46
niemeyer	rog: No, I'm talking about the watcher foundation and the way that MachineWatcher uses it	17:47
niemeyer	rog: We do not have per-channel tombs there	17:47
rog	niemeyer: so the MachineWatcher doesn't have a tomb? cool!	17:47
niemeyer	rog: Heh	17:47
niemeyer	rog: The MachineWatcher has a tomb, because it has a goroutine that we want to stop deterministically	17:48
rog	niemeyer: i think we'll usually find that every stage in a pipeline is like that	17:48
niemeyer	<niemeyer> rog: No, I'm talking about the watcher foundation and the way that MachineWatcher uses it	17:48
niemeyer	<niemeyer> rog: We do not have per-channel tombs there	17:48
rog	niemeyer: a machine watcher is one stage down from the watcher foundation.	17:48
rog	niemeyer: i don't see the watcher foundation as a pipeline.	17:48
niemeyer	rog: Okay, sorry	17:49
rog	niemeyer: we can deterministically stop goroutines without a tomb per-stage if we're prepared to wait on the channel for the goroutine to acknowledge that it's been stopped (by closing the channel)	17:49
rog	niemeyer: i know that was rejected ages ago, but i still think it's a nice way to go about things.	17:50
niemeyer	rog: We can't close a channel from the consumer side	17:50
rog	niemeyer: it means the stop message can percolate deterministically down the pipeline	17:50
rog	niemeyer: each stage in the pipeline can delegate the stop handling to the stage upstream	17:51
rog	niemeyer: when that sees a stop, it closes the upstream channel, which then flows stage-by-stage back to the most downstream channel	17:52
niemeyer	rog: Doesn't work.. it means we have background activity because closing the channel leaves things running	17:52
niemeyer	rog: Deterministically stopping means Stop() returned, nothing else is happening	17:52
niemeyer	rog: It's one of the things we've fixed from Python, and it makes me very happy to be honest	17:53
rog	niemeyer: a stop would be a two-stage process - we close the stop channel, then wait for to see the downstream channel closed.	17:53
rog	niemeyer: there's nothing left running	17:53
niemeyer	rog: Heh.. I'm sure you can create many other mechanisms to do the same	17:53
niemeyer	rog: We have one, that works, and works well	17:54
rog	niemeyer: it means it's really heavyweight to do things that should be throwaway pieces of code, IMHO	17:54
rog	niemeyer: the only thing that needs to do a select is the head of the stream.	17:55
niemeyer	rog: Sorry, but this isn't helpful at all.. you're building a new design as I mention the requirements, and saying that what we do is heavyweight despite the fact it works.	17:55
rog	niemeyer: anyway, sorry, derail	17:55
rog	niemeyer: yes it does	17:55
niemeyer	rog: If you want to reimplement juju again, I won't be around	17:55
rog	:-)	17:55
rog	niemeyer: i'm off for the evening. see ya tomorrow!	18:06
niemeyer	rog: Have a good time	18:06
rog	niemeyer: when do you set off?	18:06
niemeyer	rog: Sometime Saturday only, I'll be around tomorrow still	18:07
rog	niemeyer: cool	18:07
rog	niemeyer: if you manage to remove fwereade's toolsDir branch tonight, that would be awesome. i reckon we can get it running actual commands tomorrow possibly.	18:08
niemeyer	rog: I'm sure it must be ready to go in.. it was really a trivial detail, and it was great	18:09
niemeyer	jimbaker: ping	18:14
jimbaker	niemeyer, hi	18:14
niemeyer	jimbaker: Heya	18:14
niemeyer	jimbaker: I'm wondering about format 2 stuff	18:14
niemeyer	jimbaker: Were you the one pushing it?	18:14
jimbaker	niemeyer, yes	18:14
jimbaker	it's in trunk now	18:14
jimbaker	in terms of the raw string changes	18:15
niemeyer	jimbaker: Okay.. I've noticed there's a "-Not set-" value being used by "config get"	18:15
niemeyer	jimbaker: Have you killed that?	18:15
jimbaker	niemeyer, i didn't notice that in the code or usage... i'll see if it's still there	18:15
niemeyer	jimbaker: Ouch	18:15
niemeyer	jimbaker: config_get.py	18:16
niemeyer_	jimbaker: Sorry, connection broken	18:18
jimbaker	niemeyer, yes that's unfortunate... i didn't touch juju get in the merged branch, but we should definitely address so it is also raw	18:19
niemeyer_	jimbaker: It should be nil or something.. "-Not set-" is a perfectly valid set value	18:20
jimbaker	niemeyer_, agreed	18:20
jimbaker	i understand that the attempt here is to provide some human readable info on how the settings have been made (such as the default flag). i do think it would be better if it were to simply roundtrip with juju set, but that's too late i suppose	18:23
niemeyer_	jimbaker: Can you please fix it so it's "None" instead?	18:24
jimbaker	niemeyer_, np. so should we also have the new behavior respect format 2 for that service?	18:24
=== niemeyer_ is now known as niemeyer
niemeyer	jimbaker: No, this is client side	18:25
niemeyer	jimbaker: format 2 is about the charm	18:25
jimbaker	niemeyer, i understand	18:25
niemeyer	jimbaker: This is a bug in the juju get command, that I suggest actually fixing	18:25
niemeyer	jimbaker: I'll suggest the same in the Go side	18:25
niemeyer	jimbaker: yaml has a perfectly valid idea of "nil"	18:26
niemeyer	jimbaker: Which is what this means	18:26
jimbaker	ok, i'll propose this fix	18:26
niemeyer	jimbaker: Cheers	18:26
=== steev_ is now known as steev
fwereade	niemeyer, heyhey, sorry I missed the conversation above	19:10
fwereade	niemeyer, I did repropose the toolsDir PATH branch, maybe you missed it	19:10
fwereade	niemeyer, https://codereview.appspot.com/6496120	19:11
jimbaker	niemeyer, juju get is now fixed in trunk r577, and the change will be part of the 0.6 release	19:11
niemeyer	fwereade: Ah, sorry, I was postponing it because I didn't think you'd do it right now, but if you're active I'll have a look right away.. I'm sure it's good	19:12
niemeyer	jimbaker: Woohay, cheers	19:12
fwereade	jimbaker, cool	19:12
fwereade	(I wondered about that in the golang implementation :))	19:13
jimbaker	fwereade, no need to port that bug :)	19:13
* fwereade suddenly has first-job flashbacks... bug-compatible reimplementation of directx... fun :)		19:14
fwereade	we called it indirectx, of course	19:14
niemeyer	LOL	19:21
niemeyer	fwereade: Done, thanks!	19:22
fwereade	niemeyer, sweet, tyvm	19:27
fwereade	niemeyer, btw, I'm not 100% sure whether this is just me seeing patterns where none exist... but ISTM that the 50ms spinning while waiting for the uniter makes things a lot flakier than the 200ms spinning when I'm running full test suites, and I have a theory that this is because I'm spending more time spinning rather than sitting back and allowing the uniter to do its thing	19:32
niemeyer	fwereade: Hmm.. this sounds a bit like we don't know what's actually going on	19:33
fwereade	niemeyer, how would you feel about dialing them back a little, at least to, say, 100ms... (until I decide that I'm seeing just as many just-missed timeouts, and change it back)	19:33
fwereade	niemeyer, ha, yes, that is a good way of putting it	19:33
niemeyer	fwereade: I'd rather dial them to 25ms so we can actually find out what's going on :)	19:34
fwereade	niemeyer, haha :)	19:34
niemeyer	fwereade: The delay is preceding a pull.. if returning earlier makes it break anyhow, something is necessarily not ok	19:34
niemeyer	fwereade: In some cases this reflects background activity that is missing a Stop, for example	19:35
fwereade	niemeyer, hmmm, my theory was no more sophisticated than "the more I do on the test goroutine, the less time is spent letting real things actually happen"	19:37
fwereade	niemeyer, (I'm defiitely not contradicting your suggestion)	19:37
niemeyer	fwereade: I find that unlikely.. 50ms is an eternity in computing time	19:38
niemeyer	fwereade: I find it more likely that we're simply lucky enough to have spotted a way to observe a problem	19:38
fwereade	niemeyer, you are probably right... many uniter things take longer than I would naively expect them to	19:39
niemeyer	fwereade: Can I help you to debug the issue anyhow?	19:40
fwereade	niemeyer, not obviously, except perhaps to guess at why this chunk of code might take 200ms:	19:41
fwereade	if err = u.deployer.Deploy(u.charm); err != nil {	19:42
fwereade	return err	19:42
fwereade	}	19:42
fwereade	if err = u.sf.Write(reason, Done, hi, url); err != nil {	19:42
fwereade	return err	19:42
fwereade	}	19:42
fwereade	}	19:42
fwereade	log.Printf("charm %q is deployed", url)	19:42
fwereade	niemeyer, from the final deferred log in Deploy to the log at the bottom	19:42
fwereade	niemeyer, but, yeah, that's not a helpful thing to ask	19:42
niemeyer	fwereade: I'll have a look at the code	19:43
fwereade	niemeyer, specifically in the context of the jujud test	19:43
fwereade	niemeyer, but I'll be staring at verbose runs of that on its own and seeing if I can spot anything, too	19:43
fwereade	niemeyer, hmm, yeah, that's more like 10ms when run on its own	19:44
niemeyer	fwereade: Ah, ok	19:44
niemeyer	fwereade: That sounds about right for a disk seek	19:45
fwereade	niemeyer, hmm, just got a couple of 100ms~s in a row	19:46
niemeyer	fwereade: Ok, that may be the machine working.. keep in mind that we're fsyncing on the writes	19:47
fwereade	niemeyer, ha! yes, I totally forgot that	19:47
fwereade	niemeyer, ok, I feel a little less insane now, and in that case actually more comfortable just bumping timeouts across the board	19:48
niemeyer	fwereade: Maybe I misunderstand the original issue then	19:48
niemeyer	fwereade: I thought the 50ms was inside a loop	19:48
fwereade	niemeyer, yeah, just when polling for things to have happened	19:48
niemeyer	fwereade: That would wait further for a few seconds until the desired state actually arrived	19:48
niemeyer	fwereade: It'd certainly be too little to be deterministically waiting just that, but it's not too little if it's inside a loop	19:49
niemeyer	fwereade: Do I misunderstand what you're actually covering?	19:49
fwereade	niemeyer, the issue is really just that, when things fail, I see a lot of "waiting..." spam from inside the tests; specifically, much more than I intuitively expected	19:51
fwereade	niemeyer, being reminded that we're fsyncing every write makes that behaviour seem a lot less weird	19:52
fwereade	niemeyer, ie I now have a mechanism for the uniter taking its own sweet time over things	19:52
fwereade	niemeyer, rather than the half-baked "overeager tests + gc pauses + handwaving" theory	19:53
niemeyer	fwereade: LOL	19:53
niemeyer	fwereade: This is cool :)	19:53
fwereade	niemeyer, and I'm now comfortable just bumping the top-level timeouts a little and relaxing	19:53
niemeyer	fwereade: We can certainly disable the fsyncing for tests later	19:55
niemeyer	fwereade: But we should make sure things actually work first :)	19:55
fwereade	niemeyer, v good point	19:55
fwereade	niemeyer, quite so :)	19:56
=== 45PAA1GWO is now known as hazmat
* niemeyer steps out for a while		20:57

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!