[05:09] bzr: ERROR: Cannot lock LockDir(chroot-93404944:///%2Bbranch/gocheck/.bzr/branch/lock): Transport operation not possible: readonly transport [05:09] ^ any ideas ? [05:12] screw it, hit it with the rm hammer [05:13] nope [05:14] that hasn't worked either [05:24] davecheney: i've not seen that issue before, sorry, i'll see if i can find out [05:26] davecheney: actually, it aappears to be because you may not have write permissions to the gocheck project codebase? [05:27] wallyworld_: yeah [05:27] i guess I havent' submitted anything to that previously [05:27] i've written to gustavo [05:27] nice easy error message :-) [06:40] Morning [09:45] morning. [09:47] Aram: Hi [09:48] hi aram and TheMue [09:48] jam: Heya [09:55] mornin' all [10:04] rogpeppe: Morning [10:05] Anyone interested in reviewing the Firewaller 2.0? ;) https://codereview.appspot.com/6875053/ [10:06] rogpeppe, TheMue: morning [10:06] dimitern: Hello [10:20] TheMue: i'm having a look [10:20] rogpeppe: Great, thanks. [10:25] TheMue: line 184: copy(ports, unitd.ports) [10:25] TheMue: how can that ever work? [10:25] TheMue: because ports is a zero length slice [10:27] rogpeppe: Oh, funnily it works. Have to look into copy(). [10:28] TheMue: it can't do - it will never copy any elements [10:28] TheMue: so unitd.watchLoop will always be started with no ports [10:28] rogpeppe: Good catch, will change. [10:29] TheMue: make a test that fails before making the change [10:30] rogpeppe: Any idea on how that test should look like? [10:31] TheMue: i haven't looked too deeply, but the question is: why do you start watchLoop with some ports? does it make any difference to the externally visible semantics? [10:31] TheMue: if it doesn't make a difference, maybe you should just pass nil to watchLoop and delete those two lines [10:32] rogpeppe: Has been the result of longer discussions to do every startup with the current state. [10:33] TheMue: so i guess it'll just set the same ports twice, right? [10:34] Ruetobas: Pardon? [10:34] Oops [10:34] rogpeppe: ??? [10:35] rogpeppe: Passing them to the watchLoop() just avoids a first change event. [10:36] TheMue: that's what i mean. won't that first change event trigger a redundant setting of the ports? (i actually meant "i guess it'll just set the ports redundantly") [10:37] rogpeppe: Yes, it would do so. [10:38] TheMue: hmm, and because you don't see environment port-setting events directly, you can't observe the difference [10:40] rogpeppe: It's job of the both reconcileXyz() methods to compare the initial state with that of the environment or the instances. And then needed changes are done. [10:41] rogpeppe: Everything later is then based on change events. [10:41] rogpeppe: And the initial state is taken from the first events of the regarding watchers. [10:42] TheMue: yes, that all seems fine. i was just wondering on whether it was possible to test that broken code. [10:44] rogpeppe: With exporting internal states of the firewaller maybe. But I don't think it's worth it, your capture has been great and it would only demonstrate that you're right. :D === asavu is now known as asavu_ === asavu_ is now known as asavu [11:40] GOod morning juju-devs [11:41] niemeyer: Hiya [11:42] niemeyer: morning :) [11:42] niemeyer: yo! [12:08] Lunchtime, BIAB [13:20] niemeyer: ping [13:20] TheMue: Hey [13:21] niemeyer: Hi, could you please take a look at https://bugs.launchpad.net/juju-core/+bug/1084867 which Mark assigned to me yesterday. [13:21] <_mup_> Bug #1084867: conn: conn connections should pause between retry attempts < https://launchpad.net/bugs/1084867 > [13:21] niemeyer: It's in state.Open() and I've got a question. Regarding mgo. [13:23] niemeyer: If I'm right mgo.DialWithInfo() uses a func for dialing, but the timing if it's failing is controlled inside the method. You only may specify a timeout (here 10 minutes). Am I right? [13:24] TheMue: Where are the messages in that bug being printed? [13:24] TheMue: That code must necessarily be repeated for the message printing to be repeated, right? [13:24] niemeyer: In the func passed to DialWithInfo() [13:25] niemeyer: Yes, until the conn is established or the timeout happens. [13:26] niemeyer: Is there a fixed time between those retries? [13:28] TheMue: There are retries, there are timeouts, there are delays [13:30] niemeyer: Yes. IMHO Dave recomments a kind of exponential delay behavior. Alternatively to static delays. [13:30] niemeyer: But maybe I misunderstand the comment. [13:30] TheMue: Yes, I can see in the bug that he recommends that [13:31] TheMue: Where is the repetition happening, for how long does it repeat quickly, how long does it wait until the quick repetitions? [13:34] niemeyer: Sorry, no more details so far than written in the issue. So I'll try to talk to Dave tomorrow morning. Just wanted to know first if I'm missing something about the mgo dialing behavior. [13:34] TheMue: You don't have to talk to Dave, you have to understand the problem instead [13:34] niemeyer: And will start some test runs here. [13:34] TheMue: Sorry, I definitely don't intend to sound harsh, but you have to do your home work when a bug is assigned to you [13:35] niemeyer: No problem. [13:36] TheMue: This sounds like a trivial problem of a missing delay in the right place [13:38] niemeyer: This would have been the next where I would have talked to you about. Such a kind of delay is possible in the dial func, but it may also be interesting for you as the creator of mgo to make it configurable via DialInfo in a next release. [13:39] TheMue: Where is the repetition happening? Who's dialing so quickly? [13:39] TheMue: You seem to have no idea [13:39] TheMue: Putting sleeps in the code when you have no idea about exactly how work is taking place is a terrible plan [13:40] TheMue: It can be in mgo, it can be in Open, it can be in the call site of Open, etc [13:40] TheMue: You have the whole code base at your hand [13:41] niemeyer: It looks to me that the dial func passed to DialWithInfo() is repeated quickly. Am I right? It contains no own loop. And it is passed to DialWithInfo with the DialInfo. [13:42] TheMue: Exactly.. it contains no loop.. things that contain no loop do not lead to repetition [13:42] TheMue: How can it possibly be repeated? [13:42] TheMue: Staring at it and wondering how to control repetition is going to be ineffective [13:42] niemeyer: It returns an error and no connection, so DWI() retries. [13:43] TheMue: Where? [13:44] niemeyer: Aaargh, I think I've got it. [13:46] niemeyer: I'm currently walking the call tree upwards. [13:46] niemeyer: Have been too fixed on that Open(). [13:46] TheMue: Don't guess.. put debug statements in, watch the timings, watch call traces [13:47] niemeyer: Yes, will do so. [14:40] Three branches off the queue.. 10 to go [14:40] Lunch first, though [16:06] rogpeppe2: https://codereview.appspot.com/6902070/ is reviewed [16:06] rogpeppe2: Either I'm on crack, or the logic there isn't entirely sound === rogpeppe2 is now known as rogpeppe [16:06] niemeyer: you may well be right. let me check. [16:07] niemeyer: the first add is because we can't do an add before the websocket handler function is invoked [16:08] niemeyer: the websocket handler uses a different goroutine for each call (like the http server) [16:09] rogpeppe: Yeah, even then it doesn't seem entirely okay [16:09] rogpeppe: There are four different goroutines there [16:10] rogpeppe: With adds all around [16:10] rogpeppe: and it's not clear why [16:10] rogpeppe: There is also a channel blocking at the end of each handled connection, which also doesn't sound okay [16:10] niemeyer: we need to wait until everything has cleaned up before dying [16:10] rogpeppe: We need to do a lot of things [16:11] rogpeppe: But that doesnt' mean we need a new goroutine for each one of those things [16:11] niemeyer: i'm not sure how to avoid the goroutines [16:11] rogpeppe: I suggest starting from the principle that you don't need them, rather than from the principle that you do need them [16:12] niemeyer: the only way to get the http listener to quit is to close its Listener [16:12] rogpeppe: Also note how run(lis) never returns an error, despite the fact it has an error return.. which means the outside goroutine, doing a Kill, never does it [16:12] rogpeppe: There's surprisingly little logic for that ton of logic [16:13] niemeyer: yes, run never returns an error, because the error that it almost always gets is "use of closed listener" [16:13] niemeyer: (if we used the error from http.Serve, that is) [16:14] rogpeppe: Sorry, I'm not sure I understand what you mean [16:14] niemeyer: what error could Server.run return? [16:14] rogpeppe: Things that never return an error should not have an error return, and the error return from things that never return an error should not be used [16:14] rogpeppe: Both of these are false [16:14] In the branch [16:14] niemeyer: that's true. i should probably remove the error return from run [16:14] rogpeppe: Yes, that's a start [16:17] rogpeppe: What about that blocking channel at the end of every single handled connection? [16:18] niemeyer: hmm, yes, that does seem a bit crackful [16:18] niemeyer: i think it needs to be a select [16:19] niemeyer: and wait for the st.run() to finish as well as the dying tomb [16:19] rogpeppe: Why? Why do we have a goroutine being created within the per-connection goroutine? [16:19] niemeyer: i'm not sure i can use any less goroutines though [16:19] niemeyer: because we need somewhere to wait for the dying tomb [16:20] niemeyer: so that we can kill the connection [16:20] rogpeppe: One of the reasons why we have a tomb is precisely to not shut the door on running logic [16:21] niemeyer: we're not shutting the door on running logic, we're waiting for websocket.JSON.Receive to receive an EOF, and quit the run loop [16:21] niemeyer: that's the only way to terminate that run loop AFAIK [16:21] rogpeppe: That's not what conn.Close does [16:21] niemeyer: what does it do? [16:22] rogpeppe: It closes the connection, doesn't it? [16:22] rogpeppe: Irrespective of whatever is going on within the goroutine [16:22] niemeyer: yes. hmm, that will stop replies being sent, yesah [16:24] niemeyer: there probably needs to be something so that we can tell when the message receiver is blocked on a message receive. [16:24] niemeyer: (pity we can't half-close the connection) [16:26] niemeyer: perhaps before it does a message receive, it could send the connection to another goroutine which will close it if it sees Dying. [16:27] niemeyer: once it's got the message, it can ask for the connection back again [16:28] niemeyer: it would still be the same number of goroutines, but the receiver would be more in control. [16:30] rogpeppe: The point isn't closing, I think.. the point is stopping and returning.. Close can be deferred on spot [16:31] func(conn ...) { [16:31] defer conn.Close() [16:31] } [16:31] niemeyer: the loop in run is blocked on receive. someone calls Server.Stop. how do we stop the run loop? [16:31] rogpeppe: Exactly.. [16:32] rogpeppe: That's the issue we must solve: how to stop and return.. that decision must be taken with context, unlike what's being done right now [16:32] niemeyer: i think closing the conn is the only way to do that [16:32] niemeyer: but you're right that it shouldn't be done outside of run [16:32] rogpeppe: Of course not.. that's taking a cheap route from outside to knock logic we don't control down [16:33] niemeyer: ok, so how do we get the run loop to unblock, if we can't close the conn? [16:34] rogpeppe: "if we can't close the Conn" is a red-herring [16:34] rogpeppe: Where would you close the Conn? That's where you get the loop to unblock [16:35] rogpeppe: You don't want to close the Conn without context [16:35] niemeyer: we need to close the conn *somewhere* right? [16:35] rogpeppe: The point isn't closing, I think.. the point is stopping and returning.. Close can be deferred on spot [16:35] func(conn ...) { [16:35] defer conn.Close() [16:35] rogpeppe: That's where you close the Conn [16:35] niemeyer: i don't understand that code without further context [16:36] rogpeppe: handler := websocket.Handler(func(conn *websocket.Conn) { defer conn.Close() [16:36] rogpeppe: The connection must be closed when the handler is done running [16:37] rogpeppe: We don't close the connection to interrupt.. we interrupt to close the connection [16:37] niemeyer: how can that work? the handler blocks receiving a message. if it's blocked receiving a message, how can it know that the server has been killed? [16:37] rogpeppe: That's the right question [16:37] rogpeppe: Answering it is what we must find out [16:38] niemeyer: we could receive messages in a separate goroutine [16:38] rogpeppe: That's not solving the problem [16:38] niemeyer: (that's still the same number of goroutines though) [16:38] niemeyer: isn't it? [16:38] rogpeppe: We can have 10 goroutines.. we still must find a way to interrupt in the right spot [16:38] niemeyer: i mean we could read messages in a goroutine, and send them on a channel. [16:39] niemeyer: then select on that channel and Dying. [16:39] rogpeppe: It's not.. you're saying "we can move the problem to a separate goroutine".. surely we can. The problem is still in the other goroutine, though. [16:39] niemeyer: we can't do it with one goroutine. period. [16:42] rogpeppe: That's an irrelevant invariant.. we should look for the solution, rather than how many goroutines we need to put an untold solution in place. [16:42] niemeyer: this would work, but you'll probably think it's messy: http://paste.ubuntu.com/1430008/ [16:46] rogpeppe: How does conn.SetDeadline work? [16:46] niemeyer: i think it hooks into net.TCPConn.SetDeadline, but i'll check [16:47] rogpeppe: It does.. but what happens when the deadline expires? [16:47] niemeyer: you get an error that satisfies IsTimeout (IsTemporary?) AFAIR [16:48] rogpeppe: Sure, but the interesting question is what happens to the connection itself and the data that was being received when that takes place [16:49] niemeyer: i *think* it may be left intact. but i don't think that helps us. [16:50] niemeyer: if something is doing a ReadFull and it gets a temporary error, it is likely to be fatal [16:51] rogpeppe: Yeah, it'd likely introduce that kind of weird issue indeed [16:52] niemeyer: something like this is fairly clean: http://paste.ubuntu.com/1430030/ [16:53] niemeyer: then the close can go in the function as you suggest [16:54] rogpeppe: Yeah, that looks like a nice directoin [16:54] rogpeppe: The error must be taken care of, etc, but looks nice [16:55] niemeyer: that's what i was thinking of when i said [16:55] [16:38:48] niemeyer: i mean we could read messages in a goroutine, and send them on a channel. [16:55] [16:39:02] niemeyer: then select on that channel and Dying. [16:55] niemeyer: i'm not sure what the best thing to do with the errors is though [16:56] niemeyer: we don't want to take the whole server down because a single client sends an bogus request [16:56] rogpeppe: Indeed, I guess that's fine [16:56] niemeyer: i think that a simple log message might be the best way [17:11] rogpeppe: +1 [17:16] niemeyer: https://codereview.appspot.com/6902070 [17:18] rogpeppe: srv.run still returns an error that is never used.. its result is still used on Kill despite the fact it never changes [17:18] niemeyer: ah, i'd forgotten that. will remove. [17:19] rogpeppe: Add and Done within websocket.Handler is still very awkward [17:19] niemeyer: yes [17:19] niemeyer: got a better solution? [17:19] rogpeppe: It is pretty much never sensible to call those in the same goroutine [17:20] niemeyer: unfortunately we don't have control over the moment that http starts a new goroutine AFAIK [17:21] niemeyer: i'm pretty sure my ErrStillAlive check makes the logic sound, but it is definitely awkward. [17:22] rogpeppe: Where is Wait? [17:22] niemeyer: ha [17:22] niemeyer: i'm sure it used to be *somewhere* :-) [17:25] niemeyer: i was contemplating putting in a long-running request (some kind of waiter, perhaps) so that i could test these issues better [17:25] rogpeppe: Why do we need that goroutine at the start of run? [17:26] niemeyer: it's either that or http.Serve [17:27] niemeyer: hmm, and since we're not using the error returned by http.Serve, maybe we should just do "go http.Serve(...)" [17:27] niemeyer: oh, but we can't [17:27] niemeyer: it would look just the same as the other go statement [17:27] rogpeppe: Why? [17:27] rogpeppe: Why? [17:28] niemeyer: we want to block until it's completed, i think [17:31] rogpeppe: Indeed, sounds like a good idea [17:35] rogpeppe: This also sounds a bit bogus, btw: [17:35] 71 » » // If we've got to this stage and the tomb is still [17:35] 72 » » // alive, we know that any tomb.Kill must occur after we [17:35] 73 » » // have called wg.Add, so we avoid the possibility of a [17:35] 74 » » // handler goroutine running after Stop has returned. [17:35] niemeyer: PTAL (waitgroup added) [17:35] rogpeppe: Stop won't return before Done is called [17:35] rogpeppe: and we should not call Done while there's stuff running in background.. that's the invariant on tomb [17:35] niemeyer: that's fine [17:36] rogpeppe: [17:36] 47 » go func() { [17:36] 48 » » defer srv.tomb.Done() [17:36] 49 » » srv.run(lis) [17:36] 50 » }() [17:36] rogpeppe: How about go srv.run(lis)? [17:37] rogpeppe: The Done there isn't right [17:37] niemeyer: the Tomb.Done in the above code? [17:37] rogpeppe: Sorry, I guess it is now that you've added wg.Wait [17:37] niemeyer: i *think* it's ok [17:37] rogpeppe: Still, we don't need a closure just to call a function [17:38] niemeyer: good point. [17:38] niemeyer: it was nice when it was returning an error, but now a bit redundant [17:38] niemeyer: will make smaller [17:38] rogpeppe: Agreed on both points [17:41] rogpeppe: Line 152 will wedge on Dying [17:42] niemeyer: argh, yes. it probably needs to select on dying too [17:44] niemeyer: alternatively, we could close the conn and wait for readRequests to die, which might be cleaner. [17:44] niemeyer: at least then we know we've cleaned up *all* resources [17:46] rogpeppe: closing under a sender? [17:46] niemeyer: no [17:46] niemeyer: http://paste.ubuntu.com/1430107/ [17:49] rogpeppe: Doesn't look cleaner [17:49] rogpeppe: That means we'll have to Close twice, and wait for messages we don't care about just to unblock the other side [17:50] niemeyer: it's cleaner in that no log messages can be produced after the server has been stopped [17:50] niemeyer: i don't think we'll have to close twice [17:50] rogpeppe: So that paste already has a bug [17:50] rogpeppe: There are two returns.. one of them doesn't close the connection [17:50] niemeyer: that's right [17:50] rogpeppe: !? [17:51] niemeyer: does that invalidate the general approach? [17:52] rogpeppe: Sorry, I'm a bit lost.. it's right to not close the connection? [17:53] niemeyer: no. but the piece i was trying to demonstrate was the bit that closes the connection and waits for the readRequests goroutine to finish [17:53] rogpeppe: All of the comments above were made about this [17:54] niemeyer: i think this is different, as the readRequests goroutine has no side effects [17:54] niemeyer: (this is a slightly different phrasing of the idea: http://paste.ubuntu.com/1430129/) [17:55] rogpeppe: I have no idea about what side effects has to do with the above points [17:55] rogpeppe: Either the paste had a missing Close, or it was Closing twice [17:55] niemeyer: oh, sorry, i thought you were referring to much earlier comments [17:55] niemeyer: "above" is, erm, ambiguous, in IRC context :-) [17:55] rogpeppe: You said it doesnt [17:56] niemeyer: i said it doesn't what? [17:56] rogpeppe: So I don't know what you have in mind.. either way, all of that makes it sound like a bad idea. [17:56] niemeyer: i don't think we'll have to close twice [17:56] niemeyer: did you look at my most recent paste? [17:56] rogpeppe: No, I'm still trying to understand what we're talking about [17:56] rogpeppe: We're clearly talking across each other [17:56] rogpeppe: And looking at more pastes won't help [17:57] niemeyer: you were concerned about readRequests blocking forever when run returns, right? [17:57] rogpeppe: I made a very specific statement, saying that there was a missing Close or that it was closing twice [17:58] rogpeppe: You said "I don't think so", and didn't explain why, and ignored further comments [17:59] niemeyer: my paste was supposed to be an explanation, sorry [17:59] niemeyer: and the most recent paste shows how we can close once only, and reliably. [18:00] rogpeppe: Your different paste has different logic, and ignores my questioning [18:00] rogpeppe: "Yes, I wanted it to close twice"; "Yes, there was a bug".. those are all trivial to say [18:00] rogpeppe: and keep us on the same page [18:00] niemeyer: your only question has been "it's right not to close the connection?" [18:01] niemeyer: which sounds rhetorical to me - of course it's not right not to close the connection [18:01] rogpeppe: Sorry, that's not working at all [18:02] rogpeppe: I make a comment, and you say I'm wrong, and ignore it, and then say it's obvious that I'm right [18:02] rogpeppe: Just send the code you think is right for review then [18:02] niemeyer: will do [18:07] niemeyer: PTAL https://codereview.appspot.com/6902070 [18:15] rogpeppe: Done [18:16] niemeyer: you're probably right that it's not worth saving one allocation per message :-) [18:16] niemeyer: i wondered if you'd jump on that [18:17] rogpeppe: There's no allocation being saved in either case [18:18] rogpeppe: Both send the address of a local variable across to another function, and both send a copy of it over the channel. There's no reason for one to allocate and the other not. [18:18] niemeyer: i think that the pointer passed to json.Unmarshal counts as escaping, so the local variable will be allocated each time round the loop [18:18] rogpeppe: See above [18:19] niemeyer: it makes a difference if the variable is declared inside or outside the loop [18:20] niemeyer: if it's declared outside, it will be allocated only once for the lifetime of the function [18:20] niemeyer: if it's declared inside, it will be allocated each time it comes into scope (because it escapes) [18:20] niemeyer: but still, it's only one allocation out of many, who cares? :-) [18:21] rogpeppe: Doesn't sound correct.. "escaping" is not defined by looping [18:21] rogpeppe: If you call f(&foo), the question is whether f keeps &foo or not [18:21] niemeyer: no, "escaping" is defined by being passed to json.Unmarshal, in this case [18:21] rogpeppe: But really, it doesn't matter [18:21] rogpeppe: Yep, which both do [18:22] rogpeppe: I don't care, though.. this branch has been reviewed enough, and that's an early optimization anyway [18:22] niemeyer: true [18:22] niemeyer: i'll go with that [18:28] niemeyer: for illustration: http://play.golang.org/p/9vsPOfjOmJ [18:31] rogpeppe: That's totally unrelated to escape analysis.. both are escaping. [18:31] niemeyer: it illustrates the difference between the old version of the code and the new version of the code [18:31] niemeyer: with "escape" standing in for some function that lets its argument escape [18:32] niemeyer: (in our case JSON.Receive) [18:32] rogpeppe: Sure, I misunderstood what you meant.. it's obvious that it'll do more allocation *IF IT ESCAPES* [18:33] rogpeppe: rogpeppe: If you call f(&foo), the question is whether f keeps &foo or not [18:33] niemeyer: sure. and i'm absolutely certain that the argument the JSON.Unmarshal will escape. so we will allocate more. [18:33] niemeyer: well, to be more accurate: [18:34] niemeyer: the compiler can't tell that the argument doesn't escape [18:34] rogpeppe: There's zero intrinsic reason for it to escape [18:34] rogpeppe: Since it's side-effects free [18:34] rogpeppe: But.. I don't mind either way. Glad we've cleared the issue, though, thanks. [18:34] niemeyer: yes, but it goes into reflect-based code, so the compiler throws up its hands and says "i dunno" [18:35] rogpeppe: Quite possibly [18:36] niemeyer: just checked: http://play.golang.org/p/Wz7Z94cKaV [18:38] rogpeppe: Nice, feel free to revert the code if you want. [18:44] niemeyer: submitted, thanks for the very useful review [18:44] rogpeppe: My pleasure [18:44] niemeyer: time for me to go. [18:45] niemeyer: and i guess we won't be seeing you for a little while [18:45] niemeyer: good luck with everything! [18:45] niemeyer: (i'm off tomorrow BTW) [18:45] rogpeppe: Ahh, cool, enjoy the holiday, and see you soon [18:45] rogpeppe: I'll certainly be around every now and then [18:55] g'night all === sidnei` is now known as sidnei