[00:02] svz90: Easiest thing to do would be to configure it in your OpenSSH .ssh/config file [00:04] maxb: Thanks. [05:32] i've just done a "bzr add foo", but before i committed the change i changed my mind and i no longer want to add "foo" [05:32] how can i tell bzr i don't want to add "foo" anymore? [07:31] rm [07:31] --keep or whatever [14:06] english question: in a commit message like "reverted changeset xyz", reverted is past participle, not simple past, right? that is, changeset was reverted, not I reverted it. [14:09] As a commit message it would make more sense to say "Reverts changeset xyz" as the message should say what that changeset does. But in the above either works really, not sure which way it would be interpreted by a grammer perfectionist though. [14:24] I'd go with "revert changeset" myself. Imperative FTW. [14:28] can it be read, is it usually read, as past participle? [14:29] is it "undone commit 123.", "commit 123 undone.", or "undid commit 123."? [14:37] 3rd IMO [14:51] there's one guy saying 1st could be read as "the commit has undone commit 123" [14:52] and other guys saying "commit 123 undone" ("commit 123 has been undone") is ok, even though 3rd one is more usual [15:35] vila: Around? [16:01] fullermd: briefly, may I help ? [16:02] I wondered if you've ever seen oddities of selftest on your FreeBSD vm with it locking itself up waiting on a semaphore. [16:03] this doesn't ring any bell, FreeBSD (the babune slave) is on of the most stable (if the not *the* most) [16:03] fullermd: is it something you've seen with an explicit output or only a weird behavior [16:03] ? [16:04] Well, for instance, http://babune.ladeuil.net:24842/job/selftest-freebsd/lastCompletedBuild/testReport/bzrlib.tests.blackbox.test_branch/TestSmartServerBranching/test_branch_from_trivial_branch_streaming_acceptance/? [16:04] That one. [16:04] while being stable it's also quite slow (compared to the others), but I don't have a good explanation for this [16:04] If I run it via 'bzr selftest branch_streaming_acceptance', it runs fine. [16:05] If I just like 'bzr selftest' hit it, it just stops dead. python had 'usem' as a wchan in ps/top. I've let it sit for 4 or 5 minutes, and gets nowhere. [16:05] haaa interesting [16:05] (runs in <3 seconds standalone) [16:05] I've tried running a full selftest a half dozen times, and I keep finding new places further on that it locks down. [16:05] Except last try around, where I foudn another place _early_ on that it started locking. Gruuh. [16:06] so, IIRC, that's a test which runs both a client and a server in the same process [16:06] * fullermd nods. [16:06] All the ones that lock are of that ilk AFAICS. [16:07] I've been suspecting a weird interaction between python and BSDs (including OSX) but other OSes too (but quite differently) but I've never been able to diagnose precisely enough [16:07] weird interaction around sockets [16:08] some double handling of socket state (especially when both sides of the socket are used in the same process) [16:08] but that's only a theory so far... and pretty weak :-/ [16:09] Mmm. And you're using --parallel and somehow don't it it? [16:09] (hit it) [16:09] I do use --parallel yes [16:09] well babune that is [16:09] Annoyingly it seems like sometimes it sneaks past it fine. Weird timing stuff; always works when it's part of a short list of tests being run. [16:10] ha, that rings a bell [16:10] Would make it easy enough to work around by running a couple times, if it didn't happen 5 or 10 minutes into a freakin' test run... [16:10] * fullermd is now trying `./bzr selftest -x branch_streaming_acceptance -x test_create_clone_on_transport_use_existing_dir -x RemoteBranch -x RemoteBzr`. Furrfu. [16:10] my suspicion began when there was a bunch of sockets waiting to die which seem to slow the overall run [16:11] which led me to suspect some select() call [16:12] a bit like if python was relying on the OS to sort things out while.... somehow checking until it was happy with sleeps() intermixed... as you can see nothing very concrete [16:13] * vila baktracks [16:13] fullermd: you encounter hangs ? [16:13] Except for me, python ends up waiting on a semaphore of some sort, so it never checks again. [16:13] Well... halting problem. It COULD just be being very very slow, for 4 or 5 minutes, using no CPU and never waking up. [16:13] But it sure smells like... oh, look. [16:14] * fullermd tries a-fscking-gain... [16:14] * vila holds his breath... [16:14] look what ? [16:14] :D [16:15] Oh, it locked itself again a thousand or so tests in. [16:15] what's the flag... -Dthreads -Ethreads ? [16:16] -Ethreads [16:17] It was quite useful when debugging the leaks, but I think it was a but too intrusive and led to some failures (changing the output that some tests are checking), so be careful while encountering failures [16:18] Well, we're 1600-some in to this attempt... [16:18] Aaaand, it does. [16:19] oh yes, lsof ! [16:20] lsof was also useful combined with -Ethreads whose output gives the socket references by tests [16:20] both client and server side [16:20] % fstat -p59868 | grep -c tcp [16:20] 212 [16:20] That's a couple TCP sockets open... [16:20] that's bad [16:21] so, the other factor here is paramiko which leaves pending sockets [16:21] Not a factor here. I've had paramiko uninstalled for, like, a year and a half. [16:21] but they can't be easily collected as they are internal to pa... [16:21] damn [16:21] one more theory dead [16:22] * fullermd , slayer of theories! [16:24] When you say -Ethreads, does that mean some sort of python internal threads, or does it use OS-level threads? [16:25] python threads [16:25] -Ethreads outputs debug statements at various "interesting" points [16:26] There are 3 threads in the hung process. One seems to be sitting in the 'accept' wchan, which I WAG may mean it's sitting in accept(2)... [16:26] * vila hates vbox a bit more every day especially when killing one VM sometimes kill *all* the running VMs [16:26] WAG ? [16:26] Wild-Add Guess [16:26] :) [16:27] one thread for the accept(), one thread for the serving, one thread for the client (usually the main one) [16:29] and by the way, the selftest in running with BZR_CONCURRENCY=4 and the VM has 2 processors configured [16:29] s/in/is/ [16:32] Well, I'm not doing any --parallel'ing. [16:32] I got that, just mentioning [16:32] * fullermd nods. [16:33] I'm running a straight selftest right now [16:34] Oh, hey, it locked up only 197 in this time. [16:35] with -Ethreads ? [16:35] Only 16 TCP sockets open this time. [16:35] Trying that out now. [16:36] this... is surprising, I was pretty sure we collect almost all sockets now, except for the paramiko ones (which aren't relevant here) [16:36] you're running bzr.dev there right ? [16:36] Well, locked up, but I guess I should use -v too so's to have some idea where it is... [16:37] Well, technically, this is bzr.dev+someotherchanges. But nothing that would mess with network. [16:38] Well, it dumps a pile of information, but I don't see how it's much useful... [16:38] 2700 tests and running and fstat reports only 'internet stream tcp' lines, that's sockets right ? [16:38] Yah. [16:38] 'sockstat | grep python' should tell you what they are (well, unless you've got a lot of other stuff python running; then grep for the pid) [16:38] it helps making sure the sockets are for past tests not the current ones and it also helps in finding which tests are leaking [16:40] * fullermd frowns. [16:40] Hm. [16:40] I'm trying t selftest --no-plugins now. That seems to clean up after itself pretty fast; I'm in the 500's now and nothing hanging around... [16:41] right, ~4000 here and still no socket leaks (only 3 displayed) [16:41] Oh, well, that was a good theory while it lasted. Locked up. [16:42] out of curiosity, what changes do you keep there ? Are you sure they aren't some gems worth proposing ? ;D [16:42] Well, I WANT to propose them. That's why I want selftest to run :p [16:42] Here's an interesting thing: [16:42] USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS [16:42] ha great :) [16:42] fullermd python 60430 4 tcp4 127.0.0.1:39758 *:* [16:42] fullermd python 60430 5 tcp4 127.0.0.1:39758 127.0.0.1:16483 [16:43] That's presumably the socket of the current test. One of the threads is waiting in accept. The socket is listening. But there's also a connection made to it. [16:44] That's on a test_pull_smart_stacked_[something] [16:44] yes, could be the client and the accept() or the serving one, -Ethreads should help there no ? [16:44] also, you should certainly see the mirrored one most of the time but if you happened to list while they are killed ? [16:45] OK... [16:45] ...nching.test_branch_from_trivial_branch_streaming_acceptanceServer thread ('127.0.0.1', 31415) started [16:45] Client thread ('127.0.0.1', 10740) -> ('127.0.0.1', 31415) started [16:45] fullermd python 60607 4 tcp4 127.0.0.1:31415 *:* [16:45] fullermd python 60607 5 tcp4 127.0.0.1:31415 127.0.0.1:10740 [16:45] fullermd python 60607 6 tcp4 127.0.0.1:10740 127.0.0.1:31415 [16:46] 7000 and fstat is still clean as a baby, your 212 number above is still very weird [16:46] Only tcp sockets open for the process. [16:46] So it's got both sides of the connection open. But one thread is still sitting in accept. [16:46] sounds fine [16:47] And it's dead there. [16:47] there is one thread waiting in the accept, spawning a thread for serving each connection and accept'ing again [16:47] dead and staying dead right now ? [16:47] keep it alive !!!! [16:48] Well, it's been a couple minutes... [16:48] seriously, [16:48] this is the dirty bit when shutting down a test server [16:48] what's the test name ? [16:49] ...nching.test_branch_from_trivial_branch_streaming_acceptance [16:49] let me see [16:49] In this case. I don't think it matters though; it's semi-random. [16:49] * fullermd just fired off a selftest in a virgin bzr.dev, just for kicks. [16:50] could be, but I want to look at what kind of test server is involved and show you the relevant code so you may have your own take on it [16:50] (not asking to debug it, but look at the code in a context where it seem to be failing ) [16:50] Man, the relevant code is python. My take will be "Hey, I can just rewrite that in perl..." :p [16:51] hehe, no I mean as socket code, whatever the language, we're almost doing C there :) [16:52] Remember, any test it's halted on yet, it works just peachy if I run it alone. [16:52] ok, so that's a TestCaseWithTransport so it should be an http server [16:52] Or in a small group. It's only when I run a big enough group (e.g., a full selftest) that it semi-arbitrarily picks a time to lock itself up. [16:52] I've seen very very very weird failures when chasing the leaks [16:52] exactly [16:52] Oh, there went virgin bzr.dev. [16:52] | [1516/24890 in 3m27s, 1 failed] per_branch.test_branch.TestBranch.test_comm.. [16:54] bah, showing you the code is a bad idea, too many classes involved [16:54] * fullermd has no class. [16:55] ha, no, have a look at bt.test_server.TestingTCPServerInAThread.stop_server [16:56] the dirty bit I was referring to is defined there: the server is blocked in an accept call, so we 1) tell him to stop acception connections (after getting out of it's current call), 2) give him a dummy connection [16:57] there is also a if debug_threads():... call that you could copy/adapt/add in various points [16:57] I suspect the case you encounter should be in this area but exhibit an unexpected behavior [16:58] 14000 test and still clean [16:59] hmm, IPV6 ? [17:00] All those connections are over 127.0.0.1 [17:00] naah, you're running py2.6 right and anyway, I'm pretty sure I've used a variation of the right python code [17:00] we've seen weird things even when using only ipv4 for us, as long as the host is configured for ipv6, (paramiko at least as a bug for this kind of config) [17:01] most of the time we force ipv4 by using 127.0.0.1 but some tests may still use localhost and went unnoticed [17:02] we have some fugitives like that here and there ;D [17:02] Just switch it to force ipv6 by using ::1 instead. If their system doesn't support v6 yet, set it in fire; better for everyone that way :p [17:04] hehe [17:05] 18000 and still clean, babune:~ :) $ fstat -p1004 |wc -l [17:05] 12 [17:06] but if you encounter the problem with bzr.dev and --no-plugins that should rule your proposal out :) [17:08] 20000 and raising briefly at ~40 (paramiko), back at 12 [17:08] It doesn't seem to [generally] do any notable accumulation of stuff over time. Most of the locks have no sockets except the current ones. [17:09] but when it blocks it's on a socket right ? [17:09] and one in the accept() state ? (with no foreign address ?) [17:10] Yah. [17:10] or are you unsure about that ? [17:10] Well, I'm not _sure_ insofar as I've dug into the stack. But that's what everything looks like. [17:11] so, I encounter weird things when trying the kill-the-socket-with-shutdown approach IIRC [17:12] I tried various ways with mixed results until I ended up giving it a dummy connection and even there I think I tried various tricks before settling on a simple close() [17:13] that's why I suggested adding some sys.stderr.write in this area but if you're blocked on a case where there is no foreign address... [17:13] it kind of means that this last_conn didn't succeed ? [17:14] or did you even not reach reach the point where 'Server thread %s will be joined' ? Or is this message not flushed ? [17:16] I don't recall ever seeing joins on the locks. [17:16] oook [17:16] See the one I pasted ~half hour ago. The "Client thread started" is the last thing that shows up. [17:17] As if the server thread sat in accept() waiting for the client thread (which the OS sees as connected), and never got out of accept(). [17:17] that part is expected, the server thread has spawned another thread [17:18] what isn't expected is that this spawned thread can't finish (or its related client thread) [17:18] * fullermd sighs. [17:18] the client thread being the main thread [17:18] It sucks trying to figure out why changes were made when you can't ask the maker :| [17:18] so, this kind of hang is the hardest [17:18] which changes ? [17:19] The ones I'm working on cleaning up, which led me to trying selftest in the first place. [17:19] which changes ? :D [17:20] upgrade enhancements. [17:20] ha :-/ [17:21] err, may I misinterpreted here, you mean in upgrade.py ? [17:21] Yah. [17:22] qblame mentions only living people, so you should be able to reach them no ? [17:22] Yes, that would be for the existing merged code, not the outstanding unmerged. [17:23] Which is igc :( [17:23] ha, I didn't misinterpreted :-/ [17:23] you're digging an old mp ? [17:23] https://code.launchpad.net/~bzr/bzr/smooth-upgrades/+merge/8921 [17:26] * vila reading [17:27] by the way, the selftest succeeded here with only: bzrlib.tests.test_smart_transport.TestServerHooks.test_server_started_hook_memory is leaking threads among 2 leaking tests. [17:27] 1 non-main threads were left active in the end. [17:32] hmm, lots of stuff there ;-/ Nice to see you working on it, I should leave now, but I'll be pp next week and will look at anything you'll submit, so feel free to begin with high level questions or even re-start the discussion on the ML [17:33] --pack should be useless, --cleanup I think has been implemented elsewhere, so it's worth having a look at that while it still makes sense to propose it there (I went with a bunch of upgrades lately and could have used it) [17:36] You're thinking of pack --clean-obsolete-packs I think. [17:42] fullermd: that's an interesting mp [17:44] Yeah, it should be a real nice addition to 2.0.0 :| [17:50] :-P [17:56] 's one of the things whose continued absense pisses me off. I don't really have time or inclination to work on it, but I have even less of both to keep not having it. Sigh. [19:23] vila: Just dropped what I could do on it. Enjoy your piloting :) [19:26] if revision x in branch a adds file c, and revision y in branch b adds file d, where d is actually descendant of c... is there a simple way to tell that to bzr after both commits have been made? (both mentioned revisions containing other unrelated changes as well) [19:27] I'm not sure what you mean by "descendant". But the answer will be "no". Once you've made a revision in the current World Order, it's too late to go back and tell bzr more about it. [19:30] except for uncommit? [19:31] Well, semantics. Even uncommit doesn't change a revision, it just pops things off the top of the history and throws them away. [19:34] There is occasional discussion about ways to _annotate_ information after the fact (specifically to this case, things like equivalences between separate files). But it's all speculative and far-future, which doesn't help anything now. [19:47] well... how about cherry-picking creation of one file from a distant branch? [19:47] You could do that. Being a cherry-pick, it wouldn't record any of the revision info. But it would use the same file-id. [21:38] ah, was going to ask about the 'local oddities' fullermd ran into with the test suite, but I see it's in the history so I'll just read up. [21:41] I'd be interested if lp:~gz/bzr/cleanup_testcases_by_collection_613247 helped though, I recall freebsd has som lower default resource limits than other nixes so leaks could hurt [21:45] hi mgz [21:46] mgz: if you're interested in testrepository, I would love to know if trunk still works on win32 [21:49] hm, looks more like 's just theclient-server tests being dodgy still from the log [21:49] lifeless: I'll pull and test. [21:51] thanks! [21:51] mgz: its specifically 'testr run' I'm hacking on [21:52] mgz: I've added a SIGPIPE fixup for unix [21:52] but that may bork win32 subprocess invocations [21:52] hm, some kind of import issue [21:52] I wish the framework didn't hide that stuff [21:52] ? [21:52] ... File "C:\Python24\lib\unittest.py", line 532, in loadTestsFromName [21:52] parent, obj = obj, getattr(obj, part) [21:52] AttributeError: 'module' object has no attribute 'tests' [21:52] not useful. [21:52] ah yes [21:52] I filed a bug on python [21:52] and there is one on testtools too [21:52] workaround [21:53] python [21:53] import testrepository [21:53] yup. [21:53] import testrepository.tests [21:53] ... [21:54] * lifeless wags on testrepository.ui.cli [21:54] what are the dependencies explictly? [21:55] I'm failing to find them documented anywhere. [21:55] cat INSTALL.txt [21:55] doh. [21:55] okay, I'll fix up my python path and go again. [21:55] the autotools have kindof devalued that file [21:55] which is a bit sad [22:01] ... [22:01] File "C:\bzr\testscenarios\lib\testscenarios\scenarios.py", line 26, in ? [22:01] from itertools import ( [22:01] ImportError: cannot import name product [22:01] will fix that and see what else I hit. [22:18] so, your sigpipe fixup also wasn't valid 2.4 syntax, but that and a few other things look like easy fixes. [22:18] 17 failures, a bunch look quoting related. [22:19] heh [22:19] some rename-over-existing-file ones. [22:19] feel like filing bugs? [22:19] only if I can't fix 'em. [22:19] cool [22:20] well I'm mid rewrite of run to do parallel test invocation [22:24] hm, so, could either fix the quoting in commands.run, or make it use a subprocess pipe rather than a shell pipe. you got any strong feelings there? [22:25] thats one of the things I'm mid rewrite on [22:25] okay, will leave that one for the mo. [22:25] a test with a tempdir that needs quoting in would be worth adding though. [22:25] the shell=True is for shell expansion [22:26] mgz: please do [22:32] mgz: Yeah, it's not a resource limit. More something squirrel with the sockets. Race or sumfin'. [23:38] wow python-dev has had a lot of long pointless threads of late. [23:39] yeah [23:39] win [23:39] lifeless: I hate DocTestMatches, it's awful. Down to six failures though, four of which are quoting related. [23:40] mgz: :( [23:40] mgz: I'd welcome a more powerful full text matcher [23:40] fullermd: I'd be interested if you give that branch a run anyway, though it probably won't help. I'll do yours in turn to check you've not regresses anything. [23:41] lifeless: [23:41] FAIL: testrepository.tests.test_testr.TestExecuted.test_runs_and_returns_run_argv_no_args [23:41] AssertionError: Match failed. Matchee: "True True True [23:41] ['b:\\temp\\tmprbtwer\\testr'] [23:41] " [23:41] Matcher: DocTestMatches('True True True\n[...]\n', flags=8) [23:41] Difference: Expected: [23:42] I don't see why that's a mismatch, and the output sucks. [23:42] it possibly has an extra newline off the end? who knows. [23:43] the test_bdist was even worse because it prints pages of text every time. [23:45] I don't think the Matcher scheme works well with more than one line of text, the top thingy needs to know about every leaf object to not spam you. [23:51] mgz: it should have printed a difflike thing [23:53] but the matchee is always printed anyway. [23:54] `assertThat(some_giant_thing, WhateverMatcher)` brings pain. [23:54] mgz: that can be changed [23:54] mgz: OTOH its good to be able to see definitional errors [23:55] it's good to be able to see more than the tail of the last failing test's output.