[07:29] <mgz> morning people!
[07:32]  * fullermd obviously isn't one   ;p
[07:33] <mgz> you're a category to yourself perhaps fullermd?
[07:37] <fullermd> I'm autoontological!
[08:21] <jam> mgz, vila, jelmer, jml: I put up a branch to handle: bug# 1046284
[08:21] <jam> bug #1046284
[08:21] <jam> In doing so, I'm starting a new branch for bug #1046697
[08:21] <jam> Which so far has found at least 2 bugs in lightweight checkouts with remote repositories.
[08:21] <jml> jam: cool :)
[08:21] <jam> So I'm trying to decide how much to split this up.
[08:22] <jam> Right now, I'm thinking to do 1 branch per small fix, and then have an overall integration branch that runs the full permutation tests.
[08:22] <jam> That should leave us with easy to review small steps, and a test suite that ensures all those are correct when we're done.
[08:22] <jam> vila,mgz,jelmer: Would you prefer I wait until I've fixed all the holes and have one major branch to review?
[08:22] <jam> also, the initial fix is targeting 2.5
[08:23] <jam> however, all these cleanup fixes could just be done in bzr.dev if we feel that is a 'safer' way to do it.
[08:23] <jam> Thoughts?
[08:24] <vila> out of blue, I'd say: target a final branch with all permutations passing and do as many intermediate branches you feel are small enough to review ? So basically, you're already doing that, go for it ;)
[08:24] <vila> now, if you hesitate about 2.5 vs dev:
[08:25] <vila> if 2.5 is deployed on lp, you'll get bug reports soon enough and would need to fix those bugs anyway, so go for 2.5
[08:25] <vila> the alternative being rolling back from 2.5 on lp, I don't think you'll hesitate long ;)
[08:26] <mgz> I don't mind seeing small intermediate branches, even if there's a rollup that's what lands in the end
[08:26] <jam> vila: so far they are all client side fixes
[08:26] <jam> so while the initial bug is exposed by bzr-2.5 on the server, it is only the client that needs updating.
[08:26] <jam> Anyway, I'm happy to improve our test coverage and fix the holes that we've had a long time.
[08:26] <jam> They aren't strictly regressions.
[08:27] <jam> But meh, targeting 2.5 is cheap and easy.
[08:27] <jam> If it was "we should target 2.1" I might fuss a bit.
[08:27] <vila> client only but uncovered because lp has been upgraded, so to me, that's still part of the 2.5 experience, even for the clients
[08:28] <vila> if < 2.5 clients encounter the issue, well, they'll have to upgrade
[08:28] <jam> vila: so for 'get_file_text' it won't impact anyone who is running <2.5 because they won't try the new rpc
[08:29] <jam> for the other bugs... I've only uncovered 1, which is that "WT.bzrdir.sprout()" is broken if you have a RemoteBranch.
[08:29] <jam> but I'm sure we've had it for a while.
[08:29] <vila> pretty good then
[08:29] <jam> And upgrading LP would have triggered it.
[08:29] <jam> I'm assuming I'll run into more bugs, though, as part of updating the test suite
[08:29] <vila> clarification: lp has been upgraded or not ?
[08:30] <jam> vila: lp has been upgraded
[08:30] <jam> we added 1 new rpc, which triggers the bug in a bzr-2.5.1+ client calling WT.get_file_text()
[08:30] <vila> to which version ? lp:bzr/2.5 or still with lp specific additional fixes ?
[08:30] <jam> in a lightweight checkout of a remote repo.
[08:30] <jam> vila: have to ask jelmer, but I think just stock 2.5.1
[08:30] <mgz> it's a pretty specific thing, that most people haven't done because the performance was terrible
[08:31] <mgz> now the performance isn't so terrible, but it breaks
[08:33] <vila> mgz: but isn't it the recommended emacs setup ?
[10:40] <mgz> have we got docs for the new way of doing :policy = appendpath stuff?
[10:40] <mgz> an example locations.conf that sets the push_location of a lp branch sensibly would be nice
[10:44] <mgz> ah, is all under configuration-help
[11:05] <jelmer> grmbl, xchat seems to remove highlighting indication after disconnects
[11:05] <jelmer> jam: yes, 2.5.1
[11:08] <mgz> bug 1046773 confuses me
[11:08] <mgz> he claims this is against launchpad as a smart server, but I fixed bug 722416 in 2.4b1
[11:21] <jam> mgz: note that 'error' is any generic "We didn't know about this error in advanced". It may not be MemoryError.
[11:21] <jam> mgz: also note that he is using 'bzr-2.5b1' so still a beta release.
[11:22] <jam> I would ask him to update to bzr-2.5.1 and confirm that the bug still happens.
[11:22] <jam> we could have changed an RPC verb between b1 and 2.5-final or something like that.
[11:27] <mgz> but as part of that fix, I made the smart server return the full error class name
[11:27] <mgz> which is all server side.
[11:28] <mgz> does launchpad actually keep bzr logs for smart server access? I can't find any.
[11:31] <mmrazik> hi
[11:31] <mmrazik> could somebody help with this stacktrace:
[11:31] <mmrazik> http://pastebin.ubuntu.com/1188779/
[11:31] <mmrazik> maybe a known error?
[11:31] <jam> mmrazik: so we recently upgraded bzr on Launchpad, such that it now closes the connection if you leave it idle for more than 5 minutes.
[11:32] <jam> I think if you either upgrade bzrlib locally, it will auto-reconnect.
[11:32] <jam> Or just reconnect yourself.
[11:32] <mmrazik> jam: thanks! let me try
[11:32] <jam> I think sidnei was mentioning earlier that tarmac opens a connection, runs the test suite, and then tries to commi tit.
[11:33] <mmrazik> jam: yes. Thats this case as well
[11:33] <mmrazik> jam: do you know what version of bzrlib should I use?
[11:33] <mmrazik> the box is something old (?oineric)
[11:37] <jam> mmrazik: I'm sure the connection retry logic is in bzr-2.5
[11:37] <jam> I thought there was an argument at the time that we were going to backport it to bzr-2.1 (stable series)
[11:37] <jam> but I then was on rotation, so maybe that never happened?
[11:37] <jam> (i don't see a release-notes indicating the retry logic was in any older stable release)
[11:38] <jam> mgz, jelmer, vila: Do you remember what happened with the retry logic? Is my statement sound? (it is in 2.5, and we never backported it to a 2.1 release?)
[11:38] <jelmer> jam: Yeah, I remember us backporting it to at least 2.2 too
[11:38] <jelmer> but I'm not sure if we ever did a release with those changes
[11:39] <mmrazik> how can I reconnect manually? Can you point me to some docs?
[11:39] <jam> jelmer: Yeah, I think I did the work, but we wanted to wait to see how it sorted out.
[11:39] <jam> mmrazik: https://launchpad.net/~bzr/+archive/ppa can get you a newer bzr for oneiric
[11:39] <mmrazik> jam: thanks
[11:39]  * mmrazik tries
[11:39] <jam> ppa:bzr/ppa I believe
[11:39] <jam> mmrazik: for the manual reconnect, I would just say re-open the branch object.
[11:39] <jam> But I don't know the internals of tarmac all that well.
[11:40] <mmrazik> I'll try the ppa first
[11:40] <jam> ah, I think I know what is happening, and what we didn't expect.
[11:40] <jam> the server is closing the ssh connection, and the ssh subprocess notices, cleans up and goes away
[11:40] <jam> and then bzr tries to talk to the ssh process that isn't there anymore.
[11:41] <jam> which is why we are getting EPIPE instead of ConnectionReset.
[11:41] <mmrazik> that explaing the EPIPE
[11:41] <mmrazik> yep
[11:41] <jam> mmrazik: if you want to do a quick test, you might try forcing "BZR_SSH=paramiko" in the environment.
[11:41] <jam> And see if that triggers ConnectionReset instead.
[11:41] <jam> Just as a 'my hypothesis is sound' :)
[11:42] <jam> then again, you may have credentials with openssh, and nothing will work with paramiko
[11:42] <mmrazik> what is paramiko btw?
[11:42] <mmrazik> I'm trying it right now...
[11:44] <jam> mmrazik: paramiko is a python ssh implementation.
[11:45] <jam> (we especially use it on Windows where openssh is likely not to be present.)
[11:53] <mmrazik> jam: with paramiko it seems to be stuck at this stage:
[11:54] <mmrazik> 26057kB     0kB/s | Revert phase:Apply phase:adding file 0/1
[12:04] <mmrazik> mhm... the same stack trace with bzrlib 2.5.1 :-/
[12:07] <jam> mmrazik: still EPIPE? strange.as there shouldn't be a subprocess for paramiko.
[12:07] <jam> ah, you mean with 2.5.1
[12:07] <mmrazik> jam: yes, EPIPE with 2.5.1
[12:07] <jam> mgz, jelmer: https://code.launchpad.net/~jameinel/bzr/2.5-remote-wt-tests-1046697/+merge/123061 is a rollup of all the fixes for lightweight checkouts and a remote repository.
[12:08] <jam> it ended up only 459lines, so you might just review that.
[12:14] <mmrazik> jam: if I have bzrlib.branch object, what can I do to re-connect
[12:14]  * mmrazik is trying to read the API but doesn't find anything
[12:14] <jam> mmrazik: I think there are 2 things to try, depending on how comfortable you are hacking python code and testing it.
[12:15] <mmrazik> lets try :)
[12:15] <jam> I think one issue is that we are using a socket_pair rather than a pipe to communicate to the subprocess, so we didn't expect EPIPE.
[12:15] <jam> mmrazik: so to just reconnect manually,y ou can do "mybranch = mybranch.bzrdir.open_branch()" I think.
[12:16] <jam> that would be at the tarmac level.
[12:16] <mmrazik> yep
[12:16] <mmrazik> just trying that
[12:16] <mmrazik> I need at least some quick hack to make the system work
[12:18] <jam> at the bzrlib level, I would probably do something like: http://pastebin.ubuntu.com/1188861/
[12:18] <jam> I think the issue is that the retry logic is there, but it isn't treating EPIPE as a connectionreset failure.
[12:20] <jam> mmrazik: I would guess the traceback is slightly different (the line numbers at least don't match up) can you paste the new traceback?
[12:21] <mmrazik> jam: http://pastebin.ubuntu.com/1188864/
[12:23] <jam> mmrazik: hmm... it looks like it is actually retrying at that point...
[12:23] <jam> the failure is occuring at line 13 of: http://pastebin.ubuntu.com/1188866/
[12:23] <jam> note that we've caught a ConnectionReset and are retrying the request.
[12:25] <jam> mmrazik: my wife is giving me the "I've been ready to leave for 15 minutes" look, so I have to go now, but I'm happy to work with you more tomorrow.
[12:25] <mmrazik> jam: thanks!
[12:25] <jam> mgz, jelmer: ^^ if you can get a chance to follow up on this. We *might* need to rollback bzr-2.5 but hopefully we can do a client side fix instead.
[12:25] <jelmer> jam: *nod*
[12:50] <mgz> ...I'm confused by the status of this reconnect bug,
[12:50] <mgz> it's paramiko only, or for any ssh connection?
[12:50] <mmrazik> mgz: I tried paramiko but it didn't really help. The stack trace went away but tarmac "hang" at certain stage
[12:51] <mmrazik> so its any ssh connection
[12:51] <mmrazik> I'm just trying paramiko with the newest bzrlib
[12:51] <mmrazik> to see if there is a change