=== r0bby is now known as robbyoconnor | ||
=== r0bby is now known as robbyoconnor | ||
mgz | morning people! | 07:29 |
---|---|---|
* fullermd obviously isn't one ;p | 07:32 | |
mgz | you're a category to yourself perhaps fullermd? | 07:33 |
fullermd | I'm autoontological! | 07:37 |
jam | mgz, vila, jelmer, jml: I put up a branch to handle: bug# 1046284 | 08:21 |
jam | bug #1046284 | 08:21 |
ubot5 | Launchpad bug 1046284 in Bazaar "TooManyConcurrentRequests when committing to lightweight checkout" [Undecided,In progress] https://launchpad.net/bugs/1046284 | 08:21 |
jam | In doing so, I'm starting a new branch for bug #1046697 | 08:21 |
ubot5 | Launchpad bug 1046697 in Bazaar "missing integration tests of lightweight checkout and remote repository" [Medium,Triaged] https://launchpad.net/bugs/1046697 | 08:21 |
jam | Which so far has found at least 2 bugs in lightweight checkouts with remote repositories. | 08:21 |
jml | jam: cool :) | 08:21 |
jam | So I'm trying to decide how much to split this up. | 08:21 |
jam | Right now, I'm thinking to do 1 branch per small fix, and then have an overall integration branch that runs the full permutation tests. | 08:22 |
jam | That should leave us with easy to review small steps, and a test suite that ensures all those are correct when we're done. | 08:22 |
jam | vila,mgz,jelmer: Would you prefer I wait until I've fixed all the holes and have one major branch to review? | 08:22 |
jam | also, the initial fix is targeting 2.5 | 08:22 |
jam | however, all these cleanup fixes could just be done in bzr.dev if we feel that is a 'safer' way to do it. | 08:23 |
jam | Thoughts? | 08:23 |
vila | out of blue, I'd say: target a final branch with all permutations passing and do as many intermediate branches you feel are small enough to review ? So basically, you're already doing that, go for it ;) | 08:24 |
vila | now, if you hesitate about 2.5 vs dev: | 08:24 |
vila | if 2.5 is deployed on lp, you'll get bug reports soon enough and would need to fix those bugs anyway, so go for 2.5 | 08:25 |
vila | the alternative being rolling back from 2.5 on lp, I don't think you'll hesitate long ;) | 08:25 |
mgz | I don't mind seeing small intermediate branches, even if there's a rollup that's what lands in the end | 08:26 |
jam | vila: so far they are all client side fixes | 08:26 |
jam | so while the initial bug is exposed by bzr-2.5 on the server, it is only the client that needs updating. | 08:26 |
jam | Anyway, I'm happy to improve our test coverage and fix the holes that we've had a long time. | 08:26 |
jam | They aren't strictly regressions. | 08:26 |
jam | But meh, targeting 2.5 is cheap and easy. | 08:27 |
jam | If it was "we should target 2.1" I might fuss a bit. | 08:27 |
vila | client only but uncovered because lp has been upgraded, so to me, that's still part of the 2.5 experience, even for the clients | 08:27 |
vila | if < 2.5 clients encounter the issue, well, they'll have to upgrade | 08:28 |
jam | vila: so for 'get_file_text' it won't impact anyone who is running <2.5 because they won't try the new rpc | 08:28 |
jam | for the other bugs... I've only uncovered 1, which is that "WT.bzrdir.sprout()" is broken if you have a RemoteBranch. | 08:29 |
jam | but I'm sure we've had it for a while. | 08:29 |
vila | pretty good then | 08:29 |
jam | And upgrading LP would have triggered it. | 08:29 |
jam | I'm assuming I'll run into more bugs, though, as part of updating the test suite | 08:29 |
vila | clarification: lp has been upgraded or not ? | 08:29 |
jam | vila: lp has been upgraded | 08:30 |
jam | we added 1 new rpc, which triggers the bug in a bzr-2.5.1+ client calling WT.get_file_text() | 08:30 |
vila | to which version ? lp:bzr/2.5 or still with lp specific additional fixes ? | 08:30 |
jam | in a lightweight checkout of a remote repo. | 08:30 |
jam | vila: have to ask jelmer, but I think just stock 2.5.1 | 08:30 |
mgz | it's a pretty specific thing, that most people haven't done because the performance was terrible | 08:30 |
mgz | now the performance isn't so terrible, but it breaks | 08:31 |
vila | mgz: but isn't it the recommended emacs setup ? | 08:33 |
mgz | have we got docs for the new way of doing :policy = appendpath stuff? | 10:40 |
mgz | an example locations.conf that sets the push_location of a lp branch sensibly would be nice | 10:40 |
mgz | ah, is all under configuration-help | 10:44 |
jelmer | grmbl, xchat seems to remove highlighting indication after disconnects | 11:05 |
jelmer | jam: yes, 2.5.1 | 11:05 |
mgz | bug 1046773 confuses me | 11:08 |
ubot5 | Launchpad bug 1046773 in Bazaar "can't push into new project" [Undecided,New] https://launchpad.net/bugs/1046773 | 11:08 |
mgz | he claims this is against launchpad as a smart server, but I fixed bug 722416 in 2.4b1 | 11:08 |
ubot5 | Launchpad bug 722416 in Bazaar "Smart server transmits MemoryError as ('error', '')" [High,Fix released] https://launchpad.net/bugs/722416 | 11:08 |
jam | mgz: note that 'error' is any generic "We didn't know about this error in advanced". It may not be MemoryError. | 11:21 |
jam | mgz: also note that he is using 'bzr-2.5b1' so still a beta release. | 11:21 |
jam | I would ask him to update to bzr-2.5.1 and confirm that the bug still happens. | 11:22 |
jam | we could have changed an RPC verb between b1 and 2.5-final or something like that. | 11:22 |
mgz | but as part of that fix, I made the smart server return the full error class name | 11:27 |
mgz | which is all server side. | 11:27 |
mgz | does launchpad actually keep bzr logs for smart server access? I can't find any. | 11:28 |
mmrazik | hi | 11:31 |
mmrazik | could somebody help with this stacktrace: | 11:31 |
mmrazik | http://pastebin.ubuntu.com/1188779/ | 11:31 |
mmrazik | maybe a known error? | 11:31 |
jam | mmrazik: so we recently upgraded bzr on Launchpad, such that it now closes the connection if you leave it idle for more than 5 minutes. | 11:31 |
jam | I think if you either upgrade bzrlib locally, it will auto-reconnect. | 11:32 |
jam | Or just reconnect yourself. | 11:32 |
mmrazik | jam: thanks! let me try | 11:32 |
jam | I think sidnei was mentioning earlier that tarmac opens a connection, runs the test suite, and then tries to commi tit. | 11:32 |
mmrazik | jam: yes. Thats this case as well | 11:33 |
mmrazik | jam: do you know what version of bzrlib should I use? | 11:33 |
mmrazik | the box is something old (?oineric) | 11:33 |
jam | mmrazik: I'm sure the connection retry logic is in bzr-2.5 | 11:37 |
jam | I thought there was an argument at the time that we were going to backport it to bzr-2.1 (stable series) | 11:37 |
jam | but I then was on rotation, so maybe that never happened? | 11:37 |
jam | (i don't see a release-notes indicating the retry logic was in any older stable release) | 11:37 |
jam | mgz, jelmer, vila: Do you remember what happened with the retry logic? Is my statement sound? (it is in 2.5, and we never backported it to a 2.1 release?) | 11:38 |
jelmer | jam: Yeah, I remember us backporting it to at least 2.2 too | 11:38 |
jelmer | but I'm not sure if we ever did a release with those changes | 11:38 |
mmrazik | how can I reconnect manually? Can you point me to some docs? | 11:39 |
jam | jelmer: Yeah, I think I did the work, but we wanted to wait to see how it sorted out. | 11:39 |
jam | mmrazik: https://launchpad.net/~bzr/+archive/ppa can get you a newer bzr for oneiric | 11:39 |
mmrazik | jam: thanks | 11:39 |
* mmrazik tries | 11:39 | |
jam | ppa:bzr/ppa I believe | 11:39 |
jam | mmrazik: for the manual reconnect, I would just say re-open the branch object. | 11:39 |
jam | But I don't know the internals of tarmac all that well. | 11:39 |
mmrazik | I'll try the ppa first | 11:40 |
jam | ah, I think I know what is happening, and what we didn't expect. | 11:40 |
jam | the server is closing the ssh connection, and the ssh subprocess notices, cleans up and goes away | 11:40 |
jam | and then bzr tries to talk to the ssh process that isn't there anymore. | 11:40 |
jam | which is why we are getting EPIPE instead of ConnectionReset. | 11:41 |
mmrazik | that explaing the EPIPE | 11:41 |
mmrazik | yep | 11:41 |
jam | mmrazik: if you want to do a quick test, you might try forcing "BZR_SSH=paramiko" in the environment. | 11:41 |
jam | And see if that triggers ConnectionReset instead. | 11:41 |
jam | Just as a 'my hypothesis is sound' :) | 11:41 |
jam | then again, you may have credentials with openssh, and nothing will work with paramiko | 11:42 |
mmrazik | what is paramiko btw? | 11:42 |
mmrazik | I'm trying it right now... | 11:42 |
jam | mmrazik: paramiko is a python ssh implementation. | 11:44 |
jam | (we especially use it on Windows where openssh is likely not to be present.) | 11:45 |
mmrazik | jam: with paramiko it seems to be stuck at this stage: | 11:53 |
mmrazik | 26057kB 0kB/s | Revert phase:Apply phase:adding file 0/1 | 11:54 |
mmrazik | mhm... the same stack trace with bzrlib 2.5.1 :-/ | 12:04 |
jam | mmrazik: still EPIPE? strange.as there shouldn't be a subprocess for paramiko. | 12:07 |
jam | ah, you mean with 2.5.1 | 12:07 |
mmrazik | jam: yes, EPIPE with 2.5.1 | 12:07 |
jam | mgz, jelmer: https://code.launchpad.net/~jameinel/bzr/2.5-remote-wt-tests-1046697/+merge/123061 is a rollup of all the fixes for lightweight checkouts and a remote repository. | 12:07 |
jam | it ended up only 459lines, so you might just review that. | 12:08 |
mmrazik | jam: if I have bzrlib.branch object, what can I do to re-connect | 12:14 |
* mmrazik is trying to read the API but doesn't find anything | 12:14 | |
jam | mmrazik: I think there are 2 things to try, depending on how comfortable you are hacking python code and testing it. | 12:14 |
mmrazik | lets try :) | 12:15 |
jam | I think one issue is that we are using a socket_pair rather than a pipe to communicate to the subprocess, so we didn't expect EPIPE. | 12:15 |
jam | mmrazik: so to just reconnect manually,y ou can do "mybranch = mybranch.bzrdir.open_branch()" I think. | 12:15 |
jam | that would be at the tarmac level. | 12:16 |
mmrazik | yep | 12:16 |
mmrazik | just trying that | 12:16 |
mmrazik | I need at least some quick hack to make the system work | 12:16 |
jam | at the bzrlib level, I would probably do something like: http://pastebin.ubuntu.com/1188861/ | 12:18 |
jam | I think the issue is that the retry logic is there, but it isn't treating EPIPE as a connectionreset failure. | 12:18 |
jam | mmrazik: I would guess the traceback is slightly different (the line numbers at least don't match up) can you paste the new traceback? | 12:20 |
mmrazik | jam: http://pastebin.ubuntu.com/1188864/ | 12:21 |
jam | mmrazik: hmm... it looks like it is actually retrying at that point... | 12:23 |
jam | the failure is occuring at line 13 of: http://pastebin.ubuntu.com/1188866/ | 12:23 |
jam | note that we've caught a ConnectionReset and are retrying the request. | 12:23 |
jam | mmrazik: my wife is giving me the "I've been ready to leave for 15 minutes" look, so I have to go now, but I'm happy to work with you more tomorrow. | 12:25 |
mmrazik | jam: thanks! | 12:25 |
jam | mgz, jelmer: ^^ if you can get a chance to follow up on this. We *might* need to rollback bzr-2.5 but hopefully we can do a client side fix instead. | 12:25 |
jelmer | jam: *nod* | 12:25 |
mgz | ...I'm confused by the status of this reconnect bug, | 12:50 |
mgz | it's paramiko only, or for any ssh connection? | 12:50 |
mmrazik | mgz: I tried paramiko but it didn't really help. The stack trace went away but tarmac "hang" at certain stage | 12:50 |
mmrazik | so its any ssh connection | 12:51 |
mmrazik | I'm just trying paramiko with the newest bzrlib | 12:51 |
mmrazik | to see if there is a change | 12:51 |
=== deryck is now known as deryck[lunch] | ||
=== deryck[lunch] is now known as deryck | ||
=== yofel_ is now known as yofel |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!