/srv/irclogs.ubuntu.com/2012/09/07/#bzr.txt

=== jordan_ is now known as jordan
=== spm is now known as stevemci
=== stevemci is now known as spm
jammmrazik: feel free to ping me when you come back online06:55
jamI'm guessing paramiko + tarmac is actually trying to prompt for a password/etc which is why it is hanging. (Since you probably have credentials set for openssh that paramiko might not know about.)06:56
jamwe can debug that if we want, but given it looks like the connection is being retried and is *still* failing, I think we need to dig deeper.06:56
jamSo I'd like to debug it in a bit more hands-on way.06:57
mmrazikjam: ok. Let me try to re-create the setup. Will take me a few minutes.07:10
mmrazikjam: so this is where I'm right now: http://pastebin.ubuntu.com/1190380/08:01
mmrazikjam: but I'll be on a phone for a while now08:01
mgzmorning!08:02
vilahi mgz08:03
mgzhey vila08:07
jammmrazik: k, I'll be doing lunch and digging into some lp stuff for a bit, but I'll try to be responsive when you get back.08:16
mmrazikjam: is there something I should try now?08:16
mmrazikI'm a bit stuck TBH08:16
jammmrazik: do you know what version of tarmac you are running (just to try to set things up similarly here)08:17
jammmrazik: line numbers in your traceback don't quite match up to tarmac trunk, but you might be able to do something like: http://paste.ubuntu.com/1190405/08:35
jamah, you might need to do both branches, 1 sec08:36
jamhttp://paste.ubuntu.com/1190407/08:39
jammmrazik: ^^ should re-open both branches, creating new connections. at least as a stop-gap. I'd like to fix bzrlib, though, if you don't mind helping me investigate.08:39
=== lifeless_ is now known as lifeless
mmrazikjam: I'm on it now08:43
mmrazikjam: regarding tarmac -- its unfortunately custom tarmac extension I didn't even write08:43
mmraziklet me check if it is somewhere on bzr08:43
mmrazikbut the setup is fairly complex and requires jenkins08:43
mmrazikthe exension is some jenkins pre-commit logic and that is also why it fails. It waits for the jenkins job to finish only then commits.08:44
mmrazikjam: I think the easiest way to reproduce will be to create some custom "sleep 420" pre-commit hook08:44
jammmrazik: is this the same one that sidnei was looking at recently?08:45
jam(not sure which team you're on)08:45
mmrazikjam: I don't know but we are different teams (and this one was written by yet another team)08:45
mmrazikfor me this tarmac stuff is almost end of life and I want to get rid of it08:45
mmrazikits just some legacy I had to maintain08:46
jammmrazik: what are you switching to?08:46
mmrazikjam: more jenkins driven approach. where the logic is in jenkins.08:46
mmrazikit also scales better because jenkins can schedule build slaves08:46
mmrazikright now tarmac must be running on the same node where the jenkins job runs08:46
mmrazikanyway... going to patch tarmac with the patch you provided08:47
mmrazikpatched/running08:49
mmrazikjam: I believe it is this one: https://code.launchpad.net/~didrocks/tarmac/tarmac-jenkins08:50
mmrazikbut as I said there should be a simpler way how to reproduce08:50
jammmrazik: seeing if I can reproduce it trivially.08:57
mmrazikjam: the tarmac patch you provided didn't help :-/09:00
mmrazikhttp://pastebin.ubuntu.com/1190430/09:01
mmrazikAFAICT it now fails in the  "source.bzr_branch = source.bzr_branch.bzrdir.open_branch()" which I just added09:02
=== mmrazik is now known as mmrazi|otp
jammmrazi|otp: http://paste.ubuntu.com/1190441/09:08
jamis another patch you can try when you get back.09:08
jammgz: poke09:09
mmrazi|otpjam: running it09:32
=== mmrazi|otp is now known as mmrazik|lunch
=== mmrazik|lunch is now known as mmrazik
mmrazikjam: still no luck :-/ http://pastebin.ubuntu.com/1190559/10:43
jammmrazik: the traceback shows it isn't the new code: source.bzr_branch = source.bzr_branch.bzrdir.open_branch()10:43
mmrazikjam... argh... sorry. I didn't apply it correctly10:44
mmrazikjam: yes. Just looking at it10:44
mmrazikjam: looks better now. There is still a stacktrace but I think its because the tarmac user is not allowed to push into the branch10:54
mmrazikI'm now trying with the real thing10:58
mmrazikjam: ack. it works with the tarmac patch.11:02
jammmrazik: so that at least gets you up and running again.11:02
jamI'm trying to see if I can reproduce here. The 5-min wait to test is a bit annoying.11:02
mmrazikjam: yep. Many thanks for the help.11:02
jamI think I tried paramiko, and found it hangs at the point of reconnect.11:02
jamwhich might be what you saw.11:02
mmraziklet me know if you need some more help with this11:03
jamwell, I should know in about 200 more seconds if it reproduces locally.11:06
jammmrazik: :( it doesn't reproduce here, the retry works: http://paste.ubuntu.com/1190598/11:11
jam(that is seconds *10)11:11
jamat 5 min it gets the 'you're disconnected' from the server.11:11
jamat 35s, the client notices, and retries the connection.11:12
jamand successfully gets Branch.last_revision()11:12
mgzhm. I wonder what's different.11:13
mmrazik:-/11:13
jammgz: well offhand I wouldn't expect EPIPE from a *socket* object, but the traceback clearly looks like it is failing while retrying, not failing in the initial request (and then failing to retry)11:38
jammgz: hmm.. right now I'm running on Windows, which uses actual pipes, rather than socketpair. I wonder if that matters.11:46
jammgz: can you run this on your machine: http://paste.ubuntu.com/1190654/11:46
jamand maybe you as well mmrazik ^^11:50
mgzjam: sure11:51
jamI can see that if I run BZR_SSH=paramiko, I don't see the stderr 'you have been disconnected' message.11:51
mmrazikjam: running11:53
mmrazikso far so good. just numbers11:53
mmrazikoh..11:53
mmrazikthats expected :)11:53
jammmrazik: well, expected for 350s :)11:53
jammgz, mmrazik: weird, when running with paramiko, we end up looping on a socket.sendall trying to send 119 bytes, and we just keep failing.11:54
* mmrazik shour read the code before copy&pasting&running something11:54
mmraziks/shour/should/11:54
jamit gives us a "sent 0 bytes" in response, but doesn't actually give an error.11:54
jamI think we should probably have a check for 'if bytes sent == 0: EOF"11:54
mmrazikjam: the code can reproduce the error11:58
mmrazikhttp://pastebin.ubuntu.com/1190681/11:58
jammmrazik: so... progress of a sort.11:58
=== mmrazik is now known as mmrazik|otp
jambug #104730912:03
ubot5Launchpad bug 1047309 in Bazaar "ssh paramiko loops endlessly sending 0 bytes" [High,Confirmed] https://launchpad.net/bugs/104730912:03
jammgz, jelmer: can you think if sock.send() can legitimately say "I couldn't send any content right now" without raising EINTR?12:04
jamI realize it returns the number of bytes written, but if it can't write *any* bytes, should we treat that as EOF immediately or should we try a couple times.12:04
jelmerjam: couldn't there be a buffer that's full, or something like that?12:07
jamjelmer: man send says: http://paste.ubuntu.com/1190698/12:09
jamit will block until it can send what you asked12:09
jamunless you are in non-blocking mode12:09
jambut then send should fail with EWOULDBLOCK12:09
jamMSG_NOSIGNAL (since Linux 2.2)12:10
jam       Requests not to send SIGPIPE on errors on stream oriented sockets when the other end breaks the connec-12:10
jam       tion.  The EPIPE error is still returned.12:10
jaminteresting.12:10
jamand we use blocking sockets (because when you set nonblocking it causes the smart server tests to fail)12:11
jammmrazik|otp: ok, in this particular case, it looks like it is getting EPIPE during the first send, not during the retry, so I think our code just isn't handling EPIPE as a connection reset12:12
jamI'll try to dig some more.12:12
jammgz: can you confirm that it fails for you?12:12
mgzonesec12:22
mgzokay, running remotely, will tell you when it returns12:23
=== mmrazik|otp is now known as mmrazik
mgz3012:28
mgzConnection Timeout: disconnecting client after 300.0 seconds12:28
mgzand traceback at loop end.12:29
mgzsame as mmrazik.12:29
jammgz: k, I think I know the bug, and i'll put up a fix, can you run the fixed code in a sec.12:36
jammgz, mmrazik: If you are comfortable running bzr from source: lp:///~jameinel/bzr/2.5-conn-reset-socket-pipe-104732512:39
jamit doesn't have a test, but it should fix the problem12:39
jam(if it is that we aren't retrying at all.)12:39
mgzsure, I'll test that.12:39
mgzprobably just want the builddeps on this..12:39
jammgz: did you get a chance to test the branch?13:18
jamI also have: https://code.launchpad.net/~jameinel/bzr/2.5-unending-sendall-1047309/+merge/12326813:18
jamup for review.13:18
mgzjam:13:20
mgzConnection Timeout: disconnecting client after 300.0 seconds13:20
mgz3113:20
mgz3213:20
mgz3313:20
mgz3413:20
mgzConnectionReset calling 'Branch.last_revision_info', retrying13:20
jammgz: did it print the revision_id at the end?13:28
mgzso, will review other branch, and that fix looks good13:28
mgzjam: yup, the lack of traceback was the main thing :)13:29
jammgz: I think the fix is good, I'd like a test for it, so if you have ideas, I'm listening.13:30
jamI might get to it over the weekend, and then we should do 2.5.213:31
mgzI do wonder about if we've got the exception wrapping at the right level13:32
mgzthere are some tests that try to check connection reset stuff, but are a little unreliable as terminiation a connection from one thread in a process to another thread is not actually the same as what really happens13:33
mgzthe short answer is you replace the underlying call to raise an exception we've observed it raising and make sure it propogates wrapped up netly13:34
mgz*neatly13:34
mgzbut a more real world test would be grand...13:34
awilkinsGah, why did I ever set things up with NTLM auth (answer : because most of my users are noobs and it's easier when it works...)13:34
awilkinsIn the position where I have a tree that SVN can check out fine (anonymously) but Bazaar can't branch it (fails the NTLM auth)13:35
awilkinsDoes Bazaar just use PyCurl if it's installed?13:38
awilkinsHmm, maybe not13:38
mgedminevery time I see "Aborting commit due to empty commit message." I feel that I ♥git13:38
mgedminyou're missing an opportunity here with that interactive roadblock13:38
mgzmgedmin: I'm not sure what you're referring to, but every time, and you've never got as far as just sending a patch?13:46
awilkinsEvery time I see a commit without a log message, I feel that I ☠☢☹ the annoying sod that committed it.14:21
jmlyou guys are going to force me to implement 'bzr branches --merged' aren't you?15:09
mgzare we?15:11
fullermdLook, I never _said_ I'd kill your puppy if you didn't...15:13
=== deryck is now known as deryck[lunch]
=== yofel_ is now known as yofel
=== deryck[lunch] is now known as deryck
mark06is it possible to make bazaar recognize mac newlines?23:05
mark06it's considering the whole file changed when no newline conversion happened in fact23:06

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!