=== ara is now known as Guest5294 | ||
=== smb` is now known as smb | ||
=== txspud|away is now known as txspud | ||
=== lool- is now known as lool | ||
shadeslayer | rbasak: hi there, it seems like I'm hitting the same issue as you were here http://irclogs.ubuntu.com/2013/08/01/%23ubuntu-kernel.html | 14:55 |
---|---|---|
shadeslayer | rbasak: was there any fix for it? | 14:55 |
shadeslayer | I have a very reliable way of reproducing it in my schroot, when unpacking the firefox tar | 14:55 |
rbasak | shadeslayer: only fix is to not use overlayfs | 15:03 |
shadeslayer | :( | 15:03 |
shadeslayer | but I need it :( | 15:03 |
rbasak | You can configure schroot to clone the entire tree instead I think? | 15:03 |
rbasak | Or something like that. | 15:03 |
shadeslayer | that sounds expensive | 15:04 |
rbasak | Indeed. | 15:04 |
shadeslayer | Well, the workaround I added seems to work, which is basically, keep retrying the tar comand | 15:05 |
shadeslayer | *command | 15:05 |
rbasak | It's a race, so that might work eventually for you. It's not guaranteed to, though. | 15:05 |
shadeslayer | aha | 15:05 |
rbasak | IIRC, the code in tar makes an assumption that doesn't hold true on overlayfs. | 15:05 |
rbasak | I can't remember the details. Something about fstat after the file has been written or something. | 15:06 |
rbasak | Probably inode number related. | 15:06 |
shadeslayer | well, the exact combination is eatmydata + overlayfs + tar, so I can imagine things going wrong | 15:08 |
=== jdstrand_ is now known as jdstrand | ||
Diziet | Hi. I'm having a problem with HOME=/ strace -ttfot git-clone -q git://kernel.ubuntu.com/ubuntu/linux.git | 16:04 |
Diziet | It falls over after almost exactly 2 minutes. | 16:05 |
Diziet | Without -q it works. | 16:05 |
Diziet | The strace shows the server sending EOF while the client is still waiting. | 16:05 |
Diziet | Where should I report this ? It's quite inconvenient as it's making something in our CI system not work... | 16:05 |
apw | Diziet, which version of git is that ? | 16:10 |
Diziet | Debian wheezy | 16:10 |
Diziet | 1.7.10.4 | 16:10 |
apw | why does that ring a bell | 16:10 |
Diziet | (Obviously the failure happens without the strace.) | 16:11 |
Diziet | The fact that it works without -q suggests to me that something is deciding that this process or this connection is "stuck", using a timer which gets reset by output (which is fairly copious with -v) | 16:11 |
Diziet | (I mean, copious without -v) | 16:11 |
apw | yep, that is something like the bug, and it is tickling my "this is familiar" noddle | 16:12 |
Diziet | As a workaround, perhaps the timeout could be increased to an amount sufficient to make "git clone -q linux" work ? | 16:13 |
henrix | apw: a grep in my irc logs (because it *does* ring a bell) shows a ref to bug #1228148 | 16:13 |
ubot5 | bug 1228148 in git (Ubuntu Precise) "git clone -q ends with "early EOF" error on large repositories" [Undecided,Confirmed] https://launchpad.net/bugs/1228148 | 16:13 |
apw | henrix, thanks ... i knew i'd seen something | 16:14 |
Diziet | I am experiencing this problem with our git caching proxy. Naturally I can't really sensibly tell the proxy not to use -q in its own git invocations. | 16:14 |
apw | henrix, oh yeah, i think we shelved this till zinc got upgraded to P | 16:14 |
apw | which i assume it already has | 16:14 |
henrix | yeah, it looks like it has been upgraded | 16:15 |
Diziet | That LP bug has a workaround (sending KA packets) but I'm pretty sure the timeout isn't in git itself. | 16:16 |
Diziet | It is probably in some proxy or wrapper you have. | 16:16 |
Diziet | So I don't think it is necessary for you to patch your git (although that would help people with broken^W NAT). | 16:16 |
Diziet | For me it would suffice to increase your timeout. (I know it is at your end because I have repro'd the problem from my colo.) | 16:17 |
apw | Diziet, what is the config for the timeout | 16:20 |
apw | Diziet, as i can likely get that changed quickly | 16:20 |
Diziet | IDK. I'm not sure it is actually in git. | 16:20 |
Diziet | There's a --timeout option to git-daemon which might be relevant. | 16:21 |
Diziet | And also a --timeout option to git-upload-pack. | 16:21 |
Diziet | I think the latter may be the one. | 16:21 |
Diziet | Are you running vanilla git-daemon out of inetd, or what ? | 16:22 |
Diziet | I'm going to try adding a --timeout to my own git-daemon here to see if I can repro the bug. | 16:23 |
apw | Diziet, i am pretty sure it is going to be timeing out, so 1) i am gc'ing the repo you are cloning, and 2) looking to change the timeout | 16:26 |
Diziet | apw: Thanks. If you bear with me 10 mins or so I can probably confirm what to do to git to increase the timeout. | 16:27 |
Diziet | Are you running git from inetd, or from upstart ? Is it git-daemon or something else ? | 16:28 |
apw | Diziet, great, pretty sure it is from xinetd | 16:28 |
Diziet | Can you check the command line you're giving it ? I think it's probably git-daemon blah blah blah --timeout=120 blah blah | 16:28 |
Diziet | But before you change that 120 I am going to try to repro the fault here. | 16:29 |
apw | Diziet, yeah looks to be ... bah ... i'll wait on your test to confirm, and then can set those wheels in motion | 16:29 |
apw | Diziet, though the bigger wheels would be to add these patches | 16:30 |
Diziet | Great, will get back to you. | 16:30 |
Diziet | Heh. | 16:30 |
Diziet | col tells me that it's fixed (as in, the KA patch is included) in trusty. | 16:30 |
apw | Diziet, yeah and i am sure that box is on P rig | 16:35 |
apw | Diziet, yeah and i am sure that box is on P right now | 16:35 |
Diziet | I can confirm that --timeout=30 makes my own git server produce the same problem. | 16:36 |
Diziet | So I think adjusting your --timeout=120 to (say) --timeout=2000 will probably help. | 16:36 |
Diziet | Actually-stuck processes consume very little resource so I think you should be fine with a big timeout. | 16:36 |
apw | Diziet, yeah, i'll see what i can get fixed | 16:37 |
Diziet | Thanks. Should I expect a change soon (eg today) ? | 16:38 |
Diziet | I have sent an email to the bug suggesting increasing --timeout as a workaround. | 16:48 |
Diziet | (Of course that won't help people behind a NAT with a short timeout.) | 16:48 |
apw | Diziet, i'd hope, but not expect it to change quite that quick | 16:52 |
Diziet | OK, thanks. | 16:59 |
=== pgraner-afk is now known as pgraner-food | ||
=== pgraner-food is now known as pgraner | ||
apw | Diziet, ok i've reuqested the timeout change, and am working on the fixes, we shall see what occurs | 19:41 |
smoser | hey | 21:34 |
smoser | wonder if someone has an idea... not necessarily kernel, but somewhat related. | 21:34 |
smoser | from trusty: http://paste.ubuntu.com/9887959/ | 21:35 |
smoser | from utopic: http://paste.ubuntu.com/9887962/ | 21:35 |
arges | smoser: its like one of those 'spot the differences' pictures. So you're wondering why the estimated minimum size is less in 3.16? | 21:37 |
smoser | yes. i had more info coming. but good job spotting the diff :) | 21:37 |
smoser | I'm not terribly concerned about the difference, but its causing a failure on a | 21:38 |
smoser | build of maas images | 21:38 |
arges | smoser: so without looking at the code, I wonder if some meta-data structures changed size and thus the minimal size changed. Whats the failure? | 21:39 |
smoser | after doing a of apt-get installs and such to a loop mounted image, on trusty I try to 'resize2fs' to 1400M (arbitrary historic size) and it fits fine. | 21:40 |
=== pgraner is now known as pgraner-gym | ||
arges | smoser: one thing to try is use the older e2fsprogs with the newer kernel to see if there is any calulcation differences there | 21:47 |
smoser | do you think that resize2fs is able to use anything from the kernel ? | 21:47 |
smoser | i would not haave thought it would. | 21:47 |
smoser | but your suggestion would give more info to that | 21:48 |
arges | smoser: there is a patch that adjusts the inode struct which means it could be padded differently and have a different size too.. haven't exhaustively searched | 21:48 |
arges | the difference is 8114 bytes, but we have 145152 inodes... so maybe its a superblock change? | 21:49 |
arges | smoser: another thing to look at is config differences between 3.13/3.16 to see if something was turned on that could affect structure sizes | 21:51 |
smoser | arges, but the file isn't mounted. | 21:51 |
smoser | so i'm confuse das to how kernel would be involved. | 21:52 |
smoser | isn't resize2fs just opening a file an dlooking around at it ? | 21:52 |
arges | smoser: not sure. strace and find out | 21:52 |
arges | downloading the image here and looking at it | 21:53 |
smoser | strace output: http://paste.ubuntu.com/9888176/ | 21:54 |
arges | smoser: so you see a few ioctls before 'Minimum' gets printed | 21:56 |
arges | i'm not sure extactly how minimum size is calculated though. | 21:56 |
smoser | what i'm doing is in lp:maas-images. something like this ends up getting run: | 22:01 |
smoser | time maas-cloudimg2eph2 -vv --kernel=linux-generic --arch=arm64 \ | 22:01 |
smoser | $url root-image.gz \ | 22:01 |
smoser | --krd-pack=linux-generic,boot-kernel,boot-initrd 2>&1 | tee out.log | 22:01 |
smoser | and on utopic, it doesn't fit into 1400M and on trusty it does. | 22:01 |
smoser | and i think (doing a more controlled test now) that its significantly different. | 22:01 |
arges | smoser: do you know what size it ends up being overall? | 22:02 |
smoser | i'll have that later tonirhg. just started a run on utopic and one on trusty. | 22:03 |
smoser | i'll keep the images aroudn too so i can grab more debug info on them. | 22:04 |
smoser | but i have to go afk for a while. thanks for your thoughts. | 22:04 |
arges | smoser: hmm running this in vivid makes me run 'e2fsck -f *' first, then I get 290304 (this is on 3.19) | 22:04 |
arges | smoser: sure. this might be worth filing a bug at some point. feel free to ping me again | 22:05 |
smoser | hm.. | 22:05 |
smoser | the downloaded image is dirty ? | 22:05 |
arges | that's what resize2fs 1.42.12 says | 22:05 |
smoser | hm.. | 22:06 |
arges | md5 sum matches | 22:06 |
smoser | that could be relavant. | 22:06 |
smoser | well, you see the output that i got, it doesn't complain. | 22:06 |
smoser | maybe i'll try more liberally sprinking 'e2fsck -fy the-image' | 22:06 |
arges | 1.42.12 vs 1.42.9 wonder if there is some fix that exposes this... ugh | 22:07 |
* arges tries a trusty image for the heck of it | 22:09 | |
arges | trusty image on vivid doesn't complain that i need to run e2fsck. size on 3.19/vivid 236103, 3.13/trusty 230329 so still a difference | 22:12 |
arges | smoser: if you want i can do a bisect and figured out which commit made it blow up in size. | 22:20 |
smoser | arges, | 23:34 |
smoser | trusty: http://paste.ubuntu.com/9889226/ | 23:34 |
smoser | utopic: http://paste.ubuntu.com/9889235/ | 23:34 |
smoser | basically, note that 'df' reports a difference used of '980200' to '980216' | 23:35 |
smoser | but resize2fs on trusty reports min size of of ~1073M versus ~1582M on utopic. | 23:36 |
smoser | resize2fs does claim "estimated" size, but you'd hope it could estimate a bit closer than that. | 23:36 |
Diziet | apw: (git-deamon timeout) Thanks. | 23:38 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!