[00:17] xnox: if this is blocking anything we should override the milestone freeze, not let ourselves be blocked by it [00:19] slangasek: i'm blindly test rebuilding stuff against llvm 3.5 =) if you wish, you can unblock it and then i'd test it when it propagates. All other autopackage tests passed so yeah freeze block is the only one. [00:20] slangasek: but it's like on all images, thus it is respin the world trigger =/ [00:25] xnox: no, it is not [00:25] xnox: we do *not* guarantee images are up-to-date with respect to the archive during alpha milestones === psivaa is now known as psivaa-brb === psivaa-brb is now known as psivaa [09:36] I am just jumping a bit through the kernel source at lxr.free-electrons.com... the container_of macro (which is a heavy-use macro) is defined in 10 files. Are there many redundancies like this one? [09:40] vroomfondel: The drivers/staging/ are probably because those drivers haven't been cleaned up yet after being out-of-tree, for the tools/ and scripts/ those are for separate compilation units, which leaves the gpu/radeon and include/linux to explain :) [09:42] vroomfondel: looking at gpu/radeon that is also a stand-alone tool ... leaving just include/linux :) [14:22] hey. [14:22] http://paste.ubuntu.com/7914933/ [14:22] is this the same power woes we've seen before ? [14:28] smoser, doesn't look familiar to me, but its also a little sparse on details. apw ? [14:29] rtg, well, what you see is all i know. that and "it died" [14:29] :) [14:29] smoser, is the bare metal ? I'm just not that familiar with P8 [14:29] is this* [14:30] its a guest. [14:30] kvm guest. [14:31] smoser, prolly something we need to get the IBM guys to look at [14:32] rtg, i'll file a bug. [14:37] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1350889 [14:37] Launchpad bug 1350889 in linux (Ubuntu) "kernel crash kvm guest on power8" [Undecided,New] [14:50] yeah not something i have noticed so far [15:00] apw: stgraber: on 3.16 in utopic, in a unprivileged container, all of /proc is owned by nobody:nogroup [15:00] dat aint gonna fly [15:38] I can now reliably reproduce this kernel hang with BTRFS: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711 [15:38] Launchpad bug 1349711 in linux (Ubuntu) "Machine lockup in btrfs-transaction" [High,Incomplete] [15:39] The machine I'm using for testing won't necessarily be around for much longer, so if there are any further immediate diagnostic things should do it would be good to know [15:41] pwaller, can you reproduce this on bare metal ? or is it just a Xen guest issue ? [15:42] rtg: It's an amazon EC2 instance. I don't have bare metal available to test on. Or rather, I can't get the partition where I can reproduce it to a bare metal machine. [15:42] (It's 750GB) [15:47] arges, since you're just plowing through virt issues this morning, how about having a look at this one ^^ [15:48] I should add, my test case involves using my existing partition. I haven't tried reproducing from scratch [15:48] Mainly because I suspect a problem with the free space cache [15:49] It took 12 days before first manifesting. And then 2 hours of solid writes before I ended up in a situation where I could quickly reproduce again. [15:49] pwaller, have you talked to upstream about it ? IIRC you said 3.16 still has the issue [15:50] rtg: I have mailed upstream and had no response. I've had some interest from #btrfs who suggested trying 3.16, as did the person who responded in the ubuntu bug. [15:57] * arges looking === psivaa is now known as psivaa-bbl [15:59] pwaller: so you said you can rapidly reproduce. Is that documented in the bug already (still reading) [15:59] Yes arges at the end [15:59] pwaller: ahh there it is : ) [16:00] pwaller: so you pretty much just copy a large amount of files and get this error... [16:00] arges: that's basically it [16:01] arges: I haven't bothered trying to reproduce on a fresh BTRFS because I figure that that would be covered by other testing done to this thing [16:01] arges: and I'm also not sure what would constitute "interesting enough files" [16:02] pwaller: ok so at this point you can only reproduce on this specific instance. so if you created a new instance and ran your test it may not necessarily reproduce [16:02] arges: I reproduced it on a new instance with a snapshot of the same volume on a new EBS device [16:03] smb: ^^^ more xen fun, have you seen something like this [16:04] arges, no, nothing btrfs related which I can remember [16:05] arges: smb I'm migrating to the pub, but I will reappear on IRC in about 10 minutes in case you have further questions. [16:05] pwaller: ok i think on this end we'll try to digest the logs a bit [16:06] arges: thanks very much for looking into this :) [16:06] np [16:06] arges: I can also do further testing on the spot if you have more ideas for experiments. [16:06] arges, The fact it runs on ec2 may be just coincidental in this case [16:06] smb: yea i wonder if it can be reproduced in a non-ec2 xen enviornment : or ever non virt [16:20] I'm back arges/smb [16:20] pwaller: so has this happened with other btrfs volumes you've used? [16:21] arges: I've only observed this on this one filesystem (but copied the filesystem to another device and observed it there too) [16:21] We're running 5 systems which are configured the same, but only one other has a similar workload [16:21] pwaller: how did you copy the filesystem? [16:21] arges: EC2 snapshot restore to new volume [16:22] Even on the original system we haven't observed the crash since we upgraded kernel from 3.13 to 3.15 [16:22] (which is now at 2 days 1h uptime) [16:22] but the workloads aren't very steady. [16:23] pwaller: but a copy of ~400GB of small files seems to cause the soft lockups [16:23] Though from the logs we see it is less of a crash than a lockup [16:23] arges it's probably more like ~275GB maximum [16:24] arges: I started copying all 275GB, and 2h in to the ~6 hour operation, it locked up. I then resumed from the point of the lockup after the reboot, and then it will reliably lockup within minutes, or even within 30 seconds sometimes. [16:25] pwaller: you resumed what exactly, and how [16:26] arges: the volume contains ~23,000 sqlite files of varying sizes (4kb - 12GB). I copy them all from $PATH to ${PATH}.new [16:26] I observe the machine hang at ${GIVEN_PATH}, and then if I choose files that are after ${GIVEN_PATH} in the list, I can rapidly reproduce the lockup [16:27] pwaller: so you observe the hang. then stop the cp process? [16:28] then resume it after teh point where it was hanging? [16:28] arges: the cp process hangs. If I terminate the cp process, a kernel thread is using 100% CPU [16:28] arges: If I leave the machine idle (except for the pinned CPU on a kernel thread) and unattended for 5-10 minutes, it locks up totally and becomes unresponsive to network traffic === psivaa-bbl is now known as psivaa [16:28] ah [16:29] If I restart the machine and then resume the cp from the last file that was printed, it locks up fairly rapidly at that point [16:29] arges: one more detail: I run 2 cp's in parallel [16:29] arges: this is the literal command I'm running [16:29] cat sqlite-files.txt | xargs -n1 -I{} -P2 sudo sh -c 'rm -f {}.new; cat {} > {}.new; echo {}' [16:29] (then I resume with `tail -n+18082 | xargs`) [16:30] pwaller: can you been able to reproduce on a single cpu instance [16:30] arges: that is something I can try [16:35] 1) create around ~300GB of small files (sqlite files for example), put the files into a list sqlite-files.txt [16:35] 2) Start the copy: [16:35] cat sqlite-files.txt | xargs -n1 -I{} -P2 sudo sh -c 'rm -f {}.new; cat {} > {}.new; echo {}' [16:35] 3) When it hangs, identify where it hung as $NUM and resume with the following: [16:35] tail -n+$NUM | xargs -n1 -I{} -P2 sudo sh -c 'rm -f {}.new; cat {} > {}.new; echo {}' [16:35] pwaller: ^^^ does that sum up the test case [16:35] pwaller, where does the 87% full info actually come from? df info? [16:35] du -sch /volume, smb [16:35] arges: looks about right [16:36] arges: unfortunately the problem didn't manifest until 12 days of running originally [16:36] arges: though the write workload will not have been as pathological as what you just described [16:36] pwaller: what kind of workload were you running for those 12 days? [16:36] pwaller, thanks... maybe try "btrfs fi df /mountpoint" [16:37] smb: https://gist.github.com/pwaller/ce4312f5e16147847a65 [16:37] arges: user generated workload accessing those sqlite files for arbitrary read/writes [16:52] pwaller: have you looked at this wiki https://btrfs.wiki.kernel.org/index.php/Balance_Filters [16:53] one thing to note is 'if you are getting out of space errors' try 'btrfs balance ...' (i'm not sure if you already tried this) [16:54] arges: we just reproduced it on a single core machine in < 15s [16:54] (an AWS EC2 m3.medium [16:55] rtg, re bug 1350373, you have any idea why I can't describe commit b7dd0e in Linus' tree? [16:55] bug 1350373 in linux (Ubuntu Trusty) "Kernel BUG in paravirt_enter_lazy_mmu when running as a Xen PV guest" [Medium,Triaged] https://launchpad.net/bugs/1350373 [16:55] rtg jsalisbury@salisbury:~/src/linux$ git describe --contains b7dd0e350e0bd4c0fddcc9b8958342700b00b168 [16:55] fatal: cannot describe 'b7dd0e350e0bd4c0fddcc9b8958342700b00b168' [16:55] rtg, am I just missing some git knowledge? [16:55] arges: it sprang back into life [16:55] jsalisbury, there is likely no subsequent tag. is it after -rc7 ? [16:55] rtg, ahh, that makes sense [16:56] arges: probably just observed a volume warming up [16:56] rtg, so it will work in -rc8 or whenever the next tag is added [16:56] jsalisbury, I think so [16:56] pwaller: ok so at this point test a bit with the single cpu machine. and 2) look at the wiki i mentioned, and see if 'btrfs balance ...' changes anything [16:56] its lunchtime here brb [16:56] rtg, got it, thanks [16:58] arges: I don't understand the rebalance thing. It's a single device volume, is rebalance even relevant if it isn't RAID? [17:07] smb, arges: one interesting feature of the single CPU is that at times, the write load drops to 0 during the copy but the sys cpu goes to 100% [17:08] which is spending ~10% in rcu_sched, 50% in kworker/u30:1 and some fraction in some btrfs processes (btrfs-transaction and btrfs-endio-met) [17:09] pwaller, It kind of would make sense with the assumption that something in the kernel desperately trying but never giving up doing something. [17:09] the 100% sys load remains even after aborting the CP [17:09] a "sync" from the command line isn't completing [17:10] I would read the straces you posted as: the cp you do cause some background writes which are done by a kworker [17:11] "echo l > /proc/sysrq-trigger" causes the machine to freeze for quite a long time [17:13] dumping a stack trace via sysrq is considerably less interesting on the single-CPU machine, the stack just has "write_sysrq_trigger" in it [17:13] having said that there is another stack trace with "generic_write_sync" at the top [17:13] my sync is still going [17:13] and the CPU is still stuck in 100% sys [17:16] tbh not really surprised with that [17:16] arges: someone from #btrfs (I don't know their credentials) said that rebalance was unlikely to have any effect in my circumstances [17:17] smb: the sync finally finished after 8 minutes [17:17] smb: so I guess this could just be explained by a cold block store [17:17] the machine finally went down to 0% sys [17:17] now that is more suprising if that was a lockup [17:18] Hm... is this a case where io for some reason is slow as hell... [17:18] smb: well the machine was effectively unusable given that the kernel was consuming all of the CPU. I wouldn't expect that if IO was just "running a bit slow" [17:18] (but then I don't know if my expectations are "reasonable") [17:22] It certainly should not be that drastic. I mean the guest has around 4G of memory and sure that gets used up as cache . But it sounds a bit like the writeout was done in a polling fashion... [17:35] smb: "time echo l > /proc/sysrq-trigger" takes 18 seconsd [17:35] smb: why would that be? [17:38] Hm, cannot say for sure . maybe done in some soft-interrupt and if something prevents the cpu from processing them quickly... [17:40] no... apparently ipi but I think for xen those where using event channel(s)... [18:09] arges: smb: I'm hitting the road so will appear/disappear from here out. Thanks for your help looking into things. If you have any further suggestions I guess I'll hear via the launchpad bug. Have a good day/evening :) [18:09] cya thanks [18:09] I should add I'm also interested in ways to mitigate this since it affects production systems === iri- is now known as pwaller` === DalekSec_ is now known as DalekSec [19:26] smb`, I see you've been busy. I showed my coworker your "I had a dream" comment === hatch__ is now known as hatch