[11:16] <pwaller> Hi Folks, I just hit this kernel bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711
[11:17] <pwaller> The advise of the BTRFS developers is to update to 3.15 or above
[11:17] <pwaller> s/advise/advice/
[11:17] <pwaller> I tried adding https://launchpad.net/~kernel-ppa/ppa and/or https://launchpad.net/~kernel-ppa/pre-proposed to my trusty system, but both resulted in 404 when doing `apt-get update`
[11:18] <pwaller> What is a good way to get a more recent kernel on 14.04?
[11:29] <apw> pwaller, that isn't exactly the most helpful advice
[11:29] <pwaller> apw: ah. Any better suggestions?
[11:30] <apw> well they could suggest a fix instead of a blanket upgrade to a newer kernel
[11:30] <apw> when utopic releases tehre will be a 3.16 kernel available in 14.04 but that is not yet ready for production use, nor available in 14.04
[11:30] <pwaller> hm, OK.
[11:31] <pwaller> apw: It was advice from a random on IRC, and I suppose they were trying to help me hit the ground running again.
[11:31] <pwaller> apw: can you think of any ways I can help the bug along or is it best to sit on my hands for now?
[11:31] <apw> pwaller, are you seeing any ill effects or only these warnings
[11:31] <pwaller> apw: I've determined that these are warnings
[11:31] <pwaller> apw: they all appear to be relating to caches which can be discarded
[11:32] <pwaller> (according to notes found on mailing lists and the advice from #btrfs)
[11:32] <apw> yeah that appears to match my understanding, i don't expect any of these to stop things working
[11:32] <apw> the soft lockups all clear before 46s or whatever the second check is
[11:32] <pwaller> apw: a cursory check suggests things are working
[11:32] <pwaller> apw: I can't parse your last statement
[11:33] <pwaller> apw: the soft lockups are happening after >48h running
[11:33] <apw> the softlockup warnings all say 22/23s which indicates they did not continue to be locked up, they resolved each individual lockup
[11:33] <pwaller> ah! I misread that.
[11:33] <apw> those are serious when they say like 23s, 46s, 90s, 200s, ...
[11:33] <pwaller> gotcha.
[11:34] <pwaller> Except that it is preventing the machine from being useful
[11:34] <pwaller> but I get what you mean.
[11:34] <apw> ok so there are ill effects, whci are ?
[11:34] <pwaller> apw: http and ssh stop responding
[11:35] <pwaller> apw: my interim "fix" was going to be to configure a watchdog which rebooted the system, but maybe that wouldn't work
[11:35] <xnox> depends on the workload and how full the fs is. Do you rebalance regularly?
[11:35] <pwaller> apw: I guess the BTRFS FS wasn't accepting writes.
[11:35] <pwaller> xnox: stupid question from me: do I need to rebalance if it just on one block device?
[11:35] <pwaller> xnox: FS is 87% full
[11:35] <apw> hmmm, i don't think those warnings are necessarily even related.  but anyhow, you should cirtainly file a bug against linux if its getting hung up
[11:36] <apw> and we can suggest some debug kernels to try etc and see if we can figure out if it is fixed in later versions
[11:36] <pwaller> apw: on the kernel tracker?
[11:36] <apw> on launchpad, run "ubuntu-bug linux"
[11:36] <pwaller> apw: the link above is on launchpad
[11:36] <pwaller> Oops, no it isn't!
[11:37] <pwaller> Oh, yes - the link I sent is here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711
[11:37] <pwaller> apw: so I did already file one there
[11:39] <apw> ok as btrfs upstream are suggesting v3.15 I would first suggest you test with a v3.15 mainline kernel;
[11:39] <apw> https://wiki.ubuntu.com/Kernel/MainlineBuilds
[11:40] <apw> though these are manual installs and do not upgrade automatically; so they are only useful for testing
[11:40] <pwaller> ah, I missed the /mainline of ~kernel-ppa
[11:40] <apw> if that works try v3.14, if it does not try v3.16-rc7
[11:40] <apw> and report that back to the bug, if that shows it is fixed somewhere already we can start a bisect for the fix
[11:41] <pwaller> apw: it's a bit difficult to bisect if it isn't triggerable, isn't it?
[11:41] <apw> very much so, sadly
[11:41] <apw> but if we have a bracket, we could at least look at the btrfs changes and see if anything "sticks out"
[11:42] <pwaller> okay
[11:43] <pwaller> apw: we reached more than 12 days uptime before the last problem I think
[11:43] <pwaller> so it is going to be difficult to observe
[11:46] <pwaller> apw: woah. Just looked in the nginx log
[11:46] <pwaller> apw: which is not on the BTRFS drive - it's full of null characters at the point of the fault
[11:47] <pwaller> There are ~2280 null characters
[11:48] <pwaller> apw: and indeed there were no HTTP requests serviced for the duration of the fault (whilst the kernel was reporting 20s soft lockups)
[11:49] <pwaller> afk for lunch, back in <1h.
[12:37] <pwaller> back
[13:16] <pwaller> Does anyone how I can run apport-collect but check that the output doesn't contain company secrets?
[13:49] <pwaller> Does anyone know where /dev/watchdog comes from? My reading is that it should be there by standard, but I can't find it
[13:51] <pwaller> Ah, have to load softdog
[14:02] <bjf> smb, did you see that bug 727459 seems to have come back for a user?
[14:03] <smb> bjf, I saw some updates to the bug but have not gotten to that, yet
[14:27] <smb> bjf, I am very doubtful this is just the same bug, so I asked to open a new report
[14:57] <bjf> smb, ack
[14:59] <jsalisbury> **
[15:00] <jsalisbury> ** Ubuntu Kernel Team Meeting - Today @ 17:00 UTC - #ubuntu-meeting
[15:00] <jsalisbury> **
[15:20] <Joe_CoT> bjf, unless I'm missing something, think your automated script might be faulty https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349883 It asked me to run apport to collect logs because logs were missing, but the logs apport attached were the same as ubuntu-bug attached in the first place. Is there some log that apport didn't pick up it should've?
[15:23] <pwaller> How do I specify that a kernel module should be loaded? Is it via /etc/modules? Is there a /etc/modules.d equivalent or do I have to edit that one file?
[15:23] <smb> Joe_CoT, Maybe the bot acts too quickly. Seems like status changed by apport right after
[15:23] <pwaller> (I want to make a salt state which causes a module to be updated on boot so I don't particularly want to mess with /etc/modules)
[15:23] <smb> Its now confirmed so it should be ok.
[16:36] <Joe_CoT> smb, I think the issue is probably that ubuntu-bug collects the same logs, but doesn't set the tag apport-collected, which the bot cues off of
[16:37] <smb> Joe_CoT, That tag seemed to be there by the time I looked. 
[16:39] <Joe_CoT> 10:53 I put the bug in with ubuntu-bug, which had all the logs attached. 11:00 the bot replied, saying the logs were missing, and to run apport-collect. 11:10 I ran apport-collect, which attached the same logs, but added the tag apport-collected
[16:40] <Joe_CoT> apport-bug was already there, apport-collected was not
[16:46] <smb> Hm, ok. bjf, so I am not sure why this happened ^. 
[16:46] <smb> Joe_CoT, Anyway I am looking into it
[16:55] <jsalisbury> ##
[16:55] <jsalisbury> ## Kernel team meeting in 5 minutes
[16:55] <jsalisbury> ##
[17:12] <_`_> wait that was it, jsalisbury ?
[17:24] <ppisati> rtg: git://kernel.ubuntu.com/ppisati/ubuntu-embedded.git
[17:24] <ppisati> rtg: i'm still working on it, so it's pretty rought right now
[17:24] <ppisati> rtg: but it works
[17:24] <ppisati> rtg: for your mirabox board run as:
[17:25] <ppisati> rtg: sudo ./make_img.sh -b mirabox -d 14.10
[17:26] <ppisati> rtg: dd the .img file that it creates to an sd, pop in in your mirabox and follow the instruction in "mirabox-uboot-env.txt"
[18:03] <rtg> ppisati, I'll give it a run
[18:05] <hallyn> 3.13 most certainly did not resolve the btrfs sync performance issues :)
[18:05] <hallyn> uh, :(
[21:34] <cantstanya> lol
[21:56] <dannf> kamal: i'm seeing a build regression w/ trusty master-next that i've bisected to a 3.13.11.5 patch
[21:56] <dannf> kamal: is there a 3.13.11.y git tree i could use to demonstrate whether or not it is a problem outside of ubuntu?
[23:16] <kamal> dannf, http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.13.y-queue
[23:17] <kamal> dannf, what's the problem patch?
[23:17] <dannf> kamal: a ptrace patch - see my e-mail to kernel-team@
[23:18] <dannf> kamal: not at all an obvious problem (at least to me), i haven't debugged it yet
[23:18] <dannf> macro fun no doubt
[23:23] <kamal> dannf, yeah, its not clear why that patch would cause that build failure
[23:27] <kamal> dannf, so yeah, if you can (or can't) reproduce the failure with that 3.13.y stable repo, that would be interesting to know
[23:29] <kamal> dannf, and/or I can try to reproduce it -- what config are you using? -- fwiw, I didn't see any build failure with the few ARM configs I test (tegra, omap2plus, imx_v6_v7).
[23:31] <dannf> yeah, weird, can't reproduce with pristine
[23:32] <dannf> but bisecting definitely led to that one, then going back to top of trusty master-next and reverting continued to show that it was the breaker