/srv/irclogs.ubuntu.com/2014/07/29/#ubuntu-kernel.txt

pwallerHi Folks, I just hit this kernel bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/134971111:16
ubot5Launchpad bug 1349711 in linux (Ubuntu) "Machine lockup in btrfs-transaction" [Undecided,Incomplete]11:16
pwallerThe advise of the BTRFS developers is to update to 3.15 or above11:17
pwallers/advise/advice/11:17
pwallerI tried adding https://launchpad.net/~kernel-ppa/ppa and/or https://launchpad.net/~kernel-ppa/pre-proposed to my trusty system, but both resulted in 404 when doing `apt-get update`11:17
pwallerWhat is a good way to get a more recent kernel on 14.04?11:18
apwpwaller, that isn't exactly the most helpful advice11:29
pwallerapw: ah. Any better suggestions?11:29
apwwell they could suggest a fix instead of a blanket upgrade to a newer kernel11:30
apwwhen utopic releases tehre will be a 3.16 kernel available in 14.04 but that is not yet ready for production use, nor available in 14.0411:30
pwallerhm, OK.11:30
pwallerapw: It was advice from a random on IRC, and I suppose they were trying to help me hit the ground running again.11:31
pwallerapw: can you think of any ways I can help the bug along or is it best to sit on my hands for now?11:31
apwpwaller, are you seeing any ill effects or only these warnings11:31
pwallerapw: I've determined that these are warnings11:31
pwallerapw: they all appear to be relating to caches which can be discarded11:31
pwaller(according to notes found on mailing lists and the advice from #btrfs)11:32
apwyeah that appears to match my understanding, i don't expect any of these to stop things working11:32
apwthe soft lockups all clear before 46s or whatever the second check is11:32
pwallerapw: a cursory check suggests things are working11:32
pwallerapw: I can't parse your last statement11:32
pwallerapw: the soft lockups are happening after >48h running11:33
apwthe softlockup warnings all say 22/23s which indicates they did not continue to be locked up, they resolved each individual lockup11:33
pwallerah! I misread that.11:33
apwthose are serious when they say like 23s, 46s, 90s, 200s, ...11:33
pwallergotcha.11:33
pwallerExcept that it is preventing the machine from being useful11:34
pwallerbut I get what you mean.11:34
apwok so there are ill effects, whci are ?11:34
pwallerapw: http and ssh stop responding11:34
pwallerapw: my interim "fix" was going to be to configure a watchdog which rebooted the system, but maybe that wouldn't work11:35
xnoxdepends on the workload and how full the fs is. Do you rebalance regularly?11:35
pwallerapw: I guess the BTRFS FS wasn't accepting writes.11:35
pwallerxnox: stupid question from me: do I need to rebalance if it just on one block device?11:35
pwallerxnox: FS is 87% full11:35
apwhmmm, i don't think those warnings are necessarily even related.  but anyhow, you should cirtainly file a bug against linux if its getting hung up11:35
apwand we can suggest some debug kernels to try etc and see if we can figure out if it is fixed in later versions11:36
pwallerapw: on the kernel tracker?11:36
apwon launchpad, run "ubuntu-bug linux"11:36
pwallerapw: the link above is on launchpad11:36
pwallerOops, no it isn't!11:36
pwallerOh, yes - the link I sent is here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/134971111:37
ubot5Launchpad bug 1349711 in linux (Ubuntu) "Machine lockup in btrfs-transaction" [Undecided,Incomplete]11:37
pwallerapw: so I did already file one there11:37
apwok as btrfs upstream are suggesting v3.15 I would first suggest you test with a v3.15 mainline kernel;11:39
apwhttps://wiki.ubuntu.com/Kernel/MainlineBuilds11:39
apwthough these are manual installs and do not upgrade automatically; so they are only useful for testing11:40
pwallerah, I missed the /mainline of ~kernel-ppa11:40
apwif that works try v3.14, if it does not try v3.16-rc711:40
apwand report that back to the bug, if that shows it is fixed somewhere already we can start a bisect for the fix11:40
pwallerapw: it's a bit difficult to bisect if it isn't triggerable, isn't it?11:41
apwvery much so, sadly11:41
apwbut if we have a bracket, we could at least look at the btrfs changes and see if anything "sticks out"11:41
pwallerokay11:42
pwallerapw: we reached more than 12 days uptime before the last problem I think11:43
pwallerso it is going to be difficult to observe11:43
pwallerapw: woah. Just looked in the nginx log11:46
pwallerapw: which is not on the BTRFS drive - it's full of null characters at the point of the fault11:46
pwallerThere are ~2280 null characters11:47
pwallerapw: and indeed there were no HTTP requests serviced for the duration of the fault (whilst the kernel was reporting 20s soft lockups)11:48
pwallerafk for lunch, back in <1h.11:49
pwallerback12:37
pwallerDoes anyone how I can run apport-collect but check that the output doesn't contain company secrets?13:16
pwallerDoes anyone know where /dev/watchdog comes from? My reading is that it should be there by standard, but I can't find it13:49
pwallerAh, have to load softdog13:51
bjfsmb, did you see that bug 727459 seems to have come back for a user?14:02
ubot5bug 727459 in linux (Ubuntu Lucid) "TSC is not reliable under Xen on some Intel CPUs" [Medium,Triaged] https://launchpad.net/bugs/72745914:02
smbbjf, I saw some updates to the bug but have not gotten to that, yet14:03
smbbjf, I am very doubtful this is just the same bug, so I asked to open a new report14:27
bjfsmb, ack14:57
jsalisbury**14:59
jsalisbury** Ubuntu Kernel Team Meeting - Today @ 17:00 UTC - #ubuntu-meeting15:00
jsalisbury**15:00
Joe_CoTbjf, unless I'm missing something, think your automated script might be faulty https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349883 It asked me to run apport to collect logs because logs were missing, but the logs apport attached were the same as ubuntu-bug attached in the first place. Is there some log that apport didn't pick up it should've?15:20
ubot5Launchpad bug 1349883 in linux (Ubuntu) "dmesg time wildly incorrect on paravirtual EC2 instances." [Undecided,Confirmed]15:20
pwallerHow do I specify that a kernel module should be loaded? Is it via /etc/modules? Is there a /etc/modules.d equivalent or do I have to edit that one file?15:23
smbJoe_CoT, Maybe the bot acts too quickly. Seems like status changed by apport right after15:23
pwaller(I want to make a salt state which causes a module to be updated on boot so I don't particularly want to mess with /etc/modules)15:23
smbIts now confirmed so it should be ok.15:23
=== psivaa is now known as psivaa-bbl
Joe_CoTsmb, I think the issue is probably that ubuntu-bug collects the same logs, but doesn't set the tag apport-collected, which the bot cues off of16:36
smbJoe_CoT, That tag seemed to be there by the time I looked. 16:37
Joe_CoT10:53 I put the bug in with ubuntu-bug, which had all the logs attached. 11:00 the bot replied, saying the logs were missing, and to run apport-collect. 11:10 I ran apport-collect, which attached the same logs, but added the tag apport-collected16:39
Joe_CoTapport-bug was already there, apport-collected was not16:40
smbHm, ok. bjf, so I am not sure why this happened ^. 16:46
smbJoe_CoT, Anyway I am looking into it16:46
jsalisbury##16:55
jsalisbury## Kernel team meeting in 5 minutes16:55
jsalisbury##16:55
=== jsalisbury changed the topic of #ubuntu-kernel to: Home: https://wiki.ubuntu.com/Kernel/ || Ubuntu Kernel Team Meeting - Tues August 12th, 2014 - 17:00 UTC || If you have a question just ask, and do wait around for an answer! If the question is should I file a bug for something, likely you can assume yes.
_`_wait that was it, jsalisbury ?17:12
ppisatirtg: git://kernel.ubuntu.com/ppisati/ubuntu-embedded.git17:24
ppisatirtg: i'm still working on it, so it's pretty rought right now17:24
ppisatirtg: but it works17:24
ppisatirtg: for your mirabox board run as:17:24
ppisatirtg: sudo ./make_img.sh -b mirabox -d 14.1017:25
ppisatirtg: dd the .img file that it creates to an sd, pop in in your mirabox and follow the instruction in "mirabox-uboot-env.txt"17:26
rtgppisati, I'll give it a run18:03
hallyn3.13 most certainly did not resolve the btrfs sync performance issues :)18:05
hallynuh, :(18:05
=== psivaa-bbl is now known as psivaa
cantstanyalol21:34
dannfkamal: i'm seeing a build regression w/ trusty master-next that i've bisected to a 3.13.11.5 patch21:56
dannfkamal: is there a 3.13.11.y git tree i could use to demonstrate whether or not it is a problem outside of ubuntu?21:56
kamaldannf, http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.13.y-queue23:16
kamaldannf, what's the problem patch?23:17
dannfkamal: a ptrace patch - see my e-mail to kernel-team@23:17
dannfkamal: not at all an obvious problem (at least to me), i haven't debugged it yet23:18
dannfmacro fun no doubt23:18
kamaldannf, yeah, its not clear why that patch would cause that build failure23:23
kamaldannf, so yeah, if you can (or can't) reproduce the failure with that 3.13.y stable repo, that would be interesting to know23:27
kamaldannf, and/or I can try to reproduce it -- what config are you using? -- fwiw, I didn't see any build failure with the few ARM configs I test (tegra, omap2plus, imx_v6_v7).23:29
dannfyeah, weird, can't reproduce with pristine23:31
dannfbut bisecting definitely led to that one, then going back to top of trusty master-next and reverting continued to show that it was the breaker23:32

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!