[08:22] cking, ping [08:24] apw: curious if there is an ETA for the patch you mentioned in bug 1100386 [08:24] Launchpad bug 1100386 in linux (Ubuntu) "Server installations on VMs fail to reboot after the installations" [High,Confirmed] https://launchpad.net/bugs/1100386 [08:26] psivaa, hmmm, that rings a bell, but i suspect i have lost track of that [08:27] diwic, pong [08:27] cking, hi! I've heard you've done some system debugging in your days ;-) [08:27] apw: the issue (the apparent hang) can be seen in saucy as well, so when ever you could find some time .. :) [08:27] psivaa, i think it was smb who had the setup to confirm the fix, i will circle with him on it [08:27] apw: ack, thanks [08:28] cking, in short, I'm testing some code but for some reason it causes the i915 module (I think, not sure) to hang for ~50 seconds on boot [08:28] cking, is there a way to figure out *where* it's stuck in those 50 seconds? [08:29] cking, I've done some attempts with "perf top" and "operf" but I can't get them to run properly at boot time [08:30] cking, perf top seems to require an ncurses thing and operf just does not seem to want to start, don't know why [08:31] diwic, so at least you have init up and running, hrm.. let me think [08:32] cking, so far, I was able to use tracing, I enabled tracing/events/module/* and the last entry before the 50 second delay is "load_module i915" [08:34] diwic, have you tried taking a boot chart, and have you tried enabling initcall_debug (kernel command line option) as this also gives you begin end of initcalls for module loads regardless of when [08:35] diwic, or maybe seeing what the kernel is doing with LTTng [08:35] diwic, finally have you got a dmesg of a boot which suffered this pause [08:35] we may see something in it [08:35] cking, LTTng requires a special kernel, right? [08:35] "in flight recorder mode" ? [08:36] diwic, you may get by using lttng-modules-dkms [08:36] s/get/get by/ [08:36] apw, hmm, next attempt will be "drm.debug=0xf initcall_debug=1", that could indeed give something [08:36] i think it is is just initcall_debug no =1 [08:36] diwic, so when did this start going wrong? [08:37] yeah what does your change do [08:37] cking, it's development of new code [08:37] in the kernel, in ? [08:37] cking, apw in short, Intel/we are trying to build bridges between the i915 and snd-hda-intel driver for better power saving [08:37] diwic, maybe finding out what bit of new code breaks it would be useful to see why it's triggering the hang [08:38] diwic, are you adding new deps between the two modules, could we have some kind of udev loop here ? [08:38] cking, well, I know that module_request("i915") is causing the hang, now to trying to understand why [08:39] diwic, you might want to install pitti's new udev and see if the issue is still there before trying to hard [08:39] in case it is one of those things the new udev avoids in its cleverness [08:40] diwic, perhaps for deeper tracing use http://lwn.net/Articles/365835/ [08:49] * diwic tries the function tracer too [08:51] apw, i use http://www.speedtest.bbmax.co.uk/ [08:55] okay, so the function tracer gives too much data, so I don't see the actual hang, because it's being overwritten by later data [09:28] diwic, when it hangs do you have any other modprobes running, it could be a deadlock that way [09:29] apw, yeah, I'm thinking something like that [09:30] you should be able to see that with ps pretty easy [09:30] or in the udev log it keeps during boot maybe, perhaps pastebin /var/log/udev and dmesg so we can look [09:30] apw, what determines if two modprobes can run in parallel or not? [09:30] diwic, that is a good question, there used to be some periods during which only one could run, but i did think they got rid of most of them now [09:58] * apw wonders if diwic is even here [09:58] apw, I am [09:58] apw, indeed there are modprobes in parallel [09:59] can we see the udev log and dmesg, would give us something to look at, and the ps please [09:59] what are they [09:59] apw, I'm using this as a method to learn kernel debugging, that's why I'm trying to analyze it myself, but in short [10:00] apw, udev kills the i915 modprobe after ~50 s with a syslog message (or was it dmesg) [10:01] cool for learning, if you show us those things we promise to tell you what we find and why it is relevant :) [10:01] what was the other modprobe ? [10:01] apw, snd-hda-intel is in a modprobe itself, and it tries to load i915 from its own modprobe [10:01] apw, which starts a new modprobe [10:02] ok, did you add that yourself ? [10:02] ie does your _init do like a request_module ? [10:02] apw, it's the .probe that does a request_module [10:03] it that occurs in the context of the _init (and from the hang i assume it does) i think that is not allowed (tm) [10:03] you presumably want to load that module to run things in it, i wonder if you can just reference them directly [10:04] so that the module dependancy loader figures it out ... or are you trying to make it independant ? [10:04] s/independant/dependant on h/w found [10:04] apw, maybe switch to a private channel as this is prerelease hw [10:04] diwic, ack [12:28] FYI: http://lists.randombit.net/pipermail/cryptography/2013-May/004225.html [12:29] [cryptography] skype backdoor confirmation [12:29] ok, this one has more details: [12:29] http://lists.randombit.net/pipermail/cryptography/2013-May/004224.html [12:44] quick question : has kexec been disabled on secureboot kernels ? [12:44] well, signed kernels [12:48] caribou, not deliberatly, it doesn't need to be because before we get to that point we have closed boot services [12:48] caribou, that said it is not clear that kexec can work as well in that environment as we have close boot services so you can only have a normal style boot from there [12:49] apw: ok, I vaguely remembered a discussion where kexec wouldn't work [12:49] apw: but maybe this is done in kexec itself, not a kernel build option [12:50] caribou, well it cannot call the 'efi secure mode' entry point, so we cannot do all the same quirking you might use in the normal run of the boot [12:50] apw: just testing kdump on Quantal on a vmlinuz...signed [12:55] apw: ok, thanks [12:57] np [13:00] jsalisbury: i've updated bug 1181315 with the result of using the upstream kernel that you asked to test [13:00] Launchpad bug 1181315 in linux (Ubuntu) "unregister_netdevice: waiting for lo to become free. Usage count = 2' is reported and causing kernel hang when floodlight tests are run using utah" [Medium,Confirmed] https://launchpad.net/bugs/1181315 [13:01] psivaa, great, thanks. I'll take a look shortly.\ [13:01] jsalisbury: ack, thanks [13:10] jsalisbury, is that a base issue that smb has looked at, i have a real feeling of deja-vu with that symptom, and a feeling smb was looking at something related to it in the past (some time ago) [13:11] apw, I had trouble with this when entering and exiting containers, but that was a few versions ago [13:11] apw, hmm, it could be. I'll check with smb when he's online [13:11] there looks to be some openvswitch in the mix here, so ick [13:11] [ 480.492344] [] copy_net_ns+0x8c/0x130 [13:12] so it is entirely possible they are using namespace containers here [13:13] * apw lunches [13:59] jsalisbury, do we have any specific wiki docs on disabling the abi checks ? [14:00] apw, not that I know of off the top of my head [14:00] apw, we do but i think they are buried on one of the pages [14:00] apw, looking [14:01] apw, https://wiki.ubuntu.com/KernelTeam/KernelMaintenance ABI section [14:02] apw, https://wiki.ubuntu.com/KernelTeam/KernelMaintenance#ABI [14:06] bjf, ahh thanks [14:11] bah why do kernel questions get asked on #ubuntu-devel more often than on here === kentb-out is now known as kentb [14:43] apw, uploaded linux_3.10.0-0.1 to ppa:kernel-ppa/pre-proposed for DKMS testing [15:04] ** [15:04] ** Ubuntu Kernel Team Meeting - Today @ 17:00 UTC - #ubuntu-meeting [15:04] ** [15:33] http://reqorts.qa.ubuntu.com/reports/kernel-bugs/reports/_kernel_stable_hot_.html [15:48] jsalisbury, hrm i assume my fookage occured at just the right time to miss this [15:49] jsalisbury, 1100386 -- Server installations on VMs fail to reboot after the installations -- that one is one i am actually working on today [15:49] apw, heh, yeah. Don't worry, only one bug was assigned to you [15:50] jsalisbury, anyhow i hope to have fixes out for that today, just having some "fun" because we have just today shifted from udev to systemd packaging for udev binaries [15:50] apw, cool,thanks [15:56] jsalisbury, just put an update in the bug and nomed it all over the shop [15:56] ## [15:56] ## Kernel team meeting in 5 minutes [15:56] ## [15:57] apw, thanks, nom nom [16:02] jsalisbury: i'm confused, are you sure its already 17:00 *UTC*? [16:02] henrix: 1600 UTC. [16:02] henrix, sorry, I'm daylight savings challenged [16:02] jpds: yep, that's what i thought. thanks :) [16:02] The meeting is in an hour [16:02] jsalisbury: o/ [16:22] jsalisbury, just uploaded pciutils for saucy. try installing saucy on that SDP in awhile to see if lspci recognizes the Broadcom gizmo. [16:32] rtg_, will do [16:32] rtg_, Will pciutils be included if I just build the latest Saucy tree? [16:33] jsalisbury, it is a separate package from the kernel [16:33] jsalisbury, i think he is saying you can just upgrade pci-utils once it is built [16:33] rtg_, apw, ahh right [16:37] rtg_, my external usb keyboard/mouse breaks on that system with 3.10-rc1 and rc2, but works on 3.9, so I'll also bisect that [16:56] ## [16:56] ## Kernel team meeting in 5 minutes [16:56] ## === jsalisbury changed the topic of #ubuntu-kernel to: Home: https://wiki.ubuntu.com/Kernel/ || Ubuntu Kernel Team Meeting - Tues May 28th, 2013 - 17:00 UTC || If you have a question just ask, and do wait around for an answer! [17:19] * rtg_ -> lunch [17:39] BenC: *nudge* [17:40] BenC: So, all that's left to do on #1181305 is for you to do some install/reboot smoketesting with the binaries in -proposed on a couple of machines and let me know if they're horrificially broken. [17:41] BenC: I'll test powerpc64 on my POWER5. Not sure I want to disrupt my firewall to test powerpc32, but if 3 out of 4 flavours are good, we're probably in good shape. [17:42] BenC: Then we twiddle the regress-testing task, and wait patiently for master to pass its more extensive testing. Once master is ready to promote, the bot with automagically mark ppc ready too (if we've done all our bits). [17:42] s/with/will/ [18:04] rtg_, ogasawara, i have just merged kernel-wedge from debian, minimal changes, but just in case fyi [18:04] apw, ack [18:06] * apw relocates to a more comfy location [18:38] infinity: works for me on ppce500mc [18:46] hi [18:47] does anything speak against enabling ARCH_TEGRA in 3.10? [18:47] it has been converted to multiarch [18:47] marvin24, can't see why not. [18:48] well, it could harm, but there is still plenty of time to test ... [18:49] looks like there are plenty of tegra models supported [18:49] yes, all reference boards and some others also [18:50] so we get tegra2/3 and 4 support and ~10 boards [18:50] or 15 [18:55] BenC: Kay. I'm doing a glibc build on my POWER5 to look for testsuite regressions, I figure that's a reasonably sane kernel test. :P [19:12] marvin24, I does not appear that you can enable CONFIG_TEGRA with trashing CONFIG_ARCH_MULTIPLATFORM [19:12] without* [19:26] rtg_: urg [19:26] rtg_: will check [20:12] Hey guys I've been working on https://bugs.launchpad.net/ubuntu/+source/ncpfs/+bug/1035226 [20:12] Ubuntu bug 1035226 in ncpfs (Ubuntu) "Can't remove directory from NCP server" [Undecided,Confirmed] [20:12] which is actually an upstream kernel bug as well. [20:12] I've bisected the commit using lucid to 1d2ef5901483004d74947bbf78d5146c24038fe7 [20:13] using mainline-build-one 1d2ef5901483004d74947bbf78d5146c24038fe7 lucid. [20:13] * rtg_ -> EOD [20:14] however when I run mainline-buildone 1d2ef5901483004d74947bbf78d5146c24038fe7 precise, and test on precise rmdir works, but rm -rf fails with device busy. [20:14] i have /proc/locks showing me 1: FLOCK ADVISORY WRITE 20035 00:12:10525 0 EOF. task 20035 which locked inode 10525 no longer exists. how can i tell which task is holdign that fd open? [20:15] hallyn task 20035 [20:15] but it no longer exists. [20:15] locks need to be unlocked. [20:15] iirc [20:15] but whenthe task exits the fd should be closed, and then (flock manpage claims) flock is released [20:16] unless the process gets killed or exits badly. [20:17] so long as it isn't stickign around as a zombie, i would expect the cleanup gets done int he kernel no matter how the task dies... [20:17] or does flcok cleanup happen in userspace? [20:18] * hallyn goes rooting through fs/locks.c [20:18] the lock release should probably get initiated from userspace. [20:18] or even the close for the at matter [20:19] so anyone have any clue why building the same commit id for lucid works where building it for precise fails? [20:25] jsalisbury: bug 1180513 has been updated [20:25] Launchpad bug 1180513 in linux (Ubuntu) "lid close actions are ignored laptop always suspends" [Medium,Confirmed] https://launchpad.net/bugs/1180513 [20:36] hallyn does lsof or fuser return anything useful? [21:06] chiluk: oh, i think i've found it. debootstrap seems to be firing off a running udev [21:07] stgraber: ^ this is bad. presumably due to the saucy udev change... [21:07] * hallyn does one more run to verify... [21:09] hallyn: hmm, yeah, that could have been triggered by the new udev, blame pitti :) [21:10] stgraber: trivial to reproduce now: ( flock -x 200; debootstrap saucy xxx; ) 200>/tmp/zzz [21:35] stgraber: markd it as affecting udev. as per usual, i expect to get smacked over it before it's all over :)