/srv/irclogs.ubuntu.com/2016/03/19/#ubuntu-kernel.txt

=== martink3_ is now known as martink3
kernelbugproblems with bisection...13:36
kernelbugwhat if the kernel is not listed here? http://people.canonical.com/~kernel/info/kernel-version-map.html13:36
kernelbughow i'm supposed to map anything then? :)13:36
kernelbugi ran mainline-build-one and got errors...13:37
kernelbugfatal: invalid reference: Ubuntu-3.18.0-7.813:38
kernelbugvivid-amd64: chroot not found (::,)13:38
mamarleyapw: So I figured out the cause of my problem.  It wasn't actually a regression in the kernel, it is the "watchdog" service.  If I stop that service, suspend works perfectly.  Odd...13:40
ogra_wow13:42
ogra_there is an option in /etc/default/watchdog to disable it permanently 13:43
ogra_(and please fil a bug too)13:44
mamarleyI want to try it on another box first to see if it has the same problem there.13:44
ogra_good idea13:44
mamarleyI actually had to add an [Install] section with a WantedBy to get it to work at all, so I am afraid any bug I file would be invalid anyway.13:47
kernelbugso? :)13:48
kernelbugwhy errors?13:48
kernelbug:<13:52
mamarleyI just tried another system with the iTCO_wdt device and watchdog enabled.  It does not seem to suffer from the same issue.13:53
kernelbugam i ignored or are you just busy? :)13:57
apwor perhaps, maybe, it is the weekend?  the first error implies you are not in the appropriate git repo, the second that you do not have build chroots created14:11
kernelbugapw, hmm14:17
kernelbugapw, maybe it's morning somewhere ;)14:17
kernelbugapw, well i followed the instructions14:18
kernelbughttps://wiki.ubuntu.com/Kernel/KernelBisection14:19
kernelbugapw, "HEAD is now at 32ac5b4... UBUNTU: Ubuntu-3.19.0-56.62"14:49
kernelbugapw, why 3.19? :)14:49
kernelbugwell, see you later...14:59
=== DevBox|2 is now known as DevBox
alkisgHi guys, post the 3.2 kernel I'm getting a 10-second delay at boot: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/125986117:01
ubot5Launchpad bug 1259861 in linux (Ubuntu) "5-10 second delay in kernel boot" [Medium,Confirmed]17:01
alkisgThe weird thing is that it happens in some real hardware and under virtualbox, but I was not able to reproduce it under KVM17:02
alkisgI'm guessing there's some "wait up to 10 seconds for that event" code somewhere, but how can I pinpoint it to help in solving the issue?17:03
alkisgIt's Ubuntu specific, I haven't seen it in any vanilla kernels so far17:03
TJ-alkisg: does a boot with 'debug' indicate where the delay is by the last messages before the delay?17:17
alkisgTJ-: no, I tried with debug and it made no difference in dmesg17:19
alkisgAnd booting with various clients, the messages I see around the delay are not the same17:19
alkisgTJ-, it's possible that it happens on your PC as well, you could run dmesg and see...17:20
apwalkisg, hmmmm, there must be a pattern17:20
apwcitainly i do not have a 10s that i can see17:20
TJ-alkisg: which exact Ubunt kernel version do you see it on first (after 3.2) - we can go back through the commits from that17:20
TJ-alkisg: no, it doesn't happen for me :)17:21
alkisgI think that if I run the stock ubuntu live cds in virtualbox, it always happens17:21
alkisgSo I can test with e.g. 12.04.1 (it doesn't), 12.10, 12.04.2...17:21
alkisgWill that help?17:21
* alkisg has something running in kvm and can't use virtualbox right now, but will try it in half an hour or so when it finishes17:26
alkisgOne dmesg from 16.04 on some i5, netbooted with LTSP: http://paste.debian.net/417051/17:31
alkisgThe delay is before the initramfs gets loaded, at 2 => 12 sec17:31
ogra_different compressions ?17:32
infinityThat kinda looks like you have a sleep 10 in your initrd. :P17:33
infinityWhich would be a local thing (or a weird package), cause I've never seen it.17:34
infinityalkisg: rgrep sleep /usr/share/initramfs-tools/ | pastebinit17:34
alkisginfinity: the delay is before the initramfs gets loaded17:35
alkisgBut weird as it sounds, I tried removing the "ip=" parameters from the cmdline17:35
alkisgAnd the delay seems to go away17:35
infinityOn what are you basing "before the initramfs"?17:35
alkisgip= is normally processed by the initramfs, but in this case it's also causing 10 sec delay before the initramfs17:35
alkisginfinity: because the delay happens before e.g. break=top17:36
ogra_you mean "before the initramfs writes to logs" ... 17:37
TJ-alkisg: have you considered its due to system delays in loading/decompressing the initrd.img, as ogra_ hinted at?17:37
ogra_how do you know it doesnt operate 17:38
infinityTJ-: No way that machine would take 10s to load an initrd unless it was a several hundred megs.17:38
ogra_... it simply doesnt print for 10sec ... it might as well process something 17:38
alkisgTJ-: I think it is indeed because somehow ip=xxx is processed by the kernel (so same initrd and compression methods etc)17:38
alkisgGive me 5 mins to check17:38
TJ-alkisg: if the mass storage device has bad sectors there may be I/O errors going on17:40
ogra_infinity, well, a 25MB xz compressed initrd (typical ubuntu initrd nowadays) on a 600MHz single core CPU can surely take a while 17:41
ogra_(indeed this isnt a 600MHz single core :) )17:41
alkisgYup, ip= is what's causing it17:42
alkisgSo, my test so far is:17:42
infinityIndeed.  And it's obviously not the issue if removing cmdline args "fixes" it.17:42
infinityNor is it I/O issues, etc.17:42
ogra_yeah17:42
alkisgI put break=top. I get to the initramfs prompt in 2 secs.17:42
alkisgI put ip=dhcp break=top. I get to the initramfs prompt in 12 secs.17:42
alkisgAnd of course ip= is processed long after "top" by the initramfs17:43
ogra_is the NIC driver builtin ? 17:43
alkisgNo idea, but udev hasn't ran yet at that point...17:43
alkisgLet me reboot to see which nic it has17:43
* ogra_ was more interested in modprobe than udev17:44
infinityIndeed, nothing *should* be doing much with IP before break=top17:45
infinityBut that doesn't stop people from being silly.17:45
infinity(note that conf/conf.d is sourced, so one can have code in there, for instance)17:45
alkisgI can put a custom init inside the initramfs if that'll help in proving that initramfs isn't to blame17:46
alkisgHmm or I could just not use an initramfs and check the time of the kernel panic17:46
infinitySure, just 'ln sh init'17:46
alkisgogra_: 03:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c17:46
alkisgStock 16.04, I suppose that's not builtin, right?17:46
infinityOr drop the initrd, yes.17:46
ogra_right,, not builtin ... 17:47
ogra_could be the module loading (or the firmware loading) that slows you down17:47
infinityNah.17:47
infinityThe modules are loaded after his delay.17:47
infinitydmesg is pretty clear on that.17:47
infinityI suppose the kernel could be attempting to configure a nonexistent device for 10s, but that seems pretty amazingly silly if it is (and seems like a bug people would have yelled about earlier)17:48
ogra_how would ip= work without a network device ?17:48
infinityogra_: The initrd processes it later.17:48
ogra_so it times out til it notices there is no NIC yet ? 17:49
alkisgSo, without initrd. (1) without ip= → kernel panic in 0.58. (2) with ip= → kernel panic in 12 sec.17:49
alkisgI think that rules out the initramfs issues17:50
* alkisg tries various ip=xxxx parameters, to see if "none" or something bypasses it...17:50
infinityIt's entirely possible the kernel is blocking with ip=dhcp because there's no interface yet, but man, that seems braindead.17:51
alkisgThere's a way to have the kernel assign an ip17:51
alkisgIt needs some CONFIG_XXX lines and the module to be included17:51
alkisgBut those aren't there by default in Ubuntu kernel builds17:51
TJ-net/ipv4/ipconfig.c is responsible for processing the "ip=" internally, dhcp is going to result in a delay if the DHCP server doesn't respond. Have you tcpdump-ed the network link?17:51
infinityTJ-: The dhcp server can't respond, there's no interface. :)17:52
infinity(nothing to tcpdump)17:52
alkisgTJ-: and also in my initial try, I was putting a static ip=x:x:x:x: there, that too caused a delay17:52
alkisg(IPAPPEND 3 in pxelinux)17:52
TJ-infinity: that'll not help :)17:53
ogra_just go for https://sourceforge.net/projects/ubuntubsd/ ... i heard BSD is a lot better for network stuff *g*17:55
alkisgHaha17:56
TJ-alkisg: try enabling dynamic debug during boot for the ipconfig handler to begin with:   ... "ddebug_query=file net/ipv4/ipconfig.c +pflm" ...18:00
alkisgTJ-: I just put that part in the cmdline? Thanks, trying...18:02
TJ-alkisg: depending on the age of the kernel that may need some modification since things have changed alot with both the key and the pr_debug call sites18:03
alkisgStock 16.04, vmlinuz-4.4.0-14-generic i38618:03
alkisgLinux 4.4.0-14-generic #30-Ubuntu SMP Tue Mar 15 13:02:52 UTC 2016 i686 i686 i686 GNU/Linux18:04
TJ-ok, you may need to replace 'ddebug_query=' with 'dyndbg='18:05
TJ-see Documentation/dynamic-debug-howto.txt for more detail18:07
alkisgI tried both, but I didn't see any changes. Would I be seeing more output in the screen, or does it go to some internal files e.g. under /sys, /proc or whatever?18:08
TJ-you will see it in the dmesg output18:09
alkisgNothing there... are you sure that's the correct file? Isn't that from klibc which goes inside the initramfs? Is that same file included in the kernel as well?18:10
TJ-there are a lot of pr_debug() sites in ipconfig.c so if you've set "ip=dhcp" i'd expect to see something. I seem to recall the kernel will stop processing the options after a "--" so make sure its before that if it occurs18:10
TJ-net/ipv4/ipconfig.c is the source file where the code for 'ip=' lives18:11
alkisgI have ip=some:static:ip, and no "--"... I'll try with ip=dhcp as well18:12
TJ-it is possible, if there's no network device, that code never gets called. in which case you'd need to identify which part of the code is responsible for 'looking' for network devices and looking for pr_debug() call sites there, then putting them on the command line as well18:13
alkisgNo difference with ip=dhcp.18:14
alkisgThe good thing is that now it's very easily reproducible18:14
alkisgAnyone can just put ip=dhcp in his cmdline and reproduce it18:15
alkisg(maybe not in kvm though, maybe the kernel handles virtualized networking differently)18:19
infinityalkisg: VIRTIO_NET is builtin on that kernel.18:22
infinityalkisg: Which helps the argument that the kernel is blocking when there's no device to configure.18:22
alkisgLooks like it... there are also several timeouts in ipconfig.c that may match what I'm seeing18:25
alkisgip=none doesn't cause the issue18:25
alkisgip=10.161.254.61:10.161.254.11:10.161.254.1:255.255.255.0:pc61::none does cause it18:29
TJ-alkisg: you might want to add to that existing dyndbg this: ... "; file drivers/net/virtio_net.c +pflm"...18:37
alkisgThank you TJ-, trying...18:38
TJ-basically, in the source tree once you identify a location (in the built-in source at this point) do a "git grep 'pr_debug' path/to/files" to see if there are dyn-debug sites to make use of18:39
alkisgI think that I will need to debug ipconfig.c though, and that only has "ifdef IPCONFIG_DEBUG" there, no pr_debug...18:40
alkisgAnd since the driver is realtek, I'm not sure that debugging virtio will help18:41
alkisgMaybe all that means I'll have to build the kernel myself, while setting IPCONFIG_DEBUG there? :-/18:41
alkisgVirtualbox with virtio instead of intel nic emulation, doesn't have the delay18:44
alkisg(it takes 3 minutes to load the kernel/initrd via the network with virtio under vbox, but ok that's completely unrelated, it just makes debugging that way suck)18:45
alkisgHrm, I can't see any debug messages with virtio either18:58
alkisgSo yesterday booting LTSP clients in 16.04 took 45 seconds, now with this and 2 other delays I got rid of caused by the initramfs, it got down to 20 seconds :)19:03
kbuganyone awake? :)20:59
kbugstill issues with the bisection...20:59
=== hallyn81 is now known as hallyn

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!