[00:00] bug: https://bugs.launchpad.net/ubuntu/+source/linux-hwe-edge/+bug/1852581 [00:00] Ubuntu bug 1852581 in linux (Ubuntu) "hwe-edge kernel 5.3.0-23.25 kernel does not boot on Precision 5720 AIO" [Critical,In progress] [00:00] fix: https://lists.ubuntu.com/archives/kernel-team/2019-November/105544.html [00:00] sarnold, shibboleth: ^ [00:01] tyhicks: heh, bummer the simple simple fix can't actually be used by anyone but us [00:01] ty === cjwatson_ is now known as cjwatson [08:40] Hi, I found a way to get a kernel opps by using tcpdump to write to a file on a USB stick and then unplugging the USB stick. [08:40] what is the best way to report this? [08:41] I where able to repoduce this on multipe platforms and kernel versions. === fling is now known as goffee === goffee is now known as fling === henrix_ is now known as henrix [13:11] I installed crashdump tools(kexec\kdump) to acquire a crashed Linux kernel dump and enabled the “kernel.panic = 60”, “kernel.softlockup_panic = 1” and “kernel.hardlockup_panic = 1” variables. I triggered the some kind of misbehavior in the kernel by writing to /sys/kernel/debug/provoke-crash/DIRECT file. I see that my system just [13:11] reboots without copying the crash dump to /var/crash.. Can someone help me to collect the coredump? [13:11] *crash dump [13:18] My system has linux-crashdump\kexec-tools\crash deb packages [13:18] and kdump-tools [14:13] m90s, the best place to report that is by filing a bug on launchpad: https://bugs.launchpad.net/ubuntu/+source/linux [14:21] connor_k: I guess that person is gone from IRC already [14:22] sub526: do you have console access? a virtual terminal would suffice (tty1) [14:22] cascardo: yeah I suppose so :-/ maybe all the join/leaves aren’t present in the history from my bouncer to my phone. That’s okay, still good advice in general lol [14:23] cascardo: yes, i've monitor connected [14:23] sub526: and which version of kernel and kdump-tools do you have? [14:24] sub526: okay, I mean a console opposite to a graphical environment. at least, that could help finding out if the panic kernel is executed at all [14:24] cascardo: kdump-tools : 1:1.6.3-2~16.04.1 and 5.0 kernel [14:25] sub526: that seems to be xenial, what is uname -r ? [14:26] For debugging purpose, I compiled an installed the vanilla kernel 5.0 on Ubuntu 16.04.4 machine [14:27] well, that version of makedumpfile probably does not support 5.0. at least, you would get a very large dumpfile on /var/crash/ [14:27] but at worst, it wouldn't be able to dump the kernel at all [14:27] sub526: by debugging purposes, do you mean debugging the kdump/crash situation or something else? [14:30] cascardo: no, I'm facing system hang issue for actual test case... So to debug this I enabled few debug options like KASAN etc on plain kernel and then rebuild it. [14:31] cascardo: Before executing the actual test case, I'm trying to validate my system whether it supports collecting the crashdump or not... [14:31] sub526: well, I would suggest you get a more recent makedumpfile/kdump-tools too, then, if that's possible [14:32] sub526: and which config did you use? the same one as Ubuntu's? and which kernel source? [14:33] cascardo: I downloaded the kernel source from https://mirrors.edge.kernel.org/pub/linux/kernel/v5.x/ [14:34] linux-5.0.1.tar.gz [14:35] okay, so that doesn't contain Ubuntu patches [14:35] what about the config? [14:35] Regarding .config , I added certain debug options under 'kernel hacking menu' [14:35] added in respect to what base? [14:35] I ran make defconfig and then make menuconfig [14:37] CONFIG_KEXEC=y, CONFIG_KEXEC_FILE=y, CONFIG_ARCH_HAS_KEXEC_PURGATORY=y, CONFIG_KEXEC_JUMP=y, CONFIG_KEXEC_CORE=y [14:37] my .config has above stuff related to kexec [14:38] also CONFIG_CRASH_DUMP=y, CONFIG_COREDUMP=y [14:39] do you see any issues in this aproach? === ben_r_ is now known as ben_r [14:41] sub526: what about CONFIG_CRASH_CORE ? [14:42] cascardo: CONFIG_CRASH_CORE=y [14:43] sub526: okay, so what do you get on your console after panic? [14:43] sub526: and by the way, can you update kexec-tools and makedumpfile/kdump-tools ? [14:43] yeah, kexec-tools could be an issue [14:44] by update, I mean you should get a version from bionic, at least, but better if newer than that, like the one from eoan [14:44] cascardo: sure , i will update those tools. sudo apt-get update and then install or any other method? [14:46] In console I see the corresponding crash dump and after kernel.panic timeout it reboots, but /var/crash is empty.. also no crash log in /var/log/kern.log [14:48] sub526: what do you mean by corresponding crash dump ? [14:48] Dumb question: console log mean whatever displayed on monitor, right? What exactly console means? [14:48] sub526: yeah, if you see reboot logs and crash dump logs on the monitor, that's sufficient [14:49] cascardo: I triggered crash via SYSRQ key press and corresponding log I can see in monitor [14:49] sub526: what is the version of kexec-tools? [14:50] let me check [14:50] 1:2.0.16-1ubuntu1~16.04.1 [14:50] I checked that xenial-updates has a reasonable recent version [14:50] yeah, 2.0.16, not too old [14:53] Cascardo: As per https://help.ubuntu.com/lts/serverguide/kernel-crash-dump.html , i understood that a manual intervention is required in order to capture the memory for 'machine exceptions'. What exactly ‘manual intervention’ means? [14:54] sub526: I haven't seen that guide before, I am not certain what that means. but you are doing the right thing in testing that it's working, because kdumping is not a certain success as you can see by yourself [14:55] sub526: I need to attend a meeting and get out after it, can you open a bug? I can promise I can work on it, unless you can reproduce it with the Ubuntu kernel? you should try linux-hwe, you will get a 4.15 kernel on xenial [14:56] cascardo: Sure thanks for your support.. Bye for now.. [14:56] sub526: and do you see makedumpfile being called at all? is that what you mean by corresponding log? or only the panic stack trace? [14:57] I did not check for makedumpfile being called, but I see panic stack trace. [14:58] what exactly need to be checked in makedumpfile called or not? [14:59] sub526: well, there should be two rebootss [15:01] cascardo: /sbin/reboot is part of systemd-sysv - is this gets called? what is other reboot? [15:29] sub526: so, what I mean is that after panic, the system should start executing the kdump kernel, which will trigger the capture of the dump followed by a cold reboot [15:29] ok [15:30] if you don't see the logs for the kdump kernel after the panic, but a cold reboot, then there will never be an opportunity for makedumpfile to execute [15:31] if you see those logs, but that kdump kernel panics itself, or there is an OOM on that kernel, then those logs would be useful to understand what is happening. in the case of an OOM, increasing the memory for crashkernel allocation should fix it [15:32] cascardo: what should be the kernel.panic set to? [15:34] sub526: do you see the "Rebooting in 60 seconds..." message? if you do, then crash kernel is not being executed? [15:34] what is the result of kdump-config status ? [15:34] and kdump-config show ? [15:36] cascardo: current state : ready to kdump [15:38] cascardo: https://pastebin.com/Lb0tADcZ [15:46] cascardo: kdump-config show, looks ok? [15:53] sub526: that looks ok [15:53] sub526: I need to go out now, just let me know if you see the "Rebooting in 60 seconds..." or not [15:54] cascardo:I did not see that log [15:54] What it means? [16:07] cascardo: Thanks a lot for your support. I need to leave now... will catch you in next week. === himcesjf_ is now known as him-cesjf