=== smeso_ is now known as smeso === broder_ is now known as broder === Alives_ is now known as Alives === frickler_ is now known as frickler === himcesjf_ is now known as him-cesjf [14:08] cking, yo ... zfs ... specifically zfs-dkms looks to have a Recommends of zfs-dkms which is trying to pull that into main, i suspect ti should be a Suggests: as we have it in kernel [14:09] apw, ah, good catch, I'll fix that [14:29] hello, I have an issue with the HWE kernel for xenial "linux-image-4.13.0-37-generic", it's freezing my server some minutes after the boot, is this a known issue? [14:30] no issue with 4.13.0-32-generic [14:31] Slashman, hard hangs which are reapeatable, i'd not say that was something i at least am aware of [14:31] Slashman: have you tried disabling the spectre and meltdown fixes? "nospectre_v2" and "nopti" [14:31] apw: We've seen a few recently reported in #ubuntu but as it's a freeze there's very little in the way of evidence [14:32] I'll try to disable the spectre and meltdown fixes [14:32] There's bug 1746806 which I think is a current reported hard hang. [14:32] bug 1746806 in linux-aws (Ubuntu) "sssd appears to crash AWS c5 and m5 instances, cause 100% CPU" [Critical,In progress] https://launchpad.net/bugs/1746806 [14:33] rbasak, i thought that one exposed as an ssshd burning cpu and not doing its work [14:33] apw: 100% CPU reported via the host AIUI, but still a hard hang. [14:34] Slashman, also you say -32 is good, have you tried any of the ones between that an -37 ? there is definatly a -36 [14:34] rbasak, i know someone or other is looking at that one [14:34] TJ-: I'm on 4.13.0-32-generi currently and all spectre and meltdown are already in: https://paste.ubuntu.com/p/zGCqvwsPzb/ [14:35] Slashman, there is also a -38 in -proposed which is worth confirming is also broken [14:35] Slashman: right, I'm suggesting you *turn it off* to find out if those fixes are causing the issue [14:35] apw, I have -36, I can try that, but I can't test everything, the server take some time to reboot and I will need it online and running [14:38] Slashman, a problem indeed if you don't have a devleopment environment to test updated before deployment [14:40] TJ-: is there any launchpad report about a similar issue so I can look if it's similar? [14:41] Slashman: you can search for 'freeze' but that's so generic I doubt it'll be helpful [14:42] Slashman, in a perfect world you would be trying to get a crashdump off the box [14:43] i am sure that is what support would recommend in this case [14:46] hm, I think the issue may be related to a BIOS update in the end, I'll test with a rollback of the BIOS [14:46] Slashman, what is in the new bios update [14:46] one sec [14:47] i hope that isn't the new supposedly ok intel microcode [14:48] The syslog/dmesg shows this multiple time: https://paste.ubuntu.com/p/SxTjMbCR5T/ [14:49] that is -32 which is 'ok' [14:49] that's microcode from intel indeed... [14:50] http://www.dell.com/support/home/en/us/frbsdt1/drivers/driversdetails?driverId=NMF8F [14:51] apw: those fuckers, is there any source I can send to dell with the issue that the microcode update does? [15:03] apw, re the zfs-dkms recommends, I can't see what you are referring to in the kernel (as I was just cross referencing what you said about the Suggests in the kernel package) [15:07] cking, i am referring to the Recommends present on zfsutils-linux (i think it is) which is pulling zfs-dkms into main [15:08] apw, gotcha [15:08] * apw watches the auto test-builds suffer because of our recent -backlog policy [15:10] and worse i cannot fix the worst egregious issues without the queue draining [16:32] hello. I was wondering if I can get some help with understanding the kernel to my system better [16:57] TJ- , apw: postmortem: it was the BIOS with the intel microcode, latest kernel works perfectly with the rollbacked BIOS, thanks for your help [16:58] Slashman: Ouch! Is that the revised Intel microcode ? [16:58] Slashman: the earlier update in January was withdrawn, so if this is the new release, that isn't good [16:58] TJ-: I'm not sure, the last update of the BIOS is 26 Feb 2018, so I'll say yes [16:59] Slashman: have you notified Dell? [16:59] I've called dell because I have an issue on an other server where I have a memory issue and they told me to update the bios and it seems like they already know, but that's not very clear [17:00] those technician are outsourced in Morocco (I'm in France) so they are not the brightest [17:08] Slashman: it might be worth sending an email FAO one of the Dell kernel engineers, asking them to pass on the info/your contact details to whoever is responsible internally. E.g: Mario Limonciello [17:10] TJ-: you are right, I'll send an email to him [17:11] Slashman, hrm, lets hope it was the old microcode, and not the new [17:11] but if you do find out, it would be good to find out [17:12] we'll see... it's either that or some other undocumented changes in their BIOS I guess [17:15] From the various Dell updates it appears to be the late January / early February revision [17:26] sounds like there better be an easy way to prevent microcode loading when the new ones get included in the OS... [17:28] I dumped and extracted the update, it's own Version.txt says "January 08, 2018" [17:28] which to me sound like the /original/ microcode [17:34] yeah, who says newer ones won't crash some "random" hardware/software combinations which Intel didn't test? [17:34] but who* [17:35] (if even some quite popular combinations aren't tested...) [17:35] These are Dell updates; I'd expect Dell would test them [17:36] right, but maybe they did and it worked with some other OS/kernel version, etc. [17:38] and when you include it in the OS, Canonical can't test more than a fraction of all hardware out there [17:39] TJ-: incredible, thank you for looking inside the BIOS archive itself === kees_ is now known as kees [17:46] TJ-: how did you extract the "R730-020701C.hdr" file to find the readme? [17:48] Slashman: "mkdir Hack; cd Hack; wget https://downloads.dell.com/FOLDER04787858M/1/BIOS_NMF8F_LN_2.7.1.BIN; chmod +x BIOS_NMF8F_LN_2.7.1.BIN; fakeroot BIOS_NMF8F_LN_2.7.1.BIN --extract ./ ;" [17:51] "/usr/bin/fakeroot: line 178: BIOS_NMF8F_LN_2.7.1.BIN: command not found" [17:51] last command seems incorrect [17:52] ok, I just did "./BIOS_NMF8F_LN_2.7.1.BIN --extract ./" [17:57] I used fakeroot to avoid needing to be root user [17:58] TJ-: I extracted it inside a temporary lxc container [17:59] Slashman: :D [18:00] Slashman: I'm disassembling the UEFI image now, might be able to isolate the actual microcode update itself and use icode-tool on it [18:30] what is fakeroot? [18:36] prophecy04: from what I understood, it's to make a program use the current dir as "/", avoiding in this case the binary to extract stuff at /opt/dell and instead extract at ./opt/dell [19:01] Slashman: no, that's chroot; fakeroot makes programs think they're running as uid 0 [19:01] oh ok [19:11] https://paste.ubuntu.com/p/t9TXwjNZNR/ [19:12] Sorry. I left out that I had a successful bionic build recently, and now this fail from fresh clone. [19:12] All up to date. [20:38] https://paste.ubuntu.com/p/FFf6qD9dnk/ [21:33] ycyclist, yep, those tests are too stringent right now ... feel free to ignore them [21:33] we should have a better abi check for that in the next couple of days