=== smeso_ is now known as smeso | ||
=== broder_ is now known as broder | ||
=== Alives_ is now known as Alives | ||
=== frickler_ is now known as frickler | ||
=== himcesjf_ is now known as him-cesjf | ||
apw | cking, yo ... zfs ... specifically zfs-dkms looks to have a Recommends of zfs-dkms which is trying to pull that into main, i suspect ti should be a Suggests: as we have it in kernel | 14:08 |
---|---|---|
cking | apw, ah, good catch, I'll fix that | 14:09 |
Slashman | hello, I have an issue with the HWE kernel for xenial "linux-image-4.13.0-37-generic", it's freezing my server some minutes after the boot, is this a known issue? | 14:29 |
Slashman | no issue with 4.13.0-32-generic | 14:30 |
apw | Slashman, hard hangs which are reapeatable, i'd not say that was something i at least am aware of | 14:31 |
TJ- | Slashman: have you tried disabling the spectre and meltdown fixes? "nospectre_v2" and "nopti" | 14:31 |
TJ- | apw: We've seen a few recently reported in #ubuntu but as it's a freeze there's very little in the way of evidence | 14:31 |
Slashman | I'll try to disable the spectre and meltdown fixes | 14:32 |
rbasak | There's bug 1746806 which I think is a current reported hard hang. | 14:32 |
ubot5` | bug 1746806 in linux-aws (Ubuntu) "sssd appears to crash AWS c5 and m5 instances, cause 100% CPU" [Critical,In progress] https://launchpad.net/bugs/1746806 | 14:32 |
apw | rbasak, i thought that one exposed as an ssshd burning cpu and not doing its work | 14:33 |
rbasak | apw: 100% CPU reported via the host AIUI, but still a hard hang. | 14:33 |
apw | Slashman, also you say -32 is good, have you tried any of the ones between that an -37 ? there is definatly a -36 | 14:34 |
apw | rbasak, i know someone or other is looking at that one | 14:34 |
Slashman | TJ-: I'm on 4.13.0-32-generi currently and all spectre and meltdown are already in: https://paste.ubuntu.com/p/zGCqvwsPzb/ | 14:34 |
apw | Slashman, there is also a -38 in -proposed which is worth confirming is also broken | 14:35 |
TJ- | Slashman: right, I'm suggesting you *turn it off* to find out if those fixes are causing the issue | 14:35 |
Slashman | apw, I have -36, I can try that, but I can't test everything, the server take some time to reboot and I will need it online and running | 14:35 |
apw | Slashman, a problem indeed if you don't have a devleopment environment to test updated before deployment | 14:38 |
Slashman | TJ-: is there any launchpad report about a similar issue so I can look if it's similar? | 14:40 |
TJ- | Slashman: you can search for 'freeze' but that's so generic I doubt it'll be helpful | 14:41 |
apw | Slashman, in a perfect world you would be trying to get a crashdump off the box | 14:42 |
apw | i am sure that is what support would recommend in this case | 14:43 |
Slashman | hm, I think the issue may be related to a BIOS update in the end, I'll test with a rollback of the BIOS | 14:46 |
apw | Slashman, what is in the new bios update | 14:46 |
Slashman | one sec | 14:46 |
apw | i hope that isn't the new supposedly ok intel microcode | 14:47 |
Slashman | The syslog/dmesg shows this multiple time: https://paste.ubuntu.com/p/SxTjMbCR5T/ | 14:48 |
apw | that is -32 which is 'ok' | 14:49 |
Slashman | that's microcode from intel indeed... | 14:49 |
Slashman | http://www.dell.com/support/home/en/us/frbsdt1/drivers/driversdetails?driverId=NMF8F | 14:50 |
Slashman | apw: those fuckers, is there any source I can send to dell with the issue that the microcode update does? | 14:51 |
cking | apw, re the zfs-dkms recommends, I can't see what you are referring to in the kernel (as I was just cross referencing what you said about the Suggests in the kernel package) | 15:03 |
apw | cking, i am referring to the Recommends present on zfsutils-linux (i think it is) which is pulling zfs-dkms into main | 15:07 |
cking | apw, gotcha | 15:08 |
* apw watches the auto test-builds suffer because of our recent -backlog policy | 15:08 | |
apw | and worse i cannot fix the worst egregious issues without the queue draining | 15:10 |
prophecy04 | hello. I was wondering if I can get some help with understanding the kernel to my system better | 16:32 |
Slashman | TJ- , apw: postmortem: it was the BIOS with the intel microcode, latest kernel works perfectly with the rollbacked BIOS, thanks for your help | 16:57 |
TJ- | Slashman: Ouch! Is that the revised Intel microcode ? | 16:58 |
TJ- | Slashman: the earlier update in January was withdrawn, so if this is the new release, that isn't good | 16:58 |
Slashman | TJ-: I'm not sure, the last update of the BIOS is 26 Feb 2018, so I'll say yes | 16:58 |
TJ- | Slashman: have you notified Dell? | 16:59 |
Slashman | I've called dell because I have an issue on an other server where I have a memory issue and they told me to update the bios and it seems like they already know, but that's not very clear | 16:59 |
Slashman | those technician are outsourced in Morocco (I'm in France) so they are not the brightest | 17:00 |
TJ- | Slashman: it might be worth sending an email FAO one of the Dell kernel engineers, asking them to pass on the info/your contact details to whoever is responsible internally. E.g: Mario Limonciello <mario.limonciello@dell.com> | 17:08 |
Slashman | TJ-: you are right, I'll send an email to him | 17:10 |
apw | Slashman, hrm, lets hope it was the old microcode, and not the new | 17:11 |
apw | but if you do find out, it would be good to find out | 17:11 |
Slashman | we'll see... it's either that or some other undocumented changes in their BIOS I guess | 17:12 |
TJ- | From the various Dell updates it appears to be the late January / early February revision | 17:15 |
JanC | sounds like there better be an easy way to prevent microcode loading when the new ones get included in the OS... | 17:26 |
TJ- | I dumped and extracted the update, it's own Version.txt says "January 08, 2018" | 17:28 |
TJ- | which to me sound like the /original/ microcode | 17:28 |
JanC | yeah, who says newer ones won't crash some "random" hardware/software combinations which Intel didn't test? | 17:34 |
JanC | but who* | 17:34 |
JanC | (if even some quite popular combinations aren't tested...) | 17:35 |
TJ- | These are Dell updates; I'd expect Dell would test them | 17:35 |
JanC | right, but maybe they did and it worked with some other OS/kernel version, etc. | 17:36 |
JanC | and when you include it in the OS, Canonical can't test more than a fraction of all hardware out there | 17:38 |
Slashman | TJ-: incredible, thank you for looking inside the BIOS archive itself | 17:39 |
=== kees_ is now known as kees | ||
Slashman | TJ-: how did you extract the "R730-020701C.hdr" file to find the readme? | 17:46 |
TJ- | Slashman: "mkdir Hack; cd Hack; wget https://downloads.dell.com/FOLDER04787858M/1/BIOS_NMF8F_LN_2.7.1.BIN; chmod +x BIOS_NMF8F_LN_2.7.1.BIN; fakeroot BIOS_NMF8F_LN_2.7.1.BIN --extract ./ ;" | 17:48 |
Slashman | "/usr/bin/fakeroot: line 178: BIOS_NMF8F_LN_2.7.1.BIN: command not found" | 17:51 |
Slashman | last command seems incorrect | 17:51 |
Slashman | ok, I just did "./BIOS_NMF8F_LN_2.7.1.BIN --extract ./" | 17:52 |
TJ- | I used fakeroot to avoid needing to be root user | 17:57 |
Slashman | TJ-: I extracted it inside a temporary lxc container | 17:58 |
TJ- | Slashman: :D | 17:59 |
TJ- | Slashman: I'm disassembling the UEFI image now, might be able to isolate the actual microcode update itself and use icode-tool on it | 18:00 |
prophecy04 | what is fakeroot? | 18:30 |
Slashman | prophecy04: from what I understood, it's to make a program use the current dir as "/", avoiding in this case the binary to extract stuff at /opt/dell and instead extract at ./opt/dell | 18:36 |
TJ- | Slashman: no, that's chroot; fakeroot makes programs think they're running as uid 0 | 19:01 |
Slashman | oh ok | 19:01 |
ycyclist | https://paste.ubuntu.com/p/t9TXwjNZNR/ | 19:11 |
ycyclist | Sorry. I left out that I had a successful bionic build recently, and now this fail from fresh clone. | 19:12 |
ycyclist | All up to date. | 19:12 |
ycyclist | https://paste.ubuntu.com/p/FFf6qD9dnk/ | 20:38 |
apw | ycyclist, yep, those tests are too stringent right now ... feel free to ignore them | 21:33 |
apw | we should have a better abi check for that in the next couple of days | 21:33 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!