[00:28] is there a fix for meltdown comming out today for linux-hwe-16.04? [00:29] vieira: no patches for v4.10, you need hwe-16.04-edge (4.13) [00:29] Ok, I just installed linux-virtual-hwe-16.04-edge and rebooted [00:30] vieira: 4.10 kernel from 17.04 reaches end of life on Jan 13th, so wasn't worth the effort to backport patches [00:30] but have 4.13.0-21-generic [00:30] I think it should be 4.13.0-25.29? [00:30] vieira: waiting for it to appear in the -security archive [00:31] ahh ok! Thanks :) [01:13] Over in #ubuntu-server it was mentioned that I should mention this here: unable to boot past the RAM image loading on 4.4.0-108.131, meanwhile 4.4.0-104.127 continues to work fine. CPU is an Intel i5-4670K, motherboard is an ASUS Z87-A. [01:16] thanks keithzg [01:17] possibly related Ask Ubuntu, with stack-trace photo: https://askubuntu.com/questions/994067/kernel-panic-after-spectre-meltdown-update-16-04 [01:31] keithzg, are you still around? [01:31] bjf: Yup [01:31] bjf: He's been active in... there he is. [01:32] keithzg, ok, may have a new kernel for you to try "soon" [01:32] bjf: we're hoping keithzg can capture some log messages in the meantime ... got your camera handy, keithzg ? [01:33] Yup, camera's ready although it's not much to see currently! [01:33] hehehe, common problem [01:37] What little I have so far is now at https://photos.app.goo.gl/nqKrJHknd11NuCeO2 [01:39] one last test, try adding "nopti" on that kernel's command line [01:39] TJ-: I haven't tried adding "debug" yet, should I try both, or just "debug" first, or? [01:41] keithzg: both, why not! [01:41] keithzg: it looks like 'debug' on it's own won't capture anything; looks like the kernel's initial setup is failing [01:42] TJ-: hehe, shall do [01:47] keithzg, sorry, i won't have a new kernel for you to try. please file a bug and them come back here and tell me what the bug id is [01:47] TJ-: No dice, I just get the loading lines and nothing more. [01:47] bjf: Fair enough, shall do [01:48] keithzg: that's with 'nopti' ? [01:48] TJ-: Yeah. [01:48] keithzg: hmmmph, that IS strange [01:48] Well, that *and* debug [01:49] It's conceivable I don't know what I'm doing with editing GRUB entries on the fly [01:49] keithzg: which suggests some of the other x86/mm sub-system patches not directly related to the pti patch-set may be implicated [01:50] keithzg: you're editing the kernel command line via GRUB? pressing E on a highlighted entry, navigate to the line starting "linux ...", add the options towards the end of the line, e.g. " debug nopti" then IMPORTANTLY press Ctrl+X too boot with that change - don't return to the menu to start the entry else changes will be lost [01:52] TJ-: For whatever reason Ctrl-x just writes x to the editor, so I went with F10 as the alternate option, yeah. But now I'm wondering if there were quotation marks around the whole thing and I didn't notice and put my options outside of them? The much smaller window you get when running GRUB in a non-graphical mode makes things harder to visually parse, heh. I'm just going to try adding nopti to GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub, regenerate [01:52] grub, and then try another reboot. [01:53] keithzg: no quotation marks required when editing the GRUB lines [01:56] TJ-: Yeah I was just wondering in retrospect. And yeah, I verified that it's definitely trying to boot with that option, and definitely still hanging there. Bug report time! [01:57] keithzg, ok, i will shortly have something for you to try... i'm waiting for LP to complete copying something [01:59] keithzg, https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/pti [01:59] keithzg, in that ppa there will be "real soon now" a newer 4.4 kernel [02:00] keithzg, it's telling me the kernel is there now [02:02] * bjf is stepping away for a bit [02:03] Well it's a banner day for errors for me, I just tried opening a bug on Launchpad and got Timeout Error ((Error ID: OOPS-809eb4fff20040fcf023ecf9bda0f5d3)) [02:04] bjf: Only newer package I'm seeing is linux-libc-dev? [02:05] keithzg: maybe the LP server just got it's PTI patches :p [02:05] TJ-: hah! [02:06] keithzg, 4.4.0-109.132 .. you probably need to pull down the .debs and dpkg -i them [02:06] see https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/pti/+sourcepub/8710767/+listing-archive-extra 4.4.0-109-132 [02:07] apt-cache policy shows it Candidate: 4.4.0-109.132 [02:09] Hmm, apt-cache policy is still just showing me "4.4.0.108.113" as the latest for linux-image-generic. Manual .deb time then I guess! [02:09] keithzg: did you "apt update" ? [02:09] keithzg: presumably you added the ppa via apt-add-repository ? [02:09] TJ-: Yes and yes [02:10] keithzg: strange; I did the update and it shows [02:10] keithzg: try "apt-cache policy linux-image-4.4.0-109-lowlatency" [02:11] TJ-: Yeah, I see it there then, as I do too with `apt-cache policy linux-image-4.4.0-109-generic`. I guess it makes sense to just apt install *that*. [02:12] Although I might need linux-signed-* correspondingly? [02:13] keithzg: if using Secure Boot, yes. [02:15] TJ-: I remain unsure, heh, been too long since I set up this server and I don't remember if I disabled that or not. Running `apt install linux-signed-image-4.4.0-109-generic linux-image-extra-4.4.0-109-generic linux-headers-4.4.0-109-generic` to be sure. [02:20] Installed, lets give this a shot. [02:25] Booted with 4.4.0-109-generic! That's with "nopti" still on the kernel command line, I'm going to try removing that and seeing if it's still fine. [02:31] Excellent, that seems to have done the trick. Many thanks, bjf and TJ- :) [02:32] keithzg: so do we know what was wrong? [02:33] TJ-: I certainly have no idea, all I know is that the newer kernel packages from the PPA have resulted in everything appearing to be fine now on my system. [02:34] keithzg: Of course, I'd forgotten you've just updated to the very new build! I'm getting confused with the regression I'm investigating here [02:35] TJ-: It's a busy time :) [02:38] \o/ [02:38] keithzg, thanks! [02:50] bjf: :) [02:52] Worth mentioning, I ran into this again on another server, also using an ASUS motherboard (different, however: P8Z77-V LX in this case) and an older Intel CPU (i5-3550). [02:53] Going to check in a bit if the newer PPA packages fix this one too (it does also reboot fine with the -104 kernel, so odds seem good). First I'm going to pick up a pizza, though, since this may be a long night ;) [03:23] keithzg, we are rushing out the kernel you tested for us [05:03] bjf: Good to hear :) Hopefully there aren't any more issues lurking! [07:16] TJ-, bjf: that keithzg issue was one I pointed out twice in here already (including the fix), apw said he'll look at it on Monday already :-/ seems like I should have filed an LP bug earlier? I was under the impression that nobody would look at those atm while being busy backporting the huge patch sets.. going forward, I assume LP is the way to go again to report improvements/regressions? [07:54] f_g: is that the _ds panic? [07:55] yep [07:55] (I know it's already fixed in the latest 4.4) [07:56] if so reporting that here was the right thing, iirc you were going to test if that fix was the fix, and we had people scrambling to test too but positive reports didn't make it in time to hit the first updates [07:56] reporting it here was worthwhile [07:59] okay. you're right, I did not explicitly state that it actually fixed the issue, I just pinged a second time about it not yet being applied in the pti branch. I think I just assumed that it would get in in any case since it was an obviously incomplete backport - could have stated it more clearly. [08:01] going ahead, should I file bugs at LP in general, or just for already released kernels? are there any plans for Spectre / IBRS / RETPOLINE / .. already (or ways to contribute to that?) [08:02] (FWIW, we haven't received any other negative feedback, so things are looking good atm) [08:03] just bad luck on timing, we were juggling all the series at once [08:03] it will be in the next update, and that will soon, we are in free release mode [08:04] Spectre is in progress for sure [08:18] f_g, oh indeed it looks like there is a build in proposed with that fix right no [08:18] w [08:19] f_g, and i hear it was already approved for -updates, so it should be there 'soon' [08:21] yes :) we only had to juggle two kernel releases here, and still missed some stuff in the first iteration. that's just how it is ;) gotta catch up on LKML now - if there is anything that can be done from over here regarding Spectre backporting/testing/.. just ping [08:40] ack [11:52] * ogra didnt notice this channel became +r [11:53] it seems the just released kernel doesnt boot on my desktop ... [11:53] i'm greeted with an oops right after grub loads it [11:54] ogra: which kernel / distro? [11:54] heh, ubuntu 16.04 [11:54] 4.4.0-108 [11:55] 4.4.0-109 likely fixes it, if the oops looks like https://imgur.com/a/vVbR0 [11:55] i had to go back to 4.4.0-104-generic [11:55] see LP#1741934 [11:56] ah, i'm just behind one version then [11:56] * ogra upgrades [11:56] thx! [11:59] apw et al. thanks for the hard work [12:49] * alkisg also had an issue with 4.4.0-108 today... 4.4.0-104 booted, but -108 displayed a black screen without even loading the initramfs [12:50] I solved it by installing the -hwe stack, maybe I should have filed a bug report instead... [12:55] alkisg, if you get a chance try -109 [12:55] apw: thanks, if they call me again from that school I will do so [13:40] any known issues with 4.13.0.26.46 linux-image-generic-hwe-16.04 and kernel panics? [13:46] thresh: best to just tell us the symptoms you are seeing [13:52] apw, as it's a hetzner server, I have limited visibility (and it seems IP KVM has issues, too). I'm trying to boot it into rescue now to maybe have more info. [13:52] here's what I got from the support: "But it seems as the server shows no video output anymore (or responds to key strokes) once the boot process is nearly finished and the login promt should show up" [13:52] it's an AMD Ryzen system, too. [13:53] I'm sorry I don't have a lot of info atm, still waiting for it to boot into rescue system. [13:53] And it turns out I dont have a netconsole set up on that server :| [13:55] the kernel version I've got installed is 4.13.0-26.29~16.04.2 [14:38] I was able to boot into the rescue system, and there are no logs, sadly. [14:45] changed grub default back to the 4.8.0-56 and system booted up just fine. I understand that that does not help. [15:04] ok, tried booting back to 4.13.0-26.29~16.04.2, after some kernel initialization it freezes, and goes to "no signal" on KVM again. Sadly, it seems to early for netconsole to kick in. [15:05] and no panic stack trace on the screen either. [16:09] On 16.04, xtables-addons dkms fails to build with the new 4.13 kernel. [16:10] I do have another AMD system locally, although it's AMD Turion(tm) II, not Ryzen. I'm going to test the upgrades tomorrow. [16:10] (with physical access, too) [16:24] thresh: given the 'had to go back to a 4.8 kernel', I'm guessing you hadn't run any of the 4.10 linux-hwe kernels? Do any of the 4.10 linux-hwe kernels work for you? Also you could manually install the linux-hwe-edge 4.13.0-21.24~16.04.1 to try to narrow it down to just being the meltdown/kpti patchset that's breaking things for your hardware. [16:25] sbeattie, yeah, I've been using 4.8.0 since, well, it ran fine for me for quite some time. [16:25] I'm going to try those kernels now. [16:36] sbeattie, linux-image-4.13.0-21-generic=4.13.0-21.24~16.04.1 booted up fine (except did not bring up the network interfaces). [16:38] sorry for the question if it's already been addressed, but the MeltdownAndSpectre wiki page says the linux-aws kernel version for Xenial is 4.4.0-1047.56 whereas the linked 3522-1 says it is 4.4.0.1047.49. Any clarity on which is correct? [16:38] USN-3522-1 that is [16:41] thresh: likely need linux-image-extra-4.13.0-21-generic=4.13.0-21.24~16.04.1 too [16:42] yeah, r8169 [16:42] but, I guess, that kinda sorta means that kpti patches are the problem [16:43] ctennis: 4.4.0.1047.49 is the linux-meta version, which update-manager uses to know to pull in the specific linux-image 4.4.0-1047.56 package. [16:44] that's seemingly really confusing [16:45] (more correctly, linux-meta-aws version, and linux-image-4.4.0-1047-aws) [16:46] confusing> a bit yes, but it's what allows you to have multiple kernel images installed so that you can rollback if a specific version fails to work for you. [16:47] right, it's just the different #s between the two pages [16:48] I had 1047.49 installed, but it did NOT have 4.4.0-1048.57 with it [16:48] 4.4.0.1047.49 I mean [16:48] but just 30 minutes ago it became available, so now i have 4.4.0.1048.50 and the corresponding 4.4.0-1048.57 [16:50] anyway, looks like its sorted now, and I was just in a small window of time between updates [16:54] ctennis: uhhhh, it should not be possible to have linux-image-aws 4.4.0.1047.49 installed without also having linux-image-4.4.0-1047-aws (4.4.0-1048.57) installed, as the former has a direct dependency on the latter. [16:55] also, to be explicit, both are referred to in USN-3522-1 [16:56] This is what I had about 30 minutes ago: linux-aws is already the newest version (4.4.0.1047.49) [16:57] The Meldown page says just says "4.4.0-1047.56" [16:57] so I recognize maybe that's a different package, but at first glance it looks like...I'm not up to date enough [17:01] anyway, thanks for the response, and don't worry about me, I just wanted to get clarity. I still think it's confusing, but it sounds like you're all on top of it. [17:03] well, apparently, linux-image-extra-4.13.0-21-generic=4.13.0-21.24~16.04.1 failed to boot. [17:03] hmmm. [17:05] could be a r8169 which is crashes it, then, too.. [17:14] thresh, would it be possible for you to open a launchpad bug? This will allow allow all this info to be kept together and serachable by others. [17:14] jsalisbury, sure - I'm just trying to single out the exact problem which causes the issue. [17:15] thresh, thanks! I can work with you on the problem. Can you post the bug number once you have it? [17:16] yup, but I need to go home soon - will hopefully be able to work on it after a couple of hours. [17:17] thresh, ok, thanks [17:17] the fact that it's on hetzner does not help a lot to debug, too :-( === JanC is now known as Guest37549 === JanC_ is now known as JanC [17:54] klebers, hi, I noticed that the kernel metapackage for raspi2 depends on linux-firmware which seems not right and linux-firmware-raspi2 would be more fitting [19:52] is linux-firmware a metapackage that points a linux-firmware-raspi2? [19:53] nope