* enyc meows | 04:24 | |
apw | enyc, not heard anything against 5.4 this cycle so far | 08:37 |
---|---|---|
enyc | apw: indeed may be complete misnomer, turned out that user 5.4.0-47 still | 09:02 |
enyc | apw: ooi any idea where there is a 'gap' in kernel cycle announced on kernel.ubuntu.com ? | 09:02 |
apw | enyc, i think it is mentioned at least on the front page | 09:02 |
enyc | apw: sorry i mean WHY ther eis ... as annoinced on ... | 09:03 |
* enyc thinks... maybe working on ubuntu 20.10 release kernel | 09:03 | |
apw | enyc, the cycle would have been badly aligned against the 20.10 release which isn't a great plan, and there is some infrastructure work going on which makes it hard to realign that cycle; so it was decided to pause over that period | 09:04 |
enyc | apw: i see =) | 09:04 |
enyc | hrrm groovy 5.8 kernel | 09:05 |
apw | enyc, nothing earth shattering or concerning; we will be right back at it the cycle after | 09:05 |
enyc | =) | 09:06 |
ira | On 20.04 I have a kernel hanging crypto tasks during a ceph install using ceph-ansible, to set up encrypted volumes. | 17:10 |
ira | Where's the best place to send info, and whatnot. | 17:10 |
ira | I've also run the proposed kernel, shows the same issues. I have not tried mainline yet. | 17:11 |
=== Eighth_Doctor is now known as Conan_Kudo | ||
=== Conan_Kudo is now known as Eighth_Doctor | ||
sarnold | ira: which tasks? does strace show them hung? if so, where does the /proc/pid/stack for the processes say they are hung? | 19:35 |
ira | I'm getting kernel wait messages. I wiped the machines and I'm installing the OEM kernel + 20.04 to see if things change. | 19:36 |
ira | The whole setup is MAAS + ansible, so reconfiguring it isn't the end of the world. | 19:38 |
ira | And yeah they are hung, the install hangs. | 19:40 |
ira | It is using ceph-volume to create dm-crypt encrypted disks. | 19:41 |
sarnold | if it's moments after boot I wonder if you're hitting the /dev/random entropy stuff. can you shove in a random seed from the maas server to the cloud-init on those things? | 19:44 |
ira | It's way after boot. | 19:45 |
sarnold | days? or minutes? | 19:46 |
ira | minutes. | 19:46 |
sarnold | still plausibly randomness, a machine you never touch has limited ability to gather its own | 19:47 |
ira | Except it installs fine on 18.04. | 19:47 |
sarnold | oh interesting :) | 19:47 |
ira | Same containers, scripting etc... | 19:48 |
ira | Rebooting the boxes right now to pick up 5.6. I | 19:48 |
ira | 'll do the install there, and see how it goes... if it flies through it takes about 20-30m. | 19:49 |
ira | If it all fails, I'll reset it to stock 20.04 and be glad to do a debug session. | 19:50 |
ira | (Even if not, I'd like to get this fixed, so we can use 20.04 and not 18.04 here.) | 19:50 |
ira | @sarnold the 5.6 kernel does it. | 20:34 |
ira | So I got machines hanging for interrogation :) | 20:34 |
sarnold | sweet :) well, I mean, ugh :) but you know.. a reproducer is handy :) | 20:35 |
sarnold | can you spot which processes look hung? what does strace say they are doing? /proc/pid/stack? | 20:35 |
ira | What's the same thing as fpaste on fedora here? | 20:36 |
ira | (old red hat engineer here... sorry man :) ) | 20:36 |
ira | (ex-red hat) | 20:36 |
tomreyn | paste.ubuntu.com | 20:37 |
tomreyn | pastebinit as a CLI | 20:37 |
ira | https://paste.ubuntu.com/p/X5S2brG89F/ | 20:38 |
tomreyn | or just echo foobar | nc termbin.com 9999 | 20:38 |
ira | And 5.6 locked faster than 5.4 it looks like. | 20:39 |
ira | https://paste.ubuntu.com/p/zYVxKr8QPP/ | 20:40 |
ira | (the stack of that process.) | 20:40 |
ira | Thank you for helping :). | 20:41 |
ira | Anything else off the machines you'd like? | 20:56 |
ira | Looks like it doesn't happen every time, but it does happen, I see it made a few OSDs on that host successfully. | 21:12 |
ira | I've also tried the non-containerized built for ubuntu ceph octopus packages, and they don't work either, in the exact same way. In case there's a question there :) | 21:16 |
ira | https://github.com/NixOS/nixpkgs/issues/40282 | 21:25 |
ira | Reinstalling the machines, with blacklisting intel_qat, and also blocking install. | 21:49 |
sarnold | ira: oh wow looks like you've been very productive :D | 22:03 |
sarnold | ira: I think yo'ure right, blacklisting qat looks promising | 22:03 |
ira | here's to hoping. | 22:04 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!