/srv/irclogs.ubuntu.com/2021/11/12/#ubuntu-server.txt

=== genii is now known as genii-core
williamo	I have multiple u20.04 VMs that refuse to boot kernel versions after 5.4.0-80, what do?	03:36
lotuspsychje	could try !HWE williamo	03:36
williamo	!HWE	03:36
ubottu	The Ubuntu LTS enablement stacks provide newer kernel and X support for existing LTS releases, see https://wiki.ubuntu.com/Kernel/LTSEnablementStack	03:36
lotuspsychje	williamo: or, investigate why the -80 kernels doesnt wanna boot exactly	03:37
williamo	every kernel up to and including 5.4.0-80 boots and works. kernels after that seem to fail to read the disks.	03:37
lotuspsychje	!info linux-image-generic focal	03:38
ubottu	linux-image-generic (5.4.0.90.94, focal): Generic Linux kernel image. In component main, is optional. Built by linux-meta. Size 3 kB / 18 kB. (Only available for amd64, armhf, arm64, powerpc, ppc64el, s390x.)	03:38
williamo	running amd64	03:38
lotuspsychje	even the latest -90 one doesnt boot williamo ?	03:39
williamo	the furthest I've gotten is a busybox prompt after tossing a 'break' into the kernel boot line.	03:39
williamo	correct. 5.4.0-90 does not boot	03:39
lotuspsychje	thats indeed a weird one	03:39
lotuspsychje	williamo: can you still grab a dmesg from a failed boot one?	03:40
williamo	not sure how I would. can't read/write/touch the disks without it dying, don't have networking as far as I can tell	03:41
williamo	best I can do is screenshot from ESX console	03:41
lotuspsychje	williamo: or if you can textboot F1 at boot and see how far you can go	03:42
lotuspsychje	maybe we lucky at wich point it gets stuck	03:42
williamo	I do have VMs that do boot up after -80, the only difference on the esx side is if there are a mixture of encrypted/nonencrypted disks	03:42
williamo	if all the disks are either encrypted or unencrypted in esx, it boots.	03:43
williamo	it looks like it gets stuck at the point where it is trying to talk to the disks and get them sorted	03:44
lotuspsychje	if you can log or screenshot something, that could be helpful for the volunteers to help tracing whats the bottleneck	03:45
williamo	https://imgur.com/dUz2966	03:46
lotuspsychje	udev database hmm	03:48
williamo	https://imgur.com/crPuuVl It gets about this far, and it just sits there till about 240s for those timeouts to occur	03:50
williamo	and I think I remember hearing from someone else that 18.04 is having the same issue	03:51
lotuspsychje	not sure myself williamo dont think i saw that udev error before, dont find any related bugs right away neither	03:52
lotuspsychje	you think you could try a !hwe kernel for a test?	03:53
lotuspsychje	see if the 5.11 and higher series influence this	03:53
williamo	sure	03:53
williamo	I'll wait for the full timeout, but I'm getting the same issue	03:57
lotuspsychje	ouch	03:59
williamo	https://imgur.com/mOJaGQI https://imgur.com/A9hLpMh	04:04
lotuspsychje	williamo: maybe we should file a new !bug on this, and attach all your logs you shared to it	04:05
lotuspsychje	williamo: can you reproduce this on 1 machine only or serveral?	04:06
williamo	I can reproduce this on other VMs that have the similar configuration with 1 of 3 disks esx encrypted. as far as the vm guest is concerned, this should be 100% transparent and shouldn't matter, but for some reason it does.	04:07
williamo	or 2/3 disks encrypted	04:07
lotuspsychje	try ubuntu-bug linux to start your bug file, then add your story & logs to it and ill ping some volunteers about it later, see if they find something	04:09
williamo	why can we just encrypt/unencrypt all? we have stupid amounts of data has has to be encrypted, and even more dumb amounts of data that is too large to beencrypted.	04:09
williamo	and now to type this uuid url in by hand.	04:11
williamo	hooray, first bug report. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1950712	04:23
ubottu	Launchpad bug 1950712 in linux (Ubuntu) "ESX u20.04 VM refuses to boot after upgrading to kernel 5.4.0-81 and up" [Undecided, New]	04:23
williamo	now to go bug that other person for info about 18.04 that might be doing the same stuff	04:23
williamo	in 8 hours	04:24
=== ^wuseman is now known as Guest6898
lotuspsychje	great work williamo !	04:32
lotuspsychje	hang around a bit and we can see if more volunteers are awake, if we can trace some more	04:34
lotuspsychje	tomreyn: can you take a look for williamo see if you got any ideas to try? ^	04:56
tomreyn	lotuspsychje: i can try ;)	05:03
tomreyn	williamo: since this seems to be esx related, have you verified that you're running the latest / a supported, fully patched esx server version?	05:04
tomreyn	williamo: have you tried booting without those extra kernel parameters? why are you using them, are they actually needed?	05:04
tomreyn	there's a kernel oops on your screenshot, but (a) it may not be the first, and thus a side effect of a former one, (b) we can't see the actually failing module there, since it already scrolled off the screen. you may want to look into some kernel debugging options to get a better idea of what's failing there.	05:08
tomreyn	https://wiki.ubuntu.com/Kernel/KernelDebuggingTricks	05:08
tomreyn	the easiest will likely be to set up a serial console	05:10
tomreyn	searching the web for the err message printed on your screen, "device [..] not initialized in udev database even after" points to a bug in lvm2, but those are from around 2019, and this bug has since been fixed in debian, and probably ubuntu, too: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=925247 ; additionally, those are first triggered by grub-mkconfig during OS install, not during boot.	05:20
ubottu	Debian bug 925247 in fai-client "setting up lvm2 hangs for a long time" [Normal, Open]	05:20
tomreyn	in the 5.4.0-80 boot dmesg, you have multiple "BAR 13: no space for [io size 0x1000]" errors. that's memory allocation failing for the VMware PCI Express Root Ports.	05:38
=== PC_ is now known as deksar
=== calcmandan_ is now known as calcmandan
jamespage	icey, coreycb: doko fixed up greenlet/eventlet sufficiently for packages to build and unit test OK so I'm dealing with the openstack packages that need to drop depends on python3-crypto	09:33
icey	fantastic	09:35
=== cpaelzer_ is now known as cpaelzer
coreycb	jamespage: ah thanks for doing that! it was on my list and hadn't gotten to it.	13:29
icey	jamespage: see that ceph risc failure? best fail reason ever: "/usr/bin/ar: unable to copy file '../../lib/librgw_a.a'; reason: Success"	13:53
jamespage	coreycb: np	13:54
jamespage	icey: gotta live a risc failure after 23 hours of building...	13:54
jamespage	live/love rather	13:54
icey	jamespage: and only 85% done :-P	13:55
xnox	/usr/bin/ar: unable to copy file '../../lib/librgw_a.a'; reason: Success => i feel like retrying ceph riscv64 build	14:59
ginggs	nothing succeeds like success	15:01
=== genii-core is now known as genii
williamo	tomreyn, only 1/3 disks has LVM, the other 2 are also showing issues with accessing data	15:42
williamo	frustrating, esx will not output serial console to text files for encrypted VMs	16:31
tomreyn	williamo: did you verify that esx is up to date, though?	22:32
=== genii is now known as genii-core

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!