/srv/irclogs.ubuntu.com/2020/02/17/#ubuntu-server.txt

aloini	I am seeing some problems with a ubuntu server 18.04.3 instance, where on bootup, the cannot start the network interface causing the entire machine to not start. If I check /etc/network/interfaces, I just see a blank file and am not sure where it is failing to start the problem.	02:11
aloini	Actually, I take that back, its ubuntu 18.04.4 (I didn't know there was a more recent upgrade that occurred)	02:34
aloini	This what I eventually see if I wait long enough: https://imgur.com/HW4ozvM	02:49
tomreyn	aloini: ubuntu server 18.04 uses netplan with the systemd-networkd renderer for network configuration by default, /etc/network/interfaces would be legacy.	03:23
tomreyn	do you read release notes?	03:23
tomreyn	!releasenotes	03:23
ubottu	For release notes of a given Ubuntu release, please refer to the 'Docs' column on the 'List of releases' table at https://wiki.ubuntu.com/Releases	03:23
aloini	I do yes, my issue is that this occurred after a reboot of an already functioning network configuration and the server was working. The server is hosted in esxi, and the other servers I have have no problems on the host.	03:24
tomreyn	the screenshot you posted does not explain what failed about brining up systemdd-networkd, you'll need to refer to the log files as indicated.	03:25
aloini	Which log files? I attempted to look at /var/log/syslog, lastlog, kernel, and others but couldn't find any relevant info in any log file.	03:25
tomreyn	quoting your screen shot: "See systemctl status systemd-networkd.service for details."	03:26
aloini	I can't do that if the system does not boot or drop to a shell though.	03:26
aloini	If I reboot into recovery mode, there is no relevant information there.	03:26
tomreyn	so it does not contniue to boot after 1min30s are reached?	03:27
aloini	No, it just continually cycles through this process of trying to start the network interface for an unlimited amount of time.	03:27
tomreyn	i see. in this case you may want to boot to recovery	03:27
aloini	I rebooted the server this morning around noon, and then came back to it at 5 and it was still cycling through.	03:28
tomreyn	!recovery	03:28
ubottu	If your system fails to boot normally, it may be useful to boot it into recovery mode. For instructions, see https://wiki.ubuntu.com/RecoveryMode	03:28
tomreyn	other than syslog there's also journalctl for accessing log files.	03:28
tomreyn	well, not log files, but logs	03:29
tomreyn	to me, cloud-init is the culprit there	03:30
aloini	Where would you start from here then? If I boot into recovery and tell it to drop to root shell it does that successfully. But, I am honestly, not familiar with cloud-init so I am not sure what I need to do here to resolve cloud-init or another service to have it start working again.	03:31
aloini	But one second, let me see if there is a way to get something from journalctl.	03:32
tomreyn	journalctl -b -1 -e would let you inspect the end (-e) of what was logged during the previous (-b -1 ) boot	03:33
aloini	I see a call trace in the log right after starting, but, nothing more then that. I am unable to copy and paste things, so photos will be the only way to achieve this... one second.	03:38
aloini	https://imgur.com/lzIrhHo	03:38
tomreyn	a (virtual) serial console would enable to copy and paste	03:39
aloini	For what it is also worth, in recovery, I can ping out to IP addresses, but am unable to use DNS (IE: can't ping google.com but can ping google's dns servers, 8.8.8.8)	03:40
tomreyn	since i know practically nothing about this system, guessing on the lower end of a kernel call trace is not going to get us very far. this trace refers to "fuse", which may suggest your system makes use of a fuse file system, where the driver fails somehow.,	03:41
tomreyn	since you have networking, you could post the full log to termbin, if that's acceptable in terms of company policies / regulations, to share those	03:43
aloini	Yeah it would be, this is a personal system, not a company system.	03:43
tomreyn	journalctl -b -1 \| nc 5.39.93.71 9999	03:43
tomreyn	post the url it returns	03:44
aloini	https://termbin.com/f6ad	03:45
tomreyn	"pci 0000:00:15.3: BAR 13: no space for [io size 0x1000]" and "pci 0000:00:15.3: BAR 13: failed to assign [io size 0x1000]" is the first problem, try a web search for this	03:47
tomreyn	so this does not seem to really hint on why the ens160 network interface fails to get configured. maybe you can share the network configuration?	03:51
tomreyn	maybe using: cat /etc/netplan/* \| nc 5.39.93.71 9999	03:53
tomreyn	the systemd-timesyncd task gets hung somehow. this could be due to problems with the hwclock provided by the (vmware) virtualization	03:54
aloini	https://termbin.com/fk0l	03:54
tomreyn	you were not running the latest kernel package at the time, though	03:54
tomreyn	does the same still happen on the latest kernel?	03:54
aloini	It's mainly a DHCP configuration, and the DHCP server is up and running as far as I can see (other clients are receiving addresses without issue)	03:55
tomreyn	dhcp would happen after the network interface is brought up, so its indeed not a dhcp issue	03:56
aloini	Not sure tomreyn, I can't run apt update due to the lack of dns, it seems that the file that symlinks to /etc/resolv.conf (../run/systemd/resolve/stub-resolve.conf) is missing in recovery	03:56
tomreyn	you can either mount a tmpfs at /run and create the expected directories and the file there, with some public resolvers or your preferred ones, or you can delete the symlink at /etc/resolv.conf and place the file there,p then delete it later on.	03:59
tomreyn	(or just move it aside)	04:00
aloini	So, I don't see any potential upgrades for the kernel, https://termbin.com/b3g7	04:04
tomreyn	good. all i know is that when it was creating the logs you posted at https://termbin.com/f6ad it was running 5.3.0-26-generic #28	04:08
tomreyn	but 5.3.0-28-generic #30 is available now	04:08
tomreyn	i had you posted the log from last but one boot there, though	04:10
tomreyn	i suggest you start by looking for a vmware upgrade first of all, since this can be a virtualization issue	04:10
aloini	Ah, yeah, I booted into .28 to verify.	04:10
aloini	Ah, yeah, I booted into .26 to verify if a previous kernel would fix it. *	04:10
tomreyn	the log we were looking at was produced between Wed 2019-11-06 03:27:04 UTC (when it booted) and Mon 2020-02-17 03:34:53 UTC (when the log ends, due to reboot or shutdown), though.	04:11
tomreyn	the log is probably also not posted completely, but cut off towards the end (or the system froze / power cycled there)	04:12
tomreyn	it may be useful to review a log of a current kernel boot after you've worked out the vmware side of things	04:13
aloini	So if I boot to recovery, remove the resolv.conf file, run init 5, I can then boot the system perfectly fine.	04:13
aloini	I am sure there are things that are not necessarily working correctly however.	04:14
tomreyn	so no more pci errors?	04:14
aloini	Does seem like fuse might be causing it.	04:14
tomreyn	and does systemd-timesyncd work then?	04:14
aloini	What is the latest linux 4 kernel?	04:14
tomreyn	upstream? kernel.org would tell.	04:15
aloini	user@plex:~$ which systemd-timesyncd	04:16
aloini	user@plex:~$ command -v systemd-timesyncd	04:16
aloini	There is no output of that command	04:16
tomreyn	it's a systemd service	04:16
tomreyn	timedatectl can query it	04:17
aloini	https://termbin.com/3gto	04:18
aloini	If I run timedatectl nothing happens however	04:18
tomreyn	i'll be happy to continue looking into this once you have convincingly stated that you've reviewed available vmware updates	04:19
tomreyn	also discuss how you use fuse file systems	04:21
tomreyn	and show a journalctl -b for a current kernel boot	04:22
tomreyn	in this order	04:22
aloini	Updating to https://docs.vmware.com/en/VMware-vSphere/6.7/rn/esxi670-201912001.html right now, but, using fuse to mount a Google Drive File System mount via rclone and cache. Once the esxi upgrade is complete, I will get back to you on the other stuff.	04:24
aloini	So this is the current boot log if I do the following: recovery, init 5: https://termbin.com/486f	04:40
aloini	If I just have the system boot up, it still goes through the continuous loop of starting networking services	04:40
aloini	I also do have the latest version of vmware tools installd into the guest OS as well	04:51
aloini	ii open-vm-tools 2:11.0.1-2ubuntu0.18.04.2 amd64 Open VMware Tools for virtual machines hosted on VMware (CLI)	04:51
tomreyn	unfortunately the previously problematic PCI 15ad:07a0 vmware device triggering the "no space for [io size 0x1000]" messages is still problematic. maybe a newer version of vmwares' guest additions (provided by them/the virtualization host) may help.	04:53
tomreyn	how do you mount the fuse file system in fstab?	04:54
aloini	Ah, thanks, you made me remember a change I made several weeks ago to a systemd file.	04:59
aloini	Fixing that actually caused the system to boot again properly.	04:59
aloini	I am mounting fuse with a systemd script that waits on the network to mount due to Google Drive requiring a valid network connection.	05:01
tomreyn	aloini: don't keep me dumb - which change did you make and revert now?	05:02
aloini	I basically left off a \ for the script.	05:03
aloini	One second.	05:03
aloini	https://paste.ubuntu.com/p/CFnzvfGQry/	05:04
aloini	Line 19 was missing the \	05:04
aloini	Adding that resolved the problem	05:04
tomreyn	i see, so just a syntax error in a systemd service file, i'd hoped this to be reported by systemd when you enabled the service.	05:06
tomreyn	you can and should use the _netdev mount option in /etc/fstab for network devices	05:07
lordievader	Good morning	07:05
=== Wryhder is now known as Lucas_Gray
=== Wryhder is now known as Lucas_Gray
charolastra	in the process of an LTS -> LTS upgrade it stoped at the question of a modified file and the options to keep it, view difference, etc. but then didn't take any input anymore. dpkg process is still running and i see a process called 'xenial'. how to best debug the current situation? just kill dpkg?	16:19
blenderartist18	I'm trying to do a headless install of Ubuntu 19.10 through serial console using these instructions: https://askubuntu.com/questions/250869/how-can-i-install-ubuntu-on-a-device-without-a-screen-nor-a-keyboard/260469#260469	16:19
blenderartist18	But these files don't exist: syslinx.cfg or text.cfg	16:20
blenderartist18	Any ideas how to get this to work for Ubuntu 19.10?	16:20
=== teward_ is now known as teward

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!