[07:25] schopin: ack [15:40] Hi. I'm having some issues with a Hetzner server: after the apt update && apt upgrade && reboot netplan doesn't set the default gateway anymore -> I can only access it via a KVM. Any ideas what I can do or how to debug this? I compared the config to another server and I don't see any differences. Actually I never touched the default netplan config [15:40] from Hetzner until now ;-) [15:42] That command brought the machine back online (the "onlink" was required, otherwise it gave me an error): sudo ip route add default via 1.2.3.4 (gateway IP) dev enp7s0 onlink [15:43] I started systemd-networkd in debug mode according to the docs here and the machine was disconnected and the default gateway is gone again: https://wiki.ubuntu.com/DebuggingSystemd [15:44] Now I'm connected via the KVM and have no idea how to fix that and what causes that behavior... [15:44] can you show your config from /etc/netplan/*.yaml? [15:45] slyon: sure... give me a minute [15:49] slyon: here: https://pastebin.com/HmnUcHVL [15:49] It also matches the Hetzner docs here: https://docs.hetzner.com/de/robot/dedicated-server/network/net-config-debian-ubuntu/ [15:52] Thank you, could you please also show the output of `networkctl status enp7s0` just to see if that is passed through correctly [15:52] sure, one sec [15:52] and maybe "route" and "route -6" [15:53] err. "ip route" and "ip -6 route" that is [15:54] route is empty, route -6 says "/proc/net/ipv6_route: No such file or directory \ INET6 (IPv6) not configured in this system [15:55] netowrkctl isn't so easy, I can't copy & paste anything and it includes public IPs... but I see "State: routable (failed) [15:57] the route output is interesting. "route" being empty should not be the case, as there is a default route defined in netplan, and also some ipv6 addresses and a gateway, so it should exist [15:57] ip route and ip -6 route are both empty [15:58] networkctl: failed, is interesting, too. in this case you should check your journalctl -u systemd-networkd (you put that already into debug mode which is good) [15:58] Somehow it failes to set that route... I did it manually last time and the machine was online immediatly again until I started systemd-networkd in debug mode [15:59] It's Ubuntu 20.04 HWE (Hetzner installimage) btw and everything patched as of yesterday. [15:59] lemme try journalctl... [16:01] Btw. when I run lib/systemd/systemd-networkd it shows "Enumeration completed \ enp7s0: Could not set address: Operation not supported \ enp7s0: Failed [16:06] that's probably the problem.. but I'm not sure what "Operation not supported" means. are there any more hints in journalctl? [16:06] you didn't put any custom networkd configuration or systemd override, right? [16:07] In journalctl I see that error message the first time appeared yesterday after the reboot... So seems that something is broken [16:08] Nope, I didn't touch that file until today when I started to investigate it but still I didn't modify anything. I was running netplan generate and apply but nothing (to my knowledge) should have modified anything here [16:08] And WireGuard was set up which I disabled, just in case [16:09] I saw that netplan was updated end of March... I would almost say that it broke something -> I have a few other servers but I don't want to reboot them now until this is solved because they are productive ;-) [16:12] to me it looks like netplan is doing it's job. but networkd is somehow failing to comunicate with the kernel... so it might be a kernel or systemd-networkd issue :-/ did you updates for those packages as well? [16:12] could you try booting an older kernel image? [16:13] Here is a screenshot of journalctl -> the upper part before the empty line is the last successful one: https://imgur.com/BfCgsOL [16:14] could you maybe paste a (redacted) version of /run/systemd/network/10-netplan-enp7s0.network ? [16:14] I've updated everything available yesterday, ~24h ago. I also checked for new updates today after setting the route manually -> nothing new [16:14] sure, one sec [16:14] I could try to apply that on a 20.04 machine and see if it produces the same issue.. [16:17] https://imgur.com/7UH7TW9 [16:18] The IP address match the one from /etc/netplan/01... and I also coudn'T see any difference on another server (besides diffferent IP addresses, ofc) [16:19] Guest9223 your ip address has netmask /32 so it can't reach your gateway, of course [16:19] if you actually want the /32 netmask for some reason then you need to add a specific route to the gateway before actually adding the default route [16:19] Regarding kernel updates: I just used apt update && apt upgrade, nothing was installed manually, no modules etc. [16:21] but you almost certainly don't actually want a /32 subnet [16:22] ddstreet: well it is mentioned and explained like this in the Hetzner docs: https://docs.hetzner.com/de/robot/dedicated-server/network/net-config-debian-ubuntu/ + another server has the same netmask and that one is working fine. So maybe something was changed here with a recent update and the Hetzner docs are outdated? [16:22] ok it's fine to do that - i.e. the 'point to point' stuff they talk about - but you do have to define the route to the gateway [16:22] What I mean: they set/recommend it like that, I think to protect from (accidental or malicious) IP changes and duplicate IP addresses with other customers - that'S how I understand their docs [16:23] that's what the 'on-link' does, it says 'ok kernel i know you have no route to this ip, but don't worry it really is on your link local so just pretend you can route to it' [16:24] if systemd isn't properly handling the setting there might be a systemd bug [16:24] Yeah that explains why it errored out without the "online" parameter and it was working for a few months like that -> I think January 24th 2022 or so [16:24] ah [16:24] Guest9223 in your .network file the param is misspelled [16:24] it's GatewayOnLink not GatewayOnlink [16:24] change the L capitalization and it should work ok [16:24] not sure if netplan is using the wrong cap? [16:25] Lemme check... [16:26] Is it enough to edit the .network file? Because I get the same error and on another (not yet rebooted) server it is also lowercase [16:26] no, it looks like there's a bug in netplan where it's using the wrong spelling [16:27] So something is case sensitive and not recognizing the key you mean? [16:28] until that's fixed, i guess you could add a drop-in file, e.g. if the netplan-generated file is named '10-netplan-enp7s0.network' then create a file /etc/systemd/network/10-netplan-enp7s0.network.d/override.conf' and make its content: [16:28] [Route] [16:28] GatewayOnLink=true [16:28] (just those lines) [16:28] then i think if you reboot it shoudl work [16:28] ddstreet: netplan is rendering it as "GatewayOnlink=true".. but that didn't change since 2018... and it was actually changed from GatewayOnLink -> GatewayOnlink in that commit d419c7b8 [16:28] ah ok that's interesting then [16:29] Got disconnected... did my GitHub message make it? [16:31] Guest92: no, i don't think so [16:32] https://github.com/canonical/netplan/search?q=GatewayOnLink shows GatewayOnlink only, lowercase [16:32] netplan is rendering it as "GatewayOnlink=true".. but that didn't change since 2018... and it was actually changed from GatewayOnLink -> GatewayOnlink in that commit d419c7b8 [16:32] Guest92 i'm wrong, as slyon said either spelling is ok [16:33] so ignore me, might be the kernel as he said :) [16:33] Seems that systemd-networkd is accepting both: https://github.com/systemd/systemd/search?q=GatewayOnLink [16:33] I got to drop now. I think this problem is most probably related to networkd or kernel (as the error message told us: "Enumeration completed \ enp7s0: Could not set address: Operation not supported \ enp7s0: Failed") [16:33] That one is correct: https://github.com/systemd/systemd/blob/2afb2f4a9d6a497dfbe1983fbe1bac297a8dc52b/src/network/networkd-route.c#L2348 [16:34] Hm... I'm not a kernel dev... any idea what to do now? ;-) [16:34] I can set the route manually via ip route add... but that lasts only until the next reboot [16:39] slyon fyi systemd is only keeping the 'Onlink' spelling for backwards compat, the correct usage is 'GatewayOnLink' per upstream commit 9cb8c5593443d24c19e40bfd4fc06d672f8c554c [16:41] Ok, so a little bug discovered (it should be GatewayOnLink in the .network file) but that probably isn't the issue here? :-) [16:42] right it's definitely not the issue for you [16:44] can you share the output of 'ip a' after it fails? it's unable to set your static ip, which is strange [16:51] and check your kernel logs, e.g. 'journalctl -b -k' [16:51] One sec... I enabled IPv6 again in /etc/default/ufw and enabled more logging for systemd-networkd [16:51] Just rebooting and seeing what will happen now... [16:54] https://imgur.com/lhVCC4R [16:54] Still dead after reboot... [17:00] but you still get the 'Could not set address' error? [17:00] the address looks set [17:02] I see some ACPI Errors in the journalctl -b -k output and the last line is "enp7s0... Link is Up, Full Duplex etc. [17:02] no errors from networkd? [17:03] Nope... but still the same error in journalctl -u systemd-networkd [17:03] But much more debug output now... shall I upload that? I can set the route manually to get the machine online [17:03] sure [17:10] There we go: https://pastebin.com/8KUYndZw [17:10] I just replace the first 3 parts of the IP, the last part is the original [17:18] Guest92 do you have ipv6 disabled? [17:18] like, in the kernel cmdline? [17:19] AFAIR I disabled it only via /etc/default/ufw but not sure... how to check that? [17:19] well first check /proc/cmdline [17:20] to make sure you didn't disable ipv6 globally [17:20] Yeah, looks disabled: BOOT_IMAGE=/vmlinuz-5.13.0-40-generic root=/dev/mapper/vg0-root ro ipv6.disable=1 nomodeset consoleblank=0 [17:21] Ah I remember... seems I disabled it via grub: GRUB_CMDLINE_LINUX="ipv6.disable=1" [17:21] ok...and you're telling networkd to set up an ipv6 address? [17:22] Ok, lemme check... I'll enable it and reboot... [17:26] Holy moly! That was the problem! ;-) [17:27] yeah, hard for networkd to add the ipv6 addr you asked it to add, when ipv6 is disabled :) [17:27] Besides that mistake on my behalf... shouldn't it set up IPv4 at least or somehow detect that? I found another issue here with a similar case: https://github.com/systemd/systemd/issues/12656 [17:28] And... do you guys accept coffee or donations somehow? ;-) [17:28] networkd doesn't do 'partially configured', if part of the interface setup fails it's considered 'failed' [17:29] so since setting (one of) the addresses failed, networkd didn't bother to continue with the next step, adding the route for the interface [17:34] I see... BIG BIG BIG thank you to help me figuring this out! :-)