StyXman | schopin: ack | 07:25 |
---|---|---|
Guest9223 | Hi. I'm having some issues with a Hetzner server: after the apt update && apt upgrade && reboot netplan doesn't set the default gateway anymore -> I can only access it via a KVM. Any ideas what I can do or how to debug this? I compared the config to another server and I don't see any differences. Actually I never touched the default netplan config | 15:40 |
Guest9223 | from Hetzner until now ;-) | 15:40 |
Guest9223 | That command brought the machine back online (the "onlink" was required, otherwise it gave me an error): sudo ip route add default via 1.2.3.4 (gateway IP) dev enp7s0 onlink | 15:42 |
Guest9223 | I started systemd-networkd in debug mode according to the docs here and the machine was disconnected and the default gateway is gone again: https://wiki.ubuntu.com/DebuggingSystemd | 15:43 |
Guest9223 | Now I'm connected via the KVM and have no idea how to fix that and what causes that behavior... | 15:44 |
slyon | can you show your config from /etc/netplan/*.yaml? | 15:44 |
Guest9223 | slyon: sure... give me a minute | 15:45 |
Guest9223 | slyon: here: https://pastebin.com/HmnUcHVL | 15:49 |
Guest9223 | It also matches the Hetzner docs here: https://docs.hetzner.com/de/robot/dedicated-server/network/net-config-debian-ubuntu/ | 15:49 |
slyon | Thank you, could you please also show the output of `networkctl status enp7s0` just to see if that is passed through correctly | 15:52 |
Guest9223 | sure, one sec | 15:52 |
slyon | and maybe "route" and "route -6" | 15:52 |
slyon | err. "ip route" and "ip -6 route" that is | 15:53 |
Guest9223 | route is empty, route -6 says "/proc/net/ipv6_route: No such file or directory \ INET6 (IPv6) not configured in this system | 15:54 |
Guest9223 | netowrkctl isn't so easy, I can't copy & paste anything and it includes public IPs... but I see "State: routable (failed) | 15:55 |
slyon | the route output is interesting. "route" being empty should not be the case, as there is a default route defined in netplan, and also some ipv6 addresses and a gateway, so it should exist | 15:57 |
Guest9223 | ip route and ip -6 route are both empty | 15:57 |
slyon | networkctl: failed, is interesting, too. in this case you should check your journalctl -u systemd-networkd (you put that already into debug mode which is good) | 15:58 |
Guest9223 | Somehow it failes to set that route... I did it manually last time and the machine was online immediatly again until I started systemd-networkd in debug mode | 15:58 |
Guest9223 | It's Ubuntu 20.04 HWE (Hetzner installimage) btw and everything patched as of yesterday. | 15:59 |
Guest9223 | lemme try journalctl... | 15:59 |
Guest9223 | Btw. when I run lib/systemd/systemd-networkd it shows "Enumeration completed \ enp7s0: Could not set address: Operation not supported \ enp7s0: Failed | 16:01 |
slyon | that's probably the problem.. but I'm not sure what "Operation not supported" means. are there any more hints in journalctl? | 16:06 |
slyon | you didn't put any custom networkd configuration or systemd override, right? | 16:06 |
Guest9223 | In journalctl I see that error message the first time appeared yesterday after the reboot... So seems that something is broken | 16:07 |
Guest9223 | Nope, I didn't touch that file until today when I started to investigate it but still I didn't modify anything. I was running netplan generate and apply but nothing (to my knowledge) should have modified anything here | 16:08 |
Guest9223 | And WireGuard was set up which I disabled, just in case | 16:08 |
Guest9223 | I saw that netplan was updated end of March... I would almost say that it broke something -> I have a few other servers but I don't want to reboot them now until this is solved because they are productive ;-) | 16:09 |
slyon | to me it looks like netplan is doing it's job. but networkd is somehow failing to comunicate with the kernel... so it might be a kernel or systemd-networkd issue :-/ did you updates for those packages as well? | 16:12 |
slyon | could you try booting an older kernel image? | 16:12 |
Guest9223 | Here is a screenshot of journalctl -> the upper part before the empty line is the last successful one: https://imgur.com/BfCgsOL | 16:13 |
slyon | could you maybe paste a (redacted) version of /run/systemd/network/10-netplan-enp7s0.network ? | 16:14 |
Guest9223 | I've updated everything available yesterday, ~24h ago. I also checked for new updates today after setting the route manually -> nothing new | 16:14 |
Guest9223 | sure, one sec | 16:14 |
slyon | I could try to apply that on a 20.04 machine and see if it produces the same issue.. | 16:14 |
Guest9223 | https://imgur.com/7UH7TW9 | 16:17 |
Guest9223 | The IP address match the one from /etc/netplan/01... and I also coudn'T see any difference on another server (besides diffferent IP addresses, ofc) | 16:18 |
ddstreet | Guest9223 your ip address has netmask /32 so it can't reach your gateway, of course | 16:19 |
ddstreet | if you actually want the /32 netmask for some reason then you need to add a specific route to the gateway before actually adding the default route | 16:19 |
Guest9223 | Regarding kernel updates: I just used apt update && apt upgrade, nothing was installed manually, no modules etc. | 16:19 |
ddstreet | but you almost certainly don't actually want a /32 subnet | 16:21 |
Guest9223 | ddstreet: well it is mentioned and explained like this in the Hetzner docs: https://docs.hetzner.com/de/robot/dedicated-server/network/net-config-debian-ubuntu/ + another server has the same netmask and that one is working fine. So maybe something was changed here with a recent update and the Hetzner docs are outdated? | 16:22 |
ddstreet | ok it's fine to do that - i.e. the 'point to point' stuff they talk about - but you do have to define the route to the gateway | 16:22 |
Guest9223 | What I mean: they set/recommend it like that, I think to protect from (accidental or malicious) IP changes and duplicate IP addresses with other customers - that'S how I understand their docs | 16:22 |
ddstreet | that's what the 'on-link' does, it says 'ok kernel i know you have no route to this ip, but don't worry it really is on your link local so just pretend you can route to it' | 16:23 |
ddstreet | if systemd isn't properly handling the setting there might be a systemd bug | 16:24 |
Guest9223 | Yeah that explains why it errored out without the "online" parameter and it was working for a few months like that -> I think January 24th 2022 or so | 16:24 |
ddstreet | ah | 16:24 |
ddstreet | Guest9223 in your .network file the param is misspelled | 16:24 |
ddstreet | it's GatewayOnLink not GatewayOnlink | 16:24 |
ddstreet | change the L capitalization and it should work ok | 16:24 |
ddstreet | not sure if netplan is using the wrong cap? | 16:24 |
Guest9223 | Lemme check... | 16:25 |
Guest9223 | Is it enough to edit the .network file? Because I get the same error and on another (not yet rebooted) server it is also lowercase | 16:26 |
ddstreet | no, it looks like there's a bug in netplan where it's using the wrong spelling | 16:26 |
Guest9223 | So something is case sensitive and not recognizing the key you mean? | 16:27 |
ddstreet | until that's fixed, i guess you could add a drop-in file, e.g. if the netplan-generated file is named '10-netplan-enp7s0.network' then create a file /etc/systemd/network/10-netplan-enp7s0.network.d/override.conf' and make its content: | 16:28 |
ddstreet | [Route] | 16:28 |
ddstreet | GatewayOnLink=true | 16:28 |
ddstreet | (just those lines) | 16:28 |
ddstreet | then i think if you reboot it shoudl work | 16:28 |
slyon | ddstreet: netplan is rendering it as "GatewayOnlink=true".. but that didn't change since 2018... and it was actually changed from GatewayOnLink -> GatewayOnlink in that commit d419c7b8 | 16:28 |
ddstreet | ah ok that's interesting then | 16:28 |
Guest92 | Got disconnected... did my GitHub message make it? | 16:29 |
slyon | Guest92: no, i don't think so | 16:31 |
Guest92 | https://github.com/canonical/netplan/search?q=GatewayOnLink shows GatewayOnlink only, lowercase | 16:32 |
slyon | netplan is rendering it as "GatewayOnlink=true".. but that didn't change since 2018... and it was actually changed from GatewayOnLink -> GatewayOnlink in that commit d419c7b8 | 16:32 |
ddstreet | Guest92 i'm wrong, as slyon said either spelling is ok | 16:32 |
ddstreet | so ignore me, might be the kernel as he said :) | 16:33 |
Guest92 | Seems that systemd-networkd is accepting both: https://github.com/systemd/systemd/search?q=GatewayOnLink | 16:33 |
slyon | I got to drop now. I think this problem is most probably related to networkd or kernel (as the error message told us: "Enumeration completed \ enp7s0: Could not set address: Operation not supported \ enp7s0: Failed") | 16:33 |
Guest92 | That one is correct: https://github.com/systemd/systemd/blob/2afb2f4a9d6a497dfbe1983fbe1bac297a8dc52b/src/network/networkd-route.c#L2348 | 16:33 |
Guest92 | Hm... I'm not a kernel dev... any idea what to do now? ;-) | 16:34 |
Guest92 | I can set the route manually via ip route add... but that lasts only until the next reboot | 16:34 |
ddstreet | slyon fyi systemd is only keeping the 'Onlink' spelling for backwards compat, the correct usage is 'GatewayOnLink' per upstream commit 9cb8c5593443d24c19e40bfd4fc06d672f8c554c | 16:39 |
Guest92 | Ok, so a little bug discovered (it should be GatewayOnLink in the .network file) but that probably isn't the issue here? :-) | 16:41 |
ddstreet | right it's definitely not the issue for you | 16:42 |
ddstreet | can you share the output of 'ip a' after it fails? it's unable to set your static ip, which is strange | 16:44 |
ddstreet | and check your kernel logs, e.g. 'journalctl -b -k' | 16:51 |
Guest92 | One sec... I enabled IPv6 again in /etc/default/ufw and enabled more logging for systemd-networkd | 16:51 |
Guest92 | Just rebooting and seeing what will happen now... | 16:51 |
Guest92 | https://imgur.com/lhVCC4R | 16:54 |
Guest92 | Still dead after reboot... | 16:54 |
ddstreet | but you still get the 'Could not set address' error? | 17:00 |
ddstreet | the address looks set | 17:00 |
Guest92 | I see some ACPI Errors in the journalctl -b -k output and the last line is "enp7s0... Link is Up, Full Duplex etc. | 17:02 |
ddstreet | no errors from networkd? | 17:02 |
Guest92 | Nope... but still the same error in journalctl -u systemd-networkd | 17:03 |
Guest92 | But much more debug output now... shall I upload that? I can set the route manually to get the machine online | 17:03 |
ddstreet | sure | 17:03 |
Guest92 | There we go: https://pastebin.com/8KUYndZw | 17:10 |
Guest92 | I just replace the first 3 parts of the IP, the last part is the original | 17:10 |
ddstreet | Guest92 do you have ipv6 disabled? | 17:18 |
ddstreet | like, in the kernel cmdline? | 17:18 |
Guest92 | AFAIR I disabled it only via /etc/default/ufw but not sure... how to check that? | 17:19 |
ddstreet | well first check /proc/cmdline | 17:19 |
ddstreet | to make sure you didn't disable ipv6 globally | 17:20 |
Guest92 | Yeah, looks disabled: BOOT_IMAGE=/vmlinuz-5.13.0-40-generic root=/dev/mapper/vg0-root ro ipv6.disable=1 nomodeset consoleblank=0 | 17:20 |
Guest92 | Ah I remember... seems I disabled it via grub: GRUB_CMDLINE_LINUX="ipv6.disable=1" | 17:21 |
ddstreet | ok...and you're telling networkd to set up an ipv6 address? | 17:21 |
Guest92 | Ok, lemme check... I'll enable it and reboot... | 17:22 |
Guest92 | Holy moly! That was the problem! ;-) | 17:26 |
ddstreet | yeah, hard for networkd to add the ipv6 addr you asked it to add, when ipv6 is disabled :) | 17:27 |
Guest92 | Besides that mistake on my behalf... shouldn't it set up IPv4 at least or somehow detect that? I found another issue here with a similar case: https://github.com/systemd/systemd/issues/12656 | 17:27 |
Guest92 | And... do you guys accept coffee or donations somehow? ;-) | 17:28 |
ddstreet | networkd doesn't do 'partially configured', if part of the interface setup fails it's considered 'failed' | 17:28 |
ddstreet | so since setting (one of) the addresses failed, networkd didn't bother to continue with the next step, adding the route for the interface | 17:29 |
Guest92 | I see... BIG BIG BIG thank you to help me figuring this out! :-) | 17:34 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!