=== esv_ is now known as esv | ||
holmanb | minimal: sorry, missed your comment before. Adding support in eni seems reasonable, at least without digging into details too much. Are you looking for any input/thoughts in particular? | 17:10 |
---|---|---|
minimal | holmanb: I guess I'm wondering about the impact of this on other renderers and why things like IPv6 Privacy Extensions are not currently catered for in network config v2 | 17:36 |
minimal | I guess I'm thinking about IPv6 network configurations that are perhaps not typical in cloud environments but may be in VM or physical host environments | 17:55 |
axino | hey there, cloud-init is giving me https://pastebin.ubuntu.com/p/mNBzMpkRdP/ on a physical machine that's getting commissioned by MAAS. Apparently it's not breaking anything though. | 21:09 |
axino | happy to provide debugging output to help understand this | 21:09 |
blackboxsw | axino: sorry to have missed this. Does `sudo cloud-init schema --system` present an error due to invalid YAML being provided as user-data? | 21:41 |
axino | blackboxsw: hi o/ looking | 21:41 |
axino | blackboxsw: Cloud config schema errors: format-l1.c1: File None needs to begin with "#cloud-config" | 21:45 |
axino | blackboxsw: that's all this command gives me | 21:45 |
blackboxsw | axino: meh, ok user-data provided to this instance isn't #cloud-config specifically (could be b64encoded or scripts or something else ) you can see it with: sudo cloud-init query userdata | 21:52 |
blackboxsw | that said, the traceback you mentioned is making me think either the network-config.yaml provided to cloud-unit on this instance is bogus YAML or user-data. | 21:53 |
axino | blackboxsw: yeah it's a shell script with a payload | 21:54 |
blackboxsw | +1 ok we should have a bug for cloud-niit query being smarter/more informative about non-#cloud-config user-data. | 21:54 |
blackboxsw | that said, the mac address provided would likely need to be quoted to avoid YAML parse errors for yaml parse errors I think | 21:55 |
axino | blackboxsw: so I can extract this, and it gives me a curtin directory with subdirs | 21:57 |
axino | blackboxsw: and the MAC address generating the messages appears only in curtin/configs/config-003.cfg | 21:58 |
axino | blackboxsw: which looks like a netplan config with a "network_commands" block appended | 21:58 |
axino | blackboxsw: the error appears 3 times, and the MAC address as well, so it looks like a good candidate | 22:00 |
blackboxsw | axino: is there a /etc/cloud/cloud.cfg.d/50-curtin-networking.cfg on the system? | 22:07 |
blackboxsw | and is that content shareable(not sensitive)> | 22:07 |
blackboxsw | ? | 22:07 |
blackboxsw | at least I think that's where curtin via MAAS leaves network config artifacts for cloud-init to apply | 22:08 |
blackboxsw | if network config version provided by MAAS is `version: 2` recent cloud-init should actually pass through that content without trying to load the YAML at all, so I wouldn't have expected that error | 22:09 |
axino | blackboxsw: OK I can't answer that now, all the nodes are deployed | 22:14 |
axino | blackboxsw: I'll try to circle back | 22:14 |
blackboxsw | axino: the other error in that log is reminiscent of the dbus race w/ netplan apply per this bug https://bugs.launchpad.net/cloud-init/+bug/1997124 | 22:15 |
-ubottu:#cloud-init- Launchpad bug 1997124 in cloud-init "Netplan/Systemd/Cloud-init/Dbus Race" [High, In Progress] | 22:15 | |
blackboxsw | James is already toying with systemd dependency changes to avoid the activators.py[WARNING]: Running ['netplan', 'apply'] resulted in stderr output: Failed to connect system bus: No su | 22:16 |
blackboxsw | ch file or directory | 22:16 |
blackboxsw | in this PR https://github.com/canonical/cloud-init/pull/1937 | 22:16 |
-ubottu:#cloud-init- Pull 1937 in canonical/cloud-init "Order cloud-init.service after dbus.socket on Ubuntu" [Open] | 22:16 | |
blackboxsw | but I think there is a secondary problem with the YAML processing as you mentioned | 22:17 |
axino | OK | 22:17 |
waldi | bah, can't commet. but how does ordering after dbus.socket help` | 22:18 |
waldi | even if, this should be sockets.target. but dbus.socket does not mean it can run dbus yet | 22:19 |
waldi | and on debian/ubuntu, there is a cycle between dbus.service and cloud-init if you hold it wrong enough | 22:21 |
blackboxsw | systemd-networkd talks to dbus I believe and "netplan apply" on ubuntu systems writes out configuration that is ultimately rendered by networkd | 22:22 |
waldi | systemd-networkd, as all systemd services, can connect to dbus late, so does not depend on dbus | 22:23 |
waldi | as dbus.service depends on sysinit.target, this reorders the whole startup | 22:25 |
* blackboxsw starts looking at wonders about cloud-init services only blocking on presence of an active dbus.socket when we happen to need to run `netplan apply` (which happens only for datasources which are only detected after network is already up and have netplan for their network config 'backend'). | 22:29 | |
blackboxsw | waldi: we don't want After=dbus.service but were thinking about After=dbus.socket ... trying to see if that dep-chain still pulls us into After=sysinit.target | 22:30 |
blackboxsw | meh same prob | 22:33 |
blackboxsw | root@dev-f:~# systemctl show -p Requires,Wants,Before,After dbus.socket | 22:35 |
blackboxsw | Requires=-.mount system.slice sysinit.target | 22:35 |
blackboxsw | Wants= | 22:35 |
blackboxsw | Before=shutdown.target dbus.service systemd-logind.service polkit.service sockets.target accounts-daemon.service | 22:35 |
blackboxsw | After=-.mount system.slice sysinit.target | 22:35 |
waldi | you have a socket, try to connect to it and systemd schedules a transaction to start the associated service. at this stage it can only kill the connection or wait until the service is up and running. dbus.service depends on sysinit.target | 22:36 |
waldi | the only thing that fixes all those problems is to move dbus.socket/.service before sysinit.target | 22:41 |
waldi | debian currently have that problem with firewalld, because it wants to start before network, but after dbus. so sysinit -> dbus -> firewalld -> network -> cloud-init -> sysinit | 22:42 |
blackboxsw | thanks waldi for chiming in. hrm. right, ok it looks like the option of re-ordering dependencies for very early cloud-init boot stages (cloud-init.service or cloud-init-local.sevice) to depend on dbus.socket/service is high risk and impactful to overall general boot times and systemd ordering cycles (I had missed the After-sysinit.target in those cases). | 22:49 |
blackboxsw | the alternative we could take in cloud-init for the specific case where the datasource is detected after network is up (in init-network stage via "cloud-init.service") | 22:50 |
blackboxsw | is to actually just defer the 'netplan apply' call (to apply network configuration) to cloud-config.service | 22:51 |
blackboxsw | which is scheduled After=sysinit.target | 22:52 |
waldi | why does netplan apply need dbus? | 22:52 |
waldi | hmpf, even service reload is implemeted via "networkctl reload" | 22:53 |
blackboxsw | I don't think it's netplan per-se, but rather systemd-networkd is complaining about inability to talk to the socket | 22:54 |
blackboxsw | but I'm digging in both netplan && systemd to double-check that | 22:55 |
waldi | unlikely. systemd-networkd.service does not depend on dbus.*. this is networkctl | 22:55 |
waldi | anyway, sleep | 22:56 |
blackboxsw | yep https://github.com/canonical/netplan/blob/main/netplan/cli/utils.py#L116 | 22:57 |
blackboxsw | thx again. will poke at a more reasonable solution for this. | 22:58 |
blackboxsw | or, simplest interim solution, we could retry netplan apply in cases where dbus.socket isn't up yet and cloud-init hits this specific error condition | 23:14 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!