[17:10] <holmanb> minimal: sorry, missed your comment before. Adding support in eni seems reasonable, at least without digging into details too much. Are you looking for any input/thoughts in particular?
[17:36] <minimal> holmanb: I guess I'm wondering about the impact of this on other renderers and why things like IPv6 Privacy Extensions are not currently catered for in network config v2
[17:55] <minimal> I guess I'm thinking about IPv6 network configurations that are perhaps not typical in cloud environments but may be in VM or physical host environments
[21:09] <axino> hey there, cloud-init is giving me https://pastebin.ubuntu.com/p/mNBzMpkRdP/ on a physical machine that's getting commissioned by MAAS. Apparently it's not breaking anything though.
[21:09] <axino> happy to provide debugging output to help understand this
[21:41] <blackboxsw> axino: sorry to have missed this. Does `sudo cloud-init  schema --system` present an error due to invalid YAML being provided as user-data?
[21:41] <axino> blackboxsw: hi o/ looking
[21:45] <axino> blackboxsw: Cloud config schema errors: format-l1.c1: File None needs to begin with "#cloud-config"
[21:45] <axino> blackboxsw: that's all this command gives me
[21:52] <blackboxsw> axino: meh, ok user-data provided to this instance isn't #cloud-config specifically (could be b64encoded or scripts or something else ) you can see it with: sudo cloud-init query userdata
[21:53] <blackboxsw> that said, the traceback you mentioned is making me think either the network-config.yaml provided to cloud-unit on this instance is bogus YAML or user-data.
[21:54] <axino> blackboxsw: yeah it's a shell script with a payload
[21:54] <blackboxsw> +1 ok we should have a bug for cloud-niit query being smarter/more informative about non-#cloud-config user-data.
[21:55] <blackboxsw> that said, the mac address provided would likely need to be quoted to avoid YAML parse errors for yaml parse errors I think
[21:57] <axino> blackboxsw: so I can extract this, and it gives me a curtin directory with subdirs
[21:58] <axino> blackboxsw: and the MAC address generating the messages appears only in curtin/configs/config-003.cfg
[21:58] <axino> blackboxsw: which looks like a netplan config with a "network_commands" block appended
[22:00] <axino> blackboxsw: the error appears 3 times, and the MAC address as well, so it looks like a good candidate
[22:07] <blackboxsw> axino: is there a /etc/cloud/cloud.cfg.d/50-curtin-networking.cfg  on the system?
[22:07] <blackboxsw> and is that content shareable(not sensitive)>
[22:07] <blackboxsw> ?
[22:08] <blackboxsw> at least I think that's where curtin via MAAS leaves network config artifacts for cloud-init to apply
[22:09] <blackboxsw> if network config version provided by MAAS is `version: 2` recent cloud-init should actually pass through that content without trying to load the YAML at all, so I wouldn't have expected that error
[22:14] <axino> blackboxsw: OK I can't answer that now, all the nodes are deployed
[22:14] <axino> blackboxsw: I'll try to circle back
[22:15] <blackboxsw> axino: the other error in that log is reminiscent  of the dbus race w/ netplan apply per this bug https://bugs.launchpad.net/cloud-init/+bug/1997124
[22:15] -ubottu:#cloud-init- Launchpad bug 1997124 in cloud-init "Netplan/Systemd/Cloud-init/Dbus Race" [High, In Progress]
[22:16] <blackboxsw> James is already toying with systemd dependency changes to avoid the  activators.py[WARNING]: Running ['netplan', 'apply'] resulted in stderr output: Failed to connect system bus: No su
[22:16] <blackboxsw> ch file or directory                                                                                                                       
[22:16] <blackboxsw> in this PR https://github.com/canonical/cloud-init/pull/1937
[22:16] -ubottu:#cloud-init- Pull 1937 in canonical/cloud-init "Order cloud-init.service after dbus.socket on Ubuntu" [Open]
[22:17] <blackboxsw> but I think there is a secondary problem with the YAML processing as you mentioned
[22:17] <axino> OK
[22:18] <waldi> bah, can't commet. but how does ordering after dbus.socket help`
[22:19] <waldi> even if, this should be sockets.target. but dbus.socket does not mean it can run dbus yet
[22:21] <waldi> and on debian/ubuntu, there is a cycle between dbus.service and cloud-init if you hold it wrong enough
[22:22] <blackboxsw> systemd-networkd talks to dbus I believe and "netplan apply" on ubuntu systems writes out configuration that is ultimately rendered by networkd
[22:23] <waldi> systemd-networkd, as all systemd services, can connect to dbus late, so does not depend on dbus
[22:25] <waldi> as dbus.service depends on sysinit.target, this reorders the whole startup
[22:29]  * blackboxsw starts looking at wonders about cloud-init services only blocking on presence of an active dbus.socket when we happen to need to run `netplan apply` (which happens only for datasources which are only detected after network is already up and have netplan for their network config 'backend').
[22:30] <blackboxsw> waldi: we don't want After=dbus.service but were thinking about After=dbus.socket ... trying to see if that dep-chain still pulls us into After=sysinit.target
[22:33] <blackboxsw> meh same prob
[22:35] <blackboxsw> root@dev-f:~# systemctl show -p Requires,Wants,Before,After dbus.socket
[22:35] <blackboxsw> Requires=-.mount system.slice sysinit.target
[22:35] <blackboxsw> Wants=
[22:35] <blackboxsw> Before=shutdown.target dbus.service systemd-logind.service polkit.service sockets.target accounts-daemon.service
[22:35] <blackboxsw> After=-.mount system.slice sysinit.target
[22:36] <waldi> you have a socket, try to connect to it and systemd schedules a transaction to start the associated service. at this stage it can only kill the connection or wait until the service is up and running. dbus.service depends on sysinit.target
[22:41] <waldi> the only thing that fixes all those problems is to move dbus.socket/.service before sysinit.target
[22:42] <waldi> debian currently have that problem with firewalld, because it wants to start before network, but after dbus. so sysinit -> dbus -> firewalld -> network -> cloud-init -> sysinit
[22:49] <blackboxsw> thanks waldi for chiming in. hrm. right, ok it looks like the option of re-ordering dependencies for very early cloud-init boot stages (cloud-init.service or cloud-init-local.sevice) to depend on dbus.socket/service is high risk and impactful to overall general boot times and systemd ordering cycles (I had missed the After-sysinit.target in those cases).  
[22:50] <blackboxsw> the alternative we could take in cloud-init for the specific case where the datasource is detected after network is up (in init-network stage via "cloud-init.service") 
[22:51] <blackboxsw> is to actually just defer the 'netplan apply' call (to apply network configuration) to cloud-config.service
[22:52] <blackboxsw> which is scheduled After=sysinit.target
[22:52] <waldi> why does netplan apply need dbus?
[22:53] <waldi> hmpf, even service reload is implemeted via "networkctl reload"
[22:54] <blackboxsw> I don't think it's netplan per-se, but rather systemd-networkd is complaining about inability to talk to the socket
[22:55] <blackboxsw> but I'm digging in both netplan && systemd to double-check that
[22:55] <waldi> unlikely. systemd-networkd.service does not depend on dbus.*. this is networkctl
[22:56] <waldi> anyway, sleep
[22:57] <blackboxsw> yep https://github.com/canonical/netplan/blob/main/netplan/cli/utils.py#L116
[22:58] <blackboxsw> thx again. will poke at a more reasonable solution for this.
[23:14] <blackboxsw> or, simplest interim solution, we could retry netplan apply in cases where dbus.socket isn't up yet and cloud-init hits this specific error condition