[13:44] smoser: when you get a change, https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1636912/ has some details re: networkd and cloud-init.service that pitti tracked down. I'm going to see about testing out the change to cloud-init.service to include an After=systemd-networkd.service and see if that's resolves the ordering issue we're seeing [15:06] smoser: playing with the networkd target and cloud-init.service; I noticed that we say Before=network-online.target ; however, don't we want networking to be *up* when cloud-init.service runs (it may scan for network metadata services) ? [15:12] its weird. [15:12] network-online is an target [15:12] that other things would wait for [15:12] to know that the network is up [15:13] and we want to block those things running until after cloud-init has run [15:13] hrm [15:13] ok [15:13] we want to force that bottleneck in boot [15:13] so, how do we handle the race between 'networking.service' being complete and the devices being UP [15:13] and when cloud-init running [15:13] and checking if networking is up (in the route output info) [15:13] you're gonna make me think, eh [15:14] we have walked through this, and the stuff is sane. but let me look [15:14] so, in my case, networkd runs, kicks of DHCP, takes 2 to 3 seconds; cloud-init.service runs almost the same time as networkd (just right after) [15:14] but races the DHCP response [15:14] ok [15:14] i'd not be completely sure, but pitti has helped me to be sure int he past [15:14] ok, I updated the bug [15:14] but obviously this is all really complex (and unforunatlye brittle) [15:15] so, if you recall any discussion about the ordering, add it there so we can clarify [15:15] I wonder if the serial execution of ifup in networking.service naturally delays things [15:16] vs. networkd going daemon mode and async bringing up networking [15:16] so in ifupdown world, there is no race on 'networking.service' [15:16] if its done ('After') then stuff that is supposed to be up is up [15:16] which kind of makes sense. [15:16] right, but we have no daemon [15:16] rather there are two Exec= [15:16] statements [15:16] both have to run [15:17] systemd-networkd can just spawn itself and report that it's done [15:17] it gets out of the way faster [15:17] well it shouldnt do that [15:17] but it takes roughly 3 seconds for the DHCP response to complete (according to the journal) [15:17] but [15:17] there's a sepatarate systemd-networkd-wait-online.service [15:17] which waits [15:17] but again, that's a *start* of a unit [15:17] not a completion [15:18] so things are "up" after -wait-online exits [15:18] which is roughly network-online.target [15:18] well, it says (/lib/systemd/systemd-networkd-wait-online) [15:18] which *says* "Block until network is configured": [15:18] so maybe we can say Requires systemd-networkd-wait-online [15:18] so maybe we just want to be After systemd-networkd-wait-online [15:18] yeah [15:18] lemme try that [15:19] except... if htat is ogoing to cause issues when networkd is *not* used. [15:20] maybe [15:20] I'll test both [15:24] /lib/systemd/systemd-networkd-wait-online [15:24] is blocking on my desktop right now [15:24] :-( [15:29] hrm [15:29] may need to generate that [15:29] based on if networkd is going to run [15:29] or something like that [15:50] ah [15:50] [ INFO ] Network Service is not active. [15:50] [DEPEND] Dependency failed for Wait for Network to be Configured. [15:50] [DEPEND] Dependency failed for Initial cloud... job (metadata service crawler). [15:50] it's disabled by default if ifupdown is installed IIRC [16:32] bleh, I have it activated now, but it blocks continuously; and no idea why [17:02] Anyone know how I can reference an EC2 instance ID in a cloud config on CentOS6? [17:03] I've tried just using $INSTANCE_ID, want to use write_files to put it in a file [17:08] been suggested to set a variable in bootcmd section, but not a fan of how it looks, wasn't sure if that's the best/only way [17:31] NerdyBiker: you can readlink -f /var/lib/cloud/instance [17:32] that's a symlink to the instances/$INSTANCE_ID of the current session [17:42] NerdyBiker: it also looks like /var/lib/cloud/data/instance-id includes the current instance id value [19:26] smoser: well, after most of the day here; I'd say were in a wedge; networkd really wants dbus.service; otherwise it just adds boot time; in Xenial, we don't have resolved service; I suspect there we could drop the requirement; but it means divergent config and possible behavior [19:30] :-( [19:32] rharper, but well what about the After when networkd is not running [19:32] that would block, right ? [19:56] I've not tested the non-networkd setup yet [19:56] just trying to figure out how to get networkd and cloud-init to run in the right order (and without blocking excessively) === Hazelesque_ is now known as Hazelesque [20:18] ok. [20:36] I wish there was an easy way to find out why a unit was skipped [20:52] ok; I think have something; if we drop the Requires , but use After=networking.service and After=systemd-networkd.service; then there's no hang [20:52] in the case of networkd/nplan, the netplan generater will add systemd-networkd as a wanted target if /etc/netplan/*.yaml is present [20:53] the one catch here is that, networking service still runs as well (ie, if you have both ifupdown and networkd/netplan installed) [20:53] I did have to update the cloud-config.target to Require cloud-init.service ; it wasn't getting picked up after we dropped the networking requirement [20:54] this stuff is just painful [20:54] in the bug, still discussing with pitti and slangasek as the UC16 image is "unique" [20:56] rharper, i dont know why you'd need 'Require' rather than just 'After' for cloud-config.target [20:56] the difference (since they're clearly always installed together) is that Require means only run if the other succeeds [20:57] we want cloud-config.target to run anyway, as it may be needed. [20:57] its kind of failsafe. [20:57] but that is my understanding of Require and After [21:01] smoser: I agree [21:01] there unfortunately isn't a systemctl --who-wants=cloud-init.service [21:01] so I can't tell *why* it isn't run [21:01] it just _isnt_ [21:01] if I add just that line to the cloud-config.target [21:01] it works [21:02] there's not much reason to run cloud-config if cloud-init.service fails though, right? [21:02] smoser: note that cloud-init.service is *already* in the After= list for cloud-config.target [21:03] * rharper fiddles some more for the smallest set of changes [21:03] right. [21:03] so, I'm left wondering why it didn't run [21:03] look at the journal [21:03] it says *nothing* about what didn't run [21:03] for 'reaking' [21:03] there wasn't a loop [21:03] (not freaking) [21:03] ok [21:03] it's just not *wanted* [21:03] which is bizare [21:03] breaking. that is what i was looking for. [21:03] cloud-init.target wants it