[13:44] <rharper> smoser: when you get a change, https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1636912/ has some details re: networkd and cloud-init.service that pitti tracked down.  I'm going to see about testing out the change to cloud-init.service to include an After=systemd-networkd.service and see if that's resolves the ordering issue we're seeing
[15:06] <rharper> smoser: playing with the networkd target and cloud-init.service;  I noticed that we say Before=network-online.target ; however, don't we want networking to be *up* when cloud-init.service runs (it may scan for network metadata services) ?
[15:12] <smoser> its weird.
[15:12] <smoser> network-online is an target
[15:12] <smoser> that other things would wait for
[15:12] <smoser> to know that the network is up
[15:13] <smoser> and we want to block those things running until after cloud-init has run
[15:13] <rharper> hrm
[15:13] <rharper> ok
[15:13] <smoser> we want to force that bottleneck in boot
[15:13] <rharper> so, how do we handle the race between 'networking.service' being complete and the devices being UP
[15:13] <rharper> and when cloud-init running
[15:13] <rharper> and checking if networking is up (in the route output info)
[15:13] <smoser> you're gonna make me think, eh
[15:14] <smoser> we have walked through this, and the stuff is sane. but let me look
[15:14] <rharper> so, in my case, networkd runs, kicks of DHCP, takes 2 to 3 seconds; cloud-init.service runs almost the same time as networkd (just right after)
[15:14] <rharper> but races the DHCP response
[15:14] <rharper> ok
[15:14] <smoser> i'd not be completely sure, but pitti has helped me to be sure int he past
[15:14] <rharper> ok, I updated the bug
[15:14] <smoser> but obviously this is all really complex (and unforunatlye brittle)
[15:15] <rharper> so, if you recall any discussion about the ordering, add it there so we can clarify
[15:15] <rharper> I wonder if the serial execution of ifup  in networking.service naturally delays things
[15:16] <rharper> vs. networkd going daemon mode and async bringing up networking
[15:16] <smoser> so in ifupdown world, there is no race on 'networking.service'
[15:16] <smoser> if its done ('After') then stuff that is supposed to be up is up
[15:16] <smoser> which kind of makes sense.
[15:16] <rharper> right, but we have no daemon
[15:16] <rharper> rather there are two Exec=
[15:16] <rharper> statements
[15:16] <rharper> both have to run
[15:17] <rharper> systemd-networkd can just spawn itself and report that it's done
[15:17] <rharper> it gets out of the way faster
[15:17] <smoser> well it shouldnt do that
[15:17] <rharper> but it takes roughly 3 seconds for the DHCP response to complete (according to the journal)
[15:17] <rharper> but
[15:17] <rharper> there's a sepatarate systemd-networkd-wait-online.service
[15:17] <rharper> which waits
[15:17] <rharper> but again, that's a *start* of a unit
[15:17] <rharper> not a completion
[15:18] <rharper> so things are "up" after -wait-online exits
[15:18] <rharper> which is roughly network-online.target
[15:18] <smoser> well, it says (/lib/systemd/systemd-networkd-wait-online)
[15:18] <smoser> which *says* "Block until network is configured":
[15:18] <rharper> so maybe we can say Requires systemd-networkd-wait-online
[15:18] <smoser> so maybe we just want to be After systemd-networkd-wait-online
[15:18] <smoser> yeah
[15:18] <rharper> lemme try that
[15:19] <smoser> except... if htat is ogoing to cause issues when networkd is *not* used.
[15:20] <rharper> maybe
[15:20] <rharper> I'll test both
[15:24] <smoser>  /lib/systemd/systemd-networkd-wait-online
[15:24] <smoser> is blocking on my desktop right now
[15:24] <smoser> :-(
[15:29] <rharper> hrm
[15:29] <rharper> may need to generate that
[15:29] <rharper> based on if networkd is going to run
[15:29] <rharper> or something like that
[15:50] <rharper> ah
[15:50] <rharper> [ INFO ] Network Service is not active.
[15:50] <rharper> [DEPEND] Dependency failed for Wait for Network to be Configured.
[15:50] <rharper> [DEPEND] Dependency failed for Initial cloud... job (metadata service crawler).
[15:50] <rharper> it's disabled by default if ifupdown is installed IIRC
[16:32] <rharper> bleh, I have it activated now, but it blocks continuously; and no idea why
[17:02] <NerdyBiker> Anyone know how I can reference an EC2 instance ID in a cloud config on CentOS6?
[17:03] <NerdyBiker> I've tried just using $INSTANCE_ID, want to use write_files to put it in a file
[17:08] <NerdyBiker> been suggested to set a variable in bootcmd section, but not a fan of how it looks, wasn't sure if that's the best/only way
[17:31] <rharper> NerdyBiker: you can readlink -f /var/lib/cloud/instance
[17:32] <rharper> that's a symlink to the instances/$INSTANCE_ID of the current session
[17:42] <rharper> NerdyBiker: it also looks like /var/lib/cloud/data/instance-id includes the current instance id value
[19:26] <rharper> smoser: well, after most of the day here; I'd say were in a wedge;  networkd really wants dbus.service; otherwise it just adds boot time;  in Xenial, we don't have resolved service; I suspect there we could drop the requirement; but it means divergent config and possible behavior
[19:30] <smoser> :-(
[19:32] <smoser> rharper, but well what about the After when networkd is not running
[19:32] <smoser> that would block, right ?
[19:56] <rharper> I've not tested the non-networkd setup yet
[19:56] <rharper> just trying to figure out how to get networkd and cloud-init to run in the right order (and without blocking excessively)
[20:18] <smoser> ok.
[20:36] <rharper> I wish there was an easy way to find out why a unit was skipped
[20:52] <rharper> ok; I think have something;    if we drop the Requires , but use After=networking.service and After=systemd-networkd.service;  then there's no hang
[20:52] <rharper> in the case of networkd/nplan, the netplan generater will add systemd-networkd as a wanted target if /etc/netplan/*.yaml is present
[20:53] <rharper> the one catch here is that, networking service still runs as well (ie, if you have both ifupdown and networkd/netplan installed)
[20:53] <rharper> I did have to update the cloud-config.target to Require cloud-init.service ;  it wasn't getting picked up after we dropped the networking requirement
[20:54] <rharper> this stuff is just painful
[20:54] <rharper> in the bug, still discussing with pitti and slangasek as the UC16 image is "unique"
[20:56] <smoser> rharper, i dont know why you'd need 'Require' rather than just 'After' for cloud-config.target
[20:56] <smoser> the difference (since they're clearly always installed together)  is that Require means only run if the other succeeds
[20:57] <smoser> we want cloud-config.target to run anyway, as it may be needed.
[20:57] <smoser> its kind of failsafe.
[20:57] <smoser> but that is my understanding of Require and After
[21:01] <rharper> smoser: I agree
[21:01] <rharper> there unfortunately isn't a systemctl --who-wants=cloud-init.service
[21:01] <rharper> so I can't tell *why* it isn't run
[21:01] <rharper> it just _isnt_
[21:01] <rharper> if I add just that line to the cloud-config.target
[21:01] <rharper> it works
[21:02] <rharper> there's not much reason to run cloud-config if cloud-init.service fails though, right?
[21:02] <rharper> smoser: note that cloud-init.service is *already* in the After= list  for cloud-config.target
[21:03]  * rharper fiddles some more for the smallest set of changes
[21:03] <smoser> right.
[21:03] <rharper> so, I'm left wondering why it didn't run
[21:03] <smoser> look at the journal
[21:03] <rharper> it says *nothing* about what didn't run
[21:03] <smoser> for 'reaking'
[21:03] <rharper> there wasn't a loop
[21:03] <smoser> (not freaking)
[21:03] <smoser> ok
[21:03] <rharper> it's just not *wanted*
[21:03] <rharper> which is bizare
[21:03] <smoser> breaking. that is what i was looking for.
[21:03] <rharper> cloud-init.target wants it