[00:12] <agentnoel> Hi there, I'm looking to run some scripts per-instance, and pass them environment variables via user-data.
[00:12] <agentnoel> This is in the context of AWS EC2
[00:14] <agentnoel> I've found boothooks can set environment variables early in the process, but the syntax is not completely user friendly. If I supply a "bash script" as user data, it is easier to read, but executes too late in the process.
[00:14] <agentnoel> Do you have any suggestions for injecting environment variables early in the cloud-init process?
[00:15] <agentnoel> Also, the scripts are "baked" into the Amazon Machine Image at /var/lib/cloud/scripts/per-instance/. Is this the correct location? (I couldn't get the vendor scripts folder working).
[00:15] <agentnoel> Thank you :-)
[17:54] <blackboxsw> rharper: so, would you expect if cloud-init didn't rename interfaces that we'd avoid the wait-online  timeout?
[18:35] <rharper> blackboxsw: well, I would say no, but I also don't know why we're not matching any interface to configure either
[18:37] <blackboxsw> rharper: I'm wondering if the dual rename is introducing a race in networkd wait-online. like the data gets cached at some point post cold-plug rename and pre cloud-init rename
[18:37] <rharper> no, rename happens before networkd starts
[18:37] <rharper> it's 1) boot 2) generate which creates /run/systemd/network/*.{link,network} files 3) cold-plug (which fires on .link files) 4) cloud-init local
[18:38] <smoser> rharper: i dont think order of 2 and 3 is guaranteed
[18:38] <rharper> yes
[18:38] <rharper> it is
[18:38] <rharper> systemd generators run way before any units are processed
[18:38] <smoser> ? cloud-init's invocation of generate ?
[18:38] <rharper> no
[18:38] <rharper> generators
[18:39] <rharper> netplan get's called as a geneator by systemd itself
[18:39] <rharper> this is reboot, so we already have an existing /etc/netplan/*.yaml file
[18:39] <smoser> ah. yeah.
[18:40] <blackboxsw> as it stands operations like look the following:   cold-plug rename eth1 -> rename3 because of our existing /run/systemd/networkd/10-netplan-eth0.link file which contains a Name=eth0 and matching mac. But since azure presents an existing eth0 we fallback to rename3.
[18:41] <blackboxsw> then cloud-init does two  renames, eth0 -> cirename0 and rename3 -> eth0;
[18:41] <rharper> oh, interesting
[18:41] <rharper> moves the "new" eth0 out of the way; and then pushes the right eth0 into position
[18:41] <blackboxsw> yeah cloud-init's is a little smarter, move the existing out of the way
[18:41] <rharper> so, this sounds just fine
[18:41] <rharper> w.r.t getting the "right" eth0 in place
[18:41] <rharper> which means it won't be optional
[18:42] <rharper> we'll wait for it and config matches;  so why the stall then on wait-online ?
[18:42] <blackboxsw> right, but networkd-wait-online may have been waiting for the orig eth0 device to come online (by what logic I'm uncertain). if new rename3 ->eth0 is moved into eth0 place prior to a udev rule saying online maybe that's why we timeout?
[18:43] <rharper> networkd-wait-online can't run until cloud-init-local has finished, we block the network.target
[18:43] <blackboxsw> hrm ok... <drums-fingers-on-desk>
[18:48] <blackboxsw> btw, order of the nics is properly rendered by azure metatdata, I can see the ips and macs get pushed to the proper index in the 'network-'-
[18:48] <blackboxsw> > 'interface' key
[18:50] <rharper> blackboxsw: so I don't know how much we can play with it and look at the console but if possible, getting a networkctl status --all dump prior to invoking systemd-networkd-wait-online;  I'd typically do this with an extra ExecStart=/bin/sh -x -c 'cmd1 here' in the networkd.service file
[18:51] <rharper> the alternative is to recreate in say qemu where we can manually swap the mac addrs on the underying nics to trigger the scenario and still have a serial console to get in and debug