[13:03] <falcojr> I was looking at the rhel hostname PR and got a failure when running the integration test for it. I did a little digging and found that this is giving us a problem: https://github.com/canonical/cloud-init/pull/859/files#diff-2902e7f3123dd1e701482d6c60d6ca2e51d2dcbe3f643cb30be70727fde3fbfbR262
[13:03] <falcojr> The configuration there doesn't contain the userdata so we can't make any decisions based on it. Is this normal/expected for distro objects to not receive userdata?
[13:04] <falcojr> if so, implementing this will require a different approach
[15:19] <rharper> falcojr: I don't think distro objects ever get user-data, that's bound to the datasource object, and the *cloud* object is what binds a distro and a datasource ;
[15:21] <rharper> ah, _cfg is system_config IIRC , let me dig a bit more;
[15:24] <rharper> right, system_config, in cloudinit/stages.py:Init object, the distro property, you can see distro_cls is fed system_config from _extract_cfg('system') which crawls through /etc/cloud for "system" config, the static on-disk conf files; and is pulling out the 'system_info' section of the on-disk config;
[15:33] <falcojr> thanks for that. Back to the drawing board then I guess... :/
[15:49] <rharper> there's nothing that requires the config to be present within the system_config, only that the method could be called with a parameter (which would default to system_config value, if not specified) ...
[15:49] <rharper> I think we do something like this re ntp
[15:49] <rharper> falcojr: ^
[15:53] <falcojr> rharper: I'm not sure I understand that last sentence..."that the method could be called with a parameter"?
[16:20] <ananke> I'm trying to determine the order of operation of various parts of cloud-init on an ec2 debian image, and I was hoping somebody could give me a clue. I want to know whether user-data is executed before or after scripts-per-boot. I can't figure out what the name of the user-data module is
[16:20] <ananke> there's rightscale_userdata, but I'm not sure if that's it
[16:25] <rharper> falcojr:  right, sorry.  there are interactions in cc_set_hostname.py between the userdata config (the cfg that's passed to handle) and the distro object (Hanging off cloud object which binds the datasource and distro object together).  I'm suggesting that in the method we are invoking on the distro object to set the hostname, it can take a value provided by cc_set_hostname which came from user-data;  then as a unittest, that's
[16:25] <rharper> split into two parts,  1) distro method handles various values of the paramter  2) config test which tests calling distro method in various ways;
[16:29] <ananke> ahh, it's user-scripts module
[16:29] <rharper> ananke: user-data isn't a module, it's the config in which you pass data which is consumed by one or more config modules;  most config modules have a specific namespace, like ntp: { config under here for ntp}  but some config modules may read from multiple namespaces ...   scripts-per-boot is one of the config stages that runs in cloud-init "final" stage, which runs after the config "module" stage;   there are some more subtle
[16:29] <rharper> orderings ... so it may help if you ask specifically about what you're trying to do or what's not working
[16:39] <ananke> rharper: it appears that what I'm looking at is 'user-scripts (i.e. shell scripts passed as user-data)' (from cloud-init docs) , otherwise described here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
[16:40] <ananke> and my particular issue is figuring out if that runs before, or after, the scripts-per-boot. Looks like it runs after
[16:44] <Odd_Bloke> ananke: Yep, that is "scripts-user" and in the default configuration runs after "scripts-per-boot": you can see your specific config in /etc/cloud/cloud.cfg.
[16:46] <ananke> Odd_Bloke: thanks! my issue was trying to figure out the correct name for it, because /etc/cloud/cloud.cfg didn't have much in terms of 'user-data'
[16:48] <rharper> Odd_Bloke: thanks ... that's an unfortunate name swap (scripts-user vs user-scripts in the docs)
[16:57] <Odd_Bloke> https://cloudinit.readthedocs.io/en/latest/topics/modules.html#scripts-user does have it the right way around
[17:54] <hamalq> hi can i get +1 on this https://github.com/canonical/cloud-init/pull/859
[18:55] <blackboxsw> hamalq: I think you may have missed backlog conversation on your branch from 5 hours ago. "falcohjr" I was looking at the rhel hostname PR and got a failure when running the integration test for it. I did a little digging and found that this is giving us a problem: https://github.com/canonical/cloud-init/pull/859/files#diff-2902e7f3123dd1e701482d6c60d6ca2e51d2dcbe3f643cb30be70727fde3fbfbR262
[18:56] <blackboxsw> folks are talking back and forth about it a bit
[18:58] <blackboxsw> https://irclogs.ubuntu.com/2021/04/13/%23cloud-init.html for IRC log reference
[19:04] <hamalq> blackboxsw: but this says it pass :). b
[19:04] <hamalq>  https://github.com/canonical/cloud-init/pull/859/checks?check_run_id=2282060461
[20:04] <ananke> hmm, this is nuts. Despite scripts-per-boot being clearly set up to run before scripts-user, the files they create indicate otherwise
[20:05] <ananke> we're talking less than 1 second apart, but it's consistent: my scripts-per-boot is able to use data produced by scripts-user
[20:10] <ananke> the only explanation would be that perhaps AWS EC2 user-data is actually executed as the rightscale_userdata not scripts-user, but the content of it is in /var/lib/cloud/instance/user-data.txt
[20:11] <rharper> how are you testing?  new instance launches?
[20:13] <ananke> same instance & rebooting. I have user-data provided script which creates a file, and another script in /var/lib/cloud/scripts/per-boot that consumes it to configure resolver. Simply put, the per-boot script would fail flat if the file was not present
[20:14] <rharper> are you certain you're not using the file from the previous boot ?
[20:14] <ananke> I can overwrite the file, remove it, etc: yet it always appears before the per-boot script runs
[20:16] <rharper> you likely want to use cloud-init clean --logs; this allows the same instance to behave more like a new instance launch (cloud-init cleans up itself, but it can't undo all actions, like user-add, setting passwords, or side-effects from running scripts);
[20:16] <rharper> there's not a lot of magic in those scripts; the config modules write convert the user data into a shell script, and then call runparts on their respective directories
[20:17] <rharper> if you look at cloud-init.log in /var/log/cloud-init.log, you should see cc_scripts_user being called after cc_scripts_per_boot ;
[20:18] <ananke> ohh, I should mention that I've also created a dozen new instances, and all of them behave the same way. Coincidentally, the behavior is in my favor, as I want to consume the file created by user-data via a per-boot script
[20:18] <ananke> I just don't like to rely on luck, especially if on paper it shouldn't work
[20:18] <rharper> I wouldn't rely on that either
[20:19] <ananke> rharper: I looked at cloud-init analyze show, and the order is as you indicate
[20:21] <ananke> hmm, I think I may have found my culprit
[20:33] <hamalq> falcojr: hi can i get +1 on https://github.com/canonical/cloud-init/pull/859 thanks
[20:36] <falcojr> hey hamalq, unfortunately no. Scroll up a bit to find some discussion around an issue I found with it. The TLDR is that the config object attached to the distro doesn't have the userdata information we want. It will need to be passed in through via the module instead.
[20:36] <hamalq> but the test pass how did u find thee issue
[20:37] <falcojr> I can help make that happen, but it'll require some more reworking
[20:37] <falcojr> The integration test passed for you? It didn't pass for me
[20:37] <hamalq> what does that mean it run it a job i run more that onw
[20:38] <hamalq> how did it fail for u, what if ur ENV is wrong
[20:38] <hamalq> :)
[20:39] <hamalq> i mean is there another integration test other than this https://github.com/canonical/cloud-init/pull/859/checks?check_run_id=2282060461
[20:46] <falcojr> ooph, it actually failed but reported as passed because a bug in how we setup CI
[20:46] <falcojr> see the bottom of the log at https://travis-ci.com/github/canonical/cloud-init/jobs/496557277
[20:46] <falcojr> (I'll fix that ASAP)
[20:50] <hamalq> oh it says passed but its failing, ok thanks
[20:50] <falcojr> yeah sorry, I know that's annoying
[22:07] <hamalq> falcojr: done https://travis-ci.com/github/canonical/cloud-init/jobs/498272024
[22:10] <fowl> currently I use cloud-init to install a bunch of software and then use ssh to start the processes so that the secret values I pass in aren't in the instance metadata, is there a better way to do it with just cloud-init?
[22:17] <blackboxsw> fowl, offhand I could think of maybe runcmd in #cloud-config to git clone some repo that contains your secrets and operate on that to finalize your setup so #cloud-config (and instance-metadata.json) won't contain your secrets. Alternatively,  /run/cloud-init/instance-data.json is world readable and redacted for non-root users for certain sensitive metadata keys.
[22:17] <blackboxsw> https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/__init__.py#L195-L197
[22:19] <blackboxsw> fowl, if  you control your cloud's metadata exposed to cloud-init, then maybe #cloud-config could provide a switch to allow you to extend the sensitive_metadata_keys to allow redacting some keys you know will contain such sensitive information.
[22:22] <blackboxsw> user-data exposed to the instance is only accessible as root user currently, so sensitive information provided at instance launch time via user-data should be "safe" from non-root users. Though you have to trust the transport/cloud you are using which is accepting that user-data and sending it to the instance at launch time (as well as that the secrets aren't exposed on that cloud's IMDS (instance metadata
[22:22] <blackboxsw> service/API)
[22:25] <blackboxsw> while cloud-init does redact /run/cloud-init/instance-metadata.json I still think the IMDS in most clouds makes that world readable to the instance with typical curl <your_userdata_route> so it has limited benefit for unsecured cloud IMDS endpoints.
[22:33] <fowl> that is really useful looking, a perfect solution if the IMDS could be turned off
[23:23] <hamalq> falcojr: https://travis-ci.com/github/canonical/cloud-init/jobs/498303397