[15:53] <thaddeus> I'm experiencing some weird issues on a SLES 15 SP4 machine I'm trying to configure with cloud-init 21.4, perhaps someone can help, please. I'm sending over user-data and config such as hostname, users, groups and write_files are all working as expected, however nothing from runcmd seems to be executed. /etc/cloud/cloud.cfg does have entry for
[15:53] <thaddeus> runcmd in cloud_config_modules. In runcmd I have 1 entry with "- sh /usr/local/custom_scripts/runme.sh" but I've even tried to echo to a file but with no joy
[15:53] <thaddeus> Weirdly, the exact same user-data works on a different distro (Ubuntu)
[15:59] <minimal> thaddeus: have you tried setting cloud-init logging to debug? That way you'll see what each module is doing
[16:02] <aciba> thaddeus: to collect even more information: https://cloudinit.readthedocs.io/en/21.4/topics/cli.html#collect-logs
[16:09] <thaddeus> Looks like debug logging is enabled. I've collected those logs and I can see it notes |`->config-runcmd ran successfully @2230.17900s +00.00300s
[16:11] <thaddeus> I can also see util.py[DEBUG]: Writing to  /usr/local/custom_scripts/runme.sh &  Changing the ownership of /usr/local/custom_script/install.sh to 0:0
[16:12] <minimal> thaddeus: I was thinking more of looking in cloud-init.log (with debug) to see what happened when it logged "Running module runcmd"
[16:13] <minimal> that would be the write_file module logging rather than runcmd module logging though...
[16:13] <thaddeus> "Running module runcmd" doesn't even appear in cloud-init.log :blink:
[16:14] <minimal> ok, search for "runcmd" in general then in the log
[16:14] <thaddeus> Nothing
[16:15] <minimal> that doesn't make sense then as you said "- runcmd" was present in your /etc/cloud/cloud.cfg file's "cloud_config_modules:" section
[16:15] <thaddeus> There's 14 instances of running module $something, but not runcmd
[16:15] <thaddeus> That's correct
[16:17] <aciba> could you run cloud-init query merged_cfg.cloud_config_modules ?
[16:17] <aciba> and see if runcmd is in the output
[16:17] <minimal> you'd either see "Running module runcmd" or "Skipping modules 'runcmd' because not applicable config is provided." if runcmd was enabled
[16:17] <minimal> sounds like a problem with your cloud.cfg file
[16:17] <thaddeus> cloud-init query merged_cfg.cloud_config_modules shows runcmd
[16:18] <thaddeus> Although I've not made any manual changes to cloud.cfg
[16:21] <minimal> what's the 1st line of your install.sh script (the shebang line) ?
[16:22] <blackboxsw> thaddeus: given that you can see  /usr/local/custom_scripts/runme.sh  we know minimally part of your user-data is correct (maybe per write-scripts. can you try running 'sudo sh /usr/local/custom_scripts/runme.sh` from a terminal on your system. I'm guessing we are erroring out or something and the output of that failure is in /var/log/cloud-init-output.log
[16:24] <minimal> I'm wondering if the shebang line is something like "#!/bin/bash" and "sh" on the system points to /bin/dash or similar
[16:24] <thaddeus> #!/bin/sh
[16:24] <minimal> ok, and "sh" is /bin/sh also? "which sh"
[16:25] <thaddeus> "/usr/bin/sh"
[16:25] <minimal> hmm, and are /bin/sh and /usr/bin/sh the same file? (probably at least 1 is a softlink)
[16:25] <thaddeus> if I run sh /usr/local/custom_script/install.sh it works as expected
[16:26] <thaddeus> Correct minimal
[16:26] <minimal> ls -l /bin/sh /usr/bin/sh ?
[16:27] <thaddeus> "/bin/sh -> /usr/bin/sh & /usr/bin/sh -> bash"
[16:28] <thaddeus> on the working ubuntu system they both point to dash though
[16:28] <minimal> on Ubuntu AFAIK /bin/sh is dash, not bash
[16:28] <minimal> snap! ;-)
[16:28] <thaddeus> Indeed
[16:28] <thaddeus> That wouldn't be why runcmd couldn't run though?
[16:29] <minimal> hmm, seems the opposite way around to what I expected, I'd guessed your script has some Bashisms that "normal" shell might not handle
[16:29] <blackboxsw> thaddeus: on your system did cloud-init write out /var/lib/cloud/instance/scripts/runcmd ? that should be the "shellify function in couldinit would manipulate your run command and wrap it in a shell script 
[16:30] <blackboxsw> take a look at /var/lib/cloud/instance/scripts/runcmd   and also see if you can run that successfully directly
[16:30] <thaddeus> It did. /var/lib/cloud/instance/scripts/runcmd contains "#!/bin/sh" on the first line and "sh /usr/local/custom_script/install.sh" on the second
[16:31] <thaddeus> And that works fine
[16:33] <blackboxsw> ok and that runs fine for me too. and runs fine for me as well with silly sample config like this: https://paste.opendev.org/show/bYcTj6vpb01BOwi6jtVO/
[16:34] <thaddeus> Yeah I get no issues with debian based systems like Ubuntu, can you try that with opensuse / centos, please?
[16:41] <minimal> thaddeus: is your shellscript doing anything "funky"?
[16:43] <thaddeus> Define "funky"
[16:43] <thaddeus> :D
[16:43] <minimal> something that has a 50/50 chance of working across various distros with different shells ;-)
[16:45] <thaddeus> Hah, no. Weirdly it has worked in the past on SLES but I rebuilt my VM image and I know it's running a slightly outdated cloud-init binary compared to Ubuntu, but going through the changelogs I couldn't find anything that may explain why
[16:45] <thaddeus> Ignoring the script I can't even do things like echo to a file from runcmd
[16:48] <thaddeus> I wonder if it's some apparmour / selinux shenanigans
[16:53] <thaddeus> It's like cloud_init_modules all run, but nothing from cloud_config_modules does
[16:54] <minimal> I haven't been near SLES for several years, I forget any specifics about it
[16:54] <waldi> thaddeus: which cloud-init stages are running?
[16:54] <waldi> redhat got some modifications when cloud-init runs in comparison fo debian/ubuntu
[16:55] <blackboxsw> you can check stages run on latest boot w/ `cloud-init analyze show` or cat /run/cloud-init/status.json
[16:55] <minimal> yeah, are all the cloud-init init.d/service files enabled?
[16:58] <thaddeus> cloud-init analyze show only shows init-local and init-network being executed
[16:58] <blackboxsw> ahh nice, ok so config-modules is skipped. thx waldi 
[16:59] <blackboxsw> thaddeus: it might be a systemd ordering cycle issue with cloud-init systemd services vs something 'new' installed in your opensuse image
[17:00] <blackboxsw> typically you can check journalctl -b 0 | grep "ordering cycle" to see if systemd found a funky unresolvable dependency chain and punted a cloud-init service out of the boot target/goal
[17:00] <blackboxsw> https://unix.stackexchange.com/questions/193714/generic-methodology-to-debug-ordering-cycles-in-systemd for context
[17:04] <thaddeus> No ordering cycle issues from the journal, going to compare module configs between ubuntu and sles
[17:11] <blackboxsw> thaddeus: just validated on opensuse 15.3 (on lxc images I had to: lxc launch images:opensuse/15.3 -c user.user-data="$(SAMPLE .YAML)" ypper install cloud-init; systemctl enable cloud-init.service; systemctl enable cloud-init-local.server; systemctl enable cloud-config.service; systemctl enable cloud-final.service;  cloud-init clean --reboot --logs) 
[17:12] <blackboxsw> version: /usr/bin/cloud-init 21.4-150100.8.58.1
[17:13] <thaddeus> Interesting, if I fun systemctl status cloud-config.service it shows as disabled Loaded: loaded (/usr/lib/systemd/system/cloud-config.service; disabled; vendor preset: disabled)
[17:13] <thaddeus> I've never had to manually enable them in the past, wasn't aware they were individual cloud-init unit files for different stages
[17:13] <thaddeus> Going to rebuild my template adding your systemctl enable commands blackboxsw
[17:15] <blackboxsw> thaddeus: generally you shouldn't have to enable those services as they *should* have been enabled in the stock distrubution cloud image
[17:16] <blackboxsw> I only suggested that as you had mentioned "custom image" which carries a lot of baggage for me about the potential of a derivative image which has been toyed with a bit.
[17:18] <thaddeus> Yeah it's enabled on my ubuntu image. I'm going to modify my packer script and see if that works, will take a few mins to rebuild and I'll report back. But thank you blackboxsw, minimal, waldi, aciba for all your help so far, much appreciated.
[17:52] <thaddeus> Yup, that was it blackboxsw! Thank you!
[17:54] <minimal> currently looking into network config v2 as I'm wanting to set some IPv6 related settings in eni such as "privext", "request_prefix", "autoconf". Seems this is currently only defined/catered for via netplan pass-through
[17:55] <minimal> So before I start work on a PR to add eni support for this sort of stuff wanted to get some thoughts/input from people