/srv/irclogs.ubuntu.com/2024/04/10/#cloud-init.txt

anankehow can I identify which cloud-init module is erroring out? cloud-init status --long shows 'degraded done', but 'errors' shows 'errors: []'14:06
anankethe exit code I get is '2', while the full message is https://dpaste.org/pbAAp14:15
anankewhile cloud-init.log ends with main.py[DEBUG] Ran 12 modules with 0 failures 14:26
anankeand  handlers.py[DEBUG] finish: modules-final: SUCCESS: running modules for final14:26
holmanananke: the exit 2 relates to the information you see after 'recoverable_errors'14:37
holmanananke: does your userdata try to use the `cc_refresh_rmc_and_interface` key?14:38
anankeit doesn't. frankly, that's the first time I see that module. I'll look more in the logs to see what's going on14:39
holmanananke: the other thing going on is deprecation warnings due to using keys that will go away14:39
holmanananke: which distro?14:40
anankekali linux, rolling, which is debian based these days14:40
anankebehind the scenes the process is a bit convoluted. I'm building an AMI from the official upstream AMI, using packer. then I take the resulting AMI and I do the same thing with a trimmed down packer/cloud-init configuration14:42
holmangotcha14:42
anankeboth packer recipes start with the cloud-init --wait, and the first build process is fine. so it's something I introduce during the first stage, or that I feed during the second stage. second stage cloud-init contains bare minimum, and I've trimmed it down to the point of being just nothing14:43
anankeso now I'm dissecting things more. the problem is that the logs provide very little clue, there's nothing with 'error', etc14:44
holmanananke: can you tell me what `grep rmc /etc/cloud/cloud.cfg` shows?14:45
anankeif I could scope it down to a module, it would be ideal. sure, just a sec. starting a new run14:45
anankeon an instance running AMI from first stage, that returns: └─# grep rmc /etc/cloud/cloud.cfg14:46
ananke - reset_rmc14:46
ananke - refresh_rmc_and_interface14:46
holmanananke: did that config come stock with the image?14:47
anankeahh, and cloud-init status produces exit code 2 there too14:47
anankeholman: yes, it did14:47
anankewe don't touch /etc/cloud/cloud.cfg14:47
anankecloud-init 24.1.1-114:48
holmanananke: that specific module was renived in 23.214:49
holmanremoved14:49
holmanso the distro provider needs to drop that key from the confix14:49
anankeI don't suppose you have an idea what module is this? I can't find it in the documentation14:49
holmanconfig14:49
anankeohh, I see. thank you. and this explains, I was looking at 24.1 documentation14:49
holmanyeah14:50
holmanso probably a debian change?14:50
anankewouldn't surprise me. I can file a bug with Kali, and see if they'll file it upstream14:50
holmanananke: perfect, thanks!14:50
holmanananke: could you link me the bug once it's filed?14:51
anankecertainly. one quick question: as a temporary workaround can I override the inclusion of this module via a cfg.d file, or will I have to remove it from cloud.cfg?14:51
holmanjust remove it from cloud.cfg and that should go away14:52
anankecheers. I'll test that14:53
holmanand once you update your cloud-config keys to use the non-deprecated ones, I would expect status to exit 014:53
holmanananke: just to reiterate: all that you need to debug an exit 2 error code should be visible in the recoverable errors key15:17
holmanananke: and here's some bedtime reading material if you want more background on it -> https://cloudinit.readthedocs.io/en/latest/explanation/return_codes.html15:18
anankethank you. I'm currently rebuilding the image in stage 1 to see if removing those rmc modules will suffice. I'll also read that documentation. once I'm certain this is the issue, I'll file bug with Kali and pass you the info15:21
holmansounds good :-)15:22
holmanlooking forward to hearing back from you15:22
holmanminimal: ping (when you're around)15:23
holmanminimal: I know you took a peak at #5120 and had some questions around gdate / etc15:55
holmanminimal: that PR is approved by falcojr but I'll wait to merge in case you have any outstanding concerns 15:56
holmanminimal: I added context related to the usage of gnu date to the PR context15:56
minimalholman: thanks, having a read of the comments now16:02
holmanminimal: thanks :)16:03
minimalBTW yes Busybox ash is not necessarily 100% POSIX compliant (depending on its config when built) but from memory "shellcheck -s sh ..." does complain about $'..' not being POSIX16:03
holmanyeah I haven't found anything that is 100% POSIX complaint16:04
holmandash isn't either16:04
holmangood to know on the shellcheck -s sh16:05
minimalwell ash *CAN* be if you don't enable any of the non-POSIX compile options16:05
holmangood to know16:05
minimali.e. one of the links you referenced mentioned the ENABLE_ASH_BASH_COMPAT compile option - turn that off and you're 1 step closer to POSIX compliance16:06
holmanbut I honestly don't care about 100% compliance without a real use-case16:06
holman"local" is really nice16:06
minimalAlpine's ash turns that on: https://git.alpinelinux.org/aports/tree/main/busybox/busyboxconfig#n114316:07
minimalyeah, "local" is probably the only non-POSIX feature I tend to use in my shellscripts16:07
holmanI recently discovered the freebsd man page for sh (also an Almquist derivative)16:10
minimalyou can never have enough shell options eh? ;-)16:10
holmanit's really good16:11
holmanhttps://man.freebsd.org/cgi/man.cgi?sh16:11
holmanhehe, so many options16:11
holmanminimal: one more thing16:15
holmanminimal: do you have any pointers for installing python3.12 on edge?16:15
holmanI can just wait for 3.12 to become available if not16:16
minimalthere are no "released" py3.12 packages for Edge - basically as a lot of python packages are breaking with 3.12 ncopa hasn't yet pushed a 3.12 package as that would mess up Edge16:17
holmangotcha16:17
holmanI'd have to build from source then to repro that issue?16:18
minimalhowever he does have a "personal" repo with the 3.12 packages he created and there were notes posted on IRC how to make use of this for others working on fixing 3.12-related issues16:18
minimalI'll dig up those notes...16:18
holmanthanks16:19
minimalcloud-init already has some testing for py3.12, right?16:21
minimalI though I saw a reference to 3.12 a while ago in some of the github workflow stuff16:21
holmanminimal: 3.13 even :-)16:42
holmanhttps://github.com/canonical/cloud-init/blob/93f5a0165069603b2eb45ec20983393170fe78a9/.github/workflows/unit.yml#L2816:43
minimalholmnan: so where would the 3.12 logs be visible?16:52
minimalBTW re the $'...' thing, https://www.shellcheck.net/wiki/SC203916:52
holmanPython 3.12 pytest logs should be visible in every PR under the "Checks" tab -> open the "Unit tests" dropdown 16:57
holmanex: https://github.com/canonical/cloud-init/pull/5162/checks16:58
-ubottu:#cloud-init- Pull 5162 in canonical/cloud-init "Deprecate the users ssh-authorized-keys property" [Open]16:58
holmanwhich triggered this ci job: https://github.com/canonical/cloud-init/actions/runs/8626580327/job/23665266600?pr=516216:58
minimalso then that ci unittest passing points to the issue likely being specific to musl and py3.12. I wonder if freebsd would have similar issues with python 3.1217:03
holmanminimal: agreed17:50
holmanokay just figured out a reproducer17:51
minimaloh? interesting18:08
anankehmm, more deprecated stuff: Invalid user-data /var/lib/cloud/instances/i-0b277ba864e274251/cloud-config.txt18:40
anankeError: Cloud config schema errors: system_info: Additional properties are not allowed ('system_info' was unexpected)18:40
anankewhat was system_info replaced with?18:41
minimalanake: system_info isn't user-data, it's specified in /etc/cloud.cfg18:50
minimalused to specify things like the distro name, default user name, etc18:51
minimalhttps://cloudinit.readthedocs.io/en/latest/reference/base_config_reference.html#system-info-keys18:52
anankeuhmm, user-data can provide cloud-init config. this wasn't an issue in earlier versions of cloud-init18:52
minimalfrom that page: "Anything under system_info cannot be overridden by vendor data, user data, or any other handlers or transforms."d18:53
anankehah. so we've had it all wrong for years, and cloud-init just started enforcing it18:53
* ananke facepalms18:53
minimal"use the docs Luke..." ;-)18:54
anankeminimal: that's easy to say when you know what docs to look in. I kept looking18:55
minimalto find that I just used the "search" functionality on the RTD site18:55
anankeI did too, but missed that section18:55
minimalthat was the 3rd result returned for "system_info": "Base configuration"18:56
anankebeen fighting this stuff too long today. the fact that it didn't become an issue until now is what was throwing me off18:56
anankeliterally the same config at the start of the process is valid, and it becomes invalid at the end of it, while cloud-init is updated18:57
minimalit doesn't become "invalid" at the end, it was always invalid, it just wasn't validated automatically in (some) previous releases18:57
anankeso I'm wondering how in the world this is going to work moving forward18:57
minimalinvalid config is always invalid config, even if you don't realise/aren't told so18:58
minimalum, you validate any user-data before using it?18:59
anankeminimal: validate _how_?18:59
minimalusing the cli validation?18:59
anankeno, I don't. like I said, this particular configuration was never a problem until the new version of cloud-init19:00
minimalit wasn't a "problem" but it was still incorrect/invalid19:00
anankeproblems with the syntax in the past were more apparent19:00
minimal"cloud-init schema ..." to check19:00
anankeminimal: part of the problem is that I can't pre-validate it before feeding it as user-data. chicken & egg: there's no cloud-init in place to validate it19:01
minimali.e. "cloud-init schema -t cloud-config -c myconfig.yml"19:01
minimalyou use another machine/VM to run the validation on?19:02
anankenot for user-data. and I'd need to somehow have cloud-init of a given version available. this process uses a single container image with packer/aws tools/etc. it's used to build images for many different distros, each one with different version of cloud-init19:03
minimalwell as what represents valid user-data can change between cloud-init releases your infra should take this into account19:04
anankeeasier said than done19:05
minimalok but that doesn't change the fact19:05
anankenot sure why you're beating the dead horse at this point19:05
minimali.e. if you're creating user-data using some form of templating that that template could be written to cater for differing versions19:06
anankeas to being invalid in previosu versions, I find it dubious. the default user/gecos fields I'm feeding via user-data _are_ used on first boot. 19:14
anankeor rather, they were with an earlier version19:15
anankeI'll do some more digging later19:15
minimalhow are you providing them? inside a top-level "users:" section?19:15
anankesystem_info: default_user:19:16
minimalwhich earlier cloud-init version?19:16
anankesorry, had to drive to the office. let me check19:34
anankeremoving ability to specify default_user via user-data would be fairly problematic. we rely on it, and has worked just fine on ubuntu/centos/debian/etc for years. 19:38
anankeso it worked fine on 23.3.1, and schema check returns no issues there. after cloud-init is updated to 24.1.1 and system is rebooted, schema is no longer accepted19:40
minimal23.3.1? I'm not seeing that in the github cloud-init releases list for some reason, there is 23.3.2 and 23.3 but not 23.3.119:44
anankecloud-init     23.3.1-1     all          initialization system for infrastructure cloud instances19:44
minimalI'm guess that release was "pulled" for some reason19:45
anankeand this would mean anything previous was accepting these keys too19:45
anankeour 'default_user' fed via user-data hasn't changed in years19:46
minimalanyway, the 23.2.2 docs state the same about system_info not being overriden by user-data, so then I'd say the behaviour you see with 23.3.1 is a bug as it was not intended to work19:46
minimalso what led you to expect a "system_info" section in cloud-config to work? Was there a document at some time that showed this as valid?19:47
anankemust have been. and it clearly worked across multiple distros/versions. we've had this for the past 5 years19:49
anankeubuntu 16/18/20/22, centos 7, debian 10/11, kali, etc19:49
minimalok, but it seems likely that was unintended behaviour19:49
holmanananke: that used to be supported, apparently back in 0.7.2: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/109048219:49
-ubottu:#cloud-init- Launchpad bug 1090482 in cloud-init (Ubuntu Raring) "over-riding distro config still broken" [High, Fix Released]19:49
minimalnormally to change aspects of the default user in cloud-config you'd specify: "users:\n  - default\n" and then some attributes19:50
anankehah. https://cloudinit.readthedocs.io/en/22.1_a/topics/examples.html?highlight=default_user19:50
anankeso it was in the sample config in documentation19:51
holmanananke: but it looks like that was reverted in 18.4 in this commit: https://github.com/canonical/cloud-init/commit/f0ff194054da90b7b49620b5658342e52156d68e19:52
-ubottu:#cloud-init- Commit f0ff194 in canonical/cloud-init "stages: Fix bug causing datasource to have incorrect sys_cfg."19:52
holmanas a fix for this bug https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/178745919:52
-ubottu:#cloud-init- Launchpad bug 1787459 in cloud-init (Ubuntu) "datasource.sys_cfg gets different values in local stage and after." [Medium, Fix Released]19:52
anankeholman: so I wonder how it continued to work for us, until now19:53
holmanananke: did it actually work? or did it just fail silently?19:53
holmanananke: schema validation isn't blocking configurations, it's just warning about invalid ones19:53
minimalthe cloud-init.log with debug should show exactly what happens or doesn't happen with that cloud-config19:54
anankeholman: it worked. we've built hundreds of images, with our custom default username fed via user-data's default_username option19:54
anankeminimal: the users - default doesn't allow you to specify what the default user will be. just includes the default username configured via default_user19:55
anankes/default_username/default_user in earlier sentence19:55
minimalananke: I believe you can do "users:\n  - default\n    name: whatyouwant\n" to override the default name19:56
anankeI can certainly try and see if that works19:56
minimal"name:" being one of the attributes that can be specified19:56
minimalI'm fairly sure I've used that in the past19:57
Odd_Bloke"default" is a string not a mapping, so I don't think that's valid YAML.20:06
minimalhmm, I could have sworn I'd overriden aspects of the default user in the past20:10
Odd_BlokeI think perhaps `user:` can be used to override the default default user?20:10
Odd_BlokeOr the default user defaults, perhaps more accurately (if not less confusingly :p).20:11
minimalwell yeah, if you specify "users:" but don't also specify "- default" then it won't be created and then you could have a "full" user definition instead20:11
anankeso how would one translate this config to something using the user module? https://dpaste.org/XdcZx/raw20:11
minimal"users:\n  - name: student\n     gecos: Student\n     shell: /bin/bash" would definately work20:12
Odd_BlokeI _think_ the content of your `default_user:` block under `user:` at the top level.20:12
anankethe nice thing about using this approach was that one would just override the bare minimum needed: username/gecos/shell. The rest was inherited from whatever the distro provides20:12
minimalas by NOT specifying "- default" when you specify "users:" then the default is not created20:13
minimalthat's mentioned in the docs for the users_groups module20:13
minimaloh, you want to inherit also.....hmm20:14
anankewell, inheritance is a secondary goal, at this juncture I just need to figure out how to make this work with whatever is the correct syntax20:14
anankethe problem is though, distro provides its own system_info with default_user: wonder how this is going to work20:15
Odd_Blokehttps://github.com/canonical/cloud-init/blob/main/cloudinit/distros/ug_util.py#L172-L173 sets old_user to cfg["user"], and https://github.com/canonical/cloud-init/blob/main/cloudinit/distros/ug_util.py#L207 merges that with distro.get_default_user().20:16
anankewonder if I'm a canary, and other people will come out with the same problem :)20:17
minimalah, there is "user:" to override the default20:17
minimalit's shown in "Example7" for the users_groups module20:17
minimalso "user:\n  name: whatever\n"20:18
anankewill try both and see what comes out20:19
anankegotta get kid from practice, will finish this tongith20:22
ananketonight20:22
minimalOdd_Bloke: I'd missed that you referenced "user:" earlier20:23
Odd_BlokeHaha, I did wonder, all good!20:35
Odd_BlokeAt least we reached the same conclusion.20:35
falcojrhmmm, after trying it looks like you actually can override the default_user in system_info using user data...but that's not going to be true for most keys in system_info and using `user` as already mentioned here is the supported way to go21:08
holmanananke: thanks for reporting one this21:31
holman*on21:31
holmanit sounds like we have some docs to update and deprecation(s) to add to schema21:32
holmanananke: we've warned on invalid keys for a while now I think, I assume you're digging into this because of the exit 2 error code?21:32
holmanunfortunately there's been some spurious things like this to fix21:36
holmanwhich is one reason why we haven't made cloud-init hard error on invalid config, but rather warn more loudly when it thinks it's got something that isn't right21:37
holmancloud-init's configuration was never fully documented, and it's difficult to audit all of the places that the config is accessed, but we're getting closer :-)21:37
anankeholman: yes, it all started after running into that exit code 2. when building images with packer we leverage cloud-init as much as possible, so we can have more uniform build recipes, and the first build step is to run cloud-init status --wait. up until now this step never complained about this particular schema21:50
holmanananke: gotcha21:51
anankebut yeah, back when we started documentation wasn't anywhere near as complete, so we relied on example configs. once it was working, that part wasn't touched, and we were none wiser21:55
anankenow it's a matter of moving to whatever syntax is valid and replicates previous functionality. will have to use the serial console, because packer can't seem to be able to connect anymore21:58
anankeohh, this is interesting. looks like ssh_authorized_keys for that user may be wiping packer's ssh key22:11
anankeso a couple observations: 1) no, omitting keys for a user via the users: config does NOT merge it with what the ones for default_user:. this includes things like groups, sudoers, and so on22:20
anankewhich leads me to believe that's not how you can specify the default user, because more importantly 2) this new user does not have ssh keys specified in AWS metadata22:21
minimalananke: "user:" config or "users:" config? from the earlier chat the consensus was to use "user:" for changing default user stuff22:21
anankeminimal: I must have misread it. I've been using 'users:'22:22
minimalI mentioned "Example7" for the users_groups module22:23
anankeI haven't figured out what Example7 means. I've been looking at https://cloudinit.readthedocs.io/en/latest/reference/examples.html22:23
anankeahh, I see, https://cloudinit.readthedocs.io/en/latest/reference/modules.html#users-and-groups22:24
minimalgo to https://cloudinit.readthedocs.io/en/latest/reference/modules.html22:24
minimalthen go to the "Users and Groups" section22:24
minimalthen click on the "Examples" tab in that section22:24
minimalthen scroll down to "Example7"22:25
anankeyep, found it, thank you22:25
minimalalso click on the "Config schema" tab in that section and look at "user"22:25
minimal"The user dictionary values override the default_user configuration from /etc/cloud/cloud.cfg."22:26
minimal"The user dictionary keys support for the default_user are the same as the users schema" - so you can specify things like "ssh_authorized_keys"22:27
anankeit's certainly been a long day. user: vs users: blends in22:27
anankeminimal: the problem wasn't with what was supplied via ssh_authorized_keys, rather what's provided via AWS metadata service22:28
minimal"users:" is the general way to creation additional users, whereas "user:" is to modify the default user's settings22:28
minimalyou're referring to metadata? or to metadata/user-data/network config provided via the metadata *service* (IMDS)?22:29
minimalas I'd expect ssh keys coming via IMDS to be from user-data provided (by you) to AWS cli/API when a VM is created22:31
anankethese keys are created automatically by packer, and accessible via metadata service under public-keys/, presumably that's where cloud-init pulls them from22:32
anankepoint being, when I tried using the 'users:' section to specify our username, that user was created, but the packer ssh key wasn't present in that account22:33
anankemoving to user: fixed it22:33
minimalright, as "users:" was creating a new user (with no ssh_authorized_keys specified)22:35
anankebut interestingly enough, there seemed to be _no_ default user22:36
minimalin which scenario?22:37
anankein users:22:37
minimalok, as mentioned earlier, when "users:" is used (i.e. to create NEW users) unless you specifiy "- default" then the default user will NOT be created22:37
minimalthis is mentioned in the docs I referred you to22:37
anankewhich is what made the mistake of 'users:' vs 'user:' confusing. 22:38
anankeright, but all along I was trying to manipulate the default user22:38
minimalIn the "Summary" tab: "Omission of default as the first item in the users list skips creation [typo] the default user."22:39
anankethat's not the point :)22:39
minimalthat explains why the default user was not created when you used "users:"22:40
anankethe explanation is much simpler: I kept mistaking 'users:' for 'user:'. didn't realize there was a separate distinct config option 'user:'22:40
minimalthat's why you should check "Config schema" for the relevant module in the docs ;-)22:41
anankedoesn't help in case where you think user: == users:22:42
minimalwouldn't kit help as that part of the docs clearly lists "user:" and "users:" *separately* one of the other?22:44
minimals/one of the/one after the/22:45
minimalthat section (in my opinion) makes it very clear that they are 2 different things22:45
anankenot sure there's much more point in telling me I made a mistake22:46
anankeI realize that. I was merely explaining what happened22:47
minimalI wasn't doing that, I'm just saying that (in my opinion) the relevant section of the docs, if consulted, is quite clear22:48
minimalobviously if it is not consulted that's a different matter22:48
anankethe fact that user manipulation is spread across 'default_user' 'user' and 'users' doesn't help22:49
anankebut I digress. the issue in this particular case is solved, thank you everybody for help and patience22:49
minimal"user" and "users" are documented in the same place, "default_user" is documented in base config which, from memory, mentions this is config setting for distros/vendors, not for "end users"22:50

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!