[00:35] <minimal> running/testing cloud-init main git on Alpine Linux I noticed warning after reboot about groups (defined in user-data) unable to be created as they exist, however whilst digging into this I noticed in /var/log/cloud-init.log that all the various cc_* modules are logged (in debug) with "frequency none" rather than always/once/once-per-instance
[13:40] <minimal> Hi. Any devs around?
[13:51] <minimal> 2 issues regarding 22.2 I wanted to discuss, (1) cc_set_passwords change on non-systemd OS causing Warning message on console (note added to #1450), (2) modules all appear to have frequency "none" (I'm seeing "add group" warnings on subsequent boots)
[14:22] <aciba> Hello, minimal: I have answered you about (1), I'm going to purpose a PR to improve that case. Thanks!
[14:25] <minimal> aciba: perhaps make it a Debug message rather than a Warning so it does not appear on the console?
[14:27] <minimal> also the 2nd issue I mentioned above looks generally problematic (either the Debug messages for all modules are incorrectly logging "frequency none" but actually using the right frequency, or all the modules are actually using an incorrect frequency which is likely to cause problems)
[14:42] <aciba> minimal: cloud you please share the logs of (2)?
[14:42] <falcojr> minimal: I can take a look into #2. We did a bunch of schema work this past cycle, and with it some changes to how modules are detected and run
[14:43] <falcojr> But yeah, like aciba says, we could use some logs too. 'cloud-init collect-logs' attached to a launchpad bug would probably be easiest
[14:43] <minimal> falcojr: I'll spin up a VM to get some logs, but I'd expect you'd see the same yourself. I just grep'ped "frequency " from cloud-init.log with debug enabled and all the cc_* modules showed "none"
[14:44] <falcojr> minimal: on Alpine?
[14:46] <minimal> falcojr: I'm testing on Alpine, I'm not testing on other distros
[14:49] <minimal> falcojr: I see the expected logs when running cloud-init 22.1 on Alpine, it seems to be a git "main" change in behaviour
[14:50] <falcojr> minimal: that'd make sense. My knowledge of alpine is ...minimal... . How do you build your cloud-init packages for alpine?
[14:52] <blackboxsw> falcojr: minimal I confirm behavior of printed frequency None in logs on jammy
[14:52] <minimal> falcojr: Using the standard Alpine packaging mechanisms. You can see here: https://git.alpinelinux.org/aports/tree/community/cloud-init/APKBUILD
[14:52] <blackboxsw> using ppa:cloud-init-dev/daily 
[14:52] <falcojr> blackboxsw: you're seeing this on Ubuntu too?
[14:52] <blackboxsw> falcojr: if you are looking at %h rendering I can poke that this now
[14:52] <blackboxsw> 2022-05-16 14:51:43,335 - modules.py[DEBUG]: Running module power-state-change (<module 'cloudinit.config.cc_power_state_change' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py'>) with frequency None
[14:52] <minimal> blackboxsw: yeah I'd assumed it was not an Alpine-specific issue.
[14:53] <blackboxsw> yeah I see the logs ^ reporing None
[14:53] <blackboxsw> whether that's actually the case or just our log reporting (I think it is the latter)
[14:53] <blackboxsw> I'll double check and make sure cc_* modules aren't automatically running each boot.
[14:53] <falcojr> blackboxsw: thanks, I'll go back to the rendering bug
[14:53] <minimal> I noticed this as when I rebooted a Alpine VM after 1st c-i run the console showed it trying to create the same groups again defined in user-data - so it seems to be running cc_user_groups more than just once-per-instance
[14:53] <blackboxsw> if they are still gated properly on per-instance, then it's just a log formmating issue
[14:54] <blackboxsw> because we relocated where frequency for cc_* modules was defined
[14:54] <minimal> so I don't expect it to be a log formatting issue
[14:54] <blackboxsw> :/
[14:54] <blackboxsw> ok will confirm now and hopefully post a fix today on this. thanks minimal for the heads up her .
[14:54] <blackboxsw> here
[14:55] <minimal> blackboxsw: shouldl be easy for you to test wqith ubuntu - in user-data create some groups and see what happens on 2nd boot
[14:56] <blackboxsw> yeah +1 , and since we migrated `freq = PER_INSTANCE` out of some cc_* modules assuming this was the default when unspecified, I think it's any cc_* module that no longer has a `freq`  module attribute 
[14:56] <minimal> I'm seeing "__init__.py[WARNING]: Skipping creation of existing group 'admin'
[14:57] <minimal> which again doesn't occur with c-i 22.1
[14:59] <blackboxsw> affected modules: power_state_change... which actually has a module attribute `cloudinit/config/cc_power_state_change.py:frequency = PER_INSTANCE`. Ok, it's something other than my hypothesis. Ok, will figure it out and squash it.
[15:01] <minimal> blackboxsw: sorry to point this out so late in the day - generally I test forthcoming releases frequently during the dev process but got caught up recently with cloud-utils and other stuff so only got to testing the past few days
[15:03] <falcojr> minimal: much better now than in 2 weeks :)
[15:26] <blackboxsw> minimal: falcojr holmanb yep and confirmed what minimal said, the frequency None means we'll be running these modules evey single boot
[15:27] <blackboxsw> minimal: this is perfect timing. thx
[15:27] <blackboxsw> it costs twice as much later :)
[15:28] <blackboxsw> time to extend our integration tests a bit more as this regression fell through the cracks
[15:28] <minimal> great, looking forward to a fix, that only leaves the matter of the Warning appearing on non-systemd systems regarding ssh_pwauth
[15:29] <minimal> aciba is working on that - I hope its change can prevent the console message rather than just change its text
[15:33] <falcojr> blackboxsw: filed https://github.com/lxc/lxd/issues/10422 , need any help with the fix or testing?
[15:33] <falcojr> (asking about help with the issue you're looking at, not what I filed)
[15:43] <blackboxsw> falcojr: can't hurt to have more eyes. It  we adapted behavior of fixup_modules and may have dropped functionaliy from fixup_module which sets a sensible default (PER_INSTANCE)  when "frequencies" is not provided in the module attribute itself. Looks like the commit 8a6be66156af57fb76211083d283ee104f2e9381  introduced this change. Just double checking now.
[15:43] <blackboxsw> *It looks like we adapted*...
[15:46] <blackboxsw> +            freq = raw_mod.get("freq") is almost always None now (unless someone changes it in /etc/cloud/cloud.cfg via supplemental settings (which _read_modules picks up ). Otherwise our logic passes that None through into mostly_mods.  I think we just need to source the official frequency from the metaschema. working that up now
[15:59] <blackboxsw> ok have a fix, working on the supplemental integration test
[16:36] <blackboxsw> thanks for the review on https://github.com/canonical/cloud-init/pull/1457 I'll grab your additioanal integratoin test and maybe add an asset "frequency None" not in logs too
[16:46] <blackboxsw> incorporated those test changes. will let CI run on this while I build/run locally.
[16:47] <aciba> I have created a PR for the console message: https://github.com/canonical/cloud-init/pull/1456
[16:48] <blackboxsw> nice aciba will grab that review now. Thanks!
[16:48] <aciba> We have an issue in the CI running the tests in 3.10: https://app.travis-ci.com/github/canonical/cloud-init/jobs/570350390
[16:49] <aciba> I am having a look
[16:53] <blackboxsw> when looking over CI failures, I don't really like how our verify_clean_logs integration test fails on warning counts. It's totally not a helpful traceback
[16:54] <blackboxsw> `>       assert warning_count == expected_warnings
[16:54] <blackboxsw> 5834E       AssertionError`
[16:54] <blackboxsw> aciba: falcojr or holmanb if we  are touching integration tests in the future maybe we can change that assertion message to actually contain the unexpected warning content when the count != expected
[16:55] <blackboxsw> example failure here https://app.travis-ci.com/github/canonical/cloud-init/jobs/570265179
[16:55] <blackboxsw> as another rough edge around CI behavior
[17:04] <falcojr> blackboxsw: yep, problem for that particular check is that we're checking that the log doesn't have additional warnings beyond those...so we can't really check for specific text that exists in the log
[17:05] <falcojr> it can be done, but just requires some more complicated logic...especially if it's around tracebacks
[17:12] <aciba> I have created a PR to solve the CI issue: https://github.com/canonical/cloud-init/pull/1458
[17:42] <blackboxsw> ack falcojr, yeah I was thinking minimally just dumping warnings when expected warnings don't match https://github.com/canonical/cloud-init/pull/1424/commits/dde4bb1f816f5701c4a3b42cdb8d3a132e496352
[17:43] <blackboxsw> tracebacks'd be a bit more complex getting a multi-line regex 
[17:51] <blackboxsw> falcojr: you are going to spoil me. I write two lines of code and you contribute 70 lines of test coverage
[18:16] <blackboxsw> turns out I actually need to package the json schema in order to pass integration tests :) https://github.com/canonical/cloud-init/pull/1424 ready for re-review
[18:17] <blackboxsw> since the cloudinit.config.schemas didn't have an __init__.py it wasn't included when packaging, which lead to a lot of warnings about inability to find valid schema