blackboxsw | orndorffgrant: if you are doing any UA testing for systemd daemon, add-apt-repository ppa:cloud-init-dev/daily on Jammy will get you /run/cloud-init/cloud-id-* | 00:29 |
---|---|---|
blackboxsw | for tomorow | 00:30 |
blackboxsw | s/r/rr | 00:30 |
blackboxsw | woo hoo thx for the reviews folks Jammy update is accepted for tip of cloud-init [ubuntu/jammy-proposed] cloud-init 21.4-119-gdeb3ae82-0ubuntu1~22.04.1 (Accepted) | 03:34 |
orndorffgrant | blackboxsw: Great! Yes that will make testing much simpler | 14:10 |
apple-corps[m] | blackboxsw: I had removed the sudo commands from my cloud-init script. I tried adding a sleep statement regarding the filesystem initialization concerns. I'm seeing the same behavior. From the cloud-init.log it looks like the runcmd is written, but not executed as shown previously. Then I assume I will need to collect the logs to file a bug report. | 19:39 |
blackboxsw | apple-corps[m]: yes I would have expected your logs to contain ""Running command ['/var/lib/cloud/instance/scripts/runcmd'] with allowed return codes [0] (shell=False, capture=False)"" | 19:42 |
blackboxsw | to show that cloud-init actually ran it. Has your system actually completed running cloud-init? You can check via `cloud-init status --wait --long` | 19:42 |
blackboxsw | apple-corps[m]: I'm also presuming that you are using nitro system and maybe IPv6 only networking? | 19:43 |
apple-corps[m] | blackboxsw: when I run `sudo cloud-init status --wait --long` it appears I get repeated periods ... displayed so does that suggest it has not completed? | 19:44 |
blackboxsw | apple-corps[m]: yeah, it suggests cloud-init hasn't been able to get far enough to complete on the system. probably because something is happening in networking that is preventing cloud-init from moving to the next stage where it actually runs the runcmd | 19:45 |
apple-corps[m] | Regarding nitro system, I am not aware regarding nitro-system, then I think not. Running a z1d.large ec2 instance type | 19:46 |
blackboxsw | apple-corps[m]: runcmd module is only executed at the end of the "config" boot stage https://cloudinit.readthedocs.io/en/latest/topics/boot.html#config. So we we don't get that far on your system you'll never see that command execute. | 19:47 |
apple-corps[m] | blackboxsw: should I go ahead and provide the log and file a bug report then? The last lines of the log. util.py[DEBUG]: cloud-init mode 'modules' took 4.127 seconds(4.12) | 19:48 |
blackboxsw | apple-corps[m]: what's /run/cloud-init/status.json should show a start and end time for each boot stage cloud-init has completed. If you have any stages which don't show an end time you know you are 'stuck' there. `cloud-init analyze show` might also help discern how far in boot you have gotten | 19:48 |
apple-corps[m] | handlers.py[DEBUG]: finish: modules-config: SUCCESS: running modules for config | 19:49 |
blackboxsw | apple-corps[m]: +1 on bug filing. if you have the ability to get /var/log/cloud-init*log and or `cloud-init collect-logs` it'd make this way easier | 19:49 |
blackboxsw | if you had network connectivity from the instance I'd say run `ubuntu-bug cloud-init` | 19:49 |
apple-corps[m] | I probably need to scrub the logs of any sensitive information. | 19:50 |
blackboxsw | apple-corps[m]: +1 on a scrub, generally watch out for the included ./run/cloud-init/instance-data-sensitive.json ./run/cloud-init/instance-data.json files. The cloud-init.log itself should have "REDACTED" passwords from user-data for instance | 19:52 |
apple-corps[m] | I see finished and start times on 3 modules. The other modules look like they have null values for both. | 19:52 |
blackboxsw | my example status.json has start/end times for init, init-local, modules-config, modules-file | 19:53 |
blackboxsw | my example status.json has start/end times for init, init-local, modules-config, modules-final | 19:53 |
blackboxsw | if you don't get across modules-final something is holding things up. `cloud-init analyze show` or `cloud-init analyze blame` may help when looking at your first boot on the system | 19:54 |
apple-corps[m] | mine has init, init-local, modules-config all show start / end times. modules-final and modules-init are both null | 19:54 |
apple-corps[m] | for start / end times | 19:55 |
apple-corps[m] | Ok, I will work on collecting and scrubbing the logs then. | 19:56 |
falcojr | I believe systemd-analyze critical-chain should tell you if something if blocking the service from starting | 19:56 |
blackboxsw | one other thing to double check on your image is whether cloud-init's systemd targets have all executed (sometimes systemd "ordering cycles" could have been introduced with other software on the node resulting in cloud-init system units being dropped/disabled from the boot target. | 19:58 |
blackboxsw | for unit in cloud-init-local.service cloud-init.service cloud-config.service cloud-final.service; do systemctl status $unit; done | 19:58 |
apple-corps[m] | I'll take a look. I can see at least some of the modules are running. The script was shellified by cloud-init, etc | 19:58 |
blackboxsw | confirm they were all run. and confirm you don't see "Breaking ordering cycle" in journalctl -b 0 | 19:58 |
apple-corps[m] | When I run systemd-analyze critical-chain it says Bootup is not yet finished. Please try again later | 19:59 |
blackboxsw | and `systemctl status network-online.target` | 19:59 |
blackboxsw | as that'd block a number of cloud-init jobs | 20:00 |
blackboxsw | I'm also thinking of checking active services still running/blocking with `systemctl --type=service --state=running` or something | 20:01 |
apple-corps[m] | cloud-finlal service is loaded, enabled and dead from the for loop | 20:01 |
apple-corps[m] | cloud-init.local.service doesn't exist. The others are enabled and active | 20:02 |
blackboxsw | ok. good and it is a typo cloud-init.local.service should be cloud-init-local.service | 20:03 |
blackboxsw | but yes given your status.json, I'd expect init-local to be "healthy" | 20:04 |
blackboxsw | `systemctl list-jobs` may also point to something unfinished on your system boot | 20:06 |
blackboxsw | hrm my cloud-final.service is Active: active (exited) since Thu 2022-02-10 01:50:08 UTC; 1 day 18h ago | 20:07 |
blackboxsw | again I think minimally your cloud-init.log will help us deduce what's happened here if possible | 20:07 |
apple-corps[m] | Looking at cloud-init-local.service "my bad", I see it's active, enabled. From the journalctl logs for this I see cloud-init warning. dhcp.py[WARNING]: dhclient did not produce expected files: dhclient.pid, dhclient.leases . I see some logging related to a dhclient process. But I do see dhclient provided the leases | 20:08 |
apple-corps[m] | looks like there were some permissions errors creating dhcp.eleases and dhclient.pid. But it's not clear that is blocking the service or not. | 20:09 |
apple-corps[m] | I think if we expect cloud-final.service to execute the runcmd command, because it is dead and it's last run was much before today, then we know there is an issue there | 20:10 |
apple-corps[m] | collecting the logs then and will file | 20:28 |
apple-corps[m] | atteched to bug 1960678 | 21:18 |
ubottu | Bug 1960678 in cloud-init "cloud-config script not executing" [Undecided, New] https://launchpad.net/bugs/1960678 | 21:18 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!