/srv/irclogs.ubuntu.com/2022/10/27/#cloud-init.txt

=== janitha78 is now known as janitha7
andrew49Hello! I'm using cloud-init 22.3.4 and am encountering a problem where it remains in "status: running" forever (hours) with no indication in the logs about where it is stuck or why; "cloud-init analyze show" indicates that init-local and init-network have finished the states successfully; moreover, if I run "cloud-init clean" followed by12:39
andrew49"cloud-init --debug init" and "cloud-init --debug modules", everything completes successfully. What else can I do to determine why/where it is hanging forever?12:39
meenahi andrew49 13:39
meenaandrew49: do a full clean of logs and state. enable logging properly: basically, just make sure this file exists: https://github.com/canonical/cloud-init/blob/main/config/cloud.cfg.d/05_logging.cfg looks like this, and is named .cfg and so will be included13:41
meenathen do a reboot and see what the logs say13:41
falcojrandrew49: actually, don't do that yet (unless you see no logs at all). You said 'cloud-init analyze show' shows init-local and init-network. In /var/log/cloud-init.log, is there a "running 'modules:config'" or "running 'modules:final'" message in the logs?13:49
falcojrif not there's likely something else that was blocking cloud-init from running its final stages13:49
falcojrdoes "systemctl --failed" or "systemd-analyze critical-chain" show anything unusual?13:50
andrew49falcojr I do not see either of those entries in /var/log/cloud-init.log; "systemctl --failed" shows systemd-remount-fs.service as "failed" - maybe that is the problem?13:57
andrew49it looks like the error in the unit is "mount: /: can't find LABEL=cloudimg-rootfs."13:58
falcojrandrew49: I'd be surprised if that was the problem (unless literally mount '/' failed...which...would give you bigger problems I think :P ), but cloud-init actually runs 4 separate times on boot, so once the first service starts, status will say 'running' until the final service has completed. cloud-config.service has some dependencies during boot, so if those don't complete or are taking forever for some reason, it will look like cloud-init is...14:03
falcojr... never completing14:03
andrew49does it seem likely that the fact that this systemd-remount-fs.service is "failed" is the thing holding it up then?14:03
andrew49"systemd-analyze critical-chain" reports "Bootup is not yet finished" so not a lot of extra info there14:04
falcojrandrew49: try 'systemctl list-jobs' ?14:06
falcojranything 'waiting' there?14:06
andrew49yes a number of things - snapd.autoimport.service, cloud-init.target, cloud-config.service, snapd.seeded.service, cloud-final.service, multi-user.target, ubuntu-advantage.service, graphical.target, and systemd-update-utmp-runlevel.service14:08
itjamie-tempquestion. when using nocloud-net datasource. is it possible for cloud-init to supply the syserial/mbserial to the datasource http server?14:09
itjamie-tempeg something like `ds=nocloud-net;s=http://10.10.0.1:8000/$sysserial/`14:10
falcojrandrew49: cloud-config.service relies on snapd completing. I would check the snap logs to see if anything is hanging there14:11
falcojritjamie-temp: I think that could work, but then cloud-init will try contacting http://10.10.0.1:8000/$sysserial/user-data for user data and similarly for vendor-data and meta-data14:14
andrew49falcojr I see several errors, e.g.14:14
andrew49"[change 142 "Setup snap \"snapd\" (17029) security profiles" task] failed: cannot reload udev rules: exit status 1"14:14
andrew49and14:14
andrew49"error trying to compare the snap system key: system-key missing on disk"14:14
falcojrandrew49: Unfortunately, I don't know enough about snap to meaningfully help debug that14:15
andrew49falcojr okay thanks, I have enough to go on for now so I'll work on figuring it out and report back what I find14:15
itjamie-temp@falco14:16
itjamie-tempfalcojr is there a list of variables cloud-init has available? eg i just guessed $sysserial14:16
andrew49falcojr I think the issue I'm hitting is https://bugs.launchpad.net/snapd/+bug/1712808; this container does have 'security.privileged: "true"' as described there14:22
-ubottu:#cloud-init- Launchpad bug 1712808 in snapd "udev interface fails in privileged containers" [Medium, Confirmed]14:22
falcojritjamie-temp: oh sorry, I think I misunderstood your initial question. I don't think there's any variable substitution that can be applied there14:28
itjamie-tempdamn. that would have been really useful...14:28
meenafalcojr: so, i just realized something, and I don't know why it took me almost a week: I can mock the ifconfig -a output as '' in cases where it has no bearing on the outcome.14:29
falcojrmeena: yeah, were you trying to feed it realistic results before?14:29
meenafalcojr: pretty much everywhere where it was failing14:30
falcojrdoh...14:30
meenafalcojr: i just did my thing of readResource('assets/netinfo/freebsd-ifconfig-output')14:30
meenabut that's a lot of work, for: we're just checking for … something completely different.14:30
meenabut, tbf, i was searching for generic solutions14:31
itjamie-tempwhere would i open a feature request for cloud-init ?14:32
meenaitjamie-temp: link is in the /topic14:33
meenawell, it says bugs, but, still14:34
itjamie-tempok so its fine to open a bug for an FR?14:34
meenaitjamie-temp: yes14:34
meenaI open at least one per week :P14:34
itjamie-temphttps://bugs.launchpad.net/cloud-init/+bug/1994980 well fingers crossed.14:38
-ubottu:#cloud-init- Launchpad bug 1994980 in cloud-init "FR for variable substitution in nocloud-net urls (eg system serial number)" [Undecided, New]14:38
meenasooooo close14:58
meenafalcojr: all bugs, no, all tests fixed15:49
meenajrm: i think our package should patch whatever is causing cloud-init to print its version to… our version (in the net/cloud-init-devel package)16:25
andrew49falcojr following up on my issue, it looks like as noted in https://discuss.linuxcontainers.org/t/snap-inside-privileged-lxd-container/13691 that adding "security.nesting: true" in addition to "security.privileged: true" (the latter being what I really want) is sufficient to avoid this problem and allow snapd to successfully install (and thus16:35
andrew49cloud-init to finish); thanks again for the help!16:35
falcojrah, great. Glad you found it16:48
meenafalcojr: thanks for the review. Will adapt how flags are parsed, and actually add OpenBSD and NetBSD outputs to our assets for testing20:53
meenatomorrow.20:54
falcojrSounds good20:55
itjamieRe https://bugs.launchpad.net/bugs/1994980 if i take a stab at a pr is there any guidance you'd like to give on what vars should be available for substitution?22:47
-ubottu:#cloud-init- Launchpad bug 1994980 in cloud-init "FR for variable substitution in nocloud-net urls (eg system serial number)" [Wishlist, Triaged]22:47

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!