/srv/irclogs.ubuntu.com/2020/10/14/#cloud-init.txt

=== tds0 is now known as tds
catphishmorning, i'm almost there with my datasource, but i have a problem that i don't understand, on first boot, my local stage populates netplan with my network config, but the network never comes up. if i then reboot, the network *does* come up11:23
catphishit feels like cloud-init it suppressing netplan, but not actually configuring the network itself11:23
catphishthis is (i thinkt) the relevant log output: https://paste.ubuntu.com/p/zV8ndvcf7k/11:26
catphishthe documentation for the local stage says "Cloud-init then exits and expects for the continued boot of the operating system to bring network configuration up as configured." but it would seem that for some reason, netplan is not being applied *after* this local stage11:34
catphishoh, the netplan config is written during the main stage, not the local stage, hmm11:41
meenacatphish: glad you figured all that out without any help11:43
catphishthank you, though i'm still not clear about how the network should get configured at first boot :(11:44
catphishmy datasource fetches and returns network config (yay), but this doesn't get written to the netplan config file until after netplan has already executed, and hence the network is never actually brought up during the first boot, i hope i'm missing something simple, but i'm not sure what it is11:46
otuboHey guys, any chance to take a look at my PR before it gets stale? https://github.com/canonical/cloud-init/pull/58612:01
otuboAlso, a quick question about logging, why does multi_log() is on util.py and not on log.py? I just got an IOError on multi_log()'s flush call over here.12:04
otuboI see that log.flushLoggers() is the single point of flushing (except for log._resetLogger()) and it passes on IOError exception. Wondering if we should pass on util.multi_log() as well or move multi_log inside log and reroute the flush to log.flushLoggers()12:08
otuboThe bug in question is this one: https://bugzilla.redhat.com/show_bug.cgi?id=183110712:10
ubot5bugzilla.redhat.com bug 1831107 in cloud-init "[RHV] cloud-init with empty fields injects configurations" [Low,New]12:10
=== tds6 is now known as tds
Odd_Blokemeena: pickle is a way of serialising Python objects: https://docs.python.org/3/library/pickle.html.  cloud-init uses it to persist instance state between boots.13:42
Odd_Blokemeena: Yep, I discovered you can do the mocking after I'd written that concrete subclass implementation.13:43
rharpercatphish: is your datasource configured to run at local time?   2020-10-14 11:21:08,512 - main.py[DEBUG]: [local] Exiting. datasource DataSourceKatapult not in local mode.  It sounds like it is not, the message your post ends with is in cloudinit/cmd/main.py where it checks the current datasource.mode with the mode;    What does your datasources =  line look like in your new datasource?  if you look at DataSourceConfigDrive,  you14:27
rharpercan see how a datasource class is bound to sources.DEP_FILESYSTEM, tuple; this tells cloud-init that ConfigDrive runs at "local time"; access to local filesystem vs dependency on the network (the datasource needs networking to fetch it's data)14:27
catphishrharper: i was fairly sure my data source runs at local time. it runs if i execute "clout-init init --local", the dependencies are defined as follows: datasources = [ (DataSourceKatapult, (sources.DEP_FILESYSTEM, )) ]14:47
catphishrharper: this is the complete (WIP) datasource: https://paste.ubuntu.com/p/s9F4xth7Yc/14:47
catphishi will check whether "cloud-init init --local" actually populates the netplan config file or not14:48
rharpercatphish: your log suggested it did,  2020-10-14 11:21:08,250 - netplan.py[DEBUG]: V2 to V2 passthrough14:50
rharper2020-10-14 11:21:08,252 - util.py[DEBUG]: Writing to /etc/netplan/50-cloud-init.yaml - wb: [644] 662 bytes14:50
rharperand it invoked netplan generate14:50
rharper020-10-14 11:21:08,254 - subp.py[DEBUG]: Running command ['netplan', 'generate'] with allowed return codes [0] (shell=False, capture=True)14:50
rharperso, the questions that remain are  1) what's in /etc/netplan/50-cloud-init.yaml 2) what's in /run/systemd/network/*   3) is systemd-networkd.service enabled?  4) if so, did it run before or after cloud-init-local.service (journalctl -b 0 -o short-monotonic -u cloud-init-local.service -u systemd-networkd.service14:52
catphishthank you, i will go back through this now, what i know for sure is that after the first boot, /etc/netplan/50-cloud-init.yaml is populated, but not applied, after a subsequent reboot, it's applied14:53
catphishso, a --local run definitely populates netplan: https://paste.ubuntu.com/p/hMDgNHjPrW/15:09
catphishso, after the initial boot, /etc/netplan/50-cloud-init.yaml is populated, but systemd-networkd.service is "dead", if i run "systemctl start systemd-networkd.service", the network comes up fine15:14
rharperdead usually meants it's not enabled by default in your system15:15
catphishhmm, yes it does seem like it may be disabled15:16
rharperon Ubuntu, we have /etc/systemd/system/network-online-target.wants/systemd-networkd-wait-online.service15:16
catphishbut on a subsequent boot, it runs15:16
catphishnb. this is ubuntu 18.04 with cloud-init installed after a regular installation15:16
rharperbecause netplan during boot will now parse the yaml and emit system want targets15:16
rharperoh15:16
rharperdesktop ?15:16
catphishserver15:17
rharperfrom the legacy installer15:17
catphishyes15:17
rharperdo you have the want's I meantioned in /etc/systemd/system/network-online-target.wants ?15:17
catphishi'm happy to abandon this weird installation and start from a new cloud image, i just assumed it would work the same15:18
catphishi don't have anything called /etc/systemd/system/network-online-target.wants15:19
rharperyeah, I suspect this is part of the cloud images that may not be present without15:19
catphishit's very much my suspicion that netplan is starting before cloud-init-local15:20
rharpernetplan only runs at generator time; and it writes config files15:20
rharpermkdir -p /etc/systemd/system/network-online.target.wants; cd /etc/systemd/system/network-online.target.wants && ln -s /lib/systemd/system/systemd-networkd-wait-online.service15:20
rharperthat should ensure that systemd-networkd is part of the boot target; such that when cloud-init calls netplan generate and those files are created in /run/systemd/network/ then networkd should run15:21
rharperalso, systemct status systemd-networkd  should say: Loaded: loaded (/lib/systemd/system/systemd-networkd.service; enabled; vendor preset: enabled)15:21
catphishrharper: that's fixed it, thanks!15:21
catphish   Loaded: loaded (/lib/systemd/system/systemd-networkd.service; disabled; vendor preset: enabled)15:22
catphishinterestingly "disabled" but it works15:22
catphishyay - https://paste.ubuntu.com/p/Bg59SWTvc4/15:23
rharpercatphish: \o/15:24
catphishanyway, that was probably a waste of time, but at least now i know my datasource isn't the problem, i'll try to get some actual cloud images onto my platform now, thank you for your assistance15:24
catphishhopefully the cloud images will work out of the box15:25
rharperyeah, I hope so15:26
catphishthere's still a lot i need to understand, but getting there15:27
meenaOdd_Bloke, that warning sounds…dangerous. do we trust our data??15:27
Odd_Blokemeena: I was thinking about that earlier: if you have the ability to write nasty data into obj.pkl, then you almost certainly have the ability to do substantially worse things in a much less indirect fashion.15:28
Odd_Blokeobj.pkl is root:root, 400.15:29
Odd_BlokeAnd if you can feed user-data which would cause a vulnerability in, then you may as well just give yourself root more directly, without exploiting some pickling bug.15:30
Odd_Bloke(If anyone has something more concrete than that, then please follow our security process! https://cloudinit.readthedocs.io/en/latest/topics/security.html)15:34
meenaOdd_Bloke, aye16:10
meenai need to fix the Azure tests, and get this pr done16:34
meenai say this, while sitting here watching buddi https://www.netflix.com/title/8099359016:36
vijayendrarharper, LP: 1893770. I tried your suggestion on adding one more datasource NoCloud along with ConfigDrive but I still see it resets to fallback(dhcp)16:57
ubot5Launchpad bug 1893770 in cloud-init "Cloudinit resets network scripts to default configuration DHCP once Config Drive is removed after sometime" [Undecided,Incomplete] https://launchpad.net/bugs/189377016:57
rharpervijayendra: sure, update the bug with logs from that run;  I suggest that you interactively work with ds-identify and your cloud.cfg until you see ds-identify report disabled;  sudo DI_LOG=stderr /usr/lib/cloud-init/ds-identify --force17:04
vijayendrarharper, updated logs on bug. Sure. Currently doing the same by running /usr/libexec/cloud-init/ds-identify --force17:10
vijayendrarharper, currently tried by doing this change https://paste.ubuntu.com/p/Ywqv58nGxs/. Looks working with this change17:15
rharpervijayendra: ok, but you shouldn't need this; I'll look at the logs18:02
vijayendrarharper, Sure18:02
rharperthis is what it looks like for me; what I expect it should do for you as well;  https://paste.ubuntu.com/p/vfZ98k3VRm/18:04
vijayendrarharper, In my case last line looks like No ds found [mode=search, notfound=enabled]. Enabled cloud-init [0]. https://paste.ubuntu.com/p/zqfxXfnQXy/18:16
vijayendrarharper, cloudinit is disabled for you but for me its enabled18:17
rharpervijayendra: I updated the bug18:19
rharpersomething has the ds-identify policy set to notfound=enabled ; which means if you don't find any datasources, enable cloud-init anyway18:19
rharperthat's not the upstream default, or how we run it in Ubuntu18:19
rharperI don't think RHEL sets that as default either18:19
rharperbut it's possible18:19
rharperin any case though, smoser already mentioned that a ds-identify change won't help since any cloud-init operation that was supposed to run on every boot (user-data may have enabled these things) wouldn't get run since cloud-init disabled itself;18:20
rharperso the path forward is either looking at the datasource fallback PR; or deal with manual-cache-clean issues you've found in any backup/template script;18:21
vijayendrarharper, True. Let me work on the fallback PR you suggested.  I don't see much of an impact on our platform if subsequent boot cloudinit gets disabled. So wanted to give a try18:26
=== tds2 is now known as tds
rharperyou don't accept user-data ?18:34
vijayendrarharper, yes, That will not work, but after reboot dhcp network on the guest seems to be bigger problem than that, was just trying to barrow some time if we can deal with this problem before we address issue completely.18:46
rharpervijayendra: I see; it's definitely an improvement in the short term18:49
rharpervijayendra: ah, I see what's up; since power does not have DMI, the default policy is to enable cloud-init even if DS is not found; primarily because we can't rule some datasources out without looking at DMI data;   so, since you're already writing a datasource_list: []  you can also include a ds-indentify policy to set notfound=disabled;  you can write to /etc/cloud/ds-identify.cfg with the content: policy: search,found=all,maybe18:53
rharper=all,notfound=disabled18:53
vijayendrarharper, ah! ok. Sure. Let me try this18:56
vijayendrarharper, yes. This change did not reset to fallback19:08
vijayendrarharper, this helps in short term, I will work on the PR suggested and update you. Thanks for the support.19:11
rharpercool19:17
meenahttps://discourse.ubuntu.com/t/path-to-a-commit-bit-proposal/18770 ⬅️ i commented on rick_h's commit bit proposal20:52
meenaaaaaand now, i sleep20:52
=== MAbeeTT_ is now known as MAbeeTT
=== tds4 is now known as tds
johnsonshiOdd_Bloke: I think the Azure datasource report_diagnostic_event refactor can now be merged since I've addressed all of the comments: https://github.com/canonical/cloud-init/pull/56323:54

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!