/srv/irclogs.ubuntu.com/2019/08/26/#cloud-init.txt

tribaalblackboxsw: Odd_Bloke: Nice! I'll review the SRU for what I can today (enabling -proposed)07:49
Nick_Awhat is the correct way to pass a fqdn via user data?14:54
Nick_Anvm found it14:56
marlincNick_A, what was it?15:05
tribaalblackboxsw: it seems like one of my colleagues found a problem with our new datasource, but we can't really understand exactly what is happening. runcmd: in the #cloud-config doesn't seem to be picked up. Current hypothesis is that our merge of set_passwords to run always is triggering a version of https://bugs.launchpad.net/cloud-init/+bug/1532234 (list of cloud_config_modules is *overwritten*, not15:10
ubot5Launchpad bug 1532234 in cloud-init "Merging with data in /etc/cloud/cloud.cfg does not work as expected" [High,Confirmed]15:10
tribaalmerged. Could you help confirm?15:10
Nick_Amarlinc just fqdn: ____15:14
Nick_AIt's not necessary though - default hostname sets it as a fqdn if specified that way15:14
chillysurferhow are boot records "split" for cloud-init? looking through cloud-init analyze i see that a machine i just provisioned and booted up has 3 different boot records16:13
chillysurferhow did cloud-init determine to have multilple boot records in this case?16:13
blackboxswchillysurfer: boot records are based on each time cloud-init sees an init-local 'start' log in /var/log/cloud-init.log.   That log line has the format:16:44
blackboxswCloud-init v. <version> running 'init-local'16:45
blackboxswso analyze counts the number of those logs in your log file as they are emitted each boot16:45
chillysurferblackboxsw: ahh ok i see16:46
chillysurferthanks for the explanation!16:46
chillysurferblackboxsw: so if you run `cloud-init init --local` manually though it'll create another boot record even though there was no reboot?16:47
blackboxswyeah it's all handled in either cloudinit/analyze/dump.py:parse_ci_logline which creates event json objects and cloudinit/analyze/__main__.py:analyze_show which parses those parsed log events by type/name and formats the boot output messages16:48
blackboxswchillysurfer: true. that would emit that log line and analyze would then lie I believe16:48
blackboxsws/lie/be wrong16:48
chillysurferblackboxsw: got it makes total sense! thanks!16:49
blackboxswchillysurfer: just validated that behavior16:49
tribaalblackboxsw: I think the use of mergemanydict is wrong :/ Instead, we should look for the particular key we are interested in ("cloud_config_module"), and merge the list by hand (if we find that key, look for a set-password, if you find it, replace it with the two elements list ["set-passwords", "always"]16:50
blackboxswtribaal: I assume you are looking at the -proposed SRU content for Exoscale :/16:51
tribaalblackboxsw: correct. Well, I found the problem in Eoan...16:51
blackboxswtribaal: this is *good*, as we can try to resolve that on Eoan quickly and get that into the current SRU16:51
tribaalbut I'm afraid I don't know enough about the internals of cloud-init there16:52
blackboxswwhich just started friday16:52
blackboxswso no big 'loss' of test/devel time. I was just starting to test clouds today16:52
blackboxswtribaal: if you could post me access to an instance (ssh-import-id chad.smith) I can ssh into it and add rharper and we can poke around16:52
tribaalblackboxsw: sure thing16:53
tribaalblackboxsw: I created this instance with the following userdata: https://gist.github.com/chrisglass/fb0cf860be8cf01f456dfff8e162e00416:54
blackboxswtribaal: can you also file a bug against cloud-init (as we'll need one to get it fixed in the SRU/Eoan)16:54
tribaalblackboxsw: ack16:54
tribaalblackboxsw: actually let me file the bug first and link all the relevant stuff there16:54
blackboxswthat'd be great tribaal thx16:54
=== blackboxsw changed the topic of #cloud-init to: Reviews: http://bit.ly/ci-reviews | Meeting minutes: https://goo.gl/mrHdaj | Next status meeting Sept 02 16:15 UTC | cloud-init v 19.2 (07/17) | https://bugs.launchpad.net/cloud-init/+filebug
tribaalblackboxsw: https://bugs.launchpad.net/cloud-init/+bug/184145417:02
ubot5Launchpad bug 1841454 in cloud-init "Exoscale datasource overwrites *all* cloud_config_modules" [Undecided,New]17:02
tribaalblackboxsw: I imported your pubkeys in "ssh ubuntu@159.100.241.237"17:05
tribaalblackboxsw: it's a test machine on our preprod, so you can break anything you want17:05
tribaalbonus points if you manage to break anything more than the instance itself :)17:05
blackboxswhaha! thanks, tribaal ok, yeah something going on with datasource config merging order in stages.py. _read_cfg. I'll refresh on why that merge is being overridden instead of merged ther .17:08
blackboxswtribaal: as Azure does the same type of thing :/17:09
blackboxswahh your builtin is setting all of cloud_config_modules to your list. as you were supposing, we need to only augment  the ds config on disk with your defaults. I'll work up something.17:13
flipsaHi! I ran into a weird problem / edge-case, where ds-identify does or does not correctly detect a NoCloud datasource, depending on which version of util-linux (blkid) is installed on the system. Could somebody spare a few minutes to have a look and decide if this should be fixed in cloud-init or in the third party software I am using (xen orchestra)? I described my findings in more detail here: https://17:22
flipsagithub.com/vatesfr/xen-orchestra/issues/4449 Thanks!17:22
rharperflipsa: thanks for reporting it; can you run a cloud-init collect-logs in the failing case?  and ideally open a bug in launchpad ?  I'd like to see what udevadm info --query=all /sys/class/block/xvdXX shows so we can see what sorts of properties were on the device;18:03
blackboxswtribaal: thanks for access to the system. I have a fix for #1841454 . I  can get it to run the modules each boot now https://pastebin.ubuntu.com/p/wTtf9JYDHs/18:45
blackboxswtribaal: rharper I'll put up a branch for this fix shortly (SRU-regression/ Eoan bug)18:45
rharpernice18:46
blackboxswrharper: also openstack v2 related. fixing idempotent normalize_route works wonders for fixing unit test issues18:46
rharper\o/18:46
rharperI thought it might; just prevents mutuation in paths which call it multiple times18:47
blackboxswhttps://paste.ubuntu.com/p/kgxN8qJfcM/18:47
rharperI'd like an eye-catcher though so we can see where we're getting multipath passes through18:47
rharperoh18:47
rharpernasty18:47
blackboxswyeah I think this mutation only happened when prefix was /0  for default routes18:47
blackboxswneeded a none check instead18:48
rharperindeed18:48
blackboxsw+1 on tracking the multiple callers for normalization as well will add some debug18:48
tribaalblackboxsw: \o/19:11
flipsarharper: https://bugs.launchpad.net/cloud-init/+bug/184146619:27
ubot5Launchpad bug 1841466 in cloud-init "ds-identify fails to detect NoCloud datastore with LABEL_FATBOOT instead of LABEL (change introduced recently in util-linux-2.33-rc1)" [Undecided,New]19:27
flipsarharper: if you need anything else, ping me...19:27
flipsarharper: the logs are almost all empty / non existent, because ds-identiy bails with exit code 1 and cloud-init doesn't even run...19:28
flipsabut i guess the problem / cause of the problem are clear anyway. But if not I can clarify / do more testing19:29
rharperflipsa: do you have the command that creates the partitionless disk with FAT16?19:39
rharperflipsa: it seems reasonable to me to support both, but I'd like to be able to create one of these type of disks so we can verify the behavior and the fix19:40
flipsa@rharper: I am not familiar with the Xen Orchestra code base, but this should be it: https://github.com/vatesfr/xen-orchestra/blob/master/packages/xo-server/src/fatfs-buffer.js19:48
flipsait's all done in a web app with nodejs. The external npm library they use is commented out on line 3319:49
rharperflipsa: ok, my reading of the util linux code seemed to indicate to me that the filesystem label detection in blkid used to fallback to reading the boot record label as well, but blkid stopped doing that;  this is almost certainly going to cause more wide spread failures of where FAT16 labels were provided but now they aren't, instead all of the tools which use to get a LABEL value no longer get that.19:51
flipsayeah, as soon as people upgrade some will get bitten, quite sure19:51
rharpernow util linux is more accurate separating the two labels out; but now everyone else gets to fix their stuff as well.  I wonder what case really broke where blkid reported the boot label as the fs label19:52
flipsano clue19:53
rharperflipsa: would you be able to duplicate the original disk?  is it something really small we could attach to the bug ?19:55
rharperlooks like dosfstools writes both boot record and volume label with the same value19:55
rharperhttps://github.com/dosfstools/dosfstools/blob/master/src/boot.c#L75519:56
rharperflipsa: I wonder if the ftafs would do that as well19:56
rharperand we can certainly check if LABEL_FATBOOT is present as well19:56
flipsarharper: 10MiB19:57
rharperyeah, I bet if you xz it, that'll drop smaller, either way if you don't mind attaching that to the bug would be great19:58
flipsarharper: see the bottom of my initial bug report where i did some experiments... dosfslabel incorrectly reads from the LABEL_FATBOOT field (if LABEL is not present), but it writes to the LABEL field but does not over-write LABEL_FATBOOT. seems like a bug as well imho19:59
rharperwell, it's not _incorrect_ by my reading, rather it checks in _both_ locations19:59
flipsarharper: will try. what's the size limit for bugs.launchpad?20:00
rharpermuch bigger than 10MiB20:00
rharperflipsa: we'll handle this in the cloud-init side with a fix to ds-identify to support checing LABEL_FATBOOT for cidata20:01
rharperI would suggest also mentioning writing both label values to orchestra20:01
rharperso they can generate nocloud data for images which won't yet have cloud-init with the fix20:01
rharperI see no reason not to write the same value in both places for the cloud-init use-case20:02
flipsarharper: awesome!20:02
flipsayeah, think they were hoping that upstream fixes it, but you are right, until this is pushed to distros will take enough time to create problems for people...20:02
flipsarharper: will a dd image of the virtual disk do? don't think there's a supported way to export it from the Xen Orchestra interface20:14
rharperis there only a single disk or do you have a rootfs disk and a separate cloud-init data disk ?20:15
flipsaonly one disk, no partitions20:16
rharperand the disk is FAT16 ?20:16
rharperthat's surely not the Operating System disk ?20:17
flipsaI am sure it's not the OS disk... but you might be right, i assumed that it's FAT16 from reading the source code of Xen Orchestra. cfdisk tells me: Label: dos, identifier: 0x00000000, no partitions only "free space". blkid says type "vFAT"20:22
flipsais there a diff between FAT16 and vFAT?20:25
flipsabtw, this is the whole content:20:25
flipsa.20:26
flipsa├── meta-data20:26
flipsa├── network-config20:26
flipsa├── openstack20:26
flipsa│   └── latest20:26
flipsa│       ├── meta_data.json20:26
flipsa│       └── user_data20:26
flipsa└── user-data20:26
rharperthat's just the metadata disk20:29
rharperso yeah, just dd that20:29
flipsarharper: https://bugs.launchpad.net/cloud-init/+bug/1841466/+attachment/5284769/+files/xvdb.img.tar.gz20:36
ubot5Launchpad bug 1841466 in cloud-init "ds-identify fails to detect NoCloud datastore with LABEL_FATBOOT instead of LABEL (change introduced recently in util-linux-2.33-rc1)" [Undecided,New]20:36
rharperflipsa: thanks!20:36
blackboxswtribaal: rharper  I have a branch up that fixes #184145420:37
blackboxswrharper: tribaal https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/37182320:37
blackboxswit allows for overriding existing sys_cfg options20:38
rharperok20:39
rharperblackboxsw: so, why didn't it work ...20:39
rharpervs. azure's build in ds config merging ? because of the list type ?20:40
tribaalI guess yes20:42
tribaalblackboxsw: I'll give it a try20:43
tribaalblackboxsw: yep, building a local deb and installing to a fresh ewan instance makes it work indeed20:57
tribaalruncmd runs, and get-passwords runs with frequency "always" as was initially expected.20:58
rharperso, azure's built-in ds doesn't update or modify the modules list, so that's why it's not an issue;21:01
rharperin azure, the ds.activate() method is used to clear our disk_setup/mount semaphors in the case that they need to reinitialize the disks due to migration;  exoscale could clean up the set_passwords semaphor that way.21:04
tribaalrharper: instead of forcing the frequency?21:06
rharperyes ;  in your template that you loaded, did you just copy the current value of modules and replace the one entry ?21:07
tribaalrharper: I'm not sure I understand the question21:08
rharperbefore I suggested that the datasource indicate the frequency, you said there was an in-image file which set this value21:08
rharperwhich meant that stock images wouldn't run password on each boot21:08
tribaalah yes21:08
tribaalso the in-image file does it "wrong" as well it turns out21:09
rharperheh21:09
rharperthat's not surprising; the config is awkward in that you have to somehow know or read the current list;21:09
tribaalwe did not copy the list and change the value, we asssumed it would be merged and therefore just added the one entry there21:09
tribaalactually that's how we detected the problem - with a bionic image in our prod. But the fix there is easy enough for us - we'll just copy the full list and tweak the frequency until we can forget about it all and use the proper datasource instead that should do the right thing for us21:10
tribaalin other words as a workaround we'll copy the full list in our in-image file, until we can get rid of the in-image file :)21:11
blackboxswrharper: yeah sorry was on an errand. because of the list type, full list value overrode the entire cloud_config_modules list item22:10
blackboxswrharper: per overriding sysconfig values from the file system, I didn't see any existing cases where we did that in datasources after that initial /etc/cloud/config.d merge normally our ds's pull only ds_cfg under datasource: <dsname>: key/values22:15

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!