/srv/irclogs.ubuntu.com/2019/07/26/#cloud-init.txt

factorDid'nt know this existed until now.05:02
factorCan I use git clone inside a runcmd ?05:02
factorand use said repo05:03
naccis it expected that 'selinux_user' in a cloud config on ubuntu would cause an exception ("useradd: -Z requires SELinux enabled kernel")13:12
Odd_Blokefactor: o/ There shouldn't be any problem doing that; are you seeing issues?13:14
Odd_Blokenacc: \o I'm not sure, TBH, but a bug would be appreciated!13:15
=== Odd_Bloke changed the topic of #cloud-init to: Reviews: http://bit.ly/ci-reviews | Meeting minutes: https://goo.gl/mrHdaj | Next status meeting July 22 16:15 UTC | cloud-init v 19.2 (07/17) | https://bugs.launchpad.net/cloud-init/+filebug
naccOdd_Bloke: ack will get it filed13:16
Odd_BlokeThanks!13:16
Odd_Blokerharper: Do you know why cloud-init doesn't emit its logs to the journal?  When trying to debug interactions between multiple units, it's quite annoying to have to keep two log files open and mentally interleave them.13:28
Odd_Blokerharper: I filed https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1838032, so we can carry out the conversation there.13:34
ubot5Ubuntu bug 1838032 in cloud-init (Ubuntu) "cloud-init should emit its logs to the systemd journal" [Undecided,New]13:34
rharperOdd_Bloke: it does log to the journal... journalctl -u cloud-init-local.service  ?14:06
* rharper reads bug 14:07
Odd_Blokerharper: Clarifying in there now.#14:14
rharperOdd_Bloke: I understand now14:15
rharperand we can discuss the defaults in upstream, in ubuntu, and I've added a comment to make a one line config change to get what you want14:15
rharperthe latter which is immediately useful to you14:16
Odd_Blokerharper: Ack, thanks; I've clarified that I mean _all_ logs.14:16
factor<Odd_Bloke> Thanks for the response. Going to test it , have not yet. So I can git a puppet script to run as part of cloud-init.15:08
Odd_Blokefactor: You can definitely `git clone` the puppet script from a runcmd, yes.  And if you're asking about using the puppet cloud-init module, then that runs after runcmd so whatever you do should be on the system by the time it runs.15:20
factorOdd_Bloke, thanks15:24
Odd_Blokerharper: So it looks like for that Azure growpart bug the udev event is missing information: https://bugs.launchpad.net/cloud-init/+bug/1834875/comments/2416:17
ubot5Ubuntu bug 1834875 in cloud-init "cloud-init growpart race with udev" [Medium,In progress]16:17
Odd_Blokerharper: Do you have any suggestions as to where I should go looking next?16:17
rharperyes, so if you look at growpart, you can see what it's doing with sgdisk,16:18
rharperwhich is updating the table and setting values16:18
rharperit looks like whatever is watching the disk (systemd) sees the write/close on the device, and fires an uevent where the partition data isn't present,16:19
rharperOdd_Bloke: look at growpart at line 47616:20
rharperOdd_Bloke: so I wonder if the version of sgdisk matters here16:20
rharperit16:21
rharperis effecitvely deleted pt table, re writing it, then calling partx --update16:21
rharperto which, I think one *could* add a udevadm settle --exit-if-exists /path/to/partition16:22
rharperbut I thought your change added a settle right before the check on new size value;16:23
Odd_BlokeSo --exit-if-exists only short-circuits the regular behaviour, AIUI.16:24
rharperif update was triggered (partx --update) then a settle after that certainly should have already re-read pt table16:25
Odd_BlokeSo if the queue is empty, it won't _wait_ for that file.16:25
rharperOdd_Bloke: right; it just means to not wait the default 120 seconds if the path already exists16:25
rharperit checks if it exists, then checks if the queue has anything in it16:25
rharperand then will wait up to 120 seconds for queue to empty16:26
Odd_BlokeRight, which means `udevadm settle` will do whatever it would do; and it didn't address the issue.  That said, I don't think tobijk used my patched cloud-init for these latest tests.16:26
rharpereffectively, the sgdisk write new table info, partx --update will relese the syscall to re-read partition table on the disk16:26
rharperif we then call settle; the updated pt will generate change events16:27
rharperand *those* should have the updated partition data in the event16:27
Odd_BlokeWhy would settle cause more events to be generated?16:27
rharperotherwise I don't see why a change event would occur;  though when the partition is _deleted_ via sgdisk command, that will create a change event16:27
rharpersettle won't cause any events16:27
rharperonly sgdisk command will generate events16:27
rharperthe partx --update is a kick to ask the kernel to reread the partition table data (and then the kernel may emit update events if the table has changed since it last read it)16:28
rharperso, a settle after those two commands have run, should ensure that once a CHANGE event for the updated ptable is emitted, and rules have run, that the symlink would be present16:29
rharperit's not clear to me if the captured change event you;'re seeing is from when sgdisk is "clearing things out" or after it's written the new data and called partx --update16:30
rharperOdd_Bloke: I can hangout if you want16:30
Odd_BlokeOn both failing and succeeding instances, we see the same number of sda1 udev events: one add when the system comes up, one change when we resize, and one change later in boot (I think when systemd is reloaded?).  After that second "change", both systems are in a good state.16:33
Odd_BlokeThe order in which the sda{1,14,15} udev events are logged is different between the two systems; sda1 is first on the failing system, and last on the good system.16:34
Odd_BlokeSo my theory is that the events are prompted by the clear-out, and we're then racing against the partition table being rewritten.16:35
rharperand the timing of the second change from passing to failing ? timestamp wise; does this occur before or after the read ?16:35
Odd_BlokeIn both cases the second change is 20-30s after the resize has happened.16:35
rharperoh my16:35
rharperso, let's test this if we can'  in the check_size path; invoke 'partx ---update' or blockdev --rereadpt /dev/sda;  then udevadm settle16:36
rharperthis will *force* a reread of the table before we check size16:36
rharperand settle should enforce rules complete if update detected a table change16:36
rharperif that works, then it's likely that the partx --update after sgdisk returns isn't effictive due to the processing of the change of removal16:37
Odd_BlokeAs an aside, those second change events come after "systemd[1]: Reloading."16:39
chillysurferit's not jumping out at me by looking at bddeb, but is there a way to tell the build to dump debs and other distribution artifacts in a `dist` dir?16:53
chillysurferi see that it is gitignored so wondering if i'm just not seeing something16:53
Odd_Blokerharper: Further confirmation that we aren't seeing the event due to the partx command; the growpart call specifies a single partition, and we're seeing events for all partitions.16:54
Odd_Blokerharper: I'm heading to lunch; could you take a look at https://bugs.launchpad.net/cloud-init/+bug/1834875/comments/25 and let me know if that matches your understanding?16:56
ubot5Ubuntu bug 1834875 in cloud-init "cloud-init growpart race with udev" [Medium,In progress]16:56
rharperOdd_Bloke: that looks right;  reading partx source-code (https://github.com/karelzak/util-linux/blob/master/disk-utils/partx.c) I wonder if what we really want instead of partx --update /dev/sda 1 ; is just blockdev --rereadpt /dev/sda ;17:22
rharperpartx --update looks like it does more than just asking the kernel to reread the partition table on the disk.17:23
Odd_Blokerharper: When I tried a `sudo blockdev --rereadpt /dev/sda` on a running system, I get "blockdev: ioctl error on BLKRRPART: Device or resource busy"17:28
Odd_BlokeSo I don't know if we _can_ use that.17:28
rharperright17:28
rharperthat's why they used partx; uses a different syscall17:29
rharperOdd_Bloke: I think it would be interesting to see partx --show /dev/sda  ; before the --update and after17:30
Odd_BlokeOK, will include that.17:30
Odd_Blokechillysurfer: Sorry, skipped over your question before.  If I had to guess, I think dist/ is ignored for Python packaging reasons, not because that's where we expect deb/rpm packages to end up.17:53
Odd_Blokechillysurfer: I think adding an option for output location would probably be a reasonable proposal, though, defaulting to the current behaviour.17:53
chillysurferOdd_Bloke: yep that makes sense. i have a one-liner git and xargs command that just deletes those untracked files anyways17:55
chillysurferso not a huge deal (as long as the build files are my only untracked ones)17:55
bitfehleri was wondering - is there any "official" definition of what constitutes a variant "other" in the cloud.cfg template?18:27
bitfehlerthere are several uses like `{% if variant in ["ubuntu", "unknown", "debian"] %}` or even `{% if variant in ["ubuntu", "unknown"] %}`18:28
bitfehlere.g. enabling the apt-related modules18:28
bitfehlerwhich seems odd, i figured "other" would be a distro... well, _other_ than the ones that have specific values18:29
Odd_BlokeUbuntu and Debian aren't the only distros that use debs (e.g. Mint).18:30
bitfehlertrue, but still seems a far-reaching assumption? also, i though mint was in the list as well?18:33
bitfehlerah, not quite, but it seems mint is mapped to ubuntu already18:33
bitfehlersystem_info() has something like18:34
bitfehler        elif linux_dist in ('ubuntu', 'linuxmint', 'mint'):18:34
bitfehler            var = 'ubuntu'18:34
Odd_BlokeFair enough.18:35
bitfehleri mean, i am also not even trying to prove anything, i was just wondering. the apt thing was just an example. if it really is meant to be "anything else" that's file18:35
bitfehlerfine :)18:35
rharperbitfehler: no strict definition; I would say, other  is something not currently defined18:36
rharperwe're opt-in here; the default being Ubuntu; and as distros and variants of a particular distro have need to deviate; those get introduced/added18:36
Odd_BlokeI guess the thinking is that if we know that we _don't_ know what distro we're on, we shouldn't remove things from the list.18:37
bitfehlerinteresting point :)18:38
Odd_BlokeSo `{% if variant in ["ubuntu", "unknown", "debian"] %}` is more like "we shouldn't run this if we know definitively we aren't on Ubuntu or Debian"18:38
bitfehlerthat is good guidance for me, thanks. i guess it's clear that "other" shouldn't be anything to rely on anyways, so i am just trying to understand how to best go about supporting another distro there18:40
rharperbitfehler: is the distro a variant of existing ones? or something altogether different ?18:40
Odd_BlokeYeah, my first reaction to that is that we should be defining this distro in cloud-init. :)18:41
rharperI think the best way to answer that is looking at the distro classes and their implementation of things like package install, and default file locations18:41
bitfehleras a fun fact, the variant of my distro (arch) as returned by system_info is actually "linux", which is not a valid input when using --distro= for ./setup.py18:41
bitfehlerhowever, when not setting anything, that's what gets passed to the template18:41
rharperarch doesn't have an os-release file ?18:41
rharperhttps://gist.github.com/natefoo/814c5bf936922dad97ff18:42
bitfehlerit does, and that is reflected somewhere in system_info result, but the variant is "linux"18:42
rharperthe python 'dist' info just says linux then ?18:43
rharperif there is extra info in either os-release or another file then we can update util.py:get_linux_distro() to parse that bit out18:44
bitfehlerno, dist is set correctly, but the variant is determined by checking for a few know values and e/t else is "linux"18:44
bitfehler    if system == "linux":18:44
bitfehler        linux_dist = info['dist'][0].lower()18:44
bitfehler        if linux_dist in ('centos', 'debian', 'fedora', 'rhel', 'suse'):18:44
bitfehler            var = linux_dist18:44
bitfehler        elif linux_dist in ('ubuntu', 'linuxmint', 'mint'):18:44
bitfehler            var = 'ubuntu'18:44
bitfehler        elif linux_dist == 'redhat':18:44
bitfehler            var = 'rhel'18:44
bitfehler        elif linux_dist in (18:44
bitfehler                'opensuse', 'opensuse-tumbleweed', 'opensuse-leap', 'sles'):18:44
bitfehler            var = 'suse'18:44
bitfehler        else:18:44
bitfehler            var = 'linux'18:44
rharperbitfehler: we'd certaily accept a patch to help set the variant to arch18:44
bitfehlercool, can do18:45
rharperif that's found with get_linux_distro() method;18:45
rharperwhich it sounds like it does18:45
bitfehlertotally18:45
bitfehlerwhich leads me to my last question :) i figured it would be good to add some arch-specifics into the template generation as well then18:46
rharperfor sure18:46
bitfehlerand i was wondering what the approach should be there? run every module that can be reasonably expected to work on the distro?18:46
rharperand likely revist the distros value in the config modules18:46
rharperbitfehler: yeah, you can force all modules to run or you could run each one via cloud-init single --force --name18:47
rharperhttps://wiki.archlinux.org/index.php/Cloud-init18:47
rharpereven mentions using unverified_modules18:48
bitfehleryeah, i guess it would be nice to get to the point where that is no longer neccessary. i wasn't aware that the modules declare this as well, thanks for pointing that out18:50
bitfehlercool, thanks a lot for all that, i'll send a few patches your way soon :)18:51

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!