[05:02] Did'nt know this existed until now. [05:02] Can I use git clone inside a runcmd ? [05:03] and use said repo [13:12] is it expected that 'selinux_user' in a cloud config on ubuntu would cause an exception ("useradd: -Z requires SELinux enabled kernel") [13:14] factor: o/ There shouldn't be any problem doing that; are you seeing issues? [13:15] nacc: \o I'm not sure, TBH, but a bug would be appreciated! === Odd_Bloke changed the topic of #cloud-init to: Reviews: http://bit.ly/ci-reviews | Meeting minutes: https://goo.gl/mrHdaj | Next status meeting July 22 16:15 UTC | cloud-init v 19.2 (07/17) | https://bugs.launchpad.net/cloud-init/+filebug [13:16] Odd_Bloke: ack will get it filed [13:16] Thanks! [13:28] rharper: Do you know why cloud-init doesn't emit its logs to the journal? When trying to debug interactions between multiple units, it's quite annoying to have to keep two log files open and mentally interleave them. [13:34] rharper: I filed https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1838032, so we can carry out the conversation there. [13:34] Ubuntu bug 1838032 in cloud-init (Ubuntu) "cloud-init should emit its logs to the systemd journal" [Undecided,New] [14:06] Odd_Bloke: it does log to the journal... journalctl -u cloud-init-local.service ? [14:07] * rharper reads bug [14:14] rharper: Clarifying in there now.# [14:15] Odd_Bloke: I understand now [14:15] and we can discuss the defaults in upstream, in ubuntu, and I've added a comment to make a one line config change to get what you want [14:16] the latter which is immediately useful to you [14:16] rharper: Ack, thanks; I've clarified that I mean _all_ logs. [15:08] Thanks for the response. Going to test it , have not yet. So I can git a puppet script to run as part of cloud-init. [15:20] factor: You can definitely `git clone` the puppet script from a runcmd, yes. And if you're asking about using the puppet cloud-init module, then that runs after runcmd so whatever you do should be on the system by the time it runs. [15:24] Odd_Bloke, thanks [16:17] rharper: So it looks like for that Azure growpart bug the udev event is missing information: https://bugs.launchpad.net/cloud-init/+bug/1834875/comments/24 [16:17] Ubuntu bug 1834875 in cloud-init "cloud-init growpart race with udev" [Medium,In progress] [16:17] rharper: Do you have any suggestions as to where I should go looking next? [16:18] yes, so if you look at growpart, you can see what it's doing with sgdisk, [16:18] which is updating the table and setting values [16:19] it looks like whatever is watching the disk (systemd) sees the write/close on the device, and fires an uevent where the partition data isn't present, [16:20] Odd_Bloke: look at growpart at line 476 [16:20] Odd_Bloke: so I wonder if the version of sgdisk matters here [16:21] it [16:21] is effecitvely deleted pt table, re writing it, then calling partx --update [16:22] to which, I think one *could* add a udevadm settle --exit-if-exists /path/to/partition [16:23] but I thought your change added a settle right before the check on new size value; [16:24] So --exit-if-exists only short-circuits the regular behaviour, AIUI. [16:25] if update was triggered (partx --update) then a settle after that certainly should have already re-read pt table [16:25] So if the queue is empty, it won't _wait_ for that file. [16:25] Odd_Bloke: right; it just means to not wait the default 120 seconds if the path already exists [16:25] it checks if it exists, then checks if the queue has anything in it [16:26] and then will wait up to 120 seconds for queue to empty [16:26] Right, which means `udevadm settle` will do whatever it would do; and it didn't address the issue. That said, I don't think tobijk used my patched cloud-init for these latest tests. [16:26] effectively, the sgdisk write new table info, partx --update will relese the syscall to re-read partition table on the disk [16:27] if we then call settle; the updated pt will generate change events [16:27] and *those* should have the updated partition data in the event [16:27] Why would settle cause more events to be generated? [16:27] otherwise I don't see why a change event would occur; though when the partition is _deleted_ via sgdisk command, that will create a change event [16:27] settle won't cause any events [16:27] only sgdisk command will generate events [16:28] the partx --update is a kick to ask the kernel to reread the partition table data (and then the kernel may emit update events if the table has changed since it last read it) [16:29] so, a settle after those two commands have run, should ensure that once a CHANGE event for the updated ptable is emitted, and rules have run, that the symlink would be present [16:30] it's not clear to me if the captured change event you;'re seeing is from when sgdisk is "clearing things out" or after it's written the new data and called partx --update [16:30] Odd_Bloke: I can hangout if you want [16:33] On both failing and succeeding instances, we see the same number of sda1 udev events: one add when the system comes up, one change when we resize, and one change later in boot (I think when systemd is reloaded?). After that second "change", both systems are in a good state. [16:34] The order in which the sda{1,14,15} udev events are logged is different between the two systems; sda1 is first on the failing system, and last on the good system. [16:35] So my theory is that the events are prompted by the clear-out, and we're then racing against the partition table being rewritten. [16:35] and the timing of the second change from passing to failing ? timestamp wise; does this occur before or after the read ? [16:35] In both cases the second change is 20-30s after the resize has happened. [16:35] oh my [16:36] so, let's test this if we can' in the check_size path; invoke 'partx ---update' or blockdev --rereadpt /dev/sda; then udevadm settle [16:36] this will *force* a reread of the table before we check size [16:36] and settle should enforce rules complete if update detected a table change [16:37] if that works, then it's likely that the partx --update after sgdisk returns isn't effictive due to the processing of the change of removal [16:39] As an aside, those second change events come after "systemd[1]: Reloading." [16:53] it's not jumping out at me by looking at bddeb, but is there a way to tell the build to dump debs and other distribution artifacts in a `dist` dir? [16:53] i see that it is gitignored so wondering if i'm just not seeing something [16:54] rharper: Further confirmation that we aren't seeing the event due to the partx command; the growpart call specifies a single partition, and we're seeing events for all partitions. [16:56] rharper: I'm heading to lunch; could you take a look at https://bugs.launchpad.net/cloud-init/+bug/1834875/comments/25 and let me know if that matches your understanding? [16:56] Ubuntu bug 1834875 in cloud-init "cloud-init growpart race with udev" [Medium,In progress] [17:22] Odd_Bloke: that looks right; reading partx source-code (https://github.com/karelzak/util-linux/blob/master/disk-utils/partx.c) I wonder if what we really want instead of partx --update /dev/sda 1 ; is just blockdev --rereadpt /dev/sda ; [17:23] partx --update looks like it does more than just asking the kernel to reread the partition table on the disk. [17:28] rharper: When I tried a `sudo blockdev --rereadpt /dev/sda` on a running system, I get "blockdev: ioctl error on BLKRRPART: Device or resource busy" [17:28] So I don't know if we _can_ use that. [17:28] right [17:29] that's why they used partx; uses a different syscall [17:30] Odd_Bloke: I think it would be interesting to see partx --show /dev/sda ; before the --update and after [17:30] OK, will include that. [17:53] chillysurfer: Sorry, skipped over your question before. If I had to guess, I think dist/ is ignored for Python packaging reasons, not because that's where we expect deb/rpm packages to end up. [17:53] chillysurfer: I think adding an option for output location would probably be a reasonable proposal, though, defaulting to the current behaviour. [17:55] Odd_Bloke: yep that makes sense. i have a one-liner git and xargs command that just deletes those untracked files anyways [17:55] so not a huge deal (as long as the build files are my only untracked ones) [18:27] i was wondering - is there any "official" definition of what constitutes a variant "other" in the cloud.cfg template? [18:28] there are several uses like `{% if variant in ["ubuntu", "unknown", "debian"] %}` or even `{% if variant in ["ubuntu", "unknown"] %}` [18:28] e.g. enabling the apt-related modules [18:29] which seems odd, i figured "other" would be a distro... well, _other_ than the ones that have specific values [18:30] Ubuntu and Debian aren't the only distros that use debs (e.g. Mint). [18:33] true, but still seems a far-reaching assumption? also, i though mint was in the list as well? [18:33] ah, not quite, but it seems mint is mapped to ubuntu already [18:34] system_info() has something like [18:34] elif linux_dist in ('ubuntu', 'linuxmint', 'mint'): [18:34] var = 'ubuntu' [18:35] Fair enough. [18:35] i mean, i am also not even trying to prove anything, i was just wondering. the apt thing was just an example. if it really is meant to be "anything else" that's file [18:35] fine :) [18:36] bitfehler: no strict definition; I would say, other is something not currently defined [18:36] we're opt-in here; the default being Ubuntu; and as distros and variants of a particular distro have need to deviate; those get introduced/added [18:37] I guess the thinking is that if we know that we _don't_ know what distro we're on, we shouldn't remove things from the list. [18:38] interesting point :) [18:38] So `{% if variant in ["ubuntu", "unknown", "debian"] %}` is more like "we shouldn't run this if we know definitively we aren't on Ubuntu or Debian" [18:40] that is good guidance for me, thanks. i guess it's clear that "other" shouldn't be anything to rely on anyways, so i am just trying to understand how to best go about supporting another distro there [18:40] bitfehler: is the distro a variant of existing ones? or something altogether different ? [18:41] Yeah, my first reaction to that is that we should be defining this distro in cloud-init. :) [18:41] I think the best way to answer that is looking at the distro classes and their implementation of things like package install, and default file locations [18:41] as a fun fact, the variant of my distro (arch) as returned by system_info is actually "linux", which is not a valid input when using --distro= for ./setup.py [18:41] however, when not setting anything, that's what gets passed to the template [18:41] arch doesn't have an os-release file ? [18:42] https://gist.github.com/natefoo/814c5bf936922dad97ff [18:42] it does, and that is reflected somewhere in system_info result, but the variant is "linux" [18:43] the python 'dist' info just says linux then ? [18:44] if there is extra info in either os-release or another file then we can update util.py:get_linux_distro() to parse that bit out [18:44] no, dist is set correctly, but the variant is determined by checking for a few know values and e/t else is "linux" [18:44] if system == "linux": [18:44] linux_dist = info['dist'][0].lower() [18:44] if linux_dist in ('centos', 'debian', 'fedora', 'rhel', 'suse'): [18:44] var = linux_dist [18:44] elif linux_dist in ('ubuntu', 'linuxmint', 'mint'): [18:44] var = 'ubuntu' [18:44] elif linux_dist == 'redhat': [18:44] var = 'rhel' [18:44] elif linux_dist in ( [18:44] 'opensuse', 'opensuse-tumbleweed', 'opensuse-leap', 'sles'): [18:44] var = 'suse' [18:44] else: [18:44] var = 'linux' [18:44] bitfehler: we'd certaily accept a patch to help set the variant to arch [18:45] cool, can do [18:45] if that's found with get_linux_distro() method; [18:45] which it sounds like it does [18:45] totally [18:46] which leads me to my last question :) i figured it would be good to add some arch-specifics into the template generation as well then [18:46] for sure [18:46] and i was wondering what the approach should be there? run every module that can be reasonably expected to work on the distro? [18:46] and likely revist the distros value in the config modules [18:47] bitfehler: yeah, you can force all modules to run or you could run each one via cloud-init single --force --name [18:47] https://wiki.archlinux.org/index.php/Cloud-init [18:48] even mentions using unverified_modules [18:50] yeah, i guess it would be nice to get to the point where that is no longer neccessary. i wasn't aware that the modules declare this as well, thanks for pointing that out [18:51] cool, thanks a lot for all that, i'll send a few patches your way soon :)