[00:01] right I guess if all interfaces dhcp, we would drop the dns on the floor in those cases [00:02] yes [00:02] expecting dhcp to provide the dns [00:04] right, ok that seems reasonable [00:06] so rharper sound reasonable to emit a warning then during network_data.json -> v2 conversion if we have global dns and no applicable interfaces? [00:07] at least it points to an impossible config presented to the instance [00:09] blackboxsw: yeah, I think if we din we have dns in services, but no interface to attach it to, then a warning makes sense [00:19] ok will do [00:19] thx [01:10] bug 1841697 probably needs attention [01:10] bug 1841697 in cloud-init (Ubuntu) "Upgrade corrupts 90_dpkg.cfg" [Undecided,New] https://launchpad.net/bugs/1841697 [01:15] db_get cloud-init/datasources [01:34] sed -n -e /^datasource_list:/!d -e s/datasource_list:[ \[]*// -e s, \]$,, -e p /etc/cloud/cloud.cfg.d/90_dpkg.cfg [01:34] smoser: that doesn't like no spaces between the leading [ and trailing ] [01:34] so: [ NoCloud, None ] is OK, but [NoCloud, None] is not [08:26] has anyone encountered a bug where write_files creates a dir instead of a file, regardless if there is a content? [12:51] rharper: My only reason for raising it here is that it happend as a result of an SRU. I don't know if anything changed in that area (I thought a new datasource or two was added), so I thought it might be a regression as a result. [12:52] if it is just a matter of a person edited that file and wrote stuff that the crappy sed parser didn't like, then its nothing new while it shoudl be fixed isn't terribly important. [14:05] smoser: its not an sru issue [14:05] it was local config [14:06] ivve: no, if you have user-data that you can share to demonstrate, please file a bug and we can look at it [14:06] smoser: thanks for the heads-up [14:10] rharper: yeah, i saw that and your mp. repsonded to the mp with a suggestion. [14:10] ohm, thans [14:10] rharper: ok thanks [14:14] ivve: thinking about it, write_files in cloud-init does an effective mkdir -p on the dirname of the path to the file; and if the write failed, you would see the dir but not the content; there should be an error in /var/log/cloud-int.log ; [14:15] thats exactly what happens [14:15] 4 out of 8ish files becomes directories. i think it started when i fiddled with default user, but im not sure.. this template is 2500lines with well over 100k chars [14:17] well not entirely true [14:17] it creates a dir in place of the file [14:17] say i have /path/to/file it creates /path/to/file/ [14:17] so instead of a file called file, i have a dir called file [14:18] yeah, there should be an error in the cloud-init.log and if you have just the write_files bits, that'd be helpful to reproduce (I don't need the actual content of the files, just a version with similar paths, etc [14:19] aye, i think i can provide originals [14:19] ill just make a last check [14:25] ok think i found it [14:26] File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1418, in chownbyname [14:26] guess ill try with root:root [14:27] i think i had that before and it worked [14:36] ivve: I see, if you don't provide a user:group in write_files content, then it will use the default user and default permissions which is root:root and 0644 [14:40] rharper: i provided elasticsearch:elasticsearch which is created in users section in cloudinit along with default [14:42] ivve: ok, please file a bug and we can sort out what's going on; [14:42] the log should show a write failure, but it's written by root (cloud-init) and then chmod'ed to match the config specified, so its quite strange to have a missing file here [14:42] i was checking for spelling errors in elasticsearch but couldn't find any, i will try deploy with root:root and see if thats the problem [14:43] perhaps the group isn't created [14:43] damn myself for doing too many changes without commits [14:45] hmm think i found the error now [14:45] no_user_group: true [14:45] and write_files is using the group of the user [14:49] will verify, recreating the stack [14:50] but its odd, because the group exists i groups [14:50] elasticsearch:x:1000:elasticsearch [14:52] interesting [14:52] so initially i think i am at fault here [14:52] but im suprised as well :) [14:53] trying root:root now anyways just to block out the error [14:53] i can always chown at runcmd [14:56] blackboxsw: I saw your branch landed - nice! Thanks. Do you know when the code is planned to hit eoan out of curiosity? [14:59] rharper: looks like that was the problem. however group and user does exist [15:05] and now i know what creates the directories [15:05] its docker-compose [15:05] mounting stuff that doesn't exist [16:01] tribaal: I expect I'm uploading today [16:01] probably within the next 2 hours. there is another 1-2 branches of ryan's that I'd like to get in too [16:02] one of them should land in ~15 mins and I'm reviewing the second now [16:02] tribaal: once uploaded, it won't be in official cloud images until tonight as you know [16:02] ~tonight :) [16:03] blackboxsw: sure, I'll run an apt-upgrade in our image thingy for preprod [16:03] it won't be *official* but that should be good enough [16:03] (for preprod at least) [16:03] +1 [16:09] https://paste.ubuntu.com/p/Xx4NpDN4SB/ [16:09] blackboxsw: ^^ [16:10] so yeah, v2 to sysconfig is going to write out the DNS0 values, so we may want to confirm if we need to have a resolvconf-only flag [16:10] +1 rharper [16:23] rharper: the master ifcfg template @ https://github.com/openSUSE/sysconfig/blob/master/config/ifcfg.template doesn't mention a supported DNS* config option. [16:23] that code I referenced can't possibly be the source of truth (too old) [16:24] but it was the best reference I'd found so far [16:25] also cat /etc/sysconfig/network/ifcfg.template on my opensuse leap 15.1 lxc is showing basically the same, no DNS* config options allowed per interface [16:26] so I'm going with your suggestion, we'll need a resolveconf-only flag for suse to handle net config v2 'global' dns [16:27] I don't see robjo around [16:27] yeah I was trying to tab-complete his nick :/ [16:28] might send a message to the mailing list once the v2 branch is up [16:28] get some input if he has time [16:31] y [16:44] I'm installing cloud-init on OL 7.7 is there any reason that the user home directory wouldn't have the permissions needed to create? [17:00] gcstang: sorry; cloud-init runs as root, so it has permissions to create directories in /home [17:01] what do you expect to happen and what is actually happening ? [17:02] I have the user setup in cloud.cfg and during boot the cloud-init.service shows an error that it couldn't run mkdir(name, perm) when creating my user's home directory in /home/ [17:03] gcstang: I think if possible, we'd like to see your userdata reproducing this failure in a bug. Run the following on your system: sudo cloud-init query userdata (and make sure it doesn't have any secrets). If possible a bug filed to https://bugs.launchpad.net/cloud-init/+filebug and attach the tar.gz file emitted by "sudo cloud-init collect-logs" [17:09] blackboxsw the query command returns empty [17:10] gcstang: ok so on your vm you didn't provide any custom user-data then? [17:10] like providing #cloud-config to the vm [17:10] at launch time [17:10] like the metadata? [17:11] gcstang: correct, I was wondering if you were trying change default behavior on the vm you are launching to add new (non-default) users [17:11] by providing additional configuration metadata to the instance (we call that user-data) [17:12] the metadata should be supplied by our URL http://169.254.169.254/opc/v1/instance/metadata/ but I don't think cloud-init is finding that [17:13] the only thing provided is the ssh_authorized_keys [17:13] blackboxsw otherwise the only configuration is in the /etc/cloud/cloud.cfg [17:14] gcstang: ok, so sudo cloud-init collect-logs will help the most I think as it'll scrape /var/log/cloud-init.log for us (which should have logged the 169.254 crawl cloud-init tried). [17:15] it [17:15] will also show the tracebacks of user dir creation. (I'm wondering if it's opc default user on ubuntu that is causing an issue here) [17:26] blackboxsw tried submitting a bug several times and I get [17:26] Timeout errorSorry, something just went wrong in Launchpad.We’ve recorded what happened, and we’ll fix it as soon as possible. Apologies for the inconvenience.Trying again in a couple of minutes might work.(Error ID: OOPS-44fe7d797c5d2d2342e5ea3320e3fc05) [17:28] interesting gcstang I've reflected that to our IS department to check it out. gcstang can you see edit this bug? https://bugs.launchpad.net/cloud-init/+bug/1841816 [17:28] Launchpad bug 1841816 in cloud-init "OL 7.7 fails to create a user" [Undecided,New] [17:29] blackboxswI can [17:30] all yours gcstang edit at will :) [17:30] wont' let me attach, same failure [17:35] blackboxsw I was able to attach the logs and my cfg [19:50] rharper: followup sed for https://code.launchpad.net/~raharper/cloud-init/+git/cloud-init/+merge/371919 [19:50] thx [19:50] see what you think. I want to avoid python dependency in that shell script, and avoid a heavy copy/paste/hoist of 37+ lines of code from ds-identify [19:51] blackboxsw: I don't see an update on the MP, did you update bug or MP ? [19:51] haha unsaved comments can't be seen. [19:51] fixed [19:51] cool [19:51] I agree w.r.t python or the bigger shell fix [19:52] the resulting sed should support datasource_list [19:53] I can't deny I like the simplicity of the python -c =) [19:54] the following examples : ds_list : [1,2,3] ds_list: [ 1,2,3 ] \s*ds_list: [1 ,2, 3] etc [19:54] yeah I agree rharper on the python call, just didn't want to have to sort python2 vs python3 deps in the script [19:54] too [19:54] or have to fallback to another option if the dep wasn't available at the time for some reason. [19:54] yaml wasn't always standard lib right ? [20:10] rharper: right, it wasn't initially. ~2013 maybe? [20:23] I just tested that sed suggestion on lxc, purging cloud-init, adding datasource_list: [1,2,3] to /etc/cloud/cloud.cfg.d/90_dpkg.cfg and it gets correctly reformatted datasource_list: [ 1, 2, 3 ] [20:30] rharper: I was hoping to hold restarting the cloud-init SRU on the inclusion of this https://code.launchpad.net/~raharper/cloud-init/+git/cloud-init/+merge/371919 . [20:30] Do you think we should wait for it, or just restart the SRU with the exoscale fixes? [20:31] hrm [20:31] my initial thought is to not wait, as tonight or tomorrow will bring another thing to pick up [20:31] I also don't think the error is that common, it requires manual manipulation of the file [20:32] ok will kickoff an upload to Eoan of tip of cloud-init then [20:33] * blackboxsw wants one quick manual ec2 test on current SRU, just to make sure the world [20:33] blackboxsw: yeah, looks like just two commits, the exoscale and the oracle vnics bits [20:33] didn't break [20:33] yeah [20:49] rharper: [ubuntu/eoan-proposed] cloud-init 19.2-24-ge7881d5c-0ubuntu1 (Accepted) [20:49] tribaal: ^ [20:49] that'll have exoscale fix [20:50] * blackboxsw now tries to regen an SRU including this content too [20:51] great [20:55] and xenial bionic disco -proposed SRU upgrade test on ec2 doesn't error out. (there are no specific SRU bugs/code related to Ec2 in this sru) [21:15] rharper: here's what I was going to do debian/changelog wise for updating the existing SRU [21:15] https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/371959 [21:17] so I've added a second debian/changelog stanza for the incremented new version. both reference the current SRU bug number [21:17] I'm not referencing the upstream bug id, because we [21:17] have an SRU exception to handle validating that ourselves [21:18] "the upstream bug id" == for the exoscale fix [21:18] #1841454 [21:18] bug #1841454 [21:18] bug 1841454 in cloud-init "Exoscale datasource overwrites *all* cloud_config_modules" [Undecided,Fix committed] https://launchpad.net/bugs/1841454 [21:19] blackboxsw: hrm, so should changelog have two entries each with the SRU bug # ? [21:19] I guess we already pushed to ubuntu/disco [21:20] rharper: I think I only pushed to chad.smith/ubuntu/disco. I thought steve suggested that on a previous sru-regression fix we performed. I can separate it out, but then we have 2 changelog entries in the most recent commit that don't have bugs [21:20] blackboxsw: you may be right, I'm just asking [21:20] I can double-check in #ubuntu-devel [21:29] trying to build an rpm from cloud-init source, i'm getting the error KeyError: u'RPM_BUILD_ROOT'. i've tried setting the env var to a path on my machine, but still not working. any thoughts? [21:31] chillysurfer: I suggest using our tools/run-container [21:32] chillysurfer: if you want to do it outside ,then you want to use packages/brpm [21:32] rharper: yep that's the script i kicked off, packages/brpm [21:32] and i get that key error [21:33] and you have rpm-build and such installed ? [21:33] nope! [21:33] i'm guessing that's the problem [21:33] is there a list of deps for brpm? [21:34] chillysurfer: not in there but in tools/run-container [21:34] rharper: the reason i'm doing this exercise is that it looks like our rpm doesn't inject the PACKAGED_VERSION into version.py [21:34] ah, it uses tools/read-dependencies [21:35] chillysurfer: I would suggest you use the tools/run-container [21:35] rharper: so it falls back to just VERSION without the package information [21:35] that's how we create our srpms for upload to copr [21:35] and trying to unwind why that's happening [21:35] ok cool i'll do tools/run-container [21:35] so, you can look at tools/run-centos and see how that calls tools/run-container , [21:36] our ci does this, ./tools/run-centos 7 --srpm --artifact [21:36] which sets up a centos7 lxd container, puts in the cloud-init source, runs all of the steps to prep the container environment, and then calls brpm with the right flags, and will pull out the srpm to your dir when done [21:37] rharper: awesome i'll definitely do that [21:37] rharper: dumb question... what controls the rhel repos? if this is for dumping to copr for centos/fedora, what handles the pkg distribution for rhel regarding cloud-init? [21:38] chillysurfer: we have a copr project repo for centos; we've not access to the downstream repos [21:39] our CI publishes daily rpms for centos7; I've a branch to fix up our py3 build so we can publush on newer fedoras and centos8; but that' needs to get reviewed/landed [21:39] https://code.launchpad.net/~raharper/cloud-init/+git/cloud-init/+merge/368845 [21:40] we maintain two other repos on copr, el-stable and el-testing; el-stable hasn't been updated in quite some time, mosty for centos6 support; el-testing get's any of our new release (19.1, 19,2) and whenever we do an SRU [21:42] chillysurfer: the downstreams decide what to pick up and when; typically depending on the downstream, some pick up newer cloud-inits at the point releases (19.1, 19.2, etc) others keep an older release and then backport fixes. [21:42] ah i see [21:42] that makes sense [21:42] rharper: do they pick up source packages? [21:43] I suspect they checkout the release tag from git [21:43] we tag and sign point releases [21:43] rharper: ahhh i see. so in that case you aren't surprised that the packages/redhat/cloud-init.spec.in isn't applied tot he pkg? [21:44] rharper: the reason i'm bring this all up is that `sed -i "s,@@PACKAGED_VERSION@@,%{version}-%{release}," $version_pys` doesn't seem to be happening, which is specified in cloud-init.spec.in for the rpm [21:45] the spec file is a template, so it get's rendered during the install, IIRC [21:46] chillysurfer: see packages/brpm ; it calls tools/read-version and then runs the generate_spec_contents() [21:46] rharper: right exactly. but if the downstream maintainer of cloud-init isn't calling packages/brpm that would explain why the version isn't getting injected right? [21:46] they don't use our spec file [21:47] rharper: they use their own spec file? that would explain why PACKAGED_VERSION remains unchanged [21:48] chillysurfer: sounds like it could be a downstream bug in their build spec [21:48] it's not always clear they want cloud-init --version to match what the package version value is [21:48] ok rharper vorlon confirmed on reusing existing SRU bug [21:48] will push xenial and bionic for SRU re-review [21:48] for example if they're backporting features into an older release; they may not want to bump the version value that cloud-init returns [21:49] blackboxsw: ok, thanks [21:52] blackboxsw: I'm going to run an errand in just a few minutes, when I get back, I'll review the bionic/xenial MPs if you've got them up [21:54] thanks.rharper bionic: https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/371962 xenial: https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/371963 [21:57] rharper: dugtrio running out of space? https://jenkins.ubuntu.com/server/job/cloud-init-ci/1088/console [21:57] maybe [21:57] couldn't create chroot [21:57] * blackboxsw logs in to see what we need to clean up [21:57] no, maybe just not configured [21:58] that job typically ran on tork, but the queue has been sprayed across additional nodes [21:58] so this looks like setup fallout [21:58] blackboxsw: send paride a note [21:58] maybe you can reschedule that job on tork [21:58] rharper: yeah I'll try a reschedule, and will send a note [22:04] rharper: ah ok that's really good information [22:21] blackboxsw: ack. I'll spin an Eoan template now and will torture it tomorrow [22:26] rharper: what's the copr repo for cloud-init? [22:26] chillysurfer: daily builds are https://copr.fedorainfracloud.org/coprs/g/cloud-init/cloud-init-dev/ [22:27] testing: https://copr.fedorainfracloud.org/coprs/g/cloud-init/el-testing [22:28] we'll be updating testing today/tomorrow, but generally we don't updated except when we are performing an ubuntu Stable release update into xenial, bionic, disco (>= monthly) [22:44] blackboxsw: great thanks!