[10:08] <_NiC> If I have an existing user 'debian' in my image, and in /etc/cloud/cloud.cfg I have system_info: -> default_user: -> name: debian, will the settings from cloud.cfg not be applied because the user already exists? Because the settings there are not applied. :-\ (debian 9, cloud-init 0.7.9-2) [10:23] <_NiC> oh, perhaps it's skipped because I'm specifying another user in my user-data, and not also explicitly telling it to create the default user as well? [12:40] _NiC: perhaps! is the user from user-data created? [12:40] <_NiC> meena, yes. and I didn't have "- default" listed under users: [12:41] <_NiC> meena, so from what I could gather, that meant to *not* care about the default_user as defined in cloud.cfg. So when I added "- default" it worked as expected again. :) [12:41] 👍 [12:42] <_NiC> When running the openstack create command, you get sent back an 'adminPass' though, and I tried to figure out if that could be injected somehow, but not easily it seems, so I ignored that as well. I have my custom user, so it's all good. :) [14:07] hello everyone: we've got a pipeline that builds custom cloud images for openstack... I've added centos8 support and I'm having trouble with adduser for our custom user... on all the other builds stamping in an updated cloud.cfg that replaces the centos user with our user works ok, but with centos8 it doesnt... that user never gets added and our pipeline fails... any ideas on whats going on? [14:11] Hello [14:13] I have a vmware instance where I spawn by terraform rancheros. The issue is that in this environment DHCP is disable so I need to provide network configuration. The question is should I do this by metadata or by cloud-config? What happens if I have defined in both? with one is taken? [14:26] Hi :) I want to deploy some cloud servers at Hetzner. I plan to run 3-4 PrestaShops on them. I think nginx+php-fpm is the best and then a separate mysql server. So my question: does anyone have a good cloud-init-file for nginx + php-fpm that would be suitable for running prestashop? PHP 7.2 is needed [14:41] <_NiC> onklmaps, my knowledge of cloud-init is pretty limited, but I suspect you don't want to try to do *all* of your config using cloud-init [14:41] <_NiC> onklmaps, rather use cloud-init to get a good base, and use something else to configure the rest. (such as ansible, puppet, chef, or similar) [14:42] Ahh. That makes sense.. You know any good base for nginx + php-fpm? [14:44] <_NiC> I'm not familiar with prestashop either, but unless its requirements are very specific, any standard setup should work [14:44] <_NiC> if you need high performance etc etc you should probably look into various forms of caching, but that's a topic for another channel I think.. :-) [14:44] It's pretty standard, yes..I have some custom things that needs to go into the server block, but other than that its pretty standard [14:45] Yeah - caching... I know.. It's a big subject [14:45] I just installed nginx as reverse proxy in front however. I might do static cache there? [14:46] <_NiC> that's possible [14:47] PrestaShop is much easier to set up with apache.. I might just serve the prestashops using apache + php-fpm, then use nginx reverse proxy with static cache in front [14:47] What you think? [14:52] <_NiC> Not sure I have an opinion about that :-) [15:19] onklmaps: I think php is a bad idea. [17:23] hey folks. let's do this.... [17:23] #startmeeting Cloud-init bi-weekly status [17:23] Meeting started Tue Feb 4 17:23:28 2020 UTC. The chair is blackboxsw. Information about MeetBot at http://wiki.ubuntu.com/meetingology. [17:23] Available commands: action commands idea info link nick [17:23] morning, afternoon and evening folks. Time for another cloud-init community status meeting [17:24] #chair rharper [17:24] Current chairs: blackboxsw rharper [17:24] #chair Odd_Bloke [17:24] Current chairs: Odd_Bloke blackboxsw rharper [17:24] Coud-init upstream uses this meeting as a platform for community updates, feature/bug discussions, and an opportunity to get some extra input on current development. [17:24] The next scheduled status meeting is always listed in the topic of this channel, so feel free to drop in on next session if you miss this one [17:25] while we're at it I'll update for next status meeting. === blackboxsw changed the topic of #cloud-init to: pull-requests https://git.io/JeVed | Meeting minutes: https://goo.gl/mrHdaj | Next status meeting February 16 17:15 UTC | 19.4 (Dec 17) drops Py2.7 : origin/stable-19.4 | 20.1 (Feb 18) | https://bugs.launchpad.net/cloud-init/+filebug [17:26] 2 weeks from today, same bat time, same bat channel [17:26] Our previous meeting minutes line here: [17:26] #link https://cloud-init.github.io/ [17:26] *live here* rather [17:27] the topics we cover in this meeting are the following: Previous Actions, Recent Changes, In-progress Development, Community Charter, Upcoming Meetings, Office Hours (~30 mins). [17:27] new topics or intejections are always welcome [17:27] #topic Previous Actions [17:27] From last meeting we had no unresolved actions so we can jump to the next section [17:27] #topic Recent Changes [17:29] found from tip of master with `git log --since 01/21/2020` [17:29] - sysconfig: distro-specific config rendering for BOOTPROTO option (#162) [17:29] [Robert Schweikert] (LP: #1800854) [17:29] - cloudinit: replace "from six import X" imports (except in util.py) (#183) [17:29] - run-container: use 'test -n' instead of 'test ! -z' (#202) [17:29] [Paride Legovini] [17:29] - net/cmdline: correctly handle static ip= config (#201) [17:29] [Dimitri John Ledkov] (LP: #1861412) [17:29] Launchpad bug 1800854 in cloud-init "BOTOPROTO handling between RHEL/Centos/Fedora and SUSE distros is different" [Medium,Triaged] https://launchpad.net/bugs/1800854 [17:29] - Replace mock library with unittest.mock (#186) [17:29] - HACKING.rst: update CLA link (#199) [17:29] Launchpad bug 1861412 in cloud-init (Ubuntu) "cloud-init crashes with static network configuration" [Undecided,Fix committed] https://launchpad.net/bugs/1861412 [17:29] - Scaleway: Fix DatasourceScaleway to avoid backtrace (#128) [17:29] [Louis Bouchard] [17:29] - cloudinit/cmd/devel/net_convert.py: add missing space (#191) [17:29] - tools/run-container: drop support for python2 (#192) [Paride Legovini] [17:29] - Print ssh key fingerprints using sha256 hash (#188) (LP: #1860789) [17:29] - Make the RPM build use Python 3 (#190) [Paride Legovini] [17:29] Launchpad bug 1860789 in cloud-init (Ubuntu) "ssh_authkey_fingerprints must use sha256 not md5" [Undecided,Fix committed] https://launchpad.net/bugs/1860789 [17:29] thought we were going to use pastebin :P [17:29] heh, that is a good point (I wondered if anyone would call me on that) [17:30] #link https://paste.ubuntu.com/p/3jQdKZVPcM/ [17:31] generally speaking, dropping use of six since our code based is not python3-only, tooling dropping py2, sysconfig rendering flavors for opensuse, doc fixes and read the docs fixups [17:32] thanks all for the contributions over the last couple weeks [17:32] #topic In-progress Development, [17:32] #topic In-progress Development [17:32] Any existing PRs are up for review at the following url: [17:33] #link https://github.com/canonical/cloud-init/pulls [17:33] generally speaking we are in the 'long tail' part of a couple of feature-sets: [17:34] * we are trying to wrap up tooling for our automated CI, publishing processes and documentation for the shift to github from launchpad [17:34] * we are in progress on cloud-init handling network hotplug for a couple of datasources [17:35] * in progress on boot speed improvements for various platforms [17:36] We also recently validated and released cloud-init v 19.4.33 to Xenial, Bionic and Eoan (1/9/2020) [17:38] Hi @blackboxsw I'm no longer in the provisioning team, but there's an urgency for the cloud test to be resilient. Have you looked at those issues, I can dedicate as much time as needed to this. If you have time, can we tackle this today? [17:38] there are also a number of PRs in flight for FreeBSD,NetBSD, OpenSUSE and CentOS that need attention so we can better enable those distros [17:38] azurecloudtest that is [17:39] hi ahosmanMSFT I can spend some time on office hours here to peek more at it. my individual runs didn't hit the timeouts again, so we might need a reproducer cmdline from you in a new bug maybe? [17:39] So your able to run all tests successfully without timeout and image not building> [17:40] ahosmanMSFT: but yes I can spend a little time on this today. and I think ultimately we'll have to find the tox command line that exhibits this error. I'll go checkout my test run again and see. I don't think I saw the failure. but I might be invoking tests differently than you [17:41] blackboxsw: hmm that's interesting, thanks let me know [17:41] same here ahosmanMSFT, can you file a bug with the traceback you see and the tox cmdline you are running? [17:41] then I know exactly what to look for [17:42] Sure, will do now [17:43] cool. [17:43] ok next topic [17:43] #topic Community Charter [17:43] ok this section is reserved to raise general community work/goals. [17:44] At last cloud-init summit we raised a couple of general themes of improvements cloud-init would like to achieve [17:45] These themes fell into two categories for this year: datasource documentation updates and cloud-init json schema validation for the 50+ config modules in cloudinit/config/cc_*py so that we can better raise user-config errors and remove some of cloud-init's "sharp edges" [17:45] we converted a number of these feature requests in into bugs which can be searched here: [17:46] #link https://bugs.launchpad.net/cloud-init/+bugs?field.tag=bitesize [17:46] tasks in this list should be fairly easy one-time bugs for folks with a little time available to help improve cloud-init. [17:47] we'll revisit this set of bugs/features and the community charter goals near the end of 2020 at the next cloud-init summit [17:47] #topic Office Hours (next ~30 mins) [17:48] this time is spent with cloud-init upstream dev eyes on this channel for any cloud-init feature, bug or implementation discussions. In the absence of such discussions, we'll review the active PRs to try to tidy up the review queue and unblock developers [17:49] for the moment, I'll look over some Azure test timeouts ahosmanMSFT is seeing [17:49] any other topics, concerns, bugs, questions are welcome and someone should be around to field them [17:50] ahosmanMSFT: so timeouts running integration tests, you said you are getting them about half the time? [18:48] blackboxsw: Yes, I tracked it down to platforms/instance._wait_for_system [18:49] I invoke it after initializing vm in platform/azurecloudtest/instance.start [18:49] when removed, everything works as expected [18:50] looks like it's needed for cloud tests so thought I'd leave it to you, since I don't know how ec2/lxd/... rely on [18:50] ahosmanMSFT, can you file a bug please with the cli example? [18:50] that would help us triage and make a proper decision on what change to make [18:51] powersj, yes, was in the middle of that side tracked by meeting. On it now [18:51] thanks! [19:07] aaaand, I should probably wrap the meeting for the day. [19:08] Thanks all for the time and energy you put into improving cloud-init! See you next time, or anytime in between [19:08] #endmeeting [19:08] Meeting ended Tue Feb 4 19:08:14 2020 UTC. [19:08] Minutes: http://ubottu.com/meetingology/logs/cloud-init/2020/cloud-init.2020-02-04-17.23.moin.txt [19:27] blackboxsw, powersj: submitted the bug https://bugs.launchpad.net/cloud-init/+bug/1861921 [19:27] Ubuntu bug 1861921 in cloud-init "cloud tests ssh time out" [Undecided,New] [19:28] ahosmanMSFT, can you give us the full CLI you are using to launch? [19:29] sure [19:36] powersj: added full log to the bug [19:36] ahosmanMSFT, perfect thanks [19:37] powersj: I found it most reproducible when going Idle [19:39] for some reason that repro's it for me, for example launching the tests then checking on something else outside the vm where I am running it [19:39] ahosmanMSFT, "reproduce this is by going idle." what do you mean by that? [20:31] i missed another one! [20:31] this time my excuse is that I'm visiting family in Ireland [20:34] meena: =) [20:35] powersj, I run cloud tests on a vm, so when not on the vm it's self , ie use the browser. It typically hit's that time out error. This doesn't occur though when removing that wait function. [20:35] hrm, do you think the VM itself is suspending ? [20:36] the wait has a timeout which is likely what it's hitting [20:36] if you remove the timeout, then when the VM resumes, it just continues executing [20:37] when you resume, the VM time will catch up, that will trip timers , ie, if you've suspeneded for longer than the 300 seconds, then when it resumes the timer expires immediately [20:37] ahosmanMSFT: can you confirm that if you stay active in the VM that it does not fail ? [20:38] alternatively, launch an Ubuntu instance in Azure; and run cloud-tests from within the Azure instances (instead of a local VM) [20:39] This was on a ubuntu VM in Azure, rharper [20:39] but, yes, it's less likely to happen when staying on VM [20:39] quite strange [20:39] it still smells like some sort of suspend [20:40] you could spawn some other cpu bound task in the background ? [21:13] meena: you have to disavow all family and friends like the rest of us... that's the only way this is going to work ;) [21:17] yeah ahosmanMSFT my full ci tests against azure are still running and I'm not seeing any of those failures yet. but I'm running from an Ubuntu lxc on Ubuntu host OS. If memory serves you may be running on an Ubuntu lxc running on an OSX host system? How the 'vms' behave on OSX when focus shifts to other services (like browsers etc) could likely be the culprit [21:18] and I'd follow rharper's suggestion of "alternatively, launch an Ubuntu instance in Azure; and run cloud-tests from within the Azure instances (instead of a local VM)" [21:19] blackboxsw: I think ahosmanMSFT is doing that [21:19] oops just read that [21:19] blackboxsw: I suppose we could try that [21:19] but still not sure what "going idle" means from that perspective [21:20] sure I can kick off an instance to do that. I was just thrown by the "browser" comment [21:20] running from within a screen/tmux or something else [21:20] blackboxsw: as was I [21:20] it's more random than anything [21:20] ssh timeout [21:20] so I suspect there's still something else tooling-wise were missing [21:22] ahosmanMSFT: sure, but there are really only two causes here; 1) bad connection between the cloud_test client and target instance 2) the instance is taking longer than the timeout to boot up to ssh [21:22] (1) and (2) can be intermittent [21:23] I agree, but the second is usually not the case. It usually never goes pass the boot time out of 300s [21:25] just looking over those pastes more closely, I *am* running basically the same command as ahosmanMSFT : .tox/citest/bin/python -m tests.cloud_tests run --verbose --os-name bionic --data-dir results --preserve-data --platform azurecloud [21:27] ahosmanMSFT: do we not have the console-log configured ? [21:27] rharpher: can you elaborate [21:28] serial_console_log_blob_uri [21:28] that should be created [21:28] looks like we enable the serial console, so cloud_tests should capture the serial-console log on failure [21:28] I'd like to see the serial console log of the failing instance [21:29] and comparing the failing serial-console log to the passing one likely will reveal some differences which may explain what's going on [21:29] I'm also a bit surprised to see a 3:45 min gap between the following logs [21:30] 2020-02-04 19:11:03,628 - tests.cloud_tests - DEBUG - image not found, launching instance with base image [21:30] 2020-02-04 19:14:44,950 - tests.cloud_tests - DEBUG - executing "wait for instance start" [21:30] That is stored in a storage account in your resource group when running, but the resource group is deleted after the cloud-tests [21:30] in my runs, typically that gap should only be around the time to launch and get a response back from Azure API. which is about 30 seconds on 'az vm create' [21:31] ahosmanMSFT: ah you mentioned that bug; we should dump the value before tearing down [21:32] blackboxsw: the compute_client.images.get() may be taking a long time [21:32] preserve should keep it, but it doesn't working on fixing that too [21:32] ahosmanMSFT: I think --preserve-instance and --preserve-data should keep that around post failure [21:32] depending on the REST query; I've seen *long* REST calls to list of images [21:32] hrm :/ [21:32] though 3 minutes seems excessive [21:33] Out of three of these bugs, the current biggest flaw isn't even the timeout, it's that images aren't created properly due to azurecloud/image._img_instance staying None despite re-assignment, but this is more on my side, but can use your help. As the ssh timeout isn't 100%, this issue is. [21:33] the preserve call should be a simple fix [21:34] ssh timeout I think is more machine/system dependent now that I'm seeing your not getting it [21:35] ahasenack: the default timeout is 300, it would be a terrible thing to bump that on azure platform to say 600 and see if that runs more reliabling to completion [21:36] ahosmanMSFT: ^ [21:36] :) [21:36] ahasenack: sorry [21:36] np :) [21:36] lack of extra tab [21:38] rharper: I agree, I don't think timeout is the issue [21:38] default timeout that is [22:07] rharper, blackboxsw: Are you still in today? [22:07] yep [22:08] Mind taking a look at azurecloud/image._img_instance [22:08] it's not re-assigning correctly, which means images don't build correctly [22:09] it works on ec2, right? [22:09] I placed some print statements and it stays instantiated as None [22:10] ahosmanMSFT: sure, do you have more details on "images aren't created properly due to azurecloud/image._img_instance staying None"; [22:12] rharper: yes... [22:13] so when cloud-tests starts it hit's image.py checks if there is an image and if there isn't it creates one via streams parameters. Then in the next test it checks, is there a clean image, in our case the variable that specifies this never changes and always stays None for some reason [22:14] because of that, it always launches a base image, not a clean image [22:14] which negates the whole image process [22:14] of creating a clean image for all future tests [22:15] ahosmanMSFT: are you specifying a cloud-init.deb or is one getting built ? [22:16] One get's built, looking at the code, seems to be abstracted [22:16] The fault comes when _img_instance is used in snapshot, it never passes the initial conditional [22:16] https://paste.ubuntu.com/p/nwgm8KjXJk/ [22:17] rharper [22:24] ahosmanMSFT: just looking at some differences, in azurecloud/image.py:_instance(), you call platform.create_images(); but you don't start it; [22:27] rharper: hmm I changed it and added some logging and still get that it's None [22:27] https://paste.ubuntu.com/p/VPgqXZXkkX/ [22:28] ahosmanMSFT: so, if it's not started, then when snapshot gets called there's not self._img_instance set and it returns the AzureCloudSnapshot() rather than doing the boot clean script and such [22:30] rharper: running again, I'll show you the logs and all the logging I added when trying to debug [22:31] I modified _instance(), as shown above [22:31] right, so the call path goes from platform.get_image() -> AzureCloudImage() -> platform.get_snapshot() -> snapshot.launch(); where you create an instance, but don't start it either () [22:33] rharper: that might be it, I didn't spot that [22:33] I'll make the change and run it again, keep you update [22:34] yeah, I'm just walking through cloud_tests/platforms/__init__.py cloud_tests/bddeb.py:setup_build() and comparing platform/ec2/* to platform/azurecloud/* [22:34] are you running this in a container [22:40] no, I'm just reading the code [22:41] I mean where the tests are run [22:43] no, just under tox on an jenkins node;