/srv/irclogs.ubuntu.com/2018/01/22/#cloud-init.txt

mazzyHi there. I14:14
mazzyhttps://gist.github.com/mazzy89/340dae524474ca01d4e8aa1ee647259814:14
mazzyI have this use case. any suggestion?14:14
kholkinaHi! Does cloud-init update vendor-data on instance reboot?14:47
mazzydoes cloud-init overwrite an existing file if the file is created in write_files?14:49
=== stelucz_ is now known as stelucz
ajorg\(´O`)/15:53
rharpernice one15:56
blackboxswmazzy: yes, write_files will overwrite pre-existing files with content provided in your write_files section15:58
mazzyblackboxsw thank you15:59
smosero/16:00
blackboxswmorning alll16:00
blackboxswok I think it's time for our bi-weekly meeting. probably going to be a short one this week.16:07
ajorgmore time for office hours then?16:08
blackboxsw#startmeeting Cloud-init bi-weekly status meeting16:08
meetingologyMeeting started Mon Jan 22 16:08:22 2018 UTC.  The chair is blackboxsw. Information about MeetBot at http://wiki.ubuntu.com/meetingology.16:08
meetingologyAvailable commands: action commands idea info link nick16:08
blackboxswcertianly ajorg :) (on office hours)16:08
blackboxswWelcome to another episode of cloud-init bi-weekly status. We'll chat about about cloud-init updates and in progress work, and we'l drop into office hours for ongoing discussions/bug work etc.16:09
blackboxsw#topic Recent changes16:10
blackboxswJust walking through git-log for what we have committed in the last couple of weeks, here's the brief summary16:11
blackboxswthx smoser16:12
blackboxsw   - shorten the message in the exception per powersj feedback16:12
blackboxsw    - Use the same botocore session so the patched changes stick.16:12
blackboxsw    - fix bad use of %16:12
blackboxsw    - Fix console_log, improve comments and raise PlatformError on.16:12
blackboxsw    - tests: Fix EC2 Platform to return console output as bytes.16:12
blackboxsw    - tests: remove zesty as supported OS to test [Joshua Powers]16:12
blackboxsw    - Do not log warning on config files that represent None. (LP: #1742479)16:12
ubot5Launchpad bug 1742479 in cloud-init (Ubuntu) "setting manual_cache_clean causes warning" [Medium,Fix released] https://launchpad.net/bugs/174247916:12
blackboxsw    - tests: Use git hash pip dependency format for pylxd.16:12
blackboxsw    - tests: add integration requirements text file [Joshua Powers]16:12
blackboxsw    - MAAS: add check_instance_id based off oauth tokens. (LP: #1712680)16:12
blackboxsw    - tests: update apt sources list test [Joshua Powers]16:12
blackboxsw    - tests: clean up image properties [Joshua Powers]16:12
blackboxsw    - tests: rename test ssh keys to avoid appearance of leaking private keys.16:12
ubot5Launchpad bug 1712680 in maas-images "cloud-init re-generates network config every reboot overwriting manual admin changes on CentOS." [Undecided,New] https://launchpad.net/bugs/171268016:12
blackboxsw      [Joshua Powers]16:12
blackboxsw    - tests: Enable AWS EC2 Integration Testing [Joshua Powers]16:12
blackboxsw    - cli: cloud-init clean handles symlinks (LP: #1741093)16:12
ubot5Launchpad bug 1741093 in cloud-init "cloud-init clean traceback on instance dir symlink" [Low,Fix committed] https://launchpad.net/bugs/174109316:12
ajorgWhat's being patched in botocore?16:13
blackboxswSo a number of changes went into integration test related work, separating out requirements files.16:13
blackboxswMAASDatasource now also has smarted cache handling based on oauth token renewal from the maas server16:14
blackboxswso botocore is used by integration tests only as a mechanism to talk to the instance under test... looking back at the specifics here16:14
blackboxswit might have just been shuffling out how and where we define the dependency16:14
smoserblackboxsw: (my 'paste' to you was bad... http://paste.ubuntu.com/26438113/ is better, showing only those on master, not my local branch that was currently checked out )16:14
blackboxswheh, oopsie daisy let's paste again inline then16:15
blackboxsw    - tests: remove zesty as supported OS to test [Joshua Powers]16:15
blackboxsw    - Do not log warning on config files that represent None. (LP: #1742479)16:15
blackboxsw    - tests: Use git hash pip dependency format for pylxd.16:15
blackboxsw    - tests: add integration requirements text file [Joshua Powers]16:15
blackboxsw    - MAAS: add check_instance_id based off oauth tokens. (LP: #1712680)16:15
blackboxsw    - tests: update apt sources list test [Joshua Powers]16:15
blackboxsw    - tests: clean up image properties [Joshua Powers]16:15
blackboxsw    - tests: rename test ssh keys to avoid appearance of leaking private keys.16:15
blackboxsw      [Joshua Powers]16:15
blackboxsw    - tests: Enable AWS EC2 Integration Testing [Joshua Powers]16:15
blackboxsw    - cli: cloud-init clean handles symlinks (LP: #1741093)16:15
ubot5Launchpad bug 1742479 in cloud-init (Ubuntu) "setting manual_cache_clean causes warning" [Medium,Fix released] https://launchpad.net/bugs/174247916:15
ubot5Launchpad bug 1712680 in maas-images "cloud-init re-generates network config every reboot overwriting manual admin changes on CentOS." [Undecided,New] https://launchpad.net/bugs/171268016:15
ubot5Launchpad bug 1741093 in cloud-init "cloud-init clean traceback on instance dir symlink" [Low,Fix committed] https://launchpad.net/bugs/174109316:15
blackboxswok the real deal, that looks better16:15
blackboxswahh ajorg that interim commit message on botocore was about integration tests caching the session information during testing so we don't recreate that session with every ssh connection to the instance16:16
blackboxswjust a little time savings per review comments on powersj branch I believe16:16
ajorgokay, so nothing that needs to get upstreamed to botocore?16:17
blackboxswI don't think so, powersj smoser I have vague recollection of someone filing an upstream botocore issue. did we have to do that for something else though?16:17
powersjhttps://github.com/boto/botocore/issues/135116:18
powersjthat was the issue smoser put in ^16:18
blackboxswnice recall powersj thanks.16:18
blackboxsw#link https://github.com/boto/botocore/issues/135116:18
smoserajorg: you can read that bug.  imo they have a data loss error, but not one that they can easily fix without causing failures in places that previously ran fine.16:20
ajorgI'll ask them to re-open it.16:20
ajorgAt the very least they should answer your last.16:21
smoserthanks.16:21
blackboxswGenerally anything significant that we have landed (and any inprogress work) should be available at the following link.16:22
blackboxsw#link https://trello.com/b/hFtWKUn3/daily-cloud-init-curtin16:22
blackboxswanything else we should note over the last couple weeks?16:23
blackboxswotherwise I'll switch to ongoing work topic16:23
blackboxsw#topic In-progress Development16:24
blackboxswAs you may have seen last week, we've gotten through a few passes and discussions around dojordan's branch to define pre-provisioning16:25
blackboxsw#link https://code.launchpad.net/~dojordan/cloud-init/+git/cloud-init/+merge/33434116:25
blackboxswsome of that discussion resulted in a new context manager: EphemeralDHCPv4 to support a sandboxed dhclient request on an instance.16:26
blackboxswthis context manager affects Ec2 datasource a bit as it encapsulates all of the dhcp request -> EphemeralIPV4Network calls that Ec2 was doing16:27
blackboxswthere may be a couple other datasources that follow suit with this type of sandboxed dhcp request in weeks to come16:28
ajorgglad it turned out to be generally useful rather than only specifically to ec216:28
blackboxswabsolutely16:28
blackboxswSome other in-progress bits look like we might try focusing a bit more on chrony support and gettting robjo's branches some more eyes.16:30
blackboxswand some work on Ubuntu snappy support per the snappy and snap config modules.16:31
smoserdojordan: i just put one comment on your mp. /me thanks dojordan again for his patience.16:31
blackboxswrharper: smoser powersj anything more in the immediate pipeline that I'm missing/16:32
blackboxsw?16:32
smoserblackboxsw: we should get the EphemeralDHCP thingy into the digital ocean datasource also.16:32
rharperblackboxsw: a reply to the network discussion on the list from the azure folks and robjo16:32
ajorgI took another look at https://code.launchpad.net/~yeazelm/cloud-init/+git/cloud-init/+merge/331897 and saw that origin/master seems to be failing some of the integration tests too.16:32
ajorg(at least for me, locally, on a 16.04 instance)16:32
blackboxswahhh right forgot about all your work there rharper, thanks!16:32
smoserajorg: https://jenkins.ubuntu.com/server/view/cloud-init/job/cloud-init-ci-nightly/16:33
smoserthat is nigytly run of trunk16:33
blackboxsw#link https://jenkins.ubuntu.com/server/view/cloud-init/job/cloud-init-ci-nightly/16:33
ajorgI'll try blackholing IMDS on my instance. Could be that's interfering with something.16:34
smoserit is red, but 218 (green) and 219 (red) used the same git has on trunk (5cc0b19b8).16:35
ajorgI'll follow up during office hours16:35
smosercan you give me example of your failures ? we had "disk full" errors recently on our jenkins, so that might be the cause of the issue for 291.16:36
smosers/291/219/16:36
blackboxswI don't remember seeing that traceback recently. w/ warning messages present in cloud-init16:36
smoserpowersj: ? can you explain lxc timeout failure at16:36
smoser https://jenkins.ubuntu.com/server/view/cloud-init/job/cloud-init-ci-nightly/219/consoleFull16:36
powersjsmoser: we discovered that our qemu-migration test was installing lxd from the archive and causing conflicts with the snap installed lxd16:37
powersjI have a message to christian to prevent it, and I have already cleaned it up16:37
powersjso new runs should pass16:37
ajorg2018-01-22 16:19:03,550 - tests.cloud_tests - WARNING - test case: modules/ssh_import_id failed TestSshImportId.test_no_stages_errors with: AssertionError: 1 != 0 : errors ['(\'ssh-import-id\', ProcessExecutionError("Unexpected error while running command.\\nCommand: [\'sudo\', \'-Hu\', \'ubuntu\', \'ssh-import-id\', \'gh:powersj\', \'lp:smoser\']\\nExit code: 1\\nReason: -\\nStdout: -\\nStderr: -",))'] were encountered in stage m16:37
smoserhm.. well, that will hit launchpad.net over https16:38
smosercloud-init-output.log probaly has more info (should be collected)16:38
powersjthe actual error is:   File "/var/lib/jenkins/slaves/torkoal/workspace/cloud-init-ci-nightly/tests/cloud_tests/platforms/instances.py", line 142, in _wait_for_system16:38
powersj    raise OSError('timeout: after {}s system not started'.format(time))16:38
powersjit is because when the qemu tests installed lxd it didn't initialize lxd networking16:38
powersjso no IP is received16:39
smoserajorg: would you have had outbound access to launchpad https ? if not, then that'd be expected failure.16:39
smoseroh, and i guess 'gh:powersj' (github)16:40
ajorgsmoser: I'll check some things, but in short yes. Maybe lxc is being weird?16:40
smoseri dont like our user names in that test though...16:40
powersjsmoser: we could use the bot instead16:40
ajorgsmoser: it's a public ec2 instance with no special outbound rules, and I can connect to public https sites from a normal session.16:42
blackboxswhrm, ok let's chat about what we can do to anonymize or drop that type of test data if we can16:44
blackboxswprobably time to kick over to office hours16:44
blackboxsw#topic Office Hours (next 30 minutes)16:45
smoserpowersj: well, i think i'd prefer some public key that we state "no one has the private key for this."16:45
smoserobviously we could lie about that, but one would *expect* that you and I would gain access to the system using our public keys.16:46
smoserit doens't make me feel a lot better that a bot could/can.16:46
ajorgIs there a way to limit integration testing to a specific test?16:47
blackboxswFeel free to bring up any topic/bugs/branches/features you'd like discussion on. We can also continue our discussion on the ssh key imports in teting16:47
ajorg(takes a long time to run the full suite)16:47
blackboxswajorg: yes16:48
blackboxsw(reverse-i-search)`cloud_t': python3 -m tests.cloud_tests run --os-name=artful --platform=nocloud-kvm --preserve-data --data-dir=../results --verbose -t modules/locale -t modules/set_password16:48
ajorgthanks, that should help16:48
blackboxswajorg: you can specify the test names (like modules/set_password) and modules/locale  in this test16:48
blackboxswyeah those are short ones I frequently test with16:48
smoserhttp://paste.ubuntu.com/26438334/16:49
smoserthat is what i use. and yeah... we've discussed that integration test could be easier to run :)16:49
blackboxsw#link http://paste.ubuntu.com/26438334/16:49
blackboxswnice 116:50
blackboxswsmoser: to have a public key we know nobody has a private key for would that mean we'd need a separate github account (or maybe just an additional key associated w/ our bot account in gh16:51
* blackboxsw checks github for authorizing multiple keys.16:51
blackboxswhrm, that wouldn't work as we need gh:ubuntu-server-bot   (one key) n/m16:51
ajorgI've got meetings most of today, so I'll have to follow up later. thanks everyone!16:52
blackboxswthanks ajorg16:52
blackboxswso, bot account for the time being is better than powersj owning the testing world ;)16:59
blackboxswbut I'm not too concerned about it as this are supposed to be throw away instances16:59
blackboxswbut I'm not too concerned about it as there instances under test are supposed to be throw away instances16:59
blackboxsw*these instances*.... anyway17:00
smoserblackboxsw: right. it would require users on both those services .17:04
=== hrybacki is now known as hrybacki_mtg
blackboxswalrighty. think we're at the close of office hours. Last call?17:14
blackboxswThanks for your time and contributions to cloud-init folks!17:16
blackboxsw#endmeeting17:16
meetingologyMeeting ended Mon Jan 22 17:16:43 2018 UTC.17:16
meetingologyMinutes:        http://ubottu.com/meetingology/logs/cloud-init/2018/cloud-init.2018-01-22-16.08.moin.txt17:16
rharperblackboxsw: thanks17:26
blackboxswnp, just posting the notes to cloud-init.github.io now17:26
=== hrybacki_mtg is now known as hrybacki
smoserhttps://hackmd.io/OwBgpgxgRlwIwFooBMBMAzBAWYECcCAHIbgtIXgGyEDMIElcEQA=18:45
smoserblackboxsw, powersj rharper . i just put that together in order to go fresh system to functional "test my branch".18:46
smosermainly for ajorg's request of "does trunk work".18:46
blackboxswreading18:46
rharpersmoser: -t <release> is super fancy sauce18:48
* rharper has learned something new for tday 18:48
blackboxswis it worth folding a bit of that into readthedocs content for cloudinit? I'm not sure where to 'host' that info (as it's a good top-level view)18:49
blackboxswyet it references tools that aren't cloudinit proper18:49
powersjsmoser: why not use lxd via snap?18:50
powersjthat way those instructions are not tied to xenial18:50
smoserwell, i first tried not getting a new one.18:50
smoserjust using what was in the image.18:50
smoserbut something failed... a container didnt get an IP address.18:50
smoserso i thought "ok, just get something newer".18:50
smoserapt still seems easier and since we already reference other apt thhings, seems just easier to recommend apt18:51
smoserand i honestly didnt know how apt-installed versus snap installed get a long18:51
smoserrharper: you can also18:51
smoser apt-get install lxd/xenial-backports18:51
powersjok18:52
powersjalso someone could use tree_run to build their local tree and run the tests all in one18:52
smoseryeah, i knew there was something for that.18:52
smoseri'm fine to change that.18:52
smoserbut "how to build a deb" seems useful doc anyway. and as it is right now it already recommended installing all these deps18:53
smoserso... might as well use them. rather than launching a container and putting them in there.18:53
rharpersmoser: also nifty18:53
smoseralong the way i find18:53
smoser http://paste.ubuntu.com/26438886/18:53
smoser:-(18:53
powersjgoodness18:55
rharpersmoser: so, according to this: https://github.com/systemd/systemd/issues/2912  and just verifying this in bionic;  systemd-networkd (configured via cloud-init/netplan); on ip link set down, the networkd v4 dhcp client will re-acquire leases;  that's certainly new behavior w.r.t say xenial/ifupdown (confirming that now in a xenial image)18:55
rharpers/set down/set down; set up/18:56
rharperappears that dhclient does this just fine as well;19:06
smoserrharper: ? you're saying that isc-dhcp does that ?19:15
smoserpowersj: ubuntu@ec2-18-218-147-181.us-east-2.compute.amazonaws.com19:16
smoserthere is failure there right now on ntp test.19:16
smoser$ time ./tools/tox-venv citest python3 -m tests.cloud_tests run -v         --os-name=xenial         --preserve-data --data-dir=/tmp/results.short.d --test tests/cloud_tests/testcases/modules/ntp_servers.py19:16
smoseri'm looking at it.19:16
rharpersmoser: yeah, what's in xenial appears to bring the interface back up.19:21
rharperah, but routing info isn't updated19:21
rharperlemme compare that to networkd19:21
smoserrharper: you're saying if i run:19:22
smoser ip link set eth0 down19:22
smoserthat something will magically bring it back up?19:22
smoseri dont see that in a container here.19:22
rharperno, that bouncing the link state will restore the connectivity19:22
rharperthe unplug (set down); and replug (set up) restores interface config (but not routing in isc-dhclient);  in networkd, I can see the networkd dhclient require a lease and apply it19:23
smoserrharper: well, in isc-dhcp its not so much that it "restores interface config", its just that nothing removed that config. so putting it back up keeps it.19:24
smoserbut the routes get dropped and do not get restored.19:24
smoserare you suggesting that 'link set down eth0' is the same thing as if the hypervisor pulled the cable ?19:25
rharperyeah, just walking through the link state change so we can capture what does an does not work here w.r.t the need for something19:25
rharperthat's what the azure email is suggesting, that they can toggle the link state19:26
rharperyou can look at the code they posted which watches the interface oper/link state and , I think to work around the isc-dhcp client not updating the route-interface, lauches another dhclient instance19:26
rharperyeah, under networkd; it does bring the route for the bounced interface back up;19:27
* smoser wishes ajorg would show up.19:53
dojordan@smoser, I answered your comment. During the re-init of the EphemeralDHCPv4 context manger we do actually want to look for the fallback nic again. It shouldn't change names but if it does we would never find it if we only run find_fallback_nic once.20:20
smoserdojordan: it wont change names. and you'd see different errors for sure if it did.20:25
smoseri handt considered the nic getting renamed ... its a bug if it does get renamed under us20:26
smoserbut i had considered only another nic coming online20:26
smoserlook once... first nic is eth120:26
smoserlook again, first nic is eth020:26
rharpersmoser: once we've renamed it, the specific nic cannot get renamed by udev; it would have to be some other program ding that20:26
smoserwhen do we rename ?20:26
dojordanus being cloud init, or ubuntu?20:26
rharperin local netconfig, apply_nic_names20:26
smoserin the init (non network) no renaming has taken place.20:26
smoserrharper: yeah, we're runnign before that here.20:27
dojordangot it, so during local we can assume it wont change20:27
rharperI think it happens before networking stage otherwise the network config can come up20:27
dojordan?*20:27
smoserdojordan: in this case "us" == cloud-init.20:27
rharpersmoser: are we "paused" before stages calls apply_network_config() ?20:27
smoserdojordan: we're kind of all sorts of foobarred if other things are changing nic names on is.20:28
smoserat that point.20:28
smoserthe pause for IMDS (polling netwokr metadata)  is before we have rendered fallabck networking.20:28
dojordanbut what's the harm of looking for the fallback nic multiple times? It is up to the caller (of the context manager) to decide20:28
smoserhappening all in cloud-init.lcoal ("pre-network")20:28
smoserdojordan: i just think it'd be harder to figure it out. and opens up a failure case.20:29
smoserboot, dhcp on eth0, hit MD, poll a bit...20:29
smoser then get a 40420:29
smoserdhcp accidently on 'anic0'20:30
dojordan404 won't call find_fallback_nic20:30
smoserand then dead20:30
dojordanor dhcp or anything20:30
smoseri thought it was 404 that caused it to re-try dhcp20:30
dojordanonly non 200-299 / 400 or unhandled exception (timeout, socket error, etc)20:30
smoserbut whatever it is that causes that re-try dhcp. if the second time it goes off on amnother nic, then we're done.20:30
dojordannope, 404 means we hit the metadata server, but nothing is available for us yet20:30
dojordan" if the second time it goes off on amnother nic, then we're done." - why20:31
smoseri assumed that only one nic is connected to the network that has the MD on it.20:31
smoserif all of them are, and a dhcp on any would be sufficient, then you're right.20:31
smoserbut it still is odd.20:31
smoserto use one nic, and then use another the next time.20:31
dojordanso question, if we unplug a nic and plug it back in from hyper-v, we would need to guarantee it will keep the iface name20:32
dojordanotherwise i am fine with that change20:33
dojordanerr, assuming disconnect / connect doesnt change the name20:33
smoseroh right. i'd forgot that you might do that.20:33
smoserdisconnect as in "pull the cable" ?20:33
smoseror as in "pull the nic"20:33
smoserpull the cable wont rename.20:34
smoserpull the nic, maybe.20:34
smoserdojordan: so... will all nics ever plugged into the system e able to reach the MD if they dhcp ?20:35
smoserie, is taht a feature of the platform ?20:35
dojordanI will confirm with the team doing that work. don't want to assume yes but i think so20:35
smoserdojordan: http://paste.ubuntu.com/26440131/21:24
smoserwhat do you think about that ?21:24
smoserif i *did* get some debug messages there, that'd clearly lindicate what nic we were using.21:24
dojordanI like it. sounds good to me. I'll fix some indentation errors too21:25
smoser:)21:26
dojordan@smoser, pushed, running to lunch. Take a look when you get a chance21:41
smoserhttps://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/33645821:59
smoserthat took longer than it should have to diagnose.21:59
powersjhmm ci isn't happy again22:00
dojordanseems like some lxd issue22:04
powersjI'm looking at it22:06
dojordanthanks22:07
powersjdojordan: run is looking better this time. sorry about that22:33
powersjwe had some lxd issues over the weekend where another project's tests messed with the config22:33
blackboxswhrm hitting timeouts while pulling packages in ec2 testing the ec2 ntp fix Failed to fetch http://us-east-2.ec2.archive.ubuntu.com/ubuntu/pool/main/n/ntp/sntp_4.2.8p10+dfsg-5ubuntu3.1_amd64.deb  504  Gateway Time-out [IP: 52.15.107.13 80]22:35
smoserhey.22:37
smoserto get this out of my buffer on a ec2 system22:38
smoser http://paste.ubuntu.com/26440452/22:38
smoserpylxd returns string not bytes from execute.22:38
smoserjoy22:38
smoserso i tried to have it collect systemd  journal, but can't.22:38
smoseri'll file bug on pylxd tomorrow.22:38
smoserwe can work around by lxc cmdline as shown in that paste.22:38
dojordanThanks for restarting @powersj. @smoser, anything else, or can we ship it?23:17

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!