blackboxsw | thank smoser | 00:37 |
---|---|---|
smoser | blackboxsw, tox && git pushing now | 00:37 |
smoser | then i'll upload again | 00:38 |
smoser | blackboxsw, ok. that is uploaded. i go away for the night now. | 00:46 |
smoser | blackboxsw, https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/330954 | 00:49 |
smoser | and now i really *am* gone. | 00:49 |
smoser | later. | 00:49 |
smoser | uploaded | 00:50 |
rharper | smoser: do we know if any public cloud where we can test DataSourceOVF ? | 16:13 |
smoser | well you can test it locally | 16:14 |
smoser | by providing an ovf iso. doc shows how to create one | 16:14 |
rharper | ok | 16:15 |
smoser | sankaradita has one | 16:15 |
rharper | ok | 16:16 |
rharper | https://code.launchpad.net/~raharper/cloud-init/+git/cloud-init/+merge/330995 this is what I'm going to test; I think it's a good cleanup w.r.t reducing the number of devices we actually probe (and re-using blkid cache) | 16:17 |
dpb1 | rharper: for a workaround for the case? or a medium term fix? | 16:18 |
rharper | long term | 16:18 |
dpb1 | k | 16:19 |
rharper | it's only looking for devices with iso9660 filesystem, blkid can do that quite quickly (and cloud-init respects the blkid cache); ds-identify will probe this early | 16:19 |
rharper | so, this should speed up OVF probing dramatically for systems with more than just one block device | 16:19 |
smoser | rharper, i'm pretty sure the mounting we're doing in the default case is doing a mount -t iso9660 | 16:27 |
smoser | which i'd think would fail unless youhad iso9660 | 16:27 |
smoser | so i'm not sure how this would cause races with other mounts. | 16:27 |
smoser | not sure why we're failing https://code.launchpad.net/~cloud-init-dev/+recipe/cloud-init-daily-xenial | 16:34 |
blackboxsw | test_openstack_on_non_intel_is_maybe hmm | 16:36 |
blackboxsw | checking maybe it's a leaked mock thing? | 16:36 |
blackboxsw | or unmocked leak I mean | 16:36 |
smoser | yeah. thats all i can think of. | 16:40 |
smoser | i can't reproduce it on the branch (ubuntu/xenial) though | 16:41 |
smoser | even with patches applied | 16:41 |
smoser | so trying in a container | 16:41 |
smoser | blackboxsw, https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/331012 | 18:32 |
seven-eleven | hi | 18:50 |
seven-eleven | how can i run something after cloud-init is done? | 18:50 |
seven-eleven | basically i want my custom systemd unit file "openvpn2.service" run after cloud-init setup the hostname | 18:50 |
blackboxsw | smoser: approved | 18:51 |
seven-eleven | what i tried and didn't work: (1) appending After=cloud-init.target to openvpn2.service, (2) supplying user-data via droplets **user-data kwargs "systemctl daemon-reload; systemctl restart openvpn2" | 18:54 |
blackboxsw | seven-eleven: :n | 18:55 |
blackboxsw | seven-eleven: sorry I'm looking now. I thought systemd chaining would do that for you. but I'm checking which service you need to be after | 18:56 |
seven-eleven | maybe i can tell cloud-init in /etc/cloud/cloud.cfg @ cloud_final_modules: to restart my openvpn2, so it grabs the new hostname :] | 18:56 |
seven-eleven | blackboxsw, let me paste my service | 18:56 |
seven-eleven | blackboxsw, my customized openvpn2.service unit http://dpaste.com/1YF2AV0 which uses %H for hostname | 18:57 |
seven-eleven | it's actually starting my service fine, but when i want to restart the service i have to run `systemctl daemon-reload`, else it fails. i can leave it like that, but it's not perfect :-) | 19:15 |
seven-eleven | it fails with the error "Failed to start OpenVPN connection to bus1." and bus1 is the hostname of the vm i created the snapshot from | 19:16 |
rharper | smoser: re mount -t iso9660, we passed mtype to the util.mount_cb, all this does is change the list of candidate devices from os.listdir(dev) to blkid -odevice -tTYPE=iso9660; we still pass mtype into mount_cb; | 19:18 |
blackboxsw | seven-eleven: not sure if this helps, but what about After=cloud-final.service | 19:19 |
blackboxsw | that's the last stage to run | 19:19 |
blackboxsw | http://cloudinit.readthedocs.io/en/latest/topics/boot.html#final | 19:19 |
blackboxsw | seven-eleven: you could also add a runcmd: ['systemctl daemon-restart <yourservice>'] and that should be run in final stage | 19:20 |
smoser | rharper, right. | 19:20 |
smoser | but two things i was suggesting | 19:20 |
seven-eleven | blackboxsw, let me try. i was also thinking to maybe add Before= to cloud-final | 19:21 |
smoser | a.) you can/should further limit th results through the regex that was already being done. otherwise you risk extending the breadth inadvertantly. | 19:21 |
smoser | b.) the fact that we were passing '-t iso9660' makes me think that this was not actually the bug | 19:21 |
smoser | because there is basically no way that 'mount -t iso9660' was going to work on any of the entries in the fstab provided. | 19:22 |
smoser | so i dont know how it would have caused issues. | 19:22 |
rharper | a) already limits to block devices with iso9660; the regex is likely too narrow but practically I see you point; if there were possibly a block device that's outside of the regex, OVF claims to not want to look at it even if it has an iso9660; that's debatable but certainly increased the "scope" however unlikely | 19:23 |
rharper | b) is more interesting; I wonder if mount's -t opens and peeks at the filesystem type | 19:24 |
rharper | if that's done via exclusive-open, it could race with a .mount unit | 19:25 |
rharper | which expects an exclusive-open | 19:25 |
smoser | right. so we should just avoid inadvertantly extending the scope. | 19:25 |
smoser | (and by doing so actually further *limit* this questionable search, which is good) | 19:25 |
rharper | smoser: yes; I can add the regex check back into the search; | 19:25 |
rharper | pratically it won't matter but it's certainly possible | 19:25 |
smoser | so yeah, it could be doing a exclusive open and a read-check for the fs magic | 19:25 |
smoser | but that'd seem *so* fast | 19:26 |
smoser | that i cant imagine we'd actually hit the issue | 19:26 |
smoser | open; read(4096); check; close() | 19:26 |
smoser | that is really really really fast | 19:26 |
rharper | races are races, however small | 19:26 |
rharper | the block device can be slow to respond | 19:26 |
smoser | but you woudlnt have seen someone report this | 19:26 |
smoser | or if you did, they couldnt reproduce | 19:26 |
smoser | thats my feeling. | 19:27 |
rharper | right, and | 19:27 |
smoser | i dont deny that it *could* happen | 19:27 |
rharper | its mostly only reproducible with large sets of mounts | 19:27 |
rharper | agreed | 19:27 |
smoser | just very unlikely that it is the source of the bug if i understand it right. | 19:27 |
rharper | the data from wolsen was that after disabling OVF datasource, some multi-hundred reboots never reproduces, where with OVF in, it reproduces every 3 or so | 19:28 |
rharper | this on a contrived instrance with like 26 ebs volumes | 19:28 |
smoser | that sounds plausible | 19:30 |
smoser | rharper, also limiting the mount-callbac-umount | 19:33 |
smoser | we are also passing -o ro | 19:33 |
rharper | it was also doing util.peek | 19:35 |
rharper | which does an open and a read | 19:35 |
rharper | well, filtering the set of devices we poke to zero (unless they have an iso9660) will surely work | 19:35 |
rharper | I generally expect that OVF run much faster that we're not checking all that regex devices, opening/read and also mount_cb each one | 19:36 |
smoser | rharper, well, blkid is going to also do an open | 19:38 |
smoser | fwiw | 19:38 |
rharper | no | 19:38 |
rharper | it's cached | 19:38 |
rharper | ds-identify runs blkid first | 19:38 |
rharper | then we just read the cache | 19:38 |
smoser | i'm not sure that is the case | 19:38 |
rharper | why do you think not? | 19:39 |
smoser | we do call find_devs_with with no_cache=False. but i have never actually been able to determine what blkid does wrt its cache. | 19:39 |
smoser | when it determines something is valid and when not | 19:39 |
smoser | run: | 19:40 |
smoser | sudo strace -o /tmp/out blkid | 19:40 |
smoser | and then run it again | 19:40 |
rharper | with -c/dev/null | 19:40 |
smoser | we only pass -c/dev/null if no_cache == true | 19:41 |
smoser | ie, thats supposed to be the "re-read everything" | 19:41 |
smoser | it should re-use its cache if *not* given that | 19:41 |
rharper | right, I just meant to compare the syscalls | 19:41 |
rharper | so you can see what the cache buys you | 19:41 |
rharper | w.r.t the cost; for the instance with 26 volumes or so, we have a total of 1.5 seconds time spent do 1) read /proc/mounts 2) regex match 3) peek file 4) mount_cb; | 19:43 |
rharper | thats about 60 ms per device; the filtering helps eliminate all of those; that's pretty nice; | 19:43 |
smoser | so i just never really understood what -c/dev/null does | 19:47 |
smoser | it seems to still do stuff | 19:47 |
rharper | -c means to ignore the cache; I suspect it depends on the query to some degree, | 19:49 |
rharper | smoser: around? have time for a ho on the mount race? I've got some questions I wanted to bounce off you if you're available | 21:00 |
smoser | quick. yeah. i got 10 minutes | 21:03 |
rharper | k | 21:05 |
rharper | smoser: https://hangouts.google.com/hangouts/_/canonical.com/cloud-init?authuser=1 | 21:05 |
smoser | in | 21:06 |
seven-eleven | blackboxsw, sorry i was afk and could test only now. After=cloud-final.service didn't help. but by using run_cmd it worked. I used digitalocean's API and provided run_cmd through the user_data attribute: http://dpaste.com/29JDA2Z | 21:24 |
akik | firewall-cmd --reload on centos 7 eats DOCKER-USER iptables chain. is that a problem? | 21:39 |
akik | sorry wrong channel | 21:39 |
blackboxsw | :) | 21:51 |
=== blackboxsw changed the topic of #cloud-init to: Reviews: http://bit.ly/ci-reviews | Next status meeting: Monday 10/2 14:00 UTC | ||
=== blackboxsw changed the topic of #cloud-init to: Reviews: http://bit.ly/ci-reviews | Meeting minutes: https://goo.gl/Ts2k8t | Next status meeting: Monday 10/2 14:00 UTC | ||
=== blackboxsw changed the topic of #cloud-init to: Reviews: http://bit.ly/ci-reviews | Meeting minutes: https://goo.gl/mrHdaj | Next status meeting: Monday 10/2 14:00 UTC | ||
blackboxsw | seven-eleven: good work! | 22:20 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!