[00:37] <blackboxsw> thank smoser
[00:37] <smoser> blackboxsw, tox && git pushing now
[00:38] <smoser> then i'll upload again
[00:46] <smoser> blackboxsw, ok. that is uploaded. i go away for the night now.
[00:49] <smoser> blackboxsw, https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/330954
[00:49] <smoser> and now i really *am* gone.
[00:49] <smoser> later.
[00:50] <smoser> uploaded
[16:13] <rharper> smoser: do we know if any public cloud where we can test DataSourceOVF ?
[16:14] <smoser> well you can test it locally
[16:14] <smoser> by providing an ovf iso. doc shows how to create one
[16:15] <rharper> ok
[16:15] <smoser> sankaradita has one
[16:16] <rharper> ok
[16:17] <rharper> https://code.launchpad.net/~raharper/cloud-init/+git/cloud-init/+merge/330995    this is what I'm going to test;  I think it's a good cleanup w.r.t reducing the number of devices we actually probe (and re-using blkid cache)
[16:18] <dpb1> rharper: for a workaround for the case?  or a medium term fix?
[16:18] <rharper> long term
[16:19] <dpb1> k
[16:19] <rharper> it's only looking for devices with iso9660 filesystem, blkid can do that quite quickly (and cloud-init respects the blkid cache); ds-identify will probe this early
[16:19] <rharper> so, this should speed up OVF probing dramatically for systems with more than just one block device
[16:27] <smoser> rharper, i'm pretty sure the mounting we're doing in the default case is doing a mount -t iso9660
[16:27] <smoser> which i'd think would fail unless youhad iso9660
[16:27] <smoser> so i'm not sure how this would cause races with other mounts.
[16:34] <smoser> not sure why we're failing https://code.launchpad.net/~cloud-init-dev/+recipe/cloud-init-daily-xenial
[16:36] <blackboxsw> test_openstack_on_non_intel_is_maybe   hmm
[16:36] <blackboxsw> checking maybe it's a leaked mock thing?
[16:36] <blackboxsw> or unmocked leak I mean
[16:40] <smoser> yeah. thats all i can think of.
[16:41] <smoser> i can't reproduce it on the branch (ubuntu/xenial) though
[16:41] <smoser> even with patches applied
[16:41] <smoser> so trying in a container
[18:32] <smoser> blackboxsw, https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/331012
[18:50] <seven-eleven> hi
[18:50] <seven-eleven> how can i run something after cloud-init is done?
[18:50] <seven-eleven> basically i want my custom systemd unit file "openvpn2.service" run after cloud-init setup the hostname
[18:51] <blackboxsw> smoser: approved
[18:54] <seven-eleven> what i tried and didn't work: (1) appending After=cloud-init.target to openvpn2.service, (2) supplying user-data via droplets **user-data kwargs "systemctl daemon-reload; systemctl restart openvpn2"
[18:55] <blackboxsw> seven-eleven: :n
[18:56] <blackboxsw> seven-eleven: sorry I'm looking now. I thought systemd chaining would do that for you. but I'm checking which service you need to be after
[18:56] <seven-eleven> maybe i can tell cloud-init in /etc/cloud/cloud.cfg   @ cloud_final_modules:  to restart my openvpn2, so it grabs the new hostname :]
[18:56] <seven-eleven> blackboxsw, let me paste my service
[18:57] <seven-eleven> blackboxsw, my customized openvpn2.service unit http://dpaste.com/1YF2AV0 which uses %H for hostname
[19:15] <seven-eleven> it's actually starting my service fine, but when i want to restart the service i have to run `systemctl daemon-reload`, else it fails. i can leave it like that, but it's not perfect :-)
[19:16] <seven-eleven> it fails with the error "Failed to start OpenVPN connection to bus1." and bus1 is the hostname of the vm i created the snapshot from
[19:18] <rharper> smoser: re mount -t iso9660, we passed mtype to the util.mount_cb, all this does is change the list of candidate devices from os.listdir(dev)  to blkid -odevice -tTYPE=iso9660;  we still pass mtype into mount_cb;
[19:19] <blackboxsw> seven-eleven: not sure if this helps, but what about After=cloud-final.service
[19:19] <blackboxsw> that's the last stage to run
[19:19] <blackboxsw> http://cloudinit.readthedocs.io/en/latest/topics/boot.html#final
[19:20] <blackboxsw> seven-eleven: you could also add a runcmd: ['systemctl daemon-restart <yourservice>']   and that should be run in final stage
[19:20] <smoser> rharper, right.
[19:20] <smoser> but two things i was suggesting
[19:21] <seven-eleven> blackboxsw, let me try. i was also thinking to maybe add Before= to cloud-final
[19:21] <smoser> a.) you can/should further limit th results through the regex that was already being done. otherwise you risk extending the breadth inadvertantly.
[19:21] <smoser> b.) the fact that we were passing '-t iso9660' makes me think that this was not actually the bug
[19:22] <smoser> because there is basically no way that 'mount -t iso9660' was going to work on any of the entries in the fstab provided.
[19:22] <smoser> so i dont know how it would have caused issues.
[19:23] <rharper> a) already limits to block devices with iso9660; the regex is likely too narrow but practically I see you point;  if there were possibly a block device that's outside of the regex, OVF claims to not want to look at it even if it has an iso9660;  that's debatable but certainly increased the "scope" however unlikely
[19:24] <rharper> b) is more interesting;  I wonder if mount's -t opens and peeks at the filesystem type
[19:25] <rharper> if that's done via exclusive-open, it could race with a .mount unit
[19:25] <rharper> which expects an exclusive-open
[19:25] <smoser> right. so we should just avoid inadvertantly extending the scope.
[19:25] <smoser> (and by doing so actually further *limit* this questionable search, which is good)
[19:25] <rharper> smoser: yes; I can add the regex check back into the search;
[19:25] <rharper> pratically it won't matter but it's certainly possible
[19:25] <smoser> so yeah, it could be doing a exclusive open and a read-check for the fs magic
[19:26] <smoser> but that'd seem *so* fast
[19:26] <smoser> that i cant imagine we'd actually hit the issue
[19:26] <smoser> open; read(4096); check; close()
[19:26] <smoser> that is really really really fast
[19:26] <rharper> races are races, however small
[19:26] <rharper> the block device can be slow to respond
[19:26] <smoser> but you woudlnt have seen someone report this
[19:26] <smoser> or if you did, they couldnt reproduce
[19:27] <smoser> thats my feeling.
[19:27] <rharper> right, and
[19:27] <smoser> i dont deny that it *could* happen
[19:27] <rharper> its mostly only reproducible with large sets of mounts
[19:27] <rharper> agreed
[19:27] <smoser> just very unlikely that it is the source of the bug if i understand it right.
[19:28] <rharper> the data from wolsen was that after disabling OVF datasource, some multi-hundred reboots never reproduces, where with OVF in, it reproduces every 3 or so
[19:28] <rharper> this on a contrived instrance with like 26 ebs volumes
[19:30] <smoser> that sounds plausible
[19:33] <smoser> rharper, also limiting the mount-callbac-umount
[19:33] <smoser> we are also passing -o ro
[19:35] <rharper> it was also doing util.peek
[19:35] <rharper> which does an open and a read
[19:35] <rharper> well, filtering the set of devices we poke to zero (unless they have an iso9660) will surely work
[19:36] <rharper> I generally expect that OVF run much faster that we're not checking all that regex devices, opening/read and also mount_cb each one
[19:38] <smoser> rharper, well, blkid is going to also do an open
[19:38] <smoser> fwiw
[19:38] <rharper> no
[19:38] <rharper> it's cached
[19:38] <rharper> ds-identify runs blkid first
[19:38] <rharper> then we just read the cache
[19:38] <smoser> i'm not sure that is the case
[19:39] <rharper> why do you think not?
[19:39] <smoser> we do call find_devs_with with no_cache=False. but i have never actually been able to determine what blkid does wrt its cache.
[19:39] <smoser> when it determines something is valid and when not
[19:40] <smoser> run:
[19:40] <smoser> sudo strace -o /tmp/out blkid
[19:40] <smoser> and then run it again
[19:40] <rharper> with -c/dev/null
[19:41] <smoser> we only pass -c/dev/null if no_cache == true
[19:41] <smoser> ie, thats supposed to be the "re-read everything"
[19:41] <smoser> it should re-use its cache if *not* given that
[19:41] <rharper> right, I just meant to compare the syscalls
[19:41] <rharper> so you can see what the cache buys you
[19:43] <rharper> w.r.t the cost;  for the instance with 26 volumes or so,  we have a total of 1.5 seconds time spent do 1) read /proc/mounts 2) regex match 3) peek file  4) mount_cb;
[19:43] <rharper> thats about 60 ms per device;  the filtering helps eliminate all of those;  that's pretty nice;
[19:47] <smoser> so i just never really understood what -c/dev/null does
[19:47] <smoser> it seems to still do stuff
[19:49] <rharper> -c means to ignore the cache; I suspect it depends on the query to some degree,
[21:00] <rharper> smoser: around?  have time for a ho on the mount race?  I've got some questions I wanted to bounce off you if you're available
[21:03] <smoser> quick. yeah. i got 10 minutes
[21:05] <rharper> k
[21:05] <rharper> smoser: https://hangouts.google.com/hangouts/_/canonical.com/cloud-init?authuser=1
[21:06] <smoser> in
[21:24] <seven-eleven> blackboxsw, sorry i was afk and could test only now. After=cloud-final.service didn't help. but by using run_cmd it worked. I used digitalocean's API and provided run_cmd through the user_data attribute: http://dpaste.com/29JDA2Z
[21:39] <akik> firewall-cmd --reload on centos 7 eats DOCKER-USER iptables chain. is that a problem?
[21:39] <akik> sorry wrong channel
[21:51] <blackboxsw> :)
[22:20] <blackboxsw> seven-eleven: good work!