[14:18] <Odd_Bloke> smoser: Are you going to be able to look at __builtin__ (i.e. https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/310432) this week, or shall I pick up work on it?
[14:19] <smoser> Odd_Bloke, if you can make tests pass, then i'm good with it in trunk
[14:19] <smoser> but, no. i wont get there. right now i'm workingon
[14:19] <smoser> https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1611074
[14:19] <Odd_Bloke> smoser: OK, cool, I'll look at the tests.
[14:19] <smoser> and i'll hope to have some review from you on that...
[14:19] <smoser> its kind of a pita
[14:19] <Odd_Bloke> smoser: Ugh, that bug.
[14:19] <smoser> the code that was there has always been racy
[14:20] <smoser> i do think that i have a reasonable plan
[14:20] <Odd_Bloke> smoser: FYI, I'm out tomorrow, Monday and Tuesday.
[14:20] <smoser> you're on UTC hours or are you on this continent now.
[14:21] <Odd_Bloke> smoser: UTC at the moment, flying to that continent tomorrow and back on Monday.
[14:21] <smoser> fun weekend.
[14:22] <Odd_Bloke> Yeah, I'm having to do a quick trip to activate my permanent residency.
[14:39] <smoser> Odd_Bloke, that doesnt really make sense.
[14:39] <smoser> :)
[14:39] <Odd_Bloke> :)
[14:39] <Odd_Bloke> Being with my Canadian wife (anywhere in the world) counts as "resident".
[14:40] <Odd_Bloke> And my medical expires in December, so I'm having to make a trip before that.
[15:58] <Odd_Bloke> smoser: Oops, you'll just have got a huge email from LP because of me; proposed merging my tests to your branch having forgotten that I rebased on to master. ¬.¬
[15:58] <Odd_Bloke> smoser: https://code.launchpad.net/~daniel-thewatkins/cloud-init/+git/cloud-init/+merge/311164 is your commit with an additional commit fixing the tests.
[15:59] <smoser> Odd_Bloke, nice. thank you!
[16:43] <rharper> Odd_Bloke: smoser: the "ephemeral0" disk in azure is mapped how? /dev/disk/by-label? by-id ?  ie, what sort of kpath would such a device have ?
[16:44] <Odd_Bloke> rharper: cloud-init ships udev rules that make it appear at /dev/disk/cloud/azure_resource
[16:45] <rharper> ok, lemme read those rules
[16:45] <Odd_Bloke> rharper: That's only post-trusty, before xenial walinuxagent ships rules which name them slightly differently.
[16:46] <Odd_Bloke> (Well, I think walinuxagent still ships those rules)
[16:46] <rharper> Odd_Bloke: ok, would you be able to get me access to a trusty and xenial instance in azure?  I'm poking at a disk formatting and mounting bug there, https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1642383
[16:48] <Odd_Bloke> rharper: On it.
[16:48] <rharper> thanks
[16:51] <Odd_Bloke> rharper: ubuntu@xenial-161117-1647.cloudapp.net
[16:52] <Odd_Bloke> rharper: And ubuntu@trusty-161117-1648.cloudapp.net
[16:52] <Odd_Bloke> I'll leave you to work out which one is which suite. ;)
[16:53] <rharper> Odd_Bloke: thanks!
[16:53] <Odd_Bloke> rharper: I'm heading off for vacation, so could you shutdown the machines when you're done with them?
[16:54] <rharper> Odd_Bloke: yes!
[16:54] <Odd_Bloke> Danke. :)
[18:58] <rharper> smoser: looks like we'll need this https://code.launchpad.net/~daniel-thewatkins/cloud-init/lp1460715  to fix disk_setup partitioning under xenial (sfdisk changes in 16.04)
[19:02] <smoser> rharper, yeah, there are other issues... i hope/expect to fix that bug with what i have for the other
[19:02] <smoser> part of my motivation to have you look at that one was so you'd understand my changes coming here.
[19:02] <smoser> and why they should work for both
[19:03] <smoser> but yea... using that dynamic path is part of this.
[19:03] <rharper> k
[19:03] <smoser> as well as making sure we wait for the device to appear
[19:03] <rharper> right, that's a 16.04 ism w.r.t kernel partition awareness under systemd
[19:03] <rharper> that sounded like the kpartx thingy
[19:06] <rharper> smoser: would really like to replace the cc_mounts/cc_disk_setup with curtin backends  but that's not going to happen right now
[19:10] <smoser> rharper, yeah. that is absolutely true
[19:11] <smoser> and additionally we need to be able to take disk config in curtin like format.
[19:11] <rharper> y
[19:11] <rharper> another thing for the roadmap =)
[19:12] <rharper> the 33, 66, 88 thingy isn't non-parseable unless you read the help documentation, so moving to a more helpful format would be nice
[19:53] <smoser> rharper,
[19:53] <smoser> https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/311205
[19:53] <smoser> Odd_Bloke, ^ also
[19:54] <rharper> +            if os.path.isfile(devpath + suff + "2"):
[19:55] <rharper> that seems off, what if there are 3 ?
[19:55] <rharper> also, I think it's possible to have un-ordered partitions (gpt) for example
[19:55] <smoser> well, it woudl fail if there was a partition 1 and partition 3 but *not* a partition 2
[19:55] <rharper> which is possible
[19:55] <smoser> (you can do that with mbr also)
[19:55] <smoser> yeah, it is hueristic for sure.
[19:55] <smoser> i could probably use a function in disk_setup
[19:56] <smoser> but this was what i did initially
[19:56] <rharper> we have a sysfs device partition counting in curtin
[19:56] <smoser> yeah, but that wont work on non-linux
[19:57] <smoser> the sfdisk stuff in disk_ probably would
[19:57] <rharper> not with -L Linux
[19:57] <smoser> so yeah, thats not the greatest bit of code, but it is better than what *was* there.
[19:57] <rharper> oh sure; I just don't know how far we want to go here
[19:57] <smoser> -L linux isn't "linux specific stuff"
[19:57] <rharper> hehe
[19:57] <smoser> its just "dont complaina bout stuff linux doesnt care about"
[19:58] <rharper> well, it's deprecated anyhow
[19:58] <smoser> the code that was there before didn't even check for a second partition at all
[19:58] <rharper> but, it *is* linux specific
[19:58] <smoser> !
[19:58] <smoser> ?
[19:58] <smoser> i suspect that sfdisk -L works on freebsd
[19:58] <rharper> Stderr: "sfdisk: --Linux option is unnecessary and deprecated\nsfdisk: unsupported unit 'M'\n"
[19:58] <rharper> in 16.04
[19:59] <rharper> upstream util-linux changed
[19:59] <rharper> in 2014 or so
[19:59] <smoser> ah. right
[20:01] <rharper> https://bugs.launchpad.net/cloud-init/+bug/1460715  is still needed for partitioning on xenial
[20:03] <rharper> the can_dev_be_reformatted is specifically there to help reprevent dataloss if someone "used" the disk ...
[20:04] <rharper> certainly it's possible to populate it the ntfs existing partition though?  we can't protect against that?
[20:11] <smoser> rharper, if there are files on it, we leave it alone
[20:11] <smoser> (other tahn the DATALOSS_ file)
[20:11] <rharper> ok, I'm posting comments, the msg on the return True, is confusing
[20:17] <smoser> rharper, addressed that.. you're right. i
[20:17] <smoser> it should have said no important files
[20:17] <rharper> I thought so after reading the rest of the code
[20:18] <rharper> and I switch to util.log_time should be easy and announce the wait on the log
[20:18] <smoser> i think i  might be racing systemd and mount now.
[20:18] <rharper> I've had several folks ask about making sure cloud-init announces when it's blocking ( which I think it does most of the time) but we need to do that here too
[20:19] <smoser> rharper, ubuntu@40.78.152.32
[20:19] <smoser> er... smoser@
[20:19] <rharper> ok
[20:20] <smoser> byobu or /var/log/cloud-init.log
[20:20] <smoser> mount -a failed as already mounted.
[20:20] <smoser> which should not happen until
[20:20] <smoser> defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig
[20:20] <rharper> urg
[20:21] <smoser> maybe in new systemd world, a mount-a is going to fail in that scenario... i'm not sure, but i'm pretty sure it should *not* fail with already mounted
[20:21] <rharper> Nov 17 20:11:02 repro3 systemd[1]: Mounting /mnt...
[20:21] <rharper> Nov 17 20:11:02 repro3 systemd[1]: Mounted /mnt.
[20:21] <rharper> Warning: mnt.mount changed on disk. Run 'systemctl daemon-reload' to reload units.
[20:22] <rharper> $ journalctl -o short-precise --unit mnt.mount
[20:22] <rharper> -- Logs begin at Thu 2016-11-17 20:10:56 UTC, end at Thu 2016-11-17 20:22:01 UTC. --
[20:22] <rharper> Nov 17 20:11:01.513996 repro3 systemd[1]: Unmounted /mnt.
[20:22] <rharper> Nov 17 20:11:02.345932 repro3 systemd[1]: Mounting /mnt...
[20:22] <rharper> Nov 17 20:11:02.820894 repro3 systemd[1]: Mounted /mnt.
[20:22] <rharper> check the timestamps with cloud-init log
[20:22] <smoser> well, the timestamps suck
[20:22] <rharper> Nov 17 20:11:02.868532 repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed
[20:22] <rharper> journalctl -o short-precise --unit cloud-init.service
[20:22] <rharper> will get you real timestamps
[20:23] <rharper> so, we were too late
[20:23] <smoser> you want short-monotonic here
[20:23] <rharper> as you said
[20:23] <rharper> sure
[20:23] <rharper> that too
[20:23] <smoser> as clock runs all over :)
[20:23] <rharper> hrm
[20:23] <rharper> that shows something different then
[20:24] <rharper> ~$ journalctl -o short-monotonic --unit cloud-init.service | grep mount
[20:24] <rharper> [   15.757407] repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed
[20:24] <rharper> smoser@repro3:~$ journalctl -o short-monotonic --unit mnt.mount
[20:24] <rharper> -- Logs begin at Thu 2016-11-17 20:10:56 UTC, end at Thu 2016-11-17 20:22:01 UTC. --
[20:24] <rharper> [   14.406073] repro3 systemd[1]: Unmounted /mnt.
[20:24] <rharper> [   19.680401] repro3 systemd[1]: Mounting /mnt...
[20:24] <rharper> [   19.750265] repro3 systemd[1]: Mounted /mnt.
[20:24] <rharper> something is lieing
[20:24] <rharper> but it does seem like if we update fstab, we may need to systemctl daemon-reload
[20:24] <rharper> so generators trigger
[20:24] <rharper> which would do the fstab generator
[20:24] <rharper> and do the mount for us
[20:24] <rharper> IIUC
[20:25] <smoser> right. thats possible.
[20:26] <smoser> but both the old and the new fstab entry had not before cloud-init.service
[20:26] <smoser> so it really shuld not be doing anything until that is finished.
[20:27] <smoser> holy moley
[20:27] <smoser> the monotonic is not in order
[20:27] <smoser> journalctl -o short-monotonic
[20:27] <smoser> needs to sort
[20:30] <rharper> ew
[20:30] <rharper> that seems *wrong*
[20:30] <smoser> yeah. its probably because its just timestampping the different threads as the come in
[20:32] <rharper> so,  something umounts /mnt right after cloud-init.service starts ( that's expected, right?)
[20:32] <rharper> [   14.713559] repro3 cloud-init[1037]: Cloud-init v. 0.7.8 running 'init' at Thu,
[20:32] <rharper> [   14.406073] repro3 systemd[1]: Unmounted /mnt.
[20:33] <rharper> actually before
[20:33] <rharper> not sure why
[20:33] <smoser> well, cloud-init *does* do that. or could
[20:33] <rharper> -local ?
[20:33] <smoser> if it was ntfs, then cloud-init mounts, reads, unmounts
[20:33] <rharper> well, this is the mnt.service saying
[20:33] <rharper> so, that happens after fstab generator runs
[20:33] <rharper> otherwise we wouldn't have a mnt.mount unit
[20:33] <smoser> probably not -local (looking for the datasource disk) but it coudl be
[20:34] <rharper> bbiab, picking up the kiddos, then back to help debug
[20:34] <smoser> yeah, everything happens after the fstab generator runs
[20:58] <rharper> smoser: back
[21:00] <smoser> i poked slangasek, he might be looking too
[21:02] <smoser> its kind of interesting/annoying that 'mount -a' failed
[21:02] <smoser> as i've never seen that before
[21:02] <smoser> "everything is already mounted" normally exits 0
[21:03] <rharper> that seems new
[21:03] <rharper> if we run mount -a now, do we see ?
[21:03] <rharper> no
[21:03] <rharper> it returns 0
[21:03] <smoser> right.
[21:03] <rharper> so, something is very strange
[21:03] <rharper> what's the RC?
[21:03] <smoser> so it could just be a really unlucky race
[21:03] <rharper> well, the return code should give us a hint
[21:03] <smoser> 32
[21:04] <rharper> maybe it's ntfs return code
[21:04] <rharper> 32
[21:04] <rharper> mount failure
[21:04] <rharper> that's so helpful
[21:04] <rharper> =(
[21:05] <rharper> https://linux.die.net/man/8/ntfs-3g.probe doesn't list 32
[21:05] <rharper> so unlikely ntfs error
[21:13] <rharper> ov 17 20:11:02.868532 repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed
[21:13] <rharper> Nov 17 20:11:02.868575 repro3 kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)
[21:14] <rharper> smoser: that does look like a race , mount -a got the OK to mount it, but something mounted it before it could ?
[21:14] <rharper> that's what 50 ms?
[21:15] <smoser> thats way shorter than that
[21:16] <rharper> 6 places isn't nano, ?
[21:16]  * rharper sucks at subsecond placement 
[21:16] <smoser> oh. but you're looking at the timestamp of the message that it failed.
[21:16] <rharper> sure, but I still think it's related
[21:16] <smoser> 20:11:01,652
[21:17] <smoser> is 'Running'
[21:17] <rharper> [   14.425648] repro3 systemd[1]: Stopped File System Check on /dev/disk/cloud/azure_resource-part1.
[21:17] <rharper> [   14.484319] repro3 systemd[1]: Starting File System Check on /dev/disk/cloud/azure_resource-part1...
[21:17] <rharper> [   14.560400] repro3 systemd-fsck[1140]: /dev/sdb1: clean, 11/25688 files, 8896/102400 blocks
[21:17] <rharper> [   15.757407] repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed
[21:17] <smoser> yeah, that just seems completely bogus
[21:18] <rharper> [   15.440692] repro3 kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)
[21:18] <rharper> something was "using" sdb1 (systemd-wise)
[21:18] <rharper> when cloud-init attempted to mount it
[21:18] <rharper> if fsck was still running, it would be seen as busy
[21:18] <rharper> ie, an open on the device
[21:18] <rharper> systemd-fsck
[21:18] <smoser> :-(
[21:18] <smoser> yeah
[21:19] <rharper> These services are started at boot if passno in /etc/fstab for the file system is set to a value greater than zero.
[21:19] <rharper> whichi s set to 2
[21:19] <rharper> which triggers the service
[21:21] <rharper> we could try not setting the passno in the fstab entry and see if things just work
[21:22] <rharper> if so then we may need to mask systemd-fsck service prior to the mount (or as you see) it gets mounted automatically
[21:22] <rharper> ie, if we're system_is_systemd
[21:22] <rharper> we could skip the mount -a
[21:24] <smoser> right, on ubuntu we do not in theory need the mount -a now.
[21:24] <smoser> mask systemd-fsck service ?
[21:28] <smoser> :-(
[21:28] <smoser> another instance, i see:
[21:28] <smoser> 2016-11-17 21:20:04,125 - DataSourceAzure.py[DEBUG]: Azure ephemeral disk: All files appeared after ['/dev/disk/cloud/azure_resource'] seconds: 0
[21:28] <smoser> 2016-11-17 21:20:04,125 - DataSourceAzure.py[DEBUG]: reformattable=False: device /dev/disk/cloud/azure_resource is not a file
[21:29] <smoser> which is very odd.
[21:31] <rharper> it's alink
[21:31] <rharper> right ?
[21:31] <smoser> right. but the check is os.exists
[21:31] <rharper> and it's /dev/disk/azure/resource
[21:31] <rharper> not azure_resource
[21:31] <smoser> err.. isfile
[21:31] <smoser> which isvalid
[21:32] <smoser> no its both
[21:32] <rharper> your instance says no
[21:32] <smoser> and cloud-init is the /dev/disk/cloud/azure_resource one. the other is from walinuxagent
[21:32] <rharper> ah
[21:33] <rharper> two paths
[21:33] <rharper> is suspect that's udev racing with symlink creation
[21:34] <smoser> but it shoudlnt ever stop existing
[21:34] <smoser> after it exists
[21:35] <smoser> i have to go... to teacher conferences
[21:35] <rharper> k
[21:35] <smoser> i'm using http://paste.ubuntu.com/23492436/
[21:35] <smoser> to make it think its a resize
[21:35] <smoser> (just partitions it as 1 parittion and mkfs ntsf)
[21:35] <rharper> and then reboot ?
[21:35] <smoser> and then reboot
[21:35] <smoser> yeah
[21:35] <rharper> ok
[21:36] <smoser> here, i'll give you another instance
[21:36] <smoser> smoser@smoser1117x.cloudapp.net
[21:37] <smoser> shoot
[21:37] <smoser> isfile is just wrong
[21:37] <smoser> fix that
[21:37] <smoser> os.path.isfile("/dev/disk/cloud/azure_resource")
[21:37] <smoser> False
[21:37] <rharper> heh
[21:37] <rharper> islink
[21:37] <rharper> will work
[21:37] <rharper> >>> os.path.islink('/dev/disk/cloud/azure_resource')
[21:37] <rharper> True
[21:38] <smoser> well i dont want to know if its a link
[21:38] <smoser> cause that says it can be danling
[21:38] <rharper> you need readlink -f
[21:38] <rharper> I'll get a fix
[21:38] <smoser> i just changed to exists
[21:38] <smoser> and pushed
[21:38] <smoser> so that should be fine really
[21:38] <rharper> well, yeah
[21:38] <smoser> going afk
[21:38] <rharper> k
[21:39] <rharper> >>> os.path.realpath('/dev/disk/cloud/azure_resource')
[21:39] <rharper> '/dev/sdb'
[21:39] <rharper> >>> os.path.exists(os.path.realpath('/dev/disk/cloud/azure_resource'))
[21:39] <rharper> True
[21:51]  * rharper gives it a go 
[22:37] <rharper> smoser: getting further;  with the realpath and exists going, now we're looking to find sdb1 .. it's not immediately present when we're detecting if the devices can be reformatted ..
[23:01] <rharper> smoser: currently, sdb1 isn;t being detected as ntfs  via the blkid util.find_devs_with(); debugging that
[23:29] <rharper> ah , looks like it's working, but we tried to mount /dev/sdb, instead of /dev/sdb1 (real_partpath)
[23:35] <rharper> and, wrong name of the semaphor file, config_disk_setup, not config_disk_config
[23:35] <rharper> almost there