=== shardy_afk is now known as shardy | ||
=== shardy is now known as shardy_mtg | ||
=== shardy_mtg is now known as shardy | ||
Odd_Bloke | smoser: Are you going to be able to look at __builtin__ (i.e. https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/310432) this week, or shall I pick up work on it? | 14:18 |
---|---|---|
smoser | Odd_Bloke, if you can make tests pass, then i'm good with it in trunk | 14:19 |
smoser | but, no. i wont get there. right now i'm workingon | 14:19 |
smoser | https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1611074 | 14:19 |
Odd_Bloke | smoser: OK, cool, I'll look at the tests. | 14:19 |
smoser | and i'll hope to have some review from you on that... | 14:19 |
smoser | its kind of a pita | 14:19 |
Odd_Bloke | smoser: Ugh, that bug. | 14:19 |
smoser | the code that was there has always been racy | 14:19 |
smoser | i do think that i have a reasonable plan | 14:20 |
Odd_Bloke | smoser: FYI, I'm out tomorrow, Monday and Tuesday. | 14:20 |
smoser | you're on UTC hours or are you on this continent now. | 14:20 |
Odd_Bloke | smoser: UTC at the moment, flying to that continent tomorrow and back on Monday. | 14:21 |
smoser | fun weekend. | 14:21 |
Odd_Bloke | Yeah, I'm having to do a quick trip to activate my permanent residency. | 14:22 |
=== cpaelzer_ is now known as cpaelzer | ||
smoser | Odd_Bloke, that doesnt really make sense. | 14:39 |
smoser | :) | 14:39 |
Odd_Bloke | :) | 14:39 |
Odd_Bloke | Being with my Canadian wife (anywhere in the world) counts as "resident". | 14:39 |
Odd_Bloke | And my medical expires in December, so I'm having to make a trip before that. | 14:40 |
Odd_Bloke | smoser: Oops, you'll just have got a huge email from LP because of me; proposed merging my tests to your branch having forgotten that I rebased on to master. ¬.¬ | 15:58 |
Odd_Bloke | smoser: https://code.launchpad.net/~daniel-thewatkins/cloud-init/+git/cloud-init/+merge/311164 is your commit with an additional commit fixing the tests. | 15:58 |
smoser | Odd_Bloke, nice. thank you! | 15:59 |
rharper | Odd_Bloke: smoser: the "ephemeral0" disk in azure is mapped how? /dev/disk/by-label? by-id ? ie, what sort of kpath would such a device have ? | 16:43 |
Odd_Bloke | rharper: cloud-init ships udev rules that make it appear at /dev/disk/cloud/azure_resource | 16:44 |
rharper | ok, lemme read those rules | 16:45 |
Odd_Bloke | rharper: That's only post-trusty, before xenial walinuxagent ships rules which name them slightly differently. | 16:45 |
Odd_Bloke | (Well, I think walinuxagent still ships those rules) | 16:46 |
rharper | Odd_Bloke: ok, would you be able to get me access to a trusty and xenial instance in azure? I'm poking at a disk formatting and mounting bug there, https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1642383 | 16:46 |
Odd_Bloke | rharper: On it. | 16:48 |
rharper | thanks | 16:48 |
Odd_Bloke | rharper: ubuntu@xenial-161117-1647.cloudapp.net | 16:51 |
Odd_Bloke | rharper: And ubuntu@trusty-161117-1648.cloudapp.net | 16:52 |
Odd_Bloke | I'll leave you to work out which one is which suite. ;) | 16:52 |
rharper | Odd_Bloke: thanks! | 16:53 |
Odd_Bloke | rharper: I'm heading off for vacation, so could you shutdown the machines when you're done with them? | 16:53 |
rharper | Odd_Bloke: yes! | 16:54 |
Odd_Bloke | Danke. :) | 16:54 |
rharper | smoser: looks like we'll need this https://code.launchpad.net/~daniel-thewatkins/cloud-init/lp1460715 to fix disk_setup partitioning under xenial (sfdisk changes in 16.04) | 18:58 |
smoser | rharper, yeah, there are other issues... i hope/expect to fix that bug with what i have for the other | 19:02 |
smoser | part of my motivation to have you look at that one was so you'd understand my changes coming here. | 19:02 |
smoser | and why they should work for both | 19:02 |
smoser | but yea... using that dynamic path is part of this. | 19:03 |
rharper | k | 19:03 |
smoser | as well as making sure we wait for the device to appear | 19:03 |
rharper | right, that's a 16.04 ism w.r.t kernel partition awareness under systemd | 19:03 |
rharper | that sounded like the kpartx thingy | 19:03 |
rharper | smoser: would really like to replace the cc_mounts/cc_disk_setup with curtin backends but that's not going to happen right now | 19:06 |
smoser | rharper, yeah. that is absolutely true | 19:10 |
smoser | and additionally we need to be able to take disk config in curtin like format. | 19:11 |
rharper | y | 19:11 |
rharper | another thing for the roadmap =) | 19:11 |
rharper | the 33, 66, 88 thingy isn't non-parseable unless you read the help documentation, so moving to a more helpful format would be nice | 19:12 |
smoser | rharper, | 19:53 |
smoser | https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/311205 | 19:53 |
smoser | Odd_Bloke, ^ also | 19:53 |
rharper | + if os.path.isfile(devpath + suff + "2"): | 19:54 |
rharper | that seems off, what if there are 3 ? | 19:55 |
rharper | also, I think it's possible to have un-ordered partitions (gpt) for example | 19:55 |
smoser | well, it woudl fail if there was a partition 1 and partition 3 but *not* a partition 2 | 19:55 |
rharper | which is possible | 19:55 |
smoser | (you can do that with mbr also) | 19:55 |
smoser | yeah, it is hueristic for sure. | 19:55 |
smoser | i could probably use a function in disk_setup | 19:55 |
smoser | but this was what i did initially | 19:56 |
rharper | we have a sysfs device partition counting in curtin | 19:56 |
smoser | yeah, but that wont work on non-linux | 19:56 |
smoser | the sfdisk stuff in disk_ probably would | 19:57 |
rharper | not with -L Linux | 19:57 |
smoser | so yeah, thats not the greatest bit of code, but it is better than what *was* there. | 19:57 |
rharper | oh sure; I just don't know how far we want to go here | 19:57 |
smoser | -L linux isn't "linux specific stuff" | 19:57 |
rharper | hehe | 19:57 |
smoser | its just "dont complaina bout stuff linux doesnt care about" | 19:57 |
rharper | well, it's deprecated anyhow | 19:58 |
smoser | the code that was there before didn't even check for a second partition at all | 19:58 |
rharper | but, it *is* linux specific | 19:58 |
smoser | ! | 19:58 |
smoser | ? | 19:58 |
smoser | i suspect that sfdisk -L works on freebsd | 19:58 |
rharper | Stderr: "sfdisk: --Linux option is unnecessary and deprecated\nsfdisk: unsupported unit 'M'\n" | 19:58 |
rharper | in 16.04 | 19:58 |
rharper | upstream util-linux changed | 19:59 |
rharper | in 2014 or so | 19:59 |
smoser | ah. right | 19:59 |
rharper | https://bugs.launchpad.net/cloud-init/+bug/1460715 is still needed for partitioning on xenial | 20:01 |
rharper | the can_dev_be_reformatted is specifically there to help reprevent dataloss if someone "used" the disk ... | 20:03 |
rharper | certainly it's possible to populate it the ntfs existing partition though? we can't protect against that? | 20:04 |
smoser | rharper, if there are files on it, we leave it alone | 20:11 |
smoser | (other tahn the DATALOSS_ file) | 20:11 |
rharper | ok, I'm posting comments, the msg on the return True, is confusing | 20:11 |
smoser | rharper, addressed that.. you're right. i | 20:17 |
smoser | it should have said no important files | 20:17 |
rharper | I thought so after reading the rest of the code | 20:17 |
rharper | and I switch to util.log_time should be easy and announce the wait on the log | 20:18 |
smoser | i think i might be racing systemd and mount now. | 20:18 |
rharper | I've had several folks ask about making sure cloud-init announces when it's blocking ( which I think it does most of the time) but we need to do that here too | 20:18 |
smoser | rharper, ubuntu@40.78.152.32 | 20:19 |
smoser | er... smoser@ | 20:19 |
rharper | ok | 20:19 |
smoser | byobu or /var/log/cloud-init.log | 20:20 |
smoser | mount -a failed as already mounted. | 20:20 |
smoser | which should not happen until | 20:20 |
smoser | defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig | 20:20 |
rharper | urg | 20:20 |
smoser | maybe in new systemd world, a mount-a is going to fail in that scenario... i'm not sure, but i'm pretty sure it should *not* fail with already mounted | 20:21 |
rharper | Nov 17 20:11:02 repro3 systemd[1]: Mounting /mnt... | 20:21 |
rharper | Nov 17 20:11:02 repro3 systemd[1]: Mounted /mnt. | 20:21 |
rharper | Warning: mnt.mount changed on disk. Run 'systemctl daemon-reload' to reload units. | 20:21 |
rharper | $ journalctl -o short-precise --unit mnt.mount | 20:22 |
rharper | -- Logs begin at Thu 2016-11-17 20:10:56 UTC, end at Thu 2016-11-17 20:22:01 UTC. -- | 20:22 |
rharper | Nov 17 20:11:01.513996 repro3 systemd[1]: Unmounted /mnt. | 20:22 |
rharper | Nov 17 20:11:02.345932 repro3 systemd[1]: Mounting /mnt... | 20:22 |
rharper | Nov 17 20:11:02.820894 repro3 systemd[1]: Mounted /mnt. | 20:22 |
rharper | check the timestamps with cloud-init log | 20:22 |
smoser | well, the timestamps suck | 20:22 |
rharper | Nov 17 20:11:02.868532 repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed | 20:22 |
rharper | journalctl -o short-precise --unit cloud-init.service | 20:22 |
rharper | will get you real timestamps | 20:22 |
rharper | so, we were too late | 20:23 |
smoser | you want short-monotonic here | 20:23 |
rharper | as you said | 20:23 |
rharper | sure | 20:23 |
rharper | that too | 20:23 |
smoser | as clock runs all over :) | 20:23 |
rharper | hrm | 20:23 |
rharper | that shows something different then | 20:23 |
rharper | ~$ journalctl -o short-monotonic --unit cloud-init.service | grep mount | 20:24 |
rharper | [ 15.757407] repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed | 20:24 |
rharper | smoser@repro3:~$ journalctl -o short-monotonic --unit mnt.mount | 20:24 |
rharper | -- Logs begin at Thu 2016-11-17 20:10:56 UTC, end at Thu 2016-11-17 20:22:01 UTC. -- | 20:24 |
rharper | [ 14.406073] repro3 systemd[1]: Unmounted /mnt. | 20:24 |
rharper | [ 19.680401] repro3 systemd[1]: Mounting /mnt... | 20:24 |
rharper | [ 19.750265] repro3 systemd[1]: Mounted /mnt. | 20:24 |
rharper | something is lieing | 20:24 |
rharper | but it does seem like if we update fstab, we may need to systemctl daemon-reload | 20:24 |
rharper | so generators trigger | 20:24 |
rharper | which would do the fstab generator | 20:24 |
rharper | and do the mount for us | 20:24 |
rharper | IIUC | 20:24 |
smoser | right. thats possible. | 20:25 |
smoser | but both the old and the new fstab entry had not before cloud-init.service | 20:26 |
smoser | so it really shuld not be doing anything until that is finished. | 20:26 |
smoser | holy moley | 20:27 |
smoser | the monotonic is not in order | 20:27 |
smoser | journalctl -o short-monotonic | 20:27 |
smoser | needs to sort | 20:27 |
rharper | ew | 20:30 |
rharper | that seems *wrong* | 20:30 |
smoser | yeah. its probably because its just timestampping the different threads as the come in | 20:30 |
rharper | so, something umounts /mnt right after cloud-init.service starts ( that's expected, right?) | 20:32 |
rharper | [ 14.713559] repro3 cloud-init[1037]: Cloud-init v. 0.7.8 running 'init' at Thu, | 20:32 |
rharper | [ 14.406073] repro3 systemd[1]: Unmounted /mnt. | 20:32 |
rharper | actually before | 20:33 |
rharper | not sure why | 20:33 |
smoser | well, cloud-init *does* do that. or could | 20:33 |
rharper | -local ? | 20:33 |
smoser | if it was ntfs, then cloud-init mounts, reads, unmounts | 20:33 |
rharper | well, this is the mnt.service saying | 20:33 |
rharper | so, that happens after fstab generator runs | 20:33 |
rharper | otherwise we wouldn't have a mnt.mount unit | 20:33 |
smoser | probably not -local (looking for the datasource disk) but it coudl be | 20:33 |
rharper | bbiab, picking up the kiddos, then back to help debug | 20:34 |
smoser | yeah, everything happens after the fstab generator runs | 20:34 |
rharper | smoser: back | 20:58 |
smoser | i poked slangasek, he might be looking too | 21:00 |
smoser | its kind of interesting/annoying that 'mount -a' failed | 21:02 |
smoser | as i've never seen that before | 21:02 |
smoser | "everything is already mounted" normally exits 0 | 21:02 |
rharper | that seems new | 21:03 |
rharper | if we run mount -a now, do we see ? | 21:03 |
rharper | no | 21:03 |
rharper | it returns 0 | 21:03 |
smoser | right. | 21:03 |
rharper | so, something is very strange | 21:03 |
rharper | what's the RC? | 21:03 |
smoser | so it could just be a really unlucky race | 21:03 |
rharper | well, the return code should give us a hint | 21:03 |
smoser | 32 | 21:03 |
rharper | maybe it's ntfs return code | 21:04 |
rharper | 32 | 21:04 |
rharper | mount failure | 21:04 |
rharper | that's so helpful | 21:04 |
rharper | =( | 21:04 |
rharper | https://linux.die.net/man/8/ntfs-3g.probe doesn't list 32 | 21:05 |
rharper | so unlikely ntfs error | 21:05 |
rharper | ov 17 20:11:02.868532 repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed | 21:13 |
rharper | Nov 17 20:11:02.868575 repro3 kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null) | 21:13 |
rharper | smoser: that does look like a race , mount -a got the OK to mount it, but something mounted it before it could ? | 21:14 |
rharper | that's what 50 ms? | 21:14 |
smoser | thats way shorter than that | 21:15 |
rharper | 6 places isn't nano, ? | 21:16 |
* rharper sucks at subsecond placement | 21:16 | |
smoser | oh. but you're looking at the timestamp of the message that it failed. | 21:16 |
rharper | sure, but I still think it's related | 21:16 |
smoser | 20:11:01,652 | 21:16 |
smoser | is 'Running' | 21:17 |
rharper | [ 14.425648] repro3 systemd[1]: Stopped File System Check on /dev/disk/cloud/azure_resource-part1. | 21:17 |
rharper | [ 14.484319] repro3 systemd[1]: Starting File System Check on /dev/disk/cloud/azure_resource-part1... | 21:17 |
rharper | [ 14.560400] repro3 systemd-fsck[1140]: /dev/sdb1: clean, 11/25688 files, 8896/102400 blocks | 21:17 |
rharper | [ 15.757407] repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed | 21:17 |
smoser | yeah, that just seems completely bogus | 21:17 |
rharper | [ 15.440692] repro3 kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null) | 21:18 |
rharper | something was "using" sdb1 (systemd-wise) | 21:18 |
rharper | when cloud-init attempted to mount it | 21:18 |
rharper | if fsck was still running, it would be seen as busy | 21:18 |
rharper | ie, an open on the device | 21:18 |
rharper | systemd-fsck | 21:18 |
smoser | :-( | 21:18 |
smoser | yeah | 21:18 |
rharper | These services are started at boot if passno in /etc/fstab for the file system is set to a value greater than zero. | 21:19 |
rharper | whichi s set to 2 | 21:19 |
rharper | which triggers the service | 21:19 |
rharper | we could try not setting the passno in the fstab entry and see if things just work | 21:21 |
rharper | if so then we may need to mask systemd-fsck service prior to the mount (or as you see) it gets mounted automatically | 21:22 |
rharper | ie, if we're system_is_systemd | 21:22 |
rharper | we could skip the mount -a | 21:22 |
smoser | right, on ubuntu we do not in theory need the mount -a now. | 21:24 |
smoser | mask systemd-fsck service ? | 21:24 |
smoser | :-( | 21:28 |
smoser | another instance, i see: | 21:28 |
smoser | 2016-11-17 21:20:04,125 - DataSourceAzure.py[DEBUG]: Azure ephemeral disk: All files appeared after ['/dev/disk/cloud/azure_resource'] seconds: 0 | 21:28 |
smoser | 2016-11-17 21:20:04,125 - DataSourceAzure.py[DEBUG]: reformattable=False: device /dev/disk/cloud/azure_resource is not a file | 21:28 |
smoser | which is very odd. | 21:29 |
rharper | it's alink | 21:31 |
rharper | right ? | 21:31 |
smoser | right. but the check is os.exists | 21:31 |
rharper | and it's /dev/disk/azure/resource | 21:31 |
rharper | not azure_resource | 21:31 |
smoser | err.. isfile | 21:31 |
smoser | which isvalid | 21:31 |
smoser | no its both | 21:32 |
rharper | your instance says no | 21:32 |
smoser | and cloud-init is the /dev/disk/cloud/azure_resource one. the other is from walinuxagent | 21:32 |
rharper | ah | 21:32 |
rharper | two paths | 21:33 |
rharper | is suspect that's udev racing with symlink creation | 21:33 |
smoser | but it shoudlnt ever stop existing | 21:34 |
smoser | after it exists | 21:34 |
smoser | i have to go... to teacher conferences | 21:35 |
rharper | k | 21:35 |
smoser | i'm using http://paste.ubuntu.com/23492436/ | 21:35 |
smoser | to make it think its a resize | 21:35 |
smoser | (just partitions it as 1 parittion and mkfs ntsf) | 21:35 |
rharper | and then reboot ? | 21:35 |
smoser | and then reboot | 21:35 |
smoser | yeah | 21:35 |
rharper | ok | 21:35 |
smoser | here, i'll give you another instance | 21:36 |
smoser | smoser@smoser1117x.cloudapp.net | 21:36 |
smoser | shoot | 21:37 |
smoser | isfile is just wrong | 21:37 |
smoser | fix that | 21:37 |
smoser | os.path.isfile("/dev/disk/cloud/azure_resource") | 21:37 |
smoser | False | 21:37 |
rharper | heh | 21:37 |
rharper | islink | 21:37 |
rharper | will work | 21:37 |
rharper | >>> os.path.islink('/dev/disk/cloud/azure_resource') | 21:37 |
rharper | True | 21:37 |
smoser | well i dont want to know if its a link | 21:38 |
smoser | cause that says it can be danling | 21:38 |
rharper | you need readlink -f | 21:38 |
rharper | I'll get a fix | 21:38 |
smoser | i just changed to exists | 21:38 |
smoser | and pushed | 21:38 |
smoser | so that should be fine really | 21:38 |
rharper | well, yeah | 21:38 |
smoser | going afk | 21:38 |
rharper | k | 21:38 |
rharper | >>> os.path.realpath('/dev/disk/cloud/azure_resource') | 21:39 |
rharper | '/dev/sdb' | 21:39 |
rharper | >>> os.path.exists(os.path.realpath('/dev/disk/cloud/azure_resource')) | 21:39 |
rharper | True | 21:39 |
* rharper gives it a go | 21:51 | |
rharper | smoser: getting further; with the realpath and exists going, now we're looking to find sdb1 .. it's not immediately present when we're detecting if the devices can be reformatted .. | 22:37 |
rharper | smoser: currently, sdb1 isn;t being detected as ntfs via the blkid util.find_devs_with(); debugging that | 23:01 |
rharper | ah , looks like it's working, but we tried to mount /dev/sdb, instead of /dev/sdb1 (real_partpath) | 23:29 |
rharper | and, wrong name of the semaphor file, config_disk_setup, not config_disk_config | 23:35 |
rharper | almost there | 23:35 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!