/srv/irclogs.ubuntu.com/2016/11/17/#cloud-init.txt

=== shardy_afk is now known as shardy
=== shardy is now known as shardy_mtg
=== shardy_mtg is now known as shardy
Odd_Blokesmoser: Are you going to be able to look at __builtin__ (i.e. https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/310432) this week, or shall I pick up work on it?14:18
smoserOdd_Bloke, if you can make tests pass, then i'm good with it in trunk14:19
smoserbut, no. i wont get there. right now i'm workingon14:19
smoserhttps://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/161107414:19
Odd_Blokesmoser: OK, cool, I'll look at the tests.14:19
smoserand i'll hope to have some review from you on that...14:19
smoserits kind of a pita14:19
Odd_Blokesmoser: Ugh, that bug.14:19
smoserthe code that was there has always been racy14:19
smoseri do think that i have a reasonable plan14:20
Odd_Blokesmoser: FYI, I'm out tomorrow, Monday and Tuesday.14:20
smoseryou're on UTC hours or are you on this continent now.14:20
Odd_Blokesmoser: UTC at the moment, flying to that continent tomorrow and back on Monday.14:21
smoserfun weekend.14:21
Odd_BlokeYeah, I'm having to do a quick trip to activate my permanent residency.14:22
=== cpaelzer_ is now known as cpaelzer
smoserOdd_Bloke, that doesnt really make sense.14:39
smoser:)14:39
Odd_Bloke:)14:39
Odd_BlokeBeing with my Canadian wife (anywhere in the world) counts as "resident".14:39
Odd_BlokeAnd my medical expires in December, so I'm having to make a trip before that.14:40
Odd_Blokesmoser: Oops, you'll just have got a huge email from LP because of me; proposed merging my tests to your branch having forgotten that I rebased on to master. ¬.¬15:58
Odd_Blokesmoser: https://code.launchpad.net/~daniel-thewatkins/cloud-init/+git/cloud-init/+merge/311164 is your commit with an additional commit fixing the tests.15:58
smoserOdd_Bloke, nice. thank you!15:59
rharperOdd_Bloke: smoser: the "ephemeral0" disk in azure is mapped how? /dev/disk/by-label? by-id ?  ie, what sort of kpath would such a device have ?16:43
Odd_Blokerharper: cloud-init ships udev rules that make it appear at /dev/disk/cloud/azure_resource16:44
rharperok, lemme read those rules16:45
Odd_Blokerharper: That's only post-trusty, before xenial walinuxagent ships rules which name them slightly differently.16:45
Odd_Bloke(Well, I think walinuxagent still ships those rules)16:46
rharperOdd_Bloke: ok, would you be able to get me access to a trusty and xenial instance in azure?  I'm poking at a disk formatting and mounting bug there, https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/164238316:46
Odd_Blokerharper: On it.16:48
rharperthanks16:48
Odd_Blokerharper: ubuntu@xenial-161117-1647.cloudapp.net16:51
Odd_Blokerharper: And ubuntu@trusty-161117-1648.cloudapp.net16:52
Odd_BlokeI'll leave you to work out which one is which suite. ;)16:52
rharperOdd_Bloke: thanks!16:53
Odd_Blokerharper: I'm heading off for vacation, so could you shutdown the machines when you're done with them?16:53
rharperOdd_Bloke: yes!16:54
Odd_BlokeDanke. :)16:54
rharpersmoser: looks like we'll need this https://code.launchpad.net/~daniel-thewatkins/cloud-init/lp1460715  to fix disk_setup partitioning under xenial (sfdisk changes in 16.04)18:58
smoserrharper, yeah, there are other issues... i hope/expect to fix that bug with what i have for the other19:02
smoserpart of my motivation to have you look at that one was so you'd understand my changes coming here.19:02
smoserand why they should work for both19:02
smoserbut yea... using that dynamic path is part of this.19:03
rharperk19:03
smoseras well as making sure we wait for the device to appear19:03
rharperright, that's a 16.04 ism w.r.t kernel partition awareness under systemd19:03
rharperthat sounded like the kpartx thingy19:03
rharpersmoser: would really like to replace the cc_mounts/cc_disk_setup with curtin backends  but that's not going to happen right now19:06
smoserrharper, yeah. that is absolutely true19:10
smoserand additionally we need to be able to take disk config in curtin like format.19:11
rharpery19:11
rharperanother thing for the roadmap =)19:11
rharperthe 33, 66, 88 thingy isn't non-parseable unless you read the help documentation, so moving to a more helpful format would be nice19:12
smoserrharper,19:53
smoserhttps://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/31120519:53
smoserOdd_Bloke, ^ also19:53
rharper+            if os.path.isfile(devpath + suff + "2"):19:54
rharperthat seems off, what if there are 3 ?19:55
rharperalso, I think it's possible to have un-ordered partitions (gpt) for example19:55
smoserwell, it woudl fail if there was a partition 1 and partition 3 but *not* a partition 219:55
rharperwhich is possible19:55
smoser(you can do that with mbr also)19:55
smoseryeah, it is hueristic for sure.19:55
smoseri could probably use a function in disk_setup19:55
smoserbut this was what i did initially19:56
rharperwe have a sysfs device partition counting in curtin19:56
smoseryeah, but that wont work on non-linux19:56
smoserthe sfdisk stuff in disk_ probably would19:57
rharpernot with -L Linux19:57
smoserso yeah, thats not the greatest bit of code, but it is better than what *was* there.19:57
rharperoh sure; I just don't know how far we want to go here19:57
smoser-L linux isn't "linux specific stuff"19:57
rharperhehe19:57
smoserits just "dont complaina bout stuff linux doesnt care about"19:57
rharperwell, it's deprecated anyhow19:58
smoserthe code that was there before didn't even check for a second partition at all19:58
rharperbut, it *is* linux specific19:58
smoser!19:58
smoser?19:58
smoseri suspect that sfdisk -L works on freebsd19:58
rharperStderr: "sfdisk: --Linux option is unnecessary and deprecated\nsfdisk: unsupported unit 'M'\n"19:58
rharperin 16.0419:58
rharperupstream util-linux changed19:59
rharperin 2014 or so19:59
smoserah. right19:59
rharperhttps://bugs.launchpad.net/cloud-init/+bug/1460715  is still needed for partitioning on xenial20:01
rharperthe can_dev_be_reformatted is specifically there to help reprevent dataloss if someone "used" the disk ...20:03
rharpercertainly it's possible to populate it the ntfs existing partition though?  we can't protect against that?20:04
smoserrharper, if there are files on it, we leave it alone20:11
smoser(other tahn the DATALOSS_ file)20:11
rharperok, I'm posting comments, the msg on the return True, is confusing20:11
smoserrharper, addressed that.. you're right. i20:17
smoserit should have said no important files20:17
rharperI thought so after reading the rest of the code20:17
rharperand I switch to util.log_time should be easy and announce the wait on the log20:18
smoseri think i  might be racing systemd and mount now.20:18
rharperI've had several folks ask about making sure cloud-init announces when it's blocking ( which I think it does most of the time) but we need to do that here too20:18
smoserrharper, ubuntu@40.78.152.3220:19
smoserer... smoser@20:19
rharperok20:19
smoserbyobu or /var/log/cloud-init.log20:20
smosermount -a failed as already mounted.20:20
smoserwhich should not happen until20:20
smoserdefaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig20:20
rharperurg20:20
smosermaybe in new systemd world, a mount-a is going to fail in that scenario... i'm not sure, but i'm pretty sure it should *not* fail with already mounted20:21
rharperNov 17 20:11:02 repro3 systemd[1]: Mounting /mnt...20:21
rharperNov 17 20:11:02 repro3 systemd[1]: Mounted /mnt.20:21
rharperWarning: mnt.mount changed on disk. Run 'systemctl daemon-reload' to reload units.20:21
rharper$ journalctl -o short-precise --unit mnt.mount20:22
rharper-- Logs begin at Thu 2016-11-17 20:10:56 UTC, end at Thu 2016-11-17 20:22:01 UTC. --20:22
rharperNov 17 20:11:01.513996 repro3 systemd[1]: Unmounted /mnt.20:22
rharperNov 17 20:11:02.345932 repro3 systemd[1]: Mounting /mnt...20:22
rharperNov 17 20:11:02.820894 repro3 systemd[1]: Mounted /mnt.20:22
rharpercheck the timestamps with cloud-init log20:22
smoserwell, the timestamps suck20:22
rharperNov 17 20:11:02.868532 repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed20:22
rharperjournalctl -o short-precise --unit cloud-init.service20:22
rharperwill get you real timestamps20:22
rharperso, we were too late20:23
smoseryou want short-monotonic here20:23
rharperas you said20:23
rharpersure20:23
rharperthat too20:23
smoseras clock runs all over :)20:23
rharperhrm20:23
rharperthat shows something different then20:23
rharper~$ journalctl -o short-monotonic --unit cloud-init.service | grep mount20:24
rharper[   15.757407] repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed20:24
rharpersmoser@repro3:~$ journalctl -o short-monotonic --unit mnt.mount20:24
rharper-- Logs begin at Thu 2016-11-17 20:10:56 UTC, end at Thu 2016-11-17 20:22:01 UTC. --20:24
rharper[   14.406073] repro3 systemd[1]: Unmounted /mnt.20:24
rharper[   19.680401] repro3 systemd[1]: Mounting /mnt...20:24
rharper[   19.750265] repro3 systemd[1]: Mounted /mnt.20:24
rharpersomething is lieing20:24
rharperbut it does seem like if we update fstab, we may need to systemctl daemon-reload20:24
rharperso generators trigger20:24
rharperwhich would do the fstab generator20:24
rharperand do the mount for us20:24
rharperIIUC20:24
smoserright. thats possible.20:25
smoserbut both the old and the new fstab entry had not before cloud-init.service20:26
smoserso it really shuld not be doing anything until that is finished.20:26
smoserholy moley20:27
smoserthe monotonic is not in order20:27
smoserjournalctl -o short-monotonic20:27
smoserneeds to sort20:27
rharperew20:30
rharperthat seems *wrong*20:30
smoseryeah. its probably because its just timestampping the different threads as the come in20:30
rharperso,  something umounts /mnt right after cloud-init.service starts ( that's expected, right?)20:32
rharper[   14.713559] repro3 cloud-init[1037]: Cloud-init v. 0.7.8 running 'init' at Thu,20:32
rharper[   14.406073] repro3 systemd[1]: Unmounted /mnt.20:32
rharperactually before20:33
rharpernot sure why20:33
smoserwell, cloud-init *does* do that. or could20:33
rharper-local ?20:33
smoserif it was ntfs, then cloud-init mounts, reads, unmounts20:33
rharperwell, this is the mnt.service saying20:33
rharperso, that happens after fstab generator runs20:33
rharperotherwise we wouldn't have a mnt.mount unit20:33
smoserprobably not -local (looking for the datasource disk) but it coudl be20:33
rharperbbiab, picking up the kiddos, then back to help debug20:34
smoseryeah, everything happens after the fstab generator runs20:34
rharpersmoser: back20:58
smoseri poked slangasek, he might be looking too21:00
smoserits kind of interesting/annoying that 'mount -a' failed21:02
smoseras i've never seen that before21:02
smoser"everything is already mounted" normally exits 021:02
rharperthat seems new21:03
rharperif we run mount -a now, do we see ?21:03
rharperno21:03
rharperit returns 021:03
smoserright.21:03
rharperso, something is very strange21:03
rharperwhat's the RC?21:03
smoserso it could just be a really unlucky race21:03
rharperwell, the return code should give us a hint21:03
smoser3221:03
rharpermaybe it's ntfs return code21:04
rharper3221:04
rharpermount failure21:04
rharperthat's so helpful21:04
rharper=(21:04
rharperhttps://linux.die.net/man/8/ntfs-3g.probe doesn't list 3221:05
rharperso unlikely ntfs error21:05
rharperov 17 20:11:02.868532 repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed21:13
rharperNov 17 20:11:02.868575 repro3 kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)21:13
rharpersmoser: that does look like a race , mount -a got the OK to mount it, but something mounted it before it could ?21:14
rharperthat's what 50 ms?21:14
smoserthats way shorter than that21:15
rharper6 places isn't nano, ?21:16
* rharper sucks at subsecond placement 21:16
smoseroh. but you're looking at the timestamp of the message that it failed.21:16
rharpersure, but I still think it's related21:16
smoser20:11:01,65221:16
smoseris 'Running'21:17
rharper[   14.425648] repro3 systemd[1]: Stopped File System Check on /dev/disk/cloud/azure_resource-part1.21:17
rharper[   14.484319] repro3 systemd[1]: Starting File System Check on /dev/disk/cloud/azure_resource-part1...21:17
rharper[   14.560400] repro3 systemd-fsck[1140]: /dev/sdb1: clean, 11/25688 files, 8896/102400 blocks21:17
rharper[   15.757407] repro3 cloud-init[1037]: 2016-11-17 20:11:01,686 - util.py[WARNING]: Activating mounts via 'mount -a' failed21:17
smoseryeah, that just seems completely bogus21:17
rharper[   15.440692] repro3 kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)21:18
rharpersomething was "using" sdb1 (systemd-wise)21:18
rharperwhen cloud-init attempted to mount it21:18
rharperif fsck was still running, it would be seen as busy21:18
rharperie, an open on the device21:18
rharpersystemd-fsck21:18
smoser:-(21:18
smoseryeah21:18
rharperThese services are started at boot if passno in /etc/fstab for the file system is set to a value greater than zero.21:19
rharperwhichi s set to 221:19
rharperwhich triggers the service21:19
rharperwe could try not setting the passno in the fstab entry and see if things just work21:21
rharperif so then we may need to mask systemd-fsck service prior to the mount (or as you see) it gets mounted automatically21:22
rharperie, if we're system_is_systemd21:22
rharperwe could skip the mount -a21:22
smoserright, on ubuntu we do not in theory need the mount -a now.21:24
smosermask systemd-fsck service ?21:24
smoser:-(21:28
smoseranother instance, i see:21:28
smoser2016-11-17 21:20:04,125 - DataSourceAzure.py[DEBUG]: Azure ephemeral disk: All files appeared after ['/dev/disk/cloud/azure_resource'] seconds: 021:28
smoser2016-11-17 21:20:04,125 - DataSourceAzure.py[DEBUG]: reformattable=False: device /dev/disk/cloud/azure_resource is not a file21:28
smoserwhich is very odd.21:29
rharperit's alink21:31
rharperright ?21:31
smoserright. but the check is os.exists21:31
rharperand it's /dev/disk/azure/resource21:31
rharpernot azure_resource21:31
smosererr.. isfile21:31
smoserwhich isvalid21:31
smoserno its both21:32
rharperyour instance says no21:32
smoserand cloud-init is the /dev/disk/cloud/azure_resource one. the other is from walinuxagent21:32
rharperah21:32
rharpertwo paths21:33
rharperis suspect that's udev racing with symlink creation21:33
smoserbut it shoudlnt ever stop existing21:34
smoserafter it exists21:34
smoseri have to go... to teacher conferences21:35
rharperk21:35
smoseri'm using http://paste.ubuntu.com/23492436/21:35
smoserto make it think its a resize21:35
smoser(just partitions it as 1 parittion and mkfs ntsf)21:35
rharperand then reboot ?21:35
smoserand then reboot21:35
smoseryeah21:35
rharperok21:35
smoserhere, i'll give you another instance21:36
smosersmoser@smoser1117x.cloudapp.net21:36
smosershoot21:37
smoserisfile is just wrong21:37
smoserfix that21:37
smoseros.path.isfile("/dev/disk/cloud/azure_resource")21:37
smoserFalse21:37
rharperheh21:37
rharperislink21:37
rharperwill work21:37
rharper>>> os.path.islink('/dev/disk/cloud/azure_resource')21:37
rharperTrue21:37
smoserwell i dont want to know if its a link21:38
smosercause that says it can be danling21:38
rharperyou need readlink -f21:38
rharperI'll get a fix21:38
smoseri just changed to exists21:38
smoserand pushed21:38
smoserso that should be fine really21:38
rharperwell, yeah21:38
smosergoing afk21:38
rharperk21:38
rharper>>> os.path.realpath('/dev/disk/cloud/azure_resource')21:39
rharper'/dev/sdb'21:39
rharper>>> os.path.exists(os.path.realpath('/dev/disk/cloud/azure_resource'))21:39
rharperTrue21:39
* rharper gives it a go 21:51
rharpersmoser: getting further;  with the realpath and exists going, now we're looking to find sdb1 .. it's not immediately present when we're detecting if the devices can be reformatted ..22:37
rharpersmoser: currently, sdb1 isn;t being detected as ntfs  via the blkid util.find_devs_with(); debugging that23:01
rharperah , looks like it's working, but we tried to mount /dev/sdb, instead of /dev/sdb1 (real_partpath)23:29
rharperand, wrong name of the semaphor file, config_disk_setup, not config_disk_config23:35
rharperalmost there23:35

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!