/srv/irclogs.ubuntu.com/2018/01/03/#cloud-init.txt

=== shardy is now known as shardy_mtg
=== shardy_mtg is now known as shardy
Odd_Blokesmoser: rharper: Does https://bugs.launchpad.net/cloud-images/+bug/1740176 look at all familiar as a cloud-init bug?15:54
ubot5Launchpad bug 1740176 in cloud-images "Disksize is only 2GB, expected in 10GB" [Undecided,Confirmed]15:54
Odd_BlokeWe haven't triaged it yet, but thought it might look like something you know how to handle.15:54
smoserOdd_Bloke: did you recreate that ?16:07
Odd_Blokesmoser: I think Tribaal did.16:16
Tribaalwell, some user reported when using the artful Vagrant image16:17
Tribaaland I could reproduce it16:17
dpb1Tribaal: Uppercase T?16:17
Tribaaldpb1: says the guy with a "1" appended to his nick :)16:18
dpb1touche!16:18
Tribaaldpb1: I thin klowercase was already taken16:18
dpb1good ol irc16:19
=== Tribaal is now known as tribaal
tribaalthere! :)16:19
dpb1hi tribaal16:19
dpb1welcome16:19
blackboxswtribaal: yo. we were just talking about that bug16:27
tribaaloh?16:28
blackboxswsmoser: thought it might be reltaed to https://bugs.launchpad.net/cloud-images/+bug/172681816:28
ubot5Launchpad bug 1726818 in linux (Ubuntu) "vagrant artful64 box filesystem too small" [High,In progress]16:28
blackboxswand happy new year BTW16:28
smoseryeah, i marked as a dupe. it certainly smells that way.16:28
tribaalthat looks like a total dupe indeed16:28
smosertribaal: it would be good to know if this is present in bionic16:29
tribaalsmoser: give me a sec, should be easy to try16:29
smoseras it really needs to be fixed in bionic.16:30
tribaal+10016:30
smosertribaal and then, it looks like there is a 4.14 in proposed.16:31
smoseri suspect that this is probably still present in bionic in the 4.13 that is in the release pocket16:31
smoserbut has a chance of being fixed in -proposed 4.1416:31
smoserhttps://launchpad.net/ubuntu/+source/linux16:31
tribaalsmoser: indeed, bionic is affected as well (well, as of the 20180101 image)16:35
tribaalsmoser: I'll enabled -proposed in the machine and reboot - could you remind me what to nuke to make cloud-init think it's running for the first time? /var/lib/cloud/*?16:38
smosertribaal: cloud-init clean16:39
tribaalsmoser: TIL! Sweet16:39
smosertribaal: i'd wonder though if the image might be 'dirty' at that point16:42
smoseri dont know how easy it is to supply your own image to vagrant16:42
smoserbut you might be able to modify16:42
smoser https://github.com/cloud-init/qa-scripts/blob/master/scripts/get-proposed-cloudimg16:42
smoseror basically do what it does16:42
smoserbetter to create yourself a "clean" image16:43
tribaalsmoser: yeah, I could just build a vagrant image with -proposed enabled but that takes much longer16:43
smoserblackboxsw:16:43
smoserroot@b2:~# cloud-init clean16:43
smoserERROR: Could not remove instance: Cannot call rmtree on a symbolic link16:43
blackboxswboo!16:43
blackboxswwill work up a fix16:43
tribaal(and I'm almost EOD)16:43
smoserexits non-zero too, and that is clearly not a error.16:43
smosertribaal: well, essentially i suspect you may be able to do16:44
smoser  mount-image-callback your.orig.image --system-resolvconf -- /bin/bash16:45
smoserthen inside, just enable proposed, apt-get update, apt-get install linux-image (or whateve rpackage that is)16:45
smoserthen exit16:45
tribaalthe dirty way didn't produce the clear case, so I'll have to do something like this, yes (or just build an image with -proposed enabled)16:50
blackboxswsmoser: how'd you reproduce that cloud-init clean. I launched a bionic container, by default it doesn't hit this symlink error17:01
blackboxswI assumed bionic because your instance was named b217:02
blackboxswin either case, I can fix it easily enough. but wondered how we got there17:02
powersjblackboxsw: https://paste.ubuntu.com/26314129/17:19
blackboxsw+1 powersj I  have a fix, just adding unit tests, will17:20
blackboxsw+1 powersj I  have a fix, just adding unit tests, will try to reproduce here too17:20
smoserblackboxsw: hm.17:27
smoseri just fresh launched bionic-daily container17:28
blackboxswcsmith@uptown:~$ lxc --version17:31
blackboxsw2.0.1117:31
blackboxswheh. powersj17:31
blackboxswxenial17:31
blackboxswfor the loss17:31
powersjexcept I wouldn't expect that to make a difference in this case :\17:32
powersjright? should only be the version of cloud-init in the image17:32
blackboxswone would think. I'm testing from zesty17:32
smoserhm.17:33
blackboxswhttps://pastebin.ubuntu.com/26314191/17:34
blackboxswyeah from zesty, still no problem on my side17:34
blackboxswanyway we can easily test for link and unlink if needed17:34
smoserthis is weird17:35
blackboxswbut not quite sure why that shows up in some cases17:35
smoseri reproduced again, but then not17:35
powersj299e803c9fe1 | no     | ubuntu 18.04 LTS amd64 (daily) (20180101)17:36
powersjthat's the bionic image I used17:36
smoserhttp://paste.ubuntu.com/26314202/17:36
blackboxswhttps://pastebin.ubuntu.com/26314219/17:38
blackboxswseemingly the same, but I get stuccess (I would've expected it to always fail17:38
blackboxswon instance symlink17:38
smoseryeah. hmm.17:38
blackboxswI'm adding a debug print17:39
smoserblackboxsw: i am guessing.17:40
smoserbut i suspect thatlistdir()17:40
smoserlistdir('.')17:40
smoseris not any specific order17:40
smoserand that when it does instances first its ok but when instance first it is not17:41
smoseror reverse17:41
blackboxswyeah, I could have sorted() the dir list and then we would've always seen it. I bet becaause is_dir returns false when a symlink target is already broken17:43
smoserhttp://paste.ubuntu.com/26314255/17:43
blackboxswyeah that's the fix I have17:43
blackboxswsame one. just wanted to know why17:44
smosersorted is arbitrary17:44
blackboxswit seems dirlist is indeterminate apparently. as you see the issue only sometimes17:44
smoseryeah, its not guaranteed sorted.17:44
smoserit is just traversing the dirent17:45
smoserand even if it was, its arbitrary that 'instnace' would sort before 'instances'17:45
blackboxswhttps://pastebin.ubuntu.com/26314266/17:47
blackboxswI would've thought sorting would have put instanec before instances too, but yeah the dirent iterartor isn't sorted17:48
blackboxswok anyway pushing the fix and unit test17:48
blackboxswI wouldn't thought that even a sorted list would have '17:49
blackboxswI wouldn't thought that even a sorted list would have 'instances' > '17:49
blackboxsw'instance'17:49
blackboxswbut maybe that's arbitrary too as you point uot17:49
blackboxswsorry was typing on a different keyboard.17:50
smoserblackboxsw: sorting would fix it i think, but just seems arbitrary from the perspective of it will start to fail again if we had a link named zz to aa17:56
blackboxswright, yeah it was a fragile fix to sort(and would have been wrong)17:58
blackboxswbecause it ignored the problem (that we weren't handling symlinks)17:58
blackboxswhttps://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/335671 has the fix17:58
blackboxswdidn't realize I was doing something a bit different than you suggestion.17:58
blackboxswutils util.is_link instead17:58
blackboxsws/utils/using/17:58
dojordan@blackboxsw - happy new year! is there anything blocking checking in my PR to master? https://code.launchpad.net/~dojordan/cloud-init/+git/cloud-init/+ref/azure-preprovisioning18:13
blackboxswdojordan: I think there was a side discussion I had with smoser that we might hit an issue with systemd unit timeouts. if we attempt block indefinitely in a polling loop in cloud-init's unit systemd might timeout at 5 minutes??? which could cause the behavior you are looking for to fail.18:17
blackboxswI'm not sure about the 5 minute auto-timeout in systemd, lemme find a reference to see if I can dig up a doc on it18:18
dojordanI've tested in 16.04 with polling for much longer than that, and systemd didn't kill cloud init or anything18:18
dojordanwhat is the service name?18:19
blackboxswcloud-init.service I believe18:21
dojordandoug@dojordandev:~$ systemctl show cloud-init.service -p TimeoutStopUSec TimeoutStopUSec=infinity18:23
dojordanthats why is worked :)18:23
blackboxswor cloud-init.targetahh there you go18:24
blackboxswooops18:25
blackboxswahh I mean18:25
blackboxswlooks like 'we' (cloud-init) don't explicitly set that timeout to inifinity, but me OS-specific setting18:26
blackboxswohh wait18:26
blackboxswTimeoutSec=018:26
dojordanthat infinity timeout was on the azure 16.04 LTS image18:27
blackboxswlooks like we set that for the systemd/cloud-final.service.tmpl18:27
dojordangot it18:27
blackboxswok this might not really be an 'issue' then, though it *may* be worth us explicitly configuring that inifinity timeout on azure images... I'm not certain.18:28
dojordanis 0 == infinity?18:28
* blackboxsw thinks, as TmieoutSec == setting for both TimeoutStart and timeoutStop18:30
blackboxswbut trying to confirm18:30
blackboxswsd_notify(3)).18:30
blackboxswTimeoutSec=18:30
blackboxswA shorthand for configuring both TimeoutStartSec= and TimeoutStopSec= to the specifie18:30
blackboxswfrom https://www.freedesktop.org/software/systemd/man/systemd.service.html18:30
smoserdojordan: i think there is still general issues with timeouts18:31
blackboxswthough I don't see a reference that "0" == 'infiinity' for Timeout(Stop/Start)Sec18:31
dojordanit just says "Pass "infinity" to disable the timeout logic." on the timeoutstopsec18:32
smoserhm..18:32
dojordanbut are we currently doing that...18:32
smoserbuti dont know how other things in boot handle it18:32
smoserso cloud-init-local or cloud-init.service may have TimeoutSec set correctly18:32
smosermaybe i'm worng.18:33
dojordandoug@dojordandev:~$ systemctl show cloud-init.local -p TimeoutStopUSec TimeoutStopUSec=1min 30s18:33
smoserbut i swear if've seen boot just go on when i didn't think it should.18:33
dojordanso local is showing 1:30, but service is showing infinite18:33
smoserdojordan: well you typo'd there.18:34
smosercloud-init.local is not a service18:34
smosercloud-init-local.service18:34
dojordand'oh....18:34
dojordan"cloud-init-local.service" == TimeoutStopUSec=infinity18:35
smoserdojordan: so how does /var/lib/waagent/poll_imds get written?18:42
smosernewerubuntu and cloud-init i think do not rely on waagent at all18:42
smoserand i'm not really interested in adding such a dependency back18:42
dojordanthere is no dep on waagent18:42
dojordanwe can change the path - it should probably be in the instance directory18:43
smoserwhat creates it ?18:43
dojordanthe azure data source18:43
smoseroh. ok. yeah. i see that. the waagent thrrew me off.18:43
dojordanalso, I confirmed why the timeout is infinity - since we are using type=oneshot the timeout is disabled18:44
smoseri suspect that 0 did mean infinity at some point18:44
smoserand probably still does18:45
smoserREPROVISION_MARKER_FILE, why do we need that?18:45
dojordanthe marker file (/var/lib/waagent/poll_imds) is needed in case the VM reboots for whatever reason before it is reused by a customer18:46
smoserrather than internal state or something.18:46
dojordanwe report ready to the fabric which means we detach the provisioning ISO18:46
smoserwhy would it reboot?18:46
dojordanhardware, software updates, etc18:46
dojordanunderlying platform18:46
dojordanand we don't write the Ovf since we don't want to persist any azure specific data since the real ovf will come from the customer18:47
dojordanwe will be in this polling loop for a while18:47
dojordanand if there is an unexpected reboot we want to keep polling when the vm comes back up18:47
smoserit seems odd that the platform would choose to reboot such a machine18:49
dojordanin azure, all of our VMs are backed by remote storage. so when hardware issues occur, we simply move the VM to a new machine since the data is persisted and reboot it18:49
dojordanby data i mean the os vhd18:51
smoserseems like maybe you could just kill machines in this state. as they're not owned by anyone while maybe everything "should work" if you just kill a machine while booting and then re-start it, i suspect that in reality there are lots of issues.  but thats not really important here.18:52
paulmeysmoser: true, but that would require some rearchitecting on a different level which is not going to happen any time soon...18:55
dojordanthe same behavior is true today for all VMs. if the reboot before cloud-init finishes we just move thme18:58
dojordan@smoser these are all valid points. the underlying platform is not perfect, and there are cases today where we don't get an ACK from our remote storage layer, then the VM will be busted. I think the important thing about the marker file is that allows us to keep the pre provisioned vms around for longer which enables us to have a higher hit rate for reuse and therefore increase boot performance. the availability won't be any w19:30
smoserdojordan: i responded on mp there.19:37
smosersorry for taking so long to take a look at your proposal19:37
smoserdojordan: please don't take offense. over all, you've done a good job.19:37
dojordanno worries, appreciate the feedback. Will address the PR comments later today19:38
blackboxsw pushed a couple changes to the review-mps script in qa-scripts repo for landing branches20:03
blackboxswslowly building it into something useful/working20:03
blackboxswthx for the review btw. landed20:03
ivveheya guys, im doing a really ugly hack with write_files but been getting lots of trouble getting write_files to work at all, i have a hard time writing two files as well. first example https://hastebin.com/otubucuqin.pl , second example https://hastebin.com/egoxufopab.js20:07
ivvelike the simplest example with only a path + content works... other than that i have a really hard time getting it to execute20:08
ivveis my syntax way off?20:09
blkadderYou might find it easier just to base64 it.20:10
ivveok20:11
ivveso - encoding: b64 ?20:11
ivveand then just content: |20:11
blkadderOne sec.20:11
blkadderwrite_files:20:12
blkadder   - encoding: b6420:12
blkadder     content: T3JpZ2luOiByZXBvLnFiaXMuY28KTGFiZWw6IHJlcG8ucWJpcy5jbwpDb2RlbmFtZTogeGVuaWFsCkFyY2hpdGVjdHVyZXM6IGkzODYgYW1kNjQgc291cmNlCkNvbXBvbmVudHM6IG1haW4KRGVzY3JpcHRpb246IHFiaXMgcmVwbwpTaWduV2l0aDogZGVmYXVsdCAK20:12
blkadder     path: /tmp/distributions20:12
blkadder     owner: root:root20:12
blkadder     permissions: '0644'20:12
blkadderSorry that may not have come through very well.20:12
ivveno worries20:12
ivveso i have to encode it then20:13
blkadderhttps://paste.ubuntu.com/26315006/20:13
blkadderYes base64 -w020:13
blkadderCopy/paste20:13
ivvealright20:13
blkadderI dont’ thinkl your second example will work.20:16
blkadderhttp://cloudinit.readthedocs.io/en/latest/topics/examples.html20:17
blkadderIt’s very picky about where you put “-“ and spaces, etc.20:17
ivveaye20:23
ivvedoesn't seem to like multiple files at all :P20:26
ivvei got it to work once or twice20:26
ivvethis isn't working either20:26
ivve:(20:26
smoserivve:  i suspect that you have general yaml issue.20:29
ivveaye, i keep getting [   18.299202] cloud-init[861]: 2018-01-03 20:29:00,205 - __init__.py[WARNING]: Unhandled non-multipart (text/x-not-multipart) userdata: '#cloud_config...'20:30
smoseroh. yeah.20:30
smosercloud-init will ignore that20:30
smoserif it does not start with '#cloud-config'20:30
ivveit does20:30
smoseror is not declared as cloud-config with multipart20:31
smoserjust add '#cloud-config' as first line.20:31
ivveoh shit20:31
ivve_ not -20:31
ivve:D20:31
ivveoh man20:31
smoserah. yeah. funnyt.20:31
ivvei used a heat template first and just shortcutted20:31
ivveadded # and removed the :20:32
ivvejeez what an idiot i am20:32
smoserivve: whenever someone shows me something like the hastebin there, the first thing i do is use 'yaml-dump'20:33
smoser http://paste.ubuntu.com/26315078/20:33
smoserjson output is much more clear and identifies errors to a human more clearly.20:33
smoser(you didn't mess up that way, this is just fyi)20:34
ivveaye its a good pointer, thanks20:34
ivvehowever i can't read json even if my life depended on it20:35
ivvewell it works now20:35
ivvei guess i will be encoding stuff now20:35
ivveit help writing stupid expect scripts :P20:35
blkadderUse IntelliJ20:36
blkadderIt’s a life saver.20:36
blkadderAnd it has a vi/vim mode too which is nice.20:38
ivvestack completed, music to my ears :P20:41
ivvethanks a bunch guys20:41
blackboxswwhile it's a little late in the process, your machine with cloud-init  installed can run 'cloud-init devel schema -c your-configfile.yaml     to validate the yaml https://pastebin.ubuntu.com/26315122/20:42
smoser:)20:42
blackboxswit at least gives you a quick once over on the yaml file once you've discovered something didn't work as you expected20:43
blkadderIs that in mainline now?20:44
blackboxswyep. should be in xenial and greater20:44
blackboxswand on trukn20:44
blackboxswand on trunk20:44
blkadderCool.20:44
ivveaye its a good pointer as well and i did think about it but never came to that point since i just added a file and stuff stopped working20:45
blackboxswneeds a lot of work as it'll eventually support all attributes of each cloud-config module and --annotate20:45
blackboxswto report specific errors20:45
ivveruncmd: also doesn't like " or ;20:45
ivveor : for that matter20:45
blackboxswmy new year's resolution is to make "cloud-init schema" a first-class citizen with support for reporting schema errors in all 54 cloud-config modules20:47
blkadderivve: https://paste.ubuntu.com/26315137/20:47
blkadderSomeone here helped me with that… :-)20:47
blackboxswwe'll see if that holds (hopefully better than the "exercising daily" new year's resolution)20:47
blkadderYay exercise.20:48
blkadderI highly recommend: https://stronglifts.com/5x5/ Only 3 or 4 days a week. No two hours at the gym. :-)20:49
ivveah yes ofc you can type it that way to get it right20:50
ivvehowever as you are pointing out help is needed to write it and read it :P20:50
blkadderWell that gives you some syntax to pattern off of.20:50
blkadderNow that I have that I can generally manage on my own.20:50
ivveis just preferred using write_files to write the script and runcmd: - bash/expect /path/to/script20:51
ivveits just a hack for a demo, nothing proper20:51
ivvei'd use ansible for proper stuff20:51

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!