blackboxsw | hrm... ok seeing we get through cloud-init modules:config stages which means the datasource succeeded | 00:00 |
---|---|---|
blackboxsw | hrm have to step away a bit. sorry for the moment dojordan_ will check it out | 00:01 |
blackboxsw | I'll have something on this later | 00:01 |
blackboxsw | sorry I should have tested this again this morning | 00:01 |
dojordan_ | no worries | 00:01 |
dojordan_ | one thing would be great if you could do, would be to change logging level so we can see more info on the serial port | 00:02 |
* blackboxsw clicks enable boot diagnostics logging in the UI and clicks reboot on this instance | 00:07 | |
blackboxsw | and bailing for dinner | 00:19 |
dojordan_ | same problem on artful | 00:37 |
blackboxsw | smoser: just pushed a merge proposal into bionic for today. I need a bit more time to triage what gives on Azure :/ | 01:20 |
blackboxsw | https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/336513 | 01:20 |
blackboxsw | ok I'm out for the night. gotta do the bedtime routine with the kiddos. more on azure first thing in my morning | 01:20 |
dojordan_ | thanks for all the help, sounds good | 01:21 |
smoser | blackboxsw: fudge | 01:47 |
smoser | https://jenkins.ubuntu.com/server/job/cloud-init-ci/725/console | 01:47 |
smoser | :-( | 01:47 |
* smoser fail | 01:47 | |
smoser | i'm fixing and pushing. http://paste.ubuntu.com/26448094/ | 01:58 |
smoser | rharper, powersj blackboxsw if you're still around, to disagree or +1 that. | 01:58 |
rharper | looking | 01:58 |
smoser | running tox + centos build && git push upstream HEAD | 01:58 |
rharper | looks sane | 01:59 |
smoser | thakns | 01:59 |
rharper | btw, it's a royal pain to launch multi-subnet/ip instances via the console; also it would be of great help if the ec2 docs would tell you what the format of the instance data is, for example local-ipv4s is a list of some sort of ipv4 addresses; but is it comma separated, newline, space? I can't find any examples with my google-fu so trying to get an instance up to check | 02:17 |
rharper | *finally* | 02:46 |
rharper | vpc is timeconsuming | 02:46 |
blackboxsw | hab | 03:00 |
blackboxsw | bah | 03:00 |
blackboxsw | resubmitting the merge proposal | 03:04 |
blackboxsw | with a new snapshot from master | 03:07 |
rharper | ok, have crude ec2 network metadata to v1 config | 03:09 |
* rharper calls it a night | 03:09 | |
blackboxsw | nice | 03:14 |
blackboxsw | ok new MP against bionic up. thanks for the fix smoser | 03:18 |
blackboxsw | https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/336514 | 03:19 |
=== shardy is now known as shardy_afk | ||
=== shardy_afk is now known as shardy | ||
=== Guest28399 is now known as mgagne | ||
smoser | blackboxsw: on azure... | 15:33 |
smoser | you there? | 15:33 |
smoser | dojordan_: west coasters. | 15:34 |
smoser | (they stay up late.. https://finance.yahoo.com/news/exclusive-fitbits-6-billion-nights-sleep-data-reveals-us-110058417.html ) | 15:36 |
blackboxsw | here | 16:03 |
blackboxsw | :) | 16:03 |
blackboxsw | ok azure triage time | 16:19 |
=== hrybacki is now known as hrybacki_mtg | ||
dojordan_ | here now @blackboxsw | 17:05 |
blackboxsw | good to know you come to work at a reasonable time like the rest of us :) I'm walking through failure path again, as smoser surmised it's likely the cdrom disappearing on us before I cloud-init clean --reboot.... so I'm adding logs etc now and going through that to confirm | 17:06 |
* blackboxsw wasn't sure yet why this seemed to work with tip of master though too. | 17:07 | |
dojordan_ | hmm, interesting idea. doesn't waagent copy the ovf_env.xml off of the cd ? | 17:07 |
dojordan_ | FWIW we remove the CD as soon as we get a provisioning message | 17:09 |
dojordan_ | question, shouldn't we be keeping around the ovf-file before rebooting? | 17:44 |
dojordan_ | i think clean --reboot deletes it | 17:44 |
dojordan_ | err, nvm, it lives in /var/lib/waagent/ovf-env.xml | 17:45 |
dojordan_ | @blackboxsw, same problem on xenial | 17:58 |
smoser | dojordan_: i had asked blackboxsw to edit /etc/cloud/cloud.cfg.d/05_logging.cfg and change the console logging from WARN to DEBUG | 18:00 |
smoser | and then collect console log | 18:01 |
smoser | are you able to easily do that too ? | 18:01 |
dojordan_ | yeah | 18:01 |
smoser | dojordan_: for boot diagnostics | 18:02 |
smoser | which storage account type do you need ? | 18:02 |
* blackboxsw is already mid reboot/test on my bionic instance with debug console logs enabled, an azure storage account created and | 18:03 | |
blackboxsw | boot logs enabled | 18:03 |
blackboxsw | ubuntu@40.70.46.88 | 18:04 |
smoser | ssh-import-id smoser ? | 18:04 |
blackboxsw | checking boot logs now to make sure cloud-init reported correctly on this last clean boot | 18:04 |
blackboxsw | already done for dojordan and smoser | 18:04 |
blackboxsw | i'm in byobu term | 18:04 |
smoser | permission denied | 18:05 |
smoser | in | 18:05 |
blackboxsw | added agin | 18:05 |
blackboxsw | must've typod | 18:05 |
dojordan_ | denied | 18:06 |
smoser | try again | 18:06 |
dojordan_ | cool | 18:07 |
blackboxsw | ok let's see here..... cehcking azure cli now to make sure I could see boot logs | 18:08 |
blackboxsw | before rebooting | 18:08 |
dojordan_ | worst case i can always get them :) | 18:08 |
blackboxsw | az vm boot-diagnostics get-boot-log --ids /subscriptions/12aad61c-6de4-4e53-a6c6-5aff52a83777/resourceGroups/SRUGRP10/providers/Microsoft.Compute/virtualMachines/my-b1 | 18:09 |
blackboxsw | 'ascii' codec can't decode byte 0xe2 in position 40610: ordinal not in range(128) | 18:09 |
blackboxsw | hrm oops az cli | 18:09 |
blackboxsw | checking UI | 18:09 |
blackboxsw | serial log in UI is working for me | 18:09 |
blackboxsw | ok | 18:09 |
blackboxsw | old log | 18:10 |
blackboxsw | http:pastebin.ubuntu.com/26453160 | 18:10 |
blackboxsw | http://pastebin.ubuntu.com/26453160 | 18:10 |
smoser | blackboxsw: 'ordinal not in range' | 18:11 |
smoser | ? | 18:11 |
blackboxsw | yeah az cli cloudn't decode the boot logs on the machinie | 18:11 |
smoser | is that because az is trying to .decode() the console log ? | 18:11 |
blackboxsw | yeah | 18:11 |
smoser | :-( | 18:11 |
blackboxsw | so something to file against azure cli when I dig into it :/ | 18:11 |
blackboxsw | but UI works | 18:11 |
dojordan_ | ugh, ill make a bug report | 18:11 |
blackboxsw | thanks dojordan_ | 18:12 |
blackboxsw | lemme get az cli version | 18:12 |
dojordan_ | can you pastebin the ui logs? | 18:12 |
blackboxsw | http://paste.ubuntu.com/26453179/ | 18:12 |
smoser | dojordan_: you're in good company. this week, we've hit. | 18:12 |
smoser | https://github.com/lxc/pylxd/issues/268 | 18:12 |
smoser | and | 18:12 |
smoser | https://github.com/boto/botocore/issues/1351 | 18:12 |
blackboxsw | dojordan_: ui logs is http://pastebin.ubuntu.com/26453160 | 18:12 |
dojordan_ | second boot? | 18:13 |
smoser | blackboxsw: yeah, go for it. | 18:13 |
blackboxsw | dojordan_: smoser 2nd rebooting now | 18:13 |
blackboxsw | ok | 18:13 |
=== hrybacki_mtg is now known as hrybacki | ||
blackboxsw | hrm any way to show in cli what power state is on node | 18:16 |
dojordan_ | let me see | 18:16 |
blackboxsw | dojordan_: ok it's looping | 18:18 |
blackboxsw | just got logs smoser dojordan_ | 18:18 |
blackboxsw | looping on reprovidsiondata | 18:18 |
blackboxsw | copying now | 18:18 |
blackboxsw | new boot log http://pastebin.ubuntu.com/26453225 | 18:19 |
blackboxsw | I'm looking now | 18:20 |
blackboxsw | yeah it's looping on 404 from reprovisioning | 18:20 |
=== shardy is now known as shardy_afk | ||
blackboxsw | so, something triggered that poll which shouldn't have | 18:21 |
dojordan_ | DataSourceAzure.py[INFO]: Creating a marker file to poll imds | 18:21 |
dojordan_ | yup | 18:21 |
smoser | well, that part seems like it is functioning as designed. | 18:21 |
blackboxsw | hahah | 18:21 |
blackboxsw | :) | 18:21 |
smoser | dojordan_: logging seems extremely verbose if you're expecting this to sit up for 24 hours before use | 18:21 |
dojordan_ | but we won't log debug by default right? | 18:22 |
smoser | debug does go to log file, but not to console | 18:22 |
smoser | looks like < 1k/second. but that'd add up. | 18:22 |
smoser | but thats not the issue. | 18:23 |
smoser | why did we get into the imds | 18:23 |
blackboxsw | so cfg.PreprovisionedVm == True | 18:23 |
blackboxsw | something in _extract_preprovisioned_vm_setting returns True | 18:24 |
blackboxsw | we need to look over that ovf file again | 18:24 |
blackboxsw | I think | 18:24 |
dojordan_ | my guess is the refactoring broke something. the weird thing is it should have been covered by ut | 18:24 |
blackboxsw | yeah I thought so too | 18:24 |
dojordan_ | im re reading my code now and no idea... | 18:25 |
blackboxsw | I'm starting up a 2nd vm now and will run _extract_... on the doc | 18:26 |
dojordan_ | smart | 18:27 |
dojordan_ | we didnt see any of those debugs though... | 18:28 |
smoser | blackboxsw: smoser@52.151.23.91 | 18:29 |
smoser | if you want | 18:29 |
smoser | take it | 18:29 |
blackboxsw | 40.79.65.171 | 18:30 |
blackboxsw | as well | 18:30 |
blackboxsw | 40.79.65.161 rather | 18:30 |
blackboxsw | <ns1:PreprovisionedVm>false</ns1:PreprovisionedVm> | 18:31 |
blackboxsw | ok,,,, so that should've been interpreted as false | 18:31 |
dojordan_ | oh no | 18:32 |
dojordan_ | bool("false") is true | 18:32 |
blackboxsw | ahhahhha | 18:32 |
blackboxsw | ohhh right | 18:32 |
blackboxsw | didn't translate from string type | 18:32 |
dojordan_ | ill push a fix | 18:33 |
dojordan_ | thanks for all the help | 18:33 |
blackboxsw | cheers. gotta go pickup a kiddo from school | 18:33 |
blackboxsw | see ya in a bit | 18:33 |
dojordan_ | @smoser, @blackboxsw, I pushed a fix, and added another UT that would have caught it. Testing now in azure. | 19:37 |
dojordan_ | my thoughts on removing the verbose logging: maybe just log a byte every request of something. Also, do we have a log level that goes to the console by default? | 19:47 |
blackboxsw | dojordan_: warning level is configured to the console by default. | 19:58 |
* blackboxsw wonders about us adding a param in a subsequent branch to url_helper.readurl(quiet=(False|True) then callers handling retries outside of that could turn down the volume of logs | 20:04 | |
blackboxsw | testing your latest branch now too | 20:10 |
dojordan_ | same here, *fingers crossed* | 20:18 |
dojordan_ | i got permission denied using password auth but at least the ECSDA host key changed on me | 20:21 |
smoser | ssh auth shouldnt be affected. if you get there, it really should let you in | 20:23 |
smoser | nothing woudl have deleted your keys | 20:23 |
dojordan_ | password would have been redacted in the ovf-env.xml, not sure what that changes | 20:23 |
blackboxsw | I know it's a nit, but changing the log message Start polling IMDS from debug -> warning feels like it really shouldn't be a warning level log | 20:24 |
blackboxsw | maybe I'm wrong (I know you are probably just trying to get it to show up in default console log configuration) | 20:25 |
dojordan_ | right, im open to other options, but it would be nice to get to the console | 20:26 |
smoser | i can understand wanting to see somethign on the console (for azure platofrm perspectivee) | 20:27 |
smoser | but itkind of stinks from the users' perspective. | 20:27 |
smoser | they have a right to expect WARN in the logs to mean "something went wrong" | 20:27 |
smoser | but here nothing in their control actually went wrong. | 20:28 |
dojordan_ | true... | 20:28 |
dojordan_ | im fine reverting it now that we found this bug | 20:28 |
smoser | yeah. i think that is best for now. | 20:28 |
blackboxsw | dojordan_: one more thing while you are in there. | 20:28 |
smoser | i have said many times i think python logging lacks level granularity | 20:28 |
smoser | and cloudd-init usage of what *is* there is bad. | 20:28 |
blackboxsw | there's a util.translate_bool that might be of use in checking that truthy value from ovf file | 20:29 |
smoser | it seems to me that this should qualify as INFO level | 20:29 |
smoser | and at some point maybe a concerted effort couldg et INFO to the console | 20:29 |
blackboxsw | btw smoser and dojordan_ success ubuntu@40.79.65.161 | 20:33 |
dojordan_ | sweet! | 20:33 |
dojordan_ | what distro? | 20:33 |
blackboxsw | dojordan_: bionic, running through xenial now | 20:34 |
dojordan_ | Y | 20:34 |
dojordan_ | cool, I got back in on xenial too | 20:37 |
dojordan_ | just pushed those two changes (correct log level, and util.translate_bool) | 20:38 |
blackboxsw | great, xenial looks good for me too. | 20:47 |
smoser | \o/ | 20:48 |
blackboxsw | ok, I'll land this when ci completes it's vote dojordan_ | 20:50 |
dojordan_ | thanks! | 20:51 |
blackboxsw | thanks for "dotting the i's and crossing the t's" | 20:57 |
robjo | smoser: As touched on in previous discussion platform.linux_distribution() is deprecated in upstream Python and as of version 3.7 is expected to go away, in 3.6 on SUSE it returns an empty tuple, thus useless | 20:59 |
robjo | I take it in Ubuntu you guys patched Python | 20:59 |
robjo | anyway, I think we shoud make a decisison if we expand the dependencies to python-distro or if cloud-init gets it's own function to determine the distribution | 21:00 |
robjo | thoughts? | 21:00 |
robjo | https://github.com/nir0s/distro#distro---a-linux-os-platform-information-api | 21:01 |
smoser | hm. | 21:01 |
smoser | i think i'd just want to build my own. | 21:02 |
smoser | s/my/own/ | 21:02 |
smoser | s/my/our/ | 21:02 |
smoser | i dont want an external dependency for something as seemingly simple as "figure out if you are on ubuntu, suse, redhat, ...". | 21:03 |
smoser | :-( | 21:03 |
robjo | fair enough, something like this? | 21:03 |
robjo | if os.path.exists('/etc/os-release'): | 21:04 |
robjo | use it and determine the distro | 21:04 |
robjo | else: | 21:04 |
robjo | try: | 21:04 |
robjo | platform.linux_distribution() | 21:05 |
robjo | except: | 21:05 |
robjo | ...... | 21:05 |
robjo | Sound reasonable? | 21:05 |
smoser | yeah i guess. id 'also like to olet the packager easily just set it | 21:05 |
robjo | well that's the other option, just punt and make the person running setup set the distro then we save the code all together | 21:08 |
smoser | well, i thin i'd l ike it to do the right thing, but if the logic that is there doesnt "do the right thing", then let the packager set it. | 21:12 |
smoser | i want trunk to "just work" though | 21:13 |
robjo | OK, I'll see what I can come up with | 21:13 |
robjo | https://bugs.launchpad.net/cloud-init/+bug/1745235 | 21:24 |
ubot5 | Ubuntu bug 1745235 in cloud-init "distribution detection" [Undecided,New] | 21:24 |
dojordan_ | @smoser and @blackboxsw, thank you for all your help landing this PR. When will the nightly azure images contain these changes? | 23:15 |
blackboxsw | heh, was going to ping you that I just landed it :) | 23:15 |
blackboxsw | should be in bionic tomorrow | 23:15 |
blackboxsw | I'm thinking we will probably SRU in February.... so xenial, artful would have it our next SRU | 23:16 |
dojordan_ | bionic will work for me :) | 23:21 |
blackboxsw | dojordan_: oopsie, sorry I need to propose for merging into bionic | 23:22 |
blackboxsw | I'll put up another merge proposal tonight. we can probably land that tomorrow and it'll be published friday | 23:22 |
blackboxsw | just landing robjo's btrfs branch too | 23:22 |
dojordan_ | gotcha. Is the bionic branch just a delayed mirror of master? | 23:22 |
blackboxsw | dojordan_: yeah the way we structure bionic publishing is just to mirror all content from master tip | 23:24 |
blackboxsw | for SRUs into xenial, zesty artful releases we take a snapshot of tip as well and if some significant behavior change requires attention to retain backward compatibility we carry a small patch to retain behavior in xenial. | 23:26 |
blackboxsw | since bionic is not officially in feature freeze until March 2018, any change in behavior of cloud-init is given the go-ahead, so snapshots are easy https://wiki.ubuntu.com/BionicBeaver/ReleaseSchedule | 23:28 |
blackboxsw | SRUs into ubuntu series that are 'stable/released' require a bit more work on our end with testing/verification | 23:29 |
blackboxsw | https://wiki.ubuntu.com/CloudinitUpdates for our SRU process (TMI I know) | 23:29 |
dojordan_ | got it, this explains a lot. (not TMI :) ) | 23:30 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!