[02:09] Hi, several months ago I reported https://bugs.launchpad.net/cloud-init/+bug/1722992 . Since then, we've been running an outdated version of cloud-init. Are there plans to fix this soon? [02:10] Ubuntu bug 1722992 in cloud-init "On the latest centos 7 release, we are unable to resize our instances filesystems" [Medium,Confirmed] === r-daneel_ is now known as r-daneel === shardy is now known as shardy_afk === shardy_afk is now known as shardy [15:09] smoser: blackboxsw: Is https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1757176 a known issue? [15:09] Ubuntu bug 1757176 in cloud-init (Ubuntu) "TypeError: get_hostname() got an unexpected keyword argument 'metadata_only' on GCE bionic boot" [Undecided,New] [15:10] wow Odd_Bloke I'll look at that. thanks. That feels like version mismatch (as we just grew the metadata_only arg)... but will glance after this meeting I'm in. [15:11] blackboxsw: Thanks! [15:11] Odd_Bloke: have access to cloud-init collect-logs on the CLI there? it'll dump a *log.tar in the cwd [15:12] blackboxsw: I can't get in to the machine, unfortunately. [15:12] no prob. will see if I can reproduce the prob [15:12] SSH host keys weren't created, so I can't SSH in. [15:14] Odd_Bloke: confirmed and I can fix that today [15:14] Thanks! [15:14] I see what happened [15:14] blackboxsw: Should we expect it to only happen on GCE? [15:15] working it right now Odd_Bloke . I'm checking other datasources now [15:17] Odd_Bloke: CloudSigma AliYun, OpenNebula looks like too [15:17] but fallout is contained to those [15:18] Ack, thanks. [15:18] easy fix, should be up, reviewed and posted to bionic today [15:39] smoser: rharper here's the fix the the bug above http://paste.ubuntu.com/p/RVC6FQbwTB/ [15:40] I'm trying to decide how much unit testing we should have to cover it [15:41] surprised pylint didnt scream on that. [15:41] well, mayb e not [15:41] to exercise this, I could have a suite of tests that initializes each fake datasource only to run the base class get_hostname method passing metadata_only to make sure it is accepted... but that seems overkill. [15:41] different signatures on subclasses. [15:41] yeah. i agree. [15:42] i dont know. [15:42] ii dont want to say "no unit tests are OK". [15:42] yeah smoser I was suprised about the different named param especiialy in b/cloudinit/sources/DataSourceAliYun.py case which renames resolve_ip to _resolve_ip [15:42] but agree on the pain. [15:42] well, that is probably fallout *from* pylint [15:42] I feel like I should fix aliyin param name too so it matches base class [15:43] it complains if you dont use a varibable somteims [15:43] i think its fine to change the aliyun param name [15:44] maybe I add a unit test per datasource I've changed to make sure get_hostname accepts the proper param names as defined in base class? [15:45] at least then the subclass tests that it's implementation provides the same 'api' as the base class [15:45] s/it's/its/ [15:46] though if parent class implementation moves on, these unit tests would still incorrectly succeed [15:46] I've got an idea. === pickle is now known as dhill_ === pickle is now known as dhill_ [16:01] blackboxsw: did we have a general soltuion for json.dumps({dictionary that miht have some binary keys }) [16:03] hrm, we have process_base64_metadata in cloudinit.sources.__init__ [16:03] and json_dumps uses json_serialize_default in cloudinit.util [16:04] which prefixes base64 content with ci-b64 [16:04] nice. thanks [17:35] blackboxsw: http://paste.ubuntu.com/p/8Hxy67h5X2/ [17:36] smoser: interesting, will fix now. [17:37] smoser: care if I land that branch in the process of fixing review-mps? [17:38] blackboxsw: fine wit em [17:38] my regex was bad [17:39] didn't allow for digits in the lp_user [17:39] strangely we hand't hit that type of user until now [17:41] ok tox is running against it prior to merge. should land in a couple mins [17:43] review-mps pushed [17:43] OpenNebula should land shortly [17:45] ...done [18:25] smoser: rharper for review and landing today to unblock GCE bionic [18:25] https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/341757 [18:26] looking [18:26] thanks [18:27] inspect, nifty [18:29] blackboxsw: fyi, c-i says go away [18:29] i suspect due to 'inspect' :) [18:30] no. py27 [18:32] * blackboxsw is testing on py2.6 now too and no go on that unit test approach [18:32] hrm [18:34] ahh gotit smoser rharper 30 seconds [18:34] my last change with lints busted it [18:35] will push a fix [18:38] smoser: rharper fd165e92b0e22245d41e7f55a5f89de4b6b522a6 pushed [18:38] just tested on py2.6 [18:38] k [18:38] as well [18:39] giving a GCE deployment of it to confirm sucees [18:39] giving a GCE deployment of it to confirm sucess [19:00] validated on GCE by upgrading xenial to my branch and cloud-init clean --logs --reboot [19:00] * blackboxsw validates failure case on GCE now using cloud-init-tip without get_hostname fix [19:01] sure enough, here's the failure on GCE without my changeset https://pastebin.ubuntu.com/p/NpMNDVMhz6/ [19:03] ok gotta run. I still need to sort that lint error on my branch looks like [19:03] back in 30 [19:44] back [19:49] @blackboxsw, any update on the UrlError handling discussion? [20:00] dojordan: saw your changes at least to readurl yesterday. hadn't gotten through a followup discussion on URLError handling. [20:00] let's see if we can hash it out here. [20:00] so URLError can be raised by readurl in two cases: [20:00] Can someone remind me what the new cloud-init SRU process/cadence is? [20:01] 1. IMDS service is currently unavailable (during service upgrade or something) This would raise an URLError with the string "[Errno 111] Connection refused" [20:02] 2. No network is configured on the instance with the e.cause string containing '[Errno 101] Network is unreachable' [20:02] right now cloudinit's URLError doesn't surface errno on the exception raised [20:03] should it handle that programmatically when raising an URLError? I'm not certain we really need to do that or Azure should really care a much about Errno 101 [20:04] Odd_Bloke: cadence is at will currently when we've decided there is enough content to warrant an SRU. Generally I'd like to keep it minimally at quarterly (or at least when shortly we cut a cloud-init upstream release). The higher frequency the better [20:05] hmm, where does timeout fit into the picture [20:05] because in reality that is what is used in the happy path [20:05] blackboxsw: let's just simpmlify that... "after we release"? [20:05] Odd_Bloke: for the short term, Since we have a cut of cloud-init v.18.2 next week, I'd like to start an SRU of that to xenial within a week after the release. [20:06] blackboxsw: "the release" == the 18.2 release, or bionic? [20:06] +1 dpb1 ... /me is rambling a bit too much on that. so, post 18.2 cut next week, we'll kick off an SRU [20:07] OK, cool; I'm interested in https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1752391 ending up in xenial, and it sounds like we're on the path to that being true pretty soon. [20:07] Ubuntu bug 1752391 in cloud-init "cloud-init does not recognize initramfs provided network config in all cases" [Medium,Fix committed] [20:07] Odd_Bloke: SRU to xenial after cloud-init 18.2 is cut next week. I'd estimate 2 weeks until SRU publish [20:08] Ack, thanks. :) [20:10] dojordan: I think the timeout you provide to readurl is on the intial readurl request cap for when it raises a TimeoutException, it's passed into requests directly [20:11] so readurl will sit 1 second in your use case before raising that timeout exception [20:11] if IMDS is not functional for some reason [20:11] and then does timeout exception get wrapped in a urlerror? (sorry its been a little while since i wrote this code) [20:12] I believe it does [20:12] in testing, that's what it looks like happened. on my end. I setup a SimpleHTTPService and killed it [20:13] I don't think cloud init needs to do too much special casing here - but maybe im wrong [20:13] we tried to communicate, and didn't get a response [20:14] vs got a response we don't like (5xx) [20:15] right only difference in the first case was that you are re-EphemeralDHCPv4'ing when you don't really need to. [20:15] but I think that's probably an unlikely corner case [20:15] except with a timeout exception we do need to re dhcp [20:16] which was the first case [20:16] ahh right, much concern about nothing then [20:17] but i agree, if we got a 101 we wouldn't necessarily need to re dhcp but it does' [20:17] t really harm anything IMO [20:18] dojordan: yeah and generally I can't imagine what case would actually get you down the path of Errno 101, because that means you got a temporary dhcp lease and something managed to get there at the moment and ifdown your interface [20:18] .... not going to happen. [20:20] right, since systemd-networkd is not up yet [20:24] ok I'm good with this approach. will mark approved after I get one test run on EC2 [20:24] thanks for your patience [20:31] sounds good, thanks! [21:00] smoser: rharper finally sorted my CI woes https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/341757 [21:07] dojordan: in other threads about using netlink, there is some mention of reducing as much downtime as possible; if we can avoid DHCP'ing when we don't need to; that will keep the waiting overhead to a min, no ? [21:08] blackboxsw: rharper https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/341774 [21:08] i have to run now. blackboxsw i can come back in in ~ 4 hours and upload anything that you've got landed. [21:09] +1 smoser thanks will have something up [21:09] just approved your branch there. [21:10] just approved your Hetzner enable branch too smoser https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/341679 [22:43] smoser: Odd_Bloke put up https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/341778 for bionic GCE fix [22:43] that's to publish our fix which is in tip [22:43] if it lands tonight it should be in bionic cloudimages tomorrow morn