[18:01] <rharper> falcojr: blackboxsw: for 1910552 ;  I suspect that the issue is that the MAAS seed is network-based, to we do cache the ds such that we don't need to crawl the datasource on subsequent boots, *but*, maas configures a reporting endpoint, so each of cloud-init messages that we send to the reporter are attempting to POST those to an apparently dead MAAS.  I believe the reporter configuration uses url_helper to post, and the timeout 
[18:01] <rharper> is likely high and there are 10s of not 100s of messages we post during cloud-init stages ; 
[18:05] <falcojr> rharper: yeah, I figured that log spam was reporting endpoints, but that shouldn't be blocking anything, right?
[18:06] <rharper> how long does each message take?  it's not async in cloud-init
[18:06] <rharper> so, each post has to fail before cloud-init can run the next module, etc 
[18:07] <rharper> for each close of the reporting context manager, it builds its post and pushes to endpoint ... that can hang ... 
[18:09] <rharper> the azure report uses threads to submit events async ,  which should be generalized , but never got time to refactor that into the general case.   
[18:14] <rharper> falcojr: yeah, so in the WebHookHander, no timeout value is set, the OautHelper (which maas uses) does not set one either and upstream requests module says if you don't set a timeout it will just hang.  which *sounds* like what they're seeing ...  one could setting timeout in WebHookHandler to something other than None to have the POST timeout after some number of seconds.  
[19:14] <falcojr> ahhh, thanks for that context...that's helpful
[19:14] <falcojr> I couldn't find any specific call that hangs indefinitely...but there were a lot of 30-ish second calls being made between all of the various modules
[19:15] <falcojr> I was suspecting that same call...so makes sense to try putting a timeout on it