| smoser | rharper: i think i probably paste-binned you a git-propose-merge once ? | 12:12 |
|---|---|---|
| caribou | smoser: good morning; I'm about to push the new version of my MR in our infrastructure as it fixes an IPv6 regression | 12:36 |
| caribou | hopefully it'll be adequate for merging, otherwise I'll push a new version | 12:36 |
| caribou | btw, I took ownership of LP: 1662345 as it is a showstopper for us to deploy cloud-init on arm64 | 12:37 |
| ubot5 | Launchpad bug 1662345 in qemu (Ubuntu Xenial) "smbios parameter settings not visible in guest" [Medium,Confirmed] https://launchpad.net/bugs/1662345 | 12:37 |
| smoser | caribou: ok. i will look. | 14:08 |
| caribou | smoser: fun thing is that your suggestion fixed an issue we have with IPv6 following our recent deployment | 14:09 |
| rharper | smoser: hrm | 14:22 |
| smoser | rharper: i found it. | 14:31 |
| smoser | thanks to irc logs. | 14:31 |
| rharper | heh | 14:31 |
| smoser | git-propose-merge is http://paste.ubuntu.com/p/gMBWKvD42W/ | 14:31 |
| rharper | what was it again? oh, like sparkie's git to MP | 14:31 |
| rharper | right? | 14:31 |
| rharper | for launchpad ? | 14:31 |
| smoser | yeah, but sparkiegeeks' requires launchpadlib and auth. | 14:31 |
| rharper | which is sort of the right way to do it, but sure , we're already logged in anyhow | 14:32 |
| smoser | this is much faster, and works in a small subset of places . but places that i use. | 14:32 |
| rharper | yep | 14:32 |
| smoser | well, this just opens up the browser for you | 14:32 |
| smoser | and you hit 'merge' | 14:32 |
| smoser | his actually creates it. | 14:32 |
| smoser | there are other issues i have with his. nothign that couldnt be fixed. that is definitly the right place (or more right than hacky smoser scripts) | 14:33 |
| danMS_ | smoser: i am working on https://bugs.launchpad.net/cloud-init/+bug/1779207 | 14:40 |
| ubot5 | Ubuntu bug 1779207 in cloud-init "Failed mount of '/dev/sdb1' with Swap File cloud-init config" [Medium,Triaged] | 14:40 |
| danMS_ | Do you know if other cloud platforms have hit this issue? | 14:41 |
| smoser | danMS_: by "platform" you mean cloud platform (not linux distro) | 14:56 |
| smoser | right? | 14:56 |
| danMS_ | yes | 14:57 |
| smoser | the issue is probably present everywhere. | 14:57 |
| smoser | however it is dramatically worse on azure | 14:57 |
| smoser | because of "redeploy" with same instance id. | 14:57 |
| smoser | er... maybe not relavant to instance id. | 14:57 |
| smoser | let me try again | 14:58 |
| smoser | the problem exists any time there are "stale" entries in /etc/fstab | 14:58 |
| danMS_ | ok, on Azure they will get a new ntfs formatted ephemeral on redeploy or deallocate | 14:58 |
| smoser | on Azure, stale entries occur in a regular lifespan of an instance (with 'redeploy'... that might not be the right word) | 14:58 |
| smoser | on other platforms, the issue can occur on a snapshot -> new-instance | 14:59 |
| danMS_ | so it sounds like, when they deallocate their VM, they present the previously attached ephemeral drive | 14:59 |
| danMS_ | whereas we are not | 14:59 |
| mgerdts | blackboxsw: Trying to update DataSourceSmartOS to use EventType.BOOT to get network reconfiguration on boot and hitting a snag. The change is simple enough, I think: https://hastebin.com/vahizocidi.diff | 15:00 |
| smoser | danMS_: well no one else "deallocates"to my knowledge. | 15:00 |
| smoser | i dont know acutally. | 15:01 |
| danMS_ | i have not done a deep dive, but i did not see this issue on Ubuntu vms | 15:01 |
| mgerdts | But when I install the new deb networking is not configured properly. The ENI file has dhcp, and it looks like the saved configuration is for the wrong datasource. | 15:01 |
| mgerdts | >>> pickle.load(open('obj.pkl', 'rb')) | 15:02 |
| mgerdts | <cloudinit.sources.DataSourceNone.DataSourceNone object at 0x7fc92c313c50> | 15:02 |
| mgerdts | That was in /var/lib/cloud/intance. | 15:02 |
| mgerdts | *instance | 15:02 |
| smoser | mgerdts: blackboxsw is out for a hwhile | 15:03 |
| mgerdts | oh, ok. Any idea what could be causing the wrong datasource data to be cached? That is, does this sound familiar? | 15:03 |
| === r-daneel_ is now known as r-daneel | ||
| rharper | mgerdts: hrm; that's curious; what's the recreate scenario ? new instance boot, upgrade deb , reboot ? | 15:20 |
| mgerdts | install xenial, upgrade to bionic, reboot, install new cloud-init, reboot (ssh host key changed - bad) (did not look closely at networking config), poweroff vm, modify network config in host, restart host metadata service to be sure it picked it up, booted VM. | 15:23 |
| rharper | mgerdts: ok; I suppose it's best to pkl.load and print after each state to see where it changes | 15:24 |
| mgerdts | Since then, I did cloud-init clean -l, reboot. It put DataSourceNone on obj.pkl | 15:24 |
| rharper | ah | 15:24 |
| rharper | that sounds like unidentified change | 15:24 |
| rharper | clean wipes the previous instance info | 15:24 |
| rharper | so next boot ds-identify needs to run and pick; what does the log look like after that reboot ? | 15:25 |
| mgerdts | https://hastebin.com/axaguxusit.txt | 15:27 |
| rharper | no local data found from DataSourceSmartOS | 15:28 |
| rharper | so the SmartOS DS didn;t say "yes" I have metadata/user-data | 15:28 |
| rharper | when cloud-init called .get_data() on it | 15:28 |
| rharper | which means that the fallback is DatasourceNone | 15:28 |
| rharper | so the question is , why did the SmartOS DS say it had no local metadata ? | 15:29 |
| rharper | isn't that over the serial interface ? | 15:29 |
| mgerdts | yeah, | 15:29 |
| mgerdts | I'll try some debug statements in _get_data | 15:29 |
| rharper | yeah, I don't see a return without a boolean, and the False path has logging =( | 15:31 |
| mgerdts | that's what I was thinking | 15:31 |
| rharper | and some debugging in sources/__init__.py | 15:32 |
| rharper | we now do this metadata caching; | 15:32 |
| smoser | mgerdts: you should be able to use the main too | 15:32 |
| smoser | python -m cloudinit.sources.DataSourceSmartOS | 15:32 |
| smoser | migth be easier to debug that way | 15:33 |
| mgerdts | that dumps a bunch of metadata | 15:33 |
| smoser | so i think the pickled object must have a method that is getting in the way. method or attribute i guess. | 15:35 |
| rharper | IIUC, cloud-init clean was run | 15:35 |
| rharper | which wipes the object | 15:36 |
| smoser | oh. hm.. | 15:36 |
| mgerdts | yes, and verified that clean worked | 15:36 |
| smoser | so then this is essentially fresh boot ? | 15:36 |
| smoser | or as llose as clean can get us ? | 15:37 |
| mgerdts | commenting the change to DataSourceSmartOS caused ENI/50-cloud-init.cfg to get static network config. Oddly, not the right network config. | 15:38 |
| mgerdts | https://hastebin.com/aviqacabew.txt | 15:38 |
| mgerdts | ip should be .223 per sdc:nics, but is .222 | 15:39 |
| rharper | "ip":"10.88.88.223","ips":["10.88.88.222/24"] | 15:39 |
| rharper | your data disagrees | 15:39 |
| mgerdts | huh. I guess I missed one of the ip's int he zonecfg. | 15:40 |
| rharper | I think we only look at ips since it was a superset ? | 15:40 |
| mgerdts | notice 222 and 223 in there | 15:40 |
| rharper | yeah | 15:40 |
| mgerdts | ok, so something in the get_data() path is unhappy with update_events = {'network': [EventType.BOOT]} in DataSourceSmartOS. | 15:41 |
| mgerdts | I'll go hunting | 15:42 |
| mgerdts | Looks like the comment in class DataSource is wrong. This seems to work: | 15:57 |
| mgerdts | update_events = {'network': [EventType.BOOT_NEW_INSTANCE, EventType.BOOT]} | 15:57 |
| mgerdts | Apparently BOOT_NEW_INSTANCE is not a subset of BOOT. | 15:58 |
| rharper | oh, yeah | 15:58 |
| rharper | that must have been asperational | 15:58 |
| mgerdts | :) | 15:59 |
| rharper | Don't we log if we skip reading it ? | 15:59 |
| mgerdts | I think DataSourceSmartOS should do: update_events['network'].append(EventType.BOOT) | 16:00 |
| mgerdts | Doesn't look like it. | 16:01 |
| rharper | update_metadata could use some logging in the negative path I think | 16:01 |
| rharper | otherwise you get return False and nothing | 16:01 |
| mgerdts | yeah, I'll add something there along with updating the aspirational comment. | 16:02 |
| rharper | thanks for debugging that | 16:05 |
| mgerdts | Yeah, no problem. Thanks for the nudges in the right direction. | 16:08 |
| mgerdts | hopefully this addresses the changed ssh host key after upgrade + reboot. | 16:09 |
| mgerdts | would you like the fix for blackboxsw's change in a separate changeset from the smartos changes? | 16:10 |
| mgerdts | likely: https://hastebin.com/evokubiluj.diff | 16:11 |
| rharper | mgerdts: the ssh won't regen if the instance-id hasn't changed; | 16:18 |
| rharper | separate is best if that's not too much trouble | 16:19 |
| mgerdts | so maybe dpkg -i cloud-init_all.deb cuased it to get clobbered. | 16:19 |
| mgerdts | sure, easy enough. | 16:19 |
| rharper | I suspect we should also add a unittest on the derived Datasource | 16:19 |
| rharper | I wonder if the actual unittests do the .append() like you did | 16:19 |
| mgerdts | sadly unittests fail now. So probably need some fixes there too. | 16:21 |
| === akik_ is now known as akik | ||
| mgerdts | I think I managed to sort this out. https://code.launchpad.net/~mgerdts/cloud-init/+git/cloud-init/+merge/350374 then https://code.launchpad.net/~mgerdts/cloud-init/+git/cloud-init/+merge/350375 | 19:18 |
| rharper | mgerdts: thatnks, reviewing | 19:27 |
| smoser | mgerdts: you just want the first to land first ? | 19:27 |
| smoser | the second is a superset right ? but not separate commits. | 19:27 |
| smoser | i think.. ? | 19:27 |
| mgerdts | yeah, the second will break without the first. | 19:27 |
| mgerdts | due to list vs. set | 19:27 |
| mgerdts | I tried to set dependencies in the merge proposal, but not sure if that is actually hooked into anything. | 19:28 |
| mgerdts | awesome. python 2.6 strikes again. | 20:50 |
| smoser | https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/350381 | 21:16 |
| smoser | mgerdts: ./tools/run-container is fairly easily usable from ubuntu if you alve lxc to get you a centos/6 | 21:17 |
| smoser | set notation? | 21:17 |
| mgerdts | yeah | 21:17 |
| mgerdts | should be fixed now. will CI bot automatically re-run or does it need to be nudged? | 21:22 |
| smoser | rharper: https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/350382 will fix tip-pylint | 21:22 |
| rharper | auto reruns | 21:22 |
| rharper | looking | 21:22 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!