/srv/irclogs.ubuntu.com/2018/07/20/#cloud-init.txt

smoserrharper: i think i probably paste-binned you a git-propose-merge once ?12:12
caribousmoser: good morning; I'm about to push the new version of my MR in our infrastructure as it fixes an IPv6 regression12:36
caribouhopefully it'll be adequate for merging, otherwise I'll push a new version12:36
cariboubtw, I took ownership of LP: 1662345 as it is a showstopper for us to deploy cloud-init on arm6412:37
ubot5Launchpad bug 1662345 in qemu (Ubuntu Xenial) "smbios parameter settings not visible in guest" [Medium,Confirmed] https://launchpad.net/bugs/166234512:37
smosercaribou: ok. i will look.14:08
caribousmoser: fun thing is that your suggestion fixed an issue we have with IPv6 following our recent deployment14:09
rharpersmoser: hrm14:22
smoserrharper: i found it.14:31
smoserthanks to irc logs.14:31
rharperheh14:31
smosergit-propose-merge is http://paste.ubuntu.com/p/gMBWKvD42W/14:31
rharperwhat was it again?  oh, like sparkie's git to MP14:31
rharperright?14:31
rharperfor launchpad ?14:31
smoseryeah, but sparkiegeeks' requires launchpadlib and auth.14:31
rharperwhich is sort of the right way to do it, but  sure , we're already  logged in anyhow14:32
smoserthis is much faster, and works in a small subset of places . but places that i use.14:32
rharperyep14:32
smoserwell, this just opens up the browser for you14:32
smoserand you hit 'merge'14:32
smoserhis actually creates it.14:32
smoserthere are other issues i have with his. nothign that couldnt be fixed. that is definitly the right place (or more right than hacky smoser scripts)14:33
danMS_smoser: i am working on https://bugs.launchpad.net/cloud-init/+bug/177920714:40
ubot5Ubuntu bug 1779207 in cloud-init "Failed mount of '/dev/sdb1' with Swap File cloud-init config" [Medium,Triaged]14:40
danMS_Do you know if other cloud platforms have hit this issue?14:41
smoserdanMS_: by "platform" you mean cloud platform (not linux distro)14:56
smoserright?14:56
danMS_yes14:57
smoserthe issue is probably present everywhere.14:57
smoserhowever it is dramatically worse on azure14:57
smoserbecause of "redeploy" with same instance id.14:57
smoserer... maybe not relavant to instance id.14:57
smoserlet me try again14:58
smoserthe problem exists any time there are "stale" entries in /etc/fstab14:58
danMS_ok, on Azure they will get a new ntfs formatted ephemeral on redeploy or deallocate14:58
smoseron Azure, stale entries occur in a regular lifespan of an instance (with 'redeploy'... that might not be the right word)14:58
smoseron other platforms, the issue can occur on a snapshot -> new-instance14:59
danMS_so it sounds like, when they deallocate their VM, they present the previously attached ephemeral drive14:59
danMS_whereas we are not14:59
mgerdtsblackboxsw: Trying to update DataSourceSmartOS to use EventType.BOOT to get network reconfiguration on boot and hitting a snag.  The change is simple enough, I think: https://hastebin.com/vahizocidi.diff15:00
smoserdanMS_: well no one else "deallocates"to my knowledge.15:00
smoseri dont know acutally.15:01
danMS_i have not done a deep dive, but i did not see this issue on Ubuntu vms15:01
mgerdtsBut when I install the new deb networking is not configured properly.  The ENI file has dhcp, and it looks like the saved configuration is for the wrong datasource.15:01
mgerdts>>> pickle.load(open('obj.pkl', 'rb'))15:02
mgerdts<cloudinit.sources.DataSourceNone.DataSourceNone object at 0x7fc92c313c50>15:02
mgerdtsThat was in /var/lib/cloud/intance.15:02
mgerdts*instance15:02
smosermgerdts: blackboxsw is out for a hwhile15:03
mgerdtsoh, ok.  Any idea what could be causing the wrong datasource data to be cached?  That is, does this sound familiar?15:03
=== r-daneel_ is now known as r-daneel
rharpermgerdts: hrm;  that's curious;   what's the recreate scenario ? new instance boot, upgrade deb , reboot ?15:20
mgerdtsinstall xenial, upgrade to bionic, reboot, install new cloud-init, reboot (ssh host key changed - bad) (did not look closely at networking config), poweroff vm, modify network config in host, restart host metadata service to be sure it picked it up, booted VM.15:23
rharpermgerdts: ok;  I suppose it's best to pkl.load and print after each state to see where it changes15:24
mgerdtsSince then, I did cloud-init clean -l, reboot.  It put DataSourceNone on obj.pkl15:24
rharperah15:24
rharperthat sounds like unidentified change15:24
rharperclean wipes the previous instance info15:24
rharperso next boot ds-identify needs to run and pick; what does the log look like after that reboot ?15:25
mgerdtshttps://hastebin.com/axaguxusit.txt15:27
rharperno local data found from DataSourceSmartOS15:28
rharperso the SmartOS DS didn;t say "yes" I have metadata/user-data15:28
rharperwhen cloud-init called .get_data() on it15:28
rharperwhich means that the fallback is DatasourceNone15:28
rharperso the question is , why did the SmartOS DS say it had no local metadata  ?15:29
rharperisn't that over the serial interface ?15:29
mgerdtsyeah,15:29
mgerdtsI'll try some debug statements in _get_data15:29
rharperyeah, I don't see a return without a boolean, and the False path has logging =(15:31
mgerdtsthat's what I was thinking15:31
rharperand some debugging in sources/__init__.py15:32
rharperwe now do this metadata caching;15:32
smosermgerdts: you should be able to use the main too15:32
smoserpython -m cloudinit.sources.DataSourceSmartOS15:32
smosermigth be easier to debug that way15:33
mgerdtsthat dumps a bunch of metadata15:33
smoserso i think the pickled object must have a method that is getting in the way. method or attribute i guess.15:35
rharperIIUC, cloud-init clean was run15:35
rharperwhich wipes the object15:36
smoseroh. hm..15:36
mgerdtsyes, and verified that clean worked15:36
smoserso then this is essentially fresh boot ?15:36
smoseror as llose as clean can get us ?15:37
mgerdtscommenting the change to DataSourceSmartOS caused ENI/50-cloud-init.cfg to get static network config.  Oddly, not the right network config.15:38
mgerdtshttps://hastebin.com/aviqacabew.txt15:38
mgerdtsip should be .223 per sdc:nics, but is .22215:39
rharper"ip":"10.88.88.223","ips":["10.88.88.222/24"]15:39
rharperyour data disagrees15:39
mgerdtshuh.  I guess I missed one of the ip's int he zonecfg.15:40
rharperI think we only look at ips since it was a superset ?15:40
mgerdtsnotice 222 and 223 in there15:40
rharperyeah15:40
mgerdtsok, so something in the get_data() path is unhappy with update_events = {'network': [EventType.BOOT]} in DataSourceSmartOS.15:41
mgerdtsI'll go hunting15:42
mgerdtsLooks like the comment in class DataSource is wrong.  This seems to work:15:57
mgerdtsupdate_events = {'network': [EventType.BOOT_NEW_INSTANCE, EventType.BOOT]}15:57
mgerdtsApparently BOOT_NEW_INSTANCE is not a subset of BOOT.15:58
rharperoh, yeah15:58
rharperthat must have been asperational15:58
mgerdts:)15:59
rharperDon't we log if we skip reading it ?15:59
mgerdtsI think DataSourceSmartOS should do: update_events['network'].append(EventType.BOOT)16:00
mgerdtsDoesn't look like it.16:01
rharperupdate_metadata could use some logging in the negative path I think16:01
rharperotherwise you get return False and nothing16:01
mgerdtsyeah, I'll add something there along with updating the aspirational comment.16:02
rharperthanks for debugging that16:05
mgerdtsYeah, no problem.  Thanks for the nudges in the right direction.16:08
mgerdtshopefully this addresses the changed ssh host key after upgrade + reboot.16:09
mgerdtswould you like the fix for blackboxsw's change in a separate changeset from the smartos changes?16:10
mgerdtslikely: https://hastebin.com/evokubiluj.diff16:11
rharpermgerdts: the ssh won't regen if the instance-id hasn't changed;16:18
rharperseparate is best if that's not too much trouble16:19
mgerdtsso maybe dpkg -i cloud-init_all.deb cuased it to get clobbered.16:19
mgerdtssure, easy enough.16:19
rharperI suspect we should also add a unittest on the derived Datasource16:19
rharperI wonder if the actual unittests do the .append() like you did16:19
mgerdtssadly unittests fail now.  So probably need some fixes there too.16:21
=== akik_ is now known as akik
mgerdtsI think I managed to sort this out.  https://code.launchpad.net/~mgerdts/cloud-init/+git/cloud-init/+merge/350374 then https://code.launchpad.net/~mgerdts/cloud-init/+git/cloud-init/+merge/35037519:18
rharpermgerdts: thatnks, reviewing19:27
smosermgerdts: you just want the first to land first ?19:27
smoserthe second is a superset right ? but not separate commits.19:27
smoseri think.. ?19:27
mgerdtsyeah, the second will break without the first.19:27
mgerdtsdue to list vs. set19:27
mgerdtsI tried to set dependencies in the merge proposal, but not sure if that is actually hooked into anything.19:28
mgerdtsawesome.  python 2.6 strikes again.20:50
smoserhttps://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/35038121:16
smosermgerdts: ./tools/run-container is fairly easily usable from ubuntu if you alve lxc to get you a centos/621:17
smoserset notation?21:17
mgerdtsyeah21:17
mgerdtsshould be fixed now.  will CI bot automatically re-run or does it need to be nudged?21:22
smoserrharper: https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/350382 will fix tip-pylint21:22
rharperauto reruns21:22
rharperlooking21:22

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!