[09:22] <Xat`> hello guys
[09:22] <Xat`> how is scripts-per-instance determined ?
[09:23] <Xat`> how cloud-init knows about the first boot ?
[09:28] <powersj> Xat`, the instance-id is read, if that changes it is considered a new instance
[09:32] <Xat`> powersj: I am testing with local instance on vbox, how this is implemented ?
[09:35] <Xat`> On cloud provider, I guess instance-id is retrieve from instance metadata . How it works with vbox or vmware
[09:35] <Xat`> wait a min, maybe cloud-init provides it
[09:35] <Xat`> let me query the metadata url
[09:36] <Xat`> ok no it does not
[09:38] <Xat`> powersj: nvm, I gonna read about instance metadata with cloud-init ;)
[09:40] <Xat`> I changed the value from /run/cloud-init/.instance-id , then did a reboot but the script in the scripts-per-instance has not been executed
[15:53] <Odd_Bloke> blackboxsw: You have some changes requested on https://github.com/canonical/cloud-init/pull/70 if you want to take another look
[16:31] <blackboxsw> Odd_Bloke: otubo. https://github.com/canonical/cloud-init/pull/70 looks good. was there a Launchpad bug related to this commit set?
[16:32] <blackboxsw> I've approved  pull 70, just didn't squash merge yet incase we forgot to correlate to a launchpad (or redhat bug)
[16:38] <blackboxsw> ahh yes there was
[16:38] <blackboxsw> https://bugs.launchpad.net/cloud-init/+bug/1781781
[16:38] <blackboxsw> ok I'll tie that bug to the squashed commit message
[17:21] <blackboxsw> ohh interesting Odd_Bloke paride, on an azure vm where I've config'd ipv6 and ipv4 on nic0  and only ipv4 on nic1. IMDS is showing/allocating ipv6 addresses to nic1.
[17:21] <blackboxsw> cloud-init query ds.meta_data.imds.network | pastebinit
[17:21] <blackboxsw> http://paste.ubuntu.com/p/T2CSCQTRVC/
[17:22] <blackboxsw> I think we may have a minor issue t to file for clarity with azure folks.
[17:31] <Odd_Bloke> Yeah, sounds like some clarification is necessary.
[17:45] <akik> i expected a non-zero exit code if the events log says container die
[17:45] <akik> uhh wrong channel
[18:02] <cyberpear> any chance of passing the cloud-init config via `-fw_cfg` option of qemu? kind of like how Fedora CoreOS does its ignition config? `--qemu-commandline="-fw_cfg name=opt/com.coreos/config,file=/path/to/example.ign"` https://docs.fedoraproject.org/en-US/fedora-coreos/getting-started/#_launching_with_qemu
[18:50] <Odd_Bloke> cyberpear: Using the firmware configuration like that isn't supported, so the two options I would suggest are using the kernel cmdline or a NoCloud metadata drive.  Both of those options are documented at https://cloudinit.readthedocs.io/en/latest/topics/datasources/nocloud.html#datasource-nocloud
[18:51] <Odd_Bloke> cyberpear: (You can also file a feature request using the bug link in the topic, if you'd like. :)
[19:53] <Odd_Bloke> blackboxsw: https://github.com/cloud-init/ubuntu-sru/pull/87 <-- for your review; in particular, review of the verification script before I start running it for all releases would be appreciated :)
[21:16] <blackboxsw> Odd_Bloke: looks good. The testing you are doing there is probably a bit deeper than needed as we could have used `lxc exec test-$SERIES -- cloud-init devel net-convert --output-kind=netplan --directory /out.d --network-data=network.yaml --distro ubuntu` and validated the output results instead of having to setup an lxc and override configs.    Did you want to exercise the whole system instead, it is definitely
[21:16] <blackboxsw> more thorough to setup lxc network on launch and your test is valid
[21:24] <blackboxsw> comment and pointer added https://github.com/cloud-init/ubuntu-sru/pull/87 take what you will.
[21:32] <ahosmanMSFT> Hi @blackboxsw it's been a while, I've been inbetween teams. I I noticed azurecloud integration has an issue with the function _wait_for_system(self, wait_for_cloud_init) in the base instance.py class. This function is called after the vm is booted and tries to run a script and ssh. When removing that function there are no ssh issue's, but for now it ssh'ing is 50/50. Can you help me look into this?
[21:33] <blackboxsw> hi ahosmanMSFT.
[21:34] <blackboxsw> is that failing due to timeout?
[21:34] <ahosmanMSFT> Yes
[21:36] <ahosmanMSFT> When I remove that function it's a 100% success
[21:38] <blackboxsw> so on a test run that did fail you'd probably want to pass --preserve-instance and see if ssh connectivity came up sometime later after the default boot_timeout that you have set for azure which is 300 seconds.
[21:40] <blackboxsw> on a test system that is retained (and exhibited the timeout failure) I'd be curious to see cloud-init analyze blame to see if cloud-init was spending an inordinate amount of time setting up
[21:40] <ahosmanMSFT> I did some tests and ssh connective is available, I think it has to do with either the scripts or something else in that function
[21:40] <blackboxsw> if cloud-init setup on Azure is < 30 seconds, then the issue is somehow that the initiall ssh connection to the vm is timing out without connect
[21:40] <ahosmanMSFT> hmm I haven't run blame on the system
[21:42] <ahosmanMSFT> I presume it it has nothing to do with ssh'ing it's self, but the wait  part because it ssh's immediately when that function is removed
[21:42] <blackboxsw> ahosmanMSFT: also what that 'script' waits for is for a systemd enabled system to report either `systemctl is-system-running == 'running' or 'degraded'`
[21:42] <blackboxsw> so checking `systemctl is-system-running` on the system will tell you what state it is in
[21:43] <blackboxsw> ahosmanMSFT: and a `systemd-analyze blame` on the timedout system will also tell you where the boot process spent most of it's time
[21:43] <blackboxsw> *its*
[21:54] <ahosmanMSFT> Ok, I'll try that and let you know. Got a meeting soon though.
[22:04] <Odd_Bloke> blackboxsw: https://github.com/canonical/cloud-init/pull/185 <-- very small CI fix/change
[22:14] <Odd_Bloke> blackboxsw: And https://github.com/cloud-init/ubuntu-sru/pull/88
[22:20] <ahosmanMSFT> Do cloud tests run every PR, I know they run every night @blackboxsw
[22:32] <Odd_Bloke> We run a subset of the lxd tests for each PR.
[22:32] <Odd_Bloke> But the full LXD test suite and the non-LXD test suites only run nightly.
[22:34] <ahosmanMSFT> how about azure/ec2
[22:37] <Odd_Bloke> As I said, the non-LXD test suites only run nightly. :)
[22:37] <ahosmanMSFT> ok, that makes sense since those tests would consume more time
[22:50] <blackboxsw> more time and more $$  spinning up instances on the clouds :)
[22:51] <meena> Odd_Bloke: what about contextlib vs contextlib2?
[23:01] <blackboxsw> ahosmanMSFT: did you know the azure instance type which exhibits byte-swapping behavior?
[23:01] <blackboxsw> I'm trying to validate that your fix resolves the issue w/ incorrectly seeing 'new' instance-id across boots
[23:01] <blackboxsw> as that fix is part of this SRU
[23:02] <ahosmanMSFT> It was on all Azure gen2 VM's when switching nodes on azure
[23:02] <blackboxsw> thanks ahosmanMSFT
[23:15] <ahosmanMSFT> @blackboxsw I'm witnessing some weird behavior if azurcloud/image.py two different if statements one executes one doesn't they both have the same self._img_instance value of NONE, can you verify this. This is why images aren't launching on azurecloud integration tests. https://paste.ubuntu.com/p/VW8SH8QXsj/
[23:19] <Odd_Bloke> meena: I haven't looked at it yet, but I assume it can go?
[23:25] <blackboxsw> ahosmanMSFT: is self._img_instance the string "NONE" instead of the python value of None?
[23:25] <blackboxsw> that would trigger one path to run, and the other not
[23:27] <ahosmanMSFT> They both have the same value, you can see when it’s initialized in azure cloud/image.py.__init__
[23:31] <blackboxsw> ahosmanMSFT: your             LOG.debug("self._img_instance: %s" % self._img_instance) is down below  self.platform.create_instance( and             self._img_instance.start(wait=True, wait_for_cloud_init=True)
[23:32] <blackboxsw> so it's one of those two that isn't completing without error (which is why your logs don't show             LOG.debug("self._img_instance: %s" % self._img_instance)
[23:32] <blackboxsw> so the logic paths are properly followed. just something bogus happening in the create_instance or instance.start() calls right
[23:34] <blackboxsw> ahosmanMSFT: is there a specific cloud_test name that typically fails for you when things do fail?
[23:36] <ahosmanMSFT> blackboxsw it doesn’t fail individual tests, but when running multiple tests it fails to create clean image for the rest of tests due to it failing to creat a snapshot
[23:37] <blackboxsw> ok will run a suite and see if I can get it to fail for me
[23:38] <blackboxsw> won't be able to kick that off though until I'm done with current SRU verification on Azure specifically (as I don't want to collide w/ my manual test runs in the same account)
[23:39] <blackboxsw> I just have one more manual SRU test to run (I had hit a configuration problem as I sent in email). But I *think* I've worked around it by creating a load balancer for the moment.
[23:40] <ahosmanMSFT> blackboxsw: Thanks, I’ll keep hacking at it too