/srv/irclogs.ubuntu.com/2020/04/22/#cloud-init.txt

=== cpaelzer__ is now known as cpaelzer
=== hjensas is now known as hjensas|afk
caribouHello everyone, did someone report issues with cloud-init cloud-config & cloud-final services being blocked by snapd in recent ubuntu cloud images ?12:30
caribouAll I can find is a recent discussion in linuxcontainers.org reporting similar issues12:31
caribouwell looks like it's more of a snapd problem than cloud-init12:36
=== hjensas|afk is now known as hjensas
caribouok, it IS snapd's fault; removing the package release the cloud-config & cloud-final jobs12:50
caribouany snapd IRC channel around ?12:50
caribouFYI : https://bugs.launchpad.net/snapd/+bug/187424912:57
ubot5Ubuntu bug 1874249 in snapd "snapd service never completes on boot off focal cloudimg" [Undecided,New]12:57
Odd_Blokecaribou: I haven't seen that particular failure.  BTW, it looks like you have truncated lines in your journal output there, which might make it harder for the snapd folks to debug.13:33
cariboujust got it today with the new cloudimg13:35
caribouok, I'll check the logs & add a new set13:35
blackboxswOdd_Bloke: I see the same issue I'm seeing over on ua-client repo. no travis links to jobs that are in progress. https://github.com/canonical/cloud-init/pull/323  have you  noticed this before?17:55
blackboxswI'm finding as well on ua-client that even completed travis jobs are not firing a status response back to the source PR, so it remains unmergeable17:56
Odd_BlokeIt happens from time-to-time, yeah.17:56
Odd_BlokeMigrating to travis-ci.com should also fix this, I believe.17:57
Odd_Bloke(Because AIUI .org uses an older, now-deprecated GitHub API.)17:57
blackboxswsimilar to intermittent probs like this I think https://github.com/travis-ci/travis-ci/issues/736317:59
blackboxswgotcha17:59
blackboxswyeah I can see your travis run has completed with success https://travis-ci.org/github/canonical/cloud-init/builds/678274876 but no status update on your PR yet https://github.com/canonical/cloud-init/pull/32318:00
Odd_BlokeYep.18:00
blackboxswhttps://www.githubstatus.com/ github issue:  Update - We have implemented a fix and are processing a backlog of notifications.18:03
blackboxswApr 22, 01:26 UTC18:03
Odd_BlokeTravis reported they were operational after that.18:03
Odd_BlokeOh, not after that, but I think notifications are probably user-facing notifications?18:04
Odd_Blokehttps://www.traviscistatus.com/incidents/bj882gcyxh9v corresponded to https://www.githubstatus.com/incidents/dsf2qtzh4jpz18:04
AnhVoMSFTif I want cloudinit to write netplan yml file into /run/netplan, where is the right place to change the netplan_path?20:06
AnhVoMSFTI changed it in the datasource's __init__ (distro.renderer_configs['netplan']['netplan_path']) but I don't think it's being picked up20:07
Odd_BlokeAnhVoMSFT: I don't know off the top of my head.  What are you trying to achieve by doing this?20:27
AnhVoMSFTcloud-init, once disabled/removed leaves behind the /etc/netplan/50-...yml configuration file that has a mac address hardcoded in it. This causes problem for customers who snapshots the VHD and wants to boot them up as a separate VM.20:30
AnhVoMSFTsince network configuration is re-generated upon every boot on Azure anyway, it makes more sense to write the netplan configuration file in /run where it does not persist across boot20:31
AnhVoMSFTI'm trying to change the path of the netplan config from within the datasource so that it writes to /run/netplan instead20:32
Odd_BlokeShouldn't cloud-init run on those VMs and regenerate the correct configuration (with the appropriate MACs for that VM)?20:34
AnhVoMSFTA couple scenarios where that does not work: 1) Customers already disabled/removed cloud-init, 2) In some scenarios, the metadata source isn't available when booting these VHDs20:38
AnhVoMSFTSo I changed the distro's renderer_config that was passed to the datasource, but when I print it out from distros/__init__.py's _supported_write_network_config, it does not seem like the change was picked up20:42
Odd_BlokeAnd what would happen to an instance that was rebooted and had cloud-init fail for some reason?  I think it would fall off the network if its networking config was all in /run?20:51
AnhVoMSFTThat is a good point. I think it would depend on when/where cloud-init fails. Let me think about it a bit20:56
Odd_BlokeIn fact, (1) is a case where storing the network config in /run would fail too, isn't it?  `apt remove cloud-init; reboot` -> no network config21:00
AnhVoMSFTindeed. Would writing a netplan file into /etc/netplan without a mac address, then write one with mac-address into /run work? thinking out loud21:02
AnhVoMSFTalthough that is probably as good as not writing mac address into /etc in the first place21:03
Odd_BlokeAnhVoMSFT: This feels quite complex, and I'm worried that we will miss/forget stuff if we discuss it in IRC.  Do you think you could file a bug for it so that we can make sure we all understand the requirements/problem statement?21:05
AnhVoMSFTlet me see if we had an existing bug on it21:06
AnhVoMSFTwe did talk about this with Ryan and Josh in one of our sync meetings and at the time the /run approach seemed reasonable, but you pointed out a pretty big gap21:07
=== mutantturkey is now known as old_joe
AnhVoMSFTI guess the main problem is cloud-init is leaving behind the netplan file with a hardcoded mac address in it21:07
=== old_joe is now known as mutantturkey
Odd_BlokeWell, it's "leaving it behind" so that it can apply network configuration correctly on the next boot, so it's not entirely a "problem". :)21:08
AnhVoMSFTI think what I meant was when it gets removed/uninstalled, etc...21:09
AnhVoMSFTbut the problem isn't so much of leaving it behind, the problem is it hardcodes the mac address in it, which potentially can become stale and if there isn't an entity that updates it21:09
Odd_BlokeRight, but I think hardcoding the MAC address is the correct thing to do in the general case.  Because if we don't do it then, potentially, on future boots, interfaces can be presented to userspace with different names (this can happen due to races in the kernel, so it's not platform-specific, or it can be the platform presenting them at different PCI addresses), and we'll apply incorrect configuration.21:12
Odd_Bloke(Do you already have a deprovisioning process that these customers are expected to follow?  Could that be expanded to include a step which calls cloud-init somehow?)21:13
AnhVoMSFTif there is only one nic there's no need for hardcoding mac. Or do we still need to hardcode it?21:14
AnhVoMSFTthe trouble is the backup/restore scenario where the customer takes snapshot or backups the OSDisk, then later restore it (as a different VM)21:14
AnhVoMSFTalthough backup/restore might not be as big of a problem if they provision it again as a normal VM, because cloud-init will run and perform network config21:15
Odd_BlokeAnd boots of those restored VMs don't run cloud-init?21:15
Odd_BlokeAha, we raced on the question and answer there. :)21:15
AnhVoMSFTonly if they attach OS Disk as specialize VM (which is the only way to boot up from a vhd today)21:15
AnhVoMSFTso there're some limitation of the platform there - when attaching disk as specialize vhd there isn't provisioning information being made available and cloud-init fails at some point earlier on and doesn't really do network config if I remember correctly21:16
AnhVoMSFT(it would fail to find Azure datasource, because there's no provisioning ISO attached)21:17
Odd_BlokeTo answer a slightly earlier question: we wouldn't need to hardcode the MAC if we were sure there would only _ever_ be one NIC.  But instances could have NICs attached, or disk images could be restored to systems with multiple NICs, so we can't assume that.21:18
Odd_Bloke(Obviously the restore case would break with a hardcoded MAC, so perhaps that wasn't the best example.  Still, the attach case is valid.)21:19
AnhVoMSFTyeah, customers can add new NIC, reboot, and probably lose network :-)21:19
AnhVoMSFTactually in that case no, because when they reboot they will get new config with 2 NICs and we'll be writing network config correctly (hopefully)21:20
Odd_BlokeRight, this would be the case where cloud-init had been disabled, I guess.21:21
Odd_BlokeInstance booted with a single NIC, cloud-init persists MAC address, cloud-init is removed, NIC added, reboot -> the cloud-init generated config will still reliably apply to the original NIC21:22
Odd_Bloke(Right?)21:22
AnhVoMSFTright21:22
AnhVoMSFTthis is tricky...21:22
Odd_BlokeAgreed.21:23
Odd_Bloke:p21:23
AnhVoMSFTlet me look into the scenario where we boot up vhd and no provisioning ISO attached21:23
Odd_BlokeYeah, this definitely feels like we need to understand the exact requirements driving the change, because that could make a substantial difference to the solution.21:24
AnhVoMSFTperhaps we can do something there21:24
AnhVoMSFTyeah, we have these support cases from backup/restore customers who now fail to boot up VM due to mac address in netplan. I will take a closer look and perhaps file a bug with better details so we can discuss21:25
Odd_BlokeOK, cool, thank you!21:25
AnhVoMSFTthanks Odd_Bloke21:26
=== tds1 is now known as tds

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!