[08:22] <otubo> smoser: I'd appreciate a quick last review on my PR if you have some time :)
[13:47] <smoser> blackboxsw: or Odd_Bloke https://github.com/canonical/cloud-init/pull/721#pullrequestreview-581355914
[13:47] <smoser> otubo: yeah... i just responded.
[13:48] <smoser> always feel free to ping me, don't feel like you're annoying me.
[13:59] <otubo> smoser: will keep that in mind :) thanks!
[14:38] <Vultr-EBenner> Hello, I work with Vultr and we are currently working on adding a Datasource to Cloud-Init for our platform. One issue we have come across is that Cloud-Init by default only brings up the first interface, and any additional require a reboot. We use two different interfaces, one for public, and one for private. We really want both to be up here.
[14:38] <Vultr-EBenner> What we currently have implemented is that we are using the net class to determine the names and then brining them up with subp commands. However this is happening in the datasource and kind of on the fence about it. I wanted to reach out and see how the developers feel about this and how they would like to see us handle this.
[14:54] <Odd_Bloke> Vultr-EBenner: Hi!  cloud-init allows platforms to provide network configuration which will be applied to the instance; this sounds like a perfect use case for it: https://cloudinit.readthedocs.io/en/latest/topics/network-config-format-v2.html
[14:56] <Odd_Bloke> Vultr-EBenner: Basically, your datasource should provide network config in either the v1 or v2 format (I linked to the latter, which is preferred) in the .network_config property, and cloud-init will take care of converting that to the configuration format of the running system and applying it.
[15:20] <Vultr-EBenner> That is indeed what we are doing. We are proving the network config, and it is indeed configured, however the second interface is not brought up as the first one is.
[15:23] <rharper> Vultr-EBenner: how is network config provided? is your metadata for the instance network based ?  if so, what cloud-init does is bring up one interface to then collect instance  metadata first, and if the provided metadata includes network config, (for one or more interfaces) this will be written to the OS net config and then the OS network layer will bring up all configured interfaces
[15:25] <Vultr-EBenner> We are depending on filesystem and network, then providing the config via networking with our metadata endpoing. We provide the network config via the vendor config and via the network function in the datasource. The first and second interface are configured, the first one comes up but the second does not unless we explicitly configure it and bring
[15:25] <Vultr-EBenner> it up ourselves it seems.
[15:28] <smoser> Vultr-EBenner: you need to run at local time frame in order to provide network config.
[15:28] <smoser> otherwise it is simply too late to apply it.
[15:30] <smoser> even if you "manually" apply it in "cloud-init init" time frame, other parts of the system have already expected networking to be up.
[15:30] <rharper> Vultr-EBenner: if your network config is included in the metadata end point, then you'll want to make use of the EphemeralDHCP class, see Azure or EC2 Datasources ... this allows your datasource to collect the correct instance network config before OS networking is configured;
[15:30] <smoser> look at digital oceans' datasource, which does something like what you need.
[15:31] <rharper> yeah, that's likely a simpler class to review
[15:32] <smoser> it doesn't have to be ephemeraldhcp... it just has to deterministically identify the platform is vultur, and then bring up networking manually to get the datasource, and then tear it all down.
[15:32] <Vultr-EBenner> I think I may be explaining this poorly. We arent having trouble creating the configs and getting them pushed. Thats all good. The issue is that the second interface isnt being brought up, just configured after everything is said and done. We can connect to the metadata server and configure everything just fine, thats not an issue.
[15:32] <Vultr-EBenner> No errors, in logs, the second interface is just never brought up no matter how we configure it it seems.
[15:33] <Vultr-EBenner> But it is configured. On reboot it comes up as expected.
[15:33] <rharper> smoser: but ephemeral dhcp might be less work if you just need an interfac4e up to talk to the metadata service;  either way is fine.
[15:34] <rharper> Vultr-EBenner: right; so, if you have logs, cloud-init collect-logs would be good;  it sounds like your network-config property might not get populated at local time; in which case, you get a dhcp on one interface config, and then later the config is updated but OS networking has already run
[15:34] <Vultr-EBenner> Again our issue isnt connecting to the metadata or anything in the config generation steps. Its after the config is applied by cloud-init the second interface is left down.
[15:34] <smoser> "how we configure it"
[15:34] <smoser> ?
[15:35] <smoser> you said that you are providing this at network time frame. "We are depending on filesystem and network"
[15:35] <smoser> that means it happens too late.
[15:35] <Vultr-EBenner> Ah, thank you, thats what I was wondering.
[15:35] <Vultr-EBenner> Assuming we will need to bring the link up on our own and do it before the networking step and stop depending on it then.
[15:36] <smoser> if you've correctly implemented check_instance_id, then a reboot will probably fix it. but yeah... you dont want that.
[15:36] <smoser> the fix is to run at local time frame.
[15:36] <Vultr-EBenner> Great thanks. We were thinking that was the issue. We have a solution ready to fix this then. Wasnt sure
[15:48] <Vultr-EBenner> Hmm, apparently I missed the message about running local prior, haha sorry for that. Yeah that does indeed seem to be the issue. Thanks again for the clarification.
[15:57] <rharper> np
[16:25] <kryl> hi there, I'm searching a service who can emulate EC2 like meta-data (for testing cloud-init with vagrant)
[16:40] <smoser> kryl: old. https://gist.github.com/smoser/1278651
[16:45] <kryl> smoser, thank you
[18:48] <jxtps> Hi! I'm trying to create an AWS AMI based on the AWS Deep Learning AMI (Ubuntu 16.04) v38.0. The provided AMI is quite slow to get training going due to a lot of "first run" type setup work that needs to be done (select framework (tensorflow), compile NVIDIA kernels (?), fetch transfer learning weights file, matplotlib needs to initialize some
[18:48] <jxtps> cache, ...).
[18:48] <jxtps> So I have created a warmup script that basically just runs a very limited training run and then I package up an AMI based on that. That works ok-ish - it's faster to get training going on a new instance than it would otherwise be. It's however still quite slow (~8 minutes), and I'm concerned that there are files stored to e.g. temporary directories
[18:48] <jxtps> that don't make it through the AMI-ification process.
[18:48] <jxtps> I'm aware of https://cloudinit.readthedocs.io/en/latest/topics/cli.html#cli-clean but the artifacts I'm concerned about are probably not part of the cloud-init state - is there any other cleaning that happens when packaging up an AMI on AWS?
[18:48] <jxtps> (I spawn an instance off of the based AMI, with a cloud-init script in the user-data that powers down the instance on completion, then an orchestration instance detects that shutdown and issues an ec2 create-image API call)
[18:48] <jxtps> Thanks!
[18:56] <Odd_Bloke> jxtps: My only experience is building the Ubuntu cloud images: those are constructed in a chroot without being booted, precisely to avoid having to clean up artifacts created by a boot.  You might be well-served by looking into image creation tools (e.g. Packer) which are likely to encapsulate best practice for you?
[18:57] <Odd_Bloke> (Other folks in here likely have more relevant experience, too. :)
[20:25] <andrewbogott> I maintain a custom debian base image with a bunch of things pre-installed for faster startup. It is… mostly not worth the trouble.  For real performance I'd probably make a snapshot post-cloud-init and use that.
[20:26] <andrewbogott> (not sure that's 100% applicable to your issue though)