[04:32] <johnjaye> is cloud-init useful to manage your personal vms? or would it be faster to just use the package manager or whichever on those systems?
[12:58] <meena> cjp256: for now, the plan is to keep it as is…
[12:59] <meena> I'm not sure how to have more than one NIC with the same MAC address
[13:04] <meena> cjp256: most of the refactor I'm doing is focused on BSDs, so for now, I would port the same code/bugs
[13:06] <meena> cjp256: if i see this right, the issue is mostly triggered by MAC 00:00:00:00:00:00;
[13:07] <meena> cjp256: so, I should exclude https://github.com/canonical/cloud-init/blob/1f43a83e15fd47fa2a6b5d3836bf6e055b956b89/cloudinit/distros/parsers/ifconfig.py#L155 empty_mac()s here, from being stored and subsequently overwritten.
[13:20] <meena> we don't have an empty_mac() function, it would seem, that's just a pattern in tests
[15:46] <cjp256> meena: on Azure, there are virtual accelerated network interfaces that are bonded with their non-accelerated couterparts.  They share the same mac (not zeros).  The root of the problem is get_interfaces_by_mac() assumes 1:1 (and their callers).  I think moving callers to get_interfaces() and dropping the _by_mac() variants would address it.  But get_interfaces_by_mac() is the current cross-platform variant, which is probably a focus of the 
[15:46] <cjp256> refactor?
[19:01] <holmanb> cjp256: Agreed, lots of places could use just a iterable of non-virtual interfaces rather than specifically _by_mac(). That said, would working around the exception there cause a different race elsewhere? I'm curious when the kernel will be done bonding and the interface configurable.
[19:13] <AnhVoMSFT> holmanb: https://github.com/torvalds/linux/blob/81e7cfa3a9eb4ba6993a9c71772fdab21bc5d870/net/core/dev.c#L10057
[19:15] <AnhVoMSFT> line 10057 is when the device is registered to the kernel and will be visible in /sys/class/net - line 10090 is when the bonding happens in the case of netvsc. The window between the two are fairly small (the code in between doesn't seem like it depends on the host/hypervisor). Our thinking is that a simple sleep/retry should work here
[19:54] <AnhVoMSFT> taking a closer look at the call in netvsc where they're trying to find the parent to bond the nic to: https://github.com/torvalds/linux/blob/f2b220ef93ea34ff6ce48fec382689cf02099f39/drivers/net/hyperv/netvsc_drv.c#L2276
[19:56] <AnhVoMSFT> apparently they're matching it by two methods: either matching the serial, or matching the mac address. I checked with the netvsc team and the vf_serial property is not exposed as an attr so cloud-init won't be able to use that field. Since we're solving the issue of duplicate mac address, the mac address matching isn't useful to us. This means there's really no way for us in user mode to 
[19:56] <AnhVoMSFT> know when kernel has finished bonding, other than just wait and try again 
[22:10] <holmanb> AnhVoMSFT: Thanks for the context. That sounds less than ideal, but we might be able to do something like that. I'm hesitant about introducing that kind of logic on other clouds, though if others use hyperv like this we may not have a choice. Is there any plan/interest in exposing the serial to userspace in the future?
[22:13] <AnhVoMSFT> holmanb: Another option is to do a driver check and if one of the two nics with duplicate mac's driver is hv_netvsc and the other one is not, that's expected and the nic that is not hv_netvsc driver should be ignored.
[22:14] <AnhVoMSFT> That's definitely better than blindly retrying. However, this will introduce a specific driver's behavior into networking module
[22:27] <meena> fun… 
[23:42] <meena> What data structure should I be using rather than Dict to be able to store duplicate keys?