[12:47] powersj, hey! First time attending cloud-init summit this year. Just wanted to understand the format of the event. Should I (can I?) prepare some slides to talk about cloud-init and Red Hat? [15:02] robjo: here now [15:03] rharper: in a meeting will ping you after, thanks [15:03] sure [15:05] Odd_Bloke: I think I addressed all of your reviews points, if you have some time to have another look :) [15:10] tribaal: Sure thing! Today's looking pretty busy (meetings and catch-up after a public holiday yesterday), so it may be tomorrow before I get to it, FYI. [15:11] I am trying to install cloud-init v19.1 on Ubuntu16.04. I get install with status: done. But when I check logs for the cloud-final.service, it says http://paste.openstack.org/show/755571/. I tried doing live capturing and deploy another VM on the image, it get assigned with the earlier VM IP. Can someone please suggest something to resolve? [15:12] amansi26: cloud-init comes installed in the Ubuntu cloud images by default; when you say "trying to install", can you clarify what you mean? [15:13] (I'm heading in to a block of meetings in a couple of minutes, FYI, so I won't be as responsive here.) [15:14] I added some custom module and made a debian package. Then I install that package on a Ubuntu machine [15:15] The same version works fine for RHEL. [15:20] Odd_Bloke: no worries, thanks for your time! [15:50] amansi26: I would suggest you file a cloud-init bug (the link is in the topic) describing the issue in detail, and ensuring to attach the tarball generated by `cloud-init collect-logs` to it. :) [15:51] rharper: only blocker to shifting generate_fallback_config from speaking network v1 to network v2 seems to be that when generating driver/device_id udev rules by NetworkState, the driver parameters are lost in a couple of cases. So, the rules get emitted like this: DRIVERS=="?*" instead of 'DRIVERS=="hv_netsvc". I'm gonna track that down now [15:51] Odd_Bloke:Sure [15:52] amansi26: good to hear of people packagin custom plugin modules. I wasn't sure how often that 'feature' of cloud-init was used [15:53] blackboxsw: yes, that sounds right; I think we need a way to hang that in the NetworkState so that a v2 -> eni does the right thing; for v2 -> v2; that should be converted to match: {'device_driver': 'hv_netscv'} [15:55] rharper: +1 though I thought the match section had 'driver' and 'device_id' keys. I'll double check.. maybe that's my misunderstanding due to cloudinit/net/__init__.py :extract_physdevs._version_2 [15:55] * blackboxsw checks netplan.io examples [15:56] only driver, mac, or name [15:56] driver works [15:56] that's what we want [15:56] the azure code wants to explicitly ignore than mlx4_core devices; and the dhcp should apply only to the devices with driver=hv_netscv [15:56] https://netplan.io/reference#common-properties-for-physical-device-types [15:56] roger [15:56] so, we should be able to match with both driver and mac [15:57] +1 [15:58] rharper: the reference for netplan doesn't say anything about device_id property in match. (only driver). Should v2 emit a device_id 0x3(or whatever?) or is there another reference that might document that [15:59] driver is fine [15:59] does the device_id even show up in the udev rule ? [16:00] don't think so SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:11:22:33:44:55", NAME="eth1" [16:00] right [16:00] ok omitting [16:00] right, azure only calls into fallback with the driver name to blacklist [16:01] agreed [16:01] I think the fallback just stuffed the device_id in there because it was part of the device_driver() return [16:08] yeah and if we don't need/use it, then no need to call device_devid() anymore either [16:09] cyphermox: thanks for the confirmation on !devicE_id [16:09] I'm not against adding new matching ways, but it largely depends on what networkd can realistically do, and that's a bit of a pain [16:10] +1, I don't think we have a use case that currently requires device_id. If we do we'll file a bug/feature request [16:10] ack [16:25] rharper: still have to get back to the other network issue, but something new popped up yesterday [16:25] refresh my mind on other issues [16:26] openstack setup in "dual stack mode, i.e. ipv4 & 6" [16:26] this is the new one [16:26] the metadatada server produces: [16:26] curl http://169.254.169.254/openstack/2018-08-27/network_data.json [16:26] {"services": [], "networks": [{"network_id": "4aae8709-b4e6-4cf7-84f7-b7cbddfe3ecb", "link": "tapc48a243a-e6", "type": "ipv4_dhcp", "id": "network0"}, {"network_id": "4aae8709-b4e6-4cf7-84f7-b7cbddfe3ecb", "type": "ipv6_slaac", "services": [], "netmask": "ffff:ffff:ffff:ffff::", "link": "tapc48a243a-e6", "routes": [{"netmask": "::", "network": "::", "gateway": "fd29:c112:2871::1"}], "ip_address": "fd29:c112:2871:0:f816:3eff:fe64:b0d6 [16:26] ", "id": "network1"}], "links": [{"ethernet_mac_address": "fa:16:3e:64:b0:d6", "mtu": 1450, "type": "ovs", "id": "tapc48a243a-e6", "vif_id": "c48a243a-e623-4ef6-a363-98564e59fade"}]} [16:26] the openstack helper then produces: [16:27] 2019-08-05 14:14:10,631 - stages.py[DEBUG]: applying net config names for {'version': 1, 'config': [{'mtu': 1450, 'type': 'physical', 'subnets': [{'type': 'dhcp4'}, {'type': 'static', 'netmask': 'ffff:ffff:ffff:ffff::', 'routes': [{'netmask': '::', 'network': '::', 'gateway': 'fd29:c112:2871::1'}], 'address': 'fd29:c112:2871:0:f816:3eff:fe64:b0d6'}], 'mac_address': 'fa:16:3e:64:b0:d6', 'name': 'eth0'}]} [16:27] so we have a "static" and a "dynamic" subnet on the same interface [16:28] these get processed in order and so the renderer clobbers "bootproto=dhcp" with "bootproto=static" [16:28] that of course breaks ipv4 access to the system [16:29] one fix is to declare dhcp the winner in the rendered, i.e.: [16:29] is it ? [16:30] yes because ifcfg-eth0 will have "bootproto=static" which menas the dhcp client to request an IP address will not staart and thus the instance only has the static IPv6 address [16:30] but dhcp will allow static ip assignment in addition to dhcp ? [16:31] yes, but the interface is configured as "static" therefore no dhcp request will be issued [16:31] ok, is that consistent on rhel/suse ? [16:32] the fix is to pick dhcp as the winner, i.e. in the renderer it should be: [16:32] elif subnet_type == 'static': [16:32] if iface_cfg['BOOTPROTO'] != 'dhcp': [16:32] iface_cfg['BOOTPROTO'] = 'static' [16:33] it's on openSUSE/SLES, but I would be surprised if RHEL behaves differently in this case [16:34] https://paste.ubuntu.com/p/drPGpmW8KT/ [16:34] I can't reproduce with cloud-init master [16:34] both suse/centos hav bootproto=dhcp [16:38] robjo: the dual_stack.yaml is the v1 config you pasted from the log [16:38] OK, looking at the code, the issue was reported with 18.5 [16:39] master produces the expected results, I agree [16:39] going fishing through the code .....be back in a bit [16:41] robjo: in your 18.5 branch, I think you can just repeat the net-convert command like I did and see if it renders differently [16:47] OK, that's weird the 18.5 renderer does not set bootproto to static either, more digging required, thanks for the help [16:50] robjo: sure [17:15] rharper: just force pushed generate_fallback_config -> talking network v2 to https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/370970 for azure v2 support [17:16] cool [18:00] Hi folks, I'm seeing this error from one customer's deployment [18:01] 2019-07-31 17:09:16,279 - util.py[DEBUG]: Failed mount of '/dev/sr0' as 'auto': Unexpected error while running command.Command: ['mount', '-o', 'ro', '-t', 'auto', '/dev/sr0', '/run/cloud-init/tmp/tmphmsab0y2']Exit code: 32Reason: -Stdout: Stderr: mount: /dev/sr0 is already mounted or /run/cloud-init/tmp/tmphmsab0y2 busy [18:01] looks like the command took about 2s and then failed [18:02] there was a dump of /proc/mounts earlier but /dev/sr0 wasn't on there, so definitely it wasn't "already mounted" scenario, so likely /run/cloud-init/tmp/... was busy - that was created with "with temp_utils.tempdir() as tmpd:" in mount_cb , how can it be busy ? [18:06] AnhVoMSFT: Are you able to file a bug with `collect-logs` attached? [18:07] what logs do collect-logs get? We have cloud-init-output log and cloud-init log, kernel log. The instance was deleted though [18:09] if it said 'already mounted', then i'd think the most likely scenario is that it was already mounted. [18:09] collect-logs will also get /run/cloud-init which is useful. [18:09] If that's what we have, that'll probably do. I'd recommend doing a `collect-logs` in future if possible, though, because it gathers a bunch of useful stuff. [18:09] lack of /dev/sr0 in /proc/mounts that is dumped could be a race. cloud-init tried mount, failed, and then unmounted. [18:11] no, the dump of /proc/mounts is part of util.py mounting code, it checks before trying to mount. While there is potential race issue it's unlikely because other than cloud-init nothing else is mounting /dev/sr0 during init-local phase [18:11] (which is quite early in the boot process) [18:26] 3.14 [20:01] rharper: Odd_Bloke powersj, just published tip of cloud-init to eoan. should be seeing tip in cloud images tomorrow [20:04] \o/ [20:17] cloud-init 19.2-5-g496aaa94-0ubuntu1 (Accepted) [20:19] nice [20:37] rharper: if I'm parsing IMDS in azure and the vm has 3 nics, should I think about adding dhcp4-overrides: route-metric: 100 * so the larger the interface number, the higher the metric? or should all non-primary be 200? [20:44] powersj: I'm going to validate that eoan-proposed systemd=243~rc1-0ubuntu1 fixes Azure multi-ip on primary nic [20:44] excellent [20:44] I expect it will. [20:45] but that'll tie off azure multi-ip primary nic support [21:39] If i run a `reboot` command as part of a runcmd, then I have subsequent commands after it, do all commands get run? [21:39] example: [21:39] ``` [21:39] `runcmd:` [21:39] ` - echo 'hi' | tee /tmp/a-file` [21:40] ` - reboot` [21:40] ` - echo 'hee hee' | tee -a /tmp/a-file` [21:40] I want both `hi` and `hee hee` to be written to the file `/tmp/a-file` across reboot. Is this possible? [21:47] djhaskin987: no, the reboot is going to run and the remainder of your commands won't be run again; it defaults to running command once per instance; [21:49] rharper: https://code.launchpad.net/~daniel-thewatkins/cloud-init/+git/cloud-init/+merge/370927 <-- ready for re-review [21:51] rharper thanks for the info, very helpful. [21:54] djhaskin987: https://cloudinit.readthedocs.io/en/latest/topics/examples.html#reboot-poweroff-when-finished will allow you to take the reboot out, and I would suggest that you write a script with write_files to /var/lib/cloud/scripts/per-boot/XXXX ; that script will be called on every boot; and in your script, you can check for some marker file that your firstboot runcmd would touch; [21:54] Odd_Bloke: ok [21:54] rharper sounds good thanks [21:55] i'll do that [21:57] rharper and Odd_Bloke, also Azure support for route-metrics on secondary nics is up https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/370970 [21:58] just pushed changes. (I think I have lints to fix) [22:00] blackboxsw: nice [22:05] blackboxsw: so that one 'driver: hv_netsvc' ; that's in the example comment for the method right ? It's hard to see from the diff [22:16] rharper: right yeah, I just extended the docstring on a method [22:16] to represent expected method/function input [23:22] What about things? [23:23] Who knows about things [23:27] I have some questions [23:27] ask away [23:28] in this path, cloud-init/cloudinit/sources [23:28] I'm looking to add a new source [23:28] yes! [23:28] but I am pure dog scientist [23:29] if I wanted to query a metadata URL, does it have to be via IP or will I be able rock'n'roll with a domain? [23:29] you can use a dns name, DataSourceGCE.py for example uses that [23:30] so it does, thanks, will look there [23:30] I've never used Launchpad before :| [23:30] and uhh, I mean, I can barely write python, so this should be fun [23:30] https://cloudinit.readthedocs.io/en/latest/topics/hacking.html [23:31] that should have most of the steps you can use to get started on a contribution [23:31] why spend ten minutes in the docs when I can spend ten hours toiling away? [23:31] * kbZ looks [23:31] >It assumes you have a Launchpad account [23:31] assumptions already, yikes [23:32] well, at least it told you [23:32] lol, thanks :D [23:32] in Launchpad's defense, I usually only end up here angry and at the end of a bug [23:32] so we kinda got off on the wrong foot [23:32] fair enough [23:37] one thing I noticed [23:37] DataSourceAzure.py has different permissions than the rest [23:37] -rwxr-xr-x vs -rw-r--r-- [23:38] I want to make jokes [23:38] Microsoft bully you into this? (the joke) [23:38] hehe [23:39] is that an accident? if I'm in here fixing things? [23:39] the execute bit isn't needed on the Datasource, so sure, send in a fix [23:39] cool [23:40] well, apparently I have been an Ubuntu One member for a long time [23:40] Member since: 2005-11-04 [23:40] awe [23:40] nice [23:40] someone sent my real Ubuntu 6 CDs in the mail a long time ago [23:40] I've a few of those [23:41] lets see if I can slip in this chmod thing and then I'll try taking a crack at the bigger issue [23:42] >To contribute, you must sign the Canonical contributor license agreement [23:42] who gets my first born? [23:44] up to you =)