otubo | powersj, hey! First time attending cloud-init summit this year. Just wanted to understand the format of the event. Should I (can I?) prepare some slides to talk about cloud-init and Red Hat? | 12:47 |
---|---|---|
rharper | robjo: here now | 15:02 |
robjo | rharper: in a meeting will ping you after, thanks | 15:03 |
rharper | sure | 15:03 |
tribaal | Odd_Bloke: I think I addressed all of your reviews points, if you have some time to have another look :) | 15:05 |
Odd_Bloke | tribaal: Sure thing! Today's looking pretty busy (meetings and catch-up after a public holiday yesterday), so it may be tomorrow before I get to it, FYI. | 15:10 |
amansi26 | I am trying to install cloud-init v19.1 on Ubuntu16.04. I get install with status: done. But when I check logs for the cloud-final.service, it says http://paste.openstack.org/show/755571/. I tried doing live capturing and deploy another VM on the image, it get assigned with the earlier VM IP. Can someone please suggest something to resolve? | 15:11 |
Odd_Bloke | amansi26: cloud-init comes installed in the Ubuntu cloud images by default; when you say "trying to install", can you clarify what you mean? | 15:12 |
Odd_Bloke | (I'm heading in to a block of meetings in a couple of minutes, FYI, so I won't be as responsive here.) | 15:13 |
amansi26 | I added some custom module and made a debian package. Then I install that package on a Ubuntu machine | 15:14 |
amansi26 | The same version works fine for RHEL. | 15:15 |
tribaal | Odd_Bloke: no worries, thanks for your time! | 15:20 |
Odd_Bloke | amansi26: I would suggest you file a cloud-init bug (the link is in the topic) describing the issue in detail, and ensuring to attach the tarball generated by `cloud-init collect-logs` to it. :) | 15:50 |
blackboxsw | rharper: only blocker to shifting generate_fallback_config from speaking network v1 to network v2 seems to be that when generating driver/device_id udev rules by NetworkState, the driver parameters are lost in a couple of cases. So, the rules get emitted like this: DRIVERS=="?*" instead of 'DRIVERS=="hv_netsvc". I'm gonna track that down now | 15:51 |
amansi26 | Odd_Bloke:Sure | 15:51 |
blackboxsw | amansi26: good to hear of people packagin custom plugin modules. I wasn't sure how often that 'feature' of cloud-init was used | 15:52 |
rharper | blackboxsw: yes, that sounds right; I think we need a way to hang that in the NetworkState so that a v2 -> eni does the right thing; for v2 -> v2; that should be converted to match: {'device_driver': 'hv_netscv'} | 15:53 |
blackboxsw | rharper: +1 though I thought the match section had 'driver' and 'device_id' keys. I'll double check.. maybe that's my misunderstanding due to cloudinit/net/__init__.py :extract_physdevs._version_2 | 15:55 |
* blackboxsw checks netplan.io examples | 15:55 | |
cyphermox | only driver, mac, or name | 15:56 |
rharper | driver works | 15:56 |
rharper | that's what we want | 15:56 |
rharper | the azure code wants to explicitly ignore than mlx4_core devices; and the dhcp should apply only to the devices with driver=hv_netscv | 15:56 |
blackboxsw | https://netplan.io/reference#common-properties-for-physical-device-types | 15:56 |
blackboxsw | roger | 15:56 |
rharper | so, we should be able to match with both driver and mac | 15:56 |
blackboxsw | +1 | 15:57 |
blackboxsw | rharper: the reference for netplan doesn't say anything about device_id property in match. (only driver). Should v2 emit a device_id 0x3(or whatever?) or is there another reference that might document that | 15:58 |
rharper | driver is fine | 15:59 |
rharper | does the device_id even show up in the udev rule ? | 15:59 |
blackboxsw | don't think so SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:11:22:33:44:55", NAME="eth1" | 16:00 |
rharper | right | 16:00 |
blackboxsw | ok omitting | 16:00 |
rharper | right, azure only calls into fallback with the driver name to blacklist | 16:00 |
blackboxsw | agreed | 16:01 |
rharper | I think the fallback just stuffed the device_id in there because it was part of the device_driver() return | 16:01 |
blackboxsw | yeah and if we don't need/use it, then no need to call device_devid() anymore either | 16:08 |
blackboxsw | cyphermox: thanks for the confirmation on !devicE_id | 16:09 |
cyphermox | I'm not against adding new matching ways, but it largely depends on what networkd can realistically do, and that's a bit of a pain | 16:09 |
blackboxsw | +1, I don't think we have a use case that currently requires device_id. If we do we'll file a bug/feature request | 16:10 |
cyphermox | ack | 16:10 |
robjo | rharper: still have to get back to the other network issue, but something new popped up yesterday | 16:25 |
rharper | refresh my mind on other issues | 16:25 |
robjo | openstack setup in "dual stack mode, i.e. ipv4 & 6" | 16:26 |
rharper | this is the new one | 16:26 |
robjo | the metadatada server produces: | 16:26 |
robjo | curl http://169.254.169.254/openstack/2018-08-27/network_data.json | 16:26 |
robjo | {"services": [], "networks": [{"network_id": "4aae8709-b4e6-4cf7-84f7-b7cbddfe3ecb", "link": "tapc48a243a-e6", "type": "ipv4_dhcp", "id": "network0"}, {"network_id": "4aae8709-b4e6-4cf7-84f7-b7cbddfe3ecb", "type": "ipv6_slaac", "services": [], "netmask": "ffff:ffff:ffff:ffff::", "link": "tapc48a243a-e6", "routes": [{"netmask": "::", "network": "::", "gateway": "fd29:c112:2871::1"}], "ip_address": "fd29:c112:2871:0:f816:3eff:fe64:b0d6 | 16:26 |
robjo | ", "id": "network1"}], "links": [{"ethernet_mac_address": "fa:16:3e:64:b0:d6", "mtu": 1450, "type": "ovs", "id": "tapc48a243a-e6", "vif_id": "c48a243a-e623-4ef6-a363-98564e59fade"}]} | 16:26 |
robjo | the openstack helper then produces: | 16:26 |
robjo | 2019-08-05 14:14:10,631 - stages.py[DEBUG]: applying net config names for {'version': 1, 'config': [{'mtu': 1450, 'type': 'physical', 'subnets': [{'type': 'dhcp4'}, {'type': 'static', 'netmask': 'ffff:ffff:ffff:ffff::', 'routes': [{'netmask': '::', 'network': '::', 'gateway': 'fd29:c112:2871::1'}], 'address': 'fd29:c112:2871:0:f816:3eff:fe64:b0d6'}], 'mac_address': 'fa:16:3e:64:b0:d6', 'name': 'eth0'}]} | 16:27 |
robjo | so we have a "static" and a "dynamic" subnet on the same interface | 16:27 |
robjo | these get processed in order and so the renderer clobbers "bootproto=dhcp" with "bootproto=static" | 16:28 |
robjo | that of course breaks ipv4 access to the system | 16:28 |
robjo | one fix is to declare dhcp the winner in the rendered, i.e.: | 16:29 |
rharper | is it ? | 16:29 |
robjo | yes because ifcfg-eth0 will have "bootproto=static" which menas the dhcp client to request an IP address will not staart and thus the instance only has the static IPv6 address | 16:30 |
rharper | but dhcp will allow static ip assignment in addition to dhcp ? | 16:30 |
robjo | yes, but the interface is configured as "static" therefore no dhcp request will be issued | 16:31 |
rharper | ok, is that consistent on rhel/suse ? | 16:31 |
robjo | the fix is to pick dhcp as the winner, i.e. in the renderer it should be: | 16:32 |
robjo | elif subnet_type == 'static': | 16:32 |
robjo | if iface_cfg['BOOTPROTO'] != 'dhcp': | 16:32 |
robjo | iface_cfg['BOOTPROTO'] = 'static' | 16:32 |
robjo | it's on openSUSE/SLES, but I would be surprised if RHEL behaves differently in this case | 16:33 |
rharper | https://paste.ubuntu.com/p/drPGpmW8KT/ | 16:34 |
rharper | I can't reproduce with cloud-init master | 16:34 |
rharper | both suse/centos hav bootproto=dhcp | 16:34 |
rharper | robjo: the dual_stack.yaml is the v1 config you pasted from the log | 16:38 |
robjo | OK, looking at the code, the issue was reported with 18.5 | 16:38 |
robjo | master produces the expected results, I agree | 16:39 |
robjo | going fishing through the code .....be back in a bit | 16:39 |
rharper | robjo: in your 18.5 branch, I think you can just repeat the net-convert command like I did and see if it renders differently | 16:41 |
robjo | OK, that's weird the 18.5 renderer does not set bootproto to static either, more digging required, thanks for the help | 16:47 |
rharper | robjo: sure | 16:50 |
blackboxsw | rharper: just force pushed generate_fallback_config -> talking network v2 to https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/370970 for azure v2 support | 17:15 |
rharper | cool | 17:16 |
AnhVoMSFT | Hi folks, I'm seeing this error from one customer's deployment | 18:00 |
AnhVoMSFT | 2019-07-31 17:09:16,279 - util.py[DEBUG]: Failed mount of '/dev/sr0' as 'auto': Unexpected error while running command.Command: ['mount', '-o', 'ro', '-t', 'auto', '/dev/sr0', '/run/cloud-init/tmp/tmphmsab0y2']Exit code: 32Reason: -Stdout: Stderr: mount: /dev/sr0 is already mounted or /run/cloud-init/tmp/tmphmsab0y2 busy | 18:01 |
AnhVoMSFT | looks like the command took about 2s and then failed | 18:01 |
AnhVoMSFT | there was a dump of /proc/mounts earlier but /dev/sr0 wasn't on there, so definitely it wasn't "already mounted" scenario, so likely /run/cloud-init/tmp/... was busy - that was created with "with temp_utils.tempdir() as tmpd:" in mount_cb , how can it be busy ? | 18:02 |
Odd_Bloke | AnhVoMSFT: Are you able to file a bug with `collect-logs` attached? | 18:06 |
AnhVoMSFT | what logs do collect-logs get? We have cloud-init-output log and cloud-init log, kernel log. The instance was deleted though | 18:07 |
smoser | if it said 'already mounted', then i'd think the most likely scenario is that it was already mounted. | 18:09 |
smoser | collect-logs will also get /run/cloud-init which is useful. | 18:09 |
Odd_Bloke | If that's what we have, that'll probably do. I'd recommend doing a `collect-logs` in future if possible, though, because it gathers a bunch of useful stuff. | 18:09 |
smoser | lack of /dev/sr0 in /proc/mounts that is dumped could be a race. cloud-init tried mount, failed, and then unmounted. | 18:09 |
AnhVoMSFT | no, the dump of /proc/mounts is part of util.py mounting code, it checks before trying to mount. While there is potential race issue it's unlikely because other than cloud-init nothing else is mounting /dev/sr0 during init-local phase | 18:11 |
AnhVoMSFT | (which is quite early in the boot process) | 18:11 |
ahosmanMSFT | 3.14 | 18:26 |
blackboxsw | rharper: Odd_Bloke powersj, just published tip of cloud-init to eoan. should be seeing tip in cloud images tomorrow | 20:01 |
Odd_Bloke | \o/ | 20:04 |
blackboxsw | cloud-init 19.2-5-g496aaa94-0ubuntu1 (Accepted) | 20:17 |
rharper | nice | 20:19 |
blackboxsw | rharper: if I'm parsing IMDS in azure and the vm has 3 nics, should I think about adding dhcp4-overrides: route-metric: 100 * <intf_num> so the larger the interface number, the higher the metric? or should all non-primary be 200? | 20:37 |
blackboxsw | powersj: I'm going to validate that eoan-proposed systemd=243~rc1-0ubuntu1 fixes Azure multi-ip on primary nic | 20:44 |
powersj | excellent | 20:44 |
blackboxsw | I expect it will. | 20:44 |
blackboxsw | but that'll tie off azure multi-ip primary nic support | 20:45 |
djhaskin987 | If i run a `reboot` command as part of a runcmd, then I have subsequent commands after it, do all commands get run? | 21:39 |
djhaskin987 | example: | 21:39 |
djhaskin987 | ``` | 21:39 |
djhaskin987 | `runcmd:` | 21:39 |
djhaskin987 | ` - echo 'hi' | tee /tmp/a-file` | 21:39 |
djhaskin987 | ` - reboot` | 21:40 |
djhaskin987 | ` - echo 'hee hee' | tee -a /tmp/a-file` | 21:40 |
djhaskin987 | I want both `hi` and `hee hee` to be written to the file `/tmp/a-file` across reboot. Is this possible? | 21:40 |
rharper | djhaskin987: no, the reboot is going to run and the remainder of your commands won't be run again; it defaults to running command once per instance; | 21:47 |
Odd_Bloke | rharper: https://code.launchpad.net/~daniel-thewatkins/cloud-init/+git/cloud-init/+merge/370927 <-- ready for re-review | 21:49 |
djhaskin987 | rharper thanks for the info, very helpful. | 21:51 |
rharper | djhaskin987: https://cloudinit.readthedocs.io/en/latest/topics/examples.html#reboot-poweroff-when-finished will allow you to take the reboot out, and I would suggest that you write a script with write_files to /var/lib/cloud/scripts/per-boot/XXXX ; that script will be called on every boot; and in your script, you can check for some marker file that your firstboot runcmd would touch; | 21:54 |
rharper | Odd_Bloke: ok | 21:54 |
djhaskin987 | rharper sounds good thanks | 21:54 |
djhaskin987 | i'll do that | 21:55 |
blackboxsw | rharper and Odd_Bloke, also Azure support for route-metrics on secondary nics is up https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/370970 | 21:57 |
blackboxsw | just pushed changes. (I think I have lints to fix) | 21:58 |
rharper | blackboxsw: nice | 22:00 |
rharper | blackboxsw: so that one 'driver: hv_netsvc' ; that's in the example comment for the method right ? It's hard to see from the diff | 22:05 |
blackboxsw | rharper: right yeah, I just extended the docstring on a method | 22:16 |
blackboxsw | to represent expected method/function input | 22:16 |
kbZ | What about things? | 23:22 |
kbZ | Who knows about things | 23:23 |
kbZ | I have some questions | 23:27 |
rharper | ask away | 23:27 |
kbZ | in this path, cloud-init/cloudinit/sources | 23:28 |
kbZ | I'm looking to add a new source | 23:28 |
rharper | yes! | 23:28 |
kbZ | but I am pure dog scientist | 23:28 |
kbZ | if I wanted to query a metadata URL, does it have to be via IP or will I be able rock'n'roll with a domain? | 23:29 |
rharper | you can use a dns name, DataSourceGCE.py for example uses that | 23:29 |
kbZ | so it does, thanks, will look there | 23:30 |
kbZ | I've never used Launchpad before :| | 23:30 |
kbZ | and uhh, I mean, I can barely write python, so this should be fun | 23:30 |
rharper | https://cloudinit.readthedocs.io/en/latest/topics/hacking.html | 23:30 |
rharper | that should have most of the steps you can use to get started on a contribution | 23:31 |
kbZ | why spend ten minutes in the docs when I can spend ten hours toiling away? | 23:31 |
* kbZ looks | 23:31 | |
kbZ | >It assumes you have a Launchpad account | 23:31 |
kbZ | assumptions already, yikes | 23:31 |
rharper | well, at least it told you | 23:32 |
kbZ | lol, thanks :D | 23:32 |
kbZ | in Launchpad's defense, I usually only end up here angry and at the end of a bug | 23:32 |
kbZ | so we kinda got off on the wrong foot | 23:32 |
rharper | fair enough | 23:32 |
kbZ | one thing I noticed | 23:37 |
kbZ | DataSourceAzure.py has different permissions than the rest | 23:37 |
kbZ | -rwxr-xr-x vs -rw-r--r-- | 23:37 |
kbZ | I want to make jokes | 23:38 |
kbZ | Microsoft bully you into this? (the joke) | 23:38 |
rharper | hehe | 23:38 |
kbZ | is that an accident? if I'm in here fixing things? | 23:39 |
rharper | the execute bit isn't needed on the Datasource, so sure, send in a fix | 23:39 |
kbZ | cool | 23:39 |
kbZ | well, apparently I have been an Ubuntu One member for a long time | 23:40 |
kbZ | Member since: 2005-11-04 | 23:40 |
kbZ | awe | 23:40 |
rharper | nice | 23:40 |
kbZ | someone sent my real Ubuntu 6 CDs in the mail a long time ago | 23:40 |
rharper | I've a few of those | 23:40 |
kbZ | lets see if I can slip in this chmod thing and then I'll try taking a crack at the bigger issue | 23:41 |
kbZ | >To contribute, you must sign the Canonical contributor license agreement | 23:42 |
kbZ | who gets my first born? | 23:42 |
rharper | up to you =) | 23:44 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!