[00:20] Hi. I need to know, I installed a cloud-init v19.1 on a rhel8. I can see IP at /etc/sysconfig/network-script/ifcfg-env2 but when I do ip a ,IP is not getting assigned. [00:20] What can be the possible flow I can check? [00:22] amansi26, https://cloudinit.readthedocs.io/en/latest/topics/faq.html#where-are-the-logs [00:22] I would review the logs in the above link [00:29] powersj: I check the log, I can see the IP under " applying net config names for" and the datasource used is configdrive [09:37] Hi all ,is it possible to apply network config from datasource nocloud-net(for vm with pre configured temp network/dhcp) in either stage of cloudinit? asking after searching the docs and multiple testing with no success. [09:45] shubas: can you define no success? by, say, posting your cloud-config, and your log output? [09:49] meta-data and user-data are aplied but network-config arn't ,using the same files in iso all configs are aplied. [09:52] using default centos config with : datasource_list: [ NoCloud ] [09:52] datasource: [09:52] NoCloud: [09:52] seedfrom: http://192.168.5.5/ [09:53] okay, then lets dig into the logs, and find out why network-config isn't applied. [10:20] here is the log: https://send.firefox.com/download/fe67a07c819f84ac/#xI5VGQGxaK6pYqTiWJ947w [11:07] shubas: it says: 2020-01-16 09:57:49,299 - DataSourceNoCloud.py[DEBUG]: Seed from http://192.168.5.5/ not supported by DataSourceNoCloud [seed=None][dsmode=net] [11:12] oh, wait, that was init-local [11:14] 2020-01-16 09:57:58,999 - stages.py[DEBUG]: applying net config names for {'version': 1, 'config': [{'subnets': [{'type': 'dhcp'}], 'type': 'physical', 'name': 'eth0', 'mac_address': 'aa:ce:dc:b2:5c:06'}]} [11:17] so, shubas this reads fairly okay / successful to me. What about the config that you get as a result is different from what you're expecting? [11:22] this is the fallback config ,it does not fetch the network-config file from seed [11:22] shubas: but it says it's doing that… [11:23] shubas: what's curl http://192.168.5.5/meta-data look like? [11:23] (or whatever the correct url is) [11:23] also checked the access log for http://192.168.5.5/ , only fetching meta-data and user-data files [11:24] the curl is ok ,i get the content [11:30] is there a way to manually fetch and apply network config from seed through cloud-init? [11:31] might provide more insight on what's happening [11:37] shubas: what's the config look like? [11:38] cloud.cfg? [11:44] shubas: no, i meant the thing that you get from curl [11:49] meta-data - [11:49] instance-id: someid123 [11:49] user-data - [11:49] #cloud-config [11:49] hostname: somehostname [11:49] network-config - [11:49] network: [11:49] version: 1 [11:49] config: [11:49] - type: physical [11:49] name: eth0 [11:49] subnets: [11:49] - type: static [11:49] address: someIP [11:49] netmask: someMASK [11:49] gateway: someGW [11:49] dns_nameservers: [11:49] - 8.8.8.8 [11:49] - 8.8.4.4 [11:55] not sure what, but something there isn't quite right [11:56] would be cool if it said what, rather than just applying dhcp [12:06] at this point my thought is that the network config mathod is determined in the intial stage before the network seed is checked and so ignored in later stages [12:47] talking about configuration, meena shubas can you confirm network_state is populated by conf file? [12:48] I'm seeing too much code lately, need a visual confirmation :-) [13:28] @otubo can you clearify what do you mean by network state and where should it be populated? [13:31] shubas: sorry! I was able to figure out by my self. The answer is yes :) [13:32] shubas: network_state is the variable filled with network configuration both read from cloud.cfg and from already existing network configuration present on the guest [13:45] also, removing /var/lib/cloud/instance would be enough to emulate the first boot? Or should I need to do something else? [14:02] "cloud-init clean" will do that [14:05] cloud-init clean --logs --reboot \o/ [14:11] thanks people! [15:30] otubo: yeah, reading the bug now [15:36] rharper: _render_networkmanager_conf() is writing 99-cloud-init.conf *after* NM starts, and that's causing the problem [15:37] otubo: so, the design for the network config by cloud-init is that at local time, we crawl imds for network config, and write out this config to the os dirs, *before* the networking service starts; so in your case, we've written sysconfig *and* resolv.conf values before NM starts [15:37] otubo: so why does network-manager start before cloud-init-local.service ? [15:38] Before=NetworkManager.service [15:38] Before=network-pre.target [15:38] we run local before NM and network-pre.target; this is when we write out all of our network config, so the 99 file *is* written before NM starts [15:38] or something is very wrong with units [15:40] So what you're saying is that on first boot, resolv.conf comes preconfigured for some reason, cloud-init writes dns=none and NM starts. And even though this happens NM wipes out the file on shutdown [15:40] otubo: and " it does not change the resolv.conf file and this file is clean (reverted to clean state by NetworkManager during the first shutdown)." seems to be where our conflict lies; you said that NM does this because it started before the 99-cloud-init.conf which tells NM *not* to do that [15:41] otubo: I don't know the contents of your resolv.conf before boot; typically it is a symbolic link to the systemd-resolved local caching resolver; but in older images it may be an empty file [15:42] I know on SUSE we've had to adjust whether cloud-init writes anyting at all (we used to always write resolv.conf even if we didn't have dns values, which was fine for Ubuntu which had resolvconf managing the file) [15:43] rharper: ok, good thoughts. I need to confirm the exact boot sequence and if NM is processing correctly the options on the config file. [15:43] the correct sequence should be: 1. cloud-init local runs first, before NM, reads net config, writes out all sysconfig/NM config files 2. NM starts, and it should see the 99-cloud-init.conf which says "dns = none" which should prevent it cleaning it up on shutdown 3. cloud-init net runs after NM has started but before all things are online ... [15:44] is the target OS systemd based? if so, I've found journalctl -o short-monotonic -b -u cloud-init-local.service -u NetworkManager.service -u cloud-init.service to be useful to see when they started [15:45] otubo: your log seems to indicate that there may be a race here; but if you have a complete boot log from both scenarios, we can see if there's a different bug here; [15:46] https://bugs.launchpad.net/cloud-init/+bug/1843334 [15:46] Ubuntu bug 1843334 in cloud-init "Change location of DHCP leases in CloudStack provider as it doesn't work for RHEL8" [Medium,Fix released] [15:46] otubo: that landed very recently, maybe that's not present in your cloud-init yet ? [15:47] which could explain the race [15:53] rharper: it looks like the boot sequence is correct, cloud-init indeed starts before NM https://pastebin.com/y65yFajd [15:54] otubo: yes, but I wonder if there's a race between when cloud-init-local finishes and NM starts ... [15:54] rharper: but I think NM ignores dns=none configuration, along the logs I can see one single entry of NM updating DNS [15:54] or it's not yet written [15:55] the ordering ensures that a unit is started before or after, but not necessariy complete [15:55] oh I see [15:56] rharper: rhel tree doesn't contain this fix for the bug 1843334 you pointed. I'll cherry pick and test. [15:56] bug 1843334 in cloud-init "Change location of DHCP leases in CloudStack provider as it doesn't work for RHEL8" [Medium,Fix released] https://launchpad.net/bugs/1843334 [15:57] that may be enough to ensure there's time for local to write it's config [16:00] otubo: so I think the ordering is strong enough; however, in the case where cloud-init local is doing a dhcp for metadata and the dhcp/network response is slow; I'm wondering if we'd find that NM would start before cloud-init writes that file; if so; we *might* want to issue a 'systemctl try-restart NetworkManager.service' which would restart NM IIF it was already started. [16:01] rharper: Well, but i believe this is a good idea even if that fix solves my problem. [16:02] rharper: I could write that feature in a near future [16:02] yeah [16:02] look at cloudinit/net/netplan.py which uses the _postcmds config; it allows a Distro class to specify commands to run after rendering a network config [16:03] the freebsd renderer also makes use of this as another example [16:08] rharper: yeah, didn't work. [16:09] so [16:10] local also does this: ExecStart=/bin/touch /run/cloud-init/network-config-ready [16:11] hrm, we want logic like: if cloud-init ran, NM should wait untll this file is touched before starting; but in the case that cloud-init doesn't run (it's not activated) then NM should run whenever it normally would; [16:12] rharper: what about we reload configuration when cloud-init.final finishes? Like `pkill -HUP NetworkManager' [16:13] hrm, well, we could PreExec the reload in cloud-init.service as a drop in [16:14] until code changes to use the try-restart land and release [16:25] rharper: ok, I'm gonna try to include something with `systemctl try-restart NetworkManager.service' on _postcmds tomorrow morning. Thanks for the help :-) [16:28] sure [16:32] https://paste.ubuntu.com/p/tGyGWJBq5t/ [16:32] otubo: that is a systemd unit drop it config which would trigger the try-restart right before cloud-init.service [17:52] Odd_Bloke: just pushed https://github.com/cloud-init/ubuntu-sru/pull/79 for SRU start [17:52] -> errand === blackboxsw_ is now known as blackboxsw [19:17] rharper: util.subp will raise a ValueError if target is anything but None. The docstring says this is for compatibility with curtin's subp. Do you think that's a goal we still want to retain? [19:18] (I'm asking because nothing actually calls get_dpkg_architecture with a target that isn't None, so I was going to remove that parameter, then dug deeper and found this.) [19:38] Odd_Bloke: I just added another note about what clouds SRU verification already covers https://github.com/canonical/cloud-init/pull/167 [19:41] SRU -proposed bits are up and accessible in xenial bionic and eoan-proposed. I'm writing up the notification email now [20:34] blackboxsw: https://github.com/canonical/cloud-init/pull/167 reviewed, we're getting there! [20:54] pushed changes Odd_Bloke thanks [20:54] and accepted all suggestions [21:17] blackboxsw: LGTM now. I have another couple of minor nits (a missing : and the directive names), which I'll apply then merge. [21:19] blackboxsw: Oh, no I'm not, I don't have permissions. [21:23] ? [21:24] blackboxsw: I can't commit to your branch, so you'll need to apply this last round of nits. :) [21:24] Odd_Bloke: I just clicked allow edits [21:24] Odd_Bloke: does it work now I forgot to click that [21:26] I think in general Odd_Bloke if we are working through a bunch of nits it might be easier for you to review and me to grok the set of changes if you push over my branch, then I can diff to origin and walk through the full visual diff. walking through a bunch of accept suggestion links makes me feel I'm going to miss something. [21:27] OK, we can bear that in mind next time. [21:28] That's now how I prefer to receive changes, which is why I didn't try to do that. :) [21:28] (And also, I want you to be happy with the changes being made before I make them, unless they really are trivial!) [21:29] makes sense. I was just feeling a bit guilty for the number of review rounds you had to do Odd_Bloke [21:30] Odd_Bloke, so do I need to accept your last round of review comments or are you able to push to my branch? [21:30] Oh, I don't mind that at all. I've been feeling guilty about how many review rounds you had to do. :p [21:31] blackboxsw: It's telling me I still don't have permissions, so why don't you accept them. [21:31] We can figure that out next time. [21:31] good we all have feelings of guilt to work out this year it seems :) [21:31] will do [21:33] rharper: blackboxsw: FYI, I've asked waveform (the Pi guy on the Foundations team) to file a bug for a mount issue they're seeing with cloud-init (I assume in Ubuntu Core), as they were talking about how to work around it. [21:36] blackboxsw: You have a long line in there still, doc8 fails. [21:37] doc/rtd/topics/debugging.rst:171: D001 Line too long [21:39] d'oh [21:39] The dangers of using the GH suggestions. :) [21:39] Odd_Bloke: oh, where was the discussion re: mount issues ? [21:39] bug is best though [21:39] rharper: In the Foundations channel that I never /part'd. :p [21:40] it's more likely run with systemd but cloud-init plays a role in writing those fstab entries out [21:40] gotcha [21:40] Odd_Bloke: force pushed the sru doc branch [21:40] should be good2go [21:41] community notice: I also just pushed a origin/stable-19.4 branch which was our last version of cloud-init to claim support for py2.7 === blackboxsw changed the topic of #cloud-init to: cloud-init pull-requests https://git.io/JeVed | Meeting minutes: https://goo.gl/mrHdaj | Next status meeting January 21 17:15 UTC | 19.4 (Dec 17) | 20.1 DROP py2.7 (Feb 18: origin/stable-19.4) | https://bugs.launchpad.net/cloud-init/+filebug === blackboxsw changed the topic of #cloud-init to: cloud-init pull-requests https://git.io/JeVed | Meeting minutes: https://goo.gl/mrHdaj | Next status meeting January 21 17:15 UTC | 19.4 (Dec 17) | 20.1 DROP py2.7 (Feb 18) | https://bugs.launchpad.net/cloud-init/+filebug [21:42] rharper: It's something to do with the cloud-init seed being on a drive that's already being mounted for some other reason; when we go to mount it, it's already mounted so *sad trombone* [21:43] But a bug is incoming. [21:43] I suspect this is core20 fun [21:43] Yeah, I don't see why a Pi would have cloud-init (in a weird configuration) on it if it weren't. [21:43] the writable partition includes the seed so it can be modified outside of the read-only image; [21:44] oh, non-core pi image has cloud-init to auto generate keys, import them and do that outside of the base image; === blackboxsw changed the topic of #cloud-init to: cloud-init pull-requests https://git.io/JeVed | Meeting minutes: https://goo.gl/mrHdaj | Next status meeting January 21 17:15 UTC | 19.4 (Dec 17) drops Py2.7 : origin/stable-19.4 | 20.1 (Feb 18) | https://bugs.launchpad.net/cloud-init/+filebug [21:44] Aha, OK, well we'll find out! [21:44] core just makes things more complicated w.r.t not having a writable filesystem [21:44] it was a nice improvement to allow cloud-init to read the boot directory of the pi's to look for cloud-config; so they can do a dd of the image and then write a cloud.cfg in the boot dir [21:45] yes, we shall see =) [21:46] blackboxsw: Did you overwrite the last set of changes you applied from the GH UI? [21:46] Looks like the directives have reverted. [21:46] geez man, you're probably correct. [21:46] I'll re-apply [21:50] Odd_Bloke: tried to reapply, erased the code-block suggestions plus and opening cloud-init: [21:52] blackboxsw: Approved. \o/ [21:54] thanks! [22:01] Hi, whenever I run tox -e py27, I get an error about test-requirements.txt [22:01] Requirement already satisfied: setuptools in ./.tox/py27/lib/python2.7/site-packages (from -r /source/test-requirements.txt (line 13)) (45.0.0) [22:01] ERROR: Package 'setuptools' requires a different Python: 2.7.12 not in '>=3.5' [22:01] ERROR: could not install deps [-r/source/test-requirements.txt]; v = InvocationError('/source/.tox/py27/bin/pip install -r/source/test-requirements.txt (see /source/.tox/py27/log/py27-1.log)', 1) [22:02] I have not modified test-requirements.txt at all, and my tox tests were running fine until a few days ago [22:02] Any possible reasons why this error is being thrown? [22:19] johnsonshi: I believe setuptools have dropped support for Python 2 in the latest release, which is what's causing that error. [22:20] johnsonshi: We've followed suit, however, and 19.4 was the last cloud-init release that supported Python 2. [22:20] So you probably don't need to run `tox -e py27` any longer, and instead should be running `tox -e py3`. [22:24] Odd_Bloke: Thanks for clearing this up! [22:41] johnsonshi: Happy to help!