[07:07] <hipolito> Hi, I'm trying the nocloud config drive example from the docs, but it doesn't work at all. Am I missing something?
[07:08] <hipolito> sadly I can't log into the image, it doesn't have any user/pw
[07:17] <meena> hipolito: let's start with a silly question: does your cloud support NoCloud?
[07:52] <hipolito> meena: I'm trying locally on virtualbox, attaching the seed iso
[07:55] <hipolito> I'm following the example from here: https://cloudinit.readthedocs.io/en/latest/topics/datasources/nocloud.html
[07:55] <hipolito> the only thing I can see on the console is "source ...NoCloud failed"
[08:01] <meena> hipolito:   okay, so we can dig into logs then
[08:02] <hipolito> meena: sadly the image doesn't have any user or password, I can't log in without cloud-init injecting ssh keys
[08:03] <meena> oh. ooohhhh
[08:04] <hipolito> is there a kernel parameter on cloud-init so it prints more logs to standard output? I think systemd will print it out to the console
[08:08] <meena> that i don't know, and I'm only on my phone so looking thru the source code is a bit tricky
[08:13] <hipolito> no worries, I'll figure a way to log in somehow
[12:36] <Odd_Bloke> meena: :D
[12:48] <meena> there's been an update from #NetBSD, and at won't work on OpenBSD either, and won't cover all edge cases (on freebsd?) but we could use that simpler version
[12:48] <meena> our OpenBSD support is very spotty right now anyway
[12:57] <meena> We should probably move ifconfig from netinfo(?)
[12:57] <meena> or is it just duplicated in netinfo?
[13:00] <Odd_Bloke> meena: We don't need a single implementation that works for all BSDs; we can add {Free,Open,Net,Dragonfly}BSDNetworking subclasses and have specific implementations in each of those.
[13:01] <meena> basically, ifconfig -C is needed everywhere and can live in BSD base ass
[13:04] <meena> "what instrument do you play?" - "the base ass"
[13:26] <meena> i wonder if there are any circumstances under which `ifconfig -C` output would change over the runtime of a machine… like if we load different kernel drivers
[15:58] <rharper> Odd_Bloke: blackboxsw:  on the nfs mount bug, https://bugs.launchpad.net/cloud-init/+bug/1870370 ;  We never captured a log with an error message.  I've tested bionic and focal images from daily; and fstab always has the correct entry present;  the messages mentioned are present in the cloud-init, but they do not prevent the entry from being added to fstab;  the stacktrack related to the call to mount -a is due to the lack of an
[15:58] <rharper> nfs client ;  once one installs nfs-common; the remote mount succeeds;  so; this AFAICT this was never an actual bug (in practice);
[15:59] <blackboxsw> ahh geez
[16:02] <blackboxsw> so rharper was the bug really that if we detect is_network_device(path) cloud-init should be installing nfs-utils?
[16:04] <rharper> blackboxsw: well, I don't know;  the original submitter thought that the issue was the error message of 'ignoring entry'
[16:04] <rharper> and maybe there still is a bug w.r.t saying we ignore mount entries and *still* put them in fstab
[16:07] <rharper> At this point, AFAICT, there never was an issue with using NFS mounts, other than the error message it displayed ... the reason the mount -a fails is the missing nfs client;  the bug fix applied now does not emit the error message and nfs entries are considered "sanitized";
[16:08] <rharper> for nfs, we could install the client; prior to running mount -a ...  not sure for other remote filesystems ; doing such a thing is a non-trivial feature;  and thus far, I suspect users have been rolling their own image (or doing everything in their own runcmd to handle client installs and updates to fstab
[16:09] <rharper> w.r.t the ec2 efs;  not clear to me what client is needed in the image for efs;
[16:10] <rharper> w.r.t the SRU;  I can verify that the error message is no longer emitted ...  and there's no regression in cloud-init behavior; so I don't think this disrupts the SRU at all; but the bug/fix is misleading in that it never was the barrier to enabling nfs mounts on first boot
[16:11] <rharper> blackboxsw: let me know how you want to proceed
[16:11] <blackboxsw> agreed, rharper . I think that is a sound approch for now. We'll ping the submitter on the bug and ask for confirmation at their convenience
[16:12] <blackboxsw> we can have them re-open the bug if they hit it again and fill in more details (as well as cloud-init logs)
[16:12] <blackboxsw> I think we can proceed with your verification of no regression in current behavior
[16:12] <rharper> blackboxsw: alright, I'll just verify the error message is not present any more
[16:39] <meena> i dunno folks, everybody should know what kind of environment they putting an image into. if you gonna need nfs, and you don't install nfs-utils into the image before putting it to use, then i don't know what to say
[16:40] <meena> (and since i don't even know what ec2 efs is, I'm not gonna say anything about that)
[16:42] <meena> we could check if the fs is supported, and error out, 🤷‍♀️
[16:45] <haderach> Hello! How to connect a local instance of cloud? I got the shell in the machine and instance id.
[16:48] <meena> hrm… checking if an fs is supported is probably as complex as installing support for it, aaaaand, highly distro specific.
[16:48] <haderach> Is running in Ubuntu 20.04
[17:02] <falcojr> blackboxsw : in doing the Oracle SRU, I'm seeing two tracebacks
[17:02] <falcojr> https://paste.ubuntu.com/p/tms3rBfsW3/
[17:02] <falcojr> it looks like in older SRUs, two tracebacks were found, but we didn't detail them, so it's probably the same thing, but curious if you know what they are
[17:03] <falcojr> this is for bionic
[17:03] <falcojr> also /etc/netplan is empty which wasn't the case for older SRUs
[17:03] <falcojr> but it's empty when I launch the instance, and empty after I installed proposed and reboot
[17:09] <blackboxsw> falcojr: the cloudinit.url_helper.UrlError: 404 Client Error: Not Found for url: http://169.254.169.254/latest/meta-data/ is known on Oracle, older series don't use the proper src/cloud-init/cloudinit/sources/DataSourceOracle.py   This was a cloudimage feature that we need to resolve with CPC team internally at some point. So that is known, the openstack datasource hits urls that Oracle IMDS doesn't actually
[17:09] <blackboxsw> support.
[17:12] <blackboxsw> the other trace is related to network already being up when trying to run EphemeralDHCP, which I think is ok because that means the network is already active because of iscsi root on Oracle during initial  datasource detection time in local timeframe. We probably aren't going to resolve this for Oracle specifically in the OpenStack datasource, because oracle should be using DataSourceOracle which checks for
[17:12] <blackboxsw> iscsi_root first https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/DataSourceOracle.py#L194-L198
[17:12] <blackboxsw> falcojr: does Oracle have a focal series image yet?
[17:13] <falcojr> yes
[17:13] <falcojr> I didn't have issues with the focal image
[17:13] <blackboxsw> did it detect Oracle datasource vs OpenStack
[17:14] <falcojr> ```
[17:14] <falcojr> {
[17:14] <falcojr>  "v1": {
[17:14] <falcojr>   "datasource": "DataSourceOracle",
[17:14] <falcojr>   "errors": []
[17:14] <falcojr>  }
[17:14] <falcojr> }
[17:14] <falcojr> ```
[17:14] <falcojr> lol...my brain can't keep two chat systems separate now
[17:15] <falcojr> any idea about the missing /etc/netplan config?
[17:20] <blackboxsw> falcojr: that's something concerning I think.   this is bionic right? grep renderer /var/log/cloud-init.log
[17:21] <falcojr> returns nothing
[17:23] <blackboxsw> yet all of the other fetches of  http://169.254.169.254/latest/ are working and `ip addr show` list valid active addresses on the network interfaces
[17:26] <falcojr> yes, I can pull down metadata and reach out the the internet...ip a shows 10.0.0.24 on ens3 and loopback
[17:26]  * blackboxsw relooks at the last oracle SRU runs. it may be worth putting up the full in progress PR . generally older Oracle SRUs did render /etc/netplan/50-cloud-init.yaml so something didn't fire on t
[17:26] <blackboxsw> the instance.
[17:26] <falcojr> alright, I'll put up the full text in the PR and we can take a look there
[17:26] <falcojr> thanks
[17:27] <blackboxsw> probably want to attach the logs and. Yeah sorry. and as you mentioned this is probably generally a known condition (as each oracle SRU had 2 tracebacks in logs)
[17:27] <blackboxsw> just the netplan file not being present seems amiss
[17:28] <blackboxsw> also may want to grep  'Writing to /etc/net' /var/log/cloud-init.log to see if it rendered /etc/network/interfaces or /etc/netplan etc
[17:28]  * blackboxsw has to step away for kid lunch prep for a few
[17:58] <meena> blackboxsw: o/~ 💜
[18:07] <rharper> blackboxsw: https://github.com/cloud-init/ubuntu-sru/pull/135
[18:29] <meena> Odd_Bloke: i'm trying to contribute is_physical for BSD to your PR, and it would need to pull in get_interfaces_by_mac() (the underlying function that the BSDs use for get_interfaces() / get_devicelist())
[19:01] <Odd_Bloke> meena: Are you able to call `self.get_interfaces_by_mac()`?
[19:08] <meena> Odd_Bloke: hrm, so, the "problem" is that get_interfaces_by_mac() is a lot more output and parsing, and we only need it on OpenBSD, since ifconfig -l is all we want, but OpenBSD doesn't have that.
[19:09] <meena> i think i should probably start with FreeBSD (and NetBSD) and then do openbsd (or let someone who cares about OpenBSD lol have a go)
[19:09] <falcojr> blackboxsw PR here: https://github.com/cloud-init/ubuntu-sru/pull/136
[19:13] <Odd_Bloke> meena: Hmm, I'm not sure I'm following along, I'm afraid.
[19:14] <meena> Odd_Bloke: ifconfig -l output:
[19:14] <meena> meena@fbsd12-1:~ % ifconfig -l
[19:14] <meena> vtnet0 lo0
[19:14] <Odd_Bloke> meena: Bear in mind that we can have a `BSDNetworking.is_physical` and an `OpenBSDNetworking.is_physical` (and a PR which just has the latter raise NotImplementedError would be perfectly acceptable: it's still an improvement).
[19:14] <meena> or, or an actual server: vtnet0 lo0 bridge0 vnet0:1 vnet0:2
[19:14] <meena> aye.
[19:15] <Odd_Bloke> meena: (And, in fact, separate PRs for the two separate implementations would be much easier to review, too. :)
[19:17] <blackboxsw> thanks gents for hte prs
[19:17] <blackboxsw> falcojr: ahh oracle has disabled networking :)  2020-06-25 18:58:05,467 - stages.py[DEBUG]: network config disabled by system_cfg
[19:17] <blackboxsw> sooo, yes that would be expected that cloud-init doesn't emit /etc/netplan :)
[19:17] <blackboxsw> falcojr: and I believe that is due to the fact that network config is setup for iscsi root
[19:19] <meena> Odd_Bloke: aye.
[19:20] <blackboxsw> falcojr: to confirm that the machine is iscsi root you can python3 -c 'from cloudinit.net.cmdline import read_initramfs_config; print(read_initramfs_config())'
[19:21] <blackboxsw> that should give you iscsi_root network configuration I believe
[19:22] <falcojr> None
[19:22] <blackboxsw> falcojr: it's at least that's what the Oracle proper datasource uses to confirm netcfg is up. also worth confirming that focal on Oracle doesn't emit the log "network config disabled by system_cfg"
[19:23] <blackboxsw> but I think that's implied (the focal log check)  because you said focal rendered /etc/netplan/*cloud-init.yaml
[19:24] <falcojr> right
[19:25] <falcojr> so should I just update the procedure with a different comment as to why I'm || true there, remove the logs, and call it a day?
[19:25] <blackboxsw> falcojr: I think so. if we have to sort more we can do it in review.
[19:27] <blackboxsw> and it looks like we'll have a little time on this because solutionsQA verification run still isn't started, it has 4 CI jobs queued in front of it (which may take 8 hrs each).
[19:27] <falcojr> sounds good
[19:27] <blackboxsw> so we are in the camp of waiting  on CI approval for cloud-init SRU until that solutionsQA test run is actually executed.
[19:28] <blackboxsw> despite being queued a couple days ago
[21:47] <blackboxsw> falcojr: paride finally have a cloud-config fix that avoids the lxc console <VM> interaction to fix lxd on launch
[21:47] <blackboxsw> https://paste.ubuntu.com/p/54WcQWrn4H/
[21:47] <blackboxsw> I might even be able to simplify more
[21:48] <blackboxsw> by adding that vendor data to the vm profile
[21:48] <blackboxsw> rharper: too ^
[21:48] <blackboxsw> sorry will wrap up your remaining sru PRs today
[21:55] <blackboxsw> yep profile https://paste.ubuntu.com/p/pxtbd4fjph/
[21:55] <falcojr> great!
[21:55] <blackboxsw> ok lucasmoura I'm going to rework the ua-client PR for vm support
[21:58] <lucasmoura> blackboxsw, ack. I have reviewed it this afternoon, but I just had a couple of minor comments
[22:00] <rharper> blackboxsw: lemme look
[22:00]  * rharper has been rather annoyed at lxd --vm 
[22:00] <rharper> also, super not happy about the lxd agent not working in ubuntu-daily;$release ; and then the images:ubuntu/$release/cloud  which is not an official cloud image, but something else;  also has no ssh server installed.
[22:01]  * rharper finish mini rant 
[22:02] <rharper> blackboxsw: I see, your comment in the second paste is most helpful;  ISTR there was some issue with the reboot needed due to difficulting wrangling systemd units starting soon enough
[22:03] <blackboxsw> rharper: I think it's just that the systemd units we add don't start properly without the reboot
[22:04] <rharper> yes
[22:04] <blackboxsw> the install.sh run comments about avoiding the reboot. but it failed when I tried
[22:04] <blackboxsw> I'm following this https://discuss.linuxcontainers.org/t/running-virtual-machines-with-lxd-4-0/7519
[22:04]  * blackboxsw has to head on an errand for a few
[22:04] <rharper> nice
[23:41] <rharper> blackboxsw: , the install says you can skip the reboot; To start it now, unmount this filesystem and run: systemctl start lxd-agent-9p lxd-agent