=== vrubiolo1 is now known as vrubiolo [13:48] smoser: I wonder if you have any thoughts on https://github.com/canonical/cloud-init/pull/369? I don't love the idea of pushing more info into the log which will be unused in almost every situation. I am considering suggesting writing out to a separate file (perhaps in /run/?) to avoid this. What do you think? [13:49] (Asking because I know you have Opinions about our logging. ^_^) [13:49] mruffell: Of course, thank you for your work! [13:51] meena: Being able to run CI is a big step up from commit emails IMO, but I agree that a lot of the actual review functionality isn't much different (there's only so many different ways to comment on patches that are ultimately just text file, lol). [13:54] Hey everyone, does anyone have an idea to manually test this feature ? https://git.launchpad.net/cloud-init/commit/?id=87cd040e [13:55] My understanding on it is that we must fake the endpoint to return 403. But I could not find a way to properly test this [14:01] lucasmoura: I don't see why you would need 403 for that commit (it looks like it should affect all interactions with the AWS IMDS, to me), is that definitely the one you meant to link to? [14:02] Odd_Bloke, Nope, my bad here. This is the right commit https://git.launchpad.net/cloud-init/commit/?id=1f860e5a [14:04] lucasmoura: Right, that one makes more sense! :p Yeah, I think all we can test is the happy path (i.e. we still come up on AWS); we do sometimes ask the clouds to test changes that require messing with their IMDS, so we might be able to do that in this case. [14:05] blackboxsw: Can you add anything about asking clouds to test changes that require IMDS manipulation? [14:07] Odd_Bloke: i had a partial comment from a week ago or so saying something to that effect. [14:08] its very easy to have a knee jerk reaction that the log should have shown you exactly the information that you found would have helped out in this very rare scenario [14:08] i've recently come more and more to agree with https://dave.cheney.net/2015/11/05/lets-talk-about-logging [14:09] i think the real solution is [14:09] a.) cloud-init log a lot less by default [14:09] b.) easily turn on debug logging, which is then a firehose [14:21] Yeah, agreed, I like that as a long-term goal. [14:23] smoser: So is your position that we shouldn't include this change at all because it's log spam, or that we should put it in a separate file, (or that we should allow it as-is, because it's a drop in the logging pond as things stand)? [14:24] i'd say its log spam [14:24] (which ... 90% of cloud-init log is, so its a hard position to hold) [14:25] hindsight is 20/20 [14:25] but i like the separate-file even less than i like the log spam [14:26] really we need to be able to enable debug logs easily [14:26] OK, fair enough; I was at least thinking that that would go away across boots, but it's not great either. [14:26] So I wonder if the thing to do is to accept this one as-is, and then come up with a Plan For Logging which we can add to HACKING.rst and then expect future submissions to follow? [14:28] its hard to tell someone "no that can't go in the log". when there is so much crap in the log. [14:28] Definitely. [14:28] so, mostly i agree with you. let this in, and then future policy. [14:29] but people have wanted to (and have) put stuff in the log as a result of kernel bugs in one release or uninitalized memory usage in a subprogram. [14:30] stuff that is so very unlikely to ever have another 'hit' on usefullness [14:33] So I guess, at a minimum, we need to reach an agreement on what log levels should be used (and for what), and determine the mechanism(s?) by which you can opt into the firehose of debug messages. [14:34] (I think we want users to be able to opt-in without image modification, for example, else there'll be a lot of bugs that we can't get enough info on.) [14:35] yeah. :-( [14:35] which leads you to "log a ton of non-important information by default in the event that sometime it might be useful" [14:35] :-( [14:36] i've said this before, but as i do things now, i'm much more strict than i was when cloud-init started. [14:36] fail loudly and exit failure. [14:37] because cloud-init's behavior of "well... stuff might have gone wrong, check the log for WARN messages" [14:37] results in missing *more* bugs [14:40] Yeah, there's definitely a fine line to balance between "break very obviously" and "do enough that people can at least access the instance to debug". [14:41] Because if you don't do the latter, then you _need_ to log a bunch of stuff to the serial console to be sure that people will be able to debug the problem well enough. [15:30] falcojr: thanks for the note at standup. I'll scrub your sru verification results now too [15:31] cool, thanks [15:35] Odd_Bloke: lucasmoura agreed we have sometimes asked for assistance from the specific cloud author on certain functionality that we land. In the case of https://git.launchpad.net/cloud-init/commit/?id=1f860e5a the author was fred-lefebvre. i have in the past emailed microsoft folks notifying them that their patch landed and is in testing and that it could be validated by following [15:35] https://cloudinit.readthedocs.io/en/latest/topics/debugging.html?highlight=proposed#manual-sru-verification-procedure [15:38] I mentioned microsoft, but the same applies to aws. If we can notify them of the change in -proposed. they can get a chance to test. We could choose to highlight @fred-lefebrvre in a comment on the merged PR https://github.com/canonical/cloud-init/pull/216 that it is available for testing (then we are pretty sure the author gets a notification of this) [15:57] python 3.5 is cool [15:57] https://imgur.com/a/DalMKsc [16:21] https://www.youtube.com/watch?v=3epfRPCtGJA [16:24] it's mad about an unprintable unicode character [16:24] if I try to print the variable in middle log line I get [16:24] UnicodeEncodeError: 'ascii' codec can't encode character '\u03b5' in position 74: ordinal not in range(128) [16:25] not sure why it's trying to encode ascii when system locale stuff all says utf-8, and if I open a python shell manually I can print it fine [16:25] only happens on xenial though [16:26] Is it because it's writing to a file that's opened with ascii encoding? [18:07] Is there any way to detect os (linux or windows) in cloud-init? [18:10] punkgeek, hey - cloud-init does not support Windows at this time. Inside cloud-init we do have various methods for determining which Linux or bsd distro we might be running on [18:10] punkgeek: for windows you'd probably be using cloudbase-init instead of cloud-init as they are separate projects. But in cloud-init upstream let's you detect what linux/bsd distribution you are running on via `cloud-init query distro` or in jinja templates in your #cloud-config userdata file [18:10] ^ nice [18:11] punkgeek: https://cloudinit.readthedocs.io/en/latest/topics/instancedata.html#using-instance-data shows examples of providing #template: jinja\n#cloud-config [18:11] in that you can use {{ v1.distro }} or even the short alias {{ distro }} [18:11] cool [18:13] yeah, `cloud-init query --all` will give you a list of any keys which could be provided in #template: jinja\n#cloud-config user-data [18:13] Aha thank you, what differences between cloudinit and cloudbase? [18:16] punkgeek: my understanding is https://github.com/cloudbase/cloudbase-init is windows-only and only supports a subset of the config modules that cloud-init upstream supports https://cloudinit.readthedocs.io/en/latest/topics/modules.html [18:16] this channel is related to upstream cloud-init, which as powersj mentioned. does not support windows. [18:18] from their splashscreen https://cloudbase.it/cloudbase-init/ it looks like they have plans to support bsd at some point, but generally very windows focused. [18:19] + looks like they support 4 datasources(cloud platforms) instead of upstream cloud-init's 21 https://cloudinit.readthedocs.io/en/latest/topics/datasources.html [18:20] Thank you so much [18:20] no worries [18:31] falcojr: thanks for https://github.com/cloud-init/ubuntu-sru/pull/107#pullrequestreview-428284569 2 minor comments and we can land it [18:39] I want to deploy a shell script into the machine that does not have internet connectio, what is the best way? Here is my shell shell script: https://github.com/autovmnet/tools/blob/master/vm_config.sh [18:43] Hi everyone, I am trying to manually test this PR https://github.com/canonical/cloud-init/pull/234, but I am still experiencing issues with it, so I think I am missing something [18:43] I am trying to replicate the exact same case that rharper used in the discussion [18:44] Although I can see the error he describe when running cloud-init, I see another error when updating for the fixed version [18:46] It states the following error: Exec format error. Missing #! in script?' [18:47] Which makes sense, sice the example in the PR is not a shell script per se. So does the example in this PR supposed to work ? The one rharper uses to reproduce the error with a lxc container [19:03] Odd_Bloke: you should rebase https://github.com/canonical/cloud-init/pull/391 and… we should merge it, or i should start sending you patches [19:28] meena: There are some comments on there that I need to read through and address, and unfortunately it's dropped down my list (for now). :( [19:35] lucasmoura: here, what's not working ? [19:36] lucasmoura: if you have a newer cloud-init with the fix, when cloud-init processes the mime-type of the payload (which is a cloud-config, not a shell-script) then it will get merged correctly into user-data, *instead* of being thought as a shell-script; which as you see, it isn't and it cannot be executed [19:36] lucasmoura: for verifying that; can you reproduce the failure with the steps in https://github.com/canonical/cloud-init/pull/234#issuecomment-604033345 ? [19:38] rharper, yes, I can reproduce the error perfectly. But the problem is that an error still happens when I try to run the same user-data on the newer cloud-init version [19:38] Just give me a couple of minutes and I will add the script I am using and the error I am receiving [19:38] I think it will be easier to explain the issue [19:39] lucasmoura: are you using the lxc-proposed-snapshot to create a new image with the updated cloud-init ? [19:40] No, but I am manually installing the new version in the lxc container. First I reproduce the error, than I manually add the ppa where the newer cloudinit version is and try to run it again to see if no error is raised [19:41] and you run cloud-init clean --logs --reboot ? [19:41] Oh no, I just run cloud-init init [19:42] you need to clean [19:42] otherwise it's not a "first boot" any more [19:42] so the already parsed user-data is written out as a shell-script and runparts will see the old file [19:42] you can skip the reboot [19:42] but you do need clean [19:42] That makes total sense [19:42] and I suggest; cloud-init clean --logs && cloud-init init --local && cloud-init init [19:43] https://cloudinit.readthedocs.io/en/latest/topics/faq.html#how-can-i-re-run-datasource-detection-and-cloud-init [19:43] Let me update the script [19:43] the safer way is to always use the lxc-proposed-snapshot; and run a completely new container [19:43] thanks powersj [19:44] rharper, do you mean reproduce the error first and then launch a container with lxc-proposed-snapshot to verify the fix ? [19:52] lucasmoura: yeah; https://github.com/cloud-init/ubuntu-sru/pull/100/files [19:53] in there, I have a recreate() and then a verify() [19:56] rharper, cool. Thanks for the suggestion. I will start using it :) [20:03] I am having a cloud-init/netplan networking problem with Ubuntu 20.04. Cloud-init runs with local cloud-init datasource and sets up the networking configuration (static IP, vlan, etc) without any errors. Networking is configured but not active after cloud-init completes. I did notice that cloud-init finds all the Ethernet links down when it runs (Up column in Net device info). I have discovered that doing a 'netplan apply' [20:03] after cloud-init has run will made the network configuration active. The Ethernet HW in question is a Intel 10Gb ixgbe interface. Is this expected behavior? Is my cloud-init config missing something? [20:17] haybill: not expected; couple of things 1) if you can, share you net config, ensuring you've a match on the ixgbe nic in your config to ensure that it's brought up 2) there was a netplan bug around bringing things online without an applyl; though it was related to wifi; might be related; are you using -daily images or the released image (which will have -updates available) ? [20:25] rharper, #2 I am using released 20.04 with a fresh apt update && apt upgrade. Should I be trying something newer/etc? [20:29] haybill: no, I just new there was a release to -updates for netplan for the wifi issue; [20:30] rharper, I am having trouble getting you a sample networking config right now, but when I deploy 18.04 with the same tools networking is fine [20:30] haybill: you have a server install? or using a cloud-image ? [20:33] followup on https://github.com/cloud-init/ubuntu-sru/pull/105#pullrequestreview-428359896 for you falcojr [20:33] sorry lucasmoura missed the conversation [20:33] thx rharper [20:35] blackboxsw: yw [20:35] rharper, it is a Dockerfile generated image (like https://github.com/packethost/packet-images). Based on this conversation I am beginning to be worried that I am missing some new 20.04 packages for networking [20:39] haybill: ok, are you booting a container ? and passing in the nic ? or booting a vm ? [20:40] either case, I would suggest testing with the Ubuntu cloud images, https://cloud-images.ubuntu.com/daily/server/focal/current/ there are VM images and root.tar.gz for containers ; testing with this can help you track down if it's related to your image build or some other issue (or cloud-init/netplan bug) [20:44] rharper, thx I will look into the cloud images [20:45] sure [20:52] blackboxsw, no worries [20:53] rharper, powersj thanks for the help, just confirming that the proposed fix solved the issue [20:54] lucasmoura: nice! [21:00] lucasmoura: ec2 sru pr reviewed https://github.com/cloud-init/ubuntu-sru/pull/102/files# [21:00] blackboxsw, ack [21:20] lucasmoura: another for you https://github.com/cloud-init/ubuntu-sru/pull/106/files# [21:22] https://github.com/cloud-init/ubuntu-sru/pull/107 merged [21:28] lucasmoura: same minor change request on https://github.com/cloud-init/ubuntu-sru/pull/108/files# [21:28] and we'll merge both [21:29] blackboxsw: thanks for the reviews! I followed up to your followup on https://github.com/cloud-init/ubuntu-sru/pull/105 [21:32] np and strange falcojr ... I thought my eyes were seeing exactly the opposite (no system_info via cloud-config). I'll double check again. I probably botched something [21:32] it's possible I'm doing something crazy too! :D [21:33] * blackboxsw doesn't mean falcojr is strange... I wouldn't say that out loud.... erm, I mean. [21:33] * blackboxsw walks away [21:33] I won't deny being strange ;) [21:33] haha [21:44] wow falcojr ok, I swear I didn't see the Resolving logs before on my side, [21:45] ok. I'm rerunning. and I think I'll take the debt of documenting the system_info via cloud-config example in cloudinit.readthedocs.io . so I actually remember that [21:45] I can see the pre-upgrade 'failure' case now [21:45] and verifying the fix per your script [21:45] thx [21:51] falcojr: ok +1 on your branch if we can add some valid URL can verify that we see the Resolving http://valid.com in logs and not foo.com [21:53] https://github.com/cloud-init/ubuntu-sru/pull/110 merged