/srv/irclogs.ubuntu.com/2021/03/05/#cloud-init.txt

=== seednode7 is now known as seednode
=== sh1bumi is now known as shibumi
prologicHi all 👋 I'm having a lot of difficulty trying to run a simple script per instance (i.e: cloud-init-per once) via cloud-init's runcmd module. It looks like (but I can't prove it) that cloud-init silently kills my script for running too long?02:46
prologicIs this at all the case? I can find no documentation on this.02:46
prologicMy script does: wait for an ip to come up (from dhcp) excluding a few interfaces we don't care about, once that ip is known, reconfigure some pre-installed software (from the packer image) and restart some services.02:46
prologicAlternative question; Am I abusing cloud-init here?02:47
=== MAbeeTT_ is now known as MAbeeTT
stevenmOdd_Bloke, so I read ... what you wrote11:52
stevenmThat it really - they were words11:54
stevenmThey did at least enter my brain11:54
stevenmI really hate the word cloud11:55
meenaprologic: how long is that script running for when you run it without cloud-init? and, why can't you do that configuration via cloud-init's Network… things… netplan let's you do fairly complex configuration scenarios, and that's basically what the v2 network config format is12:13
prologicOh I see so I am abusing cloud-init's runcmd(s)12:14
prologicCan you point me to where I can read more about this network / netplan stuff?12:15
prologicAs for long long, well as long as it takes dhcp to assign an ip to the interfacee I'm interested in12:15
prologicso not long12:15
prologicmultiple seconds I guess, I can't get it to work in cloud-init via runcmd so 🤷‍♂️12:16
unix_prologic, https://paste.unix-comp-airnet.net/paste/5eHuuzBg#2jnV-G+fjpy5l+uT8P5IhV9vp9mxvzyr8vtdcTzD+Xc12:45
unix_i do the same on a smartos system12:46
unix_host: smartos12:46
unix_vm: centos12:46
unix_"user-script" section12:46
prologicI don't understand what you're showing me12:48
prologicthis just looks likee you're configuring the network12:48
prologicI'm trying to configure a piece of (questionable) software _after_ the network is up12:48
prologicor more precisely after a particular interface has an address I know is routable12:48
austinKGood day anyone able help answer some questions?13:52
Odd_Blokefalcojr: https://github.com/canonical/cloud-init/pull/829 is now un-WIP'd and ready for full review.14:58
Odd_Blokestevenm: Clouds, clouds everywhere, nor any drop to drink.14:59
Odd_Blokeprologic: I'd be surprised if `runcmd` was being killed due to timing, but something strange may be going on.  Could you pastebin cloud-init.log from an affected instance?15:00
stevenmOdd_Bloke, :)15:15
stevenmWe (and when I say we.. I mean who I work for) don't really want to have an ongoing 'connection' with the software our customers are choosing to run inside the VM's we host for them - at all really :)15:16
stevenmSo I was just hoping to use cloud-init ready images and cloud-init (via Proxmox VE) to just pre-seed certain things (e.g. to get the network working, that's mostly it) on first time boot *only*15:17
Odd_Blokestevenm: Do you (and when I say you.. ;) have an image capture story for your users?  By which I mean: can they launch a VM, then capture its filesystem and launch new VMs from that captured image?15:26
stevenmI don't think our users care about having that functionality15:28
Odd_Blokestevenm: Right, but if it's available and they were to use it and you've completely disabled cloud-init, then their new VMs will behave very unexpectedly (and insecurely: SSH host keys will not be rotated, for example).  I don't know enough about Proxmox to know if such capacity is available by default.15:47
Odd_Blokestevenm: But, also, cloud-init does only perform most of its actions on first boot, so I'm not exactly sure what issue you're seeing that we're trying to address here. :)15:48
stevenmI was hoping that this would just be a channel (in the form of a virtual CD-ROM drive) to communicate certain first-time setup information only15:53
stevenmAnd that is it.15:53
stevenmNo reliance on cloud-init support from the hypervisor afterwards15:53
stevenmI'd rather give customers a blank VM and some space for them to upload their own ISOs15:54
stevenmWe want that little involvement in what they run inside the VMs15:54
Odd_BlokeI would expect Proxmox to handle that for you: cloud-init will use DMI data to determine if the instance ID has changed.15:54
stevenmstop calling them "instances" :P15:54
stevenmThe customer can run BeOS in them for all I care :P  They're VMs - plain and simple.15:55
Odd_BlokeYou can have BeOS cloud instances, so I'm not sure what your point there is. :p15:56
Odd_BlokeBut, sure, cloud-init will use DMI data to determine if the VM ID has changed. :)15:57
stevenmHere is my Windows 95 Cloud Instance...15:58
stevenmhttps://i.snipboard.io/H8mKTq.jpg15:58
stevenmWho knew they were ahead of their time.15:58
Odd_BlokeI'd call that a cloud image, not an instance. ;)15:58
stevenmNah I want cloud-init OUT OF IT :) Well... after that initial first time setup anyway.15:58
stevenmSo maybe this isn't for us.15:59
Odd_BlokeMaybe: will users be able to upload and use their own images?15:59
stevenmCertainly ISO's... not sure about anything else.15:59
Odd_Bloke(Most VM images are built with cloud-init included already.)15:59
stevenmPersonally I don't mind if they upload disk images or indeed anything else like templates (cloud-init ready or not)16:01
stevenmand apparently the customer-facing front end we were going to buy in... supports it too16:03
stevenm(cloud-init)16:03
blackboxswstevenm: meena Odd_Bloke, I'm probably missing the point here(and risk of cloud-init not fixing a network across a machine/image that has changed across reboots), but cloud-init running once could be done providing the "#cloud-config\nruncmd: [touch /etc/cloud/cloud-init.disabled]" with initial userdata or config in /etc/cloud/cloud.cfg.d/ which would get you a system that'll do it's cloud-init thing 1 time.16:14
blackboxswthat system though would be static and never reprovision again as long as  /etc/cloud/cloud-init.disabled exists, so the image would likely be fragile if moved from one network to another16:15
blackboxswagain, just a drive by comment without full context16:15
Odd_BlokeAnhVoMSFT: We're chatting about testing for the upcoming SRU and we were wondering if https://github.com/canonical/cloud-init/pull/709 is something that we can reproduce as regular users, or if that's only an issue in internal deployments?17:12
xscoriOdd_Bloke well, it used to work fine until 18.5. Our current RHEL ssh keys are where we expect them (i.e. what we specified in /etc/ssh/sshd_config for "authorizedkeysfile"). It just seems to be broken afterwards. Here is my bug report on it: https://bugs.launchpad.net/cloud-init/+bug/191781718:45
ubot5Ubuntu bug 1917817 in cloud-init "sshd_config authorizedkeysfile setting is not honored after v18.5" [Undecided,New]18:45
xscoriI actually tried runcmd: [bin/cp, /home/%u/.ssh/authorized_keys, /etc/ssh/auth_keys/%u] thinking I could copy the file once it is inserted, but that did not seem to work either, I could not tell why from logs.18:46
meenaxscori: why would runcmd know how to resolve %u?19:54
xscoriyou mean it does not?20:02
xscorimeena I was looking at the code base and saw https://github.com/canonical/cloud-init/blob/master/cloudinit/ssh_util.py#L237 I assumed runcmd would understand %u and resolve it as well.20:05
meenaI'm fairly certain it does not20:06
meenawhat would be the context of %u?20:06
Odd_Blokexscori: %u is templating that sshd uses when reading its configuration, so we mirror that in our SSH handling.  By using runcmd, you'd be circumventing our SSH handling entirely (because it doesn't do what you want) and so you'd need to handle it yourself.20:06
xscoriwell, that would explain the reason it did not work :)20:06
Odd_Bloke(That said, I'm looking into why you might have seen this regress.)20:07
xscorioh, thank you very much!20:07
meenawait, this used to work???20:08
xscoriyes20:09
xscoriwe have production systems running with 18.5 and it inserts the ssh keys correctly into the file specified in sshd-config > authorizedkeysfile20:10
meenawhich users? how??20:11
xscoriI mean, we launch an ec2, say create a new key-pair and attach it to instance, then we ssh into the instance, and when we check the ssh keys, we see them not under /home/ec-user but under /etc/ssh/auth_keys/ec2-user20:12
Odd_BlokeWe've definitely had some changes in this area since 18.5.20:12
xscoridefault user for rhel is, clouduser, I think but we switch that to be 'ec2-user'20:12
Odd_Blokehttps://github.com/canonical/cloud-init/commit/f1094b1a539044c0193165a41501480de0f8df14 was between 18.5 and 19.4, so is the most likely culprit.20:13
xscoriyeah, I looked, alot of changes actually...and unfortunately could not figure out what broke20:13
xscoriyes, I saw that20:13
xscoriI did a diff on that commit...and got lost :)20:14
Odd_Blokehttps://github.com/canonical/cloud-init/commit/b0e73814db4027dba0b7dc0282e295b7f653325c landed in 20.4 and was intended to handle a bug in that previous one (perhaps this bug?) but was implemented in a way that opened up a vulnerability, so was reverted in 20.4.1.20:14
xscoriI am not sure, I cloned the repo and looked at the ssh related changes since 18.5 in git history and change logs, I just could not understand the logic of the code to follow up20:15
Odd_BlokeThe one thought I have, though, is that IIRC otubo brought these upstream from the Red Hat packaging, so it's possible that the Red Hat packages you're using have these changes even if they weren't upstream for that version.20:15
xscoriidk... I thought about filing a case with rhel, but thought they might simply point me back to cloud-init devs, so started there instead20:17
Odd_BlokeYeah, not trying to palm you back off on them (yet ;), just thinking aloud.20:17
xscoriwe have enterprise support with them, so I can definitely open a case with them if it is something they did20:17
xscorisure, I appreciate your time20:18
Odd_BlokeSo I think the problem is probably https://github.com/canonical/cloud-init/commit/f1094b1a539044c0193165a41501480de0f8df14#diff-8978d79f04e525de3011b92f7b141a7bd6dae4b6d0a70f9b9ea923bbd1451a43L23920:18
Odd_BlokeThat went from returning `auth_key_fn` to returning `default_authorizedkeys_file`.20:19
xscoriyes, and I manually tried to simulate the situation20:19
Odd_BlokeWhich is consistent with what you've described, I think.20:19
xscorifor example 'extract_authorized_keys' func returned ['/home/ec2-user/.ssh/auh_keys', {}] when I provided the second param to it '/etc/ssh/sshd_config', which is a CONSTANT anyway20:20
xscoriI did not expect that20:20
Odd_BlokeAnd, indeed, the (since-reverted) fix moved that to returning `auth_key_fns[0]`.20:20
xscoribut I also did not know what the return values should be20:20
xscorithat second {}, I though, should be /etc/ssh/auth_keys/ec2-user to match the experience we have with 18.520:21
xscorithought*20:21
Odd_BlokeIt should return ("/path/to/store/the/keys/in", ["list", "of", "the", "keys", "to write there", "which will look more like", "ssh-rsa AAAAAAAAA....etc"])20:22
xscoriagain, I did not understand the whole logic, so... was not sure what I was seeing is unexpected20:22
xscorioh20:22
xscoriin that case, it is definitely returning the wrong thing20:22
xscoriit is reading sshd_config, so not sure why it does not use it once it sees the authorizedkeysfile specified20:23
Odd_BlokeYeah, that's the bug, from the line I linked to.20:23
Odd_BlokeIt unconditionally returns `default_authorizedkeys_file` which is `default_authorizedkeys_file = os.path.join(ssh_dir, 'authorized_keys')` (and ssh_dir is `os.path.join(pw_ent.pw_dir, '.ssh')`).20:24
Odd_BlokeAnd so it disregards the setting, hence the regression.20:25
xscoriI see20:26
xscorioops did not notice channel kicked me out20:29
Odd_BlokeNo worries, you didn't miss anything.20:30
xscori:)  ok20:30
xscoriso, this is not a rhel issue and needs to be fixed in the source....hmm, even if it is fixed, then it will need to make its way to rhel, to an rpm etc.:(  oh boy....20:31
xscoriOdd_Bloke does any workaround pop into your mind?20:32
xscoriI guess copying might be one, or convincing our cyber to temporarily allow ssh keys in /home/{user}/.ssh20:33
xscorinone too pretty20:33
Odd_Blokexscori: Unfortunately, you're correct.  Your best bet is likely to iterate over users (in a runcmd) and move the keys, as you suggest.20:35
Odd_BlokeApologies that we don't have a better answer. :/20:35
xscoriok, I guess I need to learn about runcmd and figure out how to do that20:35
xscoriit is what it is. cloud-init is awesome, and bugs are fact of life, I mean code :)20:36
Odd_Bloke:)20:36
xscoriok, thinking about this a bit... I read somewhere in documentation, I guess, that ssh is run in boot stage before sshd_config is read and sshd service is run20:40
xscoriis that b/c keys have to be in place before the service starts or is it ok to insert the keys at any point? b/c runcmd is running at a later stage, right?20:40
xscoriagain thinking loud, if that's true, copying the keys alone won't do it, I will have to recycle the service as well?20:41
Odd_Blokexscori: IIRC, you need to have _host_ keys in place before SSH starts.  I think it's just an implementation detail that we write authorized keys at the same time: it's certainly the case that sshd will pick up new authorized keys without a restart.21:07
Odd_BlokeI think you would have a window between SSH coming up and keys being installed that isn't present in the default configuration.21:08
Odd_Blokerharper: I'm planning on landing https://github.com/canonical/cloud-init/pull/829 Monday morning so we can kick off an SRU process; LMK if I haven't addressed your concerns sufficiently. :)21:10
xscorigood to know, thank you! @o21:41
xscorigawd! I meant,  Odd_Bloke21:42
rharperOdd_Bloke: +1 on landing22:04
Odd_Blokerharper: Thanks!22:57
Odd_BlokeHave a good weekend!22:57
rharperOdd_Bloke: Thanks , you as well22:58
beantaxiDo folks here tend to stick to US business hours?23:19
rharperbeantaxi: I think there may be some Euro timezone but mostly US23:22
rharperif you leave your client up, it's worth just asking any time, we'll usually reply when we can23:22
beantaxiThanks! I think in the past I've thought weekends were a good time to eg get a PR done and dusted, but it looks like business hours actually works better23:28
beantaxiSo does that imply folks work on cloud-init fulltime? Or just that they tend to do it while at work23:29
rharperthere's a cloud-init team at Canonical, and many cloud partners also have folks who work cloud-init upstream;  and community folks maintaining distros or OS specific sections23:31
beantaxiThat makes sense23:32
beantaxiHow frowned upon is it to say "hey can you look at my PR?" Speaking purely from antsiness rather than urgency23:44
beantaxiThe Real Falcon was good enough to give it quite a bit of attention today, and I'm afraid I've got my very last change in just after he's taken off. There's no harm it all in it waiting till Monday if that's how it is.23:45
beantaxiBut life's too short to never be annoying.23:45
falcojrbeantaxi: yeah, I probably won't get to it again until Monday23:48
falcojrYou can feel free to ask for a review anytime here, but sometimes it takes us a while to get around to it just because of other competing priorities23:49
beantaxifalcojr: That makes sense. It's literally my first contribution to anything this size, and my excitement for it is truly laughable.23:55
rharperbeantaxi: IMO, it's always OK to ping for reviews on your PR (either on the PR itself or here in irc)23:55

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!