/srv/irclogs.ubuntu.com/2020/02/27/#cloud-init.txt

otuborharper: I was going over irc logs and found out we agreed to do the NM dropin as a downstream fix, then we would have some time to work on the _postcmd fix on the sysconfig renderer. I'm gonna close my PR, then.09:37
otuborharper: I actually already have a draft using _postcmd, I'll be pushing my branch soon.09:37
DanyChi folks, i came across https://bugs.launchpad.net/cloud-init/+bug/1020695 and while i see it has never implemented, anyone has another suggestion on how i can achieve the same but in an early cloud-init stage - ie: not final stage (runcmd/ script-user) ?10:04
ubot5Ubuntu bug 1020695 in cloud-init "Add variable for local IP address to /etc/hosts manager" [Low,Triaged]10:04
meenaotubo: cool.10:26
meenaDanyC: there's some ideas here in smoser's (unmerged) branch: https://code.launchpad.net/~smoser/cloud-init/lp1020695/+merge/16321610:29
meenaDanyC: i think i'd revive that bug / patch10:31
meenaDanyC: but, can you explain why you need it super early?10:31
DanyCmeena: sure thing i can. First let me give you the full picture (you might have seen i've asked various q recently) ;) . All i'm trying is to snapshot an EC2 and create an AMI from it with an application configured so i can later create multiple dev env from it. Now i have an application which doesn't cope very well with changing the IP and at the same time it does also look in /etc/hosts file (in addition t10:35
DanyCo its own "cache").10:35
meenaooff. I've seen those…10:36
DanyCand because of all that my final cloud-init stage doesn't kick/ run due to cloud-final.service not being up which is being held by my app service10:36
meena(my opinion is, like always, a bit different. because i have a config management fetish, in particular puppet)10:37
meenaI'd keep /etc/hosts as created by cloud-init in the AMI, and ensure that the application is *not* auto-started (in the AMI)10:38
DanyCso i can't use runcmd, i can't use script-user. I tried bootcmd but it doesn't seem to work if i'm trying to "update" the IP in the /etc/hosts file with the info from ec2metadata10:38
meenathen have cfg-mgmt run, fix up /etc/hosts, and whatever else the app needs. and only then enable and start the app.10:38
DanyChence my assumption that maybe ec2metadata is not available and so i was looking for s'thing else.10:39
DanyCmeena: but with a cfn-mgmt that will need to be run in user-data no? and if the final stage doesn't kick then i fail to see how it will help me10:39
meenauuuh… there's a switch somewhere to disable ec2metadata after the first boot, actually…10:40
DanyCi don't need to be switched, i need to work in the bootcmd stage10:40
DanyC*to be switched off10:41
meenai know, it just occurred to me, that you're trying to access it, while it's maybe switched off.10:42
meenaanyway… let's collect all the things we know so far.10:42
meenafirst up: what do you think of my idea of only enabling & starting the app once everything is in place?10:42
DanyCmeena: i wished i could do that, sadly i have other 10 services depending on the one which is holding cloud-init10:50
DanyCthe only option i see is to be able to update the /etc/hosts file with the current IP of the new EC2 and bounce the service so i can then let cloud-init other stages kicking in10:51
DanyCbut to do that it seems is harder than i initially thought, is a catch 22. Not to mention i can't change the silly app :facepalm10:52
meenaDanyC: then all 10 services are disabled and stopped until everything is fixed up.10:52
meenaWhat's the point of having them running, if they don't run correctly?10:52
meenaheck, you could even make it a systemd service that all of them depend on!10:53
meenaafter networking, you run, update_etc_hosts.service, it fixes up /etc/hosts, and *then* all services can be started.10:55
meenathink outside the box :P and inside another box!10:55
DanyCinteresting, and i guess you saying i should have in the cloud_init_module stage a step to run before update_eth_hosts module ? or i've misunderstood you ?10:57
meenai don't really know what the best way is to bring that service onto the AMI.11:03
meenahow are you bringing the other services onto the AMI?11:03
rharperotubo: ah, ok;  I do still wonder why the ordering is not enough;  my reading of systemd documentation suggests that NM should not be started before cloud-init-local.service has run to completion;   but I don't have the journal of when you saw the failure.15:32
Goneriblackboxsw, I think I've addressed all your comment here: https://github.com/canonical/cloud-init/pull/6215:46
Goneriblackboxsw, up to date prebuilt images are available here: https://bsd-cloud-image.org/15:46
blackboxswGoneri: Thanks for the ping.  Out of curiosity, how often are the prebuilt images updating cloud-init?18:27
blackboxswas in, I wonder if that's a good point of reference which you control that could be referenced if people find bugs/issues on bsd.18:29
blackboxswonce bsd changes are all integrated upstream in cloud-init I guess we could discuss that further18:29
Goneriblackboxsw, my goal is to autobuild them as frequently as necessary (e.g: after every merge, or even every PR), but it's still a work in progress18:36
Gonerifor now, I still trigger the build manually.18:37
blackboxswrharper: we can land this branch now right? https://github.com/canonical/cloud-init/pull/5419:00
* blackboxsw is scrubbing PRs since I'm in the mood :)19:00
rharperblackboxsw: yes19:00
rharperit lands to ubuntu/xenial19:00
blackboxswok will get that landed today19:01
blackboxswsorry Goneri, one more pull request landed on master I saw your force push :/19:35
blackboxswwill wrap up review on that next19:35
Goneriawesome :-)19:36
meenablackboxsw: you can probably close some of mine19:38
blackboxswyeah, it's about time PRs don't age as well as fine wine.19:39
meenabtw, can someone who's better at python than me, explain why i did this: https://github.com/canonical/cloud-init/blob/master/cloudinit/util.py#L1824 and how can i do this (in a thread-safe way? without spawning / forking?) so that i just say: find me a libc.19:48
sarnoldwhat problem are you really trying to solve?19:49
meenasarnold: make this code work on NetBSD, and on the next version of FreeBSD if it changes that .719:56
sarnoldmeena: aha; I think I'd try a loop over several potential libc pathnames, and populate that list with the paths to current and future libc libraries19:59
blackboxswGoneri: quick volley of comments {% if variant in ["freebsd", "netbsd"] %}20:04
blackboxswoops20:04
blackboxswGoneri: on https://github.com/canonical/cloud-init/pull/62 I mean20:04
blackboxswsomething concerns me a bit with the shifting cloudinitlocal to after networking in startup scripts as is diverges from upstream behavior and that could impact our future work20:05
blackboxswas we'd have to take into account that netbsd is different in this regard20:05
Goneriblackboxsw, this is totally a mistake. I'm actually surprised it just works20:06
meenai could use https://docs.python.org/3/library/ctypes.html#ctypes.util.find_library20:06
GoneriI will fix the order. thanks blackboxsw for the review!20:06
blackboxswGoneri: yeah me too a bit, I would've thought it would have introduced a startup service broken dependency loop or something20:07
sarnoldmeena: oh that looks a lot nicer20:07
meenabut i think i'm gonna have to read the actual code to see what that does on the systems before using it.20:07
blackboxswmeena: have a url for me of the PR that you'd like looked at first?20:08
blackboxswto refresh my memory. I'll try to get a pass on it today20:08
meenablackboxsw: nope.20:08
blackboxswheh, will grab one of 'em and work through it20:08
meenablackboxsw: all i want is the networking stuff on the Mailinglist to get a response :P20:08
blackboxswmeena: I'll try to get an update to that then. I think the direction that robjo is going with current prs (with flavors on sysconfig renders) is probably the best approach at the moment, but I think sysconfig renderer is the most contentious/dirty of our cases because suse and rhel differ so much in network config flag support. I'll try saying something smart about that on the mailinglist, what blocked me was20:10
blackboxswexamples & suggestions there.20:10
lachesishi all, i'm having a problem with ssh_pwauth setting on DigitalOcean. I am shipping an image that has its own /etc/ssh/sshd_config file with `PasswordAuthentication no` already set, but also with a `MatchUsers` section that enables PasswordAuthentication for some specific users (for somewhat silly reasons that are hard to fix). however, cloud-init is doing a broad string replacement, so it's disabling ssh password auth even for that particul20:15
lachesisar user.20:15
lachesishow can i prevent this? also, where can i open an issue about this bug?20:15
lachesisi'd really like to mask that setting out in the image... i suppose i could do this by editing the /usr/lib/python3/dist-packages/cloudinit/config/cc_set_passwords.py file to just disable that check20:16
meenablackboxsw: the examples & suggestions provided, or the ones missing?20:16
meenablackboxsw: re PRs, i think this one can probably be closed: https://github.com/canonical/cloud-init/pull/69 from what i understand from rharper . and i'm wondering if we shouldn't revert the previous work i did there.20:17
lachesisi tried putting `ssh_pwauth: unchanged` in `/etc/cloud/cloud.cfg` but that didn't help20:18
rharperlachesis: could you file a bug and include the tarball from 'cloud-init collect-logs'  and any provided user-data ?   we can look into the issue and see what's going on20:22
blackboxswlachesis: you can file a bug here: https://bugs.launchpad.net/cloud-init/+filebug to give a bit more context about the problem.20:23
blackboxswsorry, would help if I hit enter20:23
lachesis:) yeah i'll file there... i am not providing any user data but it is possible that DO is without me... let me see if i can track down where cloud-init fetches that, probably some magic 169 address20:24
rharperlachesis: right, I see you were adding the config via /etc/cloud/cloud.cfg vs. user-data; that's good enough20:28
blackboxswlachesis: if you ever need to check what userdata and vendordata cloud-init sees:   `sudo cloud-init query userdata` or `sudo cloudinit query vendordata` or `sudo cloudinit query --all`20:30
lachesisok yeah it looks like ssh_pwauth is coming in from vendor data20:32
lachesisbug report: https://bugs.launchpad.net/cloud-init/+bug/186508220:36
ubot5Ubuntu bug 1865082 in cloud-init "ssh_pwauth: no disables PasswordAuthentication in MatchUsers block as well as globally" [Undecided,New]20:36
rharperblackboxsw: man, we really should have collect logs pull in  /etc/cloud/cloud.cfg, /etc/cloud/cloud.cfg.d/*.cfg ...20:41
blackboxswyeah we really should20:41
blackboxswit is a pain point when we have to debug/triage20:41
rharperlachesis: replied; was this run from an instance with the /etc/clouc/cloud.cfg including the added 'ssh_pwauth: unchanged' ?20:41
blackboxswthough we omitted it because it could contain sensitive info20:41
blackboxswbut maybe we lump it into the user-data question in apport20:42
rharperI mean, any of them could;  so can user-data to some degree20:42
rharperright20:42
blackboxswyeah20:42
lachesisnegative, there was no cloud.cfg in this case. i can rebuild the image and regenerate with that set, but it'll take a few min20:42
rharperlachesis: you can add that to the ecisting instance20:42
rharperand then run: sudo cloud-init clean --logs --reboot20:42
lachesisand reboot it?20:42
lachesiswill do20:42
rharperthat will run like "new instance"20:42
rharperthe code reads that it should exit out without calling the ssh_utils path which reads in sshd ;20:43
lachesishmm ok that restarted my box, and it hasn't come back up yet :/20:45
lachesisit's a cow, not a pet, so i can just destroy and rebuild it, no big worries, but it will prevent me from getting the logs this time :)20:46
rharperwell, that's not nice of DO ...20:46
rharperlachesis: so DO does provide a *lot* of vendor-data scripts; it's possible that they are including a ssh_pwauth: no  setting by default20:46
lachesisthey definitely are... i included the query --all result in the tar file in that bug20:47
rharperin which case, that can override system config; which leaves you with having to fight them or disabling vendor-data in your image; so you can set system-config;20:47
lachesismm i see, ideally i'd just be able to disable that particular setting and/or that setting would be smart enough to avoid messing up my MatchUsers20:48
lachesisi am pretty strongly tempted to patch the python file and be done with it lol20:48
rharperheh20:48
rharper2020-02-27 20:50:04,362 - util.py[DEBUG]: Writing to /var/lib/cloud/instances/f1/sem/config_set_passwords - wb: [644] 24 bytes20:50
rharper2020-02-27 20:50:04,362 - helpers.py[DEBUG]: Running config-set-passwords using lock (<FileLock using file '/var/lib/cloud/instances/f1/sem/config_set_passwords'>)20:50
rharper2020-02-27 20:50:04,363 - cc_set_passwords.py[DEBUG]: Leaving SSH config 'PasswordAuthentication' unchanged. ssh_pwauth=unchanged20:50
rharper2020-02-27 20:50:04,363 - handlers.py[DEBUG]: finish: modules-config/config-set-passwords: SUCCESS: config-set-passwords ran successfully20:50
rharperlachesis: so that20:50
rharperwhat I expect to see if you20:50
rharperyour value makes it into the combined cloud config20:50
lachesissry how do i get logs from that cloud-init clean run?20:51
blackboxswlachesis: logs live in /var/log/cloud-init.log20:53
lachesisobviously, thx :)20:53
lachesis2020-02-27 20:47:12,316 - util.py[DEBUG]: Read 2964 bytes from /etc/ssh/sshd_config20:53
lachesis2020-02-27 20:47:12,317 - ssh_util.py[DEBUG]: line 97: option PasswordAuthentication already set to no20:53
lachesis2020-02-27 20:47:12,317 - ssh_util.py[DEBUG]: line 103: option PasswordAuthentication updated yes -> no20:53
lachesis2020-02-27 20:47:12,317 - util.py[DEBUG]: Writing to /etc/ssh/sshd_config - wb: [644] 2963 bytes20:53
blackboxswthe cloud-init clean --logs  removes your old /var/log/cloud-init.log so it'll only contain the current boot20:53
lachesis$ cat /etc/cloud/cloud.cfg20:53
lachesis# The top level settings are used as module20:54
lachesis# and system configuration.20:54
lachesisssh_pwauth: unchanged20:54
lachesiswait is that a space?20:54
lachesisdoh20:54
lachesisit's not a space, the _ just got lost somehow?20:54
lachesis`ssh_pwauth: unchanged`20:54
lachesismaybe my xchat font is borked... but there is an _ showing in vim20:54
lachesisbut i imagine the vendor-data just overrode my config there20:55
rharperthat's what I'm thinking20:55
lachesisok im gonna go with the dumb patch cc_set_passwords.py option :)21:01
lachesisthanks for your help folks, i love an active open-source IRC channel :)21:01
rharperlachesis: yw21:05
=== hggdh is now known as hggdh-msft
blackboxswrharper: just repushed  https://github.com/canonical/cloud-init/pull/214 with doc updates21:56
rharperblackboxsw: cool, did you see my comments in the review re: SRU blocker text / system_info() json encoding potential issues ?21:56
blackboxswrharper: I had and responded to both21:57
rharperk21:58
* rharper reviews 21:58
blackboxswI think SRU blocker is a no because we added new fields before across SRU boundary for platform/subplatform.21:58
rharperand we mark the fields experiemental correct ?21:58
blackboxswrharper: I did in the ds in the base, but we can/should add those good thought21:58
blackboxswdoing that now21:58
rharperso we're not baking things in; though I suspect telling folks they can now use this in their jinja template might make it less useful if it will change on them21:59
rharperblackboxsw: well, I do think things like system_info/variant, etc21:59
blackboxswthat's true21:59
rharperwon't be going away21:59
rharperso those don't have to be experiement, ie, they are fixed values at this point21:59
blackboxswright it's already been a requested  feature here once or twice21:59
rharperwe already have runtime code that looks at os.variant etc21:59
blackboxswrharper: correct, nothing changed there, just surfaced that key in instance data22:00
rharperyeah; but once added and SRU'ed, it can't change without potential regressing user scripts22:00
blackboxswthat's all part of stock util.system_info22:00
rharperso, we should be happy with them;22:00
blackboxswright, we would be unable to change names once SRU'd22:00
blackboxswmaybe it's worth bikeshedding on key names?22:01
* rharper was just reviewing them 22:01
blackboxswas we certainly don't  change v1 keys22:01
blackboxswthe rest of the dict doc is up for changes in general as we don't promise that part won't change.... just v1 keys22:01
blackboxswas v1 is the generalized output22:02
blackboxsws/generalized/standardized22:02
rharperthe 'cfg' head is going to be free-form, no?  if someone modified their /etc/cloud/cloud.cfg (or added a sub file) then there's going to be whatever in there ...22:02
rharperblackboxsw: reviewed22:16
blackboxswthanks rharper yeah, that cfg probably should be root-readonly22:21
blackboxswit could have just about anything22:21
fredlef1I've been looking at the life cycle of instance_link.  I see it's mostly managed in Init::_reflect_cur_instance(), called in stage6, after a valid data source has been found. But it's also preemptively deleted in Init::_get_data_source(), before initiating a search for available data sources.22:24
fredlef1The preemptive deletion was added in 2016 in 0x0964b42e5 (quickly check to see if the previous instance id is still valid). Does anyone remember what made the deletion outside of reflect_cur_instance() necessary/desirable?22:24
rharperfredlef1: hi, I believe the goal is to not trust the on disk object cache (which is only present on datasources which implement check_instance_id());  unless we can confirm the current instance id is the same as what's on disk (either in cache, or via the symlink);  this deletion happens during cloud-init-local time;  on subsequent stages (init, config, final) they all use the trust flag23:19
rharperfredlef1: do you have a particular bug you're looking at or something else?23:19
fredlef1rharper: thanks. That's useful information. Unless I'm mistaken, the disk object cache is always created but it only gets used if check_instance_id() is implemented or if manual_cache_clean is set to True.23:22
fredlef1rharper: I'm looking at ways to gracefully handle a reboot where the datasource is not available anymore23:22
rharperfredlef1: yes, I believe that's true, we do write it but it won't be used unless the ds implements the function ; its disabled by default in base source class (cloudinit/sources/__init__.py23:22
rharperfredlef1: this is manual-cache-clean: True23:23
fredlef1rharper: not quite.  I don't want to reuse the cache if the datasource is available, so as not to miss updates/changes to user-data and network interfaces.23:24
fredlef1Basically, if the datasource is available, we should always crawl it but if it is not, I want a way out23:24
rharperinteresting23:24
rharperwell23:25
fredlef1I'm testing a patch that modifies _get_data_source forcefully load the cached data if we failed to find a valid data source and the cached datasource still matches the config.23:26
fredlef1I should probably hide that behind a configuration option to make it more palatable23:26
fredlef1Does that sound senseless ?23:27
rharpernot senseless;  I generally like the idea of using cached datasource object under the following criteria 1) the platform is the same, 2) the datasource is the same 3) metadata service is down;  4) and I think we can compare the object's ds.instance_id value to what's written to /var/lib/cloud/data/instance-id23:30
rharpercurrently we won't re-use the object unless the source implements check_instance_id() ;  normally this is done via some non-network verification, some platforms encode instance-id in platorm data (like dmi system uuid etc);  the ec2 instance-id is not present in system info that I'm aware of;   but you could attempt to fetch instance id via metadata service; and return true and we'd use the object;  in your "metadata service is down scenario";  then23:33
rharperyou could do the other fallback checks I mentioned (1, 2, 3) and return True only if those hold23:33
rharperthis would restore the obj from cache23:33
rharperbeyond that, we'd need to sort through the other hits to imds, like the .network_config property23:34
fredlef1rharper: I looked at implementing check_instance_id as we do have a way to get the instance id on newer instance types but it turns outs that implementing check_instance_id for the EC2 datasource would conflict with public documentation from AWS.  In fact, I'm rather planning to implement a DataSourceEc2::check_instance_id() that returns false has23:42
fredlef1a place to document why it should not be implemented.23:42
rharperok,23:42
fredlef1I'll keep your criteria for reusing the cached datasource object in mind. Thanks for that.23:43
rharperthat's reasonable;   so, then I suspect we'd need to modify _get_data() to handle this scenario23:43
rharperbut possibly deal with the removal of the object cache , which is why you're asking =)23:43

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!