/srv/irclogs.ubuntu.com/2020/09/16/#cloud-init.txt

=== ijohnson is now known as ijohnson|lunch
blackboxswparide: Odd_Bloke rharper I can't remember if we no longer support CentOS/7 source rpm builds17:41
blackboxswto upload to copr repo17:41
blackboxsw./tools/run-container --source-package centos/8 works17:50
blackboxswbut dependency issues trying centos/7 . I think I recall paride mentioning a few weeks back that we can no longer support centos/7 builds there17:50
blackboxswfalcojr: for the moment, I think building centos/8 --source-package and uploading and requesting a build for el8 should suffice18:00
=== ijohnson|lunch is now known as ijohnson
rharperblackboxsw: master cannot build centos7 since we dropped py2 support; I discussed with paride that we'd want the copr builds for centos7 to happen against 19.4-stable;18:37
chillysurferis the cloudinit.util dict merging functions for general use? i'm assuming those are for merging cloud config. but is it normal to use them for typical dict merging in the rest of the code base?18:48
Odd_Blokechillysurfer: If you grep for mergemanydict in the codebase, you can see it's used for a few different things; it might depend a little on what you consider "typical" dict merging as to whether or not it's appropriate in your case. :)18:52
chillysurferOdd_Bloke: yep that's the one i'm specifically looking at. we even use it in our azure data source18:57
chillysurferi might try to get around that though if possible. like most things, it doesn't seem free :)18:58
blackboxsw+1 rharper I had forgotten the context around that.19:13
blackboxswthanks19:13
meenaOdd_Bloke: so twitter actually worked!19:17
Odd_Blokemeena: Yep!19:20
Odd_BlokeThat explains why the checkmarks on that PR are all blue now.19:20
Ac-townhmm not seeing anything on the docs for this, how can I return an error during a runcmd step? so that the error is passed down into data/status.json?21:44
Ac-townor even scripts-per-boot21:46
Ac-townideally I want to collect that file and parse it to see if we failed any boot steps and if so what21:46
blackboxswAc-town: cloud-init status --long should report errors. https://paste.ubuntu.com/p/mZmwvqMpVW/  I get that error when I lxc launch ubuntu-daily:xenial  -c user.user-data="$(cat bogus_runcmd.yaml)"22:37
blackboxswwhere bogus_runcmd.py contains "#cloud-config\nruncmd:\n  - 1/0"22:37
blackboxswone thing to wonder/check is whether your runcmd content is actually valid cloud-config yaml on the system: https://cloudinit.readthedocs.io/en/latest/topics/faq.html#how-can-i-debug-my-user-data22:39
Ac-towndo these errors only report when cloud-init fails to parse? or when those commands fail22:40
Ac-townfor one example, we have our first puppet run in scripts-per-boot, and if that fails for some reason the host is going to fail, I'm looking for a sane way to report back what failed so I can watch the status across our whole deploy22:41
Ac-townbut there are no errors in status.json under modules-final22:42
blackboxswthat doc url I linked only validates user-data cloud-config. (which actually produces some semblance of error through cloud-init status --long though those errors surfaced are widely different depending on what problems there are in the user-data YAML.)   I'll have to check scripts-per-(boot|instance|always) to see how cloud-init behaves there on failed "external" scripts22:48
blackboxswI would want cloud-init to try processing and running all scripts in those directories and not stop after the first error, but I'd also want cloud-init to collect any and all errors and surface them from cloud-init status --long (which reads /run/cloud-init/status.json and result.json)22:49
blackboxswAc-town: I'm trying to think of how best to handle this, if cloud-init isn't surfacing an error in a scripts-per-boot maybe that's worth a bug and we can discuss on the bug whether that is out of scope for cloud-init.22:51
Ac-townI can easily see the errors section to be in the case that there was a cloud-init level error, but when I have puppet and commands in runcmd that are critical to boot it would be nice to avoid parsing logs22:52
blackboxswI'm trying to launch a broken instance now with a bad script to see22:52
blackboxswAc-town: especially parsing cloud-init logs to be frank (as they are way too busy)22:52
blackboxswAc-town: your puppetmaster doesn't represent puppet apply's that fail on given nodes?22:53
blackboxswor maybe you are running masterless22:53
Ac-townit's not an ideal puppet in any sense22:53
Ac-townno puppetdb22:53
Ac-townbut there's still a puppetmaster22:54
Ac-townpuppet is also managed by an external team, so getting those tools isn't trivial22:54
blackboxsw+1  right, understood. I'm trying to think of the best approach to take into account custom script failure (which doens't really fail cloud-init proper)22:55
blackboxswI was toying with minor abuse/use of the  existing call home functionality  https://cloudinit.readthedocs.io/en/latest/topics/examples.html#call-a-url-when-finished22:56
blackboxswwhich could post your custom script success failure exit codes maybe?22:56
blackboxswto some url or your choosing22:56
Ac-townthat would cool actually22:56
blackboxswbut again you want an easy way to "see" that failure from cloud-init I suppose22:56
Ac-townthe idea of having it from the machine would be it's better for us to reach out to a host than a host to reach out22:57
Ac-townsince I was going to write a small service to allow us to collect this data, or ansible22:57
Ac-towna vm workflow is this:22:58
Ac-townscale out, vm is created, puppet runs in scripts-per-boot and sets up all our basic host things and some packages used in runcmd, runcmd uses those packages to setup our data and start the service22:58
Ac-townmy ideal would be to look across the infra to either follow the status or to see if there is a problem across the stack22:59
blackboxswok validated that I dump the same bogus script from runcmd in /var/lib/cloud/scripts/per-boot/1.sh22:59
blackboxswand I see the error ('scripts-per-boot', RuntimeError('Runparts: 1 failures in 1 attempted commands',))22:59
blackboxswacross reboots22:59
Ac-townwhat about runcmd?23:00
Ac-town(it's possible we're not passing down puppets exit code)23:00
blackboxswand same-ish from runcmd: https://paste.ubuntu.com/p/mZmwvqMpVW/23:00
Ac-townI see a rsync step that fails and runcmd says success23:00
blackboxswnote note 'scripts-user' instead of ';scripts-per-boot'23:01
blackboxswinteresting, so I imagine you see that rsync failure in cloud-init-output.log instead of cloud-init.log?23:01
Ac-townI'm looking at the cloud-final journalctl23:01
Ac-townfor some reason we're not writing out the logs in /var/log, why I'm not sure yet23:02
Ac-townbut it's always been the case23:02
blackboxswinteresting. ok, might be rsyslog custo configuration on the system23:02
Ac-towncould be, either way I see that scripts-per-boot return "SUCCESS" and same with scripts-user23:03
blackboxswlog configuration can/does get manipulated in  /etc/cloud/cloud.cfg.d/05_logging.cfg on a lot of systems23:03
blackboxswon recent versions of cloud-init can show you what cloud-init sees a logging configuration   via `cloud-init query merged_cfg._log`23:05
Ac-townI see a cloud-init.log file managed there and the file does get made but it has a size of zero23:05
blackboxswas root user23:05
blackboxswahh interesting23:05
Ac-townwhat version should that query work? I got an error from that23:05
blackboxswcould it be puppet manifest clearing it out?23:05
Ac-townwe're on 18.523:05
blackboxswv. 20ish23:05
blackboxswwhat disto?23:05
blackboxswdistro?23:06
Ac-towncentos7 atm23:06
blackboxswahh, best we can get you is 19.4 via our copr repos. cloud-init upstream dropped python2 support end of 201923:06
blackboxswwhich lives @ https://copr.fedorainfracloud.org/coprs/g/cloud-init/el-testing/23:08
blackboxswour daily upstream builds py3 only support https://copr.fedorainfracloud.org/coprs/g/cloud-init/cloud-init-dev/23:09
blackboxswwhich will get you el8 deb packages23:10
Ac-townswapped out info for generic names, but this is the runcmd section https://paste.ubuntu.com/p/HQ6WhxhCkt/23:10
blackboxswinteresting, here's my same section with a broken runcmd script https://paste.ubuntu.com/p/mDCwCXSyX5/23:16
blackboxswI also see the failure noted in cloud-init-output.log for that runcmd script that was created23:16
blackboxswso it may be easier to parse WARNING out of /var/log/cloud-init-output.log for that case if upgrading to 20.3 of cloud-init is not possible on those images23:17
Ac-townwhat if multiple commands are listed in the runcmd? we have 10 steps in ours and at least one of those steps should always work23:17
Ac-townI can see if we can get it updated, but that's also managed by another team as they make the images :(23:18
blackboxswright. so the runcmd section is written into a single consolidated script I think. that makes for a very long script that would have to take into account your separate steps/scripts running.23:20
blackboxswmakes me think maybe you'd want to emit separate per-boot scripts in cloud-config write_files section maybe so they can operate independently but I probably don't understand the problem you are trying to solve.23:21
Ac-townthat's a decent option23:22
Ac-towndidn't think about that23:22
blackboxswI did a sample write_files puppet setup here earlier for a feature test https://github.com/cloud-init/ubuntu-sru/blob/master/bugs/d3e71b5e.txt23:23
blackboxswyou could b64 your content to make it easier to provide  your big scripts23:24
Ac-townso user write_files to write out scipts into scripts-user?23:25
Ac-townyep, we do that for some things already23:25
Ac-townI guess per-instance would make sense, it's after per-boot23:26
blackboxswproblem is if users try to provide user-data to their instance launch (if that's a thing you support) it would override that23:26
blackboxswgotta run.23:26
blackboxswsee folks later23:26
Ac-townty23:26
blackboxswI'll check in later/tomorrow Ac-town good luck :)23:26

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!