=== tds6 is now known as tds === waxfire7 is now known as waxfire === frickler_pto is now known as frickler === vrubiolo1 is now known as vrubiolo === vrubiolo1 is now known as vrubiolo === vrubiolo1 is now known as vrubiolo === vrubiolo1 is now known as vrubiolo === vrubiolo1 is now known as vrubiolo [14:16] Hi all. Is this channel for cloud-init development only? Or is it ok for general cloud-init questions [14:17] beantaxi: always ask away ... someone might be able to help [14:19] Haha thanks! I'm _very_ new to cloud-init, though I'm very happy my EC2 startup actually is following an open standard. Anyway, I've excitedly moved to launch template & user data based startup, since why not just use that instead of learning terraform or what have you. [14:19] heh, cool. welcome [14:20] Trouble is, it appears (perhaps misleadingly) that my userdata is not being executed till completion. Almost as though at some point, cloud-init says "ok, you've had long enough" and kills my script and decides to finish booting. [14:20] I'd be very surprised if that's what's happening, so I'm tring to dig in and get some more detail [14:22] For example, my user data is basically a bunch of apt installs, then a few mounts + writes to fstab, then some could downloads from S3 and some systemctl enables. But I keep getting these no good instances, because eg in cloud-init-output.log it appears to die somewhere in the middle, eg after my first mount [14:23] right, typically we look at the cloud-init logs; if you can get into your system; then cloud-init collect-logs will create a tarball of cloud-init logs and state .. it will package up /var/log/cloud-init* /run/cloud-init* and include user-data, so if it's sensitive, you can edit those out and just paste a cloud-init.log; [14:23] beantaxi: is your script run via runcmd: in user-data ? [14:27] I'm not sure about runcmd. But everything's in userdata. I have a launch template, where the base image is just EC2's Ubuntu 18.04 Server, and the userdata is my base64 encoded script [14:31] Ultimately I was able to 'fix' my instances, by uploading and running the script by hand, from a sudo -i shell. There only seems to be an issue during startup. [14:31] ok, so you should be able to find your decoded script in /var/lib/cloud/instance/scripts/ [14:31] I'd first confirm it looks the way you expect decoded; [14:31] I've fired up a new instance, so I can grab the logs with collect-logs as you described. Thanks foe that! That sounds useful. And thatnks for the decoded script path! That'll be a great next step. [14:32] second, you can try to re-run it like cloud-init would with: cloud-init --debug single --name cc_scripts_user --frequency=always ; cloud-init will call run-parts on the that scripts dir; [14:32] Yesssss that sounds perfect [14:33] and lastly, if you use a #!/bin/bash -x for your shebang in your script, then you can see the execution tracing output in /var/log/cloud-init-output.txt [14:34] That's the one thing I've actually done from the beginning. is it /var/log/cloud-init-output.txt or .log? [14:36] Backstory - a buddy has started a new job, with runaway k8s issues. k8s for everything. Unsurprisingly nothing works, and no one knows how it's even supposed to work. I told him 'have you looked at cloud-init? I think that's 99% of what you need.' So I'm hoping to demonstrate that (and perhaps get a little contract out of it.) [14:37] /var/log/cloud-init-output.log [14:38] Ok good. That's what I've been looking at. It's unclear what it's relation is, to what AWS makes available in the console for 'Get System Log', but I presume that's some very AWS specific stuff going on. [14:38] beantaxi: speaking of k8s and cloud-init, https://bugs.launchpad.net/cloud-init/+bug/1888822 [14:38] Ubuntu bug 1888822 in cloud-init "cloud-init does not respect declared MIME types in multipart archives" [Critical,Triaged] [14:39] this was just worked on last week; and it had to do with some k8s bootstrapping of secret-user-data ... may not be related but figured I'd pass it along in case that was the issue [14:42] Thanks! It was a good read. Among other things, demonstrates people are successfully using cloud-init for much more elaborate scenarios than mine. [14:43] I was little afraid my issue was 'dont use cloud-init for anything over a dozen lines or so; that's not what cloud-init is for' === vrubiolo1 is now known as vrubiolo [14:55] beantaxi: hehe, no there are some very elaborate and long scripts to setup hosts with cloud-init; [15:05] Murhpys Law: I just built a new image, and then launched a new VM from the new image, and both came up flawlessly. And I'd terminated the bad guys so I couldn't run the above scripts. But those are fantastic to have for future use. [15:06] =) [15:08] Actually in looking at my successful run, I notice I have an rsync in there, to sync a local disk up from a volume, and perhaps that's not really part of 'system startup'. [15:09] Do you guys have a recommendation, on whether to put that in a separate cloud-init step to run on start, or to use systemd, or other? [15:15] cloud-init will run every boot, not every cloud-init operation runs every boot; you can create a script which cloud-init will run every boot, or only once or once-per-instance; [15:16] cloud-init can run things quite early ( a boot hook) ; user-scripts/runcmd typically run fairly late (by design, after networking is up and users created, files written, etc) [15:16] so it really depends on when you need to run the rsync; how often, etc. [15:20] That's actually how I found out about cloud-init. I wanted something to run on every VM start, not just VM creation, and I came across an AWS thing saying I could use a multi-part MIME file etc etc. [15:21] I'm new to systemd as well, so I'm musing if I want to go the multi-part MIME route or the systemd route. I'm happy to know any technical pros/cons if there's more to it than personal taste. [15:22] cloud-init only runs during boot; so after the bootup is finished, it's not active; of course with a systemd unit you can start/restart it trivially; having cloud-init re-run a script is also doable but likely more overhead of spinning up cloud-init to exec a script; if it's meant to run more frequently than boot; I'd probably use write_files to create a systemd unit with my program being called from that [15:26] Ah - this is the bit where you use cloud-init / cloud-config directives, instead of pure bash [15:28] Right, write_files, and runcmd, you could use write_files to write out bot your unit and the script, and runcmd to invoke the script and the service if you like [15:35] I saw that ... I was a bit hesitant to learn that, instead of just writing bash, largely because I wasn't sure how I'd troubleshoot my cloud-config or see exactly what was going on. Of course I'd get the benefit of any error checking etc -- all the stuff I _should_ be doing but probably am not. [15:36] What's the implementation of write_files etc ... is it all little python functions? [15:41] almost all of cloud-init is written python; the syntax for the user-data in put is in yaml, we have examples on our docs page, https://cloudinit.readthedocs.io/en/latest/topics/modules.html#write-files [15:42] for debugging/troubleshooting, we typically use LXD to run a system container with user-data attached to it; that's faster than launching an image (if you don't have a dev setup with lxd, you can launch an ubuntu instance and use lxd from there) [15:44] That lxd maneuver sounds incredibly helpful. Everytime I need to debug an image startup issue I lose half a day, just waiting for VMs to start. [15:44] alternatively, if you deploy into an instance, you can test your configs with: cloud-init --debug --file my-cloud-config.cfg single --name cc_write_files --frequency=always; write your cloud-config that you want to text into the file and then repeatedly call cloud-init single , the --frequency=always means it will always execute that module [15:45] yeah, lxd part is nice; we do something like : lxc init ubuntu-daily:bionic b1; lxc config set b1 user.user-data "$(cat my-user-data.cfg); lxc start b2; [15:45] s/b2/b1; [15:45] then you can lxc exec b1 bash; and run cloud-init status --wait (this blocks until all of cloud-init is done); and check your results; [15:47] lxd has seemed like black magic to me for quite some time. It's been bugging me, but I've never had a 'way in' to demystify it and actually use it as a productivity tool. This sounds perfect. [18:40] So, I've been musing if cloud-init could be used to deploy containers on separate AWS regions, or even across cloud providers. [18:41] And now, in a LXD youtube I'm watching from 2015, this guy talks about using LXD to migrate cloud_init _running containers_ from host to host. Wow. === vrubiolo1 is now known as vrubiolo === tds8 is now known as tds