[15:16] <minfrin> A quick question on AWS and cloud-init, to see whether anyone has seen this behaviour before.
[15:16] <minfrin> I have an EC2 instance with an EBS volume attached to it, and I;ve asked cloud-init to format it like so:
[15:16] <minfrin> fs_setup:
[15:16] <minfrin>   - label: data
[15:16] <minfrin>     device: /dev/xvdh
[15:16] <minfrin>     filesystem: ext4
[15:16] <minfrin> When cloud-init starts, mke2fs fails as follows:
[15:16] <minfrin> 2016-02-10 15:03:42,593 - util.py[WARNING]: Failed during filesystem operation
[15:16] <minfrin> Failed to exec of '['/sbin/mkfs.ext4', '/dev/xvdh', '-L', 'data']':
[15:17] <minfrin> Unexpected error while running command.
[15:17] <minfrin> Command: ['/sbin/mkfs.ext4', '/dev/xvdh', '-L', 'data']
[15:17] <minfrin> Exit code: 1
[15:17] <minfrin> Reason: -
[15:17] <minfrin> Stdout: ''
[15:17] <minfrin> Stderr: 'mke2fs 1.42.9 (4-Feb-2014)\nCould not stat /dev/xvdh --- No such file or directory\n\nThe device apparently does not exist; did you specify it correctly?\n'
[15:17] <minfrin> Later on, after sshing into the machine, I am able to run mke2fs manually and the formatting of the block device works fine.
[15:17] <minfrin> Are they are delays that people are aware of when deploying AWS EBS drives that would cause cloud-init to fail?
[15:22] <waldi> check the kernel log
[15:23] <minfrin> Not following - what would I check the kernel log for? (I assume you mean dmesg)
[15:25] <rcj> smoser, I'm debugging a new DS where the config module fails during boot "Can not apply stage config, no datasource found!" but after boot I can run it 'service cloud-config start' just fine.  Any tips on debugging or ideas of what is happening?
[15:30] <waldi> minfrin: it should list when xvdh appeared. why do you have _eight_ disks?
[15:38] <minfrin> The only mention in dmesg of the xvdh disk is the very last lines, which could be warnings triggered during the successful mkfs.ext4:
[15:38] <minfrin> [   40.226724] blkfront: xvdh: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
[15:38] <minfrin> [   40.233951]  xvdh: unknown partition table
[15:38] <minfrin> What I was hoping to find was if anyone had any AWS experience, and whether there were any known issues with AWS running cloud-init before all resources were ready?
[15:40] <smoser> rcj, probably log says WARN somewhere
[15:40] <smoser> and also /run/cloud-init/*.json
[15:41] <smoser> minfrin, you're rpobably right. the disk probably was not there when cloud-init r un.
[15:42] <smoser> i'd not seen that, but the no such file or directory is pretty clear
[15:42] <minfrin> Some digging has found this patch to something called "rubber" from 2012 which referred to bugs with AWS volumes not being ready: https://github.com/rubber/rubber/pull/156/files
[15:42] <minfrin> The patch seems to pause in a loop until the device exists, and then continues.
[15:47] <smoser> thats definitely a sane path for some situations.
[15:47] <smoser> but in others its not sne.
[15:47] <smoser> its posible we could let you define the behavior to cloud-init.
[15:48] <smoser> the issue is that it sits there and waits forever for that device to appear, and its not there....
[15:48] <smoser> then its not going to get to the point where it gets your ssh keys or runs your user-data
[15:48] <smoser> and you're not going to ever be able to get in.
[15:51] <rcj> smoser, no WARN.  This is a container environment where we're emitting some missing signals for network and mounting FSes.  Could be that I've inadvertently changed the order the cloud-init jobs are run.  is cloud-config run before cloud-init-local?
[15:51] <smoser> do you have something in /run/cloud-init ?
[15:52] <rcj> yes
[15:52] <rcj> http://paste.ubuntu.com/15009184/
[15:55] <rcj> smoser, ^ there is results.json (prior run) and here is cloud-init-output.log from the latest run http://paste.ubuntu.com/15009195/
[15:57] <smoser> rcj, config definitely did start before init
[16:07] <rcj> smoser, should it?
[16:13] <smoser> no.
[16:46] <minfrin> smoser: An option to work around the device-not-ready problem is a "wait" option, giving the longest time we're prepared to wait for the drive to become available.
[16:48] <smoser> minfrin, well, nto if you're trying to mkfs
[16:48] <smoser> oh. i see, yeah.
[16:48] <smoser> sorry. i thought you were saying upstart or systemd wait for the filesystem
[16:48] <smoser> but yeah
[18:00] <minfrin> smoser: Still waiting for word from AWS support about this, need to work out how widespread the problem is and how it will affect us.
[18:06] <rcj> smoser, thanks.  this is an issue with this particular container running things out of order due to upstart overrides that change emit timing for cloud-init prereqs.
[18:53] <jmccann> I was wondering if someone could help me finish getting a CCA signed so I could try contributing to cloud-init
[18:53] <jmccann> I'm on page http://www.ubuntu.com/legal/contributors/submit and not sure what to fill for "Please add the Canonical Project Manager or contact"
[18:56] <harlowja_at_home> smoser, ^
[18:56] <harlowja_at_home> scott moser i think is said contact still
[18:57] <harlowja_at_home> mr.scott
[18:57] <harlowja_at_home> lol
[18:57] <jmccann> thanks much!
[18:57] <smoser> jmccann, yeah. you can put me that is fine
[18:58] <jmccann> thanks!
[19:54] <rcj> smoser, http://paste.ubuntu.com/15010669/ is what I need in this container environment (or perfectly replicating the FS-related emits) because right now I have a race.
[19:54] <rcj> smoser, but I don't know that is acceptable for Trusty SRU
[20:04] <smoser> rcj, somethign else is getting broked
[20:06] <rcj> smoser, how so?
[20:08] <smoser> so 'filesystem' should not be emitted until /  is mounted rw
[20:08] <smoser> and / should not be mounted rw until cluod-init is stopped
[20:08] <rcj> but it's a container so when it's started all the filesystems are mounted and ready.  they have a mountall.override in this environment to emit / /run local-filesystems,... filesystems.
[20:09] <rcj> so they all land at the same time
[20:09] <smoser> lxc works fine . this works tehre.
[20:09] <rcj> smoser, I understand that.
[20:09] <smoser> upstart will not pass 'mounted mountpoint=/'' until things that 'start on' that are done
[20:10] <smoser> ie, tahts a blockign event.
[20:10] <rcj> smoser, it's not happy on an lx-branded solaris zone
[20:10] <smoser> that make sense?
[20:10] <smoser> can you boot init --verbose or --debug ?
[20:10] <smoser> to /sbin/init ?
[20:10] <rcj> no, I don't have a kernel cmdline to alter
[20:11] <smoser> but how is /sbin/init invoked ?
[20:16] <rcj> smoser, https://github.com/joyent/illumos-joyent/blob/master/usr/src/lib/brand/lx/lx_init/lxinit.c
[20:16] <rcj> Line 831
[20:16] <smoser> well, fun
[20:16] <smoser> you can wrap it
[20:17] <smoser> http://paste.ubuntu.com/15010796/
[20:18] <rcj> yes
[20:59] <tmartins> hey guys... When using cloud-init with OpenStack Heat, is it possible to have 2 "user_data" sections at the same time?
[20:59] <tmartins> ke this: http://pastebin.com/7HiDpjWr ?
[20:59] <tmartins> Or something similar ?
[21:00] <smoser> tmartins, cloud-config-archive is what is basically taht.
[21:01] <smoser> oh. wait... id otn know about heat and if you can actually add the 2 sections tehre.
[21:01] <smoser> but cluod-config-archive baseically allows you to put m ultiple into one user-data
[21:02] <tmartins> Mmm...
[21:02] <tmartins> basically, initially, I'm creating a user, then, I'll run a script, that will call Ansible...
[21:02] <tmartins> simpler -> complex
[21:04] <tmartins> Maybe a mix of: "#cloud-config", which is simpler, than, a shell / Ansible, which is more complex...
[21:05] <tmartins> Mmm... I think I know how to do it!
[21:06] <tmartins> The "#cloud-config" supports write_files and runcmd...