#cloud-init 2013-12-30
<harlowja> harmw sean hopefully getting me an image today :-P
<harmw> ah nice harlowja 
<harmw> first things first though, time to fix my broken havana upgrade
<harlowja> ah, good luck :-/
<harmw> god how I hate openstack for beeing such a pig to debug in case of failures :| especially since it now just works
<harmw> oh, haha
<harmw> ^^
<harlowja> just works??
<harlowja> lol
<harlowja> u so funny
<harlowja> we are working through the havana upgrade as we speak afaik
<harmw> it was bugging me about not beeing able to find drive.conf, even though i configured force_config_drive=false
<harlowja> saw on the ML that some migration is possibly busted
<harlowja> drive.conf?
<harmw> yeah, whats it called
<harlowja> odd
<harmw> in the instance dir
<harmw> or just 'disk'
<harlowja> ya, thats config drive stuff
<harlowja> *at least disk.config is 
<harmw> yes , indeed
<harmw> and which I told nova to have have disabled
<harlowja> ya, :-/
<harlowja> and u restarted nova processes?
<harmw> it also complained about timeouts with neutron, causing bad behaviour aswell
<harmw> ofc :)
<harlowja> k, odd
<harmw> well apart from that, it now 'just' works
<harmw> grizzly to havana
<harlowja> ya
<harlowja> i haven't heard anything bad from the guys that are working on that upgrade here
<harlowja> just takes time to make sure :-P
<harlowja> and once u go forward u never go back (sadly)
<harmw> having to do db_sync for nova/cinder/glance/etc. was the only real bug i guess
<harmw> lol, interestingly enough, my newly created instance has a disk.config file :|
<harmw> force_config_drive = false
<harlowja> :-/
<harmw> 2013-12-30 22:49:00.693 25436 INFO nova.virt.libvirt.driver [req-51d97fd0-50b5-4c9c-be1d-c4767df31cf5 30f8cf93273345d1b7dc4e22da05536d 04826ef02a00466da4af5e70ddc67038] [instance: 87b2cf14-bd06-475c-a5a2-293969d31960] Using config drive
<harmw> wtf
<harlowja> ya
<harlowja> odd
<harlowja> https://github.com/openstack/nova/blob/stable/havana/nova/virt/libvirt/driver.py#L2419
<harmw> ah, and here we go again: 2013-12-30 22:49:53.105 25436 TRACE nova.compute.manager [instance: 87b2cf14-bd06-475c-a5a2-293969d31960] ConnectionFailed: Connection to neutron failed: timed out
<harlowja> https://github.com/openstack/nova/blob/stable/havana/nova/virt/configdrive.py#L180
<harlowja> instance.get('config_drive') or CONF.force_config_drive
<harlowja> so maybe your instance think it needs the config_Drive
<harlowja> and thats overriding force setting
<harmw> hm, well nova boot isn't telling it to use it
<harlowja> ya, someone is setting that i think
<harmw> i was hoping to -not- having to go through sourcecode to get this fixed...  :p
<harlowja> ya
<harlowja> can u paste the debug request logs
<harlowja> https://github.com/openstack/nova/blob/stable/havana/nova/api/openstack/compute/servers.py#L763
<harlowja> my guess is that code is getting activated, which is saving sometihing in the instance
<harlowja> a favorite saying of a person on my team
<harlowja> 'its openstack relaxxxxx'
<harlowja> *long x, lol
<harmw> :p
<harlowja> does seem odd that it isn't turned off though
<harlowja> when booting use --debug and see what the heck nova is sending
<harmw> "name": "dummy", "created": "2013-12-30T22:02:04Z", "tenant_id": "04826ef02a00466da4af5e70ddc67038", "OS-DCF:diskConfig": "MANUAL", "os-extended-volumes:volumes_attached": [{"id": "07c70f2f-9aa5-46fc-ba40-768b1cc412e4"}], "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-STS:power_state": 0, "config_drive": "", "metadata": {}}}
<harmw> though now its hitting a neutron timeout
<harlowja> odd, ya, "" == false in python
<harlowja> its openstack rellaxxxx
<harlowja> lol
<harmw> openstack should invest some time in making stuff like this easier to debug :)
<harmw> ffs, ive just ran the build-me-a-new-instance script again, but this thime on the controller
<harmw> and volia, no neutron timeout :|
<harmw> difference is fedora workstation with grizzly cmdline tools, controller has havana
<harmw> though its probably just dumb luck
<harmw> its still creating a config_drive though
<harmw> 2013-12-30 23:24:08.190 25862 DEBUG nova.openstack.common.service [-] force_config_drive             = false log_opt_values /usr/lib/python2.6/site-packages/oslo/config/cfg.py:1945
<harmw> from nova-compute startup
<harlowja> :-/
<harlowja> did u try setting no value for force_config_drive
<harlowja> harmw ya, i think i see it
<harlowja> lol
<harlowja> https://github.com/openstack/nova/blob/master/nova/virt/configdrive.py#L46
<harlowja> cfg.StrOpt
<harlowja> haha
<harlowja> so leave it empty, not 'false'
 * harlowja not really funny :-/
<harmw> +1 for the ever consistent openstack
<harlowja> def
<harlowja> force_config_drive is a string option, to allow for future behaviors
<harlowja> lol
<harlowja> 'future behaviors'
<harlowja> lol
<harmw> and empty is "" or just realy NULL ?
<harlowja> i'd just do
<harlowja> force_config_drive             =
<harmw> yea
<harmw> ok
<harlowja> i guess that would be null
<harlowja> or should be
<harmw> 2013-12-30 23:38:59.843 26681 WARNING nova.virt.disk.api [req-9093c6c8-eca4-4107-9fea-deae1409d02a 30f8cf93273345d1b7dc4e22da05536d 04826ef02a00466da4af5e70ddc67038] Ignoring error injecting data into image ([Errno 2] No such file or directory: '/data/openstack/nova//instances/0f6f6e41-cc37-4bf0-983b-a48dff16c9db/disk')
<harmw> 2013-12-30 23:39:02.071 26681 INFO nova.compute.manager [-] Lifecycle event 0 on VM 0f6f6e41-cc37-4bf0-983b-a48dff16c9db
<harmw> 2013-12-30 23:39:02.388 26681 INFO nova.virt.libvirt.driver [-] [instance: 0f6f6e41-cc37-4bf0-983b-a48dff16c9db] Instance spawned successfully.
<harmw> there we go
<harmw> thanks harlowja 
<harlowja> np
<harmw> +infinity for opensource :)
<harlowja> :)
<harlowja> mostly works some of the time, ha
<harlowja> ^ thats opensource, lol
<harmw> :)
<harmw> a little over 24hours away from nye :)
<harlowja> def
<harlowja> 2014, the year of linux
<harlowja> lol
<harmw> lol
<harlowja> or maybe the year of openstack, idk
<harlowja> woot, got a freebsd to locally run
<harlowja> devfs           1.0K    1.0K      0B   100%    /dev
<harlowja> procfs          4.0K    4.0K      0B   100%    /proc
<harlowja>  /dev/vtbd0p2    9.6G    1.4G    7.5G    16%    
<harlowja> harmw do u have an simple instructions i can do to make it active, since i'm guessing its not install rpm :-P
#cloud-init 2013-12-31
<harmw> harlowja: 
<harmw> pkg install python27 py27-yaml py27-requests py27-prettytable py27-cheetah py27-boto dmidecode e2fsprogs gpart sudo
<harlowja> thx
<harmw> and a python setup.py build with this one: https://pypi.python.org/packages/source/j/jsonpatch/jsonpatch-0.6.tar.gz
<harmw> then get cloud-init source and setup.py that aswell
<harmw> its bedtime here though 
<harmw> :)
<harlowja> np
<harlowja> then some rc.d stuff i guess?
<harmw> ive only run it manually
<harmw> after installing it
<harmw> python setup.py build
<harmw> python setup.py install -O1 --skip-build --root /
<harlowja> k
<harlowja> thx harmw 
<harmw> np
<harlowja> will see how far we can get
#cloud-init 2014-01-02
<harlowja> arg, harmw so close, except need to modify the ufs image and rhel doesn't come with ufs module that seems to work, lol
<SnoFox> No topic ...? I didn't realize that was possible.
<harlowja> seems like even ubuntu creates a RO ufs module, arg
<harmw> harlowja: what do you try to do?
<harlowja> need to put some files in /tmp (jsonpatch...), but instance under libvirt doesn't have networking yet, and durn RH and ubuntu ship read-only ufs modules
<harmw> why doesn't it have networking?
<harlowja> ya, not sure, haven't investigated that much yet :-P
<harmw> i can imagine rc.conf lacking some networking settings, though running dhclient should just give you an adress
<harlowja> kk, let me try
<harlowja> ah, it works
<harlowja> sweet
<harlowja> arg, of course no wget, lol
<harmw> fetch :)
<harmw> and perhaps even curl, not sure though
<harlowja> hmmm
<harlowja> this should let me download jsonpatch and stuff
<harlowja> pita, lol
<harmw> this beeing fetch?
<harlowja> ah, i got wget installed :-P
<harmw> in usr/local/sbin perhaps?
<harlowja> let me check
<harlowja> crazy bsd, lol
<harlowja> ;)
<harlowja> oh, i guess i can put all the needed files that aren't installed on a fat disk, and then avoid all this stuff, hmmm
<harmw> or you just wget/fetch the jsonpatch and cloudinit tarballs and unzip those ;)
<harlowja> ya ya :)
<harlowja> gonna get this to work :-P
<harmw> harlowja: http://paste.openstack.org/show/59693/ I'm seeing this on accasion, while creating/destroying instances
<harlowja> ah, good ole neutron
<harlowja> :-/
<harmw> any ideas where/how to debug this? since it doesn't always happen 
<harlowja> hmmm
<harlowja> i bet thats your keystone timing out
<harmw> lol, i thought the same thing
<harlowja> how are u setting up keystone?
<harmw> mysql+memcached
<harlowja> multi-process?
<harlowja> 1 process?
<harmw> no, just 1
<harmw> keystone-all
<harlowja> ya, i bet thats causing it
<harmw> hm, is multi-process something new in havana?
<harlowja> nah, its been an ongoing fight for that
<harmw> because this only started since after the upgrade from grizzly
<harlowja> https://www.mail-archive.com/openstack@lists.openstack.org/msg03715.html
<harmw> keystone token-get is slow aswell, sometimes it returns in <1s while it can also take an average of 10s
<harlowja> hmmm
<harmw> (I should mention its a rather old Atom cpu though)
<harlowja> do u have a script running that cleans up the token database?
<harmw> keystone is running 100%cpu most of the time
<harmw> my token table is empty
<harlowja> k
<harmw> since memcached takes care of that
<harlowja> right
<harlowja> 100%cpu  not good
<harlowja> http://docs.openstack.org/developer/keystone/apache-httpd.html
<harlowja> u can try that
<harlowja> so much stuff hits keystone that its a good idea to scale it > 1
<harmw> true
<harlowja> its the first thing to fall over really
<harmw> though why didn't this happen with grizzly?
<harlowja> unsure
<harmw> mind you, I'm running a tiny tiny cloud with just 2 nodes 
<harlowja> :-/
<harlowja> and its running at 100%
<harmw> indeed
<harlowja> that seems off :-/
<harlowja> bb, food
<harlowja> keystone logs useful?
<harmw> :)
<harmw> ahno
<harmw> i had verbose+debug on, nothing interesting
<harmw> i think this Atom just sucks bigtime at doing all the PKI stuff, causing it to max at 100
<harmw> bedtime now harlowja, let me know how your fbsd stuff ends
<harlowja> ends, probably never ends, haha :-P
#cloud-init 2014-01-03
<harmw> harlowja_away: :)
<harlowja> let me see how much farther i can get today harmw , nearly there i think
<mbarr> I'm trying to find the code for the puppet module in cloud-init.  part of it is just what variables are available for interpolation into the puppet.conf file..  can someone point me to it?
<mbarr> I've looked in the bzr repo, and don't see anything, but it clearly works...   
<harlowja> mbarr http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/config/cc_puppet.py
<mbarr> and there's nothing about modules in the current docs site.
<harlowja> mainly line 51 and on
<mbarr> Hmm... i wouldn't have expected it inside config.
<harlowja> :)
<mbarr> that's why i didn't find it :)
<harlowja> back in the day i tried to name that differently
<harlowja> *name the folder differently
<harlowja> i failed, haha
<mbarr> ahh, there is only the %f & %i.  got it.
<mbarr> What's interesting is that I'm trying to change the hostname of the system early enough to be picked up by puppet, and it gets cranky.. 
<mbarr> Thus, the code is helpful.
<harlowja> hmmm
<mbarr> i suspect it's starting the puppet instance before the runcmd actually changes the hostname.
<harlowja> are u using the sethostname module?
<harlowja> http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/config/cc_set_hostname.py
<mbarr> That was what i was *just* looking at.
<mbarr> It's not documented at present in the main docs.
<mbarr> so thus.. the code.  it is helpful :)
<harlowja> ah
<harlowja> http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/config/cloud.cfg#L40
<harlowja> runcmd happens after pupper
<mbarr> I could switch it to bootcmd, and it'd be OK, it looks like.
<harlowja> ya, i suspect so
<mbarr> I was figuring i would just read code, and figure it out :)
<harlowja> although set_hostname module runs after bootcmd
<mbarr> or i could just use the set_hostname and it'd be fine too.
<harlowja> so it might get overwritten
<mbarr> preserve_hostname would prevent that, or i could just do the puppet command by hand.
<mbarr> however, it's annoying to have to edit the defaults file.
<mbarr> So i was looking to see how you dealt with it.
<harlowja> hmmm, i haven't use the puppet stuff personally :)
<mbarr> That's oK.
<mbarr> it's just using the autostart().
<mbarr> I could not bother w/ the init script and just run it, and mostly, it'd be fine.
<harlowja> :)
<mbarr> i just really wanted to see how this thing worked, and seeing the order was actually key :)
<harlowja> def
<mbarr> I'm attempting to give more info to things like logs, so i need my hostnames to have some info as to role, but also still be unique, and auto scaleable...
<mbarr> oh what fun autoscaling is
<mbarr> I end up w/ hostnames like webserver-prod-us-east-1c-i-2828384.ec2.......
<mbarr> thank you, I'm glad to see that the project was included in RHEL 6.4.  That'll make this a fairly standard option now.
<harlowja> def
<mbarr> Thanks again! wonderful job :)
<harlowja> np
<harlowja> ha
<harmw> harlowja: any luck?
<harlowja> ah, haven't had time yet
<harmw> ok
<harmw> harlowja: you work at yahoo, right?
<harlowja> correct
<harmw> you're behind the ads.yahoo.com malware? :p
<harlowja> lol
<harlowja> nope
<harmw> :) what is it you do, software developer?
<harlowja> yuppers, openstack software developer guy
<harmw> cool
<harmw> developer or also responsible for the yahoo cloud running smooth and gentle?
<harlowja> a little of all the things :-P
<harlowja> more developer, less smooth running person
<harlowja> i try to not get involved there
<harmw> haha
<harmw> and yahoo pays you to 'just work on openstack'? 
<harlowja> pretty much
<harmw> thats pretty cool
<harmw> just you or a team of several?
<harlowja> openstack to big for 1 person ;)
<harlowja> so more than 1
<harlowja> less than 50
<harlowja> :-P
<harmw> thats quite a few :)
<harmw> and something similar to keep the yahoo cloud running?
<harmw> and please forgive me for beeing curious :)
<harlowja> np
<harlowja> well its almost the same team, we have to be pretty well connected with the people running it
<harmw> obviously
<harmw> how large is your deployment anyway?
<harlowja> ah
<harlowja> now u getting into the intereting questions :-P
<harmw> :>
<harlowja> ya, that one i can't easily say
<harlowja> but in the thousands
<harlowja> thats the estimate i can say
<harmw> instances?
<harmw> (vm's)
<harlowja> lets say both to that
<harlowja> lol
<harmw> ok
<harmw> compute nodes?
<harmw> and what kind of storage do you use?
<harmw> something iscsi-ish? or a ceph cluster?
<harlowja> hmmmm
<harlowja> lol
<harlowja> ya, not sure if i can answer that one :-P
<harlowja> *not yet at least*
<harmw> lol
<harmw> because you don't know or because it's a secret :p
<harlowja> well lets say its a WIP
<harlowja> lol
<harlowja> on something, lol
<harmw> lol, do tell :>
<harlowja> ha
<harlowja> someday when i can :-P
<harmw> haha, though by then it's probably on yahoo.com/cloud/solution/bla
<harlowja> link not work :-P
<harmw> :>
<harmw> btw, is the yahoo cloud private or public?
<harmw> since there's no easy reference on yahoo.com
<harlowja> private
<harlowja> http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/yahoo-case-study :-P
<harlowja> with me
<harlowja> lol
<harmw> lol nice
<harmw> you're the bald guy from the still? :p
<harlowja> lol
<harlowja> 2 bald guys are in that video
<harlowja> but yes
<harmw> oh haha
<harlowja> :)
<harlowja> bb food
<harmw> :) nice talk
<harlowja> ha
<harlowja> thx
<harlowja> alright, harmw finally getting back to freebsd
#cloud-init 2014-01-04
<harlowja> harmw nearly there, seanb is gonna work on the rc.d script adjustments
<harlowja> make them work there, and then it should just work
<harlowja> ha
<harmw> harlowja_away: cool
<harmw> i haven't taken the time to work out initscripts, and if seanb is working on that ill keep my hands off :)
#cloud-init 2014-12-30
<vhosakot> Hi cloud-init team, I am seeing the bug bug # 1315501 (https://bugs.launchpad.net/cloud-init/+bug/1315501), and cannot access the Ubuntu 14.04 VM. eth0 of the Ubuntu 14.04 VM does not get its IP address when the VM boots. Hence, I cannot ping and SSH into the Ubuntu VM. I have emailed about this issue. Is there a work-around/fix/patch for this bug please ?
<vhosakot> could anyone please help me ?
<vhosakot> helloooo :)
<vhosakot> :)
<Akshat> Hi
<Akshat> need some help in cloud-init
<Akshat> I want to understand is it possible to use multiple datasources in a single cloudinit config
<wayne> hi. why isn't there a way to include unit files?
<wayne> i imagine some kind of cloud-init.tar.gz that has the main yaml file and potential included files
<wayne> or even pulling from a repo somewhere
<JayF> You can inject arbitrary files with cloud-init.
#cloud-init 2015-01-02
<jmreicha_> Anybody around?  I'm having issues with runcmd on a debian EC2 instance.  The same command works on ubuntu 14.04
<jmreicha_> cloud-init version is 0.7.2 on debian and 0.7.6 on ubuntu, seems to be the only difference I can find so far
<jmreicha_> I think I am having issues with syntax but don't know where it is failing :/
#cloud-init 2016-01-04
<smoser> danielbruno, i'm not certain that you can use the user syntax on 12.04 / 0.6.3
<harlowja> okkkkk so what's been happening with cloudint
<harlowja> :-P
<harlowja> happy 2016!!!
<harlowja> holy crap i'm getting old
<harlowja> lol
#cloud-init 2016-01-05
<cornfeedhobo> does growpart support btrfs?
<cornfeedhobo> nvm. found it does
#cloud-init 2016-01-06
<rhys> iâve spent a day trying to get this to work, but can anyone tell me why the resolv-conf module does not work on centos 7 in AWS ?
<rhys> iâve notice that systemd is configured to execute init, config, and final all separately. i use the init chunk to install the newest version of cloud-init, and to add the âresolv-confâ module to cloud_config_modules. i then pass it user-data that includes manage-resolv-conf: true, exactly from the examples. all other directives i pass in user-data execute successfully.
<rhys> i turn off the auto-updating of /etc/resolv.conf in /etc/sysconfig/network-scripts/ifcfg-eth0, so that dhclient doesnât override it. but no matter what i do resolv.conf remains untouches.
<rhys> oh. is this not the same cloud-init thats version cloud-init-0.7.6-2.el7 ? openstack and canonical have different cloud-inits? :(
<smatzek> rhys:  I believe the samples are wrong here.  It needs to be manage_resolv_conf note the underscore character.
<rhys> smatzek: also the module is named âresolv-confâ with a dash yes?
<smatzek> for module names in cloud.cfg they can be either dash or underscore, they get normalized in the code, but for directives that the individual modules look for in userdata, they need to match what the code is looking for.
<smatzek> in this case the code is looking for 'manage_resolv_conf'.   http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/config/cc_resolv_conf.py#L97
<rhys> smatzek: welp there was a day of my life gone
<rhys> thank you so much
<smatzek> Odd_Bloke:  are you there?
<smatzek> rhys:  I've opened this bug to try and address the change in the documentation:  https://bugs.launchpad.net/cloud-init/+bug/1531582
<rhys> thanks. i really appreciate it. i should have noticed the disparity
<rhys> because i was using manage_etc_hosts too
<smatzek> the resolv.conf one and the lock_passwd one have sucked many hours of my time and others time in the past.
<rhys> smatzek, the interesting thing is that while its now working, its writing the resolv.conf template raw
<rhys> its not evaluating the template
<rhys> actually thats now true of all modules.
<rhys> like etc/hosts now has 127.0.0.1 {{fqdn}} {{hostname}}
<rhys> in the logs. Jan  6 18:54:48 localhost cloud-init: 2016-01-06 18:54:48,271 - templater.py[WARNING]: Jinja not available as the selected renderer for desired template, reverting to the basic renderer.
<rhys> package build problem. found the bug
<harlowja> smoser did u guys meetup in england yet?
#cloud-init 2016-01-07
<openstackgerrit> Merged openstack/cloud-init: It seems like httppretty 0.8.11 and 0.8.12 are broken  https://review.openstack.org/264464
<smoser> harlowja, no. got cancelled.
<SuperLag> I'm attempting to make a VMware VM that we can use anywhere. Currently "anywhere" means either a vSphere box, or AWS. When I export the VM to OVA, and then import to EC2, keys don't work. I've learned that cloud-init handles the piece of injecting the configuration when you bring up the instance the first time. I'm creating a user data file. I'm just not sure a. if I'm doing it right, and b. where to
<SuperLag> put it / how to make sure the config gets applied, when I start the instance. I'm using RHEL 7.2 Server, btw.
<SuperLag> I've got a start of the user-data file at http://pastebin.com/egBRbR7G
<larsks> SuperLag: amazon has some docs on this at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
<larsks> SuperLag: Also of interest: http://stackoverflow.com/questions/22204001/how-does-ec2-install-the-public-key-from-your-keypair
<smatzek> SuperLag:  heads up, change lock-passwd: false to lock_passwd: false
<smatzek> SuperLag:  https://bugs.launchpad.net/cloud-init/+bug/1531582
<SuperLag> smatzek: thank you
<SuperLag> still going through the config file
<SuperLag> So cloud-init is ran once on first boot, right? Do I understand that right? Is there way to reset its status to "never ran", so I don't have to spin up multiple instances for testing?
<larsks> SuperLag: Sure.  You can just remove /var/lib/cloud/instances/*
<larsks> You can also run individual modules directly, but I always need to look up the syntax for that.
<SuperLag> larsks: so far, I just want to set up a user, and make sure SSH works.
<SuperLag> larsks: couldn't find mkpasswd to do a password hash, on OS X.
<larsks> You could (arguably, should) just rely on ssh keys rather than worrying about passwords.  I don't think I even have mkpasswd on my linux box.
<larsks> For just getting started, you should be able to rely on the existing user, and just use key-based login.  In this case, you don't even need to provide a user-data file.
<SuperLag> that's the goal, no password login, only keys... but I'm afraid to enable it yet, for fear that I'll get locked out (again)
<SuperLag> larsks: I haven't gotten it working, on multiple attempts
<SuperLag> but I wasn't using cloud-init at that point
<SuperLag> so how do I populate keys, if not with the user-data file?
<larsks> You create (or import) ssh keys into amazon, where cloud-init will find them and use them for the default user (which differs by distro...on rhel7, should be "cloud-user").  http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html
<larsks> When you launch an instance, you tell it which keypair to use.
<SuperLag> gah
<SuperLag> I didn't know you could import your own keypair. *sigh*
<SuperLag> I'm going to go bury my head in the sand somewhere.
<SuperLag> Nope. See it's still asking for a password.
<SuperLag> Okay. Confused.
<SuperLag> you said the user for RHEL should be cloud-user
<SuperLag> and *that* works
<SuperLag> and even if I copy the authorized_keys file from ~cloud-user/.ssh/ to ~my-user/.ssh/ it *still* doesn't work
<SuperLag> I don't get it
<SuperLag> Do I need anything more involved than this? http://pastebin.com/3tDptjJC
<SuperLag> Am I leaving something out?
#cloud-init 2016-01-08
<bassamt> hi, cloud-init is failing on init for me with the following error UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8d in position 0: invalid start byte
<bassamt> version 0.7.7 from an Ubuntu 15.10 AMI.
<bassamt> any ideas for how I can debug the issue? user-data in AWS does look weird (and not something I can change)
<SuperLag> this is all I have in my cloud.cfg file http://pastebin.com/FrWTnHLf and when I try to SSH, it still asks for a password. What am I doing wrong? :/
<SuperLag> I've changed it to be the stock file with only the stuff I needed included. It still doesn't work.
<SuperLag> I clearly don't have a clue, and I'm at the end of my rope.
<SuperLag> I'd be *happy* to pay someone to help me figure this out.
<xael> hello all, i am Alex from Bigstep, currently working on adding a new datasource for the Bigstep cloud service and to also port our templates into using cloudinit
<Odd_Bloke> xael: Hola!  (I'm Dan Watkins from Canonical. :)
<xael> currently, I am having a problem with changing the password for the root user on a CentOS 6.6 machine running cloud-init 0.7.5
<xael> by sending it in a hashed form, not plaintext
<xael> the way in which i managed to do it was to add an 'hashed_passwd' option in distros/__init__ and use the existing set_password with the hashed parameter set
<Odd_Bloke> xael: Can you pastebin the patch?  (Probably easier for me to grok that way :)
<xael> http://pastebin.com/1j1yzxYB here you go, i pasted the whole function
<Odd_Bloke> xael: That looks pretty sensible. :)
<xael> what do you mean by sensible?:))
<Odd_Bloke> xael: I mean: That looks like a change we should include in cloud-init. ^_^
<xael> :D
<xael> Odd_Bloke: did you manage to look over the datasource I sent you before the holidays?
<Odd_Bloke> xael: I didn't, I think that must have got lost in the shuffle, I'm afraid.
<Odd_Bloke> xael: Could you send it over again?
<xael> of course
<Odd_Bloke> xael: Sorry about that!
<xael> no problem :D
<xael> sent, could you please tell me if you received it?
<Odd_Bloke> xael: Received; thanks. :)
<Odd_Bloke> xael: I doubt I'll be able to get to it today, but I'll try to have a look on Monday. :)
<xael> no rush
<Odd_Bloke> :)
<atonal> Hello, is user-data supposed to merge with stuff from /etc/cloud/cloud.cfg?
<atonal> I have merge_type: 'list(append)+dict(recurse_array)+str()' in the user-data, but merge doesn't seem to happen
<larsks> atonal: what in particular are you trying to merge?
<atonal> larsks: some users
<atonal> If I have two separate configs in the heat template with merge_type, they merge together fine, but stuff from /etc/cloud/cloud.cfg is missing
<atonal> And if I remove the configs from the heat template, I get the stuff from /etc/cloud/cloud.cfg
<smoser> atonal, can you show what you have in user data ?
<larsks> smoser: I think I can replicate the problem.  I'm using this: http://chunk.io/f/6d69899a5f95470a8248256b731848bf
<larsks> smoser: that seems to be *replacing* the users: block in /etc/cloud/cloud.cfg, rather than merging.
<atonal> yes, that's close to what I have
<larsks> (on my test systems, the users: block in cloud.cfg contains only "- default"
<larsks> )
<larsks> atonal: for what it's worth, I've always found the merging support in cloud-init to be confusing at best.  smoser and I have had several discussions where the outcome was "huh, I guess that won't work".
<atonal> Confusing it definitely is.
<smoser> larsks, i agree with you
<smoser> :)
<smoser> the merging is fairly straight forward for dict
<smoser> and generally does the right thing
<smoser> but for lists, its wierd. and by default it just replaces lists.
<atonal> But in general, merging user data and /etc/cloud/cloud.cfg _should_ work? Or is this some undefined-behavior-land?
<smoser> yeah.. but ther are some caviats.
<smoser> caveats
<smoser> atonal, for testing you can do this
<smoser> PYTHONPATH=$PWD ./tools/ccfg-merge-debug config/cloud.cfg cc.cfg  > out.1
<smoser> just from trunk. it will output a merged file.
<larsks> atonal: I have never been able to successfully merge something with cloud.cfg, although i was trying to modify things like cloud_init_modules which if I recall are simply parsed too early in the process or something like that.
<smoser> i did hyave to change the config/cloud.cfg to have '#cloud-config' at the top or it doesn't get considered cluod-config.
<smoser> init_modules is parsed too early
<smoser> but config_modules and final_modules can work..
<larsks> ...although those are lists, like users:, and since the users merge doesn't seem to be working even with list(append), I'm wondering if those will work.  I guess that's a quick test...
 * larsks tests.
<smoser> actually... init_modules should be able to be re-written too.
<larsks> smoser: you and I had looked at something like that in the past and discovered that I think it wouldn't work.  I thought.
<larsks> I'd have to see if I still have the irc logs...
<smoser> yeah. i think there is an issue with merging over builtin configs.
<smoser> but i thoguth i would have ahd a bug on it and i dont see it.
<larsks> smoser: so, e.g, this appears to just override cloud_config_modules rather than appending: http://chunk.io/f/f3325b6193234a87a8ac4c1fa9604966
<SuperLag> I am still having issues getting my config to work. I just want to specify a user, and set up specific SSH keys for that user, and have that go into each instance I spin up. My config is at http://pastebin.com/ekjhZAEr.
<SuperLag> Do you need to include every section in your cloud.cfg, and just leave the non-relevant parts blank? or do you only include the parts that you need?
<smoser> larsks, yeah, thats the issue. the merges only apply to config coming in in user-data.
<smoser> which would also be why the users dont work as expected.
<smoser> could you open a bug, just point to your data. i'd like to fix that.
<larsks> smoser: sure.
<larsks> smoser: https://bugs.launchpad.net/cloud-init/+bug/1532234
<smoser> SuperLag, the data you have there is not valid yaml
<smoser> there is probably a WARN message to this affect in /var/log/cloud-init.log
<smoser> but you can easily test yaml just with:
<smoser>  python -c 'import yaml, sys; yaml.load(open(sys.argv[1]).read())' foo.yaml
<smoser> now, that said, i'm not certain that ssh keys are added to a user if the user is not created. let me check
<smoser> it should.
<smoser> larsks, that is awell written bug report
<smoser> thank you.
<openstackgerrit> Joshua Harlow proposed openstack/cloud-init: Change stackforge to openstack  https://review.openstack.org/256684
<openstackgerrit> Joshua Harlow proposed openstack/cloud-init: Update .gitreview for new namespace  https://review.openstack.org/236289
<openstackgerrit> Joshua Harlow proposed openstack/cloud-init: Update stackforge to openstack  https://review.openstack.org/237452
<openstackgerrit> Merged openstack/cloud-init: py26 is no longer supported by Infra's CI  https://review.openstack.org/261718
<openstackgerrit> Merged openstack/cloud-init: Deprecated tox -downloadcache option removed  https://review.openstack.org/256694
<openstackgerrit> Merged openstack/cloud-init: Update stackforge to openstack  https://review.openstack.org/237452
#cloud-init 2017-01-03
<powersj> magicalChicken: when are you around this week?
<magicalChicken> powersj: I'm full time this week and next
<powersj> magicalChicken: oh cool :) care if I set something up for tomorrow morning?
<magicalChicken> powersj: sure, tomorrow morning is good
<magicalChicken> I have debian/centos support pretty much finished
<powersj> magicalChicken: sweet!
#cloud-init 2017-01-04
<cpaelzer> rharper: hi
<rharper> cpaelzer: here
<rharper> smoser: should be online in just a few
<rharper> we can three-way sync
<cpaelzer> rharper: why did you start without telling me - and wrote the same fixes as I had :-P
<rharper> heh
<rharper> it was sporadically done during the morning with getting the kids ready
<rharper> so I wasn't "officially" at work yet
<cpaelzer> rharper: fyi the test numbers are not referrign to any other doc's test numbers - to explain why I confused you
<rharper> the other thing that changed though was that cloud-init doesn't not look for other datasources
<rharper> cpaelzer: I saw that after I read the appendix
<rharper> so I understand now
<rharper> the big change this morning was that we need to specify the datasource_list; otherwise cloud-init will search the "Default" list of sources as well
<rharper> even if we've specified one directly
<cpaelzer> I consider it a special kind of review for each of us that we both decided to write out to dataource_list
<rharper> smoser may show the error of my ways but this way works
<rharper> ahah
<rharper> nice
<cpaelzer> rharper: I really had issues with the missing mkdir
<rharper> sorry
<cpaelzer> rharper: and it makes sense logically
<rharper> that was annoying
<cpaelzer> rharper: to safety-create the dir
<cpaelzer> rharper: to if you could pick up that change that would be nice
<rharper> it should be in my branch
<cpaelzer> ah fine
<cpaelzer> rharper: I didn't check what you pulled/changed yet
<cpaelzer> rharper: about the "report full log" change - I liked it for debugging
<rharper> I'm testing a xenial image with my latest c-i deb in it
<cpaelzer> rharper: what do you think on adding that to your branch as well?
<rharper> yes, that should go in too
<cpaelzer> "The updated branch will be faster, if you look at cloud-init.log the current bits you tested still check for local seeds and such."
<cpaelzer> rharper: is that still true^^ or was that part of the misunderstanding
 * cpaelzer is checking my logs
<rharper> it
<rharper> it's not quite straight forward in sh to capture the execution trace of the program
<cpaelzer> true, but that is just to move the output to the log
<rharper> right
<cpaelzer> for the slow/fast question - my logs show it only checking for the one local DS I set in datasource_list
<cpaelzer> anyway I think we concluded on the same set of things and are good now
<rharper> ah, I didn't realize you set _list
<rharper> now it makes sense
<rharper> which is good
<cpaelzer> I did in my branch
<cpaelzer> that is what I meant with "special review"
<cpaelzer> two people coming to the same conclusion
<rharper> I have an appt I have to go to shortly; smoser should be online in a bit;
<rharper> indeed
<cpaelzer> I'm out a bit, then meeting and I should be able to chat while in the meeting
<rharper> I think today is a matter of building the UC16 image with the right bits
<cpaelzer> we will see
<rharper> sure
<cpaelzer> yes I agree to the UC16 building
<smoser> cpaelzer, here.
<cpaelzer> one thing - it might be worth to build one with the worst case where cloud-init is enabled and polling all
<smoser> rharper, here.
<rharper> ah, perfect
<cpaelzer> and then the "good one" where we can configure with NoCloud or ConfigDrive
<rharper> let's sync quickly
<smoser> k
<cpaelzer> smoser: howdy - happy NY and such
<cpaelzer> as happy as it can be - you should still be on vac
<smoser> https://hangouts.google.com/hangouts/_/canonical.com/hangout-smoser
<cpaelzer> https://git.launchpad.net/~raharper/cloud-init/
<cpaelzer> +ds-identify
<smoser> cpaelzer, ok... so after lookign at changes in rharper branch what should i then look at
<cpaelzer> smoser: ??
<smoser> i'll look at rharper branch, and the changes he made there, understand them... maybe offer some chagnes.
<smoser> then what
<cpaelzer> smoser: if you don't have huge changes next step is ryan building a UC16 image with it
<cpaelzer> smoser: in that we will try the testaces for the POC show next week
<cpaelzer> smoser: feel free to devel some configs for the use cases we had in mind, but given that this should be a free day for you ... you might as well drop off if you are ok with the current changes
<cpaelzer> smoser: if you want you can do the silly go-hello world comparison in KVM and uncache before starting (echo 3 > proc/sys/vm/drop_caches)
<cpaelzer> and send the results to the mail thread
<cpaelzer> the next major work on verifying and creating testcases can only start when the custom UC16 image is done by rharper
<smoser> rharper, when do you get back here ?
<rharper> smoser: here
<smoser> k.
<smoser> chat a bit ?
<rharper> yeah
<smoser> https://hangouts.google.com/hangouts/_/canonical.com/hangout-smoser
<smoser> rharper, ^
<smoser> https://git.launchpad.net/~smoser/cloud-init/log/?h=feature/ds-identify
<smoser> http://paste.ubuntu.com/23740449/
<smoser> https://git.launchpad.net/~smoser/cloud-init/commit/?h=feature/ds-identify&id=b092a5cfa167afff57e186a3f28049f4b83077bd
<rharper> http://paste.ubuntu.com/23740492/
<rharper> smoser: I merged your branch and retested, things look good;  I'm going to build a UC16 image with that cloud-init and exercise the UC16 image locally and in OpenStack;  thanks again for coming in today
<cpaelzer> rharper: hey - checking if anything unexpected came up before final EOD for me?
<rharper> only in the snap builds
<rharper> I've got a working cloud-init updated from smoser
<rharper> now just debugging snap builds; but I expect things to be fully working once I work through getting the right cloud-init into a UC16 image
<cpaelzer> rharper: ok, great - continuing tomorrow morning then
<rharper> yep, I'll send an update
<cpaelzer> rharper: touching all kind of wood for your image builds to succeed
<rharper> hehe
<rharper> smoser: around ?  I can't seem to find out why the Z99-cloud-init-locale.test does or does not get included in a build....
#cloud-init 2017-01-05
<cpaelzer> thanks rharper for your mail
<cpaelzer> rharper: I was creating wrappers accordingly to show our user-stories
<cpaelzer> so far working fine
<cpaelzer> I also made some time analysis with cloud-init-analyze on case 2 I wanted to discuss with you
<cpaelzer> Also I'm not sure if we want/need to "show" user story #4, I'd more expect that to be the story that leads over to the discussion on how to stage it into Xenial
<cpaelzer> rharper: now lunch, then creating case 3 but that is essentially 2 on Openstack which should be fine
<cpaelzer> rharper: please ping me once you are around for the discussion on the #2 timings
<cpaelzer> smoser: you would have enjoyed this lunch http://johorkaki.blogspot.com/2015/10/samyang-extremely-spicy-chicken-flavor.html
<cpaelzer> rharper: comparing times on the Openstack based execution is just as non-helpful
<cpaelzer> rharper: I need to discuss if/where I should see something stable enough for the Demo other than the huge ec2 conf timeout
<cpaelzer> that one I have and I also like how we can show of the "snap config and install via cloud-init"
<cpaelzer> but everything I threw at cloud-init analyze so far didn't give me a good showcase for the timing at least
<cpaelzer> on top it seems that the older ci enabled image has not yet the improvements on the output for cloud-init analyze to be able to differ the stages more easily
<cpaelzer> rharper: ping me, I can send you data or let you to my bastion whatever we need
<cpaelzer> other than that the tech part is ready creating a few raw slides to guide along
<cpaelzer> rharper: smoser: jgrimm: I've made a draft for the ds-identify show and fully automated the demo part of it based on ryans work
<cpaelzer> rharper: thanks
<cpaelzer> I'll share the draft slide deck so you can review/modify
<cpaelzer> if nothing is fatally broken jgrimm and I will adapt depending what comes up on Monday
<cpaelzer> shared
<cpaelzer> rharper: the only blind spot that is left is my lack of a good case to show the timing - waiting for you to show up
<cpaelzer> maybe I even have all I need but just don't find it impressive enough :-)
<rharper> cpaelzer: not quite in yet, but I think this should capture the delta we need (the first being , no searching, then latter how cloud-init on OpenStack looks without identify)  http://paste.ubuntu.com/23746042/
<cpaelzer> rharper: forgot to reply - I think I see what you mean - gathering that on my dmeo env now and seeing it it shows something nice
<cpaelzer> hrm
<cpaelzer> rharper: smoser: I found every now and then (about 1/8 of the cases) that the final stage hangs
<cpaelzer> thought it is part of my ssh setup, but now got in and I find it hanging
<cpaelzer> oh I see
<cpaelzer> not "us" but snap it seems
<rharper> cpaelzer: ok, here now
<rharper> yes
<rharper> if you use my user-data
<rharper> it does snap installs
<rharper> which may take some time
<cpaelzer> which I want for some of the cases
<cpaelzer> but, what I mean is that this sometimes hangs
<cpaelzer> like really for minutes
<rharper> snaps do that
<cpaelzer> how unfriendly
<rharper> download, verify, unpack, squash mount, etc
<cpaelzer> is there anything cloud-init should do to unlock that?
<cpaelzer> the process is sleepng
<rharper> apt isn't fast either
<cpaelzer> I really think it is dead some way
<rharper> possible, in general, we've discussed the idea of doing some config modules in parallel but it's a non-trivial problem if one has dependencies between them
<cpaelzer> http://paste.ubuntu.com/23746560/
<cpaelzer> what I want would be a timeout and retry
<cpaelzer> this is now crossing the 10 minute mark - it won't ever succeed
<cpaelzer> (likely)
<rharper> no
<rharper> see the mount
<rharper>  systemctl start snap-part\x2dcython-3.mount
<rharper> that's not us
<cpaelzer> the x2dcython-3.mount one
<rharper> right
<cpaelzer> ok let me rephrase
<rharper> I understand what you suggest;
<cpaelzer> I think it is a snap issue, but should the snap module of cloud-init take care to detect and recover in those cases
<rharper> config modules can certainly have timeouts
<cpaelzer> oh ok
<cpaelzer> sorry for feeling misunderstood then
<rharper> we do in other places (like searching for data sources)
<rharper> no
<rharper> no worries
<cpaelzer> but when scripts start a bunch of them 1/8 hits often enough to feel bad :-)
<rharper> yeah, maybe use different snaps
<cpaelzer> one instead of three should already help to mitigat most I hope
<rharper> yeah
<rharper> hello is going to be fast and easy
<rharper> it's the #1 snap in the store  =)
<cpaelzer> another issue seems to be that if I check /var/lib/cloud/instance/boot-finished too early that seems to conflict with the final stage (re)starting ssh - that makes the final stage hang as well it seems
<cpaelzer> which still is part of config snappy I jsut see
<cpaelzer>  |`->running config-snappy with frequency once-per-instance @02.00000s +912.00000s
<cpaelzer> that is in the analyze after I hard restarted the ssh on the UC16 openstack guest
<rharper> strange
<cpaelzer> that is in the case without user-data even
<rharper> not following final stage restarting ssh ?
<cpaelzer> the cloud-analyze output is linear as well right?
<rharper> sure, it's just sorted by event timestamp
<cpaelzer> rharper: http://paste.ubuntu.com/23746681/ line 212
<cpaelzer> the pastebin has three cases with-data, no-data-ds-identify, no-data-old
<rharper> hrm
<rharper> I'm not seeing that huge time
<rharper> which image?  recent or the old one ?
<cpaelzer> old one
<cpaelzer> you think it might just be an old issue?
<rharper> possibly
<rharper> it certainly has older packages
<rharper> which may get refreshed updated
<rharper> which could block time
<rharper> let me upload a newer  image without the updated cloud-init
<rharper> so it's more apples to apples w.r.t boot time
<cpaelzer> no hurry ping me or write mail once you rebuilt one - so I can exchange the image I use
<cpaelzer> thanks for the "policy can be overridden" statement - missed to add that
<cpaelzer> rharper: FYI - I have all the timing data if discussion comes up, but I think there is nothing with enough "bang" in it to get a slide
<cpaelzer> I can pull them out whenever needed - or not if not
<cpaelzer> too much data without the need could lead to deep dives on unimportant numbers
<rharper> well, I think a 10x factor in time to run cloud-init-local is nice enough
<rharper> the biggest win will be in the ec2 image (we don't have those times) simply because ec2 runs *last*
<cpaelzer> of course
<cpaelzer> just at least on openstack old and new image both are rather fast
<cpaelzer> or my timing granularity is bad
<cpaelzer> it's basically 00.0000 for init-local and init-network on OLD
<rharper> well, you need to use the journalctl method to get subsecond resolution
<cpaelzer> I had both
<cpaelzer> ah year
<cpaelzer> it is so small
<cpaelzer> fine
<rharper> I suggest trying again with the journalctl -o short,precise -u cloud-init-local | cloudinit-analyze show -i -
<cpaelzer> do I miss something that init-network is not in the journal methof?
<cpaelzer> method
<cpaelzer> I already have the journal data rharper
<rharper> I just did one unit
<cpaelzer> 00.56115 -> 00.08126
<rharper> you can string them all
<rharper> in total time it's small
<cpaelzer> I'll do so
<rharper> but it's a rather huge reduction
<rharper> -u cloud-init-local -u cloud-init -u cloud-config -u cloud-final
<rharper> are the unit names to append to the journalctl command
<cpaelzer> now that I spent the automation effort for all showcases it is a minor change to get it with that on top :-)
<rharper> that'll get you all stages
<rharper> k
<rharper> for the images using config-drive, the change will be small, ones using Openstack datasource, will be larger win as it probes local + other clouds then OpenStack datasource
<rharper> so the further from the "head" of the list, the greater the win w.r.t time reduction;  and you're right, it's not huge in terms of wallclock time
<rharper> but the improvement scales with the speed of the system
<cpaelzer> ok, then I'm good
<rharper> a follow up is that the POC only prevents cloud-init from probing other sources
<cpaelzer> it means I rad the data correctly and it isn't impressive until underlining the sweet spots :-)
<rharper> we will further have cloud-init skip the specific datasource probe as well
<rharper> yes, that's true
<cpaelzer> but - as we said when you showed cloud-init analyze - no matter if big or small, having the data is the important point for the discussions
<rharper> yep
<powersj> magicalChicken: https://paste.ubuntu.com/23747530/
<magicalChicken> powersj: needs --upgrad
<magicalChicken> I had meant to do that yesterday, I;'ll get that done now
<magicalChicken> I added a config flag to automatically do --upgrade on linuxcontainers.org images
<magicalChicken> since they don't ship with cloud-init
<powersj> ok
<powersj> let me know when I can pull again (no rush) and I can continue playing with it :)
<magicalChicken> powersj: sure, should have that fixed in just a bit
<aixtools> o/
<aixtools> before I get started again - a git question. How do I update 'my' copy - aka aixtools (aixtools        ssh://aixtools@git.launchpad.net/~aixtools/cloud-init (push)) - from origin  (https://git.launchpad.net/cloud-init (fetch))
<nacc> aixtools: do you have changes in your branch relative to what you currently have from cloud-init origin?
<magicalChicken> aixtools: you can add the origin as a remote and pull
<nacc> and then you'd rebase your branch(es) onto the updated origin/master, normally
<nacc> it depends on your workflow, and what, if any changes, you have locally
<aixtools> i have a seperate branch I am working (an AIX port I hope); I have 'aixtools' that is my clone of master.
<aixtools> what I would like to do is: 1) get my launchpad "clone" up to date; 2) use that to update (i think fetch?) my 'local' copy of 'master';
<nacc> aixtools: sorry, can you pastebin the output of `git remote; git branch; git branch -r` ?
<aixtools> finally,. 3:
<aixtools> update/merge the changes of the current status into my 'changes' for aix-port
<aixtools> moment
<aixtools> http://pastebin.com/8wfzSqTq
<nacc> aixtools: ok, so this is how *I* would do it, you can choose to take/leave what you want :)
<aixtools> i am a noob - I shall live and learn :)
<nacc> aixtools: what i would first do (for your own sanity/helpfullness) is make sure that your ~/.gitconfig has:
<nacc> [log]
<nacc>       decorate = short
<nacc> that will put in `git log` output, things like tags and head names if they are in the history
<nacc> if you do that, in your current branch (aix-port), `git log` should indicate that an ancestor of HEAD is origin/master (I'm guessing, presuming you have not fetched yet)
<nacc> so the way to update your tree would be then:
<nacc> 1) git fetch origin
<nacc> 2) git rebase -i origin/master
<nacc> iiuc, (it does depend on how many commits you have relative to the old upstream you were using), this will present you with an $EDITOR window, which allows you to specify how to treat the commits that are to be carreid forward
<aixtools> did that also update my copy i.e., ssh://aixtools@git.launchpad.net/~aixtools/cloud-init , or is that an update of my local disks?
<aixtools> (step 1, that is)
<nacc> just local
<aixtools> and step 2 - starting now...
<nacc> all step 1) did was to fetch from the 'origin' remote any branches and commits
<nacc> so then 3) would be `git push aixtools aix-port` (presuming your remote branch is also called aix-port, if it's master there, you would say aix-port:master)
<aixtools> step 2 - what is that 'trying' to do. looks like my changes coming into 'master' which I do not want.
<nacc> aixtools: look at the pictures in `man git-rebase`
<aixtools> i want to move 'master' into aix-port and.or see conflicts, so I can review them
<nacc> basically you have a topic branch (aix-port)
<nacc> oh
<nacc> wait, what?
<aixtools> maybe my thought process is 'wrong',
<aixtools> yes - topic branch aka aix-port
<nacc> so git is just storing a DAG, right?
<nacc> directed acyclic graph
<aixtools> DAG was shorter, still do not know the fancy verb
<aixtools> adjective i should say
<nacc> ok, let's gloss it for now
<aixtools> hence, i wanted my 'local' master to be equal to the project master.
<nacc> if you look at the first example in `man git-rebase` (around line 68)
<nacc> aixtools: right, but you don't *need* that at all for git
<aixtools> git is turning into the new trick this old dog cannot learn
<aixtools> i read somewhere git is keeping three copies: the 'master', a local copy of the master, and then the local changes
<nacc> aixtools: i mean, yes your repository's master branch can track origin/master, but you also already have origin/master :)
<nacc> i feel like this would be way faster to explain on the phone :)
<nacc> but let me keep going
<aixtools> my thought is to keep my topic-branch as close to master as I can.
<nacc> yep
<nacc> that's smart
<nacc> you do that with regular rebases
<aixtools> ok, so all i have done now is step 1, step 2 I aborted.
<nacc> aixtools: do you have hangouts? i can explain this quickly if you have the time?
<aixtools> for the rebase - would I go back to my branch and then execute 'rebase'
<aixtools> (imho - git has a lot of features - I wiull someday see the benefit(s) - but for now they just confuse.
<nacc> right, i didn't realize you had left your branch
<nacc> as your output before was that you were on the aix-port branch
<nacc> and `git fetch` doesn't move your checked out state at all
<aixtools> well, I did git checkout master before starting irc
<nacc> ah
<nacc> don't do that :)
<nacc> wasn't in my list of steps :)
<aixtools> no, it was in someone elses list (long-live google)
<aixtools> seemed to be the way to prepare for a merge
<aixtools> which is what I thought I needed to do
<nacc> so, here's my opinion
<nacc> you have no need for a local master branch
<aixtools> nods
<nacc> it is of no use to you, as you're always doing topic branches forked from upstream's master
<nacc> so let's just ignore the local master :)
<aixtools> :)
<nacc> you can delete it, but git will complain sometimes, so it's easier to leave it around, but ignore it
<nacc> so we're going to only work on your aix-port branch
<nacc> (git checkout aix-port)
<nacc> we're going to run the rebase step here, which is basically telling git (long typing to follow)
<nacc> git rebase -i origin/master
<nacc> (implicilty the commit to rebase is HEAD)
<nacc> I was based off something in the history of origin/master
<nacc> but now origin/master has moved on without me
<nacc> I want git to 'remember' all the stuff that i've done from that historical fork-point (called the merge-base) and save it
<nacc> then I want to fast-forward the branch I'm on (which HEAD is on) to the updated state of origin/master
<nacc> and then I want to replay the 'remembered' stuff, as new commits
<aixtools> so, just save the file with all the 'picks' in it.
<nacc> aixtools: in your case, yeah
<nacc> aixtools: as you don't want to drop anything
<aixtools> not yet :)
<aixtools> Successfully rebased and updated refs/heads/aix-port.
<nacc> aixtools: you can, in the future, probably, drop the -i. I like to always see what git-rebase is going to do, but your case should be a quick rebase each time, esp. if you do it often
<aixtools> On branch aix-port
<aixtools> Your branch and 'aixtools/aix-port' have diverged,
<aixtools> and have 9 and 10 different commits each, respectively.
<nacc> right
<aixtools> so, rather than the pull suggested, I would do a push?
<aixtools> to put the local copy on the server?
<nacc> yep, i said that earlier, `git push aixtools aix-port`
<nacc> now, if i had to guess, that will complain saying it's not a fast forward
<aixtools> forgot that... my apologies
<nacc> np!
<aixtools> few lines too many... http://pastebin.com/289U3mCR
<nacc> non-fast-forward is the important bit
<nacc> so the reasoning here is
<nacc> imagine someone was using your branch as the basis for their work
<nacc> they, just like you did, want to be able to do a `git fetch origin` (excpet their origin is your repository)
<nacc> and have it make sense and be a linear history
<nacc> but you just 'moved' your history by rebasing it
<nacc> so for your topic branch, it's relatively likely you'll need to tell `git-push` the '-f' flag (to force), *if* you know your local branch is correct
<aixtools> so, just add -f
<nacc> after verifying you want your local aix-port branch to be what is ont he server
<nacc> (by looking at git-log, diffing against origin/master, etc)
<aixtools> well, as I am working solo - it is either a mess and I get to start over again, or it is okay.
<aixtools> I'll vote (read hope) for the latter.
<nacc> ack, it's not a big deal for topic branches that are one-offs
<aixtools> Total 56 (delta 40), reused 0 (delta 0)
<aixtools> To ssh://git.launchpad.net/~aixtools/cloud-init
<aixtools>  + f1dee34...be633b8 aix-port -> aix-port (forced update)
<nacc> it's a bigger deal for origins, masters, etc
<aixtools> So, maybe even a good way to learn the ropes.
<nacc> note that until you do a MR, there's not really even a reason to push
<nacc> except if you develop in multiple places, or if you are worried about your system dying
<nacc> there's also not a reason *not* to push, admittedly
<aixtools> thanks very much - the boss (wife) called. time to go...
<aixtools> well, I am also trying to learn git.
<aixtools> later, or tomorrow. thx.
<nacc> aixtools: np! i'll be around
<aixtools> afk
<magicalChicken> powersj: I got the setup_overrides working so lxd images can force --upgrade even if it isn't specified
<magicalChicken> powersj: so 'run -n xenial' should work now without upgrade
<powersj> magicalChicken: tests appear to be running, thank you!
<powersj> magicalChicken: I just got the too many open files on my laptop
<magicalChicken> powersj: what image was it on?
<powersj> python3 -m tests.cloud_tests run -v -n xenial
<powersj> didn't think you mentioned that with the ubuntu image
<magicalChicken> powersj: i've never seen it with xenial
<powersj> https://paste.ubuntu.com/23748172/
<magicalChicken> i only get it on centos 7/6.6, debian wheezy, and ubuntu precise
<magicalChicken> powersj: you're getting it in a different place too
<magicalChicken> i've always seen the stacktrace come from inside pylxd
<magicalChicken> not sure what's going on yet, must be resources leaking somehow
<rharper> something is leaking file descriptors, likely the exec bits in pylxd ?
<magicalChicken> yeah, it must be something like that
<magicalChicken> this only started after the switch to pylxd 2.2 i think
<magicalChicken> i'm going to make a branch with centos/debian support but using pylxd 2.1, see if it happens there
<rharper> https://bugs.launchpad.net/juju/+bug/1602192
<rharper> Apparently the relevant limit is /proc/sys/fs/inotify/max_user_instances. This is "128" by default. When increasing it with
<rharper>    sudo sysctl fs.inotify.max_user_instances=256
<rharper> looks to be the tl;dr
<rharper> https://github.com/lxc/lxd/blob/master/doc/production-setup.md
<rharper> that looks helpful here
<magicalChicken> that definitely looks related
<magicalChicken> from the last comment on there, it sounds like the init system being used by the container has an effect on whether or not the bug occurs
<magicalChicken> which could explain why i'm only seeing it with some distros
<rharper> for sure
<rharper> systemd new enough spawns cgroups per unit
<rharper> including inotify watches on each one
<magicalChicken> aah
<magicalChicken> that make sense
<rharper> so, pre-systemd like 2XX or something like that isn't affected
<magicalChicken> I'm hesitant about just bumping the limit though
<magicalChicken> because it may still hit the limit eventually
<rharper> but there's nothing to do about that other than wait
<rharper> right?
<magicalChicken> Maybe minimizing the calls to execute() would help, as right now it polls for the system being up using execute()
<rharper> it's a global limit
<rharper> well, sounds like systemd execution in the guest is consuming them
<rharper> not exec (I was thinking of leaking fd's , we say that like more than a year ago)
<magicalChicken> oh, yeah if its systemd itself that's using all the fd then there's not much to help
<rharper> right, other than watching the global limit and raising it before running
<rharper> we can raise it up, watch the count during a run
<rharper> and see how close we get
<rharper> and then in jenkins (at least) ensure we run with a limit high enough to handle things with some head room
<magicalChicken> yeah, that should work
<magicalChicken> hopefully there'll be either systemd or kernel changes to fix this eventually though
<rharper> there's no fix
<rharper> it's just a global resource that's being used
<rharper> sorta like file descriptors;  if you make that man opens, it has to be tracked;
<powersj> magicalChicken: also https://github.com/lxc/pylxd/issues/209
<rharper> systemd is a heavy inotify user
<powersj> and 211 from the link at the bottom
<rharper> ha!
<rharper> double whammy
<magicalChicken> yay :)
<rharper> time to poke  rockstar in #lxd
<magicalChicken> well, that one is fixable
<magicalChicken> haha yeah
<magicalChicken> temp fix in the test suite is to go back to polling inside the instance with a single call to execute() though
<rharper> we could even test it
<rharper> either way, for sure
<magicalChicken> funny enough, I changed to doing it in python right after the pylxd 2.2 switch
<rharper> hehe
<magicalChicken> yeah, doing that swicth should show which bug broke us
<magicalChicken> or it could be both too :)
<rharper> well, I suspect the exec fd is the big one, but it was hard to find without systemd consuming a bunch of stuff too
<magicalChicken> the systemd limit is still an issue without exec though too
<magicalChicken> because ideally I had wanted to get all of this going in parallel
<rharper> yeah, but I think raising the limit on a jenkins instance is reasonable, and we can add that to testsuite docs
<rharper> yeah =)
<magicalChicken> yeah, that makes sense
<erick3k> hi
<erick3k> anyone here?
<erick3k> or everyone hidleing?
<erick3k> idleing
<rharper> best to just ask, folks will answer when they can
<erick3k> oh nice
<erick3k> just making sure there is ppl
<erick3k> i go other rooms and they are full yet is like there is nobody
<rharper> happens here too
<rharper> =)
<erick3k> oh
<erick3k> hehe
<erick3k> question
<erick3k> i am trying to expire root password after launch
<erick3k> is it possible to do it within the vm?
<erick3k> in cloud.cfg?
<erick3k> although this doesnt work: # System and/or distro specific settings # (not accessible to handlers/transforms) system_info:    # This will affect which distro class gets used    distro: ubuntu    # Default user name + that default users groups (if added/used)    default_user:      name: root      lock_passwd: false 	 chpasswd:       expire: True
<rharper> let's look at the user config
<erick3k> something like that?
<erick3k> like the cloud.cfg?
<rharper> http://cloudinit.readthedocs.io/en/latest/topics/modules.html#users-and-groups   shows you can set expire to a date;
<rharper> lemme look at the code to see what get's passed around
<rharper> another option is to use the chage command as a run_cmd
<erick3k> sorry rharper am not familiar with terms
<erick3k> by user config you mean cloud.cfg on /etc/cloud?
<rharper> there's a linux command called 'chage'
<rharper>  /etc/cloud is the default config, typically one passes in user-data into the stance in addition to the default
<rharper> at least on debian/ubuntu, there's no root password set, so nothing to expire
<erick3k> problem is i would like to do it within the vm
<erick3k> instead of passing it throught user-data
<erick3k> i did set a password
<rharper> and your image already has a password set for root, right
<erick3k> but upon launching i want it to expire so customers have to reset it
<rharper> if you're authoring the image
<rharper> then when you set the root password, you can use the chage command to expire it immediately
<rharper> rather than doing it in cloud-init (which is just going to run the 'chage' command anyhow)
<erick3k> i tried both
<erick3k> passwd --expire root and  chage -d 0 root
<erick3k> before powering off and runing cloud-init
<erick3k> but does't work
<rharper> you want to look at the 'mount-image-callback' command; this let's you run commands inside the filesystem of the vm
<erick3k> it applies whatever password you set but upon login it is not expired
<rharper> you can use chage --list root to see what got set
<erick3k> ok gonna try that agian
<erick3k> ummm
<erick3k> rharper it worked
<erick3k> but
<erick3k> i get this https://i.imgur.com/xPgaxcO.png
<erick3k> until i change the expired password i can't ssh
<rharper> are you trying to ssh in as root ?
<erick3k> yes
<erick3k> i had to login throught the vm console and set the expired password
<rharper> yeah; I've never set a root password or forced it to expire; I suspect there's something at play with sshd config with root
<rharper> sorry I'm not more help here
<erick3k> thanks for trying
<rharper> http://askubuntu.com/questions/427153/change-expired-password-via-ssh
<rharper> maybe not; -1 disables expiration
<erick3k> quick question
<erick3k> does the ubuntu user needs to exist for cloud-init to run?
<rharper> no
<rharper> but many modules expect there be a default non-root user
<rharper> so, things like 'add ssh keys' only work if you use the default config (which supplies a non-root user for your distro type)
<erick3k> is it safe to delete the ubuntu directory in /home?
<erick3k> ubuntu home user directory
<rharper> only affects the ubuntu user
#cloud-init 2017-01-06
<erick3k> can someone help me with a problem
<jgrimm> rharper, https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1645644
<rharper> jgrimm: hrm, so pre-existing ntp.conf .. when dpkg installs, it will divert the package conf to a new file (where as cloud-init moves the original out of the way and replaces).   Do you suppose we should do the same here?
<jgrimm> not sure, i just wasn't sure if you'd noticed that bug yet
<rharper> I had not
<rharper> so, they're clearly adding 'ntp' key to trigger the install
<rharper> but not using the existing pool or server list
<rharper> so, not quite sure why;  though possible that they want to tweak things...
<jgrimm> indeed
<powersj> magicalChicken: what was the decision re: to many files open?
<magicalChicken> powersj: It was caused by polling for the system to be up using lots of calls to execute()
<magicalChicken> it went away once i switched back to polling in the instance
<powersj> ah ok so you are going to go the route of updating your code
<magicalChicken> there's a new bug in centos with setting hostname, but it looks like its a cloud-inti issue
<magicalChicken> powersj: yeah, it is a bug in pylxd most likely, but the fix in the test suite is simple, so no sense in waiting until it can be fixed in pylxd
<powersj> ok!
<magicalChicken> I've pushed the fix
<powersj> is the fix committed such that I can keep testing?
<powersj> ah :)
<powersj> thank you!
<magicalChicken> The centos tests will still fail due to hostnamectl timing out when cloud-init tries to run cc_set_hostname thought
<vans163> hello. Is there a general way to disable cloud-init after first init?
<vans163> Im having a problem where if I take a snapshot of a cloud disk, boot up another Vm with it, it freezes upon boot because cloud init is looking for things again
<rharper> if you're booting a snapshot with the same UUID/instance-id, then it won't attempt to re-init; but if you change the instance-id then it will try to run again.  That's by design as you don't want the same ssh keys and other generated data to be the same between different instances.
<rharper> if you want, you can append cloud-init=disabled to the kernel command line,  or if you modify your image, you can touch /etc/cloud/cloud-init.disabled which will prevent cloud-init from running
<vans163> rharper: is there something without touchng the system much.  If i understood correct I can add  touch /etc/cloud/cloud-init.disabled  to the end of my cloud-init script?
<vans163> i saw tutorials that say to uninstall cloud-init, but for different distros the command will be different so this is not as portable
<vans163> rharper: im booting the snapshot WITHOUT attaching the ISO again
<vans163> im using the ISO method to use cloud init
<vans163> so I launch the fedora_24_cloud.raw disk image with a cloud-init.iso attached.  Everythings fine.  now I snapshot the .raw disk, and spin up a new VM with it.  WITHOUT the cloud-init.iso
<vans163> and now it hangs indefinetaly
<rharper> and if you boot your image without a cdrom in the first place, it will likely do the same (it's looking for a datasource) and takes quite a while to timeout;  I suspect, somehow the instance_id in your second boot is different than the first;  the cloud-init.log in the second boot will confirm;  you should be able to just wait it out (like 10 or so minutes) or even just boot the snapshot image with the cdrom
<rharper> bbiab
<vans163> rharper: no cdrom attached yea.  Hum.. okay i need to try touching the cloud-init.disabled
<vans163> yea no go
<vans163> touching that does nothing
<vans163> its hanging at boot
<vans163> this is fedora24 cloud image
<rharper> do you see any boot messages to serial console ?
<rharper> I usually do a  "-serial telnet:localhost:2446:,nowait,server"  to my qemu command, then when you launch it, you can telnet localhost 2446 and see boot messages and interact with serial console of the guest
<vans163> rharper: humm i had VNC opened and I see nothing except the first line of the boot
<rharper> if the image isn't booting properly, maybe its not cloud-init related
<vans163> image boots fine if I attach the cloud-init.iso to it
<vans163> i think its just looking for the iso/trying to connect to EC2 ip
<vans163> i need to find how to tell it to not do that
<vans163> (once its booted the first time in its life)
<vans163> yum erase cloud-init at the end of the config script seems to be the solution but this is not portable
<vans163> (havent tried it, just read blog about it)
<rharper> well, it will eventually timeout trying to reach EC2; having the cloud-init.log of the second boot can help understand what's going on;
<vans163> hum let me get that then sec
<vans163> well not sec.. 10 min :po
<rharper> hehe
<rharper> yeah, it's a long timeout
<vans163> brb going to store
<vans163> not so slow actually
<vans163> got dressed and its done
<vans163> it spent 2 minutes trying to reach /latest/meta-data/instance-id on the gateway IP DHCP gave it
<vans163> maybe more
<vans163> DataSourceEc2.py[CRITICAL]: Giving up on md from ['http://169.254.169.254/2009-04-04/meta-data/instance-id'] after 21630 seconds
<vans163> first that, then trying to reach the gateway ip 2 minutes, then it booted
<rharper> yeah, I was interested in /var/log/cloud-init.log   and /var/lib/cloud/instances/
<rharper> trying to recreate here
<vans163> cloud-init.log is empty
<rharper> that seems very wrong
<vans163> (i added the touch .../.disabled
<rharper> oh
<rharper> though the first boot one should have run though
<rharper> and it wouldn;'t have tried ec2 if it was disabled
<vans163> instances has 1 UUID folder, and iid-datasource-none
<rharper> it's like it never ran at all
<rharper> your cdrom image would have included a meta-data file with an instance-id in it
<rharper> that should be in /var/lib/cloud/instances/<instance id>
<rharper> from the first boot
<vans163> df -h shows me  /dev/vda1        50G  446M   47G  so the drive was expanded
<rharper> if that's not there, then your snapshot isn't getting captured
<vans163> or..
<rharper> yeah, cloud-init does that
<vans163> yea it ran right the first time, because the SSHkeys i set are there
<vans163> then i powerdown the instance + snapshot the disk
<rharper> maybe I don't know what the fedora cloud-init is doing but the instance dir should be persistent
<vans163> so basically the same thing as booting the instance again without the cloud-init.iso cdrom attached
<rharper> right but it's not recording the instance data where cloud-init looks (at least on ubuntu)
<rharper> inside the instance dir, includes a pickled datasource object that it re-uses on subsequent boots
<vans163> obj.pkl is there i see it
<rharper>  then you have to have a /var/lib/cloud/instances/  ...
<vans163> yes isntances folder has 2 folders in it
<vans163> 1 is a UUID folder, the other is iid-datasource-none
<rharper> ok, that makes sense
<rharper> what I suspect is that without a way to specify the instance-id to cloud-init, it won;t know to use the previous uuid as a datasource
<rharper> I'll confirm locally
<vans163> somethign strange I see my cloud-config is truncated
<vans163> cloud-config.txt in that uuid folder is missing a few things
<vans163> ... it shows at the end
<vans163> let me show you sec in gist
<rharper> so unless we have a way to persist the instance-id (via a permanent datasource) I don't think cloud-init can know that it should use it;
<rharper> you may want to look at using the nocloud seed (write out your user-data/metadata to /var/lib/cloud/seed/nocloud-net dir in the image itself
<vans163> well upon the snapshot boot, it should ignore cloud init
<vans163> as the instance is already configured
<rharper> that;s not how cloud-init works
<vans163> ah
<rharper> it only knows it's configured if it finds that it's the same instance
<rharper> in real clouds, the meta-data provides the instance-id;  that's ephemeral , each time you launch a new vm, it's a new instance
<vans163> and the ISO or webserver it tries to query tells it the id
<rharper> it had a meta-data file
<rharper> you can embed the same config from the iso in a dir inside the image
<rharper> that will achieve what you want
<vans163> i think that is exactly what I need
<vans163> thanks for explaining
<rharper> http://cloudinit.readthedocs.io/en/latest/topics/datasources/nocloud.html
<rharper> sure
<vans163> but I have this 1 wierd thing
<rharper> that link shows it via iso, but if you just put the contents of the iso at /var/lib/cloud/seed/nocloud-net/{user-data,meta-data}
<vans163> that solves alot of problems yea
<vans163> https://gist.github.com/anonymous/5e47847afa56cdd01f88ef0d27aa62a4  this gist, the top is the cloud-config i passed the ISO,  the bottom is what was in the instances  /var/lib/cloud/instances/d2ffab6d-7ac9-423a-bc8a-023402a2d7a0/cloud-config.txt
<vans163> why is there so many spaces in the root password, guessing its fine tho?
<vans163> brb
<rharper> not sure; possible your original cloud-config has trailing white space ?
#cloud-init 2017-01-07
<vans163> rharper: only that line break. going to play wiht it.  the password works fine but other distros like ubuntu fail
<vans163> fedora works great, same config ubuntu fails
#cloud-init 2018-01-03
<Odd_Bloke> smoser: rharper: Does https://bugs.launchpad.net/cloud-images/+bug/1740176 look at all familiar as a cloud-init bug?
<ubot5> Launchpad bug 1740176 in cloud-images "Disksize is only 2GB, expected in 10GB" [Undecided,Confirmed]
<Odd_Bloke> We haven't triaged it yet, but thought it might look like something you know how to handle.
<smoser> Odd_Bloke: did you recreate that ?
<Odd_Bloke> smoser: I think Tribaal did.
<Tribaal> well, some user reported when using the artful Vagrant image
<Tribaal> and I could reproduce it
<dpb1> Tribaal: Uppercase T?
<Tribaal> dpb1: says the guy with a "1" appended to his nick :)
<dpb1> touche!
<Tribaal> dpb1: I thin klowercase was already taken
<dpb1> good ol irc
<tribaal> there! :)
<dpb1> hi tribaal
<dpb1> welcome
<blackboxsw> tribaal: yo. we were just talking about that bug
<tribaal> oh?
<blackboxsw> smoser: thought it might be reltaed to https://bugs.launchpad.net/cloud-images/+bug/1726818
<ubot5> Launchpad bug 1726818 in linux (Ubuntu) "vagrant artful64 box filesystem too small" [High,In progress]
<blackboxsw> and happy new year BTW
<smoser> yeah, i marked as a dupe. it certainly smells that way.
<tribaal> that looks like a total dupe indeed
<smoser> tribaal: it would be good to know if this is present in bionic
<tribaal> smoser: give me a sec, should be easy to try
<smoser> as it really needs to be fixed in bionic.
<tribaal> +100
<smoser> tribaal and then, it looks like there is a 4.14 in proposed.
<smoser> i suspect that this is probably still present in bionic in the 4.13 that is in the release pocket
<smoser> but has a chance of being fixed in -proposed 4.14
<smoser> https://launchpad.net/ubuntu/+source/linux
<tribaal> smoser: indeed, bionic is affected as well (well, as of the 20180101 image)
<tribaal> smoser: I'll enabled -proposed in the machine and reboot - could you remind me what to nuke to make cloud-init think it's running for the first time? /var/lib/cloud/*?
<smoser> tribaal: cloud-init clean
<tribaal> smoser: TIL! Sweet
<smoser> tribaal: i'd wonder though if the image might be 'dirty' at that point
<smoser> i dont know how easy it is to supply your own image to vagrant
<smoser> but you might be able to modify
<smoser>  https://github.com/cloud-init/qa-scripts/blob/master/scripts/get-proposed-cloudimg
<smoser> or basically do what it does
<smoser> better to create yourself a "clean" image
<tribaal> smoser: yeah, I could just build a vagrant image with -proposed enabled but that takes much longer
<smoser> blackboxsw:
<smoser> root@b2:~# cloud-init clean
<smoser> ERROR: Could not remove instance: Cannot call rmtree on a symbolic link
<blackboxsw> boo!
<blackboxsw> will work up a fix
<tribaal> (and I'm almost EOD)
<smoser> exits non-zero too, and that is clearly not a error.
<smoser> tribaal: well, essentially i suspect you may be able to do
<smoser>   mount-image-callback your.orig.image --system-resolvconf -- /bin/bash
<smoser> then inside, just enable proposed, apt-get update, apt-get install linux-image (or whateve rpackage that is)
<smoser> then exit
<tribaal> the dirty way didn't produce the clear case, so I'll have to do something like this, yes (or just build an image with -proposed enabled)
<blackboxsw> smoser: how'd you reproduce that cloud-init clean. I launched a bionic container, by default it doesn't hit this symlink error
<blackboxsw> I assumed bionic because your instance was named b2
<blackboxsw> in either case, I can fix it easily enough. but wondered how we got there
<powersj> blackboxsw: https://paste.ubuntu.com/26314129/
<blackboxsw> +1 powersj I  have a fix, just adding unit tests, will
<blackboxsw> +1 powersj I  have a fix, just adding unit tests, will try to reproduce here too
<smoser> blackboxsw: hm.
<smoser> i just fresh launched bionic-daily container
<blackboxsw> csmith@uptown:~$ lxc --version
<blackboxsw> 2.0.11
<blackboxsw> heh. powersj
<blackboxsw> xenial
<blackboxsw> for the loss
<powersj> except I wouldn't expect that to make a difference in this case :\
<powersj> right? should only be the version of cloud-init in the image
<blackboxsw> one would think. I'm testing from zesty
<smoser> hm.
<blackboxsw> https://pastebin.ubuntu.com/26314191/
<blackboxsw> yeah from zesty, still no problem on my side
<blackboxsw> anyway we can easily test for link and unlink if needed
<smoser> this is weird
<blackboxsw> but not quite sure why that shows up in some cases
<smoser> i reproduced again, but then not
<powersj> 299e803c9fe1 | no     | ubuntu 18.04 LTS amd64 (daily) (20180101)
<powersj> that's the bionic image I used
<smoser> http://paste.ubuntu.com/26314202/
<blackboxsw> https://pastebin.ubuntu.com/26314219/
<blackboxsw> seemingly the same, but I get stuccess (I would've expected it to always fail
<blackboxsw> on instance symlink
<smoser> yeah. hmm.
<blackboxsw> I'm adding a debug print
<smoser> blackboxsw: i am guessing.
<smoser> but i suspect thatlistdir()
<smoser> listdir('.')
<smoser> is not any specific order
<smoser> and that when it does instances first its ok but when instance first it is not
<smoser> or reverse
<blackboxsw> yeah, I could have sorted() the dir list and then we would've always seen it. I bet becaause is_dir returns false when a symlink target is already broken
<smoser> http://paste.ubuntu.com/26314255/
<blackboxsw> yeah that's the fix I have
<blackboxsw> same one. just wanted to know why
<smoser> sorted is arbitrary
<blackboxsw> it seems dirlist is indeterminate apparently. as you see the issue only sometimes
<smoser> yeah, its not guaranteed sorted.
<smoser> it is just traversing the dirent
<smoser> and even if it was, its arbitrary that 'instnace' would sort before 'instances'
<blackboxsw> https://pastebin.ubuntu.com/26314266/
<blackboxsw> I would've thought sorting would have put instanec before instances too, but yeah the dirent iterartor isn't sorted
<blackboxsw> ok anyway pushing the fix and unit test
<blackboxsw> I wouldn't thought that even a sorted list would have '
<blackboxsw> I wouldn't thought that even a sorted list would have 'instances' > '
<blackboxsw> 'instance'
<blackboxsw> but maybe that's arbitrary too as you point uot
<blackboxsw> sorry was typing on a different keyboard.
<smoser> blackboxsw: sorting would fix it i think, but just seems arbitrary from the perspective of it will start to fail again if we had a link named zz to aa
<blackboxsw> right, yeah it was a fragile fix to sort(and would have been wrong)
<blackboxsw> because it ignored the problem (that we weren't handling symlinks)
<blackboxsw> https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/335671 has the fix
<blackboxsw> didn't realize I was doing something a bit different than you suggestion.
<blackboxsw> utils util.is_link instead
<blackboxsw> s/utils/using/
<dojordan> @blackboxsw - happy new year! is there anything blocking checking in my PR to master? https://code.launchpad.net/~dojordan/cloud-init/+git/cloud-init/+ref/azure-preprovisioning
<blackboxsw> dojordan: I think there was a side discussion I had with smoser that we might hit an issue with systemd unit timeouts. if we attempt block indefinitely in a polling loop in cloud-init's unit systemd might timeout at 5 minutes??? which could cause the behavior you are looking for to fail.
<blackboxsw> I'm not sure about the 5 minute auto-timeout in systemd, lemme find a reference to see if I can dig up a doc on it
<dojordan> I've tested in 16.04 with polling for much longer than that, and systemd didn't kill cloud init or anything
<dojordan> what is the service name?
<blackboxsw> cloud-init.service I believe
<dojordan> doug@dojordandev:~$ systemctl show cloud-init.service -p TimeoutStopUSec TimeoutStopUSec=infinity
<dojordan> thats why is worked :)
<blackboxsw> or cloud-init.targetahh there you go
<blackboxsw> ooops
<blackboxsw> ahh I mean
<blackboxsw> looks like 'we' (cloud-init) don't explicitly set that timeout to inifinity, but me OS-specific setting
<blackboxsw> ohh wait
<blackboxsw> TimeoutSec=0
<dojordan> that infinity timeout was on the azure 16.04 LTS image
<blackboxsw> looks like we set that for the systemd/cloud-final.service.tmpl
<dojordan> got it
<blackboxsw> ok this might not really be an 'issue' then, though it *may* be worth us explicitly configuring that inifinity timeout on azure images... I'm not certain.
<dojordan> is 0 == infinity?
 * blackboxsw thinks, as TmieoutSec == setting for both TimeoutStart and timeoutStop
<blackboxsw> but trying to confirm
<blackboxsw> sd_notify(3)).
<blackboxsw> TimeoutSec=
<blackboxsw> A shorthand for configuring both TimeoutStartSec= and TimeoutStopSec= to the specifie
<blackboxsw> from https://www.freedesktop.org/software/systemd/man/systemd.service.html
<smoser> dojordan: i think there is still general issues with timeouts
<blackboxsw> though I don't see a reference that "0" == 'infiinity' for Timeout(Stop/Start)Sec
<dojordan> it just says "Pass "infinity" to disable the timeout logic." on the timeoutstopsec
<smoser> hm..
<dojordan> but are we currently doing that...
<smoser> buti dont know how other things in boot handle it
<smoser> so cloud-init-local or cloud-init.service may have TimeoutSec set correctly
<smoser> maybe i'm worng.
<dojordan> doug@dojordandev:~$ systemctl show cloud-init.local -p TimeoutStopUSec TimeoutStopUSec=1min 30s
<smoser> but i swear if've seen boot just go on when i didn't think it should.
<dojordan> so local is showing 1:30, but service is showing infinite
<smoser> dojordan: well you typo'd there.
<smoser> cloud-init.local is not a service
<smoser> cloud-init-local.service
<dojordan> d'oh....
<dojordan> "cloud-init-local.service" == TimeoutStopUSec=infinity
<smoser> dojordan: so how does /var/lib/waagent/poll_imds get written?
<smoser> newerubuntu and cloud-init i think do not rely on waagent at all
<smoser> and i'm not really interested in adding such a dependency back
<dojordan> there is no dep on waagent
<dojordan> we can change the path - it should probably be in the instance directory
<smoser> what creates it ?
<dojordan> the azure data source
<smoser> oh. ok. yeah. i see that. the waagent thrrew me off.
<dojordan> also, I confirmed why the timeout is infinity - since we are using type=oneshot the timeout is disabled
<smoser> i suspect that 0 did mean infinity at some point
<smoser> and probably still does
<smoser> REPROVISION_MARKER_FILE, why do we need that?
<dojordan> the marker file (/var/lib/waagent/poll_imds) is needed in case the VM reboots for whatever reason before it is reused by a customer
<smoser> rather than internal state or something.
<dojordan> we report ready to the fabric which means we detach the provisioning ISO
<smoser> why would it reboot?
<dojordan> hardware, software updates, etc
<dojordan> underlying platform
<dojordan> and we don't write the Ovf since we don't want to persist any azure specific data since the real ovf will come from the customer
<dojordan> we will be in this polling loop for a while
<dojordan> and if there is an unexpected reboot we want to keep polling when the vm comes back up
<smoser> it seems odd that the platform would choose to reboot such a machine
<dojordan> in azure, all of our VMs are backed by remote storage. so when hardware issues occur, we simply move the VM to a new machine since the data is persisted and reboot it
<dojordan> by data i mean the os vhd
<smoser> seems like maybe you could just kill machines in this state. as they're not owned by anyone while maybe everything "should work" if you just kill a machine while booting and then re-start it, i suspect that in reality there are lots of issues.  but thats not really important here.
<paulmey> smoser: true, but that would require some rearchitecting on a different level which is not going to happen any time soon...
<dojordan> the same behavior is true today for all VMs. if the reboot before cloud-init finishes we just move thme
<dojordan> @smoser these are all valid points. the underlying platform is not perfect, and there are cases today where we don't get an ACK from our remote storage layer, then the VM will be busted. I think the important thing about the marker file is that allows us to keep the pre provisioned vms around for longer which enables us to have a higher hit rate for reuse and therefore increase boot performance. the availability won't be any w
<smoser> dojordan: i responded on mp there.
<smoser> sorry for taking so long to take a look at your proposal
<smoser> dojordan: please don't take offense. over all, you've done a good job.
<dojordan> no worries, appreciate the feedback. Will address the PR comments later today
<blackboxsw>  pushed a couple changes to the review-mps script in qa-scripts repo for landing branches
<blackboxsw> slowly building it into something useful/working
<blackboxsw> thx for the review btw. landed
<ivve> heya guys, im doing a really ugly hack with write_files but been getting lots of trouble getting write_files to work at all, i have a hard time writing two files as well. first example https://hastebin.com/otubucuqin.pl , second example https://hastebin.com/egoxufopab.js
<ivve> like the simplest example with only a path + content works... other than that i have a really hard time getting it to execute
<ivve> is my syntax way off?
<blkadder> You might find it easier just to base64 it.
<ivve> ok
<ivve> so - encoding: b64 ?
<ivve> and then just content: |
<blkadder> One sec.
<blkadder> write_files:
<blkadder>    - encoding: b64
<blkadder>      content: T3JpZ2luOiByZXBvLnFiaXMuY28KTGFiZWw6IHJlcG8ucWJpcy5jbwpDb2RlbmFtZTogeGVuaWFsCkFyY2hpdGVjdHVyZXM6IGkzODYgYW1kNjQgc291cmNlCkNvbXBvbmVudHM6IG1haW4KRGVzY3JpcHRpb246IHFiaXMgcmVwbwpTaWduV2l0aDogZGVmYXVsdCAK
<blkadder>      path: /tmp/distributions
<blkadder>      owner: root:root
<blkadder>      permissions: '0644'
<blkadder> Sorry that may not have come through very well.
<ivve> no worries
<ivve> so i have to encode it then
<blkadder> https://paste.ubuntu.com/26315006/
<blkadder> Yes base64 -w0
<blkadder> Copy/paste
<ivve> alright
<blkadder> I dontâ thinkl your second example will work.
<blkadder> http://cloudinit.readthedocs.io/en/latest/topics/examples.html
<blkadder> Itâs very picky about where you put â-â and spaces, etc.
<ivve> aye
<ivve> doesn't seem to like multiple files at all :P
<ivve> i got it to work once or twice
<ivve> this isn't working either
<ivve> :(
<smoser> ivve:  i suspect that you have general yaml issue.
<ivve> aye, i keep getting [   18.299202] cloud-init[861]: 2018-01-03 20:29:00,205 - __init__.py[WARNING]: Unhandled non-multipart (text/x-not-multipart) userdata: '#cloud_config...'
<smoser> oh. yeah.
<smoser> cloud-init will ignore that
<smoser> if it does not start with '#cloud-config'
<ivve> it does
<smoser> or is not declared as cloud-config with multipart
<smoser> just add '#cloud-config' as first line.
<ivve> oh shit
<ivve> _ not -
<ivve> :D
<ivve> oh man
<smoser> ah. yeah. funnyt.
<ivve> i used a heat template first and just shortcutted
<ivve> added # and removed the :
<ivve> jeez what an idiot i am
<smoser> ivve: whenever someone shows me something like the hastebin there, the first thing i do is use 'yaml-dump'
<smoser>  http://paste.ubuntu.com/26315078/
<smoser> json output is much more clear and identifies errors to a human more clearly.
<smoser> (you didn't mess up that way, this is just fyi)
<ivve> aye its a good pointer, thanks
<ivve> however i can't read json even if my life depended on it
<ivve> well it works now
<ivve> i guess i will be encoding stuff now
<ivve> it help writing stupid expect scripts :P
<blkadder> Use IntelliJ
<blkadder> Itâs a life saver.
<blkadder> And it has a vi/vim mode too which is nice.
<ivve> stack completed, music to my ears :P
<ivve> thanks a bunch guys
<blackboxsw> while it's a little late in the process, your machine with cloud-init  installed can run 'cloud-init devel schema -c your-configfile.yaml     to validate the yaml https://pastebin.ubuntu.com/26315122/
<smoser> :)
<blackboxsw> it at least gives you a quick once over on the yaml file once you've discovered something didn't work as you expected
<blkadder> Is that in mainline now?
<blackboxsw> yep. should be in xenial and greater
<blackboxsw> and on trukn
<blackboxsw> and on trunk
<blkadder> Cool.
<ivve> aye its a good pointer as well and i did think about it but never came to that point since i just added a file and stuff stopped working
<blackboxsw> needs a lot of work as it'll eventually support all attributes of each cloud-config module and --annotate
<blackboxsw> to report specific errors
<ivve> runcmd: also doesn't like " or ;
<ivve> or : for that matter
<blackboxsw> my new year's resolution is to make "cloud-init schema" a first-class citizen with support for reporting schema errors in all 54 cloud-config modules
<blkadder> ivve: https://paste.ubuntu.com/26315137/
<blkadder> Someone here helped me with thatâ¦ :-)
<blackboxsw> we'll see if that holds (hopefully better than the "exercising daily" new year's resolution)
<blkadder> Yay exercise.
<blkadder> I highly recommend: https://stronglifts.com/5x5/ Only 3 or 4 days a week. No two hours at the gym. :-)
<ivve> ah yes ofc you can type it that way to get it right
<ivve> however as you are pointing out help is needed to write it and read it :P
<blkadder> Well that gives you some syntax to pattern off of.
<blkadder> Now that I have that I can generally manage on my own.
<ivve> is just preferred using write_files to write the script and runcmd: - bash/expect /path/to/script
<ivve> its just a hack for a demo, nothing proper
<ivve> i'd use ansible for proper stuff
#cloud-init 2018-01-04
<smoser> powersj: did you test console output ?
<smoser> in your ec2
<powersj> I added the call to grab it, however as you will note about my comment it isn't always ready to get pulled
<smoser> seems like it should return a base64 encoded string
<smoser> http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.Instance.console_output
<powersj> the latest commit got it from: https://git.launchpad.net/~powersj/cloud-init/commit/?id=677c60738d026a9bc1272c3b6e2366ce4bdc98f8
<smoser> ie, if you *did* get something i think it will be base64
<powersj> also the base64 encoding doesn't apply since "If using a command line tool, the tool decodes the output for you."
<powersj> when using the aws-cli == command line tool
<smoser> right
<smoser> you're *not* using it
<smoser> so you need to decode
<powersj> or using the python library seems to cover that as well
<smoser> that seems wrong
<powersj> same for user-data... we don't encode it
<smoser> https://eucalyptus.atlassian.net/browse/TOOLS-407
<smoser> err.. the redux
<smoser>  https://eucalyptus.atlassian.net/browse/TOOLS-675?attachmentViewMode=gallery
<smoser> basically, though, i'd expect that the content that went to the console of the device is not necessarily anything that be represented losslessly as a string
<smoser> it is bytes
<smoser> so if ec2 did an encode to utf8, that'd be lossy
<powersj> smoser: your comments stem from my use of .encode() on the output?
<smoser> have you tested this?
<powersj> yes ran it through all the tests, only one fails and we need to fix that test
<powersj> let me try again in case I just never got any output at all
<smoser> because i think what you will be '.encode()'ing is b'aGVsbG8gd29ybGQK'
<smoser> so sure, it will work, buit the console log will be base64 encoded string
<smoser> er,...bytes.
<smoser> i have to run, i'll take a further look later.
<powersj> smoser: it produces a string
<powersj> https://paste.ubuntu.com/26321229/
<powersj> here is also an example of no output: https://paste.ubuntu.com/26321219/
<powersj> here is part of the raw string: https://paste.ubuntu.com/26321243/
<blackboxsw> powersj minor comments on my last pass of your ec2 branch posted
<blackboxsw> approved on my end once those nits are reviewed and handled or rejected by you
<powersj> blackboxsw: thank you, I'm +1 on all of those
<dojordan_> hey smoser, I'm working on using url_helper instead of raw requests, but I need infinite retry support so I was planning on using max_wait==None. I wanted to make sure that is okay, as well as wanted to point out that right now, using the default parameter of None will trigger a somewhat unexpected exception
<dojordan_> def timeup(max_wait, start_time): return ((max_wait <= 0 or max_wait is None) or (time.time() - start_time > max_wait))
<dojordan_> the problem is the short circuit check for a None parameter should happen before the none to int comparison. (this is in url_helper)
<powersj> smoser: in addition to the above, take a look at: https://paste.ubuntu.com/26321285/
<powersj> blackboxsw: changes pushed
<powersj> spelling fix, docstring format, try/except change, and more details in debug messages
<blackboxsw> powersj: looks good, just saw another highlighted in your latest changest.   'deleting secuirty' in tests/cloud_tests/platforms/ec2/platform.py
<blackboxsw> all good otherwise.
<powersj> blackboxsw: thx are you using a spell checker of some sort?
<blackboxsw> I have two of them, and no prescription glasses ;)
<powersj> lol
<powersj> mine don't work very well
<blackboxsw> nope just reading through things generally. that's why I missed the typos first time around (I should use a spell checker) ispell or something
<powersj> blackboxsw: https://paste.ubuntu.com/26321529/ may have some test escapes again
<powersj> that was on my local branch, let me see if the nightly run shows the same
<powersj> actually ignore me... I am looking at the decimal wrong doh
<powersj> coffee time
<blackboxsw> powersj: hrm, yeah that is taking a bit long..... I thought when we saw bad escapes we'd see > 1 second (but that was probably with retries etc)
<powersj> yeah I saw that and went 5 seconds?!
<powersj> but it isn't 5 seconds :P
<blackboxsw> powersj: yeah I went on a date to zquila over break and had a decimal place issue. we almost ordered 2 margs that were $75 each. thank goodness the waitress asked us to double check
<powersj> hahaha
<blackboxsw> I can't believe enough people would actually order a $75 marg to warrant putting something like that on the menu
<blackboxsw> #wrongchannel
<blackboxsw> :)
<blkadder> LOL
<smoser> powersj: then the bug is in boto3 :-(
<powersj> bug?
<smoser> i'll file it upstream.
<powersj> you believe it should return bytes over a string?
<smoser> it is taking a base64 encoded blob and decoding it into a string
<smoser> and losing data along the way
<powersj> https://github.com/boto/botocore/blob/develop/botocore/handlers.py#L142
<smoser> yeah, thats the bug :)
<smoser> https://github.com/boto/botocore/issues/1351
<smoser> thanks for digging, powersj
<powersj> smoser: heh thanks for calling it out :)
<powersj> I am always amazed by your ability to pull out pugs from years ago heh :P
<smoser> firefox history
<smoser> i think euca must not use boto3
<smoser> otherwise it would never have seen the bug that i reported (twice)
<smoser> i  know it didn't origintally use boto3, but i figured it might have moved by now. but the project is kind of dead now.
<smoser> dojordan_: yes, that seems sane. 0 might also make sense.
<smoser> powersj: fwiw i'm not just making this stuff up. the data that systemd writes to /dev/console is almost all the time *not* utf-8 during a boot cycle on ubuntu at least.
<smoser> i only know this because of tools that assumed it was not working :)
<powersj> ok good to know, my quick look at the data made me go looks like normal output and move on
<smoser> powersj: i hit 'submit' on review
<smoser> i think i might be missing something though... i just went through my last round of comments, and it didn't look like you addressed many of them.
<powersj> smoser: hmm thought I did
<powersj> I'll look
<powersj> smoser: looks like I misunderstood your comment about snapshots, otherwise hit all your previous requests
<powersj> I'll get these fixes done
<powersj> smoser: pushed updates, with a question on the review
#cloud-init 2018-01-05
<redguy> hi, I am restarting systemd-journald as a part of my cloud-init final scripts and this seems to break the execution of cloud-final.service on Debian jessie (exit status 1). Is this known and is there some workaround I could use?
<redguy> or rephrased: what is the cannonical way of managing journald config with cloud-init ?
<redguy> s/cannonical/recommended/
<smoser> redguy: hm..
<smoser> thats not somethign i've done or considered.
<smoser> how is 'break execution' ?
<smoser> as in it doesnt run ?
<redguy> as in the execution stops
<redguy> trying to do it via runcmd now, but I think this will also break
<smoser> that seems bad.
<redguy> https://pastebin.com/0m8QVEZq
<smoser> bad that restarting systemd-journald would stop a service
<redguy> yeah, possibly this is a bug in systemd :-)
<smoser>  can you get /var/log/cloud-init-output.log and /var/log/cloud-init.log ?
<smoser> and fyi, https://hastebin.com/ doesn't hurt your eyes to look at :)
<redguy> oh, nice
<redguy> will get logs in couple of secs, but this time with runcmd approach
<redguy> hmm, runcmd didn't fail, but it seems that it it wasn't executed at all (as in the mkdir in there didn't create the directory I wanted it to create). This most likely this is a config problem on my part. Going back to the pervious setup with restarting systemd-journald in a shell script
<smoser> systemd doesn't do as well as upstart did with restarting things otherwise being "dynamic" during boot
<smoser> we had to move the installation of packages to final rather than cloud-config stage because of that
<redguy> https://hastebin.com/fanahigiwo.php
<redguy> so systemd says status 1, but the cloud-init.log says 141
<redguy> which is EPIPE afair
<redguy> no, a result of SIGPIPE
<redguy> so my current theory is that this is caused by syslog log handler: cloud-init has /dev/log socket open, journald is stopped and closes the socket, cloud-init breaks
<redguy> can I override log_cfgs with user_data ?
<smoser> redguy: i think that paste didnt work ?
<smoser> redguy: i think you should be able to write to log configs.
<smoser> but your disgnosis could be wrong i think
<smoser> no.. system can't be that broken
<smoser> i was thinking that it might be just SIGPIPE issue because journal is catching cloud-init's output
<smoser> hm.
<redguy> somehow hastebin doesn't work for me :/
<redguy> https://pastebin.com/5ev0Z7La
<redguy> https://pastebin.com/raw/5ev0Z7La ? :-)
<powersj> blackboxsw: smoser: https://git.launchpad.net/~powersj/cloud-init/commit/?id=3793d3d24500d5f9a6ee03cd28bd9f3e6182554c
<smoser> powersj: just loking at the diff i tlooks like ou changed the param name in the doc string
<smoser> but not in the usage
<powersj> oh shoot
<smoser> i dont think it needs _ base on it.
<smoser> its the only 'image' thing in that class.
<powersj> until the snapshot function
<powersj> but yes only class var
<powersj> I can revert it back to just image_ami
<smoser> powersj: hwere does the snapshot get removed ?
<smoser> the "Amazon EC2 Snapshot"
<smoser> that backs the ami
<powersj> the backing instance is removed still in image.py (that was already there from last night)
<smoser> no..
<smoser> hm,..
<powersj> lol
<smoser> ok. so 'create_image' does
<smoser>  a.) stop instance
<smoser>  b.) snapshot volume
<smoser>    (which creates a snap-XXXXXX thing)
<smoser>  c.) register an ami with snap-XXXXX as its root volume. (creating a ami-XXXXX thing)
<smoser> then in CII-EC2Snapshot.destroy, you deregister the image
<smoser> whicih deletes form AWS the ami-XXXX thing
<powersj> yes
<smoser> but does the snap-XXXXX thing get removed ?
<powersj> to my knowledge of how the deregistering of an AMI works yes, looking at the console there are no longer anything listed under images or snapshots
<smoser> interesting.
<smoser> that does make snse, and much nicer than aking the user clean up the snapshot themselve.
<smoser> which i thought you'd have to do.
<smoser> oh... ik now why.
<smoser> in ubuntu publishing, we would register the snapshot and then create an ami from that snapshot separately
<smoser> so that we could register multiple ami from the same snapshot
<smoser> (one as 'released' and one as 'daily')
<smoser> so... good. thank you.
<powersj> smoser: np, I am double checking now though :)
<powersj> smoser: ugh I'm wrong, there are ec2 snapshots listed. I was looking at wrong region (didn't change my web browser to match my config)
<powersj> there is a snap-03ac... left over
<powersj> smoser: thanks for comments I have a clean up for the snap done and will make your last change
<powersj> blackboxsw: smoser: final commit with Scott's fixes to removing the snapshot + variable naming: https://git.launchpad.net/~powersj/cloud-init/commit/?id=14beefdcc7d848f0b5852a797753f4758ff647e1
<blackboxsw> sorry powersj was in a hole on lanscape stuff for a bit. looking now
<blackboxsw> and reading backlog
<powersj> blackboxsw: no worries smoser was fixing my invalid ways
<smoser> i think seems good.
<blackboxsw> spare the rod, spoil the child
<smoser> at least good enough.
<smoser> we do not support instance store right now
<smoser> should probably doc that
<smoser> as '
<smoser> 'create_image' doesnt work on instance store isntance
<smoser> and our Destroy of the ami will fail for such a thing too
<smoser> powersj: do we shut down the booted system that we collect ?
<smoser> i was thinking that we should do that.
<smoser> i had this in my buffer
<smoser>  http://paste.ubuntu.com/26327283/
<powersj> smoser: we terminate during destroy
<smoser> right
<smoser> but we dont shut the instance down
<smoser> ie, '/sbin/poweroff'
<blackboxsw> powersj: do we need to loop through snapshotids per +            snapshot_id = image.block_device_mappings[0]['Ebs']['SnapshotId']
<smoser> if we did that, then we could collect the log *after* doing so
<smoser> and we'd have a mnore complete log on every platform i think
<blackboxsw> or are we saying that we only really support one ebs snapshot which we will ultimately destroy during deregister
 * blackboxsw adds a pdb in there and tries to look at what we get from ec2
<smoser> blackboxsw: well, the block device mapping is something like:
<powersj> blackboxsw: for now we could make that assumption, if we starting adding disks then no
<smoser>  {'root-disk': 'snap-XXXXXXX', 'disk1': 'snap-XXXXXXX', 'disk2'....}
<smoser> the snapshots we create with create_ami will most probaly have the snap-XXX at as the '0' eleement
<smoser> in the block device mapping
<smoser> ie, the root device
<blackboxsw> yeah I was just wondering about whether it was worth a snapshot_ids = set([dev_map.get('Ebs', {}).get('SnapshotId') for dev_map in  image.block_device_mappings])
<blackboxsw> but doesn't matter for this use case
<powersj> smoser: so attempt the console log capture twice, and console_log_final would return the existing console data on platforms like lxd and nocloud-kvm?
<blackboxsw> and maybe my approach is wrong anyway
<smoser> powersj: well that was what my paste did, yeah. but it is silly.
<smoser> blackboxsw: i think you probalby would not want to do that.
<smoser> hm.. maybe. i dont know.
<blackboxsw> no worries, we can dig into it if we actually get a use case
<powersj> blackboxsw: this I got everything from yesterday https://code.launchpad.net/~powersj/cloud-init/+git/cloud-init/+merge/335774
<powersj> smoser: any further thoughts? or are you trying to get the console log going?
<smoser> i  was atrying to get console log going
<powersj> :)
<smoser> and now noticedc that with recent (bionic) lxc it is broken
<smoser> :-(
<smoser> so ... can't test that way :)
<blackboxsw> powersj: looks good, I just noticed that epel repo on centos doesn't contain awscli package "yet/ever?" so we'll have an issue there as far as dependencies, but generally the renames are good for the rest
<blackboxsw> s/centos/centos 6/
<blackboxsw> centos7 has awscli
<powersj> blackboxsw: see my comment about bzr+lp?
<blackboxsw> yeah I liked it.  the simplestreams comment?
<powersj> well if you run read-dependencies on that it tries to install bzr+lp:simplestreams
<powersj> so should I do a) a rename to specify python[3]-simplestreams or b) add something to read-dependencies to filter out the 'bzr+lp:' prefix?
<blackboxsw> ohh you mean per read-dependencies behavior. no I didn't see that comment. hrm
<blackboxsw> hrm, might have a new rename rule. lemme think during lunch
<blackboxsw> we have a couple other lp cases right?
<powersj> hmm I thought this was the only one
<blackboxsw> hmm maybe not
<blackboxsw> yeah I thought we were doing something ahh, it was just simplestreams in tox.ini
<blackboxsw> ok that was the only case, and you moved it.
<blackboxsw> hmm ok will toy with a rename rule for bzr+lp
<blackboxsw> powersj: WDYT? I like calling out the needs
<blackboxsw> oops
<blackboxsw> http://paste.ubuntu.com/26327747/   rather
<blackboxsw> add the ability for pip dependency exclusions if system packages aren't available
<blackboxsw> at least we'd be noisy on unsupported platforms
<blackboxsw> ok really grabbing lunch
<powersj> blackboxsw: well if simplestreams isn't even available then I question why we should even add integration requirements to the read-depens file
<powersj> if you don't have simplestreams forget running integration tests, yes you could run lxd backend, but that would be it
<powersj> from a rhel user who is just hacking on cloud-init and wants to run tests, this would also install aws-cli which is pointless
<blackboxsw> true powersj, so simplestreams system package is avail on ubuntu... but nothing else. so we could just do something like the following:  http://paste.ubuntu.com/26327790/
<blackboxsw> then no need to worry about integration deps for non ubuntu
<powersj> agreed
<powersj> so that last pastebin + the rename for simplestreams for "debian" should do it
<blackboxsw> we'd still need the        "renames" : {
<blackboxsw> +         "bzr+lp:simplestreams": {
<blackboxsw> +             "2": "python-simplestreams",
<blackboxsw> +             "3": "python3-simplestreams"
<blackboxsw> +        },
<powersj> yeah
<blackboxsw> just for 'debian' in pkg-deps.json
<powersj> ok I'll get that worked up in a bit
<blackboxsw> thanks sir
<powersj> thank you
<smoser> blackboxsw: shoudl the package deps have been even referening simplestreams ?
<blackboxsw> hrm .... package itself shouldn't... ohh hold the phone... right, forgot we generate all test deps into package control file deps.
<blackboxsw> right will work up a tiny tweak ... make ci-deps-ubuntu should install those system packages for integration dependencies, but it shouldn't be included when generating the debian/control file. one min
<blackboxsw> smoser I *think* that is safe... packages/bddeb calls specifically       'read-dependencies',
<blackboxsw>         ['--requirements-file', 'test-requirements.txt',
<blackboxsw> so it doesn't call see integration-requirements.txt
 * blackboxsw runs bddep and checks the pkg created
<smoser> so what path was exposing the need for a change ?
<blackboxsw> http://paste.ubuntu.com/26327885/ yeah build-deps don't get integration test dependencies
<blackboxsw> smoser: if we want to continue to support installing system package installation setup for ci via 'make ci-deps-ubuntu' I'd like to have us also install the integration test dependencies via this mechanism
<smoser> well, that makes sense. but some things i think will not be resolveable that way.
<blackboxsw> make ci-deps-ubuntu calls  ./tools/read-dependencies -d ubuntu -v 3  --test-distro to install system packages instead of python packages
<smoser> such as softlayer python library
<smoser> or azure library
<blackboxsw> right, there are integration python package requirements  that won't fit this mold.... and cases like powersj mentioned for awscli
<smoser> powersj: http://paste.ubuntu.com/26327905/
<blackboxsw> where it's snapped, so we have a different vehicle for the installation of those deps that would need to be sorted
<smoser> that works for me in nocloud
<smoser> i'd be interested in if it works on ec2
<smoser> and i guess i could find a xenial system to test lxd
<blackboxsw> but I feel like that step will need to be grown when we actually include azure or softlayer integration support
<smoser> hm.
<smoser> maybe
<blackboxsw> as it is currently, the python pip vs. system-package logic 'works' for our current integration needs and doesn't preclude the option of growing an option to install a non-system-package dependency like azure's or softlayer's  python packages
<blackboxsw> it just doesn't support that level of granularity yet
<powersj> smoser: I'll try ec2 shortly, be aware our test systems use lxd from ppa so it will be on par with bionic
<blackboxsw> it does beg something slightly smarter than a simple translation from pip-pkg-name -> system-package-name which is what we currently do wholesale
<powersj> Which reminds me we need to move to the snap
<smoser> powersj: well, it works in that it doesn tfail
<smoser> it just doenst get a console log
<smoser> stgraber proposed fix for lxd at https://github.com/lxc/lxd/pull/4140
<powersj> smoser: do you want this change as a part of ec2 or a follow on?
<smoser> i was just going to ask.
<smoser> i can either propose it as stand alone
<smoser> or as part of yours.
<smoser> powersj: i think pull it into yours
<powersj> smoser: ok I'll pull and test a bit, that means hopefully merge monday? ;)
<smoser> powersj: yeah.
<powersj> \o/ an ec2 console log
<smoser> woot!
<powersj> now I just need my coffee shop connection to get a boost to download an image for nocloud
<powersj> pylxd doesn't like the snap lxd
<powersj> smoser: do you want me to squash all my commits?
<powersj> otherwise once this lxd test is done I think we are good to go
<powersj> smoser: pushed a version I confirmed working console log with lxd, nocloud-kvm, and ec2
<powersj> I'm calling it done :D
<powersj> thanks for the help
#cloud-init 2018-01-06
<smoser> powersj: if you can rebase to trunk and squash i'll pull it. not shre whati'm doing wrong but... it didnt squash without conflicts.
<powersj> ok
<smoser> actually, i think ihave it.
<powersj> there was a log message that conflicted during squash
<smoser> not sure why my rebase with squash didnt work right, but i'm doing a tox on this and centos build and will push
<smoser> and then there is one other.. but i didnt see what the error was.
<smoser> i just rebased squashed and diffed against your branches tests/cloud_tests
<smoser> and no changes, so i'm about to push if it passes tox and centos build
#cloud-init 2018-01-07
<ybaumy> heho
<ybaumy> how to build a custom nocloud provider
<ybaumy> is there some example
<ybaumy> we wanted to use foreman. but are now on the way to build our own provisioning services
<ybaumy> part of that is a cloud-init nocloud provider
<ybaumy> i need help
#cloud-init 2020-01-02
<gnulnx> I'm trying to debug an issue with chef, in my cloud init script.  when I run `cloud-init -d single --name chef` I get `Ran chef but it failed!
<gnulnx> `.  Is there a way to turn on more verbose debugging?
<meena> gnulnx: what does the log files say?
<meena> both, cloud-init's, and chef's
#cloud-init 2020-01-03
<SnoFox> Hello. I have an Amazon Linux 2 VM that seems to have removed the cloud-init binary and systemd units. I feel like I'm misunderstanding the purpose of the tool - I don't see anything funny in userdata to cause that nor am I finding anything in the docs about persistence. Is this expected or is Amazon doing weird things?
<MKS2020> Hello, iâd like report a bug at line https://github.com/canonical/cloud-init/blame/8116493950e7c47af0ce66fc1bb5d799ce5e477a/cloudinit/ssh_util.py#L260 not sure if this IRC channel is the right place.
<MKS2020> cloud-initâs code by default makes a wrong assumption that all users using private .ssh folder for authorized_keys.
<gnulnx> meena: Chef isn't even being installed.  The inital directories are created, but there is no chef-client binary, and the logs just say "running module chef ... failed", and "Ran chef but it failed!"
<gnulnx> The cloud-init logs ^
<MKS2020> But `AuthorizedKeysFile` directive in sshd_config could be used to have a system-wide folder with user keys which are managed by configuration management systems, for example puppet. The idea is to have configuration like `AuthorizedKeysFile /etc/ssh/authorized_keys/%u` to prevent users manage their authorized_keys or to rootkit other users from user who had a sudo permissions on host. Folder `/etc/ssh/authorized_keys/` and all files inside are owned b
<MKS2020> root because user shouldnât be able to modify this files. When cloud-init change mode to 700 for such folder it breaks whole consept.
<MKS2020> so before applying 700 to folder defined in `AuthorizedKeysFile` cloud-init needs to validate that that folder located within userâs HOME folder.
<meena> gnulnx: how is chef being installed?
<gnulnx> meena: `install_type: "omnibus"`
<meena> MKS2020: do you think you can patch that?
<meena> gnulnx: ooohhh, aah? okay?
<gnulnx> One sec, I'm pasting my config
<MKS2020> meena: yes, https://github.com/canonical/cloud-init/pull/149
<meena> MKS2020: i haven't looked all day at GitHub, i'm in a new paid jobâ¦ and iâ¦'ve mostly been busy setting up my laptop, ð
<meena> MKS2020: did you look at https://github.com/canonical/cloud-init/blob/master/HACKING.rst ?
<MKS2020> hehe, it can wait right now weâre fixing this issue with runcmd: "awk  '/^AuthorizedKeysFile/ {print $2}' /etc/ssh/sshd_config | xargs dirname | xargs chmod 755" in our code but it really hard to use our own AMIs across different accounts and departments :)
<gnulnx> meena: https://gist.github.com/kylejohnson/e44a1d72b634dd7fade4fc830f2a7ae6 is what I have
<MKS2020> meena: yep, iâm at the middle of https://ubuntu.com/legal/contributors/agreement now
<meena> MKS2020: oh. that was simpler than i thoughtâ¦ also: can you show us your cloud-init config that let's you do / break that? (as a comment to PR perhaps)
<MKS2020> meena: ok, iâll describe steps to reproduce a bug
<MKS2020> meena: what should i write in `Please add the Canonical Project Manager or contact` ?
<meena> MKS2020: that would be powersj
<meena> gnulnx: we don't even capture if anything goes wrong in the installer: https://github.com/canonical/cloud-init/blob/master/cloudinit/config/cc_chef.py#L308
<meena> gnulnx: so, i'd start by toggling that Flag, and seeing if you get more useful output.
<gnulnx> Just toggle Capture and re-run?
<meena> gnulnx: set capture=True; run cloud-init clean --logs --reboot ; and enjoy the show
<gnulnx> thank ya
<gnulnx> I've been doing rm -rf sem; cloud-init -d single --name chef
<gnulnx> Would that get me (close to) the same result?
<gnulnx> meena: https://gist.github.com/kylejohnson/1a157062f1bccc7106e9de2ed5cd639e
<gnulnx> That's interesting.  It doesn't like any of my cef keys
<MKS2020> meena: iâve submited agreement and added steps to reproduce into MR. Let me know if some information is needed/missed from my side.
<meena> 15:51 <gnulnx> Would that get me (close to) the same result? â¬ï¸ no. clean nukes /var/lib/cloud-init and /run/cloud-init
<meena> https://cloudinit.readthedocs.io/en/latest/topics/modules.html#chef this documentation seems to be incomplete, and, confusing
<gnulnx> meena: Yeah, that's what I found too.  The documentation doesn't look up to date.
<meena> gnulnx: which version are you on, btw?
<gnulnx> meena: 19.3-41
<meena> let's openâ¦ at least one bug, gnulnx .
<gnulnx> Oh what's that?
<gnulnx> So I added `validation_name: test` and it actually installed chef this time
<gnulnx> Installed, daemonized and forked
<meena> gnulnx: so that's that then
<gnulnx> yup
#cloud-init 2020-01-05
<meena> why are tests on master failingâ¦ ohâ¦ on 3.8
