#cloud-init 2014-01-06
<smoser> kwadronaut, wrt your question on ssh keys 
<smoser> you can just configure that portion off
<smoser> just add something like this to a '*.cfg' in /etc/cloud/cloud.cfg.d/
<smoser> http://paste.ubuntu.com/6703527/
<smoser> and take out the '-ssh'
<smoser> do that on "first boot" and subsequent ones wont run it.
<smoser> you wont/wouldn't get ssh keys pulled from metadata service into your user's .ssh/authorized_keys, but thath probably isn't a huge deal for you in that case.
<smoser> you could also use harmw's suggestion of
<smoser> http://cloudinit.readthedocs.org/en/latest/topics/examples.html#configure-instances-ssh-keys
<smoser> and just write that stanza explicitly from the files in /etc/ssh after first boot.
<smoser> (so each boot would just explicitly re-write them)
<harlowja> harmw ya, i think he is, will bug him today to see how far he got
<harmw> ok harlowja 
#cloud-init 2014-01-07
<pquerna> smoser: your proposal for requiring a matching label is good;  I'll get the CLA done in a bit.
<smoser> pquerna, thanks.
<harlowja> harmw just fyi, sean got sucked into some y! mail thing yesetday, will be getting around to sysvinit scripts soon
<kwadronaut> thanks smoser, i will have to look at it later. 
<lotia> greetings all.
<lotia> using trying to figure out how to prevent filesystems and mount points from being configured on ubuntu 12.04 instances running on ec2. would i just put required config files in /etc/cloud.cfg.d/
<harlowja> lotia that would work
#cloud-init 2014-01-08
<smoser> lotia, yes, you can do that or give user-data to do it.
<smoser> ie, something that isn't well understood / known is that cloud-config provided by userdata == cloud-init configuration files in /etc/cloud/cloud.cfg.d/
<harmw> harlowja: I read yahoo switching to https on e-mail, probably related :)
<harlowja> harmw i think so :-P
<alevy> HI all, my ssh key does not seam to get imported from openstack when cloud-init runs. How would I debug this?
<harlowja> alevy which datasource are u using with openstack?
<harlowja> config drive?
<harlowja> other?
<harlowja> what version of cloud-init 
<alevy> harlowja: im not sure what data source, how do I check? 0.7.3
<harmw> cloudinit will tell you which sources it tried
<harlowja> do u know how the openstack u are using is setup?
<harlowja> that will affect which datasource cloud-init should try
<harlowja> but as harmw said, the console log of openstack should also tell u what was being tried
<harmw> you didn't forget to pass --key mykeyname on booting the instance alevy :)
<harlowja> and if not /var/log/cloud-init.log usually has more
<alevy> harmw: started it from the console
<alevy> harlowja: it is a nebula one openstack cloud
<alevy> one image is working the other isn't...
<harmw> nova boot --flavor bla [..] --key-name thisismykey 
<harmw> ah
<harlowja> alevy hmmm, then it could vary, do u know what the nebular people recommend for images?
<smoser> alevy, you can check to see if your key is there in 'ec2metadata'
<harlowja> is that image known to work (that they are providing)
<harmw> which image works, which one doesn't?
<alevy> the one someone downloaded but I am trying to build my own (for certain rrsons)
<smoser> i think nebula use ubuntu cloud images (even pull them in by default)
<alevy> both centos 6.4 with the same cloud-init package installed and same config
<harlowja> alevy build your own, hmmm
<alevy> Can i run it interactively to see what is happening, I dont understand what is in the logs...
<harlowja> alevy is it possible for u to pastebin the logs somewhere (filter out anything u don't want to show?)
<alevy> smoser: how do I check that
<alevy> harlowj: sure
<smoser> alevy, in ubuntu you'd have a package 'ec2metadata'
<smoser> just run it and it will crawl the metadata
<smoser> from inside the system (assuming you got in, but you clearly might not bee able to)
<alevy> smoser: bash: ec2metadata: command not found
<alevy> smoser: centos
<smoser> ah. well then
<harlowja> logs at /var/log/cloud-init.log should be useful here
<harlowja> *if any
<smoser> $ curl -q http://169.254.169.254/latest/meta-data/public-keys; echo
<smoser> 0=brickies
<alevy> curl -q http://169.254.169.254/latest/meta-data/public-keys;
<alevy> 0=alevy
<alevy> looks ok there...
<alevy> does cloud-init care about selinux?
<harlowja> define 'care'?
<smoser> it should handle it.
<smoser> i have to run
<smoser> later.
<alevy> harlowja: i noticed one image has it enabled and the other doesn't, just looking for differences...
<harlowja> kk, can u also check the diff between the cloud.cfg files @  /etc/cloud/cloud.cfg 
<harlowja> a diff there might be part of the issue
<alevy> harlowja: nothing commented out matters right?
<harlowja> right
<alevy> ok they are identical then
<harlowja> hmmm, k, thats pretty odd
<alevy> harlowja: is there a way to run cloud-init and see if it is puking or something?
<harlowja> ya, u can run cloud-init just via $ cloud-init
<alevy> do indents matter?
<alevy> i.e. "-" vs. " -"
<harlowja> potentially
<harlowja> yaml is white space sensitive
<alevy> gr
<alevy> ok i see what may be the problem then
<harlowja> kk
<alevy> yaml is valid and nothing changed when I re-ran cloud-init
<harlowja> valid yaml could still mean the yaml isn't right, if the spacing is right thats usually valid, but it might still be off
<harlowja> u should be able to run $ cloud-init single 'module-name'
<harlowja> and then see if one is dying
<harlowja> perhaps run just the ssh one
<alevy> harlowja: cloud-init single --name ssh ?
<harlowja> i think so
<alevy> cloud-init single --name 'ssh' --frequency once
<alevy> that did something...
<alevy> generated public private keypair
<alevy> how do I test the fetch of keys from openstack?
<harlowja> so i think u should be able to run $ cloud-init init
<harlowja> and that will rerun the fetching part
<alevy> does not seem to
<harlowja> any output at all?
<alevy> just prints out the networking stuffs
<harlowja> k, its probably noticing u already fetched the data
<harlowja> can u check /var/lib/cloud/
<harlowja> if u temporarily move that directory to somewhere else, it should re-run
<harlowja> that directory is where cloud-init stores alot of its data
<harlowja> especially under /var/lib/cloud/instance
<alevy> ok moved it and now it prints the networking and generates the keys again but says Failed to generate ecdsa key
<harlowja> ok, afaik rhel has issues with the ecdsa key, but the rest of the keys should be getting made, 
<harlowja> the question i guess is did it put your keys in place
<harlowja> maybe try the cloud-init single --name ssh  again
<alevy> it makes the ones in /etcssh/ for sure i just checked the fingerprint
<harlowja> k
<alevy> is the ssh module the one that fetches the key from openstack?
<alevy> http://cloudinit.readthedocs.org/en/latest/topics/modules.html
<alevy> that is totally empty.. ha ha
<harlowja> hmmm, ya, thought that had some data in it
<harlowja> the modules are @ http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/files/head:/cloudinit/config/
<alevy> https://gist.github.com/8326025
<alevy> that is the log btw, gist came back up
<harlowja> ah, k
<harlowja> hmmm, ya, its using 'iid-datasource-none' which confuses me
<alevy> what does that mean?
<harlowja> its supposed to find the ec2 one
<harlowja> datasources provide where cloudinit gets info from, ec2 being one
<harlowja> the none one is like a fallback
<alevy> do i need to set that in cloud.cfg?
<alevy> let me paste that too
<alevy> https://gist.github.com/8326105
<harlowja> u can, it might help reduce the set of ones it will try
<harlowja> adding the following will help reduce the options
<harlowja> # Only these datasources will be attempted (in order)
<harlowja> datasource_list:
<harlowja>  - ConfigDrive
<harlowja>  - Ec2
<harlowja>  - None
<harlowja> or something like that
<harlowja> u are running the '- disable-ec2-metadata' module though
<harlowja> that alters iptables, so that means u can't probably run cloud-init twice
<harlowja> without unblocking that iptables filter
<harlowja> probably for testing disable that module running
<alevy> ok i disabled that module
<alevy> and ran again 
<alevy> still not fetching any keys
<alevy> https://gist.github.com/8326270
<alevy> https://gist.github.com/8326272
<harlowja> did u make a new instance? or the same one?
<alevy> same one. should I make a new one?
<harlowja> ya, if that module already ran, it turned on an iptables rule that stops the metdata from being fetched
<alevy> can i just flush iptables?
<harlowja> probably
<alevy> seems to still be using None
<alevy> ok I have to run
<alevy> i guess i just need to keep looking?
<harlowja> ya, make sure u are removing /varl/lib/cloud each time u try to re-run
<harlowja> i'd restrict what datasources are allowed to
<harlowja> to avoid all these other ones being activated
<alevy> ok
#cloud-init 2014-01-09
<smoser> alevy, running 'single' doesn't actually hit the datasource.
<smoser> harlowja, 
<smoser> it just uses what it already found
<smoser> you have to re run 'cloud-init init' to have it search
<harlowja> ya, moving /var/lib/cloud should make that work out right
<smoser> single wont re-run the search though
<smoser> so it wont have pulled the data.
<smoser> i dont think
<smoser> this use case is one i want to solve with 'ci-tool'
<harlowja> ya, k, i think alevy was running init also
<alevy> harlowja: hey man if you have time I am going to roll a new instance and try to finish debugging this?your input is appreciated.
<harlowja> sure, i'll have a little time today, today i have more meetings (which sux)
<harlowja> smoser can probably also jump in to (or others)
<alevy> ha ha
<alevy> yeah understood
<alevy> thx
<smoser> i can help. sure.
<harlowja> all smoser  fault anyway alevy , lol
<harlowja> :-P
<harlowja> ha
<alevy> thanks guys
<alevy> let me get it up and see where i am at
<alevy> smoser: I assume i need console=ttyS0 to make sure I can see cloud-iit output at boot?
<smoser> ubuntu?
<alevy> smoser: centos
<smoser> i don tknwo fo sure, but quite possibly.
<smoser> it depends on the sysvinit scripts
<smoser> if they send output to their stdout (which i'd expect them to do)
<smoser> so yeah, you probably should have that. and should log thta serial console
<alevy> ok
<pquerna> smoser: have you been following the systemd-networkd work?  They kinda have a proposal rolling for network config files.
<pquerna> smoser: ps sorry i'm bad at bzr; made it a new merge proposal :-/
<smoser> pquerna, no. not at all. network config files ?
<pquerna> smoser: http://lists.freedesktop.org/archives/systemd-devel/2013-November/014077.html was the first patch, trying to find the bigger spec doc on the .network and .link file formats.
<pquerna> smoser: i know the CoreOS guys are interested in getting the .link and .network files onto a ConfigDrive, rather than having to parse out the interface file.  Not sure in the end its much different though.
<smoser> well, i think thats wrong.
<smoser> network/interfaces format is equally wrong
<smoser> but openstack is stuck with it for legacy
<smoser> they shouldn't go inserting some other arbitrary format in.
<pquerna> i'm all for something better.
<smoser> instead use something like the netcfg format or somethign.  
<pquerna> is anyone actually adopting netcfg though?
<smoser> picking up a "standard" that is all of 2 months old isn't a good idea.
#cloud-init 2014-01-10
<alevy> smoser: so i rolled one out again and i am still getting that cloud-init does not find the ec2 data source despite the fact that I can curl the API from the command line.
<alevy> I keep falling back to data source none when Ec2 appears to be working by hand...
<pquerna> smoser: https://docs.google.com/document/d/1ODIsSVyppkkEAun9abqazX_r8o71pVG0JHeHVCYurtI/pub has more but is apparently partly out of date. *shrug*
<pquerna> i don't have a horse in it, i just write out config drives :x
<alevy> smoser: python-boto was missing
<alevy> that might be part of it
<smoser> alevy, that'd do it.
<smoser> i do plan on dropping python-boto here soon.
<smoser> sorry for your lost time on that.
<harlowja> smoser where is that code that i did to replace that
<harlowja> i can't remember, lol
<smoser> in bzr history somewhere.
<harlowja> :-P
<smoser> yeah, i was just going to go there.
<harlowja> k
<harlowja> its somewhere, lol
<harlowja> harmw i'll bug sean again tommorow, i think the mail stuff is calming down (ssl seems to be on and working)
<smoser> hm.. its not even there.
<smoser> i thought hwe'd merged it and then dropped it.
<harlowja> hmmm
<harlowja> not sure, it i think was in a branch somewhere at least
<harlowja> unless i deleted the branch :-/
<harlowja> smoser http://bazaar.launchpad.net/~harlowja/cloud-init/url-ssl-fixings/view/698/cloudinit/ec2_utils.py
<harlowja> there u go
<smoser> you want to propose that as a drop for python-boto ?
<harlowja> hmmm
<smoser> i'm happy to take that now i think
<harlowja> sure, will see if i can do that
<harlowja> http://bazaar.launchpad.net/~harlowja/cloud-init/url-ssl-fixings/view/707/cloudinit/ec2_utils.py seems even better (the final version before we said screw it, just use boto)
<smoser> :)
<smoser> and use requests rather than urllib i thikn
<harlowja> def
<harlowja> smoser will see if i can pull that off tommorow
<subway> Is there a good way to set an instance hostname to something-instanceid via cloud-init?
<subway> foo-%i diddn't work the way I'd hoped..
<subway> I guess I can just do it in a script instead of using a cloud-config yaml user data
#cloud-init 2014-01-11
<harlowja> smoser https://code.launchpad.net/~harlowja/cloud-init/boto-ec2-replace/+merge/201269, will add some tests and all (not done)
<harmw> ok harlowja_away 
#cloud-init 2015-01-05
<smoser> o/
<tennis> hi guys, I need to use a resolv_conf config like the one here http://cloudinit.readthedocs.org/en/latest/topics/examples.html#configure-an-instances-resolv-conf   
<tennis> But where should the file be located and what should be its name? 
<tennis> The docs are vague about where files starting with "#cloud-config" should be located and what their names should be.
<tennis> Bueller? Bueller? 
#cloud-init 2015-01-06
<mhroncok> smoser: ping
<smoser> bah.
<smoser> mhroncok, tennis, you can't just ask questions and then leave.
 * smoser is still on reduced computer hours
<ericsnow> PTAL https://bugs.launchpad.net/cloud-init/+bug/1404311
<smoser> i'll look at that now. :)
<ericsnow> thanks!
<wwitzel3> smoser: o/
<smoser> o/
<smoser> looking at your MP for bug 1404311
<smoser> https://code.launchpad.net/~wwitzel3/cloud-init/gce/+merge/245209
<smoser> the LOG messages on line 29 of the diff there look wrong.
<smoser> LOG.warn...
<smoser> you have None, None
<smoser> but do not have a '%s' at all in the string
<wwitzel3> smoser: yep, you are right
<wwitzel3> smoser: I will correct that
<wwitzel3> smoser: correction pushed up
<smoser> wwitzel3, ok. i merged. please do check what is in trunk now just to be sure but it shoudl be good.
<smoser> you had a pep8 error like this: {'key':'value'}
<smoser> which should have required white space between ':' but wasnt.
<wwitzel3> smoser: hrmm, surprised my vim didn't warn or auto correct that
<smoser> and i didn't understand why ./tools/run-pep8 didn't catch it. now it does.
<smoser> thats why most of the noise in my commits before merge
<wwitzel3> smoser: I probably messed something up when I install all the go-vim stuff
<wwitzel3> smoser: thank you, so for testing this against GCE live .. how would I go about doing that?
<smoser> that is kind of hard isnt it. can you re-bundle an instance?
<smoser> ie, launch one, patch it, create a snapshot and then from there?
<wwitzel3> smoser: hrmm, maybe? .. is there an easy way to re-run the cloud-init process on an instance?
<wwitzel3> smoser: I could create and instance, patch, and just re-run the part of the scipt that consumes the metadata and generates the file
<wwitzel3> I actually don't know anything about when cloud-init actually runs
<wwitzel3> I can probably figure it out :)
<smoser> wwitzel3, yeah, you can do that.
<smoser> actually  if you just
<smoser> rm -Rf /var/lib/cloud; reboot
<smoser> that shoudl be good
<smoser> after installing your new deb
<wwitzel3> smoser: ok, awesome I will give that a shot
<wwitzel3> smoser: thanks again
<wwitzel3> smoser: confirmed the new behavior, thanks again :)
<wwitzel3> smoser: so what will be the next steps to get the new update in to live gce? is there a schedule for releases?
#cloud-init 2015-01-07
<wwitzel3> smoser: if you happen to be lurking again, was curious if you read my question from yesterday about getting the GCE update live and cloud-init release cycle.
<smoser> i did.
<smoser> we can get a new version of cloud-init into vivid.
<smoser> and then we have to SRU that fix to trusty, and utopic
<wwitzel3> ok cool. what is what I needed to know. Was asked about it yesterday and wasn't sure :) thanks again
<wwitzel3> s/what/that
<wwitzel3> smoser: though I think GCE only runs trusty images right?
<smoser> i actually dont know.
<smoser> i'd hope that at least at some point i would be able to run vivid there.
<smoser> if you can't test "interim" releases, you're virtually guaranteed that next LTS is not going to work as you hoped :)
<ericsnow> smoser: do you know what is the process (or timeframe) for getting that update into cloud-images and then into the GCE store of the images?
<ericsnow> smoser: I expect it is mostly automated so nothing would need to be done but wait
<wwitzel3> smoser: that's a good point, I guess I was just thinking right now, but it makes sense to have interim releases up there too
<ericsnow> smoser: mostly I'm curious about how long to expect to wait :)
<smoser> ericsnow, https://wiki.ubuntu.com/StableReleaseUpdates
<smoser> thats the process of getting chagnes into stable releases.
<ericsnow> wwitzel3, smoser: see https://cloud.google.com/compute/docs/operating-systems/
<smoser> http://cloud-images.ubuntu.com/releases/streams/v1/com.ubuntu.cloud:released:gce.json
<ericsnow> wwitzel3, smoser: they support "all" the releases
<wwitzel3> ahh so they do
<wwitzel3> cool
<smoser> we do have 'utopic' there. unfortunately, it appears:
<smoser>  http://cloud-images.ubuntu.com/daily/streams/v1/
<smoser> there is no 'daily' 
<smoser> anway, the way this generally works...
<smoser> a.) it gets uploaded to vivid
<smoser> b.) it gets SRU'd to trusty-proposed
<smoser> c.) someone rwrites the SRU report, and verifies it
<smoser> d.) it gets into images (normally into daily images, which are then promoted)
<smoser> e.) it gets into a "released" image
<ericsnow> smoser: re: timeframe, thanks for the link; I just want to get the ball rolling now so that we can release the new GCE provider :)
<Odd_Bloke> smoser: wwitzel3: We publish images for precise, trusty and utopic to GCE.
<Odd_Bloke> We do have dailies also, for the same.
<Odd_Bloke> Though not streams for same. *files bug*
<ericsnow> Odd_Bloke: what's the process for getting the images on GCE updated (e.g. with the cloud-init fix we need in juju)?
<smoser> Odd_Bloke, is there a reason there isn't a vivid ?
<Odd_Bloke> smoser: We do have vivid, actually.
<smoser> ok. cool.
<Odd_Bloke> Must have happened in the week before the break, which I seem to have completely forgotten. :p
#cloud-init 2016-01-11
<atonal> Hi, are the logs for this channel stored anywhere?
<kwadronaut> used to
<kwadronaut> http://irclogs.ubuntu.com/2016/01/11/%23cloud-init.html yep, they still are.
<atonal> kwadronaut: thanks!
<Odd_Bloke> smoser: xael is writing a data source that uses a metadata URL delivered by the DHCP server.  The current proposed implementation gets the DHCP client to write it out to a file in /etc/cloud/.  Does that seem OK to you, or might there be a better place to put it?
<Odd_Bloke> smoser: (Side note: we should look at doing this for Azure, where we currently just parse the leases file, with varying levels of success)
<smoser> i dont like /etc/cloud
<smoser> really shoudlnt write variable data there.
<smoser>  /var/lib/cloud/<somewhere> woudl be preferable i think.
<Odd_Bloke> xael: ^ :)
<xael> noted
<xael> is /var/lib/cloud/cloudinit_data_url or something similar ok?
<Odd_Bloke> xael: I'd make it clear it's a BigStep thing, so people don't expect it to exist on any other cloud.
<xael> I send the URL in option 67 (filename), as it is not normally used inside an OS, only in pxe booting environments as a replacement for the filename header
<xael> so, /var/lib/cloud/cloudinit_bigstep_url?
<Odd_Bloke> LGTM.
<smoser> is /var/lib/cloud/data/seed/bigstep/ ?
<smoser> would that work ?
<smoser> its not entirely a 'seed' as in it doesnt provide full information
<xael>  /var/lib/cloud/data/seed/bigstep/url ?
<harlowja>  smoser so boss sir, is stuff still merging into launchpad for 0.7.x :(
<harlowja> can we perhaps do github this year :)
<harlowja> i'll fix up the 0.7.x branch on gerrit, since it seems busted
<harlowja> *unless someone else wants to
<smoser> harlowja, lets do it man!
<harlowja> okies
<harlowja> smoser do u want to ask the infra folks to resync from a converted https://code.launchpad.net/~cloud-init-dev/cloud-init/trunk github into 0.7.x branch
<harlowja> or shall i?
<harlowja> sync script @ https://gist.github.com/harlowja/8bfe7e9a19214379684f
<harlowja> (if u lost it)
#cloud-init 2016-01-13
<openstackgerrit> melissaml proposed openstack/cloud-init: Put py34 first in the env order of tox  https://review.openstack.org/266753
<george_> hi all, is there a recommended way of adding additional config modules
<george_> other than copying them into /usr/lib/python2.7/site-packages/cloudinit/config or the equivalent?
<george_> bump from earlier in case it was misssed: is there a recommended way of adding additional config modules?
<george_> other than copying them into /usr/lib/python2.7/site-packages/cloudinit/config or the equivalent?
<harlowja> george_ currently that will be the way u'll have to do it
<harlowja> https://github.com/openstack/cloud-init/blob/0.7.x/cloudinit/stages.py#L648 only looks in the cloudinit.config module
<george_> yeah, that's the only thing i saw
<george_> thanks!
<harlowja> np
<harlowja> smoser yo, https://github.com/harlowja/cloud-init/tree/0.7.x-final-final-final-really is the most up to date bzr 0.7.x sync
<harlowja> if u want i can ask the infra folks to sync ^ and replace https://github.com/stackforge/cloud-init/tree/0.7.x with that
<harlowja> sorry i mean https://github.com/openstack/cloud-init/tree/0.7.x
<smoser> ii was afraid you'd bug me again :)
<harlowja> ;)
<harlowja> i am in your basement
<harlowja> lol
<harlowja> u can not escape
<smoser> AH!
<harlowja> :-P
<harlowja> soooo how about that, lol
<harlowja> sooo
<harlowja> lol
<kasey-al`> Hi all, I am trying to debug an issue on a VM (C6.5 image) where the cloud-init script is running and the routes are not properly added.  This is on a cluster running Openstack Icehouse, C6.5 hypervisors, and OpenContrail SDN.  The image boots just fine on 60+ different hypervisors, but fails on a couple.
<kasey-al`> tcpdump shows repetitive dhcp discover, offer, request, ack cycles, but the console-log of the VM just ends with the route table.
<kasey-al`> and it never gets to the login prompt.
<kasey-al`> Is there a way to get more verbose output out of cloud-init?
#cloud-init 2016-01-14
<harlowja> kasey-al` ya, there is a way
<harlowja> kasey-al` try https://gist.github.com/harlowja/398a66974f08a2ac3a3e as the user-data
<harlowja> that should start logging at DEBUG (from what i remember)
<kasey-al`> Thank you!  I will give that a shot
<gord0> hello
<gord0> trying to figure out if there's something wrong with my cloud init file -> http://pastebin.com/Ygj1Yyhi
<gord0> nova boot isn't applying it for some reason
<harlowja_at_home> gord0, seems ok, any idea what nova boot is applying (if anything at all?)
<gord0> harlowja_at_home: not applying anything. i did some testing and took a simple file that applies only /etc/environemnt and yum.conf and that works fine. i have to figure out where its failing
<gord0> is there a syntax checker or some such?
<harlowja_at_home> well that file is just yaml, so yaml syntax
<harlowja_at_home> but cloud-init logs stuff, so u might check in var/log/cloud
<harlowja_at_home> see if anything useful there
<gord0> this is on the node where i run it right? there is no /var/log/cloud
<gord0> ah nvm i see it
<gord0> there are no errors there
<harlowja_at_home> ah
<harlowja_at_home> can u paste?
<gord0> ok i'm trying once more
<guacer> hey folks, I'm curious if anyone knows of a way to add runcmd to vendor-data which can't be overridden by user-data. The docs say no, with exception of using `[{ "op": "add", "path": "/runcmd", "value": ["my", "command", "here"]}]` from the user-data side. I'm wondering if the inverse is true?
<guacer> i.e. can the vender-data side merge onto the user side?
<guacer> also, not trying to do anything "evil"
<guacer> as per https://github.com/number5/cloud-init/blob/master/doc/vendordata.txt
<harlowja> gord0 did u have any luck getting that paste (i was offline for a while)
#cloud-init 2016-01-15
<smoser> guacer, user-data merges over vendor data in the same way that user-data merges over user-data.
<smoser> that is by design.  user wins.
<smoser> you should be able to put a cloud archive format into vendor-data though
<smoser> and in it have a user-script.
<smoser> that way it is much less likely to accidently get overwritten by the user.
<smoser> generally recommendation is that vendor-data not use runcmd for that reason.
#cloud-init 2017-01-09
<powersj> magicalChicken: still getting too many open files https://paste.ubuntu.com/23771548/
<powersj> is this more in line with what you were seeing?
<magicalChicken> powersj: yeah, especially first ones there where the stacktrace is from in pylxd
<powersj> magicalChicken: ok, how can I help here?
<magicalChicken> i guess switching back to polling inside the instance lets the test suite get further before it hits it
<magicalChicken> powersj: pylxd is broken, there isn;t really anything that can be done here
<magicalChicken> I'm going to look into the pylxd bug sometime today or tomorrow I guess and see if i might be able to patch it
<powersj> ok - should I fall back to pylxd 2.1?
<magicalChicken> I'll have to modify the test suite again to get it working with 2.1, the response from execute() is different
<powersj> ok then let's leave it at 2.2 and get pylxd fixed
<magicalChicken> We'd also lose all of the error handling for setup image
<magicalChicken> Yeah, fixing pylxd shouldn't be too difficult. I've read through the code a couple times before when I was starting on the test suite
<magicalChicken> The pylxd bug will probably cause issues for other projects as well at some point, so good to fix it
<powersj> magicalChicken: ok sounds like a plan. I am going to send you a list of scenarios that I am using today with no issues and what I am using to test your merge. Hopefully that helps
<magicalChicken> Sure
<magicalChicken> Raising the open file limit might buy the test suite enough space, but it probably isn't needed
<magicalChicken> powersj: I've also traced down the last centos issue to a systemd bug in the centos lxd images, so the test suite is doing the right thing there
<powersj> oh nice!
<rharper> powersj: magicalChicken: question on using integration tests... in the conf yaml which supplies the cloud-init, I've got a specific IP and hostname that I'd like to reference ... in the test datacollections, there isn't any variable references; I just need to make sure I copy the value into the collect script ?
<magicalChicken> rharper: you can access the cloud-config provided to the test from the test script with self.cloud_config
<magicalChicken> oh, wait you mean the collect scripts themselves
<magicalChicken> I haven't added in any variables there, its definitely doable though
<rharper> so I have user-data with server: 192.168.12.23;   was thinking I'd set a variable, but that's probably too much trouble;
<rharper> if I didn't use shell to run the collect program, I could use python instead and import the user-data configuration to fetch the values
<magicalChicken> Most of the current collect scripts are like that
<magicalChicken> The script itself just grabs a file or the output of a command, then the python verifier script checks if its correct
<magicalChicken> In the python verify scripts there's some helper utilities as well to read through the cloud-config the test ran with
<rharper> ah, yeah, that's fair
<rharper> I forgot things were spread apart
<magicalChicken> I figured this way its easier to debug the verification, because downloading the images and all is slow
<rharper> hrm, most of the collect scripts don't redirect output to a file ... I suppose that's what I was confused; most of them run a test (grep $this $that) and return zero or non-zero
<rharper> ah, i see, output is stored via the script name
<magicalChicken> I don't actually have the test suite looking at the return code from a colelct script
<rharper> sorry
<magicalChicken> oh, yeah stdout is saved
 * rharper is coming up to speed on adding one
<rharper> or rather modifying
<magicalChicken> There's also a 'create' command that can give you a template
<rharper> magicalChicken: powersj:  I added a test_ method to one of the ntp_server ones ... what's the top-level command I'd use to kick off say just the 'ntp_servers' integration test ?
<powersj> python3 -m tests.cloud_tests run -v -n xenial -t tests/cloud_tests/configs/modules/user_groups.yaml
<powersj> is an example of running on a xenial based one with the user_groups test
<rharper> sweet
 * rharper fires away and watches the world burn 
<powersj> :)
<rharper> http://paste.ubuntu.com/23772570/
<rharper> powersj: magicalChicken ^^
<powersj> rharper: what version of pylxd? `pip show pylxd
<rharper> pip?
<rharper> % apt-cache policy python3-pylxd
<rharper> python3-pylxd:
<rharper>   Installed: 2.0.5-0ubuntu1.1
<magicalChicken> rharper: also, which version the test suite
<rharper> I'm on xenial
<rharper> magicalChicken: how do I know ?
<rharper> I branched cloud-init from trunk last week
<rharper> for this bug fix
<magicalChicken> I haven't tested the version in trunk
<rharper> what's running on jenkins ?
<magicalChicken> also, what's in trunk will only work with pylxd 2.1
<magicalChicken> the version in trunk
<powersj> In my experience only 2.1.x of pylxd has worked sufficiently well.
<rharper> boo
<rharper> smoser:  ^^
<magicalChicken> there's a bug in pylxd 2.2
<magicalChicken> the version based on 2.2 is much better, but can't be used until that bug is fixed
<rharper> do you have a cloud-init-testing ppa with 2.1.x for xenial ?
 * rharper isn't going to do the pip thing here
<magicalChicken> rharper: that's probably a good call, pip has messed up my system pretty bad
<rharper> hehe
<magicalChicken> there isn't a ppa afaik
<powersj> The jenkins slaves have to use pip to grab a few things, so grabbing pylxd wasn't to big of a deal, but I agree a ppa would be nice...
<magicalChicken> the version in repos is 2.0 which is also incompatible
<rharper> powersj: cheater =P
<magicalChicken> a ppa would be good
<rharper> yeah
<powersj> rharper: I know :P
<rharper> is 2.1 in yak or zesty ?
<rharper> we can just copy archive into any old ppa
<magicalChicken> rharper: its in yakkety
 * rharper rmadisons
<rharper> cool
<rharper> yeah, y and z are 2.1
<magicalChicken> the problem with 2.1 is there's no way for me to do error checking at all during setup, thats why the switch to 2.2 was needed
<rharper> sure
<smoser> wait...
<smoser> we should run the tests in a tox virtual env
<smoser> powersj, and i talked about that.
<smoser> then we can get the pylxd that we want/need
<magicalChicken> smoser: no longer possible with the newer version of the test suite as it requires shell access
<smoser> >
<smoser> ?
<magicalChicken> smoser: i couldn't get support for other distros going without it
<smoser> i dont follow that.
<magicalChicken> util.subp() doesn
<magicalChicken> util.subp() doesn't work inside tox env right?
<smoser> why wouldnt it?
<magicalChicken> isn't tox inside a chroot?
<rharper> smoser: I'm fine with tox; but I'd like a jenkins-runner like top-level entry point so I don't have to figure out how to call it
<smoser> tox is not a chroot, no.
<magicalChicken> smoser: oh, nvm, sorry
<magicalChicken> smoser: tests should run fine inside tox then
<magicalChicken> i can write a wrapper script
<smoser> tox kiind of complains in some cases if you run a command (through it directly) that is not in the virtual env
<smoser> but, you'd run 'tox -e integration-tests'
<smoser> or something.
<smoser> the issue with this...
<powersj> well we have a whole host of arguments we need to pass
<smoser> is that pylxd requires c interfaces
<smoser> which sucks
<magicalChicken> oh, yeah that does suck
<smoser> meaning 'pip install pylxd' compiles code with gcc (and requires python-dev and the like)
<magicalChicken> i remember when we were trying to test curtin with python parted
<magicalChicken> I'm almost considering just writing a wrapper around lxc command line then
<magicalChicken> Since the current hold up on the test suite is a pylxd bug
<magicalChicken> And the previous one was also a pylxd bug
<smoser> i have one question...
<smoser> when i saw powersj running this... i have to run this more locally...
<smoser> each of the shells into a container took like 1 second to run
<smoser> or more
<magicalChicken> blame pylxd :)
<smoser> so collecting things was painful to watch
<magicalChicken> it creates a websocket interface then shuts it down each execute() call
<smoser> if i blame pylxd, then i'm quite open to pitch pylxd
<magicalChicken> yeah, there's no reason that should take that long
<smoser> but that doesntn seem like it relaly should be that slow.
<powersj> smoser: after upgrading to a newer version of pylxd that fixed auth issue it is far faster
<rharper> powersj: newer version being ?
<powersj> recall my updates about 50 mins -> 12 mins (or something of that magnitude)
<magicalChicken> I'm not sure why it is, but compare instance.execute() to 'lxc execute cloud_test_... "cat /var/log/cloud-init.log"
<powersj> 2.1.3
<powersj> or newer
<magicalChicken> powersj: that's interesting, maybe that's part of why tests were taking so much longer on jenkins than my laptop for a while
<smoser> hm..
<smoser> $ time sh -c 'for i in "$@"; do lxc exec x1 /bin/true; done' -- $(seq 1 10)
<smoser> real	0m1.705s
<smoser> user	0m0.240s
<smoser> sys	0m0.116s
<powersj> that plus zfs backend :)
<magicalChicken> powersj: haha yeah that definitely helped
<powersj> yeah - everything is running as fast as you were seeing now, so speed isn't an issue imo
<magicalChicken> the current devel version should be even faster, image download times are less than half what they were before
<smoser> magicalChicken, yeah, your instance.execute()  is what i'mo asking about.  above, there i did that in ~ 0.17 seconds per each
<smoser> (which is still slow)
<smoser> but i recal that being like 3 seconds when watching it on powersj
<rharper> powersj: magicalChicken:  if I want to pip install python3-pylxd which version string should I use ?
<smoser> if those are much more similar with the lxc cmdline, then i'd say we keep pylxd
<smoser> and maybe we should anyway
<smoser> rharper, really... dont do that :)
<smoser> just use tox or virtual env to get you waht you need
<magicalChicken> rharper: for the version of the test suite in master, 2.1.3
<magicalChicken> but yeah pip is terrible
<magicalChicken> i have 4 or 5 different versions of every python library installed on my system
<magicalChicken> smoser: all my instance.execute() method does is call pylxd's execute()
<rharper> maybe a diff to tox.ini with a cloud-test group with the right defs ?
<magicalChicken> I am thinking that dropping pylxd may be good just so it doesn't break the test suite again
<smoser> i dont kmpw/
<smoser> know
<magicalChicken> pylxd does keep the code clean though
<smoser> yeah... and honestly it shouldnt be that bad to bring up a web socket
<smoser> so it *shouldnt* be that slow
<smoser> and we happen to work with the people who write it :)
<smoser> i dont like that i have to have gcc for it in a virtual env...
<magicalChicken> i think the speed issue is mostly fixed
<magicalChicken> the current issue is https://github.com/lxc/pylxd/issues/209
<magicalChicken> its leaking file descriptors, so the test suite can't get through a full run without running out
<smoser> well, we can ulimit -a for now
<smoser> and the speed thing shoudl be fixable for sure...
<magicalChicken> smoser: there's a half finished fix for that bug, I just need to get it to pass tests and ping someone to pull it
<smoser> and we can/should  improve our collection and such to do fewer execs
<smoser> as over other arches, those will be  more expensive
<smoser> ie, ssh as tghe transport
<magicalChicken> i think 1 exec per collect script isn't too unreasonable
<magicalChicken> it may be possible to reduce the number of default collect scripts though
<smoser> sure its possible.
<smoser> we wioll just need to work things differently.
<smoser> but thats fine
<smoser> not a big deal now
<magicalChicken> Yeah, on kvm execute() should be pretty fast as well
<magicalChicken> its only when we get to remote instances that it'll be slow
<rharper> http://paste.ubuntu.com/23772854/
<rharper> smoser: powersj: magicalChicken: that let's me run it under tox on my xenial host
 * rharper will figure out how to pass the specific test as args and just include the main run command eventually 
<magicalChicken> rharper: nice
<magicalChicken> rharper: default behavior if -t is not specified is to just run everything
<rharper> right
<powersj> rharper: sweet... btw is that list of dependencies just a cut and paste of everything you had? Can we just specify pylxd and still run?
<rharper> that was the list of deps from apt-cache show python3-pylxd
<rharper> it's from the xenial testenv
<powersj> ok!
<rharper> and then the last 5 or below pylxd are package deps
<rharper> 017-01-09 16:08:06,209 - tests.cloud_tests - DEBUG - running collect script: ntp_conf_servers
<rharper> 2017-01-09 16:08:06,373 - tests.cloud_tests - DEBUG - running collect script: cloud-init-output.log
<rharper> 2017-01-09 16:08:06,534 - tests.cloud_tests - DEBUG - running collect script: cloud-init-version
<rharper> smoser, I'm seeing sub-second collects on 2.1.3 pylxd
<rharper> also on zfs backend, but I think that looks reasonable;  but will wait until we see the whole thing run
<magicalChicken> rharper: on my system its usually ~8-10 seconds per test case, of which 6 of that is booting the system
<rharper> yeah
<rharper> why do we delete the base image each time ?
<rharper> that's somewhat annoying since I already use ubuntu-daily:xenial
<magicalChicken> rharper: save disk space
<rharper> had it on my system
<magicalChicken> there's a modified version of the image that the tests actually run from, which should be deleted
<rharper> right
<magicalChicken> i could set it to leave the base image behind
<rharper> but the base images could stay
<rharper> sorta like sync-images in curtin
<magicalChicken> the issue i run into is i only have 2G given to zfs on my system
<rharper> ideally we'd have a sync stage, and run stage which uses what's present
<magicalChicken> so I can only have 1 image at a time
<magicalChicken> but yeah, the test suite doesn't need to do that
 * rharper hands magicalChicken external SSD disk 
<magicalChicken> lol
<magicalChicken> i have an external 1T disk, but no power cable :(
<rharper> but ssd ?
<magicalChicken> No, not ssd
<rharper> this is usb3 128G ssd, should be helpful
<rharper> powered via usb bus which is nice
<magicalChicken> nice
<rharper> I'll pass it along at the next meet up if you like
<magicalChicken> serious? thanks :)
<magicalChicken> that'll actually help a lot
<rharper> yeah
<magicalChicken> I can add in a image-sync command that'll download every image available
<magicalChicken> s/available/needed
<rharper> yeah
<rharper> so when using tox -e foo ;;; it injects a cloud-init-0.7.9.zip ;;; how and when is that made?  ie, how do I know that it includes what's committed or what's uncommitted ?
<rharper> powersj: ^^
<rharper> wondering if the cloud-init that's running during the cloud_tests is my modified version or something else?
<magicalChicken> rharper: the cloud-init in the cloud-tests is either from a repo, a ppa or a .deb
<powersj> rharper: unless you specify a specific deb to inject... what he said :)
<magicalChicken> the test suite doesn't modify the cloud-init itself
<rharper> but how does it get injected into tox
<powersj> it's not tox... it's the lxc image
<rharper> hrm
<rharper> don't we have a curtin-like pack where we package up what's in-tree and inject it ?
<magicalChicken> rharper: Oh, the version that gets copied into tox is probably just a clone of the current version
<magicalChicken> rharper: but i don't think its used
<magicalChicken> There isn't a pack equivalent just because the way cloud-init gets built and what version of python it uses all depends on the release
<rharper> so, I know at least when we run the tox command, we're using the right bits since I see a change in the server values
<rharper> but when the cloud_tests runner launches an lxc container, it starts with the base image, and then doesn't it inject the current git tree ?
<magicalChicken> It can install the current git tree if you want it to with a setup script
<rharper> or are we not yet to to using mount-image-callback lxd:container-name
<rharper> well, if I'm developing a new test, how do I validate it ?
<magicalChicken> rharper: the way setup works right now is yuo can execute a script or use one of the setup args
<powersj> rharper: I typically build the deb and specify it
<rharper> powersj: urg, really ?
<magicalChicken> rharper: I normally use --ppa "ppa:cloud-init-dev/daily"
<rharper> that's a long round-trip
<rharper> magicalChicken: well, you were creating tests for existing code
<magicalChicken> rharper: you don't have to build a new deb just to test a new test case
<rharper> I'm fixing a bug
<rharper> so I need to run my fixed code; would prefer not to have a ppa build in the iteration loop, no ?
<magicalChicken> I'd just push to a ppa
<magicalChicken> Oh
<rharper> -1 =P
<magicalChicken> I can set a script up to do a build automatically
<powersj> I get the need for fast turn around, but these aren't unit tests so I guess I can accept some time required. Maybe a full ppa is too much then?
<rharper> so, we're doing this via tox; so let me figure out what cloud-init-0.7.9.zip contains
<magicalChicken> I think with tox, that zip will be the current contents of cloudinit/
<rharper> if that's the current tree bits, then it's a matter of having the runner pick that up and inject that into the snapshot
<magicalChicken> Since that's used for unittests as well
<rharper> magicalChicken: I hope so
<rharper> yeah
<rharper> that makes sense
 * rharper checks .tox 
<magicalChicken> rharper: the tricky part is automatically building though
<magicalChicken> I've never managed to build cloud-init on my system
<rharper> looks like the zip is installed but not kept
<magicalChicken> Although I guess using setup.py rather than building a deb would work
<magicalChicken> rharper: If you modify a file then run tox -e flake8, it sees the change immediately
<magicalChicken> So I think its just the current tree
<rharper> yeah
<rharper> ok, yeah, so pip has tree installed;  but we don't yet have a way of putting tree into the container
<magicalChicken> rharper: I can add that to setup_image
<rharper> magicalChicken: so with proper deps on the host, package/bddeb should create a deb from current tree ... you're saying that doesn't work for you ?
<magicalChicken> I've not managed to get bddeb working yet
<rharper> possibly due to your pip cruft ;  but on a clean setup (like in a container ) it should work
<magicalChicken> Yeah, it works fine in a ppa, so that should work
<rharper> so, we could 1) copy in source into the container and then build cloud-init via tools/bddeb
<rharper> then inject that deb for the snapshot
<magicalChicken> That should work fine assuming that bddeb works
<rharper> it should, and if we run bddeb in a container, that should make it repeatable
<magicalChicken> The other option is setup.py install, but that could be messy
<rharper> it's probably pretty close though it will miss any pre-post stuff that dpkg did; (which I don't think is that much )
<magicalChicken> Yeah, I think dpkg mainly just does systemctl enable cloud-init
<magicalChicken> I guess if I add the image sync feature then firing up containers is fast
<magicalChicken> So there could be a 'build container' that runs before the rest of the tests start
<magicalChicken> If the test suite is told to use current tree
<rharper> right, I'd definitely sync the images needed;
<magicalChicken> That should be pretty straight forward, I guess general policy could be always leave around unmodified versions of images
<rharper> then you could copy in the src, into a build container;  once it's built and installed, snapshot that;  use that snapshot as the base for each test
<rharper> we might need to purge the build-deps and other things (that's going to add some startup cost)
<rharper> but I think for the developer path; that's not bad
<magicalChicken> The build container could even be separate from the test containers
<rharper> for the normal ci path, pointing at PPA or repo is fine
<rharper> right
<magicalChicken> Just build, copy the deb back into the host, then nuke it
<rharper> that's also possible
<rharper> build once, copy out, then run your normal 'use this cloud-init.deb'
<magicalChicken> Yeah, for ci it should work well to just use a ppa
<rharper> cool
<magicalChicken> That would also make it possible to just build 1 deb for all ubuntu images too
<rharper> yeah
<rharper> it's python
<magicalChicken> Only issue is trusty would need py2
<rharper> bddeb takes flags
<magicalChicken> but trusty can just be run separately
<rharper> for python2 or python3
<rharper> you could run the build twice
<rharper> and generate the 2 and 3 versions, copyout etc
<magicalChicken> Simplest way might be just to add a new command to test suite to build a deb in a clean container
<magicalChicken> and then a script that does that, then runs the test suite with the given args
<rharper> sure;  do we have a way of chaining these commands?  right now, I just call the 'run' command
<magicalChicken> You can tell 'run' to use multiple distros in 1 go
<magicalChicken> like 'run -n xenial -n yakkety -n zesty -n stretch'
<magicalChicken> also multiple tests and multiple platforms
<powersj> rharper: thanks for the tox starter :) here is a simpler version that works for me and an example of passing in arguments https://paste.ubuntu.com/23773398/
<powersj> now as you suggested if we combine that with building the local checkout in a 'build' container and you should have fast tests then for easier development ;)
<magicalChicken> powersj: nice, I'll pull that + rharper's tox config into the devel branch
<powersj> magicalChicken: cool, really there are only 2 differences 1) is you only need pylxd specified to run, if we want to lock down other requirements we can and 2) the default arguments
<magicalChicken> I'm going to try getting the build container + 'run from current tree' script put together tonight
<powersj> in mine {posargs:-n xenial} basically means run with -n xenial unless you get something else and in that case run that
<powersj> ok
<magicalChicken> Right yeah, passing in args is definitely nice there
<powersj> I'm actually surprised tox worked so well... I could swear you and I tried it and it failed terribly
<magicalChicken> I'll try to base that off of the pylxd 2.1.3 version so that it doesn't have to wait on pylxd being fixed
<magicalChicken> powersj: I'm pretty sure I had tried it before too
<magicalChicken> I may have just done something wrong with the environment though
<powersj> magicalChicken: any concerns about doing this for when we add KVM backend?
<powersj> starting to look at that is what I hope to do tomorrow
<magicalChicken> hmm
<powersj> make sure we don't back ourselves into a corner now
<magicalChicken> well, its not going to affect the actual cli
<powersj> (you can think about it) ;)
<magicalChicken> so we can always just switch bach
<magicalChicken> We're going to have to have root permission to modify the images for kvm
<magicalChicken> Although I may be able to rig something up where that happens inside of lxd :)
<powersj> lol
<magicalChicken> I have wanted to do that for a while, so vmtests can stop needing root
<magicalChicken> Something like a fuse mount of a directory inside lxd where the image has been mount-image-callbacked
#cloud-init 2017-01-10
<rharper> powersj: magicalChicken:  so when a test fails, I get a nice dir in /tmp which includes the collect data ... if I want to modify that data (instead of recollecting it) and then re-run the unittest, is that possible?
<powersj> sure
<powersj> Instead of running "run" you can run "verify"
<powersj> python3 -m tests.cloud_tests verify -d <your dir>
<powersj> I believe is all you need.
<rharper> sweet
<powersj> I did this when writing the tests so I didn't have to collect each time :)
<powersj> you were the one who pushed for this too :P
<rharper> btw, I figured how how to pass all the args
<rharper> yes, yes I did
<rharper> pain in curtin
<powersj> I can imagine
<rharper> waiting 120 seconds per change in the unittest
<rharper> this is faster
<rharper> not by much (2m15 seconds) to run a single ntp_server test; that's mostly image download
<rharper> which magicalChicken said he'd look at getting a sync-images
<rharper> as we just throw them away
<magicalChicken> rharper: I'm hoping to get to that today
<magicalChicken> rharper: Still working on finishing up deb build/run from tree script
<rharper> cool
<magicalChicken> rharper: I'm basing this + saving image on the pylxd 2.1.3 version so it can merge soon
<rharper> currently I'm passing --deb to the run command after doing a package/bddeb locally
<rharper> so that works nicely with the tox env
<rharper> tox -e citest -- run -v -n xenial --deb cloud-init_0.7.9-2-gca191a2-1~bddeb_all.deb -t tests/cloud_tests/configs/modules/ntp_servers.yaml
<rharper> updated the citest tox command to just have {posargs}
<magicalChicken> rharper: This is pretty much same thing, just building in a container to keep the host clean
<rharper> yep
<rharper> nice
<magicalChicken> rharper: I pulled in tox with {posargs}
<magicalChicken> probably going to have a citest_run env and a citest env separate
<magicalChicken> where citest_run is shortcut to just run with local tree on default systme
<rharper> eah
<rharper> nice
<rharper> smoser: when maas deploys, does it typically write a 'user_data.sh' script ? or is that something a maas user would add ?
#cloud-init 2017-01-11
<smoser> rharper, ?
<rharper> here
<smoser> what is user_data.sh ? context ?
<smoser> above, tox -e citest == tox-venv citest
<rharper> maas bug
<rharper> Bug 1644229
<rharper> Read from http://10.2.0.50/MAAS/metadata/2012-03-01/user-data (200,
<rharper> 18443b) after 1 attempts
<rharper> util.py[DEBUG]: Writing to /var/lib/cloud/instance/scripts/user_data.sh - wb: [448] 13399 bytes
<rharper> I was wondering if maas always sense a "user_data.sh" during deployment, or if that's custom user-data from the maas user
<rharper> the bug is that the user_data.sh script doesn't exit 0; so cloud-init says the deployment failed
<smoser> i'm not sure what woudl write user_data.sh
<smoser> how do i open a x-7z ?
<magicalChicken> smoser: 7z program, it has args like tar
<magicalChicken> it doesn't do directories right though
<rharper> smoser: p7zip-full
<rharper> then 7z x <file>
<smoser> rharper, i really dont know what wrote that.
<smoser> that too is obnoxious
<smoser> but anyway... id ont kow what wrote that file.
<smoser> cloud-inti in user-ata is getting a multi-part input
<rharper> ok, I don't think it's "normal"
<smoser> and one of the parts is file name user_data.sh
<rharper> right
<smoser> i dont see it in maas though
<rharper> right
<rharper> I suspect it's something they added for debugging or something but that's the cause of the failure (the script)
<rharper> if they remove it,  things should work.
<smoser> ok. there it is
<smoser>  generate_user_data
<rharper> in maas?
<smoser> yeah
<rharper> what data does it pull in ?
<rharper> looks like power info for one
<rharper> hrm, but that's config, not a script
<smoser> no. its a mime multipart
<smoser> one part x-shellscript
<smoser> which gets put inthere
<smoser> i really woudl like to get the contents of /var/lib/cloud/
<smoser> that'd be sufficient
<rharper> ok
<smoser> its wierd that it doesnt output anything
<smoser> it just exits fail
<smoser> i hvae to run
<magicalChicken> powersj, rharper: 'run tests on current code' functionality is done, PR is at:
<magicalChicken> https://code.launchpad.net/~wesley-wiedenmeier/cloud-init/+git/cloud-init/+merge/314496
<rharper> magicalChicken: sweet!
<powersj> magicalChicken: thanks! will test it out later this morning
<powersj> magicalChicken: fyi - I like the previous layout of having an inheritance model with common interfaces and I wish to continue using that model. My hope was to determine the flow and methods we use inside the KVM interface.
<powersj> The issue we may have is the model we use to interact with a VM versus a container. With the container we can execute commands directly, with the VM we would need SSH or turn the system off and mount.
<magicalChicken> powersj: An additional setup option to set a root password and enable ssh could be used, something like setup_image.backdoor
<magicalChicken> Then we could just ssh in with those credentials for everythign
<magicalChicken> Since the vm would be running on our local host, or at least on the same LAN there wouldn't be much cost to ssh
<magicalChicken> The execute method could still get stdout and exit code over ssh
<powersj> magicalChicken: as long as smoser is fine with us modifying the image like that, ok :) I was trying to avoid modifying the image in a way that may affect a test. However, when we add additional platforms like the clouds, we will have to add SSH anyway it seems.
<rharper> I think smoser is right with exec;
<rharper> I think we want to use exec over whatever transport
<rharper> and the substrate layer should do what it needs to enable exec (remote ssh, if needed)
<magicalChicken> Yeah, I think for everything other than lxd, ssh is the easiest way to handle exec
<powersj> ok
<magicalChicken> I think its also a fair assumption that after a certain timeout, if the system isn't accessible over the network, it can just be stopped
<magicalChicken> For kvm, we could just send SIGKILL to qemu for shutdown
<rharper> why not use a trailing collect script to shutdown like we do in vmtest ?
<rharper> at least in powersj proposal, we're injecting the collect scripts anyhow;  no reason not to also have a boiler plate to shutdown the instance at the end of collection
<magicalChicken> the platform objects are used with a context manager that shuts down using instance.shutdown()
<magicalChicken> i don't think it makes sense to inject collect scripts
 * smoser has to read
<rharper> magicalChicken: that's fair (no inject) if we're settled on exec
<magicalChicken> push_file() can be implemented wiht execute
<magicalChicken> yeah, so run_script() is just push_file() + execute('/bin/bash file')
<powersj> yeah it sounds like usin exec means using ssh over injecting files + running + shutdown + pull files
<rharper> well, pull_files and then shutdown ?
<magicalChicken> pull_files() can be done over ssh as well
<rharper> the alternative is using a secondary disk image which can be accessed directly offline
<magicalChicken> and shutdown is just execute('shutdown')
<rharper> ala vmtests 'collect disk'
<magicalChicken> that would require collect behavior to change based on platform though
<magicalChicken> it wouldn't be too bad, just for part of collect, but its still more work
<rharper> sure but that's why we're having platform abstraction ?
<magicalChicken> the abstraction was originally just over how we get to the point where we have something we can call execute() on
<magicalChicken> it may also be nice to have ssh set up so we can get in for debugging
<rharper> I think file pull via execute is fine;  we can avoid adding the secondary disk if possible;
<rharper> I would like a 'keep the instance on failure' flag like we have in vmtest
<rharper> specifically for live inspection
<rharper> in which case, it could emit the ssh info to connect
<magicalChicken> Yeah, 'keep instance on failure' would be nice
<magicalChicken> I've been meaning to do that for lxd as well
<magicalChicken> And have it configurable via cmdline
<magicalChicken> Yeah, there could be a debug message with ssh info during setup
<powersj> magicalChicken: rharper: sounds like the current model then is to get image, modify image by adding ssh backdoor, for each test exec will run the commands specified via ssh and output collected then, not when turned off.
<magicalChicken> Also, for the most part, run_script() is the only consumer of execute(), since pull_file() is only used by 'bddeb' and push_file() is only used if running with --deb or --rom
<magicalChicken> powersj: Yeah, I think it makes sense to do it like that
<magicalChicken> For the image modification and setup, images.execute() could just be a call through to mount-image-callback
<rharper> there's possibly a missing step of uploading the modified image
<rharper> on clouds, we'll need to create the backdoored image, upload it, then use the uploaded image
<rharper> but at least for the 'local' case lxd and kvm, mic is fine
<magicalChicken> There's a snapshot object in between image modification and launching instances
<magicalChicken> It should work fine for the snapshot to represent a remote image that can be launched right away
<magicalChicken> So the upload could happen during snapshot.__init__
<smoser> i'm pretty happy with the backscroll there. :)
<smoser> magicalChicken, yes, snapshot was intended to do the upload... basically that takes an "image" and turns it into something that can be strated.
<smoser> wrt ssh... do we have a test that disables ssh ?
<smoser> if we do not, then at the momentn we can punt on baackdoring image
<smoser> and just use port 22
<magicalChicken> default root password might differ between distros though
<smoser> well, we're not going to go in as root.
<smoser> well, maybe you would
<powersj> don't believe we have a disable ssh test at the moment. just a number of ssh key generation tests
<smoser> but if you backdoor, you just add a user that can sudo
<magicalChicken> right, shutdown cloud be a 'sudo shutdown'
<magicalChicken> i don't think any of the collect scripts really need to be root
<smoser> well, execute() assumes root
<smoser> it shoudl at least
<magicalChicken> it could just always sudo then
<smoser> that can be done over ssh easily enough
<magicalChicken> it cmd as a list so just ['sudo'] + cmd should work fine
<smoser> almost
<smoser> :)
<smoser> http://smoser.brickies.net/git/?p=snippits.git;a=blob;f=bash/ssh-attach;hb=HEAD
<smoser> more context on taht basic path at
<smoser>  https://gist.github.com/smoser/88a5a77ab0debf268b945d46314ea447
<magicalChicken> Oh, that's really nice
<magicalChicken> So it doesn't flatten everything into 1 cmd
<magicalChicken> s/cmd/arg
<smoser> well, yeah. the wrapper unflattens
<smoser> if we're doing ssh, we probably do want to use a python library for it...
<smoser> paramiko
<smoser> but the command execution wrapper business still can be managed.
<magicalChicken> Yeah, that should be cleaner than using subp
<magicalChicken> The new img_conf format in the devel version of the test can automatically apply certain setup options to images on certain platforms, so that handles enabling backdoor on kvm
<magicalChicken> There's a lot of general cleanup in that branch other than just enabling debian/centos, so its probably best to build this off of there
<powersj> magicalChicken: didn't see any comments from you on image sourcing. rharper suggested reusing curtin's sync. smoser any comments there?
<magicalChicken> powersj: We'd need to have something url based as well for other distros, but that could be files thrown into the same directory with a separate index for them
<magicalChicken> I think curtin image sync makes sense
<smoser> powersj, well, we want to sync yes.
<smoser> but curtin syncs maas images
<rharper> what we don't have AFAIK, is any streams data for other distro images
<smoser> which require a kernel external
<smoser> no boot loader
<rharper> it may be we need a sync per 'distro'
<magicalChicken> rharper: that's what i meant with having a raw download url as well
<rharper> well, that's not what I mean
<magicalChicken> oh
<rharper> streams publishes where the URL is
<smoser> well... http://smoser.brickies.net/image-streams/
<magicalChicken> just for ubuntu though
<rharper> ie, if they're building new ones, you can't just blindly wget $URL which may not get you what you want
<rharper> magicalChicken: even for ubuntu, we use the streams data to figure out what we want from what's available
<smoser> so that is just  manually maintained and updated
<magicalChicken> rharper: I thought at least for debian there was a kind of 'lastest build' url
<rharper> smoser: sneaky =)
<smoser> but, it works. and we could do something similar
<rharper> magicalChicken: right but that's still not what we want
<magicalChicken> rharper: right o
<rharper> if you can only recreate on a specific image or release; you  sort of want history
<magicalChicken> yeah that makes sense
<magicalChicken> but: http://smoser.brickies.net/image-streams/ would work
<rharper> yeah
<magicalChicken> thats actually pretty nice
<smoser> we do want to at least be able to easily see that somethig changed between two runs
<smoser> my images are currently synced into serverstack
<rharper> nice
<rharper> would be nice to see if there are other sources of published images that are newer
<rharper> like AMIs ?
<rharper> surely there are newer centos7 images
<rharper> smoser: in general do you want to host the pulling of images ?
<smoser> so.. in my design i really just pushed this all off to the "platform"
<rharper> we might need to mirror that service to prodstack
<smoser> i forget what i called it, but essentially you ask the platform to get you an image that you can modify
<rharper> yeah
<smoser> platform.get_me_an_image("ubuntu/foo")
<rharper> I think that's a good abstraction
<rharper> since we'll need to poke a each substrate differently
<magicalChicken> smoser: that had to be modified a bit
<magicalChicken> there's basically an image config with information about how to locate the image
<magicalChicken> which can be different on each platform
<magicalChicken> so 'xenial' -> 'os=ubuntu release=xenial arch=amd64'
<smoser> thats more an 'alias' than a 'config'
<magicalChicken> theres more information there as well
 * smoser looks
<magicalChicken> like timeouts for stuff and setup options that may be required
<magicalChicken> don't look at master, its broken
<magicalChicken> look in wesley-wiedenmeier/cloud-init:integration-testing
<magicalChicken> That's the main reason I want to base the kvm development off of the current version of the tests, the new img_conf format is much cleaner
<smoser> hm... i'll read some. i'm not convinced :)
<magicalChicken> smoser: the version in master is pretty bad, it shouldn't be used
<magicalChicken> but there is always going to have to be a kind of alias system, since we want the same os_name to refer to the same release on every platform
<smoser> i still dont follow really.
<magicalChicken> so whether the identification info is in 1 place or inside the platform or inside releases.yaml doesn't really matter, its the same thing
<magicalChicken> smoser: each release has a name, so 'xenial', or 'stretch' or 'centos70'
<magicalChicken> and the new img_conf maps that name and the platform name to all config needed to locate and use that image on that platform
<magicalChicken> so that platform.get_image() can just be passed config.load_os_config('platform_name', 'image_name')
<powersj> so saying xenial in a run of the integration tests knows which AMI to pick on AWS or what lxc command line option to use or what simplestreams command to run to get for kvm
<smoser> i agree that something needs to translate 'ubuntu/xenial' into an image, and that takes some additional information.
<smoser> a.) 'ubuntu' is the os, and 'xenial' means 16.04 ...
<smoser> b.) where to get this image or create access to it (get an ami or use lxc or ... )
<smoser> i think though, that i really consider thedetails of that to be a platform thing
<smoser> possibly even configurable through the platform
<magicalChicken> smoser: it is configurable for each platform separately
<magicalChicken> in the new format
<magicalChicken> the main reason for having the image config and per-platform image location information together
<magicalChicken> is that some of the image config may change based on the platform
<magicalChicken> i.e. the timeout for booting xenial on lxd is not necessarily the same as the timeout for booting xenial on aws
<magicalChicken> or setup_image options that are enabled by default are not the same for all platforms
<magicalChicken> the actual implementation of downloading the image is handled by the platform object, the img_conf information is just used by that
<smoser> thats fine and makes sense. but you've made the 'Platform.get_image()' much less easily usable
<smoser> you can't do anything without a platform
<magicalChicken> getting an image is just platform.get_image(config.load_os_config('lxd', 'xenial'))
<smoser> so having that platform thing be easily usable is important.
<magicalChicken> I could also do the config.load_os_config silently inside the platform
<magicalChicken> so it could be platform.get_image('xenial')
<magicalChicken> or even change os names to be in the 'ubuntu/xenial' format
<smoser> i dont have strong feelings on 'ubuntu/xenial'. it is obvious what that means to you and me ('xenial').
<smoser> but it is not so obvious if you just use '7'
<smoser> rather than centos/7
<magicalChicken> the centos ones are 'centos70' and 'centos66' rn
<smoser> i think having a delimeter there makes sense.
<smoser> but... meh. not all that important.
<smoser> you're right in that something has to take htat string and make sense of it.
<smoser> i think we're mostly in agreement.
<magicalChicken> yeah, the delimiter might look nicer
<magicalChicken> the name is just used as a dictionary key, so its pretty easy to replace
<smoser> its ok if  we have some alias thing that turns one into another.
<smoser> for  now, i think we should just not bother with non-ubuntu on kvm. dont halt yourself on that. we'll find images, and then we'll enable other os there.
<magicalChicken> that makes sense
<magicalChicken> if the sstreams mirror is per distro, it would be no trouble to add another one once a source is found
<magicalChicken> There's also a spreadsheet I have going to track all this at:
<magicalChicken> https://docs.google.com/spreadsheets/d/1DAzBlh-wk-rv-WRjllNRG6nnHtAmD0EFBLEYtu8weII/edit#gid=0
<smoser> something that comes to my mind...
<smoser> the kvm 'platform'
<smoser> lxd is a cloud platform. because it handles metadata for us (and puts that stuff into /var/lib/cloud/seed/)
<smoser> kvm is not a cloud platform
<smoser> kvm+NoCloud is analogous to lxd in that sense.
<smoser> i think when we say 'kvm', we're really meaning "kvm+NoCloud" and even then probably kvm+NoCloud-attachedDisk (versus seeding noCloud).
<magicalChicken> yeah, i think kvm + seed disk makes the most sense to do
<magicalChicken> since that's used by some cloud setups
<smoser> well, seed disk differs.
<smoser> no cloud to my knowledge uses Nocloud other than uvt
<smoser> ConfigDrive is different
<smoser> how do id eal with root...
<magicalChicken> we could try basing this on ConfigDrive then
<magicalChicken> to test openstack support
<smoser> i think nocloud is fine and rpobably best now. cofigdrive is a bit mroe entailed.
<smoser> and we can get that easily enough from a real-ish opesnstack
<smoser> https://gist.github.com/smoser/b32bb1c33564d1d46971cd9ded2e8477
<smoser> magicalChicken, ^ that is a failsafe ssh that i had set up.
<smoser> i think there are some bugs in it, but it is a starting point
<smoser> and https://code.launchpad.net/~smoser/+junk/backdoor-image
<magicalChicken> smoser: nice, that looks pretty easy to use
<smoser> that backdoors an image, adding a user that can sudo
<smoser> it might be nice to hook in the failsafe root console too
<smoser> for debugging
<magicalChicken> yeah, I could add that as a setup_image option
<smoser> you add that, and then hit 'alt-f2' and 'enter' and root prompt
<smoser> :)
<magicalChicken> smoser: nice, would be good for vmtests too
<powersj> magicalChicken: when I check out your branch I need to create a tag before I build it looks like
<magicalChicken> powersj: the integration-testing-invocation-cleanup branch?
<powersj> do you run something like `git tag -a 0.7.9 -m "my test"`
<powersj> yeah
<magicalChicken> no
<magicalChicken> it should just work
<magicalChicken> it commits inside the build container, so it doesn't mess with the main repo
<powersj> well running the tox citest_run fails because git describe fails
<magicalChicken> I'm not sure why that's happening
<magicalChicken> is git describe failing inside the build or on the host?
<powersj> Is this the proper way to check out your branch?
<powersj> git clone -b integration-testing-invocation-cleanup https://git.launchpad.net/~wesley-wiedenmeier/cloud-init
<powersj> host
<magicalChicken> I'm not even sure what would be calling git describe in the host other than tox
<powersj> well that is where it fails
<magicalChicken> and the zip build by tox isn't used for anythign really, it shouldn't affect anything
<magicalChicken> does tox -e flake8 work?
<powersj> here is an example of what I was doing but via jenkins: https://jenkins.ubuntu.com/server/job/cloud-init-citest-run/1/console
<powersj> and no flake8 or just 'tox' doesn't work until I create a tag with 0.7.9 in it
<magicalChicken> its failing before cloud_tests are even called
<magicalChicken> something broke in the git clone
<powersj> after cloning I don't have any tags from your branch
<magicalChicken> ?
<magicalChicken> why
<powersj> no idea but git tag -l shows nothing
<magicalChicken> Let me try to clone in a clean environment, I have no idea how that would happen
<powersj> (this is where my git foo is lacking)
<magicalChicken> powersj: I'm seeing the same issue with git describe from cloning with that url
<nacc> what repo (i can try and help resolve the git side, at least)?
<magicalChicken> nacc: ~wesley-wiedenmeier/cloud-init:integration-testing-invocation-cleanup
<magicalChicken> nacc: I think it must be something to do with cloning via https on launchpad, because using my ssh key it works
<nacc> magicalChicken: hrm, i see no tags over in your repo?
<nacc> https://git.launchpad.net/~wesley-wiedenmeier/cloud-init/refs
<nacc> seems to only list branches?
<magicalChicken> nacc: that is really strange, i see tags on my working copy
<nacc> magicalChicken: with a fresh clone? let me also try locally
<magicalChicken> maybe I'm only seeing the tags from my upstream remote and they didn't get pulled in
<magicalChicken> I'm going to try with a fresh clone again, maybe my repo just doesn't have tags at all
<nacc> magicalChicken: it's possible you only pusehd your branches and not tags by refspec? (or with --tags)
<magicalChicken> nacc: I might have, would 'git push --tags' resolve?
<nacc> magicalChicken: presuming that's what you want to do (push all your local tags) (and you might need to specify a remote, depending on your git configuration for that repository)
<magicalChicken> nacc: I have my repo set as default remote, I think it worked
<magicalChicken> There's the same tags as upstream at https://git.launchpad.net/~wesley-wiedenmeier/cloud-init/refs now
<magicalChicken> I must have just messed up when I set my repo up originally
<nacc> yep i see tags now
<magicalChicken> nacc: thanks for the help, I'm still not great with git
<magicalChicken> powersj: clone + describe is working now
<powersj> magicalChicken: ok
<powersj> nacc: thank you!
<magicalChicken> http://paste.ubuntu.com/23783539/
<nacc> magicalChicken: np! i think by default, unless you specify a push refspec in your git config, `git push` only pushes your current branch (see `man git-push` for the defaults)
<magicalChicken> i guess that makes sense as default behavior since you may have tags just for your own reference
<nacc> yep
<smoser> powersj, you dont have the tags locally
<smoser> nacc knows that sort of stuff
<smoser> powersj should be relying on the upstream tag
<powersj> smoser: yeah I believe nacc got us all sorted out now :)
<smoser> ah. i see.
<powersj> no more little hack
<powersj> magicalChicken: https://paste.ubuntu.com/23783582/ on my laptop things timed out, on jenkins it is running just dandy :\
<powersj> jenkins run so far: https://jenkins.ubuntu.com/server/job/cloud-init-citest-run/2/consoleText
<powersj> is there a way to triage where it is getting stuck or slowing down?
<magicalChicken> this is all running on old pylxd so stuff may be failing silently
<magicalChicken> looks like jenkins run is working perfectly though
<powersj> yeah
<magicalChicken> I'm not sure what caused timeout on your laptop
<powersj> hmm
<magicalChicken> looks like it failed before the first instance ever booted
<powersj> I will disconnect from VPN and make sure that isn't killing me again
<magicalChicken> yuo may want to try increasing timeout a bit for bddeb
<magicalChicken> i thought about adding a flag to adjust it but it didn't seem needed on a decent internet connection
<magicalChicken> because the initial boot for bddeb installs devscripts and that has a ton of deps
<magicalChicken> also, looking at this, I should have used run_stage for the tree_* commands, since it tried to go and do the actual run even though build failed
<magicalChicken> I'll switch to using that real quick, it'll be cleaner too
<powersj> ok and which timeout should I bump?
<magicalChicken> xenial boot timeout
<powersj> ok
<powersj> ah ok so the generic timeout for a release
<magicalChicken> with the old img_conf there's only one
<powersj> ah that's right
<magicalChicken> it takes ~80 seconds for me to do initial boot including installing devscripts for bddeb, so I could see it taking 120s if you're on vpn
<powersj> yep it took just under 3 mins
<magicalChicken> :(
<magicalChicken> that's way slower than it should be, but i guess its just network speed
<powersj> I don't have the best connection when I'm in WA
<magicalChicken> the problem is devscripts has py2 deps
<magicalChicken> so tons of stuff gets pulled in
<smoser> nothing you can really do about it.
<smoser> you cant test a deb without buildling a deb
<magicalChicken> its fine for jenkins since the servers have good network
<smoser> still not really ok.
<smoser> its still a *ton* of io
<smoser> but, dont know what we can really do about it.
<magicalChicken> I think the bddeb/tree_run paths are really only for testing stuff in local branches anyway
<smoser> bddeb could probably us dpkg-buildpackage
<magicalChicken> Jenkins can just build 1 deb and use it for all tests
<smoser> which has a bunch less
<magicalChicken> yeah, debuild may be overkill
<smoser> well, it can also just use the daily build ppa
<smoser> and not build anything
<magicalChicken> yeah, that's probably the cleanest way to do it
<magicalChicken> ppa support works well, I pretty much only use that for local testing
<powersj> magicalChicken: so the use case for this was a local developer (e.g. rharper) creating a test and wanting to try it out without needing a whole build env.
<rharper> smoser: in general, when I'm working on a fix that adds a test-case change, it'd be nice to be able to push the current try into the image and run that; like we have with curtin;  a close second is building out of tree, which is what magicalChicken was doing;  at least for me, that's a useful workflow for iterating on code/testcases
<powersj> ^^^ that :P
<rharper> powersj: well, waiting on ppa build sucks
<powersj> right, so help you
<powersj> so waiting for this is a small price to pay versus a ppa
<rharper> and the followup for magicalChicken was that his package/bddeb didn't work so do it in a "clean" environment
<smoser> rharper, of course it is.
<rharper> I think it would be nice if we had a pack equivalent since that's even faster but
<smoser> i wasnt saying that it wasnt.
<magicalChicken> Yeah, my python installation is most likely the cause
<rharper> smoser: ah, sorry
<smoser> we could add an "install from trunk"
<smoser> but then you end up building yourself a package manager or all the stuff that is being put intot he package already.
<magicalChicken> The bddeb route can run completely (bddeb + 1 test case) in 3 minutes for me, which isn't too bad
<magicalChicken> s/3/4
<magicalChicken> Once we're preserving images as well instead of downloading each time it'll be closer to 2, so I think that's fast enough
<rharper> magicalChicken: and we re-use the base-image-download + inject bddeb
<rharper> so, we shouldn't pay that cost more than one per base-image
<rharper> right ?
<magicalChicken> rharper: yeah, we'd just keep all the base images downloaded
<magicalChicken> then make a copy (which uses zfs copy on write) for the snapshot+instance used to build the deb in and the snapshot+instances for tests
<magicalChicken> so only 1 download, and possibly none if we already had an up to date image from running tests before
<rharper> y
<powersj> magicalChicken: https://paste.ubuntu.com/23783735/ a timing example for you
<magicalChicken> powersj: that's pretty slow, but 4 minutes of that were downloading images
<powersj> let me hop off vpn and try again brb
<magicalChicken> I just ran with 'tox -e citest_run -- -n xenial -t modules/final_message' and got 'real    5m5.283s' for time
<magicalChicken> not much better, but a bit
<powersj> well that made no difference
<magicalChicken> probably limit is local isp, not the vpn then
<magicalChicken> I'm working on config for keeping images right now, going to cherry pick new img_conf format out of devel branch back to version in master, add it on there, then rebase bddeb on that
<magicalChicken> that'll be the biggest speed increase possible
<powersj> magicalChicken: ok I'm about to comment on the merge with the tests I have run so far
<magicalChicken> cool
#cloud-init 2017-01-12
<adi1992> how to use cloud-init on ubuntu 14.04  to  configure a vm  ubuntu.qcow2 image with kvm hypervisor . Not able to virt-install with cdrom option whith user-data and meta-data .iso files . any help ?
<adi1992> where should I post queries on cloud-init , I am new to IRC , if there is a separate page or something please provide the link .
<rharper> smoser: on the ntp fix, did you want me to drop the tox changes altogether from the branch? or just rebase into collapsed comments (tox change, test-case update,  cloud-init module changes, unittests) ?
<rharper> s/comments/commits
<smoser> drop the tox
<smoser> changes
<smoser> they shoudl be in another pull request
<rharper> ok
<rharper> that'll get updated whenever you merge in magicalChicken updated branch;
#cloud-init 2017-01-13
<adi1992> Hi , I have some doubts on using cloud-init without a  cloud platform like openstack with a kvm hypervisor on ubuntu 14.04 machine ? Where should I post it ?
<adi1992> http://blog.oddbit.com/2015/03/10/booting-cloud-images-with-libvirt/   : In this link the " virt-install -n example -r 512 -w network=default \   --disk vol=default/fedora-21-cloud.qcow2 --import \   --disk path=config.iso,device=cdrom " command did not work for me .
<adi1992> the command that I tried virt-install -n cld_int -r 512 -w network=default --disk vol=/home/aditya/scienaptic/Fedora-Atomic-25-20170106.0.x86_64.qcow2 --import --disk path=config.iso,device=cdrom gave this error : ERROR    Error with storage parameters: Couldn't lookup volume object: Storage pool not found: no storage pool with matching name ''
<Raboo> hello
<Raboo> if i specify datasource_list: [  NoCloud, None ] and don't have any seedfrom or fs_label
<Raboo> where will cloud init try to get user- & metadata?
<Raboo> will it look at /user-data /meta-data?
<Raboo> or hmm
<Raboo> is it possible for cloud init to use user-data and meta-data from local filesystem?
<Raboo> perhaps i can just create the a file with meta-data and user-data in /etc/cloud/cloud.cfg.d?
<rharper> Raboo: NoCloud will look for either a directory in the image (/var/lib/cloud/seed/nocloud-net/{user-data,meta-data}  or if you have an ISO attached with the label 'cidata' ;
<rharper> Raboo: on ubuntu, the 'cloud-image-utils' package provides a tool, 'cloud-localds' which helps create such an iso and shows how to attach to a local kvm instance
<Raboo> rharper i'm thinking baremetal, not kvm
<rharper> either works
<Raboo> ok
<Raboo> rharper i also just realized i can just put the user-data and meta-data in /etc/cloud/cloud.cfg.d i think.
<rharper> some config can be set there;  I suggest populating /var/lib/cloud/seed/nocloud-net/
<smoser> you can put user-data and meta-data in /etc/cloud/cloud.cfg.d/ also
<smoser> the NoCloud's datasource config allows you to just put it there.
<rharper> cool
<Raboo> rharper i built an image with openstack's disk-image-create:
<Raboo> DIB_CLOUD_INIT_DATASOURCES=NoCloud disk-image-create ubuntu baremetal dhcp-all-interfaces -o my-image
<Raboo> and the qcow2 image didn't have a /var/lib/cloud folder
<rharper> you have to populate that;  you can use mount-image-callback from the cloud-image-tools to mount up and modify;  or as smoser suggested, you can put the config in /etc/cloud/cloud.cfg.d/  if that dir is already there
<rharper> mkdir -p /var/lib/cloud/seed/nocloud-net etc.
<smoser> or cloud-localds also
<smoser> withoutput tar
<Raboo> ok
<Raboo> why is there 3 different hostname in meta-data?
<smoser> ?
<Raboo> hostname, local-hostname, public-hostname
<Raboo> and which one do i need?
<Raboo> my previos experience i didn't need to build a meta-data file. but that was with kvm and opennebula, most likely openneubla already took care of that.
<smoser> you dont need any.
<smoser> i am confused.
<smoser> if you're building a bare metal image for openstack
<Raboo> no, for on-premise
<smoser> then when you deploy it with openstack, openstack should deal with providing meta-ata
<Raboo> i don't have openstack.
<Raboo> i'm using foreman and trying to build a image deploy function for it.
<Raboo> is meta-data needed?
<smoser> the easiest thing to do would probably be just to disable cloud-init . by removing it, or (if newe enough cloud-init) touch /etc/cloud/cloud-init.disabled
<Raboo> but i want cloud-init to configure the machine first boot.
<Raboo> set hostname, install chef, run chef..
<Raboo> the user-data contains hostname and fqdn. So i'm wondering if i need meta-data also.
<Raboo> https://github.com/number5/cloud-init/blob/master/doc/examples/cloud-config-datasources.txt#L28
<rharper> Raboo: AFAIK, you only need an instance-id: XXXX in meta-data
<Raboo> rharper ok
<Raboo> taken our discussions, this is what i ended up with: http://pastebin.com/PmRcY7hK
<Raboo> rharper, smoser thanks for the input.
<smoser> Raboo, but how will you give it user-data ?
<rharper> looks like writing out to cloud.cfg.d/
<smoser> are you planning on writing user-data to the image and then deploying it?
<Raboo> my plan is this:
<Raboo> 1. unknown node boots via pxe and loads a discovery image.
<Raboo> 2. node pops up in foreman, and i press provision.
<Raboo> 3. node reboots with a small image the runs a bash script that curls |dd of=/dev/sda
<Raboo> 3b. the small image mounts /dev/sda to /mnt, writes /mnt/etc/cloud/cloud.cfg.d/[that file from pastebin, but rendered with hostname..]
<Raboo> 3c. reboot
<Raboo> 4. node boots from hdd and runs cloud init to configure the machine and run chef.
<smoser> Raboo, generally speaking that should work.
<smoser> Raboo, does foreman provide any way to provide user-data ?
<smoser> without the ability for the user to provide some customizaiton information themselves, you kind of are very limited.
<Raboo> smoser foreman has so called Provisioning tempaltes, intended for preseed, kickstart etc etc..
<Raboo> basically you can have a dynamic cloud-init file as a provisioning template.
<smoser> so how would that tie in here.
<smoser> hmm. ok.
<rharper> smoser: magicalChicken:  were either of your looking to bring in a curtin.net module update to cloudinit.net ?
<Raboo> so my step 3 basically a small linux image that writes the ubuntu os image to disk and then downloads the cloud-init provisioning template from foreman and puts it in /etc/cloud/cloud.cfg.d/99-magicstuff.
<smoser> Raboo, how does the small  linux image get booted ?
<smoser> just curious. is it initramfs ?
<Raboo> smoser yea, via pxe, will use this image https://theforeman.org/plugins/foreman_discovery/8.0/index.html#2.3ForemanDiscoveryImage
<Raboo> will dual purpose it.
<Raboo> discovery and deploy
<Raboo> will cloud-init know that it is initialized (or will it try to run /etc/cloud/cloud.cfg.d/99-magicstuff every boot)?
<smoser> Raboo, well, you have an instance-id in that.
<smoser> if the instance-id changes, it will do the per-instance things again
<magicalChicken> rharper: yes
<magicalChicken> I was planning on working to combine the two modules into one code base
<magicalChicken> I am still writing a doc for that, I'll send it out this weekend
<rharper> ok
#cloud-init 2017-01-14
<chinychinchin> is it possible to bake cloud init scripts into an image
<chinychinchin> rather than specify the script via user-data
#cloud-init 2018-01-08
<redguy> so for posterity - in Debian jessie (systemd 215) when you restart journald as a part of cloud init it makes the cloud-init (and possibly other systemd units) fail when they want to write something into stdout/stderr. Debian stretch (systemd 232) is not affected.
<ajorg> meeting today?
<blackboxsw> ajorg: morning. just in time
<blackboxsw> was just determining who would kick it off
<blackboxsw> <--- winner
<ajorg> morning
<blackboxsw> #startmeeting Cloud-inin bi-weekly status meeting
<meetingology> Meeting started Mon Jan  8 16:03:53 2018 UTC.  The chair is blackboxsw. Information about MeetBot at http://wiki.ubuntu.com/meetingology.
<meetingology> Available commands: action commands idea info link nick
<blackboxsw> Happy  2018 cloud-initers!   Thanks ajorg for helping kick us off.
<blackboxsw> Welcome back from break hope the holidays were good for folks.
<blackboxsw> It
<blackboxsw> It's been a while since we've held the meeting due to holidays and vacation time. So, not a ton of content to report for the last bit. Digging up those details now
<blackboxsw> Testing of 17.2 on EC2, Azure, and GCE and release to Ubuntu Bionic
<blackboxsw> Complete 17.1.46 SRU to Ubuntu Xenial, Zesty, and Artful
<blackboxsw> Fix documentation around 'init' mode for modules subcommand (LP: #1736600)
<blackboxsw> Tooling to merge community authored branches into master
<ubot5> Launchpad bug 1736600 in cloud-init "CLI: cloud-init modules -h documents unsupported --mode init" [Low,Fix committed] https://launchpad.net/bugs/1736600
<blackboxsw> So the canonical side of the team worked a bit on getting the latest SRU updates 17.1.46 into Xenial, Zesty and artful. The testing and verification of that release took a bit of time, but we are getting better(faster)
<blackboxsw> I think this last SRU only took us 2 weeks instead of 4 weeks. so that frees up more time on upstream reviews and increasing cloud-init's velocity
<ajorg> great
<blackboxsw> we also added team tools for streamlining community authored branches. so that we stop slowing folks down :/
<blackboxsw> then the only problem is the reviewer :)
<blackboxsw> #topic Recent changes
<blackboxsw> ^ should have been that topic.
<blackboxsw> Also 17.2 release was 'cut' prior to Christmas break, this opened master up for more changes to land. so we've pulled in good  fixes for VMWare NoCloud and SLES
<blackboxsw> digging up the changests now.
<blackboxsw> Also, keep in touch with our active development and the "done" lane on trello. It's out bookkeeper for anything we are working and Done represents anything landed
<blackboxsw> #link https://trello.com/b/hFtWKUn3/daily-cloud-init-curtin
<blackboxsw> so high-level content that landed between 17.1.46 and 17.2:
<blackboxsw> * CLI added the clean and status subcommands
<blackboxsw> * Support for identifying OVF datasource provided by VMware
<blackboxsw> * NoCloudKVM tests now run in continuous integration
<blackboxsw> * Formalize DataSource get_data and related properties
<blackboxsw> * Remove prettytable dependency and introduce simpletable
<blackboxsw> * VMWare pre and post-customization script support
<blackboxsw> Thanks ajorg I think you were the author of note on simpletable stuff, it's nice to drop dependencies where we can to increase speed of cloud-init
<ajorg> it was done selfishly
<ajorg> we dislike taking on new dependencies :-)
<blackboxsw> and thanks to robjo(suse)  maitree(vmware) too and dojordan and Ryan McCabe(redhat) for recent branches too
<blackboxsw> :)
<blackboxsw> Post our 17.2 release we've started work on improved integration..... I think we just got powersj's ec2 integration tests landed right johs?
<blackboxsw> josh even
<powersj> \o/ yep!
<ajorg> nice
<blackboxsw> sweet, so an extra security blanked for us when we have significant changesets landed in master to ensure ec2 is happy.
<blackboxsw> powersj: what are out plans for continuous integration frequency
<blackboxsw> with ec2 specifically
<ajorg> Can those integration tests be run by others with EC2 accounts?
<blackboxsw> ajorg: yes they can
<powersj> I am working on the jenkins jobs this week and hope to have a weekly run as well as a manual run for backport testing
<blackboxsw> I'll get the cmdline
<ajorg> thanks!
<blackboxsw>  tox -e citests -m tests.cloud_tests run --os-name=artful --platform=ec2 --preserve-data --data-dir=../results --verbose
<blackboxsw> or something like that
<ajorg> got it
<ajorg> thanks!
<blackboxsw> powersj: documented it too I think
<blackboxsw> getting link
<powersj> https://cloudinit.readthedocs.io/en/latest/topics/tests.html#ec2
<blackboxsw> #link https://cloudinit.readthedocs.io/en/latest/topics/tests.html#ec2
<blackboxsw> :)
<blackboxsw> excellent work  Josh
<powersj> thanks for all the reviews :)
<blackboxsw> anything else I'm missing about landed work?    rharper powersj smoser1 ?
<blackboxsw> otherwise next topic
<rharper> blackboxsw: nothing from me
<blackboxsw> #topic In-progress Development
<blackboxsw> So we've got a fairly healthy review queue that we need to get through as we get the year started....
<blackboxsw> we also have a few things we are in flight currently:
<blackboxsw> - continuous integration improvements per powersj
<blackboxsw> - dropping dependence on ifup ifdown utils where possible as that's not supported (or installed in some cases) in systemd world
<smoser1> blackboxsw: wow. sorry, missing.
<blackboxsw> who is that smoser1 guy anyway
<smoser1> yeah, i didnt see anything missing sorry.
<smoser> wonder how that happened.
<blackboxsw> welcome ;)
<blackboxsw> - netplan improvements per rharper and jinja template support for all cloud-config modules
<blackboxsw> - and softlayer support per smoser
<blackboxsw>  know the Azure guys are also posting a couple branches on getting a pre-provisioning setup going for thier datasource which looks pretty exciting
<blackboxsw> I can't think of anything else off the top of my head.
<robjo> chrony support
<ajorg> we're only talking feature work in this topic?
<blackboxsw> any in progress development to highlight is fair game. bug work. refactoring, feature etc
<blackboxsw> +10 robjo and again thanks for working with us getting all those branches up and (hopefully soon) landed
<ajorg> what does "jinja template support for all cloud-config modules" mean?
<ajorg> I'd guess most modules don't need templating?
<blackboxsw> ajorg: two things. 1. since we have now landed /run/cloud-instance/instance-data.json    to store metadata/userdata it'd be that #cloud-config can  new be specified with ## template:jinja header and could leverage anything jinjia has to offer plus sourcing any of the instance-data.json metadata fields
<ajorg> Ah, right. Is that not being done above the module level?
<blackboxsw> so if people have repetitive or template-driven content in the runcmd   or write_files portion or their #cloud-config  they'd be able to leverage jinja templates etc
<smoser> ajorg: yes, above the module level.
<blackboxsw> ajorg: not anywhere in cloud-config currently
<blackboxsw> one sec I misunderstood the question
<blackboxsw> smoser: can you clarify what you mean?
<ajorg> I mean, shouldn't #cloud-config template expansion happen before the module sees the config?
<smoser> blackboxsw: we could/should also allow other part types to be rendered
<smoser> ttps://trello.com/c/xyqxyOxg
<smoser> er... bad url. in 2 ways
<ajorg> The the part handler would be the one to do that expansion.
<smoser> https://trello.com/c/AYaCdQyT
<blackboxsw> ahh ok, right that makes sense. I think the cut I made was limited in focus to cloud-config modules and custom scripts supporting the ## template:jinja header.. but nothing would preclude handling other parts
<blackboxsw> so the link to my WIP branch was
<blackboxsw> #link https://trello.com/c/xyqxyOxg
<blackboxsw> and the general feature per smoser
<blackboxsw> #link https://trello.com/c/AYaCdQyT/21-cloud-init-query-standardized-json-information
<ajorg> Is there a design doc of some kind of this?
<blackboxsw> not yet.. but we probably should have a spec as it'd be a good template for the docs we'll need to write
<blackboxsw> scott captured most of the use cases we'd be going for in that last trello link above
<ajorg> Small example of where some clarity is needed: if Jinja is interpreting {foo} in a user-script, what will it do when it sees a shell variable ${foo}
<ajorg> ?
<smoser> you declare that the content is a jinja template
<smoser> if you provide it something that is not renderable as a jinja template
<smoser> then it will fail
<smoser> it requires input to explicitly say "this is jinja". it does not just attempt to render anything.
<smoser> (unless explicitly told to)
<blackboxsw> some brief working examples are in the description of the branch @ https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/334030
<ajorg> Sure. But as a content author, I need to know if Jinja is going to try to render ${foo} or not.
<smoser> then as a content author you can read jinja docs :)
<blackboxsw> jinja would try to render {{ foo }}
<ajorg> :-
<smoser> ajorg: we'll document a simple case, and we can even document "for shell, you'll have to be aware that ...."
<smoser> but we're not going to document all of jinja
<ajorg> I see.
<ajorg> My understanding was that Jinja was highly customizable in what it interpreted and how, so that it's important to document how you've configured it to work.
<blackboxsw> and since to burden is on the #cloud-config or script writer to provide the header        ## template: jinja\n#cloud-config\n         they *should* understand what they are doing
<blackboxsw> we won't implicitly run the #cloud-config through jinja
<ajorg> I get that, no problem, what I'm saying is that Jinja is an engine that you configure to do something, not a markup that always does the same thing for everyone.
<ajorg> Am I making any sense?
<blackboxsw> understood (though I thought it was fairly constrained it it's application and functionality). We'll make sure that the mechanism by which jinja operates is well documented and confined as best we can... for our own sanity we don't want that template engine to be too flexible... too many tough support use cases
<blackboxsw> ok anything else for "In progress development"  otherwise we can move to Office hours for 30 mins
<blackboxsw> #topic Office Hours (next 30 minutes)
<blackboxsw> robjo: you've got quite a few branches of goodness up for us to review. Any prioritization on those branches or just take them as we can?
<rharper> I don't think  there are issues w.r.t jinja and shell; they use different variable escape methods, jinja uses {{ variable/expression }}; and it doesn't consume $  AFAIK, ajorg do you know differently ?
<blackboxsw> #link https://code.launchpad.net/~rjschwei/cloud-init/+git/cloud-init/+merge/334992
<blackboxsw> I'm guessing is top of the list
<ajorg> I saw {instance_id} at https://trello.com/c/AYaCdQyT/21-cloud-init-query-standardized-json-information so I assumed it was being customized to look for { instead of {{
<robjo> blackboxsw: The chrony support should probably be the last as it will take longer over all and more back and forth
<ajorg> (for one thing)
<ajorg> rharper: also, there's the whole question of the "extends" feature
<ajorg> We integrated Jinja into an internal tool a few years back and we spent a very long time making sure the loaders did the right thing.
<blackboxsw> ajorg: I thought I read somewhere that you couldn't exend jinja for custom functions. maybe I was mistaken
<robjo> I am also not certain that the "re-write everything" on the first go around for chrony is really what we want to do initially
<ajorg> blackboxsw: I don't think I'm referring to custom functions
<robjo> That's probably where we ant to end up, but I am not certain that a "step function" approach is in order
<rharper> ajorg: hrm, I've always seen {{ variable }} or {% expression %};  so maybe blackboxsw can just update the templates;
<rharper> the examples in the cards
<ajorg> rharper: sure, that would have helped in this case.
<robjo> If we do go down the route of the step function I'll need more gudance then in rharper's comments
<ajorg> blackboxsw: I was referring to the ability of one template to extend another.
<ajorg> blackboxsw: and the question of where does the engine look when it's asked to extend another template. It can be tricky.
<blackboxsw> yeah I honestly hadn't gotten past step one of handling the template markup within an existing single template. so this may need a bit of thought/work
<ajorg> Personally, I'd be a lot happier with limiting things to Python format() templates, even though it means you can't have loops, but I won't get in the way as long as we're cognizant of the problems we can run into by accepting the full power of an advanced engine like Jinja.
<smoser> i'm not opposed to allowing ## template: python-format
<ajorg> heh
<smoser> honestly.
<smoser> you can pick  a differnt name if you dont like that one.
<smoser> but we already use jinja, so it makes sense to support jinja
 * smoser has to run. sorry.
<rharper> I do feel that supplying the template means the user is opting in;  and specifically if we've got a good way to provide dry-run based on a instance.json and a script; that certainly can help folks work out the kinks in the template of their choosing
<ajorg> I'm really not opposed so much as wary of the extensive power of the thing
<rharper> ajorg: that's a fair warning; given you've experience here; help drawing the line is most welcome
<ajorg> I'm trying to think of a way to read in /etc/shadow using Jinja, you know?
<rharper> well, cloud-init is root anyhow; so, what's the deal with that ?
<blackboxsw> ajorg: heh, right though you can read that with your runcmd section in #cloud-config :)
<ajorg> If I can come up with a way to do it that doesn't make it look obvious that I'm doing it, and then post that as something others can copy, or use with #include <url> then I win.
<rharper> I don't think jinja makes that any more troublesome
<rharper> folks already wget | bash with shell they don't understand either
<ajorg> I suspect Jijna makes it more opaque.
<ajorg> The answer to "what file does Jinja read when I use {% extends foo %}" is a very lengthy "it depends"
<ajorg> anyway, I've said my piece
 * ajorg is a bit of a template naysayer.
<blackboxsw> +1, there's one in every group. We'll try to keep that in mind as this feature evolves
<blackboxsw> :)
<ajorg> nice
<ajorg> :-)
<blackboxsw> any pet bugs, new features or burning reviews that need mention?
<blackboxsw> ajorg: we could do something simple like disable the extends option via policies
<blackboxsw> it looks like
<blackboxsw> #link http://jinja.pocoo.org/docs/2.10/api/#policies
<blackboxsw> or maybe I'm misunderstanding the issue   I'll read up more on it
<ajorg> thanks
<ajorg> It looked like https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/334074 was blocking https://code.launchpad.net/~yeazelm/cloud-init/+git/cloud-init/+merge/331897 but shouldn't be anymore.
<ajorg> I'll remind Matt to try it again now.
<blackboxsw> thanks good dela
<blackboxsw> dela
<blackboxsw> deal
<blackboxsw> geez
<blackboxsw> on that note. I think it's time for coffee
<blackboxsw> and time to end the meeting
<blackboxsw> Happy New Year again folks. Good to be back in the office.
<blackboxsw> thanks again for the chat, until next time..
<blackboxsw> #endmeeting
<meetingology> Meeting ended Mon Jan  8 17:15:30 2018 UTC.
<meetingology> Minutes:        http://ubottu.com/meetingology/logs/cloud-init/2018/cloud-init.2018-01-08-16.03.moin.txt
<ajorg> bye now
<blackboxsw> see you ajorg and robjo  have a good one
<ajorg> oh, actually before I leave, but not necessarily before you leave, what conferences are folks attending this year?
<blackboxsw> powersj: mind promoting me to opp again?
<blackboxsw> or rharper
<blackboxsw> thx rharper
<rharper> np
<blackboxsw> I hadn't chosen yet.  I was think about LISA or debconf.
<blackboxsw> cloud foundry looks interesting too.
<blackboxsw> actually, no it doesn't, to me.
<ajorg> I went to LISA when it was in Seattle. I'd probably go again if it was in Seattle again.
<powersj> blackboxsw: debconf is in taiwan this year
<dpb1> chad and I went in San Diego, it was a good conference, IMO
<powersj> blackboxsw: smoser: some fun for you https://paste.ubuntu.com/26348213/
<blackboxsw> powersj: guess that narrows down that choice
<blackboxsw> anyone heard much about http://devopssummit.sys-con.com/general/papers2018east.htm ?
<blackboxsw> cloud/devops expo
<blackboxsw> ?
<blackboxsw> I think marco ceppi went
<blackboxsw> not sure if it's the 'right' venue
<smoser> powersj: i really should upload simplestreams to pypi
<smoser> and do a release of that.
<powersj> smoser: yes you should
<smoser> powersj: responded to all of them.
<powersj> smoser: thanks!
<powersj> after lunch I'll go through 'em
<smoser> rharper: or blackboxsw https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/335108 review woudl be appreciated
<blackboxsw> will do. just posted archive of meeting notes https://cloud-init.github.io/status-2018-01-08.html#status-2018-01-08
* blackboxsw changed the topic of #cloud-init to: Reviews: http://bit.ly/ci-reviews | Meeting minutes: https://goo.gl/mrHdaj | Next status meeting: Monday 1/22 16:00 UTC | cloud-init 17.2 released (Dec 14, 2017)
<dojordan> @blackboxsw @smoser can you guys take a look at the new changes I made? https://code.launchpad.net/~dojordan/cloud-init/+git/cloud-init/+merge/334341
<blackboxsw> smoser: quick discussion point on your maas oauth branch
<blackboxsw> looking dojordan
<smoser> blackboxsw: k
<smoser> blackboxsw: ?
<blackboxsw> sorry, I missed something?
<blackboxsw> was working on review comments ATM
<smoser> "smoser: quick discussion point on your maas oauth branch"
<smoser> i expected follow up on that
<blackboxsw> ohh posted on your branch mso
<blackboxsw> smoser:
<blackboxsw> just wondered about pulling the required oauth token logic down into OauthUrlHelper
<blackboxsw> so that other callers (as they grow) could benefit from same behavior... warning and proceed w/out oauth
<smoser> blackboxsw: yeah, i kind of saw that duplicated also... i think that sounds reeasonable.
#cloud-init 2018-01-09
<dojordan> @smoser, gentle bump on my PR. I believe I've addressed all of your comments
<smoser> dojordan: ok.
<smoser> powersj: t2.micro cost 0.0116/60-minutes
<smoser> i suspect that our average api-start -> terminate is probably < 10 minutes .
<powersj> agreed
<smoser> so each test would end up costing us in *instance-time* ~ 0.001
<smoser> still have other charges (possibly network and ebs volume time, but ... cheap)
<smoser> blackboxsw: i left a question for you at https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/335108
<blackboxsw> smoser: checking now
<blackboxsw> smoser: so by dropping silentish downgrading from oauth to non-oauth.... you mean just fail and fail loudly?
<blackboxsw> that's a fair and explicit behavior
<sushant> hi guys, i work for Microsoft azure-networking and wanted to discuss the possibility of adding a networking module specific to azure.
<sushant> This is to support networking scenarios for VMs in Azure. To begin with, it will listen for media disconnect/connect and issue re-DHCP.
<sushant> This will help us in moving virtual machines from one azure virtual network to another.
<sushant> Over time, we plan to add support for more advance networking scenarios in azure.
<sushant> Please let me know if this channel is the best place to discuss, or I can also start an email thread (let me know who should I include in the email).
<blackboxsw> sushant: saw your comments on dojordan's branch and figured we would probably start a discussion at some point. While the initial discussion can start here. I think it best to email to cloud-init@lists.launchpad.net   so that other viewers can participate if needed.
<blackboxsw> I think initially something to be aware of is the systemd changes for ubuntu releases > Xenial which might make driving dhcp rediscovery a bit tougher
<blackboxsw> s/tougher/different
<blackboxsw> sushant: also there is a mechanism in  cloudinit now to quickly (and temporarily) interact with dhcp if needed without writing any system lease files or producing artifacts via /sbin/dhcp-script. So you might be able to interact with dhcp if you need to discover new service endpoints etc.
<blackboxsw> cloudinit/net/dhcp.py:maybe_perform_dhcp_discovery() is something we added to assist in dhcp interactions
<blackboxsw> but even that helper will have to be adapted in a systemd-only work which doesn't contain a packaged 'dhclient' utility
<dojordan> @blackboxsw, I looked at that yesterday, but I wanted to confirm on xenial the current solution using bounce will work
<blackboxsw> s/systemd-only work/systemd-only world/
<dojordan> with systemd-networkd we may have another solution using link state disconnect and connect (think unplugging and re plugging the ethernet cable)
<dojordan> that solution doesn't work on xenial as the networking stack doesn't retrigger DHCP for some reason. we are currently testing on 17.10 but wanted to check in the PR for xenial in the mean time
<blackboxsw> dojordan: yes I believe for Xenial-only the bounce will continue to work. it's just newer series where this bounce will fallover.
<blackboxsw> on xenial I believe your are correct that the bounce is required as systemd isn't driving re-dhcp on hostname changes... I *think*
<dojordan> that is the behavior we saw
 * blackboxsw has to relook at the changes I just landed related to bounce ifup/down being absent . to re-remember what's going on there
<blackboxsw>     In artful and bionic ifupdown package is no longer installed in default
<blackboxsw>     cloud images. As such, Azure can't use those tools to bounce the network
<blackboxsw>     informing DDNS about hostname changes. This doesn't affect DDNS updates
<blackboxsw>     though because systemd-networkd is now watching hostname deltas and with
<blackboxsw>     default behavior to SendHostname=True over dhcp for all hostname updates
<blackboxsw>     which publishes DDNS for us.
<blackboxsw> sorry commit related to this was b05b9972d20ec3ea699d1691b67314d04e852d2f
<blackboxsw> so, yeah calls to perform_hostname_bounce on xenial will continue to use the ifdown ifup logic to talk to dhcp again
<blackboxsw> as ifupdown deb package will continue to be delivered in xenial images
<dojordan> yup, that's the plan
<dojordan> thanks for confirming
<blackboxsw> ok; I feel okay about that as we won't break backward compat in xenial and remove ifupdown pkg
<blackboxsw> and even if you continued to call perform_hostname_bounce on artful, bionic, C-series etc. it'll just no-op and log a warning message
<blackboxsw> not even warning... a debug message: Skipping network bounce: ifupdown utils aren't present
<blackboxsw> so if systemd-networkd doesn't do what you want sushant or dojordan's network module will have to do the lifting you mention to make that happen
<dojordan> exactly. but for xenail we are safe
<blackboxsw> I believe that is true. /me re-reads any of smoser's concerns there to see if I missed something
<smoser> dojordan: i was just reading
<smoser> blackboxsw: but will bounce do *anything* on bionic ?
<blackboxsw> smoser, not it no-ops and  adds a debug message "Skipping network bounce: ifupdown utils aren't present."
<blackboxsw> s/not/nope/
<blackboxsw> cloudinit/sources/DataSourceAzure.py:616-ish
<blackboxsw> in that case, we blindly rely on systemd-networkd to do it's job and automatically updated dhcp every hostname change
<blackboxsw> which admitedly is a configurable default behavior which *could* be turned off on custom images
<dojordan> from what I read the only way to retrigger dhcp if the hostname flag isn't set is to restart systemd-networkd, which is less than idea
<dojordan> ideal*
<blackboxsw> unrelated minor comment  on https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/335108 for you Scott. I'll give this a test on our MAAS to watch it save time :)
<blackboxsw> +1 dojordan yeah, not sure about the other fallouts of restarting systemd-networkd..... it feels like it begs an api/interface/knob from systemd to make that possible/simple if needed, but I don't have a lot of faith in that feature showing up.
<smoser> hmm
<smoser> blackboxsw: http://paste.ubuntu.com/26354996/ ?
<blackboxsw> +1 smoser
<sushant> @blackboxsw Thanks a lot, I will start an email thread with cloud-init@lists.launchpad.net
<blackboxsw> can't remember smoser do we know yet whether bionic cloud-images might drop 'dhclient'?
<blackboxsw> it's currently still in bionic daily images per my lxc spin ups
<smoser> they cant easily at themoment
<smoser> the thing protecting them is initramfs-tools and ubuntu-minimal
<smoser> http://paste.ubuntu.com/26355073/
<dojordan> doesn't systemd-networkd have it's own dhcp client?
<smoser> dojordan: yes it does.
<dojordan> so is dhclient sticking around as a no-op?
<smoser> well, at the moment yes.
<smoser> its possible that foundations team would change initramfs-tools dependency on it.
<smoser> and i suspect tha'd allow them to drop it from 'minimal'
<smoser> cloud-init does use it at the moment on ec2.
<smoser> i kind of suspect that it willlive in bionic unless someone goes pushing on it.
<smoser> blackboxsw: somwhat related to above
<smoser> bug 1739516
<ubot5> bug 1739516 in cloud-init "networking comes up before hostname is set" [Medium,Confirmed] https://launchpad.net/bugs/1739516
<smoser> mwhudson says that nothign he found re-dhcp's on hostname change.
<blackboxsw> hrm..... though it's documented here. https://www.freedesktop.org/software/systemd/man/systemd.network.html#SendHostname=
<blackboxsw> and I see logic in systemd-229:src/network/networkd-dhcp4.c which calls sd_dhcp_client_set_hostname if link->network->dhcp_send_hostname
<blackboxsw> ahh but that looks to be just  on dhcp4_configure
<blackboxsw> hmm wonder where/when that's triggered
<blackboxsw> as you mentioned earlier, might have just been lease expiration etc.
<smoser> dojordan: i just hit 'submit' on a review.
<smoser> dojordan: do you know the lease time that azure responds with ?
<smoser> blackboxsw: did you verify that my maas thing works in a real maas ?
<smoser> deploy, upgrade, reboot ?
<smoser> and if so, can you let me into one ?
 * smoser looks for creds he knows he has
<blackboxsw> smoser: I still see tracebacks on artful. but, unrelated to your maas branch. looking deeper
<blackboxsw> found a bug in cloud-init status just now too.
<dojordan> @smoser, our leases are 2^32 - 1 seconds...
<smoser> dojordan: wow!
<blackboxsw>     "('apt-configure', ValueError('Old and New apt format defined with unequal values True vs False @ apt_preserve_sources_list',))"
<blackboxsw> the full traceback
<blackboxsw> https://pastebin.ubuntu.com/26355603/
<blackboxsw> this was maas 2.3 with master + your change . I think master is doing the same thing.
<blackboxsw> our maas provides an empty ntp: cloud-config
<blackboxsw> n/n empty pools: [] and servers: [<maas-ip>]
<blackboxsw> in vendor-data
<blackboxsw> and I don't see much else as far as apt config
<dojordan> @smoser, We are exploring an alternate solution to bounce the nic from hyper-v, but in the mean time we would like to get this checked in. So an alternate solution for bionic would be to simply change the hostname. This way, systemd-networkd will keep re triggering DHCP. Once we get the final ovf_env.xml from IMDS, we will actually apply the real, customer provided hostname. If you guys are okay with this approach I will code 
<blackboxsw> smoser: oops missed you creds request
<blackboxsw> adding
<blackboxsw> ssh ubuntu@10.5.1.18
<blackboxsw> seeing the restored from cache messages as expected due to  check_instance_id() returning True
<blackboxsw> ok that apt error I got was a duplicate of https://bugs.launchpad.net/maas/+bug/1735950
<ubot5> Launchpad bug 1735950 in MAAS 2.3 "ValueError: Old and New apt format defined with unequal values True vs False @ apt_preserve_sources_list" [Critical,Triaged]
<blackboxsw> ok added comments/pastes to that bug, it's targetted to 2.3 and 2.4, so we'll see
#cloud-init 2018-01-10
<smoser> blackboxsw: thank you. i'm not sure why that appears sometimes and is not crippling.
<smoser> blackboxsw: https://bugs.launchpad.net/cloud-init/+bug/1742494
<ubot5> Launchpad bug 1742494 in cloud-init "network reporters generate WARNING in local stage" [Medium,Confirmed]
<blackboxsw> ahh thx smoser
<blackboxsw> smoser: maas code claims it's adding apt configuration (even though they set preserve_sources=False) for earlier cloud-init versions...hmm
<blackboxsw>     # Add APT configuration for new cloud-init (>= 0.7.7-17)
 * blackboxsw checks how  0.7.7-17 behaved regarding apt.... maybe this is an issue on trusty for some reason?
<smoser> the nwere apt format is more expressive
<smoser> and allows them to say some thigns that they could not say previously
<smoser> but curtin udnersands the new format
<smoser> and writes it into the target for everything (i'm pretty sure)
<blackboxsw> landed https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/335108
<powersj> blackboxsw: if you are looking at merges, want to respond to smoser on https://code.launchpad.net/~powersj/cloud-init/+git/cloud-init/+merge/335774
<smoser> powersj: yeah, that was on his list.
<powersj> ok :)
<blackboxsw> umm, https://code.launchpad.net/~powersj/cloud-init/+git/cloud-init/+merge/335774   so, I'm forgetting something here josh. I thought I had seen a changeset at some point that did still provide a mechanism by which we could install system packages via 'make ci-ubuntu-deps'
<blackboxsw> but looks like maybe we rebased the cii-requirements branch or something
<powersj> read my latest comment
<blackboxsw> powersj: I've conveniently forgotten our conversation. and will read the latest comment
<blackboxsw> hrm.. yeah I guess I thought at the end of our conversation you were working up a solution to handle the outliers which couldn't be installed bit pypi or system packages. looks like I misunderstood/misremembered
<blackboxsw> powersj: you talked at standup about maybe pulling a commit rev or git hash on certain dependencies.
<powersj> yeah I should update pylxd==2.2.4 to be something like git://github.com/lxc/pylxd.git@0722955260a6557e6d2ffde1896bfe0707bbca27
<blackboxsw> were you thinking about that as 'futures'? we might be able to extend read-dependencies script to handle some of that lookup for us.  /me digs back into context here to look for shortsighted/quick-wins o
<blackboxsw> because I'm all about the short-sighted solutions
<powersj> and yes I was thinking about doing something special for integration tests, but it just didn't make sense because 1) it is only ubuntu and 2) I would need some mix of system packages + pypi packages
<powersj> and I don't think I want to go installing pypi packages on anyone's system
<blackboxsw> powersj: what if we left that packages/pkg-deps.json rename and the bzr+lp system package translation in tools/read-dependencies... then at least our "make ci-ubunt-deps" could still additionally call "read-dependencies --requirements-file integration-requirements.txt"
<blackboxsw> ... with the exception of pylxd right?
<blackboxsw> or you feel that's muddying the waters too much in read-dependencies script?
<powersj> yeah trying to remember why we have all these exceptions seems like asking for a nightmare
<powersj> don't get me wrong though, having a install-deps for integration tests would be nice, and I would like it a lot, especially since I want to get two new jenkins systems up, I don't want to have to worry about deps for everything
<smoser> i was just fine to push 'install c-i requiremenst' to a later date.
 * smoser has to run
<blackboxsw> yeah that's fine there too. will land powersj as is. we can bikeshed on making make ci-deps-ubuntu work for integration tests in a subsequent branch
<blackboxsw> powersj: landed, https://code.launchpad.net/~powersj/cloud-init/+git/cloud-init/+merge/335774. Thanks for the rehash. We'll go in circles about this again later :)
<powersj> blackboxsw: thanks :) and yes we will
<blackboxsw> powersj: did you say pylxd was snapped/
<powersj> blackboxsw: no, lxd is
<blackboxsw> ahh ok. gotcha
<powersj> and to use the lxd snap we needed newer version of pylxd
<blackboxsw> which we can only easily get from here right? https://github.com/lxc/pylxd
<powersj> blackboxsw: correct
<blackboxsw> powersj: tox env change for pylxd? http://paste.ubuntu.com/26363025/
<blackboxsw> I can put up a branch unless you already have that
<blackboxsw> https://www.irccloud.com/pastebin/uIisHdgJ/
<blackboxsw> just tested with: tox -r -e citest -- run --os-name=artful --platform=ec2 --preserve-data --data-dir=../results --verbose -t modules/runcmd
<blackboxsw> sorry post that irccloud got in the way
<blackboxsw>  44.9 78.1  47:21.26 chrome
<blackboxsw> geez man.
<powersj> lol
<blackboxsw> 78 % MEM for chrome
<blackboxsw> my machine is almost locking up
<blackboxsw> #toomanytabs
<powersj> I found chrome recently was taking all my swap too
<powersj> I was out of memory, locking up
<powersj> anyway... trying your patch, but I think that works
<powersj> yeah that looks good, can you change the comment to https://paste.ubuntu.com/26363038/
<powersj> blackboxsw: that way I know why and can update as necessary
<blackboxsw> funny I had realized that after I posted the pastebin.... I had # lxd backend master tp from 01/10/2018  ..... I like yours better
<blackboxsw> pushing a silly branch for this, just so we all get karma (I mean so it can be tracked)
<blackboxsw> https://code.launchpad.net/~chad.smith/cloud-init/+git/cloud-init/+merge/335963 powersj
<powersj> blackboxsw: thx!
<powersj> +1ed
<blackboxsw> thx
<blackboxsw> will land it
<blackboxsw> fastest branch ever
<blackboxsw> if only I'd review your branches that quickly
 * blackboxsw waits for ci
<blackboxsw> just because
<powersj> :)
#cloud-init 2018-01-11
<kholkina> hi all! I have a question regarding user-data update. I have updated a user-data via openstack nova and now I have a new value when chek it via http://169.254.169.254/2009-04-04/user-data. Also I added [scripts_users, always] in the config file. But it still doesn't work on reboot. When I delete obj.pkl manually and reboot it works fine. But I think it's not the better way
<niluje> smoser: <3
<niluje> just noticed you merged the scaleway datasource for xenial on December
<niluje> thanks for the xmas gift :)
<smoser> niluje: \o/
<niluje> "caribou" joined us on monday :p
<dojordan> @smoser, can you take a look at my latest iteration? I proposed and implemented a solution that works post xenial: https://code.launchpad.net/~dojordan/cloud-init/+git/cloud-init/+merge/334341
<ajorg> I found what I think is an interesting (bug?) behavior...
<ajorg> If you are using NoCloud (via an ISO attached to a VM) and you remove the ISO, the next time you boot it will decide you're using None instead of NoCloud and it will re-initialize.
<ajorg> I'm not sure how to improve this behavior, but I wonder if using the DMI UUID or serial for both None and NoCloud would fix it.
<ajorg> I suspect similar behavior would be shown if you denied access to the instance metadata on an EC2 instance early enough that cloud-init couldn't read it.
<smoser> ajorg: that is correct.
<smoser> yeah, we could use the dmi uuid (as some other clouds do).  and keep that.
<smoser> there is also 'manual_cache_clean' that you can set
<smoser> and then you can remove the disk.
<ajorg> If we use the same ID for both NoCloud and None will it detect that cloud-init has already run even though it has changed clouds?
<ajorg> A note that says "please don't remove the ISO or this will happen" in the docs would also be a good start.
<smoser> blackboxsw:
<smoser> https://git.launchpad.net/~dojordan/cloud-init/commit/?id=7f23e5c4808a9c647cd4d5277625a723a58b132b
<smoser> "on ubuntu release > xenial we rely on systemd-networkd"
<smoser> do you know if that is correct ?
<blackboxsw> reading
<blackboxsw> smoser: for artful and bionic I know that's true... not sure about zesty. I don't recall, but I'll spin one up now
<blackboxsw> I left nearly the same log message on dropping ifupdown Azure stuff in my commit
<blackboxsw> ï¿¼"    In artful and bionic ifupdown package is no longer installed in default
<blackboxsw>     cloud images. As such, Azure can't use those tools to bounce the network
<blackboxsw>     informing DDNS about hostname changes."
<smoser> blackboxsw: i just commented there
<smoser> mwhudson says that you and i are blowing smoke
<blackboxsw> " the data thre is new and will get documented more as we go.
<blackboxsw> That function will make its way back to 16.04 in a cloud-init SRU in the next month or so." .... smoser I'll make a card for that.
<blackboxsw> I should have shoved something into RTD when we added instance-data.json, but we can get it in soon
<blackboxsw> I kinda figured we needed to do it with the jinja-template handling too
<dojordan> @blackboxsw unrelated, but do you know where the SendHostname=true flag lives?
<blackboxsw> smoser: yeah you pointed me at his comment there that systemd isn't handling publishing dhcp info on hostname change..... I thought I ran on azure artful and bionic and the update did happen, but I don't know what triggered it. hrm.
<dojordan> I actually can't find that flag anywhere actually being set
<dojordan> Im looking at an artful azure VM now, and I checked /lib/systemd/network, /run/systemd/network, /etc/systemd/network
<blackboxsw> dojordan: cloud-init doesn't set it, I only read the docs on it as default behavior if unset. I can see in systemd during dhcp client config refresh they check for said flag... but I can't confirm whether something sets that up.
 * blackboxsw needs to look at this again it seems. my recall is dusty on what I originally tested/saw on artful and bionic. I'll spin up azure vms for testing now
<dojordan> Gotcha, ill enable the flag and look for a dhcp request
<dojordan> what's the policy in cloud-init about restarting another service? Manual testing seems to confirm what michael said in bug 1739516 that changing the hostname doesn't appear to actually send a DHCP request
<ubot5> bug 1739516 in cloud-init "networking comes up before hostname is set" [Medium,Confirmed] https://launchpad.net/bugs/1739516
<smoser> dojordan: well if we have to bounce it, we can
<blackboxsw> dojordan: so systemd-networkd docs say the SendHostname=true needs to be in the [DHCP] section of the configs. I see that DHCP section in /run/systemd/network/10-netplan-eth0.network.... but I'm guessing we could provided a /etc/systemd/network/11-something.network  file containing [DHCP]\bSendHostname=true  if needed? (shot in the dark as I'm working off docs at the moment
<smoser> ads we have before
<smoser> it'd be nicer if you coudl bounce *that* inteface
<smoser> rather than restarting the service
<smoser> i'm somwwhat concerned about restartign the service during boot, but we will see.
<blackboxsw> that's just me RTFMing though. poking at an azure instance in a min here to confirm
<blackboxsw> smoser: rharper do we know if netplan configs allow you to post arbitrary dhcp config options, or just dhcp4: on dhcp6: on etc
<blackboxsw> s/on/true
 * blackboxsw doesn't see anything that looks like it at http://people.canonical.com/~mtrudel/netplan/
<blackboxsw> not sure if that's the 'canonical' source for netplan docs though
<rharper> blackboxsw: netplan does not expose arbitrary dhcp options; I think accept-ra is the exception at this time
<smoser> what was the flag we were interested in ?
<rharper> SendHostname; but I do wonder if we can figure out what's going on before we start exposing these sorts of things in the top-level yaml;
<rharper> is this some sort of dynamic dns update based on hostname from a dhclient request ? or is it doing something else with this specific mechanism
<dojordan> I think the flag is simply whether or nor to send the hostname when sending a dhcp request, but not actually sending dhcp everytime the hostname is changed
<cyphermox> smoser, rharper: SendHostname is default in systemd.
<rharper> it's just sent when the client attempts to obtain (or renew) a lease IIUC
<cyphermox> (so is UseHostname=, which is meant to use the hostname that DHCP hands out)
<dojordan> gotcha, so we won't be able to rely on that to force a new dhcp
<cyphermox> what is this about?
<dojordan> basically we (azure) need a way to force a dhcp request to acquire a new IP in certain circumstances
<dojordan> we used to use ifdown/ifup but post xenial we no longer have those binaries installed by default
<cyphermox> dojordan: wouldn't the "right way" be to have the DHCP server send a DHCPNAK?
<cyphermox> assuming that works "out of line"
<dojordan> or the other right way would be to use leases less than 2^23 - 1 seconds... but unfortunately in our stack the dhcp server is not aware of when to send the DHCPNAK
<cyphermox> right
<cyphermox> short leases obviate the issue of "we need to force the client to re-configure now", at the cost of more packets happening on the network
<cyphermox> whereas DHCPNAK or doing stuff on the client means there needs to be code that decides it's time to reconfigure outside of the lease expiry time
<cyphermox> is the client or infrastructure more likely to know what set of circumstances the client should do DHCP again?
<dojordan> client
<cyphermox> (my guess is the server should always be authoritative, but I don't know of your use case)
<cyphermox> ah, interesting
<dojordan> i guess we could just run dhclient -r?
<dojordan> i can elaborate on the scenario:
<cyphermox> dojordan: dhclient isn't what's doing DHCP when using systemd-networkd.
<dojordan> oh right... and systemd-network doesn't expose the same api?
<cyphermox> I don't think it exposes that, but I'm not sure :)
<dojordan> thinking outside the box, can we delete the dhcp lease files? does that retrigger it?
<smoser> dojordan: this m ight become easier (and more easily use dhclient sandboxed) if we get to running in local time.
<dojordan> sorry didn't quite catch that
<nazarewk> hlelo
<smoser> oh. hm.
<nazarewk> *hello
<smoser> wait.
<nazarewk> how does this project relate to CoreOS cloud-init and RancherOS cloud-init?
<smoser> now i'm confused.
<cyphermox> dojordan: I don't think it would trigger it, but you might be able to just restart systemd-networkd without adverse consequences
<smoser> nazarewk: no. coreos is not here.
 * smoser googles rancheros
<nazarewk> smoser: i'm researching cloud operating systems
<dojordan> that was my though too, just wanted to make sure it would be safe
<smoser> dojordan: we run azure datasource at local
<smoser> meaning proper system configured networking isnt up yet.
<nazarewk> after going through coreos (deprecated cloud-init), RancherOS (forked from CoreOS cloud-init) i stumbled upon configuring Project Atomic with cloud-init
<smoser> now i'm confused.
<nazarewk> and saw that is some wide standard
<cyphermox> smoser: again?
<cyphermox> ;)
<nazarewk> any idea how those 3 (or 2 i guess) relate to each other?
<smoser> well, project atomic to my knowledge uses this cloud-init
<smoser> (as contributors from redhat have added that support)
<nazarewk> i already know that, i am more interested in how cloud-init came to be
<nazarewk> i can't find any info on where it came from
<nazarewk> ok looks like i found this
<nazarewk> https://github.com/coreos/coreos-cloudinit#configuration-with-cloud-config
<cyphermox> dojordan: so, I guess what you would need to do depends on what the circumstances are in which a client needs to restart DHCP, if it's because new information was received from the datasource, maybe the best is for cloud-init to restart systemd-networkd when it finds out, if systemd-networkd is even running
<cyphermox> OTOH, if it's potentially hours after boot, then some agent would need to do it, if it's safe.
<dojordan> so the reason this matters is if we hit our instance metadata service with a stale IP, it doesn't recognize it and the client will throw an exception
<dojordan> so we catch it and bounce the nic (on xenial)
<cyphermox> so this isn't actually doing any real DHCP?
<cyphermox> otherwise you'd have the lease times to tell you when to renew, so you theoretically never have a stale IP
<dojordan> our leases are infinite
<dojordan> but it is doing real DHCP in the sense we are getting a new ip,dns,subnet, etc
<dojordan> the reason it could be stale is our platform is moving a vm from one vnet to another
<cyphermox> yes, but I mean even if the IP belongs to you forever, a short lease will have you ping the DHCP server periodically and possibly get new information
<cyphermox> ah
<dojordan> @smoser, not entirely sure when systemd-networkd is running, but I have confirmed in testing that in my current PR we do successfully hit the instance metadata server in our data source. So I guess some parts of the networking stack are setup?
<smoser> dojordan: right.
<smoser> thats what is confusing to me
<smoser> i think you are actually ending up "bouncing" the network that wasnt up yet
<smoser> but not sure. i'm looking at that.
<rharper> what stage is cloud-init at when it's time to bounce? has cloud-init net already run? is that what we're trying to determine ?
<dojordan> yeah, this is during _get_data within the AzureDataSource, so if it is in local then network has yet to run
<rharper> then it won't yet be up; yeah
<smoser> right
<rharper> and then the resume *shouldn't* need to kick dhcp since it never leased anything anyhow; execpt how are we polling metadata service to know when it's time to come up again ?
<smoser> right
<smoser> i think it came up because we bounced it
<smoser> so it ran 'ifdown; ifup' and it came up. since the ifdown didnt do anthing
<smoser> thats my hypothesis
<smoser> on xenial
<dojordan> i think you're right
<dojordan> so theoretically it wouldn't work > xenial as there is no ifup
<rharper> w.r.t the systemd path;  if were in the same boat; I think an ip link set down on the interface that needs to bounce may be enough for systemd-networkd to allow a restart to kick DHCP again
<rharper> that's something testable
<dojordan> yes i am actually testing that later this week or early next
<dojordan> waiting for the networking team...
<dojordan> but they claimed changing the link state of the interface (removing and re adding the nic from our hypervisor) didn't trigger dhcp in the vm.
<smoser> dojordan: since we're running at local time frame, we dont have to "bounce"
<smoser> and we can use somethign more like the ec2 datasource
<dojordan> sounds good
<rharper> dojordan: that make sense as it never DHCP'd in the first place;
<smoser> which does a dhclient, gets its network info, then we can use the 'EphemeralIPV4Network' context manager
<smoser> to hit the datasource, and timeout and do it all again
<dojordan> beauty
<dojordan> i like it
<rharper> but what remains is, while it's sleeping, how can it know when it's time to wake-up  if we don't have a network interface up to poll a URL ?
<dojordan> we bring up a temporary one
<smoser> rharper: right now, if uyou're looking at his branch it runs
<smoser> self.bounce_network_with_azure_hostname
<smoser> in _reprovision
<smoser> and that does
<dojordan> (just to confirm) the ec2 ephemeralipv4network hits the dhcp server?
<smoser>  sh -c 'ifdown eth0; ifup eth0'
<smoser> which ends up bringing it up the first time through
<smoser> and relying on "stale" networking configuration to do so
<smoser> dojordan: look at the ec2 code you shuld be able to see.
<rharper> but, the ifup eth0 isn't sufficient in local; there's no eth0 network config to indicate it needs to DHCP
<rharper> at least in > xenial
<rharper> or at least I'm not sure we write out the network config for it starts "sleep waiting"
<smoser> this hunk in _get_data
<smoser> http://paste.ubuntu.com/26368525/
<smoser> we'd just do
<rharper> oh, EC2 does, but not in Azure at this time
<smoser> with net.EphemeralIpv4Network(**net_params):
<smoser> right
<smoser> so i'm saying something like:
<smoser> while True:
<smoser>     dhcp_leases = dhcp.maybe_perform_dhcp_discovery(self.fallback_interface)
<smoser>     if not dhcp_leases:
<smoser>         something_bad
<smoser>       with net.Ephem....
<smoser>         hit MD service
<smoser>         on happy path break
<smoser> that is terrible i realize , buti think maybe explains it.
<rharper> right;  the net.Eph dhcp stuff; how did we work around requiring dhclient ? or do we keep that present ?for Artful + Bionic ?
<smoser> we require dhclient
<blackboxsw> nazarewk: I believe cloud-init originated here, with these folks, originally a Canonical internal product  that garnered broad adoption and was picked up a supported by other OSes and clouds. :) https://www.podcastinit.com/cloud-init-with-scott-moser-episode-126/
<blackboxsw> :)
<blackboxsw> smoser: per the suggestion about the poll_imds looping.... right we should be able to call maybe_permform_dhcp_discovery which attempts dhclient queries using the fallback nic. and returns an empty list if dhclient doesn't exist or can't get an ip address
<blackboxsw> you're pastebin explains what we do currently in ec2 which could apply here in azure too https://paste.ubuntu.com/26368525/
<smoser> i hope dojordan follows. i think blackboxsw and rharper do, but ih ave to run.
<smoser> i'll look in tomorrow on it a bit dojordan
<blackboxsw> dojordan: (just to confirm) the ec2 ephemeralipv4network hits the dhcp server?.... Ephemeral ipv4 network uses the response for maybe_perform_dhcp_discovery
<blackboxsw> you provide it with a params interface, ip, prefix_or_mask broadcast and router all of which maybe_perform_dhcp_discovery returns
<blackboxsw> you provide it with the params (interface, ip, prefix_or_mask broadcast and router) all of which maybe_perform_dhcp_discovery returns
<blackboxsw> the specific use (which could be nearly the same for Azure) is here https://git.launchpad.net/cloud-init/tree/cloudinit/sources/DataSourceEc2.py?id=78372f16d2711812793196aa8003ad51693ca472#n105
<blackboxsw> though within the with net.EphemeralIPv4Network context youc could do your polling of IMDS
<blackboxsw> as that EphemeralIPv4Network context manager serves only to temporarily bring up a network interface to allow you to hit an external URL. Then it tears that interface back down
<blackboxsw> so it basically performs whatever static network setup is required (including routes) on a given interface  for your context and then any setup it needed to perform it tears down upon __exit__
<dojordan> yeah I will play around with it this afternoon, I might make a modification to add retry support though
<blackboxsw> I'm +1 on retries, I like those more that while Trues :)
<dojordan> haha yeah a little scary typing that
#cloud-init 2018-01-12
<blackboxsw> confirmed, Azure: zesty hostname updates on the command line don't get auto-posted to dhcp, as such nslookup of your hostname just after setting it is unresolvable. Artful and bionic it works within < 1/2 second
<blackboxsw> ok was just a drive by. I'm off
<dojordan> @blackboxsw, did you have waagent running? it also has some functionality to trigger dhcp on hostname change: https://github.com/Azure/WALinuxAgent/blob/master/azurelinuxagent/ga/env.py#L39
<blackboxsw> good point dojordan forgot to check settings on the waagent
<blackboxsw> will spin up zesty & artful again to check both
<blackboxsw> smoser: manual_cache_clean branch: do we really care about logging dicfg == None? or could we just return with "no di_report found in config"
<blackboxsw> minor comment left on https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/335956
<blackboxsw> question really
<blackboxsw> I'll actually test that in a minute here. but I don't think we can get to that "di_report is None" logic
<smoser> you are correct
<smoser> blackboxsw: no. you can get there.
<smoser> oh. wait. no.
<smoser> blackboxsw: http://paste.ubuntu.com/26373092/ ?
<blackboxsw> dojordan: zesty azure, waagent is running on zesty, but I see systemd-networkd is dead
<dojordan> interesting, during cloud init or after boot?
<dojordan> is it using networkmanager?
<dojordan> do ifdown/ifup exist?
<smoser> no.
<blackboxsw> smoser: sure. per your pastebin if we want to differentiate from no di_report and empty di_report I'
<blackboxsw> am +1
<smoser> blackboxsw: http://paste.ubuntu.com/26373186/
<smoser> no reason to set a default since we've already checked if it was there.
<smoser> dojordan: ubuntu cloud images dont use networkmanager anywhere
<smoser> dojordan: you might be right on walinuxagent triggering the hostname update.
<smoser> i suspect you are
<blackboxsw> smoser: +1 on di_report 2nd paste because we also check for None in your branch and nondict
<smoser> i'd forgotten about that.
<blackboxsw> dojordan: how do I tell walinuxagent monitor frequency?
<dojordan> let me check
<dojordan> @blackboxsw, fwiw if you want to disable there is a config flag you can set to disable it
<dojordan> in /etc/waagent.conf, set Provisioning.MonitorHostName=y
 * blackboxsw didn't see anything in /var/log/waagent messaging 'Detected hostname change' on zesty or artful. but the hostname update on artful did update dns
<dojordan> and it is sleeping 5s
<blackboxsw> Provisioning.MonitorHostName=n on artful
<blackboxsw> so it's gotta be systemd-netweorkd
<blackboxsw> Provisioning.MonitorHostName=n on zesty tii
<blackboxsw> too
<dojordan> can you run sudo tcpdump -i eth0 port 67 or port 68 -e -n?
<dojordan> in a different screen
<dojordan> while you change hostname
<blackboxsw> definitely
<blackboxsw> sudo tcpdump -i eth0 port 67 or port 68 -e -n
<blackboxsw> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
<blackboxsw> listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
<blackboxsw> nothing came across the wire across multiple sudo hostname newname calls
<blackboxsw> yet nslookup myjunk3
<blackboxsw> Server:		127.0.0.53
<blackboxsw> Address:	127.0.0.53#53
<blackboxsw> Non-authoritative answer:
<blackboxsw> Name:	myjunk3
<blackboxsw> Address: 10.0.0.5
<dojordan> gotcha, so its not actually re dhcp-ing
<blackboxsw> doesn't look like it.
<blackboxsw> w/out dhcp in azure, how could dns have been updated?
<blackboxsw> I thought that's the mechanism by which that happened
<blackboxsw> w/out re-dhcp
<dojordan> that would be the iDNS
<dojordan> so if you have another VM in the same vnet, can it hit the new hostname?
<dojordan> (iDNS being the instance dns server). I'm curious about the DNS thing though...
<smoser> dojordan: were you looking at using the dhclient and ephemeral ivp4 ?
<dojordan> yeah, I wrote up some quick and dirty code and am testing it now
<smoser> powersj: https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/335956
<smoser> you knwo what that failed ?
<smoser> it doesnt seem like my change
<smoser>  https://git.launchpad.net/~smoser/cloud-init/commit/?id=6f59e49b4ece8eceb43b01a3d7c063ee7899a051
<smoser> could have gone from green to red
<smoser> and i just ran c-i here with it.
<powersj> smoser: you may need to rebase to pick up the pylxd change
<powersj> yeah I don't see integration-requirements.txt, which will be required to grab the pylxd of the necessary version to support lxd as snap
<smoser> k
<smoser> hm.
<smoser> it worked here though
<smoser> odd
<smoser> blackboxsw: if you want to OK https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/335956
<smoser> officially, that'd be good
<smoser> then we can pull
<dojordan> stupid question, but I am having problems mocking cloudinit.util.is_freeBSD()...https://pastebin.com/qSpEtASK
<dojordan> for some reason when i log the return value in the code to be tested, it is coming back as a magicmock object and not False
<rharper> dojordan: I think the patch decorator order is backwards, the first line above the fuction is the first parameter
<blackboxsw> merged https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/335956
<blackboxsw> +1 on mocking param order suggestion
<dojordan> awesome, that worked. thanks!
#cloud-init 2020-01-06
<meena> And so it begins again! (I hope)
<tribaal> hi all! I would need some pointers on how to get a more recent cloud-init version on CentOS (eg. CentOS 7.6). Is using a COPR repository the exepcted way to get the latest binairies or is there some kind of process to refresh the cloud-init package in CentOS base similar to Ubuntu's SRU?
<meena> tribaal: which version are you getting from COPR?
<meena> or, "would you be getting" â¦ if you haven't dared to try yet.
<tribaal> meena: the idea would be to have a centOS 7.6 with 19.3+
<tribaal> but I don't know what the best way to do this would be (we'd probably update our default templates to include the right binaries ourselves if there's no SRU-like process for CentOS)
<meena> tribaal: https://copr.fedorainfracloud.org/coprs/g/cloud-init/el-testing/ you'd get 19.4 if you https://docs.pagure.org/copr.copr/how_to_enable_repo.html#how-to-enable-repo `dnf copr enable @cloud-init/el-testing`
<tribaal> meena: ack, thanks. That's not really "production" builds though, is it?
<meena> tribaal: i don't know (anything about CentOS / Fedora etcâ¦ i haven't used them in a long time, and in $$jobs$$ we use RHEL)
<tribaal> meena: ack - same here (but we us Ubuntu :) )
<tribaal> so our Ubuntu templates have the latest and greatest but we'd need our centOS templates to have the latest/greatest as well
<tribaal> and I'm not too sure how/where to look :)
<Odd_Bloke> meena: tribaal: o/ Happy new year!
<Odd_Bloke> I can't remember exactly the arrangement of our COPR repos, we'll have to wait for Ryan/Chad to help, I'm afraid.
<meena> Odd_Bloke: happy monday
<meena> Odd_Bloke: i'll need to quiz your brain some more wrt net refactor
<meena> but, IIRC, i've written the ideas / questions down. now lemme just find the URLâ¦
<meena> 23:47 <meena> updates: https://hackmd.io/3-YBj1t9TAeKhmfLBQUjXQ?view#Revisiting-modules
<meena> 27th of December.
<Odd_Bloke> I haven't even opened my inbox yet, so I don't know how soon I'll get to it, but it's on my list now! :p
<tribaal> Odd_Bloke: heya! happy new year to you too :)
<blackboxsw> happy 2020 cloud-initers. Time to dig out from holiday emails
<blackboxsw> tribaal: we (upstream cloud-init) don't work with distros directly, other than Ubuntu, to push upstream changes into the distribution. We rely on the distro vendors to vet cloud-init themselves. That said, I believe otubo is the right person to ask about how to coordinate updates into CentOS.   The cloud-init upstream team only provides a our COPR repos as a facility for CentOS users to check if more recent
<blackboxsw> upstream versions of cloud-init help resolve bugs seen on the current (older) cloud-init in CentOS stock images.  Also note, upstream only publishes to https://copr.fedorainfracloud.org/coprs/g/cloud-init/el-testing/  when we are undergoing an SRU for ubuntu, so expect that the "el-testing" repo is fairly stable.
<blackboxsw> we SRU cloud-init into Ubuntu every month or two.
<tribaal> blackboxsw: ack, so I guess building our CentOS templates with a newer binary (for instance built in your COPR repo - or building one ourselves) might be what we end up doing
<tribaal> blackboxsw: also: happy new year :)
<blackboxsw> tribaal: definitely, for the short term and otubo may have pointers for feature-requesting an update of cloud-init into CentOS. I'm not sure where to start there.
<blackboxsw> same to you ð
<tribaal> otubo: out of curiosity - do you know/can you explain what the process of updating cloud-init in CentOS looks like? is it periodic? is it "whatever was current at the time of the CentOS release"?
<blackboxsw> emoj snap for the win
<tribaal> I mean updating in the base CentOS
<ananke> I'm trying to locate where in the official documentation various config options are documented, specifically for 'users' section, and I must be missing something obvious. where can I find the canonical info on items such as the ones described here? https://www.zetta.io/en/help/articles-tutorials/cloud-init-reference/
<smoser> its more than a bit obnoxious that you cannot attach tarballs to pull requests
<smoser> https://github.com/canonical/cloud-init/pull/128
<rharper> ananke: https://cloudinit.readthedocs.io/en/19.2/topics/modules.html#users-and-groups  and https://cloudinit.readthedocs.io/en/19.2/topics/examples.html#including-users-and-groups
<smoser> oh. maybe ignore my noise
<ananke> rharper: ahh, thanks! didn't realize the modules section expanded
<rharper> ananke: well, the latest version doesn't do that, we have a bug to revert that back to the other releases
<ananke> ohh, that would explain
<blackboxsw> rharper: yeah I was finding the same.
<rharper> https://bugs.launchpad.net/cloud-init/+bug/1852456
<ubot5> Ubuntu bug 1852456 in cloud-init "doc: list of modules is no longer present" [Medium,Triaged]
<blackboxsw> user_groups docs get generated but not included for some reason
<blackboxsw> thanks ryh
<rharper> low hanging fruit
<blackboxsw> yeah
<ananke> yep, no wonder. I've been blindly fumbling through the search and menu
<rharper> ananke: yeah, sorry for the trouble
<rharper> blackboxsw: I don't think we should wait for powersj to sort out a better solution  as mentioned in the bug; we should re-add the Left-hand-side toc  as it was; it's *much* friendlier
<powersj> rharper, blackboxsw +1
<blackboxsw> smoser, I thought I saw github could allow tar.gz attachments https://github.com/dear-github/dear-github/issues/150#issuecomment-369744370
<blackboxsw> testing that theory
<blackboxsw> yeah test.tar.gz works, test.tar is rejected due to invalid file type
<blackboxsw> and yeah, ridiculous that they'd support tar.gz, but not tar :P0
<blackboxsw> Just published to focal [ubuntu/focal-proposed] cloud-init 19.4-16-gf8950d63-0ubuntu1 (Accepted)
<blackboxsw> Inbox
<blackboxsw> rharper:/Odd_Bloke I think we'll need to fix build-and-push script as we can no longer push branches direct to upstream (as they require review) https://trello.com/c/MdP53rtp/1239-github-upstream-release-script-build-and-push-needs-to-push-up-a-pr-for-review
<smoser> blackboxsw: yeah, ignore my noise
<blackboxsw> I've put up a PR for the upstream release (which was already dput and accepted by my build-and-push run) https://github.com/canonical/cloud-init/pull/151
<blackboxsw> smoser:  no problemo..... but just this time my friend
<smoser> i wont mess up again
<blackboxsw> heh
<Odd_Bloke> blackboxsw: Should we just disable branch protection on those branches?
<Odd_Bloke> I'd prefer to have exactly the git commit used for the upload as tip of those branches, and I don't think any of GH's merging strategies would do that.
<Odd_Bloke> (GH's rebase never fast-forwards even when it could IIRC.)
<blackboxsw> Odd_Bloke: that's a good idea on branch protection disabling for ubuntu/devel at least. We can talk at a planning meeting in 1.5 hrs
<Odd_Bloke> blackboxsw: Do we need to wait for a meeting?
<Odd_Bloke> rharper: powersj: Thoughts on disabling branch protection on the ubuntu/* branches, so we can push the uploaded git revision directly?
<powersj> Odd_Bloke, I'm good with that
<powersj> not hearing any objections
<blackboxsw> +1 Odd_Bloke
<rharper> Odd_Bloke: +1 on allowing direct landing
<blackboxsw> and Odd_Bloke earns a card :) https://trello.com/c/MdP53rtp/1239-github-upstream-release-script-build-and-push-needs-to-push-up-a-pr-for-review
<blackboxsw> rharper: ananke powersj https://github.com/canonical/cloud-init/pull/153  for review of the doc fix in modules https://github.com/canonical/cloud-init/pull/153
<blackboxsw> I added a TOC to the top of modules page. make doc seems to show all modules now, with links
<Odd_Bloke> blackboxsw: Done. :)
<ananke> thanks!
<ananke> and looks like trello is back in working order too
<blackboxsw> starting out the year right :). New year's resolutions and all
<blackboxsw> Odd_Bloke: thanks, just pushed ubuntu/19.4-16-gf8950d63-0ubuntu1 to origin
<blackboxsw> ubuntu/devel should represent the commitish that was uploaded
<Odd_Bloke> Nice
<ananke> am I just confused, or is the example backwards? 'Update apt database on first boot
<ananke> from https://cloudinit.readthedocs.io/en/latest/topics/examples.html shows 'package_update: false
<Odd_Bloke> ananke: It's not _wrong_ per se, in the sense that it's just illustrating how you would give this configuration option, but it is _definitely_ confusing, so we should address it.
<ananke> Odd_Bloke: it's certainly very confusing, considering the very next example shows an option _adjusted_ to match the description
<Odd_Bloke> Yep, agreed.
<Odd_Bloke> ananke: Interested in proposing a fix?
<ananke> certainly, I'm just not familiar with the procedure. do I have to actually do a PR?
<ananke> on a side note, is there a way to use variables in cloud-init configs? not sure if YAML permits that
<Odd_Bloke> ananke: https://cloudinit.readthedocs.io/en/latest/topics/instancedata.html#using-instance-data details how to use Jinja for templating, is that what you're looking for?
<Odd_Bloke> ananke: You'd need to submit a PR, and sign the CLA (which, unfortunately, involves having a Launchpad account).
<ananke> indeed, it's close. I am looking for using it with runcmd, since I have to refer to a version number in multiple commands. I wasn't prepared for the entire config to be be a potential jinja2 template, but I think it will work
<ananke> Odd_Bloke: I see. I'd love to, but I'm short on time to complete few other things, and getting those items lined up would put me behind. maybe later this week
<ananke> hmm, I don't think the instance data will work though, since all I'm doing is supplying a single cloud init config file
<Odd_Bloke> ananke: What do you mean by "config file"?
<Odd_Bloke> cloud-init has several things that might be. :p
<ananke> Odd_Bloke: good point. it's what I can provide with packer & multipass when an instance is launched
<Odd_Bloke> ananke: That's user-data (at least for multipass), so I think you should be good.  But your best bet would be to try and see, if you can. :)
<ananke> forgive my ignorance, still trying to wrap my head around how cloud-init works, and how we can leverage it
<Odd_Bloke> No forgiveness necessary, we're more than happy to help. :)
<ananke> is there an architectural diagram of cloud-init that would show the stages and what's being ingested at each one? or something that would differentiate between user and instance data (and if there are any other ones)
<Odd_Bloke> ananke: https://github.com/canonical/cloud-init/pull/154 :)
<Odd_Bloke> I'm not aware of such a diagram.
<ananke> thanks!
<Odd_Bloke> Once again, I agree that it would be very useful! :p
<blackboxsw> just descriptive docs that I know of https://cloudinit.readthedocs.io/en/latest/topics/boot.html about what each stage does. differentation of user/meta/instance  data  https://cloudinit.readthedocs.io/en/latest/topics/instancedata.html
<ananke> the diagram here is somewhat helpful: http://fbrnc.net/blog/2015/11/how-to-provision-an-ec2-instance
<blackboxsw> ananke: a brief overview of cloud-init from a presentation we gave in OSSEU https://events19.linuxfoundation.org/wp-content/uploads/2017/12/cloud-init-The-cross-cloud-Magic-Sauce-Scott-Moser-Chad-Smith-Canonical.pdf
<ananke> blackboxsw: thank you, this looks great. exactly the high level stuff I need to get me oriented. I've been suffering from 'can't see the forest for the trees'
<blackboxsw> ananke: also I forgot it's linked here too with other context https://cloudinit.readthedocs.io/en/latest/topics/faq.html#where-can-i-learn-more
<ananke> cheers!
<blackboxsw> surely
<meena> packer can use cloud-init?
<Odd_Bloke> meena: It depends on which backend you're using, I think, but yes.
<ananke> meena: yes. you can pass the cloud-init config via its user_data or user_data_file options
<ananke> it's certainly not something easy to figure out if you're not familiar with that nomenclature. none of its documentation mentions cloud-init outright
<Odd_Bloke> rharper: On the Scaleway PR, is your intent that Louis step back and find a way to log better messages?  (It looks, to me, like a more general statement about what we should do as a project, but might read as a more specific request.)
<rharper> Odd_Bloke: the latter; for now;
<rharper> I'm I won't -1 the current change to the datasource; but we have other variants, DatasourceSmartOS does a if self._network_config is UNSET: self._network_config = None;
<rharper> which might match better in Scaleway since they __init__() to self._network_config = None
<rharper> it works out the same, but I was sort of ranting that we've not given datasources common methods for this non-datasource-specific scenario
<rharper> I was quite surprised that it was set to UNSET until I walked through the logs and code
<rharper> so I kind of want to make it clear in the future via the logs; and potentially have better methods around manipulation/checking of ds.network_config state
<Odd_Bloke> Ack
<meena> ananke: aaah. i was looking for it under the provisioning methods, and could not find it
<ananke> meena: bingo. and it runs before any provisioners kick in
<meena> cool cool
<ananke> though I wish it had better integration with cloud-init, for example an easy way to tell packer to wait until cloud-init is done
#cloud-init 2020-01-07
<powersj> ananke, https://github.com/hashicorp/packer/issues/2639
<powersj> workaround in the last comment
<ananke> powersj: indeed, that's exactly what I found earlier today and used in my implementation
<powersj> there's also this one https://www.packer.io/docs/other/debugging.html#issues-installing-ubuntu-packages
<ananke> not as clean as I'd like, but it does give one ability to set a timeout if things go south
<ananke> my eventual goal is to use gitlab CI to kick off an EC2 instance, deploy packer on it with cloud-init, then use packer and cloud-init to build an AMI based on a vanilla one + our add-ons
<ananke> so I'm trying to debug why a very basic set of runcmd directives do not appear to be working. I've gone as far as cleaning out everything from a basic user data and using the example section from https://cloudinit.readthedocs.io/en/latest/topics/examples.html
<ananke> cloud-init analyze show indicates that it should have been executed: |`->config-runcmd ran successfully @17.71300s +00.00100s
<ananke> yet the log file shows this: cloud-init.log:2020-01-07 00:27:04,770 - cc_runcmd.py[DEBUG]: Skipping module named runcmd, no 'runcmd' key in configuration
<powersj> ananke, common thing to do is to validate your YAML to ensure the syntax is good
<ananke> sorry, had to deal with a small emergency
<ananke> powersj: YAML seems to be correct. multipass accepts it, and the running instance has said YAML in /var/lib/cloud/instances/test/user-data.txt
<ananke> http://dpaste.com/28XMSTQ shows the input YAML and the resulting YAML
<ananke> I also tried running it explicitly such as:
<ananke> $ sudo cloud-init single --name runcmd --frequency always
<ananke> the CLI output appears without any issues: Cloud-init v. 19.3-41-gc4735dd3-0ubuntu1~18.04.1 running 'single' at Tue, 07 Jan 2020 01:04:43 +0000. Up 2283.14 seconds.
<ananke> but the log still has: 2020-01-07 01:04:43,096 - cc_runcmd.py[DEBUG]: Skipping module named runcmd, no 'runcmd' key in configuration
<ananke> this is output of tail -f in /var/log after said execution: http://dpaste.com/2H1NDZN
<ananke> powersj: well, color me surprised. after testing the yaml with cloud-init devel schema I learned that the YAML was actually broken. seems double quotes were removed from one of the runcmd items, which meant the http:// URL was breaking YAML
<ananke> now the question is, what removed said quotes. cloud-init or multipass. seems they were removed on one line, but not on another, wtf
<ananke> so it appears to be due to yaml-cpp used by multipass, https://github.com/canonical/multipass/issues/1097
<Odd_Bloke> Huh, weird.
<Odd_Bloke> Let me go and chat to the multipass folks and see if we can (in the long-term) improve that.
<meena> Odd_Bloke: hello, good morning
<Odd_Bloke> o/
<ananke> k, I think I've given up. there seems to be no way to quote things with multipass for runcmd to function correctly. semi-related, I am surprised that cloud-init doesn't have a 'fetch file from url' module
<Odd_Bloke> You can do includes, let me find the docs.
<Odd_Bloke> ananke: https://cloudinit.readthedocs.io/en/latest/topics/format.html#include-file
<ananke> Odd_Bloke: that appears to be only for fetching its own configs, not some random data
<Odd_Bloke> Oh, in that case I don't think I understand what you're looking for, could you expand?
<ananke> yes, I'm trying to fetch a binary from URL, since it's not available in repos as a package. my hope was to use wget/curl, but that's mangled with the existing multipass problems
<blackboxsw> ananke: some more complex runcmd examples (could use sh -xc "something" ) or single quote around the outside of the curl etc https://cloudinit.readthedocs.io/en/latest/topics/examples.html#run-commands-on-first-boot and https://www.digitalocean.com/community/questions/help-with-cloud-init-syntax-for-runcmd
<ananke> blackboxsw: doesn't work, because yaml-cpp used by multipass mangles that. see https://github.com/canonical/multipass/issues/1097
<ananke> it's a bit ironic, considering cloud-init's documentation recommends multipass as means for validating YAML :) things were going too well for me with multipass & cloud-init, figures I'd have to run into something eventually
<roberto-sanchez> Hello guys, got a question, I'll try to make it short: can I format and mount an N number of EBS raw volumes (may be 1, may be 20, externally defined) with cloud-init? As in: it would need to check for unforrmatted disks (lsblk, fdisk, or parted), and if found, format them with mkfs.ext4 and mount them to /data[N] (/data01, /data02 and so on). If
<roberto-sanchez> someone reads this and can help me, thank you.
<roberto-sanchez> I know I can do this in the final stage, my question was thinking of the fs_setup and disk_setup modules.
<rharper> roberto-sanchez: yes, those are the right modules
<roberto-sanchez> but how can I get it to iterate on the unformatted disks? That's what I don't get.
<roberto-sanchez> and to dynamically act on the N amount of volumes, so it makes N amount of dirs (data01, data02).
<roberto-sanchez> and then to mount them.
<rharper> oh I understand, an instance may launch with variable mount of volumes but you want a consistent process
<rharper> those modules for now, require direct mapping in the config, it supports some abstraction (ephemeral0, 1, 2) and it will look up the name, but you really want a pattern matching ...
<roberto-sanchez> right, sorry for my poor explanation, you got it clearly
<Odd_Bloke> ananke: Yeah, it's frustrating!
<roberto-sanchez> so this is better resolved by standard bash scripting then, @rharper?
<blackboxsw> ananke: hrm nice issue will peek a bit today to see if I can reproduce the issue today
<ananke> blackboxsw: you can see some more attempts here: https://paste.ofcode.org/6vQAzp2LncyVRuh98Gf6qq
<rharper> roberto-sanchez: likely;  we don't currently have support for a syntax that would pattern match N devices by some attribute; which is the first part.  if you know the number ahead of time, it's possible to generate a cloud-config using disk_setup + fs_setup and mount config to do what you're wanting to do; for now, it's likely more portable to write a program to do it and you can execute it early via bootcmd: or later in boot via run_cmd cloud-config
<roberto-sanchez> thank you very much for your help and time rharper, logging off; have a nice day everyone!
<meena> Odd_Bloke: have you had any time to look at cloudinit/net?
<Odd_Bloke> meena: Not yet, I'm afraid!
<usrdev> hey all! curious, is it intended behavior that at reboot that cloud-init would reject/refuse to see particular partitions sizes and resize the boot partition?
<blackboxsw> looks like i's that time again. +15 :)
<blackboxsw> #startmeeting Cloud-init bi-weekly status
<meetingology> Meeting started Tue Jan  7 17:30:28 2020 UTC.  The chair is blackboxsw. Information about MeetBot at http://wiki.ubuntu.com/meetingology.
<meetingology> Available commands: action commands idea info link nick
<blackboxsw> #chair Odd_Bloke
<meetingology> Current chairs: Odd_Bloke blackboxsw
<blackboxsw> #chair rharper
<meetingology> Current chairs: Odd_Bloke blackboxsw rharper
<blackboxsw> Welcome to the first cloud-init community status meeting of 2020. cloud-init upstream uses this meeting as a platform for community updates, feature/bug discussions, and an opportunity to get some extra input on current development.
<Odd_Bloke> usrdev: I'm not 100% sure from that description, could you file a bug using the link in the topic and attach the output of `cloud-init collect-logs` on an affected instance?
<blackboxsw> We generally have this meeting ever 2 weeks (outside of intermittent holidays)... You can always find the next scheduled meeting in the topic of this channel
<blackboxsw> Let
<blackboxsw> Let
<blackboxsw> Let's schedule the next meeting now as well
<blackboxsw> Any objections to Jan 21 ?
* blackboxsw changed the topic of #cloud-init to: cloud-init pull-requests https://git.io/JeVed | Meeting minutes: https://goo.gl/mrHdaj | Next status meeting January 21 17:15 UTC | 19.4 (Dec 17) | 20.1 DROP py2.7 (Jan 2020) | https://bugs.launchpad.net/cloud-init/+filebug
<robjo> Look I'm not late ;)
<blackboxsw> ok topic set for next meeting
<blackboxsw> nope, just me robjo :) welcome to the party
<blackboxsw> as always previous meeting minutes are here.
<blackboxsw> #link https://cloud-init.github.io/status-2019-12-10.html#status-2019-12-10
<blackboxsw> topics for this round: Feel free to interject/suggest other topics at any time. Our typical format is the following: Previous Actions, Recent Changes, In-progress Development, Community Charter, Upcoming Meetings, Office Hours (~30 mins).
<robjo> The move to Tuesday creates a conflict for me for the last 15 minutes of the meeting. Generally I don't think that's an issue as we are often done in less than 1 hour, just pointing out that usually I have to leave 15 minutes early
<robjo> not today ;)
<blackboxsw> +1 robjo. We'll try to keep it snappy :) and if others have conflicts we can certainly touch on shifting the schedule a bit. We generally have a conflict at 1 hr before this meeting, which is the only reason it isn't 1 hr earlier
<blackboxsw> #topic Previous Actions
<blackboxsw> last round: rharper to confirm https://github.com/canonical/cloud-init/pull/42 can land. COMPLETED
<blackboxsw> action2:  upstream core-devs to decide about whether a PR can land if any upstream dev still has 'requested changes'
<blackboxsw> Odd_Bloke: started writing up a spec/procedure for PR review and he is currently working on adding a documentation addition PR to http://cloudinit.readthedocs.io that will describe the workflow for a PR to get from proposed -> merged.
<blackboxsw> that PR should likely be up this week for review if folks are watching our review queue
<blackboxsw> #link https://github.com/cloud-init/cloud-init/pulls
<blackboxsw> No other actions from the previous meeting in December.
<blackboxsw> #topic Recent Changes
<blackboxsw> recent commits that made it into tip: found via git log --since 12-10-2019
<blackboxsw> let's see if I get throttled for spam
<blackboxsw>     - freebsd: fix create_group() cmd (#146) [GonÃ©ri Le Bouder]
<blackboxsw>     - doc: make apt_update example consistent (#154)
<blackboxsw>     - doc: add modules page toc with links (#153) (LP: #1852456)
<blackboxsw>     - Add support for the amazon variant in cloud.cfg.tmpl (#119)
<blackboxsw>       [Frederick Lefebvre]
<ubot5> Launchpad bug 1852456 in cloud-init "doc: list of modules is no longer present" [Medium,Triaged] https://launchpad.net/bugs/1852456
<blackboxsw> heh
<blackboxsw>     - freebsd: fix create_group() cmd (#146) [GonÃ©ri Le Bouder]
<blackboxsw> 10:41     - doc: make apt_update example consistent (#154)
<blackboxsw> 10:41     - doc: add modules page toc with links (#153) (LP: #1852456)
<blackboxsw> 10:41     - Add support for the amazon variant in cloud.cfg.tmpl (#119)
<blackboxsw> 10:41       [Frederick Lefebvre]
<blackboxsw> 10:41     - ci: remove Python 2.7 from CI runs (#137)
<blackboxsw> 10:41     - modules: drop cc_snap_config config module (#134)
<blackboxsw> 10:41     - migrate-lp-user-to-github: ensure Launchpad repo exists (#136)
<blackboxsw> 10:41     - docs: add initial troubleshooting to FAQ (#104) [Joshua Powers]
<blackboxsw> 10:41     - doc: update cc_set_hostname frequency and descrip (#109)
<blackboxsw> 10:41       [Joshua Powers] (LP: #1827021)
<ubot5> Launchpad bug 1827021 in cloud-init "SSH Documentation should mention "Host Key"" [Medium,Triaged] https://launchpad.net/bugs/1827021
<blackboxsw>     - ci: emit names of tests run in Travis (#120)
<blackboxsw> 10:41     - Release 19.4 (LP: #1856761)
<ubot5> Launchpad bug 1856761 in cloud-init "Release 19.4" [Undecided,Fix released] https://launchpad.net/bugs/1856761
<blackboxsw> 10:41     - rbxcloud: fix dsname in RbxCloud [Adam Dobrawy] (LP: #1855196)
<blackboxsw> 10:41     - tests: Add tests for value of dsname in datasources [Adam Dobrawy]
<blackboxsw> 10:41     - apport: Add RbxCloud ds [Adam Dobrawy]
<blackboxsw> 10:41     - docs: Updating index of datasources [Adam Dobrawy]
<ubot5> Launchpad bug 1855196 in cloud-init "RBXCloud has no dsname defined, so datasource cannot be properly detected." [Low,Triaged] https://launchpad.net/bugs/1855196
<blackboxsw> 10:41     - docs: Fix anchor of datasource_rbx [Adam Dobrawy]
<blackboxsw> 10:41     - settings: Add RbxCloud [Adam Dobrawy]
<blackboxsw> 10:41     - doc: specify _ over - in cloud config modules
<blackboxsw> 10:41       [Joshua Powers] (LP: #1293254)
<ubot5> Launchpad bug 1293254 in cloud-init "style guide on dashes vs underscores in cloud-init" [Low,Fix released] https://launchpad.net/bugs/1293254
<blackboxsw>    - tools: Detect python to use via env in migrate-lp-user-to-github
<blackboxsw>       [Adam Dobrawy]
<blackboxsw>     - Partially revert "fix unlocking method on FreeBSD" (#116)
<blackboxsw>     - tests: mock uid when running as root (#113)
<blackboxsw>       [Joshua Powers] (LP: #1856096)
<blackboxsw>     - cloudinit/netinfo: remove unused getgateway (#111)
<blackboxsw>     - docs: clear up apt config sections (#107) [Joshua Powers] (LP: #1832823)
<ubot5> Launchpad bug 1856096 in cloud-init "unittest failure when running tests as root: no such file or dir: 'ud'" [High,Fix released] https://launchpad.net/bugs/1856096
<blackboxsw>     - doc: add kernel command line option to user data (#105)
<blackboxsw>       [Joshua Powers] (LP: #1846524)
<ubot5> Launchpad bug 1832823 in cloud-init "docs: confusing heading "Add apt repositories"" [Low,Fix released] https://launchpad.net/bugs/1832823
<ubot5> Launchpad bug 1846524 in cloud-init "docs: cloud-init user-data docs should mention kernel cmdline options" [Wishlist,Fix released] https://launchpad.net/bugs/1846524
<blackboxsw>     - config/cloud.cfg.d: update README [Joshua Powers] (LP: #1855006)
<blackboxsw>     - azure: avoid re-running cloud-init when instance-id is byte-swapped
<blackboxsw>       (#84) [AOhassan]
<blackboxsw>     - fix unlocking method on FreeBSD [Igor GaliÄ] (LP: #1854594)
<blackboxsw>     - debian: add reference to the manpages [Joshua Powers]
<blackboxsw>     - ds_identify: if /sys is not available use dmidecode (#42)
<blackboxsw>       [Igor GaliÄ] (LP: #1852442)
<ubot5> Launchpad bug 1855006 in cloud-init "config/cloud.cfg.d/README says "All files" rather than "*.cfg"" [Low,Fix released] https://launchpad.net/bugs/1855006
<blackboxsw>     - docs: add cloud-id manpage [Joshua Powers]
<blackboxsw>     - docs: add cloud-init-per manpage [Joshua Powers]
<ubot5> Launchpad bug 1854594 in cloud-init "lock passwd implemented wrong on FreeBSD" [Medium,Fix released] https://launchpad.net/bugs/1854594
<blackboxsw>     - docs: add cloud-init manpage [Joshua Powers]
<blackboxsw>     - docs: add additional details to per-instance/once [Joshua Powers]
<blackboxsw>     - Merge pull request #96 from fred-lefebvre/master [Joshua Powers]
<blackboxsw>     - Update doc-requirements.txt [Joshua Powers]
<ubot5> Launchpad bug 1852442 in cloud-init "ds-identify uses the /sys filesystem which is linux specific and non-portable" [Undecided,Fix released] https://launchpad.net/bugs/1852442
<blackboxsw>     - doc-requirements: add missing dep [Joshua Powers]
<blackboxsw> Ok that should do it.
<blackboxsw> maybe best to just pastebin next time
<robjo> yup
<blackboxsw> lots of doc changes as you can see. dropping python 2.7 automatic testing
<blackboxsw> some additional FreeBSD enablement work landed too (thanks Goneri && meena )
<blackboxsw> total changelog since last meeting:
<blackboxsw> #link https://paste.ubuntu.com/p/Cwnn3SbmWQ/
<blackboxsw> much better
<blackboxsw> #topic In-progress Development
<blackboxsw> We've dusted off our shoes and will get back into using our Trello board more frequently for the immediate updates for what we are currently working.
<blackboxsw> New Year's resolution and all
<blackboxsw> #link https://trello.com/b/hFtWKUn3/daily-cloud-init-curtin
<blackboxsw> expect to see more cloud-init cards migrating through the lanes of the board. Expectation as well is that we'll drop the backlog and ideas lanes and keep the board a simple kanban of what is in progress, review and done
<blackboxsw> Also note I'm going to drop the community charter lane and create bugs for each item, tagging them 'bitesize' so that quick drivebys of developers that want to contribute can search bugs for those straightforward tasks
<blackboxsw> that said, some high level goals upstream is working:
<blackboxsw> - cloud-init one-shot daemon work
<blackboxsw> - cloud-init network hotplug handling
<blackboxsw>  - boot performance  improvements
<blackboxsw> - github automation and tooling improvements for expedited reviews and process
<blackboxsw> I think that plus reviewing the PR active review queues will keep folks busy for the next 2 weeks :)
<blackboxsw> we will likely be adding a cloud-init SRU into xenial, bionic, disco, eoan into the mix as well
<blackboxsw> #topic Community Charter
<blackboxsw> So generally I'd be pointing to the trello lane "Community low hanging fruit" but I hope to convert those cards to bugs today. So let's say community ongoing efforts fall into two camps"
<blackboxsw> 1. add json schema validation to missing cloudinit/config/cc_*py modules.  ( I think there are about 45 remaining modules that need json schema for syntax validation)
<blackboxsw> 2. doc scrub and update for datasources in read the docs
<blackboxsw> All of these items can easily be worked in parallel, which is why they are a good set of tasks for the greater community
<blackboxsw> Expect to find them by searching cloud-init bugs for bitesize tag
<robjo> With bugs remaining in launchpad, would it be a good idea to have things like the schema validation not as bugs but issues in GitHub?
<robjo> that would make them more visible IMHO
<robjo> and those are not really bugs nor is it pressing
<blackboxsw> #link https://bugs.launchpad.net/cloud-init/?field.tag=bitesize
<blackboxsw> robjo: good suggestion.  I think we were trying to avoid the confusion of having two places for bugs (launchpad bugs and github issues) That is a good point though, and maybe it's worth a mailing list discussion to get others to weight in.
<Odd_Bloke> I would be -1 on enabling issues, we would spend our entire lives telling people to report in Launchpad instead.
<Odd_Bloke> I totally understand wanting to separate "bugs" and "development tasks", though.
<Odd_Bloke> But I don't think we have a _great_ way of doing that which doesn't end up with a confusing experience for bug reporters.
<robjo> True that people will equate issues in GitHub with bugs and thus file problems there rather than launchpad, it's a two edged sword
<blackboxsw> right, I think designation is there. We could also add a link to community charter bugs to the top-level README.md for the github project. Just so there is a close breadcrumb in github to get to those items
<Odd_Bloke> Our plan is to assess how this is working in a month or two, so if it's not working well then we can figure something else out.
<blackboxsw> I think the designation of "community development tasks" is there by using bitesize tag or some equivalent
<blackboxsw> #ACTION bbsw seed initial community charter bitesize bugs
<meetingology> ACTION: bbsw seed initial community charter bitesize bugs
<blackboxsw> #topic Office Hours (next ~30 mins)
<robjo> Well, "community development tasks" is a bit mis-leading, after all the core team should be part of the "community" right?
<robjo> So everything is really a  "community development tasks", just that some things are easier than others ;)
<blackboxsw> robjo: yes absolutely. right... I've seen some projects use 'goodfirstbug' or something like that  too
<blackboxsw> just something to reduce the barrier to involvement for anyone wanting to contribute
<blackboxsw> and yes, core team should be accountable to work on some of those community charter tasks when time permits
<robjo> Yes, I think it is important to label the "easy" stuff to help people find a place to get started
<blackboxsw> so that hopefully next cloud-init summit we can set a charter for something else
<robjo> just based on experience there are a lot of people that are sensitive to wording and we don't really want to get into the bikeshedding that comes along with such situations
<blackboxsw> for those reading, office hours is a time of open and unstructured discussion. core cloud-init devs will have eyes on the channel to field questions, concerns, feature or bug discussions. Participate at will. In the absence of any ongoing discussions, upstream will groom/review the active review queue @  https://git.io/JeVed |
<Odd_Bloke> Honestly losing my mind over this bug: https://bugs.launchpad.net/cloud-init/+bug/1858615
<ubot5> Ubuntu bug 1858615 in cloud-init "Fail to boot when NoCloud datasource is included" [Undecided,New]
<Odd_Bloke> The board reboots if you use dmidecode!
<Odd_Bloke> smoser: As you said, that's a regression.  Do you think it follows that the fix should be in cloud-init?
<Odd_Bloke> Because I don't know how you deal with something that broken from where we are in the stack. :/
<Odd_Bloke> (Unless we think this is enough evidence that we can't reliably use dmidecode on aarch64, then I guess it is on us to stop doing that. :( )
<robjo> This was probably in the e-mail by rharper I have not yet read, but I'll ask anyway ;)
<robjo> I think I had some pending merge proposals in launchpad and patches, did these "magically" make their way into GitHub? DO I need to sort out where hings were?
<smoser> i've heard "board reboots if you use dmidecode" before.
<smoser> and maybe even cloud-init skipped calling dmidecode on aarch64 to avoid that.
<smoser> but that is sheer non-sense
<Odd_Bloke> Very glad that boards like this are going to be in the walls of every building in 5 years. ;)
<smoser> umm..... fix your hardware ?
<blackboxsw> other dmidecode issues on other hardware here too https://bugs.launchpad.net/qemu/+bug/1243287
<ubot5> Ubuntu bug 1243287 in QEMU "[KVM/QEMU][ARM][SAUCY] fails to boot cloud-image due to host kvm fail" [Undecided,Fix released]
<smoser> its more forgivable because dmidecode is priviledged but i swear that all it does is *read* /dev/mem
<blackboxsw> robjo: for your pending merge proposals we'd like to see you propose against github if possible. Looking for a run of ./tools/migrate-lp-user-to-github robjo <your_GITHUB_USERNAME> to get your github user included as a CLA signer
<blackboxsw> then we have Conributor License Agreement accountability and can start merging those branches on the github side
<robjo> Yesh I haven't migrated to the GitHub repo.... even in 2020 the 24 hour/day limitation remains, darn it ;)
<robjo> I'll get at least my migration to GitHub done this week, possibly even this afternoon
<blackboxsw> heh, absolutely, and actually I mistyped your migrate cmd:  ./tools/migrate-lp-user-to-github rjschwei <YOUR_GITHUB_USERNAME>
<blackboxsw> ok think that about wraps the meeting for today. Happy new year folks! Thanks for dropping in!
<blackboxsw> #endmeeting
<meetingology> Meeting ended Tue Jan  7 18:39:55 2020 UTC.
<meetingology> Minutes:        http://ubottu.com/meetingology/logs/cloud-init/2020/cloud-init.2020-01-07-17.30.moin.txt
<meena> smoser: didn't we have an actual dmidecode fix that would be a better candidate for causing this? given how much care we have taken that my code never gets executed on Linux
<smoser> meena: yeah, i incorrectly blamed you i think
<smoser> b
<smoser> but on such a system ds-identify would probably run dmidecode
<meena> does that System have /sys/class?
<meena> then, no
<meena> well, rather: then, no, it shouldn't
<meena>         # if `/sys/class/dmi/id` exists, but not the object we're looking for,
<meena>         # do *not* fallback to dmidecode!
<meena>         return
<meena> we should ask the person if that path exists
<anarcat> hello!
<Odd_Bloke> Hi!
<anarcat> i'm trying to figure out if cloud-init is a good fit for improving our install processes here
<anarcat> we have an heterogenous environment, with machines distributed across "cloud" (Hetzner) and baremetal hosts, in different locations, usually without local network authority/trust
<anarcat> we have a common trunk of commands we run (by hand!!) on each machine when we set it up, and a set of hosting-specific setup procedures, of course
<anarcat> i'm thinking the common set might be done better by cloud-init than trying to write my own installer
<anarcat> i also looked at FAI, to give you an idea
<anarcat> and i'm wondering if I shouldn't just write Ansible stuff instead
<Odd_Bloke> How do the bare metal hosts get provisioned?
<blackboxsw> Is this fai? https://fai-project.org/
<anarcat> blackboxsw: yes
<anarcat> Odd_Bloke: blood, sweat and tears
<anarcat> it depends.
<anarcat> on Hetzner "robot" (bare metal) it's https://help.torproject.org/tsa/howto/new-machine-hetzner-robot/
<anarcat> it's horrendous
<anarcat> cloud is https://help.torproject.org/tsa/howto/new-machine-hetzner-cloud/
<anarcat> and the common trunk is in https://help.torproject.org/tsa/howto/new-machine/
<anarcat> my MVP is to replace the common trunk
<Odd_Bloke> anarcat: Are you looking to configure Puppet as part of this, or remove Puppet entirely?
<anarcat> Odd_Bloke: we keep puppet
<blackboxsw> most of your https://help.torproject.org/tsa/howto/new-machine/  look like they could be replaced with existing cloud-config modules (puppet support, hostname setting package installation etc
<Odd_Bloke> OK, cool.
<Odd_Bloke> I figured, but just in case.
<anarcat> blackboxsw: yeah, that's what i gathered as well
<anarcat> blackboxsw: how about the partitionning stuff in https://help.torproject.org/tsa/howto/new-machine-hetzner-robot/ ?
<anarcat> does cloud-init support LUKS and raid?
<anarcat> and lvm? and btrfs and zfs and weirdofs? :)
<anarcat> the only compromise i'd be ready to do on puppet would be to run ansible to bootstrap puppet
<anarcat> we have too much puppet stuff written to switch away
<meena> Odd_Bloke: https://github.com/canonical/cloud-init/pull/143 merge merge merge
<Odd_Bloke> Yeah, I totally understand!
<blackboxsw> anarcat: some basic disk formatting and config https://cloudinit.readthedocs.io/en/latest/topics/modules.html#disk-setup
<Odd_Bloke> meena: done done done
<meena> anarcat: if luks had documentation, we could consider supporting it
<blackboxsw> With MAAS maas.io the curtin project also handles bare metal disk formatting and provisioning  and hands the configured machine over to cloud-init
<blackboxsw> anarcat: ^
<Odd_Bloke> MAAS isn't really practical for Hetzner though?
<blackboxsw> good pt.
<anarcat> blackboxsw: this doesn't seem to say which filesystem types are supported...
<anarcat> i must say i find the documentation a bit confusing :/
<anarcat> meena: luks is definitely documented... i am not sure what you mean
<anarcat> i mean its documentation isn't great either
<anarcat> but it's documented
<Odd_Bloke> So I don't know specifics of LUKS and raid off the top of my head.
<anarcat> maas looks something like FAI that assume you control DHCP and have a PXE server, not our case
<anarcat> right
<anarcat> cloud-init is, after all, ... er... cloud :p
<anarcat> RAID is abstracted away
<meena> https://dev.glitch.social/@hirojin/103382178381991566
<anarcat> meena: heh
<anarcat> meena: then i think you meant "good" documentation :p
<blackboxsw> anarcat:  here are a couple of other contextual examples for disk setup filesystem type etc: https://cloudinit.readthedocs.io/en/latest/topics/examples.html#create-partitions-and-filesystems  and  https://cloudinit.readthedocs.io/en/latest/topics/examples.html#create-partitions-and-filesystems
<anarcat> most documentation sucks, to be honest :p
<meena> anarcat: does that answer your question
<Odd_Bloke> But it has fairly decent support for multiple filesystems, and partitioning
<anarcat> Odd_Bloke: yeah, i saw those thanks
<anarcat> meena: i guess it does
<meena> apparently not. go read some openbsd documentation
<anarcat> meena: sorry, say again?
<anarcat> meena: did you just RTFM me?
 * anarcat confused
<meena> also, luks vs luks2 feels like a php function bad, use this new php function
<meena> anarcat: no, i meant just for comparison
<meena> openbsd docs are really good
<anarcat> from here on, you can assume i have read other documentation than the cryptsetup manpage
<anarcat> meena: actually, i find the FreeBSD docs much better than openbsd :p
<anarcat> but i agree openbsd has good docs :)
<Odd_Bloke> anarcat: So I think cloud-init can do a chunk of what you want natively.
<anarcat> Odd_Bloke: right :)
<anarcat> it doesn't seem reasonable to expect it to do funky disk partitionning though
<Odd_Bloke> And when it comes down to it, you can pass shell scripts as additional configuration.
<anarcat> right
<Odd_Bloke> So hopefully it lets you structure some of the stuff, and only have to have scripting for the Complicated Bits.
<anarcat> maybe i could reuse FAI's setup-storage thing to do my funky stuff
<anarcat> i created those monstrosities already https://gitweb.torproject.org/admin/tsa-misc.git/tree/installer
<anarcat> so maybe those could be hooked up
<meena> anarcat: yeah, like, you can read man make in freebsd top to bottom and actually understand it, and you can read gnu's make manual, and not understand anything
<anarcat> bsd make is simpler than gnu make, though
<anarcat> but yeah, the info system might have been a mistake
<Odd_Bloke> Oh, I just recite the four software freedoms three times while thinking about a GNU make option, and the help text for it just enters my head.
<anarcat> Odd_Bloke: that's surprisingly disturbing
<Odd_Bloke> Is that not how it works for everyone else?
<anarcat> no
<anarcat> i start by using make
<anarcat> then end up writing a shell script in make
<anarcat> then realize what the fuck am i doing
<anarcat> and rewrite it in python
<meena> i have never once successfully read anything in info
<Odd_Bloke> i always yell about variable assignment for a while
<anarcat> god that's insane
<Odd_Bloke> definitely some pacing up and down
<anarcat> oh, i have another actual question :)
<anarcat> so it seems cloud-init is designed to be automatically hooked up in cloud images
<meena> i used to use make before i knew puppet, cuz it's idempotent!
<anarcat> so it's kind of pre-installed or something... e.g. on the youtube video in the frontpage, you just copy-paste the config in a text field when you create an amazon instance
<Odd_Bloke> Yep, that's how it's expected to be installed.
<anarcat> but i'm a weirdo that hate clouds and love the sun
<anarcat> so i'm bare bones
<anarcat> if i'm lucky, i have a decent debian install to start from
<anarcat> how does that work from there?
<anarcat> apt install cloud-init?
<anarcat> then call cloud-init do-magic?
<Odd_Bloke> That's a good question.  We would _generally_ expect cloud-init to be included in the image mastering (or customisation) process, and cloud-init is generally invoked by systemd units in a few phases.
<Odd_Bloke> I'm not 100% sure what happens when you install it on a system without it, because there are no such Ubuntu systems and I work on Ubuntu. :p
<anarcat> uh!
<Odd_Bloke> But I see no reason why it wouldn't work, off the top of my head.
<anarcat> so an Ubuntu desktop comes with cloud-init pre-installed?
<Odd_Bloke> Oh no, I meant servers, sorry.
<blackboxsw> anarcat: ubuntu cloud-images all come with cloud-init installed.
<anarcat> how do you pass the config in that context? https://cloudinit.readthedocs.io/en/latest/topics/datasources/nocloud.html ?
<anarcat> blackboxsw: yeah i figured as much, but i meant more generically
<anarcat> assume that i'm building a cloud
<anarcat> can cloud-init be used to setup the cloud? :)
<anarcat> maybe i'm just picking the wrong hammer here, that would be fine too :)
<Odd_Bloke> anarcat: So the NoCloud data source looks for a CD drive/ISO with a particular label attached to the system, and reads data off of it.
<meena> what *is* the cloud?
 * blackboxsw just tested in a debian stretch (9)  lxc that didn't have cloud-init apt-get install cloud-init (0.7.9 waaay old) worked fine and ran automatically on reboot
<anarcat> meena: other people's computers
<Odd_Bloke> That data is the stuff that on, for example, EC2 would be fetched from the metadata service.
<anarcat> Odd_Bloke: a *CD* drive?
<anarcat> like the plastic coasters you put your beer on? :)
<Odd_Bloke> A virtual one!
<anarcat> yuck!
<anarcat> it must be all sticky! ;)
<anarcat> okay, got it
<Odd_Bloke> :)
<anarcat> can i get away with not doing that? :)
<Odd_Bloke> Things like hostname, network configuration, and the user-data/custom data that you would type into a box.
<anarcat> like fetch over https://i-trust-this-domain-because-i-love-dnssec.com?
<Odd_Bloke> That's a good question, I believe you can.
<anarcat> is that the "nocloud" data source?
<meena> https://github.com/canonical/cloud-init/pull/144 merge merge merge
<Odd_Bloke> That would be the nocloud-net data source.
<rharper> No cloud also reads from /var/lib/cloud/seed/nocloud-net/{user-data,meta-data,network-config}
<rharper> or from any filesystem with the label 'cidata'
<Odd_Bloke> anarcat: So if you're going to be installing cloud-init yourself, you could consider having a simple script which populates /var/lib/cloud/seed/nocloud-net/... from https://its-not-paranoia-if-theyre-really-out-to-get.you/ before cloud-init runs.
<blackboxsw> robjo: https://github.com/canonical/cloud-init/pull/158 will land when CI passes thanks
<Odd_Bloke> Or you could write the appropriate configuration to /etc/cloud/cloud.cfg.d/... to point at your server.
<anarcat> i see
<anarcat> you can hook datasources in /etc/cloud?
<blackboxsw> anarcat: datasources can be set with something like this:
<blackboxsw> root@dev-x:~# cat /etc/cloud/cloud.cfg.d/90_dpkg.cfg
<blackboxsw> # to update this file, run dpkg-reconfigure cloud-init
<blackboxsw> datasource_list: [ ConfigDrive, NoCloud, OpenNebula, DigitalOcean, Azure, AltCloud, OVF, MAAS, GCE, OpenStack, CloudSigma, SmartOS, Bigstep, Scaleway, AliYun, Ec2, CloudStack, Exoscale, None ]
<Odd_Bloke> anarcat: Yeah, so you would just hard-code the specific datasource you want, and the config to kick it off.
<blackboxsw> if your image provides just one datasource type in the list, cloud-init will default to use that datasource. otherwise ds-identify script will try detecting the datasource
<Odd_Bloke> That long list is intended for images which will run across multiple platforms; cloud-init performs local-only discovery to find which of those data sources are applicable, and then uses that one.
<anarcat> okay
<anarcat> i don't quite understand the details, but it seems i'll be able to do the magic i need
<anarcat> do i need to reboot for cloud-init to do its stuff, or can i run it interactively?
<Odd_Bloke> I expect you can run it interactively (and depending on what the units do on installation, may have to do something to stop it happening :p), but a reboot might give you a more consistent experience with other cloud-init users.
<blackboxsw> anarcat: right and you can optionally provide other specific datasource config params with a /etc/cloud/cloud.cfg.d file that looks like  https://paste.ubuntu.com/p/kD4SgnMm7J/
<blackboxsw> I *think* though I haven't played with overriding NoCloud config in a while.(just looked over the ds_cfg reading in cloudinit/sources/DataSourceNoCloud.py
<blackboxsw> and yes you *could* run it directly without reboot during development with sudo cloud-init init --local; cloud-init init; sudo cloud-init modules --mode=config; sudo cloud-init modules --mode=final;
<anarcat> how do you remember all that stuff :)
<anarcat> thanks!
<blackboxsw> it's what the systemd units/jobs do for each cloud-init boot stage per https://cloudinit.readthedocs.io/en/latest/topics/boot.html
<blackboxsw> anarcat: I don't remember it. I fgrep -r cloud-init /lib/systemd/ | grep Exec
<blackboxsw> that told me the individual cloud-init commands that are run by systemd during boot :)
<blackboxsw> fgrep -ir to save swap space in my head
<anarcat> aweome
<anarcat> you guys rock
<Odd_Bloke> rharper: I'd be interested to hear your thoughts on the LUKS/RAID storage configuration anarcat discussed above.
<blackboxsw> btw anarcat if you are creating your own debian image, I'd recommend building/getting a more recent version of cloud-init than 0.7.9 it's about 4 years old.
<blackboxsw> ubuntu 16.04 (Xenial) and later is currently running cloud-inig 19.3 (released late in 2019)
<anarcat> versions in debian: https://tracker.debian.org/pkg/cloud-init
<blackboxsw> buster & bullseye look good
<anarcat> blackboxsw: yeah, don't worry, we're on stable and up for new machines, so 18.3 at least
<blackboxsw> +1
<anarcat> i'm trying to avoid building images, because i don't always control the initial setup
<anarcat> i am trying to avoid building an installer
<anarcat> building images mean writing an installer
<anarcat> building an image is like installing debian
<anarcat> i wish we would just converge
<rharper> Odd_Bloke: right, anarcat : there is limited support in cloud-init for filesystem creation and partitioning;  for advanced storage; I would point toward  curtin, https://curtin.readthedocs.io/en/latest/index.html ; which has extensive support for creating customizing storage, include, bcache, raids, luks, zfs, filesystems, lvm, etc;
<anarcat> curtin, intersting thanks
<anarcat> not in debian though
<rharper> it's an installer, and under the MAAS product, it runs curtin from cloud-init to install ubuntu or other OSes to physical (or virtual) systems;  its all command line based, so it can be incorporated into whatever workflow you have;  it's ubuntu centric, so on the debian path, there are some known issues where it doesn't quite match debian (like kernel package names and othee defaults)
<rharper> no
<anarcat> so MAAS does the basic PXE bootstrap, then calls cloud-init which calls curt?
<blackboxsw> MAAS does PXE bootstrap to call curtin which pre-provisions, then calls cloud-init
<anarcat> oic
<rharper> blackboxsw: not quite, PXE boot into a live image, which runs cloud-init, pulling cloud-config from MAAS, which invokes curtin with config provided by maas;
<rharper> MAAS boots the cloud image ephemerally; and uses that environment to run curtin to deploy any target OS to the platform, curtin customizes the storage layout, unpacks the payload, then runs a "post-install" we call curthooks to enable additional packages, configurations, bootloader etc;
<anarcat> such a mess
<rharper> bare metal provisioning is overkill for customizing images; for sure.
<rharper> that said, re-using curtin to customize storage may be helpful depending on how much you want to do youself vs generating config for curtin to consume;
<meena> can we get cloud-init or something else sensible, like pxe for installing android phones?
<blackboxsw> harper: doesn't curtin's post-install step pass reporting #cloud-config to properly configure the node to talk to maas etc. Or is that earlier?
 * blackboxsw was thinkimg curthooks: handle_cloudconfig function
<blackboxsw> but maybe that is only for during ephemeral provisioning  process as you stated
<rharper> blackboxsw: right, so we configure the baremetal target machine to use cloud-init as well, it will contact maas for further configuration; this first boot on baremetal is just like a cloud instance; first time booting, finds  a datasource, fetches config etc
<rharper> we emit cloud-config to set the datasource to the MAAS controller with creds to talk to MAAS, etc
#cloud-init 2020-01-08
<otubo> blackboxsw: tribaal well, we don't have a specific plan right now, but we're in terms of having one pretty soon. I'm working together with the Fedora maintainer so we can have a better policy for updating the package, so that could be extended to CentOS as well.
<otubo> I'll send an email to the group once we have a plan. Thanks for asking about that :-) And sorry for the delay on the replay, I was on vacation :-)
<tribaal> otubo: oh, that's good to hear! in the mean time, what would you suggest to be the best course of action? For context, we're a cloud provider maintaining several OS base images - CentOS beingone of them - and we recently (in 19.3) have our own cloud-init datasource that we'd like to use instead of the more generic former one
<tribaal> so, ubuntu images have their cloud-init version SRU'd, and for debian 10 we will inject a deb from testing and do our own QA pass over it, but I'm not too sure what the best course of action is here for CentOS/Fedora (having personally had less exposure to those ecosystems)
<tribaal> otubo: also - "an email to the group" -> I'd be very interested to be part of that group if I'm not there already via e.g. the cloud-init mailing list :)
<otubo> tribaal: https://launchpad.net/~cloud-init there's a mailing list section at the bottom :)
<otubo> tribaal: if I understand you correctly, all you need is more or less an updated package version for CentOS so you can use your DataSource?
<tribaal> otubo: yes, correct. We could build a new package ourselves (of course) but would rather help test/contribute if appropriate
<tribaal> we don't (currently) maintain public archives for rpms either, so skipping the cost of setting that up would be welcome. But again, if needed, we can.
<otubo> tribaal: what are the issues when using your DS with current available packages?
<tribaal> otubo: so, in 18.5 (or generally, before 19.3) the datasource we use is racy in specific conditions, resulting in a portion of cloud-init runs to fail to bring up networking. Note: that's specific to our platform.
<tribaal> since we want to have the liberty to diverge from our previously-used datasource (cloudstack), we created an Exoscale specific one, the first fix being that specific race condition
<otubo> tribaal: I understand. Well I can't promise you any dates, but I'm positive we're going to update at least Fedora in February.
<otubo> tribaal: I still need to work on CentOS
<tribaal> otubo: understood - let me know if/how we can help
<otubo> tribaal: what about a small vps on your infrastructure for running tests? :-)
<robjo> rharper: ping
<rharper> robjo: here
<robjo> rharper: received a bug w.r.t. mixing "static" and DHCP IP addresses, i.e. IPv4 may be static and IPv6 dynamic or vice versa
<robjo> I think we have reached the breaking point of trying a non-differentiated write of ifcfg-eth*
<robjo> OK with introducing a "flavor" into the sysconfig dict that configures the renderer?
<robjo> BOOTPROTO setting needs to be handled differently between SLES and RHEL?
<rharper> reading
<robjo> https://bugs.launchpad.net/cloud-init/+bug/1858808
<ubot5> Ubuntu bug 1858808 in cloud-init "On SUSE mixed static and dhcp setups are no properly configured" [Undecided,New]
<robjo> RHEL will complain about BOOTPROTO setting of dhcp4 and dhcp6, I think
<rharper> yes
<rharper> I'm updating bug with doc details
<robjo> I'll add the SLES doc
<rharper> Ive done that
<robjo> lol
<rharper> currently looking for why we set BOOTPROTO="static"
<rharper> as that doesn't seem to have any affect; I suspect none is the right setting instead
<robjo> In newer versions, per comment in the code we do not set static
<rharper> ok
<robjo> Well "none" is implied "static", I am not a fan of this being implicit
<rharper> sure; but I suspect it's better to follow documentation of the distro
<robjo> Well, "none" if we write that actually has a meaning:
<robjo> none
<robjo>                      For bonding slaves, to .....Note:  Do not use to just skip the IP setup -- use
<robjo>                      BOOTPROTO="static"  without  any  addresses in the IPADDR
<robjo>                      variables (or routes) instead.
<rharper> hrm
<rharper> that's troublesome; I don't see any code that checks the static value
<robjo> we really do want "static" when that's what we get from the config ;)
<rharper> nm, I totally missed the first line
<ananke> so this is a fun error. running '/usr/bin/cloud-init status --wait' on amazon linux 2 results in a 'NameError: global name 'PermissionError' is not defined'
<ananke> full error: http://dpaste.com/3NEB0AS
<rharper> https://bugs.launchpad.net/cloud-init/+bug/1831146
<ubot5> Ubuntu bug 1831146 in cloud-init "PermissionError undefined on python 2.7" [Medium,Triaged]
<blackboxsw> ananke: what distro again CentOS 7?
<Odd_Bloke> blackboxsw: "on amazon linux 2"
<blackboxsw> which is python3 ?
<Odd_Bloke> Is it?
<blackboxsw> I thought it wasn't as we heard in cloud-init summit I think?
<blackboxsw> heh, and right (I was oblivious to the "on amazon linux 2" and jumped straight to the error
<blackboxsw> https://forums.aws.amazon.com/message.jspa?messageID=870220
<blackboxsw> AL2  default looks to be py2 per that thread and the faq for core packages maintained https://aws.amazon.com/amazon-linux-2/faqs/
<blackboxsw> sooo Odd_Bloke with us dropping py2.7 in trunk, where does that leave AL2 and it's python2 default
<blackboxsw> s/trunk/tip
<Odd_Bloke> It leaves Amazon needing to get into this^Wlast decade. :3
<blackboxsw> maintaining their own cloud-init py2 compat support I suppose
<blackboxsw> yeppers, unfortunately
<ananke> figures that I'd dig up another issue :)
<Odd_Bloke> For a non-sarcastic answer, I'm not entirely sure.
<blackboxsw> yeah and not one this time that upstream would be likely to address ananke :/ it's still worth a bug https://bugs.launchpad.net/cloud-init/+filebug
<Odd_Bloke> Well, the bug is in 18.5, right?
<ananke> /bin/cloud-init 18.5-2.amzn2
<Odd_Bloke> Which was advertised as supporting 2.7.
<ananke> yep
<blackboxsw> Odd_Bloke: ahh right. :/
<blackboxsw> so that'd be us patching the last py2 support branch
<ananke> this is whatever the latest amazon linux 2 btw
<Odd_Bloke> Yeah, though there's not much benefit to doing that if Amazon aren't going to use that branch.
<blackboxsw> minor changeset to fix that for python2 :/
<Odd_Bloke> So we probably need to reach out to Amazon and talk to them about how they want to handle this sort of issue, as they're one of the last places that has Python 2 in their most recent release.
<ananke> with amazon pushing cloud-init, I imagine they'd be receptive
<Odd_Bloke> blackboxsw: rharper: BTW, realised I hadn't actually done the thing: https://github.com/canonical/cloud-init/pull/159
<blackboxsw> haha Odd_Bloke
<blackboxsw> man
<blackboxsw> oops broke protocol
<blackboxsw> btw rharper https://github.com/canonical/cloud-init/pull/77 has a nit if you'd like to add it to the comment
<blackboxsw> we can land that then ^
<Odd_Bloke> Haha, was just about to point that out.
<Odd_Bloke> blackboxsw: If you want it fixed, should it be Approved?
<blackboxsw> since you are making the rules Odd_Bloke you can break them :)
<blackboxsw> Odd_Bloke: I guess that falls into "Aprrove (with nits)"
<blackboxsw> I can make the change then :)
<rharper> blackboxsw: I think so, I have allow edits enabled
<blackboxsw> will try that out now
<blackboxsw> hrm so should I be able to git push ryan's github repo  fix/network-static6-rendering
 * blackboxsw reads the docs
<blackboxsw> as I don't think I can add a comment to a file that wasn't touched by the original PR via the github PR web UI
<blackboxsw> got it
<blackboxsw> git checkout <rh-remote>/<pr_branch_name> -b <local_branch>; add nits; git commit -am 'nits' ; git push <rh-remote> <local_branch>:<pr_branch_name>
<blackboxsw> oops, clicked the merge button before fully scrubbing the squashed commit message. Odd_Bloke rharper when we have Co-authored-by references in a squash merge commit, should we be using our @canonical or personal emails?
<blackboxsw> it seems Dan and I have a default email set as personal
<blackboxsw> not sure it matters much as our merge markers are all authored as @canonical
<blackboxsw> I've switched my primary email address in my github profile to chad.smith@canonical.com, but then I need to remember to change/override that for any external opensource projects I would touch.
<blackboxsw> meena: I think the only thing blocking one of your PRs is dropped the integration test test_boottime from the PR (and maybe addressing an integration testt as a separate PR) https://github.com/canonical/cloud-init/pull/53/files#r360061164
<Odd_Bloke> blackboxsw: I don't think it particularly matters, TBH.
<blackboxsw> Odd_Bloke: as in we can land as is, or maybe add a comment #TODO(move to cloud_tests) for that test?
<Odd_Bloke> blackboxsw: No, I meant the thing you pinged me about. :)
<blackboxsw> ahh email
<blackboxsw> right
<meena> friends, it's late here( 20:38)so I'm in bed already and very sleepy
<meena> I'll clean this up tomorrow
<wyoung> nn
<ananke> is there a recommended solution for running 'runcmd' items as a specific user?
<ananke> say I have just a few commands, so not worth creating a discrete script. I could prepend each one with sudo, but it would seem there would be an easier way
<Odd_Bloke> runcmd doesn't have support for it, from looking at the code.
<Odd_Bloke> We could consider adding such support, though we would need to think about the security/correctness implications of not using sudo (or, I suppose, we could just call sudo).
<Odd_Bloke> ananke: You could do `sudo sh -c "...; ..."`, which simplifies things a little.
<ananke> Odd_Bloke: thanks for the suggestions!
<Odd_Bloke> :)
<smoser> ananke: yaml anchors are useful
<smoser> http://paste.ubuntu.com/p/35DFhjsGYP/
<smoser> thats the kind of stuff that smoser wastes time on.
<robjo> smoser: What does this mean in Ubuntu? in a netplan
<robjo> eth1:
<robjo>                         match:
<robjo>                             macaddress: cf:d6:af:48:e8:80
<robjo>                         set-name: eth1
<robjo> does the interface get brought up but get no IP address?
<robjo> or does dhcp magically run?
<powersj> hmm I thought you had to have a dhcp4: true or a addresses:
<blackboxsw> yowsa smoser that is a hefty bit of cloud-config. me saves it to grok a bit later
<blackboxsw> I don't use anchors nearly enough (and probably need to check it out via the cloud-init devel schema annotation stuff to make sure we can annotate properly)
<blackboxsw> s/nearly enough/at all/
#cloud-init 2020-01-09
<ananke> smoser: thanks! that's exactly what I was considering. what are those key words with & in front of them? I've never seen that nomenclature
<ananke> eg: write_launch_info
<ananke> ahh, it's some kind of labels
<ananke> ohh, this is nifty
<ananke> smoser: what's the significance of 'sm_misc'? I can't find any reference to that in cloud-init's docs
<ananke> nevermind, that was too obvious :)
<smoser> ananke: they're called 'anchors'
<smoser> sm_misc means nothing, just a namesapce.
<smoser> it is hard to search for
<ananke> smoser: thanks for the hints. I've been able to use similar approach and get it working. I do have one question: why are those labels/anchors sometimes referred as '*something' and sometimes as 'something'?
<ananke> I'll try to read more about anchors & aliases, perhaps I'm missing something obvious
<smoser> ananke: always need the & to refer to it.
<smoser> err... sorry. the * to refer to it, the & to name it.
<smoser> fwiw, 'sm_misc' is just 'smoser misc'
<ananke> smoser: yeah, my 'too obvious' comment was related to figuring out what sm_misc stood for :) took me too long to figure that out.
<ananke> smoser: as to the alias reference, I was just confused by some of the nomenclature where I saw the same things appear twice. eg: - [sh, -xc, *write_exe, write_apt_go_fast, apt-go-fast, *apt_go_fast]
<ananke> I have a hard time figuring out how this maps out exactly, what is 'apt-go-fast' doing there as a list item
<Odd_Bloke> ananke: That's a `sh` thing.  *write_exe is the command_string, write_apt_go_fast is the command_name, and then everything else (apt-go-fast, *apt_go_fast) are arguments to command_string.
<Odd_Bloke> ananke: (Those names come from the dash man page, I didn't just know them off the top of my head :p https://linux.die.net/man/1/dash)
<Odd_Bloke> The command name throws me almost every time.
<ananke> ahh, thanks. the naming convention was throwing me off, and made me question what I thought was a YAML reference and what wasn't
<ananke> on an unrelated note, is there a way to access the name of the cloud provider from within the user-data config? I'd like to have some conditional logic for when a given recipe is executed against 'nocloud' platform
<ananke> I'm already writing the recipe as a jinja template
<smoser> yeah, sh command thing is a pain.
<smoser> with sh -c 'shell commands blob here'
<smoser> the *next* argument will be the "command name".
<smoser> if you don't give it one, '$*' and '$1' and the like aren't going to work as expected (they'll be off by one)
<smoser> so i try to give it one that would make some sense in a 'ps'
<smoser> other times you'll see people use '--' or '-' ... but those seem no more descriptive to me
<smoser> maybe 'this-is-argument-zero-just-ignore-it'
<ananke> ahh, that's good to know. thank you
<robjo> lunch
<robjo> back
<meena> hello, i'm alive; i'll be writing some patches / fix some of my prs
<blackboxsw> saving the world one PR at a time :)
<meena> cloning freebsd and freebsd-ports
<meena> at the same time.
<meena> my computer is unimpressed
<meena> so, import distro patches: https://github.com/canonical/cloud-init/pull/161
<meena> rebase https://github.com/canonical/cloud-init/pull/53 and throw out the integration test. @skipUnlessFreebsd stays, however. in case it might be needed later.
<meena> sooooooooooooooâ¦ has anyone had any time to give my thoughts over at https://hackmd.io/3-YBj1t9TAeKhmfLBQUjXQ?view
<meena> the silence is deafening ð
<rharper> meena: I left some comments on your hackmd doc, and reviewed your PR
<meena> uninstalled DejaVu and now my emoji are displayed now
<meena> rharper: i'm not sure i understand this: https://github.com/canonical/cloud-init/pull/53#pullrequestreview-340801051
<meena> ah
<ProfFalken> Hey all, I'm running Cloud-init 18.5 from EPEL on a custom CentOS 7 box in DigitalOcean, and I don't appear to have access to the network interface information.
<ProfFalken> I'd expect it to be at ds.meta_data.interfaces or similar, but that key doesn't even exist
<ProfFalken> (but it does exist for an Ubuntu server)
<ProfFalken> Any ideas what might be causing this?
<Odd_Bloke> ProfFalken: Are they both using the same datasource?  (`cloud-init status --long` will tell you the DS in use.)
<blackboxsw> ProfFalken: and `cloud-init query --all` will show all the instance metadata obtained.     `cloud-init query --format '{{ds.meta_data.keys()}}'`  will show all keys present by  datasource's scraped metadata.
<robjo> rharper: More fun with networking https://github.com/canonical/cloud-init/pull/162
<blackboxsw> ProfFalken: from https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/DataSourceDigitalOcean.py#L68L74 it looks to me like it attempts to pull that metadata content if present regardless of distribution type.
<Odd_Bloke> blackboxsw: If they aren't using the same data source, that doesn't matter. :)
<blackboxsw> right. good pt
<robjo> This should get us to "clean" ifcfg-*" files, i.e. only options understood on the given distro RHEL or SUSE get written to the files, it solves the BOOTPROTO handling bug, and gets the SUSE ifcfg-* files closer to what they should look like for bonding, vlan and bridge configs
<blackboxsw> Odd_Bloke: also, if it didn't exist, and they were actually using the DO datasource, network_config  would have died horribly:             raise Exception("Unable to get meta-data from server....")
<rharper> robjo: thanks
<robjo> After we get this through I'll come back with a proposal for route writing and then DNS handling after that
<ProfFalken> Odd_Bloke / blackboxsw: sorry, got dumped out there for a minute, anyway, it looks like it's because I'm running my own custom images, not ones provided by DO, because a DO CentOS box is running the DO datasource
<ProfFalken> I'm talking to them now, if they manage to fix it then I'll let you know, but I'm going to assume it's a DO issue, not a Cloud-init one. Thanks for the pointers :)
<meena> rharper: fixed
<rharper> meena: excellent
<blackboxsw> yeah ProfFalken, I think Odd_Bloke was coming to that conclusion per his suggestion about status --long. Your custom images should probably derive from latest DO CentOS images, or you could alternately try adding a /etc/cloud/cloud.cfg.d/99-mandate-digitalocean-datasource.cfg  file  which would provide "datasource_list: [DigitalOcean]"
<blackboxsw> that was cloud-init would default to DigitalOcean on your custom images (if you were rolling your own for digital ocean specifically)
<blackboxsw> I *think(
<ProfFalken> blackboxsw: oh, that's a neat trick, I'll have a look at that, thanks.  And yes, we are rolling our own for DO (and will be for other cloud providers as well) - we need a specific disk layout amongst other things, so this could be awesome if it works, I'll try it now...
<blackboxsw> ProfFalken: generally cloud-init should be able to detect the right datasource, so you shouldn't have to 'pin' a specific one for each image you create. So something else may be going on there (check your /var/log/cloud-init.log and /run/cloud/ds-identify.log  to see why detection of DigitalOcean datasource didn't work for cloudinit)
<ProfFalken> I'm being told it's something to do with the way DO have their networking configured, I'm trying to get more details now...
<blackboxsw> +1. may end up being a cloud-init bug/feature then
<ProfFalken> interestingly, curl http://169.254.169.254/metadata/v1/interfaces/private/0/ipv4/address returns the expected data
<ProfFalken> so it's when I'm trying to use the YAML for Cloud-Config, or querying via the CLI that it's not being seen
<Odd_Bloke> ProfFalken: The cloud-config and `query` command use cloud-init's cached state.
<Odd_Bloke> So that suggests to me that it isn't using the DO datasource, so cloud-init never goes to fetch the DO-specific configuration (which is then, of course, not present in cloud-init's cache).
<ProfFalken> apparently the "solution" is to use a bash-script rather than the YAML, which makes the config more complicated, but if it works... :(
<ProfFalken> ah, ok
<rharper> I don;t believe we expose the datasource network_config in the instance data;  further; in insttance metadata that we persist, the datasource must have a 'network_json'  value
<Odd_Bloke> rharper: The assertion is that it works on Ubuntu but not on CentOS.  And it's under the meta_data key, so I think that's whatever the datasource wants to put in there?
<rharper> well, Ubuntu vs Centos almost always works out to cloud-init version differences
<Odd_Bloke> (i.e. I believe that's DO network configuration, not v1 or v2 YAML)
<ProfFalken> Odd_Bloke: FWIW, DO appear to be in agreement with you
<rharper> Odd_Bloke: I guess what I was getting at (which may not be ProfFalken issue) is we currently don't consitently persist network metadata from platforms in a consistent place; which would be a nice feature improvement to the instance-metadata.json
<Odd_Bloke> rharper: Agreed.
<ProfFalken> I'm trying a switch over to using bash instead of using cloud-config, I'll see how it goes
<Odd_Bloke> ProfFalken: If it's working on Ubuntu but not on CentOS, I don't believe you should have to switch, but if you'd prefer to stick with DO's advice then I'm happy to defer.
<Odd_Bloke> (AFAICT, the DO datasource hasn't changed since 18.5, which was the version you're using on CentOS, I believe?)
<rharper> Odd_Bloke: https://github.com/canonical/cloud-init/pull/53  ;  I'm happy with this branch, it has a "Request changes" from you;  could you take a peek and see if you're happy as well ?
<Odd_Bloke> Sure thing.
<ProfFalken> Odd_Bloke: I'll give it a go and see what happens, but there does seem to be something "not quite right" on the DO side of things here
 * rharper grumbles about "vendor-data" 
<Odd_Bloke> meena: Sorry, one last request on #53.
<meena> Odd_Bloke: fair enough
<meena> Odd_Bloke: done
<Odd_Bloke> Thank you!
<meena> i've also re-reviewed goneri's netbsd branch, maybe someone with more python / cloudinit experience might wanna rein in my review
<ProfFalken> Odd_Bloke / blackboxsw: The bash-script approach has got me up and running again, thanks again for all your help, I'll take it up with DO and see if they can get it sorted on their end! :)
<meena> always good approachâ¦
 * meena has spent about a year or more to get FreeBSD running on Hetzner
<meena> somebody pls merge all my patches
<meena> This pull request can be automatically merged by project collaborators
<meena> anyway, bed time.
#cloud-init 2020-01-10
<Odd_Bloke> blackboxsw: Ugh, this Actions PR stuff is a real PITA.
<meena> Odd_Bloke: so you should look at something more pleasant likeâ¦ how to refactor cloudinit/net
 * meena has a call with a customer in minus two minutes, andhopefully iwon't fall asleep utnil it actually happens
<Odd_Bloke> meena: Oh, I thought you'd got feedback from Ryan?
<Odd_Bloke> I can still take a look, of course. :)
<meena> Odd_Bloke: he added > [] â and that's all i've seen
<Odd_Bloke> What more could you want? ;)
<meena> Odd_Bloke: yeah, i dunno either, sometimes i just feel like i have too high expectations which i don't know how to communicate
<Odd_Bloke> meena: So I'll take a look at that doc today.  And let's be explicit: what feedback are you seeking?
<Odd_Bloke> blackboxsw: Just dropped a wall of text regarding next steps for CLA actions: https://github.com/canonical/cloud-init/pull/164#issuecomment-573075057
<rharper> meena: I hope all of the comments made it
<rharper> meena: in view mode, on the right, there is ð¬ mark with a number to indicate comments, clicking that will show them on the right;  I added comments to sections: Background, Code duplication
<meena> rharper: aahh. so I discovered @goner's comment, and you appear to have answered it
<rharper> meena: yes
<rharper> meena: it took me a little bit to realize I could comment without making changes to the content; that's why you saw my [] edit ...
<blackboxsw> Odd_Bloke: ok I responded to the wall of text and refactored the CLA signing PR to still get triggered on pull_request, and fail the status check with the CLA failure comment. If we still think we need to add a comment or label to PRs, we could leave in a top-level PR grooming action that'd walk open PRs and add labels.
<Odd_Bloke> Ugh, not even being able to comment _sucks_.
<Odd_Bloke> Really feel like "you won't be able to use this for the most basic GitHub action" should be more prominent in their docs somewhere. Â¬.Â¬
<meena> ð¬
<meena> rharper: i'll try to clarify re your comments.
<rharper> meena: great;  generally I think Odd_Bloke 's suggestion is the direction to go;  but I wanted to make sure we understand the end goal of reuse;  maybe "more portable to other BSDs" is  what you mean vs.  cross distro (which to me refers within the linux family variants)
<meena> rharper: nah, pretty much all BSDs most of these things will work the sameâ¦ except DHCP.
<meena> s/BSDs/Linuxes/
<meena> i hope i have clarified. it a bit now: i reckon we can remove the code duplication section?
<meena> rharper: my issue with Odd_Bloke's suggestion is how that if we put the functions into the Distro classes, then, uh, how do i get a distro object in cloudinit/net?
<rharper> meena: re: code dupe; yes; though maybe it's worth a bug/PR to comment/change the naming so it's not as confusing ?
<meena> rharper: they are private functions, so, yah
<meena>  rharper: if i started with *that* PR, that would be the ULTIMATE YAK SHAVE.
<Odd_Bloke> meena: https://github.com/canonical/cloud-init/blob/master/cloudinit/distros/__init__.py#L85-L94 <-- this is where the Renderer is instantiated, so I think you could just pass `self` in there?
<Odd_Bloke> (I'll defer to rharper on if that makes sense though.)
<rharper> Odd_Bloke: for renderers, yes; for cloudinit.net ; which is the bulk of the porting work;  I don't think we need Distro there;  net module could likely get by with use of is_freebsd()  variants;
<rharper> with some import log, we likely could have 'from cloudinit import net'   determine linux vs. bsd and  pull in named methods
<meena> rharper: https://github.com/canonical/cloud-init/pull/147/files
<rharper> s/log/logic
<meena> rharper: i did that â¦ well, i started that a year ago, but i sucked at python / cloud-init. and so i, very naÃ¯vely, tried to translate all functions instead all public functions
<meena> but from the layout of cloudinit/net it's not clear what's public and what's not
<meena> perhaps a good first step, as i mentioned, it would be a good idea to start with marking the private functions first.
<rharper> meena: everything is public that's not named with leading _
<meena> rharper: i know
<meena> we need to mark the things that aren't used anywhere other than cloudinit/net with a _
<meena> i don't suck that bad at python!
<meena> i literlly cannot remember what i was trying to do
<meena> (that is usually a good place to quit, and sleep instead)
#cloud-init 2020-01-11
<amansi26> Hi.. can anyone tell me what difference does it make if cloud init status returning running and cloud init status returning done?
<meena> cloud-init is a oneshot service. so if you're asking systemd for a status while it's actively doing stuff, it'll say running, otherwise, done
<amansi26> meena: For rhel and sles when I install cloud-init package, the status shows done (by default) but for ubuntu it is showing disabled (by-default)
<meena> amansi26: you probably have to enable it in /etc/default/cloud*
<meena> i say, not having used cloud-init on Linux in almost two years
<meena> on *BSD we have /etc/rc.conf for the same purposes, but no one has even looked at my patch! https://github.com/canonical/cloud-init/pull/161
<amansi26> When we install cloud-init on a system. What ll be the default value for cloud-init status?
<amansi26> How to change cloud-init status from running to done?
<blackboxsw> amansi26: status starts as 'not run'
<amansi26> But in my case it is coming as disabled
<blackboxsw> It expects to see all cloudinit stages completing in /run/cloud-init/satus.json
<blackboxsw> https://github.com/canonical/cloud-init/blob/master/cloudinit/cmd/status.py#L101
<blackboxsw> And if disabled itll report that
<blackboxsw> https://github.com/canonical/cloud-init/blob/master/cloudinit/cmd/status.py#L73
<blackboxsw> amansi26: running cloud-init status --long  will tell you why disabled
<amansi26> blackboxsw: It says "Cloud-init disabled by cloud-init-generator" . But then I checked for /etc/cloud/cloud-init.disabled (it doesnot exists) and /proc/cmdline (doesnot contain cloud-init=disabled)
<blackboxsw> amansi26: so this is a fresh pkg install, cloudinits generator hasn't run yet to create init files which will run cloudinit on next boot.
<blackboxsw> Genet
<blackboxsw> Generator is at /lib/systemd/system-generaties
<blackboxsw> Oops /lib/systemd/system-generators/cloud-init-generator is what enables cloud init
<amansi26> blackboxsw: There is no such file on the system
<blackboxsw> amansi26: what distibution again?
<blackboxsw> ubuntu?
<amansi26> blackboxsw: yes
<blackboxsw> if that ubuntu image didn't have cloud-init already installed, it is probably not a certified cloud image from ubuntu from https://cloud-images.ubuntu.com/
<amansi26> blackboxsw: I can tell you the steps I performed. 1.Took an iso from http://old-releases.ubuntu.com/releases/16.04.4/ubuntu-16.04-server-ppc64el.iso and installed the ubuntu system. 2. I have a custom cloud-init debian package, I tried installing that. It got installed. But the status is disabled(by-default, as discussed).
<blackboxsw> if I `lxc launch ubuntu-daily:bionic mybionic; lxc exec mybionic ls /lib/systemd/system-generators/` I can see cloud-init-generator file there
<blackboxsw> amansi26: if you are trying to boot cloud-image with cloud-init you really should source from official cloud-image isos
<blackboxsw> https://cloud-images.ubuntu.com/xenial/current/
<amansi26> But there is no iso file out there
<amansi26> blackboxsw: When will this file /lib/systemd/system-generators/cloud-init-generator get generated?
<blackboxsw> amansi26: it should be part of the deb package built
<blackboxsw> packages/bddeb could help with that
<blackboxsw> it defaults to systemd init system type which s what packages the cloud-init-generator from systemd/*.tmpl
<blackboxsw> './tools/run-container' could also perform a build of your custom cloud-init dir in an lxc container on centos/debian/ubuntu and emit a built binary package to your $CWD
#cloud-init 2020-01-12
<turova>  hello! I'm trying to use cloud-init in proxmox to set my IP on a fresh install of Ubuntu 18.04 and while the username/password do get updated, for networking changes, it just changes my old netplan file to a blank one
<turova> I changed the setting to disable predictable network interface names because I saw that that was required, but didn't change anything else
