[14:07] <rharper> smoser: I'm going to merge the policy branch into my net-passthrough and then see if this all still works =)
[14:07] <smoser> rharper, ok... i just did something, tried to test yiour branch and lxc and didn't seem to do what i expected.
[14:08] <smoser> putting together something to show you what i tried.
[14:08] <rharper> ok
[14:13] <rharper> smoser: in the policy feature; you've not yet wired it into the distro yet, right?  also not poked at what the systemconfig key/val would be yet?
[14:20] <rharper> smoser: so, we don't have a common config dictionary for renders;  that makes the distro part where it fetches the first renderer via policy, it still needs to know which renderer it is can construct possible different configs;  unless we come up with a common config structure
[14:21] <smoser> rharper, well, at first i dont think we *need* any configuration on nit.
[14:26] <rharper> renderers today all that a config dict; and they're all different; that's just the code today without any new code;
[14:26] <rharper> some things are common, for example, eni and sysconfig take a 'netrules_path' key for udev rule hooks
[14:27] <rharper> but eni has a eni_header (which we could truncate to header, and reuse for netplan as well, we use the same one anyhow);
[14:27] <smoser> rharper, right. so we do have to know how to call each of the renderers, which kind of suck
[14:27] <rharper> sysconfig uses dns_path  and sysconf_dir,
[14:27] <rharper> yeah
[14:28] <smoser> but we do not need to necessarily have that configured
[14:28] <smoser> the datasource can just raise a RunTimeError if it doens't know how to call the selected_renderer
[14:28] <smoser> but even then, i think the renderers all take config={}
[14:28] <rharper> they all take the config; I'm saying it's annoying in the distro that supports multiple renders
[14:28] <smoser> and probably do something sane-ish... they can  just be improved to better dtrt
[14:29] <rharper> the logic should be 1) render_cls = load_net_render_by_policy();  2) render  = render_cls(config)
[14:30] <rharper> but instead, we have to ask well, if render_cls == 'eni' then X; elif render_cls == 'netplan' Y
[14:30] <rharper> where the config dict looks different
[14:30] <rharper> that's what I don't like;
[14:31] <smoser> i understand.
[14:31] <smoser> but 2 things
[14:31] <smoser> a.) maybe you dont' have to do that.... you just call renderer(config={})
[14:31] <smoser>   (why can't that do the right thing)
[14:32] <smoser> i dont remember 'b'
[14:33] <rharper> yeah, looking at the constructors; they all have sane defaults;  save the HEADER isn;'t enabled by default;
[14:33] <rharper> and the header is common, so that could work
[14:34] <rharper> alternatively, the distro could pack both sets of config into a single dict; they do not overlap
[14:34]  * rharper plays with some code an unittests
[14:36] <smoser> rharper, right. i considered that too
[14:36] <smoser> (putting both configs in)
[14:36] <smoser> but honestly that isn't any more difficult than this:
[14:37] <smoser> configs = {'eni': {'eni_data_for_config_param': 1}, 'netplan': {'netplan stuff'}}
[14:37] <smoser> name, renderer = load_net_render_by_policy()
[14:37] <smoser> renderer(config=configs[name])
[14:38] <smoser> or config=configs.get(name, {})
[14:40] <rharper> well, load_renderer instantiates the class, you pass the config in
[14:41] <rharper> but I'll play around with something
[14:41] <rharper> we could pass the config dict as you show with the class name configs and have the loader pull out the config
[14:41] <rharper> that seems sane
[14:53] <smoser> rharper, http://paste.ubuntu.com/24189094/
[14:53] <smoser> that doesnt seem to get networking. it ends up rendering yaml, but with only 'lo' in the 50-cloud-init
[14:53] <smoser> trying with:
[14:53] <smoser>  /tmp/try-lxc-v2 ubuntu-daily:zesty /tmp/cloud-init_all.deb
[14:54] <rharper> k
[14:55] <rharper> unrelated, I merged your feature branch in and I'm getting unittest errors with 'is_exe' is not defined
[14:56] <smoser> rharper, i was i might not have committeed that... let me check. i did see that.
[14:56] <smoser> thought i had fixed it.
[14:56] <smoser> will fix now
[14:56] <rharper> k
[14:56] <smoser> so that paste above... i verify in logs that you're rendering netplan
[14:56] <smoser> and there is a netplan config, and there is no 50-cloud-* (i was wrong)
[14:57] <smoser> but it does not get networking
[14:57] <rharper> ok, it;s likely related to other systemd unit changes
[14:58] <rharper> you can check: systemctl status systemd-networkd (it should be active) and systemctl status sytemd-network-wait-online.service (should be active too)
[14:58] <rharper> it's likely that one or the other didn't trigger which means we write the file but nothing runs
[14:58] <rharper> also /run/systemd/network/* should have 10-netplan-* files if cloud-init called netplan generate (which it should if it rendered the 50-cloud-init.yaml )
[14:59] <rharper> this is the "fun" of systemd unit magic
[15:29] <smoser> rharper,  i pushed my renderes branch... i had the is_exe change locally but not pushed. sorry
[15:29] <smoser> and confirmed that lxd works after reboot
[15:29] <smoser> with your branch network comes up after reboot
[15:32] <rharper> smoser: ok; I hacked in my own is_exe, but will switch
[15:33] <rharper> I need to update my unittests now that the default policy is to use eni; my netplan paths which I mocked out a which('netplan') require a bit more work since it finds 'eni' first
[15:34] <rharper> smoser: do you have an idea on what we'd put in sysconfig dict to override the default policy and pass it ?  I think the Distro object would config.get('network_render_policy') or something like that
[15:37] <smoser> rharper, looking
[17:04] <szb> hi all, I'm getting errors from apt 'Could not get lock /var/lib/apt/lists/lock' when trying to add a source. It looks like the initial apt-get update/upgrade is still running? Why doesn't this block the rest of the script?
[17:20] <paulmey> I'm thinking about the scenario where we make the Azure datasource a local datasource...
[17:21] <paulmey> do local datasource get to execute any code at the end of cloud-init provisioning?
[17:30] <smoser> \o/
[17:30] <smoser> paulmey, yes.
[17:30] <smoser> well, as much as network datasources do
[17:31] <smoser> szb, what is "the rest of the script" ?
[17:31] <paulmey> ok, so even a local data source gets to signal successful provisioning over the network in the end...?
[17:31] <smoser> everything is serial currently in cloud-init. perhaps you had an old lock file there that had got captured?
[17:32] <smoser> paulmey, where does that happen for azure now ?
[17:33] <paulmey> let me look...
[17:33] <smoser> so the answer is... we'll make it work and fix what we have to to do so.
[17:35] <szb> @smoser: I have "package_upgrade: true" at the top, and then farther down I have an "apt: sources:" section and it breaks there because apt is still locked upgrading.
[17:38] <smoser> szb, can you show me what you have for apt: sources: ?
[17:38] <smoser> but fwiw, this is not run "top down".
[17:40] <szb> @smoser: http://pastebin.com/BURKWmZs
[17:45] <smoser> szb, can you paste a /var/log/cloud-init.log ?
[17:45] <smoser> (and fwiw, xenial ubuntu images have 'pastebinit') so you can just run 'pastebinit /var/log/cloud-init.log'
[17:47] <smoser> szb, i dropped the 'fs_setup' section, and it worked for me here.
[17:53] <szb> @smoser: http://paste.ubuntu.com/24190140/
[17:54] <szb> @smoser: Hmm, interesting. Is this not a good use for cloud-init, setting up mongo? I'm not sure but I like settings things up this way rather than coding a shell script.
[17:56] <smoser> unrelated suggestion, i suggest even though its much longer that you specify the key, not the keyid.  that way you remove the dependency on a gpg server.
[17:56] <smoser> szb, it seems sane to me.
[17:56] <smoser> can you paste /var/log/cloud-init-output.log ?
[17:57] <smoser> i suspect that is what showed you the lock complaint
[17:57] <smoser> but i dont know what would cause apt to be running that early.
[17:57] <paulmey> @smoser, it's currently done at the end of the func that gets the metadata
[17:57] <paulmey> that's not the right place anyway
[17:58] <paulmey> :-/
[17:58] <smoser> well, it doesnt seem terrible at the end of that func. as its deciding that its "done" at taht point.
[17:58] <smoser> i think it might fit in 'activate'
[17:59] <smoser> activate will be called when networking is configured.
[17:59] <szb> @smoser: http://paste.ubuntu.com/24190179/
[18:02] <smoser> szb, i'm not sure what has that lock
[18:05] <szb> @smoser: Ah, well ty for looking
[18:06] <smoser> so, i'm pretty sure that cloud-init is not doing it.
[18:06] <smoser> i've never seen this error before.
[18:06] <smoser> is this a stock ubuntu image ?
[18:07] <smoser> szb, ^ ?
[18:08] <szb> base 16.04 LTS AMI with some customizations, built from packer.
[18:17] <szb> @smoer, could this in our APT config be it? APT::Periodic::Update-Package-Lists "1"; APT::Periodic::Unattended-Upgrade "1";
[18:18] <rharper> harlowja: on the sysconfig path, in rhel.py if we use the 'apply_network_config' path which uses the renderer instead of the _write_network_config() path I noticed that the section that writes out 'etc/sysconfig/network' file isn't rendered;  I suspect that should be common ?
[18:19] <harlowja> i would agree
[18:19] <rharper> https://git.launchpad.net/cloud-init/tree/cloudinit/distros/rhel.py#n100
[18:19] <rharper> in practice, is that already configured?
[18:20] <harlowja> no afaik
[18:20] <harlowja> *not
[18:20] <rharper> I would have thought that fedora/rhel images would have failed if they were using the apply_network_config path if those values aren't set
[18:21] <harlowja> ya perhaps cut off the 'if dev_names'
[18:21] <harlowja> and just let it happen
[18:21] <rharper> y
[18:22] <rharper> it needs a bit more to make it common between the two methods; but I'll see what I can do
[18:22] <rharper> something like a _enable_networking() method
[18:22] <rharper> which takes the v6 booloean
[18:22] <rharper> then we can call it from both methods
[18:22] <harlowja> wfm
[18:29] <rangerpb> heya Odd_Bloke , you around possible to talk about https://git.launchpad.net/cloud-init/tree/cloudinit/sources/DataSourceAzure.py#n77 ?
[18:29] <rangerpb> looking for a little background in it
[18:32] <Odd_Bloke> rangerpb: I'm in a meeting ATM, but should be out soon; if you ask me some questions I'll answer once I have a minute. :)
[18:33] <rangerpb> Odd_Bloke, well , lets start with what is the purpose of that method?  can you remember why it is needed?
[18:35] <rangerpb> then the next question is that I have had to recently patch the non-agent provisioning path in DataSourceAzure to perform the whole set hostname and bounce of the network to propogate things to DDNS.  it duplicates a lot of code like https://git.launchpad.net/cloud-init/tree/cloudinit/sources/DataSourceAzure.py#n115 and lines following (including the call to the bounce method).
[18:35] <rangerpb> smoser has suggested I see if I can refactor some of the code between both paths to reduce duplication ... but that contextlib method seems like it is going to be problematic
[18:43] <smoser> rangerpb, hey.
[18:43] <smoser> so on azure, i thought you sue NetworkManager rahter than sysconfig ?
[18:43] <smoser> did i make that up?
[18:44] <smoser> ah. never mind. i understand what i was going to ask. i was wondering how we render networking. but i guess on your images there, fallback networking gets rendered, but nothing uses it.
[18:46] <rangerpb> not sure what you mean
[18:46] <smoser> you use network manager in some images right?
[18:47] <rangerpb> in some yes
[18:47] <rangerpb> but i guess that decision is made by who makes the image
[18:50] <smoser> well, as it is right now, cloud-initdoes not support reading network configuration from a datasource and rendering it to NetworkManager.
[18:51] <smoser> so it works for you just because cloud-init will render the sysconfig networkign on fedora, but your image doesnt pay any attention to it.
[18:54] <rangerpb> I believe the only thing that is read is the hostname
[19:01] <rharper> smoser: maybe time for netplan
[19:01] <rharper> it writes NetworkManager configs
[19:02] <smoser> :)
[19:02] <smoser> rangerpb, on azure that is true.
[19:03] <smoser> there, cloud-init selects a "fallback network configuration" that equates to "run dhcp on the first network interface".
[19:03] <smoser> but on other clouds (digital ocean, openstack, lxd, smartos) the cloud provider provides information on how the network should be configured
[19:07] <szb> @smoser: Fixed, I think the APT unattended upgrades in the base AMI were the issue. I removed those and rebuilt the AMI and now it's working. ty for your hlep!
[19:15] <smoser> szb, yeah... you want to file a bug ?
[19:15] <smoser> that is quite obnoxious...
[19:15] <Odd_Bloke> rangerpb: Hmm, let me see if I can remember.
[19:15] <smoser> generally speaking, i think that apt unattended upgrades thing is a pita. as it can cause *anything* to fail
[19:18] <Odd_Bloke> rangerpb: https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1375252 was the bug I was fixing.
[19:18] <Odd_Bloke> rangerpb: IIRC, the Azure fabric expects you to use the hostname that it has recorded for your instance.
[19:18] <Odd_Bloke> rangerpb: So cloud-init was always resetting to that hostname on boot.
[19:19] <Odd_Bloke> rangerpb: Whereas temporary_hostname just sets back to that hostname while the agent is talking to the fabric, and then returns it to the previously-set one.
[19:19] <Odd_Bloke> (So that users can modify hostnames.)
[19:58] <smoser> powersj, around ?
[19:58] <smoser> https://code.launchpad.net/~powersj/cloud-init/+git/cloud-init/+merge/319878
[19:58] <powersj> smoser: yes
[19:58] <smoser> jane's entry in /etc/shadow should have $1$xyz$sPMsLNmf66Ohl.ol6JvzE.
[19:58] <smoser> right?
[19:59] <powersj> yes
[19:59] <powersj> want a simple test of that one?
[20:00] <smoser> yea
[20:00] <smoser> do you know what that passowrd is ?
[20:01] <powersj> I don't I think I grabbed a string and messed it up more
[20:02] <smoser>  ?
[20:02] <smoser>  !$1$ssisyfpf$YqvuJLfrrW6Cg/l53Pi1n1
[20:02] <smoser> https://git.launchpad.net/cirros/commit/?id=95f4ffa2f5339aa04226718895a780f2994817b9
[20:02] <powersj> oh well then maybe I didn't lol
[20:03] <powersj> want me to change it?
[20:07] <smoser> yeah, lets use that.
[20:07] <smoser> just becauase its good to use something i think that we knwo.
[20:07] <smoser> and [put a comment in there.
[20:09] <smoser> or we can just put their names for each one
[20:11] <smoser> powersj, i'll do this an give you a suggestion
[20:12] <powersj> smoser: ok, are you fine with the Random -> RANDOM changes as well? I kind of lumped those in and was not sure if that should be another merge or not
[20:14] <smoser> its fine. its supposed to be RANDOM to indicate a random password, right?
[20:14] <powersj> yes
[20:17] <rangerpb> Odd_Bloke, but it appears to me in the code it sets the hostname to what azure expects by getting the hostname from the metadata...
[20:18] <Odd_Bloke> rangerpb: Right, because walinuxagent does a DHCP bounce which includes the hostname.
[20:18] <rangerpb> so ... why is the method needed then ?
[20:19]  * rangerpb must be misunderstanding something
[20:21] <Odd_Bloke> rangerpb: I think it's this: (1) User boots instance, (2) user changes hostname of instance, (3) user reboots.  At that second boot, we still need to report the original hostname.
[20:22] <Odd_Bloke> These assumptions may no longer hold; I was modifying a strategy that was already in place.
[20:23] <rangerpb> so this covers an instance when the user changes the hostname for some ungodly reason ?
[20:23] <rangerpb> interesting ...
[20:23] <rangerpb> not something I had considered
[20:24] <Odd_Bloke> rangerpb: I believe so, yes.
[20:25] <rangerpb> paulmey, do you know if this still holds water?
[20:25] <rangerpb> thanks Odd_Bloke !
[20:26] <Odd_Bloke> :)
[20:31] <rangerpb> Odd_Bloke, one reason I am questioning this method is that if the hostname is not set up correctly and the network bounces, then things like DNS may not work correctly
[20:31] <rangerpb> specifically resolving the VMs DNS name <> IP
[20:32] <Odd_Bloke> rangerpb: What do you mean by "the hostname is not set up correctly"?  When might that happen?
[20:32] <rangerpb> meaning, the hostname is not what azure wants it to be
[20:33] <Odd_Bloke> Ah, you mean if the network bounces outside of cloud-init's control?
[20:33] <Odd_Bloke> (Rather than the intentional bounce on boot?)
[20:34] <Odd_Bloke> I think it only needs to be configured as Azure expects when walinuxagent starts.
[20:42] <smoser> Odd_Bloke, well, in the "builtin"  path, cloud-init is not currently bouncing the interface.
[20:43] <smoser> the goal for rangerpb was to get the hostname from the environment file and then "publish" it to azure (where "publish" means "dhcp with set-hostname")
[20:43] <smoser> which is why we were ever bouncing the hostname
[21:02] <rangerpb> Odd_Bloke, sure, but frankly everyone wants off of the agent for provisioning
[21:02] <rangerpb> and what smoser said
[21:03]  * rangerpb runs out of gass
[21:11] <smoser> powersj, how do i save the collect stuff ?
[21:11] <smoser> so that i can verify --data-dir= ?
[21:11] <smoser> magicalChicken, ?^
[21:11] <powersj> smoser: instead of a run you want to do a collect
[21:12] <powersj> collect -n xenial -d /tmp/collection for example
[21:12] <magicalChicken> smoser: yeah, if you run collect you can then use verify afterwards on the collected data
[21:14] <smoser> so just instead of 'run' i 'collect' ?
[21:14] <magicalChicken> yeah
[21:14] <smoser> i think i'd like an option to 'run' to take --data-dir
[21:14] <magicalChicken> the rest of the args are the same, but you have to add --data-dir
[21:14] <powersj> https://cloudinit.readthedocs.io/en/latest/topics/tests.html#collect ;)
[21:15] <magicalChicken> smoser: i have plans for adding a --preserve=always for run and a --data-dir option there too
[21:15] <magicalChicken> just haven't gotten to that yet, too much other stuff first
[21:16] <magicalChicken> and that would be based on the tmpdir stuff from the bddeb branch which needs to be merged as well
[21:19] <smoser> powersj, i'm about to leave
[21:19] <smoser> but
[21:19] <smoser> http://paste.ubuntu.com/24191229/
[21:19] <smoser> is what ihave
[21:19] <powersj> smoser: thx I'll take a look at it
[21:21] <smoser> hm.m
[21:21] <smoser> test_shadow_passwords (tests.cloud_tests.testcases.get_suite.<locals>.tmp) ... ok
[21:21] <smoser> '<locals>'
[21:21] <smoser> powersj, that just passed a 'run' for me. so i think its good.
[21:22] <smoser> the password -> passwd  change i found when one of my asserts failed.
[21:23] <powersj> interesting is that worth a doc change?
[21:23] <smoser> doc is right :)
[21:24] <powersj> https://cloudinit.readthedocs.io/en/latest/topics/modules.html#set-passwords says password
[21:24] <smoser> its correct there.
[21:24] <powersj> I made all these test based off of the read the docs
[21:24] <powersj> oh
[21:24] <smoser> in a users: it is 'passwd'
[21:24] <smoser> so.. based on that quite reasonable misundstanding of yours...
[21:25] <smoser> maybe we should make the code take 'password' or 'passwd'
[21:25] <powersj> ah right
[21:25] <smoser> https://cloudinit.readthedocs.io/en/latest/topics/modules.html#users-and-groups
[21:25] <smoser> thats what you were writing to. 'passwd'
[21:25] <smoser> i got to run.
[21:25] <smoser> later
[21:25] <smoser> .
[21:25] <powersj> thx o/