=== shardy is now known as shardy_lunch === shardy_lunch is now known as shardy === rangerpbzzzz is now known as rangerpb [14:07] smoser: I'm going to merge the policy branch into my net-passthrough and then see if this all still works =) [14:07] rharper, ok... i just did something, tried to test yiour branch and lxc and didn't seem to do what i expected. [14:08] putting together something to show you what i tried. [14:08] ok [14:13] smoser: in the policy feature; you've not yet wired it into the distro yet, right? also not poked at what the systemconfig key/val would be yet? [14:20] smoser: so, we don't have a common config dictionary for renders; that makes the distro part where it fetches the first renderer via policy, it still needs to know which renderer it is can construct possible different configs; unless we come up with a common config structure [14:21] rharper, well, at first i dont think we *need* any configuration on nit. [14:26] renderers today all that a config dict; and they're all different; that's just the code today without any new code; [14:26] some things are common, for example, eni and sysconfig take a 'netrules_path' key for udev rule hooks [14:27] but eni has a eni_header (which we could truncate to header, and reuse for netplan as well, we use the same one anyhow); [14:27] rharper, right. so we do have to know how to call each of the renderers, which kind of suck [14:27] sysconfig uses dns_path and sysconf_dir, [14:27] yeah [14:28] but we do not need to necessarily have that configured [14:28] the datasource can just raise a RunTimeError if it doens't know how to call the selected_renderer [14:28] but even then, i think the renderers all take config={} [14:28] they all take the config; I'm saying it's annoying in the distro that supports multiple renders [14:28] and probably do something sane-ish... they can just be improved to better dtrt [14:29] the logic should be 1) render_cls = load_net_render_by_policy(); 2) render = render_cls(config) [14:30] but instead, we have to ask well, if render_cls == 'eni' then X; elif render_cls == 'netplan' Y [14:30] where the config dict looks different [14:30] that's what I don't like; [14:31] i understand. [14:31] but 2 things [14:31] a.) maybe you dont' have to do that.... you just call renderer(config={}) [14:31] (why can't that do the right thing) [14:32] i dont remember 'b' [14:33] yeah, looking at the constructors; they all have sane defaults; save the HEADER isn;'t enabled by default; [14:33] and the header is common, so that could work [14:34] alternatively, the distro could pack both sets of config into a single dict; they do not overlap [14:34] * rharper plays with some code an unittests [14:36] rharper, right. i considered that too [14:36] (putting both configs in) [14:36] but honestly that isn't any more difficult than this: [14:37] configs = {'eni': {'eni_data_for_config_param': 1}, 'netplan': {'netplan stuff'}} [14:37] name, renderer = load_net_render_by_policy() [14:37] renderer(config=configs[name]) [14:38] or config=configs.get(name, {}) [14:40] well, load_renderer instantiates the class, you pass the config in [14:41] but I'll play around with something [14:41] we could pass the config dict as you show with the class name configs and have the loader pull out the config [14:41] that seems sane [14:53] rharper, http://paste.ubuntu.com/24189094/ [14:53] that doesnt seem to get networking. it ends up rendering yaml, but with only 'lo' in the 50-cloud-init [14:53] trying with: [14:53] /tmp/try-lxc-v2 ubuntu-daily:zesty /tmp/cloud-init_all.deb [14:54] k [14:55] unrelated, I merged your feature branch in and I'm getting unittest errors with 'is_exe' is not defined [14:56] rharper, i was i might not have committeed that... let me check. i did see that. [14:56] thought i had fixed it. [14:56] will fix now [14:56] k [14:56] so that paste above... i verify in logs that you're rendering netplan [14:56] and there is a netplan config, and there is no 50-cloud-* (i was wrong) [14:57] but it does not get networking [14:57] ok, it;s likely related to other systemd unit changes [14:58] you can check: systemctl status systemd-networkd (it should be active) and systemctl status sytemd-network-wait-online.service (should be active too) [14:58] it's likely that one or the other didn't trigger which means we write the file but nothing runs [14:58] also /run/systemd/network/* should have 10-netplan-* files if cloud-init called netplan generate (which it should if it rendered the 50-cloud-init.yaml ) [14:59] this is the "fun" of systemd unit magic [15:29] rharper, i pushed my renderes branch... i had the is_exe change locally but not pushed. sorry [15:29] and confirmed that lxd works after reboot [15:29] with your branch network comes up after reboot [15:32] smoser: ok; I hacked in my own is_exe, but will switch [15:33] I need to update my unittests now that the default policy is to use eni; my netplan paths which I mocked out a which('netplan') require a bit more work since it finds 'eni' first [15:34] smoser: do you have an idea on what we'd put in sysconfig dict to override the default policy and pass it ? I think the Distro object would config.get('network_render_policy') or something like that [15:37] rharper, looking [17:04] hi all, I'm getting errors from apt 'Could not get lock /var/lib/apt/lists/lock' when trying to add a source. It looks like the initial apt-get update/upgrade is still running? Why doesn't this block the rest of the script? [17:20] I'm thinking about the scenario where we make the Azure datasource a local datasource... [17:21] do local datasource get to execute any code at the end of cloud-init provisioning? [17:30] \o/ [17:30] paulmey, yes. [17:30] well, as much as network datasources do [17:31] szb, what is "the rest of the script" ? [17:31] ok, so even a local data source gets to signal successful provisioning over the network in the end...? [17:31] everything is serial currently in cloud-init. perhaps you had an old lock file there that had got captured? [17:32] paulmey, where does that happen for azure now ? [17:33] let me look... [17:33] so the answer is... we'll make it work and fix what we have to to do so. [17:35] @smoser: I have "package_upgrade: true" at the top, and then farther down I have an "apt: sources:" section and it breaks there because apt is still locked upgrading. [17:38] szb, can you show me what you have for apt: sources: ? [17:38] but fwiw, this is not run "top down". [17:40] @smoser: http://pastebin.com/BURKWmZs [17:45] szb, can you paste a /var/log/cloud-init.log ? [17:45] (and fwiw, xenial ubuntu images have 'pastebinit') so you can just run 'pastebinit /var/log/cloud-init.log' [17:47] szb, i dropped the 'fs_setup' section, and it worked for me here. [17:53] @smoser: http://paste.ubuntu.com/24190140/ [17:54] @smoser: Hmm, interesting. Is this not a good use for cloud-init, setting up mongo? I'm not sure but I like settings things up this way rather than coding a shell script. [17:56] unrelated suggestion, i suggest even though its much longer that you specify the key, not the keyid. that way you remove the dependency on a gpg server. [17:56] szb, it seems sane to me. [17:56] can you paste /var/log/cloud-init-output.log ? [17:57] i suspect that is what showed you the lock complaint [17:57] but i dont know what would cause apt to be running that early. [17:57] @smoser, it's currently done at the end of the func that gets the metadata [17:57] that's not the right place anyway [17:58] :-/ [17:58] well, it doesnt seem terrible at the end of that func. as its deciding that its "done" at taht point. [17:58] i think it might fit in 'activate' [17:59] activate will be called when networking is configured. [17:59] @smoser: http://paste.ubuntu.com/24190179/ [18:02] szb, i'm not sure what has that lock [18:05] @smoser: Ah, well ty for looking [18:06] so, i'm pretty sure that cloud-init is not doing it. [18:06] i've never seen this error before. [18:06] is this a stock ubuntu image ? [18:07] szb, ^ ? [18:08] base 16.04 LTS AMI with some customizations, built from packer. [18:17] @smoer, could this in our APT config be it? APT::Periodic::Update-Package-Lists "1"; APT::Periodic::Unattended-Upgrade "1"; [18:18] harlowja: on the sysconfig path, in rhel.py if we use the 'apply_network_config' path which uses the renderer instead of the _write_network_config() path I noticed that the section that writes out 'etc/sysconfig/network' file isn't rendered; I suspect that should be common ? [18:19] i would agree [18:19] https://git.launchpad.net/cloud-init/tree/cloudinit/distros/rhel.py#n100 [18:19] in practice, is that already configured? [18:20] no afaik [18:20] *not [18:20] I would have thought that fedora/rhel images would have failed if they were using the apply_network_config path if those values aren't set [18:21] ya perhaps cut off the 'if dev_names' [18:21] and just let it happen [18:21] y [18:22] it needs a bit more to make it common between the two methods; but I'll see what I can do [18:22] something like a _enable_networking() method [18:22] which takes the v6 booloean [18:22] then we can call it from both methods [18:22] wfm [18:29] heya Odd_Bloke , you around possible to talk about https://git.launchpad.net/cloud-init/tree/cloudinit/sources/DataSourceAzure.py#n77 ? [18:29] looking for a little background in it [18:32] rangerpb: I'm in a meeting ATM, but should be out soon; if you ask me some questions I'll answer once I have a minute. :) [18:33] Odd_Bloke, well , lets start with what is the purpose of that method? can you remember why it is needed? [18:35] then the next question is that I have had to recently patch the non-agent provisioning path in DataSourceAzure to perform the whole set hostname and bounce of the network to propogate things to DDNS. it duplicates a lot of code like https://git.launchpad.net/cloud-init/tree/cloudinit/sources/DataSourceAzure.py#n115 and lines following (including the call to the bounce method). [18:35] smoser has suggested I see if I can refactor some of the code between both paths to reduce duplication ... but that contextlib method seems like it is going to be problematic [18:43] rangerpb, hey. [18:43] so on azure, i thought you sue NetworkManager rahter than sysconfig ? [18:43] did i make that up? [18:44] ah. never mind. i understand what i was going to ask. i was wondering how we render networking. but i guess on your images there, fallback networking gets rendered, but nothing uses it. [18:46] not sure what you mean [18:46] you use network manager in some images right? [18:47] in some yes [18:47] but i guess that decision is made by who makes the image [18:50] well, as it is right now, cloud-initdoes not support reading network configuration from a datasource and rendering it to NetworkManager. [18:51] so it works for you just because cloud-init will render the sysconfig networkign on fedora, but your image doesnt pay any attention to it. [18:54] I believe the only thing that is read is the hostname [19:01] smoser: maybe time for netplan [19:01] it writes NetworkManager configs [19:02] :) [19:02] rangerpb, on azure that is true. [19:03] there, cloud-init selects a "fallback network configuration" that equates to "run dhcp on the first network interface". [19:03] but on other clouds (digital ocean, openstack, lxd, smartos) the cloud provider provides information on how the network should be configured [19:07] @smoser: Fixed, I think the APT unattended upgrades in the base AMI were the issue. I removed those and rebuilt the AMI and now it's working. ty for your hlep! [19:15] szb, yeah... you want to file a bug ? [19:15] that is quite obnoxious... [19:15] rangerpb: Hmm, let me see if I can remember. [19:15] generally speaking, i think that apt unattended upgrades thing is a pita. as it can cause *anything* to fail [19:18] rangerpb: https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1375252 was the bug I was fixing. [19:18] rangerpb: IIRC, the Azure fabric expects you to use the hostname that it has recorded for your instance. [19:18] rangerpb: So cloud-init was always resetting to that hostname on boot. [19:19] rangerpb: Whereas temporary_hostname just sets back to that hostname while the agent is talking to the fabric, and then returns it to the previously-set one. [19:19] (So that users can modify hostnames.) [19:58] powersj, around ? [19:58] https://code.launchpad.net/~powersj/cloud-init/+git/cloud-init/+merge/319878 [19:58] smoser: yes [19:58] jane's entry in /etc/shadow should have $1$xyz$sPMsLNmf66Ohl.ol6JvzE. [19:58] right? [19:59] yes [19:59] want a simple test of that one? [20:00] yea [20:00] do you know what that passowrd is ? [20:01] I don't I think I grabbed a string and messed it up more [20:02] ? [20:02] !$1$ssisyfpf$YqvuJLfrrW6Cg/l53Pi1n1 [20:02] https://git.launchpad.net/cirros/commit/?id=95f4ffa2f5339aa04226718895a780f2994817b9 [20:02] oh well then maybe I didn't lol [20:03] want me to change it? [20:07] yeah, lets use that. [20:07] just becauase its good to use something i think that we knwo. [20:07] and [put a comment in there. [20:09] or we can just put their names for each one [20:11] powersj, i'll do this an give you a suggestion [20:12] smoser: ok, are you fine with the Random -> RANDOM changes as well? I kind of lumped those in and was not sure if that should be another merge or not [20:14] its fine. its supposed to be RANDOM to indicate a random password, right? [20:14] yes [20:17] Odd_Bloke, but it appears to me in the code it sets the hostname to what azure expects by getting the hostname from the metadata... [20:18] rangerpb: Right, because walinuxagent does a DHCP bounce which includes the hostname. [20:18] so ... why is the method needed then ? [20:19] * rangerpb must be misunderstanding something [20:21] rangerpb: I think it's this: (1) User boots instance, (2) user changes hostname of instance, (3) user reboots. At that second boot, we still need to report the original hostname. [20:22] These assumptions may no longer hold; I was modifying a strategy that was already in place. [20:23] so this covers an instance when the user changes the hostname for some ungodly reason ? [20:23] interesting ... [20:23] not something I had considered [20:24] rangerpb: I believe so, yes. [20:25] paulmey, do you know if this still holds water? [20:25] thanks Odd_Bloke ! [20:26] :) [20:31] Odd_Bloke, one reason I am questioning this method is that if the hostname is not set up correctly and the network bounces, then things like DNS may not work correctly [20:31] specifically resolving the VMs DNS name <> IP [20:32] rangerpb: What do you mean by "the hostname is not set up correctly"? When might that happen? [20:32] meaning, the hostname is not what azure wants it to be [20:33] Ah, you mean if the network bounces outside of cloud-init's control? [20:33] (Rather than the intentional bounce on boot?) [20:34] I think it only needs to be configured as Azure expects when walinuxagent starts. [20:42] Odd_Bloke, well, in the "builtin" path, cloud-init is not currently bouncing the interface. [20:43] the goal for rangerpb was to get the hostname from the environment file and then "publish" it to azure (where "publish" means "dhcp with set-hostname") [20:43] which is why we were ever bouncing the hostname [21:02] Odd_Bloke, sure, but frankly everyone wants off of the agent for provisioning [21:02] and what smoser said [21:03] * rangerpb runs out of gass === rangerpb is now known as rangerpbzzzz [21:11] powersj, how do i save the collect stuff ? [21:11] so that i can verify --data-dir= ? [21:11] magicalChicken, ?^ [21:11] smoser: instead of a run you want to do a collect [21:12] collect -n xenial -d /tmp/collection for example [21:12] smoser: yeah, if you run collect you can then use verify afterwards on the collected data [21:14] so just instead of 'run' i 'collect' ? [21:14] yeah [21:14] i think i'd like an option to 'run' to take --data-dir [21:14] the rest of the args are the same, but you have to add --data-dir [21:14] https://cloudinit.readthedocs.io/en/latest/topics/tests.html#collect ;) [21:15] smoser: i have plans for adding a --preserve=always for run and a --data-dir option there too [21:15] just haven't gotten to that yet, too much other stuff first [21:16] and that would be based on the tmpdir stuff from the bddeb branch which needs to be merged as well [21:19] powersj, i'm about to leave [21:19] but [21:19] http://paste.ubuntu.com/24191229/ [21:19] is what ihave [21:19] smoser: thx I'll take a look at it [21:21] hm.m [21:21] test_shadow_passwords (tests.cloud_tests.testcases.get_suite..tmp) ... ok [21:21] '' [21:21] powersj, that just passed a 'run' for me. so i think its good. [21:22] the password -> passwd change i found when one of my asserts failed. [21:23] interesting is that worth a doc change? [21:23] doc is right :) [21:24] https://cloudinit.readthedocs.io/en/latest/topics/modules.html#set-passwords says password [21:24] its correct there. [21:24] I made all these test based off of the read the docs [21:24] oh [21:24] in a users: it is 'passwd' [21:24] so.. based on that quite reasonable misundstanding of yours... [21:25] maybe we should make the code take 'password' or 'passwd' [21:25] ah right [21:25] https://cloudinit.readthedocs.io/en/latest/topics/modules.html#users-and-groups [21:25] thats what you were writing to. 'passwd' [21:25] i got to run. [21:25] later [21:25] . [21:25] thx o/