/srv/irclogs.ubuntu.com/2014/05/22/#cloud-init.txt

=== harlowja_ is now known as harlowja_away
=== zz_gondoi is now known as gondoi
=== gondoi is now known as zz_gondoi
=== zz_gondoi is now known as gondoi
=== sauce_ is now known as sauce
=== gondoi is now known as zz_gondoi
=== zz_gondoi is now known as gondoi
=== shardy is now known as shardy_afk
=== harlowja_away is now known as harlowja_
r-daneelsmoser, so I was wondering if I had anything more to do for patch https://bugs.launchpad.net/cloud-init/+bug/127509818:02
smoserhey, r-daneel 18:02
smoserso the reason it languisued is probably18:03
smoserthat i just didn't have focused time to think about it.18:03
smoserits very non-trivial.18:03
r-daneelyou mean the logic is non-trivial ?18:04
smoserin that bouncing network adapters early in boot feels strange.18:04
smoserand may have unintended side effects.18:05
smoseri suspect the reason the interfaces came up was because they were "left over" from a previous instance ? ie, in a "capture" ?18:06
r-daneelwell, I experienced only 2 cases: 1. I have no proper networking, so bring_down/bring_up has no side-effect, 2. I have wrong IP info, so  I anyway will break networking by replacing the IP18:07
r-daneelthe case arises when we clone a volume18:07
r-daneeland then try booting a new VM from the clone18:07
r-daneelthe OS has already an IP (the previous - wrong - one) and has it configured before cloud-init runs18:08
r-daneelof course, cloud-init does all is needed in the config files18:08
r-daneelbut it's too late18:08
r-daneelas interfaces are set up18:08
r-daneelnow doing an ifup, either adds the new IP or fails totally18:09
harlowja_smoser u alive!18:09
smoseryeah, i dont know why my bip proxy didn't join here.18:09
smosernow it will.18:10
smoserr-daneel, so i'm admittedly not all that knowledgable about how boot works on centos and cloud-init.18:12
smoserbut in ubuntu, networking comes up in parallel to the local datasource.18:12
r-daneelsmoser, as far as I could understand from the code, we end up calling the _bring_up_interface() method18:12
smoserif thats the case on centos (or even if its not, because i want to solve this correctly),18:12
smoserthen the 'ifdown' could fail with "interface not up"18:13
smoseror between cloud-init takign it down and then back up, the OS could bring it up.18:13
r-daneelsmoser, ifdown failing does not seem to be a real issue to me18:13
smoserand the cloud-init's "ifup" would fail18:13
smoserr-daneel, it may not seem to be an issue.18:14
smoserbut you can't blindly ignore it.18:14
r-daneelmaybe try/catch that call and pass in case of failure 618:14
r-daneel?18:14
smoserfailure 6 ?18:15
smoserbasically you have a fairly straigh tforward failure path, with a fairly straight forward work around.18:16
smosersimply remove non 'eth0' (or all) interfaces before you "capture" and snapshot.18:16
r-daneel(sorry '6' is the lower case for '?' on my keyboard :p)18:16
smoserbut fixing it by just going willy nilly with 'ifdown && ifup' seems to be racy18:16
smoserand i'd rather have a guaranteed failure with striaght forward work around18:17
r-daneelsmoser, this is a volume clone, the OS is not aware of being 'freezed'18:17
smoserthan sometimes-it-doesnt-work situation18:17
smoserr-daneel, understood.18:17
smoserbut you could easily "prep" before "capture".18:17
smosergenerally, cloud-init has made you not have to "prep" (clean). and i've wanted to make that always "just work"18:18
smoserbut there are some wierd cases, where I don't knwo what the right behavior is.18:18
r-daneelwell, I understand that proper 'prepping' would be the right thing to do18:19
r-daneelfor instance debian/ubuntu have that udev file you'd need to cleanup 18:20
smoserdebian/ubuntu should not have udev files18:21
smoserunless you're using an odd MAC range18:21
r-daneelbut obviously, when we boot on a clone, the only issue we get is with IP configuration 18:21
smoseryeah. i do understand this is an issue. and i'd like to have it fixed properly.18:22
r-daneelif we try to do that very cleanly, we should check if there is a pre-existing setup or at least check if the network is setup as exepcted by our freshly installed config18:23
smoserits really hard in ubuntu, and i suspect it might be in centos too (if not now, then it might be later with move to systemd)18:23
r-daneelwe could then only ifdown if we really already have a wrong setup18:23
smoserthe ordering of boot is just very much not guaranteed18:24
r-daneelwhen cloud-init does ifup it already assumes an ordering18:24
r-daneelit assumes no interface is set, and that noone will fiddle with it18:25
smoserwell, sort of.18:25
r-daneeleven worse, on centos it adds the IP to any existing config18:25
r-daneelnot enforcing the setup18:25
smoserif there was no interface configuration, for ethX, and cloud-init writes an interface configuration for ethX , at least in current ubuntu (and i think centos) nothing is magically going to bring it up18:25
smoserso that case is safe to assert "it was not up"18:25
smoserubuntu/debian 'ifdown' is terribly annoying.18:26
r-daneelbut cloud-init happily overwrites the existing config files18:27
smoserif you remove ethX from /etc/network/interfaces and then ifdown an interface that was already up, it says "not configured"18:27
smoserr-daneel, yeah, agreed. its not perfect now.18:27
r-daneelif interface is non-existent, not being able to ifdown it has no effect at worse18:27
r-daneeland in ubuntu/debian the call is ifup ALL, not event per interface18:28
r-daneels/event/even/18:28
smoserthe sanest thing to me seems to be to block any network events from occurring while you're going looking for "local" data sources18:29
r-daneelagain, ifdown on a non-setup interface or one that has wrong info wouldn't bother me, ... if I do changes by hand then reboot, ifdown will fail on shutdown and OS doesn't care18:30
smoserthen, correctly reading "existing config" and merging it with config from config drive.18:30
r-daneelmerging is ok, but you're likely to conflict with what is set18:30
smoserbut ignoring things that might fail sometimes is not helpful to anyone.18:30
smoserwell, if you conflict and you're blocking all networking coming up, thne you set it right.18:31
smoserie, config-drive would "win".18:31
smoser(possibly allowing for cloud.cfg configuration that modifies that behavior)18:31
smoserdo you know if you could block all networking from coming up on centos ?18:32
r-daneelbut the cloud-init service in the OS comes after the OS own's script for network setup, as it seems18:32
smoseri suspect i can do it on ubuntu, pretty sure you can do it fairly easily on sysvinit.18:32
r-daneelso how will you prevent the OS from setting the network info ?18:32
smoserthere are 2 cloud-init services18:33
smoserlocal and 'init'.18:33
smoserthe local can read only local datasources18:33
r-daneelok, let me check ordering ...18:33
smoserand cannot presume netowrking, but can set up networking.18:33
smoserthen... there is the other thing that adds a twist here.18:33
smoserin reality, i suspect that ocnfig drive is not long for this world.18:34
r-daneelhmm ? configdrive will be removed 618:34
r-daneel?18:34
smoserits static nature is just too limiting. i suspect in time to come, config-drive will disappear as the metadata service is able to provide dynamic data.18:34
smoserare you familiar with how hot-plug of network interfaces works on amazon ?18:35
smoserits really nice.18:35
smoserinterface added, causes udev rules to fire that create and name the interface.18:35
smoserthose same rules say "oh, look, i'm on EC2, and that means the metadata service might tell me what IP address I should get".18:35
smoser(this doesn't work on ubuntu but on Amazon Linux AMI it does)18:36
r-daneelok, I see18:36
smoserhttps://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/115362618:36
smoserthat is a much more sane way of doing things18:37
smoserand going that rougte would still mean we potentially have the issue that you were seeing (already existing config)18:37
=== praneshp_ is now known as praneshp
smoserr-daneel, i'm not opposed to getting this working better.18:38
smoserand don't mean to sound stand-offish18:38
smoserits just not as easy as it might first appear.18:38
r-daneelI see your point, we use configdrive because of inherent reduction of complexity and security18:39
smoseri don't really know that anything is more secure.18:40
smoserif your host network is compromised, i think you have significant issues.18:41
smoserand as for complexity, i think that openstack networking generally being difficult to get right is what lead to pepole wanting config drive and its initial popularity), but at least in my recent experience, that is much better sorted now.18:41
smoserie, you dont have "no route to 169.254.169.254" issues so much.18:42
r-daneelexposing a common service to all tenants seems much more risky than giving access to a file on the host. Someone escaping his VM is a much bigger problem but less likely to happen18:45
alexpilottismoser: hi there18:46
smoserr-daneel, but you've already exposed dozens of common services to your guests ;)18:47
r-daneelsmoser, such as ?18:47
smosernuetron-api, nova-api, dhcp, dns18:47
r-daneelsmoser, these are control planes not the VM infrastrucutre18:48
smoserdhcp and dns are on vm infrastructure18:48
r-daneelwe have no dhcp and dns is external18:48
smoseri dont know. if you can't securely route traffic from one vm to an endpoint specific to that vm, then you cant actually do networking securely between 2 vms of a single tenant.18:49
smoserso yes, i agree, it is mroe complex, but its not complexity that you can actualy do without i think in the end.18:50
smoseralexpilotti, whats up?18:50
alexpilottismoser: waiting for an answer on which is the minimum / recommended cloud-init version with MaaS metadata support :-)18:51
smoseroh. sorry.18:52
smoserour 12.04 images use 0.7.0-0ubuntu218:53
smoserso thats surely known-working18:53
alexpilottismoser: great, so with 0.7.4 we are good 18:53
r-daneelsmoser, when trying to keep things as secure and non-complex as possible, we found it useful to use configdrive. I agree that at some point in time we may need more flexibility and my walk the metadata-server way. For now I find it useful to have the choice between statically set information in configdrive and dynamic setu through metadata-server. 18:55
smoserr-daneel, and you're certainly not alone in that decision :)18:56
alexpilottismoser: tx, doing some tests18:57
r-daneelsmoser, so back to our initial topic ;) as far as I remember, the 'local' and 'init' phases both ran after the OS had finished setting the IPs. Am I wrong ? 19:01
smoserr-daneel, on ubuntu, local happens [possibly] in parallel with any 'auto' interfaces.19:04
smoser'[possibly]' is very complex. 19:04
smoserbut for almost all cases i can think of they're in parallel, and nothing forces them to be serialized.19:05
smoseri do not know about centos.19:05
smosersysvinit iseasy to do these sorts of things :)19:05
r-daneelcentos service ordering is S10network, S50cloud-init-local, S51cloud-init, S52cloud-config, S53cloud-final19:05
r-daneelfor ubuntu, I did not fully check. experience showed (in the logs) that we were always doing ifup on an already  up interface and ifup refused to override19:11
smoserr-daneel, yeah, so on centos, it should be possible to just put cloud-init-local before S1019:11
smoserthe one thing that that breaks, which i dont think is a real issue is network mounted filesystems (ie, /usr/ on nfs)19:12
r-daneelif you have the wrong IP, our setup prevent you from communicating (anti spoofing) if it could, we'd mess up things because of IP collision19:13
r-daneelsmoser, then we still have no better solution for ubuntu19:16
r-daneelsmoser, would it be more acceptable to do the ifdown/ifup cycle only on failure of the initial ifup ?19:17
smosermaybe, yeah.19:18
smoseri'm guessing one way or another we can force cloud-init local to run before and block networking coming up19:19
smoserbut it will probably be tricky19:19
smoser(since as it is, netowrking comes up on udev events)19:20
smoser(network-intreace-added)19:20
smosersorry.. net-device-added19:21
r-daneelso for those relying on udev, cloud-init should already be hooked-in to 'know' what to do19:22
smoseryeah. is complex though... elcoud-init explicitly actually emits the net-device-added when its inside a container19:24
smoser(as lxc instances don't get those events)19:24
r-daneelsmoser, would it help to implement that ifdown/ifup in the platform specific files ? maybe only on ifup failure ? I understand that we're trying to march toward a future-proof solution but this will require a lot more code-diving on my part :)19:32
smoseri dont mind if ifup/down is in 'distro'.19:33
smosererr.. in per-distro code.19:33
r-daneelok, I'll try to come up with distro-specific code. Will get back to you for a review once done :) 19:39
r-daneelsmoser, thank you for your help19:40
=== gondoi is now known as zz_gondoi
=== oobx_ is now known as oobx
=== zz_gondoi is now known as gondoi
=== gondoi is now known as zz_gondoi

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!