smoser | mgagne, hm.. | 11:50 |
---|---|---|
smoser | your 'is_bonding_slave' idea is fine, but i really dontunderstand why it shoudl be needed. | 11:50 |
smoser | you mentioned upstart, are you using upstart somewhere ? | 11:51 |
smoser | you dont want to mess with dsmode really. | 11:56 |
smoser | in current trunk, dsmode=local would allow you to make init_modules run earlier (without access to network) | 11:57 |
=== rangerpbzzzz is now known as rangerpb | ||
smoser | mgagne, around ? | 15:44 |
mgagne | smoser: I am now | 15:56 |
smoser | hey. | 15:58 |
smoser | so, are you using upstart ? | 15:58 |
mgagne | smoser: I'm not using upstart, I was reading source code and commenting about it. cloud-init is running twice as per bug description. Running it a second time fails because cloud-init doesn't expect bonding to be configured at this point or in fact, all code and tests were done without bonding support so a lot of assumption were made which aren't true anymore. | 15:58 |
mgagne | I'm booting on ubuntu 16.04, it's systemd afaik | 15:59 |
smoser | cloud-init does run twice for sure, but only the first time should set the networking up. | 15:59 |
smoser | oh. but we rename on every boot, so maybe we're doign that twice. | 15:59 |
smoser | hm.. | 15:59 |
mgagne | ok, well that's not the case on my side, bug 3.1) and 3.2) were caused by this double network config run | 15:59 |
smoser | i'll have a look in a bit. your is_bonding_slave change seems to make sense. | 16:00 |
mgagne | no no, I boot ONCE and it fails, I'm not even testing reboot at this point | 16:00 |
smoser | right. | 16:00 |
mgagne | because of this mac/link/device mapping, 2nd run fails because of how bonding behaves, it changes the mac of the bonding slaves, hence the added logic for is_bonding_slave. | 16:01 |
mgagne | I didn't do extensive tests, just boot, ping, ssh (with sshkey) and check hostname | 16:02 |
smoser | right | 16:02 |
harlowja | smoser u ever get a chance to look over https://code.launchpad.net/~harlowja/cloud-init/+git/cloud-init/+merge/302609 | 18:02 |
harlowja | its the future! | 18:02 |
harlowja | ha | 18:02 |
smoser | harlowja, i've not looked at it now. | 18:02 |
harlowja | np | 18:03 |
smoser | so .. | 18:04 |
smoser | Instead of looking in a very specific location for | 18:04 |
smoser | cloudinit config modules; which for those adding there | 18:04 |
smoser | own modules makes it hard to do without patching that | 18:04 |
smoser | location instead use entrypoints and register all | 18:04 |
smoser | current cloudinit config modules by default with that | 18:04 |
smoser | new entrypoint (and use that same entrypoint namespace | 18:04 |
smoser | for later finding needed modules). | 18:04 |
smoser | -- | 18:04 |
smoser | how does registering the entry points help "those adding their own modules" | 18:05 |
smoser | rharper, what shall i do for mgagne's auto-bringup of bond. | 18:06 |
smoser | did you have work on that that i didnt' see ? | 18:06 |
rharper | smoser: the fix is what i had | 18:06 |
rharper | but in general, we need to think about v4 vs v6 | 18:07 |
smoser | what fix ? | 18:07 |
rharper | in eni.py | 18:07 |
smoser | i didnts ee, sorry. | 18:07 |
smoser | didnt see | 18:07 |
rharper | he posted patches, basically adds the if 'bond-master' or 'bond-slaves' in iface, then emit auto | 18:07 |
harlowja | smoser so they still need to add a entry to cloud.cfg (either at packaging time, or at userdata/runtime) | 18:07 |
rharper | smoser: <mgagne> rharper: all patches: http://paste.ubuntu.com/23059836/ | 18:07 |
harlowja | i didn't go into the path of discovering and creating a cloud[init,config,final] sections of that config | 18:07 |
harlowja | because though i could, its umm, non trival :-P | 18:08 |
harlowja | and likely requires more metadata on modules to define there ordering (not via cloud.cfg at that point) | 18:08 |
rharper | we probably should instead check if iface['type'] in ['bond', 'vlan'] and possibly 'bridge' ; | 18:08 |
smoser | rharper, so you're just assuming all bonds (or vlans or bridges) then are 'auto' | 18:09 |
harlowja | so that kind of stuff seems like a larger change, vs just attempting to find modules that are already defined in cloud.cfg via entrypoints (leaving the change to be just a different way to find modules) | 18:09 |
rharper | smoser: we default to auto if an interface has a subnet | 18:09 |
rharper | in this case, it's a bond with no subnets | 18:09 |
rharper | as it's being assembled but not with a subnet; | 18:10 |
smoser | ie, those default to 'auto' while others (even with 'subnets') default to non-auto | 18:10 |
smoser | we do default to auto if a subnet ? | 18:10 |
rharper | no we always default to auto unless 'control' is set in subnet | 18:10 |
rharper | yes | 18:10 |
smoser | hm.. you're saying that is true after your change or before | 18:11 |
rharper | there are a few known cases where config explicitly wants subnet + control=manual (aka iscsiroot) | 18:11 |
rharper | if iface has subnet, control=auto for the iface/index pair | 18:11 |
rharper | if you do not include any subnet, then no auto (except for bond-slaves) | 18:11 |
rharper | that really should be any interface with a nested config (master/slave); I'm pretty sure | 18:12 |
smoser | if iface has subnet and no control= | 18:12 |
rharper | then control is set to auto | 18:12 |
rharper | for iscsiroot, we specify control: manual | 18:12 |
rharper | override the default; | 18:12 |
smoser | right, so you're not actually checking for bond-master. | 18:15 |
smoser | you're just trunign oauto on | 18:15 |
rharper | no, we check for bond-master in the case if iface with no subnets | 18:16 |
rharper | and then auto it, *if* it's a slave (slaves point to their master with bond-master key) | 18:16 |
rharper | but, if the bond master itself (bond0) doesn't configure a subnet, it doesn't get an auto | 18:17 |
rharper | I suspect the code in ifupdown/if-pre-up.d/ifenslave could be fixed to raise the bond master independent of whether it's marked auto or not; but it currently does *not* bring up the master unless listed in allow-auto (or marked auto) | 18:18 |
rharper | if bond0 doesn't come up then the rest of the config won't succeed (we timeout waiting on bond0 to be created via slave ifup hook) | 18:18 |
rharper | a bond-specific solution/workaround is to also include the bond master (indicated by key bond-slaves in iface) to be marked auto; | 18:19 |
rharper | that might be enough, but I'd like to test/check bridges without subnets and vlans without subnets to see if we generally need to mark non-subnet interfaces with auto by default; that is, I don't yet know of a config where we want a manual bond/vlan/bridge | 18:20 |
smoser | ok. for now i'm good with the fix as you all had. | 18:56 |
smoser | it is kind of wierd and possibly wrong that we are renaming devices in the 'init' stage (in addition to init-local). | 18:58 |
smoser | harlowja, what i dont understand is how you are making the problem of adding a config thing any easier. | 18:59 |
smoser | the cc_foo.py can now be placed in some additional directory ? | 18:59 |
harlowja | smoser i can put the config modules in my own library, expose a named entrypoint, then just update cloud.cfg to reference that module | 18:59 |
harlowja | so cc_blahblah no longer needs to be patched into cloud-init | 19:00 |
smoser | how do you expose a named entry point ? | 19:00 |
harlowja | same way the modification there to cloud-init setup.py is | 19:00 |
harlowja | so library would just need to add a entrypoints entry (like in that setup.py) in there own module | 19:01 |
harlowja | so in said libraries setup.py there would be an entry like | 19:03 |
harlowja | entry_points={ | 19:03 |
harlowja | 'cloud.config': [ | 19:03 |
harlowja | 'my_thing = my_thing.my_cloud_handler', | 19:03 |
harlowja | ], | 19:03 |
harlowja | }, | 19:03 |
harlowja | so when cloudinit looks for a way to call 'my_thing' (assuming its in a cloud.cfg listing somewhere) then it can go out and try to find it (and load this library to get at it) | 19:03 |
harlowja | (or if nobody registered that module, then die as usual) | 19:04 |
smoser | harlowja, so.. | 19:16 |
smoser | 19:16 | |
smoser | http://paste.ubuntu.com/23062532/ | 19:16 |
smoser | that is what i dont like about entry poitns | 19:16 |
smoser | takes ~ 0.01 to bring up python , 0.03 to bring up python3 on a reasonably current SSD | 19:16 |
smoser | (with '0' as first arg) | 19:17 |
smoser | importing the pkg_resources takes 0.3 seconds on python, and 0.25-ish on python3 | 19:17 |
smoser | it does look like it caches stuff as 10 runs take about the same as 1 | 19:17 |
smoser | i'm guessing that python3 is faster in my test only because i have fewer entry points or packages installed on the system in python3 compared to python2 | 19:18 |
smoser | so its doing less work. | 19:18 |
smoser | this is also embarrasing: | 19:20 |
smoser | http://paste.ubuntu.com/23062545/ | 19:20 |
smoser | and it needs fixing | 19:20 |
smoser | but i'm somewhat hesitant to add something like that. | 19:20 |
harlowja | so thats just because u imported 'pkg_resources' ? | 19:31 |
smoser | the pkg resources import takes quite some time (~.1 seconds) | 19:32 |
smoser | the enumerating of some non-existant namespace takes .2 seconds | 19:32 |
smoser | obviously very scientific data there. | 19:32 |
harlowja | :-P | 19:33 |
smoser | i should have done a -1 | 19:33 |
smoser | lets re-do that paste | 19:33 |
smoser | http://paste.ubuntu.com/23062572/ | 19:36 |
smoser | there. -1 is just cost of bringing up python | 19:36 |
smoser | fiddle | 19:37 |
smoser | http://paste.ubuntu.com/23062577/ | 19:37 |
smoser | there ^ | 19:37 |
smoser | -1 is cost of python | 19:37 |
smoser | 0 is cost of import pkg_resoruces | 19:38 |
smoser | 1 is cost of one call to 'iter_entry_points' | 19:38 |
smoser | 10 is cost of 10 calls | 19:38 |
harlowja | k | 19:38 |
smoser | with revised my.py at http://paste.ubuntu.com/23062581/ | 19:38 |
harlowja | seems like they need to better optimize that entrypoint 'catalog' lol | 19:38 |
smoser | yeah, it is stat crazy | 19:38 |
smoser | those openstack cli programs do taht. | 19:39 |
harlowja | right | 19:39 |
smoser | they do cache well | 19:39 |
smoser | since 10 runs takes basically nothing more than 1 | 19:39 |
harlowja | but assuming a entrypoint catalog existed, in the core python, then i'd assume that stuff wouldn't take forever | 19:39 |
harlowja | aka a tiny sqllite db | 19:39 |
harlowja | lol | 19:39 |
harlowja | wonder why such a thing doesn't exist | 19:40 |
smoser | yeah, but i think the entry points are stuff in taht egg.info right ? | 19:40 |
smoser | thats how those are loaded ? | 19:40 |
smoser | so python goes looking in any possible directory in sys.path for a file egg.info or something and then goes reading it and such. | 19:40 |
harlowja | thats one location of it, but u'd think that pip could update a sqllite db or something | 19:40 |
harlowja | i wonder if the python community is working on anything like that | 19:41 |
harlowja | seems pretty obvious to do that | 19:41 |
harlowja | then X people wouldn't be making there own entrypoint-thing due to this | 19:42 |
smoser | not too long ago i had a spinning disk | 19:43 |
smoser | (more embarrasment) | 19:43 |
harlowja | whats that crap | 19:43 |
harlowja | ha | 19:43 |
smoser | and running 'nova' on it took like 3 seconds to load. | 19:44 |
smoser | nova as in the cli tool, not the service :) | 19:44 |
harlowja | :-P | 19:44 |
harlowja | so ya, the other option is that we make our own loader slightly more advanced | 19:48 |
harlowja | so that say in cloud.cfg u could have fully specified modules + functions | 19:48 |
harlowja | then i could have a entry like | 19:48 |
harlowja | godaddy_ci.handlers:basic_handler | 19:48 |
harlowja | though that starts to just make our own entrypoint like thing :-/ | 19:49 |
smoser | ok. one more thing.. http://paste.ubuntu.com/23062607/ | 19:54 |
smoser | for my reference mostly. | 19:54 |
harlowja | lol | 19:54 |
smoser | that just runs it with strace too, and counts stats or opens | 19:54 |
harlowja | nice | 19:54 |
harlowja | stats or opens: 2561 | 19:54 |
harlowja | lol | 19:54 |
harlowja | ya, idk why they aren't backing that crap via sqllite | 19:54 |
harlowja | afaik entrypoints are all 'static' | 19:54 |
harlowja | in that they are all defined by packing (in setup.py or other) | 19:55 |
harlowja | seems dumb to rescan the filesystem to find them | 19:55 |
harlowja | it'd seem like a win for most of python if it wasn't so scan happy | 19:56 |
harlowja | though of course any change to do that would probably hit the people that will say its all in cache and such and blah blah | 19:56 |
harlowja | and gets into the question, of make our own thing, or just work with the python stuffs | 19:59 |
smoser | so i had a start of my own thing | 20:06 |
smoser | that took a list of directories | 20:06 |
smoser | and would look in those. | 20:06 |
smoser | cloud-init needs lots of performance improvements for sure | 20:06 |
harlowja | why not at that point just explicitly name full modules in cloud.cfg ? | 20:08 |
harlowja | i'd rather not make our own full entrypoint thing :( | 20:08 |
harlowja | or just at least, try to talk to python-devs, asking what's a solution (is there any, is sqlite db possible, or a static file that everyone updates or ...) | 20:09 |
smoser | ok... i'm just saying this out loud for my own logs and such. | 20:22 |
smoser | http://paste.ubuntu.com/23062681/ | 20:22 |
smoser | that is a bzr revno to git hash mapping that seems correct for right now. | 20:22 |
harlowja | woah | 20:30 |
harlowja | ha | 20:30 |
mgagne | so a coworker tested the "fixed" cloud-init and had some form of race condition, the gateway and routes weren't properly configured. rebooting fixed the issue. | 20:54 |
=== rangerpb is now known as rangerpbzzzz | ||
mgagne | will do more tests tomorrow | 21:17 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!