=== fginther` is now known as fginther [03:00] can anyone share a link with steps to install juju-gui charm to openstack liberty? thanks! [04:32] stokachu usually that was a symptom of something being sick in my env, like the security group not opening the port to the mongo instance on the model controller, or maybe my maas bootstrap timeout wasn't long enough [04:33] Gil - hey there, not really sure what you're asking. Do you have an openstack liberty provider available to you that you would like to consume in juju, and additionally deploy the juju gui? [04:33] s/consume/model your applications/ [06:14] marcoceppi: hey marco [09:32] hi everybody, may i use git launchpad repo to publish charms? [09:33] yo lazyPower, just looking over [09:33] https://docs.google.com/presentation/d/1a5l1bKX8dPwx21LkMQmp-zVjzOsgoTE_iQ2urD9znxk/edit?pref=2&pli=1#slide=id.g70d4533c6_2_140 [09:33] and have some questions when you are about [10:00] hi everyone. I am manually provisioning an IPv6 only VM. Deployment of charms there fail with cannot get archive: Get https://api.jujucharms.com/charmstore/v4/trusty/mysql-35/archive: dial tcp 162.213.33.121:443: network is unreachable [10:00] Does juju handle this case or is it all up to the admin? [12:02] hello [12:03] what could be the cause of [12:03] $ juju deploy --repository deployment/charms local:trusty/langpack-o-matic [12:03] ERROR charm not found in "/home/martin/ubuntu/langpack-o-matic/deployment/charms": local:trusty/langpack-o-matic [12:03] the charm definitively exists: [12:03] $ ls /home/martin/ubuntu/langpack-o-matic/deployment/charms/trusty/langpack-o-matic/ [12:03] config.yaml hooks metadata.yaml README.md [12:03] (juju 1.25 in xenial) [12:04] I have another charm in that dir ("bootstrap-node"), and deploying that works fine [12:04] $ ls /home/martin/ubuntu/langpack-o-matic/deployment/charms/trusty/bootstrap-node/ [12:04] hooks metadata.yaml [12:07] ah, nevermind! typo in "name:" in metadata.yaml [12:07] typical "you have to ask, and then you'll figure it out" situation [14:06] yo bloodearnest o/ [14:07] lazyPower, hey [14:07] hows things? [14:08] good, sprint next week, so lots of prep [14:08] and trying to understand this new systemd world... [14:08] gennadiy - (super late reply) - you can warehouse the code in git, however bzr is still the only ingestion method right now. There's a new feature coming that decouples your charms from DVCS which in turn provides instant publishing [14:09] bloodearnest - i hear ya man! [14:09] i upgraded to xenial on my primary workhorse and its been slow going getting the system stood back up. i need another weekend on it [14:09] so, tls layer [14:09] still totally a thing :) [14:09] yeah [14:10] anything specific i can help with? [14:10] my usage is very different though [14:10] i have found bzr-sync module. it will sync my code from git to bzr [14:10] so, some thoughts: [14:10] thanks for your response [14:10] gennadiy - its a nice solution for a short term problem :) [14:10] gennadiy also o/ great to see you in here [14:10] 1) a standard layer for generating certs would be useful [14:11] for us to use easyrsa, it would need to be installed by system packages [14:11] but we could use [14:11] it [14:11] yeah? [14:12] so, we need to put easy-rsa in a PPA or is that still a red-flag? [14:12] 2) there are 2 distinct uses of tls certs here: intra service comms (your layer/interface), and public service comms [14:14] kjackal: most charms have no support for ipv6. Its not that they don't support them but they've never been tested there, so they often do things that do not support ipv6. [14:14] bloodearnest - yeah, matt and I have talked about this, the public facing ssl bits. we dont have a path forward with any time alotted to get that done [14:14] heyo [14:14] but we've been kicking around ideas. we started with an idea to wrap lets-encrypt as a layer [14:14] kjackal: e.g. take an ip address from juju and curl http://thatip/ which doesn't work because it needs to be wrapped in [] to be a valid ipv6 url. [14:15] bloodearnest - if you were going to put public facing ssl infrastructure in your modeling language. what would be your preferred method to do so? [14:15] lazyPower, ppa won't work, but we could add a package to our archives, perhaps. I'm also not sure how much control it provides. Can it do SubjectAlternativeName for DNS *and* IP? [14:16] yeah [14:16] it already does add SAN for DNS and IP [14:16] k [14:16] i think we can tune the config to include a config option for additional SAN [14:16] right now we're kind of lazy about what we stuff in the SAN, ip and hostname [14:16] but it supports both entry styles [14:16] lazyPower, so, the 2 uses are different enough to warrent different approach, and different interfaces, I suspext [14:17] oh for sure [14:17] self signed certs vs ca signed sergs [14:17] s/sergs/certs/ [14:17] jrwren: True. So to have juju on an ipv6 setup we need a translation service to ipv4 [14:17] mbruzek o/ morning [14:18] heyo [14:18] mbruzek we're talking about our baby [14:18] > re: layer-tls [14:18] is my baby ugly? [14:18] its our ugly babby [14:18] say it isn't so! [14:18] :P nah [14:18] bloodearnest was just riffing about how we can make it more useful to more ppl [14:18] http://bazaar.launchpad.net/~bloodearnest/charms/trusty/x509-cert/trunk/view/head:/lib/selfsigned.py [14:19] is what we want, in terms of self signed cert [14:19] they need a deb package of easyrsa. aparently fetching it from where we're grabbing it is basically out of sorts [14:19] the DNS/IP thing is an openssl/gotls thing [14:19] bloodearnest we can do these alt_names no prob [14:20] gotls is stricter and wants proper SANs [14:20] cool [14:20] right on [14:21] so, matt and i are spiking on k8s this week [14:21] if you want to file some bugs @ the repo for layer-tls we can start planning and try to get it on our board [14:21] so, I think what we want is a subordinate charm, so we can configure multiple certs for one service (e.g. apache, haproxy) [14:21] lazyPower, ack [14:22] we talked about a subordinate charm that is a good idea, I wonder if both can be built from the same layer so we don't have to maintain two different codebases [14:23] mbruzek, you can build 2 charms from 1 layer? [14:24] yes [14:24] bloodearnest: I would see a subordinate layer that imports tls, and just has metadata that makes it a subordinate [14:24] Then add the functionality you and lazypower were discussing to the tls-layer [14:24] I suspect the interface types will be different (1 for peer negotiation, 1 for simple path communication) [14:25] right [14:25] path of least resistance [14:26] bloodearnest: The subordinate could have a tls provides relation and or requires, and you would have to extend our tls interface which *only* deals with the peer relation at this point [14:26] right [14:26] Again those could be done in the reusable tls layer and interface [14:26] Your subordinate layer would be extremely small, just making it a subordinate and using the provided functionality [14:27] so they'd be differentiated on relation type (provides, peer) rather than name? [14:27] bloodearnest: yes [14:27] wfm [14:27] the beauty of layers! [14:27] reusable components [14:28] so, the issue is, how best to generate certs, preferably using just system packages [14:28] or python deps [14:28] the python version above works fine in xenial, fwiw, python-cryptography is in main [14:29] but not trusty :( [14:29] well the current tls layer uses easyrsa (as lazypower) pointed out, if that is not sufficient you can suggest alternate methods. [14:30] In Juju 2.0 you can in the metadata.yaml specify what release your charm supports. [14:30] any method is fine, as long as it's a) vendored or b) system packaged [14:30] grabing from git is a no-no [14:31] bloodearnest how about resources? [14:31] that would work also [14:31] but require some manual prepping [14:31] what if easyrsa were exposed as a charm resource, you the deployment engineer stuff @ resources in your model-controller, and when you deploy boom its all offline. [14:31] nicer if it could work OOTB [14:32] the self signed stuff is really only for devel [14:33] plus, we are a ways away from being on 2.0 [14:33] bloodearnest: I used easyrsa from github because it had some bug fixes I needed. if you can get the repo one working submit a pull request for that. [14:33] mbruzek, ok, I will try that [14:35] easy-rsa latest github release 3.0.1, vs the latest easy-rsa in Xenial is 2.2.2 [14:36] I know you have rules against github, but I question if our rules move at the speed of modern software [14:36] mbruzek, not my rules [14:37] bloodearnest: I know [14:38] my point is you are 4 releases behind upstream, and the rules are supposed to be for "security", I would want the latest release if I were doing it for myself. [14:38] It would be great if we could create a snap of the latest release and put that in the charm. If snaps can be trusted like archive [14:39] i have this appliance docker image i use to print some certs sometimes [14:39] mbruzek, can't we just vendor it in some other form [14:39] caveate: you install docker on every host you want to generate certs [14:39] it's just a cli wrapper around openssl cli, [14:40] right? [14:40] pretty much [14:40] but its based on busybox so its stupid small [14:40] thats the only saving grace here [14:45] lazyPower, mbruzek: easy-rsa 3.0.1 is 140k of text files (40k of docs) [14:45] seems reasonable to vendor into the layer? [14:45] kilobytes? [14:45] yep [14:46] bloodearnest: yeah I think we are OK there, I would get worried about gb [14:46] bloodearnest: you were going to change it to the package manager anyway... I am sure that one is less kilobytes right? [14:47] probably about the same [14:48] bloodearnest: I get it, our current layer does it wrong by grabbing from github. [14:48] I think w/o docs and extras, you're talking about 60kb [14:49] kjackal: I don't know if we'd need a translation layer really, we just need to audit charms and start testing them in IPv6 environments to ensure they're coded to be IPv6 aware. That said, your error seems like Juju couldn't connect to the charm store via IPv6 which seems even more fundamental than charms supporting IPv6. [14:49] mbruzek, not objectivelywrong, perhaps, but wrong for us :) [14:49] I wonder if rick_h_ can chime in on whether there are issues between Juju and the charm store in IPv6 [14:49] bloodearnest: so help us fix it so it is _more_ useful [14:49] mbruzek, on it [14:49] cory_fu: yes! :) [14:50] Yes you can chime in, or yes there are issues? ;) [14:50] cory_fu: was just talking with kjackal about this in another channel and asked him to kick off an email because we expect problems with the store and the charms in them to be honest [14:50] Ah, I see [14:50] cory_fu: the charmstore is fronted by apache2 with SSL terminitaion and only works on IPV4, we'll have to work with IS on how to add IPV6 support [14:50] cory_fu: but there's a bigger issue as to what/how charms would work? [14:51] can wordpress, exposed, work with IPV6 ootb behind haproxy? [14:51] hi lazyPower thanks. never had any problems when deploying juju-gui on trusty. When I try deploy on liberty, I get this: "ERROR cannot resolve charm URL "cs:wily/juju-gui": charm not found". in my environments.yaml I have "default-series: wily" in the maas section. I needed wily and liberty to deploy this successfully: https://jujucharms.com/u/openstack-charmers-next/openstack-lxd [14:51] same with all services that can be exposed, how many don't support binding to an IPV6 addr [14:51] True [14:51] Gil: do you need the GUI running on wily? [14:51] Gil: are you colocating it with another Wily services or something? [14:52] rick_h_: and the inverse too, a cloud may be ipv4 on CGN or private IP, but support ipv6 public addresses [14:52] jrwren: yea, there's a whole can of worms here we've not worked through to my knowledge [14:53] mbruzek, so, the bugfixes you needed are not in 3.0.1, correct? [14:54] bloodearnest: I don't recall what version was needed, but the github version fixed the error I was getting [14:54] right [14:54] I will attempt a PR to use a vendored version [14:55] rick_h my main goal is to work with the nova lxd which is why I'm deploying that bundle. yes, i'd like to run the juju-gui on the liberty deploy if it's possible. [14:57] Gil: right, but you only need wily GUI if it's on a wily host. You can deploy the trusty juju-gui onto liberty without a problem [14:57] Gil: the things you deploy don't all have to be on the same series. [14:59] ok gtk. when I deployed the bundle: https://jujucharms.com/u/openstack-charmers-next/openstack-lxd it errored out and complained about the charms not matching the series so I went to "all -wily". So what I need to do then is change in environments.yaml back to "default-series: trusty" I guess which I will try now. [15:01] Gil - that or juju deploy trusty/juju-gui [15:05] tvansteenburgh - got a moment for a quick review? https://github.com/juju-solutions/jujubox/pull/2 [15:05] lazyPower: yeah gimme a min and i'll take a look [15:06] juju deploy --to 0 cs:trusty/juju-gui; gives: Added charm "cs:trusty/juju-gui-48" to the environment. + ERROR cannot assign unit "juju-gui/1" to machine 0: series does not match [15:07] ah thats because the state-server is wily, gotchya. [15:07] i didnt think you were colocating === firl_ is now known as firl [15:08] lazyPower: won't this break 1.25 users? maybe we should put these changes in a 2.0 branch? [15:08] tvansteenburgh - its for :devel flavored [15:08] this doesn't change :latest [15:08] right, but this will eventually become latest i expect [15:08] one 2.0 lands [15:08] and then we'll have nothing for 1.25 [15:09] current :latest: will move to a tag for 1.25 [15:09] roger [15:09] and :dev will supplant :latest, and :dev moves to whatever is in the :devel ppa [15:09] lgtm then [15:10] sweet \o/ progressss [15:10] lazyPower: yeah thanks for doing that [15:10] https://hub.docker.com/r/jujusolutions/jujubox/builds/bke9qasy38rcy4s98ve2c9o/ [15:11] we were solid with no modifications for 8 months [15:11] thats kind of impressive man. it wasn't until beta-1 landed that i had to dig in here and change some things [15:14] hey rick_h_ - when i'm bootstrapping with 2.0 beta-1, i get that env vars make it simple but is there an option for me to pass --config=path/to/aws.yaml to get my cloud keys? [15:14] the cloud credentials file i use for create-model dont seem to work for bootstrap :| [15:15] lazyPower: yea, you have to write out a .local/share/juju/credentials.yaml file with named credentials in it [15:15] lazyPower: will get you an example in a sec [15:15] ta [15:16] "In order to deploy a cs: Trusty charm to an alternate series machine, the charm must be locally branched to a / directory, then juju deployed from that local repo." from link https://github.com/Ubuntu-Solutions-Engineering/openstack-installer/issues/791 [15:17] is that what I would need to do at this point? [15:17] thats an option, or re-bootstrap with a different default-series [15:45] lazyPower changed environments.yaml to "default-series: trusty" then bootstrapped and successfully (as expected) deployed juju-gui. But then when the bundle "juju-deployer -c https://api.jujucharms.com/charmstore/v4/~openstack-charmers-next/bundle/openstack-lxd-50/archive/bundle.yaml -S -d" is deployed get "Added charm "cs:~openstack-charmers-next/wily/ceph-osd-15" to the environment. + ERROR cannot assign unit "ceph-osd/0" to machine 0: [15:46] Gil you'll have to modify the bundle to change the placement of ceph-osd [15:46] some solutions would be 1 extra machine for juju-gui or just use the local repo method [15:46] you'll need to co-locate it with another wily based service [15:46] ah [15:48] jamespage, fyi - updated the syncs and gh repos yesterday and they're syncing ok. https://github.com/openstack-charmers/migration-tools/blob/master/charms.txt +lxd +ceph-osd +percona-cluster [15:49] beisner, ack - have the git review ready to push again [15:49] sweet [15:50] +ceph-mon that is [15:50] osd was already good [15:50] yah [16:21] can any unit do a leader set or can only the leader perform that? [16:29] ChrisHolcombe: "Only the leader can write to the bucket" [16:29] roadmr, darn i was hoping you weren't going to say that haha [16:30] ChrisHolcombe: sorry :) straight from the docs: https://jujucharms.com/docs/1.25/authors-charm-leadership [16:31] roadmr, ah yeah i missed that line. thanks :) [18:02] beisner, hey - could you take a look at https://code.launchpad.net/~james-page/charms/trusty/neutron-gateway/tox8-test-refactor/+merge/286933 [18:02] needed for migration - also some prep for my neutron explosion branches [18:05] beisner, there was some fairly nasty lack of isolation between unit tests... [18:05] mainly due to massaging of CONFIG_FILES directly - moved to deepcopy + modification now [18:07] jamespage, yah i've been had by copies of dicts in py too. deepcopy ftw. [18:16] jamespage, want to flip `make test` to the tox method on that? [18:16] and lint [18:53] bdx: cargonza tells me you have an active issue that needs looking at? [18:53] thedac: hey, yeah, do you mind? [18:53] no, what's up? [18:54] I have been experiencing an issue where my nova-metadata-api seems to become unavailable after service restart ... [18:54] my instances can talk to 169.254.169.254 initiall after creating tenant networks [18:55] following that, if I restart the api-metadata service or reboot the node 169.254.169.254 becomes unavailable to the instances [18:56] hmm, Could be MTU. Do these hosts have jumbo frames on? [18:56] Espectially after a reboot [18:56] especially [18:57] ok, I'm not using jumbo frames. The issue presents itself w/o instance reboot [18:58] So that is my first suggestion. We definitely see problems with metadata when using default MTU of 1500. If at all possible set it on the neutron-gateway and the nova-compute nodes, restart the nova-api-metadata service and check [18:58] thedac: after some introspection, I'm not seeing the 169.254.169.254 address or interface on my compute nodes ... [18:58] ip netns list show only qrouter-283684d1-6e4d-4704-a72e-6fe6acc8e9a6 [18:59] bdx: they don't it is a special address space [18:59] It uses multicast [19:00] ok, gotcha. Do you know of ways to test for its existance from outside of the instance? [19:01] The best test is on the instnace. But verify nova-api-metadata is listening on 8775 [19:01] let me double check that port. That is off the top of my head [19:03] thedac: also, I'm not seeing the metadata api service show itself here http://paste.ubuntu.com/15182540/ [19:05] Yeah, it does not show up in the service list. So check it on neutron-gateway or if you are doing metadata on the compute nodes check there. 8775 is correct [19:06] sudo service nova-api-metadata status [19:06] nova-api-metadata start/running, process 432790y [19:06] y [19:06] ok [19:06] bdx: And this shows up in console-logs as failed access to metadata correct? [19:07] yes. [19:08] thedac: `ps aux | grep metadata` -> http://paste.ubuntu.com/15182600/ [19:09] ok, so I am still thinking MTU [19:09] it seems there is a wealth of metadata processes running [19:09] ok [19:09] You can test this by running ping with larger and larget packets sizes on the qrouter netns [19:09] entirely, ok [19:09] ping -s 1472 I think [19:10] `sudo ip netns exec q-router<#> ping -s 1472 169.254.169.254` ? [19:11] yes [19:11] and then with 1473 or higher [19:11] oh, sorry [19:11] no the IP of the instance [19:11] not the 169.254 address [19:12] ohh.. ping an instance? [19:12] yes [19:12] bdx: and regardless our best practice advice is to use jumbo frames in all openstack deploys [19:15] thedac: good to know [19:16] thedac: my pings are successful [19:16] with > 1472? [19:16] yea [19:16] ok let's check /var/log/nova/nova-api-metadata.log for any tracebacks [19:17] I'll check again ... my logs have been clean though [19:17] ok [19:17] so [19:18] ERROR oslo.messaging._drivers.impl_rabbit [req-85055fc4-7de0-46c9-8cc2-119c0eda3430 - - - - -] AMQP server on is unreachabl [19:19] rabbit was actually my initial suspect because I am seeing stale notifications in the rabbit queue [19:19] It is always rabbit ;) [19:19] but hadn't seen any errors yet .. [19:20] lovely, ok so sounds like a rabbitmq problem. From a networking perspective can you nc -vz $RABBIT_IP 5672 from the nova-api-metadata host? [19:20] Then we can check rabbitmq-server logs [19:20] nc -vz 10.16.100.59 5672 [19:20] Connection to 10.16.100.59 5672 port [tcp/amqp] succeeded! [19:21] ok, let's hope on the rabbit instance and check logs [19:21] s/hope/hop but also hope [19:23] ok, just launched an instance, that failed communicating with 169.254.169.254, rabbit logs show -> accepting AMQP connection <0.3547.1> (10.16.100.133:39614 -> 10.16.100.59:5672) [19:24] Is rabbit clustered or singleton? [19:24] singleton [19:24] ok [19:25] You might keep the tail on the rabbit log and restart the nova-api-metadata service and neutron-metadata service and see what we get [19:25] ok [19:25] omp [19:25] It could have been temporary failure [19:27] yea, I got a bunch of warning reports for about a second [19:28] rabbit seems to be talkin to both services [19:28] What were the warning messages [19:28] ? [19:28] =WARNING REPORT==== 23-Feb-2016::19:27:38 === [19:28] closing AMQP connection <0.19995.0> (10.16.100.157:42079 -> 10.16.100.59:5672): [19:28] connection_closed_abruptly [19:29] That could have been the stop of the metadata service depending on timing [19:29] so you might test another instance deploy [19:29] And watch the nova-api-metadata log as well as the rabbit log [19:29] it was ... rabbit logs got spammed with that at the time I restart [19:29] on it [19:33] thedac: yea, no errors in any logs [19:33] ok, fingers crossed for the instance [19:34] neutron-api, neutron-gateway, nova-cloud-controller, nova-compute [19:34] all show no errors [19:34] instance gets stuck reaching out for metadata while booting [19:35] ok, so I am going to keep pushing the MTU issue. metadata is suseptible to it. [19:35] I can use the config drive as a workaround to get my user-data on to my instances for the time being ... I just feel this is really fragile though [19:36] thedac: so .... If I create two new tenant networks, the instances get metadata just fine [19:36] that are deployed to the new nets [19:36] oh? [19:36] yea [19:36] or [19:36] If I neutron net-delete [19:36] and recreate the tenant networks that are affected, metadata works again [19:37] hmm, ok, that is interesting [19:37] right [19:37] do they subsequently stop working or work indefinitely? [19:38] after a re-create? [19:38] thedac: metadata works until I restart the respective services, then stops working until I destroy and recreate again [19:39] thedac: I'm suspicious this might be a permissions thing ... [19:39] ok, and is this liberty? [19:39] yea [19:39] is the 169.254.169.254 a unix socket? [19:39] I'll see if I can recreate this and get back to you [19:40] thanks === blr_ is now known as blr [21:25] kjackal, lazyPower: https://github.com/juju-solutions/layer-basic/pull/37 [21:26] cory_fu - how can i not include some of the project meta files from base like Makefile and such? [21:26] i can override that in layer.yaml right? [21:27] * lazyPower makes a note to look at the builder readme [21:28] lazyPower: I don't think there's a way to say "remove this file" (bcsaller might be able to correct me), other than just overriding it with an empty file (which isn't the same as deleting it) [21:28] ah, ok. [21:28] lazyPower: Maybe we need an "excludes" section in layer.yaml? [21:28] I think thats a swell idea [21:29] What is your use-case for that, though? [21:29] just so i can say " excludes: readme, makefile, hacking.md - things like that so when i assemble my charm, if i dont have a hacking.md file, i dont have one floating around for one of the runtime layers [21:30] Why would you not want those files, though? [21:31] You definitely need a README [21:31] And I can't see why you wouldn't also want a HACKING.md and Makefile [21:31] i sorta agree with that, i built a charm the other day and committed a load of shit for no reason other than i didn't notice it was in the output [21:32] admittedly I used bzr ignore, but still, it seems like a valid usecase, or maybe bzr ignore is the way to go! :) [21:33] Actually, it looks like layer.yaml might already support an "ignore" list [21:34] Yeah, you can give an ignore list that will do what you want [21:34] lazyPower: ^ [21:34] cory_fu - ballin. I guess you found that in charm-tools docs? [21:35] And it looks like it will work per-layer, so each layer can ignore things from the layer below (if you stack more than once) [21:35] lazyPower: If by "docs" you mean "source" [21:35] ah, right [21:35] lol [21:35] one and the same, the tome of charm keeper knowledge [21:35] Yeah, that should be documented, for sure [21:36] ya know cory_fu - i just realized we didnt put in a reference guide to any of the layer stuff [21:37] Yeah, that would be a good place for this [21:37] There's a lot of functionality built in to layer.yaml alone [21:37] can we card this and bring it up later this week? [21:37] * lazyPower has k8s stuff thats stale and needs cooked [21:38] Where would that card go? [21:38] i'll take it and put your face on it [21:38] ha [21:38] that..sounded way creepier than i intended [21:38] Indeed [21:38] anywho, incoming notice [22:51] so, word to the wise. series in metadata will make the -stable tooling (bundletester, proof) quite angry as i just found out. [23:07] it seems when I deploy my same stack in kilo, then in liberty, my nova-api-metadata changes its location from the dhcp port (kilo), to the network:router_interface_distributed port (liberty). Was this intended? Do you know about it? [23:07] thedac, openstack-charmers:^ [23:07] bdx: hey [23:07] thedac: whats up [23:07] I just had a liberty deploy up and was testing. I saw no change. [23:07] bdx: are you doing DVR for this? [23:07] thedac: yeah [23:07] thedac: and local dhcp/metadata [23:07] Ok, that is what I need to test next. I could not re-create the failure. So I will stand up a DVR deploy and keep trying [23:07] right [23:07] thedac: so yea, whats going on here is this -> I deploy my same stack in kilo, then in liberty, nova-api-metadata changes its location from the dhcp port (kilo), to the network:router_interface_distributed port (liberty) === philipballew is now known as Guest44851 [23:07] ok [23:08] So in my spinup bash script for creating tenant networks, I did not know/had not updated the network params to reflect the change of nova-api-metadata [23:08] i.e for kilo -> neutron subnet-update vlan110-subnet \ [23:08] --host_routes type=dict list=true \ [23:08] destination=10.0.20.0/24,nexthop=10.16.110.1 \ [23:08] destination=10.15.0.0/16,nexthop=10.16.110.1 \ [23:08] destination=10.10.0.0/16,nexthop=10.16.110.1 \ [23:08] destination=10.16.100.0/24,nexthop=10.16.110.99 \ [23:08] destination=10.16.111.0/24,nexthop=10.16.110.99 \ [23:08] destination=10.16.112.0/24,nexthop=10.16.110.99 \ [23:08] destination=169.254.169.254/32,nexthop=10.16.110.101 [23:09] but for liberty --> neutron subnet-update vlan110-subnet \ [23:09] --host_routes type=dict list=true \ [23:09] destination=10.0.20.0/24,nexthop=10.16.110.1 \ [23:09] destination=10.15.0.0/16,nexthop=10.16.110.1 \ [23:09] destination=10.10.0.0/16,nexthop=10.16.110.1 \ [23:09] destination=10.16.100.0/24,nexthop=10.16.110.99 \ [23:09] destination=10.16.111.0/24,nexthop=10.16.110.99 \ [23:09] destination=10.16.112.0/24,nexthop=10.16.110.99 \ [23:09] destination=169.254.169.254/32,nexthop=10.16.110.99 [23:10] bdx: ok, so is that working for you when you changed the route nexthop? [23:10] thedac: yea [23:11] ok, great. I'll figure out why things changed. [23:12] thedac: an interesting difference --> in kilo, when you update your subnet, you must include the destination,nexthop for the 169.254 address because the host route for metadata is not automatically appended to the list of new host routes [23:13] I was going to ask. We do not add the 169.254 route in our testing. [23:13] so in kilo, I can update my subnet, and lose my 169.254 static route if I do not add it to the update [23:14] in liberty, the nexthop/destination for metadata is appended to static routes automatically [23:15] unless you override it by specifying a user defined route for 169.254 as I was [23:15] wow [23:15] got it so, may be a bug in kilo [23:16] totally [23:17] thanks for your help on this [23:18] no problem