[00:18] ec0, can't you use an overlay? [01:12] would overlays still require stub applications to be present for relations to applications which exist in the model, which do not exist in the overlay? [01:41] no [01:42] search for 'rabbit' here: https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-manila-ganesha.html [04:21] kelvinliu: last core network piece https://github.com/juju/juju/pull/12011 [04:24] looking [04:37] wallyworld: lgtm ty [04:38] tyvm [06:19] wallyworld: tlm hpidcock any of u free to take a look this pr for adding service updating stuff in broker level plz https://github.com/juju/juju/pull/12013 thank you [06:19] kelvinliu: on it [06:19] ty [06:20] this one is the 1/3 split from the big PR for svc stuff [06:29] kelvinliu: looks good, just one comment about container ports [06:40] hpidcock: tyrv [06:41] currently we define the svc ports in container in podspec, which means we always expose the pod in cluster if there is svc [06:44] kelvinliu: "List of ports to expose from the container. Exposing a port here gives the system additional information about the network connections a container uses, but is primarily informational. Not specifying a port here DOES NOT prevent that port from being exposed. Any port which is listening on the default "0.0.0.0" address inside a container will be accessible from the network." [06:47] only time I think it's used is when the podspec has hostNetworking set to true "Host networking requested for this pod. Use the host's network namespace. If this option is set, the ports that will be used must be specified." [06:47] hostNetwork* [06:50] right. [06:53] and it's bad to update podspec when we have a port mapping change as well [07:09] hpidcock: wallyworld tlm could anyone free to take a look this pr for adding facades for the new caasfirewaller worker plz https://github.com/juju/juju/pull/12014 thanks [07:10] sure [08:00] stickupkid: got two small PRs for you: https://github.com/juju/juju/pull/12005 and https://github.com/juju/bundlechanges/pull/68 [08:50] ta [08:54] flxfoo, you around? [09:53] stickupkid: we seem to be using dashes for other changes (e.g. RelationsDiff) so I will make them all use dashes [09:56] stickupkid: pushed a commit to rename num_units; can you take another look? [11:25] stickupkid: ok, reverted last commit and changed the dash to _ (retained it for the ExposedEndpointDiff struct fields though since we use the same pattern for other structs) [11:25] yeah tick [11:26] manadart, I did a bit of digging on https://bugs.launchpad.net/juju/+bug/1895954 [11:26] Bug #1895954: SEGV when jujud restarts after upgrade-controller 2.8.1 -> 2.8.2 [11:26] a quick fix would just be to trap for charmDoc.Meta is nil [11:26] I think [11:27] jam: Ack. [11:27] jam, that doc is a requirement [11:27] jam, we should pretty much die there [11:27] stickupkid, we have a clear database that failed with a panic nil dereference during upgrade [11:28] stickupkid, failing during upgrade is *not* the place to fail, because then you can't get out of the failure [11:30] sure sure, ok fair [12:01] manadart, a thought. instead of trying to get him to 2.8.1 just get his charm.meta data fixd [12:01] fixed [12:01] and then get a patch into 2.8.3 [12:27] jam: Yes, pursuing this line presently. [12:29] hi! looking for help with [Bug 1895954] Re: SEGV when jujud restarts after upgrade-controller 2.8.1 -> 2.8.2 [12:29] Bug #1895954: SEGV when jujud restarts after upgrade-controller 2.8.1 -> 2.8.2 [12:30] that is me [12:30] routergod: Hey. Need any help with running the query I posted in Discourse? [12:31] ah no I missed that... [12:38] oh that produces a lot of docs (one per application?) [12:38] all have "meta" : null, [12:40] Can give me a couple of examples of charms that you have deployed - just the names will do. [12:42] https://pastebin.ubuntu.com/p/j6JJ39Q4WN/ [12:45] there are three models in this controller, this all seems to relate to one of them [12:46] it is one i didn't touch for a while - an early focal-ussuri openstack-base [12:54] routergod: Indeed. Is that model (0ea37312-4df6-49a6-8137-f6b872def5dd) functional? [13:00] In any case, I think running this will allow the upgrade to proceed: `db.charms.update({meta: null}, { $set: {"meta": {}} }, false, true)` [13:03] the model machines are running and they look normal. i don't care about that model though. [13:03] let me try that, it will only impact that one model either way [13:09] routergod, can you confirm if you have a significant number of entries that *aren't* null? I'm wondering if "the things that are working" are the ones with valid data, and the "things that are nil" are just cruft that we failed to prune correctly. [13:10] how do i say !null to mongo? [13:12] @mandart sorry i am a mongo idiot. are the backticks part of the command? [13:12] routergod, {$exists: true} [13:12] routergod: No, just for code formatting :) [13:12] ah sorry, {$ne: null} [13:12] that explains something... :-) [13:14] ok it is doing something other than crashing now [13:14] routergod, do you feel comfortable giving just all the contents of that collection? We have the feeling that these are charms that you started deploying, but it failed to get from the store, and then you would have a later charm that was properly populated [13:14] and thus these could just be removed from the DB [13:15] yes I can share just tell me the command please [13:15] db.charms.find().pretty() [13:15] routergod, ^ [13:16] If we want filtered, then it would be: db.charms.find({meta: {$ne: null}, "model-uuid" : "0ea37312-4df6-49a6-8137-f6b872def5dd"}).pretty() [13:19] manadart: Just tried your suggestion and it seems that the controllers come up again [13:20] soumplis_: Great news. We have a fix, but are still strategising around re-release/pull. [13:21] do you want it filtered? it is quite large [13:21] nah i'll post the lot :-) [13:22] https://pastebin.ubuntu.com/p/ydyp3smGMY/ [13:26] brilliant, controller seems back up here now [13:26] getting a lot of this in the machine log though; [13:26] 2020-09-17 13:24:58 ERROR juju.core.raftlease store.go:265 timeout waiting for Command(ver: 1, op: extend, ns: application-leadership, model: 0ea373, lease: glance, holder: glance/0) to be processed [13:29] `routergod@juju:~/.local/share/juju$ juju switch controller [13:29] 14:28:26+01:00 upgrade in progress since "2020-09-17T13:13:43Z" [13:29] Deployed [13:30] how to sync the other databases? do I just repeat that command there? [13:31] routergod: That will have happened automatically. Writes will have been redirected to the primary, and the replicas sync'd. [13:31] Controllers being down should not affect that. [13:34] yeah that what i expected. but the other two are like this in the machine log [13:34] `2020-09-17 13:31:26 ERROR juju.worker.dependency engine.go:671 "api-caller" manifold worker returned unexpected error: [b5fb8e] "machine-2" cannot open api: unable to connect to API: dial tcp 127.0.0.1:17070: connect: connection refused` [13:34] looks like the upgrade is complete on one controller but not the other two [13:38] manadart, https://github.com/juju/juju/pull/12016 [13:40] manadart if I repeat your original mongo foo on one of the 'down' controllers i get this; [13:40] `juju:SECONDARY> db.charms.find({meta: null}).pretty() [13:40] "signature" : { [13:44] jam: Commented on it. [13:44] routergod, it looks like you got rate limited [13:48] oh :-( [13:50] manadart I tried your mongo stanza on one of the 'down' controllers and got an "errmsg" : "not master and slaveOk=false" [13:50] jujud is not starting on those two [13:51] routergod: What is in /var/log/juju/machine-x.log on those? [13:51] 2020-09-17 13:31:26 ERROR juju.worker.dependency engine.go:671 "api-caller" manifold worker returned unexpected error: [b5fb8e] "machine-2" cannot open api: unable to connect to API: dial tcp 127.0.0.1:17070: connect: connection refused [13:52] manadart 2020-09-17 13:31:26 ERROR juju.worker.dependency engine.go:671 "api-caller" manifold worker returned unexpected error: [b5fb8e] "machine-2" cannot open api: unable to connect to API: dial tcp 127.0.0.1:17070: connect: connection refused [13:56] manadart, updated PR [13:57] routergod: had the same issue with mongo. try to stop all but one mongos, and then start them again. this seems to let me execute the command [13:58] routergod, there should be an earlier failure about the apiserver not starting (that is the other side saying it is failing to talk to itself). We could also look at "juju_engine_report" but that would need pastebin [13:58] routergod: after you apply the mongo "hack" you have to restart jujud-machineX [14:12] soumplis_ [14:13] soumplis_ many thanks that works after restart. manadart , others many thanks for your help! [14:14] Great. [14:16] routergod, your second paste of "everything" doesn't have the ones that the first past with "meta: null" contained [14:20] jam err.. ? [14:24] jam I can't repeat that original because we changed the values to something else, but I will search for the ids or something... [14:25] https://pastebin.ubuntu.com/p/j6JJ39Q4WN/ has "0ea37312-4df6-49a6-8137-f6b872def5dd:cs:ntp-41" but https://pastebin.ubuntu.com/p/ydyp3smGMY/ doesn't seem to have the string "ntp-41" anywhere [14:25] routergod, it *does* have ntp-39 with valid metadata [14:26] manadart, so this doesn't fix the potential issue with the model cache, but does fix pylibjuju https://github.com/juju/juju/pull/12017 [14:39] jam did I screw up the paste? [14:39] https://pastebin.ubuntu.com/p/QgXvpZqJ3C/ [14:40] routergod, ok, I see ntp-41 in your latest paste [14:40] jam that just that specific model uuid [14:41] sure. I see a ntp-39 that is all happy, and a ntp-41 which is still a placeholder [14:45] jam juju status show model 0ea37312-4df6-49a6-8137-f6b872def5dd running ntp-39 [14:46] manadart, also this fixes the issue where we have a custom value in application status https://github.com/juju/python-libjuju/pull/444 [14:46] manadart, walk through it quickly, before you EOD? [14:47] stickupkid: OK. Tally-ho. [14:48] routergod, sounds like an "juju upgrade-charm" that never completed because it couldn't get the charm from the store. [14:49] jam i will try upgrade-charm in that model to see what happens [14:49] There does appear to be an NTP 41 available: https://jaas.ai/ntp [14:49] yeah i am running that everywhere else [14:54] stickupkid: do you have time to review 12008 ad 12009? [14:57] petevg, would you be able to move the existing bugs targeting 2.8.3 to a 2.8.4 milestone except for the one we are doing the release for/ [14:57] ? [14:58] routergod, can you confirm that it gets to ntp-41 correctly? [14:58] @jam: ha! That sounds fun. Sure. I'll go ahead and do that. [15:01] manadart, how comfortable are you with suggesting the db.update() workaround that you pasted above? [15:01] @jam i started with upgrade-model there because it was on 2.8.0. this has not completed, i will try to find what is wrong... [15:02] there is one unit in blocked in the model [15:02] actually that might be it [15:02] routergod, that shouldn't prevent a model upgrade (IIRC) [15:02] hmm [15:02] jam: now I just have to remember how to create a milestone in launchpad. It's more annoying/obscure than it should be. [15:03] petevg, I start here: https://launchpad.net/juju/ [15:03] click on the 2.8 series link in the "Series and milestones" [15:03] then on https://launchpad.net/juju/2.8 [15:03] Create milestone [15:04] @petevg, but we shouldn't close the 2.8.3 milestone (if you did) [15:04] Since I still need to target the SEGV bug to 2.8.3 [15:05] ah, maybe you renamed 2.8.3 to 2.8.4 and are creating a new 2.8.3 that would work [15:05] jam: Reasonably :) It seems to me such records are generally going to be orphans. [15:05] jam: that is exactly what I'm doing :-) [15:05] If they were place-holders awaiting fulfilment, I assume you'd have other things in flux preventing an upgrade. [15:12] jam where do i look for logs about an upgrade-model command please? [15:20] routergod, I would look at machine-X.log on the controllers, or at least start there [15:21] hml, looking now [15:22] can someone validate my proposed steps: https://bugs.launchpad.net/juju/+bug/1895954/comments/7 [15:22] Bug #1895954: SEGV when jujud restarts after upgrade-controller 2.8.1 -> 2.8.2 [15:22] I have done them here, but I want to make sure it isn't JustMe [15:23] I also added a comment about stop/start of the controllers [15:34] jam, workaround is good [15:34] *text [15:42] stickupkid, can you try actually copy & paste the commands with a test environ? [15:48] jam, I can confirm, I inserted a null meta charm and ran the commands [15:48] jam, WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 }) [15:48] stickupkid, thanks [15:48] jam I had neglected to restart jujud on that one controller. After restart the upgrade-model works [15:48] routergod, yay. ok, so can you upgrade the charm as wel? [15:50] jam upgrade-charm ntp seemed to work ok, now on ntp-41 [15:50] many thanks! [15:52] \o/ [15:59] hml, petevg I have kicked off $$merge$$ and need to run some errands for the family. I will be back to check on it. I will be available on Telegram/Mattermost if it is urgent. [16:00] jam: sounds good. I'm gonna step away for lunch, and plan on starting a release when I get back. [16:52] petevg, looks like it merged successfully, so ab69570b38fbc746e54184e4c3274612bcbb8327 should be the hash [17:09] jam: cool cool. I'm gonna start the release process.