[04:26] <wallyworld> kelvinliu: next azure PR if you get a chance at some point (another step along the way)  https://github.com/juju/juju/pull/11998
[04:44] <kelvinliu> looking
[04:59] <kelvinliu> wallyworld: lgtm
[04:59] <kelvinliu> ty
[05:02] <wallyworld> tyvm
[06:22] <flxfoo> hi all
[06:22] <flxfoo> having trouble with pylibjuju, can I use something else? like `curl` to retrieve data from a model?
[06:25] <wallyworld> what trouble? curl will be messy, using a client is better. what's the issue?
[06:27] <flxfoo> wallyworld:I posted on discourse, but I have `RPC: Connection closed, reconnecting`
[06:28] <flxfoo> then on controller side the `logsink.log` has `ERROR juju.apiserver apiserver.go:939 error serving RPCs: codec.ReadHeader error: error receiving message: websocket: close 1009 (message too big)`
[06:28] <flxfoo> of course, nothing changed on those machines, the day before the script was working fine, then the next I have this
[06:29] <wallyworld> i've not used pylibjuju much, i assume the backup action is configured to leave the backup on the controller and not try to stream it down to the client?
[06:30] <flxfoo> yeah, mainly that just run the action. login, run action, do some server stuff,...
[06:30] <flxfoo> steps are
[06:30] <flxfoo> instanciate the controller is fine
[06:30] <wallyworld> oh, so a charm action
[06:30] <flxfoo> yes
[06:31] <wallyworld> sorry, i head went to a juju backup
[06:31] <flxfoo> the error raise up, on `get_model()` python call
[06:31] <flxfoo> it is much before an action is called
[06:32] <flxfoo> the RPC cuts, when getting model data
[06:32] <wallyworld> i know we had to increase the msg frame size for the juju client, but that was ages ago, perhaps there's a pylibjuju tweak needed
[06:33] <flxfoo> I tried from 512 until 65535 using max_frame_size, with no luck
[06:34] <flxfoo> I am not sure this is about the python lib actually but maybe what the controller answers
[06:35] <wallyworld> what version is the controller
[06:35] <flxfoo> 2.8.0
[06:36] <wallyworld> i can't recall exactly what's been fixed since 2.8.0, but upgrading to 2.8.2 would be something to consider
[06:36] <flxfoo> ok, I would need to create a new controller and migrate model yes?
[06:36] <wallyworld> no, you can just upgrade
[06:37] <wallyworld> juju upgrade-controller
[06:37] <wallyworld> you can use --dry-run to see what it would do
[06:37] <flxfoo> it is a production platform... not sure...
[06:38] <wallyworld> migration is a good option then
[06:38] <flxfoo> k thanks
[06:38] <wallyworld> you can even deploy  atest model on 2.8.2
[06:38] <wallyworld> and see if the issue goes away
[06:39] <flxfoo> I have found in logs that a previous model with same name, which raise some error could it be a potential issue related?
[06:39] <wallyworld> juju doesn't allow 2 models with the same name, so any logs would be for a previously delete model i would expect
[06:39] <flxfoo> like this `ERROR juju.worker.dependency engine.go:671 “api-caller” manifold worker returned unexpected error: [619092] “machine-0” cannot open api: model cache: model “61909253-939f-48c0-8452-389111410e43” did not appear in cache timeout`
[06:40] <wallyworld> that could be something else, i don't know off hand
[06:40] <flxfoo> k
[06:40] <wallyworld> 2.8.2 did fix some model cache issues i think
[06:41] <flxfoo> and this `ERROR juju.worker.dependency engine.go:671 “mgo-txn-resumer” manifold worker returned unexpected error: cannot resume transactions: The update path ‘settings.’ contains an empty field name, which is not allowed.`
[06:41] <flxfoo> (sorry to bother, and I don't have more) :P
[06:41] <flxfoo> p
[06:43] <wallyworld> not seen that before. we'd need to see a bit more info like what lead up to that error, any previous errors etc. and maybe even a sanitised chunk of the relevant db records. worth filing a bug
[06:45] <flxfoo> that's from logsink.log file (appear as well in machine-0.log) , that error is looping together with the previous one. around every 3 minutes.
[06:46] <flxfoo> last one, more general, in case of spawning a new controller and migrating the model, would previous non exists model or objects in general would be cleaned up?
[06:48] <wallyworld> a new controller starts empty. only the migrated model is copied across, so any old cruft is left behind
[06:49] <wallyworld> stuff restarting mneans something has gone wrong and needs looking into, likely a bug that needs fixing
[06:49] <flxfoo> talking about the RPC?
[06:50] <wallyworld> no the error being logged every 3 minutes
[06:50] <flxfoo> k
[06:51] <flxfoo> the uuid in front of that log line is the model uuid?
[06:51] <flxfoo> probably not
[06:51] <flxfoo> the whole line is eg.
[06:52] <wallyworld> the uuid is the model uuid
[06:52] <flxfoo> 66c9ba24-69af-40f0-842f-8613777c1491: machine-0 2020-09-15 06:45:04 ERROR juju.worker.dependency engine.go:671 "mgo-txn-resumer" manifold worker returned unexpected error: cannot resume transactions: The update path 'settings.' contains an empty field name, which is not allowed.
[06:52] <wallyworld> something has really got itself into a funny state
[06:52] <flxfoo> how do I know the uuid of models?
[06:53] <wallyworld> juju models --format yaml
[06:53] <flxfoo> yeah ... I might have done something wierd, but except creating and removing models or apps, really nothing special
[06:53] <flxfoo> k
[06:54] <flxfoo> ok so that controller model for sure
[06:54] <wallyworld> juju should not error for that
[06:54] <flxfoo> mgo I suppose this is mongodb related
[06:55] <flxfoo> platform is on aws
[06:55] <wallyworld> more likely juju messing up when a model was removed
[06:57] <flxfoo> I see... so yeah best option would be to respawn a new controller and migrate the current model...
[06:58] <flxfoo> how can I debug or get into the mongodb db?
[06:59] <wallyworld> you can dump a model as yaml by export JUJU_DEV_FATURE_FLAGS=developer-mode and then juju dump-model or juju dump-db
[07:00] <flxfoo> is output heavy?
[07:00] <flxfoo> there is not a lot of nodes
[07:00] <flxfoo> 17
[07:01] <flxfoo> sorry 13
[07:02] <wallyworld> the size is more related to how many apps / units
[07:04] <flxfoo> 7 apps, 13 units
[07:04] <wallyworld> it won't be much
[07:08] <flxfoo> done
[07:08] <flxfoo> sequences: have a few application entries that are no more
[07:19] <flxfoo> wallyworld:thanks ... I will try to look into that...
[07:33] <wallyworld> the sequence entries are ok, they will be ignored. but they should have been cleaned up
[08:02] <flxfoo> wallyworld:ok, so definitely something did not get right...
[08:15] <flxfoo> wallyworld:sorry to bother, port 37017 should be accessible to juju client (cli) right?
[08:17] <flxfoo> because sniffing network I have traffic coming from the vm public interface to the private interface on port 37017...
[08:28] <stickupkid> flxfoo, that's the mongo port
[08:29] <stickupkid> flxfoo, the default API port is 17070
[08:43] <flxfoo> ok good news I found in https://pythonlibjuju.readthedocs.io/en/latest/api/juju.client.html#module-juju.client.connection
[08:43] <flxfoo> that the default MAX_FRAME_SIZE is 4194304
[08:43] <flxfoo> when I doubled it, RPC error is gone
[08:45] <stickupkid> flxfoo, over 4MB of data, that is some impressive data
[08:45] <flxfoo> stickupkid:right
[08:46] <flxfoo> now I dumped the model and the yaml file is about 8022129o
[08:46] <flxfoo> which fits
[08:46] <stickupkid> flxfoo, classic
[08:46] <flxfoo> now , there might be some extra data in there
[08:46] <flxfoo> is there is way to "cleanup"
[08:47] <flxfoo> ?
[08:47] <stickupkid> flxfoo, without seeing it, I'm unsure
[08:48] <flxfoo> ok, I mean there is no gc type of tools?
[08:49] <flxfoo> stickupkid:I quite manipulate (for testing pov) the models, like adding removing apps/units etc...
[08:49] <flxfoo> I suppose that comes from that
[08:49] <flxfoo> If there is no more much changes, that size should not move forward right?
[08:49] <flxfoo> and the only way would be to respawn a model? or migrate maybe?
[08:50] <stickupkid> flxfoo, so I would hope we would clean up if you removed a unit/application
[08:50] <stickupkid> flxfoo, but yeah, they should be stable size if your model is stable
[08:51] <flxfoo> stickupkid:right now (production) there is 13 units 7 apps not much
[08:52] <flxfoo> stickupkid:before that "stable" model, I played quite a while adding/removing etc probably a lot of units and apps... and maybe some pieces haven't been removed properly or so...
[08:52] <flxfoo> stickupkid:does that sounds plausable?
[08:52] <stickupkid> yeah, but do keep an eye on it and open a bug if you believe we should be cleaning up when we're not
[08:53] <flxfoo> stickupkid:sure will do... I try to localize the issue to be able to fillup something useful... controller version is 2.8.0, and wally told me already that few things were fixed in model since
[09:01] <flxfoo> stickupkid:do unit (instance) connect to 37017 (mongodb) directly?
[09:02] <flxfoo> does not look like (sniffing
[09:02] <stickupkid> flxfoo, they should all go via an API
[09:03] <flxfoo> k
[09:04] <flxfoo> stickupkid:still don't understand why private interface tries to connect to public interface port 37017 on the controller
[09:06] <flxfoo> stickupkid:what is connecting to 37017? a client (cli) would or not? or is it internal to ctrl?
[09:07] <stickupkid> flxfoo, internal to ctrl, achilleasa any thoughts on above?
[09:09] <achilleasa> flxfoo: is that a HA controller?
[09:11] <flxfoo> stickupkid:not yet :p
[09:32] <flxfoo> stickupkid:no it is not HA
[11:27] <achilleasa> stickupkid: are you aware of any existing watcher for space topology changes?
[11:31] <achilleasa> or for subnet doc changes
[11:32] <achilleasa> the latter is what I actually need
[11:32] <stickupkid> achilleasa, not that I'm aware of
[11:35] <achilleasa> stickupkid: there is a lifecycle watcher for subnets
[13:17] <stickupkid> achilleasa, on develop if you run make statis-analysis do you get this issue?
[13:17] <stickupkid> go/src/github.com/juju/juju/apiserver/facades/client/applicationoffers/state.go:58:2: UserPermission redeclared
[13:17] <stickupkid> not sure how that's not being picked up, it really should be
[13:17] <achilleasa> can check in 5'
[13:18] <stickupkid> wicked
[14:27] <stickupkid> hml, updated my q&a steps https://github.com/juju/juju/pull/11994
[14:27] <hml> stickupkid:  rgr
[14:53] <stickupkid> hml, approved
[14:53] <hml> stickupkid:  ta
[15:51] <stickupkid> why is CharmID.Metadata a untyped map[string]string
[15:51] <stickupkid> ?
[15:51] <stickupkid> just mystery meat
[15:56] <pmatulis> lol
[16:03] <tychicus> https://jaas.ai/apache2, it does not appear that the apache2 charm supports the certificates:tls-certificates relation, to obtains tls certificates from vault, is that correct?
[21:21] <tychicus> If you set bluestore-block-db-size in ceph-osd post deployment, what additional steps are necessary to update the db partition size on all of the osd servers
[21:23] <tychicus> do you have to remove and re-add the unit to pick up the new setting or can you use osd-out zap-disk and then add-disk