/srv/irclogs.ubuntu.com/2020/09/15/#juju.txt

wallyworldkelvinliu: next azure PR if you get a chance at some point (another step along the way)  https://github.com/juju/juju/pull/1199804:26
kelvinliulooking04:44
kelvinliuwallyworld: lgtm04:59
kelvinliuty04:59
wallyworldtyvm05:02
flxfoohi all06:22
flxfoohaving trouble with pylibjuju, can I use something else? like `curl` to retrieve data from a model?06:22
wallyworldwhat trouble? curl will be messy, using a client is better. what's the issue?06:25
flxfoowallyworld:I posted on discourse, but I have `RPC: Connection closed, reconnecting`06:27
flxfoothen on controller side the `logsink.log` has `ERROR juju.apiserver apiserver.go:939 error serving RPCs: codec.ReadHeader error: error receiving message: websocket: close 1009 (message too big)`06:28
flxfooof course, nothing changed on those machines, the day before the script was working fine, then the next I have this06:28
wallyworldi've not used pylibjuju much, i assume the backup action is configured to leave the backup on the controller and not try to stream it down to the client?06:29
flxfooyeah, mainly that just run the action. login, run action, do some server stuff,...06:30
flxfoosteps are06:30
flxfooinstanciate the controller is fine06:30
wallyworldoh, so a charm action06:30
flxfooyes06:30
wallyworldsorry, i head went to a juju backup06:31
flxfoothe error raise up, on `get_model()` python call06:31
flxfooit is much before an action is called06:31
flxfoothe RPC cuts, when getting model data06:32
wallyworldi know we had to increase the msg frame size for the juju client, but that was ages ago, perhaps there's a pylibjuju tweak needed06:32
flxfooI tried from 512 until 65535 using max_frame_size, with no luck06:33
flxfooI am not sure this is about the python lib actually but maybe what the controller answers06:34
wallyworldwhat version is the controller06:35
flxfoo2.8.006:35
wallyworldi can't recall exactly what's been fixed since 2.8.0, but upgrading to 2.8.2 would be something to consider06:36
flxfoook, I would need to create a new controller and migrate model yes?06:36
wallyworldno, you can just upgrade06:36
wallyworldjuju upgrade-controller06:37
wallyworldyou can use --dry-run to see what it would do06:37
flxfooit is a production platform... not sure...06:37
wallyworldmigration is a good option then06:38
flxfook thanks06:38
wallyworldyou can even deploy  atest model on 2.8.206:38
wallyworldand see if the issue goes away06:38
flxfooI have found in logs that a previous model with same name, which raise some error could it be a potential issue related?06:39
wallyworldjuju doesn't allow 2 models with the same name, so any logs would be for a previously delete model i would expect06:39
flxfoolike this `ERROR juju.worker.dependency engine.go:671 “api-caller” manifold worker returned unexpected error: [619092] “machine-0” cannot open api: model cache: model “61909253-939f-48c0-8452-389111410e43” did not appear in cache timeout`06:39
wallyworldthat could be something else, i don't know off hand06:40
flxfook06:40
wallyworld2.8.2 did fix some model cache issues i think06:40
flxfooand this `ERROR juju.worker.dependency engine.go:671 “mgo-txn-resumer” manifold worker returned unexpected error: cannot resume transactions: The update path ‘settings.’ contains an empty field name, which is not allowed.`06:41
flxfoo(sorry to bother, and I don't have more) :P06:41
flxfoop06:41
wallyworldnot seen that before. we'd need to see a bit more info like what lead up to that error, any previous errors etc. and maybe even a sanitised chunk of the relevant db records. worth filing a bug06:43
flxfoothat's from logsink.log file (appear as well in machine-0.log) , that error is looping together with the previous one. around every 3 minutes.06:45
flxfoolast one, more general, in case of spawning a new controller and migrating the model, would previous non exists model or objects in general would be cleaned up?06:46
wallyworlda new controller starts empty. only the migrated model is copied across, so any old cruft is left behind06:48
wallyworldstuff restarting mneans something has gone wrong and needs looking into, likely a bug that needs fixing06:49
flxfootalking about the RPC?06:49
wallyworldno the error being logged every 3 minutes06:50
flxfook06:50
flxfoothe uuid in front of that log line is the model uuid?06:51
flxfooprobably not06:51
flxfoothe whole line is eg.06:51
wallyworldthe uuid is the model uuid06:52
flxfoo66c9ba24-69af-40f0-842f-8613777c1491: machine-0 2020-09-15 06:45:04 ERROR juju.worker.dependency engine.go:671 "mgo-txn-resumer" manifold worker returned unexpected error: cannot resume transactions: The update path 'settings.' contains an empty field name, which is not allowed.06:52
wallyworldsomething has really got itself into a funny state06:52
flxfoohow do I know the uuid of models?06:52
wallyworldjuju models --format yaml06:53
flxfooyeah ... I might have done something wierd, but except creating and removing models or apps, really nothing special06:53
flxfook06:53
flxfoook so that controller model for sure06:54
wallyworldjuju should not error for that06:54
flxfoomgo I suppose this is mongodb related06:54
flxfooplatform is on aws06:55
wallyworldmore likely juju messing up when a model was removed06:55
flxfooI see... so yeah best option would be to respawn a new controller and migrate the current model...06:57
flxfoohow can I debug or get into the mongodb db?06:58
wallyworldyou can dump a model as yaml by export JUJU_DEV_FATURE_FLAGS=developer-mode and then juju dump-model or juju dump-db06:59
flxfoois output heavy?07:00
flxfoothere is not a lot of nodes07:00
flxfoo1707:00
flxfoosorry 1307:01
wallyworldthe size is more related to how many apps / units07:02
flxfoo7 apps, 13 units07:04
wallyworldit won't be much07:04
flxfoodone07:08
flxfoosequences: have a few application entries that are no more07:08
flxfoowallyworld:thanks ... I will try to look into that...07:19
wallyworldthe sequence entries are ok, they will be ignored. but they should have been cleaned up07:33
flxfoowallyworld:ok, so definitely something did not get right...08:02
flxfoowallyworld:sorry to bother, port 37017 should be accessible to juju client (cli) right?08:15
flxfoobecause sniffing network I have traffic coming from the vm public interface to the private interface on port 37017...08:17
stickupkidflxfoo, that's the mongo port08:28
stickupkidflxfoo, the default API port is 1707008:29
flxfoook good news I found in https://pythonlibjuju.readthedocs.io/en/latest/api/juju.client.html#module-juju.client.connection08:43
flxfoothat the default MAX_FRAME_SIZE is 419430408:43
flxfoowhen I doubled it, RPC error is gone08:43
stickupkidflxfoo, over 4MB of data, that is some impressive data08:45
flxfoostickupkid:right08:45
flxfoonow I dumped the model and the yaml file is about 8022129o08:46
flxfoowhich fits08:46
stickupkidflxfoo, classic08:46
flxfoonow , there might be some extra data in there08:46
flxfoois there is way to "cleanup"08:46
flxfoo?08:47
stickupkidflxfoo, without seeing it, I'm unsure08:47
flxfoook, I mean there is no gc type of tools?08:48
flxfoostickupkid:I quite manipulate (for testing pov) the models, like adding removing apps/units etc...08:49
flxfooI suppose that comes from that08:49
flxfooIf there is no more much changes, that size should not move forward right?08:49
flxfooand the only way would be to respawn a model? or migrate maybe?08:49
stickupkidflxfoo, so I would hope we would clean up if you removed a unit/application08:50
stickupkidflxfoo, but yeah, they should be stable size if your model is stable08:50
flxfoostickupkid:right now (production) there is 13 units 7 apps not much08:51
flxfoostickupkid:before that "stable" model, I played quite a while adding/removing etc probably a lot of units and apps... and maybe some pieces haven't been removed properly or so...08:52
flxfoostickupkid:does that sounds plausable?08:52
stickupkidyeah, but do keep an eye on it and open a bug if you believe we should be cleaning up when we're not08:52
flxfoostickupkid:sure will do... I try to localize the issue to be able to fillup something useful... controller version is 2.8.0, and wally told me already that few things were fixed in model since08:53
flxfoostickupkid:do unit (instance) connect to 37017 (mongodb) directly?09:01
flxfoodoes not look like (sniffing09:02
stickupkidflxfoo, they should all go via an API09:02
flxfook09:03
flxfoostickupkid:still don't understand why private interface tries to connect to public interface port 37017 on the controller09:04
flxfoostickupkid:what is connecting to 37017? a client (cli) would or not? or is it internal to ctrl?09:06
stickupkidflxfoo, internal to ctrl, achilleasa any thoughts on above?09:07
achilleasaflxfoo: is that a HA controller?09:09
flxfoostickupkid:not yet :p09:11
flxfoostickupkid:no it is not HA09:32
achilleasastickupkid: are you aware of any existing watcher for space topology changes?11:27
achilleasaor for subnet doc changes11:31
achilleasathe latter is what I actually need11:32
stickupkidachilleasa, not that I'm aware of11:32
achilleasastickupkid: there is a lifecycle watcher for subnets11:35
stickupkidachilleasa, on develop if you run make statis-analysis do you get this issue?13:17
stickupkidgo/src/github.com/juju/juju/apiserver/facades/client/applicationoffers/state.go:58:2: UserPermission redeclared13:17
stickupkidnot sure how that's not being picked up, it really should be13:17
achilleasacan check in 5'13:17
stickupkidwicked13:18
=== hallback_ is now known as hallback
=== vern_ is now known as vern
=== marosg_ is now known as marosg
=== mskalka_ is now known as mskalka
=== beisner_ is now known as beisner
=== nicolasbock_ is now known as nicolasbock
=== coreycb_ is now known as coreycb
=== skay__ is now known as skay_
=== skay__ is now known as skay
stickupkidhml, updated my q&a steps https://github.com/juju/juju/pull/1199414:27
hmlstickupkid:  rgr14:27
=== xnox1 is now known as xnox
stickupkidhml, approved14:53
hmlstickupkid:  ta14:53
stickupkidwhy is CharmID.Metadata a untyped map[string]string15:51
stickupkid?15:51
stickupkidjust mystery meat15:51
pmatulislol15:56
tychicushttps://jaas.ai/apache2, it does not appear that the apache2 charm supports the certificates:tls-certificates relation, to obtains tls certificates from vault, is that correct?16:03
tychicusIf you set bluestore-block-db-size in ceph-osd post deployment, what additional steps are necessary to update the db partition size on all of the osd servers21:21
tychicusdo you have to remove and re-add the unit to pick up the new setting or can you use osd-out zap-disk and then add-disk21:23
=== mwhudson_ is now known as mwhudso
=== mwhudso is now known as mwhudson

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!