/srv/irclogs.ubuntu.com/2020/09/15/#juju.txt

wallyworld	kelvinliu: next azure PR if you get a chance at some point (another step along the way) https://github.com/juju/juju/pull/11998	04:26
kelvinliu	looking	04:44
kelvinliu	wallyworld: lgtm	04:59
kelvinliu	ty	04:59
wallyworld	tyvm	05:02
flxfoo	hi all	06:22
flxfoo	having trouble with pylibjuju, can I use something else? like `curl` to retrieve data from a model?	06:22
wallyworld	what trouble? curl will be messy, using a client is better. what's the issue?	06:25
flxfoo	wallyworld:I posted on discourse, but I have `RPC: Connection closed, reconnecting`	06:27
flxfoo	then on controller side the `logsink.log` has `ERROR juju.apiserver apiserver.go:939 error serving RPCs: codec.ReadHeader error: error receiving message: websocket: close 1009 (message too big)`	06:28
flxfoo	of course, nothing changed on those machines, the day before the script was working fine, then the next I have this	06:28
wallyworld	i've not used pylibjuju much, i assume the backup action is configured to leave the backup on the controller and not try to stream it down to the client?	06:29
flxfoo	yeah, mainly that just run the action. login, run action, do some server stuff,...	06:30
flxfoo	steps are	06:30
flxfoo	instanciate the controller is fine	06:30
wallyworld	oh, so a charm action	06:30
flxfoo	yes	06:30
wallyworld	sorry, i head went to a juju backup	06:31
flxfoo	the error raise up, on `get_model()` python call	06:31
flxfoo	it is much before an action is called	06:31
flxfoo	the RPC cuts, when getting model data	06:32
wallyworld	i know we had to increase the msg frame size for the juju client, but that was ages ago, perhaps there's a pylibjuju tweak needed	06:32
flxfoo	I tried from 512 until 65535 using max_frame_size, with no luck	06:33
flxfoo	I am not sure this is about the python lib actually but maybe what the controller answers	06:34
wallyworld	what version is the controller	06:35
flxfoo	2.8.0	06:35
wallyworld	i can't recall exactly what's been fixed since 2.8.0, but upgrading to 2.8.2 would be something to consider	06:36
flxfoo	ok, I would need to create a new controller and migrate model yes?	06:36
wallyworld	no, you can just upgrade	06:36
wallyworld	juju upgrade-controller	06:37
wallyworld	you can use --dry-run to see what it would do	06:37
flxfoo	it is a production platform... not sure...	06:37
wallyworld	migration is a good option then	06:38
flxfoo	k thanks	06:38
wallyworld	you can even deploy atest model on 2.8.2	06:38
wallyworld	and see if the issue goes away	06:38
flxfoo	I have found in logs that a previous model with same name, which raise some error could it be a potential issue related?	06:39
wallyworld	juju doesn't allow 2 models with the same name, so any logs would be for a previously delete model i would expect	06:39
flxfoo	like this `ERROR juju.worker.dependency engine.go:671 “api-caller” manifold worker returned unexpected error: [619092] “machine-0” cannot open api: model cache: model “61909253-939f-48c0-8452-389111410e43” did not appear in cache timeout`	06:39
wallyworld	that could be something else, i don't know off hand	06:40
flxfoo	k	06:40
wallyworld	2.8.2 did fix some model cache issues i think	06:40
flxfoo	and this `ERROR juju.worker.dependency engine.go:671 “mgo-txn-resumer” manifold worker returned unexpected error: cannot resume transactions: The update path ‘settings.’ contains an empty field name, which is not allowed.`	06:41
flxfoo	(sorry to bother, and I don't have more) :P	06:41
flxfoo	p	06:41
wallyworld	not seen that before. we'd need to see a bit more info like what lead up to that error, any previous errors etc. and maybe even a sanitised chunk of the relevant db records. worth filing a bug	06:43
flxfoo	that's from logsink.log file (appear as well in machine-0.log) , that error is looping together with the previous one. around every 3 minutes.	06:45
flxfoo	last one, more general, in case of spawning a new controller and migrating the model, would previous non exists model or objects in general would be cleaned up?	06:46
wallyworld	a new controller starts empty. only the migrated model is copied across, so any old cruft is left behind	06:48
wallyworld	stuff restarting mneans something has gone wrong and needs looking into, likely a bug that needs fixing	06:49
flxfoo	talking about the RPC?	06:49
wallyworld	no the error being logged every 3 minutes	06:50
flxfoo	k	06:50
flxfoo	the uuid in front of that log line is the model uuid?	06:51
flxfoo	probably not	06:51
flxfoo	the whole line is eg.	06:51
wallyworld	the uuid is the model uuid	06:52
flxfoo	66c9ba24-69af-40f0-842f-8613777c1491: machine-0 2020-09-15 06:45:04 ERROR juju.worker.dependency engine.go:671 "mgo-txn-resumer" manifold worker returned unexpected error: cannot resume transactions: The update path 'settings.' contains an empty field name, which is not allowed.	06:52
wallyworld	something has really got itself into a funny state	06:52
flxfoo	how do I know the uuid of models?	06:52
wallyworld	juju models --format yaml	06:53
flxfoo	yeah ... I might have done something wierd, but except creating and removing models or apps, really nothing special	06:53
flxfoo	k	06:53
flxfoo	ok so that controller model for sure	06:54
wallyworld	juju should not error for that	06:54
flxfoo	mgo I suppose this is mongodb related	06:54
flxfoo	platform is on aws	06:55
wallyworld	more likely juju messing up when a model was removed	06:55
flxfoo	I see... so yeah best option would be to respawn a new controller and migrate the current model...	06:57
flxfoo	how can I debug or get into the mongodb db?	06:58
wallyworld	you can dump a model as yaml by export JUJU_DEV_FATURE_FLAGS=developer-mode and then juju dump-model or juju dump-db	06:59
flxfoo	is output heavy?	07:00
flxfoo	there is not a lot of nodes	07:00
flxfoo	17	07:00
flxfoo	sorry 13	07:01
wallyworld	the size is more related to how many apps / units	07:02
flxfoo	7 apps, 13 units	07:04
wallyworld	it won't be much	07:04
flxfoo	done	07:08
flxfoo	sequences: have a few application entries that are no more	07:08
flxfoo	wallyworld:thanks ... I will try to look into that...	07:19
wallyworld	the sequence entries are ok, they will be ignored. but they should have been cleaned up	07:33
flxfoo	wallyworld:ok, so definitely something did not get right...	08:02
flxfoo	wallyworld:sorry to bother, port 37017 should be accessible to juju client (cli) right?	08:15
flxfoo	because sniffing network I have traffic coming from the vm public interface to the private interface on port 37017...	08:17
stickupkid	flxfoo, that's the mongo port	08:28
stickupkid	flxfoo, the default API port is 17070	08:29
flxfoo	ok good news I found in https://pythonlibjuju.readthedocs.io/en/latest/api/juju.client.html#module-juju.client.connection	08:43
flxfoo	that the default MAX_FRAME_SIZE is 4194304	08:43
flxfoo	when I doubled it, RPC error is gone	08:43
stickupkid	flxfoo, over 4MB of data, that is some impressive data	08:45
flxfoo	stickupkid:right	08:45
flxfoo	now I dumped the model and the yaml file is about 8022129o	08:46
flxfoo	which fits	08:46
stickupkid	flxfoo, classic	08:46
flxfoo	now , there might be some extra data in there	08:46
flxfoo	is there is way to "cleanup"	08:46
flxfoo	?	08:47
stickupkid	flxfoo, without seeing it, I'm unsure	08:47
flxfoo	ok, I mean there is no gc type of tools?	08:48
flxfoo	stickupkid:I quite manipulate (for testing pov) the models, like adding removing apps/units etc...	08:49
flxfoo	I suppose that comes from that	08:49
flxfoo	If there is no more much changes, that size should not move forward right?	08:49
flxfoo	and the only way would be to respawn a model? or migrate maybe?	08:49
stickupkid	flxfoo, so I would hope we would clean up if you removed a unit/application	08:50
stickupkid	flxfoo, but yeah, they should be stable size if your model is stable	08:50
flxfoo	stickupkid:right now (production) there is 13 units 7 apps not much	08:51
flxfoo	stickupkid:before that "stable" model, I played quite a while adding/removing etc probably a lot of units and apps... and maybe some pieces haven't been removed properly or so...	08:52
flxfoo	stickupkid:does that sounds plausable?	08:52
stickupkid	yeah, but do keep an eye on it and open a bug if you believe we should be cleaning up when we're not	08:52
flxfoo	stickupkid:sure will do... I try to localize the issue to be able to fillup something useful... controller version is 2.8.0, and wally told me already that few things were fixed in model since	08:53
flxfoo	stickupkid:do unit (instance) connect to 37017 (mongodb) directly?	09:01
flxfoo	does not look like (sniffing	09:02
stickupkid	flxfoo, they should all go via an API	09:02
flxfoo	k	09:03
flxfoo	stickupkid:still don't understand why private interface tries to connect to public interface port 37017 on the controller	09:04
flxfoo	stickupkid:what is connecting to 37017? a client (cli) would or not? or is it internal to ctrl?	09:06
stickupkid	flxfoo, internal to ctrl, achilleasa any thoughts on above?	09:07
achilleasa	flxfoo: is that a HA controller?	09:09
flxfoo	stickupkid:not yet :p	09:11
flxfoo	stickupkid:no it is not HA	09:32
achilleasa	stickupkid: are you aware of any existing watcher for space topology changes?	11:27
achilleasa	or for subnet doc changes	11:31
achilleasa	the latter is what I actually need	11:32
stickupkid	achilleasa, not that I'm aware of	11:32
achilleasa	stickupkid: there is a lifecycle watcher for subnets	11:35
stickupkid	achilleasa, on develop if you run make statis-analysis do you get this issue?	13:17
stickupkid	go/src/github.com/juju/juju/apiserver/facades/client/applicationoffers/state.go:58:2: UserPermission redeclared	13:17
stickupkid	not sure how that's not being picked up, it really should be	13:17
achilleasa	can check in 5'	13:17
stickupkid	wicked	13:18
=== hallback_ is now known as hallback
=== vern_ is now known as vern
=== marosg_ is now known as marosg
=== mskalka_ is now known as mskalka
=== beisner_ is now known as beisner
=== nicolasbock_ is now known as nicolasbock
=== coreycb_ is now known as coreycb
=== skay__ is now known as skay_
=== skay__ is now known as skay
stickupkid	hml, updated my q&a steps https://github.com/juju/juju/pull/11994	14:27
hml	stickupkid: rgr	14:27
=== xnox1 is now known as xnox
stickupkid	hml, approved	14:53
hml	stickupkid: ta	14:53
stickupkid	why is CharmID.Metadata a untyped map[string]string	15:51
stickupkid	?	15:51
stickupkid	just mystery meat	15:51
pmatulis	lol	15:56
tychicus	https://jaas.ai/apache2, it does not appear that the apache2 charm supports the certificates:tls-certificates relation, to obtains tls certificates from vault, is that correct?	16:03
tychicus	If you set bluestore-block-db-size in ceph-osd post deployment, what additional steps are necessary to update the db partition size on all of the osd servers	21:21
tychicus	do you have to remove and re-add the unit to pick up the new setting or can you use osd-out zap-disk and then add-disk	21:23
=== mwhudson_ is now known as mwhudso
=== mwhudso is now known as mwhudson

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!