/srv/irclogs.ubuntu.com/2020/09/25/#juju.txt

hpidcock	wallyworld: https://github.com/juju/systems/pull/1 please take a look	02:34
wallyworld	sure	02:34
hpidcock	wallyworld: thanks, if we are happy how this works I can move forward	02:40
wallyworld	hpidcock: did you copy the channel code from snapd?	02:58
hpidcock	wallyworld: mostly excluding architecture stuff	03:01
wallyworld	hpidcock: +1 but with fixes	03:07
wallyworld	architeture stuff included	03:07
wallyworld	plus system string parsing	03:07
hpidcock	wallyworld: thanks for thorough review	03:28
wallyworld	np	03:28
hpidcock	wallyworld or kelvinliu: please review this PR https://github.com/juju/charm/pull/321	04:00
wallyworld	otp with tom, soon	04:00
hpidcock	wallyworld: no rush	04:01
kelvinliu	looking	04:03
kelvinliu	hpidcock: lgtm ty	04:25
hpidcock	kelvinliu: much appreciated. wallyworld can you take a quick look too	04:26
wallyworld	ok	04:33
wallyworld	hpidcock: since we've gone to v8, we can drop dependncies.tsv IMO	04:37
hpidcock	sounds good	04:37
hpidcock	wallyworld: doneski	04:38
wallyworld	hpidcock: mainly missing tests	04:49
wallyworld	since it is a library, we should be careful to have coverage	04:49
=== salmankhan1 is now known as salmankhan
=== salmankhan1 is now known as salmankhan
tychicus	with the mysql-innodb-cluster is there way to get logged in to the mysql instance?	19:31
tychicus	with the regular mysql charm you could do something like mysql -u root -p`sudo cat /var/lib/mysql/mysql.passwd`	19:31
tychicus	also is it expected that if you remove mysql-innodb-cluster unit that you will get MySQL InnoDB Cluster not healthy: None	20:35
tychicus	when you add a new unit but it receives the same ip address as the unit that was previously removed	20:35
tychicus	it is also know that if you accidentally juju add-unit mysql-innodb-cluster --to 1	20:36
tychicus	and then do a juju remove-unit mysql-innodb-cluster/1	20:37
tychicus	that it will not uninstall mysql or remove that node from the mysql cluster	20:37
tychicus	I guess, I'm wondering if these are known/expected behaviors or should be filed as bugs against the charm	20:38
petevg[m]	tychicus: those do sound like bugs. Please file them!	21:28
tychicus	petevg[m]: thanks will do	21:29
pmatulis	tychicus, innodb charm does expect 3 units, so the 'not healthy' msg seems expected	21:42
tychicus	it happen when 1 unit has been removed and you attempt to add a new unit back	21:43
tychicus	if you deploy to say a lxd container and that lxd container receives an ip address that was previously a member of the cluster it will never join the cluster	21:44
tychicus	in this case MaaS is handing out the IP addresses	21:44
tychicus	if you add another unit the next unit with an IP address that has never been a member of the cluster will join just fine	21:45
pmatulis	i'm curious as to why you removed a unit to begin with	21:45
pmatulis	also, there are some actions for removing/adding cluster members	21:46
pmatulis	https://opendev.org/openstack/charm-mysql-innodb-cluster/src/branch/master/src/actions.yaml	21:46
tychicus	all of the instances were taken down	21:46
tychicus	then came back up, but did not seem to recluster once they were all back online	21:47
tychicus	so I ran juju run-action mysql-innodb-cluster/leader reboot-cluster-from-complete-outage	21:49
tychicus	this brought up 2 of the 3 nodes	21:49
tychicus	I attempted to do a rejoin-instance address=10.x.x.x	21:50
tychicus	GTID set check of the MySQL instance at '10.x.x.x:3306' determined that it contains transactions that do not originate from the cluster, which must be discarded before it can join the cluster	21:52
tychicus	so then I did a remove-instance address=10.x.x.x	21:52
tychicus	which told me that force was null and expected a bool	21:53
tychicus	so I retried with force	21:53
tychicus	then attempted both rejoin-instance and add-instance	21:54
tychicus	but neither of those worked	21:54
tychicus	so I figured that I would remove the unit and add a new one	21:54
tychicus	but I accidentally added it as a container to a host that already had a container of mysql-innodb-cluster running, not wanting to have 2 of my 3 mysql instances on the same physical host, I removed that unit	21:56
thedac	tychicus: Hi, trying to get all that went down here. When a new node has an old IP address it is a bit complicated.	21:56
thedac	The action cluster-status is your friend here. That will tell you what the cluster thinks is in metadata.	21:57
tychicus	yes,	21:57
thedac	In normal operations you would remove a unit; run the remove-instance action with the old address; then add new unit	21:57
tychicus	yes, that is what I did	21:58
tychicus	but when I added a new unit, MaaS gave me the same IP address as the old unit	21:58
thedac	What I would have been interested in is what cluster status showed before you added the new unit. As well as what it it says now?	21:59
tychicus	i'll see if I can get that into a pastebin for you	22:00
tychicus	do you just want cluster status, or also cluster rescan	22:00
thedac	cluster status with some idea about which IP addresses are the nodes that have been in the cluster the whole time	22:01
thedac	To answer your original question you can get the passwords to access the local mysql and the cluster (clusteruser) from leader get: `juju run --unit mysql-innodb-cluster/leader leader-get`	22:03
tychicus	oh that is fantastic	22:04
tychicus	I'll include some of that output as well since it will have more IP address "history"	22:05
thedac	Just make sure to sanitize the output for public consumption	22:05
tychicus	https://pastebin.ubuntu.com/p/tkFyCDDfvk/	22:11
thedac	Ok, give me a sec to try and understand this.	22:11
tychicus	:)	22:12
tychicus	thanks	22:12
thedac	tychicus: OK, I think we can handle this. But I do want to make sure you have taken a backup. Anytime we are hitting the remove-instance button there is the possibility of human error.	22:20
thedac	You have a backup, right? :) And please double triple check what I type.	22:20
thedac	First we will remove the 33.1 node. It is actually good that it is still running: `juju run-action --wait mysql-innodb-cluster/leader remove-instance address=10.100.33.1`	22:20
thedac	In theory we don't need `force` because the node is currently running.	22:20
thedac	Once that is complete do another cluster-status to confirm 33.1 is NOT in the cluster; at which point you can stop mysql on the physical host and manually uninstall.	22:20
tychicus	I do have a backup but the last time I tried to restore from backup I ran into some other GTID issue when trying to restore, but that's another story	22:23
tychicus	https://pastebin.ubuntu.com/p/Sgx2Y3XVd4/	22:23
thedac	OK, the required force may be another bug we can file. Let's try with force=True	22:24
tychicus	should I try first with force=False	22:24
thedac	Oh, good point. Yes. Let's try that first	22:25
tychicus	so it says outcome success, status completed	22:25
tychicus	so now I have 3 nodes in cluster status	22:26
thedac	and no 33.1 node?	22:26
tychicus	correct	22:26
thedac	You are safe to stop mysql on the physical node and use apt to remove the mysql packages and lastly you can remove the /var/lib/mysql dir if you want.	22:27
thedac	Two potential bugs we can file here. I agree "MySQL InnoDB Cluster not healthy: None" isn't helpful it should display the statusText: "Cluster is ONLINE and can tolerate up to ONE failure." The other is why the original reboot-cluster-from-complete-outage failed. Which may be harder to ascertain at this point.	22:27
tychicus	ok I have the answer to why it failed	22:28
tychicus	GTID conflict	22:28
tychicus	for some reason the unit that had previously been the RW was no longer the RW	22:29
tychicus	I don't think that I have those logs any more, but I do have the output of when I attempted to get it to rejoin the clsuter	22:30
thedac	Ok, I had thought (naively) that msyql was smart enough to use the node with the newest data. But that is more to do with mysql itself than our charms. Still it might be worth documenting what happened so we can keep an eye on it.	22:30
tychicus	I don't think that I have those logs any more, but I do have the output of when I attempted to get it to rejoin the cluster	22:30
thedac	OK	22:30
tychicus	https://pastebin.ubuntu.com/p/jn92h2xQgk/	22:31
thedac	Thanks	22:32
tychicus	I may be wrong here, but I don't think that the add-instance option support recoverymethod=clone	22:33
tychicus	one thought that I did have was to dump the database from the new RW and restore to the instance that would not rejoin	22:34
tychicus	but alas I didn't try that	22:34
thedac	I'll see if we can use the recovermethod=clone option in add-instance or even rejoin-instance. I'll have to dig into the mysql docs.	22:36
tychicus	thanks again for your help	22:44

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!