/srv/irclogs.ubuntu.com/2020/09/25/#juju.txt

hpidcockwallyworld: https://github.com/juju/systems/pull/1 please take a look02:34
wallyworldsure02:34
hpidcockwallyworld: thanks, if we are happy how this works I can move forward02:40
wallyworldhpidcock: did you copy the channel code from snapd?02:58
hpidcockwallyworld: mostly excluding architecture stuff03:01
wallyworldhpidcock: +1 but with fixes03:07
wallyworldarchiteture stuff included03:07
wallyworldplus system string parsing03:07
hpidcockwallyworld: thanks for thorough review03:28
wallyworldnp03:28
hpidcockwallyworld or kelvinliu: please review this PR https://github.com/juju/charm/pull/32104:00
wallyworldotp with tom, soon04:00
hpidcockwallyworld: no rush04:01
kelvinliulooking04:03
kelvinliuhpidcock: lgtm ty04:25
hpidcockkelvinliu: much appreciated. wallyworld can you take a quick look too04:26
wallyworldok04:33
wallyworldhpidcock: since we've gone to v8, we can drop dependncies.tsv IMO04:37
hpidcocksounds good04:37
hpidcockwallyworld: doneski04:38
wallyworldhpidcock: mainly missing tests04:49
wallyworldsince it is a library, we should be careful to have coverage04:49
=== salmankhan1 is now known as salmankhan
=== salmankhan1 is now known as salmankhan
tychicuswith the mysql-innodb-cluster is there way to get logged in to the mysql instance?19:31
tychicuswith the regular mysql charm you could do something like mysql -u root -p`sudo cat /var/lib/mysql/mysql.passwd`19:31
tychicusalso is it expected that if you remove mysql-innodb-cluster unit that you will get MySQL InnoDB Cluster not healthy: None20:35
tychicuswhen you add a new unit but it receives the same ip address as the unit that was previously removed20:35
tychicusit is also know that if you accidentally juju add-unit mysql-innodb-cluster --to 120:36
tychicusand then do a juju remove-unit mysql-innodb-cluster/120:37
tychicusthat it will not uninstall mysql or remove that node from the mysql cluster20:37
tychicusI guess, I'm wondering if these are known/expected behaviors or should be filed as bugs against the charm20:38
petevg[m]tychicus: those do sound like bugs. Please file them!21:28
tychicuspetevg[m]: thanks will do21:29
pmatulistychicus, innodb charm does expect 3 units, so the 'not healthy' msg seems expected21:42
tychicusit happen when 1 unit has been removed and you attempt to add a new unit back21:43
tychicusif you deploy to say a lxd container and that lxd container receives an ip address that was previously a member of the cluster it will never join the cluster21:44
tychicusin this case MaaS is handing out the IP addresses21:44
tychicusif you add another unit the next unit with an IP address that has never been a member of the cluster will join just fine21:45
pmatulisi'm curious as to why you removed a unit to begin with21:45
pmatulisalso, there are some actions for removing/adding cluster members21:46
pmatulishttps://opendev.org/openstack/charm-mysql-innodb-cluster/src/branch/master/src/actions.yaml21:46
tychicusall of the instances were taken down21:46
tychicusthen came back up, but did not seem to recluster once they were all back online21:47
tychicusso I ran juju run-action mysql-innodb-cluster/leader reboot-cluster-from-complete-outage21:49
tychicusthis brought up 2 of the 3 nodes21:49
tychicusI attempted to do a rejoin-instance address=10.x.x.x21:50
tychicusGTID set check of the MySQL instance at '10.x.x.x:3306' determined that it contains transactions that do not originate from the cluster, which must be discarded before it can join the cluster21:52
tychicusso then I did a remove-instance address=10.x.x.x21:52
tychicuswhich told me that force was null and expected a bool21:53
tychicusso I retried with force21:53
tychicusthen attempted both rejoin-instance and add-instance21:54
tychicusbut neither of those worked21:54
tychicusso I figured that I would remove the unit and add a new one21:54
tychicusbut I accidentally added it as a container to a host that already had a container of mysql-innodb-cluster running, not wanting to have 2 of my 3 mysql instances on the same physical host, I removed that unit21:56
thedactychicus: Hi, trying to get all that went down here. When a new node has an old IP address it is a bit complicated.21:56
thedacThe action cluster-status is your friend here. That will tell you what the cluster *thinks* is in metadata.21:57
tychicusyes,21:57
thedacIn normal operations you would remove a unit; run the remove-instance action with the old address; then add new unit21:57
tychicusyes, that is what I did21:58
tychicusbut when I added a new unit, MaaS gave me the same IP address as the old unit21:58
thedacWhat I would have been interested in is what cluster status showed before you added the new unit. As well as what it it says now?21:59
tychicusi'll see if I can get that into a pastebin for you22:00
tychicusdo you just want cluster status, or also cluster rescan22:00
thedaccluster status with some idea about which IP addresses are the nodes that have been in the cluster the whole time22:01
thedacTo answer your original question you can get the passwords to access the local mysql and the cluster (clusteruser) from leader get: `juju run --unit mysql-innodb-cluster/leader leader-get`22:03
tychicusoh that is fantastic22:04
tychicusI'll include some of that output as well since it will have more IP address "history"22:05
thedacJust make sure to sanitize the output for public consumption22:05
tychicushttps://pastebin.ubuntu.com/p/tkFyCDDfvk/22:11
thedacOk, give me a sec to try and understand this.22:11
tychicus:)22:12
tychicusthanks22:12
thedactychicus: OK, I think we can handle this. But I do want to make sure you have taken a backup. Anytime we are hitting the remove-instance button there is the possibility of human error.22:20
thedacYou have a backup, right? :)  And please double triple check what I type.22:20
thedacFirst we will remove the 33.1 node. It is actually good that it is still running: `juju run-action --wait mysql-innodb-cluster/leader remove-instance address=10.100.33.1`22:20
thedacIn theory we don't need `force` because the node is currently running.22:20
thedacOnce that is complete do another cluster-status to confirm 33.1 is NOT in the cluster; at which point you can stop mysql on the physical host and manually uninstall.22:20
tychicusI do have a backup but the last time I tried to restore from backup I ran into some other GTID issue when trying to restore, but that's another story22:23
tychicushttps://pastebin.ubuntu.com/p/Sgx2Y3XVd4/22:23
thedacOK, the required force may be another bug we can file. Let's try with force=True22:24
tychicusshould I try first with force=False22:24
thedacOh, good point. Yes. Let's try that first22:25
tychicusso it says outcome success, status completed22:25
tychicusso now I have 3 nodes in cluster status22:26
thedacand no 33.1 node?22:26
tychicuscorrect22:26
thedacYou are safe to stop mysql on the physical node and use apt to remove the mysql packages and lastly you can remove the /var/lib/mysql dir if you want.22:27
thedacTwo potential bugs we can file here. I agree "MySQL InnoDB Cluster not healthy: None" isn't helpful it should display the statusText: "Cluster is ONLINE and can tolerate up to ONE failure."  The other is why the original reboot-cluster-from-complete-outage failed. Which may be harder to ascertain at this point.22:27
tychicusok I have the answer to why it failed22:28
tychicusGTID conflict22:28
tychicusfor some reason the unit that had previously been the RW was no longer the RW22:29
tychicusI don't think that I have those logs any more, but I do have the output of when I attempted to get it to rejoin the clsuter22:30
thedacOk, I had thought (naively) that msyql was smart enough to use the node with the newest data. But that is more to do with mysql itself than our charms. Still it might be worth documenting what happened so we can keep an eye on it.22:30
tychicusI don't think that I have those logs any more, but I do have the output of when I attempted to get it to rejoin the cluster22:30
thedacOK22:30
tychicushttps://pastebin.ubuntu.com/p/jn92h2xQgk/22:31
thedacThanks22:32
tychicusI may be wrong here, but I don't think that the add-instance option support recoverymethod=clone22:33
tychicusone thought that I did have was to dump the database from the new RW and restore to the instance that would not rejoin22:34
tychicusbut alas I didn't try that22:34
thedacI'll see if we can use the recovermethod=clone option in add-instance or even rejoin-instance. I'll have to dig into the mysql docs.22:36
tychicusthanks again for your help22:44

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!