/srv/irclogs.ubuntu.com/2022/05/25/#ubuntu-server.txt

teward	bryceh: around?	00:12
teward	oh nvm i can't read	00:13
teward	carry on	00:13
=== xispita is now known as Guest5697
=== xispita_ is now known as xispita
cpaelzer	good morning	05:12
jamespage	coreycb: hey - would you have time to complete the submitter information for https://bugs.launchpad.net/ubuntu/+source/jaraco.context/+bug/1975600	06:22
ubottu	Launchpad bug 1975600 in jaraco.context (Ubuntu) "[MIR] jaraco.context" [Undecided, New]	06:22
jamespage	then I can complete the MIR team review for you	06:22
=== y0sh- is now known as y0sh_
=== xispita is now known as Guest7996
=== xispita_ is now known as xispita
=== thegodsq- is now known as thegodsquirrel
lvoytek	Good morning	12:39
ahasenack	kanashiro: I started this discussion: https://lists.clusterlabs.org/pipermail/users/2022-May/030296.html	13:03
ahasenack	there is a node remove command that works, but I'm kind of leaning towards a full cluster removal when making changes. Depending on what you change, you may get away with it, or you will get phantom data	13:04
ahasenack	`pcs cluster destroy` does a lot of things, it goes over /var/lib/pcsd, /var/lib/pacemaker and removes many files	13:05
kanashiro	ahasenack, maybe the charms would be better using the high-level cluster management tools like pcs and crmsh instead of doing all of this manually (?)	14:20
ahasenack	maybe	14:20
ahasenack	but a bigger change	14:20
ahasenack	I think the key thing is changing nodeid, not just the name	14:20
ahasenack	if you keep the nodeid the same, and then change the name, all is fine (testing that now)	14:20
kanashiro	right, it makes sense, but from one of the answers in the thread I think if you restart first corosync and then pacemaker all should be fine	14:22
kanashiro	did you test that?	14:22
ahasenack	the phantom node is all about pacemaker, yes	14:22
ahasenack	I've been doing "systemctl restart pacemaker corosync", unsure if the order in that command line affects things	14:23
ahasenack	but after the package is installed, both are running, nothing that can be done about that (easily, other than policy-rc.d)	14:23
ahasenack	so the "contamination" with node1 happens right after install	14:24
kanashiro	so a possible minimum change to fix this would be to create a dependency between the pacemaker and corosync systemd services(?)	14:24
ahasenack	I have vague recollections of nish doing that in the past, and suffering a lot	14:24
ahasenack	it involved creating a file in one maintainerscript and checking for that file in another maintainer script	14:25
ahasenack	inter-package RPC :)	14:25
kanashiro	if we think this is too much we can at least document this in the server guide, so once we see this happening we can point users to it	14:26
ahasenack	even in the case where you keep the nodeid the same, and crm status is clean, the "node1" node is still referenced in old cib files	14:26
ahasenack	which seems right, if I understand it correctly	14:26
ahasenack	what I don't get yet is, let's say I deploy 3 nodes	14:27
ahasenack	all 3 get node1, nodeid=1 (default pkg install)	14:27
ahasenack	then in node1 I change name to be hostname, keep nodeid=1, adjust ring0_addr	14:27
ahasenack	and add the other 2 nodes to the config, with ids 2 and 3	14:27
ahasenack	and send the config to them via scp, and restart everything	14:27
ahasenack	I don't get why changing nodeid from 1 to 2 and 3 in the other nodes doesn't introduce the same problem	14:28
ahasenack	maybe because nodeid 1 is still around, it just has another name, and is no longer myself	14:28
ahasenack	I go from node1/id1, node1/id1, node1/id1 to f1/id1, f2/id2, f3/id3	14:29
ahasenack	(fN being the new names)	14:29
kanashiro	I think that in this case the cluster has quorum and they vote to make sure that node does not exist. In a single-node cluster I am not sure when to consider it quorate	14:30
ahasenack	if I change node1/id1 to f1/id101, then node1 is still in the list, but offline, even with the 3 nodes	14:30
ahasenack	f1/id101 does not replace node1/id1	14:30
ahasenack	and in reality, id1 really disappeared from the cluster in that case, no other node assumed id1	14:31
ahasenack	hence it shows offline	14:31
ahasenack	by "disappeared" I mean there is no host anymore responding to pings on id1	14:31
ahasenack	ok, I may be starting to get this	14:31
ahasenack	the charm does change the node ids too	14:31
ahasenack	from 1 to 1001 or something like that	14:32
ahasenack	2 to 1002, and so on	14:32
ahasenack	the approach they took to fix it might be the simplest one after all. Pre-seed a config file	14:32
ahasenack	it's like one of the responses in the thread, don't start pacemaker until the config is final	14:32
ahasenack	achieves the same	14:32
kanashiro	I think that's the main takeaway here: do not restart pacemaker once everything in corosync is set	14:35
ahasenack	each cib-N.raw file in /var/lib/pacemaker/cib/ is like a state, right. I can diff between them to see what changed	14:45
ahasenack	there is probably a corosync/pacemaker (or cmrsh/pcs?) command to show that, I've sees some "diff" commands in some help output	14:45
ahasenack	messing with these attributes in a live cluster is dangerous	14:55
ahasenack	May 25 14:54:59 f3 pacemaker-controld[6239]: warning: Node 'node1' and 'f1' share the same cluster nodeid: 1 f1	14:56
ahasenack	May 25 14:54:59 f3 pacemaker-controld[6239]: error: crm_find_peer: Forked child 6391 to record non-fatal assert at membership.c:590 : member weirdness	14:56
sergiodj	kanashiro: hey, is https://bugs.launchpad.net/ubuntu/+source/openvpn/+bug/1975574 the bug you mentioned you were going to take a look during our housekeeping call today?	19:43
ubottu	Launchpad bug 1975574 in openvpn (Ubuntu Kinetic) "OpenSSL 3.0 support in OpenVPN 2.5" [High, Confirmed]	19:43
ahasenack	sounds like it	19:43
sergiodj	I will mark it as server-todo and bump its priority to high, just in case	19:44
sergiodj	ah, sorry	19:44
sergiodj	Lucas already did that, but I had opened the URL before his update	19:44
sergiodj	kanashiro: nevermind :)	19:44
kanashiro	:)	19:57
giu--	hi to all	21:55

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!