[00:13] sarnold: yo be alive [00:14] i need to pick your brain [00:15] heya teward [00:15] incoming PM flood [14:34] ahhh :-( bind9 keeps segfaulting and segfaulting, no idea why [14:35] and it creates a very big "core" file [14:35] 800M [14:35] 800M on master [14:35] 3.6G on slave! [14:35] dmesg/kern.log will usually give you the immediate reason [14:36] how do I pip stderr to file? [14:37] pipe [14:37] your-command 2>filename [14:41] Aison0: pastebin output of dmesg -T, or at least the relevant lines [14:41] dmesg shows simply nothing [14:45] but this is the whole bind9 output from start until crash [14:45] https://paste.ubuntu.com/p/vQZqkMxZKN/ [14:46] how long does it take? [14:46] 10-15 seconds? [14:46] but it depends [14:46] sometimes longer [14:54] I suspect you have filesystem corruption or a hardware fault there [14:54] Especially if dmesg output really is blank [14:55] it's not segfaulting but tripping on an assert() [14:55] Aison0: I'd start out with monitoring the process with top or htop and look for memory usage. with the sizes of those dumps, it smells like a memory leak [14:56] if those bind servers are caching resolvers, this could explain the size of the core dumps. The assertion is worrying though [14:57] still, if it takes so short a time for them to crash, it should be easy to just follow the mem usage for both process and system (and swap, of course) [14:57] but then, if that happened, there should be an OOM showing up in dmesg [14:57] RoyK, it really depends. Now it works for several minutes [14:57] it also worked over night [14:58] Aison0: how many clients do you have, using that server? [14:58] before my message, it started crashing every few seconds [14:58] or those [14:58] RoyK, around 200 [14:58] not a lot, then [14:58] no [14:58] it also worked for a long time now [14:59] this setup is not new [14:59] has there been a bind update recently? [14:59] rbasak, I don't think it is a hardware problem. It happens on primary and secondary server, which are completely different [15:00] agreed - this does *not* smell hardware issues [15:06] for what is this "core" file good for? [15:07] understanding the state of the system at the time the problem occurred [15:10] Hello, is there a change to use LVM raid during installation? [15:12] Aison0: you can run 'gdb bind core.xxx' and then run a backtrace to check where it failed. it'll normally require symbols, though, which may not be there [15:13] Aison0: I guess that'll be gdb named core.xxx, though [15:46] *pokes rbasak* got a few minutes? [15:52] are there any ppa with newer versions of bind that I can try? [15:56] Aison0: we usually advice to use packages from the repos on ubuntu, specific for your ubuntu version [15:56] and/or snaps [15:58] Aison0: https://launchpad.net/~isc/+archive/ubuntu/bind seems reputable enough [15:59] lotuspsychje: I agree with you but in this case, bind9 (9.16.1) is tripping on assert() and ISC upstream fixed a bunch of assertion in later 9.16.X [15:59] ah nice, yeah some cases might be useful [16:00] nice find sdeziel [16:02] sudo add-apt-repository ppa:isc/bind [16:02] Cannot add PPA: 'ppa:~isc/ubuntu/bind'. [16:02] ERROR: '~isc' user or team does not exist. [16:02] :P [16:04] rofl, bind just crshed again [16:04] that's why it is not working ^'^'^ [16:25] make sure you have software-properties-common installed, ppa:isc/bind is working here. [16:32] teward: o/ === coconut__ is now known as coconut === StathisA_ is now known as StathisA