[00:06] <cmaloney> Oh that's fun
[02:41] <Mooncairn> I found a site with S20 specs. Have no idea if they're definitive.
[02:42] <Mooncairn> Supposedly, "ECC [is] supported by processor" and it comes with ECC memory.
[02:42] <Mooncairn> Haven't been able to track down a BIOS guide yet.
[02:49] <Mooncairn> Can confirm ECC mention in user guide. Unfortunately, only a single reference with no info on turning it on or off.
[02:50] <Mooncairn> With all cores turned on and almost 6 hours of testing (1 pass complete and 2nd is not quite done), no errors.
[02:50] <Mooncairn> Unfortunately, turning all cores on makes the CPU run much hotter. I've got my window open just to keep CPU temp down to 75C.
[02:52] <Mooncairn> Whatever's going on, it's sporadic enough that I didn't really notice issues until, what?, a week or two after ditching KDE and switching to GNOME.
[04:26] <cmaloney> ye gads
[14:32] <Mooncairn> I'm at a loss as to what to do with my computer.
[14:35] <jrwren> have you reinstalled OS or tried a different OS? maybe part of libc is corrupted? Have you installed thermal sensors, maybe it is overheating?
[14:35] <Mooncairn> Yes, I switched from Debian to Ubuntu. Completely fresh install.
[14:36] <Mooncairn> The CPU has thermal sensors. What other kind is there that I could install?
[14:37] <jrwren> no idea, but 75C does seem high for idle. what kind of CPU is it?
[14:37] <Mooncairn> The CPU can get close to 80C when all 8 virtual cores are running on a task. I try to configure things to not use all 8 cores at once.
[14:37] <Mooncairn> The 75C is during a test performed by memtest.
[14:37] <Mooncairn> Normally, the CPU idles around 38C.
[14:39] <Mooncairn> Some of memtest's tests actively uses only one core and have the rest in a busy loop, which I suspect generates a lot of heat.
[14:40] <Mooncairn> Last night I disabled hyperthreading in the BIOS and restarted memtest with only the 4 primary cores. Max temp is 77C during tests.
[14:40] <Mooncairn> With the bedroom window closed. ;-)
[14:40] <jrwren> I've no idea. Sounds like a bummer to me.
[14:41] <Mooncairn> Yes, it is.
[14:43] <Mooncairn> I'll have to try booting into Ubuntu's emergency mode and see if some of the errors I was seeing Monday were recorded in the logs.
[14:43] <Mooncairn> They all looked memory-related, IIRC.
[14:48] <jrwren> maybe memory is bad but memtest just isn't finding it?
[14:49] <jrwren> are there multiple dimms? have you tried eliminating one of them?
[14:49] <Mooncairn> There's 6 dimms. I haven't tried removing any of them yet.
[14:51] <Mooncairn> Nothing in kern.log. In syslog, a few failed assertions in gnome-terminal close to the time of the problems.
[14:52] <Mooncairn> I can see Ubuntu gearing up for a crash report on gnome-terminal, which checks out with what I remember happening.
[14:52] <Mooncairn> The thing in the log immediately before the crash notification is systemd starting the chezmoi snap.
[14:57] <Mooncairn> Nothing about the other GNOME errors that were popping up, and no ~/.xsession-errors file to check.
[14:58] <Mooncairn> So... Pull a DIMM (or two? not sure if these are paired), wait up to two weeks to see if anything else happens?
[15:20] <Mooncairn> Ok, this is a wild shot in the dark, but I've got nothing better: Could a font cause gnome-terminal and gnome-shell to go nuts? I had very recently installed some fonts in my .local/fonts dir around the time things started to go wild.
[15:20] <Mooncairn> So, (malicious?) software problem after all? (Yea, I know it's a bit of a reach.)
[15:24] <jrwren> yes, it is possible, but you said it was messed up even with previous distro.
[15:26] <Mooncairn> There are LOTS of mentions to the fonts in the syslog. Failure to create cairo scaled font and out of memory errors for gnome-terminal.
[15:26] <cmaloney> The OOM issues are a bit worrying
[15:26] <Mooncairn> I never was able to pin down what was wrong with the previous distro. Just lots of small random things breaking.
[15:27] <Mooncairn> "scaled_font status is: out of memory"
[15:27] <Mooncairn> A bunch of that in the log
[15:27] <cmaloney> Do you have a spare drive to reinstall onto?
[15:28] <Mooncairn> Not really.
[15:28] <cmaloney> try this: create a new user
[15:28] <cmaloney> and start with that
[15:28] <cmaloney> copy things over from your old home dir as needed
[15:28] <cmaloney> see if that fixes things
[15:29] <cmaloney> barring that, move your home dir to old_home and start fresh
[15:30] <Mooncairn> That's what I've been doing with the new install.
[15:30] <Mooncairn> I haven't really moved anything from my old home yet.
[15:31] <Mooncairn> I was slowly building up my new home from scratch. I intended to keep as much config as possible in git with scripts to install the packages I use.
[15:35] <Mooncairn> Okay, I'm fully booted up and logged in. I did rename the fonts files with a .bak extension. One of the ttf files was invalid anyway. It had the diagnostic output from either curl or wget (I forgot which I used) rather than the actual font data.
[15:35] <Mooncairn> First thing off the bat is I get another system warning/crash.
[15:36] <Mooncairn> "aptitude-curses has stopped unexpectedly"
[15:36] <Mooncairn> I hope that's just left over from Monday. I did see that error about aptitude-curses in the syslog.
[15:38] <Mooncairn> I'm going to assume that's a left over for the moment.
[15:39] <Mooncairn> I'm thinking about sliding the new font(s) back into place and see if everything goes to hell again.
[15:40] <Mooncairn> Ok, that was wishful thinking. Ubuntu is reporting errors with gnome-shell.
[15:44] <Mooncairn> SIGSEGV, strangely my desktop is still functioning (so far).
[15:50] <Mooncairn> syslog and timestamp on the bogus ttf file shows that the creation of the file and things going to hell started within the same 60 second period.
[15:53] <Mooncairn> So, correlation but not necessarily causation? The problems noted by gnome-shell started about two minutes afterward. I had had enough time at that point to search duckduckgo for information on signal 7 (BUS).
[15:53] <Mooncairn> Perhaps as much as 6 minutes, actually.
[15:56] <Mooncairn> The "out of memory" errors are exclusively related to the fonts. I can find no other OOM messages in Ubuntu's logs or in my old Debian install's logs.
[16:01] <Mooncairn> Ok, it's definitely the fonts.
[16:02] <Mooncairn> I just slid one of the TTF files back into place, changed gnome-terminal's font to it, and BOOM...
[16:07] <Mooncairn> The gnome-shell errors on Monday still don't necessarily make sense, though. The ttf file that gnome-terminal blew up on today had been installed a full 13 minutes before things went sideways.
[16:10] <Mooncairn> Okay, I guess all I can do right now is to keep monitoring things w/o the offending font files and hope the fonts were the problem. (I'm not sure if I fully believe that, but it's all I have right now w/o further evidence.)
[16:11] <Mooncairn> Thanks for bearing with me on this.
[16:16] <cmaloney> No problem.
[16:17] <cmaloney> Hoping that fixes things
[16:17] <cmaloney> OOM and corrupted files are no fun to diagnose
[18:45] <Mooncairn> Okay, 
[18:46] <Mooncairn> Reinstalled the fonts and installed the remaining, this time downloading through Firefox and then installing through GNOME's font manager rather than manually downloading to ~/.local/share/fonts.
[18:47] <Mooncairn> So far, so good. gnome-terminal has not crashed.
[19:22] <cmaloney> nice