[06:32] PR snapd#7424 closed: fixme: move snapfrompid into osutils [07:38] PR snapd#7424 opened: fixme: move snapfrompid into osutils [09:29] hi all, a raspberry pi that i maintain went into a reboot loop recently. i reflashed the sd card and it now works, so i'm guessing it's probably not a h/w problem. any ideas why this might have happened? it was previously running the same s/w for months on end without incident. [09:31] it's annoying because it's a remote system running a domestic power control system and i don't have direct access to it (physical access requires either a 900 mile drive or having the pi sent in the post) [09:33] i suspect that snappy/core auto-update might have been to blame, but it's also possible that the sd card has become unreliable [10:36] rogpeppe: hey [10:37] rogpeppe: do you still have the card contents from the reboot loop? [10:37] rogpeppe: we could perform analysis to determine that [10:37] zyga: no, sorry, i had to overwrite them [10:37] zyga: i shoulda copied the card contents [10:37] rogpeppe: I see, it's too bad because it might have been interesting to understand the cause [10:37] zyga: yeah, i thought so too [10:38] rogpeppe: next time please "dd" the card to disk for safe-keeping [10:38] rogpeppe: if this happens again please let us know [10:38] we saw a few events like that but in all cases those bugs were eventually fixed [10:38] zyga: will do. i wanted to look at syslog but i'm not sure how to view journalctl logs via an sd card [10:39] rogpeppe: I'm not sure either but there's probably a way somehow [10:39] zyga: it's quite possible this hardware is unreliable - i've just seen another random reboot [10:39] rogpeppe: in one case we found a bug in the FAT encoding between the bootloader, linux and some dos editing tools from linux [10:39] zyga: i'm moving onto another rpi (a rpi 3 this time) and keeping fingers crossed [10:39] rogpeppe: so the range of possible problems is wide [10:40] rogpeppe: make sure to use a good power adapter, PI 3B+ uses way more energy and needs a reliable source [10:40] zyga: ok, that's interesting. how many amps does it need? [10:40] rogpeppe: from the top of my head you need 2.5 or 3A [10:41] rogpeppe: I ordered a new dedicated power supply for this reason, just waiting for the package to arrive [10:41] zyga: one thing that concerns me about my current setup: the snap that's doing the work prints quite a bit to stdout (logging messages) and that ends up churning the sd card quite a bit over the course of months. [10:41] rogpeppe: mmm [10:41] rogpeppe: it's a common complaint actually, [10:41] zyga: there have been a couple of occasions when i think an unexpected power outage has resulted in a corrupted sd card that's unable to boot [10:42] there used to be some settings to disable rsyslog and make journal only log to memory but I think those never got released [10:42] zyga: at least, that's my tentative diagnosis [10:42] zyga: i'm thinking of installing some kind of battery-backed UPS device to avoid that somewhat [10:43] rogpeppe: maybe just a powerbank? [10:43] some powerbanks can be charged and used at the same time [10:43] it's a low cost solution for sure :) [10:45] zyga: yeah, maybe. it needs to work reliably over a period of years [10:46] zyga: and deal ok with power outages on the order of days [10:46] days? [10:46] I have a pi zero wireless [10:46] that's powered from a power bank with USB-C power delivery [10:46] zyga: yeah, this is in a fairly remote area of scotland and power outages aren't uncommon [10:46] a larger one actually [10:46] it lets the pi run for about a week [10:47] I didn't try it with a more power-hungry device [10:47] it's neat because I can use USB-C to charge the battery [10:47] zyga: the other thing i'm considering is a small pi zero or something to act as a monitor and reboot if needed, because i've seen some hangups where the pi doesn't autoreboot on crash [10:47] while powering the PI or other devkit from the other connector [10:47] rogpeppe: yeah, the pi doesn't have a watchdog [10:47] so it's not reliable in this sense unfortunately [10:48] zyga: i remember seeing a UPS device that also included a real time clock, which would be ideal, but i can't seem to find it now [10:48] gee, I wish USB battery packs could convey their charge over USB [10:49] ha, that would be cool [10:50] yeah [10:50] too bad USB is such a mess [10:50] I think USB-PD is better in this regard [10:50] at least the pi doesn't seem to go back in time any more - that was a real pain for my app [10:52] rogpeppe: that's another bug [10:52] a combination of bugs actually [10:52] pi has no RTC with battery power [10:52] hmm, that's twice now that this pi has scheduled a reboot and not managed to come out of it [10:52] so each reboot it's back to 1970s [10:52] so we used some timestamps to establish "less insane wrong time" [10:53] like the stamp on some files in FAT [10:53] yes, that's definitely better [10:53] but you really only get working time after you ntp [10:53] so offline you are still in the past on reboot [10:53] we also changed things so that time is saved back to fat on reboot [10:53] so it's not perfect but much better [10:53] rogpeppe: if you can get any logs that would be good [10:53] and it just failed again after powering off/on, but then succeeded the next time. i'm a bit concerned about reliablilty here. [10:53] if you can image the SD card and help us with forensics that would be perfect [10:53] zyga: what logs would you look at? [10:54] many things [10:54] zyga: it's booted ok again now [10:54] snap changes to understand what snapd was doing [10:54] journal to see what the system was doing [10:54] the full card image to look for FAT corruption or other magic [10:54] it all depends on initial analysis [10:54] one small tip [10:54] if you enable persistent journal [10:54] you can see what happened in the past better [10:55] mkdir /var/log/journal [10:55] and reboot [10:55] zyga: the first time it rebooted, i got it to reboot again after a couple of tries, and the snap was still half way through installing [10:55] rogpeppe: that's normal [10:55] zyga: i think that it decided to reboot again after it had installed (and it failed again) [10:55] zyga: oh really? ok i guess. [10:55] rogpeppe: when snapd / kernel or other essential snap updates some updates are postponed until after that one fully finished [10:55] which may include a reboot [10:56] but if it reboots like 3 times that's curious [10:56] if you can ssh and check "snap changes" that would help [10:56] zyga: i'd hope the card wasn't corrupted, as i've only just flashed it [10:56] SD cards can permanently fail [10:56] use a flashing tool that verifies each block [10:56] like etcher does [10:56] zyga: when the reboot fails, it fails quickly - i don't get the core splash screen [10:56] it's useful to spot flaky cards outright [10:57] do you have "hands on" access to the device now? [10:57] zyga: yup [10:57] I can recommend attaching a serial adapter to the pins on the PI [10:57] zyga: for a few hours before i need to leave [10:57] it's useful to see what's going on early [10:57] you can plug that to another device for monitoring [10:57] zyga: i haven't got a serial adaptor here i'm afraid [10:57] e.g. even another PI [10:57] rogpeppe: ah, too bad, it's useful [10:57] zyga: i should get hold of one [10:57] USB-TTL adapters, grab a few next time on ebay [10:58] it's mostly useful to see early boot logic [10:58] as we log what's going on [10:58] especially kernel / rootfs selection [10:58] and boot mode [10:58] zyga: i presume that's not TTL as in TTL-logic [10:58] (normal or trying new kernel or new rootfs) [10:58] they are just called TTL to differentiate from "high" voltage real serial ports [10:58] on pi it's 3.3V AFAIK [10:59] i think i'm behind the times on acronyms :) [10:59] :D [10:59] FWIW when it fails to boot, the it takes about a couple of seconds before the permanent red light comes on [11:00] which PI variant is tihs? [11:00] *this [11:00] speaking of which, I need to place that backorder on Pi 4 finally [11:00] the 4GB variant is never in stock, it seems [11:01] if it shows anything is that pi 5 will have an 8GB version [11:01] and a SATA port ;-) [11:01] zyga: it's a pi 3, but i can't remember which exact variant. how do i tell? [11:01] cat /proc/cpuinfo [11:01] zyga: is this the etcher tool you were talking about? https://www.techspot.com/downloads/6931-etcher.html [11:01] yeah [11:01] now owned by balena [11:02] https://www.balena.io/etcher/ [11:02] still FOSS, just a way to point to a company [11:02] http://paste.ubuntu.com/p/3m8k66gtVS/ [11:02] it's remarkably good [11:02] one sec [11:03] that's PI 3B [11:04] no, wait [11:04] * zyga rechecks [11:04] that could be 3B+ [11:04] does it have a metal can on the CPU? [11:04] if so that's the 3B+ which needs way more power to work [11:04] (but is also faster) [11:05] not sure. i think the cpu might be covered up by the piglow device i've got plugged in [11:06] oh, you have pilgow? [11:06] let me share one thing I made [11:06] https://github.com/zyga/snappy-pi2-piglow [11:06] it's for the time when snapd interfaces were still called "snappy skills" [11:07] ah, i've found the order: Raspberry Pi 3 Model B Quad Core CPU 1.2 GHz 1 GB RAM Motherboard [11:07] mmm [11:07] I should update that piglow repo to more recent snappy standards :) [11:07] i built a nice piglow library in Go [11:09] https://godoc.org/github.com/rogpeppe/misc/piglow [11:09] it's designed to work ok if used concurrently [11:09] and a command too: https://godoc.org/github.com/rogpeppe/misc/cmd/piglow [11:10] I went to check out the gamma table :D [11:11] i'd totally forgotten about that until now :) [11:11] I used https://github.com/zyga/snappy-pi2-piglow/blob/master/sn3218.c#L30 [11:11] i think i probably nicked the gamma table from somewhere else [11:11] I don't recall how I got those values now [11:11] I _may_ have used something to measure the output at the time [11:11] it was when I was still doing hardware fun projects [11:12] ha, i even gave it credit: // Stolen from github.com/benleb/PyGlow. [11:17] yeah :) [11:17] I'm back to tinkering [11:25] zyga: sadly it seems like my command has rotted and no longer works :( [15:56] PR snapcraft#2704 closed: extensions: create the gnome-platform directory [15:56] PR snapcraft#2705 closed: extensions: rename extension classes to known names