mup | PR snapd#7424 closed: fixme: move snapfrompid into osutils <Created by ardaguclu> <Closed by ardaguclu> <https://github.com/snapcore/snapd/pull/7424> | 06:32 |
---|---|---|
mup | PR snapd#7424 opened: fixme: move snapfrompid into osutils <Created by ardaguclu> <https://github.com/snapcore/snapd/pull/7424> | 07:38 |
rogpeppe | hi all, a raspberry pi that i maintain went into a reboot loop recently. i reflashed the sd card and it now works, so i'm guessing it's probably not a h/w problem. any ideas why this might have happened? it was previously running the same s/w for months on end without incident. | 09:29 |
rogpeppe | it's annoying because it's a remote system running a domestic power control system and i don't have direct access to it (physical access requires either a 900 mile drive or having the pi sent in the post) | 09:31 |
rogpeppe | i suspect that snappy/core auto-update might have been to blame, but it's also possible that the sd card has become unreliable | 09:33 |
zyga | rogpeppe: hey | 10:36 |
zyga | rogpeppe: do you still have the card contents from the reboot loop? | 10:37 |
zyga | rogpeppe: we could perform analysis to determine that | 10:37 |
rogpeppe | zyga: no, sorry, i had to overwrite them | 10:37 |
rogpeppe | zyga: i shoulda copied the card contents | 10:37 |
zyga | rogpeppe: I see, it's too bad because it might have been interesting to understand the cause | 10:37 |
rogpeppe | zyga: yeah, i thought so too | 10:37 |
zyga | rogpeppe: next time please "dd" the card to disk for safe-keeping | 10:38 |
zyga | rogpeppe: if this happens again please let us know | 10:38 |
zyga | we saw a few events like that but in all cases those bugs were eventually fixed | 10:38 |
rogpeppe | zyga: will do. i wanted to look at syslog but i'm not sure how to view journalctl logs via an sd card | 10:38 |
zyga | rogpeppe: I'm not sure either but there's probably a way somehow | 10:39 |
rogpeppe | zyga: it's quite possible this hardware is unreliable - i've just seen another random reboot | 10:39 |
zyga | rogpeppe: in one case we found a bug in the FAT encoding between the bootloader, linux and some dos editing tools from linux | 10:39 |
rogpeppe | zyga: i'm moving onto another rpi (a rpi 3 this time) and keeping fingers crossed | 10:39 |
zyga | rogpeppe: so the range of possible problems is wide | 10:39 |
zyga | rogpeppe: make sure to use a good power adapter, PI 3B+ uses way more energy and needs a reliable source | 10:40 |
rogpeppe | zyga: ok, that's interesting. how many amps does it need? | 10:40 |
zyga | rogpeppe: from the top of my head you need 2.5 or 3A | 10:40 |
zyga | rogpeppe: I ordered a new dedicated power supply for this reason, just waiting for the package to arrive | 10:41 |
rogpeppe | zyga: one thing that concerns me about my current setup: the snap that's doing the work prints quite a bit to stdout (logging messages) and that ends up churning the sd card quite a bit over the course of months. | 10:41 |
zyga | rogpeppe: mmm | 10:41 |
zyga | rogpeppe: it's a common complaint actually, | 10:41 |
rogpeppe | zyga: there have been a couple of occasions when i think an unexpected power outage has resulted in a corrupted sd card that's unable to boot | 10:41 |
zyga | there used to be some settings to disable rsyslog and make journal only log to memory but I think those never got released | 10:42 |
rogpeppe | zyga: at least, that's my tentative diagnosis | 10:42 |
rogpeppe | zyga: i'm thinking of installing some kind of battery-backed UPS device to avoid that somewhat | 10:42 |
zyga | rogpeppe: maybe just a powerbank? | 10:43 |
zyga | some powerbanks can be charged and used at the same time | 10:43 |
zyga | it's a low cost solution for sure :) | 10:43 |
rogpeppe | zyga: yeah, maybe. it needs to work reliably over a period of years | 10:45 |
rogpeppe | zyga: and deal ok with power outages on the order of days | 10:46 |
zyga | days? | 10:46 |
zyga | I have a pi zero wireless | 10:46 |
zyga | that's powered from a power bank with USB-C power delivery | 10:46 |
rogpeppe | zyga: yeah, this is in a fairly remote area of scotland and power outages aren't uncommon | 10:46 |
zyga | a larger one actually | 10:46 |
zyga | it lets the pi run for about a week | 10:46 |
zyga | I didn't try it with a more power-hungry device | 10:47 |
zyga | it's neat because I can use USB-C to charge the battery | 10:47 |
rogpeppe | zyga: the other thing i'm considering is a small pi zero or something to act as a monitor and reboot if needed, because i've seen some hangups where the pi doesn't autoreboot on crash | 10:47 |
zyga | while powering the PI or other devkit from the other connector | 10:47 |
zyga | rogpeppe: yeah, the pi doesn't have a watchdog | 10:47 |
zyga | so it's not reliable in this sense unfortunately | 10:47 |
rogpeppe | zyga: i remember seeing a UPS device that also included a real time clock, which would be ideal, but i can't seem to find it now | 10:48 |
zyga | gee, I wish USB battery packs could convey their charge over USB | 10:48 |
rogpeppe | ha, that would be cool | 10:49 |
zyga | yeah | 10:50 |
zyga | too bad USB is such a mess | 10:50 |
zyga | I think USB-PD is better in this regard | 10:50 |
rogpeppe | at least the pi doesn't seem to go back in time any more - that was a real pain for my app | 10:50 |
zyga | rogpeppe: that's another bug | 10:52 |
zyga | a combination of bugs actually | 10:52 |
zyga | pi has no RTC with battery power | 10:52 |
rogpeppe | hmm, that's twice now that this pi has scheduled a reboot and not managed to come out of it | 10:52 |
zyga | so each reboot it's back to 1970s | 10:52 |
zyga | so we used some timestamps to establish "less insane wrong time" | 10:52 |
zyga | like the stamp on some files in FAT | 10:53 |
rogpeppe | yes, that's definitely better | 10:53 |
zyga | but you really only get working time after you ntp | 10:53 |
zyga | so offline you are still in the past on reboot | 10:53 |
zyga | we also changed things so that time is saved back to fat on reboot | 10:53 |
zyga | so it's not perfect but much better | 10:53 |
zyga | rogpeppe: if you can get any logs that would be good | 10:53 |
rogpeppe | and it just failed again after powering off/on, but then succeeded the next time. i'm a bit concerned about reliablilty here. | 10:53 |
zyga | if you can image the SD card and help us with forensics that would be perfect | 10:53 |
rogpeppe | zyga: what logs would you look at? | 10:53 |
zyga | many things | 10:54 |
rogpeppe | zyga: it's booted ok again now | 10:54 |
zyga | snap changes to understand what snapd was doing | 10:54 |
zyga | journal to see what the system was doing | 10:54 |
zyga | the full card image to look for FAT corruption or other magic | 10:54 |
zyga | it all depends on initial analysis | 10:54 |
zyga | one small tip | 10:54 |
zyga | if you enable persistent journal | 10:54 |
zyga | you can see what happened in the past better | 10:54 |
zyga | mkdir /var/log/journal | 10:55 |
zyga | and reboot | 10:55 |
rogpeppe | zyga: the first time it rebooted, i got it to reboot again after a couple of tries, and the snap was still half way through installing | 10:55 |
zyga | rogpeppe: that's normal | 10:55 |
rogpeppe | zyga: i think that it decided to reboot again after it had installed (and it failed again) | 10:55 |
rogpeppe | zyga: oh really? ok i guess. | 10:55 |
zyga | rogpeppe: when snapd / kernel or other essential snap updates some updates are postponed until after that one fully finished | 10:55 |
zyga | which may include a reboot | 10:55 |
zyga | but if it reboots like 3 times that's curious | 10:56 |
zyga | if you can ssh and check "snap changes" that would help | 10:56 |
rogpeppe | zyga: i'd hope the card wasn't corrupted, as i've only just flashed it | 10:56 |
zyga | SD cards can permanently fail | 10:56 |
zyga | use a flashing tool that verifies each block | 10:56 |
zyga | like etcher does | 10:56 |
rogpeppe | zyga: when the reboot fails, it fails quickly - i don't get the core splash screen | 10:56 |
zyga | it's useful to spot flaky cards outright | 10:56 |
zyga | do you have "hands on" access to the device now? | 10:57 |
rogpeppe | zyga: yup | 10:57 |
zyga | I can recommend attaching a serial adapter to the pins on the PI | 10:57 |
rogpeppe | zyga: for a few hours before i need to leave | 10:57 |
zyga | it's useful to see what's going on early | 10:57 |
zyga | you can plug that to another device for monitoring | 10:57 |
rogpeppe | zyga: i haven't got a serial adaptor here i'm afraid | 10:57 |
zyga | e.g. even another PI | 10:57 |
zyga | rogpeppe: ah, too bad, it's useful | 10:57 |
rogpeppe | zyga: i should get hold of one | 10:57 |
zyga | USB-TTL adapters, grab a few next time on ebay | 10:57 |
zyga | it's mostly useful to see early boot logic | 10:58 |
zyga | as we log what's going on | 10:58 |
zyga | especially kernel / rootfs selection | 10:58 |
zyga | and boot mode | 10:58 |
rogpeppe | zyga: i presume that's not TTL as in TTL-logic | 10:58 |
zyga | (normal or trying new kernel or new rootfs) | 10:58 |
zyga | they are just called TTL to differentiate from "high" voltage real serial ports | 10:58 |
zyga | on pi it's 3.3V AFAIK | 10:58 |
rogpeppe | i think i'm behind the times on acronyms :) | 10:59 |
zyga | :D | 10:59 |
rogpeppe | FWIW when it fails to boot, the it takes about a couple of seconds before the permanent red light comes on | 10:59 |
zyga | which PI variant is tihs? | 11:00 |
zyga | *this | 11:00 |
zyga | speaking of which, I need to place that backorder on Pi 4 finally | 11:00 |
zyga | the 4GB variant is never in stock, it seems | 11:00 |
zyga | if it shows anything is that pi 5 will have an 8GB version | 11:01 |
zyga | and a SATA port ;-) | 11:01 |
rogpeppe | zyga: it's a pi 3, but i can't remember which exact variant. how do i tell? | 11:01 |
zyga | cat /proc/cpuinfo | 11:01 |
rogpeppe | zyga: is this the etcher tool you were talking about? https://www.techspot.com/downloads/6931-etcher.html | 11:01 |
zyga | yeah | 11:01 |
zyga | now owned by balena | 11:01 |
zyga | https://www.balena.io/etcher/ | 11:02 |
zyga | still FOSS, just a way to point to a company | 11:02 |
rogpeppe | http://paste.ubuntu.com/p/3m8k66gtVS/ | 11:02 |
zyga | it's remarkably good | 11:02 |
zyga | one sec | 11:02 |
zyga | that's PI 3B | 11:03 |
zyga | no, wait | 11:04 |
* zyga rechecks | 11:04 | |
zyga | that could be 3B+ | 11:04 |
zyga | does it have a metal can on the CPU? | 11:04 |
zyga | if so that's the 3B+ which needs way more power to work | 11:04 |
zyga | (but is also faster) | 11:04 |
rogpeppe | not sure. i think the cpu might be covered up by the piglow device i've got plugged in | 11:05 |
zyga | oh, you have pilgow? | 11:06 |
zyga | let me share one thing I made | 11:06 |
zyga | https://github.com/zyga/snappy-pi2-piglow | 11:06 |
zyga | it's for the time when snapd interfaces were still called "snappy skills" | 11:06 |
rogpeppe | ah, i've found the order: Raspberry Pi 3 Model B Quad Core CPU 1.2 GHz 1 GB RAM Motherboard | 11:07 |
zyga | mmm | 11:07 |
zyga | I should update that piglow repo to more recent snappy standards :) | 11:07 |
rogpeppe | i built a nice piglow library in Go | 11:07 |
rogpeppe | https://godoc.org/github.com/rogpeppe/misc/piglow | 11:09 |
rogpeppe | it's designed to work ok if used concurrently | 11:09 |
rogpeppe | and a command too: https://godoc.org/github.com/rogpeppe/misc/cmd/piglow | 11:09 |
zyga | I went to check out the gamma table :D | 11:10 |
rogpeppe | i'd totally forgotten about that until now :) | 11:11 |
zyga | I used https://github.com/zyga/snappy-pi2-piglow/blob/master/sn3218.c#L30 | 11:11 |
rogpeppe | i think i probably nicked the gamma table from somewhere else | 11:11 |
zyga | I don't recall how I got those values now | 11:11 |
zyga | I _may_ have used something to measure the output at the time | 11:11 |
zyga | it was when I was still doing hardware fun projects | 11:11 |
rogpeppe | ha, i even gave it credit: // Stolen from github.com/benleb/PyGlow. | 11:12 |
zyga | yeah :) | 11:17 |
zyga | I'm back to tinkering | 11:17 |
rogpeppe | zyga: sadly it seems like my command has rotted and no longer works :( | 11:25 |
mup | PR snapcraft#2704 closed: extensions: create the gnome-platform directory <Created by sergiusens> <Merged by sergiusens> <https://github.com/snapcore/snapcraft/pull/2704> | 15:56 |
mup | PR snapcraft#2705 closed: extensions: rename extension classes to known names <Created by sergiusens> <Merged by sergiusens> <https://github.com/snapcore/snapcraft/pull/2705> | 15:56 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!