mborzecki | morning | 05:15 |
---|---|---|
zyga | Good morning | 05:25 |
zyga | Iām doing my school run now. See you all later | 05:26 |
zyga | back now | 06:08 |
zyga | mborzecki: it's cold today | 06:08 |
zyga | 13C at most | 06:08 |
zyga | brrr | 06:08 |
zyga | I hope it won't rain later | 06:08 |
mborzecki | and rainy here | 06:08 |
zyga | mborzecki: still, all the cars are stuck in traffic | 06:09 |
zyga | biking to school is way more robust | 06:09 |
zyga | how are we doing today? | 06:09 |
zyga | tests were awful yesterday | 06:09 |
zyga | failing left and right on random stuff | 06:09 |
zyga | store, portals, you name it | 06:09 |
zyga | good morning mvo | 06:14 |
zyga | a little cold and rainy today :) | 06:15 |
zyga | how was your Monday? | 06:15 |
zyga | quick breakfast | 06:19 |
mvo | hey zyga | 06:19 |
mborzecki | mvo: hey | 06:21 |
mvo | hey mborzecki | 06:21 |
mup | PR snapd#7425 closed: channel: introduce Resolve and ResolveLocked <Created by pedronis> <Merged by bboozzoo> <https://github.com/snapcore/snapd/pull/7425> | 06:25 |
mborzecki | zyga: can you take another look at #7412 ? looks like we could land it easily | 06:25 |
mup | PR #7412: tests: run dbus-launch inside a systemd unit <Test Robustness> <Created by mvo5> <https://github.com/snapcore/snapd/pull/7412> | 06:25 |
zyga | sure | 06:25 |
mvo | looks like we can merge 7342 too? | 06:26 |
mvo | and 7125 needs a second review (should be simple) | 06:26 |
zyga | +1 on 7412 | 06:27 |
zyga | mvo, do you want me to mere or do you want to yourself? | 06:27 |
zyga | mvo: +1 on 7342 | 06:28 |
mvo | I can do the merge, I just noticed it has no second +1 | 06:28 |
zyga | looking at 7125 now | 06:28 |
mup | PR snapd#7412 closed: tests: run dbus-launch inside a systemd unit <Test Robustness> <Created by mvo5> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/7412> | 06:29 |
mup | PR snapd#7342 closed: fixme: rename ubuntu*architectures to dpkg*architectures <Created by ardaguclu> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/7342> | 06:31 |
zyga | mvo: reviewed 7125, +1 but please check my comment there | 06:42 |
mvo | zyga: thanks, looking now | 06:43 |
zyga | thank you | 06:50 |
zyga | mvo, mborzecki: I'd like to ask for a review of https://github.com/snapcore/snapd/pull/7435 | 06:52 |
mup | PR #7435: tests: explicitly restore after using LXD <Test Robustness> <Created by zyga> <https://github.com/snapcore/snapd/pull/7435> | 06:53 |
zyga | it's a blocker for progress on https://github.com/snapcore/snapd/pull/7168 | 06:53 |
mup | PR #7168: tests: measure testbed for leaking mountinfo entries <Test Robustness> <Created by zyga> <https://github.com/snapcore/snapd/pull/7168> | 06:53 |
zyga | I have one PR slot open so I'll work on finishing and proposing a mount-ns extension that involves a mimic, so that we can properly evaluate https://github.com/snapcore/snapd/pull/7436 later | 06:54 |
mup | PR #7436: many: make per-snap mount namespace MS_SHARED <Bug> <Created by zyga> <https://github.com/snapcore/snapd/pull/7436> | 06:54 |
zyga | mvo: offtopic, last night I was playing with raspberry pi | 06:57 |
zyga | and I think we can slightly improve our watchdog story there | 06:57 |
zyga | specifically around try mode boots | 06:57 |
mvo | zyga: oh? tell me more | 06:59 |
zyga | I read a little about the watchdog on the pi | 07:00 |
zyga | it's a bit weird, has fixed 15 second interval | 07:00 |
zyga | we could enable it from the boot loader | 07:00 |
zyga | so any try mode boot can recover | 07:00 |
zyga | I will poke around in free time over evenings | 07:00 |
zyga | maybe I will reach something that works | 07:00 |
abeato | hey, is anybody experiencing problems with the telegram snap? I've opened https://forum.snapcraft.io/t/telegram-snap-fails-to-start/13132 | 07:01 |
zyga | abeato: that's new to me | 07:02 |
zyga | abeato: ls -ld /mnt ? | 07:02 |
abeato | zyga, $ ls -ld /mnt | 07:03 |
abeato | drwxr-xr-x 2 root root 4096 jul 19 2016 /mnt | 07:03 |
zyga | ok, so regular directory, not a symlink | 07:03 |
=== pstolowski|afk is now known as pstolowski | ||
pstolowski | mornings | 07:03 |
zyga | abeato: can you please check how many files matching "*snap-confine*" glob are present in /etc/apparmor.d/ | 07:05 |
abeato | zyga, https://paste.ubuntu.com/p/xt7NyYgPKr/ | 07:06 |
zyga | that's that! | 07:06 |
zyga | mvo: ^^^^ | 07:06 |
abeato | sep 11? | 07:06 |
zyga | abeato: one of the files is wrong | 07:06 |
zyga | it's a bug in our postinst script I believe | 07:06 |
zyga | mvo: should that be fixed and released? | 07:06 |
abeato | hm, interesting | 07:06 |
zyga | abeato: what does "apt-cache policy snapd" say | 07:06 |
zyga | that will be most useful to mvo | 07:06 |
abeato | zyga, https://paste.ubuntu.com/p/G2JMwHJSJG/ | 07:07 |
zyga | abeato: the fix for this bug is in 61cc58dbb0a7a1a785e9e3c391b6f593df892839 | 07:08 |
zyga | Date: Wed Aug 14 09:43:41 2019 +0200 | 07:08 |
zyga | it may not be released yet, perhaps | 07:08 |
zyga | mvo: ^ can you confirm if 2.40 has this | 07:08 |
mvo | abeato: yeah, the /etc/apparmor.d/usr.lib.snapd.snap-confine should not be there :( | 07:09 |
mvo | zyga: yeah 2.40 should fix it | 07:09 |
zyga | if you want to fix your system please remove the file mvo mentioned and call sudo apparmor_parser -r /etc/apparmor.d/usr.lib.snapd.snap-confine.real | 07:09 |
abeato | mvo, zyga snapd journal: https://paste.ubuntu.com/p/wHsRR4R2xD/ | 07:09 |
zyga | mvo: in that case the fix doesn't work | 07:09 |
mvo | zyga: oh well | 07:09 |
mvo | zyga: let me look at this again | 07:09 |
zyga | thank you! | 07:09 |
abeato | do you need more data? | 07:10 |
zyga | abeato: can you please update the forum thread with the log from this conversation? | 07:10 |
mborzecki | zyga: can you take a quick look at https://github.com/snapcore/snapd/pull/7109 ? pushed some changes there yday | 07:10 |
mup | PR #7109: snap-confine: fallback gracefully on a cgroup v2 only system <Created by mvo5> <https://github.com/snapcore/snapd/pull/7109> | 07:10 |
mvo | abeato: a bugreport (super small, just the data you already pasted) | 07:10 |
abeato | zyga, sure | 07:10 |
mvo | abeato: then I will do a sledgehammer fix | 07:10 |
zyga | abeato: thank you | 07:10 |
zyga | mborzecki: sure, looking now | 07:10 |
abeato | mvo, launchpad? | 07:10 |
mvo | abeato: plus the content of the /etc/apparmor.d/usr.lib.snapd.snap-confine please | 07:10 |
mvo | abeato: yeah | 07:10 |
abeato | ok | 07:10 |
mvo | abeato: or if its alreaady in the forum thats fine | 07:10 |
mvo | abeato: just need a refrence in the PR | 07:11 |
abeato | it is already in the forum, yes | 07:11 |
abeato | I will update then the forum post | 07:11 |
mvo | abeato: then that should be fine | 07:11 |
mvo | abeato: thank you! | 07:11 |
abeato | np | 07:11 |
mborzecki | anyone else seen this google:debian-9-64:tests/main/snap-service-watchdog to fail recently? | 07:12 |
mup | PR snapd#7438 opened: devicestate: add support for base->base remodel <Created by mvo5> <https://github.com/snapcore/snapd/pull/7438> | 07:12 |
mborzecki | for soem reason the snap app gets SIGABRT https://paste.ubuntu.com/p/Z6k382RCrD/ | 07:13 |
abeato | mvo, zyga https://forum.snapcraft.io/t/telegram-snap-fails-to-start/13132 updated | 07:17 |
mborzecki | this one is interesting https://paste.ubuntu.com/p/kmGzQzZRHJ/ probably something for Chipaca or pedronis | 07:18 |
zyga | store woes? | 07:18 |
mborzecki | idk, nonce is logged in the POST request, so it is sent ;) | 07:24 |
mvo | abeato: could you please pastebin me /var/lib/dpkg/info/ubuntu-core-launcher.conffiles | 07:27 |
mvo | abeato: and snapd.conffile too ? | 07:27 |
abeato | mvo, there is nothing starting as ubuntu-core* in /var/lib/dpkg/info/ | 07:28 |
=== Greyztar- is now known as Greyztar | ||
mvo | abeato: uh, sorry, please see if there is "snap-confine.conffiles" | 07:29 |
zyga | mvo: are conffiles retained after a package is removed? | 07:29 |
zyga | mvo: that is, they remain until purged? | 07:30 |
mvo | zyga: yes | 07:30 |
mvo | zyga: correct | 07:30 |
abeato | mvo, that file is not there either | 07:30 |
abeato | mvo, snapd.conffiles exists | 07:30 |
abeato | mvo, https://paste.ubuntu.com/p/pTN8P7Mtt2/ - but note that I already removed the old file and run apparmor_parser to fix the problem | 07:31 |
mvo | abeato: any output from grep usr.lib.snapd.snap-confine /var/lib/dpkg/info/*.conffiles | 07:31 |
mvo | ? | 07:31 |
abeato | $ grep usr.lib.snapd.snap-confine /var/lib/dpkg/info/*.conffiles | 07:31 |
mvo | abeato: yeah, thats fine - I'm mostly trying to figure out if its still leftover in some dpkg files | 07:32 |
abeato | /var/lib/dpkg/info/snapd.conffiles:/etc/apparmor.d/usr.lib.snapd.snap-confine.real | 07:32 |
mvo | abeato: thats the only match? | 07:32 |
abeato | mvo, yes | 07:32 |
mvo | abeato: thanks! I'm slightly puzzled but thats fine, I think I know what to do (even though I'm not sure how this happens, i.e. dpkg should either know about the file or it should be gone :/ | 07:33 |
abeato | right, it's weird... | 07:33 |
zyga | abeato: did you develop snapd on this machine? | 07:39 |
zyga | perhaps it came from some earlier hacking | 07:39 |
abeato | zyga, no | 07:39 |
zyga | mvo: I need to take a break, back-pain after last evening's longer session | 07:39 |
zyga | I'll stretch and be back in a few moments | 07:39 |
mvo | zyga: sure thing, get well! | 07:40 |
mup | PR snapd#7439 opened: packaging: remove obsolete usr.lib.snapd.snap-confine in postinst <Created by mvo5> <https://github.com/snapcore/snapd/pull/7439> | 07:43 |
mvo | abeato: -^ | 07:44 |
abeato | mvo, great! | 07:45 |
mborzecki | zyga: https://forum.snapcraft.io/t/significance-of-info-files-in-run-snapd-ns/12938 | 08:16 |
mborzecki | tests are red again :/ | 08:17 |
mborzecki | pedronis: hi, i thnk you weren't around when i liked it, interesting failure i stumbled upon today https://paste.ubuntu.com/p/kmGzQzZRHJ/ | 08:19 |
mborzecki | s/liked/linked/ | 08:20 |
pedronis | mborzecki: weird error given that we see the nonce in the requests log | 08:21 |
mborzecki | mhm | 08:21 |
pedronis | for some reason the nonce the store just gave us is considered invalid | 08:23 |
pedronis | unless if it's repeats worth poking the store people | 08:23 |
pedronis | but we haven't touched anything in that area since a while | 08:23 |
Chipaca | we do send the exact thing we get back | 08:28 |
Chipaca | fwiw | 08:28 |
pedronis | yes | 08:28 |
pedronis | so it's not missing | 08:28 |
Chipaca | this isn't the first time we've seen this error | 08:28 |
Chipaca | i think it's worth chasing down | 08:29 |
* Chipaca gets on it | 08:29 | |
mvo | mborzecki: anything changed that make the tests red? | 08:36 |
mborzecki | mvo: no, looks like the usual stuff, desktop portal, occasionally installing snapd deps from package archive or store hiccups | 08:37 |
mvo | :( | 08:37 |
zyga | mborzecki: thank you, replied | 08:46 |
mborzecki | snap/channel/channel_test.go:139:17: undefined: arch.UbuntuArchitecture on master | 08:50 |
mborzecki | opening a pr in a bit | 08:51 |
pedronis | mvo: I did a pass over the 2.42 PRs (except mine), they all need a little bit more work I fear | 08:52 |
pedronis | mborzecki: crossing merges ? | 08:52 |
mborzecki | pedronis: yes | 08:52 |
mvo | pedronis: ok, I have a look, thank you! | 08:53 |
pedronis | mvo: I'll try to tweak the test to check for systemd version in mine when I get a 2nd, worst case it will not make 2.42 | 08:54 |
mup | PR snapd#7440 opened: snap/channel: fix unit tests, UbuntuArchitecture -> DpkgArchitecture <Simple š> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/7440> | 08:54 |
mborzecki | also something weird with tests/unit/go on debian, gofmt is not installed (?) | 08:55 |
pedronis | Chipaca: also in #7411 I have two wonderings that maybe you can help with (I @chipaca-ed you on them) | 08:57 |
mup | PR #7411: cmd/model: output tweaks, add'l tests <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/7411> | 08:57 |
Chipaca | pedronis: ack | 08:57 |
Chipaca | pedronis: i noticed i didn't notice some of the things you pointed out | 08:58 |
Chipaca | will look in a bit | 08:58 |
pedronis | mvo: your PRs for 2.42 also need 2nd reviews | 09:32 |
rogpeppe | zyga: i wonder if this might have some relevance: Sep 09 19:27:54 localhost snapd[692]: handlers.go:459: Reported install problem for "core18" as f732dba4-d35a-11e9-a660-fa163e6cac46 OOPSID | 09:57 |
zyga | rogpeppe: it means that snapd fails to refresh core18 | 09:57 |
zyga | aborts the transaction | 09:57 |
zyga | and rolls back | 09:57 |
zyga | that explains your reboot loop | 09:57 |
zyga | this is very useful information | 09:57 |
rogpeppe | zyga: i'm not seeing a reboot loop currently FWIW | 09:58 |
zyga | mvo: ^ can we pull the log from that error tracker entry? | 09:58 |
zyga | rogpeppe: it will be re-attempted again | 09:58 |
zyga | rogpeppe: until it refreshes successfully | 09:58 |
zyga | that report has hints as to what went wrong | 09:58 |
rogpeppe | zyga: ah, so that's what explains the fact that it's rebooting periodically, i guess | 09:58 |
zyga | yes | 09:58 |
zyga | it's the transactional nature | 09:58 |
zyga | it's just not immune to problems | 09:58 |
* zyga found a bug (in what he was doing since morning) | 09:59 | |
rogpeppe | zyga: here's another one: Sep 09 07:40:40 localhost snapd[715]: handlers.go:459: Reported install problem for "core18" as fc12d9b2-d2e7-11e9-b568-fa163e102db1 OOPSID | 09:59 |
mvo | zyga: INFO Waiting for restart... | 10:01 |
mvo | ERROR cannot finish core18 installation, there was a rollback across reboot | 10:01 |
zyga | huh | 10:01 |
zyga | interesting | 10:01 |
zyga | nothing magic | 10:01 |
zyga | rogpeppe: what does snap version say? | 10:02 |
mvo | looks like the restart is not happening, I see a lot of Waiting for resart | 10:02 |
mvo | snapd version is 2.40 | 10:02 |
mvo | zyga: https://pastebin.canonical.com/p/WHVwWjRTVs/ | 10:03 |
zyga | this is snapd + core18 arrangement, right? | 10:03 |
rogpeppe | zyga: | 10:03 |
rogpeppe | rogpeppe@localhost:~$ snap version | 10:03 |
rogpeppe | snap 2.40 | 10:03 |
rogpeppe | snapd 2.40 | 10:03 |
rogpeppe | series 16 | 10:03 |
rogpeppe | kernel 4.15.0-1041-raspi2 | 10:03 |
zyga | rogpeppe: can you run snap changes | 10:05 |
zyga | and pastebin that? | 10:05 |
rogpeppe | zyga: http://paste.ubuntu.com/p/GbtJH9N6vv/ | 10:06 |
zyga | how about snap tasks 11 | 10:06 |
zyga | and snap tasks 13 | 10:06 |
rogpeppe | zyga: ? | 10:07 |
zyga | run the command: snap tasks 11 | 10:08 |
zyga | and pastebin the output please | 10:08 |
rogpeppe | zyga: http://paste.ubuntu.com/p/CnjtBWwrtr/ | 10:08 |
zyga | oh | 10:09 |
zyga | Error yesterday at 19:25 UTC yesterday at 19:27 UTC Automatically connect eligible plugs and slots of snap "core18" | 10:09 |
zyga | it failed on auto-connect?! | 10:09 |
mborzecki | off to school to pick up the kids, afk for a bit | 10:09 |
zyga | pstolowski: ^ can auto-connect prevent a reboot? | 10:09 |
rogpeppe | zyga: this is task 13: http://paste.ubuntu.com/p/SvvXP7rP5R/ | 10:09 |
zyga | same thing happened here | 10:09 |
zyga | I think we're getting somewhere now | 10:10 |
* Chipaca takes a break | 10:16 | |
pstolowski | zyga: as any other task handler that errors out and triggers undo. it's implemented to retry on conflicts (which it did a couple of times in that log), but then there is a bunch of things it looks up in the state that can error out. slightly weird we don't see what the error was for this task | 10:17 |
zyga | pstolowski: we can presumably ask rogpeppe for the state file | 10:17 |
zyga | do you think it would help | 10:17 |
pstolowski | yes it might help, maybe we will be able to track down what changed in the state that made task error out | 10:18 |
zyga | rogpeppe: can you please report a bug on bugs.launchpad.net/snapd | 10:20 |
zyga | rogpeppe: include a rough description of the problem how you see it | 10:20 |
pstolowski | rogpeppe: so yes, if you can grab and send us state.json that would be great (don't pastebin as it has your macaroon etc) | 10:21 |
zyga | rogpeppe: and then work with pstolowski to attach the logs there | 10:21 |
zyga | rogpeppe: and the state file | 10:21 |
rogpeppe | pstolowski: where does that file live? | 10:21 |
pstolowski | rogpeppe: /var/lib/snapd/ | 10:21 |
zyga | rogpeppe: please make sure to use a private bug (when reporting it) because as pawel said, the state file contains some shared secrets | 10:21 |
zyga | let us know if you need any help with reporting the bug | 10:21 |
zyga | we will use it for tracking and eventual regression testing | 10:22 |
zyga | mborzecki: found a small-ish bug just now, /etc/ssl is a special case, as you know | 10:22 |
zyga | mborzecki: as is /etc/alternatives | 10:22 |
zyga | mborzecki: and they don't play nicely with the trespassing detector | 10:22 |
rogpeppe | zyga: are the macaroons the only secret things in there? | 10:23 |
zyga | yes | 10:23 |
rogpeppe | zyga: ok, i've redacted them specifically. | 10:24 |
pstolowski | rogpeppe: ty | 10:26 |
rogpeppe | zyga, pstolowski: https://bugs.launchpad.net/snapd/+bug/1843417 | 10:43 |
mup | Bug #1843417: ubuntu core installation goes down regularly <snapd:New> <https://launchpad.net/bugs/1843417> | 10:43 |
zyga | thank you very much | 10:43 |
zyga | we'll try to get to the bottom of this | 10:43 |
zyga | pstolowski: can we use the state file to simulate a refresh somehow? | 10:43 |
rogpeppe | zyga: thanks. BTW if it can't be fixed fairly soon, i'll need to move to another distribution, because winter is coming and my parents need heat in the house :) | 10:44 |
zyga | rogpeppe: can you try one specific command that may help you | 10:44 |
zyga | rogpeppe: snap refresh core18 | 10:44 |
zyga | that will refresh the base snap | 10:44 |
rogpeppe | zyga: ok, trying | 10:44 |
zyga | then you can try to refresh snapd | 10:44 |
zyga | essentially one-by one | 10:44 |
zyga | one-by-one | 10:44 |
zyga | rather than all at once | 10:45 |
zyga | if there's some kind of conflict happening | 10:45 |
zyga | it might be averted this way | 10:45 |
rogpeppe | zyga: ok, it's refreshed and is rebooting | 10:45 |
rogpeppe | zyga: well, in a minute | 10:45 |
rogpeppe | zyga: ok, we'll see if it comes back up | 10:46 |
zyga | fingers crossed | 10:46 |
zyga | systems at scale is complex | 10:46 |
zyga | we wish to make this totally unattended | 10:46 |
zyga | but as reality shows, it's not trivial | 10:46 |
pstolowski | zyga: in theory possible but lots of mocking | 10:47 |
rogpeppe | zyga: i'm somewhat unhappy about the reboot just happening randomly, not under some sort of control, particularly if there's a possiblity that the system might not recover from it | 10:48 |
rogpeppe | zyga: in this particular case, if this happened when people were away, it could result in the house not being heated enough and pipes freezing, leading to expensive damage | 10:48 |
zyga | rogpeppe: you can schedule updates | 10:49 |
zyga | there's a way to update predictably at very precise moments | 10:49 |
rogpeppe | zyga: oh? how would i do that? | 10:49 |
zyga | https://snapcraft.io/docs/keeping-snaps-up-to-date | 10:49 |
zyga | in doubt ask mborzecki, he knows this code very well | 10:49 |
zyga | or ask degville about the documentation for suggestions or improvements | 10:50 |
pstolowski | i'm looking into the state | 10:50 |
rogpeppe | zyga: is there some documentation for the actual refresh timer syntax that isn't just examples? | 10:50 |
mborzecki | re | 10:51 |
zyga | I don't believe there is, what would you like, a more formal syntax? | 10:51 |
rogpeppe | zyga: yup | 10:51 |
zyga | I believe it's a set of ranges | 10:51 |
zyga | but mborzecki can correct me on this | 10:51 |
rogpeppe | zyga: and an explanation of the semantics | 10:51 |
zyga | mborzecki: is there a more formal syntax of https://snapcraft.io/docs/keeping-snaps-up-to-date | 10:51 |
rogpeppe | zyga: with those docs, i'm left guessing | 10:51 |
rogpeppe | zyga: what does the "5" in "fri5" mean, for example? | 10:51 |
degville | rogpeppe: I think you're right - we should add a formal explanation of the syntax. | 10:52 |
zyga | rogpeppe: the semantics is that snapd will only attempt to refresh in time that fits that schedule | 10:52 |
rogpeppe | zyga: it's surely not the 5th friday in the month | 10:52 |
mborzecki | rogpeppe: just examples | 10:52 |
zyga | rogpeppe: it actually is | 10:52 |
zyga | rogpeppe: you can plan a monthly refresh this way | 10:52 |
rogpeppe | zyga: most months don't have a 5th friday | 10:52 |
zyga | perhaps the example is not the best but that is the intent | 10:52 |
zyga | mborzecki: what would happen if you pick the 5th Friday actually? | 10:53 |
mborzecki | rogpeppe: exmples list: fri5,23:00-01:00 Last Friday of the month, from 23:00 to 1:00 the next day | 10:53 |
mborzecki | rogpeppe: it's day[<weeknum>], 5 means the last week basically | 10:53 |
rogpeppe | mborzecki: i don't understand why that's the last friday of the month - that's what i mean when i say that it would be good to actually document the semantics | 10:53 |
rogpeppe | mborzecki: if the digit "5" is special, then it should say so | 10:54 |
rogpeppe | mborzecki: for example, would fri9 mean the same thing? | 10:54 |
mborzecki | rogpeppe: some syntax was initially described here https://forum.snapcraft.io/t/refresh-scheduling-on-specific-days-of-the-month/1239/6 some changes were made along the way though | 10:55 |
rogpeppe | mborzecki: more unknowns: would you be allowed to do "22:00~23:00/2" ? | 10:55 |
rogpeppe | mborzecki: if so, what's the difference between that and "22:00-23:00/2" ? | 10:55 |
zyga | I think those are all good questions, thank you for engaging with us rogpeppe | 10:56 |
rogpeppe | mborzecki: basically, i'd like to see some actual description of the semantics, not just a set of examples where i'm left to try to infer the actual rules | 10:56 |
mborzecki | rogpeppe: there's this page too: https://forum.snapcraft.io/t/timer-string-format/6562 | 10:57 |
mborzecki | i gues it could use a little update with some details | 10:58 |
mup | PR snapd#7441 opened: asserts,seed/seedwriter: follow snap type sorting in the model assertion snap listings <Created by pedronis> <https://github.com/snapcore/snapd/pull/7441> | 10:58 |
rogpeppe | that design doc from niemeyer is a great start. it would be nice if some more of that made it into the actual docs. | 10:58 |
rogpeppe | it also mentions some stuff that isn't documented at all, such as `0:00~24:00/6:00` | 11:00 |
rogpeppe | but maybe that's not implemented, i guess | 11:01 |
rogpeppe | zyga: FYI the pi did not successfully reboot. | 11:02 |
rogpeppe | zyga: i'm gonna have to ask for it to be power cycled again | 11:02 |
zyga | rogpeppe: did it reboot but fail or failed to reboot? | 11:02 |
rogpeppe | zyga: i've got no way of knowing, i'm afraid | 11:03 |
zyga | huh, I see :/ | 11:03 |
rogpeppe | zyga: it said it was going down, my ssh connection was terminated, and it hasn't come back up again | 11:03 |
zyga | it has rebooted then | 11:03 |
rogpeppe | zyga: well, it shut down :) | 11:03 |
zyga | and after someone reboots it again it should roll back | 11:03 |
rogpeppe | zyga: and then i'm back in the same position as before? | 11:04 |
zyga | I was discussing a way to make that automatic in case of failure on the pi specifically | 11:04 |
zyga | yes | 11:04 |
rogpeppe | zyga: ok. :-( | 11:04 |
zyga | rogpeppe: you can set the schedule to avoid refreshes while we try to understand the cause of the failure | 11:04 |
rogpeppe | zyga: can i set the schedule so refreshes are turned off entirely? | 11:05 |
zyga | no, that's explicitly not available | 11:05 |
rogpeppe | zyga: AFAICS the minimum frequency is once per month | 11:05 |
zyga | rogpeppe: you can try one more thing | 11:05 |
zyga | you can refresh snapd itself | 11:05 |
zyga | that should not reboot | 11:05 |
zyga | but give you new software stack | 11:05 |
zyga | so that some bugs that were fixed since 2.40 can be applied | 11:05 |
zyga | well the fixes that is, not the bugs | 11:06 |
zyga | rogpeppe: you can try to refresh snapd to candidate channel to get it | 11:06 |
zyga | rogpeppe: with "snap refresh snapd --candidate" | 11:06 |
zyga | on core18 systems you no longer need to reboot to get new snapd, fortunately | 11:06 |
zyga | using the same strategy you could even refresh to a hotfix branch that contains a fix for your machine | 11:07 |
zyga | which would allow you to refresh the rest of the system correctly, once we understand the nature of the failure | 11:07 |
rogpeppe | zyga: ok, i'll try that when the system has been restarted, thanks | 11:09 |
pstolowski | zyga: so, auto-connect handler errors out because WaitRestart() reports a rollback error, we hit the "// TODO: make sure this revision gets ignored for automatic refreshes" case again, there is a revision mismatch there. this was discussed a few months ago when we hit similiar case. pedronis also did some work around reboots recently but i'm not sure if that's in play here | 11:12 |
pedronis | pstolowski: the checking for reboots has been added | 11:14 |
pedronis | maybe we didn't remove all the TODOs | 11:14 |
mborzecki | zyga: tbh, in situations like this, maybe we should have a mechanism to temporarily disable refreshes, some local assertion or whatnot | 11:14 |
pstolowski | zyga: so auto-connect is just a vitim here, problem is elsewhere | 11:14 |
zyga | mborzecki: yeah | 11:14 |
pedronis | I don't remember when it landed though | 11:15 |
rogpeppe | zyga: do you think that the failure to reboot correctly is related to this problem here, or just another problem that happens to be exacerbating the issue? | 11:16 |
zyga | I think that it may be a separate problem | 11:16 |
zyga | perhaps it'd be good to look at snap boot environment and see what it says | 11:17 |
pedronis | mborzecki: you can set refresh.hold no | 11:17 |
mborzecki | aahh | 11:17 |
mborzecki | right | 11:17 |
pedronis | a bit annoying to set, we really need a command for it, but is there | 11:18 |
rogpeppe | zyga: ok, i'm back into the system | 11:18 |
pedronis | pstolowski: we should drop that todo, the check is now done, is done much earlier, in daemon itself | 11:19 |
rogpeppe | zyga: it's maybe interesting that it seems the system only reboots successfully after exactly two power cycles. | 11:19 |
zyga | pedronis: refresh.hold? what is that | 11:19 |
pedronis | zyga: https://forum.snapcraft.io/t/system-options/87#heading--refresh-hold | 11:20 |
pedronis | it's on same page as timer | 11:20 |
zyga | ah, it's not on https://snapcraft.io/docs/keeping-snaps-up-to-date | 11:21 |
mborzecki | zyga: afaiu it's not intended to be used by the user | 11:21 |
pedronis | well, it's annoying but it exists | 11:22 |
pedronis | I would not recommend to use it without a reason | 11:22 |
pedronis | but it can be used | 11:22 |
rogpeppe | am i right that there's no way to specify that the system will refresh on the first day of every month? | 11:24 |
zyga | I believe you're correct | 11:25 |
pstolowski | pedronis: should the restart check be dropped from auto-connect handler? | 11:25 |
rogpeppe | zyga: thanks | 11:25 |
pedronis | pstolowski: in which sense? | 11:25 |
pedronis | the answer is likely no | 11:26 |
pstolowski | pedronis: removing WaitRestart() from auto-connect | 11:26 |
pedronis | pstolowski: no | 11:26 |
pedronis | it will be hit for a bit until we restart/reboot | 11:26 |
pedronis | what I said is checked earlier if when a reboot is triggered | 11:26 |
pedronis | whether it happeneed | 11:26 |
pedronis | at some point we might be able to not use WaitRestart but that requires changes to taskrunner etc | 11:27 |
pstolowski | pedronis: i see. ok, we need to find out what triggers this check to fail sometimes | 11:31 |
pedronis | pstolowski: nowadays, if we reboot 3 times and fails it will trigger | 11:32 |
pstolowski | i've a power outage, need to power off soon before my ups runs out of battery | 11:32 |
pedronis | well, try to reboot 3 times | 11:32 |
zyga | 2019 is still the year when I wish lauchpad to support markdown | 11:37 |
* pstolowski lunch | 11:41 | |
zyga | mborzecki: I reported two bugs that I found today https://bugs.launchpad.net/snapd/+bug/1843421 and https://bugs.launchpad.net/snapd/+bug/1843423 | 11:42 |
mup | Bug #1843421: snap-update-ns doesn't know about the special property of /etc/ssl and /etc/alternatives <snapd:Confirmed> <https://launchpad.net/bugs/1843421> | 11:42 |
mup | Bug #1843423: snap-update-ns fails to construct a layout in /etc/test-snapd/foo <snapd:Confirmed> <https://launchpad.net/bugs/1843423> | 11:42 |
zyga | ondra: ^ some useful bugs for you | 11:43 |
zyga | ogra: in case you run into something like that in the field | 11:43 |
zyga | I meant ondra twice but I think ogra may run into things like that as well | 11:44 |
mborzecki | zyga: btw. does s-u-n need to know aboutl nsswitch.conf too? | 11:45 |
zyga | mborzecki: probably so | 11:45 |
ondra | zyga thank you :) | 11:46 |
zyga | I'll do my best to fix them obviously | 11:46 |
zyga | writing tests is useful | 11:46 |
Chipaca | imma go lunch | 11:56 |
zyga | enjoy :) | 11:57 |
Chipaca | i'll try | 11:57 |
zyga | is it just me or are things extra slow today | 12:11 |
zyga | setting up main test suite takes about 10 minutes to complete | 12:11 |
mborzecki | zyga: and spread test are failing | 12:13 |
zyga | how? | 12:13 |
zyga | I haven't seen any failures though I'm mostly writing new tests now | 12:14 |
zyga | (but no store related failures during that process either) | 12:14 |
=== ricab is now known as ricab|brb | ||
=== grumble is now known as \emph{grumble} | ||
* zyga quick lunchj | 12:41 | |
=== ricab is now known as ricab|lunch | ||
zyga | jdstrand: hello, can you please enqueue https://github.com/snapcore/snapd/pull/7421 for a non-security review but rather a concept review of the idea | 13:25 |
mup | PR #7421: cmd/snap-confine: unmount /writable from snap view <Created by zyga> <https://github.com/snapcore/snapd/pull/7421> | 13:25 |
jdstrand | zyga: sure | 13:38 |
pedronis | mborzecki: I don't understand the spread failures, many of them don't even seem to have clear errors, or I'm not looking right (quite possible) | 13:42 |
Saviq | am I doing something dumb? | 13:59 |
Saviq | $ snap info multipass | grep installed | 13:59 |
Saviq | installed: 0.9.0-dev.171+g7a968814 (x4) 194MB classic | 13:59 |
Saviq | $ snap refresh multipass --revision 1125 --amend | 13:59 |
Saviq | error: local snap "multipass" is unknown to the store, use --amend to proceed anyway | 13:59 |
mup | PR core18#139 opened: hooks: add missing dosfstools to get fsck.fat <Created by mvo5> <https://github.com/snapcore/core18/pull/139> | 14:01 |
zyga | rogpeppe: hello | 14:01 |
zyga | rogpeppe: we have some more ideas | 14:01 |
rogpeppe | zyga: cool! | 14:02 |
rogpeppe | zyga: BTW ISTM that the fail-to-reboot issue is the main problem here - i'm not sure that the other issue would be a real problem if the reboot hadn't failed to restart | 14:03 |
zyga | rogpeppe: we looked some more and we suspect the boot partition that uses fat is corrupted, we found a bug related to absent fsck on core18 systems | 14:03 |
zyga | rogpeppe: we devised a way forward that you should be able to do remotely | 14:03 |
zyga | rogpeppe: if you have an app using "core" installed you should have access to fsck.vfat from /snap/core/current/usr/sbin/ | 14:03 |
zyga | rogpeppe: you can use that to fsck the boot partition | 14:03 |
zyga | rogpeppe: you can unmount it for the duration of the check as well | 14:04 |
rogpeppe | zyga: given that i re-flashed the card very recently, it seems slightly unlikely that it's corrupted already (within a few hours of first installing) but happy to try | 14:04 |
zyga | rogpeppe: mvo looked at some of the error tracker logs and found what I believe was kernel telling us about fs corruption of the FAT partition | 14:04 |
zyga | rogpeppe: so that's the first step, I think you know how to run that without hand-holding but please ask for help if you need any | 14:05 |
zyga | rogpeppe: try to run it in mode verbose enough for us to see if there were any errors there | 14:05 |
rogpeppe | rogpeppe@localhost:~$ ls -l /snap/core/current/usr/sbin/*fsck* | 14:05 |
rogpeppe | ls: cannot access '/snap/core/current/usr/sbin/*fsck*': No such file or directory | 14:05 |
rogpeppe | rogpeppe@localhost:~$ ls -l /snap/core/current/usr/sbin/*vfat* | 14:05 |
rogpeppe | ls: cannot access '/snap/core/current/usr/sbin/*vfat*': No such file or directory | 14:05 |
zyga | oh, silly me | 14:06 |
zyga | just /snap/core/current/sbin/ | 14:06 |
zyga | not /usr | 14:06 |
rogpeppe | zyga: so i'm planning to run these commands; do they seem right to you? | 14:08 |
rogpeppe | umount /boot/uboot | 14:08 |
rogpeppe | fsck -V /dev/mmcblk0p1 | 14:08 |
zyga | yes | 14:08 |
zyga | they look good | 14:08 |
zyga | (assuming PATH is setup to find the fsck.vfat) | 14:09 |
zyga | rogpeppe: as an extra remark, it's sometimes good to stop snapd.service during things like this (hand's on experiments) | 14:10 |
=== ricab|lunch is now known as ricab | ||
zyga | to avoid background activity | 14:10 |
rogpeppe | zyga: http://paste.ubuntu.com/p/Fm3RpdC8d2/ | 14:10 |
zyga | interesting! | 14:11 |
rogpeppe | zyga: i'd definitely unmounted the fs | 14:11 |
zyga | perhaps uboot and kernel disagree on which boot sector to use and then something gets out of sync later | 14:11 |
zyga | for the purpose of experiment, copy original to backup | 14:11 |
zyga | I _believe_ | 14:11 |
zyga | that is what the kernel would use | 14:12 |
zyga | but I welcome the advice of mborzecki | 14:12 |
zyga | mborzecki: ^ | 14:12 |
rogpeppe | zyga: and remove dirty bit? | 14:12 |
zyga | yes, but please understand my POV of trying to fix the partition and see if that means you can correctly boot out of the problem | 14:12 |
zyga | one more idea | 14:12 |
zyga | perhaps tarball all of the boot partition | 14:12 |
zyga | or even dd the whole partition to ext4 somewhere | 14:13 |
zyga | for forensics | 14:13 |
zyga | dd is better | 14:13 |
rogpeppe | zyga: sure, i could send you a copy | 14:13 |
zyga | as you don't have to mount do do | 14:13 |
zyga | *to do | 14:13 |
zyga | yes, we can then look at it bit by bit in hexedit | 14:13 |
zyga | fortunately we don't keep that many files there | 14:13 |
rogpeppe | zyga: ok, downloading disk image now; will upload to s3 | 14:15 |
zyga | thank you | 14:15 |
rogpeppe | zyga: i still find it very odd that it only boots up ok once every other time. i can't think of something that might be causing such predictable boot failures. | 14:17 |
zyga | I can offer one | 14:17 |
mborzecki | zyga: fwiw, i think it's worth checking wether the same incorrectly unmounted warning appears on a cleanly built image | 14:17 |
mborzecki | like out of the box | 14:17 |
zyga | mborzecki: good idea | 14:18 |
zyga | rogpeppe: if uboot reads the FAT differently (we saw that at least once in the past) and sees a different file than linux | 14:18 |
zyga | then snapd will configure the boot loader to boot kernel-1, core-1 in "trying" mode | 14:18 |
rogpeppe | zyga: BTW the s/w i'm running ran fine without any issues for months on end previously | 14:18 |
zyga | the boot loader will go but never see those values | 14:18 |
zyga | booting something else | 14:18 |
zyga | perhaps something that is removed now | 14:18 |
zyga | or perhaps something that is there but disagrees with what snapd expected | 14:19 |
zyga | so snapd will change boot configuration again | 14:19 |
zyga | and plan another reboot | 14:19 |
zyga | rogpeppe: but the point is that the oscillation may be kernel/uboot disagreeing on the contents of a specific file | 14:20 |
zyga | and snapd writing to that file in between boots | 14:20 |
rogpeppe | zyga: ok, interrresting | 14:22 |
mup | PR snapd#7440 closed: snap/channel: fix unit tests, UbuntuArchitecture -> DpkgArchitecture <Simple š> <Created by bboozzoo> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/7440> | 14:23 |
rogpeppe | zyga: try this: https://rogpeppe-scratch.s3.amazon.com/bootcopy.gz | 14:25 |
zyga | grabbing | 14:25 |
zyga | dns issue? | 14:25 |
rogpeppe | zyga: i've probably forgotten how s3 urls work | 14:26 |
Chipaca | i thought they'd turned off that feature | 14:26 |
Chipaca | of being able to just url the stuff | 14:27 |
Chipaca | (but i didn't pay too much attention to that email because i don't use it) | 14:27 |
rogpeppe | zyga: just usual amazon eventual consistency issues; try again | 14:29 |
mborzecki | zyga: fsck does't complain with a pristine core image | 14:29 |
zyga | mborzecki: thank you for checking! | 14:29 |
zyga | rogpeppe: still nothing | 14:29 |
rogpeppe | zyga: ha, it works for me :-\ | 14:29 |
zyga | my dns may have cached it as gone? | 14:29 |
zyga | how big is it? | 14:30 |
zyga | can you email me@zygoon.pl | 14:30 |
rogpeppe | zyga: 48621287 bytes | 14:30 |
zyga | might be faster :) | 14:30 |
zyga | I'll go over the bits in the evening to see what's wrong | 14:30 |
zyga | meanwhile you can attempt to fix FAT | 14:30 |
zyga | or even remake it and copy the files back | 14:30 |
rogpeppe | zyga: for the time being, i've just slowed down updates so it'll only update on the first monday of the month | 14:31 |
zyga | excellent | 14:31 |
zyga | rogpeppe: as pedronis said, you can also use refresh.hold to delay up to 60 days | 14:31 |
rogpeppe | zyga: sent | 14:32 |
mup | PR snapd#7442 opened: tests: extend mount-ns test to handle mimics <Created by zyga> <https://github.com/snapcore/snapd/pull/7442> | 14:33 |
rogpeppe | zyga: i think 60 days isn't enough longer than 30 days to justify the unpredictability | 14:33 |
zyga | sure, just saying | 14:33 |
zyga | let's try to fix that FAT online | 14:33 |
zyga | apply all the fixes that fsck would normally do | 14:33 |
rogpeppe | zyga: you think that might fix the issue for good? | 14:34 |
zyga | I think it's likely | 14:34 |
zyga | but we don't have the data trends from the error tracker AFAIK so perhaps there's more but we're not aware of it | 14:34 |
rogpeppe | zyga: this is quite concerning BTW - this was an absolutely pristine image created following the instructions on the web to the letter | 14:34 |
rogpeppe | zyga: if i'm seeing this problem, then i'd guess that everyone else using ubuntu core is too | 14:35 |
zyga | indeed, we proposed that the boot partition is read only outside of the transactions that need to use it | 14:35 |
zyga | so that random power failures don't leave FAT in mounted state | 14:35 |
rogpeppe | zyga: that seems like a good idea. | 14:35 |
zyga | rogpeppe: there are some issues that can be specific to your device, e.g. the SD card may be just really failing | 14:35 |
zyga | I have a number of cards that reliably corrupt a fixed offset | 14:35 |
rogpeppe | zyga: it's a near-new SD card too, but i guess that's possible | 14:35 |
zyga | you can write zeros or ones, you keep reading that blob that they somehow store | 14:36 |
zyga | rogpeppe: the one I have is little-used sandisk pro 32GB card | 14:36 |
zyga | rogpeppe: it failed in a few weeks after purchase | 14:36 |
rogpeppe | zyga: that would make it both 32GB SD cards that are failing in a similar way then | 14:36 |
rogpeppe | zyga: because i saw a similar issue with the Pi 2 and assumed the sd card had gone | 14:37 |
rogpeppe | zyga: i actually have that pi with me in fact | 14:37 |
zyga | I got the boot image now, thank you | 14:37 |
zyga | rogpeppe: try something like 'etcher' to see if you can write a pristine image and read it back correctly if you want to check that | 14:38 |
rogpeppe | zyga: yeah, i'll try that | 14:38 |
rogpeppe | zyga: i wasn't surprised when the original sd card failed BTW - it had been in constant use for about 3 years. | 14:40 |
rogpeppe | zyga: ... if it did fail, of course | 14:40 |
zyga | indeed, analysis will reveal the cause | 14:40 |
zyga | there are some moving parts | 14:40 |
zyga | and some failures on our end | 14:40 |
zyga | sil2100: hey, there's one PR for core18 | 14:40 |
zyga | sil2100: it's related to what we are discussing now | 14:40 |
zyga | sil2100: we didn't seed fsck.vfat on core18 | 14:41 |
zyga | sil2100: do you think you could review it please? | 14:41 |
zyga | https://github.com/snapcore/core18/pull/139 | 14:41 |
mup | PR core18#139: hooks: add missing dosfstools to get fsck.fat <Created by mvo5> <https://github.com/snapcore/core18/pull/139> | 14:41 |
mvo | rogpeppe: thank you so much for reporting this - and yes, super concerning to us and we dig into it | 14:48 |
* mvo hugs zyga for digging into it | 14:48 | |
rogpeppe | mvo: thanks for your interaction :) | 14:48 |
zyga | rogpeppe: this is what I see on the partition: https://pastebin.ubuntu.com/p/tVXVjCK99Q/ | 14:49 |
zyga | I will now check the contents of config.txt, cmdline.txt and uboot.env | 14:50 |
zyga | for one, I like that mtools exists | 14:50 |
zyga | and wish that there was an ext4 variant :) | 14:50 |
rogpeppe | zyga: i don't know about mtools... | 14:50 |
zyga | rogpeppe: it's a GNU tool for interacting with FAT offline | 14:50 |
sergiusens | hello folks, is there a fix upcoming for "- Download snap "crystal" (71) from channel "latest/stable" (stream error: stream ID 1; PROTOCOL_ERROR)" ? | 14:51 |
sergiusens | this is really slowing us down | 14:52 |
zyga | sergiusens: we don't have a fix, only a workaround mvo worked on | 15:00 |
sergiusens | zyga: does the workaround require work (a setting) on our side? | 15:01 |
ondra | zyga I found one cloud instance with 4 broken snaps I was not able to remove, with latest snapd, I was able to remove all them :) Great work! | 15:05 |
zyga | sergiusens: no, it's automatic | 15:05 |
zyga | ondra: thank you so much :) | 15:05 |
ondra | zyga thank you for fixing it :) | 15:06 |
zyga | ondra: fixing bugs is sometimes very draining, I'm very glad I was able to help you and others in FE | 15:06 |
zyga | rogpeppe: this is the hexyl dump of uboot.env, I'll check out what it says next https://paste.ubuntu.com/p/GgScB76RyT/ -- specifically to see if it is in agreement with snapd's state | 15:06 |
zyga | though I must say that the colorized output from hexyl is easier to read as it shows NUL bytes and other such stuff in clear, distinct color | 15:07 |
zyga | mvo: one other idea, just looking at this, is to have two uboot environment files: one for the fixed program and the other one for just the handful of actual variables we need | 15:08 |
zyga | oh, mvo is not online anymore | 15:08 |
zyga | rogpeppe: so looking here I see we have the following things: snap_core=core18_1076.snap, snap_kernel=pi-kernel_44.snap, snap_mode= (empty string), snap_try_core=core18_1100.snap, snap_try_kernel=pi-kernel_51.snap, | 15:12 |
zyga | I need to reference the boot logic for a second to understand snap_mode="" | 15:12 |
zyga | mborzecki: ^ unless you remember | 15:12 |
zyga | rogpeppe: do you have the snaps and revisions listed there in /var/lib/snapd/snaps/ | 15:12 |
zyga | they should all be mounted as well | 15:13 |
zyga | and visible in snap list --all | 15:13 |
zyga | rogpeppe: (output of snap list --all would help as well) | 15:13 |
mborzeck1 | zyga: this? https://github.com/snapcore/core-build/blob/master/initramfs/scripts/bootloader-script#L89 | 15:16 |
mborzeck1 | zyga: or this one https://github.com/snapcore/pi3-gadget/blob/16/uboot.env.in#L48 | 15:17 |
zyga | huh | 15:17 |
zyga | https://github.com/snapcore/core-build/blob/master/initramfs/scripts/bootloader-script#L104 | 15:17 |
rogpeppe | zyga: http://paste.ubuntu.com/p/ktw4XYgrvG/ | 15:18 |
zyga | I don't understand how that works | 15:18 |
zyga | ah, wait | 15:18 |
zyga | didn't notice nesting | 15:18 |
* zyga re-reads | 15:18 | |
zyga | rogpeppe: do you have core18 at revision 1100? | 15:19 |
zyga | I mean | 15:19 |
zyga | I don't see it | 15:19 |
zyga | so I assume that's why it fails | 15:19 |
zyga | it seems we are trying to get to core18 revision that's simply not here | 15:19 |
zyga | that would explain the immediate failure | 15:20 |
rogpeppe | zyga: i see core18 at 1076 | 15:20 |
zyga | though I didn't check what uboot script does if it cannot find that snap | 15:20 |
rogpeppe | zyga: what's special about 1100? | 15:20 |
zyga | rogpeppe: nothing, it's just referenced from your uboot.env but not present on the system | 15:20 |
rogpeppe | zyga: oh, i see | 15:20 |
rogpeppe | zyga: i wonder how that happened | 15:21 |
zyga | indeed | 15:21 |
zyga | though snapd may have undone 1100 transaction | 15:21 |
zyga | removing the file from disk | 15:21 |
zyga | if you feel lucky, fix the boot partition with fsck | 15:22 |
zyga | snap refresh core18 | 15:22 |
zyga | and check if it manages | 15:22 |
zyga | one other lesson from this | 15:22 |
zyga | is for snapd to fix any boot variables that are inconsistent with reality | 15:22 |
zyga | we have one boot mode | 15:22 |
zyga | as in, one variable called snap_mode | 15:23 |
zyga | that impacts two variables "trying" | 15:23 |
rogpeppe | i'm not feeling very lucky currently :) | 15:23 |
zyga | and it's clear that in this case there's a chance one of them will fail | 15:23 |
zyga | rogpeppe: I'll collect this in a retrospective | 15:23 |
zyga | there's a lot for us to learn from this | 15:23 |
zyga | rogpeppe: I would suggest to fix the FAT partition | 15:23 |
zyga | that might be enough to fix other issues | 15:24 |
rogpeppe | zyga: ok, i'll try that | 15:24 |
zyga | I'll collect all of this for a retrospective and share with you | 15:24 |
zyga | I was thinking about breaking for dinner now | 15:25 |
rogpeppe | zyga: ok, dirty bit removed. it didn't give me an option to do anything else | 15:25 |
rogpeppe | zyga: i suspect i may have done the wrong thing there :-' | 15:25 |
rogpeppe | :-\ | 15:25 |
zyga | oh? | 15:25 |
zyga | note, you can always dd it back! | 15:25 |
rogpeppe | zyga: good point! | 15:26 |
zyga | what did you do? | 15:26 |
zyga | try fixing it, mounting it | 15:26 |
zyga | and looking at the files | 15:26 |
rogpeppe | zyga: http://paste.ubuntu.com/p/MtVGCFGGwk/ | 15:26 |
zyga | at least the boot.env | 15:26 |
rogpeppe | zyga: i suspect i should've said "no" to removing the dirty bit, i think | 15:27 |
zyga | ah, no | 15:27 |
zyga | I think that's fine | 15:27 |
zyga | it's just a bit | 15:27 |
zyga | fat has no journal | 15:27 |
zyga | so apart from top-down scan there's little to do | 15:27 |
rogpeppe | zyga: i thought i'd get the option to address the other issue too | 15:27 |
rogpeppe | zyga: so the dirty bit is the only thing that was wrong? | 15:28 |
zyga | ah right | 15:28 |
zyga | I honestly don't know | 15:28 |
zyga | can you fsck again? | 15:28 |
zyga | maybe with some --force option | 15:28 |
rogpeppe | zyga: a ran it again - it does nothing | 15:29 |
rogpeppe | *i* ran it again | 15:29 |
zyga | did you use -V? | 15:30 |
zyga | rogpeppe: so | 15:38 |
zyga | rogpeppe: writing the report I'm not sure we understand what really failed on boot | 15:38 |
zyga | rogpeppe: we know that the fat was slightly corrupt | 15:38 |
zyga | that it was not unmounted cleanly | 15:38 |
zyga | rogpeppe: we do know that core18 revision 1100 was missing from your disk | 15:38 |
zyga | rogpeppe: though perhaps it was removed by snapd in its undo path | 15:39 |
zyga | rogpeppe: I would like to know if you'd be willing to attempt another reboot, at your convenience, coupled with another reboot done by "snap refresh core18" | 15:39 |
rogpeppe | zyga: so one reboot without "snap refresh core18", then run "snap refresh core18", then let that reboot by itself? | 15:40 |
zyga | yes | 15:40 |
zyga | but only if you have confidence you can recover manually | 15:40 |
zyga | and not super inconvenient for you | 15:40 |
rogpeppe | zyga: ok, i'll try that. what should i do when it fails to restart after the first reboot? | 15:41 |
zyga | power cycle | 15:41 |
rogpeppe | zyga: ok | 15:42 |
zyga | if that fails we may be SOL but I dont't think it will come to that | 15:42 |
zyga | you may want to mount the partition back | 15:42 |
zyga | mvo: hey welcome back | 15:42 |
rogpeppe | zyga: does that make a difference? | 15:42 |
zyga | rogpeppe: in case snapd wishes to write to it | 15:42 |
zyga | otherwise no | 15:42 |
rogpeppe | zyga: ok, i'll mount it again now | 15:42 |
mvo | hey zyga - whats new? | 15:43 |
zyga | mvo: we found some things, one sec, I'll share my notes | 15:43 |
ijohnson | hey folks, could I get another review on https://github.com/snapcore/snapd/pull/7429 ? mvo maybe if you're not EOD yet and have a couple minutes :-) | 15:47 |
mup | PR #7429: wrappers/services: add CurrentSnapServiceStates + unit tests <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/7429> | 15:47 |
mvo | ijohnson: heh, let me have a look | 15:55 |
ijohnson | thanks :-) | 15:57 |
zyga | rogpeppe: let us know what you find pleaes | 15:58 |
* Chipaca goes for a run | 16:01 | |
mvo | ijohnson: you have feedback | 16:02 |
mvo | ijohnson: should be super simple | 16:02 |
ijohnson | yay thanks looking now | 16:02 |
rogpeppe | zyga: will do. it might be tomorrow. | 16:04 |
zyga | ack, thank you for the note | 16:04 |
ijohnson | mvo: thanks I fixed the out of date comment / for loop, but I think it's still nice to have the more verbose/complete systemctl script | 16:13 |
mvo | ijohnson: thats fine | 16:14 |
mvo | ijohnson: keep it if you prefer it :) | 16:14 |
ijohnson | okay, cool so with your and mborzecki's review am I good to merge? | 16:14 |
* ijohnson waits on the merge button | 16:15 | |
mvo | ijohnson: yes | 16:15 |
ijohnson | oh well I guess the tests have to pass too | 16:15 |
mvo | ijohnson: *cough* | 16:15 |
ijohnson | :-O | 16:15 |
mvo | ijohnson: :( | 16:15 |
ijohnson | they're not failing just haven't started running yet | 16:15 |
mvo | ijohnson: ok | 16:16 |
ijohnson | thanks mvo, I'll merge sometime in my afternoon then | 16:17 |
mvo | ijohnson: good luck! | 16:18 |
ijohnson | :-) | 16:18 |
=== pstolowski is now known as pstolowski|afk | ||
rogpeppe | anyone got any idea why `xdg-open` has stopped opening pdf files correctly for me (it opens them in an ebook viewer). I think it might have something to do with the fact that `xdg-mime query filetype anyfile.pdf` doesn't print anything, but I'm not sure how that works. | 17:25 |
rogpeppe | oops, wrong channel, sorry! | 17:25 |
mup | PR snapcraft#2644 closed: Release changelog for 2.44 <Created by sergiusens> <Merged by sergiusens> <https://github.com/snapcore/snapcraft/pull/2644> | 17:44 |
mup | PR snapcraft#2709 opened: incorporate content provider snaps in dependency resolution <Created by cjp256> <https://github.com/snapcore/snapcraft/pull/2709> | 20:14 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!