/srv/irclogs.ubuntu.com/2019/09/10/#snappy.txt

mborzeckimorning05:15
zygaGood morning05:25
zygaIā€™m doing my school run now. See you all later05:26
zygaback now06:08
zygamborzecki: it's cold today06:08
zyga13C at most06:08
zygabrrr06:08
zygaI hope it won't rain later06:08
mborzeckiand rainy here06:08
zygamborzecki: still, all the cars are stuck in traffic06:09
zygabiking to school is way more robust06:09
zygahow are we doing today?06:09
zygatests were awful yesterday06:09
zygafailing left and right on random stuff06:09
zygastore, portals, you name it06:09
zygagood morning mvo06:14
zygaa little cold and rainy today :)06:15
zygahow was your Monday?06:15
zygaquick breakfast06:19
mvohey zyga06:19
mborzeckimvo:  hey06:21
mvohey mborzecki06:21
mupPR snapd#7425 closed: channel: introduce Resolve and ResolveLocked <Created by pedronis> <Merged by bboozzoo> <https://github.com/snapcore/snapd/pull/7425>06:25
mborzeckizyga: can you take another look at #7412 ? looks like we could land it easily06:25
mupPR #7412: tests: run dbus-launch inside a systemd unit <Test Robustness> <Created by mvo5> <https://github.com/snapcore/snapd/pull/7412>06:25
zygasure06:25
mvolooks like we can merge 7342 too?06:26
mvoand 7125 needs a second review (should be simple)06:26
zyga+1 on 741206:27
zygamvo, do you want me to mere or do you want to yourself?06:27
zygamvo: +1 on 734206:28
mvoI can do the merge, I just noticed it has no second +106:28
zygalooking at 7125 now06:28
mupPR snapd#7412 closed: tests: run dbus-launch inside a systemd unit <Test Robustness> <Created by mvo5> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/7412>06:29
mupPR snapd#7342 closed: fixme: rename ubuntu*architectures to dpkg*architectures <Created by ardaguclu> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/7342>06:31
zygamvo: reviewed 7125, +1 but please check my comment there06:42
mvozyga: thanks, looking now06:43
zygathank you06:50
zygamvo, mborzecki: I'd like to ask for a review of https://github.com/snapcore/snapd/pull/743506:52
mupPR #7435: tests: explicitly restore after using LXD <Test Robustness> <Created by zyga> <https://github.com/snapcore/snapd/pull/7435>06:53
zygait's a blocker for progress on https://github.com/snapcore/snapd/pull/716806:53
mupPR #7168: tests: measure testbed for leaking mountinfo entries <Test Robustness> <Created by zyga> <https://github.com/snapcore/snapd/pull/7168>06:53
zygaI have one PR slot open so I'll work on finishing and proposing a mount-ns extension that involves a mimic, so that we can properly evaluate https://github.com/snapcore/snapd/pull/7436 later06:54
mupPR #7436: many: make per-snap mount namespace MS_SHARED <Bug> <Created by zyga> <https://github.com/snapcore/snapd/pull/7436>06:54
zygamvo: offtopic, last night I was playing with raspberry pi06:57
zygaand I think we can slightly improve our watchdog story there06:57
zygaspecifically around try mode boots06:57
mvozyga: oh? tell me more06:59
zygaI read a little about the watchdog on the pi07:00
zygait's a bit weird, has fixed 15 second interval07:00
zygawe could enable it from the boot loader07:00
zygaso any try mode boot can recover07:00
zygaI will poke around in free time over evenings07:00
zygamaybe I will reach something that works07:00
abeatohey, is anybody experiencing problems with the telegram snap? I've opened https://forum.snapcraft.io/t/telegram-snap-fails-to-start/1313207:01
zygaabeato: that's new to me07:02
zygaabeato: ls -ld /mnt ?07:02
abeatozyga, $ ls -ld /mnt07:03
abeatodrwxr-xr-x 2 root root 4096 jul 19  2016 /mnt07:03
zygaok, so regular directory, not a symlink07:03
=== pstolowski|afk is now known as pstolowski
pstolowskimornings07:03
zygaabeato: can you please check how many files matching "*snap-confine*" glob are present in /etc/apparmor.d/07:05
abeatozyga, https://paste.ubuntu.com/p/xt7NyYgPKr/07:06
zygathat's that!07:06
zygamvo: ^^^^07:06
abeatosep 11?07:06
zygaabeato: one of the files is wrong07:06
zygait's a bug in our postinst script I believe07:06
zygamvo: should that be fixed and released?07:06
abeatohm, interesting07:06
zygaabeato: what does "apt-cache policy snapd" say07:06
zygathat will be most useful to mvo07:06
abeatozyga, https://paste.ubuntu.com/p/G2JMwHJSJG/07:07
zygaabeato: the fix for this bug is in 61cc58dbb0a7a1a785e9e3c391b6f593df89283907:08
zygaDate:   Wed Aug 14 09:43:41 2019 +020007:08
zygait may not be released yet, perhaps07:08
zygamvo: ^ can you confirm if 2.40 has this07:08
mvoabeato: yeah, the /etc/apparmor.d/usr.lib.snapd.snap-confine should not be there :(07:09
mvozyga: yeah 2.40 should fix it07:09
zygaif you want to fix your system please remove the file mvo mentioned and call sudo apparmor_parser -r /etc/apparmor.d/usr.lib.snapd.snap-confine.real07:09
abeatomvo, zyga snapd journal: https://paste.ubuntu.com/p/wHsRR4R2xD/07:09
zygamvo: in that case the fix doesn't work07:09
mvozyga: oh well07:09
mvozyga: let me look at this again07:09
zygathank you!07:09
abeatodo you need more data?07:10
zygaabeato: can you please update the forum thread with the log from this conversation?07:10
mborzeckizyga: can you take a quick look at https://github.com/snapcore/snapd/pull/7109 ? pushed some changes there yday07:10
mupPR #7109: snap-confine: fallback gracefully on a cgroup v2 only system <Created by mvo5> <https://github.com/snapcore/snapd/pull/7109>07:10
mvoabeato: a bugreport (super small, just the data you already pasted)07:10
abeatozyga, sure07:10
mvoabeato: then I will do a sledgehammer fix07:10
zygaabeato: thank you07:10
zygamborzecki: sure, looking now07:10
abeatomvo, launchpad?07:10
mvoabeato: plus the content of the /etc/apparmor.d/usr.lib.snapd.snap-confine please07:10
mvoabeato: yeah07:10
abeatook07:10
mvoabeato: or if its alreaady in the forum thats fine07:10
mvoabeato: just need a refrence in the PR07:11
abeatoit is already in the forum, yes07:11
abeatoI will update then  the forum post07:11
mvoabeato: then that should be fine07:11
mvoabeato: thank you!07:11
abeatonp07:11
mborzeckianyone else seen this google:debian-9-64:tests/main/snap-service-watchdog to fail recently?07:12
mupPR snapd#7438 opened: devicestate: add support for base->base remodel <Created by mvo5> <https://github.com/snapcore/snapd/pull/7438>07:12
mborzeckifor soem reason the snap app gets SIGABRT https://paste.ubuntu.com/p/Z6k382RCrD/07:13
abeatomvo, zyga https://forum.snapcraft.io/t/telegram-snap-fails-to-start/13132 updated07:17
mborzeckithis one is interesting https://paste.ubuntu.com/p/kmGzQzZRHJ/ probably something for Chipaca or pedronis07:18
zygastore woes?07:18
mborzeckiidk, nonce is logged in the POST request, so it is sent ;)07:24
mvoabeato: could you please pastebin me /var/lib/dpkg/info/ubuntu-core-launcher.conffiles07:27
mvoabeato: and snapd.conffile too ?07:27
abeatomvo, there is nothing starting as ubuntu-core* in /var/lib/dpkg/info/07:28
=== Greyztar- is now known as Greyztar
mvoabeato: uh, sorry, please see if there is "snap-confine.conffiles"07:29
zygamvo: are conffiles retained after a package is removed?07:29
zygamvo: that is, they remain until purged?07:30
mvozyga: yes07:30
mvozyga: correct07:30
abeatomvo, that file is not there either07:30
abeatomvo, snapd.conffiles exists07:30
abeatomvo, https://paste.ubuntu.com/p/pTN8P7Mtt2/ - but note that I already removed the old file and run apparmor_parser to fix the problem07:31
mvoabeato: any output from grep usr.lib.snapd.snap-confine /var/lib/dpkg/info/*.conffiles07:31
mvo ?07:31
abeato$ grep usr.lib.snapd.snap-confine /var/lib/dpkg/info/*.conffiles07:31
mvoabeato: yeah, thats fine - I'm mostly trying to figure out if its still leftover in some dpkg files07:32
abeato /var/lib/dpkg/info/snapd.conffiles:/etc/apparmor.d/usr.lib.snapd.snap-confine.real07:32
mvoabeato: thats the only match?07:32
abeatomvo, yes07:32
mvoabeato: thanks! I'm slightly puzzled but thats fine, I think I know what to do (even though I'm not sure how this happens, i.e. dpkg should either know about the file or it should be gone :/07:33
abeatoright, it's weird...07:33
zygaabeato: did you develop snapd on this machine?07:39
zygaperhaps it came from some earlier hacking07:39
abeatozyga, no07:39
zygamvo: I need to take a break, back-pain after last evening's longer session07:39
zygaI'll stretch and be back in a few moments07:39
mvozyga: sure thing, get well!07:40
mupPR snapd#7439 opened: packaging: remove obsolete usr.lib.snapd.snap-confine in postinst <Created by mvo5> <https://github.com/snapcore/snapd/pull/7439>07:43
mvoabeato: -^07:44
abeatomvo, great!07:45
mborzeckizyga: https://forum.snapcraft.io/t/significance-of-info-files-in-run-snapd-ns/1293808:16
mborzeckitests are red again :/08:17
mborzeckipedronis: hi, i thnk you weren't around when i liked it, interesting failure i stumbled upon today https://paste.ubuntu.com/p/kmGzQzZRHJ/08:19
mborzeckis/liked/linked/08:20
pedronismborzecki: weird error given that we see the nonce in the requests log08:21
mborzeckimhm08:21
pedronisfor some reason the nonce the store just gave us is considered invalid08:23
pedronisunless if it's repeats worth poking the store people08:23
pedronisbut we haven't touched anything in that area since a while08:23
Chipacawe do send the exact thing we get back08:28
Chipacafwiw08:28
pedronisyes08:28
pedronisso it's not missing08:28
Chipacathis isn't the first time we've seen this error08:28
Chipacai think it's worth chasing down08:29
* Chipaca gets on it08:29
mvomborzecki: anything changed that make the tests red?08:36
mborzeckimvo: no, looks like the usual stuff, desktop portal, occasionally installing snapd deps from package archive or store hiccups08:37
mvo:(08:37
zygamborzecki: thank you, replied08:46
mborzeckisnap/channel/channel_test.go:139:17: undefined: arch.UbuntuArchitecture on master08:50
mborzeckiopening a pr in a bit08:51
pedronismvo: I did a pass over the 2.42 PRs (except mine), they all need a little bit more work I fear08:52
pedronismborzecki: crossing merges ?08:52
mborzeckipedronis: yes08:52
mvopedronis: ok, I have a look, thank you!08:53
pedronismvo: I'll try to tweak the test to check for systemd version in mine when I get a 2nd, worst case it will not make 2.4208:54
mupPR snapd#7440 opened: snap/channel: fix unit tests, UbuntuArchitecture -> DpkgArchitecture <Simple šŸ˜ƒ> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/7440>08:54
mborzeckialso something weird with tests/unit/go on debian, gofmt is not installed (?)08:55
pedronisChipaca: also in #7411 I have two wonderings that maybe you can help with (I @chipaca-ed you on them)08:57
mupPR #7411: cmd/model: output tweaks, add'l tests <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/7411>08:57
Chipacapedronis: ack08:57
Chipacapedronis: i noticed i didn't notice some of the things you pointed out08:58
Chipacawill look in a bit08:58
pedronismvo: your PRs for 2.42 also need 2nd reviews09:32
rogpeppezyga: i wonder if this might have some relevance: Sep 09 19:27:54 localhost snapd[692]: handlers.go:459: Reported install problem for "core18" as f732dba4-d35a-11e9-a660-fa163e6cac46 OOPSID09:57
zygarogpeppe: it means that snapd fails to refresh core1809:57
zygaaborts the transaction09:57
zygaand rolls back09:57
zygathat explains your reboot loop09:57
zygathis is very useful information09:57
rogpeppezyga: i'm not seeing a reboot loop currently FWIW09:58
zygamvo: ^ can we pull the log from that error tracker entry?09:58
zygarogpeppe: it will be re-attempted again09:58
zygarogpeppe: until it refreshes successfully09:58
zygathat report has hints as to what went wrong09:58
rogpeppezyga: ah, so that's what explains the fact that it's rebooting periodically, i guess09:58
zygayes09:58
zygait's the transactional nature09:58
zygait's just not immune to problems09:58
* zyga found a bug (in what he was doing since morning)09:59
rogpeppezyga: here's another one: Sep 09 07:40:40 localhost snapd[715]: handlers.go:459: Reported install problem for "core18" as fc12d9b2-d2e7-11e9-b568-fa163e102db1 OOPSID09:59
mvozyga: INFO Waiting for restart...10:01
mvoERROR cannot finish core18 installation, there was a rollback across reboot10:01
zygahuh10:01
zygainteresting10:01
zyganothing magic10:01
zygarogpeppe: what does snap version say?10:02
mvolooks like the restart is not happening, I see a lot of Waiting for resart10:02
mvosnapd version is 2.4010:02
mvozyga: https://pastebin.canonical.com/p/WHVwWjRTVs/10:03
zygathis is snapd + core18 arrangement, right?10:03
rogpeppezyga:10:03
rogpepperogpeppe@localhost:~$ snap version10:03
rogpeppesnap    2.4010:03
rogpeppesnapd   2.4010:03
rogpeppeseries  1610:03
rogpeppekernel  4.15.0-1041-raspi210:03
zygarogpeppe: can you run snap changes10:05
zygaand pastebin that?10:05
rogpeppezyga: http://paste.ubuntu.com/p/GbtJH9N6vv/10:06
zygahow about snap tasks 1110:06
zygaand snap tasks 1310:06
rogpeppezyga: ?10:07
zygarun the command: snap tasks 1110:08
zygaand pastebin the output please10:08
rogpeppezyga: http://paste.ubuntu.com/p/CnjtBWwrtr/10:08
zygaoh10:09
zygaError   yesterday at 19:25 UTC  yesterday at 19:27 UTC  Automatically connect eligible plugs and slots of snap "core18"10:09
zygait failed on auto-connect?!10:09
mborzeckioff to school to pick up the kids, afk for a bit10:09
zygapstolowski: ^ can auto-connect prevent a reboot?10:09
rogpeppezyga: this is task 13: http://paste.ubuntu.com/p/SvvXP7rP5R/10:09
zygasame thing happened here10:09
zygaI think we're getting somewhere now10:10
* Chipaca takes a break10:16
pstolowskizyga: as any other task handler that errors out and triggers undo. it's implemented to retry on conflicts (which it did a couple of times in that log), but then there is a bunch of things it looks up in the state that can error out. slightly weird we don't see what the error was for this task10:17
zygapstolowski: we can presumably ask rogpeppe for the state file10:17
zygado you think it would help10:17
pstolowskiyes it might help, maybe we will be able to track down what changed in the state that made task error out10:18
zygarogpeppe: can you please report a bug on bugs.launchpad.net/snapd10:20
zygarogpeppe: include a rough description of the problem how you see it10:20
pstolowskirogpeppe: so yes, if you can grab and send us state.json that would be great (don't pastebin as it has your macaroon etc)10:21
zygarogpeppe: and then work with pstolowski to attach the logs there10:21
zygarogpeppe: and the state file10:21
rogpeppepstolowski: where does that file live?10:21
pstolowskirogpeppe: /var/lib/snapd/10:21
zygarogpeppe: please make sure to use a private bug (when reporting it) because as pawel said, the state file contains some shared secrets10:21
zygalet us know if you need any help with reporting the bug10:21
zygawe will use it for tracking and eventual regression testing10:22
zygamborzecki: found a small-ish bug just now, /etc/ssl is a special case, as you know10:22
zygamborzecki: as is /etc/alternatives10:22
zygamborzecki: and they don't play nicely with the trespassing detector10:22
rogpeppezyga: are the macaroons the only secret things in there?10:23
zygayes10:23
rogpeppezyga: ok, i've redacted them specifically.10:24
pstolowskirogpeppe: ty10:26
rogpeppezyga, pstolowski: https://bugs.launchpad.net/snapd/+bug/184341710:43
mupBug #1843417: ubuntu core installation goes down regularly <snapd:New> <https://launchpad.net/bugs/1843417>10:43
zygathank you very much10:43
zygawe'll try to get to the bottom of this10:43
zygapstolowski: can we use the state file to simulate a refresh somehow?10:43
rogpeppezyga: thanks. BTW if it can't be fixed fairly soon, i'll need to move to another distribution, because winter is coming and my parents need heat in the house :)10:44
zygarogpeppe: can you try one specific command that may help you10:44
zygarogpeppe: snap refresh core1810:44
zygathat will refresh the base snap10:44
rogpeppezyga: ok, trying10:44
zygathen you can try to refresh snapd10:44
zygaessentially one-by one10:44
zygaone-by-one10:44
zygarather than all at once10:45
zygaif there's some kind of conflict happening10:45
zygait might be averted this way10:45
rogpeppezyga: ok, it's refreshed and is rebooting10:45
rogpeppezyga: well, in a minute10:45
rogpeppezyga: ok, we'll see if it comes back up10:46
zygafingers crossed10:46
zygasystems at scale is complex10:46
zygawe wish to make this totally unattended10:46
zygabut as reality shows, it's not trivial10:46
pstolowskizyga: in theory possible but lots of mocking10:47
rogpeppezyga: i'm somewhat unhappy about the reboot just happening randomly, not under some sort of control, particularly if there's a possiblity that the system might not recover from it10:48
rogpeppezyga: in this particular case, if this happened when people were away, it could result in the house not being heated enough and pipes freezing, leading to expensive damage10:48
zygarogpeppe: you can schedule updates10:49
zygathere's a way to update predictably at very precise moments10:49
rogpeppezyga: oh? how would i do that?10:49
zygahttps://snapcraft.io/docs/keeping-snaps-up-to-date10:49
zygain doubt ask mborzecki, he knows this code very well10:49
zygaor ask degville about the documentation for suggestions or improvements10:50
pstolowskii'm looking into the state10:50
rogpeppezyga: is there some documentation for the actual refresh timer syntax that isn't just examples?10:50
mborzeckire10:51
zygaI don't believe there is, what would you like, a more formal syntax?10:51
rogpeppezyga: yup10:51
zygaI believe it's a set of ranges10:51
zygabut mborzecki can correct me on this10:51
rogpeppezyga: and an explanation of the semantics10:51
zygamborzecki: is there a more formal syntax of https://snapcraft.io/docs/keeping-snaps-up-to-date10:51
rogpeppezyga: with those docs, i'm left guessing10:51
rogpeppezyga: what does the "5" in "fri5" mean, for example?10:51
degvillerogpeppe: I think you're right - we should add a formal explanation of the syntax.10:52
zygarogpeppe: the semantics is that snapd will only attempt to refresh in time that fits that schedule10:52
rogpeppezyga: it's surely not the 5th friday in the month10:52
mborzeckirogpeppe: just examples10:52
zygarogpeppe: it actually is10:52
zygarogpeppe: you can plan a monthly refresh this way10:52
rogpeppezyga: most months don't have a 5th friday10:52
zygaperhaps the example is not the best but that is the intent10:52
zygamborzecki: what would happen if you pick the 5th Friday actually?10:53
mborzeckirogpeppe: exmples list: fri5,23:00-01:00  Last Friday of the month, from 23:00 to 1:00 the next day10:53
mborzeckirogpeppe: it's day[<weeknum>], 5 means the last week basically10:53
rogpeppemborzecki: i don't understand why that's the last friday of the month - that's what i mean when i say that it would be good to actually document the semantics10:53
rogpeppemborzecki: if the digit "5" is special, then it should say so10:54
rogpeppemborzecki: for example, would fri9 mean the same thing?10:54
mborzeckirogpeppe: some syntax was initially described here https://forum.snapcraft.io/t/refresh-scheduling-on-specific-days-of-the-month/1239/6 some changes were made along the way though10:55
rogpeppemborzecki: more unknowns: would you be allowed to do "22:00~23:00/2" ?10:55
rogpeppemborzecki: if so, what's the difference between that and "22:00-23:00/2" ?10:55
zygaI think those are all good questions, thank you for engaging with us rogpeppe10:56
rogpeppemborzecki: basically, i'd like to see some actual description of the semantics, not just a set of examples where i'm left to try to infer the actual rules10:56
mborzeckirogpeppe: there's this page too: https://forum.snapcraft.io/t/timer-string-format/656210:57
mborzeckii gues it could use a little update with some details10:58
mupPR snapd#7441 opened: asserts,seed/seedwriter: follow snap type sorting in the model assertion snap listings <Created by pedronis> <https://github.com/snapcore/snapd/pull/7441>10:58
rogpeppethat design doc from niemeyer is a great start. it would be nice if some more of that made it into the actual docs.10:58
rogpeppeit also mentions some stuff that isn't documented at all, such as `0:00~24:00/6:00`11:00
rogpeppebut maybe that's not implemented, i guess11:01
rogpeppezyga: FYI the pi did not successfully reboot.11:02
rogpeppezyga: i'm gonna have to ask for it to be power cycled again11:02
zygarogpeppe: did it reboot but fail or failed to reboot?11:02
rogpeppezyga: i've got no way of knowing, i'm afraid11:03
zygahuh, I see :/11:03
rogpeppezyga: it said it was going down, my ssh connection was terminated, and it hasn't come back up again11:03
zygait has rebooted then11:03
rogpeppezyga: well, it shut down :)11:03
zygaand after someone reboots it again it should roll back11:03
rogpeppezyga: and then i'm back in the same position as before?11:04
zygaI was discussing a way to make that automatic in case of failure on the pi specifically11:04
zygayes11:04
rogpeppezyga: ok. :-(11:04
zygarogpeppe: you can set the schedule to avoid refreshes while we try to understand the cause of the failure11:04
rogpeppezyga: can i set the schedule so refreshes are turned off entirely?11:05
zygano, that's explicitly not available11:05
rogpeppezyga: AFAICS the minimum frequency is once per month11:05
zygarogpeppe: you can try one more thing11:05
zygayou can refresh snapd itself11:05
zygathat should not reboot11:05
zygabut give you new software stack11:05
zygaso that some bugs that were fixed since 2.40 can be applied11:05
zygawell the fixes that is, not the bugs11:06
zygarogpeppe: you can try to refresh snapd to candidate channel to get it11:06
zygarogpeppe: with "snap refresh snapd --candidate"11:06
zygaon core18 systems you no longer need to reboot to get new snapd, fortunately11:06
zygausing the same strategy you could even refresh to a hotfix branch that contains a fix for your machine11:07
zygawhich would allow you to refresh the rest of the system correctly, once we understand the nature of the failure11:07
rogpeppezyga: ok, i'll try that when the system has been restarted, thanks11:09
pstolowskizyga: so, auto-connect handler errors out because WaitRestart() reports a rollback error, we hit the "// TODO: make sure this revision gets ignored for automatic refreshes" case again, there is a revision mismatch there. this was discussed a few months ago when we hit similiar case. pedronis also did some work around reboots recently but i'm not sure if that's in play here11:12
pedronispstolowski: the checking for reboots has been added11:14
pedronismaybe we didn't remove all the TODOs11:14
mborzeckizyga: tbh, in situations like this, maybe we should have a mechanism to temporarily disable refreshes, some local assertion or whatnot11:14
pstolowskizyga: so auto-connect is just a vitim here, problem is elsewhere11:14
zygamborzecki: yeah11:14
pedronisI don't remember when it landed though11:15
rogpeppezyga: do you think that the failure to reboot correctly is related to this problem here, or just another problem that happens to be exacerbating the issue?11:16
zygaI think that it may be a separate problem11:16
zygaperhaps it'd be good to look at snap boot environment and see what it says11:17
pedronismborzecki: you can set refresh.hold no11:17
mborzeckiaahh11:17
mborzeckiright11:17
pedronisa bit annoying to set, we really need a command for it, but is there11:18
rogpeppezyga: ok, i'm back into the system11:18
pedronispstolowski: we should drop that todo, the check is now done, is done much earlier, in daemon itself11:19
rogpeppezyga: it's maybe interesting that it seems the system only reboots successfully after exactly two power cycles.11:19
zygapedronis: refresh.hold? what is that11:19
pedroniszyga: https://forum.snapcraft.io/t/system-options/87#heading--refresh-hold11:20
pedronisit's on same page as timer11:20
zygaah, it's not on https://snapcraft.io/docs/keeping-snaps-up-to-date11:21
mborzeckizyga: afaiu it's not intended to be used by the user11:21
pedroniswell, it's annoying but it exists11:22
pedronisI would not recommend to use it without a reason11:22
pedronisbut it can be used11:22
rogpeppeam i right that there's no way to specify that the system will refresh on the first day of every month?11:24
zygaI believe you're correct11:25
pstolowskipedronis: should the restart check be dropped from auto-connect handler?11:25
rogpeppezyga: thanks11:25
pedronispstolowski: in which sense?11:25
pedronisthe answer is likely no11:26
pstolowskipedronis: removing WaitRestart() from auto-connect11:26
pedronispstolowski: no11:26
pedronisit will be hit for a bit until we restart/reboot11:26
pedroniswhat I said is checked earlier if when a reboot is triggered11:26
pedroniswhether it happeneed11:26
pedronisat some point we might be able to not use WaitRestart but that requires changes to taskrunner etc11:27
pstolowskipedronis: i see. ok, we need to find out what triggers this check to fail sometimes11:31
pedronispstolowski: nowadays, if we reboot 3 times and fails it will trigger11:32
pstolowskii've a power outage, need to power off soon before my ups runs out of battery11:32
pedroniswell, try to reboot 3 times11:32
zyga2019 is still the year when I wish lauchpad to support markdown11:37
* pstolowski lunch11:41
zygamborzecki: I reported two bugs that I found today https://bugs.launchpad.net/snapd/+bug/1843421 and https://bugs.launchpad.net/snapd/+bug/184342311:42
mupBug #1843421: snap-update-ns doesn't know about the special property of /etc/ssl and /etc/alternatives <snapd:Confirmed> <https://launchpad.net/bugs/1843421>11:42
mupBug #1843423: snap-update-ns fails to construct a layout in /etc/test-snapd/foo <snapd:Confirmed> <https://launchpad.net/bugs/1843423>11:42
zygaondra: ^ some useful bugs for you11:43
zygaogra: in case you run into something like that in the field11:43
zygaI meant ondra twice but I think ogra may run into things like that as well11:44
mborzeckizyga: btw. does s-u-n need to know aboutl nsswitch.conf too?11:45
zygamborzecki: probably so11:45
ondrazyga thank you :)11:46
zygaI'll do my best to fix them obviously11:46
zygawriting tests is useful11:46
Chipacaimma go lunch11:56
zygaenjoy :)11:57
Chipacai'll try11:57
zygais it just me or are things extra slow today12:11
zygasetting up main test suite takes about 10 minutes to complete12:11
mborzeckizyga: and spread test are failing12:13
zygahow?12:13
zygaI haven't seen any failures though I'm mostly writing new tests now12:14
zyga(but no store related failures during that process either)12:14
=== ricab is now known as ricab|brb
=== grumble is now known as \emph{grumble}
* zyga quick lunchj12:41
=== ricab is now known as ricab|lunch
zygajdstrand: hello, can you please enqueue https://github.com/snapcore/snapd/pull/7421 for a non-security review but rather a concept review of the idea13:25
mupPR #7421: cmd/snap-confine: unmount /writable from snap view <Created by zyga> <https://github.com/snapcore/snapd/pull/7421>13:25
jdstrandzyga: sure13:38
pedronismborzecki: I don't understand the spread failures, many of them don't even seem to have clear errors, or I'm not looking right (quite possible)13:42
Saviqam I doing something dumb?13:59
Saviq$ snap info multipass | grep installed13:59
Saviqinstalled:   0.9.0-dev.171+g7a968814              (x4) 194MB classic13:59
Saviq$ snap refresh multipass --revision 1125 --amend13:59
Saviqerror: local snap "multipass" is unknown to the store, use --amend to proceed anyway13:59
mupPR core18#139 opened: hooks: add missing dosfstools to get fsck.fat <Created by mvo5> <https://github.com/snapcore/core18/pull/139>14:01
zygarogpeppe: hello14:01
zygarogpeppe: we have some more ideas14:01
rogpeppezyga: cool!14:02
rogpeppezyga: BTW ISTM that the fail-to-reboot issue is the main problem here - i'm not sure that the other issue would be a real problem if the reboot hadn't failed to restart14:03
zygarogpeppe: we looked some more and we suspect the boot partition that uses fat is corrupted, we found a bug related to absent fsck on core18 systems14:03
zygarogpeppe: we devised a way forward that you should be able to do remotely14:03
zygarogpeppe: if you have an app using "core" installed you should have access to fsck.vfat from /snap/core/current/usr/sbin/14:03
zygarogpeppe: you can use that to fsck the boot partition14:03
zygarogpeppe: you can unmount it for the duration of the check as well14:04
rogpeppezyga: given that i re-flashed the card very recently, it seems slightly unlikely that it's corrupted already (within a few hours of first installing) but happy to try14:04
zygarogpeppe: mvo looked at some of the error tracker logs and found what I believe was kernel telling us about fs corruption of the FAT partition14:04
zygarogpeppe: so that's the first step, I think you know how to run that without hand-holding but please ask for help if you need any14:05
zygarogpeppe: try to run it in mode verbose enough for us to see if there were any errors there14:05
rogpepperogpeppe@localhost:~$ ls -l /snap/core/current/usr/sbin/*fsck*14:05
rogpeppels: cannot access '/snap/core/current/usr/sbin/*fsck*': No such file or directory14:05
rogpepperogpeppe@localhost:~$ ls -l /snap/core/current/usr/sbin/*vfat*14:05
rogpeppels: cannot access '/snap/core/current/usr/sbin/*vfat*': No such file or directory14:05
zygaoh, silly me14:06
zygajust /snap/core/current/sbin/14:06
zyganot /usr14:06
rogpeppezyga: so i'm planning to run these commands; do they seem right to you?14:08
rogpeppeumount /boot/uboot14:08
rogpeppefsck -V /dev/mmcblk0p114:08
zygayes14:08
zygathey look good14:08
zyga(assuming PATH is setup to find the fsck.vfat)14:09
zygarogpeppe: as an extra remark, it's sometimes good to stop snapd.service during things like this (hand's on experiments)14:10
=== ricab|lunch is now known as ricab
zygato avoid background activity14:10
rogpeppezyga: http://paste.ubuntu.com/p/Fm3RpdC8d2/14:10
zygainteresting!14:11
rogpeppezyga: i'd definitely unmounted the fs14:11
zygaperhaps uboot and kernel disagree on which boot sector to use and then something gets out of sync later14:11
zygafor the purpose of experiment, copy original to backup14:11
zygaI _believe_14:11
zygathat is what the kernel would use14:12
zygabut I welcome the advice of mborzecki14:12
zygamborzecki: ^14:12
rogpeppezyga: and remove dirty bit?14:12
zygayes, but please understand my POV of trying to fix the partition and see if that means you can correctly boot out of the problem14:12
zygaone more idea14:12
zygaperhaps tarball all of the boot partition14:12
zygaor even dd the whole partition to ext4 somewhere14:13
zygafor forensics14:13
zygadd is better14:13
rogpeppezyga: sure, i could send you a copy14:13
zygaas you don't have to mount do do14:13
zyga*to do14:13
zygayes, we can then look at it bit by bit in hexedit14:13
zygafortunately we don't keep that many files there14:13
rogpeppezyga: ok, downloading disk image now; will upload to s314:15
zygathank you14:15
rogpeppezyga: i still find it very odd that it only boots up ok once every other time. i can't think of something that might be causing such predictable boot failures.14:17
zygaI can offer one14:17
mborzeckizyga: fwiw, i think it's worth checking wether the same incorrectly unmounted warning appears on a cleanly built image14:17
mborzeckilike out of the box14:17
zygamborzecki: good idea14:18
zygarogpeppe: if uboot reads the FAT differently (we saw that at least once in the past) and sees a different file than linux14:18
zygathen snapd will configure the boot loader to boot kernel-1, core-1 in "trying" mode14:18
rogpeppezyga: BTW the s/w i'm running ran fine without any issues for months on end previously14:18
zygathe boot loader will go but never see those values14:18
zygabooting something else14:18
zygaperhaps something that is removed now14:18
zygaor perhaps something that is there but disagrees with what snapd expected14:19
zygaso snapd will change boot configuration again14:19
zygaand plan another reboot14:19
zygarogpeppe: but the point is that the oscillation may be kernel/uboot disagreeing on the contents of a specific file14:20
zygaand snapd writing to that file in between boots14:20
rogpeppezyga: ok, interrresting14:22
mupPR snapd#7440 closed: snap/channel: fix unit tests, UbuntuArchitecture -> DpkgArchitecture <Simple šŸ˜ƒ> <Created by bboozzoo> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/7440>14:23
rogpeppezyga: try this: https://rogpeppe-scratch.s3.amazon.com/bootcopy.gz14:25
zygagrabbing14:25
zygadns issue?14:25
rogpeppezyga: i've probably forgotten how s3 urls work14:26
Chipacai thought they'd turned off that feature14:26
Chipacaof being able to just url the stuff14:27
Chipaca(but i didn't pay too much attention to that email because i don't use it)14:27
rogpeppezyga: just usual amazon eventual consistency issues; try again14:29
mborzeckizyga: fsck does't complain with a pristine core image14:29
zygamborzecki: thank you for checking!14:29
zygarogpeppe: still nothing14:29
rogpeppezyga: ha, it works for me :-\14:29
zygamy dns may have cached it as gone?14:29
zygahow big is it?14:30
zygacan you email me@zygoon.pl14:30
rogpeppezyga: 48621287 bytes14:30
zygamight be faster :)14:30
zygaI'll go over the bits in the evening to see what's wrong14:30
zygameanwhile you can attempt to fix FAT14:30
zygaor even remake it and copy the files back14:30
rogpeppezyga: for the time being, i've just slowed down updates so it'll only update on the first monday of the month14:31
zygaexcellent14:31
zygarogpeppe: as pedronis said, you can also use refresh.hold to delay up to 60 days14:31
rogpeppezyga: sent14:32
mupPR snapd#7442 opened: tests: extend mount-ns test to handle mimics <Created by zyga> <https://github.com/snapcore/snapd/pull/7442>14:33
rogpeppezyga: i think 60 days isn't enough longer than 30 days to justify the unpredictability14:33
zygasure, just saying14:33
zygalet's try to fix that FAT online14:33
zygaapply all the fixes that fsck would normally do14:33
rogpeppezyga: you think that might fix the issue for good?14:34
zygaI think it's likely14:34
zygabut we don't have the data trends from the error tracker AFAIK so perhaps there's more but we're not aware of it14:34
rogpeppezyga: this is quite concerning BTW - this was an absolutely pristine image created following the instructions on the web to the letter14:34
rogpeppezyga: if i'm seeing this problem, then i'd guess that everyone else using ubuntu core is too14:35
zygaindeed, we proposed that the boot partition is read only outside of the transactions that need to use it14:35
zygaso that random power failures don't leave FAT in mounted state14:35
rogpeppezyga: that seems like a good idea.14:35
zygarogpeppe: there are some issues that can be specific to your device, e.g. the SD card may be just really failing14:35
zygaI have a number of cards that reliably corrupt a fixed offset14:35
rogpeppezyga: it's a near-new SD card too, but i guess that's possible14:35
zygayou can write zeros or ones, you keep reading that blob that they somehow store14:36
zygarogpeppe: the one I have is little-used sandisk pro 32GB card14:36
zygarogpeppe: it failed in a few weeks after purchase14:36
rogpeppezyga: that would make it both 32GB SD cards that are failing in a similar way then14:36
rogpeppezyga: because i saw a similar issue with the Pi 2 and assumed the sd card had gone14:37
rogpeppezyga: i actually have that pi with me in fact14:37
zygaI got the boot image now, thank you14:37
zygarogpeppe: try something like 'etcher' to see if you can write a pristine image and read it back correctly if you want to check that14:38
rogpeppezyga: yeah, i'll try that14:38
rogpeppezyga: i wasn't surprised when the original sd card failed BTW - it had been in constant use for about 3 years.14:40
rogpeppezyga: ... if it did fail, of course14:40
zygaindeed, analysis will reveal the cause14:40
zygathere are some moving parts14:40
zygaand some failures on our end14:40
zygasil2100: hey, there's one PR for core1814:40
zygasil2100: it's related to what we are discussing now14:40
zygasil2100: we didn't seed fsck.vfat on core1814:41
zygasil2100: do you think you could review it please?14:41
zygahttps://github.com/snapcore/core18/pull/13914:41
mupPR core18#139: hooks: add missing dosfstools to get fsck.fat <Created by mvo5> <https://github.com/snapcore/core18/pull/139>14:41
mvorogpeppe: thank you so much for reporting this - and yes, super concerning to us and we dig into it14:48
* mvo hugs zyga for digging into it14:48
rogpeppemvo: thanks for your interaction :)14:48
zygarogpeppe: this is what I see on the partition: https://pastebin.ubuntu.com/p/tVXVjCK99Q/14:49
zygaI will now check the contents of config.txt, cmdline.txt and uboot.env14:50
zygafor one, I like that mtools exists14:50
zygaand wish that there was an ext4 variant :)14:50
rogpeppezyga: i don't know about mtools...14:50
zygarogpeppe: it's a GNU tool for interacting with FAT offline14:50
sergiusenshello folks, is there a fix upcoming for "- Download snap "crystal" (71) from channel "latest/stable" (stream error: stream ID 1; PROTOCOL_ERROR)" ?14:51
sergiusensthis is really slowing us down14:52
zygasergiusens: we don't have a fix, only a workaround mvo worked on15:00
sergiusenszyga: does the workaround require work (a setting) on our side?15:01
ondrazyga I found one cloud instance with 4 broken snaps I was not able to remove, with latest snapd, I was able to remove all them :) Great work!15:05
zygasergiusens: no, it's automatic15:05
zygaondra: thank you so much :)15:05
ondrazyga thank you for fixing it :)15:06
zygaondra: fixing bugs is sometimes very draining, I'm very glad I was able to help you and others in FE15:06
zygarogpeppe: this is the hexyl dump of uboot.env, I'll check out what it says next https://paste.ubuntu.com/p/GgScB76RyT/  -- specifically to see if it is in agreement with snapd's state15:06
zygathough I must say that the colorized output from hexyl is easier to read as it shows NUL bytes and other such stuff in clear, distinct color15:07
zygamvo: one other idea, just looking at this, is to have two uboot environment files: one for the fixed program and the other one for just the handful of actual variables we need15:08
zygaoh, mvo is not online anymore15:08
zygarogpeppe: so looking here I see we have the following things: snap_core=core18_1076.snap, snap_kernel=pi-kernel_44.snap, snap_mode= (empty string), snap_try_core=core18_1100.snap, snap_try_kernel=pi-kernel_51.snap,15:12
zygaI need to reference the boot logic for a second to understand snap_mode=""15:12
zygamborzecki: ^ unless you remember15:12
zygarogpeppe: do you have the snaps and revisions listed there in /var/lib/snapd/snaps/15:12
zygathey should all be mounted as well15:13
zygaand visible in snap list --all15:13
zygarogpeppe: (output of snap list --all would help as well)15:13
mborzeck1zyga: this? https://github.com/snapcore/core-build/blob/master/initramfs/scripts/bootloader-script#L8915:16
mborzeck1zyga: or this one https://github.com/snapcore/pi3-gadget/blob/16/uboot.env.in#L4815:17
zygahuh15:17
zygahttps://github.com/snapcore/core-build/blob/master/initramfs/scripts/bootloader-script#L10415:17
rogpeppezyga: http://paste.ubuntu.com/p/ktw4XYgrvG/15:18
zygaI don't understand how that works15:18
zygaah, wait15:18
zygadidn't notice nesting15:18
* zyga re-reads15:18
zygarogpeppe: do you have core18 at revision 1100?15:19
zygaI mean15:19
zygaI don't see it15:19
zygaso I assume that's why it fails15:19
zygait seems we are trying to get to core18 revision that's simply not here15:19
zygathat would explain the immediate failure15:20
rogpeppezyga: i see core18 at 107615:20
zygathough I didn't check what uboot script does if it cannot find that snap15:20
rogpeppezyga: what's special about 1100?15:20
zygarogpeppe: nothing, it's just referenced from your uboot.env but not present on the system15:20
rogpeppezyga: oh, i see15:20
rogpeppezyga: i wonder how that happened15:21
zygaindeed15:21
zygathough snapd may have undone 1100 transaction15:21
zygaremoving the file from disk15:21
zygaif you feel lucky, fix the boot partition with fsck15:22
zygasnap refresh core1815:22
zygaand check if it manages15:22
zygaone other lesson from this15:22
zygais for snapd to fix any boot variables that are inconsistent with reality15:22
zygawe have one boot mode15:22
zygaas in, one variable called snap_mode15:23
zygathat impacts two variables "trying"15:23
rogpeppei'm not feeling very lucky currently :)15:23
zygaand it's clear that in this case there's a chance one of them will fail15:23
zygarogpeppe: I'll collect this in a retrospective15:23
zygathere's a lot for us to learn from this15:23
zygarogpeppe: I would suggest to fix the FAT partition15:23
zygathat might be enough to fix other issues15:24
rogpeppezyga: ok, i'll try that15:24
zygaI'll collect all of this for a retrospective and share with you15:24
zygaI was thinking about breaking for dinner now15:25
rogpeppezyga: ok, dirty bit  removed. it didn't give me an option to do anything else15:25
rogpeppezyga: i suspect i may have done the wrong thing there :-'15:25
rogpeppe:-\15:25
zygaoh?15:25
zyganote, you can always dd it back!15:25
rogpeppezyga: good point!15:26
zygawhat did you do?15:26
zygatry fixing it, mounting it15:26
zygaand looking at the files15:26
rogpeppezyga: http://paste.ubuntu.com/p/MtVGCFGGwk/15:26
zygaat least the boot.env15:26
rogpeppezyga: i suspect i should've said "no" to removing the dirty bit, i think15:27
zygaah, no15:27
zygaI think that's fine15:27
zygait's just a bit15:27
zygafat has no journal15:27
zygaso apart from top-down scan there's little to do15:27
rogpeppezyga: i thought i'd get the option to address the other issue too15:27
rogpeppezyga: so the dirty bit is the only thing that was wrong?15:28
zygaah right15:28
zygaI honestly don't know15:28
zygacan you fsck again?15:28
zygamaybe with some --force option15:28
rogpeppezyga: a ran it again - it does nothing15:29
rogpeppe*i* ran it again15:29
zygadid you use -V?15:30
zygarogpeppe: so15:38
zygarogpeppe: writing the report I'm not sure we understand what really failed on boot15:38
zygarogpeppe: we know that the fat was slightly corrupt15:38
zygathat it was not unmounted cleanly15:38
zygarogpeppe: we do know that core18 revision 1100 was missing from your disk15:38
zygarogpeppe: though perhaps it was removed by snapd in its undo path15:39
zygarogpeppe: I would like to know if you'd be willing to attempt another reboot, at your convenience, coupled with another reboot done by "snap refresh core18"15:39
rogpeppezyga: so one reboot without "snap refresh core18", then run "snap refresh core18", then let that reboot by itself?15:40
zygayes15:40
zygabut only if you have confidence you can recover manually15:40
zygaand not super inconvenient for you15:40
rogpeppezyga: ok, i'll try that. what should i do when it fails to restart after the first reboot?15:41
zygapower cycle15:41
rogpeppezyga: ok15:42
zygaif that fails we may be SOL but I dont't think it will come to that15:42
zygayou may want to mount the partition back15:42
zygamvo: hey welcome back15:42
rogpeppezyga: does that make a difference?15:42
zygarogpeppe: in case snapd wishes to write to it15:42
zygaotherwise no15:42
rogpeppezyga: ok, i'll mount it again now15:42
mvohey zyga - whats new?15:43
zygamvo: we found some things, one sec, I'll share my notes15:43
ijohnsonhey folks, could I get another review on https://github.com/snapcore/snapd/pull/7429 ? mvo maybe if you're not EOD yet and have a couple minutes :-)15:47
mupPR #7429: wrappers/services: add CurrentSnapServiceStates + unit tests <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/7429>15:47
mvoijohnson: heh, let me have a look15:55
ijohnsonthanks :-)15:57
zygarogpeppe: let us know what you find pleaes15:58
* Chipaca goes for a run16:01
mvoijohnson: you have feedback16:02
mvoijohnson: should be super simple16:02
ijohnsonyay thanks looking now16:02
rogpeppezyga: will do. it might be tomorrow.16:04
zygaack, thank you for the note16:04
ijohnsonmvo: thanks I fixed the out of date comment / for loop, but I think it's still nice to have the more verbose/complete systemctl script16:13
mvoijohnson: thats fine16:14
mvoijohnson: keep it if you prefer it :)16:14
ijohnsonokay, cool so with your and mborzecki's review am I good to merge?16:14
* ijohnson waits on the merge button16:15
mvoijohnson: yes16:15
ijohnsonoh well I guess the tests have to pass too16:15
mvoijohnson: *cough*16:15
ijohnson:-O16:15
mvoijohnson: :(16:15
ijohnsonthey're not failing just haven't started running yet16:15
mvoijohnson: ok16:16
ijohnsonthanks mvo, I'll merge sometime in my afternoon then16:17
mvoijohnson: good luck!16:18
ijohnson:-)16:18
=== pstolowski is now known as pstolowski|afk
rogpeppeanyone got any idea why `xdg-open` has stopped opening pdf files correctly for me (it opens them in an ebook viewer). I think it might have something to do with the fact that `xdg-mime query filetype anyfile.pdf` doesn't print anything, but I'm not sure how that works.17:25
rogpeppeoops, wrong channel, sorry!17:25
mupPR snapcraft#2644 closed: Release changelog for 2.44 <Created by sergiusens> <Merged by sergiusens> <https://github.com/snapcore/snapcraft/pull/2644>17:44
mupPR snapcraft#2709 opened: incorporate content provider snaps in dependency resolution <Created by cjp256> <https://github.com/snapcore/snapcraft/pull/2709>20:14

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!