[05:02] PR snapd#8592 opened: Add initial support for the hardware-control interface [07:13] morning [07:25] Hi [07:58] hi. i get this when trying to install a snap in a fresh ubuntu:20.04 lxd container: error: too early for operation, device not yet seeded or device model not acknowledged [07:59] Hey BjornT [08:00] This is a rather unfortunate part of our current implementation. After installing snapd there is a brief moment when snapd responds on the socket but is otherwise unable to do anything yet, as it is setting up some stuff [08:00] Normally it lasts a moment [08:00] Does it still happen? [08:00] zyga: yes, it doesn't seem to go away [08:01] Can you run “snap changes” please [08:01] zyga: https://paste.ubuntu.com/p/XmmVZfjgNG/ [08:02] I can you try doing what you wanted before again? It seems that initialization is now done [08:03] zyga: ah, yes, now it works [08:04] Cool [08:13] Hi, if I have a snap refresh timer set to our maintenance windows, is there a way to check if there is actually a new snap that's going to be installed during that time? [08:23] klaasvakie: I don't believe there is, we don't have "snap refresh --dry-mode" [08:23] but there's a chance you could guesstimate that by looking at snap info [08:23] and checking the revision numbers installed on your system [08:23] and the channel you are tracking [08:23] and checking if snap info reports a different revision in the channel that is being tracked [08:24] if so, "snap refresh" will pick that revision [08:27] pstolowski: challenge for today, figure out what nukes some packages so that we don't have user-session services [08:27] ok, thanks zyga. I'll script something around this. It basically makes the difference whether I have to stay at work on Friday nights or if I get to go home :) [08:27] the tests/main/session-tool-support test picks it up [08:27] normally this test passes all the time but there's an interaction with another thet where /{,usr/}lib/systemd/user/dbus.socket goes away [08:28] we probably remove the package with --autoremove or something similar [08:28] it's only occurring on deb-based systems so it may be using something hand-coded to apt-get remove stuff [08:28] klaasvakie: have you thought of using an enterprise proxy? [08:28] it's a fantastic way to manage your fleet [08:28] no need to worry about anything [08:28] it lets you control the revisions your systems see [08:29] https://docs.ubuntu.com/snap-store-proxy/en/ [08:29] it's super nice iMO [08:29] I have read about it, but I only started playing with 20.04 and snapd recently so I still have lots to learn [08:30] klaasvakie: you install the proxy on your network and configure each device to use it [08:30] klaasvakie: and then the proxy acts both as a cache [08:30] and as a control point where you can pick which revisions are "current" [08:30] let's you manage upgrades this way [08:30] I'll consider the enterprise proxy, thanks for pointing it out [08:32] pstolowski: check this out please [08:32] https://github.com/snapcore/snapd/pull/8592/checks?check_run_id=641794081 [08:32] PR #8592: Add initial support for the hardware-control interface [08:33] panic in overlord/ifacestate tests [08:34] zyga: oh, not good, will check [09:06] zyga: running ifacestate tests in a loop, no panic so far (will run hooktest too) [09:08] is mvo off today? [09:17] zyga, yes, according to hr.c.c at least [09:17] ah, thank you [09:17] np [09:17] he should be back tomorrow according to the calendar [09:22] zyga: i think they took a swap day for the sprint [09:22] yeah [09:22] well deserved [10:22] PR snapd#8593 opened: Stumb implementation of image.Prepare for darwin [10:25] zyga: ^ [10:25] oh :) [10:26] zyga: it's fine as is in master but fails on my other PR, so making it a prerequisite to reduce diff there [10:26] ok [10:28] i've tweaked the description [10:28] +1 [10:29] ty [11:23] so, what's the most broken thing on master today [11:24] actually the preseed test [11:24] pstolowski: is the test fixed? [11:24] pstolowski: IIRC you sent updates [11:25] is it just unlucky, can it land now? [11:25] let me check [11:25] thank you [11:26] ah no, it landed. where is it failing? [11:26] zyga: ^ [11:26] ah, great [11:26] I just need to rebase then [11:26] older branch from last week [11:26] ok [11:33] zyga: i see tests/main/interfaces-audio-playback-record failure [11:33] yeah [11:33] it's interesting [11:33] + rm -rf /run/user/12345 /home/test/.config/pulse [11:33] 1476 [11:33] rm: cannot remove '/run/user/12345/doc': Is a directory [11:33] https://github.com/snapcore/snapd/pull/8581 [11:33] :| [11:34] maybe I could merge all the fix branches [11:34] PR #8581: tests: port pulseaudio test to session-tool [11:34] and just merge them together [11:39] ah, that, right.. [11:44] I guess [11:44] with some luck [11:44] we can land them [11:45] just reduce the failure ratio one by one [11:46] pstolowski: https://github.com/snapcore/snapd/pull/8578 is rather simple if you want to look [11:46] PR #8578: interfaces: add host-docs interface [11:46] zyga: right, i'll [12:10] forum.snapcraft.io appears to be down [12:12] hey #snappy! What is the best way to build a snap, which uses "base:core20"? The "core20" release does not yet seem to be available inside multipass, which makes my build fail... [12:13] slyon: that's a great question, I don't actually know [12:13] PR snapcraft#3096 closed: pluginhandler: export SNAPCRAFT_BUILD_BASE to build environment [12:13] slyon: core20 beta will be released soon, [12:13] slyon: I suspect it will be generally available after that [12:16] jdstrand, when you have a chance, can you comment on bug #1876442, please? [12:16] Bug #1876442: [snap] chromium causing many audit messages in syslog [12:17] zyga, hmm... thanks for the reply, anyway! [12:17] slyon: you should be able to use core20 if using snapcraft from candidate/edge channels. there are rough edges still, so it's not yet in the stable channel. [12:21] jamesh: hey [12:22] jamesh: it's kind of late, are you around by any chance? [12:36] morning folks [12:36] hey Ian [12:36] roses are red [12:36] and so is master [12:36] there is no rhyme [12:36] it's a disaster [12:36] just like there is no green [12:36] # just made up on the spot [12:36] :D [12:37] is it just you and pstolowski today? [12:37] so far [12:37] hi ijohnson ! [12:37] o/ pstolowski [12:37] I think Claudio said it was a public holiday there, not sure about Sergio [12:37] ijohnson: one tiny nitpick to our lxd test PR, i'm happy to approve [12:37] sure, let me look now [12:38] pstolowski: thanks for spotting that, I missed it! [12:38] my back hurts, I would like to exercise a little [12:38] maybe after standup [12:40] zyga: so what's all red, is it many things? [12:40] zyga: is it anything related to the session-tool stuff we landed last week? [12:40] ijohnson: random stuff we fixed last week that we couldn't land because $(other_thing) [12:40] ah [12:40] ijohnson: nothing new AFAIK [12:40] ijohnson: I think we can push it all in with some luck in sequencing [12:40] ijohnson: I'd love to land the pulse test first as it fails most often [12:41] ijohnson: there _is_ something new, something picked up by one of the new tests (session-tool-support) [12:41] ijohnson: something removes a debian package in during execution [12:41] and that breaks all session tests as systemd --user has no service definitions for dbus anymore [12:41] ah right I remember you talking with Maciej about this a bit last week [12:41] that's the dbus-user-session package [12:41] I'm trying to find it but no luck yet (though I was distracted with general catch-up and the shutdown bug from Alex) [12:42] ijohnson: if all else fails let's "apt-get install" it in per-test prepare [12:42] or flag its absence in restore [12:42] zyga: re the shutdown bug, did you see my notes on that in the SU doc? [12:42] I'll try to get to that later today but if you beat me to it I'd love to [12:42] oh, no [12:42] let me look [12:42] reading [12:43] My gut feeling was [12:43] it's the same bug we fixed recently in the context of lxd-in-classic [12:43] it's a core system so perhaps this is a false trail [12:44] but perhaps it's a "snap stop ..." command that runs in ExecStop= [12:44] and the sequence is that core18 unmounted snapd [12:44] but we have core [12:44] and now there's a skew [12:44] I wonder what happens on a core system [12:44] where get core18 + snapd [12:44] zyga: let's discuss the bug in the internal channel [12:44] and core [12:44] ok [13:01] zyga: SU? [13:01] joining [13:42] pstolowski, https://travis-ci.org/github/snapcore/spread-cron/builds/682751904#L3799 [13:42] error in the preseed test [13:43] cmatsuoka, https://travis-ci.org/github/snapcore/spread-cron/builds/682751904#L4031 [13:43] error in encript test [13:45] cachio: thanks, i'll investigate [13:45] pstolowski, thanks [13:45] pstolowski: so journal test failed again [13:46] makes me wonder about something you said recently [13:46] that restarting journal restarts snapd [13:46] do I remember that correctly? [13:46] zyga: yes [13:46] pstolowski: on core or on classic as well? [13:46] zyga: but now i'm sending sigusr1. i added some debug to the test [13:47] zyga: it's core-only config, i didn't check classic [13:47] I tried various things on my system and journal restarts fine [13:47] and snapd does *not* restart [13:47] I wonder why it would behave differently on core [13:47] zyga: i think it was only on core 16 [13:48] zyga: do you have a log from this failure? [13:48] pstolowski: yeah, [13:49] I just restarted journal on core16 - no snapd impact [13:49] hmm [13:50] * ijohnson -> quick break [13:50] pstolowski: the failure was on [13:50] https://github.com/snapcore/snapd/runs/642670950 [13:50] I need a break to stretch my back [13:50] but I will look next [13:50] don't worry :) [13:58] zyga: my debug there shows: Process: 21744 ExecStart=/usr/lib/snapd/snapd (code=killed, signal=PIPE) [13:58] oh :) [13:58] hahah [13:58] funny [13:59] sigpipe is ignored in systemd in the future [13:59] I bet we just need a small tweak [13:59] let me check this, this is great hint! [13:59] zyga: but why is this? [13:59] pstolowski: I will explain in a sec, just need to check it [14:00] pstolowski: stdout/stderr are piped to journald [14:00] pstolowski: https://github.com/cybozu-go/well/issues/13 [14:00] zyga: aaaah [14:01] zyga: that's annoying, becuase we tell journald to just reload config :/ [14:01] not to restart [14:01] software [14:01] sigusr1 [14:02] must be broken somehow :) [14:02] it may just to that, it's still closing the pipe apparently [14:02] I wonder though [14:02] if that leaves us without any logging [14:02] because if we ignore it [14:02] ... [14:02] do we log anywhere?! [14:03] https://bugs.freedesktop.org/show_bug.cgi?id=84923 [14:03] heh, indeed [14:03] oh my god [14:03] this is so bad [14:03] I mean [14:03] it's just literally killing snapd at _arbitrary_ point due to snap set [14:03] well [14:03] first thing is first [14:03] let's fix the test not to break [14:03] then let's figure out what needs to be done [14:04] it's worse, it's killing it while in configure hook [14:05] the test cannot be fixed really afaiu, snapd simply gets killed as snap set is being executed [14:06] pstolowski: https://bugs.freedesktop.org/show_bug.cgi?id=84923#c9 [14:06] this is the killer [14:06] pstolowski: essentially on core16 we cannot have this feature [14:06] pstolowski: OR [14:06] pstolowski: we need to architect it so that snapd saves state and organizes a controlled "semi shutdown", anticipating SIGPIPE [14:07] pstolowski: we should escalate this [14:09] yes [14:10] i can prepare a PR that disables the feature & test for now (but not reverts it, just disables) [14:10] and then we can think [14:14] ah, we can disabled it just on core16 [14:14] *disable [14:15] zyga: thanks for finding these bug reports! [14:17] pstolowski: https://github.com/snapcore/snapd/pull/8594 [14:17] PR #8594: tests: work around journald bug in core16 [14:17] pstolowski: I think we should consider disabling this *feature* on core16 [14:17] or consider a backport of journald fix - if feasable - from foundations [14:18] PR snapd#8594 opened: tests: work around journald bug in core16 [14:21] zyga: giving that core16 will eventually *cough* go away... [14:21] pstolowski: in 2016 ;) [14:22] zyga: thank you for the PR, looks great, i wonder if we shouldn't make retry conditional on core16 though, so we don't hide potential issue on other cores [14:22] yeah, I thought about that for a sec [14:22] it's testing now, if it passes I'll send a tweak [14:22] sounds good [14:25] pstolowski: ohh [14:25] https://www.theverge.com/2020/5/4/21245940/macbook-pro-13-inch-apple-new-magic-keyboard-price-release-date [14:30] ijohnson, pstolowski, cmatsuoka: please review https://github.com/snapcore/snapd/pull/8594 - mvo can merge this later today if it's +2 [14:30] PR #8594: tests: work around journald bug in core16 [14:31] zyga: looking [14:31] zyga: ack, checking it [14:31] to clarify, retry-tool there just gives us more chances [14:31] snapd still restarts [14:31] but snap the client, may get a chance to talk to it after reconnection more often [14:31] and may eventually work [14:31] that's all [14:32] this test fails in master randomly but often enough to be annoying [14:32] yeah, my only remark is to make retry conditional on core16 [14:39] zyga: reading backlog, IMHO I think this needs to be escalated to foundations for a core16 backport [14:39] zyga: this feature was originally slated for a uc16 customer IIRC [14:39] so we can't just not have this feature on uc16 I think [14:41] also I agree with pstolowski I think we should only do this on uc16 [14:45] I agree on both [14:46] Afk because back pain [14:59] PR snapcraft#3104 opened: packaging: use git-based versioning for python package [15:03] * cachio lunch [15:07] I made the modifications and will push after round of local testing [15:29] pushed now [15:31] pstolowski, ijohnson: the relevant systemd commit is 13790add4bf648fed816361794d8277a75253410 [15:31] I will look if we can have a backport [15:32] thanks for digging into that zyga, a LP bug is probably a good place to co-ordinate with foundations? [15:32] yeah, I will file one today [15:32] cool [15:32] zyga: \o/ [15:36] ugggh [15:36] ijohnson: it's hopeless [15:36] we are on 219 [15:36] we need at least 236 [15:36] damn [15:36] the infrastructure to support it is substantial [15:37] journald passes a bunch of FDs to save to systemd [15:37] restartes [15:37] *restarts [15:37] and gets them back [15:37] there's no other way [15:37] so I guess ... SOL [15:37] I need to rest again, my back is killing me today :/ [15:37] if I forget, please escalate this tomorrow to mvo [15:37] of course [15:38] I will put some thoughts into the SU doc so we don't forget [15:38] thank you! [15:46] pstolowski: so when we go to change the persistence of journald, can we change a config file and then just reboot the whole system ? [15:46] it wouldn't be great that we have to restart the system just to change logging but it's certainly better than having things all randomly die [15:47] * ijohnson goes to read the implementation in snapd [15:47] ijohnson: reboot is not needed, it's a problem only with core16 [15:48] pstolowski: no I mean for core16 could we reboot? [15:48] ijohnson: ah, i see. well, yes we could i suppose [15:48] pstolowski: because afaict the issue is that when we reload journald it kills everything which is bad, so maybe instead of reloading journald we just change some config file behind journald's back and then reboot the system so upon boot it would be fixed [15:49] I presume that journald is not inotify'ing the config files, etc. [15:49] ijohnson: heh, there is no config file [15:49] pstolowski: so there's just the dir that's created in /var/log/journal? [15:49] ijohnson: you just create /var/log/journal [15:49] ijohnson: and reboot [15:50] pstolowski: hmm maybe worth a try [15:50] ijohnson: normally you send sigusr1 to notify journald about change [15:51] right, so on uc16, instead of sending sigusr1, we would just request a restart [15:52] ijohnson: yes [15:52] pstolowski: would we need some kind of special code to make the change/tasks associated with this work, or could I just try putting in handleJournalConfiguration a call to restart snapd ? [15:53] hmm although that's probably not the level we want to restart things [15:53] err I mean reboot the system [15:54] ijohnson: that's a fair point [15:55] alright well I'll just leave this idea in the SU docs for folks to think about, I have a bit too much on my plate to dig more deeply on this [16:08] pstolowski: did you see I assigned you a bug on Friday related to hotplug and auto-connect? [16:08] https://bugs.launchpad.net/snapd/+bug/1876356 [16:08] Bug #1876356: greedy auto-connect doesn't work with hotplug [16:08] just curious if you have any thoughts on why that doesn't "just work" [16:08] ijohnson: ah, no, i didn't [16:08] no worries, I think it's a lowish priority thing [16:10] ijohnson: thanks for all the details in the report, i'll see if there is anything obvious [16:10] missing [16:20] Bug #1876478 changed: Automatic snap refreshes changes state of service from stopped to running [16:41] PR snapcraft#3105 opened: build provider: clean incompatible build-environments [17:11] it seems the portal test is leaking stuff [17:11] | |-/run/user/12345 tmpfs tmpfs rw,nosuid,nodev,relatime,size=377196k,mode=700,uid=12345,gid=12345 shared [17:11] | | `-/run/user/12345/doc /dev/fuse fuse rw,nosuid,nodev,relatime,user_id=12345,group_id=12345 shared [17:11] oh well [17:11] another test to fix [17:14] zyga: could we just do `[ -d $XDG_RUNTIME_DIR/doc ] && umount $XDG_RUNTIME_DIR/doc` ? [17:14] at the end of teardown_portals ? [17:16] ijohnson: perhaps but I doubt that is the problem [17:16] the problem is more comple [17:16] *complex [17:16] we *activate* portals each time we invoke a program that seems to have a desktop plug (at all) [17:16] so any test using snap run may trigger it [17:16] Hmm really? [17:16] yeah [17:17] I suspect we run a program via su -u test [17:17] and that runs the portal outside of the session [17:17] so just happily in the space [17:17] I'll add a small check and run all the tests [17:17] what we really need is proper session teradown [17:18] we may need to change session-tool --restore to wait for linger shutdown as well, I need t ocheck [17:18] but first, need to find that test that leaks this [17:18] saving the log archive [17:21] we ran this before: 2020-05-04T16:33:35.0630508Z 2020-05-04 16:33:35 Preparing google:ubuntu-19.10-64:tests/main/interfaces-desktop-document-portal (may041608-600544)... [17:24] https://www.irccloud.com/pastebin/y3YiWMAY/ [17:24] it's one of those tests for sure :) [17:25] ijohnson: oh my [17:25] that test *badly* need session-tool [17:27] why is that test using snap-try? [17:28] ijohnson: session-tool beats su not because of any technical choices but simply becaues it puts the damn username before the damn command [17:28] Hahaha [17:29] Yeah I will look into this some more after lunch if you haven't figured it out [17:29] * ijohnson -> lunch [17:32] done [17:32] ijohnson: I'll send a PR shortly [17:36] ijohnson: I wonder if I should grep for "su" in the code [17:36] you might be afraid of what you find [17:37] PR snapd#8595 opened: o/ifacestate/handlers.go: fix typo [17:39] PR snapd#8589 closed: tests: port user-session-env to session-tool [17:39] PR snapd#8594 closed: tests: work around journald bug in core16 [17:39] wooooot [17:40] PR snapd#8581 closed: tests: port pulseaudio test to session-tool [17:40] PR snapd#8587 closed: tests: session-tool allows preparing/restoring for many users [17:47] yay mvo for the win [17:55] interesting stuff I find [17:55] I wonder if 19.10 specific [17:55] maybe some fluke of bugs interacting [17:56] so on 19.10 session-tool -u test --prepare [17:56] seems to ... not make a session [17:56] that's the only reason snaps work at all sometimes is that bugs interact and cancel each other out [17:56] ??! [17:56] hahaha [17:56] so session-tool has become the-tool [17:56] isn't that like half of current physics theories ;) [17:56] on 19.10 at least [17:56] all the infinities cancel each outher ;) [17:56] yep! [18:01] ijohnson: totally off-topic, over weekend I got an i2c DS 1307 RTC add-on to my raspberry pi [18:01] maybe next weekend I'll have a snap that manages time [18:01] very nice [18:01] though not perfectly (late boot) [18:01] I wonder if there would be a way to run some super-special things earlier via snapd [18:01] I have an RTC board too ... somewhere in my collection of random add-on things [18:01] (with appropriate interfaces) [18:01] cool [18:02] true, it would be most useful to have a RTC available during early boot [18:02] I wonder if it uses the same chip, I heard DS 1307 is popular [18:02] ijohnson: sadly it's not a /dev/rtc kind of thing [18:02] maybe it could be if device tree said something [18:02] yeah most rpi things seem to not be quite as simple plug and play like that [18:02] but I don't know how to do that [18:02] yeah I haven't looked into that at all either [18:03] https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/rtc/rtc-ds1307.txt <- it seems to be supported [18:06] nice so I bet with your own pi gadget it would probably just work [18:07] I don't know how to use DBTs though :) [18:07] more to learn [18:36] digging deeper [18:36] but I will EOD soon [18:37] really not well today [18:37] and it's late [18:38] PR snapd#8595 closed: o/ifacestate/handlers.go: fix typo [18:48] drat [18:48] my daughter turned off my laptop [19:21] I see a pattern now, I wonder what's the cause [19:21] but I also have a solution [19:22] just cannot justify it apart from "it seems to work" [19:22] I guess some systemd sources are to be read [19:38] PR snapcraft#3106 opened: sources: enable git, local, and tar handlers for all platforms [19:39] Bug #1875543 changed: Ubuntu 20.04 "A stop job is running for Snappy Daemon" during shutdown <20.04> [19:42] Bug #1876478 opened: snap stop does not explain that automatic snap refreshes starts stopped, enabled svcs [19:46] ijohnson: around? :) [19:46] yep [19:47] ijohnson: so here's what I learned: on 19.10 loginctl disable-linger leaves user@12345.service in user-12345.slice [19:47] why? I don't know yet [19:47] stopping the slice reliably stops everything and /run/user/12345 goes away [19:48] on 20.04 I don't see it [19:48] huh [19:48] I wonder if there's an interplay with some workarounds [19:48] that feels like a bug to me [19:48] so I patched session-tool to add the stops and the extra check for /run/user/12345 (for the test user obviously) [19:48] in systemd that is [19:48] wait but shouldn't the systemd version be basically the same between 19.10 and 20.04 ? [19:49] and I'm running a pass to see how it affects the "fleet" of systems running user-mounts test [19:49] ijohnson: 242 vs 245 [19:49] ijohnson: we have a logind workaround for 242 (fixed in ... wait for it ... 243!) [19:49] ijohnson: so maybe more bugs bugs bugs [19:49] ijohnson: I've yet to look at logind source but that's my next plan [19:50] I kind of want core 20 now [19:50] at least so many systemd bugs are gone there [19:50] haha [19:51] ijohnson: with the change user-mounts with trivial restore (session-tool -u test --restore) leaves no junk behind [19:51] and I bet this improves interplay with other tests [19:51] but why? :D [19:51] beats me [19:51] bugs that's why [19:52] it's super late and my wife looks at me with that very clear "STOP WORKING" eyesight [19:52] so I'll defer that [19:52] ttyl zyga :-) [19:52] but I'll send the patch for user-mounts in today still [19:52] now suppe r:) [20:44] ijohnson: session-tool patch https://github.com/snapcore/snapd/pull/8596 [20:44] PR #8596: tests: session-tool --restore -u stops user-$UID.slice [20:44] PR snapd#8596 opened: tests: session-tool --restore -u stops user-$UID.slice [20:44] Nice [20:45] it's not perfect because "Unknown reasons" but it's progress [20:46] I also suspect this could have some impact on other tests, as they were effectively leaking more bits [20:47] plus this is a PR for master after mvo landed all the good fixes [20:47] so with a bit of hope it won't fail [20:52] ijohnson: and this is the user-mounts test [20:52] https://github.com/snapcore/snapd/pull/8597 [20:52] PR #8597: tests: port user-mounts test to session-tool [20:52] and some typo fixes [20:53] PR snapd#8597 opened: tests: port user-mounts test to session-tool [20:53] I'll break for an hour and walk the dog [21:05] zyga approved the session-tool one and commented on the other one [21:05] * ijohnson EODs [22:20] PR snapcraft#3106 closed: sources: enable git, local, and tar handlers for all platforms === diddledan7 is now known as diddledan