[00:09] n 20.04 many services (in particular kubelet) are packaged as snaps [00:09] Does anyone know how to correctly interact with services running as a snap? There doesn't seem to be a systemd unit [00:10] for example 'systemctl status kubelet' (or docker or whatever) fails with a not found error [01:29] dstathis: systemd units owned by snaps will have names like "snap.$snapname.$app.service". You should see them in the "systemctl list-units" output. [03:05] I've installed a few Snaps on Fedora 32, and those with tray icons have garbled menus... Anything obvious I'm missing? [03:09] For font issue, libpango can be a reason [03:09] or libfreetype [05:15] PR snapd#8301 closed: interfaces/many: deny arbitrary desktop files and misc from /usr/share [05:15] PR snapd#9183 closed: tests: use "set -ex" in prep-snapd-in-lxd.sh [05:54] morning [05:57] good morning mborzecki [05:57] mvo: hey [06:38] good morning [06:38] I will be around shortly [06:38] just need to send some paperwork for the insurance [06:41] mvo if you can, please merge https://github.com/snapcore/snapd/pull/9175 [06:41] PR #9175: tests: find -ignore_readdir_race when scanning cgroups [06:43] zyga: hey [06:45] hi [06:48] zyga: looking [06:49] just random test failure fix [06:51] zyga: thanks, what did you change in the force push, unfortunately GH does not show me a diff for that :/ [06:51] mvo I moved the -ignore_readdir_race after the path [06:51] originally it was the first argument but old find didn't like that [06:52] compare https://github.com/snapcore/snapd/commit/cd5b94776066bbe76359e912c960c9d4258abc9c and https://github.com/snapcore/snapd/commit/8c10ddfc32cf8f909ba73bfcfe691f174917c2e4 [06:53] zyga: ta [06:55] PR snapd#9175 closed: tests: find -ignore_readdir_race when scanning cgroups [06:56] thank you [06:56] uh, it's a meeting day [06:56] just meetings and meetings [07:03] morning! [07:05] pstolowski: hey [07:06] hey mborzecki, welcome back! [07:14] good morning pstolowski [07:43] fwiw, I'm working on the lxd test failures currently [07:43] mvo thank you [07:44] yw - it's very painful as each iteration takes forever [07:44] but I hope I have a handle on it now (but I thought this the two previous runs too :( [07:44] mvo 300+ MB download is not free [07:44] mvo: thank you, i'm trying something as well on top of your existing PR but i'm not clear what root problem is and yes it is super slow [07:45] pstolowski: http://paste.ubuntu.com/p/p6pFKnRxrb/ is my best attempt so far, test is running [07:46] * mvo also wonders why wait_for_service seems to be not using the retr-tool [07:59] mvo: I noticed the desktop code review meeting isn't on my calendar for this week. Did you cancel it, or is that a glitch? [08:00] jamesh: I canceled it because we lack people but if you want to do it I can be available for you [08:00] jamesh: you also should have goten a mail about this by the calendar :/ [08:01] mvo: I don't see any email about the cancellation, which is why I asked. That's fine. [08:02] mvo: https://github.com/snapcore/snapd/pull/9150 can land too [08:02] PR #9150: gadget,kernel: add new kernel.{Info,Asset} struct and helpers (1/N) [08:02] mup: and https://github.com/snapcore/snapd/pull/9156 [08:02] mborzecki: In-com-pre-hen-si-ble-ness. [08:03] mvo: and https://github.com/snapcore/snapd/pull/9156 [08:03] PR #9156: boot: copy boot assets cache to new root [08:06] mvo: I think https://github.com/snapcore/snapd/pull/9168 is good to merge, but is failing on ubuntu-20.04-64 for some tests/main/lxd:snapd_cgroup_* tests [08:06] PR #9168: o/hookstate/ctlcmd: make is-connected check whether the plug or slot exists [08:07] jamesh: yes these tests have been investigated since yesterday [08:07] pstolowski: is it best just to wait til they get fixed then? [08:08] jamesh: if the failure is just on lxd I can force-merge [08:09] jamesh: merged [08:09] mvo: cheers! [08:09] mborzecki: landed 9150 now too [08:10] mborzecki: and 9156 [08:10] mvo: thanks! [08:10] PR snapd#9150 closed: gadget,kernel: add new kernel.{Info,Asset} struct and helpers (1/N) [08:10] PR snapd#9156 closed: boot: copy boot assets cache to new root [08:10] PR snapd#9168 closed: o/hookstate/ctlcmd: make is-connected check whether the plug or slot exists [08:11] fwiw, 9184 passed locally now for me on 20.04, let's see if it survies a full spread run [08:11] * mvo takes a short break while waiting for this [08:18] nice work [09:02] \o/ [09:23] so looks like 20.04 is now working but 16.04 is still failing :( *oh well* but with a very different error [09:28] mvo: what does 16.04 say? [09:28] zyga-x240: "Failed to execute operation: Connection timed out" [09:29] zyga-x240: in a meeting right now, I can paste the full error late but it's also in the CI i think [09:30] hmmm [09:41] ijohnson: https://github.com/snapcore/snapd/pull/9187#discussion_r473820780 [09:41] PR #9187: tests/lib/nested.sh: use more robust code for finding what loop dev we mounted <⚠ Critical> [09:52] hmmm [09:55] zyga-x240: hmm trouble getting self-hosted runners? https://github.com/snapcore/snapd/runs/1006992307 [09:57] hmm checking [09:57] the host is up [09:57] workers are up [09:58] I think I know what's going on [09:59] it seems that cla checks are hitting a time-out to grab a worker [09:59] I don't recall seeing that before, we have queueing for a reason after all [09:59] I restarted that and it passed instantly [09:59] so... no idea/ [09:59] hahah [10:00] well, maybe it's a one off occurrence [10:05] I hope so but I fear we will learn in time [10:05] time for coffee, I'm falling asleep here [10:06] maybe pressure is low or something [10:06] mvo: I looked and it looks like systemd is not responding, maybe the socket is gone somehow? [10:06] but no idea why [10:14] zyga-x240: so the nested lxd shows me a gazillion "permission denied", e.g. /bin/mount for / exited 1, mount: permission denied etc [10:14] on 16.04 [10:14] mvo: seems like lxd apparmor [10:15] mount is really disallowed [10:15] only bind may be allowed [10:15] yeah, strange that it's inside the nested though [10:15] maybe nesting is broken somehow [10:15] I wonder if we can stop doing something and get nested working without snapd [10:15] and then see what part breaks it [10:16] zyga-x240: https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1889318 is it because when run by lxc there's no apparmor namespace setup like lxd does? [10:16] Bug #1889318: install chromium in lxc container for 20.04 fails

[10:16] zyga-x240: it fails even before snpad it seems [10:16] zyga-x240: hm, well, maybe or maybe not [10:16] mborzecki: looking [10:17] I ... don't know [10:17] hm, "interessting" - it fails in systemctl daemon-reload but everyting else seems to work [10:17] well [10:17] when systemd is not responding [10:17] it's not really a place where things work [10:18] the funny thing is - systemctl restart snapd works [10:18] systemctl --list works [10:18] afaict all the things I tried work [10:18] hmm [10:18] just not the daemon-reload [10:18] it's a bit strange [10:18] what does daemon-reload say? [10:19] I mean, there is journal [10:19] and I think this is broken since forever [10:19] we never saw it because that script did not have set -e [10:19] ohhhh [10:19] that's interesting [10:19] so it would fail on stable releases as well [10:19] mvo: do you have 5 minutes for a quick call? [10:24] mvo: hi! thanks for committing PR 8301! have you decided to marge master into release/2.46? if you aren't, I need to prepare a PR for 8301 (that would include 9167) for 2.46 (which is fine, just need to know that I should do it :) [10:24] PR #8301: interfaces/many: deny arbitrary desktop files and misc from /usr/share [10:26] mvo: all that is left for new PRs that small k8s-support one that I need to investigate. then I'll be doing 'needs security review' reviews [10:29] jdstrand: thank you [10:34] mvo: (also, PR 8920 needs final reviews) [10:34] PR #8920: interfaces: update cups-control and add cups for providing snaps [10:36] mvo: do note that I intentially did not milestore PR 9186 for 2.46. that needs discussion. if there happened to be a 2.46 point release after that is merged, we could consider it, but it floating into 2.47 would be fine too [10:36] PR #9186: interfaces: add vcio interface [10:38] ok [10:42] mvo: I'm looking at https://github.com/snapcore/snapd/pulls?q=is%3Aopen+is%3Apr+label%3A%22Needs+security+review%22. nothing is milestoned for 2.46. is there anything in there that should be that you would like me to prioritize? [10:43] if not, I have a good idea of the priority [10:49] I keep getting vendor.json changes [10:49] I purged cache [10:49] purged the tree [10:49] it keeps changing [10:52] zyga: me too! I'd love to see that resolved. (my dev container is on bionic still. wonder if it is a focal vs bionic thing) [10:53] mvo: why is #9171 blocked? [10:53] PR #9171: [RFC] config: "virtual" configuration for timezone <⛔ Blocked> [10:54] zyga: jdstrand: perhapos a different version of govendor was used when vendor.json was last updated in the tree [10:54] hmm [10:54] the difference is a checksum only [10:54] (at least here) [10:54] perhaps but what's the version and who has it is interesting [10:55] i have 1.0.9 [10:56] I have 1.0.9 in /usr/bin and 1.0.8 in GOPATH [10:56] btw. i guess we all should be using the latest master govendor, since we're go getting it in run-checks [10:56] and get-deps uses that [11:01] mborzecki: totally offtopic: https://github.com/snapcore/snapd/pull/9189 [11:01] PR #9189: snap/snapenv: set SNAP_REAL_HOME [11:01] PR snapd#9189 opened: snap/snapenv: set SNAP_REAL_HOME [11:02] mborzecki: running 1.0.9 modifies my vendor.json [11:04] zyga: there's a couple of recent commits to vendor.json, mine was on 3.08 (and i'm using 1.0.9), the later commits were by claudio, samuele and jamesh [11:04] hmmmm [11:05] the change comes from 7a9cb154a0c (Claudio Matsuoka 2020-08-13 09:08:15 -0300 115) "revision": "68200eea7bdcb97e27fe8e5ff443776383908637", [11:05] so maybe claudio has older version [11:06] let's ping him when he's online [11:06] yeah [11:10] pstolowski: I think 9171 really needs samuele approval, I think it's okay otherwise, if the design gets approval I would like to tweak it a bit more with some helpers [11:22] mvo: i made a few comments, not sure what was already discussed and agreed, so maybe my comments make no sense [11:22] pstolowski: cool, happy for any feedback at this point [11:24] pstolowski: hm, great point about snap get -d [11:24] pstolowski: I think you are right, we should probably not bypass this for that [11:24] * mvo scratches head [11:25] mvo: yes i think we are breaking some high level assumptions here. and i fear it's going to be annoying to handle :/ [11:25] pstolowski: yeah, this needs more discussion for sure [11:26] pstolowski: your comment about the transations is also interessting, maybe we need to hook into commit() here instead [11:27] lxd tests passed locally in 9184 [11:27] * mvo vanishes for lunch [11:27] enjoy [11:27] mvo: what if we do store in state but synchronize config state with system setup? or is it too terrible? [11:29] * pstolowski lunch as well [11:29] pstolowski: who wins? [11:29] pstolowski: if system was modified when snapd was down? [11:30] zyga: system always wins. we update system on snap set. [11:30] but maybe it's naive [11:30] just throwing ideas [11:31] pstolowski: so when would we use the value from state? [11:33] zyga: we would always update state from system before reading. that could simplify transaction logic without special-casing. but just brainstorming at this point [11:33] anyway, time for lunch [11:33] pstolowski: I see [11:34] I don't know either, just trying to understand your point better [11:41] zyga, hi [11:42] zyga, do you have any idea about what could be causing that when I test beta image and do "systemctl --user daemon-reload" as root I get "Failed to connect to bus: No such file or directory [11:42] " [11:42] If I do that as ubuntu user I dont see any error, but as root I see that error [11:42] and it makes fail the snapd-failover test [11:43] cachio: hi [11:43] cachio: yes, I explained that a few days ago [11:43] cachio: when we reload systemd-logind.service on older versions of systemd [11:43] cachio: we cause it to forget about the session of the root user [11:43] cachio: then subsequent test that prepares a session for the root user [11:44] cachio: enables linger, which starts services and so on [11:44] cachio: but then the restore disables linger [11:44] cachio: because systemd logind is no longer tracking the incoming ssh session of the root user [11:44] cachio: it shuts down systemd --user for root [11:46] zyga, do you know which is the difference between the test we run in github and the one I run in beta validation to explain why it works in github and fails with the beta image? [11:46] cachio: I don't know what version beta was forked form [11:47] cachio: please check if it contains changes to prepare-restore.sh [11:47] talking exactly about this issue in the commit message [11:51] zyga, I see the change [11:51] I need to see how to make it for external backend [11:51] thanks for the explanation [11:54] cachio: the fix I made is generic [11:54] cachio: if you have it and the bug persists then there's something more broken [11:54] zyga, yes [11:54] but in case of external backend we exit before that code [11:54] cachio: output from loginctl list-sessions would help [12:00] cachio: when do we exit then? [12:00] zyga, most of the prepare_project is not done for external backend [12:01] I am trying to move the linger configuration to see if it works [12:02] cachio: I see [12:02] cachio: yeah, you need to apply the fixes to systemd-logind [12:02] cachio: those are one-off [12:02] cachio: do you perform package upgrades? what's the target? [12:03] is that core16? [12:03] yes [12:04] core16 needs a special workaround [12:04] in essence, I repackage core [12:04] though recently ijohnson applied a fix to core so maybe that is no longer required [12:05] following that /var/lib/systemd/linger is bind-mounted to writable using a mount unit [12:05] look at the patches I apply to replicate that [12:05] I wasn't aware the external target skips all of that [12:05] zyga, no problem, I'll try to extend that to external [12:05] you may only need the 1) mount unit 2) change to logind configuration followed up by REBOOT [12:37] zyga: the core fix has not landed yet, PR is still open [12:37] also morning folks [12:37] hey mborzecki welcome back [12:45] pstolowski, hey, any idea about thie error https://paste.ubuntu.com/p/SRTGRMHx45/ [12:45] it is happening in Core20-early-config test [12:46] I see this gadget.yaml parse error: Duplicate key: system [12:46] not sure if it could be the raeson [12:48] looking [12:49] ijohnson: I see [12:49] * zyga was having pierogis for lunch [12:49] cachio: that was what I fixed in my PR [12:49] cachio: yeah, definately [12:49] cachio: there's a duplicate key :) [12:49] cachio: that is fixed by #9187 [12:49] PR #9187: tests/lib/nested.sh: use more robust code for finding what loop dev we mounted <⚠ Critical> [12:50] right, i was just going to say that :), thanks ijohnson [12:50] ijohnson, ah, thanks!!! [12:58] ijohnson: should I force merge 9187? there are still failures in nested [12:59] mvo, the errors in nested are fixed in other pr [12:59] mvo: let me look quickly [12:59] in mine [12:59] mvo: the tests are still failing [12:59] also SU time now [12:59] pr: #9098 [12:59] PR #9098: tests: new organization for nested tests [13:00] mvo: sorry I meant the tests still seem to be running? [13:00] but mine fails sometimes because the erorr which is fixed on 9187 [13:00] ijohnson, I retriggered the tests [13:16] PR snapd#9081 closed: secboot,cmd/snap-bootstrap: cross-check partitions before unlocking, mounting [13:37] oh, and I'm making progress on unit testing bash, https://listed.zygoon.pl/ has the details [13:40] zyga-mbp, so I see we do https://github.com/snapcore/snapd/blob/master/tests/lib/prepare.sh#L662 [13:40] right [13:40] we need that for external as well? [13:40] you want that and the configuration change immediately below that [13:40] yes [13:41] ok, that plus the change https://github.com/snapcore/snapd/blob/master/tests/lib/prepare-restore.sh#L446 [13:41] cachio note that this is the static part, the dynamic part is where we decide systemd-logind needs reloading and REBOOT [13:42] yes [13:42] perfect [13:42] we try to enable linger for the test user, if that fails we know we need to restart [13:42] note that this does essentially the same change (configuration file edit) so make sure to just reboot in that case as the config file is already in place, just inactive [13:43] ok [13:43] I hope this works :) [13:43] cachio I left a small comment on v [13:43] https://github.com/snapcore/snapd/pull/8986/files [13:43] PR #8986: tests: new snaps-state command - part1 [13:44] zyga-mbp, thanks!! [14:26] cachio: any luck? [14:26] zyga, no yes, trying in 5 minutes [14:26] ok [14:26] still making some changes [14:42] PR snapd#9188 closed: interfaces: misc policy updates xlvi [14:46] mvo: 9184 is super super green :-) [14:47] PR snapd#9190 opened: [RFC] cmd/s-b/initramfs-mounts: make recover -> run mode transition automatic [14:48] zyga-mbp, Could not enable linger: No such file or directory [14:48] I see that [14:49] cachio, ok do you have a shell [14:49] when I do loginctl enable-linger test [14:49] yes [14:49] is the logind.conf file modified? [14:49] yes [14:49] StateDirectory=systemd/linger [14:49] is /var/lib/systemd/linger a mount point to the corresponding directory in writable? [14:49] with this config [14:49] I created that in writable [14:50] that's not all, is a mount in place? [14:50] I cant write to /var/lib/systemd/linger [14:50] is there a mount unit that makes /var/lib/systemd/linger a bind mount to /writable/system-data/.... [14:50] it is protected [14:51] you need the linger directory to exist [14:51] and it must be a mount point as I've explained [14:51] no way around that [14:51] THEN you can reboot to reconfigure logind [14:51] (via REBOOT) [14:51] I cant create /var/lib/systemd/linger [14:51] and then it will work [14:51] cachio so merge the core change and rebuild the snap [14:52] which is the change to merge? [14:52] ian proposed a PR for core [14:53] zyga-mbp, ok, so no workaround until the chagne in core is merged [14:53] correct [14:53] unless you can repackage core [14:54] zyga-mbp, agree but on beta validation we don't repack core [14:54] so I'll push the change done by ian [14:55] ijohnson, is this the change? https://github.com/snapcore/core/pull/116 [14:55] PR core#116: extra-files/writable-paths: make all /var/lib/systemd writable [14:55] yes [14:56] waiting for a second review? [14:56] ijohnson, [14:56] ijohnson what happens when we remove entries? [14:57] cachio: I can actually merge it, not sure if I should wait though [14:57] zyga-mbp: what do you mean ? [14:57] ijohnson this looks good but may need follow ups for hacks in prepare-restore [14:57] ijohnson the removal of /var/lib/systemd/rfkill and random-seed lines [14:57] hmm [14:58] I'd +1 without those [14:58]