[02:48] PR snapd#9514 opened: Enforce the confined fwupd to align Ubuntu Core ESP layout === jamesh_ is now known as jamesh [06:11] morning [06:31] zyga: mvo: hey [06:32] good morning mbo [06:32] good morning mborzecki [06:36] zyga: so I have a couple small fixes for PR #9509 but I noticed you said not to push anything more there - also I am pretty sure all the commits already use my @canonical.com email address... so I just wanted to see where things were at from your point-of-view [06:36] PR #9509: multipass/docker-support: Mount /etc/apparmor* from the base snap <⚠ Critical> [06:49] o/ [06:49] amurray go ahead and push them [06:49] amurray I'll discuss one bigger issue with mvo shortly [06:50] we may need to have a different implementation altogether, we'll see [06:50] hey mborzecki, mvo [06:50] sorry for starting late, not feeling good, I think I'm going to be sick this week :/ [06:50] mvo: can we briefly discuss what I tried and found on Friday [06:50] to keep amurray in the loop? [06:51] zyga: no worries, I would love to have samuele in this too [06:52] maybe I can state what I found now while amurray is here [06:52] and we can discuss more with Samuele later [06:52] I don't have a solution yet, I think, only ideas [06:53] amurray, mvo: the problem with the current implementation is that by the time we compute security profiles, the base snap is unlinked and (masked by the absent error handling on readlink) we get a non-sense profile definition [06:53] amurray, mvo: I tried to fix this by using connected plug/slot version of the interface, after all, that seems natural [06:53] zyga: +1 [06:53] amurray, mvo: and we know exactly the revision we are after, so that's even better [06:54] amurray, mvo: but then realized this doesn't work because those interfaces can be provided by _either_ core or snapd, in neither case that is the base snap we are after [06:54] amurray, mvo: from here on I was exploring ideas, each with their own flaws: [06:54] amurray, mvo: 1) leave the symlink in the mount profile and teach snap-update-ns to resolve the current symlink automatically [06:55] this works on connect [06:55] it works on disconnect [06:55] when the base revision changes and we discard and re-generate, it also works just fine [06:55] it's a bit of special case logic in snap-update-ns [06:56] 2) have new mount point for the base snap explicitly, like /base and work from there (I like this because we should do /app later as well, this would fix so so so many issues) [06:56] the /base mount is useful as it is the naked base snap, without anything covered [06:57] all the ideas on exposing fixed mount points for key snaps are interesting to consider, because IMO they would fix some of the snaps-are-hard-and-require-loads-of-env-variables-to-work-around over time [06:57] not as many as runtime + /app model in flatpak but I think that would be a step in the right direction [06:57] 3) have a new variable in snap-update-ns that explicitly talks about base snap name and revision [06:57] this is like 1) but not "magic" that some symlinks work while others are ignored [06:58] the problem in 3) is, IIRC, snap-update-ns does not have the right information to provide those other than by reading YAML by itself and readlinking current files [06:58] it's a variation of 1 that it might be regarded as 1b) perhaps [06:59] the reason I find 2) interesting is that, over time, we could have non-shared, curated /etc [06:59] and /etc/apparmor.d -> /base/etc/apparmor.d is nice [06:59] or perhaps /snap/.base/ [06:59] (we could come up with a way to inject .base there easily) [06:59] anyway [07:00] those are my ideas [07:00] amurray: does this make sense? I'm not sure if I'm making myself comprehensibly clear today [07:00] mvo: same question [07:02] morning [07:02] zyga: it does make sense [07:02] pstolowski: good morning [07:03] pstolowski: hey [07:03] zyga: I like (3) as it seems relatively easy? I like (2) better but that seems more work? [07:03] zyga: or at least conceptually more change? [07:03] mvo I agree, [07:03] zyga: in (3) you say snap-update-ns does not have the information, can we just pass it somehow? [07:04] mvo: we'd have to teach snap-update-ns about $SNAP_BASE_NAME and $SNAP_BASE_REVISION [07:04] mvo: I need to check [07:04] I mean, base is easy [07:05] base is in the yaml of the snap we are operating on [07:05] and we should be able to read that [07:05] I hope [07:06] * mvo nods [07:07] my throat hurts today so I really prefer to type rather than talk [07:07] but we can have a call about this later today [07:07] I'll start fixing it [07:11] zyga: sounds good, like I said, I would love to have Samueles opinion because I'm worried that I may give you the "wrong" direction [07:13] zyga: so is this only an issue on refresh of the base snap? I prefer 2) as well since either 1 or 3 needs to pass more info to snap-update-ns which I feel is a bit like the wrong place to be doing this - I don't love the idea of expanding variables in snap-update-ns either [07:14] I think 2, since it involves new FS elements, may take longer to agree on [07:14] while 1 and 3 we could do "locally" without introducing new elements that need design [07:15] (the new paths in the filesystem that is) [07:15] true... what if we did the snap-confine approach, would that also suffer problems on refresh of the base? perhaps that is a better temporary fix that we can back out later once /base etc is bike-shedded [07:16] amurray snap-confine based fix has the problem of not being able to handle interfaces at all, which is not too good in my opinion, we also would really like to shrink that list ot fixed mount points it does [07:16] amurray: technically it might work but I think we should not do it [07:17] zyga: sure but I wonder (given ideally we want this fixed in a few days) whether it is the more pragmatic approach to take [07:17] I think 1-3 are doable today [07:17] it's not a big change [07:19] if 1, how does snap-update-ns recognise that it is a symlink? currently we seem to try very hard to purposefully not resolve symlinks in snap-update-ns (and so fail if a mount point contains them) [07:20] amurray we could really whitelist /snap/*/current [07:20] or perhaps /snap/$base_snap_name/current only [07:22] zyga: for #base_snap_name we would then still need to teach snap-update-ns what the base snap is (or how to find it) - so that is more like option 3) from what I can see.. at the end of the day I am not too fussed what option we go with, I just want to take an approach which is achievable given the very tight deadline we have [07:23] yes [07:23] 1-3 only differ in the presentation layer [07:23] in both cases we need to know base snap name and revision in snap-update-ns [07:23] * zyga nods [07:23] hence my preference for snap-confine since it is the most minimal touch approach [07:24] amurray: but then we are stuck with two apparmor-specific mount points in snap-confine and no clear way to undo that on snapd upgrade [07:24] that's my main worry [07:25] if we ever want to stop sharing /etc, handling that is a nightmare [07:25] zyga: but we have that already for /etc/ssl.conf etc - so we would have to solve that then anyway right? so adding these apparmor ones doesn't appear to increase the complexity to me [07:26] amurray: yes, but also less nice IMO [07:26] amurray: I didn't check what happens in fedora when those directories are gone [07:27] s-u-n code handles that [07:27] s-c will fail [07:27] would require more hacking to get that to work [07:27] * zyga breaks for breakfast [07:29] PR snapd#9515 opened: cmd/snap-confine: update path to snap-device-helper in AppArmor profile [07:34] There is an entirely new user-space API for GPIO lines. Naturally, this API is rigorously undocumented, but some information can be gleaned from this commit. [07:34] from https://lwn.net/Articles/834157/ [07:35] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b53911aa872d [07:35] zyga: only 5+ years from becomign available in every major vendor's BSPs [07:43] zyga: got disconnected from my bouncer so not sure if my last msg got through - ok, cool, you know this better than me - I'll push those couple fixes - let me know if there is anything else you need me to look at [07:46] amurray, your last message I got was about /etc/ssl.conf [07:46] amurray: please push them, I'll rebase my changes on top [07:46] amurray: and I see your point of view, I just want to try doing the less problematic over time thing instead [07:54] zyga: thanks, I understand where you are coming from and appreciate your focus on avoiding more technical debt - I just want to make sure we get it fixed in time for release - I'll stop worrying now though, I've pushed my fixes (fwiw I managed to create a local arch qemu VM for testing... so that was a minor win 🏆) [08:02] woot [08:02] thanks! [08:05] ehh, power outage [08:39] amurray your mode wins [08:39] we cannot use snap-update-ns [08:40] https://github.com/snapcore/snapd/pull/9516 [08:40] PR #9516: cmd/snap-confine: mask host's apparmor config [08:40] PR snapd#9516 opened: cmd/snap-confine: mask host's apparmor config [08:40] I'll take your test and adjust [08:52] pedronis: could you please check 9516? if it looks good we should land it into groovy today to unblock the release team [09:00] should it have a spread test? on top of docker one? [09:00] yes, I will [09:00] it's just something we drafted during the 1:1 [09:01] I'll adjust the mount ns test first to make it not red [09:01] then convert the test Alex wrote [09:03] I did a pass [09:05] PR snapd#9515 closed: cmd/snap-confine: update path to snap-device-helper in AppArmor profile [09:05] hmmm [09:05] error: invalid tests/main/snap-env environment: invalid variable name: INSTANCE_KEY/regular [09:05] did master spread regress? [09:07] zyga: uh, where do you see that? [09:08] may have been a fluke, I copied ~/go from x240 over to a pristine system and ran spread from there [09:08] it goes away if I give the entry an empty string value: INSTANCE_KEY/regular: "" [09:14] zyga: something strange on opensuse, we seem to be reloading s-c apparmor profile from the core snap there [09:15] reloading? [09:15] mborzecki note, that is possible [09:15] those are separate profiles [09:15] zyga: https://paste.ubuntu.com/p/fQFXkdN8tQ/ [09:16] is that within one test? [09:16] perhaps snapd reload? [09:17] zyga: this one type=AVC msg=audit(1603098317.172:596): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/snap/core/10251/usr/lib/snapd/snap-confine" pid=19468 comm="apparmor_parser" [09:17] again, that is normal [09:17] what's the concern? [09:24] zyga: i think it messes up the profiles, and then s-c complains about elevated permissions [09:24] mborzecki how can it mess up the profiles? [09:24] zyga: https://paste.ubuntu.com/p/C3Xcrx82y2/ [09:24] which snap-confine is complaining (what is the path that is executed) [09:25] can you go to /proc/pid/attr [09:25] I think this could be a different bug [09:25] what do you see there? [09:25] mborzecki can you compare [09:26] /proc/PID/attr/current and /proc/PID/attr/apparmor/current [09:26] specifically, if you could gdb snap-confine invoked from that path [09:26] break it somewhere [09:26] and compare [09:26] not that the absolute path of invoked snap-confine matters [09:26] because it is that path alone that determines the confinement profile [09:26] mborzecki tl;dr; version is that IIRC kernel has changed things [09:26] and apparmor bits are in apparmor/current [09:26] and not in current anymore [09:26] zyga: it's opensuse, we are invoking /usr/libexec/snapd/snap-confine [09:27] because of reasons [09:27] ok [09:27] that's easier to debug then [09:27] and the profile you see there is not used [09:27] as it's compiled for an absolute path /snap/core/nubmer/usr/lib/snapd/snap-confine [09:27] it's not, s-c reports it's unconfined (and proceeds to run as a user) [09:27] I think snap-confine is broken :) [09:27] as in, it's doing this check incorrectly [09:27] pff, proceeds to run as root (complains when running as user) [09:27] yep [09:28] please look for what I mentioned [09:28] and then we know [09:28] if the profile is loaded then it's applied based on path [09:28] snap-confine reads one of the two files [09:28] perhaps it should read the other now [09:28] it's done via libapparmor [09:28] but perhaps it's also buggy [09:28] mborzecki I remember reading there was some motivation for this change [09:29] just not sure if it's some default or when it was merged [09:29] probably very few apps check their own confinement [09:29] zyga: fwiw, there' no /proc/pid/attr/apparmor/ at all [09:29] mborzecki what's the kernel version? [09:30] zyga: 5.3.18 [09:30] woo [09:30] why so old? [09:30] i have 5.8.14 in a tumbleweed vm [09:31] mborzecki: hah [09:31] something is fiiishy [09:31] maybe desync with kernel and libapparmor? [09:31] kernel from GCE [09:31] and library from actual suse [09:31] mborzecki what's the full uname? [09:31] zyga: looks like a leap 15.2 kernel but tumbleweed? [09:31] :D [09:32] maybe cachio invention [09:32] 5.3.18-lp152.19-default [09:32] dis-upgraded or something [09:32] yep [09:32] definitely [09:32] * zyga hands maciek the pieces back [09:32] not a [09:32] not a good bug [09:32] heh [09:36] * zyga runs tests and picks up tea [09:36] hah :P don't run away [09:36] no no, just a moment [09:36] mborzecki can we upgrade the kernel in-place? [09:36] mborzecki maybe we do need a sanity test for kernel version [09:37] zyga: hm, really feels like the updates should be done via tumbleweed-cli, i don't understand why it's not done like this [09:37] mborzecki: I think it starts with the image [09:37] they are almost all custom made [09:37] in ways that nobody reviewed [09:37] zyga: want a hand with 9516 or are you on it (the busy-work like pulling the tests from the other PR, adding comments/bug references) [09:37] zyga: i'll poke cachie when he's around [09:38] +1 [09:38] mvo sure: if you pick up the spread test from the other location that would be good [09:38] mvo note that the only test we should run now is a 20.10 [09:38] we never ran it on 20.10 so we didn't notice the original fix was failing on the current symlink [09:38] * zyga really goes to get tea [09:45] re [09:45] PR snapd#9517 opened: spread, tests: tweaks for openSUSE [09:47] +1 [09:47] zyga: hm, the test we have in the other pr is a bit indirect, can we test more directly? [09:47] mvo: I think the test should be [09:48] on 20.10 we can compile a profile from within a snap using {multipass,docker}-support [09:48] mvo: what did you had on your mind? [09:48] *have [09:48] zyga: I cherry picked all the bits now from the other PR and run it on 20.10 [09:48] zyga: I had in mind actually installing the problematic snaps [09:49] zyga: well, we already have docker-smoke [09:49] zyga: so that one should work once we have the fix [09:49] mvo: real snaps are poor regression tests [09:49] as they shift around [09:49] zyga: was wondering if we should have a multipass-smoke [09:49] we might have both [09:49] yeah [09:49] zyga: yeah, sorry, you are right [09:49] smoke tests + simplified regression test [09:49] +1 [09:50] mborzecki can we update the kernel to tw kernel and REBOOT? [09:51] zyga: not sure it's worth meddling with the kernel at this point, i'd rather have cachio look into building proper images starting from some recent tw snapshow [09:51] ok [09:51] yeah [09:51] I think nobody but him knows exactly what is in that image [09:52] heh [09:53] mvo: I'll push some more fixes there, (apparmor and mount ns changes) [09:53] just doing one more iteration to make sure it's good [09:54] zyga: pushed one more patch to #9517, we were not reloading the right apparmor profile file on install [09:54] PR #9517: spread, tests: tweaks for openSUSE [09:54] +1 [09:54] see it [09:55] that's a fallout from the recent change, right? [09:55] yes, lib -> libexec [09:55] right [09:55] zyga, amurray I cherry picked all the test fixes into pr#9516 please double check. also tweaked comments. test is running now [09:56] heh, i'm sure we would have caught that earlier if the testing worked :/ [09:56] mvo: yeah [09:56] mborzecki: yeah [09:56] mvo: thanks, I'll check shortyl [09:56] *shortly [09:56] zyga: no rush [09:58] curious to know if there is anyone else having trouble while booting into core20. with the gadget.yaml file from here: https://github.com/snapcore/pc-amd64-gadget/blob/20/gadget.yaml [09:58] i always get ubuntu-boot not found message, and device falls back to grub emergency [09:59] * zyga defers clmsy's question to other team members [10:00] could it be possible because my usecase is a bit different [10:00] https://pastebin.com/su8AbaK0 [10:00] clmsy: this is the gadget we use for all our image tests so it *should* work (not saying it does :) what can I do to reproduce this? [10:00] clmsy: i.e. how do you build the image etc? [10:03] i build the image through ubuntu-image with a signed model assertion, here i posted the full thing with some parts obv redacted: https://pastebin.com/HvaENzMN [10:03] i use the exact same gadget.yaml file [10:04] image build is a success but when i boot, i get ubuntu-boot not found message than it drops to emergency grub section, then the-tool service has a fatal error where it says one of the following is not found [kernel core snapd] [10:04] the thing that makes me question is setting both core20 and core18 as type: base [10:04] but if i don't do that the ubuntu-image fails on me, saying that one of the snaps need core18 as base [10:06] clmsy: what's the version of snapd on your build host? [10:06] i do want to ask something related to this, i used to work on core16 for the same device, and when i did snapcraft build on my gadget it used to create staging folder etc, when i do it now with core20 as base:core20 in snapcraft file. there is no stage folder at all but still build succeeds [10:07] this is the output for snap --version for my build host: snap 2.47 [10:07] snapd 2.47 [10:07] series 16 [10:07] ubuntu 20.04 [10:07] kernel 5.4.0-48-generic [10:07] ah shit sorry for multiple messages :/ [10:08] clmsy: does it build successfuly if you keep type: base for core20 only? [10:08] no worries [10:08] clmsy: when you boot, can you see it creating partitions and rebooting and then failing? or does it not even make it through the "install" state of the boot? [10:09] (and is my question comprehensible?) [10:12] clmsy: a screenshot/serial log would be useful if can get that [10:12] mborzecki: if i change core18 from base to core and keep base for core20 only this is what happens during ubuntu-image: error: core snap has unexpected type: base [10:13] yes let me get more information on the boot what exactly is happening [10:13] clmsy: try completely dropping the `type: ...` line from the core18 entry in the gadget [10:14] IIRC some people bumped into this before [10:14] zyga: hm thought it was fixed in 2.47 [10:16] beforehand i was told it was related to this bug: [10:16] https://bugs.launchpad.net/snapd/+bug/1883973 [10:16] Bug #1883973: cannot boot uc20 model with multiple base snaps [10:19] mborzecki: this is what happens than: error: snap "core18" has unexpected type: base [10:19] somehow defaults to it ? [10:20] maybe just drop core18 entirely? [10:21] btw a quick capture of what is happening at boot: https://streamable.com/hoqzfz [10:23] morning folks [10:24] ijohnson hello [10:24] welcome back, how was your week off? [10:24] o/ zyga [10:24] it was nice, it's gotten a bit colder now but feeling well rested :-) [10:26] ijohnson: hey, welcome back! [10:27] hi mvo [10:35] mvo, mborzecki: https://github.com/snapcore/snapd/pull/9518 [10:35] PR #9518: tests: add value to INSTANCE_KEY/regular [10:35] weird, would love if you guys could confirm master fails on your machines, with current spread [10:35] PR snapd#9518 opened: tests: add value to INSTANCE_KEY/regular [10:38] zyga: what fails? [10:38] mborzecki read the commit message please [10:38] spread refuses to run [10:38] master spread, master snapd [10:47] zyga: https://build.opensuse.org/request/show/842527 [10:50] +1 [10:50] thx [10:50] pedronis: can I merge https://github.com/snapcore/snapd/pull/9512? [10:50] PR #9512: o/snapstate: move setting updated SnapState after error paths [11:01] mvo: pushed trivial fixes to apparmor PR [11:02] * zyga afk to make more tea [11:05] zyga: haha, so tw is green now :P https://github.com/snapcore/snapd/pull/9517/checks?check_run_id=1274508678 [11:06] PR #9517: spread, tests: tweaks for openSUSE [11:10] zyga: we should chat with mvo about it and #9422 [11:10] PR #9422: overlord: add link participant for linkage transitions [11:16] pedronis: now? [11:17] pedronis: or later today? [11:17] mvo: I have a meeting in ~10, anyway when it works [11:19] pedronis: if you could give me a quick status of your thoughts in the standup HO that would be great(unless you need the 10min to prepare or for a break or something else) [11:25] zyga: did alex approve the approach btw? ideally we need a +1 from him too but I guess it's late [11:26] mvo: it was his preference [11:26] mvo: as it was simpler conceptually [11:26] we discussed it here in #snappy [11:27] both this morning and last week [11:29] zyga: \o/ thank you [11:29] zyga: just wanted to confirm [11:29] I think this was worth exploring, only if to learn what the limitations of the mount systems are [11:29] or perhaps more of the interface system [11:36] Is there any way I can figure out why snapd on my laptop appears to be in a fairly tight loop using ~10% of a CPU? "snap changes" says "1566 Doing 6 days ago, at 18:52 BST - Auto-refresh snaps "lxd", "skype"" [11:37] cjwatson I think this is something that pstolowski was debugging [11:37] cjwatson it's stuck trying to refresh lxd but fails on a mount namespace, perhaps [11:37] pstolowski ^^ [11:38] mborzecki: do you think you could do a second review for 9516 ? [11:39] mvo: what should I do about 9499 [11:40] I can adjust it to pass and add comments that it fails on core16 and core20 [11:40] I could dig and try to fix the problem [11:40] zyga. cjwatson not really, but maybe related. can you get a list of all tasks (snap change ) and journal log? [11:41] pstolowski: https://paste.ubuntu.com/p/WBkysQcyRr/, and roughly what journalctl parameters? [11:41] oh [11:41] Doing 6 days ago, at 18:52 BST - Download snap "skype" (156) from channel "latest/stable" (40.33%) [11:41] it's the other bug [11:41] stuck download [11:41] also Doing 6 days ago, at 18:52 BST - Download snap "lxd" (17738) from channel "latest/stable" (65.26%) [11:42] cjwatson are the percentages moving? [11:42] wow [11:42] or are they stuck at 0 [11:42] at zero change I mean [11:42] cjwatson: hit the elusive download issue [11:42] zyga: not changing as far as I can see [11:42] right [11:42] cjwatson: journalctl -u snapd [11:42] Also "snap change 1566" takes 16 seconds ... [11:43] cjwatson that's because we hold the lock and save all the time apparently [11:43] and there are only write locks, no read locks [11:43] that's a known issue (CC mborzecki ) [11:44] pstolowski: adding -b to just get information from the current boot: https://paste.ubuntu.com/p/MYdjW8d6NJ/ [11:49] cjwatson: thanks, actually the changes are from 6 days ago, might be useful to have log from back then [11:50] pstolowski: the log I gave you covers that time though? [11:50] my last reboot was nearly 8 days ago [11:50] cjwatson: ah yes, you're right, sorry [11:53] cjwatson: could you please send me your /var/lib/snapd/state.json (you may want to obfuscate your private macaroon there)? [11:53] ijohnson thanks for the review on fsck [11:54] cjwatson: after that you can restart snapd service to recover from this [11:54] zyga: yeah thanks for eventually catching my mistake! [11:55] ijohnson: the means of landing this are unclear so if you can push the fixes for fsck that would be great [11:55] zyga: ah ok, yeah the fix should be real simple, I just prefered to be able to demonstrate that the pr fixes the issue with that spread test already in master but sure I can open the presumptive fix to snap-bootstrap and manually verify with your spread test as it exists [11:56] :D [11:56] ijohnson if the fix is easy and in snapd itself, just go for it here [11:56] zyga: sure that should be easy actually [11:56] cool [11:56] that will leave only core16 left [11:56] and core20 in a much better state! [11:56] woot, thanks for that ijohnson [11:56] zyga: I will push in a bit still going through all the emails, etc. from last week [11:56] sure [11:59] pstolowski: Thanks, emailed to you and restarted [11:59] cjwatson: thanks! got it [12:01] and indeed it finished the refresh now [12:03] cjwatson: peculiar... it's an issue we got reported by one of the customers. not reproducible and we had no logs and very little info so far... maybe the info from you will help progress on that, thank you! [12:03] pstolowski: good luck [12:04] I guess laptops are good at reproducing things that might happen in not very reliable network environments [12:05] indeed. may be something related to suspending & resuming [12:05] or actually network going down [12:06] PR snapd#9519 opened: tests: moving the lib section from systems.sh helper to os.query tool [12:07] both of which are reasonably frequent occurrences for me; this network is not 100% reliable, and I'm moving between two houses at the moment so I suspend and resume a lot [12:08] zyga: should we close #9509 ? [12:08] PR #9509: multipass/docker-support: Mount /etc/apparmor* from the base snap <⚠ Critical> [12:08] yes, we should [12:09] it was an interesting exploration but we cannot proceed that way [12:11] PR snapd#9509 closed: multipass/docker-support: Mount /etc/apparmor* from the base snap <⚠ Critical> [12:22] ijohnson: i've iterated on fsck test a bit, I'll add mkfs test as well but I'm going to have a quick peek if I can fix core16 fsck story [12:34] mvo: the spread test needs some help [12:34] + test-snapd-docker-support-app.test-snapd-docker-support [12:34] 465 [12:35] Warning: unable to find a suitable fs in /proc/mounts, is it mounted? [12:35] it is trying to load the apparmor profile [12:35] but without apparmor, that won't work [12:35] maybe we should merely attempt to compile the profile? [12:35] zyga: did you see this bug report about livepatch problems? does it make sense to you? https://bugs.launchpad.net/snapd/+bug/1784474 [12:35] Bug #1784474: canonical-livepatch failing to enable [12:35] mvo with: apparmor_parser --skip-kernel-load [12:35] cmatsuoka: I did not, let me look [12:36] interesting [12:36] 18.04 but no freezer? [12:37] cmatsuoka thanks again, I asked some questions that should clarify what's going on [12:38] mvo: I can push that change if you don't mind [12:38] zyga: thank you [12:39] zyga: please do [12:39] thank you [12:41] PR snapd#9520 opened: tests: moving the core test suite from systems.sh to os.query tool [12:42] mvo changed locally, will test in private before pushing to avoid spread churn [12:43] zyga: this is 9516 ? [12:43] zyga: it's also failing on !ubuntu [12:43] zyga: or is this what you are talking about? [12:44] mvo, oh, anything with !apparmor? [12:45] zyga: and this one is also related to livepatch, but it was diagnosed as snapd being unable to launch the process in a sandbox: https://bugs.launchpad.net/snapd/+bug/1898473 [12:45] Bug #1898473: canonical-livepatch snap not working [12:46] cmatsuoka looking [12:48] quick restart [12:48] zyga: yes, it's all very red, I have not yet looked at every one of them but on centos it's something like "Kernel needs AppArmor compatiblity patch) [12:48] zyga: i.e. the regression test is failing [12:48] mvo: I bet that's exactly what this is [12:48] mvo: my change should fix this, I'll make sure it passes [12:49] zyga: cool, did you push it yet? [12:49] no, just waiting for this iteration to see that it fixes fedora32 [12:49] I will push it if so [12:49] zyga: ok [12:52] ijohnson: have a look at the bottom of https://manpages.debian.org/stretch/systemd/systemd-fsck@.service.8.en.html [12:52] ijohnson: the defaults are safe [12:53] ijohnson but I was wondering if recovery mode should have different set? [12:53] ijohnson e.g. fsck.repair=yes [12:54] cmatsuoka replied as well [12:55] zyga: thanks! [12:55] thank you for bringing those to my attention :) [12:56] mvo: oh well, it passes but then fails on AppArmor parser error for /snap/test-snapd-docker-support-app/x1/test-snapd-docker-support.profile in /snap/test-snapd-docker-support-app/x1/test-snapd-docker-support.profile at line 1: Could not open 'tunables/global' [12:56] I wonder what's missing there [12:56] it looks like a real problem [12:57] aha [12:57] mvo: it fails because /etc/apparmor.d is missing on fedora, obviously :| [12:57] hmm hmm [12:59] zyga: meh [12:59] mvo: https://pastebin.ubuntu.com/p/6dfFSMjPyp/ [12:59] mvo so now we have a choice [13:00] mvo: accept that multipass and docker do not work on fedora [13:00] (unless they handle this error internally) [13:00] or create those directories [13:00] or ... not sure [13:00] go back to snap-update-ns [13:00] and create more ideas [13:00] standup time anyway [13:11] PR snapd#9508 closed: tests: remove ausearch which fails during prepare [13:12] mvo: pushed [13:12] zyga: ta [13:20] pedronis: to get a non-inherited keyring you need to [13:21] keyring = keyctl(KEYCTL_JOIN_SESSION_KEYRING, 0, 0, 0, 0); [13:21] then for shared keyring: [13:21] if (keyctl(KEYCTL_LINK, [13:21] KEY_SPEC_USER_KEYRING, [13:21] KEY_SPEC_SESSION_KEYRING, 0, 0) < 0) { [13:21] that's all [13:21] keyctl is a syscall [13:38] zyga: does it look sensible https://github.com/lxc/lxd-pkg-snap/blob/latest-edge/snapcraft/hooks/remove ? [13:41] pstolowski: it's unmounted inside the mount ns :) [13:41] it's not leaving the ns, is it? [13:41] :D [13:41] so it's not unmounted on the host [13:42] zyga: but umount on the host works; is that expected? [13:43] pstolowski: one sec [13:46] pstolowski: well some more time please [13:46] pstolowski but the point is [13:46] pstolowski: the propagation is not letting that unmount event reach the host mount namespace [13:46] pstolowski so it is really unmounted, from the hook point of view [13:46] pstolowski: but it stays mounted from ours [13:46] pstolowski: presumably because the mount is created by something that escapes the mount namespace [13:47] pstolowski: but removed by something that does not [13:47] pstolowski the fix is really simple [13:47] pstolowski: prefix the unmount with nsenter -m/proc/pid/1/ns/mnt [13:47] that should be enough [13:47] brb though [13:51] pstolowski if you want I can explain in a HO [13:51] zyga: yes please [13:51] in standup [13:51] ok [13:56] PR snapd#9517 closed: spread, tests: tweaks for openSUSE [13:57] uh, didn't notice there's only 1 review there, but it seems to work fine [14:03] PR pc-amd64-gadget#51 opened: gadget: add ubuntu-save [14:04] I'll go and debug fsck but first LUNCH as I'm starving [14:27] mvo: pushed to https://github.com/snapcore/snapd/pull/9516 [14:27] PR #9516: cmd/snap-confine: mask host's apparmor config <⚠ Critical> [14:28] zyga: thank you [14:50] zyga: 9516 has some strange unit test error that looks like a fluke - okay if I restart? [14:50] yes please [14:50] I tried to cancel things [14:50] because of gpg error [14:50] +1 [14:50] but I think cancel is not reliable with spread [14:50] restarting [14:50] I just looked at it a moment ago [14:50] maybe cancel does work but takes a while to finish? [14:50] zyga: yeah, but spread should only run if the unit tests were ok [14:50] oh, indeed [14:50] well [14:51] maybe it's not spread but all of canceling :) [14:59] zyga: sure, no worries - I looked at 9499 and it looks like we can almost merge it? [14:59] looking [15:00] well [15:00] yes, except it doesn't work :) [15:00] I mean core20 and core16 fail [15:00] we could mask those failures but yeah [15:00] apart from that it's good to go [15:00] shall I mask them? [15:00] I'm looking at 16 failure now [15:00] as it, in theory at least, should work [15:01] I can mask 16 and 20 quickly and push that [15:01] 20 will be fixed with ijohnson's patch [15:01] yes sorry will push later, need to go afk for a bit now actually [15:01] ijohnson that's totally fine [15:01] PR snapd#9518 closed: tests: add value to INSTANCE_KEY/regular [15:01] thanks mvo! [15:02] mvo: just let me know what we should do in your opinion, I'll make coffee and return here to push this forward [15:02] I must be missing something as, at least on paper 16 should be good, it hast to be something silly in the integration path [15:02] * zyga -> coffee [15:02] zyga, ijohnson no worries, let's not mask and simply wait for fixes [15:03] ijohnson: and no worries, no real rush at this point [15:03] will definitely get done today [15:03] zyga: 16 is surprising [15:07] PR snapcraft#3321 opened: meta: add error check for "command not found" [15:11] re [15:11] made tea instead [15:11] but still good :) [15:11] mvo: yeah, I guess, "gulp" but yeah [15:11] * zyga digs [15:12] mvo: do you think we could have a security review of https://github.com/snapcore/snapd/pull/9516 [15:12] PR #9516: cmd/snap-confine: mask host's apparmor config <⚠ Critical> [15:12] even a brief one? [15:13] jdstrand: I know you are extremely busy lately, do you think you could look at the C changes in ^ and comment [15:13] zyga: sure, if you feel it's important. time is of the essence though, the release team is waiting for this [15:13] alex +1 the design already [15:13] mvo: I understand, I hope we can merge this today [15:13] it should be green on this pass [15:16] fun fuct, all our retry strategies share the same problem as download (maybe except for concurrent access) [15:16] ! [15:16] PR snapd#9521 opened: tests: moving main suite from systems.sh to os.query tool [15:16] pstolowski that's a fantastic find ahead of core20 [15:19] * cachio lunch [15:35] pstolowski: !!! [15:35] pstolowski: thanks for finding this issue [15:36] mvo: probably my bug, and 3-4 years old! [15:36] zyga: are you looking into the fsck failure on core16 ? if not I'm trying to look at this now [15:36] pstolowski: yeah [15:36] maybe except for download [15:36] yes I am [15:36] I can share what I have [15:36] that one may be new [15:36] I've enabled persistent journal [15:36] ran the test with debug [15:36] I'll have the improved log soon [15:36] stgraber: hey [15:37] PR snapd#9317 closed: [RFC] devicestate: keep log from install-mode on installed system [15:44] mvo found an embarrassing typo, iterating [15:45] zyga: fun, I just had the "NESTED_TYPE" error you had [15:46] I haven't seen that in ages [15:46] more info on fsck in 15 minutes [15:55] mvo good news [15:55] it may be okay [15:55] I'll know in 30 minutes [15:55] afk for now [16:05] pstolowski: hi [16:06] stgraber: hey, see my message on mattermost [16:07] woot [16:07] core16 passes [16:07] zyga: nice! what was it? [16:08] mvo: it was mounted twice [16:08] I really didn't look deeper [16:08] but now core20 is the only failure [16:08] ok [16:08] nice [16:09] ijohnson if you fix core20, we're good [16:12] PR snapd#9512 closed: o/snapstate: move setting updated SnapState after error paths [16:17] mvo: I pushed one, last I hope, thing to https://github.com/snapcore/snapd/pull/9516 [16:17] PR #9516: cmd/snap-confine: mask host's apparmor config <⚠ Critical> [17:05] zyga: mvo: I've got the fix for uc20, just waiting on spread locally before pushing it up [17:06] (assuming there are no other gotchas that pop up but seems to be straight forward I think) === ijohnson is now known as ijohnson|lunch === ijohnson|lunch is now known as ijohnson [17:51] ijohnson: \o/ thank so much [17:54] one more try on the spread test, first one had an issue unmounting the snapd snap [18:13] PR snapcraft#3312 closed: spread tests: introduce electron-builder test [18:13] PR snapcraft#3322 opened: package repositories: drop $SNAPCRAFT_APT_HOST_ARCH variable [18:23] PR snapcraft#3315 closed: build(deps-dev): bump junit from 3.8.1 to 4.13.1 in /tests/spread/plugins/v1/maven/snaps/maven-hello/my-app [18:23] PR snapcraft#3316 closed: build(deps-dev): bump junit from 3.8.1 to 4.13.1 in /tests/spread/plugins/v1/maven/snaps/legacy-maven-hello/my-app [18:27] PR snapd#9516 closed: cmd/snap-confine: mask host's apparmor config <⚠ Critical> [18:29] I'm off but please telegram me if anything comes up about the snapd groovy fix that I just uploaded there [19:01] * ijohnson coffees quickly [19:48] PR snapd#9520 closed: tests: moving the core test suite from systems.sh to os.query tool [19:48] PR snapd#9521 closed: tests: moving main suite from systems.sh to os.query tool [20:03] PR snapcraft#3307 closed: setup.py: assert with helpful error when unable to determine version [23:09] * ijohnson EODs