=== JanC is now known as Guest39672 === JanC_ is now known as JanC [05:39] morning [05:54] Hey [05:54] Taking the dog out [05:54] zyga: found anything interesting about the lxd issue? [06:03] zyga: reading through https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1871652 nice find! [06:03] Bug #1871652: snap run hangs on system-key mismatch due to reexec and shutdown [06:14] mvo: hey [06:14] good morning mborzecki [06:15] zyga: hm i was worried that maybe the client gets stuck, but looking at the backtrace the client timeout seems to work [06:16] zyga: i think that the unfortunate part is that the client timeout (overall timeout for all retried requests) is 50s, then *12 retries in snap run, we're looking at 10 minutes after which snap run would fail eventually [06:18] mborzecki: it's debugged [06:18] :) [06:18] mborzecki: if you are talking about the lxd issue [06:18] mvo: good morning :) [06:18] * zyga woke up in good mood and just returned from dog & bike ride [06:18] zyga: good morning! you seem to be in a good mood :) ? [06:18] indeed [06:18] zyga: i know, just looking at the backtrace you posted there, https://github.com/snapcore/snapd/pull/8462 and the client.do() loop [06:18] PR #8462: cmd/snap: don't wait for system key when stopping [06:19] yesterday surely ended on a high note [06:19] I have some more thoughts about how this problem is annoying [06:19] but I think the fix is valid [06:19] mborzecki: yeah, a timeout of 10min seems a bit excessive [06:19] zyga: yeah, I like hte idea to just check for shutdown [06:20] mvo: there's this comment that isn't true anymore https://github.com/snapcore/snapd/pull/8462/files#diff-0ffbc404d8a8e3aaeca8cd9d066c3d71R160 [06:20] PR #8462: cmd/snap: don't wait for system key when stopping [06:20] uhh it's `// connect timeout for client is 5s on each try, so 12*5s = 60s` [06:21] mborzecki: uh, so it looks like we try to accomodate thie situation already? is that check buggy? [06:21] mborzecki: also if the retry timeout should be max 60s but in reality is 10min is there a different bug there too :( ? [06:22] I can debug this further [06:22] I wanted to check the solution in practice [06:22] the socket is there [06:22] but will never activate [06:22] mvo: probably the client retry bits evolved separately [06:22] maybe we just hang on connect? [06:22] ahhh [06:22] fiun [06:22] hhe [06:22] zyga: yeah [06:22] ok, I'll get back to my coffee [06:23] mborzecki: aha, yeah, that makes sense [06:23] but I'm happy :) [06:23] mborzecki: sorry, I see these lines are from an open PR [06:23] if the socket wasn't there it would fail much earlier i believe [06:23] zyga: woah, thanks so much for adding this PR so quickly [06:23] :D [06:23] after breakfast I'll verify this [06:24] and write some tests [06:24] zyga: yeah, having a test there would be great [06:24] zyga: mvo: so actually a funny scenario, the socket we use to talk to snapd is there, but snapd may be inactive, how do you find out that the other end is inactive if poking th socket isn't reliable? [06:24] any issues with spread? [06:26] mborzecki: we should think about how to prevent the bug for real [06:26] mborzecki: I realized it's much harder because the dependency is dynamic [06:26] mborzecki: we depend on the active reexecution target that may be core or snapd and the revisions may change at runtime any number of times [06:27] mborzecki: which is not great [06:31] mvo: so 2.44... 4? [06:31] zyga: 2.44.3 [06:31] perfect [06:31] thank you [06:31] zyga: I hope I can upload that today [06:31] tonight [06:31] something like this :) [07:00] hmm preseed reset is failing in an intresting way [07:01] saw the same rpoblem twice already [07:01] yeah I noticed [07:01] did you debug it more? I didn't look deeper [07:01] the log https://paste.ubuntu.com/p/gTcdq4h8CF/ [07:14] morning [07:15] good morning pawel [07:17] pstolowski: hey [07:17] pstolowski: preseed reset hangs on 20.04 https://paste.ubuntu.com/p/gTcdq4h8CF/ [07:17] pstolowski: but i'm not able to reproduce it manually [07:17] pstolowski: good morning [07:19] mborzecki: interesting, i'll take a look [07:22] doing 'restart all jobs' in gh actions is actually very confusing [07:23] looks like it's first restating the unit tests job, then the canary jobs, and then the stable ones [07:23] and E: Failed to fetch http://pkg.jenkins.io/debian-stable/binary/jenkins_2.222.1_all.deb Could not connect to pkg.jenkins.io:80 (52.202.51.185), connection timed out [07:26] presumably it is respecting the job dependencies: the stable jobs can't restart until the restarted canary jobs have completed, which can't restart until the restarted unit tests have completed [07:27] jamesh: exactly [07:27] mborzecki: yeah, they don't invalidate past results until such jobs actually start [07:27] mborzecki: if it's a one off failure like that just ask mvo to override [07:27] no use in burning money on this [07:31] can I get a 2nd review on green https://github.com/snapcore/snapd/pull/8403 [07:31] PR #8403: sandbox/cgroup: avoid making arrays we don't use [07:31] it's not much and I'd like to get it in and have one less [07:35] jdstrand: thank you for the reviews [07:35] I'll break for breakfast and then get back to work [07:36] zyga: which PR is that? I can override if needed [07:43] pstolowski: I reviewed #8414, thank you [07:43] PR #8414: o/configstate: core config handler for persistent journal [07:43] couple small comments [07:50] pedronis: ty [07:52] mborzecki: yeah, hangs for me too when run on gc. will try to add some debug [07:53] pstolowski: oh, you manged to reproduce it? [07:54] zyga: https://github.com/snapcore/snapd/pull/8462#pullrequestreview-390565291 [07:54] PR #8462: cmd/snap: don't wait for system key when stopping <⚠ Critical> [07:54] mborzecki: it seems so.. it's hanging on < /mnt/cloudimg/var/lib/snapd/desktop/applications, i'm waiting for spread to timeout [07:55] pstolowski: ha, interesting, i ran with -shell and executed the test line by line [07:55] pstolowski: btw. diff -up is easier to read there [07:55] mborzecki: also it's interesting it found a diff [07:58] PR snapd#8450 closed: selinux: export MockIsEnforcing; systemd: use in tests [08:00] mvo: not sure, it was pawel [08:00] mborzecki: ta [08:00] * mborzecki wonders why it's showing `degraded` here [08:01] mborzecki: systemctl --failed [08:02] zyga: yeah, that's the mystery, shadow.service apparenty failed :P [08:02] shadow.service? [08:02] what is that [08:02] I don't have it [08:02] is it related to homed? [08:02] zyga: idk maybe https://paste.ubuntu.com/p/f7hkWx94vQ/ [08:03] which package ships that? [08:04] zyga: surprise surprise.. `shadow` :P [08:05] zyga: btw it runs the following /bin/sh -c '/usr/bin/pwck -r || r=1; /usr/bin/grpck -r && exit $r' [08:09] pwck? [08:09] wow [08:09] I learned something today already [08:09] YoU HaVe BeEn Hax0rEd [08:16] re [08:20] zyga: yeah, from what i managed to find it checks /etc/group against /etc/gshadow [08:20] so is something corrupted on your system? [08:21] zyga: nah, i had a `sudo` group at some point that was listed in gshadow, but got removed from /etc/group [08:23] zyga: probably something went out of sync during one of my arch 'installs', that's actually rsyncing whole sysroot from another arch system, wouldn't be surprised since the actuall install from scratch was years ago [08:43] mborzecki: can you look at https://github.com/snapcore/snapd/pull/8462 again please [08:43] PR #8462: cmd/snap: don't wait for system key when stopping <⚠ Critical> [08:50] zyga: thanks for adding the test to 8462 [09:01] mvo: I'll verify this in the machine where it is easy to reproduce now [09:02] zyga: \o/ [09:14] I believe I can write a spread test for this as well [09:29] zyga: extra browny points if that is possible and not too much work [09:29] mvo: I think so, just a moment to know [09:30] zyga: \o/ [09:30] mvo: gotta justify a push to fix that silly typo :D [09:45] test in progress [09:51] * mvo hugs zyga [10:01] morning folks [10:01] hey zyga I saw this failure for session-tool on one of my PR's that is relatively up to date with master https://pastebin.ubuntu.com/p/tj9qTTN6Wf/ [10:01] looks like it's a gdm issue with 19.10? [10:01] yeah [10:01] I saw it, I asked sergio to remove gdm [10:01] hi ijohnson ! [10:01] ok [10:01] I'll send a patch to stop the gdm session [10:01] o/ pstolowski [10:01] though [10:01] maybe I should remove that part of the test [10:01] it's not like we are leaking our sessions [10:02] it's just a goose chase [10:02] ijohnson: what do you think? [10:02] mmm it's a bit annoying, but in this instance also genuinely useful to have it tell us that something is on the image that is leaking state around [10:02] I dunno [10:03] is it related to the bug sergio raised recently? https://bugs.launchpad.net/snapd/+bug/1868857 [10:03] Bug #1868857: Installing evolution-data-server on test images pulls in GDM and the desktop [10:04] pstolowski: yes [10:04] after checking and checking I managed to convince him we have GDM :) [10:07] brb [10:10] mmm okay another random failure from overnight about being unable to connect to the systemd user session: https://pastebin.ubuntu.com/p/QdyVGHHDrJ/ [10:11] mborzecki: this preseed-reset hang issue is misterious; i added debug that should show up right after the last diff line where it hangs , but it's quiet :/ [10:11] pstolowski: core-persistent-journal is failing on core16 [10:12] pedronis: yeah, i've seen this, didn't happen when run locally, investigating [10:15] pedronis: mvo: i'm looking into the mount point rename [10:27] mborzecki: thank you [10:30] mvo: pedronis: i'll split the /run/mnt/host bind mount till after we have the directory in the core snap [10:31] i mean the /host directory [10:31] +1 [10:32] afk for another moment, sorry :/ === Aavar_ is now known as Aavar [10:42] mvo: could you use your magical powers to merge https://github.com/snapcore/snapd/pull/8451 ? It's been restarted numerous times and all the current failures there have either been reproduced and known by others, or have been reported [10:42] PR #8451: osutil: mock proc/self/mountinfo properly everywhere [10:44] ijohnson: sure [10:44] PR snapd#8451 closed: osutil: mock proc/self/mountinfo properly everywhere [10:44] thank you \/o [10:44] oh whoops I was too excited [10:45] \o/ [10:45] haha [10:51] pstolowski: now, it passed, it seems flakey somehow [10:55] pedronis: yes, maybe there is something flaky. i'm running it in a loop locally now [11:03] re [11:05] PR snapd#8464 opened: cmd/snap-boostrap, boot: use /run/mnt/data instead of ubuntu-data [11:05] mvo: pedronis: ^^ [11:05] i did not add the /run/mnt/data -> /run/mnt/ubuntu-data bind mount too, let's see if the tests pass [11:07] mborzecki: they won't, actually initramfs will need some changes [11:07] hmm ah right, there's some hard coded names there too [11:07] mborzecki: https://paste.ubuntu.com/p/cF4NVBChbG/ [11:08] looks like the bind mount data -> ubuntu-data could make it work tho [11:08] yes, but we do want then to change the initrd [11:09] because otherwise is a bit too many level of mounts [11:09] mborzecki: also my pastebin has type -d (not sure why I did that), so there's a couple more things actually [11:10] pedronis: just run it 10 times without failure. weird. will give it one more spin [11:10] pedronis: mborzecki: suspicious that the initrd has things like this: [11:10] echo 'LABEL=ubuntu-boot /run/mnt/ubuntu-boot auto defaults 0 0' >> /run/image.fstab [11:11] that seems like it would defeat the purpose of our cross-checking no? [11:11] ijohnson: it's optimizing some mounts [11:11] ijohnson: you'll have to discuss what that means [11:12] mmm yes [11:12] pedronis: is the results of the discussion this morning summarized somewhere ? [11:13] ijohnson: seems you got the doc [11:13] yes mborzecki PM'd it to me [11:14] zyga: btw i've re-requested your review of #8414 as it changed substantially [11:14] PR #8414: o/configstate: core config handler for persistent journal [11:14] ack [11:14] I'll look in 10 minutes [11:18] pedronis: I left a comment in the doc, so will we now have /run/mnt/boot instead of (or in addition to) /run/mnt/ubuntu-boot ? [11:18] ijohnson: no [11:18] ok, so the changes are just for ubuntu-data really [11:19] (and all alter egos of ubuntu-data) [11:19] ijohnson: yes, we'll have temporarely both data and ubuntu-data until initramfs is fixed [11:19] sure [11:28] pstolowski: looking [11:33] jdstrand: possibly I'm still broken? https://forum.snapcraft.io/t/snapcraft-and-strict-multipass-call-for-testing/16488/5 [11:33] diddledan: yeah that's not something we're doing [11:33] strangely at least two snaps have started fine, but those are desktop apps and I spent some time cooking toast [11:34] ... after reboot, so I was well past apparmor starting when I logged-in [11:35] Saviq, yeah LXD is also dead [11:38] pstolowski: +1 [11:39] diddledan: have a look at https://github.com/ubuntu/zsys/issues/60#issuecomment-609729305 for what fixed things for me on zfs root - not snapd, but the overall problem may be the same - look in the journal for things refusing to mount due to target not being empty [11:44] it's not that.. I have a correct set of files in /etc/zfs-list.cache and there are no failed mounts in the journal [11:46] mvo: sorry for the lag, i have a test [12:04] I'm running a few more iterations to recheck if fails without the fix and to remove redundant parts [12:04] I'll push the final version before the standup [12:05] most likely in 20 minutes, after the next run [12:06] pedronis: 100 runs and no failure; i wonder if we were seeing a failure from before USR1 commit [12:06] pstolowski: maybe, let's see if it gets green and can land [12:07] pedronis: doh.. it failed on 20.04 [12:08] pedronis: on preseed-reset, which is the other issue i'm investigating [12:08] pstolowski: could you recheck quickly my #8436 , I had to change the spread test because I remember it passing on core20 but actually the systemd there now uses a different property name [12:08] PR #8436: configcore,tests: use daemon-reexec to apply watchdog config [12:09] pstolowski: maybe we need an explicit journalctl --flush ? or do some activity that is none to produce logs? [12:09] pedronis: looking [12:09] s/none/known/ [12:10] pedronis: maybe, but it seems that enabling logging writes a single starting entry, so only question is if flush is needed [12:11] pedronis: but the problem is now preseed-reset test which breaks with master [12:12] diddledan: can you perform: sudo systemd-analyze plot > ./1871148-vm-no-varlib-mount_diddledan.svg' and attach it to https://bugs.launchpad.net/apparmor/+bug/1871148? [12:12] Bug #1871148: services start before apparmor profiles are loaded [12:12] (without that trailing "'" of course [12:12] ) [12:21] jdstranddone :-) [12:21] jdstrand done :-) [12:22] aaha, we started shipping var/lib/snapd/desktop/applications in the pkg, that's the primary reason of preseed-reset test failure === pedronis_ is now known as pedronis [12:38] diddledan: https://bugs.launchpad.net/snapd/+bug/1871148/comments/24 [12:38] Bug #1871148: services start before apparmor profiles are loaded [12:39] mvo, zyga: ^ I added a snapd task. please see my comment. it seems that root on zfs is aggravating the condition that apparmor.service might start after snap services [12:39] looking [12:40] since we don't see it on non-root-on-zfs systems (even though the possibility is there) [12:40] yeah, I think we need to think about how to handle this [12:41] mvo: this is possibly another 2.44 point release. up to you to decide, but with focal making zfs an option in the installer, and that seems to push the system into this bug more than others, ... [12:42] PR snapd#8465 opened: tests: update snap-preseed --reset logic to acommodate for 2.44 change <⚠ Critical> [12:44] pedronis, hey [12:44] zyga: I need to step away, but maybe this is the time to align with non-Ubuntu-but-apparmor-enabled systems? I forget the details, but iirc, there is an additional snap-apparmor unit or similar that can be After apparmor, and then snapd can add After snap-apparmor to the units. I defer to you, mvo, etc on the design and am happy to review a PR [12:44] I see this error on uc20 nested tests [12:44] https://paste.ubuntu.com/p/sBWXC4VZyG/ [12:44] is it something new? [12:44] first time I see this [12:44] jdstrand: that's a brilliant idea [12:44] jdstrand: we can just no-op if apparmor is of [12:44] *off [12:44] yeah [12:44] and we can always put the dependency into the units [12:44] yeah [12:44] mvo: ^ that's a solution that's easy [12:45] I can look at this after the current bug [12:45] diddledan: thank you for your persistence :) [12:45] * jdstrand -> steps away [12:45] zyga, jdstrand works for me [12:46] pstolowski, hey, I see also this error in nested test for uc20 Apr 09 12:08:53 ubuntu snapd[720]: hotplug.go:131: internal error: cannot get global device context: broken assertion storage, looking for model: broken assertion storage, cannot decode assertion: asser [12:46] PR #9: Added the travis config file [12:46] pstolowski, any idea? [12:48] cachio: no, maybe core20 requires something new to be done in that area, i'll need to investigate [12:48] pstolowski, thanks [12:48] #8465 should unbreak master [12:48] PR #8465: tests: update snap-preseed --reset logic to acommodate for 2.44 change <⚠ Critical> [12:48] it sounds like some code is running before seeding is done [12:48] mborzecki: ^ [12:48] I thought that code waiting on seeding [12:49] today is sponsored by tag [12:49] heh ;) [12:49] zyga: yeah, it totally is [12:49] it's the N-days before the release feeling [12:50] pedronis: it doesn't wait [12:50] pstolowski: it waits for the system snap to be there at least [12:50] pedronis: right, that's true [12:51] mborzecki: for clarity, i pinged you about 8465 [12:51] pstolowski: figured ;) [12:52] mvo: pedronis: added the compatibility bind mount, i can succesfuly go through the install mode, but it hangs in initramfs in run mode [12:52] zyga: so you are suggesting all our snap unit have "After=apparmor.service", is that what you said earlier? are you on it? should I? [12:52] mvo: pedronis: pushed a patch to #8464 anyway [12:52] PR #8464: cmd/snap-boostrap, boot: use /run/mnt/data instead of ubuntu-data [12:52] mborzecki: please do, maybe something for dimitri [12:53] pstolowski: do we need 8465 for 2.44 as well as a cherry pick? [12:53] mvo: we have no way to rewrite atm though [12:53] to rewrite units [12:53] pstolowski: do you know what that blocked the test? [12:54] pstolowski: it would actually hit the kill timeout [12:54] pedronis: yeah, but at least all new focal install with zfs will not be affected if we have it now [12:54] pstolowski: that's really weird because 131 is definitely after we are getting events [12:56] something is very broken [12:57] mborzecki: not yet, as i said in the comment and standup notes i'm investigating; maybe qemu-nbd hangs when we leave execute. if i re-arrange the test to first unmount and clean up and then fail on diffing, it fails as expected and doesn't hang [12:58] mvo: probably yes === hggdh is now known as hggdh-msft [13:28] mvo: I've verified that the spread test fails without the fix and passes with the fix [13:29] mvo: I'll jump into the apparmor issue in a moment, after the call [13:30] zyga, I take it from that the fix doesn't work? I'm understanding how CI testing works, right? [13:31] diddledan: ? [13:31] :-p [13:31] diddledan: hopefully not :) [13:31] fail test == fix works; passing test == shruggy shoulders no idea [13:32] maybe it works? [13:32] zyga: \o/ thank you [13:32] #shipit! [13:38] mvo: I pushed the test now [13:44] that was a nice bug [13:45] jdstrand: I'm looking after apparmor now [13:54] mvo: have a look at 8462 again please [13:54] maybe mborzecki as well [13:54] I'll get rid of my plate, grab coffee and jump into apparmor and zfs [13:55] zyga: sure, will do [13:56] pedronis: I merged master into 8424, the only thing missing there are tests for lsblk.go and the reimplementation of lsblk with sysfs + udev, which probably means we need to rename the file or maybe move it to it's own package somewhere [13:56] but all the logic in boot and cmd_initramfs_mounts should be there and that is tested and ready to review [13:59] ijohnson: so I should focus on cmd_initramfs_mounts ? and a bit on boot , if I understand correctly [13:59] pedronis: yes [14:00] ok, having a break and then I will look [14:00] thanks [14:02] zyga: btw, I didn't says this in the standup but I definitely prefer our snap services' units to be after something we control (or well defined from systemd) than a 3rd package service, we can control the dep on that one in one place at least [14:02] +1 [14:02] yeah, I strongly agree [14:02] this is so much cleaner than vague dependency on apparmor in each of the service files we write [14:06] mvo: the script ijohnson linked requires greasemonkey which works on pretty much every browser [14:08] zyga: hah, so when a workflow is successful, there's no way to restart it [14:08] pedronis: should I cherry pick 8459 (omit many snap-ids) for 2.44.3 too? [14:08] pstolowski: nice! [14:10] mborzecki: yeah :) [14:10] mborzecki: I have some ideas on that though [14:10] omg, close/reopen didn't trigger anything [14:10] oh w8, it did [14:11] really? [14:11] oh [14:11] odd [14:11] there's a way to trigger on more [14:11] anyway [14:11] ENOTIME [14:14] PR snapd#8466 opened: tests: backport partition fixes to 2.44 [14:23] oh, preseed-reset fix failed on 19.10 [14:24] cachio: did you update 20.04 images but not 19.10? [14:26] pstolowski, I updated all of them [14:27] pstolowski, https://travis-ci.org/github/snapcore/spread-cron/builds/672701787 [14:30] pstolowski: hm, 8465 failed in 19.10 in preseed-reset it seems, can you please have a look? [14:30] pstolowski: it's strange as it seems like the only place where it fails [14:30] mvo: yeah, i just noted this above [14:30] pstolowski: aha, sorry [14:30] mvo: i'm confused [14:30] mvo: did our deb packaging change made it to all ubuntu versions? [14:31] pstolowski: maybe not, the SRUs are notoriously slow :/ [14:31] pstolowski: https://paste.ubuntu.com/p/SzwVMqT9GX/ [14:33] mvo: yes, plus it needs to be on the image we download. oh well, i need to relax this test check then [14:33] ok [14:33] pstolowski: or just limit it to 20.04 for now? [14:33] mvo: good idea [14:34] pstolowski, do you need another update? [14:35] cachio: no, not for now, afaiu we need to snapd deb 2.44 to make it through [14:35] *to wait [14:43] mvo, I found that the nested machine is locked because it reaches 100% of cpu [14:43] and it has 1 cpu [14:44] most of the cases is when snapd is installing or removing [14:52] re [14:52] am I online? [14:53] zyga: no [14:53] haha :) [14:54] I managed to connect from my thinkpad [14:56] mvo, I'll try some qemu configuration to avoide this [15:03] zyga: fyi, I answered your question about offline and unknown in https://github.com/snapcore/snapd/pull/8462#discussion_r406269208 [15:03] thank you, looking [15:03] PR #8462: cmd/snap: don't wait for system key when stopping <⚠ Critical> [15:03] jdstrand: I'm wrapping up the fix for apparmor now [15:04] zyga: but, then that got me thinking about: https://github.com/snapcore/snapd/pull/8462#discussion_r406270147 [15:04] jdstrand: hmmm [15:04] zyga: these are not blockers imo [15:04] jdstrand: maybe [15:04] I have to step into a meeting [15:04] I need to go AFK for 10 minutes now [15:06] zyga: or maybe return nil (I mentioned that in a followup comment) [15:07] * jdstrand fully attends meeting [15:07] guys, i'm taking a day off tomorrow as well, ping me if there's anything urgent [15:07] cachio: nice, thank you [15:07] ijohnson: I did a high-level pass on 8424, let me know if you have questions [15:07] mvo: ok, i pushed a fix. tested manually on 19.10 & 20.04, should work. fingers crossed [15:10] pstolowski: cool, thank you [15:11] thanks pedronis looking now [15:40] I'm home now [15:40] mvo: testing apparmor fix now [15:41] it's so annoying we build depend on gcc-multilib [15:41] it clashes with cross compiler stack I use [15:42] zyga: thanks, looking forward to the PR [15:44] PR snapd#8403 closed: sandbox/cgroup: avoid making arrays we don't use [15:45] mvo: https://github.com/snapcore/snapd/pull/8462 is green, except for preseed-reset on ubuntu 20.04 [15:45] PR #8462: cmd/snap: don't wait for system key when stopping <⚠ Critical> [15:45] mvo: shall I merge it? [15:45] it's one patch [15:49] zyga: yes, sounds good [15:49] k [15:49] zyga: I will cherry pick then [15:50] mvo: release branch CI overflows 50minutes in travis [15:50] mvo: actually, you must merge it [15:50] required status check [15:50] zyga: just noticed but I think it also hang in the preseed [15:50] aha [15:50] I didn't look deeper [15:50] zyga: so hopefully once the preseed fix lnaded this is good again :) [15:50] focusing on apparmor [15:50] zyga: no worries [15:50] hmm [15:50] but [15:50] ah ok [15:50] zyga: yeah, looking forward to this fix [15:55] PR snapd#8462 closed: cmd/snap: don't wait for system key when stopping <⚠ Critical> [15:55] thanks! [15:55] zyga: I backed the system-key lxd fix to 2.44 (cc stgraber) - I still plan an upload tonight/in the morning [15:55] zyga: *thank you* [15:55] pleasure :) [16:04] excellent, thanks zyga and mvo [16:04] stgraber: thank you for providing the perfect laboratory environment :) [16:15] mborzecki: how did you manage to duplicate the number of tests on 8464 ? o_O [16:15] > 37 successful and 3 failing checks [16:18] ijohnson: clearly gh wanted to test that PR thoroughly [16:18] it wanted to be double extra sure [16:19] each +1 doubles the tests [16:19] that would be kinda funny if gh just kept adding to the list, so if you have to restart tests like 4 times it would say "89 successful and 4 failing checks" [16:21] PR snapd#8467 opened: many: fix loading apparmor profiles on Ubuntu 20.04 with ZFS [16:25] mvo, jdstrand: ^ [16:26] tested locally on my focal install [16:26] with lxd and systemd-analyze [16:27] I'll break for coffee and be back later (mvo: tg to summon me please) [16:42] 8465 failed on google:ubuntu-20.04-64:tests/main/interfaces-timeserver-control [16:46] hello, there is something similar for after: on app: , I know after is only for parts but there is something like that in app: [16:46] Failed to restart systemd-timesyncd.service: Unit systemd-timesyncd.service is masked. [16:46] I mean apps: [16:48] pstolowski: check what masks that service [16:49] zyga, hello why after: is not accespted by snapcraft in apps: ? [16:50] zyga there is something like after: from parts: in apps:? [16:50] I don't understand [16:50] $ snapcraft [16:50] Issues while validating snapcraft.yaml: The 'apps/ovs-vswitchd' property does not match the required schema: Additional properties are not allowed ('after' was unexpected) [16:51] I still don't understand [16:51] what do you mean after for parts? [16:51] zyga parts: accept after: [16:51] but it looks like u can use after: on apps: [16:52] yes but the meaning is different [16:52] what do you want to do? [16:53] service or command orders after this do this [16:56] alvesadrian: https://snapcraft.io/docs/snap-format documents "after" for apps [16:56] alvesadrian: which version of snapcraft are you using? [16:56] 2.44 [16:56] alvesadrian: are your apps meant to be services/daemons ? [16:56] ijohnson yes [16:57] alvesadrian: are you building in docker? [16:57] alvesadrian: snapcraft is at 3.11 [16:57] alvesadrian: that version surely supports this construct [16:58] alvesadrian: you probably need to add `daemon: simple` or something to make sure your apps are daemons and not CLI/GUI "apps" [16:58] alvesadrian: snapcraft or snapd? [16:58] (I mean 2.44) [16:59] snapcraft [16:59] alvesadrian: what's `snapcraft version` ? [16:59] snapcraft version [16:59] snapcraft, version 2.43.1+18.4 [16:59] alvesadrian: you should install snapcraft via the snap not the debian package [16:59] bionic [17:00] alvesadrian: `apt remove snapcraft && snap install snapcraft` [17:00] is there a way with the new spread tests to re-run a specific job? it seems like twice now I only have 're-run all jobs'? [17:00] I'm talking about the github interface [17:01] jdstrand: not at present, ask mvo to override if this is a well-known failure that is fixed elsewhere [17:01] jdstrand: we discussed github actions vs travis and while there are some shortcomings as compared to travis we decided to keep the experiment alive for now [17:02] zyga: ok, so long as I'm not missing something. it seems that fedora is failing a lot. [17:03] something about reboot iirc [17:03] jdstrand: it's more that a specific test is failing, [17:03] I'll keep an eye on it [17:03] jdstrand: aha [17:03] thanks, maybe it's something new [17:03] there are a few fixes in flight that should get rid of most of those [17:03] jdstrand: on the upside actions lets us really run those tests as soon as possible, much faster than travis offered [17:03] zyga: it was a specific spread test related to core reboot *I *think*, don't jump on it now ;) [17:04] I won't, my goal is to focus on your feedback for udev PR [17:04] but I think tomorrow, today I just want to get the fixes ready [17:04] zyga: I also commented on the apparmor pr [17:05] (but only a comment at this point since you were still testing) [17:05] I noticed, going through that now! :) [17:08] popey: hey, I just noticed that the instructions for rhel are wrong: https://snapcraft.io/install/icq-im/rhel [17:08] snapd has been available in EPEL 8 for a while now [17:08] hey Eighth_Doctor :) [17:09] good to see you again [17:09] The CentOS instructions were updated, but apparently not the RHEL ones [17:09] zyga: hello :D [17:09] hello Eighth_Doctor [17:09] how are you surviving? [17:09] Eighth_Doctor: with three kids at home [17:09] they're editable on the forum now, which makes life easier [17:09] popey: hey [17:09] Eighth_Doctor: and parents-in-law [17:09] woo [17:09] oh, or are they, no they are not [17:09] Eighth_Doctor: and overeager police that fines you for taking your dog out on a bike [17:09] not *those* ones [17:09] Eighth_Doctor: splendind :) [17:10] Eighth_Doctor: we are lucky I have a job [17:10] zyga: I think I was wrong about suggesting ConditionPathExists: https://github.com/snapcore/snapd/pull/8467#discussion_r406351235 [17:10] PR #8467: many: fix loading apparmor profiles on Ubuntu 20.04 with ZFS [17:10] I'm lucky I still have a job [17:10] living alone and not being able to meet people regularly has sucked [17:10] jdstrand: ack [17:10] Eighth_Doctor is this correct? https://snapcraft.io/docs/installing-snap-on-red-hat [17:11] popey: yes [17:11] oh look https://github.com/canonical-web-and-design/snapcraft.io/issues/2646 [17:11] :D [17:11] that's why I didn't notice for months :D [17:11] thanks for noticing now, anyway :D [17:12] I only noticed today when somebody asked me about adding snapd for EPEL 8, which I distinctly remember doing this last year [17:12] popey: no problem :) [17:13] jdstrand: maybe we should not change the cache for the point release [17:13] jdstrand: I'm happy to do this for +1 [17:13] not sure [17:13] mvo: ^ [17:15] zyga: the loneliness and the fear of getting sick gets to me [17:16] but I've been doing fine so far for the past month [17:16] zyga: reading [17:17] Eighth_Doctor: hey, great to hear that you are fine (but yeah, the loneliness part is depressing :( [17:18] mvo: and my "community energy" is draining with nothing to refill it these days [17:18] no SUSECON, no CentOS Dojo, no Red Hat Summit, etc. [17:18] * mvo nods [17:19] I'm just begging for Flock and oSC to not get canceled in the fall [17:19] zyga: it is up to you [17:20] zyga: your snapd-apparmor already assumes /var/cache/apparmor, so it is fine to just add /var/cache/apparmor to RequiresMountsFOr [17:20] zyga, #8468 [17:20] PR snapd#8468 opened: tests: adding option --no-install-recommends option also when install all the deps [17:20] PR #8468: tests: adding option --no-install-recommends option also when install all the deps [17:20] pleas could you take a quick look? [17:20] (ie, today, you aren't worrying about alternate locations and presumably not suffering bugs for it, so there is no need to change in the dot release) [17:29] zyga: just to double check - the unit in 8467 does nothing if both are available? there will be no races or apparmor compiling the same profile in parallel or something (cc jdstrand) [17:51] mvo: no because our service depends on the system service [17:52] mvo: so in practice, if the system service loaded snapd profiles nothing new happens [17:52] mvo: if it didn't we load the profiles [17:52] mvo: *but* the new dependency from snap.foo.bar.service to snapd.apparmor.service means the boot race is over [17:52] mvo: snapd.apparmor.service has After=apparmor.service, so they will be serialized [17:55] PR snapd#8469 opened: snap: do not use os.Environ() in 2.44 <⚠ Critical> [17:55] mvo: I'll adjust the PR once to get the cache recommendation from jamie and remove the changelog [17:56] mvo: ah, thanks for that [17:56] I wasn't sure [17:56] zyga: nice [17:56] mvo: fix your PR please [17:56] zyga: which one? [17:56] https://github.com/snapcore/snapd/pull/8469#pullrequestreview-391010494 [17:56] PR #8469: snap: do not use os.Environ() in 2.44 <⚠ Critical> [17:56] = [17:56] zyga: fixing now, sorry [17:57] wow zyga you beat me to that by 10 seconds [17:57] :D [17:57] * zyga refrains from joking about stuff now [17:58] * mvo hugs ijohnson and zyga - awesome team! [17:58] :-) [17:59] PR snapcraft#3023 closed: pluginhandler: move attributes to PluginHandler [18:01] * mvo vanishes a bit while 2.44 builds are churning along [18:02] mvo: today, apparmor is looking at /var/lib/snapd/apparmor/profiles. if you can do a focal upload of snapd that makes you control it, I can do a focal apparmor upload that undoes it. Also, the snap-apparmor service is After=apparmor so there is no race. running one after the other is 'ok' because the speed at which the parser will load the cache into the kernel is essentially as fast as it can read from [18:02] the disk, unless it recompiles policy. the 2nd run will never recompile policy since it is After=apparmor [18:03] we are *burning* through spread jobs today! [18:03] mvo: we can consider an apparmor SRU to stop looking at /var/lib/snapd/apparmor/profiles at some future point if desired [18:03] mvo: we would probably wait for a bigger SRU bug and piggyback on it though [18:06] mvo: it would be good if you could ping me on when you are going to do a focal upload so I can do the apparmor one after [18:06] zyga: indeed, it is pretty responsive, and I even made the cardinal mistake of pushing an open PR branch 3 times in 5 minutes :-P [18:13] re [18:13] ok, back to branches [18:15] jdstrand: I will do a focal upload tonight or tomorrow with the .3 fixes [18:16] jdstrand, mvo: can we decide on the cache directory [18:16] shall I change it from /var/cache/apparmor? [18:19] zyga: I agree to not change in this PR [18:19] OK [18:19] I'll adjust the RequiresMountsFor [18:19] to add /var/cache/apparmor [18:19] and nothing else [18:19] yeah, just add /var/cache/apparmor to that [18:19] is that ok? (I want to do only one push) [18:19] OK [18:20] zyga: actually [18:20] yes? [18:20] (I won't push for a few more minutes so I'm open to changes :) [18:21] zyga: this will hit other releases of Ubuntu eventually. we should verify the cache locations of all of them [18:21] zyga: otherwise snapd-apparmor will fail on those where the cache isn't in /var/cache/apparmor [18:22] jdstrand: snapd.apparmor.service is a .deb only feature so we can correct anything we want at the time we choose to dput to the archive [18:22] zyga: so, snapd hardcodes /var/cache/apparmor when compiling policy, no? [18:22] jdstrand: having said that, you are right, [18:22] jdstrand: can we use the location? [18:22] * zyga checks [18:23] yes [18:23] zyga: I think that is accurate (though, yes, please check :). if so, I think the only question would be if a release doesn't have /var/cache/apparmor, then we create it in the deb packaging rather than have snapd create it [18:23] we hard-code /var/cache/apparmor [18:24] ok, good [18:24] jdstrand: given that we use this location in 14.04 already I _think_ we are safe [18:24] jdstrand: or are you saying we should mkdir the directory from the service to be safe? [18:24] ok, then yes, don't change it cause like you said, deb only item and we can verify the other deb uploads [18:24] zyga: no, for focal, that is the location that apparmor creates [18:25] jdstrand: oh, one thing before I forget: "systemd-analyze security" [18:25] I dind't know this before, perhaps it's new [18:25] zyga: I can look at the packaging real quick for all the releases [18:25] thank you [18:27] mvo, jdstrand: I pushed the updates to https://github.com/snapcore/snapd/pull/8467 [18:27] PR #8467: many: fix loading apparmor profiles on Ubuntu 20.04 with ZFS [18:27] zyga: that is interesting. I wonder if it considers AppArmorProfile=.... today, it seems to primarily (understandably) care about its own nspawn/etc directives [18:28] jdstrand: I didn't look at what it checks but it's true that systemd has grown considerable vocabulary of sandboxing features over the years [18:28] zyga: ta [18:28] jdstrand: I would say it rivals snapd in some ways so it's a nice selection [18:29] Lucy has fever, oh buy [18:29] boy* [18:29] :/ [18:29] zyga: ok, the apparmor package creates /var/cache/apparmor all the way back to trusty [18:29] good [18:30] so we're good [18:30] I think it's safe to keep this as-is for 2.44.3 [18:30] yeah. just normal QA [18:30] we can do more in 2.45 without a time pressure [18:30] * jdstrand nods [18:30] we are at 25/32 workers so the changes should proces quickly [18:31] but I anticipate a small queue for about an hour while we go through the stuff that will spawn stable and unstable tests [18:31] mvo: I think we should aim for tomorrow morning [18:31] mvo: unless you want to dput at ~ 22 [18:31] and I'll just get back to work to work on the feedback from jdstrand [18:31] btw, jdstrand are you available tomorrow? [18:32] zyga: I am working tomorrow, yes [18:34] ok [18:34] mvo: shall I stick around or are you OK for today? [18:41] zyga: all good for today [18:41] OK, signing off :) [18:42] * zyga EODs [18:43] PR snapd#8465 closed: tests: update snap-preseed --reset logic to acommodate for 2.44 change <⚠ Critical> [18:47] PR snapd#8470 opened: tests: update snap-preseed --reset logic to acommodate for 2.44 change (2.44) [18:48] PR snapd#8466 closed: tests: backport partition fixes to 2.44 [19:13] mvo: thanks for merging my test fix! [19:16] mvo: and for the herry pick! huh, what are those all failures there [19:17] ah, i see another PR [19:18] pstolowski: yeah, some fallout from the divergence of 2.44 and master [19:19] pstolowski: I will wait for the tests and do the release tomorrow, I think its getting a bit too late [19:23] PR snapcraft#3024 opened: tests: remove usage of FakeApt fixtures in lifecycle [19:32] PR snapcraft#3025 opened: tests: move FakeApt fixtures into deb tests [19:52] PR snapd#8469 closed: snap: do use os.Environ() in 2.44 <⚠ Critical> [19:55] PR snapd#8471 opened: many: fix loading apparmor profiles on Ubuntu 20.04 with ZFS (2.44) [20:21] PR snapd#8470 closed: tests: update snap-preseed --reset logic to acommodate for 2.44 change (2.44) [20:21] PR snapd#8472 opened: tests: disable some problematic tests for 2.44 [20:24] PR snapd#8467 closed: many: fix loading apparmor profiles on Ubuntu 20.04 with ZFS [21:05] PR snapcraft#3024 closed: tests: remove usage of FakeApt fixtures in lifecycle