[05:05] <mborzecki> morning
[07:06] <pstolowski> morning
[07:11] <mborzecki> pstolowski: hey
[07:13] <mvo> good morning pstolowski and mborzecki 
[07:13] <mvo> mborzecki: is 10803 ready or still needs security review?
[07:14] <mborzecki> mvo: does amurray's review count as +1? :)
[07:15] <mborzecki> i'm trying to track down all the method names, but some are generated
[07:16] <mup> PR snapd#10791 closed: gadget/gadget.go: LaidOutSystemVolumeFromGadget -> LaidOutVolumesFromGadget <Simple 😃> <Created by anonymouse64> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10791>
[07:17] <mvo> mborzecki: heh, I don't know if amurray gave a +1 in 10803 at least he did not remove the label so I assume not. if the method names are auto-generated that is annoying :/ maybe the "only know names" breaks down there?
[07:18] <mborzecki> mvo: i'll try to track down which methods are used, and if that proves to be futile effort we'll stick with allowing all of them
[07:21] <mborzecki> school run, brb
[07:21] <mardy> 'morning all
[07:24] <mardy> some more reviews on https://github.com/snapcore/snapd/pull/10739 and /10739/checks?check_run_id=3652775103 would be appreciated :-)
[07:24] <mup> PR #10739: mount-control: step 2 <Needs Samuele review> <Needs security review> <Created by mardy> <https://github.com/snapcore/snapd/pull/10739>
[07:31] <mardy> BTW, when a PR is tagged "Needs security review", does it mean that it's in the backlog of the security team, or do I still need to ping them?
[07:34] <amurray> mardy: if you want it to get looked at with some urgency, it doesn't hurt to ping me/us ;)
[07:36] <zyga-mbp> haha, the brutal honesty :)
[07:36] <amurray> hehe - hey zyga-mbp :)
[07:36] <zyga-mbp> I think pinging is useful to set priority
[07:36] <zyga-mbp> but everyone is busy as is
[07:36] <zyga-mbp> hey there alex :)
[07:37] <amurray> yep, the "needs security review" queue is pretty long, so it definitely helps to communicate priority/urgency by letting me know directly if you need something faster than "whenever I happen to get to it"
[07:39] <amurray> plus I am not across all the snapd release dates/milestones/targetted features etc - I'd like to have more time to devote to snapd security stuff but like zyga said, everyone is busy
[07:39] <mardy> amurray: thanks, it's not super urgent, but that mount-control PR (link 10 lines above) needs a pair of security-trained eyes :-)
[07:39] <zyga-mbp> or two security pirrrrrates, each with one eye
[07:39] <mardy> zyga-mbp: that would be even better
[07:40] <amurray> arrrr now yerr talkin matey
[07:40] <zyga-mbp> haha
[07:41] <mardy> mborzecki: have you seen a similar failure before? https://paste.ubuntu.com/p/JWJprHmsDD/ Do you think it's again a matter of adding a `sleep 1` before the last check? (https://github.com/snapcore/snapd/blob/master/tests/main/security-udev-input-subsystem/task.yaml#L82)
[07:41] <mup> PR snapd#10815 opened: fde: add HasDeviceUnlock() helper <Simple 😃> <Skip spread> <Created by mvo5> <https://github.com/snapcore/snapd/pull/10815>
[07:41] <mborzecki> re
[07:45] <mardy> mborzecki: in that test, you can see that there's the AppArmor denial in the logs.
[07:45] <mborzecki> mardy: yeah, EACCESS is from apparmor lsm, iirc EPERM would be generated by device cgroup
[07:46] <mborzecki> although it was checking for EPERM and the test was successful so far
[07:53] <mardy> mborzecki: can I change the test to MATCH for both errors, or does that defeat the point of the test?
[07:53] <mardy> or do you think that a sleep 1 can help there, since cgroups are involved?
[07:56] <mborzecki> mardy: is this the only test that failed this way?
[07:59] <mborzecki> i suspect this may be about lsm evaluation order, although looking at the kernel devcgroup is checked first always, so the device must have been still allowed, maybe the tag did not get removed?
[07:59] <mborzecki> mardy: can you reproroduce it?
[08:03] <mardy> mborzecki: in this test run, it was the only failed test: https://github.com/snapcore/snapd/pull/10739/checks?check_run_id=3660036956
[08:03] <mup> PR #10739: mount-control: step 2 <Needs Samuele review> <Needs security review> <Created by mardy> <https://github.com/snapcore/snapd/pull/10739>
[08:03] <mardy> mborzecki: I'll try to reproduce it
[08:08] <pstolowski> mardy: thanks for the review of #10814, i'm going to iterate over it a bit though, therefore it's a draft and probably approving it is premature at this point (and it's missing tests)
[08:08] <mup> Bug #10814: Parted made my partition table to overlap. <parted (Ubuntu):Fix Released by cjwatson> <https://launchpad.net/bugs/10814>
[08:08] <mup> PR #10814: [RFC] o/ifacestate: don't loose connections if snaps are broken <Created by stolowski> <https://github.com/snapcore/snapd/pull/10814>
[08:20] <mardy> pstolowski: it was for encouragement :-)
[08:20] <pstolowski> mardy: LOL
[08:20] <pstolowski> mardy: appreciate it :)
[08:21] <mup> PR snapd#10705 closed: tests: add minimal smoke test for microstack <Created by mvo5> <Closed by mvo5> <https://github.com/snapcore/snapd/pull/10705>
[08:23] <mborzecki> mvo: hm exporting keys failed on LP: https://paste.ubuntu.com/p/YB5nqD3dnb/
[08:26] <mvo> mborzecki: "can't connect to the agent: IPC connect call failed" - sounds like something funny is going on there
[08:26] <mborzecki> yeah
[08:27] <mvo> mborzecki: I wonder if we can run this outside of spread as have little control over the GH containers
[08:27] <mvo> (or maybe we do and just don't exercise it enough?)
[08:27] <mardy> mborzecki: nope, I'm running a test in a loop, it doesn't seem to fail
[08:28] <mborzecki> mvo: this failed on LP builder, so that should be an sbuild-like environment?
[08:28] <mardy> wait, I spoke too early
[08:32] <mardy> mborzecki: yes, I can reproduce it; happens maybe once out of 20 tries
[08:33] <mborzecki> mardy: do you ahve a debug shell?
[08:33] <mardy> let me add a sleep, and see...
[08:33] <mardy> yes
[08:33] <mardy> it's strange that there's no apparmor denial in the logs...
[08:34] <mvo> mborzecki: oh, interessting, this is inside lp :/
[08:34] <mvo> mborzecki: that we control even less, maybe we need to mock more (but in meetings a lot today :/
[08:35] <mborzecki> mardy: can you cat /sys/fs/cgroup/devices/snap.test-snapd-udev-input-subsystem.plug-with-time-control/devices.list
[08:35] <mardy> a "sleep 1" after connecting the time-control interface does not help
[08:36] <mardy> mborzecki: https://paste.ubuntu.com/p/pbBYG4ZTBC/
[08:38] <mborzecki> mardy: hm so evdev's major number is 13, i don't see anything with that major in the list
[08:38] <mborzecki> mardy: if you run `test-snapd-udev-input-subsystem.plug-with-time-control` does it fail with permission denied?
[08:38] <mborzecki> is there a new apparmor denial logged at that time?
[08:41] <mardy> mborzecki: I need to reproduce it again :-). But I added a cat of that sys/fs file you pasted above, and while running the test its contents are just "c 249:0 rwm"
[08:41] <mardy> even when it passes
[08:43] <mardy> and even when it fails, it's always just "c 249:0 rwm"
[08:43] <mardy> now let me check apparmor...
[08:43] <mardy> no denials
[08:45] <mardy> nevermind, there are denials -- for some reason dmesg doesn't show them, but journalctl does :-o
[08:46] <mardy> Sep 21 08:42:39 sep210805-077008 audit[12507]: AVC apparmor="DENIED" operation="open" profile="snap.test-snapd-udev-input-subsystem.plug-with-time-control" name="/dev/input/event2" pid=12507 comm="read-evdev-kbd" requested_mask="wrc" denied_mask="wrc" fsuid=0 ouid=0
[08:52] <mborzecki> mardy: maybe they are from before you stated meddling in the debug shell?
[08:53] <mborzecki> mardy: anyways, with the dump you provided, the device is not lsited in device cgroup, so EPERM is expected
[08:58] <mardy> mborzecki: but it's never there, not even when the test succeeds
[08:59] <mborzecki> mardy: hm not sure what the test is checking, judging by the commnt about rtc, i suspect there's supposed to be an event device related to rtc? but i don't see anything like that on my host
[09:00] <mardy> mborzecki:  this is what I'm running in a loop: https://paste.ubuntu.com/p/2D8DbRCTwR/
[09:01] <mardy> mborzecki:  and the devices.list file has always "c 249:0 rwm", both when the test passes, and when it fails
[09:02] <mardy> though maybe I should look at it from inside the executed snap script, if cgroups are setup by snap-confine?
[09:02] <mborzecki> mardy: if it were to be blocked by apparmor, i would expect something like `c 13:.. rwm`
[09:03] <mardy> mborzecki: nope, the denial is the one I pasted above, and it's always a fresh one
[09:03] <mardy> it's the same denial you can see in the CI logs
[09:04] <mborzecki> quick errand, brb
[09:04] <mardy> ok, meanwhile I'll try modifying the snap script, to print the cgroups from there
[09:15] <mvo> mborzecki: nice, looks like 10803 is ready now
[09:33] <mborzecki> re
[09:34] <mborzecki> mvo: yeah, let's see how the tests fare and then we can land it :)
[09:51] <mardy> mborzecki: found something: when the test fails, /sys/fs/cgroup/devices/snap.test-snapd-udev-input-subsystem.plug-with-time-control/cgroup.procs is empty
[09:52] <mardy> I'm checking it from within the script run in the snap, and when the test passes I see that it contains two pids; one of them is the same pid as the shells cript
[09:52] <mardy> mborzecki: could this be a real bug in snap-confine?
[09:54] <mborzecki> mardy: that file will list only live processes that exist in that cgroup, so i expect it to be empty after the program terminates
[09:56] <mborzecki> mvo: can you land https://github.com/snapcore/snapd/pull/10740 ?
[09:56] <mup> PR #10740: osutil: helper for injecting run time faults in snapd  <Run nested> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/10740>
[09:58] <mardy> mborzecki: but I'm printing it from within the snap script: it does contain the PIDs, when the test passes
[09:59] <mardy> mborzecki: here's the current contents of the snap script: https://paste.ubuntu.com/p/h5YHKj9BVC/
[10:00] <mborzecki> mardy: ok, when you print it from the script that snap runs, and it's failing that file is empty right?
[10:00] <mardy> mborzecki: yes
[10:01] <mborzecki> mardy: ok, so there are likely no devices tagged for this snap then
[10:02] <mborzecki> mardy: can you add SNAP_CONFINE_DEBUG=1 when running the app? when it fails i would expect there will be no device cgroup setup in the logs
[10:02] <mardy> mborzecki: yes, that explains why the test then fails, but what is not clear to me is why there are no PIDs in the group
[10:03] <mborzecki> mardy: so the device cgroup is setup iff there are devices tagged for the snap application
[10:03] <mardy> mborzecki: where should I find the logs?
[10:05] <mardy> brb, need to go to lunch
[10:05] <mborzecki> mardy: the first thing to check would be debug log of s-c (that SNAP_CONFINE_DEBUG=1 thing when you run the app), if that confirms there are no devices, the next thing to check is whether there's /etc/udev/rules.d/70-snap.test-snapd-udev-input-subsystem.plug-with-time-control, its contents, and then find out which device the app was using and see `udevadm info <dev-path>`
[10:22] <mardy> mborzecki: I did a "export SNAP_CONFINE_DEBUG=1" in the terminal from where I run the snap, but it didn't seem to help in making the snap more verbose.
[10:28] <mborzecki> mardy: hm that's unexpected, where are you setting it exactly?
[10:29] <mardy> mborzecki: in the spread terminal, and then I run the snap
[10:31] <mup> PR snapd#10816 opened: fde: add new DeviceUnlock() call <Run nested> <Created by mvo5> <https://github.com/snapcore/snapd/pull/10816>
[10:41] <mup> PR snapd#10817 opened: libsnap-confine: use the pid parameter <Simple 😃> <Created by mardy> <https://github.com/snapcore/snapd/pull/10817>
[11:05] <mardy> mborzecki: ouch, the output was being redirected by the test script, nevermind :-D
[11:06] <mborzecki> haha
[11:13] <pstolowski> mborzecki: hey, can you take a look at https://github.com/snapcore/snapd/pull/10795 ?
[11:13] <mup> PR #10795: o/assertstate: check installed snaps when refreshing validation set assertions <validation-sets :white_check_mark:> <Created by stolowski> <https://github.com/snapcore/snapd/pull/10795>
[11:22] <mardy> can I connect to a spread machine via SSH?
[11:29] <mvo> mardy: cat .spread-* should give you details
[11:31] <mardy> mvo: thanks!
[12:20] <mardy> mborzecki: apart from the PID numbers, the snap-confine logs are exactly the same when the test fails and when it succeeds
[12:20] <zyga-mbp> heh
[12:20] <zyga-mbp> I can play the garden gnome
[12:20] <zyga-mbp> what's the problem?
[12:21] <zyga-mbp> maybe if you explain it to me I can ask a silly question that helps?
[12:23] <mborzecki> mardy: can you post the logs/
[12:26] <mardy> zyga-mbp: this test failure: https://paste.ubuntu.com/p/JWJprHmsDD/ :-)
[12:27] <zyga-mbp> looking
[12:27] <mardy> mborzecki: here it is: https://paste.ubuntu.com/p/bdtFCHQbhx/
[12:28] <zyga-mbp> what is /snap/test-snapd-udev-input-subsystem/x1/bin/read-evdev-kbd at line 22?
[12:28] <mardy> mborzecki: is it normal that the PID that triggers the apparmor denial is the same one (4813) as snap-confine (you'll see the same pid in the logs I just pasted)?
[12:28] <zyga-mbp> what does it do? open?
[12:29] <mardy> zyga-mbp: something like that, I guess :-) https://github.com/snapcore/snapd/blob/master/tests/main/security-udev-input-subsystem/test-snapd-udev-input-subsystem/bin/read-evdev-kbd#L22
[12:29] <zyga-mbp> does it fail if that's the first and only test that is executed?
[12:30] <zyga-mbp> mborzecki are we synchronizing with system's setup of the cgroup for the scope?
[12:30] <mardy> zyga-mbp: well, I've got a debug session in the spread, and if I repeat this script in a loop (https://paste.ubuntu.com/p/2D8DbRCTwR/) it fails maybe once out of 20 times
[12:31] <zyga-mbp> ha
[12:31] <zyga-mbp> is the app tracking feature on?
[12:31] <zyga-mbp> can you run forkstat in the background
[12:31] <zyga-mbp> and run it in a loop until it fails
[12:31] <zyga-mbp> and then collect the forkstat please
[12:32] <mup> PR snapd#10818 opened: tests: test for enforcing with prerequisites <⛔ Blocked> <validation-sets :white_check_mark:> <Created by stolowski> <https://github.com/snapcore/snapd/pull/10818>
[12:33] <zyga-mbp> please remind me, permission denied is EPERM, right?
[12:33] <mborzecki> mardy: taht's not the failure case right? I see that device cgroup is set up
[12:35] <mborzecki> zyga-mbp: what do you mean about synchronizing with systemd setup?
[12:36] <zyga-mbp> mborzecki we spawn the scope via a systemd call in snap-run
[12:36] <zyga-mbp> last time I looked we didn't wait for that to finish fully
[12:36] <zyga-mbp> just to register 
[12:36] <zyga-mbp> maybe systemd is doing cgroup-y things that we race with?
[12:36] <zyga-mbp> I could be totally wrong, I'm not looking at how that code evolved since
[12:37] <mborzecki> zyga-mbp: nope, it's EACCESS, EPERM is operation not permitted
[12:37] <zyga-mbp> ah
[12:37] <zyga-mbp> hmm
[12:37] <cmatsuoka> hey zyga-mbp 
[12:37] <mardy> mborzecki: it is the failure case, unfortunately :-(
[12:37] <mardy> mborzecki: as I wrote, the logs are the same, only difference is the pid numbers
[12:38] <mardy> zyga-mbp: that forkstat is a nice tool! Thanks! Here's the output, when the test failed: https://paste.ubuntu.com/p/CZn2Nrryn3/
[12:38] <mborzecki> zyga-mbp: we wait now
[12:38] <zyga-mbp> wait for the job to finish?
[12:39] <mborzecki> but this is v1, so we're actually using pids for tracking
[12:39] <mborzecki> mardy: what's the content of devices.list then?
[12:39] <zyga-mbp> ahh
[12:39] <mardy> the pid that triggered the apparmor failure is 12399
[12:41] <mardy> mborzecki: the same as before, no 66 or 13 in there
[12:41] <mborzecki> mardy: for ther record, can you ls -l /dev/rtc* and ls -l /dev/input/event*
[12:45] <mardy> mborzecki: all looks fine there: https://paste.ubuntu.com/p/h78t2y6ypW/
[12:48] <mborzecki> mardy: ok, to sum up, 249:0 was in the list which is expected, 13:* were not, which is also expected, but we still got EPERM which in theory should not happen, because we expect that apparmor (lsm) would block access and errno would be EACCESS
[12:49] <mardy> mborzecki: yep. How do I build snap-confine?
[12:49] <zyga-mbp> cd cmd
[12:49] <zyga-mbp> ./autogen.sh && make
[12:49] <zyga-mbp> maybe?
[12:49] <mardy> ah, no magic? :-)
[12:50] <mborzecki> and then `make hack`, if you're on ubuntu remember to set SNAPD_REEXEC=0 when running a snap so that your local copy is picked up
[12:57] <mardy> mmm... it's still not being picked up (though I can see that's correctly installed as /usr/lib/snapd/snap-confine)
[13:09] <mardy> ah, it's picking up the one from /snap/core/...
[13:22] <zyga-mbp> you have reexec 
[13:24] <mborzecki> mardy: set SNAPD_REEXEC=0 in your environment
[13:52] <zyga-mbp> mborzecki that breaks unit tests
[13:52] <zyga-mbp> I had that
[13:55] <mardy> mborzecki, zyga-mbp: so, I stopped snapd.{service,socket}, added SNAPD_REEXEC=0 in my env and restarted snapd (from the terminal); still, it's using the snap-confine from the /snap/core/... path
[13:56] <zyga-mbp> snapd is local though, right?
[13:56] <zyga-mbp> and snap-confine is not?
[13:56] <zyga-mbp> snap-confine is actually invoked by snap-run
[13:56] <zyga-mbp> so ... which version of snap run are you getting?
[13:56] <zyga-mbp> you probably need to put snap symlink next to snapd
[13:56] <zyga-mbp> IIRC there was some funny thing where it found where snap-run is running from
[13:56] <zyga-mbp> to find snap-confine
[13:57] <mardy> zyga-mbp: I only rebuilt snap-confine, so both snap and snapd are the unmodified ones
[13:57] <mardy> but OTOH, I didn't touch them...
[13:57] <zyga-mbp> where is your snap-confine?
[13:58] <zyga-mbp> my memory of this is rusty
[13:58] <mardy> in /usr/lib/snapd/snap-confine
[13:58] <zyga-mbp> but if you set the env var maciej mentioned
[13:58] <zyga-mbp> and copy snap-confine over
[13:58] <zyga-mbp> it should be used
[13:58] <zyga-mbp> there are exceptions
[13:58] <zyga-mbp> but very convoluted
[13:58] <zyga-mbp> can you forkstat and see if snap-run calls the wrong one?
[13:58] <zyga-mbp> if so, set SNAP_DEBUG=1 to see why
[13:59] <mardy> yes it calls the wrong one
[14:00] <zyga-mbp> the logic is in cmd/*/.go
[14:00] <mardy> SNAP_DEBUG does not seem to make a difference...
[14:00] <zyga-mbp> I forgot what the sub-package name was
[14:00] <zyga-mbp> (at least last Nov, it could have moved since)
[14:01] <mardy> I'm now building a local /usr/bin/snap, let's see if that helps
[14:01] <zyga-mbp> the stock one should have debug enabled to tell you what is wrong
[14:03] <mardy> nope it doesn't, and it still executes /snap/core/11893/usr/lib/snapd/snap-confine :-(
[14:03] <zyga-mbp> did you enable debug logs?
[14:03] <zyga-mbp> it should tell you why
[14:04] <zyga-mbp> what's the command like you are trying?
[14:04] <mardy> could it be SNAPD_DEBUG=1?
[14:05] <mardy> that prints something before running snap-confine, but does not seem super useful: https://paste.ubuntu.com/p/DmNv7rns5R/
[14:05] <zyga-mbp> yes
[14:06] <zyga-mbp> 2021/09/21 14:04:25.830781 tool_linux.go:204: DEBUG: restarting into "/snap/core/current/usr/bin/snap"
[14:06] <zyga-mbp> ok, look at the condition there
[14:06] <zyga-mbp> that's your next clue
[14:10] <mborzecki> mardy: sorry, that's SNAP_REEXEC=0
[14:11] <mborzecki> i always forget as I don't really have to deal with reexec
[14:12] <mup> PR snapd#10817 closed: libsnap-confine: use the pid parameter <Simple 😃> <Created by mardy> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10817>
[14:12] <mardy> mborzecki: aaaah, just found out the same looking at the sources :-)
[14:15] <mardy> mborzecki: ok, so it ran my snap-confine, but I'm not sure where the output from system() went; I thought it'd be merged with my stdout, but my C is a bit rusty
[14:15] <zyga-mbp> hahaha
[14:15] <zyga-mbp> oh my
[14:15] <zyga-mbp> nice
[14:19] <mborzecki> maybe the files were empty?
[14:19] <mborzecki> mardy: try adding echo around that
[14:20] <mborzecki> mardy: can you paste the diff also?
[14:22] <mardy> mborzecki: ah: Sep 21 14:21:40 sep211335-363145 audit[15140]: AVC apparmor="DENIED" operation="exec" profile="/usr/lib/snapd/snap-confine" name="/bin/dash" pid=15140 comm="snap-confine" requested_mask="x" denied_mask="x" fsuid=0 ouid=0
[14:25] <mborzecki> meh
[14:26] <mborzecki> you can tweak snap-confine's profile and add `/bin/dash ixr,`
[14:26] <mardy> added a line for it, now it works
[14:26] <mardy> mborzecki: I made it Ux :-)
[14:29] <zyga-mbp> Ux provides the best UX ;0
[14:29] <zyga-mbp> apparmor joke
[14:29] <zyga-mbp> amurray might approve if it wasn't midnight for him
[14:30] <mardy> :-)
[14:31] <mardy> mborzecki: so, this confirms it: the devices.list is always the same, regardless of failure or success, but the procs file is empty when the test fails
[14:31] <mardy> whereas normally it has three PIDs in it
[14:31] <zyga-mbp> procs?
[14:31] <zyga-mbp> oh
[14:31] <zyga-mbp> well, it's racy
[14:31] <zyga-mbp> or 
[14:31] <zyga-mbp> are we reading it wrong?
[14:31] <mardy> zyga-mbp: I mean /sys/fs/cgroup/devices/snap.test-snapd-udev-input-subsystem.plug-with-time-control/cgroup.procs
[14:31] <zyga-mbp> that file is supposed to be read with a huge buffer
[14:31] <zyga-mbp> all in one go
[14:31] <zyga-mbp> right, thanks for confirming that
[14:32] <mardy> zyga-mbp: I'm reading it from within snap-confine, with a system("cat /sys/fs...")
[14:32] <zyga-mbp> (same thing applies to mountinfo btw, I think I never changed that
[14:32] <zyga-mbp> ok
[14:32] <zyga-mbp> can you strace cat it?
[14:32] <zyga-mbp> that file is natually racy 
[14:32] <zyga-mbp> oh 
[14:32] <zyga-mbp> wait
[14:32] <zyga-mbp> this is devices!!!
[14:32]  * zyga-mbp is so dumb
[14:32] <zyga-mbp> mborzecki do you see it
[14:32] <zyga-mbp> mardy do you see it? :)
[14:32] <mardy> I don't :-)
[14:33] <zyga-mbp> what's the path?
[14:33] <zyga-mbp> it's the snap-specific hierarchy
[14:33] <zyga-mbp> find this process in a systemd managed one
[14:33] <mardy> the path is /sys/fs/cgroup/devices/snap.test-snapd-udev-input-subsystem.plug-with-time-control/cgroup.procs
[14:33] <zyga-mbp> we (incorrectly) steal a process from systemd
[14:33] <zyga-mbp> can you look at /proc/self/cgroups
[14:33]  * zyga-mbp double-checks the name
[14:34] <mardy> zyga-mbp: from within snap-confine, or is the terminal ok?
[14:34] <zyga-mbp> sorry, singulaar
[14:34] <zyga-mbp> mardy from whitin the process executed by snap-confine
[14:34] <zyga-mbp> we always did this part wrong
[14:34] <zyga-mbp> but most of the time systemd didn't care
[14:34] <zyga-mbp> we stole the process
[14:35] <zyga-mbp> we placed it in a device hierarchy we created
[14:35] <zyga-mbp> (that's the devices/snap.* namespace)
[14:35] <zyga-mbp> but it was already in a systemd provided cgroup perhaps
[14:35] <zyga-mbp> from within snap-confine, cat /proc/self/cgroup
[14:35] <zyga-mbp> I'm sure it will be interesting
[14:35] <zyga-mbp> and not what you expectr
[14:35] <zyga-mbp> mborzecki do you see what I'm getting at?
[14:36] <mborzecki> zyga-mbp: hm not sure wheher systemd cares unless you actually limit devices to begin with
[14:37] <mardy> zyga-mbp: normally it says "7:devices:/snap.test-snapd-udev-input-subsystem.plug-with-time-control", now let's wait for it to fail...
[14:37] <zyga-mbp> mborzecki no that's not the point
[14:37] <zyga-mbp> we got stolen back
[14:37] <zyga-mbp> if there are no processe
[14:37] <zyga-mbp> where are they?
[14:37] <zyga-mbp> all processes inhabit all cgroups
[14:37] <zyga-mbp> only the hierarchy changes
[14:37] <mardy> zyga-mbp: YES!!!
[14:37] <mardy> it's 7:devices:/user.slice/user-0.slice/user@0.service on failure
[14:38] <mardy> zyga-mbp: so, the solution is to run a kill(1, 9) before setting up the cgroups? ;-)
[14:39] <mborzecki> heh
[14:39] <mardy> zyga-mbp: do you mean that sometimes we are not writing the PIDS into cgroup.procs quickly enough, and systemd takes it over?
[14:40] <zyga-mbp> well
[14:40] <zyga-mbp> I mean we never cared up until this point
[14:40] <zyga-mbp> because systemd was not doing anything 
[14:40] <zyga-mbp> but well
[14:40] <zyga-mbp> apparently snap-confine needs to wait somewhere
[14:40] <zyga-mbp> do I get a beer? :)
[14:41] <mardy> zyga-mbp: this is more like grappa
[14:42] <mborzecki> mardy: what system is this?
[14:42] <mardy> mborzecki: xenial
[14:43] <mardy> I guess I need to do some reading of how systemd treats cgroups, to fully understand the problem
[14:43] <zyga-mbp> mardy it changes over time
[14:43] <zyga-mbp> (as in over systemd versions)
[14:44] <mardy> zyga-mbp: but are you sure that the problem is that we don't wait enough? my gut feeling is that we have too big of a time window between the time when we create the cgroup and when we assign the PIDs to it. But as I said, I might be completely off
[14:44] <mborzecki> hmmm why on earth the user scope is in device cgroup?
[14:44] <zyga-mbp> mardy it's a command running in a user session
[14:44] <zyga-mbp> so it gets shoved there by systemd
[14:44] <zyga-mbp> maybe it's some other bug as well, I'm surprised this is on xenial
[14:45] <zyga-mbp> which is ancient by now
[14:45] <zyga-mbp> are we backporting systemd?
[14:45] <mborzecki> yeah, as i wrote, systemd does not move it unless there's a deviceallow/deny set for the unit
[14:45] <mardy> I just happened to see this happening in xenial, maybe it happens on newer systems too -- I haven't tested
[14:45] <mborzecki> unless, something else went bad
[14:45] <zyga-mbp> mborzecki not quite
[14:45] <zyga-mbp> the scope is different
[14:45] <zyga-mbp> mardy does this fail with the app tracking feature off?
[14:46] <zyga-mbp> systemd creates a full scope and moves the process fully into it
[14:46] <zyga-mbp> perhaps that's what is going on
[14:46] <mardy> zyga-mbp: how do I disable that?
[14:46] <zyga-mbp> show me your snap set system experimental
[14:46] <zyga-mbp> I forgot the feature flag name
[14:46] <zyga-mbp> refresh-app-awarneness?
[14:47] <zyga-mbp> (I meant snap get, not set)
[14:47] <mardy> error: snap "core" has no "experimental" configuration option
[14:48] <zyga-mbp> snap set system experimental.refresh-app-awareness=false
[14:48] <zyga-mbp> I hope I remember the syntax right
[14:48] <mardy> yep, it's correct
[14:49] <mardy> let me try the test again
[14:52] <mup> PR snapd#10769 closed: tests: update test nested tool part 2 <Run nested> <Created by sergiocazzolato> <Merged by sergiocazzolato> <https://github.com/snapcore/snapd/pull/10769>
[14:54] <mardy> zyga-mbp: yes, it fails too
[14:58]  * mvo switches network
[15:03] <mborzecki> mvo: can you land https://github.com/snapcore/snapd/pull/10740 ?
[15:03] <mup> PR #10740: osutil: helper for injecting run time faults in snapd  <Run nested> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/10740>
[15:04] <mborzecki> mvo: and this one too: https://github.com/snapcore/snapd/pull/10803
[15:04] <mup> PR #10803: tests, interfaces/builtin: introduce 21.10 cgroupv2 variant, tweak tests for cgroupv2, update builtin interfaces <cgroupv2> <cgroupv2-impish> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/10803>
[15:05] <mardy> need to leave now, I'll continue tomorrow morning :-)
[15:06] <mardy> mborzecki, zyga-mbp: thank you guys, without your help it would have taken me at least twice the time!
[15:08] <mborzecki> debugging is the fastest way to learn things :P
[15:09] <dbungert> about proposed migration.  snapd, squashfs-tools, and some kernels.  There is a fix in flight - https://github.com/snapcore/snapd/pull/10757 - do we anticipate that this will be available for autopkgtest sometime soon or should we do something else for squashfs-tools and the kernels?
[15:09] <mup> PR #10757: build-aux: stage libgcc1 library into snapd snap <⚠ Critical> <cherry-picked> <Created by mwhudson> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10757>
[15:41] <ijohnson[m]> dbungert: what is your question? Sorry I don't understand what you are asking about
[15:43] <dbungert> ijohnson[m]: I'm trying to see if I can help squashfs-tools and a few kernels migrate.  They are currently not migrating due to snapd autopkgtest failures which are resolved by adding libgcc to snapd.
[15:44] <ijohnson[m]> dbungert: where does the autopkgtest tests get snapd from ? that fix you mentioned is for the snapd snap, do autopkgtests use the snapd snap ?
[15:44] <dbungert> ijohnson[m]: they do, actually.  check out the end of this log  https://autopkgtest.ubuntu.com/results/autopkgtest-impish/impish/arm64/s/snapd/20210914_092631_abbf2@/log.gz
[15:45] <dbungert> I built for myself a custom snapd deb that forced that test to grab the snapd snap from the edge channel, and then that test passes.
[15:46] <ijohnson[m]> dbungert: ok, so if they are using the snapd snap, then the fix you mentioned was cherry-picked to the release/2.52 branch of snapd, but unfortunately after we already started releasing 2.52 which is now in beta. So that fix will not make it into the stable channel of the snapd snap until we do a 2.52.1 release, which I don't think we have planned yet since we haven't even finished getting 2.52 out, but we could push it through if it is
[15:46] <ijohnson[m]> important, but it would take at least 2 weeks after we do the 2.52.1 release before it is on stable
[15:47] <ijohnson[m]> dbungert: see https://forum.snapcraft.io/t/snapd-release-process/26628 for more details
[15:49] <ijohnson[m]> mvo: did you notice that the snapd snap went up by 10MB from 2.52 -> everything after 2.52 ? 
[15:49] <ijohnson[m]>   latest/beta:      2.52                 2021-09-04 (13270) 33MB -
[15:49] <ijohnson[m]>   latest/edge:      2.52+git881.gf3cd286 2021-09-21 (13460) 43MB -
[15:49] <ijohnson[m]> I wonder when that changed
[15:51] <ogra> added stage packages ? 
[15:51] <ogra> (and their deps)
[15:51] <ijohnson[m]> 10 MB of stage packages ?
[15:52] <ogra> well 🙂
[15:53] <dbungert> ijohnson[m]: to judge if libgcc should be pushed thru: are there cases where snapd would expect to be able to create a squashfs, or is that just a test scenario?  If so, today it's reliant on getting libgcc from the host system, which happens to have worked well until recently.
[15:53] <ijohnson[m]> huh it seems like the discrepency is just between the build that runs for edge/master and the builds that run for release/<version>, the builds on edge/master are 10MB larger since at least a few months ago
[15:55] <ijohnson[m]> dbungert: well the main use case for creating squashfs' from snapd are really just for the `snap pack` command, otherwise snapd doesn't create squashfs's, it just hands squashfs's to systemd to be mounted
[15:55] <ijohnson[m]> dbungert: so this only affects impish or does it affect other releases as well ? I haven't seen or heard of anyone complaining about issues with `snap pack` except from you and mwhudson 
[15:56] <ijohnson[m]> and actually I don't even really know if the failures y'all are seeing are about `snap pack` or something else
[15:56] <dbungert> tip of the iceberg I think.  I expect that any distribution that picks up new glibc would see the same problem.
[15:56] <ijohnson[m]> hmm, I would imagine arch linux would pick it up too then, and we haven't had any such reports
[15:56] <dbungert> perhaps there is part of this I don't quite understand then
[15:57] <ijohnson[m]> yeah but that autopkg test failure you linked above is indeed about `snap pack`
[16:15] <dbungert> thanks for the info.  On the topic of 2.52, the roadmap linked from above just says TBD, is there even coarse projections on availability of 2.52?  cc ijohnson[m] 
[16:20] <ijohnson[m]> dbungert: for 2.52 itself that is expected within the next week, but we don't have any estimate at this time for 2.52.1, the libgcc_s issue you have mentioned would be the first such critical thing that we would consider doing a 2.52.1 for
[16:21] <dbungert> ijohnson[m]: cool.  I'm going to look at hinting things to allow squashfs-tools and kernels thru in the meantime.
[16:22] <mup> PR snapd#10819 opened: interfaces/builtin/opengl.go: add libOpenGL.so* too <Simple 😃> <Needs security review> <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/10819>
[16:58] <mup> PR snapd#10820 opened: devicestate: use EncryptionType <Run nested> <Created by mvo5> <https://github.com/snapcore/snapd/pull/10820>
[18:08] <mup> PR snapd#10740 closed: osutil: helper for injecting run time faults in snapd  <Run nested> <Created by bboozzoo> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10740>
[18:08] <mup> PR snapd#10821 opened: interfaces/raw_usb: add write access required to support USB/IP <Created by jocado> <https://github.com/snapcore/snapd/pull/10821>
[18:13] <mup> PR snapd#10795 closed: o/assertstate: check installed snaps when refreshing validation set assertions <validation-sets :white_check_mark:> <Created by stolowski> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10795>
[18:53] <mup> PR snapd#10780 closed: tests: manually umount devices during reset to prevent invariant error <Simple 😃> <Created by sergiocazzolato> <Closed by sergiocazzolato> <https://github.com/snapcore/snapd/pull/10780>