[06:09] PR snapd#7665 closed: devicestate: add support for gadget->gadget remodel [06:13] morning [06:14] mvo: thanks for merging 7665, do you need help with other remodel PRs? [06:14] mborzecki: hey, good morning! [06:14] mborzecki: maybe, let me see what pawel wrote [06:14] mvo: ok [06:15] mborzecki: 7720 needs a second review, should be super simple [06:15] mvo: ok, on it [06:24] Hello [06:24] Starting soon [06:29] zyga: good morning [06:30] mborzecki: I updated 7715, there is one question about test improvements, I need to think about it but I think it's not a blocker, feel free to ponder over this, I will not attack this today (his question about testing the taskEdges) [06:33] zyga: hey [06:33] mvo: let me check that [06:36] zyga: did you dream about keyctl? :) [06:37] in the office now [06:37] mborzecki: no, I decided to really rest today [06:37] zyga: i went through https://www.kernel.org/doc/html/v4.13/security/keys/core.html and then ecryptfs key usage examples but feel none wiser really [06:38] mborzecki: I'm sure there are keys here [06:38] https://twitter.com/zygoon/status/1191776451995095041 [06:38] but not like you know it ;) [06:38] mborzecki: I think given timing we need to investigate keys next week [06:38] zyga: still, it feels like a replacement for a cookie, so half of the problem could be addressed [06:38] this week I'd like to get the cgroup in place and adjust snapd [06:38] maybe get the UI proposed [07:23] PR snapd#7687 closed: snap-bootstrap: check gadget versus disk partitions [07:34] * zyga break, some back pain, may work from floor today === pstolowski|afk is now known as pstolowski [08:08] morning [08:10] good morning pstolowski [08:15] Breakfast [08:41] Implemented the cgroup idea [08:41] Running tests [08:41] Snapd side changes next [08:48] PR snapd#7721 opened: gadget: add support for hybrid partitioning schemas [08:48] pstolowski: hey [08:50] trivial PR for gadget ^^ [09:06] mborzecki: sure [09:06] pstolowski: thanks! [09:07] I really want snap set system reexec=no [09:12] PR snapd#7720 closed: asserts: add "snapd" type to valid types in the model assertion [09:14] Some progress [09:14] Freezer freezing snap confine is silly [09:16] hahah [09:16] zyga: you mean s-c froze itself? :) [09:23] google:ubuntu-core-18-64:tests/main/remodel-gadget failed on master :/ [09:24] a second review for 7711 would be great, it has one already and is green. hopefully straightforward [09:25] mborzecki: oh no! in what way? [09:25] - Remove data for snap "pc" (36) (failed to remove snap "pc" base directory: remove /var/snap/pc: directory not empty) [09:25] mborzecki: just being silly [09:25] mborzecki: it works now [09:25] hm probably some other test leaving stuff behind [09:25] https://www.irccloud.com/pastebin/DUaJsF5k/ [09:25] mborzecki: no, I put it in the wrong spot [09:25] mborzecki: lxd escapes us though [09:26] mborzecki: I'll send the patch with the function in a moment [09:26] mborzecki: need to clean up the commit tree [09:28] mborzecki: lxd escaping all cgroups https://www.irccloud.com/pastebin/76oMP7KO/ [09:30] mborzecki: the C diff https://paste.ubuntu.com/p/T9NmjppHrY/ [09:45] zyga: FYI, filed https://bugs.launchpad.net/snapd/+bug/1851480 (forgot to hit β€œsubmit” yesterday …). Will ping here if I get stuck in my fix attempt [09:45] Bug #1851480: Hooks are not included in slot/plug label expressions [09:51] mborzecki: have a look at https://github.com/snapcore/snapd/pull/7722 [09:51] PR #7722: cmd/snap-confine: add sc_join_sub_group [09:51] dot-tobias: hey [09:51] PR snapd#7722 opened: cmd/snap-confine: add sc_join_sub_group [09:51] dot-tobias: thanks! [09:52] mborzecki: on top I have the move of cgroup handling so that it affects all snap kinds [09:52] zyga: Welcome, hope I got half the snapd lingo right πŸ˜„ [09:52] mborzecki: and a one liner than enables this new logic [09:52] mborzecki: my thinking is we can keep the pids thing for a while [09:52] mborzecki: and nuke it later once I'm done with snapd-side change [09:53] dot-tobias: yeah, the bug report looks great [09:54] mborzecki: does this seem sensible? [09:54] zyga: the diff doesn't look too bad [09:55] zyga: it'd be nice to run the idea by jamie and mvo too before we spend too much time digging [09:55] zyga: i mean jdstrand [09:57] uhh, the way we restore the system in tests is broken [09:57] mvo, jdstrand: ^ [10:07] zyga: whats the high level idea? I have the C diff loaded now [10:07] mvo: the high-level idea is that we use a sub-directory of the current unified/v2 cgroup as a tag [10:07] mvo: and move the process there to identify it as a process belonging to a snap [10:08] mvo: I discussed this with xnox yesterday for in the context of being systemd-safe [10:08] mvo: it's safe to move to a deeper hierarchy [10:08] mvo: (moving up is not) [10:08] mvo: it uses v2 in all systems, giving us notification ability [10:08] mvo: the new idea is from maciek really, that cgroups can act as process tags [10:09] mvo: the idea to use v2 is because it is very much like name=snapd in v1 world (no controllers) [10:09] mvo: and in v2 world it doesn't fall apart as we don't have to steal a process entirely [10:09] mvo: and we can move it deper wherever systemd had originally placed it [10:09] mvo: this is the key improvement: it works in v2 in a way that doesn't disable systemd's operation [10:10] zyga: aha, ok [10:10] mvo: it works in lxd as well, I confirmed this with stgraber two days ago [10:10] zyga: that sounds promising! [10:10] mvo: notification is implemented differently [10:10] mvo: and enumeration is implemented differently [10:10] mvo: and both are coming as follow-ups to snapd [10:10] mvo: first enumeration, so that existing code can be moved over (along with all the spread tests for this) [10:11] mvo: notification should let us salvage the UI work that is waiting now [10:11] mvo: so maybe... just maybe... it will really work by Friday [10:11] mvo: no rush to merge it, I'll carry on implementing it, but it would be good to review and discuss with jamie [10:12] mvo: xnox raised a concern about security aspect of keeping an app open [10:12] mvo: because it would prevent refreshes [10:12] mvo: but I think this is really the feature design [10:12] mvo: we are also looking with mborzecki at kernel keyring as a way to prevent spoofing [10:12] mvo: where all our processes could hold something that cannot be forged [10:13] mvo: but all it does is make the feature more security tight, it doesn't change that anyone can run a snap and prevent it from refreshing using "snap run..." [10:13] mvo: so our existing snapd-side safeguards must suffice - that is the window of time after which we refresh anyway [10:14] zyga: right, agreed [10:17] mvo: there's enough priority items not to review it yet [10:17] mvo: but we might try to review all of it by end of week [10:17] mvo: just to be able to say "it's in master" [10:19] is there a way to bypass the restriction on refreshing to a revision? [10:19] some how during recent update we discovered several users skipped a revision [10:20] and during some troubleshooting we really need them to refresh or revert to that revisiom [10:20] revision* [10:20] geekgonecrazy: can you explain how skipping a revision was a problem for your users? [10:21] geekgonecrazy: based on what you said it seems like snap epochs are the solution we had designed and implemented for this problem [10:21] geekgonecrazy: where you can force all users to migrate through a set of revisions [10:21] geekgonecrazy: e.g. everyone will see and run the revision that can migrate your data format from v1 to v2 [10:21] geekgonecrazy: before allowing them to refresh to a revision that only has v2 support [10:22] where is this magic? we thought it was this way out of the box [10:22] geekgonecrazy: how did you think it operats? [10:22] geekgonecrazy: (it doesn't but I'd like to understand) [10:22] geekgonecrazy: let me refer to the docs on this feature, hold on [10:23] geekgonecrazy: https://snapcraft.io/docs/snap-epochs [10:23] it was exactly that. we had to upgrade db and introduced revision and gave it a week thinking all would be updated before introducing the next revision [10:23] geekgonecrazy: you need to use epochs to really make that a guarantee [10:24] geekgonecrazy: as to what to do in an existing situation, I don't know the details of what you did to answer [10:24] geekgonecrazy: perhaps you can refresh everyone to an intermediate revision that still understands old formats [10:24] geekgonecrazy: and publish the one that doesn't in a new epoch [10:24] geekgonecrazy: please read about this feature first [10:26] so for us we were on mongo 3.4 then intermediate revision (the one some have missed) sets compatibility mode which has to be set before it csn go to 3.6. then latest revision is 3.6 [10:26] geekgonecrazy: epochs is meant to be used for that kind of thing [10:27] geekgonecrazy: https://snapcraft.io/docs/snap-epochs [10:27] heh, i see zyga linked there already [10:27] ok [10:27] * zyga hugs Chipaca for implementing this magic [10:29] so in theory we could release another revision requiring both of those and if they already had all would be fine? [10:29] geekgonecrazy: both of which? [10:30] or will we have to hack past and put this revision in another channel and have them refresh to that channel to run that revision and then back to the latest [10:30] Chipaca: both required revisions [10:30] sorry, i don't follow [10:32] ok. so problem is we have that revision some didnt hit. now they cant refresh or revert there. so we need some way for them to get that revision [10:32] we cant release new revision. because would take those that succeeded backwards in database breaking their installs [10:34] so trying to figure out if there is a way we could have them manually refresh to it [10:34] then back to stable [10:36] geekgonecrazy: ah [10:37] geekgonecrazy: can you detect a broken install programmatically? [10:38] Chipaca: yes. But... fixing requires the older database version. [10:38] geekgonecrazy: it does sound like your easiest way forward is to use a branch [10:39] and then in future start using epochs so it doesn't happen again [10:39] really wish i would have known of epochs sooner :) would have been perfect [10:39] geekgonecrazy: a branch is like a temporary track that goes away on ts own [10:40] so you push the snap that would fix it to latest/stable/fix-for-the-thing, and tell affected users to refresh to it [10:41] oh dang.. didnt know you could do that either [10:41] we still are trying to figure [10:41] out the tracks :) [10:41] :) [10:41] branches are hard to explain until you need them [10:42] rather like epochs actually [10:46] thanks Chipaca & zyga ! [10:48] btw Chipaca on thr forum post regarding dig it seems its another dependency and dig just wont run. so will be digging after this fire and getting to bottom of it. [10:51] geekgonecrazy: good luck! [11:12] sil2100: I pushed a tiny PR to ubuntu-image so that we can have /boot on the system-seed partitoin, this should bring us one step closer to a botting image, hopefully a trviail review (cc xnox) [11:13] Chipaca: do you think we should have a short catchup today on recovery.grub? I think we are getting to a point where it will be useful :) [11:14] mvo: I was just trying the image created with your steps from yesterday, and something seems off [11:14] Chipaca: i.e. we can create an image now but right now (due to a bug) it has no /boot but even when it has that we still don't have a recovery grub that will work. [11:14] mvo: but maybe i'm missing something :-) [11:14] so, yeah [11:14] Chipaca: tell me! [11:14] mvo: /systems/2019yadda/ is empty, and instead things are in /snaps/ ? [11:15] Chipaca: its fully empty? [11:15] Chipaca: so "/snaps" has all the snaps that are shared between recoveries that is ok [11:16] mvo: ah, i missed that [11:16] Chipaca: but /systems/20191106/ should not be empty [11:16] mvo: empty of snaps [11:16] Chipaca: thats ok :) assertions and the model should be there [11:16] yep that's there [11:16] Chipaca: there will only be local unasserted snaps iirc [11:16] Chipaca: cool [11:17] Chipaca: once we have https://github.com/CanonicalLtd/ubuntu-image/pull/177 we should be able to get /boot too with grub.cfg and grubenv [11:17] PR CanonicalLtd/ubuntu-image#177: System seed boot dir [11:17] Chipaca: but we also need to tweak boot.MakeBootable to copy the right grub :) [11:17] Chipaca: but I'm excited, we are relatively close I think(?) [11:19] mvo: where's the bit that creates writable? [11:21] Chipaca: we create only "ubuntu-seed" nowdays, the recovery kernel then boots and the recovery initramfs will notice it runs in "install" mode and will run "snap-bootstrap" which will create all the missing partitions [11:22] hmmm, maybe i did something wrong [11:22] mvo: i thought i'd kicked that off, but it hung [11:22] mvo: i'll try again, paying more attention :) [11:22] Chipaca: no worries! you kicked what off? [11:23] mvo: booted the kernel, with the initramfs [11:23] and it got to the mounting step and hung [11:23] Chipaca: oh, woah - how did that happen? [11:23] mvo: the booting? or the hanging? :) [11:23] Chipaca: I mean, how did you manage to get a booting kernel? [11:23] mvo: i can walk you through that, it's not hard [11:24] Chipaca: a pastebin with the commands would be great [11:24] mvo: or i can write a workng grub.cfg for the image as it is right now [11:24] Chipaca: but thats actually much better news than I anticipated [11:24] mvo: but, basically, grub is awesome (as long as you're patient) [11:24] * mvo hugs Chipaca really hard [11:24] Chipaca: s/grub/chipaca/ [11:24] Chipaca: (and strike the patient part) [11:25] nah leave that one on [11:25] Chipaca: thats super exciting, can't wait to learn more [11:27] mvo: "ubuntu-seed"'s label is actually 'Recovery', fwiw [11:27] not sure if that's a bug or not :) [11:29] Chipaca: in the gadget.yaml ? or where is it set to Recovery? [11:29] Chipaca: it sounds like a bug [11:30] mvo: on the image [11:30] haven't looked at the yaml [11:31] quick errand, back in 30 [11:31] Chipaca: ha! you are right [11:31] Chipaca: its in the gadget.yaml, should be trivial to fix [11:32] Chipaca: I have lunch now, let's sync on the grub stuff when I'm back [11:32] hm, again it reached "Running /scripts/local-premount" and gets stuck there [11:32] maybe i'm being impatient? [11:33] * Chipaca waits more [11:34] Chipaca: will probably timeout eventually - alternatively the usual "break=top" or so should work. *maybe* even "systemd.debug-shell=1" [11:34] Chipaca: this is the new stuff from xnox, we may have systemd already in initiramfs :) [11:34] mvo: ah i was about t'ask [11:35] "the required kernel commandline snap_core is not set" [11:35] Chipaca: silly thing :) [11:35] that's not a thing in 20 [11:35] Chipaca: I think we need quite a few updates to our initramfs [11:35] Chipaca, is this conversation related to the removel-gadget test is failing on master? [11:35] Chipaca: and the fact that we can boot a kernel now means we can actually do it (hurazzz!) [11:36] cachio: no :) [11:36] Chipaca, ok, I'll take a look to the test in that case [11:36] thanks!! [11:36] :) [11:36] cachio: AFAICT noone has looked deeper into this one yet :/ any hints appreciated (but maybe mborzecki has looked at the failing remodel-gadget on master, not sure) [11:36] * mvo gets lunch first [11:37] mvo: i'll get an editable workable grub.cfg into a pastebin so more people can play [11:37] Chipaca: \o/ [11:37] Chipaca: or even into github.com:snpcore/pc-gadget in the 20 branch? :) ? [11:38] mvo: ah! Nice catch! I forgot we're skipping that one [11:38] mvo, ok [11:39] mvo: let's wait for at least one test run to finish and I'll merge it [11:46] mvo: something else is is snapcore/pc-gadget a thing? [11:46] er [11:46] mvo: sorry those were two separate things [11:46] mvo: second part first: where is this snapcore/pc-gadget? [11:47] mvo: first part: something else needs doing because it's not picking up grub.cfg from the obvious place [11:47] might be the old fat vs vfat names sillyness [11:47] * Chipaca tries [12:04] hmm [12:04] Chipaca: without reading the backlog, is something failing and you are looking at pc-gadget? [12:13] zyga: not really no [12:27] re [12:28] mborzecki: making progress, some more complexity but nothing breaking [12:31] zyga: sound like you're having fun :) [12:32] mvo: cachio: yes i have, and the way we restore the system state after the test is wrong or just lacking [12:33] mborzecki, what I see is that the test is failing just when other test is executed before [12:33] mvo: cachio: in short what happens is that we have a tar.gz with files that existed *before* the test, and in restore the restore that state, but things that are newly added during the test stay behind, they don't get cleaned up [12:33] mvo: cachio: imo what we need is like rsync --delete or just fix the tests to do the cleanups [12:34] mborzecki, nice, I'll try that [12:34] cachio: i'm fixing the tests for now, but maybe i can find a sensible way of detecting when stuff is left behind [12:35] sensible, as in not invovling gruesome shell one liners === ricab is now known as ricab|lunch [12:41] - Remove data for snap "pc" (36) (failed to remove snap "pc" base directory: remove /var/snap/pc: directory not empty) <- Chipaca <- was that the topic? [12:51] PR snapcraft#2792 closed: pluginhandler: use well-formed build package/snap lists [12:51] zyga: no [12:57] mborzecki: nice, thanks for looking into this [12:57] Chipaca: re gadget> https://github.com/snapcore/pc-amd64-gadget/tree/20 [12:58] Chipaca: that's the current place and we can update the 20 branch [13:00] mvo: where is EFI, on the resulting image? [13:03] Chipaca: a good question, I need to look [13:03] Chipaca: not sure if that needs special handling [13:03] Chipaca: aha, hmmmm, I *think* we did that using the gadget.yaml [13:04] Chipaca: maybe that needs updating [13:04] mvo: I mean, gadget.yaml talks about EFI, but there is no /EFI/ in the recovery partition afaict [13:04] Chipaca: hmm, that might need tweaks to ubuntu-image again [13:05] Chipaca: https://github.com/snapcore/pc-amd64-gadget/blob/20/gadget.yaml#L23 should populate the partition [13:05] Chipaca: but when I look I also see no content there [13:05] mvo: also, also, i'd like to be able to load a grub module :) [13:05] Chipaca: what does that involve? [13:05] is that doable? [13:06] mvo: in grub.cfg it's an insmod, and a file under EFI/ [13:06] not sure how it interferes with secure boot though [13:06] maybe a question for cmatsuoka [13:07] Chipaca: should be ok - we may hardcode it into grub though, i.e. build grub with the module built-in [13:07] mvo: but insmod regexp gives me globs so i can loop over a directory and pick out stuff without having to have env vars for everything [13:07] mvo: so i think we want that :) [13:10] Chipaca: right [13:10] Chipaca: so I think we have two options, we can built it into grub or add the module to the gadget.yaml [13:10] Chipaca: I think the later is quicker for now [13:10] mvo: only if adding it to gadget.yaml actually puts it somewhere :) [13:11] Chipaca: but maybe xnox has an opinion on regexp module vs build-in [13:11] Chipaca: right, that's indeed a bit of a mystery right now [13:12] Chipaca: mvo: loading modules is prohibited under Secureboot [13:12] and i thought regex is a built-in already anyway [13:12] mvo: also i see grub.conf in the pc-gadget, is that a bad symlink that should have been called grub.cnf? [13:13] also generated image has duplicate full copies of snaps itseems, that's quite suboptimal. [13:14] note systemd-seed is vfat thus no hardlinks / symlinks available... [13:14] (unless my world view is wrong about vfat) [13:15] xnox: grub.conf vs grub.cfg is a long story [13:15] mvo: i slowly back away. cool then. [13:15] xnox: I don't see duplicated snaps [13:15] PR snapd#7723 opened: snap-bootstrap: create encrypted partition [13:15] xnox: I just see /snaps and a bunch of them there [13:16] xnox: AFACIT the regexp module is not built-in [13:16] so reading source code i beleve regexp module is built-into our grub efi image at least. On bios it needs to be loaded, which is allowed there. [13:16] Chipaca: hm. [13:16] so maybe this is booting via the bios one? [13:16] Chipaca: how are you booting it? [13:16] seems strange, because it loads the cfg from EFI [13:16] Chipaca: I will wrestle a bit why ubuntu-image does not include the content on the recovery partition unless sil2100 has an idea, I will probably have to do some deep dive [13:17] Chipaca: we are dual-boot/hybrid by default..... [13:17] xnox: you know that email from mvo saying how to get a pc.img ? [13:17] xnox: that [13:17] xnox: how can i know if it's booting in efi or bios? [13:18] we shoudl really get rsync into the core snap [13:18] Chipaca: are you using a VM? and have you provided OVMF firware to do bios boot and get a spash with tiano core? or do you use seabios ie. default. [13:19] xnox: ah, this is a vm but wihtout the fancy ovmf stuff [13:19] mvo: I think I know the reason [13:19] so i guess it's plain ol bios [13:19] Chipaca: there is nothing in the email that tells people how to boot it. qemu binary defaults to bios boot, unless one provides EFI firwmare. I use virt-manager, and use gui to specify ovmf stuff. [13:19] Chipaca: which is not our primary target. =) [13:19] mborzecki: we have a rsync snap, no? [13:19] we are racing to boot unencrypted with UEFI =) [13:19] mvo: my expectation was that what snap prepare-image generates will already have all the necessary EFI/ directories [13:19] xnox: right, i'll get that rig up (not my main rig because it still refuses to upgrade out of 1604) [13:19] mvo: but if that's still required to be done by ubuntu-image, then I can do that [13:20] sil2100: aha, interessting [13:20] har har [13:20] mborzecki: another chunk https://github.com/snapcore/snapd/pull/7724 [13:20] PR #7724: cmd/snap-confine: tracking processes with classic confinement [13:20] sil2100: its a good question, in theory we have all the code to do that I think [13:20] mborzecki: this applies the tracking (still with pids) for classic confiement [13:26] ruh roh, ubuntulog fight! [13:27] mvo: right, but it's a snap, and i need to restore this on a core device [13:29] mborzecki: could we download it to /var/tmp and unpack there? [13:29] mborzecki: or is that too crazy [13:30] mvo: ha, this might even work! :) [13:30] mborzecki: ssh + tar dude [13:30] mvo: ok, let me finish up with this simple fix i have now and i can try that later [13:30] mborzecki: tar cz path/to/tar | ssh 'tar xvz' [13:30] mborzecki: \o/ [13:31] Chipaca: i'd seriously just want rsync -a in prepare, then rsync -a --delete in restore [13:31] mborzecki: or, snap install rsync --devmode [13:33] Chipaca: right, but the cleanup is in restore, where we dump the mount units, the state and so on, that's why unpacking to /tmp/rsync sounds appealing [13:34] unpacking the snap ofc [13:35] mborzecki: should work as long as it's the same core [13:36] mhm [13:39] PR snapd#7711 closed: seed: test and improve Core 20 seed handling errors [13:42] brb, reboot === ricab|lunch is now known as ricab [13:50] PR snapd#7725 opened: tests/lib/state: snapshot and restore /var/snap during the tests [13:51] mvo: cachio: zyga: poor man's fix ^^ [13:52] mborzecki: classic mount namespaces [13:52] mborzecki: is that why you picked /var/snap/* over /var/snap [13:52] ? [14:09] pstolowski: I updated 7715 with most of your comments [14:28] mvo: ok, PR coming to you in a minute, just finishing running the unit tests [14:28] sil2100: \o/ [14:29] ijohnson: do you know if mksquashfs lets you append to an existing squash using a different compression level? [14:30] Chipaca: you can set compression options like block size and dictionary size, but it doesn't significantly improve performance because those things don't 1:1 correspond to compression ratio and you the size differences are minimal between that and the default [14:30] Chipaca: I don't know I haven't looked at that [14:31] Chipaca: my gut guess is that no, you can only use a single compression level for the whole squash image because the compression options are written in exactly one place in the image for the different types (i.e. file data, file metadata, and something) [14:36] ijohnson: yeah, unless an append adds a new superblock, i don't see it happening either [14:40] https://forum.snapcraft.io/t/support-for-man-pages/2299 [14:40] i saw somewhere that we might add man page support to snapd. or did I dream that? [14:40] Chipaca: yeah exactly [14:40] mvo: hopefully this will do it: https://github.com/CanonicalLtd/ubuntu-image/pull/178 [14:40] PR CanonicalLtd/ubuntu-image#178: System seed boot content [14:41] I mean, if I didn't mix it up in my head [14:42] mvo: #7715 +1, with one final remark [14:42] PR #7715: overlord: add base->base remodel undo tests and fixes [14:42] mborzecki: snapd side now works too [14:42] mborzecki: I need to adjust spread tests that measure exactly how refresh app awareness is implemented [14:43] mborzecki: wanna see a quick diff [14:43] mborzecki: I cannot propose it yet because it depends on the rest [14:44] popey: hey [14:44] popey: not a dream [14:44] popey: it's likely next cycle [14:44] popey: and based on what others said it doesn't look like a big task [14:45] mborzecki: minimal, tests-passing, patch to snapd to support this new scheme [14:45] https://paste.ubuntu.com/p/6gtpY8q4VG/ [14:46] popey: its coming [14:47] popey: hoping to get it in as part of 20 work, it's above the line for now :) [14:48] mborzecki: running spread tests, also removed all pids cgroup usage from snap-confine now [14:48] Great! Thanks chaps! [14:51] mborzecki: I'll structure the patches so that they are a linear progression [14:51] zyga: looks like we might need a helper that hides cgroup.PidsInFile and security tag [14:52] huh, tab completion for 'mount' is broken with the default awk in mate eoan [14:52] zyga: like sandbox.PidsForSnap() ? [14:52] mborzecki: yeah, can do [14:52] mborzecki: it cannot take tag [14:52] mborzecki: it can take a snap info, return a map [14:53] mborzecki: it could take a tag but I don't see use of that, we'd need to scan the same set of files really, just ignore more, and derive the name anyway for locking [14:53] locking might need adjustment (will for sure) [14:53] once UI is glued correctly [14:53] mvo: so, once I add the missing files to the image, it boots in EFI (or is it UEFI) mode as well [14:53] Chipaca: nice! [14:54] sil2100: thank you, looking now [14:54] pstolowski: thank you, I'm looking now [14:54] xnox: and I can confirm 'regexp' exists in the (U)EFI grub [14:54] so globs will work, supposedly [14:54] * Chipaca tries [14:55] yep! :-) winning [14:55] Chipaca: \o/ [14:55] Chipaca: the PR from lukasz should make this all work now [14:56] Chipaca: now we just need to expand gadget.yaml to include the extra partitions that are needed but thats not in the crticial path for booting just yet (it will be once we boot :) [14:56] * sil2100 hopes his PR fixes it [14:56] mvo: is the initramfs the core18 one right now? [14:56] Later today I'll prepare my env to actually test this on the test model [14:57] Chipaca: pretty much AIUI [14:57] Chipaca: xnox told me he will use the spike one for now which is close to core18 [14:57] sil2100: should be fine, we can test for you [14:57] sil2100: well, except $meetings :/ oh well [15:06] * cachio lunch [15:13] * zyga goes for lunch [15:13] mborzecki: tests updated, let's see if they pass [15:13] mborzecki: I used a little hacky logic to find the place where snap-confine will put things [15:14] mborzecki: ideas welcome https://www.irccloud.com/pastebin/L1e8JZxX/ [15:14] but first food [15:15] cachio, mvo: I see ret tests because of 403 from the store [15:15] search is 403ing [15:15] is that expected now? [15:16] zyga: might be worth raising with #snapstore [15:16] k [15:16] nothing like an occasional 403 [15:17] Chipaca: pc-kernel is built from eoan, with initrd from eoan, whatever that contains. [15:17] Chipaca: i think it's not on-par with spike-tools, but better than just bionic. [15:17] mvo: ^ [15:17] bah [15:17] wrong [15:17] unwind [15:17] * Chipaca unwinds [15:17] Chipaca: mvo: pc-kernel 20/* is built from FOCAL, with initrd from FOCAL, whatever that contains. [15:18] * mvo is in a meeting but will read backlog [15:18] which is not my stuff, and incomplete spike-tools version, as far as I can tell. [15:18] xnox: if its not yours, who should work in it? [15:28] jdstrand: FWIW that weird X thing happens if you try to run snapped X apps in vncserver, per https://forum.snapcraft.io/t/gui-snaps-cant-connect-to-display-in-vnc/14028/3?u=chipaca [15:28] jdstrand: but only if you don't start vncserver when in an X session [15:30] jdstrand: that is: log in via the terminal, start vncserver, connect to the vncserver, try to start a snapped app and it'll fail [15:30] anyway, probably not a "you" issue [15:31] mvo: thanks for the review! I'll be merging it and we can iterate a bit more in case that's not enough or something ;) [15:41] re [15:44] Chipaca: ok, thanks [15:45] PR snapcraft#2794 opened: WIP: Add gnome 3 34 extension [15:46] jdstrand: hey [15:47] zyga: so, the basic idea is that you want to use the pids cgroup and move all snap processes into it? [15:47] yeah i'm still cleaning things up, but thought i would go ahead and open it up for comments. [15:47] zyga: but don't set the max or anything? [15:47] jdstrand: not pids [15:47] https://www.irccloud.com/pastebin/DUaJsF5k/ [15:47] lists: [15:48] https://www.irccloud.com/pastebin/DUaJsF5k/ [15:48] meh [15:48] 10:pids:/snap.travis.travis [15:48] https://github.com/snapcore/snapd/pull/7722 [15:48] PR #7722: cmd/snap-confine: add sc_join_sub_group [15:48] jdstrand: tl;dr; -> if cgroup2 is mounted we use that (even in hybrid mode) [15:48] jdstrand: if not we use cgroup name=systemd [15:49] jdstrand: the new idea is really that we dig deeper into _whatever_ systemd has placed us already [15:49] jdstrand: thereby not stealing the process [15:49] jdstrand: thereby not breaking systemd in v2 mode [15:50] jdstrand: I have two PRs and almost all the changes after that also ready [15:50] * jdstrand is wondering what https://www.irccloud.com/pastebin/DUaJsF5k/ was trying to show me [15:50] 0::/user.slice/user-1000.slice/session-2.scope/snap.travis.travis [15:50] this [15:51] * jdstrand also wonders why 'pids' is being used, but that is a different question [15:51] pids was used so far in the earlier iteration of the work on refresh app awareness [15:51] it's been like that for half a year roughly [15:51] but it's not _enabled_ in production yet [15:51] jdstrand: usage of pids is entirely replaced by the new idea [15:52] zyga: does that not happen with devmode? I just tried in a devmode snap and noticed that /proc/self/cgroup had this: 8:pids:/user.slice/user-1000.slice/user@1000.service [15:52] that's systemd [15:52] not us [15:53] I know that [15:53] the paste I shared was from my local machine [15:53] you won't see the changes elsewhere yet [15:53] why does the devmode snap use systemd, bu tyour travis paste uses snap.travis... [15:53] ok. I was confused by your "it's been there half a year" comment then [15:54] jdstrand: I don't understand, travis was strated in my desktop session so it was placed in /user.slice/user-1000.slice/session-2.scope/ [15:54] ok, anyway [15:54] jdstrand: snap-confine moved it to a subdirectory [15:54] jdstrand: which is named after the security tag [15:54] jdstrand: hence 0::/user.slice/user-1000.slice/session-2.scope/snap.travis.travis [15:54] zyga: please look you this: https://www.irccloud.com/pastebin/DUaJsF5k/ [15:55] zyga: *you* mentioned that paste in backscroll earlier today [15:55] right [15:55] are you asking about why pids= is like that? [15:55] or about something else? [15:55] 10:pids:/snap.travis.travis [15:55] yes :) [15:55] ah, forgive me :) [15:55] so, pids= implements stealing that we used for tracking [15:55] current WIP branch no longer does that [15:55] but [15:55] the stealing is like that for a good amount of time [15:55] and we just want to stop doing that [15:56] ok [15:56] so I'll ignore that [15:56] sorry for making this confusing, I should have highlighted what I was trying to convey [15:57] that said, mvo and I had discussed unconditionally putting all snap processes into both a pids and a mem cgroup so that management snaps could place resource constraints on them [15:57] mvo: "not mine" as in "not the new shiny systemd stuff" I think i'm still on the hook to maintain either initramfs-tools based initrd or the systemd based initrd =) [15:57] if we did that, couldn't we just look at those cgroups to see if there are any processes currently in them? [15:58] jdstrand: how do you want that to work in v2 world? [15:58] zyga: sorry, seeing 'pids' in that paste reminded me of this [15:58] jdstrand: yes but this doesn't exist in v2 world [15:58] jdstrand: this is why we're doing this [15:59] jdstrand: in v2 world my patch set already does what you want, in a way, because now you can put constraints on snaps regardless of which session they are in [15:59] jdstrand: or if they are system or user processes [15:59] sil2100: yeah, thanks so much for your help with u-i! [15:59] jdstrand: because _all_ snaps in v2 are in sub-directory of where systemd placed the workload initially [15:59] jdstrand: in v1 mode we can do the same thing for pids/memory [15:59] I'm sorry, I'm having a hard time context switching. also, I said unconditionally into a pids cgroup, I mean cpu and mem [15:59] xnox: aha, excellent! yes, sorry, I just misunderstood [15:59] right [16:00] also, the v2 world has cpu and mem cgroups based on https://www.kernel.org/doc/Documentation/cgroup-v2.txt [16:00] so, and this is the part I forget, the limitation in a v2 world is how systemd is using it, correct? [16:01] jdstrand: v2 world has only one tree [16:01] * jdstrand has not done much in a v2 only world yet, besides having abstract conversations about it [16:02] zyga: yes, but, as you mentioned in backscroll, you caan go deeper, just not higher, right? [16:02] jdstrand: what you can do is to put constraints on a given point in the v2 tree [16:02] jdstrand: yes [16:02] jdstrand: so my patch set does that for v2 already [16:02] zyga: yes, I understand [16:02] jdstrand: note that the 0::/... line is the only thing you will see in pure v2 world [16:02] jdstrand: so if you already scope each snap workload to the snap security tag [16:02] jdstrand: you can put constraints now [16:03] jdstrand: we can expand this to v1 world as well [16:03] jdstrand: using a similar idea [16:03] jdstrand: and pretty much the same logic we have in my first patch [16:03] zyga: but I was then discussing, why not put snaps into a deep cpu and mem cgroup? and then use that for pid tracking? [16:03] jdstrand: because I wanted to use just one thing, not several for pid tracking [16:03] jdstrand: and that one thing is v2 if availalble (18.04+) or name=systemd as fallback [16:04] jdstrand: because we can rely on that as well [16:04] zyga: sure, but you could just pick one. if they are both unconditional, then just use cpu [16:04] (for example) [16:04] yes but it has semantics and this one does not, it's strictly for tracking and not resource allocation [16:04] jdstrand: up until earlier today I was not considering name=systemd fallback at all, as we hoped we could avoid that [16:06] zyga: yes... and that's fine. It's just that when I saw 'pids' I was reminded of this unconditional cpu/mem thing, which we don't have, but we want. snapd doesn't do anything with cpu and mem, so lumping that in with tracking [16:06] I understand [16:06] zyga: but, I guess if we can do one for tracking, we can do two more for cpu/mem [16:06] yeah, I think this is safer [16:07] name=systemd subdirectory is only used by systemd [16:07] anyway, it is just a thought [16:07] sure, sorry if I feel defensive about this, I'm trying to manage making this something that's realistically available [16:07] I think it's a good idea to expand this to cpu/mem, not sure how people would configure that but if that's all the interface we give (guaranteed name) then I think that's okay [16:08] that was the only idea [16:08] I mean, we could then create an interface for managing snap* cpu/mem cgroups, etc [16:08] an interfaces in permission or an API via snapd? [16:09] but snapd itself wouldn't do anything cause it can't make an intelligent decision [16:09] interfaces/builtin [16:09] but I'm distracting us [16:09] jdstrand: we can do it if we have a driving use case, right now that part feels easy but making sure it's what someone wants is harder [16:14] zyga: yeah, there have been customer requests to make it easier to set cpu and mem limits on snaps [16:14] anyway, again, I'm distracting [16:14] thank you for sharing this, I will follow up with this idea soon [16:14] thanks! [16:16] zyga: is it documented anywhere in the forum or elsewhere all the cgroups stuff? like, v1 vs v2 vs unified within the context of systemd (perhaps by version) and how snapd currently fits, along with what other distros do and the recommendations? [16:16] no, I don't believe so [16:16] jdstrand: this idea was bounced internally between me and mborzecki for some time [16:16] jdstrand: and after name=snapd collapsed I picked it up as the only sensible alternative [16:16] zyga: I ask, cause, I get that all into my brain so I can somewhat talk about it, then persoanlly don't play with it at all, so I lose it until the next time [16:17] zyga: I can't be the only one who is in a similar position. I mean, if one lives and breathes cgroups, that is one thing. that isn't me [16:17] jdstrand: I will write it down on the refresh app awareness thread [16:18] jdstrand: it's just so much happened recently I didn't manage to settle [16:18] zyga: thank you. I think it would help everyone sync up and get on the same page for future changes in this area [16:18] zyga: totally understand. I just feel like I'm slowing things down have to reconstruct everything in my head :) [16:19] having* [16:19] I guess there is some benefit to that. it means that the code, comments and spec need to be clear ;) [16:20] but we can manage that *and* have things documented :) [16:20] zyga: /sys/fs/cgroup/systemd is used in a v2 only world? [16:20] no, in v2 world there's just one directory /sys/fs/cgroup and it's always a v2 mount [16:20] it has all the roles combined [16:21] that's right. jeez [16:21] [zyga@fedora31 ~]$ mount | grep cgroup [16:21] cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate) [16:21] zyga: ok, dumb question, why is /sys/fs/cgroup/systemd empty on my eoan system? [16:21] empty as in empty directory? [16:22] it see plenty of stuff on eoan there [16:22] meh, I fat fingered it. sorry [16:22] 1K entries [16:22] ah, ok :) [16:22] uff :) [16:22] I checked with stgraber and xnox and I think this is something we can rely on [16:23] unified (hybrid), pure v2 or name=systemd [16:23] it is great to have checked with them :) [16:25] I should chop and push the rest of the changes [16:25] need to add some more tests to refresh.go as well [16:26] jdstrand: I have the entire refresh app awareness ported over to this now [16:26] maybe tomorrow I can glue the gui to it [16:26] and have the whole thing as a PR for vancouver [16:27] the GUI is ready in another branch but the critical element I need to think about and (re)write is how snapd notices it's safe to refresh now [16:30] mvo: yw! If anything else pops up, just give me a poke [16:31] zyga: when you have branches ready for this stuff, please request reviews from me :-) [16:31] ijohnson: your wish is my command [16:31] ijohnson: please review two first PRs [16:31] https://github.com/snapcore/snapd/pull/7722 [16:31] https://github.com/snapcore/snapd/pull/7724 [16:31] PR #7722: cmd/snap-confine: add sc_join_sub_group [16:31] PR #7724: cmd/snap-confine: tracking processes with classic confinement [16:31] ijohnson: I'll add some more unit tests and commit the rest into several branches [16:31] zyga: ack, sounds good! [16:31] or maybe [16:32] one branch and several patches [16:32] it's better if the next stage is one PR as it's consistent this way [16:32] zyga: still finishing up pedronis' greedy plugs PR review then I can switch to yours I think [16:32] perfect [16:33] I will have the third one that builds on this today [16:33] code wise it's not big [16:33] removes some now-dead code [16:33] changes a few places, nothing huge [16:47] sil2100: is it easy for you to trigger a new ubuntu-image build into edge? it seems like we have the code all landed now? [16:51] Chipaca: silly question - with your grub.cfg and if we boot with qemu in UEFI mode - do we actually need anything to make the system bootable (i.e. do we need to tweak seed.go and friends at all?). it's an ESP partition and has the right binary names, so it should just work(tm), yes? [16:51] Chipaca: which would be pretty nice, so we may get a booting recovery today afterall :) [16:52] PR snapd#7726 opened: RFC: change how snapd tracks processes [16:53] ijohnson, jdstrand: https://github.com/snapcore/snapd/pull/7726 [16:53] PR #7726: RFC: change how snapd tracks processes [16:53] This is the RFC with all the changes that are going in, I'll be splitting the "many" patch next but I need a break for coffee [16:53] nice [16:54] PR snapd#7715 closed: overlord: add base->base remodel undo tests and fixes [16:55] PR snapd#7438 closed: devicestate: add support for base->base remodel [16:55] mvo: ^ proooogress :) [16:55] mvo: it has feature party with master but works in more places [16:55] I need to bring the patch that extended the LXD test to use this now [16:55] to have sane sleep tonight [16:55] but first [16:55] coffee [16:55] really [16:56] brb [16:57] zyga: woah, that sounds pretty great [17:15] hmm mvo, jdstrand, or Chipaca are y'all available to discuss pedronis's greedy plugs PR with me a little bit? I'm sorta confused on what some of these tests are doing cause it feels like a test there is testing greedy _slots_ when I thought this PR was strictly about greedy _plugs_ [17:16] (this is about #7652 btw) [17:16] PR #7652: o/ifacestate,interfaces,interfaces/policy: slots-for-plug: * [17:29] mvo: ah, crap, let me poke it === pstolowski is now known as pstolowski|afk [17:29] mvo: it operates on a github mirror on launchpad, probably stupid thing didn't update yet [17:29] I always forget there's this stupid delay [17:39] jdstrand: where is the bug tracker for snappy-debug? [17:39] https://launchpad.net/snappy-hub looks right, but not configured [17:55] re [17:57] * cachio breack [18:00] did you guys notice that travis started showing the CPU column [18:00] interesting [18:00] (the CPU architecture) === ijohnson is now known as ijohnson|lunch [18:08] popey: the forum currently. snaappy-hub should have it configured, but I don't have the ability to do that [18:13] zyga: for about a month now, since they started offering arm builders (runners? workers?) [18:13] yeah, I think it makes sense given their overall direction [18:13] just cute with nice icon :) [18:13] popey: probably the snapcraft category [18:13] ijohnson|lunch: not today :-| [18:14] for me i mean; others may :) [18:14] mvo: I'm trying to determine that. I can probably make a grub.cfg that will set the boot vars up to boot in core18 mode, which seems to be needed for now [18:15] mvo: unless by "boot" you mean "reach initramfs" in which case yes, sure, done :) [18:34] https://blog.quarkslab.com/analysis-of-qualcomm-secure-boot-chains.html <- interesting [18:34] mvo: ^ share this with claduio if I forget [18:36] cmatsuoka: ^^ [18:36] zyga: like that? [18:36] oh, is my memory failing me or did claudio change his irc nick [18:37] man I could just be tired [18:37] Chipaca: thank you sir :) [18:37] zyga: hey, maybe i'm wrong and i've been sharing stuff with the wrong irc persona [18:58] zyga, mvo (cc xnox): I did a first pass at pr 7722. the most important comment is my Delegate= observation wrt https://systemd.io/CGROUP_DELEGATION.html in https://github.com/snapcore/snapd/pull/7722#pullrequestreview-312613176 [18:58] PR #7722: cmd/snap-confine: add sc_join_sub_group [18:58] looking [18:59] Chipaca: "boot" means initramfs for now [18:59] Chipaca: no need to do anything uc18 compatible [18:59] jdstrand: we looked at scopes with mborzecki [18:59] Chipaca: I think we will figure out there we have no partitions and no run mode so we assume install and call snap-boostrap [18:59] jdstrand: and AFAIR we cannot use them [18:59] zyga: scopes with Delegate? [19:00] delegate breaks service management [19:00] AFAIK [19:00] zyga: reading [19:00] if we carve a place and say that we are a delegate there [19:00] systemd doesn't handle any service ops anymore [19:00] we had a look at delegate earlier and it was deemed ok to run things like a container there [19:00] but not things that look like they are "on" the host [19:00] zyga: I'm not sure why that side-effect would be there. The Delegate as documented in that page is referring to cgroups [19:00] we talked to systemd upstream about it [19:01] zyga: that doesn't mean that side-effect is *not* there, just that the page didn't mention it (unless I missed it) [19:01] sil2100: no worries, if the mirror picks it up by itself by tomorrow thats totally fine, was just curious :) [19:01] it's late and maciek is not here so we'll pick it up first thing tomororw [19:01] sil2100: and it looks like it got build \O/ [19:02] zyga: well, there are 3 options according to that page. I just have real concerns that systemd may decide to move things, since, well, in that page it said it could [19:02] Chipaca: so feel free to share the grub.cfg, I think we can then unblock xnox to hack on initramfs :) [19:03] zyga: if we are going against their recommendations, I think at the very least we need to document why and have some assurances that it won't just immediately break [19:03] zyga: again, I'm not saying it will break, I just have a lot of concerns based on that page [19:03] (which is why I left a comment rather than a request changes [19:04] ) [19:04] yeah, I don't blame you for it [19:04] it's super tricky part [19:04] I don't think we can use delegate at all [19:04] because: [19:04] it seems to require a unit file, that's okay for services but not for hooks and apps [19:04] zyga: I guess it feels like we are trying to play nice with systemd in general, letting it do its thing, except we slide this bit in [19:05] for non-service things we could use the dbus API to have systemd spawn them but then we don't have the semantics we had before [19:05] (like having connected tty and things like that) [19:05] if we have a special unit for snaps then we no longer have the way to set any per-user constraints [19:06] so if a user has limit on some memory [19:06] Chipaca: even for just booting in run mode, core18-style, a few changes inside the initramfs were necessary in the spike. to support installation the changes were fairly extensive [19:06] all they need to do is to run a snap, that would be moved to a delegated place that is not in their user tree anymore [19:06] zyga: how I thought of it, and again, this is without trying anything, is there is scope unit with Delegate= in it that is part of snapd install. that setups up a /sys/fs/cgroup/something/snapd directory. because Delegate is used, snapd will ignore that [19:06] jdstrand: I acknowledge the fact that all three options are suboptimal and we are bending the rules slightly [19:07] (systemd will ignore that) [19:07] zyga: then snap-confine can create the subdirs under that [19:07] we tried that approach earlier [19:07] mvo: sweet! will do that this pm [19:07] jdstrand: it breaks things like you log out and things get killed [19:07] cmatsuoka: that's a relief because it wasn't working :-) [19:07] jdstrand: and resource management associated with this [19:07] jdstrand: AFAIK, I could be wrong because layers-of-complexity, this also breaks things like systemctl stop [19:08] mvo: cmatsuoka: also i'm actually washing dishes, not working on grub.cfg, at this moment [19:08] I'll double check with maciek tomorrow [19:08] s/actually/supposedly/ :) [19:08] cmatsuoka: the labels are different now, so at least these will need to be handled differently [19:09] Chipaca: heh, thats fine - I try to bring the kids to bed too [19:09] jdstrand: some of those options also require snap-confine to make dbus calls as well, meaning that snapd depends on dbus for all "snap run" commands [19:09] Chipaca: just want something to play with in my morning ;) [19:09] zyga: what's the significance of the maciej β†’ maciek change? [19:09] jdstrand: right now what we have is a compromise upon compromise [19:09] zyga: zyga I mean, I could be wrong. *but* that page talks about Delegate and references resource controls, etc. it doesn't mention service control [19:09] Chipaca: which change specifically? [19:10] jdstrand: I could be wrong too but we explored this page, all three options, including discussing this with systemd developers [19:10] * cachio afk [19:10] * Chipaca afks too, before the dishes take over [19:10] zyga: yes, only option 2 is available then. I just don't have experience with Delegate= [19:10] jdstrand: I honestly think that we need a 4. expand systemd to have things like "portable services" delivered by things other than systemd [19:10] zyga: upstream said to use your current approach? [19:11] where there's a runtime and there are rules on how systemd and the runtime interact [19:11] jdstrand: ok, will start a forum thread [19:11] jdstrand: no, they did not, we only evaluated the 1-3 options with them and it was an imperfect fit in all cases [19:11] jdstrand: what was picked was the current proposal, after experiments and casual-check-with xnox [19:12] https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#Delegate= only is talking about control groups. it seems odd that service control would fall apart with it [19:12] jdstrand: it's based on control groups [19:12] again, not saying that there aren't side-effects, just they aren't documented in what I am seeing [19:12] I see [19:13] jdstrand: btw, I replied on one comment [19:13] https://github.com/snapcore/snapd/pull/7722#discussion_r343271853 [19:13] PR #7722: cmd/snap-confine: add sc_join_sub_group [19:13] I'll get to the rest later, I'll wrap up the day soon after adding some more unit tests [19:13] zyga: oh, you're saying that systemd's service management is all around how it configures the cgroups for tracking? [19:13] yes [19:13] ok, that makes sense [19:13] unfortunate, but makes sense [19:14] I think it requires upstream love as well [19:14] as in, us working upstream [19:14] on something that is not just a "doesn't break yet" hack [19:14] but something documented to work [19:15] zyga: so, in PR 7722, I was with you all the way until sc_cgroup_create_and_join(current_hierarchy_path, security_tag, pid); [19:15] PR #7722: cmd/snap-confine: add sc_join_sub_group [19:16] hmm? [19:16] and what happened there? [19:16] as in what does this do? [19:16] or something else? [19:16] zyga: then, since I *just* read that page, I got worried [19:17] ah :) [19:17] I understand [19:17] I was going to ask a question based on that comment, but then need to think more [19:17] I think we could use help from foundations [19:17] to find _a_ way of doing what we want [19:17] all the approaches we found so far are stretching the protocol [19:17] or outright don't work in v2 world [19:18] jdstrand: oh and btw: [19:18] jdstrand: it works in lxd :) https://www.irccloud.com/pastebin/g3GPX3Nc/ [19:18] brb, family calling\ [19:18] \ [19:19] yes, I believe it would work, it is just, what is systemd going to do after we moved a pid in there [19:20] I read the cgroups man page and was highly optimistic [19:20] then I read the systemd cgroups delegation page and got worried [19:22] zyga: curious, with your patch as is and moving pids into the new leaf dir, does service manage work ok? [19:22] management* [19:22] zyga: like, if you move the pid, does systemd lose track of it? [19:22] re [19:23] jdstrand: yes [19:23] jdstrand: because it recursively scans the tree it created [19:23] yes, as in it works I guess. and, it doesn't put it back? [19:24] put it back? [19:24] no, it doesn't seem to move anything after the initial context of the starting [19:25] I guess it forks, sets up everything and execs [19:25] and the fact we move in our exec chain doesn't bother it [19:25] zyga: this is blackbox investigation, not documented/recommended/code inspected/whatever ? [19:25] yes [19:26] we did some code inspections [19:26] we did experiments and discussed stuff as well [19:26] but it's not documented proper behavior [19:26] note that https://systemd.io/CGROUP_DELEGATION.html says: "The fact that systemd keeps these hierarchies in sync means that the legacy and hybrid hierarchies are conceptually very close to the unified hierarchy. In particular this allows us to talk of one specific cgroup and actually mean the same cgroup in all available controller hierarchies. E.g. if we talk about the cgroup /foo/bar/ then we actually [19:26] mean /sys/fs/cgroup/cpu/foo/bar/ as well as /sys/fs/cgroup/memory/foo/bar/, /sys/fs/cgroup/pids/foo/bar/, and so on." [19:28] I'm conerned about /sys/fs/cgroup/unified/user.slice/user-1000.slice/user@1000.service/foo.service/snap.foo [19:28] != /sys/fs/cgroup/unified/user.slice/user-1000.slice/user@1000.service/foo.service [19:28] in some sense that is a problem, as you see it, but if you put constraints on /sys/fs/cgroup/unified/user.slice/user-1000.slice/user@1000.service/foo.service they apply underneath as well [19:29] the only difference is that cgroup.procs doesn't contain the service process there (no processes in intermediate nodes rule) [19:30] zyga: right, unless you reduce in foo.services/cgroup.subtree_control [19:31] anyway, it sounds like everyone has discussed this. I think there may be room for more investigation/confirmation that the approach should work with systemd as designed, and then document all this in code comments [19:31] jdstrand: it's not rock solid design, it's just either this or we have no way to deliver the feature at all; it's a fallback at this stage [19:31] ie, say why we can't use the recommendations in https://systemd.io/CGROUP_DELEGATION.html but why what we are doing should work [19:32] jdstrand: it might well be that someone comes along and says "it cannot work for this-and-that-reason" and they will be right [19:32] that's a good remark, I'll make sure this is documented [19:32] zyga: that is what I am afraid of :( I would hate to deliver this and then have it buggy, break down the line with no recourse, etc [19:33] jdstrand: I had one more idea that doesn't rely on cgroups [19:33] jdstrand: but it's bad for other reasons [19:33] jdstrand: I implemented a prototype for it [19:33] jdstrand: but the rough idea is that snap-confine is a service manager and watches a subtree of processes as a subreaper [19:33] zyga: I'll mention that there is something that works on systems with apparmor: look at the security label. it is not atomic but you will get what you want [19:34] jdstrand: that's true [19:34] jdstrand: in fact any pid-tagging system works [19:34] jdstrand: we effectively use this as pid tagging [19:34] zyga: on Ubunt uTouch the security label was key in process lifecycle [19:35] jdstrand: let me just briefly finish the problems I found with the mini-service-manager idea [19:35] jdstrand: to retain the feeling that snap-{run,confine,exec} chain just execs your program in the end [19:35] zyga: with pid tagging, are you talking about 'ptags'? [19:36] ptags? [19:36] I meant it as anything that can associate some information with a process [19:36] I just googled pid tagging and found https://lwn.net/Articles/702639/ [19:36] I was told there's no such thing yet [19:37] jdstrand: (contd.) to retain the feeling that it's just a chain of exec, the service manager part forks off and forks off the application process [19:37] watching for deaths of all the processes in that subreaper tree [19:37] it doesn't seem committed, no. but, LSMs can do that (as mentioned, apparmor security label) [19:38] and relying signals and exit status to the stub process [19:38] it's ugly and I would not want to ship it but it did the job somewhat [19:38] zyga: a fork or fork/exec? [19:39] jdstrand: indeed though scanning proc is even worse than scanning cgroup [19:39] zyga: it is not great, no, but if cgroups are a no go... [19:39] jdstrand: I can show you the rough idea on a piece of paper, hold on, it might be easier [19:40] I'll admit it isn't exciting thinking about snap-confine as a service [19:40] it would be closer to lxd model [19:40] where such things are used [19:40] https://usercontent.irccloud-cdn.com/file/ND62Rwl7/snap-confine-service-manager.jpeg [19:41] jdstrand: the surogate is what others will track as "main pid" [19:41] surrogate* [19:41] lxd could actually use the Delegate= idea since it does all its management [19:41] I'm sure it is doing that [19:41] it feels like a use case for that exactly) [19:42] indeed [19:42] jdstrand: so the diagram has the entry at the top, the surrogate waits for the real application to exit or die to relay the wait status [19:42] LXD is a full container manager, which that page is talking about [19:42] jdstrand: the fork chain to the left reparents the service manager and application parts [19:42] the service manager becomes a subreaper [19:43] forks off to exec the app and loops on waitpid(-1, ...) [19:43] this is PR_SET_CHILD_SUBREAPER? [19:43] as long as there are things to wait for it will collect [19:43] yes [19:43] if it collects the app pid it relays that to the surrogate via eventfd [19:43] if it collects other things it just carries on [19:43] eventually nothing more is needed and it quits [19:44] it could also monitor the surrogate and kill the real app (I didn't implement that) [19:44] on the up side it doesn't use cgroups :) [19:44] aiui, it is going to lose zombies [19:44] in fact it doesn't touch the filesystem [19:44] why? [19:46] jdstrand: (for a moment I also considered setting timer slack nanosecond to a unique value as a cookie but that's an idea from hell) [19:46] jdstrand: waitpid will grab dead children from things forked by the app as well [19:46] waiting on a pid to die or exit. that pid forks off stuff and dies. the zombies get reassigned to init. but reading the man page for PR_SET_CHILD_SUBREAPER, I guess that isn't true [19:46] (that's what a subreaper does effectively) [19:46] right [19:46] yeah, that's the whole idea [19:46] and the fact that you can wait on your children's children this way [19:46] and know there are no more [19:46] (ECHILD) [19:47] I'm sure there are dragons in this idea [19:47] I just feel I ran out of anything resembling ideas at this stage :) [19:47] and that was the last one [19:48] zyga: we could probably make it 'ok' from a security perspective through dropping, changing profiles, etc, etc. but it does feel heavyweight since we always have two processes per started snap command [19:48] jdstrand: I also considered the manager to get the pid (pidfd) via socket [19:48] I didn't implement that [19:48] but I think this mode breaks subreaper [19:48] and waiting [19:48] it's only good for "have a handle to kill that guy" [19:48] not for anything useful [19:48] (beyond that) [19:49] it's really a shame that we cannot just treat processes as files all the way with the right mechanics [19:49] but I guess by the time I retire people will argue abut it and it will be the problem we are at today + lots of new related syscalls [19:49] well, if snapd was the parent... but that is not desired due to design constraints [19:49] yes [19:49] we're between a hard place and initial design [19:50] or something like that [19:50] it's funny that in 2019, it's hard to answer "is this running" reliably because of how everything is interconnected [19:51] zyga: I'd like you to... reconsider is probably strong... but not rule out LSM labelling as a part of a potential solution. remember, on any system with any apparmor enabled, classic, devmode and strict all get it. it is possible to start snaps under selinux with a per-snap label as well [19:52] interesting [19:52] zyga: if the cgroup thing doesn't work out that is. [19:52] so are you saying that we could use selinux tags for a process that somehow map to snap security tags [19:52] (selinux labels, sorry) [19:52] I'm not ruling it out, if you had that impression [19:53] zyga: its had wavy, but I was more saying that snapd is updated to launch every snap under its own label rather than under the shared label/unconfined [19:53] it's* [19:53] hmm [19:53] please refresh my memory [19:53] classic confined snaps get a label with a permissive profile, correct? [19:54] it would require work, but you can see we just define policy for each snap that is installed taht is the same as what we have now [19:54] jdstrand: related to classic confinement: https://github.com/snapcore/snapd/pull/7724 [19:54] PR #7724: cmd/snap-confine: tracking processes with classic confinement [19:54] (selinux there) [19:54] zyga: re classic: *yes* [19:54] jdstrand: mm [19:54] jdstrand: well, I'll get over the whole design with mborzecki tomorrow [19:54] jdstrand: as long as we can answer our UI driven questions I won't complain :) [19:55] jdstrand: this feature looked so good on paper initially [19:55] jdstrand: it's such a maze in reality [19:55] $ ps auZ|grep test-classic [19:55] snap.test-classic.sh (complain) jamie 5593 0.1 0.0 9604 3436 pts/16 S 13:54 0:00 /bin/bash /snap/test-classic/x1/bin/sh [19:55] that's good [19:55] $ snap list|grep test-classic [19:55] test-classic 0 x1 - - classic [19:56] zyga: yes, that was part of the design so we'd have tracking throughout the system [19:56] jdstrand: I wonder if 20.10 will go to v2 mode [19:56] jdstrand: 20.04 is totally premature [19:56] but after, hmm [19:57] zyga: so, if cgroups works, yay, just please document. if not, bummer since that is a great way to do this, but could limit to the feature to apparmor-enabled systems initially, then expand to selinux [19:57] jdstrand: yes, I think this is a good idea [19:57] have a look at the other PR [19:57] it's much smaller in scope and risk [19:58] and relatively tiny in size too [19:58] landing it will bring me one step closer to whatever we do in the end :) [19:58] zyga: I lost it in backscroll. number? [19:58] https://github.com/snapcore/snapd/pull/7724/files [19:58] PR #7724: cmd/snap-confine: tracking processes with classic confinement [19:58] tl;dr version is it makes classic confinement have pids= controller for tracking [19:59] with the intent of swapping out that for the one in the other branch [19:59] as can be seen in the WIP draft that has everything [20:02] zyga: sorry about my concerns and not having a lot of answers there. interactions between systemd, snapd (and container managers like k8s, docker, greengrass, etc, etc) is always... frustrating? === ijohnson|lunch is now known as ijohnson [20:02] yes :) [20:02] no need to be sorry [20:02] zyga: but, at the very least, perhaps we can get some assurances that the proposed idea should work [20:03] and knowing sooner if it won't is definitely better than not :) [20:05] jdstrand: I wish I checked better earlier [20:06] well, this is a minefield [20:07] and hey, that's what reviews are for. we caught this stuff before it went live [20:22] ijohnson, jdstrand: I've updated https://github.com/snapcore/snapd/pull/7726 to have more tests for PidsOfSnap (in overlord/snapstate/refresh.go) and I will call it a day [20:22] I've spawned another test run [20:22] PR #7726: RFC: change how snapd tracks processes [20:23] ijohnson: you will see that jdstrand did a thorough review of the main PR already but please do your review if time permits [20:23] with that, I'm wrapping up :) [20:24] in my free time I will write something easy, like a toy compiler maybe [20:24] at least that's like, rewind to 60s and read a book [20:26] jdstrand: oh, if we decide to pick a different hierarchy at some point [20:26] jdstrand: e.g. someone runs new systemd and it gives us unified or something [20:26] jdstrand: that's ok too [20:27] jdstrand: because we scan all of /sys/fs/cgroup [20:27] but I really need to wrap up and rest [20:27] ttyl and thank you for the focus jdstrand! [20:30] ack zyga [20:30] Have a good night and get good rest! [20:38] zyga: heh, take care :) [21:07] PR snapd#7725 closed: tests/lib/state: snapshot and restore /var/snap during the tests [22:34] PR snapcraft#2793 closed: tests: enable SNAPCRAFT_BUILD_INFO for spread