/srv/irclogs.ubuntu.com/2019/07/09/#snappy.txt

scoopexthe rocketchat snap package is missing two rules (https://github.com/RocketChat/Rocket.Chat/issues/14562). how can i add them?04:52
scoopexsorry, i mean apparmor rules04:52
jameshscoopex: rebuilding the snap with the network-observe plug would probably do the trick.04:56
jameshthat exposes a lot more though.04:57
mborzeckimorning05:23
=== morphis2 is now known as morphis
zygaHello06:23
mborzeckizyga: hey06:24
zygaHow are you?06:31
zygaWe had first rain in weeks06:31
zygaIt is still raining now06:31
zygaHey mvo06:42
mvohey zyga06:42
mborzeckimvo: hey06:46
mvohey mborzecki ! thanks for your decompile of the bpf in the performance thread06:47
mborzeckimvo: sure, do you have the link to the image you used with volatility?06:49
mborzeckimvo: i'm inclined to try rebuilding a couple of versions of libseccomp, check the bpf they generate and how it impacts the performance06:50
mvomborzecki: i used http://amnesia.gtisc.gatech.edu/~moyix/ds_fuzz_hidden_proc.img.bz206:50
mvomborzecki: it was more or less a random choice in their example image section06:50
mvo(its one of the smaller ones though)06:51
mborzeckimvo: hah, i could probably write a small benchmark but we could equally well use volatility :)06:51
mvomborzecki: yeah, volatility is nice as it exhibits the issue easily06:55
zygaWimpress: hello07:05
zygaWimpress: please ping me when you are working, we need to sync07:05
zygahttps://github.com/snapcore/snapd/pull/7062 is green and needs a 2nd review (re-enabling test)07:07
* zyga is a little sleepy after uberlong day07:11
zygaon the upside the new bed is ok07:11
zygapstolowski|afk: neat stuff07:14
=== pstolowski|afk is now known as pstolowski
pstolowskimornings, hey zyga07:14
zygahey :)07:15
zygagood morning07:15
zygapstolowski: it was raining all night07:15
zygaevery bit of soil and plant around here needed that07:15
pstolowskizyga: it has been rainy and cloudy here as well for over a week.. and it's actually a little chilly07:18
zygapstolowski: I kind of miss that now07:19
zygabrb07:27
zygacoffee and back to work soon07:36
zygapstolowski: I'll reivew your PR first07:36
pstolowskity07:36
zygapstolowski: thanks!07:41
zygacoffee-in-hand, review mode :-)07:42
pstolowskiyw07:42
pedronisjamesh: hi,  could you work to try to land 6959 ?07:43
mborzeckipstolowski: little chilly, 16C right now here :/07:52
zygamborzecki: wow07:57
zygapstolowski: can you help me understand the code in the PR please07:57
zygapstolowski: what sets skip-setup-profiles?, I cannot see it07:57
zygaEDIT: d'oh07:57
zygasorry, typo in search07:57
zygaI should finish that coffee, I see it now07:57
pedronispstolowski: I moved some cards related to the interface connection work, doesn't seem you have a card in doing about it though07:58
pstolowskizyga: good08:03
pstolowskipedronis: indeed, thanks08:04
zygapstolowski: I'd like to run a scenario by you,08:10
zygapstolowski: (still assembling the scenario, I'll get back to you)08:14
mborzeckimvo: i'm experimenting a bit with versions of libseccomp, so the impact is marginal08:14
mborzeckimvo: i've checked 2.4.1, 2.3.3, latest master and the binary tree branch08:15
mvomborzecki: I wonder if we could do a lookup table or if bpf is too limited08:18
mborzeckimvo: with ebpf yes, cbpf is limited08:18
mborzeckimvo: sadly seccomp is cbpf08:18
mborzeckimvo: if you're interested, this is the profile with 2.4.1 https://paste.ubuntu.com/p/wRTDmq39Sw/ and this one is with binary tree patches https://paste.ubuntu.com/p/Q9VYyhchRR/08:21
Wimpresszyga: Morning08:40
zygaWimpress: hello08:41
pstolowskipedronis: i'm removing ...== "snapd" checks; wondering what to do about client-side filtering and special-casing around connections/interfaces09:06
pedronispstolowski: we are mostly interested about places that were using .Type for anything else09:16
pedronispstolowski: client side, not sure, we had a lot of back and forth there09:17
pedronispstolowski: what's the special casing around connections/interfaces ?09:17
pstolowskipedronis: it's mostly about system nickname afaiu09:18
pedronispstolowski: that goes through the mapper, no?09:18
pedronisthe mapper decision could use the type now09:18
pstolowskipedronis: maybe, but that may not solve everything, e.g. cmd_connections uses different format for system snaps, i.e. it omits snap name for system snaps, and uses just the name match09:20
pedronispstolowski: if we do things correctly, selectInterfaceMapper  can use the type09:21
pedronispstolowski: maybe I wasn't clear,  the issue is not that we compare with the name in general09:21
pedronisthe issue is places that were checking the type of snapInfo for most things09:21
pedronisbut the  name for snapd09:21
pedronisbut I'm probably still confused what the problem is09:22
pedronispstolowski: are you talking about places that use SystemSnapName ?09:23
pstolowskipedronis: isn't the goal to support snapd snap whatever its name might be in the future?09:23
pedronispstolowski: not really09:24
pedronisnever said that09:24
zygaNOTE: the client side checks were there because some parts of the API did not return enough information AFAIK09:24
pedronispstolowski: the goal is to make decision sensitive based on the type09:24
pstolowskipedronis: allright. wrong assumption on my side then. there is no issue then.09:24
mborzeckimvo: added a note about compiled bpf with 2.4.1 and the binary tree branch: https://forum.snapcraft.io/t/concerns-about-performance/12194/809:27
pedronispstolowski: you are talking about isSystemSnap in cmd_connections ?09:27
pstolowskipedronis: yes. and there is some special-casing in cmd_interfaces as well. although they're both irrelevant in the light of what you just clarified09:28
pedroniswe might want to do something about that, but is not a priority09:29
Chipacapedronis: what should I do wrt the code for snaps with no health?09:32
Chipacain info09:32
Chipacapedronis: I don't have a strong opinion on this :)09:33
Chipacahaving a code seems more uniform, but this being snap and not snapd it's purely cosmetic09:33
* Chipaca proposes snapd-health-is-a-state-of-complete-physical-mental-and-social-well-being-and-not-merely-the-absence-of-disease-or-infirmity09:38
mvomborzecki: thank you09:41
pedronisChipaca: no code please, it's the unknown unknown case09:43
pedronisChipaca: also, code are optional, we should embrace that, not give the impression they aren't09:45
pedronisChipaca: if you think the code really adds value, then we should always set in the api, and not leave the field empty09:46
pedronisthere09:46
pedronisbut I don't think it's worth it/makes a lot of sense09:46
Chipacaagreed then, no code in this case09:48
Chipacabreak time!09:53
* Chipaca dances09:53
Chipacahttps://www.youtube.com/watch?v=otCpCn0l4Wo obvs09:53
* zyga -> breakfast09:58
zygaback10:30
zygapstolowski: I sent a partial review earlier; I cannot poke any holes in it10:37
pstolowskizyga: ty, i saw you comment. i think it's no different (in terms of risk) to anything else we do; fwtw if we made any error in that logic, we would end up with less persmissions, not more (initial setup-profiles is always done before auto-connect and link-snap). we just need to be careful as always10:40
mborzeckipstolowski: left some comments under https://github.com/snapcore/snapd/pull/708110:52
pstolowskithanks10:54
zygaWalk11:31
zygare11:39
zygaoho12:21
zygasplits12:21
zygamborzecki: one last python patch https://github.com/snapcore/snapd/pull/708212:22
* zyga switches to reviews12:31
=== ricab is now known as ricab|lunch
jdstrandmborzecki: r cbpf> for some definition of sadly, ebpf would give us a lot of grief due to spectre (and its mitigations)12:56
jdstrand(cc mvo) ^12:57
mborzeckijdstrand: as in https://seclists.org/oss-sec/2019/q1/106 ?12:58
jdstrandmborzecki: if we're looking to ease the pain of the lookup, trying to push oracle's patch forward is likely the way forward. we could do experimental builds and determine if it is good enough, then vendor libseccomp with that patch if it seemed ok12:59
jdstrandmborzecki: yes. some are disabling ebpf entirely. everyone else says only root can use it (if not running under seccomp already iirc)13:02
jdstrandmborzecki: I see you post that bintree didn't help. we could try to tweak libseccomp to put common syscalls like openat sooner.13:07
jdstrandmborzecki: and then send that upstream13:08
mborzeckijdstrand: yeah, the cost seems to be better on average, but lower number syscall seem to fare worse13:08
jdstrandand vendor it in the meantime. before vendoring or sending upstream, we'd need to run through the tests I wrote for the daemon user though...13:09
jdstrandmborzecki: I suspect if you posted your findings to the bintree one, people would try to come up with a compromise approach13:09
mborzeckijdstrand: actually that is interesting whether we should aim for all syscalls be similarly impacted by seccomp filter, rather than favoring some and penalizing others13:09
jdstrandmborzecki: since the goal is spead up when it actually slowed down13:10
mborzeckijdstrand: right, but if you do say, recvmsg/sendmsg a lot you'd see gains :)13:10
jdstrandmborzecki: well, seccomp is always going to have a measurable impact ime. what we can do to soften that... it seems plausible to put massively used ones earlier (like recvmsg/sendmsg). one would think that openat would be at the end though-- read/write/lseek at the beginning since you tend do that more than opens.13:12
jdstrandmborzecki: which means that we're always going to be slow in some pathological cases13:13
jdstrandmborzecki: if you are right about python compiling modules, shouldn't it be shipping pyc files?13:14
jdstrandmborzecki: the fact that we do a lookup at all, for volatility, it seems difficult to imagine doing better than bintree. I mean, read is very high in the list with only 6 syscalls before it, and lseek only 14. they are highly prioritized already13:20
zygajdstrand: we can  LD_PRELOAD and translate all the syscalls  to open + mmap + caching  (don't close and reopen)13:22
zygajdstrand: we can  perhaps do some tricks with multiple profiles, hand written,  that use bitmasks to quickly ACK numbers that are high up the call  frequency and then offset the cost for the NACK syscalls, maybe some sort of bloom filter approach crafted out of multiple filters that are all attached together13:23
mborzeckijdstrand: sorry for delayed replies, we're in the standup currently13:24
mborzeckijdstrand: mvo posted some info about -m compileall, but I did not apply it locall13:24
jdstrandmborzecki: it is unclear to me the frequency of the syscalls for volatility. the table tells my with <2.4, read has 0 syscalls before RET and with 2.4, 6, but how often is read called? (we know read will be high, but perhaps there is some frequently used syscall that could reasonably moved up the list)13:26
jdstrandmborzecki: mvo mentioned lseek and mmap...13:27
mborzeckijdstrand: let me grab a summary from strace13:28
mborzeckijdstrand: http://paste.ubuntu.com/p/F8PSHcscDg/13:29
jdstrandzyga: I'm not clear on your LD_PRELOAD proposal. you are saying you want to modify program behavior to sidestep the issue?13:31
mborzeckijdstrand: so read, lseek, mmap, munmap, fstat should already be handled rather early13:31
mborzeckijdstrand: openat is probably an artifact of snap-confine and perhaps python module compilation (?)13:31
jdstrandmborzecki: how easy it for you to put lseek first in the list with 2.4 and rerun?13:32
mvomborzecki: just to double check - you also see the problem go away once you add @unrestricted into the .src file and compile that, right?13:32
mborzeckimvo: yes, @unrestricted is ~14s vs 20 with seccomp set up13:32
mborzeckijdstrand: where's the list?13:34
jdstrandmvo: it's always been known there is overhead with syscall filtering. a map would sure be nice here... lseek is already prioritized and towards the top of the list. we knew pathological cases would be problematic (eg, opening lots of files over and over, etc)13:36
jdstrandmborzecki: well, yes, then I guess that is already putting it towards 'not easy' (I don't know otoh)13:36
jdstrandmvo: but volatility is executing lseek nearly a million times...13:38
mvojdstrand: yeah, I think this forum post hit exactly this pathological case13:38
mborzeckijdstrand: i'll look into how libseccomp generates bpf, i see there's some syscall sorting happening early on so that's probably the right spot :)13:38
mvojdstrand: its fine, if we can't do anything we can't do anything :) I was mostly trying to understand the details13:38
jdstrandmborzecki: it looks like there is a priority field looking at gen_bpf.h13:40
jdstranderr gen_bpf.c13:40
jdstrandmborzecki: iirc, upstream said there was more that could be done in this area, separate from bintree13:41
zygaI think that given the number of system calls it is silly a seccomp program cannot have a small array that has “yes, no, compute” action. Vast majority of times a simple lookup is sufficient13:42
zygaMaybe we need to invest in a kernel patch13:43
mvozyga: I was wondering the same - it seems the bpf machine is very limited here (maciej mentioned this earlier here)13:43
jdstrandmborzecki: db.c talks about the priority field, and talks about 'user hints'13:44
jdstrandmborzecki: which is... interesting...13:44
zygaEbpf seems to have everything we want though. Maps arrays jumps. It is just out of reach :-(13:44
zygaBut even regular bot we could use some tricks with bit mask13:45
zygaa/bot/bpf13:45
mvoyeah, was wondering that too13:46
jdstrandzyga: once we have the daemon user tests in place, perhaps we could consider using ebpf on systems that have it available. I mean, snap-confine is running as root. it will have CAP_SYS_ADMIN for using bpf()...13:47
jdstrandzyga: it wouldn't be something we would want to rush and we'd have to really justify it in my mind, cause the entire industry is using libseccomp. if we go out on our own and get it wrong, that would be, let's shall we say, less than ideal13:48
jdstrand(cc mvo) ^13:49
mborzeckijdstrand: hm seccomp_syscall_priority()13:49
zygajdstrand: but seccomp does not do eBPF, or does it?13:49
mborzeckijdstrand: let's see if i can tweak the code to call it and make it prioritize lseek13:49
jdstrandI'd much prefer to see mborzecki's findings as feedback in the bintree pr13:50
jdstrandzyga: no, it would need to be a new LSM13:50
jdstrandzyga: there are some patches for seccomp to use ebpf. that would definitely need to be evaluated :)13:51
zygaUnderstood13:51
jdstrandanyway, I suspect bintree is going to help lift everything, even if some pathological cases remain13:53
jdstrandmborzecki: yes, look at that, a C api and everything. huh. so, assuming that can make things better, it is theoretically possible for snapd to expose some knobs to tune the filter13:56
jdstrandmborzecki: eg, seccomp_syscall_priority(ctx, SCMP_SYS(lseek), 220); seccomp_syscall_priority(ctx, SCMP_SYS(read), 219);13:58
jdstrandmborzecki: with knobs, we could perhaps make pathological cases not so bad...13:59
* jdstrand is a little concerned what that might do to the added tests for snap_daemon since that whole thing was so brittle prior to 2.4 (and its defaults)14:00
jdstrandmborzecki: I of course made up 220 and 219. I don't know what the current prio is for either. maybe for your 'quick' test, 255 and 254 is best since volatility is dominated by those two14:03
mborzeckijdstrand: heh, https://paste.ubuntu.com/p/CPZ6cGkt8q/ shaved off ~1s14:06
mborzeckijdstrand: fwiw scmp_bpf_sim tells me it's fewer instructions now for those syscalls14:08
jdstrandmborzecki: hmm, only one second. that suggests the act of filtering at all causes a fairly significant slowdown (at least when calling a function a million times), with the ordering being an additional factor14:10
mborzeckijdstrand: fwiw i could possibly come up with a trace to measure how much it takes14:10
mborzeckijdstrand: had to fix the unrestricted time in the forum note, idk why i had 14s instead of 18s14:13
mborzeckijdstrand: so fiddling with bintree branch or the regular branch, we can gain a bit, but not much tbh14:14
jdstrandmborzecki: it might be interestinng for completeness since we were talking about sort order so much and that a map would help (which these finding suggest perhaps not)14:14
jdstrandmborzecki: re gain> right14:14
mborzeckijdstrand: a map of what? or an eBPF map?14:17
mborzeckijdstrand: fwiw http://paste.ubuntu.com/p/DHV5tfChq6/ disassembly output with prioritized syscalls14:18
zygathat has to be optimizable14:20
=== pedronis_ is now known as pedronis
=== ricab|lunch is now known as ricab
jdstrandmborzecki: I was saying that your run might suggest that the filtering code in the kernel is what is slow. if lseek is at the top of the list then sort order is essentially taken out of the equation for that syscall. so a(n ebpf) map isn't going to help that14:23
jdstrandmborzecki: perhaps the kernel could be improved to do caching of a previous lookup... that is likely memory intensive though14:23
jdstrandmborzecki: however, your 18s:20.5s vs 14s:20.5 is a big deal. I think we can live with that overhead for the foreseaable future. we can pull back bintree in the fullness of time and lift everyone a bit. with higher priority items out of the way, we could consider adding seccomp_syscall_priority() to the mix14:25
mborzeckijdstrand: w8, that's 18s with @unrestricted, vs 20.<closer to 0> with bintree or priorized vs 20.<closer to 1> with 2.4.114:27
jdstrandmborzecki: even better14:27
jdstrandmborzecki: point is, the delta between unrestricted and restricted is not great14:27
mborzeckijdstrand: right, the penalty is there and it's measurable14:28
jdstrandit is there, it is measurable. we knew that (we just didn't measure it). my point is that a 10% slowdown for a pathological case is not terrible14:29
mborzeckijdstrand: also, we probably need a better benchmark14:29
jdstrandand that perhaps in the future, we can get that to 5% with seccomp_syscall_priority() if that bubbles up to a roadmap item14:30
jdstrandmborzecki: re benchmark> absolutely14:30
zygamborzecki: some of the rain got here :)14:39
zygaeven storm14:41
zygafinally colder though14:41
zygalots of thunder and rain here, if I don't respond it may that my network got knocked outu14:53
zyga*out14:53
jdstrandmborzecki: fyi, I see you are responding-- I just responded again with my thoughts (perhaps you want to consider/agree/refute them in your response)15:00
pedronismborzecki: did another pass on 6890, still not super clear why the calls to checkpointPrefix and rollbackPrefix are not more symmetric15:02
mborzeckipedronis: hm must have missed that in your previous review, will take a look15:14
mborzeckijdstrand: left a note with results from more runs, so only conclusions at this point are: @unrestricted has no overhead and we need a better benchmark15:15
jdstrandmborzecki: thanks! those are concrete conclusions. I think an important additional conclusion is that sort order isn't everything and there is overhead in mediating it at all. that is useful and can guide us going forward15:17
jdstrandmborzecki: I also think that there is reasonable hope that bintree will help non-pathological cases, but that needs measuring15:18
jdstrandmborzecki: it is interesting that bintree/prioritized is slower with the averages than straight 2.4.115:21
* jdstrand updates his math15:21
jdstrandmborzecki: I wonder if it makes sense for when bintree is available, to make it chooseable. ie, add an api call to opt into/out of bintree... it seems that the upstream patch doesn't make this tunable. it only looks at MIN_SYSCALLS_TO_USE_BINTREE15:31
jdstrandmborzecki: fyi, our default template uses 368 syscalls today15:33
pstolowskidegville: hey, thanks for debug timings doc tweaks! one typo there: "snapd stores up to 100 measurements with the oldest dropped to for new measurements"; did you mean "to make room for.." ?16:08
degvillepstolowski: no problem, and yes, you're right.16:10
* zyga food16:21
pstolowskimvo: btw, #7016 needs 2nd review16:31
mvopstolowski: yes, will do16:40
pstolowskimvo: thanks16:41
=== pstolowski is now known as pstolowski|afk
cmatsuokamvo: oh well it seems that govendor doesn't like pull requests19:00
cmatsuokahas anyone used govendor with an unmerged PR?19:08
zygacmatsuoka: what do you mean?20:26
cmatsuokazyga: I have code that depends on a go-tpm PR20:27
zygayou have to update govendor's  json thing20:27
zygawe rarely do that though20:27
zygayou may also need to check if debian has that as a package20:27
zygaor otherwise part of CI will fail20:27
zygasimilar thing for fedora (sorry)20:28
cmatsuokazyga: you mean, by hand? because the govendor utility ignores the package20:28
zygano, not by hand20:28
zygaI forgot how to useit20:28
zygaAFAIK you need to "go get" it20:28
zygaand tell govendor to add it20:28
zygawhich will update the json20:28
zygalook at the history for vendor.json, maybe there are some clues20:28
zygaI honestly don't remember how to use it20:28
cmatsuokaif you govendor add the package (that's already positioned at the right branch) it's ignored, govendor doesn't add it20:29
cmatsuokabut if it's a "normal" package it works20:29
cmatsuokaaha, I can tell govendor to use the downstream tree with the upstream name20:36
cmatsuokaok, that will solve the problem (hopefully)20:37
cmatsuokayay, it worked!20:51
cmatsuokathanks zyga for making me read the FAQ again :D20:54

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!