/srv/irclogs.ubuntu.com/2018/11/28/#snappy.txt

=== chihchun is now known as chihchun_afk
=== chihchun_afk is now known as chihchun
mupPR snapcraft#2354 closed: Preflight missing multipass <Created by evandandrea> <Merged by sergiusens> <https://github.com/snapcore/snapcraft/pull/2354>04:08
=== chihchun is now known as chihchun_afk
mborzeckimorning06:14
zyga|afkHey06:22
zyga|afkmborzecki: how are you06:22
mborzeckizyga|afk: hey, feeling better today06:23
mborzeckizyga|afk: noticed that removable-media allows acessing/wiritng to /mnt, i was always under the impression that only {/run}/media is allowed there07:08
mborzeckimvo: morning07:10
mvomborzecki: good morning! hope you feel better07:10
mvozyga|afk: hey, anything new from the aa bug?07:10
mborzeckimvo: yes, way better today07:11
mvomborzecki: great!07:14
mborzeckizyga|afk: when you're here, can you take a look at https://forum.snapcraft.io/t/the-removable-media-interface/7910 I've added /mnt there since we support it, but did not mention anything about mount propagation since the interface does not allow mounting anyway07:21
=== chihchun_afk is now known as chihchun
=== pstolowski|afk is now known as pstolowski
pstolowskihey07:58
mborzeckipstolowski: hey07:59
pstolowskimborzecki: hey, how are you, feeling better?07:59
mborzeckipstolowski: yeah, thanks07:59
mvozyga|afk: woah, I got "cannot create temporary directory for /var/lib/snapd mount point: Permission denied" now on first try of mvo5/run-fontconfig-2.3608:01
mwhudsonpopey: i took your advice and made a screenshot https://snapcraft.io/go08:05
mupPR snapd#6223 closed: cmd/libsnap: move apparmor-support to libsnap <Simple šŸ˜ƒ> <Created by zyga> <Merged by bboozzoo> <https://github.com/snapcore/snapd/pull/6223>08:07
mborzeckimwhudson: nice, should include go env output too08:07
mwhudsonmborzecki: you think? i most of it is pretty boring08:08
mwhudson+think08:08
pedronismvo: hi08:08
mborzeckimwhudson: maybe go env GOROOT GOTOOLDIR?08:09
pedronismvo: anything I can help with?08:09
mwhudsonmborzecki: yeah, might make sense08:10
mwhudsonthe whole process is a bit tedious tbh, but if i redo it for some reason :)08:10
mborzeckimwhudson: that's asciinema right?08:11
mvopedronis: good morning - we are still struggling with the apparmor failure in 2.36. I did some work on the --trace-exec in 6185, please have a look but I can do more just got distracted because of the aa issue08:11
mwhudsonmborzecki: yes, plus hacking to smooth out the typing08:11
mvopedronis: but 6185 should be cleaner than before08:11
popeymwhudson: I love that!08:11
pedronismvo: ok, do we need to have a chat about the aa issues with zyga as well?08:11
mborzeckimwhudson: mhm, very nice!08:12
mvopedronis: maybe, I think he had some new ideas since last night, but I don't know more details yet08:12
pedronisok08:12
popeymwhudson: I hope it's not so big as to trigger this though ;) https://gitlab.gnome.org/GNOME/gnome-software/issues/53208:13
mwhudsonpopey: heh no08:13
mvopedronis: 5845 needs a second review (the files interface) but thats probably in good hands with jamie and zyga08:13
pedronispstolowski: are you blocked atm?  my and mvo days are a bit full of meetings and we have the 2.36 troubles08:14
pedronispstolowski: I still would like the 3 of us to talk about hot plug next step this week tough, but tomorrow might be better08:14
pstolowskipedronis: not really, it can wait till tomorrow (not sure i can get 2nd review of the hotplug disconnect branchs from mvo under these circumstances anyway). i've also been working on one of the 2.36 blockers08:30
pedronispstolowski: ok, btw I answered in https://bugs.launchpad.net/snapd/+bug/1777121 it makes sense to pick it up, make a card and work on it when you have time08:32
mupBug #1777121: Remove is called after snap services are stopped  <snapd:Triaged> <https://launchpad.net/bugs/1777121>08:33
pstolowskipedronis: great, i'll take it08:33
pedronispstolowski: can I help somehow with that panic?09:05
pstolowskipedronis: no, thanks, i think that's fixed, except cannot land due to other random test failures09:06
mvopedronis: the panic is interessting, mocking the usr.lib.snapd.snap-confine.real makes the problem go away - it almost looks like there might be a connection to the other bug we are looking at09:07
zyga|afkYep09:07
zyga|afkHello09:07
mvopstolowski: could it be that setup profiles for some reason looks at the real /snap/core/current/... usr.lib.snapd.snap-confine.real and dies there?09:07
zyga|afkSorry for starting late. Daughter woes09:07
pstolowskihey zyga|afk09:07
mvozyga|afk: hey09:07
zyga|afkIā€™m taking bit out but heading home now09:08
zyga|afkI will read backlog09:08
pstolowskimvo: no, it's not looking at real one; it's for sure looking at /tmp/.../check-xyz/snap/core/<rev> (i had debug logs confirming this)09:08
mvopstolowski: ok09:08
pstolowskiit's curious why it wasn't failing before though, not sure what changed. or if we were plain lucky before09:09
zyga|afkHahhahhahh09:11
zyga|afkYou will have fun knowing why09:11
zyga|afkMan09:11
zyga|afkSorry still not at my kb09:12
=== chihchun is now known as chihchun_afk
zyga|afkThere09:14
zyga|afkSolved it09:14
mvozyga|afk: can't wait to hear the details09:16
mvozyga|afk: does it also shed some light on the other issue ?09:16
pstolowskican't wait to that as well :)09:17
pstolowskimvo: is there any other 2.36 issue i can help with?09:20
mvopstolowski: 6185 needs a second review, we could pull this into 2.3609:21
mvopstolowski: but it first needs a merge into master :)09:21
pstolowskimvo, zyga|afk https://github.com/snapcore/snapd/pull/6219 is green now ; needs 2nd review09:36
mupPR #6219: overlord/tests: fix panic in managers test <Created by stolowski> <https://github.com/snapcore/snapd/pull/6219>09:36
pstolowskinow sure how relevant it is with zyga|afk's findings about setup-profiles09:37
mvopstolowski: I'm looking at the exact error now, I have it in a VM and it seems to be reliable to reproduce09:38
pstolowskimvo: is it cosmic?09:38
zyga|afkre09:38
mvopstolowski: 16.0409:38
=== zyga|afk is now known as zyga
zygagive me a moment please09:38
zygaso, the 1000ft version is that we never ever mocked security backends09:41
zygaand that's not good09:42
zygaand stuff didn't blow up because we 100% ignored all errors09:42
zygaso running overlord tests would really run apparmor parser09:42
zygareload udev09:42
zygaand all the other stuff09:42
zygaincluding setting up all snaps on manager startup09:43
zygaincluding initializing the apparmor backend09:43
mvoerrors:[]state.taskError{state.taskError{task:"Setup snap \"some-snap\" (40) security profiles", error:"cannot setup apparmor for snap \"core\": cannot create host snap-confine apparmor configuration: cannot compute snap-confine profile: cannot open apparmor profile for vanilla snap-confine: open /tmp/check-2391705272889139391/162/snap/core/1/etc/apparmor.d/usr.lib.snapd.snap-confine: no such file or directory"09:43
zygaand setting up snap-confine profile when core was being setup09:43
mvozyga: so yes :)09:43
zygaso09:43
zygathe reason that pawel's branch fixed the panic09:43
zygais that it made part of the apparmor.Backend.Setup code happy09:44
pedroniszyga: where?  I'm quite sure we tried to mock aa parser09:44
zygaI realised while outside that the only reason this can have an effect09:44
zygais that we ran with the full backend09:44
zygapedronis: managers_test09:44
zygait just spawns the full interface manager and carries on09:44
pedronis ms.aa = testutil.MockCommand(c, "apparmor_parser", "")09:44
pedronisetc09:44
mvozyga: I think this is understood now, thank you! the next question is why http://paste.ubuntu.com/p/fv5fsh3PDy/ makes the error to setup the aa profile basicly go away09:45
mvo  09:45
pstolowskizyga: we do mock apparmor_parser though (at least in these tests i worked on)09:45
pedronismaybe it doesn't work, but for sure it was trying09:45
zygathat's only part of the story, we should not run those tests with real backends09:45
pedroniszyga: ?09:45
pedronisthey are meant to be quite integrationy09:45
pedronisthat seems a bit of a broad statement09:45
zygathen we need much more preparation in that phase, right now a small tweak in backend needs to be reflected in extra setup or mocking in the overlord test09:46
mvopedronis: the error we get in the test is actually that it can't open the core template profile, so this happens before aa_parser (just for context)09:46
zygaanyway, please let me explore09:46
zygayep09:46
zygaso we need to either:09:46
pedronismvo: yes, so we have some code that doesn't respect dirs root09:46
zyga1) prepare a full blown environemnt that each backend expects09:46
pedronisor soemthing09:46
pedronisI'm just trying to counter that it wasn't trying to mock things09:47
mvopedronis: it does respect them, thats the problem, there is nothing there so it fails to open the profile09:47
mvopedronis: yeah, that is correct - we do mock stuff09:47
zyga2) ask each backend to mock itself properly so that it can kind of run but have no real effect09:47
zyga3) not run with real backends at that stage09:47
zygawe mock things in the wrong place, the overlord doens't know how to mock apparmor09:47
zygamocking one command is not enough09:47
pedroniszyga: assume I understand what you are saying09:48
zygathe responsibility of knowing how to mock is in the backend itself, not in a test far far away09:48
pedroniszyga: is there a clean way to turn off the backends? would those tests still pass as they are?09:50
zygae.g. calling backend.Initialize from apparmor interrogates the system about nfs and overlayfs09:50
zygathat's not mocked in any way09:50
zygapedronis: sure, we do that in the interface manager tests09:50
zygapedronis: I was surprised this is not done so09:50
zygapartially the question is what do we want to do in those tests09:51
zygaare they integration tests?09:51
zygaare they tests for the overlord?09:51
pedroniszyga: the name tells you what they try to do09:51
pedronisthey try to test the integration of more than one manager09:51
zygaif they are integration tests then running with real managers, even if their activity is mocked feels wasteful since we don't observe what they do09:51
zygapedronis: that's fine09:51
zygakeeping the manager is ok09:52
pedroniszyga: anyway I told you a very actionable thing, is there a clean way to turn off the backends? would those tests still pass as they are?09:52
zygabut if we don't check what is the impact of apparmor or seccomp part of the interface manager, perhaps we should not use real backends09:52
pedronisyou answered the first bit09:52
pedronislet's see about the 2nd, no?09:52
zygayes, checking that now09:52
zygahold on09:52
mvoI think this is interessting and we need to look at this but it does not (afaict) help with the apparmor_parser getting killed issue we have in 2.3609:52
pedronismvo: does it get killed on some distros or all distros?09:53
pedroniswhen does it get killed?09:53
mvopedronis: so far I only saw it on 16.0409:53
pedronisI mean in which tests09:53
mvopedronis: it gets killed when it tries to setup the snap-confine security profiles on the hosts09:53
pedronisin a spread test? unit test?09:53
pedronisrandomly?09:53
mvopedronis: I see it in tests/main/layout and tests/main/parallel-install-layout - spread tests09:54
pedronisok, spread test09:54
mvopedronis: let me try to find out if it happens in more09:54
pedronisanywy, yes then is not related09:54
pedronisby def09:54
mvopedronis: making the error from a notify log to a real error changes the outcome for some reason, its super strange but with http://paste.ubuntu.com/p/kksmSpN95H/ I cannot reproduce the issue anymore09:55
pedronismvo: there are probably two calls? and the first fails benigly and the 2nd days?09:56
pedronisthat would explain why that would make a difference09:56
mvopedronis: also failure seems slightly random, in GH we see it also in parallel-install-interfaces-content and snap-env but all 16.0409:57
zygamvo: there is one more place09:57
zygamvo: actually two, you just have an older patch09:57
zygamvo: the key place is actually in interface manager initialize when we redo security when system key changes, this is still ignored in the older copy you have09:58
mvozyga: ok, where can I find the correct one?09:58
mvozyga: what is super strange is that literally on the first try of run-fontconfig-2.36 I hit the bug09:59
mvozyga: then I added the diff and run again and had ~4-5 runs since and nothing triggers it09:59
=== chihchun_afk is now known as chihchun
zygapedronis: all tests pass10:08
pedroniszyga: good, can you propose something that removes the bad mocking and does the right mocking? so we can look at it10:08
zygayep10:08
zygadoing that now10:09
pedronismvo: I added a couple of questions to #618510:12
mvopedronis: thank you, checking10:13
mupPR #6185: snap: add new `snap run --trace-exec` call <Performance šŸš€> <Created by mvo5> <https://github.com/snapcore/snapd/pull/6185>10:13
zygamvo: sorry, missed your question10:14
zygamvo: there's a patch with logging -> errors on GH but I found one more place where that happens, haven't pushed that part10:14
zygapedronis: https://github.com/snapcore/snapd/pull/622610:20
mupPR #6226: overlord: mock security backends for testing <Created by zyga> <https://github.com/snapcore/snapd/pull/6226>10:20
zygalet's see how this runs10:20
mupPR snapd#6226 opened: overlord: mock security backends for testing (2.36) <Created by zyga> <https://github.com/snapcore/snapd/pull/6226>10:20
mvozyga: thank you!10:22
zygamvo: re errors vs logging, I checked my patches from last night and they are all up to date10:23
zygasorry for the noise, still drinking my first coffee10:23
mvozyga: no worries10:24
zygawhat remains a bit of a mystery is the no-output error from apparmor parser10:24
mvozyga: I reverted to see if I get the failure without the change or if I'm hunting a ghost10:24
zygacookl10:24
mvozyga: yeah, especially since we set SNAPD_DEBUG afaik in the tests so we should see output10:25
zygaI restarted my tests, if I can get it to happen again I'd like to look10:25
zygaI wonder if it is possible that mocked no-op apparmor_parser is _somehow_ leaking from unit tests into the state of the machine where they execute10:26
zygamvo: I will now look at the profile compatibility issue on leap10:30
zygamvo: I think we should not compile snap-confine profile if snap-confine in the distribution has disabled apparmor10:30
pedroniszyga: MockCommand is based on setting PATH so it should't go further than unit tests10:31
zygapedronis: yeah but maybe some magic sequence of bugs or something10:31
zygamvo: alternatively we can look at apparmor parser version and skip it this way, it should be the same outcome10:31
zygas/be/have/10:31
mvozyga: first run without the diff that makes it a real error and I hit the issue again10:32
mvozyga: "Nov 28 10:32:00 nov281009-194489 snapd[29337]: backend.go:312: cannot create host snap-confine apparmor configuration: cannot reload snap-confine apparmor profile: cannot load apparmor profiles: signal: terminated10:32
mvo"10:32
mvoNov 28 10:32:00 nov281009-194489 snapd[29337]: apparmor_parser output:10:32
zygacan you check one thing10:33
zygaif you have shell still10:33
mvoI do10:33
zygalook at what apparmor_parser is10:33
zygais it the real deal10:33
zygaalso look at journalctl10:33
mvozyga: https://paste.ubuntu.com/p/BvBvkxP4Mn/10:33
zygaif the parser is loading stuff into the kernel it will trigger an audit event10:34
zygaso you may match the timestamp above10:34
zygato an audit even10:34
zyga(looks allright)10:34
mvozyga: https://paste.ubuntu.com/p/KWYSKPVgJY/ - looks like audit is not showing that the snap-cofine profile is loaded10:35
zygaindeed10:36
zygaanything around the timestamp of10:36
zyga "Nov 28 10:32:00 nov281009-19448910:36
mvozyga: https://paste.ubuntu.com/p/6wbYy8psFb/10:37
zygahum10:38
zygaas if nothing had happened10:38
mvozyga: I will try to run the apparmor_parser manual now10:38
zygamvo: crazy idea, apparmo_parser -> apparmor_parser.real, log all invocations10:38
zygacan you reliably reproduce it?10:39
zygawhich test was it that failed now10:39
mvozyga: the test is pretty random10:39
mvozyga: but without your diff I hit it 2/2 so far10:39
mvozyga: today10:39
mvozyga: via the mvo5/run-fontconfig-2.3610:39
mvozyga: I think plain upstream/release/2.36 will also work, takes ~100 tests and then it happens10:40
zygamvo: I will try your branch now10:42
mvozyga: silly question, what is our cache dir again?10:45
zygawhich cache?10:45
zygaapparmor?10:46
mvozyga: apparmor cache for the re-exec thing10:46
zygawe use /var/cache/apparmor10:46
zygafor all our profiles10:46
zygawe no longer use any other paths10:46
mvozyga: thanks10:46
zyganote: we still use the other place for that single profile that ships with the package10:47
mvozyga: and running it by hand - works :(10:47
zygayeah, I ran it by hand once in a debug session10:47
zygait's maddening10:47
zygawish I had a rollback machine to get a few seconds into the past10:47
zygamvo: so FYI, /etc/apparmor.d/cache/10:47
zygabecause cache is in etc10:47
pedronispstolowski: mvo: I created a meeting for tomorrow, let me know if it doesn't work10:49
pstolowskipedronis: thanks, it's fine10:50
mvopedronis: thank you10:50
zygabrb, it's super cold in the office today11:04
mvozyga: see you - I add the wrapper thing now11:05
=== cpaelzer_ is now known as cpaelzer
zygaback11:14
zygamvo: https://github.com/snapcore/snapd/pull/6226 is green!11:14
mupPR #6226: overlord: mock security backends for testing (2.36) <Created by zyga> <https://github.com/snapcore/snapd/pull/6226>11:14
zygapedronis: can you look please,11:14
zygaI think this is the dealbreaker11:14
zygathough it doesn't explain why things fail in non-unit tests11:14
zygathose also passed on 1st run :)11:15
zygavirtually unheard of :)11:15
pedroniszyga: do we have other places outside of overlord that do overlord.New ? and need the same mocking11:16
pedronisI think daemon might11:16
zygaI checked all of overlord but indeed, daemon might11:17
zygalooking now11:17
pedroniszyga: +1 with this kind of questions11:17
zygapedronis: yes, adding the same treatment to daemon now11:21
pedroniszyga: devicestate/firstboot_test.go seems also to create a full overlord11:22
pedronisfor some tests11:22
zygaI added a printf that logs the added backends, I'll check that we are good across the tree11:23
zygaactually, it seems daemon did this already11:24
* zyga double checks11:24
zygathough not for all suites11:25
mvozyga: oh, fun11:30
zygamvo: any luck with apparmor_parser?11:30
mvozyga: not yet11:31
mvozyga: still running11:31
mvozyga: green is also annoying - why is it green, it should fail with the same appamor issue11:31
mvozyga: I mean, it should fail with permission denied at a random place when the apparmor_profile canot be (re)loaded11:32
zygayes :/11:33
zygaodd observation11:33
zygago test11:33
zygaI see stdout11:33
zygago test ./...11:33
zygaI don't see stdout!11:33
mvozyga: I remember I discussed that with john a while ago, it was strange go testrunner setup iirc11:36
zygaeverything is mocked now, let's do a master version of that11:44
mupPR snapd#6227 opened: overlord,daemon: mock security backends for testing <Created by zyga> <https://github.com/snapcore/snapd/pull/6227>11:47
mvozyga: with wrapper -> no error11:49
zygaseriously11:49
zygahow can that be!?!11:49
zygamaybe just back luck?11:50
zyga*bad11:50
mvozyga: maybe, I run it again11:51
mvozyga: the mock security backends got cherry-picked, right?11:51
zygamvo: meaning?11:51
zygaI opened a 2.36 and master PRs11:51
zygasame patch11:51
mvozyga: I mean, between 2.36 and master it can be chrry picked?11:51
zygayes11:51
mvozyga: great11:51
zygatwo PRs are in flight now11:51
mvozyga: ok11:52
mvozyga: did you restart the 2.36?11:52
zygayes11:52
zygawell11:52
zygaI pushed to include daemon mocking and devicestate mocking11:52
zygaso same patch in both places, one was new one was force pushed over the old one11:52
mvozyga: ok11:53
mupPR snapd#6228 opened: snapstate,overlord: update fontconfig caches with overlord mocking (2.36) <Created by mvo5> <https://github.com/snapcore/snapd/pull/6228>11:58
mvozyga: second run with wrapper starts now12:01
zygamvo: I'm fixing leap bug12:08
zygamvo: I'm a bit unsure if we should land https://github.com/snapcore/snapd/pull/622112:08
mupPR #6221: interfaces: return security setup errors (2.36) <ā›” Blocked> <Created by zyga> <https://github.com/snapcore/snapd/pull/6221>12:08
zygathe risk is breaking snapd startup12:08
mvozyga: got the error: https://paste.ubuntu.com/p/r54VmSGf37/12:09
mvozyga: yeah, this is why I set it to blocked12:09
zygaperhaps we should still ignore this error: https://github.com/snapcore/snapd/blob/master/overlord/ifacestate/helpers.go#L18712:09
zygalooking12:09
mvozyga: I think this needs some more work, breaking on startup is not ideal12:09
zygaI checked my PR, it's not blocking startup12:10
mvozyga: hm, actually the real error is earlier, let me dig some more12:10
zygamemory of what I pushed last evening is rusty12:10
zygamvo: is that the only invocation?12:10
zygaor last one12:10
mvozyga: the last one, a gazillon before12:11
zygaaha12:11
mvozyga: I thik `--replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap-confine.core.6022` is the failing one but the script needs to be smarter12:11
zygamaybe patch the wrapper to log the error too12:11
mvozyga: yeah12:11
zygacool12:11
zygasuggestion12:11
zygawhen it fails copy the non -- arguments to /tmp/WAT/12:11
zygafor inspection12:11
zygafingers crossed12:11
zygapstolowski, mvo: shall we close https://github.com/snapcore/snapd/pull/621912:14
mupPR #6219: overlord/tests: fix panic in managers test <ā›” Blocked> <Created by stolowski> <https://github.com/snapcore/snapd/pull/6219>12:15
pstolowskiclosed12:18
mupPR snapd#6219 closed: overlord/tests: fix panic in managers test <ā›” Blocked> <Created by stolowski> <Closed by stolowski> <https://github.com/snapcore/snapd/pull/6219>12:18
zygapstolowski: thank you for writing that, without that PR I would never think that that code is using apparmor backend12:19
mvoyeah, thanks pstolowski and zyga for this one - now we just need to figure the appamor thing out to make me truly happy12:21
pstolowskiyep, that an interesting one ;)12:21
zygamvo: the branches are green12:38
zygashall we merge https://github.com/snapcore/snapd/pull/622612:38
mupPR #6226: overlord,daemon: mock security backends for testing (2.36) <Created by zyga> <https://github.com/snapcore/snapd/pull/6226>12:38
zygaand https://github.com/snapcore/snapd/pull/622712:38
mupPR #6227: overlord,daemon: mock security backends for testing <Created by zyga> <https://github.com/snapcore/snapd/pull/6227>12:38
zygapedronis: ^ ?12:46
pedronisone sec12:48
pedroniszyga: did you rebase it?12:49
zygapedronis: the one on master, yes12:49
pedroniswell seems you force pushed the one on 2.36 as well12:50
zygayes, with daemon and devicestate changes12:50
zygaI wanted one patch for master12:50
pedroniszyga: do we need the devicestate changes? it doesn't seem to use the interface manager12:53
zygapedronis: it spawns the overlord, that initializes apparmor backend doing some work12:54
pedroniswhere?12:54
zygajust creating the overlord is enough, then that adds the interface manager, that does the rest12:54
zygaI tested this with a printf next to AddBackend in the interface manager initialization function12:55
zyga(printing each security backend being added)12:55
pedroniszyga: it creates only a Mock overlord12:55
pedroniswe do that all over tha place12:55
pedronisso we need to know because there might be more12:55
zygaMock is safe12:55
zygalet me double check12:55
pedronisI don't see a overlord.New there12:56
pedronisbut maybe I'm missing something12:56
pedroniszyga: I'm talking devicestate_test.go to be clear12:57
pedronisfirstboot_test.go does create a full overlord12:57
zygatesting...12:58
zygaand I think you are correct, perhaps I did one too many :)12:58
zygaI can drop that12:58
zygapushed13:00
pedronislooking in a sec13:00
zygainto the master version of the PR13:00
zygadevicestate tests are slooow13:01
pedroniszyga: :)13:01
zyga25 seconds on a 12 core VM13:02
pedronisthey got worse then13:02
pedronisanyway not a priority to improve that13:02
zygapushed into 2.36 PR as well13:02
pedronisright now13:02
pedronisthey take 4s here13:02
pedronisbtw13:03
zygaPASS: devicestate_test.go:899: deviceMgrSuite.TestDoRequestSerialErrorsOnNoHost20.213s13:03
zygamaye OS differences matter?13:03
pedronisno, network setup/dns resolution13:03
pedronisdifferences13:03
zygayep13:03
mupPR snapd#6161 closed: tests: new test suite to run snapd tests on a google remote instance <Created by sergiocazzolato> <Closed by sergiocazzolato> <https://github.com/snapcore/snapd/pull/6161>13:04
zygamvo: any luck?13:06
zygahttps://github.com/snapcore/snapd/pull/6228 is green btw13:06
mupPR #6228: snapstate,overlord: update fontconfig caches with overlord mocking (2.36) <Created by mvo5> <https://github.com/snapcore/snapd/pull/6228>13:06
pedroniszyga: +1, need a 2nd review for master changes, also agree with mvo how land it in 2.36, will we get conflicts now?13:06
zygapedronis: not likely13:07
zygapedronis: the patches are the same13:07
zygain any case, we can manage if some conflicts happen13:07
zygaI have a fix for the leap issue13:22
zygamvo: I added a new apparmor level13:23
zygaancient13:23
zygawhen level is ancient, we don't enable the apparmor backend13:23
zygawhen the parser has insufficient features to compile our basic profiles I return ancient13:23
zyga(using the existing parser feature check)13:23
zygathis makes leap 43.2 work13:24
zygaon TW all is good, as is on 16.0413:24
zygaparser feature check is not version based, we actually try to use the parseer13:24
zygaso this feels like a good way to fix this13:24
zygaI'll clean up the code slightly, add tests and propose13:25
mvozyga: no luck yet, one full run worked13:28
mvozyga: second run is ongoing13:28
zyga:/13:28
zygabut at least we're crawling out of the 2.36 hole13:28
seb128pstolowski, hey, unsure bug #1754345 should have been closed, it was an understood problem/assigned and an user states he's still having the issue, could you reconsider?13:29
mupBug #1754345: Returns "invalid credentials" error while trying to refresh an invalid macaroon <snapd:Invalid by chipaca> <https://launchpad.net/bugs/1754345>13:29
pedroniszyga: pstolowski: mborzecki: btw both mvo and me have a bunch of meetings today so won't make the standup13:30
zygapedronis: ack13:30
mborzeckiack13:30
zygabrace, the meetings are coming :)13:30
pstolowskiseb128: sure, looking13:30
seb128pstolowski, thx13:30
pstolowskiseb128: right, re-opened, thanks; i didn't see the new comment yet13:31
mupPR snapd#6229 opened: release: probe apparmor features lazily <Created by zyga> <https://github.com/snapcore/snapd/pull/6229>13:31
seb128pstolowski, thx13:31
zygamvo: 1st of the several branches that lead towards leap fix: https://github.com/snapcore/snapd/pull/622913:32
mupPR #6229: release: probe apparmor features lazily <Created by zyga> <https://github.com/snapcore/snapd/pull/6229>13:32
zygapstolowski, mborzecki ^ if you can13:32
mborzeckizyga: looking13:32
zygafor context: we load apparmor even when it's kernel features are sane but userspace is too old,13:32
zygathis is the first of that sequence: the next is caching of parser feature check, the last is introduction of the "ancient" apparmor "level" (level is how we classify apparmor) and usage of that level when parser doesn't have the feature we need13:33
zygadecided to split because smaller patches and we can see if something falls apart13:33
mborzeckizyga: lgtm, not that the user can switch off secrity=apparmor at runtime13:34
zygayep13:34
zygathat's handled13:34
zyga:)13:34
zyga(when you do that apparmor fs is not mounted)13:34
zygaand we treat it as apparmor none13:34
mborzeckithat selinux uhh, funny how all the information is scattered everywhere, i'm sure ther's like a handfull of people who know their way around it13:35
zygamborzecki: laughing their way to the bank ;)13:35
zygamvo: what do you want to do about 622813:36
zygamerge it or drop it?13:36
zygabrb, more tea, it's still freezing in here13:37
mborzeckii had this nice split into snapd, helpers with specific types, separate for snap-{update,discard}-ns, s-c and cli tools snap{,ctl} but was getting EACCESS when snapd (snappy_t domain) tried to run s-u-n (snappy_mount_t domain), even though i seemigly had all the stuff set up properly (which apparenly i didn't)13:38
pstolowskizyga: looking13:38
mvozyga: woah, its green now 6228 - thats so strange13:39
mvozyga: I run it again, just to see what happens13:39
zygamvo: backends were messing up the system13:42
zyganote13:42
zygamvo: when backends were paritally mocked13:42
mvozyga: well, maybe13:42
mvozyga: but how?13:43
zygae.g. not disabled13:43
mvozyga: I mean, I get the theory13:43
zygabut not mocking commands13:43
zygaand spread runs unit tests as root13:43
zygathings really affect the system13:43
zygaand silly tests that fake installs core13:43
mvozyga: yeah, but rootdir is set to something different13:43
zygammmm13:43
zygayeah13:43
zygathat's true13:43
zygawell13:43
zygaish13:43
mvozyga: I mean, I want to believe13:43
zygait may be set to whatever13:43
zygabut if we can call apparmor_parser /path/to/whatever13:43
zygait would still do stuff :)13:43
mvozyga: it will13:43
zygadid your overloaded parser command catch any like that?13:44
mvozyga: and yet the file is correct when I get a debug shell, apparmor parser is correct13:44
mvozyga: still no reproducer since the one I showed you earlier13:44
zygawhat do you have in your tree?13:44
zygaas in13:44
zygawhich patches13:44
* zyga does it fail on vailla 2.36?13:44
mvozyga: this is run-fontconfig-2.36 with the extra apparmor_parser wrapper13:45
zygaaha13:45
zygawell13:45
zygamagic13:45
mvozyga: I have not tried vanialla13:45
mvozyga: and yes, magic - the bad kind13:45
zygathat should trigger it13:45
zygawhat you have13:45
zygaas long as you don't have the mocking patch13:45
zygawhere backends are nil13:45
mvoyeah, I don't have this patch13:45
xnoxmvo, has i broken systemd in disco on arm64... or? https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-disco/disco/arm64/s/snapd/20181127_120551_2b149@/log.gz13:46
xnox2018-11-27 11:33:48 Error executing autopkgtest:ubuntu-19.04-arm64:tests/main/interfaces-daemon-notify :13:46
xnox-----13:46
xnoxalso lots of apparmor denials13:46
xnox[Tue Nov 27 11:33:26 2018] audit: type=1400 audit(1543318407.221:18418): apparmor="DENIED" operation="exec" profile="snap.test-snapd-daemon-notify.notify" name="/bin/systemd-notify" pid=4946 comm="notify" requested_mask="x" denied_mask="x" fsuid=0 ouid=013:46
xnox[Tue Nov 27 11:33:26 2018] audit: type=1400 audit(1543318407.229:18419): apparmor="DENIED" operation="open" profile="snap.test-snapd-daemon-notify.notify" name="/bin/systemd-notify" pid=4946 comm="notify" requested_mask="r" denied_mask="r" fsuid=0 ouid=013:46
xnox[Tue13:46
mvozyga: I think you are right, something in this test is bleeding into the others but there are some holes still I think13:46
zygamvo: perhaps we should not run unit tests as root :)13:47
mvoxnox: thanks, I need to look at this13:47
zygain case they are less unit-y13:47
zygacall them13:47
zygapractical tests13:47
mvozyga: iirc they run with some su -l call, let me look13:47
zygapractical detonation tests13:47
zygain the spread task?13:47
zygaI didn't check13:47
mvozyga: they run as user "test"13:47
zygauh13:47
zygaok13:47
zygaso that theory goes out the window13:48
zygahow about package build13:48
zygado we do nocheck?13:48
mvozyga: at least the ones in tests/unit/go13:48
mvozyga: yeah, let me check package build13:48
mvozyga: dpkg-buildpackage is also run with su test and fakeroot13:48
zygaok, so all theories are out the window13:49
zygafeels like if we don't find this13:49
zygait will just come back :)13:49
mvozyga: yeah, its very strange13:49
mvozyga: anyway, a shame, I really liked the theory :(13:50
* mvo weeps a bit in the corner13:50
zygamvo: can you ack https://github.com/snapcore/snapd/pull/6227 please13:58
mupPR #6227: overlord,daemon: mock security backends for testing <Created by zyga> <https://github.com/snapcore/snapd/pull/6227>13:58
zygamborzecki: can you please look at https://github.com/snapcore/snapd/pull/6149 after the standup13:59
mupPR #6149: cmd/snap-confine: capture initialized per-user mount ns <Per-user mount ns  šŸŽ> <Created by zyga> <https://github.com/snapcore/snapd/pull/6149>13:59
zygait's the next part of the feature branches13:59
zygaholly molly14:00
zygamvo: noooooooo14:00
zyga:D14:00
zygahttps://api.travis-ci.org/v3/job/460760408/log.txt14:00
zygahttps://www.irccloud.com/pastebin/oxArPCqx/14:00
zygathis is from https://github.com/snapcore/snapd/pull/622614:00
mupPR #6226: overlord,daemon: mock security backends for testing (2.36) <Created by zyga> <https://github.com/snapcore/snapd/pull/6226>14:00
zygamvo: again failed on 16.04 only14:03
zygamvo: my take on this is that there are multiple things happing in parallel: the backends doing stuff turned out to be a fluke (they just affect the unit tests), the ignored error in setup is interesting and I would like to focus on reproducing that problem with the panic issue out of the way14:04
zygamvo: I added a restore-each check for "signal: terminated"14:14
zygafingers crossed (if that happens)14:15
zygausing the same seed that was in the failing log14:15
zygawow, hit the problem instantly14:19
zygalooking14:19
zygaooooooooh14:20
zygamvo: THEORY :)14:20
zyga2018-11-28T14:18:10Z INFO Requested daemon restart.14:20
mvozyga: tell me14:20
mvozyga: termintaes as part of daemon restart?14:21
zygayes14:21
zygachecking14:21
zygahold on14:21
zygamvo:14:21
zygawowoooow14:21
zygaNov 28 14:18:44 nov281411-886866 snapd[29935]: helpers.go:187: cannot regenerate seccomp profile for snap "core": signal: terminated14:21
zygalook what this says14:21
mvozyga: nice, can't wait (and I'm in a meeting)14:21
zygasnap-seccomp14:21
zyga14:1814:21
mvozyga: ohhh14:22
zygathis is not even apparmor14:22
mvozyga: nice14:22
zygait's any child14:22
mvozyga: so a missing wait somewhere? when we shut down/restart?14:22
mvozyga: nice!14:22
mvozyga: nice nice nice14:22
zygaprobably14:22
zygabut man14:22
zygalooking at timing logs14:23
mvozyga: also means we had this forever14:23
zygayes14:23
mvozyga: we just now notice because the profile actually changed14:23
mvozyga: so far all the bits fit :)14:23
zygahttps://github.com/snapcore/snapd/pull/623014:24
mupPR snapd#6230 opened: spread: detect "signal: terminated" in journal logs <Created by zyga> <https://github.com/snapcore/snapd/pull/6230>14:24
mupPR #6230: spread: detect "signal: terminated" in journal logs <Created by zyga> <https://github.com/snapcore/snapd/pull/6230>14:24
zyganeed to run for lunch14:24
zygabut so far best theory14:24
zygaand man, we suck :)14:24
mvozyga: !!!14:24
* mvo hugs zyga 14:24
pstolowski /o\14:32
zygamvo: is it that fix for daemon shutdown that [c]ipaca did?14:35
zygaI'll try14:36
* cachio going to the bank14:37
cachioand lunch14:37
mvozyga: as a quick test - we can set KillMode=process and see if that helps14:39
mvozyga: I bet it does14:39
zygamvo: better, I'll cherry pick the two patches from chipaca that fix it14:40
zygait's super easy to trigger now14:40
zygawith that journalctl check14:40
mupPR snapd#6228 closed: snapstate,overlord: update fontconfig caches with overlord mocking (2.36) <Created by mvo5> <Closed by mvo5> <https://github.com/snapcore/snapd/pull/6228>14:40
zygamvo: running now14:42
mvozyga: cool, let me know14:43
zygafingers very much croessed14:43
zyga*crossed14:43
mvozyga: hopefully the fix from chipaca is enough14:43
zygamvo: I will look at what happens in the daemon restart sequence now14:43
mupPR snapd#6231 opened: data: set KillMode=process <Created by mvo5> <https://github.com/snapcore/snapd/pull/6231>14:43
zygathanks for that ^ mvo14:44
zygafeels like attacking many fronts at the same time yields results14:44
mvozyga: the critical bit is that we can't stop the daemon before all the subprocesses are finished14:44
mvozyga: or we need Killmode=process14:44
zygamvo: are you out of the meeting spree?14:45
zygacan you review https://github.com/snapcore/snapd/pull/622714:45
mupPR #6227: overlord,daemon: mock security backends for testing <Created by zyga> <https://github.com/snapcore/snapd/pull/6227>14:45
mvozyga: yeah, we are doing a pincer move on this (remember Cannae) - anyway, still meeting so only 1/4 brain available14:46
=== mborzeck1 is now known as mborzecki
zygarunning, no failures yet14:51
mvozyga: the one from john?14:51
zygayes14:51
mvozyga: that is definitely the best solution if that is enough - yay^214:51
zygaI took two patches from john and kept my journalctl test14:51
zygayes :)14:51
mvozyga: please propose to 2.3614:51
zygaabsolutely will14:51
zygaif this passes14:51
mvozyga: than hopefully when the meetings are over I can merge and get on with life14:51
mvozyga: thank you!14:52
zygathank you14:52
zygafor 2.36.2 I would like to fix leap too, so if time permits I will try14:52
zyga2.36.2 is tomorrow?14:52
mvozyga: or later tonight14:52
zygaok14:52
zygacan be tonight14:52
zygaI mean, I have the patches now14:52
mvozyga: I would rather do a .3 than to wait tbh14:52
zygaok14:52
zygaI can distro patch too14:52
mvozyga: I think its fine, we can do .3 with --trace-exec14:52
mvozyga: and your fix14:52
zygaok14:52
mvozyga: etc14:52
zygalet's do .2 today if we can14:53
mvozyga: but yeah, depends on how long tests take and all that14:53
zygaand .3 tomorrow with extra leap fix14:53
zyga(leap is not critical since it doesn't break the use of apps)14:53
mvozyga: so including it is not off the chart, just that I would not want to wait for it14:53
zygaI agree14:53
zyga40/364, no errors in sight14:53
mvozyga: hurrazh!14:53
mvozyga: I close the KillMode=process then14:54
mvozyga: at least for now14:54
zygakeep it14:54
zygalet's see what happens14:54
mvozyga: you think so? ok14:54
zygamvo: nooo, broke again15:07
zygalooking15:07
zygaNov 28 15:07:08 nov281459-174393 snapd[30000]: helpers.go:187: cannot regenerate seccomp profile for snap "core": signal: terminated15:07
zygaI find this part interesting:15:08
zygahttps://www.irccloud.com/pastebin/WYfAENsz/15:08
zygathey always come in pairs, first this then the message that we're ready to go, in deamon15:08
zygaso it must be the profile regen code15:08
mvozyga: lets see if KillMode= helps15:09
zygajournald logs: http://paste.ubuntu.com/p/hv8y3mgQJF/15:09
zygaNov 28 15:06:18 nov281459-174393 snapd[29138]: main.go:82: DEBUG: Setting up sd_notify() watchdog timer every 2m30s15:09
zygasooooo15:10
zygatheory15:10
zygawe start up15:10
zygashit takes time15:10
zygasystemd says "bye dude" and kills us15:10
zygaI once told maciej that during startup15:10
zygawhen we regen profiles15:10
zygawe should tell systemd "gimme more time"15:10
zygaotherwise watchdog15:10
zygamay15:10
zygamay kill us on a slow system15:10
zygaviable?15:10
zygahttp://paste.ubuntu.com/p/Bp9jBsjN3S/ <- complete journal logs15:12
zygaso15:13
zygaI don't get it15:13
zygalook at that 2nd log please15:13
zygago to line 127215:13
zygawe just installed core rev 602215:13
zygaon line 1280 we restart15:13
mvozyga: still meeting will look in a wee bit15:13
zyganow comes up something weird15:13
zygawe do a bit of restarts15:14
zygaone after another15:14
zyga(sure, I'm just letting my mind flow)15:14
cwayneWow when did mvo become Scottish15:14
zygait's double plus interesting that the test this failed on is main/degraded15:15
mvozyga: oh, interessting - we could disable the watchdog to test this theory15:15
zygai.e. nothing special15:15
zygayep, I'll try that in a moment15:15
mvocwayne: haha - I think I read too much terry pratchett15:15
zygawe restart about a half a dozen times15:15
zygaand what's with CUPS and ACPI being restarted15:15
zygaare we doing something wrong in test prep?15:16
zygamvo: I don't think this is the watchdog15:17
zygathere's just one snap installed, core15:17
zygawould not take that long15:17
zygaon line 1375 test helpers say "new test starts here"15:17
zygagoing to look what prints that15:18
mvozyga: lets have a quick HO to catchup after my meeting(s)15:24
zygagladly15:24
zygaI'm out of ideas15:24
zygalooking at your kill mode branch15:24
mvozyga: it is running and still going strong afaict15:24
mvozyga: so fingers crossed, I have a theory that I would like to talk through (to see if it is really cohesive)15:24
zygalist of jobs that failed with "signal: terminated" present in the log https://www.irccloud.com/pastebin/ZQikNO6z/15:25
zygaI wonder if this list is stable15:25
zygaI wonder if there's a combination of test activity and something else15:25
zygayour killmode branch seems to suggest that systemd stops snapd15:25
zygaand things go south because children die too15:26
zygaquestion is why15:26
zygawhy are we being stopped15:26
zygaor more precisely: why this happens this way15:26
zygamvo: one more idea to fix this15:26
zygamvo: in regen all profiles code path, when anything fails, we don't write system key15:26
mvozyga: systemd15:26
zygathen next time we try again15:26
mvozyga: oh, interessting15:26
zygayes, systemd but why :-)15:27
zygayep15:27
zygaI'll make coffee15:27
zygawhen will your meetings finish?15:27
mvozyga: yeah, lets tlak in the HO15:27
mvozyga: in 5-20min15:27
zygaok15:27
zygaI'll take a break now15:27
mvozyga: sounds great15:27
zygasee you in 1515:27
mvozyga: I have a theory15:27
mvozyga: so don't worry - and poke holes into it once we talk :)15:27
zygaback15:37
mvozyga: let me make a cup of tea, meeting is over, ready in 3min15:41
zygaok15:41
zygastarted another run with:15:44
zygachipaca's patches15:44
zygaterminated detector15:44
zygapropagated error from snap-confine apparmor setup15:44
zygaand system-key write skip if something fails15:44
zygalet's see what we get now15:44
pedroniszyga: mvo: let me know if I can help, want to discuss, I need a short break now tough15:46
zygak15:46
pedronismvo: is the fontconfig stuff expected in #6231?15:51
mupPR #6231: data: set KillMode=process <Created by mvo5> <https://github.com/snapcore/snapd/pull/6231>15:51
mvopedronis: yes, lets set this to blocked15:51
mvopedronis: its an experiment to see if that fixes the issue in the test15:51
mvos15:51
MmikeHi, lads. Can't snaps run in lxc containers?  I've created a fresh, empty container (lxc launch ubuntu:16.04 xen-snaptest), when it started I added my ssh key to ubuntu user (lxc exec xen-snaptest), sshed into the container, did apt update/upgrade mantra, then did: snap install hello. Snap installed, but when I try to run 'hello' command, this is what I get:15:54
Mmikeubuntu@xen-test:~$ hello15:54
Mmikecannot remount /tmp/snap.rootfs_z1wdjU/var/lib/snapd/lib/vulkan as read-only: Permission denied15:54
Mmikemy container is unprivileged15:55
Mmikeat least the lxd docs say defaults are unpriv containers, and also I can see that /proc inside the container is owned by nobody.nogroup15:55
mupPR snapd#6232 opened: overlord/snapstate: support for pre-remove hook <Created by stolowski> <https://github.com/snapcore/snapd/pull/6232>15:56
pedronismvo: zyga: so we are getting killed why running subprocesses?15:58
pedronisbut we don't know what is killing us?15:59
pedroniss/why/while/15:59
zygapedronis: hey16:05
zygacan you join the standup please16:05
zygait will be easier to explain this way16:05
mvozyga: and remember to listen to the free software song16:36
zygamvo: systemd uses kill(pid, SIGTERM)16:38
zygamvo: not -pid16:38
zyga(sorry, I didn't mean to say -SIGTERM earier)16:38
zygaI will nuke system key and see what happens during a restart16:38
mvozyga: hm, does it send SIGERM to anything else, i.e. a running snap-seccomp or similar?16:40
zygamvo: going through the log now16:40
mupPR snapd#6183 closed: sanity, spread, tests: add CentOS (2.36) <Created by bboozzoo> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/6183>16:42
mvo6195 and 6218 need a second review16:43
* mvo gets dinner while waiting for this16:44
zygamvo: this is what happens:16:53
zygasystemd killing everything in a cgroup https://www.irccloud.com/pastebin/2TpqMTd7/16:53
zygaimmediately after that there's a second loop that ensures the cgroup is now empty16:55
zygamvo: that's settled16:56
zygabut one thing still bugs me16:56
zygaif we only tell systemd that we are ready after we are done starting the overlord16:56
zygahow can we ever be killed mid way?16:56
zygaare we not telling systemd about our readiness on 16.04?16:56
zygaon 14.04 we are type=notify16:59
zygaI wonder if this means we still don't fully know how this happens16:59
zygamvo: forkstat debugging  https://www.irccloud.com/pastebin/yeeIBEmw/17:11
zygajust to put my mind at ease17:11
zygamvo: ^ this is actually a pretty amazing way to see what was going on around each test17:11
zygaand have historical data if it fails17:12
zygasmall digression about our lives: https://youtu.be/sTdWQAKzESA17:14
zygaeh, forkstat -l is too new17:21
roadmrhi jdstrand !17:24
Saviqcachio: hey, are you guys testing on Fedora rawhide? or plan to?17:58
zygaSaviq: not testing now17:58
zygaSaviq: planning to is hard to say17:58
Saviqwhat I'm really asking is how to reliably get a rawhide env on spread+google... but I suppose that answers that ;)17:59
zygaI'll let cachio answer that :)17:59
Saviqwe're trying to upgrade from the latest image available, but that misses the "reliable" bit17:59
=== pstolowski is now known as pstolowski|afk
zygaNov 28 18:05:08 nov281757-030500 snapd[30252]: helpers.go:192: cannot regenerate seccomp profile for snap "core": signal: terminated18:06
zygahttp://paste.ubuntu.com/p/4X5KvfZB7W/ < forkstat18:06
zygamvo: ^18:07
zygamvo: this forkstat stuff may also help maciej with systemd / mount error18:10
zygae.g. would you guess we are calling probe-bcache on all those snaps?18:10
zyga18:05:08 exit  30278     15    2.510 /snap/core/6022/usr/lib/snapd/snap-seccomp compile /var/lib/snapd/seccomp/bpf/snap.core.hook.configure.src /var/lib/snapd/seccomp/bpf/snap.core.hook.configure.bin18:12
zygathis is snap-seccomp existing after SIGTERM18:13
zygamvo: there are some more details here18:14
zygamvo: anyway, ping if you are around18:14
zygaotherwise let's chat later18:14
zygamvo: we should look at this tomorrow18:19
zygait seems that snapd was running long before18:19
zygamvo: tracing the snap-seccomp that was stopped18:19
mvozyga: hm?18:31
zygahey18:31
zygaif you want please look at the pastebin above18:31
mvozyga: looking18:31
zygafind the line where snap-seccomp process 30278 is killed18:32
zygathen look back in time18:32
zygato see when it is started18:32
zygaone thing that doesn't make sense in the current theory18:32
zygais that "systemctl restart snapd.service"18:32
zygathat will _wait_ until security is setup, right?18:32
mvozyga: I'm not sure18:33
mvozyga: I think we need to look at systemd and read how it behaves for things not fully started up18:33
zygain general, it waits until the service is ready18:33
zygamy point is that if we had started snapd  in the past18:33
zygait has completed the setup op18:33
zygawhat we're observing is disagreeing with that18:33
zygait seems that when we ask snapd to restart (from a synchronously running shell script)18:34
zygait is still starting up18:34
zygaI don't know how to explain that18:34
zygaapart from my misunderstanding of systemd18:34
zygabut snapd.service is service type "notify"18:34
zygaanyway, look at the log, at the branch18:34
zygayou can reproduce the failure yourself18:34
zygaand look at both forkstat data18:34
zygaand at journal18:34
zygaI think this is very very useful in general, for debugging all sort of issues18:34
mvozyga: hm, hm, I see18:36
zygadid I share the seed?18:36
mvozyga: I don't see in the forkstat when a process dies and why - or am I looking at the wrong thing?18:36
zygamvo: spread -debug -v -seed=1543419805 google:ubuntu-16.04-64:18:37
zygayou are not looking right18:37
zygalook at the output above:18:37
zyga18:05:08 exit  30278     15    2.510 /snap/core/6022/usr/lib/snapd/snap-seccomp compile /var/lib/snapd/seccomp/bpf/snap.core.hook.configure.src /var/lib/snapd/seccomp/bpf/snap.core.hook.configure.bin18:37
zygathis tells you that on 18:05:08 a process with pid 30278 exited due to signal 1518:37
zygataking 2.510 seconds of execution time18:37
zygayou can now look in past for pid 3027818:37
zygayou will find when the process was launched18:38
mvozyga: aha, nice18:39
mvozyga: sorry, just not used to ready it18:39
zygayeah18:39
mvozyga: uh, just looking at daemon.go - we set READY=1 in Start() so maybe we do it too early18:39
zygatook me some time to figure it out18:39
zygaoooh18:39
zygayeah18:39
zygadefinitely!18:39
zygaactually18:39
zygalet me look :D18:39
zygaI think I'm easily excited18:39
mvozyga: ahahahaha18:40
mvozyga: well, looking now as well where exactly the security setup happens18:40
zygasecurity setup happens in overlord.New18:41
zygait's all packed there18:41
zygabut18:41
zygamaybe18:41
zygawhat I'm missing is that some of it runs in a goroutine?18:41
zygabut probably not18:41
zygamvo: if you follow overlord.New you will get to ifacestate.Manager()18:42
zygathat follows to m.initialize18:42
zygaand that to m.regenerateAllSecurityProfiles18:43
zygaone possible alternative is that there was a real refresh18:43
zygabut not sure that's even possible18:43
mvozyga: hm, indeed18:43
zygaone thing for sure: forkstat == invaluable for corelating logs and activity18:44
zygawe can look at snap changes, journal and forkstat18:44
zygaand see what happens18:44
zygaI won't get to the bottom of this tonight but we have all the means to conclusively say what the issue is18:44
zygamvo: as for systemd killing processes18:45
zygait seems to just read the cgroup .procs file18:45
zygaand kill one by one18:45
zyganot the parent18:45
zygaI would love to cross check that with code and docs18:45
zygaI have not yet18:45
zygathat's from my strace'ing session18:45
pedroniszyga: there shouldn't be snapd own goroutines going (except the watchdog one) until inside start I think18:45
pedronisStart18:45
zygammm18:45
zygainterface manager doesn't have any ensure logic that would do security setup18:46
zygaunless hotplug and autoconnect are considered18:46
pedronis?18:46
zygaI meant that if there are no things going on in the background outside of the state manager then this must be synchronously running from overlord.New18:47
mvozyga: do you have a current jounalctl -u snapd log?18:47
zygaer, I did but I closed the session18:47
pedroniszyga: that's also what I said18:47
zygaI can restart and give you one in a moment18:47
zygapedronis: then we are in agreement and this suggests that snapd was stopped while it was still starting18:47
mvozyga: just curious what we see there (e.g. snapd starting and then the error right after that)18:47
zygamvo: started now18:47
zygaI will pastebin the three logs when I have them18:48
pedroniszyga: but regenerateAllSecurityProfiles is called by initialize18:48
pedronisand does security setup, no?18:48
zygayed18:48
zyga*yes it does18:48
pedronisthat's the bit I was confused18:48
zygait's on the direct call path from overlord.New18:48
pedronisyes18:48
pedronisanyway no Ensure is called before do Loop in Start18:49
cachioSaviq, we dont test on rawhide18:49
zygahmm, overlord.Start is called and then we tell systemd we're ready18:49
zygaI'm sorry, I meant overlord.Loop18:50
pedronisyes18:50
cachioSaviq, so far we test on f2818:52
cachioSaviq, and we are gonna test on f29 soon18:52
cachioSaviq, no plans for rawhide, why?18:53
Saviqcachio: we're ~testing Mir on rawhide, was wondering if you have a reliable solution18:54
Saviqso we're upgrading from f28 to rawhide in spread, but that's not really reliable...18:54
zygamvo: got it18:55
zygamvo: Nov 28 18:55:03 nov281847-525365 snapd[30231]: helpers.go:192: cannot regenerate seccomp profile for snap "core": signal: terminated18:55
zygamvo: forkstat http://paste.ubuntu.com/p/DqmTPympC6/18:55
zygamvo: journal http://paste.ubuntu.com/p/4h9gcyWHCH/18:56
zygamvo: snap changes http://paste.ubuntu.com/p/snbqrwBXMW/18:56
cachioSaviq, still not testing on f2918:57
cachioSaviq, this it blocking that https://forum.snapcraft.io/t/issue-with-repackaged-core-and-testing/839418:57
cachioSaviq, I'll let you know if there is a plan to test on rawhide18:57
zygamvo: (phone)18:57
zygamvo: if you are looking at this19:02
zygamvo: I would love to understand the origin of the restart frenzy from 18:54:58 toll 18:55:0219:02
cachioSaviq, do you need an image of f rawhide?19:07
Saviqcachio: if it's not any trouble :)19:08
cachioSaviq, I'll create it today19:08
cachioI'll ping you once it is ready19:08
mvozyga: sorry, was distracted as well - maybe we should add a debug.log() before READY=119:08
pedronismvo: I gave my +1 to #6185, needs a full 2nd review, and there are already some interesting comments there from other reviewers19:33
mupPR #6185: snap: add new `snap run --trace-exec` call <Performance šŸš€> <Created by mvo5> <https://github.com/snapcore/snapd/pull/6185>19:33
mvopedronis: thank you. yeah, second review for fontconfig needed19:35
mvopedronis: I will look at the comments from pawel in the morning19:36
zygamvo: can you look at https://github.com/snapcore/snapd/pull/622719:40
mupPR #6227: overlord,daemon: mock security backends for testing <Created by zyga> <https://github.com/snapcore/snapd/pull/6227>19:41
zygaI'd love to shrink the number of PRs19:41
mupPR snapd#6227 closed: overlord,daemon: mock security backends for testing <Created by zyga> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/6227>19:43
mupPR snapd#6226 closed: overlord,daemon: mock security backends for testing (2.36) <Created by zyga> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/6226>19:44
zygathanks19:44
mupPR snapd#6229 closed: release: probe apparmor features lazily <Created by zyga> <Merged by zyga> <https://github.com/snapcore/snapd/pull/6229>19:45
mvozyga: thank *you*19:45
zygaI'll prepare that final small PR with system key behavior19:45
mvozyga: 6195 needs a second review19:45
zygaaha19:45
zygalooking19:46
pedronismvo: zyga: I think I reviewed the pending things for 2.36 blocking a .2 right?19:47
zygaAFAIK yes19:47
mvopedronis: yeah, I think we are good19:47
pedroniszyga: mvo: thanks for chasing these issues btw19:47
* pedronis goes a bit afk19:49
mvopedronis: thank you, yeah, happy that we found the issue. enjoy the evening19:49
* mvo also needs to be afk19:49
zygapushed the syskey handling change20:29
mupPR snapd#6233 opened: overlord: don't write system key if security setup fails <Created by zyga> <https://github.com/snapcore/snapd/pull/6233>20:29
mupPR snapd#6234 opened: overlord: don't write system key if security setup fails (2.36) <Created by zyga> <https://github.com/snapcore/snapd/pull/6234>20:29
zygaI will add unit tests tomorrow, it's far too late anyway20:29
zygattyl!20:29
mupPR snapd#6235 opened: overlord,apparmor: new syskey behaviour + non-ignored snap-confine profile errors <Created by zyga> <https://github.com/snapcore/snapd/pull/6235>20:38
* roadmr repings jdstrand20:50
jdstrandroadmr: sorry, still going through backscroll from being off. what's up?21:03
roadmrjdstrand: wanted to ask! is it possible to make a plug in one snap automagically connect to another snap's slot? I know the IDs of both snaps.21:04
jdstrandroadmr: yes21:24
roadmr\o/ jdstrand how would I go about it? I'm happy to study existing examples if you point me to some21:25
zygahey jdstrand, long time no see22:14
jdstrandzyga: hey, yeah22:30

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!