=== Wouter01009 is now known as Wouter0100 === jnsgruk01 is now known as jnsgruk0 === TooLmaN_ is now known as TooLmaN [07:08] morning [07:59] morning [08:09] good morning pstolowski and mborzecki [08:10] hey [08:10] mvo: hey [08:10] heh didn't realize matrix irc bridge used the wrong nick [08:11] oh well maybe it didn't, but i still got confused [08:16] there's a new daemon cleanup PR up: PR#10039 [08:16] and hi [08:17] #10005 needs 2nd reviews [08:17] Bug #10005: network properties revert after reboot [08:17] PR #10005: seed: ReadSystemEssentialAndBetterEarliestTime [08:28] pstolowski: not sure if you looked more into this download issue, it's interssting, I can reproduce it here with just "snap refresh --edge core". I hacked the download handler to use a really slow "rate-limit" so that I have plenty of time to observer. what is interessting is that fnotifytool shows we keep reading/writing the state every second but also that the generic-classic assertion is acceesed every second too. when I stop the refresh thing [08:28] s are normal again [08:29] mvo: interesting! not yet, store wouldn't work yesterday evening [08:30] mvo: maybe our progress bar updating is too agressive? [08:32] pstolowski: yeah, I can poke at this. what is also strange is the constant acccess to the assertion, at this point in the download hanlder only the store should do it's download thing and not poke at the asserts db [08:32] (for the model) [08:32] mvo: yes [08:36] pstolowski: replacing meter with a progress.NullMeter does not change anything [08:42] mvo: about the model assertion, maybe there is something going on with the auth context? [08:44] pstolowski, pedronis interessting observation, if I change the reRefreshRetryTimeout from 1/10 sec to something bigger (like 10 sec) the cpu usage is completely tame [08:44] mvo: are we inside downloadImpl all the time when this is observed? [08:44] pedronis: yeah, I suspect the auth context but maybe a red-herring (or a bit of an auxilary issue) [08:45] so it seems the high cpu usage is re-retry being to aggressive which is strange because iirc this has not changed in a while (or anything around this). or am I missing something? [08:45] * mvo really like fnotifytool fwiw [08:46] mvo: hmm reRefreshRetryTimeout seems to be pretty aggressive, that's a lot of state locking/unlocking on retry [08:47] (just noticed you concluded the same) [08:48] pstolowski: yeah, that seems to trigger the high cpu usage [08:48] mvo: well we might have added something sloweish to Ensure [08:48] mvo: yeah, that didn't change for a while, but maybe no one noticed [08:48] pedronis: oh, good idea [08:48] mvo: so the issue is not re-refresh itself but it will also trigger a Ensure loop [08:48] so I think the big change is somewhere there [08:49] we migh have grown some preamables to Ensures that are slowish [08:49] that's probably where the model reading also comes from [08:50] mvo: maybe you should try compare with something like 2.47 or 2.46 [08:51] pedronis: great idea [09:01] pedronis, pstolowski I can reproduce this all the way back to 2.38 (stopped there, took random samples in between). it's very confusing, either my test is flawed or we have this bug forever (but that is strange, we would have noticed :/ [09:01] * mvo goes with "test flawed for now" [09:11] mvo: thanks for chasing this! i think i'm more skeptical about noticing something like this.. maybe on slow boards and big downloads. but then maybe these older bug reports that we attributed solely to network flakiness were really two problems (and we just fixed one of them - hopefully) [09:14] pstolowski: 2.36 is the one I found to be not affected but again, maybe flawed methods [09:18] mvo: 2.38 is where we added re-refresh [09:29] pedronis: yeah, I bisected a bit more, it looks like it's really 6356 :/ [09:30] mvo: it's fine, at least is not a regression, we can think a bit more [09:31] anyone restarted the tests in https://github.com/snapcore/snapd/pull/10033 today? [09:31] PR #10033: tests: run the reset.sh helper and check test invariants while the test is restored [09:31] PR snapd#10039 closed: daemon: switch preexisting daemon_test tests to apiBaseSuite and .req [09:31] mborzecki: i didn't [09:32] looks like one of the jobs may be stuck since 12h ago [09:32] pedronis: yeah, I added notes to the standup doc about it [09:33] hm restarted it now, let's is if it's a fluke on github or something with out workers [09:34] also, https://github.com/snapcore/snapd/pull/10038 seems to be suffering for the same problem [09:34] PR #10038: tests: replace while commands with the retry tool [10:31] PR snapd#10033 closed: tests: run the reset.sh helper and check test invariants while the test is restored [10:52] PR snapd#10040 opened: daemon: switch api_test.go to daemon_test and various other cleanups [12:37] PR snapd#10041 opened: interfaces/builtin: update unit tests to use proper distro's libexecdir [12:37] a trivial fix ^^ [13:08] cachio_: https://github.com/snapcore/snapd/pull/10038#discussion_r595151112 [13:08] PR #10038: tests: replace while commands with the retry tool [13:09] mborzecki, yanks [13:09] I already tested failover [13:09] so it should pass [13:11] cachio_: great, thanks! [14:36] mvo: we expose the pprof profiling endpoint, you can grab a cpu profile and see what's taking so long, and later collect a trace even [14:37] mvo: there are some examples in tests/main/debug-pprof [15:23] ijohnson: i've updated https://github.com/snapcore/snapd/pull/10006 [15:23] PR #10006: cmd/snap-bootstrap: refactor handling of ubuntu-save, do not use secboot's implicit fallback [15:24] mborzecki: thanks I'll have a look today then [15:24] ijohnson: great, thanks! [15:26] mborzecki: actually if you have a couple minutes can you review https://github.com/snapcore/snapd/pull/9307 ? [15:26] PR #9307: interfaces/tee: add TEE/OPTEE interface [15:27] sure [15:48] * cachio_ lunch [16:12] PR snapd#10042 opened: snapstate: reduce reRefreshRetryTimeout to 1/2 second [16:16] mborzecki: ijohnson: pstolowski: anything you need from me still today? otherwise I will eod a bit earlier [16:16] pedronis: I'm good for now thanks [16:16] pedronis: i don't, thanks [16:16] pedronis: i'm good thanks [16:17] ok, thx === msalvatore_ is now known as msalvatore === oer is now known as oerheks === ijohnson is now known as ijohnson|lunch === ijohnson|lunch is now known as ijohnson [18:58] * cachio_ afk [20:50] does snapcraft export-login's --channels argument support regexes or globs or anything like that? [20:51] istr the underlying macaroon stuff does but it's been a while [20:54] ah hah https://dashboard.snapcraft.io/docs/api/macaroon.html#request-a-macaroon says "fnmatch format"