/srv/irclogs.ubuntu.com/2020/04/08/#snappy.txt

=== mup_ is now known as mup
=== Eickmeyer is now known as Eickmeyer-Quasse
=== Eickmeyer-Quasse is now known as Eickmeyer[q]
mborzeckimorning05:50
zygao/06:20
mborzeckizyga: hey06:22
mborzeckizyga: some trouble with the cla-check job06:22
zygauc20-snap-recovery failed06:22
mborzeckizyga: where?06:22
zygabut it ran on 19.1006:22
zygahttps://github.com/snapcore/snapd/pull/8440/checks?check_run_id=56919382606:22
mupPR #8440: github: move spread to self-hosted workers <Created by zyga> <https://github.com/snapcore/snapd/pull/8440>06:22
mborzeckizyga: uh, merge master06:23
zygais that even expected?06:23
zygaknown issue?06:23
mborzeckizyga: yes, it's fixed already06:23
zygak06:24
zygahow did cla check fail?06:25
zygait passed on my branch just now06:25
zyga38seconds06:25
zygameanwhile, travis is broken06:26
zygahttps://t.co/h3UEAleWVW?amp=106:26
zygaI think I can just go back to bed06:27
mborzeckizyga: if you open a PR with a commit right on top of the master so that no merge commit is generated it will fail06:27
zygaI see06:27
mupPR snapd#8439 closed: secboot: import secboot on ubuntu, provide dummy on !ubuntu <UC20> <Created by mvo5> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/8439>06:57
pstolowskimorning07:01
mvogood morning pstolowski07:01
mvozyga: quick question, do we have a 32bit machine in travis actions?07:02
zygamvo: travis actions?07:03
mvozyga: sorry, gh actions07:03
zygamvo: as I said yesterday I didn't add a 32bit xenial machine to github actions07:03
zygamvo: though it's a one-liner in the matrix, it slipped through the cracks in the initial PRs07:03
zygagood morning :)07:04
zygalast night store went belly up07:04
zygaand everything running failed one way or another07:04
zygaso I just called it quits and went to sleep (too late anyway)07:04
mborzeckimvo: pstolowski: hey07:05
mupPR snapd#8455 opened: tests/lib/cla_check: expect explicit commit range <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/8455>07:05
mborzeckizyga: can we skip the spread jobs?07:05
zygamborzecki: in principle yes but it's not something we coded, we should try that if: ... expression I pasted before07:05
zygaone sec07:05
zygamaybe add that to your PR07:05
mborzeckicontains(github.event.issue.labels.*.name, 'skip-spread') or somesuch?07:06
zygayes07:06
zygaif: !contains ...07:06
mborzeckiidk tho, just copied and pasted from the docs :P07:06
zyga:)07:06
zygaI tried to get https://github.com/snapcore/snapd/pull/8440 green07:06
mupPR #8440: github: move spread to self-hosted workers <Created by zyga> <https://github.com/snapcore/snapd/pull/8440>07:06
zygabut each time something random failed07:07
mupPR snapd#8456 opened: tests: add 32 bit machine to GH actions <Simple πŸ˜ƒ> <Created by mvo5> <https://github.com/snapcore/snapd/pull/8456>07:07
zygasome desktop service, some store bits, some reboot tests07:07
zygaso tough luck07:07
zygamvo: could you please merge https://github.com/snapcore/snapd/pull/845407:18
mupPR #8454: tests/session-tool: session ordering is non-deterministic <Created by zyga> <https://github.com/snapcore/snapd/pull/8454>07:18
mborzeckizyga: hm the docs are kinda meh07:22
mupPR snapd#8457 opened: github: skip spread jobs when corresponding label is set <Skip spread> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/8457>07:23
zygamborzecki: interesting, except that the status check is required07:32
zygamborzecki: perhaps instead wrap that in ${{  }}07:32
zygaand have the worker essentially do nothing?07:32
zygamborzecki: ${{ .. }} is required in run blocks07:33
mborzeckizyga: hm which pr?07:33
zygayour pr07:33
mborzeckithere's 2 ;)07:33
zyga845707:33
zygaand there's a syntax error07:33
zygaI would drop the first part07:34
zygaas all events are pull reqeusts07:34
zygalet me pull the docs07:34
zygaif: contains(github.event.pull_request.labels.*.name, 'Skip spread')07:35
zygathen just negate07:35
zygaif: !contains(github.event.pull_request.labels.*.name, 'Skip spread')07:35
zygabut as we learned, that should not go into if because then the status check wont report07:35
zygaso maybe:07:35
zygarun: | echo ${{ !contains(...) }}07:36
zygaand see what that prints (probably true as that is just js)07:36
mborzeckiheh07:36
zygathen wrap that into a shell07:36
zygaand should be good07:36
mborzeckii mean, wtf are the docs about labels?07:36
zygathey are there07:36
zygahold on07:36
zygait's somewhat confusing because they are not in the action docs07:36
zygabut in the bigger github docs07:36
zygathe whole object model is documented07:36
zygahttps://developer.github.com/v3/issues/labels/07:37
zygaby doing ${{ ... }} you're effectively tapping into that07:37
mborzeckizyga: the pull request event is this: https://developer.github.com/v3/activity/events/types/#pullrequestevent  doesn't list the label there but it's in the example07:38
mborzeckiand it's an empty array07:38
mborzeckihowever, there's actually an example in the issues event payload07:39
zygahttps://developer.github.com/v3/pulls/ has the labels listed07:39
mvozyga: re 8454 sure, I will merge once the spread tests finished, they are still running07:40
zygathanks07:40
zygaone test already failed07:41
zygaon portal info07:41
mvozyga: oh, ok. is james aware of the flakiness here?07:41
zygaI don't know07:41
zygait's in spread-unstable so perhaps nobody noticed?07:41
zygajamesh: can you please check if this is expected07:42
mvoaha, could be07:42
zygahttps://github.com/snapcore/snapd/pull/8454/checks?check_run_id=569207445#step:4:81407:42
mupPR #8454: tests/session-tool: session ordering is non-deterministic <Created by zyga> <https://github.com/snapcore/snapd/pull/8454>07:42
zygafedora failed to prepare, network error07:43
mborzeckizyga: idk, i think that the labels is not actually included there07:44
zygamborzecki: where specifically?07:44
mborzeckizyga: is the pull_request object is the same as pull_request in https://developer.github.com/v3/activity/events/types/#pullrequestevent then the label is not htere07:44
mborzeckibut should be?07:44
mborzeckiidk07:44
zygapull request *event*07:44
jameshzyga: It isn't expected.  If you're seeing this error, then it can't map the process ID to a snap via cgroups07:44
zygarefers to pull request07:44
zygathat has labels07:45
zygajamesh: fun, I guess it is debug time then07:45
zygamvo: https://github.com/snapcore/snapd/pull/8456/files07:56
zygais the vendor change expected?07:56
mupPR #8456: tests: add 32 bit machine to GH actions <Created by mvo5> <https://github.com/snapcore/snapd/pull/8456>07:56
zygamvo: https://github.com/snapcore/snapd/pull/8440 is green07:56
mupPR #8440: github: move spread to self-hosted workers <Created by zyga> <https://github.com/snapcore/snapd/pull/8440>07:56
zygabut let's chat about that in the call07:56
mborzeckizyga: don't think that check works https://github.com/snapcore/snapd/pull/8457 looks like the spread jobs are still schedule07:56
mupPR #8457: github: skip spread jobs when corresponding label is set <Skip spread> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/8457>07:56
mborzeckid07:56
zygamborzecki: how do you determine that?07:57
zygamborzecki: they are required, so they are marked as expected07:57
zygamborzecki: note that normally you don't get any jobs until the previous pass is successful07:57
zygaso I don't believe this is accurate as measurement07:58
mborzeckiok, let's wait then07:58
mvozyga: yeah07:58
zygaah07:59
zygaI see the 2nd commit now07:59
zygacool07:59
zygathanks07:59
mupPR snapd#8440 closed: github: move spread to self-hosted workers <Created by zyga> <Merged by zyga> <https://github.com/snapcore/snapd/pull/8440>08:11
jameshmborzecki: one option would be to move the if: clause down to the step level08:15
mborzeckizyga: have you seen the 'cancel workflow' request to have any effect?08:16
mborzeckijamesh: supposedly job level `if` is supported now https://github.blog/changelog/2019-10-01-github-actions-new-workflow-syntax-features/08:16
jameshmborzecki: it's not quite as efficient since a job would still be sent to a runner, but it would mean the job would be considered successful08:16
mborzeckiunless it isn't :/ idk, maybe i just need to wait08:17
jameshmborzecki: yes, but if the conditional causes the job not to run, then it isn't considered successful08:17
jameshif you want to get rid of the "Some checks haven’t completed yet" message, the jobs need to at least do something08:18
zygamvo: there's a problem with the -32 bit build08:25
zygasrc/github.com/snapcore/snapd/vendor/github.com/chrisccoulson/go-tpm2/mu.go:267:17: constant 4294967295 overflows int08:25
zygachrisccoulson: ^ FYI08:26
zygamborzecki: IIRC cancelling works but spread doesn't cancel and the worker is killed08:26
mvozyga: I know, I updated the PR that adds 32bit works, it should have a fix08:33
zygamaybe the hash is wrong?08:34
mvozyga: oh, let me double check :(08:34
mvozyga: could be that govendor confused me08:34
zygawhen you push again merge master please08:34
mvozyga: sorry, I'm an idiot, I updated go-tpm instead go-tpm208:35
* zyga hugs mvo08:35
zygahttps://github.com/snapcore/snapd/pull/8403 needs a 2nd review08:36
mupPR #8403: sandbox/cgroup: avoid making arrays we don't use <Skip spread> <Created by zyga> <https://github.com/snapcore/snapd/pull/8403>08:36
zygait failed on store traffic: - Fetch and check assertions for snap "test-snapd-content-slot-no-content-attr" (1) (error reading assertion headers: read tcp 10.240.1.50:58298->91.189.92.20:443: use of closed network connection (Client.Timeout exceeded while reading body))08:37
mupPR snapd#8458 opened: github: allow cached debian downloads to restore <Created by zyga> <https://github.com/snapcore/snapd/pull/8458>08:41
zygajamesh: https://github.com/snapcore/snapd/pull/845808:41
mupPR #8458: github: allow cached debian downloads to restore <Created by zyga> <https://github.com/snapcore/snapd/pull/8458>08:41
zygathis should fix the cache08:41
zygathough I think it looks only in the scope of the PR, there's still more opportunity to cache things than we exploit08:42
zyga(caches are associated with objects and are not global)08:42
zygabrb08:44
jameshI suspect caches are probably scoped  to the (repo, user) pair08:46
* zyga monitors https://github.com/snapcore/snapd/actions?query=is%3Aqueued08:53
mupPR snapd#8421 closed: tests: enable unit tests on debian-sid again <Simple πŸ˜ƒ> <Created by mvo5> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/8421>09:01
zygamvo: that seems to have fixed things09:03
zygaoh, I spoke too soon09:03
zygamvo: src/github.com/snapcore/snapd/vendor/github.com/snapcore/secboot/utils.go:73:37: cannot call non-function he.TPMError.Code (type tpm2.ErrorCode)09:03
zygaI think this commit is not good :/09:03
zygawhy didn't this get flagged by the unit test run?09:04
zygaare we not building / testing secboot?09:04
zygaahh wait09:04
zygathat's weird09:04
zygaah, snapcore/secboot is a different repository09:04
zygaoh well09:04
zyga(we don't seem to test anything there in CI)09:05
mvozyga: meh09:06
zygabut at least the tests were quick now :)09:06
mvozyga: haha, yes. but that's slightly annoying that this fails09:08
mvozyga: one more try09:10
zygaok09:10
zygastill 0 queued09:10
zyga(which is good)09:11
zygamborzecki: thanks for the suggestion in https://github.com/snapcore/snapd/pull/761409:15
zygaupdated09:15
mupPR #7614: cmd/snap-confine: implement snap-device-helper internally <Created by zyga> <https://github.com/snapcore/snapd/pull/7614>09:15
zygastill 0 queued09:16
zygamvo: I also wonder if actions are more heavily used in US, making afternoon "harder"09:16
jameshI've always found CI runs faster before you Europeans wake up09:17
zygamborzecki: could you look at https://github.com/snapcore/snapd/pull/7825 and tell me if you think it's work splitting09:18
jameshI think it is more a case of two groups of users using CI at once09:18
mupPR #7825: many: use transient scope for tracking apps and hooks <Security-High> <Created by zyga> <https://github.com/snapcore/snapd/pull/7825>09:18
zygaI could take the go bits that do cgroup scanning out and push separately09:18
zygajamesh: haha, yeah09:18
mborzeckiheh, as jamesh commented, https://github.com/snapcore/snapd/pull/8457 does appear to be stuck09:20
mupPR #8457: github: skip spread jobs when corresponding label is set <Skip spread> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/8457>09:20
mborzeckithe unit tests job should run though, but it hasn't yet09:20
mborzeckiwierd, i'll wait a little bit longer09:22
jameshcould it have rejected the workflow entirely?09:22
mborzeckiidk, clearly something is off09:27
zygaone job queued09:27
zyga(all 32 spread workers are busy)09:28
zygamborzecki: werid09:28
zygamborzecki: can you rebase on master and push?09:28
zygaat 32 spread runs I'm seeing roughly 1MB/s in and 1MB/s out09:29
zygathat's not too terrible09:29
zygait spikes to 10MB/s09:29
zygaespecially when new jobs kick in and there's the initial sync09:29
mborzeckizyga: where do you see that?09:29
zygaspread has an inefficiency where the starting worker pushes the same tarball to each node09:29
zygamborzecki: on the machine running spread workers09:29
zygawe could optimize that traffic down by just sending the tarball once and then fetching it from the cloud09:30
pstolowskipedronis: hi. currently FilesystemOnlyApply skips core-only handlers if release is classic; i think this needs to be relaxed for image/setupSeed with a flag passed down to FilesystemOnlyApply; makes sense?09:36
pedronispstolowski: let me look09:39
pedronispstolowski: yes, the cleanest thing is probably for the package not use release.OnClassic at all, and get info through some options09:44
pstolowskipedronis: k, thanks for confirming09:45
zygacore 18 revert tests failed: https://github.com/snapcore/snapd/pull/8454/checks?check_run_id=57024800210:05
mupPR #8454: tests/session-tool: session ordering is non-deterministic <Created by zyga> <https://github.com/snapcore/snapd/pull/8454>10:05
zyga+ snap list10:06
zygaerror: cannot list snaps: cannot communicate with server: timeout exceeded while waiting for response10:06
mupPR snapd#8454 closed: tests/session-tool: session ordering is non-deterministic <Created by zyga> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/8454>10:10
popeyogra where should bugs about ubuntu core images be filed?10:15
popeyactually, probably a bug in the installer, is that subiquity on core? (the first run thing)10:16
* popey starts a forum thread.10:21
zygamborzecki: TBH I really wish there were type annotations10:26
zygareading foreign python code is like "where are the types" :(10:26
zygamborzecki: did you try adding any annotations?10:30
mborzeckizyga: not really, i've had enough fun with implementing the chooser ui10:34
mborzeckizyga: anyways if you want to play with it, better talk to mwhudson first10:34
zygamborzecki: https://github.com/CanonicalLtd/subiquity/pull/692#pullrequestreview-38984454910:37
* zyga goes upstaris to make tea10:37
mupPR CanonicalLtd/subiquity#692: console_conf: various recover chooser tweaks <Created by bboozzoo> <https://github.com/CanonicalLtd/subiquity/pull/692>10:37
zygawe are running at 23/32 workers now10:37
zygawe've reached saturation once for about 20 minutes10:37
pedronismvo: I made some comments in #8325, some are really general hindsight questions10:39
mupPR #8325: snap-bootstrap: copy auth data from real ubuntu-data in recovery mode <UC20> <Created by mvo5> <https://github.com/snapcore/snapd/pull/8325>10:39
mupPR snapd#8458 closed: github: allow cached debian downloads to restore <Created by zyga> <Merged by zyga> <https://github.com/snapcore/snapd/pull/8458>10:45
mupPR snapd#8448 closed: tests/session-tool: add session-tool --dump <Simple πŸ˜ƒ> <Created by zyga> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/8448>10:47
zygathanks!10:47
mvopedronis: thanks, will look in a wee bit, looks like it is closed, I will try to get it to a landable point today :)10:48
ograpopey, yeah, subiquity is correct10:51
ograpopey, but the issue is indeed the clock ...10:52
pedronismvo: I don't know, there are some open questions11:02
zygamborzecki: https://github.com/CanonicalLtd/subiquity/pull/692#pullrequestreview-38987031711:03
mupPR CanonicalLtd/subiquity#692: console_conf: various recover chooser tweaks <Created by bboozzoo> <https://github.com/CanonicalLtd/subiquity/pull/692>11:03
mborzeckizyga: thanks!11:04
popeyogra ok11:07
pedronismvo: can you merge #8449, it's all green but travis never came back or started, afaict ?11:22
mupPR #8449: dirs: don't depend on osutil anymore, mv apparmor vars to apparmor pkg <Simple πŸ˜ƒ> <Test Robustness> <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/8449>11:22
mvopedronis: sure11:25
mupPR snapd#8449 closed: dirs: don't depend on osutil anymore, mv apparmor vars to apparmor pkg <Simple πŸ˜ƒ> <Test Robustness> <Created by anonymouse64> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/8449>11:26
zygacore 20 recovery design11:30
zygaMAGA - make appliance good again11:30
* zyga hides11:30
zygawe are at 3/32 workers11:31
zygathough it will go back to ~20 once canary jobs are done11:31
ograMAGA ? so should we deny it exists until it hits us hard ? :)11:32
mborzeckiogra: you mean another customs war?11:37
zygaI started implementing snapctl refresh-available11:39
zygashould have a simple version today11:39
zygabut first, *hot* tea11:39
zygathe office is horribly cold even today11:39
zygaI need a 2nd review for https://github.com/snapcore/snapd/pull/840311:40
mupPR #8403: sandbox/cgroup: avoid making arrays we don't use <Skip spread> <Created by zyga> <https://github.com/snapcore/snapd/pull/8403>11:40
mupPR snapd#8459 opened: asserts: it should be possible to omit many snap-ids if allowed, fix <Created by pedronis> <https://github.com/snapcore/snapd/pull/8459>12:08
zygapedronis: ^ gofmt12:15
diddledanno, you go fmt!12:17
mupPR snapd#8460 opened: tests/session-tool: kill cron session, if any <Created by zyga> <https://github.com/snapcore/snapd/pull/8460>12:26
zygapedronis: ta12:26
pedronisI'm seeing failures on core-16-64, that are not obviously bogus12:42
zygawhat kind of failures?12:44
zygaI saw two kinds today:12:44
zyga- reboot that went nowhere12:44
zyga- snap rollback and timeout on "snap list"12:45
zygathat felt really broken12:45
zygamborzecki: https://github.com/snapcore/snapd/pull/8457/checks?check_run_id=570718915 <- cache of debian deps worked!12:45
mupPR #8457: github: skip spread jobs when corresponding label is set <Skip spread> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/8457>12:45
pedroniszyga: possibly, yes, reboot that went nowhere, but it seems new and real12:45
zygamborzecki: I wonder if we can set cache scope to "global" to make sure everyone benefits12:45
zygapedronis: I saw the reboot failure about twice last week as well12:45
zygabut never when testing with -debug to see :/12:46
zygamborzecki: spread-canary started on your skip label PR12:46
zygamborzecki: and it works!!!12:46
zygamborzecki: cool12:46
zygamborzecki: with some extra love you could set a status label that shows it was skipped12:47
zygabut the feature works :)12:47
mborzeckizyga: uhh i don't like it though12:47
zygamborzecki: why?12:47
mborzeckizyga: we still need to take as many workers as distros12:47
zygamborzecki: but not spread Vms12:47
zygamborzecki: that's nearly free12:47
zygamborzecki: they all passed now12:47
zygamborzecki: it adds ~30 seconds12:47
zygaand it's green - except for "pending travis"12:48
mborzeckihahah12:48
zygamvo: https://github.com/snapcore/snapd/pull/8457 <-12:48
mupPR #8457: github: skip spread jobs when corresponding label is set <Skip spread> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/8457>12:48
mborzeckino surprises there12:48
* zyga hugs maciek12:48
zygathank you :)12:48
mborzeckinow, i still need to figure out that cla check12:53
mborzeckilooks like there's a difference in what gets merged where between gh and travis12:54
zygamvo: src/github.com/snapcore/snapd/vendor/github.com/chrisccoulson/go-tpm2/mu.go:267:17: constant 4294967295 overflows int12:56
zygamvo: this now breaks master .deb builds12:57
zygamborzecki: you can change how we check out things12:57
zygamborzecki: there's also plenty of 3rd party solutions for this but I didn't look deeper12:57
zygamborzecki: one was cool though, each CLA signature was a signed file in the repo12:57
zygamborzecki: so the check was entirely offline12:57
pedroniszyga: it's because we are getting a pc-kernel update in the middle of the tests12:57
zygaohh12:58
zygapedronis: did you reproduce it12:58
mborzeckizyga: there's an action ready for that, broought to you by SAP (?!)12:58
pedroniszyga: no, but the log is obvious12:58
zygapedronis: we should probably hold refreshes for snaps that cause reboots12:58
pedronisonce you look at it and that the tests12:58
zygamborzecki: yes, SAP12:58
mborzeckihttps://github.com/cla-assistant/github-action12:58
zygamborzecki: fun world :)12:58
pedroniszyga: we don't have single snaps holding12:58
zygaI read that one12:58
pedronisbut it's still strange12:58
zygapedronis: oh... right12:58
zygahmm12:58
zygabut why doesn't it come back?12:58
pedroniswhy would we try to refresh kernel anyway12:58
zygamaybe really buggy?12:58
pedronissomething is off12:58
zygaoh12:58
zygastandup time12:58
* zyga needs to check one thing first12:59
pedroniszyga: because we make reboot slow explicitly12:59
pedronisthe test don't really support unexplicit12:59
pedronisreboots12:59
zygaah, indeed12:59
pedronisthese tests are not meant to reboot12:59
mupPR snapd#8457 closed: github: skip spread jobs when corresponding label is set <Skip spread> <Created by bboozzoo> <Merged by bboozzoo> <https://github.com/snapcore/snapd/pull/8457>13:00
pedroniszyga: anyway I do think we want to add per-snap holding at some point, just not clear when13:00
rbasakI just accidentally published a snap to beta, when nothing was published in beta before. Can I undo that?13:09
roadmrrbasak: yes13:10
roadmrrbasak: snapcraft close your-snap beta13:10
rbasakAh, I found a "close" option?13:10
rbasakGot it. Thanks!13:10
roadmrπŸ‘13:10
rbasakWhile I'm on the topic, is there any way I can unpublish the i386 snaps (on a different snap)? We don't build those now, so the ones that are there are way behind and probably useless now.13:12
mvozyga: yeah, trying to fix it in 845613:18
jdstrandzyga: note, I re-read through PR 8408 yesterday (though after it was merged; it was fine)13:23
mupPR #8408: snap/naming: add validator for snap security tag <Skip spread> <Created by zyga> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/8408>13:23
zygajdstrand: thank you!13:23
jdstrandzyga: PR 7614 and PR 7825 are very high on the list13:24
zygajdstrand: thanks13:24
mupPR #7614: cmd/snap-confine: implement snap-device-helper internally <Created by zyga> <https://github.com/snapcore/snapd/pull/7614>13:24
mupPR #7825: many: use transient scope for tracking apps and hooks <Security-High> <Created by zyga> <https://github.com/snapcore/snapd/pull/7825>13:24
jdstrandzyga: the apparmor upload and some training I gave set me back a bit, but I will be getting to them13:24
zygajdstrand: both fail in CI now, one on silly thing and one (f31 or f32) fails on something that seems real, I'll invesetigate soon13:24
zygajdstrand: but any feedback would be great13:24
zygajdstrand: in the branch about refresh-app-awareness please note if I should split up the cgroup scanning code to a separate PR, it could be reviewed faster and land, aiding further review13:25
* jdstrand nods13:26
zygauh13:49
zygamy daughter's school friend is at a hospital13:50
zygahe lives next door :/13:50
roadmrohnoes13:50
zygaFYI we run at capacity now, saturated 32 workers13:51
zygabut the queue is empty13:51
zygaand we should see a drop to ~ half of that in a few minutes13:51
zygaI will look at implementing the ideas we had today, that should reduce the queue load significantly13:52
zygawell, worker load13:52
zygawe're still not queueing because we manage to process everything (for now)13:52
zygaand we are at 24/3213:56
pstolowskizyga: oh14:06
mupPR snapcraft#3019 closed: static: consolidate tooling setup to setup.cfg <Created by sergiusens> <Merged by sergiusens> <https://github.com/snapcore/snapcraft/pull/3019>14:06
mupPR snapcraft#3020 closed: spread tests: default base for local plugin tests <Created by sergiusens> <Merged by sergiusens> <https://github.com/snapcore/snapcraft/pull/3020>14:06
mupPR snapcraft#3022 opened: plugins: introduce v2.PluginV2 and v2.NilPlugin <Created by sergiusens> <https://github.com/snapcore/snapcraft/pull/3022>14:06
zygapstolowski: yeah :/14:07
* zyga is hungry and breaks for lunch14:07
zygao/14:07
mupPR snapd#8411 closed: boot: cleanup more things, simplify code <Reviewed> <UC20> <Created by anonymouse64> <Merged by anonymouse64> <https://github.com/snapcore/snapd/pull/8411>14:11
stgraberzyga: back to what I asked you about yesterday, any reason why "snap run --command=stop" would block on snapd.socket?14:11
stgraberzyga: if so, you need to find a way to make snapd keep running until the last snap has been stopped14:11
zygayes, it does so when system key is different to the one on disk14:11
zygait normally happens when you boot a new kernel14:12
zygawe ask snapd to generate new profiles and wait until it does so14:12
stgraberzyga: the current situation means LXD is never stopped properly, causes a 10min shutdown delay and data loss14:12
stgraber*never stopped properly in those cases14:12
stgraberI'm not sure it's just about kernel updates, I have a system that seems to reproduce it every time, let me try it again today14:13
zygastgraber: please raise a bug to mvo14:13
* zyga is at lunch14:13
zyga(well almost)14:13
zygamborzecki: https://github.com/snapcore/snapd/pull/846114:13
mupPR #8461: github: run non-canary if label is present <Created by zyga> <https://github.com/snapcore/snapd/pull/8461>14:13
* zyga is gone14:13
mupPR snapd#8461 opened: github: run non-canary if label is present <Created by zyga> <https://github.com/snapcore/snapd/pull/8461>14:14
zygastgraber: FYI I experienced this issue but was unable to debug it at the time14:16
stgraberyeah, got it easily reproducible on an arm64 system somehow, seems to happen at every single reboot14:16
stgraberthe new kernel thing would explain why other users only get it somewhat randomly though14:17
zygaFor me it was x8614:17
stgraberfiling a critical bug against snapd claiming data loss14:17
zygaPerhaps system key is buggy14:17
zygaPlease!14:17
stgrabermvo: https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/187165214:22
mupBug #1871652: Daemon snaps not properly stopped in some cases <champagne> <snapd (Ubuntu):Triaged> <https://launchpad.net/bugs/1871652>14:22
stgrabermvo: as you know, every single server and cloud instances of 20.04 will use the LXD snap and all upgrading users of 18.04 snap will upgrade to the snap too, so we really really need this resolved or we're in for a lot of data loss / corruption issues.14:23
mvostgraber: looking14:25
stgrabermvo: thanks!14:25
stgraberI'm creating a test VM on that arm64 system which can be played with as much as needed, should make fixing this easier14:25
mvostgraber: I think zyga is on to something here, snap run will wait for snapd to re-generate the profiles, if snapd is already stopped this of course won't work, I need to see why this happens/how to do fix it14:27
stgrabermvo: I've updated the LP bug with my reproducer on arm6414:40
stgrabermvo: I'm happy to sort out a way for someone from your team to access that system if that helps14:40
* cachio afk14:41
* cachio afk14:42
mvostgraber: in various meetings right now, need to find someone to look at this while I'm "off"14:42
zygare14:42
zygastgraber: I have plenty of arm64 boards14:42
zygaI can look later today14:43
stgraberzyga: VM capable?14:43
zygastgraber: hmmmm14:43
zygastgraber: good question14:43
* zyga checks14:43
stgraberzyga: the system I'm testing this on is a 48 core, 128GB RAM, arm64 server :)14:43
zygastgraber: I think you win :)14:43
stgraber(which Qualcomm kindly forgot in my basement before firing the entire team who designed it)14:43
zygaGCE had a hiccup, restarted a job to see if it was temporary14:44
mupPR snapd#8459 closed: asserts: it should be possible to omit many snap-ids if allowed, fix <UC20> <⚠ Critical> <Created by pedronis> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/8459>14:53
mupPR snapd#8460 closed: tests/session-tool: kill cron session, if any <Created by zyga> <Merged by zyga> <https://github.com/snapcore/snapd/pull/8460>14:55
cjwatsonthat's one way to get hardware14:56
zygastgraber: if you _ever_ want to throw it out14:58
zygajust remember14:58
zygabring it to europe on a plane and I can relieve you of it ;-)14:59
stgraber:)15:00
zygastgraber: are there any arm servers available that don't require a datacenter contract?15:00
zygastgraber: looking at the bug now15:04
zygastgraber: would it be possible for me to get a shell on a machine where this can be reproduced?15:04
zygaalternatively, I'd love to see the system key snapd writes15:05
stgraberzyga: happen to have IPv6 on your side?15:05
zygastgraber: if is in /var/lib/snapd/system-key15:05
zygastgraber: unfortunately no :/15:05
zygamaybe don't open access for now,15:05
zygasystem-key is ... well, the key15:05
mvozyga: thanks for looking, I'm a bit busy with meetings15:05
zygait may be revealing15:05
stgraber{"version":10,"build-id":"799a88b406b245795da51b18f6224003020c6fb9","apparmor-features":["caps","dbus","domain","file","mount","namespaces","network","network_v8","policy","ptrace","query","rlimit","signal"],"apparmor-parser-mtime":1538072454,"apparmor-parser-features":["unsafe"],"nfs-home":false,"overlay-root":"","seccomp-features":["allow","errno","kill_process","kill_thread","log","trace","trap","user_noti15:08
stgraberf"],"seccomp-compiler-version":"66988dd2c3fb0abf9b1fb29be212771d7c38ae85 2.4.1 8c73f36d3de1f71977107bf6687514f16787f639058b4db4c67b28dfdb2fd3af bpf-actlog","cgroup-version":"1"}15:08
zygathanks, let me inspect things now15:08
zygastgraber: so, why does lxd stop itself using snap run?15:11
zygathis is not a bug on your side, I think, I'm just curious15:12
stgraberthat's how the systemd units are generated15:12
stgraberall Commands in there wrap using snap run I think15:12
zygaah, I see,15:12
zygaindeed15:13
zygastgraber: is it possible to reproduce this with SNAPD_DEBUG=1 set15:15
zygastgraber: if so please attach that15:16
zygaI need to break now, my 1yo daughter just woke up15:16
zygabut I have a hunch I know what it is15:16
zygahaving that will confirm15:16
pstolowskiso, fun fact about persistent journal; restarting systemd-journald triggers snapd restart (?) and since this is happening from config hook, bad things happen :/15:18
pedronispstolowski: that's a problem for sure :/15:21
pstolowskipedronis: it's annoying, because core16 seems to need journald restart15:22
zygaWhaaat?15:25
zygaWhy do we restart?15:25
zygaCan we reload it instead?15:25
pstolowskizyga: i need to try15:26
pedronispstolowski: core18 and 20 work without?15:39
pstolowskipedronis: core18 - yes. i haven't checked 2015:41
pstolowskiFailed to reload systemd-journald.service: Job type reload is not applicable for unit systemd-journald.service.15:43
pstolowski:}15:43
pstolowskithere you go15:43
pstolowskionly systemctl restart does it15:43
pedronispstolowski: if it works with 18 and 20 without, I would just go without, restarting the journal is kind of weird anyway15:44
pstolowskipedronis: ok. i'll double check if i wasn't dreaming15:45
pedronispstolowski: anyway what you could try is kill USR115:46
pedronispstolowski: see man systemd-journald15:46
pstolowskipedronis: aaha, thanks!15:47
pedronispstolowski: as usual is not super clear what it does15:49
pedronisfrom the man15:49
pedroniszyga: mvo: I got again a bunch of allocation problems: https://github.com/snapcore/snapd/pull/8436/checks?check_run_id=57097899415:55
mupPR #8436: configcore,tests: use daemon-reexec to apply watchdog config <Reviewed> <Squash-merge> <Created by pedronis> <https://github.com/snapcore/snapd/pull/8436>15:55
zygapedronis: looking15:56
zygapedronis: happened once today15:56
zygait looks like some permission issue15:56
zygait was mentioned on the internal channel15:56
zygaplease restart the workflow, it's not a capacity problem15:56
zygawe don't know what caused it15:56
zygabetter yet, merge master for more fixes :)15:57
pedroniszyga: I merged master many times15:58
pstolowskipedronis: systemctl kill --signal=SIGUSR1 systemd-journald does the job on core1616:10
pedronispstolowski: good16:10
pedronispstolowski: that seems safe everywhere16:10
pedronissystemd has Kill I think, right?16:11
pedronisI mean systemd our package16:11
pstolowskipedronis: yes, i'm just looking at it right now16:12
mvozyga: do you have anything to share about the lxd bug ? anything you figured out already that is worth for me to know?16:41
zygamvo: I'm still partially AFK but give me some more time16:41
zygamvo: I have conditions to reproduce it16:42
zygamvo: and I _suspect_ I know what the problem is16:42
mvozyga: nice, keep me updated please16:43
ijohnsonpedronis: did you want me to change to use a string pointer for mockedMountInfo in 8451 ?16:47
pedronisijohnson: yes, it's it not too annoying16:48
pedronisheh16:48
pedronisif it's not16:48
ijohnsonsure I mean I'll only have to re-start the workflows 1000 more times anyways so it's not a big deal16:48
pedronisijohnson: about the selinux tests, yes, that's fine, anyway is a different package, it was really testing two levels16:49
ijohnsonright16:49
zygalet's merge https://github.com/snapcore/snapd/pull/8456#pullrequestreview-39018974917:37
zygait needs a 2nd review17:37
mupPR #8456: tests: add 32 bit machine to GH actions <Created by mvo5> <https://github.com/snapcore/snapd/pull/8456>17:38
zygaijohnson: any issues?17:38
ijohnsonmm?17:39
ijohnsonoh the PR you just mentioned?17:39
zygawith CI17:39
ijohnsonI haven't been looking at CI in the past hour or two, just seems annoying that every time I look at one of my PR's exactly one check out of the 17 failed and so I have to restart everything17:40
zygaijohnson: I'll prepare the quad workflows for tomorrow17:41
ijohnsonI reviewed 845617:41
zygaijohnson: it's late and I'm looking at something else17:41
ijohnsonyes that would be much appreciated17:41
ijohnsonalso did you see the mount ns bug I assigned to you last night ?17:41
* zyga needs coffee and checks17:41
ijohnsonI couldn't reproduce it with robust-mount-namespace-updates=true with a small reproducer snap, but with the full snap I can still reproduce the EBUSY17:41
ijohnsonanyways I can send you the snap when you have time to look at the issue17:42
mupPR snapd#8456 closed: tests: add 32 bit machine to GH actions <Created by mvo5> <Merged by zyga> <https://github.com/snapcore/snapd/pull/8456>17:42
zygaijohnson: cannot find it, let me check my mail17:43
pedronisdid we get a newer systemd recently in 20.04 ?17:50
mvoijohnson: I can override failures in merges fwiw18:01
mvoijohnson: we are a bit timezone challenged so not ideal but do ping me if you have such a case18:02
ijohnsonmvo ack maybe I'll send you an email at my EOD if needed18:04
mvoijohnson: sure thing!18:04
zygare, back to work18:08
zygapedronis: 244 was in Feburary18:09
zygaFebruary18:09
zyga245 was in March18:09
zygawe are now on 245.218:10
pedronisok, just confused because a test that I tried failed now, anyway it indeeds needs tweaking for systemd >=24318:13
zygamvo: I debugged the issue related to lxd and snapd19:17
zygacachio: 19.10 images also have GDM19:19
zygacachio: it would be good to regenerate them so that we don't have the desktop19:19
ijohnsonzyga: what was the issue with lxd and snapd ?19:20
ijohnsonI'm curious19:20
zygaijohnson: https://bugs.launchpad.net/snapd/+bug/187165219:25
zygait's all there19:25
mupBug #1871652: Daemon snaps not properly stopped in some cases <champagne> <snapd:Confirmed for zyga> <snapd (Ubuntu):Confirmed for zyga> <https://launchpad.net/bugs/1871652>19:25
zygaijohnson: but tl;dr; is in the last comment19:25
* ijohnson reads19:25
zygaijohnson: it's pretty interesting actually19:25
zygastgraber: thank you for the debugging environment19:25
zygastgraber: it's late so unless it's very urgent I will fix it first thing  tomorrow after discussing with the team19:25
stgraberzyga: it's been happening for a long long time, we can wait another day :)19:26
zygaI hope one last day19:26
zygalet me do one more test today19:26
stgraberit's just that now that we understand it, we also understand the danger from it (containers aren't stopped at all, filesystem isn't unmounted, so data loss potential)19:26
stgraberit actually explains why we've seen some odd db corruption in the past which we couldn't really explained based on logs19:26
zygayes, I think the bug is well marked as critical19:27
mvozyga: ohhhh, what did you find out?19:28
mvozyga: ok, how involved is the fix :) ?19:28
zygastgraber: as a small note, setting19:28
zygaSNAPD_DEBUG_SYSTEM_KEY_RETRY=019:28
zygashould work around it19:28
zygamvo: it depends19:29
zygamvo: please read https://bugs.launchpad.net/snapd/+bug/187165219:29
zygamvo: it's probably something we can fix tomorrow19:29
mupBug #1871652: Daemon snaps not properly stopped in some cases <champagne> <snapd:Confirmed for zyga> <snapd (Ubuntu):Confirmed for zyga> <https://launchpad.net/bugs/1871652>19:29
zygamvo: tl;dr; is https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1871652/comments/719:29
mvozyga: nice19:29
mvozyga: but it does sounds like the fix will not be entirely easy19:29
zygamvo: it's actually very easy19:30
zygamvo: just the if ( ... ) part needs discussing19:30
mvoI like the sound of that19:30
zygawe cannot wait for system key on shutdown19:30
zygaand we probably should depend on core/snapd19:30
zygaand not let them go away / unmounted19:30
zygathis was never expressed in systemd terms19:30
zygabut let's discuss that tomorrow19:30
mvook19:30
zygait's late and I'd love to get off my chair :)19:30
mvozyga: sounds great, thank you so much19:31
zygawe know exactly how to reproduce this19:31
zygaand we know what to change to fix it, we need to discuss how to introduce the changes19:31
zygamvo: I suspect we _can_ do a minimal fix tomorrow19:31
zygawithout ill effects19:31
zygaand work on a more proper fix for +119:31
zygathe minimal fix will just detect the shutdown and ignore system key19:32
zygathe proper fix will introduce dependencies19:32
zygaso lxd will not stop after core is unmounted19:32
mvozyga: sounds good to me19:32
zygabut that's more iffy for the reasons you probably know about (wrappers and ensure)19:32
* zyga waves and takes a break19:32
cachiozyga, I'll run the update719:33
zygacachio: let me know if you need mount-ns test changes for that19:33
mvozyga: thank again and good night19:33
zygacachio: it slipped from my radar but I will send the patches tomorrow19:33
cachiozyga, sure19:34
zygastgraber: and I know why I cannot reproduce it, for development I disabled reexec on my main machine19:34
stgraberah, that'd do it19:34
cachiozyga, job running19:37
cachiozyga, https://travis-ci.org/github/snapcore/spread-cron/builds/67266609519:38
cachioit would be ready in about 40 minutes19:38
mupPR snapcraft#3021 closed: remote-build: remove artifact sanity check <Created by cjp256> <Merged by sergiusens> <https://github.com/snapcore/snapcraft/pull/3021>19:49
zygastgraber: ^19:50
zygahttps://github.com/snapcore/snapd/pull/846219:50
zygaactually ^19:50
mupPR #8462: cmd/snap: don't wait for system key when stopping <Created by zyga> <https://github.com/snapcore/snapd/pull/8462>19:50
zygait should do the job, we need to package it with tests and stuff19:50
mupPR snapd#8462 opened: cmd/snap: don't wait for system key when stopping <Created by zyga> <https://github.com/snapcore/snapd/pull/8462>19:50
zygathat code didn't seem to have unit tests before so it will take me more19:51
zyganow I'm really gone19:51
zygamvo: T19:51
zyga^19:51
stgraberlooks simple enough :)19:51
mupPR snapcraft#3023 opened: pluginhandler: move attributes to PluginHandler <Created by sergiusens> <https://github.com/snapcore/snapcraft/pull/3023>20:50
mupPR snapd#8463 opened: secboot: key sealing also depends on secure boot enabled <UC20> <Created by cmatsuoka> <https://github.com/snapcore/snapd/pull/8463>21:16

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!