[00:53] Bug #1887153 opened: Installing network-manager drops the network [00:59] Bug #1887153 changed: Installing network-manager drops the network [01:05] Bug #1887153 opened: Installing network-manager drops the network [06:15] morning [06:17] mvo: hey [06:19] hey mborzecki [06:19] mborzecki: do you think you could have a look at 9358? seems like we need it for 2.47 :) [06:20] mborzecki: what did I miss otherwise? [06:20] mvo: yes, i'll push a patch dropping that fmt.Println [06:20] otherwise it seems ok [06:21] mvo: so the gadget assets update seems to be a nogo now, maybe we should have a managers test instead? [06:21] i totally missed that we're only modifying the conf files [06:22] mborzecki: why a no-go? wouldn't changing/resigning work? [06:22] mborzecki: yeah, sorry for that, I messed up the test :( [06:23] mvo: i mean, in the current form it's a no go, needs more work [06:23] mborzecki: yes, sorry, then we are in agreement [06:23] mborzecki: I was wondering if maybe just changing the pe header timestamp would be enough [06:24] mborzecki: I'm looking how archivable this is right now [06:24] mvo: cool, that would modify the contnt so a differnt hash, thus triggering assets to be updated and a reseal after [06:25] mborzecki: yes, iirc secboot does understand pe binaries already to some extend so there are libs [06:25] mborzecki: and then the same problem applies to the kernel [06:25] mborzecki: I guess managers should not write code or tests anymore :( [06:26] mborzecki: anyway, looking at this now [06:26] mvo: why miss out on all the fun? :) [06:26] mborzecki: lol [06:28] mvo: fwiw, maybe we should emply reflection for modeenv at some point, when we add more fields [06:29] mvo: i mean field tags, and so on, then it'd be easier to not loose new fields by accident [06:29] good point [06:32] good morning [06:34] zyga: good morning [06:35] zyga: hey [06:38] hey guys [06:39] everyone had breakfast but Lucy decided to go back to bed for some reason [06:39] so here I am [06:39] let me see if I can connect to the office [06:39] I had some issues last night, I think the switch is faulty [06:39] it's got a flaky power cable and turns off easily when moved [06:43] mborzecki: testing proper gadget manip now locally before pushing to avoid congesting spread [06:46] I'll slowly start working while Lucy is not doing much [06:48] sounds good [06:58] mvo: tweaked #9358 a little, please take a look [06:58] PR #9358: boot/modeenv: track unknown keys in Read and put back into modeenv during Write <⚠ Critical> [07:00] morning [07:03] adjusted ExportEntryt to be a struct now [07:04] pstolowski: hey [07:05] mvo: looked at gadget assets and 9348, both ran nested core20 tests which were successful (not counting gadget assets reseal test) [07:10] mborzecki: http://paste.ubuntu.com/p/6x7ZXFhccH/ is what I have for the gadget reseal test [07:10] mborzecki: it's running locally now [07:10] good morning [07:16] mvo: haha, can you maye do sed -i 's/This program cannot be run in DOS mode/This program cannot be run in mode/' otherwise it's hard to spot the O/0 change [07:18] brb [07:20] re [07:31] mborzecki: sure thing [07:31] mborzecki: the test is funny right now, I ran it but now it resealed when I installed the same gadget again, I'm confused but maybe I need to merge master or something else if funky [07:34] mvo: ok, let me try to run it [07:36] mborzecki: I'm also running it now [07:36] mborzecki: I changed the XXX [07:40] mvo: so you're saying it stopped after the install of unchanged gadget? [07:41] mborzecki: http://paste.ubuntu.com/p/mCVhCYSv5t/ [07:41] mborzecki: that's the full log [07:41] mborzecki: eh, sorry, that does not look full, one sec [07:43] mborzecki: so SEALED_KEY_MTIME_1 and _2 are different, the original error scrolled past my logbuffer .( [07:43] mborzecki: which is strange because the "snap change" output of the gadget install tells me that there is gadget update needed [07:44] mborzecki: which also reminds me that I should add code that checks if a gadget update was triggered to this test [07:58] mvo: added some extra logs and running again [08:02] pedronis: Model/Brand was a problem when i tried adding tags, looked like there'd be a separate structure needed for read/write with just the eported fields (and tags) [08:03] mborzecki: no, we can do something simple, I will try. but in a meeting now [08:03] pedronis: ok [08:29] mvo: after reinstalling the gadget snap, the reseal count is unchaged [08:29] (at least that's what I see here) [08:34] mborzecki: oh, so the test should work? that's great actually [08:35] mvo: not quite, https://github.com/snapcore/snapd/pull/9341/files#r489259093 [08:35] PR #9341: tests: add nested core20 gadget reseal test <⛔ Blocked> [08:35] mvo: and i'm executing it manually, one by line, there's something wrong with the gadget.yaml i think [08:36] mvo: 2020-09-16T08:34:04Z ERROR bootloader not declared in any volume, need to find out why [08:37] mvo: fwiw, i can work on the test and push fixes there :) [08:37] mborzecki: pleae, in a meeting now [08:37] mborzecki: and then another one [08:37] mborzecki: and then another one [08:38] haha [08:41] mborzecki: fwiw, this is my updated kernel reseal draft https://github.com/snapcore/snapd/compare/master...mvo5:kernel-reseal-better-test but I think this misses kernel re-signing [08:41] mborzecki: most FYI but if the gadget stuff goes quickly it might be a good one to check out too :) [08:41] mborzecki: and THANK YOU for this :) [08:50] afk === doko_ is now known as doko [09:04] mvo: so, i got a reboot, but then this appeared: https://paste.ubuntu.com/p/jRJhY4yp7W/ [09:10] hmm [09:11] i'm looking at boot chains, grubx64.efi for the recovery chain was changed (by the test), but also bootx64.efi was changed during the update [09:25] mborzecki: oh, interessting, aha, I think it's because we resign both [09:25] mborzecki: just resign both [09:25] mborzecki: I will explain more, in a meeting still [09:52] morning folks [09:55] morning ijohnson ! [09:55] morning ijohnson [09:56] hey pstolowski tobias_ o/ [09:57] pedronis: hey, i think i'm confused by one thing we discussed about snapshots - why would we keep external .hash file (apart from .idx), seems it would duplicate meta.sha3_384 or i misunderstood that? [09:58] pstolowski: sorry, what external hash file? [09:59] pedronis: afair we would have the .zip file, but then we would introduce .idx and .hash file (for hash of the metadata), but maybe i made a mistake in my notes [09:59] pstolowski: no, idx would have both the set id and the new hash [09:59] not two files [09:59] does that make more sense? [10:00] PR snapd#9357 closed: interfaces/process-control: add sched_setattr to seccomp [10:02] pedronis: not really, i'm not clear why would we need to store the hash again if we have it in meta.sha3_384 (which would be caclulcated with id:0 in new code) [10:03] pstolowski: maybe, we need to sync again [10:03] huh can anyone else actually access the log for this spread run ? https://github.com/snapcore/snapd/pull/9333/checks?check_run_id=1119044782 [10:03] PR #9333: tests/nested, fakestore: changes necessary to run nested uc20 signed/secured tests [10:03] when I cleck on "view raw logs" it just gives me like 4 lines of trying to allocate a self hosted worker, but the time it took to run is 1hr 39m, so spread must have run [10:05] pedronis: also, maybe i'm confused, but it seems we store timestamp inside meta, and it will affect checksum as well, we may want to do move it out as well [10:10] pstolowski: let's chat when we both have a moment [10:10] * pedronis lunch [10:10] ok [10:18] ijohnson: so you are saying that beta-image recovery partition is now broken? [10:18] xnox: sorry do you mean about last night's discovery about the regression after resealing ? [10:18] ijohnson: meaning, it's no longer forward compatible with current snapd/sealing any more? [10:18] ijohnson: yes. [10:19] seed partition will always have older kernel/snapd/snap-bootstrap than what is in the ubuntu-data partition. and it currently must remain forever forward compatible with how ubuntu-data is sealed. [10:19] until we can update seed partition with a new recovery system. [10:20] xnox: well I don't think the recovery partition is broken because while the snap-bootstrap there will always drop new keys that it didn't know about in May from the modeenv which breaks fresh install, an old boot to recover will always work because that recovery system must have been generated with a snapd that didn't have resealing logic in it [10:20] sure [10:20] but doesn't resealing logic, reseals things upon upgrading snapd in the ubuntu-data partition? [10:21] xnox: hmm although I wonder if an old system refreshed to a new snapd with modeenv on the encrypted data gets the new resealing keys in the modeenv, if during recover mode we will ever modify the data modeenv [10:21] I don't think we will [10:21] but I can test that [10:21] give me a moment, I will try with a beta image refresh to edge snapd, then recover, then go back to normal run mode and see if things are broken [10:22] i guess it will reseal when you also trigger a reseal event? or do we now have two codepaths depending on how the system was installed?! [10:22] ack [10:22] that would help us understand the scope of brokeness =) [10:22] my hope/expectation is that recover mode won't mutate the run mode modeenv, because for example the initramfs of recover mode writes the modeenv for recover mode to the tpmfs data [10:22] maybe refresh snapd first, then refresh kernel or like gadget. [10:22] but maybe I'm wrong [10:23] it has access to both, so it could do either, or bugs [10:23] the number of possible bugs is only limited by our imagination [10:32] mborzecki: this is what I'm doing: https://paste.ubuntu.com/p/PNXqQ5yWpw/ is it too naive? I'm not sure what you tried (I'm not happy with the model& syntax but I can change it) [10:36] pedronis: looks fine for the PR ijohnson has, i tried to do the serialization bits and ran into trouble with model being / [10:36] ah [10:36] mborzecki: is there a bug in my PR? [10:37] no, but mborzecki suggested trying to avoid mantaining the list of fields separately from the struct [10:37] by using tags [10:37] yeah that makes a lot of sense [10:37] ijohnson: no, we're trying to use tags to replace the hard coded list of keys [10:37] but then I think he tried to do even more and then it gets complicated [10:37] yup [10:38] pedronis: anyways, the thing you have seems fine [10:38] ijohnson: re. https://github.com/snapcore/core20/issues/72 -- my interest came from the "GDM on Ubuntu Core" experiments I've been doing. It's the sudo group I'm more interested in than docker. [10:39] jamesh: right understood [10:39] jamesh: I will keep you and the issue updated with the result of the discussion we have tomorrow about this issue [10:39] our more immediate concern is what to do for uc20 1.0 but that will involve some discussion of what to do in general for adding users to groups like that [10:41] pedronis: hmm so if I use cloud-init to create a user on uc20 and don't run console-conf, should console-conf run when I transition to recover mode? seems wrong to me to run console-conf in recover mode, but then again I suppose since it was never run in run mode, you could just reboot the device back to run mode and then run console-conf anyways [10:43] are we even running console-conf in non run mode? [10:43] pedronis: well in this case it seems console-conf is being run in recover mode [10:43] pedronis: it's my feeling that that's a bug [10:45] uhh [10:45] I seem to have refreshed the kernel snap without getting a reboot scheduled [10:45] ah the same revision of the kernel snap is on edge as beta [10:45] ijohnson: remember though that the service also trigger the recovery menu [10:46] so i'ts a bit unclear exactly what we need there [10:46] pedronis: sure, shall I just file a bug about it and we sort out the right thing to do there later ? [10:46] yes [10:46] k [10:47] to be clear I'm not sure never running it is the general answer either [10:47] the problem is more if its runs what should it do/not do [10:47] and when should it run [10:48] yeah it's a bit confusing [10:49] pedronis: how does disabling console-conf work from the gadget defaults now? [10:49] it creates a the complete file (that hasn't changed) [10:50] ok, I wonder if we copy the complete file over from the run system to the recover system, I think we do [10:50] hmm [10:50] I actually don't think we do :-( [10:51] why would we [10:51] anyway if it's disabled in the gadget we will disable it in recover [10:51] mode [10:51] mvo has a PR about that triggering a bug [10:51] pedronis: right that was my question was about whether we will disable it in recover when we seed [10:51] (when we seed the recovery system that is) [10:52] sorry this has probably already been discussed and I'm just now discovering the fixes that were merged for it [10:54] mvo: hm so, the gadget asset test manages to get the first mtime of sealed key before snapd reseals on mark boot successful (and the branch does not have fixes from pedronis about avoiding that) [10:55] mvo: also explains why i could run the test manuall line by line [10:55] ijohnson: this is the PR https://github.com/snapcore/snapd/pull/9353 it actually but install, not recover [10:55] PR #9353: configcore: do not error in console-conf.disable for install mode [10:56] mborzecki: which fixes you mean? (I fixed lots of things) [10:56] pedronis: hmm so I wonder if we need to make that work for recover mode too [10:56] pedronis: do you want me to test that works as advertised ? [10:56] pedronis: not resealing with unasserted kernel when modeenv was not changed [10:56] (gadget default to disable console-conf in recover mode works) [10:56] mborzecki: ah [10:57] ijohnson: yes [10:57] ack will add it to my list of things to test this morning [10:57] still trying to see if beta image gets broken by installing edge snapd [11:06] mborzecki: I think maybe it's not that hard to do marshal/unmarshal as well but probably not something to do now [11:06] s/probably/very likely/ [11:08] mborzecki: ijohnson: pushed [11:17] pedronis: hmm so I have an image built with edge snapd and old pc-kernel, but upon refreshing the pc-kernel, it did not add the current_trusted_... keys to the modeenv ever [11:17] ijohnson: that's expected we have no code to do that atm [11:18] maybe that isn't clear [11:18] pedronis: hmm ok [11:18] pedronis: wait but I built the image with a new edge snapd, shouldn't it have added the current_trusted_... keys to the modeenv during install mode? [11:18] yes [11:18] hmm [11:19] but then you said old kernel, doesnt it lose stuff as we know? [11:19] yes I built the image with an old kernel but new snapd [11:19] ahhhh [11:19] there are lots of moving parts, they behave kind of predictably at least [11:19] let me think about this, probably this is expected then I guess [11:20] ijohnson: we really need to decide what to support or not in terms of backward compat [11:20] mvo knows that is an open question [11:20] probably not an issue for 1.0 [11:20] since new images for 1.0 should be fine [11:27] ah dammit I'm an idiot, too many different images I just ended up booting an old snapd with an old kernel so it worked and never tried to write the extra modeenv stuff [11:27] * ijohnson tries again to not be so foolish [11:32] ok so yes booting an image with new snapd and old kernel fails to seed in run mode as expected because the old kernel there is broken, however given the fact that we never will add the resealing keys to the modeenv, an old snapd refreshing to a new snapd will not add the keys, so thus will not trigger a resealing and thus will not be broken by transitioning back to an old recover mode snapd [11:32] xnox: so we don't have a regression where your old installed system refreshes to new snapd, then goes back into old recovery system [11:34] pedronis: since old beta systems will never be properly resealed since we don't add those keys to the modeenv, will old beta systems ever get broken by the new secboot version format for example? [11:34] ijohnson: sorry, I was confused last night, the secboot version thing works differently, v0 keys system will keep v0 keys [11:35] ok, so beta systems will keep on working as always [11:35] they just will never reseal [11:35] they can reseal, to v0 keys again [11:35] so the problem is really more on our side about that [11:35] but does our code ever trigger resealing for those old systems though? [11:35] it will, it just explode doing so [11:36] because we don't have the required data [11:36] ... so old systems will eventually get broken by resealing [11:36] ? [11:36] well, I would say more by trying to reseal [11:36] unless we do something [11:37] so is that what you meant about backwards compatibility? maybe we don't care that beta systems will eventually explode themselves ? [11:37] * ijohnson thought we were not supposed to let beta systems explode themselves [11:38] we said we would prefer not to, but we also didn't propose not to break them [11:38] but as I said, it's really a decision to make with mvo [11:38] I see [11:38] s/propose/promise/ [11:39] but we also said we would have tests but we don't [11:39] ok well I will leave testing that aside for the moment then, we know that beta systems are not yet exploded by refreshing to new snapd [11:39] pedronis: we do have tests tho [11:39] mvo: #9341 is green, but want to push one more commit addressing the race [11:39] PR #9341: tests: add nested core20 gadget reseal test <⛔ Blocked> [11:39] we build an image from beta and then refresh to edge [11:39] err rather I think we download the beta image then refresh to edge [11:39] * zyga adjusts remaining manifest leftovers [11:39] and see if it explodes [11:39] ijohnson: I see, it's probably not fully enough [11:39] anyway it's a decision to make [11:40] pedronis: yes there's an additional test we should have I think which I mentioned last night which would have caught this issue with modeenv in pr's before things landed [11:41] I will try to propose it today [11:41] really this new test is about making sure old initramfs doesn't break new snapd [11:41] what I mean, we need a test that upadtes snapd and then a kernel [11:42] or simply updates snapd and then reboots (that probably would fail already) [11:42] mmm yes we could have a test that updates snapd first then kernel, it's a bit tricky because we need to disable refreshes from the gadget otherwise the old kernel will get refreshed first before new snapd [11:42] and as I said new snapd + old kernel right now is totally broken [11:55] pedronis: ijohnson: ignoring beta systems exploading is fine, but the problem repeats itself though cause v1 (GA) image should continue to reseal going forward, no? [11:55] xnox: yes new images built with 1.0 will properly reseal and not explode [11:55] as in new snapd snap, should be backwards compatible with older snapd that we ship on GA [11:55] when we make v2 resealing [11:55] xnox: yes it will be [11:55] or v3 resealing etc [11:55] and like still dont' have a way to update seed partition. [11:55] cool [11:56] xnox: did you get a chance to test my core-initrd pr yet? [11:56] xnox: last night's discoveries led to the realization that we really really need a new snap-bootstrap in the initramfs [11:58] pedronis: should` snap reboot --recover` without a label work? it doesn't but feels like it should :-/ it currently has an unhelpful error: `error: cannot request system reboot: method "POST" not allowed` because the client and cli doesn't validate whether systemLabel is empty [11:58] ijohnson: that's a bug [11:59] pedronis: ack I can try to fix that, we could probably just do a GET to /v2/systems and then figure out the current one and then do a post to /v2/systems/ [11:59] missing spread test [11:59] ijohnson: maybe [11:59] fair [11:59] oh hey look at that I just defeated fde on uc20 [12:00] I mean not sure exactly what the solution is [12:00] pedronis: so gadget default for console-conf disable does not work at all for recover mode [12:00] because console-conf is disabled for run mode and doesn't run, or doesn't let you create a user properly, but then you go to recover mode and it lets you create a user and login [12:01] ijohnson: we need to fix that [12:01] I'm going to be in a meeting now [12:01] yes, I think mvo's patch can just be expanded to recover mode too [12:01] I will test this out [12:01] mvo: i think that #9341 is good to go [12:01] PR #9341: tests: add nested core20 gadget reseal test [12:02] dropped the blocked label [12:03] oh actually maybe that won't work [12:05] mborzecki: \o/ [12:06] mvo: we have a blocker for 2.47, console-conf doesn't get properly disabled in recover mode [12:06] I'm looking at a fix now [12:06] ijohnson: ta [12:16] xnox, hi, I see there are OVMF_CODE_4M and OVMF_VARS_4M in ovmf [12:16] xnox, I tried those with uc20 yesterday but image didn't boot [12:17] xnox, any idea if we need any change to qemu command line? or any other configuration to support that? [12:18] cachio: those ones are without keys, you want the ones with MS keys and 4k, no? [12:18] cachio: which filenames did you use? [12:26] xnox, I tried OVMF_CODE_4M.secboot.fd and OVMF_VARS_4M.snakeoil.fd [12:26] PR snapd#9358 closed: boot/modeenv: track unknown keys in Read and put back into modeenv during Write <⚠ Critical> [12:27] xnox, and the ms as well [12:28] mborzecki: haha so your branch bboozzoo/uc20-tweak-sid-packaging-params.go exists in the my local git tree and breaks building the snap locally because go interprets that file in the .git dir as a go file [12:29] *face palm* [12:29] 2020/09/16 12:27:18 processFiles failed with: /root/parts/snapd-deb/build/.git/refs/remotes/bboozzoo/bboozzoo/uc20-tweak-sid-packaging-params.go:1:1: expected 'package', found 'IDENT' c0de648a090a1690ce9d3f6b350d5e3dc92d6bd7 [12:29] hahah [12:29] feels pretty clear to me the debian packaging should ignore everything in .git as possible go source files [12:30] ok, running spread now [12:30] need to adjust two variable names [12:30] and lots of related function names [12:30] but that's just small thing [12:30] first need to check if nothing is broken [12:33] * ijohnson afk for coffee [12:46] xnox, do we need to you diferent ones for uc20? [12:49] brb, see you at the standupo [12:52] mmm I kinda just wanna copy the complete file for console-conf from the initramfs from run mode to recover mode if it exists [12:52] maybe that's too naive [13:51] hm that mbr thing on encrypted device shouldn't be a big problem, i think we can make sure that we readlink of the mount source and then walk the /sys/block//slaves up to a point where there's no more slaves [13:51] the upside is taht it will work even if there's multiple dm devices invovled, lvm, luks and others [14:03] cachio: could you look at #9333 again? [14:03] PR #9333: tests/nested, fakestore: changes necessary to run nested uc20 signed/secured tests [14:18] ijohnson, on that [14:19] thanks [14:20] hey, I saw in 9359 a bunch of lines like: [ 124.065523] snapd[670]: 2020/09/16 13:52:57.974158 stateengine.go:150: state ensure error: devicemgr: cannot mark boot successful: cannot compose run mode boot chains: cannot find expected boot asset bootx64.efi in modeenv - does that ring any bells? [14:20] mvo: yes that was the regression I was talking about [14:21] mvo: that is what happens when you boot a uc20 image with old kernel and new snapd with resealing, it is broken right now [14:21] mvo: to fix it we need new snap-bootstrap in the kernel initramfs [14:21] sorry if I was unclear about that in SU [14:21] PR snapd#9359 opened: tests: improve kernel reseal test [14:30] ijohnson: oh, right [14:30] yeah this is why even without my core-initrd pr merged it would be really good to have a new ubuntu-core-initramfs built [14:30] ijohnson, done [14:31] cachio: what do you mean about setting the env vars in spread.yaml? the vars are set from a specific tests's environment keyword which will be set before anything in spread's prepare, etc. is executed no? [14:32] at least that's what I see [14:32] ijohnson, you need to add them in environment: [14:32] GOHOME: /home/gopath [14:32] GOPATH: $GOHOM [14:33] in this part [14:33] cachio: but why [14:33] or in the suite specific env dict [14:33] spread will set just those vars in the environment for the test [14:35] otherwise the var will not be available for the script [14:35] cachio: hmm but I just use those variables for a specific nested/manual test where [14:35] I set the env vars and use them from prepare and it works [14:36] ijohnson, which ones? [14:36] cachio: see my other pr, https://github.com/snapcore/snapd/pull/9334 [14:36] PR #9334: tests/nested/manual: add test for grades above signed booting with testkeys [14:37] ijohnson, that works [14:38] what doesn't work is to set the model from cli [14:38] cachio: right but why would we need to set the model from the cli [14:38] is that a thing you do ? [14:38] ijohnson, I do that for spread cron [14:38] but [14:39] I think those vars are not going to be chagned [14:39] cachio: so do you still want me to define those things in the spread.yaml environment? [14:39] I used to update kvm, secboot, tpm, etc [14:39] ijohnson, no [14:39] ok [14:39] I can address your other feedback now though [14:40] thanks [14:42] ijohnson, for installing jq [14:42] we already have a function called get_test_snap_suffix [14:42] it is in core-config.hs [14:42] .sh [14:42] in the lib [14:42] that could be usedfull [14:43] ah ok, interesting, I can take a look at that === pedronis_ is now known as pedronis [14:50] cmatsuoka, I reviewed hte msr doc and the case we have is the following https://paste.ubuntu.com/p/wNVYrWxxxj/ [14:50] cmatsuoka, because the msr code starts with 0xffffffff [14:50] PR snapcraft#3288 closed: cmake v2 plugin: add help for cmake generators [14:55] got to run to a school meeting [15:04] cachio: I think the document I mentioned earlier is this one: https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3c-part-3-manual.pdf [15:06] cmatsuoka, nice, thanks [15:06] cachio: pr updated [15:08] ijohnson, +1 [15:08] thanks [15:19] pstolowski: I commented on #9350 [15:19] PR #9350: [RFC] store: handle v2 error when fetching assertions [15:20] ty [15:23] * cachio lunch [16:06] PR snapd#9341 closed: tests: add nested core20 gadget reseal test [16:08] pedronis: do you think you have capacity to look at 9293 ? [16:09] pedronis: and maybe 9277 (which is actually more important)? [16:13] mvo: do we need to merge the improved reseal tests into 9277 before merging it? [16:14] mvo: https://github.com/snapcore/snapd/pull/9277#pullrequestreview-489786595 [16:14] PR #9277: secboot: add boot manager profile to pcr protection profile <⛔ Blocked> [16:16] pedronis: yeah, I think it makes sense to merge master once more, maybe cmatsuoka can do it? I will then branch 2.47 in the morning most likely [16:16] this should give it ample time to finish :) [16:16] PR snapd#9360 opened: tests: make gadget-reseal more robust [16:17] cachio: I noticed we get some nested uc18 failures now sometimes, do you think you could have a look if you spot something (a pattern or so?) [16:17] * mvo dinners [16:17] mvo: so we need to land #9359 and #9360 first ? [16:17] PR #9359: tests: improve kernel reseal test [16:17] PR #9360: tests: make gadget-reseal more robust [16:21] only #9359 [16:21] PR #9359: tests: improve kernel reseal test [16:29] mvo, checking [16:30] mvo, do you have any log? [16:30] cachio: we have stopped using snakeoil keys months ago, so snakeoil.fd has not been working for months now. [16:31] ijohnson: if we had strange cla check failure on mvo PR: https://github.com/snapcore/snapd/pull/9359/checks?check_run_id=1124165015, maybe you can look when you have a moment. I reviewed your cloud-init PR [16:31] PR #9359: tests: improve kernel reseal test [16:33] xnox, I also tried ms keys [16:33] Thanks pedronis [16:33] xnox, should that work? [16:33] The cla-check issue seems to be a problem with cmatsuoka spread machine [16:33] cachio: it would be helpful if you could paste the full thing you are trying [16:34] cachio: qemu / libvirt, and which things were initialized with which and on what series. [16:34] cachio: i'll doublecheck what i do locally, and can compare. [16:34] cmatsuoka: perhaps you updated your spread worker python? [16:36] xnox, ok [16:36] let me prepare the environmnent [16:36] I'll leave you a note with that [16:46] ijohnson: uhm? cla-checker? [16:46] cmatsuoka: see pedronis' link above, the log has `claudio-spread-1` in it and has a python backtrace when it tries to run cla-checker [16:47] ugh, let's see what's happening here [16:52] PR snapd#9333 closed: tests/nested, fakestore: changes necessary to run nested uc20 signed/secured tests [16:53] mvo: should I merge #9359 into #9277? [16:53] PR #9359: tests: improve kernel reseal test [16:53] PR #9277: secboot: add boot manager profile to pcr protection profile <⛔ Blocked> [16:53] * cmatsuoka just came back from lunch [16:53] mvo, I didn't find the errors on nested uc18, I'll try to reproduce it manually here [16:53] ijohnson: is the cla-check expecting to use python 2? [16:54] cachio: the issue I think he saw was the same problem with users disappearing when spread tries to login to the nested machine [16:54] cmatsuoka: hmm good question, let me take a look quickyl [16:55] ijohnson, ah, ok, I think we need to skip the configuration for the images [16:55] the runner only had python 3 installed, I installed a minimal python2 now [16:55] ijohnson, i'll create a pr for that [16:55] cachio: ah interesting maybe that's the issue indeed [16:55] ijohnson, yes [16:57] cmatsuoka: it is possible that it only supports python2, but that doesn't seem to be the case just looking at the code, it seems fine python3 code to me from my non-experienced python eyes [16:58] we'll soon find out when the next cla-check job arrives [17:07] cmatsuoka: yes after some work, checking things the cla script does not support python3 [17:07] err it needs to be ported slightly [17:07] ijohnson: ok, the runner now has python 2 too [17:12] cmatsuoka: ok well now I have a branch that works [17:12] with python3 that is [17:12] ah nice [17:17] PR snapd#9361 opened: tests/lib/cl_check.py: use python3 compatible code [17:22] PR snapd#9362 opened: tests: skip nested images pre-configuration by default [17:22] ijohnson, I thhink with this #9362 that issue should be gone [17:22] PR #9362: tests: skip nested images pre-configuration by default [17:23] cachio: thanks I'll have a look in a bit === ijohnson is now known as ijohnson|lunch [17:37] cmatsuoka: I think merging master is enough [17:39] mvo: ack [17:45] ijohnson|lunch, I need to go to the kinesiologist now, please leave a note if you think the fix is not correct [17:45] I'll take a llok once I am back [17:45] * cachio -> kinesiologist [17:55] back [17:56] * zyga is physically tired but mentally ready for coding === ijohnson|lunch is now known as ijohnson [18:35] brb [18:35] coffee [18:35] I may have fixed crashes [18:35] now let's run and see [18:35] cachio: 9362 lgtm, not clear that it will fix all of our problems, but I don't think it can hurt [19:11] ijohnson: https://github.com/snapcore/snapd/pull/9359/checks?check_run_id=1124222729 failed on nested.sh shell stuff, could look into it? [19:11] PR #9359: tests: improve kernel reseal test [19:11] pedronis: sure [19:11] failed sadly tbh :/ [19:11] /home/gopath/src/github.com/snapcore/snapd/tests/lib/nested.sh: line 290: $4: unbound variable [19:12] hmm [19:12] :( [19:12] well I mean to be fair it was only provided 3 argument [19:12] it's still sad [19:12] yes still sad [19:12] given how long it took to get to that error [19:12] do you want me to push a fix to this branch ? [19:12] yes [19:12] ack, one moment [19:13] I don't see how to avoid another run [19:13] unless somebody runs it locally [19:13] is the concern that we wanted to branch 2.47 tonight with this pr merged ? [19:14] I can tend to this pr if it needs some poking tonight [19:15] otherwise I could try running it locally to confirm my patch works [19:16] it's a prereq to run cmatsuoka PR again [19:16] I think the idea was to cut 2.47 in the morning [19:16] but this shifts a bit everything [19:17] anyway I don't have particular opionions on this anymore [19:18] ok, well I'm trying a spread run locally with qemu-nested on my desktop, perhaps that will be faster, if it passes I will pastebin the results into the PR [19:20] ijohnson: how long does it take on your desktop [19:20] ijohnson: and what are your cpu specs roughly? [19:21] zyga: I dunno I haven't run this particular test, but in general qemu-nested is a few minutes faster on my first gen 16 core threadripper [19:21] ok [19:21] faster than gce at least [19:21] faster than gce? [19:21] right [19:22] hey, main/snap-run passed [19:22] I'll run more tests and see what else needs changes, snap-confine profile is still complain mode [19:22] PR snapd#9318 closed: .github/workflows/test.yaml: also run 20.04 nested tests with UC20 label [19:27] ijohnson, left you a comment in #9362 [19:27] PR #9362: tests: skip nested images pre-configuration by default [19:28] cachio: ah right I didn't realize that all the other nested tests were using the generic images [19:28] cachio: did you ever measure for non-uc20 how much speedup you get from not rebuilding/reconfiguring the image fresh for each test ? [19:29] if you run 2 tests with the same image in the same machine you save time [19:29] ijohnson, thats the measurement I did for uc20 [19:29] ah so you didn't measure for non-uc20 then? [19:29] not sure for uc16 and uc18 [19:30] I just measured the slowest [19:30] I see [19:31] cachio: sure let's merge this if we keep seeing the issue with minimal-smoke on uc18, then let's revert this pr and try something else [19:31] ijohnson, sure [19:34] ijohnson, well [19:34] I thinks for uc16 and 18 the time is pretty the same [19:34] I just compared PRs [19:35] ah [19:35] and the times are pretty similar [19:35] allways between 34m and 40m [19:38] ok, snapctl adjusted [19:47] PR snapd#9363 opened: boot: adjust comments, naming, log success around reseal [19:48] ijohnson: cmatsuoka: ^ simple PR addressing previous feedback [19:49] * cmatsuoka checks [19:50] approved [19:54] ijohnson: I made a comment about your doubt around overlord and snap-bootstrap, we already import some bits of overlord, what we don't want to import are the managers [19:54] that's why we build with nomanagers [19:55] ah [19:55] I didn't know about nomanagers [19:55] ok this is much easier now then [19:55] thanks for the hint [20:07] PR snapd#9363 closed: boot: adjust comments, naming, log success around reseal [20:55] ijohnson: how it's going the local run of #9359 ? [20:55] PR #9359: tests: improve kernel reseal test [21:04] Sorry got distracted but it passed [21:05] pedronis: ^ [21:05] I just pasted in the results so you can force-merge if you like [21:05] the tests are not done running yet for nested uc20 yet [21:06] in the pr that is [21:09] I think we should chip in and buy google some faster machines [21:14] * zyga considers going to bed now [21:14] I'll finish the rest in the morning [21:14] good luck, see you tomorrow guys [21:15] ijohnson: cmatsuoka: I force merged it [21:15] pedronis: thanks, I'll merge master into #9277 [21:15] PR #9277: secboot: add boot manager profile to pcr protection profile <⛔ Blocked> [21:15] cmatsuoka: thank you, was about to ask about that [21:15] * pedronis calls is a day [21:17] * cmatsuoka will patiently wait for the tests to complete [21:18] PR snapd#9359 closed: tests: improve kernel reseal test [21:18] which will still be way better than the 8h layover in LHR last year [21:21] Yeah we would have been in Frankfurt again this week!