[06:48] <zyga> good morning
[06:48] <zyga> today was way more efficient
[06:48] <zyga> not even 9 and Lucy is 100% taken care of :)
[06:49] <zyga> and I had an hour-long morning walk to exercise
[06:52] <zyga> mvo o/
[06:55] <mvo> good morning zyga
[07:02] <pstolowski> morning
[07:02] <zyga> pstolowski please check the message I sent on tg last evening
[07:02] <zyga> mvo https://listed.zygoon.pl/17533/measuring-execution-coverage-of-shell-scripts
[07:03] <pstolowski> zyga: looking
[07:03] <zyga> pstolowski I think the first thing to establish is if the generator really works on 16.04
[07:03] <zyga> if it does not we need to focus on that problem firsst
[07:03] <zyga> if it does, we just need to understand what's broken in the test (on 16.04) and fix the image (on 20.04)
[07:11] <pstolowski> zyga: got it, thanks for the clues, i'm investigating
[07:11] <zyga> super, thank you!
[07:15] <mvo> good morning pstolowski
[07:19] <mvo> zyga: I guess I still need to merge 9174, right?
[07:19] <zyga> yes
[07:19] <zyga> and we need to deal with the fallout
[07:20] <mup> PR snapd#9174 closed: tests: fix lxd test wrongly tracking 'latest' <Test Robustness> <⚠ Critical> <Created by zyga> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/9174>
[07:20] <mup> PR snapd#9178 closed: secboot: document exported functions <Simple 😃> <Skip spread> <UC20> <Created by cmatsuoka> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/9178>
[07:21] <mvo> hm, arch is also broken in the interfaces-udisks2 test it seems
[07:22] <zyga> I noticed last night
[07:22] <zyga> maybe an interface changed upstream
[07:22] <zyga> I wrote so many udisks2 tests in this company :P
[07:25] <mup> PR snapd#9177 closed: tests: remove support for ubuntu 19.10 from spread tests <Created by sergiocazzolato> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/9177>
[07:27] <mvo> zyga: we can probably disable it on arch for now, it's mostly relevant for  core anyway
[07:29] <zyga> arch is the harbinger of doom on remaining systems
[07:33] <mvo> zyga, pstolowski fwiw, systemd 229 (xenial) should support system generators, it ships systemd-fstab-generator so the functionality itself should be there
[07:33] <zyga> that's reassuring, so the question is - why is the test failing to observe the generator data
[07:34] <zyga> maybe it doesn't support something else, like /run/systemd/container?
[07:34] <mvo> zyga: yeah, a good question
[07:34] <pstolowski> mvo: also Dmitri was saying they way it's implemented would work all the way back to xenial
[07:34] <mvo> zyga, pstolowski fwiw, inside my lxc xenial I see /run/systemd/container - but maybe it's racy or something?
[07:35] <zyga> interesting, maybe
[07:37] <mvo> also we should check if we have the snap.mount in lxd, this is what the old generator was generating
[07:39] <zyga> I did see that one
[07:39] <mvo> and when I run it manually inside lxd it work, puzzling
[07:39] <zyga> which is even more puzzling
[07:39] <zyga> like half of it ran
[07:39] <zyga> but that was in the 20.04 container so that was a bit optimistic
[07:39] <zyga> I didn't re-check 16.04
[07:44] <jamesh> Looks like the Go latest/edge snap is broken based on CI results
[07:44] <zyga> mvo, pstolowski: if that's okay, I'll focus on features for now, if you need me please just say so
[07:44] <zyga> jamesh oh, I think mwhudson may want to know
[07:44] <zyga> hi :)
[07:46] <jamesh> mwhudson: ^^^ I get a failure when it tries to copy a file to /snap/go/6312/pkg/linux_amd64/runtime.a -- maybe the library is missing from the snap or out of date?
[07:46] <pstolowski> mvo, zyga yeah i just concluded the same. manual run creates dropins. but daemon-reload only creates snap.mount 🧐
[07:46] <pstolowski> ah wait silly me
[07:48] <pstolowski> scratch that; yes daemon-reload doesn't have the effect, weird
[07:49] <zyga> pstolowski maybe time to dig into systemd source
[07:49] <mvo> pstolowski: I am making some progress, fails with exit code 2
[07:50] <mvo> pstolowski: if I put systemd into debug mode inside lxc and call daemon-reload
[07:50] <pstolowski> mvo: right i see exit 2 in journalctl
[07:50] <mvo> pstolowski: I bet PATH is unset
[07:50] <mvo> pstolowski: we have a null check for that
[07:51] <mvo> pstolowski: there you go
[07:51] <pstolowski> mvo: ha, that's great finding
[07:51] <pstolowski> mvo: i'll test a quickfix by hardcoding things
[07:52] <zyga> so PATH is unset?
[07:52] <zyga> that's, interesting
[07:53] <mvo> pstolowski: yeah, http://paste.ubuntu.com/p/mphVQNCqKK/ fixed it for me
[07:53] <zyga> mvo so another place to bake path ;-) ?
[07:53] <mvo> pstolowski: in my quick test, I have a meeting now so can't work on this, thanks for taking over \o/
[07:53] <zyga> I wonder if this was fixed in future systemd
[07:53] <mvo> zyga: yeah, fun
[07:53] <mvo> zyga: it is evidently
[07:53] <mvo> zyga: or it would fail in the same way
[07:53] <zyga> what's the path we get there?
[07:54] <mvo> pstolowski: I had this suspicion during the review that the environment maybe totally blank but then did not mention it. oh well :/
[07:54] <mvo> zyga: no idea
[07:54] <mvo> zyga: I remembering looking at this a while ago but it's even hard to see debug prints when generators run :(
[07:54] <zyga> mvo open(/tmp/foo) and redirect stderr there
[07:55] <zyga> to be clear, I'm mainly asking about the path drive this problem to a clear solution
[07:55] <zyga> and not bake a slightly wrong path by accident
[07:55] <zyga> I'd love to double check both the value of PATH and the method systemd computes it
[08:01] <pstolowski> so yes hardcoding a path there makes it happy
[08:02] <mvo> \o/
[08:03] <pstolowski> mvo, zyga: what do you think about a fallback with a predefined path
[08:03] <mvo> pstolowski: I don't think we have an alternative :)
[08:03] <zyga> I think it's unavoidable, I just want to use a correct value
[08:03] <mvo> zyga: which one of the correct values ;P
[08:03] <zyga> brb, let me fetch something to drink
[08:03] <zyga> it's finally not so hot today but I just had a small sip in the morning
[08:03] <pstolowski> zyga: this is only for 16.04, we don't have preseeding for non-ubuntu
[08:04] <pstolowski> and therefore we don't need dropins elsewhere
[08:04] <zyga> pstolowski you say that but it's only a bit of code that will not be looked at in 6 months when something changes, let's just spend the extra 30 minutes to understand how PATH is provided in 18.04+ and what the value is
[08:11] <pstolowski> zyga: it's not strictly about PATH but where snapfuse is installed, that's not going to change in 16.04
[08:19] <zyga> my point is that we should not make this about 16.04
[08:19] <zyga> we either need path and we need to provide it somehow (and maybe only conditionally for old systemd)
[08:19] <zyga> or we don't need path and we bake something else in
[08:33] <pstolowski> of course man pages doesn't say anything about env and path
[08:49] <mvo> zyga, pstolowski meeting is over - I think we should simply use a sanitized versoin of /etc/environment PATH in the generator
[08:50] <mvo> zyga, pstolowski i.e.
[08:50] <mvo> PATH="/usr/sbin:/usr/bin:/sbin:/bin"
[08:50] <mvo> do you see any downside?
[08:53] <pstolowski> mvo: do you mean hardcoded? or do you think we should parse /etc/environment?
[08:54] <mvo> pstolowski: hardcoded
[08:54] <mvo> pstolowski: not more parsing code
[08:54] <zyga> I think hardcoded is ok if PATH is NULL
[08:54] <pstolowski> mvo: that's what i'm running right now
[08:54] <pstolowski> yes that's what i hae
[08:54] <pstolowski> *have
[08:54] <mvo> pstolowski: also the PATH in /etc/environment is silly - it include /usr/local and /games and stuff which I think we don't want
[08:54] <mvo> pstolowski: \o/
[08:54] <pstolowski> +1
[08:54] <zyga> (although the exact string should be picked up from systemd most likely)
[08:56] <mvo> zyga: from systemd?
[08:56] <zyga> even if we copy-paste the string, yeah
[08:56] <mvo> zyga: from the systemd source you mean?
[08:56] <zyga> yes
[08:57] <mvo> zyga: which version of systemd? with unified /usr/bin,/bin or without? not sure about the advantage over /etc/environment which is the "canonical" thing for ubuntu
[08:57] <pstolowski> i looked at systemd, starting from manager_run_generator and didn't see any explicit paths
[08:58] <mvo>  systemd uses a fixed value of
[08:58] <mvo>            "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"
[08:58] <mvo> that's what the manpage for systemd.exec tells me
[08:58] <mvo> When compiled for systems with "unmerged /usr" (/bin is
[08:58] <mvo>            not a symlink to /usr/bin), ":/sbin:/bin" is appended.
[08:59] <pstolowski> ah, i didn't look into that man page :/
[08:59] <pstolowski> joys of a dozen of manpages
[09:00] <mvo>                 m->transient_environment = strv_new("PATH=" DEFAULT_PATH);
[09:00] <mvo> pstolowski: no worries, it's really hidden
[09:00] <pstolowski> ty!
[09:01] <mvo> and https://github.com/systemd/systemd/blob/master/src/basic/path-util.h#L13
[09:01] <mvo> so we could include "local" given that systemd is also doing this
[09:02] <mvo> zyga, pstolowski thanks, I think that's indeed the best option, stick to what systemd is using with unmerged-usr
[09:03] <mvo> (and we should document it in the code why we picked the particular values)
[09:03] <pstolowski> yes doing
[09:03] <mvo> thank you!
[09:07] <pstolowski> amazing how many things can go wrong. starting with lxd test issue the hid the problem
[09:10] <mvo> indeed
[09:10] <mvo> the joy of complexity!
[09:20] <zyga> mvo we should move off latest go
[09:20] <zyga> go test runtime: copying /home/runner/.cache/go-build/63/633155d8056ad255f77968645705b93d20bb3173db582e095c1d93bb6cdab259-d: open /snap/go/6312/pkg/linux_amd64/runtime.a: read-only file system
[09:25] <mup> PR snapd#9179 opened: cmd/snapd-generator: use PATH fallback if PATH is not set <⚠ Critical> <Created by stolowski> <https://github.com/snapcore/snapd/pull/9179>
[09:32] <mvo> zyga: oh, where do you see this?
[09:32] <mvo> fwiw, I'm debugging the udisks2 arch failure right now
[09:32] <zyga> mvo our unit test action
[09:33] <zyga> https://github.com/snapcore/snapd/pull/9175/checks?check_run_id=1001916625
[09:33] <mup> PR #9175: tests: find -ignore_readdir_race when scanning cgroups <Simple 😃> <Test Robustness> <Created by zyga> <https://github.com/snapcore/snapd/pull/9175>
[09:33] <mvo> zyga: ok
[09:34] <mvo> zyga: we can temporarly disable, I can look once I resolved udisks2
[09:34] <pstolowski> just saw "go test runtime: ... open /snap/go/6312/pkg/linux_amd64/runtime.a: read-only file system" on my PR
[09:40] <jamesh> pstolowski: the edge channel of the Go snap seems to be broken, yeah
[09:45] <mwhudson> jamesh: hmm haven't looked at the go-tip snap for a long time
[09:45] <mwhudson> jamesh: can you tell when it broke?
[09:47] <mwhudson> til snap refresh --edge does not change tracks
[09:47] <jamesh> mwhudson: it looks like it was working 10 hours ago when edge was version devel-77a11c0.  It's broken now on devel-84a6245
[09:47] <mwhudson> oh ok
[09:47] <mwhudson> that at least narrows things down :)
[09:48] <jamesh> mwhudson: this is based on browsing through snapd's CI logs
[09:48] <mwhudson> hmm very very light testing has it still working
[09:49] <jamesh> mwhudson: This is an example failed run: https://github.com/snapcore/snapd/pull/9168/checks?check_run_id=1001544453
[09:49] <mup> PR #9168: o/hookstate/ctlcmd: make is-connected check whether the plug or slot exists <Created by jhenstridge> <https://github.com/snapcore/snapd/pull/9168>
[09:50] <jamesh> it looks like it's trying to copy a compiled version of the runtime.a package to $GOROOT, and tripping up on the read only file system error
[09:50] <mwhudson> hmm
[09:50] <jamesh> (since $GOROOT is obviously a squashfs)
[09:51] <mwhudson> $ go_edge list -f '{{ .Stale }}' runtime
[09:51] <mwhudson> true
[09:51] <mwhudson> seems bad
[09:53] <jamesh> clearly that would fail with EACCESS as a normal user.  But the CI is running as root, so gets EROFS instead
[09:54] <jamesh> my mistake: this wouldn't be running as root.  It's still getting EROFS though
[09:54] <mwhudson> stalereason is "not installed but available in build cache"
[09:57] <mwhudson> which doesn't make a lot of sense
[09:59] <mwhudson> will have to look tomorrow
[10:05] <mup> PR snapd#9180 opened: github: use latest/stable go, not latest/edge <Skip spread> <Created by zyga> <https://github.com/snapcore/snapd/pull/9180>
[10:08] <mvo> the arch thing is strange, I can manually mount it but mounting via udevctl (without a snap in between even) fails. so udisks is wonky there it seems
[10:11] <zyga> mvo ^ this seems to be working
[10:11] <zyga> mvo shall I look?
[10:12] <zyga> pstolowski and ideas on the 20.04 image?
[10:15] <mup> PR snapd#9181 opened: tests: disable udisks2 test on arch linux <⚠ Critical> <Created by mvo5> <https://github.com/snapcore/snapd/pull/9181>
[10:15] <mvo> zyga: look at what?
[10:15] <zyga> arch
[10:17] <mvo> zyga: I think it's fine, I pushed a PR
[10:17] <zyga> great
[10:17] <zyga> ah, I see it
[10:17] <zyga> just above
[10:18] <mvo> zyga: hm, latest/stable is 14.07 but there is a 1.15 too
[10:18] <mvo> zyga: oh well,
[10:18] <zyga> mvo how many things shall we debug today? :)
[10:18] <zyga> I would stick to something that works for a few days and let mwhudson look
[10:20] <mup> PR snapd#9180 closed: github: use latest/stable go, not latest/edge <Skip spread> <Created by zyga> <Merged by zyga> <https://github.com/snapcore/snapd/pull/9180>
[10:20] <zyga> mvo you need to force merge https://github.com/snapcore/snapd/pull/9181#pullrequestreview-470319752
[10:20] <mup> PR #9181: tests: disable udisks2 test on arch linux <⚠ Critical> <Created by mvo5> <https://github.com/snapcore/snapd/pull/9181>
[10:21] <mvo> zyga: yeah
[10:22] <mvo> pstolowski: one comment in 9179
[10:25] <mup> PR snapd#9181 closed: tests: disable udisks2 test on arch linux <⚠ Critical> <Created by mvo5> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/9181>
[10:25] <mup> PR snapd#9182 opened: tests: re-enable udisks test on debian-sid <Simple 😃> <Created by mvo5> <https://github.com/snapcore/snapd/pull/9182>
[10:26] <zyga> pstolowski one more comment from me
[10:26] <pstolowski> ty
[10:26] <pstolowski> zyga: re 20.04 i can't reproduce, i don't remember what i did yesterday..
[10:26] <zyga> pstolowski boot 16.04 and use a 20.04 container
[10:27] <zyga> or boot 20.04 and boot a 20.04 container
[10:27] <zyga> inside the container check if snapd seeded and has correctly installed snaps
[10:27] <zyga> and if any snap is broken
[10:29] <pstolowski> so yeah 20.04 + 20.04 works
[10:29] <mvo> I booted a qemu with the failure but I have a meeting now
[10:46] <pstolowski> 16.04 + 20.04 container also worked
[10:47] <zyga> pstolowski when it works
[10:47] <zyga> can you look at the hash of the rootfs squash
[10:47] <zyga> maybe it just got fixed there
[10:48] <zyga> the one we saw fail was ....
[10:48] <pstolowski> where is it stored?
[10:48] <zyga> look at /var/snap/lxd
[10:48] <zyga> it's the big file there
[10:50] <zyga-x240> pstolowski: 97c470e427c425cf2ec4d7d55b6f1397ea55043c518b194a58fc6b9da426f540.rootfs
[10:50] <zyga> I'm going to grab some coffee and think about a problem
[10:51] <zyga> pstolowski feel free to telegram/facetime me if you want to talk
[10:51] <zyga> mvo  same ^
[10:52] <pstolowski> zyga: https://paste.ubuntu.com/p/ZQgyZ9tkjd/
[10:52] <pstolowski> i have that + some newer
[10:54] <pstolowski> hmm but 97c470e427c4 is being used (i think)
[11:04]  * mvo is in a meeting
[12:12] <zyga> re
[12:12] <zyga> pstolowski is 9* rootfs something that was pre-seeded?
[12:13] <zyga> I spent a while thinking about possible ways to handle ~/snap and I have some experiments to run
[12:13] <zyga> but first lunch, then standup, then experiments and then PT
[12:14] <pstolowski> zyga: no, not preseeded
[12:14] <pstolowski> seeded the old way
[12:14] <zyga> ok, so it should seed normally
[12:14] <zyga> and with your generator fix
[12:14] <zyga> does it work?
[12:17] <pstolowski> it doesn't have any of my generator changes, that's unrelated. but  i tested my generator fix on 16.04
[12:18] <zyga> but does it seed normally or are snaps broken
[12:18] <zyga> if it works that's good
[12:21] <pstolowski> i can't repro anymore for some reason. i don't remember if it seeded fine or not, but all snaps were broken. i'd like to understand what's different today or what am i missing (and why you can reproduce)
[12:28] <mvo> I have a meeting in 2min but I'm inside the nested lxd container and it looks like /snap is not mounted at all
[12:29] <mvo> no mount units generated but "snap changes" tells me that the seeidng was successful
[12:30] <zyga> mvo hmmmmm
[12:30] <zyga> perhaps nesting is the key
[12:30] <zyga> pstolowski there were two containers in the lxd test, the one that had more problems was indeed the nesting one
[12:51] <mup> PR snapd#9176 closed: cmd/snap: use ⬏ instead of ↑ where applicable <⛔ Blocked> <Created by anonymouse64> <Closed by anonymouse64> <https://github.com/snapcore/snapd/pull/9176>
[12:52] <pstolowski> zyga: ok, reproduced with lxd test in my-nesting-ubuntu
[12:53] <zyga> that's great
[12:53] <zyga> I wonder why nesting matters
[12:53] <zyga> I guess we'll learn ;)
[12:55] <zyga> mvo, pstolowski: I guess we should force-merge https://github.com/snapcore/snapd/pull/9179
[12:55] <mup> PR #9179: cmd/snapd-generator: use PATH fallback if PATH is not set <⚠ Critical> <Created by stolowski> <https://github.com/snapcore/snapd/pull/9179>
[12:56] <zyga> to reduce the number of failures and have a clear picture of what's what as we iterate
[12:56] <pstolowski> +1
[12:56] <mvo> zyga in a meeting
[12:56] <zyga> k
[13:01] <mvo> zyga meeting overruns
[13:01] <zyga> k
[13:01] <mvo> zyga I will be slightly late
[13:01] <zyga> should we wait or start
[13:01] <pstolowski> zyga: i wonder where is this coming from in this failing container https://pastebin.ubuntu.com/p/vGBtq6YCGc/
[13:05] <zyga-x240> pstolowski: quick idea, change snapd to log anything we log in any case via tasks to systemd
[13:05] <zyga-x240> we'd get time-correlated logs
[13:05] <zyga-x240> and that would be a very welcome change in general
[13:21] <mup> PR snapd#9179 closed: cmd/snapd-generator: use PATH fallback if PATH is not set <⚠ Critical> <Created by stolowski> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/9179>
[13:23] <zyga-x240> thanks for merging that
[13:27] <pstolowski> zyga-x240: that's an interesting idea
[13:27] <pstolowski> zyga-x240: one other observation in the meantime
[13:27] <zyga-x240> yeah?
[13:28] <pstolowski> zyga-x240: this unmounting of snaps appears 2 seconds after 12:44:13 my-nesting-ubuntu systemd[1]: snapd.seeded.service: Succeeded.
[13:28] <pstolowski> zyga-x240: and after snapd restart
[13:28] <zyga-x240> why does snapd restart?
[13:28] <zyga-x240> for the seeding?
[13:28] <zyga-x240> I mean snapd + core18 model?
[13:31] <pstolowski> yes i think so
[13:31] <pstolowski> 2020-08-19T12:43:20Z INFO Requested daemon restart (snapd snap).
[13:31] <pstolowski> 2020-08-19T12:43:20Z INFO Waiting for automatic snapd restart...
[13:31] <pstolowski> that's from our change 1 log
[13:32] <zyga-x240> can you check if systemd knows about the mount units?
[13:32] <zyga-x240> systemctl status snap-snapd-1234.mount
[13:32] <pstolowski> zyga-x240: it doesn't
[13:33] <zyga-x240> pstolowski: and journald, it should remember
[13:34] <pstolowski> zyga-x240: indeed, but nothing interesting, mounts followed by unmount
[13:35] <zyga-x240> pstolowski: so the key question is why did systemd stop it
[13:35] <zyga-x240> pstolowski: can you paste the log?
[13:35] <zyga-x240> (everything in the nested container)
[13:37] <pstolowski> sure
[13:41] <pstolowski> zyga-x240: https://paste.ubuntu.com/p/9jgxNM5ZDz/
[13:43] <zyga> pstolowski note that the log doesn't say anything about the mount units being stopped
[13:43] <zyga> I suspect it's not that
[13:43] <zyga> can you do a test please
[13:43] <zyga> craft a quick unit for whatever snap
[13:43] <zyga> start it with systemd
[13:43] <zyga> check the log to see what it says
[13:43] <zyga> then unmount that by hand
[13:43] <zyga> then check the unit status
[13:44] <pstolowski> zyga: these umounts appear right after  "Stopped Snap Daemon.", isn't that weird?
[13:44] <mvo> pstolowski, zyga it looks like prep-snapd-in-lxd.sh fails in "apt autoremove --purge -y snapd" because it cannot unmount /snap directory
[13:44] <zyga> ohh, wait a second
[13:44] <mvo> (or are you guys at this point already?)
[13:45] <zyga> no!
[13:45] <pstolowski> nope
[13:45] <mvo> *but* that does not make the test fail, the following "apt update" is run
[13:45] <zyga> heh
[13:45] <mvo> which is super strange
[13:45] <mvo> *gar*
[13:45] <mvo> because we do "lxc exec -- bash /root/prep...
[13:45] <mvo> but theere is a "#!/bin/sh -e" in the script but no set -e
[13:46] <mvo> *fail*
[13:46] <zyga> ah
[13:46] <zyga> nice
[13:46] <mvo> so that's the first thing we need to fix
[13:46] <pstolowski> oh
[13:46] <mvo> so that it at least fails at the right point :)
[13:47] <mvo> I push a fix for this in 2min
[13:47] <ijohnson> sorry folks that's my fault
[13:47] <zyga> mvo I guess the bash -e is not needed
[13:48] <zyga> just invoke it
[13:48]  * ijohnson is bad at lxding
[13:48] <zyga> mvo and the error is repeated below
[13:48] <zyga> for the inner case
[13:48] <pstolowski> hmm isn't it messing up with seeding when it removes snapd deb initially?
[13:49] <zyga> I think that's an existing issue
[13:49] <zyga> there's an open PR
[13:49] <zyga> we should look at it and maybe land it
[13:49] <zyga> there was some discussion so it never moved
[13:51] <mup> PR snapd#9183 opened: tests: use "set -ex" in prep-snapd-in-lxd.sh <Created by mvo5> <https://github.com/snapcore/snapd/pull/9183>
[13:52]  * pstolowski short lunch break
[13:52] <zyga> mvo commented
[13:52] <zyga> I can push a patch if you want and agree with my suggestion
[13:53] <mvo> zyga: I wonder if it is using bash because 755 mode is not transfered on "lxc file push" or something, but I'm too lazy to test. if it works +1
[13:53] <zyga> yes
[13:53] <zyga> it's that
[13:53] <zyga> ;D
[13:53] <zyga> mvo yeah, I'll get it to work and push
[13:54] <ijohnson> now that you mention it, yes I think that's why I did that
[13:54] <ijohnson> because the mode was lost when pushing it
[13:54] <mvo> zyga: let's not overcomplicate it, if we need a bunch of chmods I think it's fine
[13:54] <zyga> yeah, and lxd file push mode is broken (or was last time)
[13:54] <zyga> yeah
[13:54] <mvo> zyga: but I have no strong opinions, just a bit tired that it's red :)
[13:54] <zyga> so it's like a rock
[13:54] <zyga> you turn it over
[13:54] <zyga> and there's a new rock attached to it
[13:55] <zyga> and it's another problem
[13:55] <mvo> heh
[13:55] <zyga> and we turn it over and, guess what,
[13:55] <zyga> it's a rock
[13:55] <mvo> haha
[13:55] <mvo> I really want to also get to the bottom of what is actually broken before my next meeting :(
[13:55] <pstolowski> lol
[13:55] <zyga> it's a quarry
[13:56] <zyga> wife back, afk
[13:56] <mup> PR snapd#9182 closed: tests: re-enable udisks test on debian-sid <Simple 😃> <Created by mvo5> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/9182>
[14:10] <mvo> zyga, pstolowski /snap is mounted inside lxd - does http://paste.ubuntu.com/p/h95HbBjKqk/   make sense?
[14:11] <cmatsuoka> ijohnson: are the existing disk cross-checks enough to validate that name-enc comes from the right disk, and name, if encrypted, comes from name-enc (and therefore from the right disk)?
[14:13] <ijohnson> cmatsuoka: with my PR, we do the following things: https://www.irccloud.com/pastebin/B549qart/
[14:14] <zyga> looking
[14:14] <zyga> mvo yes it makes sense
[14:14] <zyga> mvo containers don't use MS_SHARED /
[14:14] <zyga> (for whatever reason)
[14:14] <cmatsuoka> ijohnson: so the answer seems to be yes. I'm asking because you left a TODO comment about doing these checks in secboot_tpm
[14:15] <zyga> the systemd generator we have used to emit a mount unit to do just that
[14:15] <zyga> make /snap -> /snap bind mount
[14:15] <zyga> and then change sharing
[14:15] <zyga> all to allow snaps to mount into derivative mount namespaces via mount event propagation
[14:15] <ijohnson> cmatsuoka: I left a TODO about doing these checks in secboot_tpm ? it's probably left over with my PR that should implement all the cross-checking we want
[14:15] <zyga> otherwise things appear to work until you refresh and then the view in the container disagrees with the view in the per-snap mount namespace
[14:17] <mvo> zyga ok, so we need an extra umount in purge?
[14:17]  * mvo tries that
[14:17] <zyga> I think it's more complex than that
[14:17] <zyga> we had issues with this
[14:17] <zyga> and I think we have something in purge but it may not work
[14:17] <zyga> or we may not have, I honestly don't recall
[14:17] <zyga> I would say we want systemctl stop snap.mount
[14:17] <zyga> not an unmount
[14:18] <pstolowski> re
[14:18] <zyga> and only if this is a container
[14:18] <cmatsuoka> ijohnson: why are the checks in the paste just for run mode? aren't we cross-checking in recover mode as well?
[14:19] <ijohnson> cmatsuoka: all those checks are for run and recover mode (or should be at least), but for run mode, we use ubuntu-boot as our reference mountpoint/partition, where as for recover mode we use ubuntu-seed as our reference mountpoint/partition - that should be the only difference
[14:19] <zyga> mvo: I pushed to https://github.com/snapcore/snapd/pull/9183/files
[14:19] <mup> PR #9183: tests: use "set -ex" in prep-snapd-in-lxd.sh <Created by mvo5> <https://github.com/snapcore/snapd/pull/9183>
[14:20] <cmatsuoka> ijohnson: ah right, thanks
[14:20] <zyga> afk again
[14:26] <mvo> zyga and that works, i.e. mode is preserved?
[14:31] <mup> PR snapd#9184 opened: [RFC] packaging: umount /snap on purge for good measure <Created by mvo5> <https://github.com/snapcore/snapd/pull/9184>
[14:34] <jdstrand> mvo: hey, fyi PR 8301 has two approvals but fails the 4 lxd tests (unrelated failure). PR 8920 also has the same failures and has all comments addressed. these are both ones we discussed for 2.46. by my eod, I will have my misc updates PR done (ie, easy reviews). Depending on the speed in which I do that, I may have a separate very small, policy updates only PR for k8s-support (for microk8s strict on
[14:34] <mup> PR #8301: interfaces/many: deny arbitrary desktop files and misc from /usr/share <Created by jdstrand> <https://github.com/snapcore/snapd/pull/8301>
[14:34] <mup> PR #8920: interfaces: update cups-control and add cups for providing snaps <Needs Samuele review> <Created by jdstrand> <https://github.com/snapcore/snapd/pull/8920>
[14:34] <jdstrand> focal). I know what is needed, but I want to document why and that might take a moment (but it will only be like 3 rules, so fast to review)
[14:35] <mvo> jdstrand: we are debugging/fixing this lxd things as we speak
[14:35] <jdstrand> mvo: ack, let me know if you want me to merge master
[14:37] <mvo> jdstrand: not quite there yet, also I have a super busy afternoon with meetings so there maybe little progress, but maybe my team will pickup 9184 (if it's green)
[14:37] <jdstrand> mvo: have you decided if you are going to pull master back to 2.46? (you mentioned you may not; if you do, hopefully 8301 and 8920 will be applied and I won't need to do separate PRs for 2.46 (though, if that is what you want, that is of course fine)
[14:38] <ijohnson> cmatsuoka: thanks, for the comment, I see the TODO and just pushed a commit to remove it
[14:40] <pstolowski> ijohnson: since you worked on this area of main/lxd test -
[14:41] <ijohnson> pstolowski: s/worked/broke/ but yes
[14:41] <pstolowski> haha
[14:41] <cmatsuoka> ijohnson: thanks, I'm building an image from this branch to run a minimal test before +1'ing it
[14:41] <pstolowski> ijohnson: why do we push snapd deb into the nested container to test that snapd isn't working if there is snapd already there?
[14:42] <ijohnson> pstolowski: that is to ensure that we don't regress and actually start working in that nested container, because if that started working it would very very very likely be a confinement bug somewhere in the stack
[14:42] <ijohnson> pstolowski: or it would mean that snapd stopped doing all the checks that it should be doing
[14:43] <pstolowski> ijohnson: ah ok, so we want to test the current snapd code for this, makes sense
[14:44] <pstolowski> thx
[14:47] <ijohnson> yes
[15:19] <mvo> jdstrand: pretty sure I will (90%) just merge master into 2.46 at this point, so many fixes went into that
[15:31]  * cachio lunch
[15:58] <pstolowski> i've been trying to improve prep-snapd-in-lxd.sh in the lxd test but no luck so far and i'll soon stop for today, need to take my daughter for the training
[15:59] <ijohnson> pstolowski: where did you get ?
[16:00] <ijohnson> pstolowski: I am happy to take another look at improving it, I have some modifications I made locally yesterday that probably help quite a bit and may overlap with your work that I could propose
[16:08] <pstolowski> ijohnson: i don't have anything tangible to share, i tried to stop snapd service before apt-get remove.., now also trying apt-get remove without --purge
[16:08] <ijohnson> pstolowski: ack I will propose something on top of what 9183 already does then
[16:09] <pstolowski> ijohnson: mvo is also experimenting in 9184, but i haven't tried that
[17:01] <ijohnson> cachio: why is BUILD_SNAPD_FROM_CURRENT false for tests/nested/manual ?
[17:02] <ijohnson> it seems to me that even though we use the HOST: ... trick for that env var for the manual suite it doesn't take effect when we run nested spread tests as part of a PR with that label
[17:03] <ijohnson> cachio: see https://github.com/snapcore/snapd/pull/9102/files#r473185796 for example, that duplicate system key makes the test fail for me now on master, but it somehow passed on that PR
[17:03] <mup> PR #9102: corecfg: add "system.timezone" setting to the system settings <Run nested> <Squash-merge> <Created by mvo5> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/9102>
[17:05] <mup> PR snapcraft#3254 closed: tests: restrict colcon / ros2-foxy test to amd64 & arm64 <Created by cjp256> <Merged by sergiusens> <https://github.com/snapcore/snapcraft/pull/3254>
[17:07] <cachio> ijohnson|lunch, it is because SPREAD_BUILD_SNAPD_FROM_CURRENT=true is defined for the github actions workflow
[17:07] <ijohnson|lunch> cachio: yeah I wonder if that is working though
[17:07] <cachio> I'll update that value in the PR to make sure we allways use it
[17:07] <ijohnson|lunch> cachio: have you seen other issues like this before ?
[17:07] <cachio> ijohnson|lunch, I'll update PR 9098 with this
[17:07] <mup> PR #9098: tests: new organization for nested tests <Run nested> <Created by sergiocazzolato> <https://github.com/snapcore/snapd/pull/9098>
[17:08] <cachio> thanks for the heads up
[17:08] <ijohnson|lunch> where the nested spread test passes on the PR but then immediately after merging fails ?
[17:12] <cachio> ijohnson|lunch, just pushed the update
[18:32] <mup> PR snapd#9185 opened: secboot: use the snapcore/secboot native recovery key type <UC20> <Created by cmatsuoka> <https://github.com/snapcore/snapd/pull/9185>
[18:37] <mup> PR snapd#9186 opened: interfaces: add vcio interface <Created by jdstrand> <https://github.com/snapcore/snapd/pull/9186>
[18:38] <jdstrand> ogra_: hey, that is for you ^
[18:38] <ogra_> whee !
[18:38]  * ogra_ dances
[18:45]  * cachio -> kinesiologist
[19:17] <ijohnson> jdstrand: small comment on that vcio PR
[20:09] <jdstrand> ijohnson: thanks, responded
[20:09] <ijohnson> jdstrand: how did you check?
[20:10] <ijohnson> I don't see that device in the device cgroup on my rpi running uc20 for a snap that uses an interface that puts it into the device cgroup
[20:10] <jdstrand> ijohnson: I logged into an rpi UC device and /dev/vcio was present
[20:10] <jdstrand> oh, the device cgroup, sorry, I was thinking of kmod
[20:10] <jdstrand> ijohnson: you are right about udev
[20:10] <ijohnson> jdstrand: don't we need `connectedPlugUDev: ...` rules to ensure that the device gets added to the device cgroup ?
[20:10]  * jdstrand adjusts
[20:10] <ijohnson> ah ok, I see
[20:28] <jdstrand> ijohnson: ok, pushed
[20:28] <ijohnson> ack
[20:39] <ijohnson> jdstrand: one more comment on that PR, but otherwise lgtm
[20:58] <ijohnson> cachio: hey I noticed the tests/nested/core/core-snap-refresh-on-core test fail on uc16
[20:58] <ijohnson> cachio: the test is currently flaky and needs to be updated
[20:58] <cachio> in my pr?
[20:58] <cachio> because I already updated that
[20:59] <ijohnson> cachio: oh sorry I just saw it fail on master
[20:59] <ijohnson> cachio: which PR did you fix it in ?
[21:00] <cachio> #9098
[21:00] <mup> PR #9098: tests: new organization for nested tests <Run nested> <Created by sergiocazzolato> <https://github.com/snapcore/snapd/pull/9098>
[21:00] <ijohnson> cachio: no you need to make more changes in addition to that PR
[21:00] <ijohnson> cachio: the function refresh_to_new_core is coded wrong right now
[21:00] <ijohnson> cachio: that function currently does this:
[21:01] <ijohnson>             execute_remote "sudo snap refresh core --${NEW_CHANNEL}"
[21:01] <ijohnson>             execute_remote "snap info core" | grep -E "^tracking: +latest/${NEW_CHANNEL}"
[21:01] <ijohnson> cachio: it should wait for a reboot immediately after the `snap refresh` command, then get the info for the tracking channel after the reboot
[21:01] <ijohnson> cachio: right now the test is racing with snapd, if snapd finishes quick enough then it won't respond to `snap info` and it will hang
[21:03] <cachio> ijohnson, ah, I just saw that depending on the var it was trying to refresh to the same revision and failed
[21:03] <ijohnson> cachio: here's an example failure log https://pastebin.ubuntu.com/p/Wbgq47BYRg/
[21:03] <ijohnson> cachio: yes that's an independent issue
[21:03] <cachio> ijohnson, ah, nice, I'll fix it in that case
[21:03] <ijohnson> thanks!
[21:03] <cachio> thanks for that
[21:04] <ijohnson> cachio: I also saw google-nested:ubuntu-20.04-64:tests/nested/manual/refresh-revert-fundamentals:snapd fail in github actions
[21:04] <ijohnson> cachio: but I don't see an obvious reason why that failed yet
[21:05] <cachio> ijohnson, do you have a link?
[21:05] <ijohnson> cachio: yes one moment
[21:06] <ijohnson> cachio: https://pastebin.ubuntu.com/p/Yskh9yqNZH/
[21:08] <cachio> thanks
[21:08] <cachio> I'll insvestigate that one too
[21:24] <jdstrand> ijohnson: nice! done
[21:24] <ijohnson> cachio: if you could look at https://github.com/snapcore/snapd/pull/9187 that would be great
[21:24] <mup> PR #9187: tests/lib/nested.sh: use more robust code for finding what loop dev we mounted <Run nested> <Test Robustness> <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/9187>
[21:24] <cachio> sure
[21:25] <ijohnson> jdstrand: thanks, +1d, we still need pedronis to review the iface name, he is back next week I think
[21:25]  * jdstrand nods
[21:28] <mup> PR snapd#9187 opened: tests/lib/nested.sh: use more robust code for finding what loop dev we mounted <Run nested> <Test Robustness> <⚠ Critical> <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/9187>
[21:31] <cachio> ijohnson, seems to be ok the change
[21:32] <cachio> lets see the results
[21:32] <cachio> thanks!
[21:32] <ijohnson> thanks cachio
[21:32] <ijohnson> it passed for me once locally so 🤞
[21:33] <cachio> nice
[22:48] <mup> PR snapd#9188 opened: interfaces: misc policy updates xlvi <Created by jdstrand> <https://github.com/snapcore/snapd/pull/9188>
[23:31] <mup> PR snapcraft#3255 opened: remote-build: use requests.get() instead of urlopen() <Created by cjp256> <https://github.com/snapcore/snapcraft/pull/3255>