[02:06] <mup> PR snapd#8152 opened: managers_test: add uc20 kernel snap update happy and panic tests <Test Robustness> <UC20> <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/8152>
[06:15] <mborzecki> morning
[06:38] <mborzecki> school run, back in 30
[07:06] <mborzecki> re
[07:45] <mborzecki> mvo: morning
[07:48] <mvo> hey mborzecki ! good morning
[08:01] <pstolowski> morning
[08:14] <zyga> re
[08:14] <zyga> good morning
[08:16] <zyga> tests are somewhat unhappy
[08:16] <zyga> I've seen snapd-failover fail a lot yesterdat
[08:16] <zyga> I know Ian was looking into that but perhaps it needs more eyes
[08:18] <zyga> and snap-run hangs, oh my
[08:18] <pstolowski> zyga: right, i saw snapd-failower failures as well
[08:22] <zyga> so
[08:22] <zyga> I need bionic
[08:22] <zyga> and I need tests that run as user
[08:22]  * zyga gets to work
[08:40] <zyga> ok, test written, let's run it
[09:10] <zyga> mborzecki: do you expect https://bugs.launchpad.net/snapd/+bug/1863859 to be true given our environment generators?
[09:10] <mup> Bug #1863859: Using su removes /snap/bin from PATH <snapd:New> <https://launchpad.net/bugs/1863859>
[09:10] <zyga> mborzecki: I just noticed this in tests
[09:13] <mborzecki> zyga: su - ?
[09:13] <zyga> su test -c "snap run foo" vs su test -c "foo"
[09:13] <mborzecki> zyga: try su -l
[09:13] <zyga> mborzecki: k
[09:14] <mborzecki> or . $TESTSLIB/user.sh and then as_user_simple ?
[09:14] <mborzecki> zyga: simple will not load the profile
[09:16] <zyga> I'll look at some tests that suffer from this
[09:32]  * zyga loves calls with baby crying next door
[09:33] <mup> PR snapd#8153 opened: [RFC] "snap run --explain" with different formating <Created by mvo5> <https://github.com/snapcore/snapd/pull/8153>
[09:33] <zyga> mvo: I'll review in a moment, need to check what the crying is all about
[09:34] <mvo> zyga: no rush, thank you!
[09:55] <zyga> re
[10:10]  * zyga debugs all kinds of things through one failing test
[10:46] <zyga> mborzecki: quick question
[10:46] <zyga> https://www.irccloud.com/pastebin/B2mlykUV/
[10:46] <zyga> (this is test-snapd-sh without arguments
[10:46] <zyga> spawning endless loop of bad substitution
[10:46] <zyga> why would our argv[0] cause this?
[10:48] <zyga> I'm patching snap-exec to show what's going on at that layer
[10:48] <zyga> it is driving me crazy, I want to finally fix it
[10:53] <zyga> mborzecki: it's a dash specific message
[10:53] <zyga> that's interesting
[10:53] <zyga> I think we do something incorrectly
[10:53] <zyga> oh well, let me grab coffee
[10:55] <zyga> desktop 18.04 install in progress
[10:55] <zyga> I think we have a problem with systemd there :/
[10:55] <zyga> but I think my test setup is not doing what is right
[11:08] <zyga> brb
[11:14] <zyga> ha
[11:14] <zyga> got it
[11:14] <zyga> found a bug in spread
[11:17] <zyga> PS1 breaks dash
[11:17] <zyga> becuase SPREAD_PATH is unset
[11:18] <zyga> is it just me or is this a quiet channel?
[11:30] <sdhd-sascha> zyga: hey :-)
[11:30] <zyga> hey
[11:30] <zyga> how are you?
[11:30] <sdhd-sascha> zyga: i'm fine. thank you. Little bit tired. But fine :-) And you ?
[11:30] <zyga> a bit scared I missed something big that came up yesterday evening
[11:31] <zyga> debugging it now
[11:31] <sdhd-sascha> oh, good luck
[11:32] <sdhd-sascha> i'm just started to fix the snap for the github actions.
[11:34] <sdhd-sascha> zyga: on which point it's best to do chmod of a script ? I have a /run.sh and snapcraft says it's not executable. I use chmod at `override-build` currently.
[11:35] <zyga> sdhd-sascha: are you building on windows?
[11:35] <sdhd-sascha> `Failed to generate snap metadata: The specified command 'run.sh' defined in the app 'runner' is not executable.`
[11:35] <zyga> sdhd-sascha: it's best to keep it executable in the tree
[11:35] <sdhd-sascha> I wonder, why the generated tar isn't executable ... i will see
[11:38] <sdhd-sascha> zyga: if i go into `multipass exec ` ... Then inside of `/root/parts/runner/build/run.sh` it's executable.
[11:38] <sdhd-sascha> ```
[11:38] <sdhd-sascha> apps:
[11:38] <sdhd-sascha>   runner:
[11:38] <sdhd-sascha>     command: run.sh
[11:38] <sdhd-sascha> ```
[11:38] <sdhd-sascha> But i still get: `Failed to generate snap metadata: The specified command 'run.sh' defined in the app 'runner' is not executable.`
[11:39] <sdhd-sascha> hmm
[11:39] <zyga>  oh great
[11:39] <zyga> killing dash killed gnome-terminal
[11:39] <sdhd-sascha> wow
[11:40] <zyga> anyway
[11:40] <zyga> just more bugs
[11:43] <cmatsuoka> ijohnson: thanks for the PR
[11:43] <cmatsuoka> ijohnson: it seems that tests failed in a strange way
[11:46] <cmatsuoka> command systemctl is-active snapd.failure.service keeps failing after 30 attempts
[11:47] <cmatsuoka> and snapd.service: Failed to execute command: Exec format error
[11:47] <cmatsuoka> zyga: does that sound familiar to you?
[11:47] <zyga> cmatsuoka: hey
[11:47] <zyga> yeah, there's something really broken
[11:47] <zyga> it's worse recently perhaps the test is new or something related changed
[11:47] <zyga> it fails left and right
[11:48] <cmatsuoka> EXEC spawning /snap/snapd/x2/usr/lib/snapd/snapd: Exec format error
[11:48] <zyga> I haven't looked into it yet
[11:48] <zyga> cmatsuoka: yeah, the test is trying to install corrupt snapd
[11:48] <zyga> to check failover
[11:48] <zyga> but failover never kicks in
[11:48] <zyga> the exec format error is expected and harmless noise (for the test)
[11:49] <cmatsuoka> ah ok
[12:02] <mup> PR snapd#8154 opened: tests: reset PS1 before possibly interactive dash <Created by zyga> <https://github.com/snapcore/snapd/pull/8154>
[12:02] <zyga> mborzecki: ^
[12:02] <zyga> this caused me some grief
[12:02] <zyga> cmatsuoka: I can look after I investigate more systemd stuff
[12:03] <ijohnson> cmatsuoka: zyga: yes I was looking into this a bit yesterday, I think that the patch mborzecki recommended to snap-failure is useful here, I had that small change proposed in my larger core spread PR, but perhaps I should break it out just to get that merged
[12:03] <ijohnson> and morning folks btw
[12:03] <mborzecki> ijohnson: hey!
[12:03] <ijohnson> hey mborzecki
[12:03]  * ijohnson is here for real this morning
[12:04] <mborzecki> `The dash bug has been reported to the dash mailing list (there is no bug tracker).` hahah welp
[12:08] <zyga> mborzecki: oh well
[12:09] <zyga> mborzecki: it's very unixy
[12:09] <zyga> mborzecki: mailing list archive, CCing people
[12:09] <zyga> mborzecki: 3-character names
[12:09] <zyga> thank you for the review guys
[12:09] <mvo> I see a lot of red in the most recent spread runs, has anyone investigated yet?
[12:09] <zyga> mvo: snapd-failover
[12:09] <zyga> it's failing ... over and ... I'll shut up now ;)
[12:10] <cmatsuoka> :D
[12:10]  * zyga returns to debugging systemd 
[12:14] <mup> PR snapd#8155 opened: tests: mv ubuntu-core-snapd{,-failover} to core/ suite <Test Robustness> <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/8155>
[12:14] <ijohnson> mvo: I opened ^ which I think will help with the snapd-failover test
[12:15] <cachio> ijohnson, zyga failover test started failing a time ago but with external backend
[12:15] <cachio> now is started failing on travis too
[12:15] <ijohnson> cachio: interesting you mean on the lab machines ?
[12:15] <ijohnson> cachio: do you have logs?
[12:15] <cachio> ijohnson, http://paste.ubuntu.com/p/DXD9WN7CtJ/
[12:16] <cachio> this is from the 2.43.3
[12:16] <cachio> line v
[12:16] <cachio> 1330
[12:16] <cachio> I though it was related to our configuration for the external backend
[12:16] <cachio> because it was not possible to reproduce on travis
[12:16] <cachio> using gce
[12:17] <cachio> but seems to be something else
[12:17] <ijohnson> yeah that seems to fail the same it's failing on travis now
[12:17] <ijohnson> the issue is that snapd.failure.service is not being started ever by systemd
[12:17] <cachio> the weird part is that when using the external backend the test is not failing 100% of the times
[12:17] <cachio> right
[12:18] <ijohnson> yes I don't think it is failing 100% of the time on travis either, indeed when I tried reproducing yesterday I didn't hit it
[12:18] <cachio> a time ago I shared more info about the error, I am trying to find it
[12:19] <cachio> I remember I sent the error to mvo
[12:20] <cachio> failover was trying to find core and core snap was not installed
[12:20] <cachio> ijohnson, but I cant find the logs
[12:20] <ijohnson> cachio: hmm that sounds different
[12:21] <cachio> the error is the same you see in the log I sent you
[12:21] <cachio> I'll reproduce it
[12:22] <cachio> and share the logs again, give me 15 minutes
[12:22] <zyga> google:ubuntu-18.04-64 .../tests/main/cgroup-tracking# test-snapd-sh.sh
[12:22] <zyga> $
[12:22] <zyga> google:ubuntu-18.04-64 .../tests/main/cgroup-tracking#
[12:22] <zyga> ^ getting this feels like this day is already worth it
[12:24] <ijohnson> thanks cachio
[12:25] <zyga> mborzecki: but it was worth reporting, it's a known problem
[12:25] <zyga> https://patchwork.kernel.org/patch/11343121/
[12:25] <zyga> and there's a patch
[12:28] <mvo> nice
[12:40] <mup> PR snapd#8156 opened: [RFC] cmd/snap-bootstrap: subcommand to detect if we want a chooser to run <UC20> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/8156>
[12:41] <mborzecki> mvo: ^^ the key-watching thing
[12:41] <mborzecki> pedronis: mvo: we should probably discuss the naming there
[12:46] <zyga> oh
[12:46] <zyga> I wanted to mention that yesterday
[12:46] <zyga> about recovery and prompt and stuff
[12:46] <zyga> something to keep an open mind
[12:46] <zyga> smart IOT bulbs
[12:46] <zyga> do recovery
[12:46] <zyga> if ...
[12:46] <zyga> you turn them on and off five times in a row with specific time delay
[12:47] <zyga> our discussion about key detection feels incorrect
[12:47] <zyga> it should not be something we mandate
[12:47] <codingpanic> Hello all. I instaleld snapd on a raspbian based pi today. Unfortunately, none of the built-in interfaces/slots are available. What creates these?
[12:47] <pedronis> zyga: we don't mandate it, we need a reference experience
[12:48] <codingpanic> Since they dont exist, snaps are unable to connect to anything
[12:48] <ijohnson> codingpanic: did you install snapd as a deb pkg ?
[12:48] <codingpanic> ijohnson: yes
[12:48] <ogra> hrm
[12:48] <ogra> ubuntu@localhost:~$ snap version
[12:48] <ogra> snap    2.43.3+git1047.g6b52b37
[12:48] <ogra> snapd   unavailable
[12:48] <ogra> series  -
[12:48] <ogra> ubuntu@localhost:~$ snap list
[12:48] <ogra> error: cannot list snaps: cannot communicate with server: timeout exceeded while waiting for response
[12:48] <ogra> ubuntu@localhost:~$
[12:48] <ijohnson> codingpanic: what's `snap list` ?
[12:48] <ogra> ubuntu@localhost:~$ ps ax|grep snapd
[12:48] <ogra>  8932 ?        Ssl    0:00 /snap/snapd/6521/usr/lib/snapd/snapd
[12:48] <ijohnson> ogra: why did you break your snapd
[12:49] <ogra> this is how i found one of my qemu machines this morning
[12:49] <ijohnson> well don't leave it like that :-)
[12:49] <ogra> seems it got some reboot-triggering update yesterday evening
[12:49] <zyga> ogra: journalctl -u snapd.service
[12:49] <ijohnson> ogra: anything in the system journal for snapd?
[12:49] <codingpanic> pi@raspberrypi:/var/log $ snap list
[12:49] <codingpanic> Name     Version   Rev   Tracking  Publisher     Notes
[12:49] <codingpanic> core18   20200124  1671  stable    canonicalâ    base
[12:49] <codingpanic> scummvm  2.1.1     3126  stable    snapcrafters  -
[12:50] <codingpanic> I've also tried retroarch
[12:51] <zyga> codingpanic: snap install core
[12:51] <ogra> https://paste.ubuntu.com/p/xvRDVNPW4S/
[12:51] <ogra> the last part simply repeats over and over
[12:51] <zyga> codingpanic: it's a known bug
[12:51] <codingpanic> zyga: got it, thank you!
[12:51] <zyga> ogra: Feb 19 08:11:22 localhost snapd[824]: panic: cannot checkpoint even after 5m0s of retries every 3s: write /var/
[12:52] <zyga> ogra: did you run out of disk space
[12:52] <zyga> ?
[12:52] <codingpanic> zyga: that fixed it
[12:52] <ogra> zyga, bah, yeah
[12:53]  * zyga hugs ogra 
[12:53] <zyga> been there done that (too many times)
[12:53] <zyga> I need a coffee
[12:53] <zyga> brb
[12:54] <ijohnson> pedronis: I need to run an errand right after SU (3:30 PM your time to be specific), can we put something on the calendar for 4:30 PM your time ?
[12:54] <ijohnson> zyga: I thought we fixed that bug though ?
[12:54] <zyga> ijohnson: no, that's a separate bug
[12:54] <zyga> ijohnson: look at snap list output
[12:54] <zyga> ijohnson: according to our design you have 0 interfaces in that case
[12:55] <zyga> ijohnson: you must get snapd or core
[12:55] <ijohnson> zyga: but I thought we fixed the bug where the core snap was not installed when you to go install only a snap with `base: core18` ?
[12:55] <zyga> ijohnson: also this is raspbian so probably runs an older copy
[12:55] <ijohnson> zyga: ah right yes probably older version of snapd with the bug still in place
[12:55] <zyga> yep
[12:56] <ijohnson> zyga: we should probably update snapd in raspbian, seems to happen somewhat frequently now I think
[12:56] <zyga> ijohnson: I don't think we can
[12:56] <zyga> ijohnson: we should update snapd in debian
[12:56] <zyga> it very old
[12:56] <ijohnson> zyga: who handles snapd update in debian?
[12:56] <zyga> nobody
[12:56] <zyga> us
[12:56] <zyga> sometimes
[12:56] <zyga> when $fire
[12:57] <ijohnson> also doesn't raspbian have it's own archives based on, but separate from debian? perhaps I'm misremembering
[12:57] <ogra> hrm ... freeing up 120MB and rebooting just gets me back to that state :/
[12:57] <zyga> ijohnson: it does
[12:57] <zyga> ijohnson: but we don't have any way to change it
[12:57] <ogra> bah ... because it is secretly upgrading ... it just rebooted again
[13:00] <ijohnson> hooray now ogra's ephemeral qemu VM is more secure
[13:07] <jdstrand> zyga: hey, fyi, https://github.com/snapcore/snapd/pull/7825#discussion_r381277068
[13:07] <mup> PR #7825: many: use transient scope for tracking apps and hooks <Security-High> <Created by zyga> <https://github.com/snapcore/snapd/pull/7825>
[13:07] <zyga> hye
[13:07] <zyga> hey*
[13:07] <zyga> ack
[13:07] <zyga> I'm still looking into this
[13:07] <zyga> and especially what happened to your interactive blackbox testing
[13:10] <jdstrand> zyga: that's fine. I just wanted you to be aware of that weird thing I saw where I semi-started a transient unit and couldn't get it to go away in case it saves you some time
[13:11] <zyga> thank you
[13:12] <jdstrand> zyga: otoh, can you also make sure that these tests aren't skipped on core? you mentioned trusty but I thought I remembered something about user sessions on sore.... that said, xenial is starting systemd --user so hopefully it is all ok
[13:13] <zyga> jdstrand: I will
[13:13] <zyga> jdstrand: I  think this approach was picked specifically to support 16.04 desktop
[13:13] <jdstrand> that sounds familiar :)
[13:13] <jdstrand> that was a long time ago :)
[13:14] <jdstrand> zyga: fwiw, I'm really bought into the technique and like the PR overall :)
[13:14] <jdstrand> just need to make sure systemd does what we need everywhere
[13:14] <zyga> jdstrand: yeah, I just hope I didn't miss anything
[13:17] <jdstrand> zyga: I did notice yesterday that runc was using StartTransientUnit for at least something... could look there to build confidence. but one of the things I like about this is we are just asking systemd to take care of things for us. so long as we use the api correctly (which it appears you are), we should be good (barring bugs (systemd or otherwise))
[13:17] <zyga> jdstrand: the only thing I'm worried about is --user
[13:17] <zyga> the API is sold and supported for a long time
[13:17] <zyga> but user session is more complex
[13:17] <zyga> systemd --user is asked to stuff
[13:17] <zyga> but cannot
[13:17] <zyga> so asks system systemd to do it
[13:17] <jdstrand> zyga: kinda like how we ask it to manage starting and stopping services. we tell it a little bit and say "thanks for doing this for me!". same here
[13:17] <zyga> and that path I believe is somewhat new
[13:18] <zyga> I'm digging into this now
[13:18]  * jdstrand nods
[13:18] <jdstrand> zyga: well, it isn't *terribly* new if it is in bionic (I didn't test xenial), but there is definitely some different behavior
[13:19] <jdstrand> which might be cherrypicks, etc
[13:19] <zyga> I think it's all the way back to xenial, but give me a moment to debug it
[13:19] <jdstrand> zyga: we always have in out back pocket runtime detection and gracefully degrading too (not saying we need it yet)
[13:19] <jdstrand> s/in out/in our/
[13:20] <zyga> yes, it's not essential in a way
[13:22] <cachio> ijohnson, this is the output https://paste.ubuntu.com/p/qNnRgrHktg/
[13:22] <mvo> mborzecki: thanks, looking
[13:23] <cachio> from a local run on core-18
[13:23] <cachio> ijohnson, at the end you can see the issue that I raised a time ago snap-failure[9015]: cmd_snapd.go:124: restoring invoking snapd from: /snap/core/current/usr/lib/snapd/snapd
[13:23] <ijohnson> thanks cachio, I'm looking now
[13:24] <cachio> also the error snapd.service: Failed at step EXEC spawning /snap/snapd/x1/usr/lib/snapd/snapd: Exec format error
[13:25] <ijohnson> cachio: the EXEC spawning issue is very odd, because it should be x2 that has the Exec format error
[13:26] <jdstrand> zyga: so it is crystal clear: the reproducer is: take a desktop vm install, install snapd/enable the feature/blah, in the graphical session, open a terminal, use hello-world.sh/snap run --shell/etc and it works (find /sys/fs/cgroup -name '*.scope'). ssh in/console login and try the same and it doesn't (but not errors in SNAPD_DEBUG=1 or anywhere else I could find
[13:26] <jdstrand> )
[13:26]  * zyga nods 
[13:26] <zyga> let me check that quickly
[13:26] <zyga> I was going via spread and I hit another issue there
[13:26]  * jdstrand nods
[13:27] <cachio> ijohnson, I have a debug session opened
[13:27] <cachio> ijohnson, if you need any output please tell me
[13:27] <jdstrand> I figured it might be easier to see what to recreate if you saw the problem
[13:27] <ijohnson> I think there is a known issue in that snap-failure should be more robust in trying to restart the socket, and isn't
[13:27] <ijohnson> cachio: ^
[13:27]  * jdstrand leaves zyga to it
[13:28] <ijohnson> cachio: sorry let me keep reading I'm not sure what I would like you to run in the debug session
[13:32] <zyga> jdstrand: wwo
[13:32] <zyga> jdstrand: so
[13:32] <zyga> jdstrand: it works on a fresh install of bionic
[13:32] <zyga> jdstrand: desktop session
[13:32] <zyga> jdstrand: I recorded bustle (dbus traffic analyzer) dumps
[13:32] <jdstrand> zyga: right, in the desktop session, it works
[13:33] <zyga> and even confirmed it the simple way
[13:33] <zyga> https://www.irccloud.com/pastebin/pENvRQqK/
[13:33] <zyga> oh? I missed that
[13:33] <zyga> when didn't it work?
[13:33] <jdstrand> zyga: ssh in or do console login and it doesn't
[13:33] <zyga> ah
[13:33] <zyga> I see
[13:33] <zyga> okay that's much clearer now
[13:33] <zyga> thanks, I'll focus on that
[13:33] <jdstrand> zyga: that is what made it tricky for me to diagnose :)
[13:33] <zyga> jdstrand: but I'm 99% sure it's because there's no session bus there
[13:33] <zyga> and this is a dbus interaction
[13:34] <jdstrand> zyga: I saw systemd --user started with just ssh
[13:34] <zyga> jdstrand: ok, I'll debug both cases (console login and ssh)
[13:34] <zyga> jdstrand: were you logging into the same user as the interactive session?
[13:34] <zyga> jdstrand: or another user
[13:34] <jdstrand> zyga: but yeah. maybe there is an activation thing or something. idk
[13:34] <jdstrand> I sopped looking when I ran out of day
[13:34] <jdstrand> stopped
[13:34] <zyga> ok
[13:34] <zyga> I'll explore those
[13:35] <zyga> because previously it was a bit magic
[13:35] <jdstrand> zyga: I was
[13:35] <zyga> as to why it worked in places I checked
[13:35] <jdstrand> (same user)
[13:35] <zyga> ok
[13:35] <zyga> that would explain the systemd --user aspect
[13:36] <jdstrand> zyga: well, no. I logged into core a few minutes ago before asking about adding uc tests and there was --user
[13:36] <zyga> jdstrand: ok, please give me a day to get to the bottom of this
[13:36] <zyga> jdstrand: if you could review any of the other PRs I mentioned I could perhaps make progress on those
[13:36] <zyga> jdstrand: but for this one, I know what to do
[13:36] <jdstrand> that doesn't mean there is a dbus session bus...
[13:36] <zyga> jdstrand: right
[13:36] <jdstrand> yes, that was the plan
[13:36] <zyga> jdstrand: perhaps 18.04 doesn't have enough activation
[13:36] <zyga> testing is paramount
[13:37] <zyga> I'll try to cover those
[13:37]  * jdstrand nods
[13:42] <sdhd-sascha> mvo: fyi. github runner tries to create config files in the binary directory. This needs to be fixed first. Seems to be hardcoded path's.
[13:42] <sdhd-sascha> https://github.com/sd-hd/runner-snap/releases
[13:43] <zyga> jdstrand, mvo: fyi https://github.com/snapcore/snapd/pull/7825#discussion_r381296772
[13:43] <mup> PR #7825: many: use transient scope for tracking apps and hooks <Security-High> <Created by zyga> <https://github.com/snapcore/snapd/pull/7825>
[13:46] <mvo> zyga: 0.05s? that is not bad :)
[13:48] <zyga> mvo: that's wall clock time
[13:48] <zyga> mvo: the actual new code is 1ms
[13:57] <ogra> you have a wall clock that measures 0.05s ?!?
[13:58] <roadmr> ogra: well if you look hard enough a lot of wristwatches measure 0.25s (seconds hand moves at 4hz)
[13:58] <roadmr> that's mechanical wind-up wristwatches mind you
[13:58]  * roadmr now looks really old and anachronistic ⌚
[14:00] <ogra> haha, yeah
[14:08] <zyga> jdstrand: reproduced
[14:08] <zyga> https://www.irccloud.com/pastebin/lg1cEV4Q/
[14:08] <zyga> jdstrand: ^ that's bionic over ssh as user
[14:08] <zyga> jdstrand: running systemd-run --user --scope ls
[14:09] <zyga> jdstrand: on the up side, I now have systemd --user running as my user (remote ssh activated it)
[14:09] <zyga> jdstrand: on the down side, no idea why it failed yet
[14:09] <zyga> but I can dig
[14:10]  * jdstrand nods
[14:10] <jdstrand> I <3 reproducers
[14:10] <zyga> jdstrand: but that's really weird, raspbian (older) works okay
[14:10] <zyga> jdstrand: but I'll dig
[14:12] <zyga> jdstrand: I'll check 20.04 and 16.04 as well
[14:12] <zyga> jdstrand: _complexity_ :)
[14:14] <jdstrand> zyga: yes. hopefully it isn't too deep (I recall how the snap_daemon spun out cause of bugs, bugs, bugs. I hope very much that isn't the case here (I don't think it is))
[14:16] <zyga> I think it's just configuration (famous last words?)
[14:18] <mborzecki> jdstrand: zyga: your ssh shell should already be part of a user slice/session scope (?)
[14:18] <zyga> it is
[14:19] <mborzecki> zyga: same when you log in on a tty, though no clue who arranges that, pam & logind maybe?
[14:19] <mborzecki> s/who/what/
[14:20] <zyga> mborzecki: the problem is why remote cannot systemd-run
[14:20] <zyga> smells like polkit to me
[14:20] <zyga> I'll look after the standup
[14:23] <mborzecki> hmm hmm intersting
[14:23] <roadmr> jdstrand, pedronis : is there a non-system-files interface that would allow write access to /etc/ssh/ssh{,d}_config ?
[14:24] <roadmr> if not, is it acceptable to grant system-files in this case (it's a device-specific snap in a brand store, so mostly well-scoped)
[14:30] <zyga> mborzecki: it's polkit
[14:30] <zyga> mborzecki: give me a while to get the bits in place
[14:30] <mborzecki> zyga: polkit?
[14:30] <zyga> yep
[14:30] <zyga> it's always polkit ;)
[14:30] <mborzecki> it's always /pluseaudio/polkit/snapd/
[14:30] <mborzecki> heh forgot systemd
[14:35] <zyga> I need to take the dog out
[14:38] <roadmr> who let the dogs out? woof wof...
[14:42] <mborzecki> zyga: doesn't look like polkit to me
[14:42] <mborzecki> openat(AT_FDCWD, "/sys/fs/cgroup/unified/user.slice/user-1000.slice/user@1000.service/run-rbcaa5f3010d646e38684382cacc4aa70.scope/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC)
[14:42] <mborzecki> = 34
[14:42] <mborzecki> fcntl(34, F_GETFL)                      = 0x8001 (flags O_WRONLY|O_LARGEFILE)
[14:42] <mborzecki> write(34, "2821\n", 5)                  = -1 EACCES (Permission denied)
[14:42] <mborzecki> uh, w8 that's on focal ;)
[14:46] <mborzecki> hm weird, there's this sequence
[14:46] <mborzecki> openat(AT_FDCWD, "/sys/fs/cgroup/unified/user.slice/user-1000.slice/user@1000.service/run-r51cceeed963d4c33ace8cf029e2657d5.scope/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC)
[14:46] <mborzecki> = 30
[14:46] <mborzecki> write(30, "2504\n", 5)                  = -1 EACCES (Permission denied)
[14:47] <mborzecki> close(30)
[14:47] <mborzecki> followed by opening *.scope directory and getdents()
[14:48] <mborzecki> anyways, the message is Failed to add PIDs to scope's control group: Permission denied which kind of makes sense in this context
[14:53] <zyga> re
[14:53] <zyga> mborzecki: which systemd is that?
[14:54] <zyga> ah, focal
[14:54] <zyga> mborzecki: systemd --user should ask system to to that write
[14:54] <zyga> mborzecki: that's what I read from sources
[14:54] <zyga> mborzecki: anyway, I'll keep digging
[15:01]  * zyga went outside for a moment and finally cooled the fever of his laptop while using google meet
[15:02] <pedronis> roadmr: I don't think there is one atm
[15:03] <roadmr> pedronis: :( any objections with using system-files for this then?
[15:03] <pstolowski> mvo:  thanks for the suggestion re version check!
[15:05] <zyga> dns timeouts ugh
[15:05] <zyga> Feb 19 07:03:47 bionic systemd-resolved[618]: Grace period over, resuming full feature set (UDP+EDNS0) for DNS server 172.16.153.2.
[15:05] <pedronis> roadmr: no deep objections as long it's scoped, we might add something like sshd-control at some point but not immediately
[15:06] <jdstrand> roadmr (cc pedronis): there isn't one. if this is in a brand store, 'no', but they can certainly hurt themselves
[15:06] <jdstrand> pedronis: I might suggest 'snap set core' things instead of sshd-control
[15:07] <pedronis> jdstrand: yea, but we need clear use cases
[15:07] <pedronis> to get tehre
[15:07] <pedronis> get there
[15:07] <jdstrand> yes
[15:08] <jdstrand> pedronis: they can be added one by one to snap set. sshd-control is risky and security sensitive
[15:08] <jdstrand> imho
[15:08] <pedronis> jdstrand: I understand, the surface of the config is large though
[15:08] <jdstrand> roadmr: but yeah, no objections for system-files in a brand store so long as they know the risk
[15:08] <pedronis> anyway not something we'll do soon
[15:08]  * jdstrand nods
[15:08] <roadmr> jdstrand: yep, like pedronis said this can be scoped to the specific store to limit risk. I can tell them there is indeed a risk so they are super aware
[15:09] <roadmr> thanks jdstrand pedronis
[15:09] <pedronis> roadmr: but if they can tell use what kind of config they are changing it could help us design something later
[15:09] <pedronis> s/use/us/
[15:09] <roadmr> pedronis: sure thing, I'll also ask
[15:10] <pedronis> roadmr: thx
[15:44] <zyga> jdstrand: FYI https://github.com/snapcore/snapd/pull/7825#discussion_r381360336
[15:44] <mup> PR #7825: many: use transient scope for tracking apps and hooks <Security-High> <Created by zyga> <https://github.com/snapcore/snapd/pull/7825>
[15:44] <zyga> github is kind of slow for me today
[15:44] <zyga> but new notifications are really cool, a big improvement over YOU HAVE NAN NOTIFICATIONS
[15:45] <jdstrand> github is both slow and broken for me today
[15:45] <jdstrand> 500 on the above url
[15:45] <zyga> also 500 often
[15:45] <zyga> haha
[15:45] <zyga> yeah, just reload a few times
[15:45] <zyga> it works eventually
[15:45] <jdstrand> not able to expand resolved conversations...
[15:45] <zyga> jdstrand: at some point we can say that github is slow because new core snap refreshes ;-)
[15:46] <jdstrand> heh
[15:46] <zyga> jdstrand: the comment says that cgroup v2 doesn't have pids, it's just one big flat tree
[15:46] <zyga> jdstrand: in v1 we _could_ scope it but current code doesn't require it
[15:46] <zyga> jdstrand: scoping it mainly requires us to know more about systemd, we could do it at a low cost but I deem it non-essential now
[15:46] <zyga> jdstrand: certainly something we can improve ahead of making this stable
[15:47] <zyga> jdstrand: I don't know how you develop locally but I find a desktop (workstation) installation (vm) of fedora 31 is very useful for exploring a full complex system that uses pure v2 mode
[15:52] <jdstrand> zyga: I obviously need some practical v2 experience since I can never remember how the layout is. I occasionally have a fedora vm, but I find by the next time I need to look at fedora, the one I have is eol ;P
[15:52] <jdstrand> zyga: I replied
[15:52] <zyga> jdstrand: :)
[15:55] <ijohnson> okay I think that I have the snapd-failover test working again
[15:55] <ijohnson> will push my fix up
[15:55] <zyga> ijohnson: great news
[15:55] <pedronis> great
[15:55] <zyga> ijohnson: cannot wait to see what it was
[15:56] <ijohnson> StartLimitInterval{,Sec}=0 doesn't play well with OnFailure, because the OnFailure only runs after the unit is in "totally really absolutely failed" state, which systemd only considers after it has exhausted the start limits
[15:57] <ijohnson> so in our spread tests we specifically set StartLimitInterval=0, which breaks the OnFailure
[15:57] <roadmr> jdstrand: how can I combine allow-installation constraints? I have used "on-store" by itself, and "plug-attributes" by itself but for this one I need both to apply (ANDing them). I can provide both snap.yaml and my partial snap-decl plugs thingy
[15:57] <ijohnson> It's unclear why this broke now, I think it may have been a systemd bug that this test worked at all on spread before, and they just now fixed the bug that makes it resiliently fail
[15:59] <jdstrand> roadmr: please supply both. feel free in privmsg
[15:59] <roadmr> jdstrand: coming up!
[16:01] <ijohnson> cachio: zyga: mvo: can y'all (re-)review #8155? I think the test is fixed now, but I want to be sure that there's not unintended consequences to setting StartLimitInterval=100
[16:01] <mup> PR #8155: tests: mv ubuntu-core-snapd{,-failover} to core/ suite <Test Robustness> <Created by anonymouse64> <https://github.com/snapcore/snapd/pull/8155>
[16:01] <zyga> y
[16:02] <zyga> another new github feature? https://usercontent.irccloud-cdn.com/file/m2oxIZH8/Zrzut%20ekranu%202020-02-19%20o%2017.02.13.png
[16:04] <mborzecki> zyga: the first bit was from focal, the sencod from bionic
[16:04] <zyga> ijohnson: I cannot find StartLimitInterval documentation
[16:04] <zyga> ijohnson: but there is StartLimitIntervalSec
[16:04] <zyga> ijohnson: and also StartLimitIntervalBurst
[16:04] <ijohnson> zyga: it was renamed to StartLimitIntervalSec in systemd 230 I think
[16:04] <zyga> ijohnson: are both names supported?
[16:05] <ijohnson> zyga: I'm pretty sure, but let me double check
[16:08] <zyga> ijohnson: interesting, I would love to understand why it worked before
[16:08] <zyga> ijohnson: perhaps there's more to it than that
[16:08] <zyga> ijohnson: I think that the explanation on how the value 0 interacts with OnFailure is good but I need to cross check it with documentation
[16:09] <ijohnson> leonard has some comments on bug reports explaining the interaction, does that count :-)
[16:09] <ijohnson> zyga: see comment 7 https://bugs.freedesktop.org/show_bug.cgi?id=87799
[16:10] <ijohnson> but then of course see comment 8 right below it :-)
[16:12] <zyga> ijohnson: indeed, we should reference in the code there
[16:12] <zyga> https://bugs.freedesktop.org/show_bug.cgi?id=87799#c7
[16:12] <ijohnson> should we add a comment in the code or the commit message ?
[16:12] <cachio> ijohnson, nice, reviewing it
[16:12] <zyga> in the code I think
[16:12] <zyga> easier to keep track of
[16:12] <ijohnson> hmm I guess if it was in the override I would have seen it much quicker in the running system
[16:13] <ijohnson> yeah good point I'll add it to the code
[16:17] <pedronis> pstolowski: I re-reviewed 8130
[16:18] <pstolowski> pedronis: just saw it, thank you!
[16:21] <ijohnson> okay I have it added to the override unit we put in the spread image, I will put it in a followup PR so that as soon as this one is green we can merge
[16:29] <pstolowski> ijohnson: can you take another look at #8003 when you have a moment?
[16:29] <mup> PR #8003: o/ifacestate, api: implementation of snap disconnect --forget <Needs Samuele review> <Created by stolowski> <https://github.com/snapcore/snapd/pull/8003>
[16:29] <ijohnson> pstolowski: I'll try to look this afternoon or tomorrow morning
[16:32] <pstolowski> thx
[16:32] <zyga> ijohnson: reviewed, please look
[16:32] <zyga> I'll be back shortly
[16:38] <ijohnson> zyga: it looks like StartLimitInterval is only valid in [Service] in systemd 229 and below, and for systemd 230 and above, StartLimitInterval is supported in systemd 230+ in [Service], but StartLimitIntervalSec is only supported in [Unit], so we are fine to keep using StartLimitInterval in [Unit] like the test does
[16:38] <ijohnson> see https://lists.freedesktop.org/archives/systemd-devel/2017-July/039255.html, but this is poorly documented in systemd docs
[16:38] <ijohnson> zyga: how would you like me to document that in the PR?
[16:40] <ijohnson> zyga: see the corresponding commit too https://github.com/systemd/systemd/commit/f0367da7d1a61ad698a55d17b5c28ddce0dc265a
[17:01] <zyga> ijohnson: a code comment will suffice
[17:01] <zyga> I just felt it is a bit too magic to leave as-is
[17:03] <mup> PR snapcraft#2948 opened: Regenerate the GDK pixbuf loaders cache file if for whatever reason it isn't there (LP: #1863801) <Created by oSoMoN> <https://github.com/snapcore/snapcraft/pull/2948>
[17:03] <ijohnson> zyga: a follow-up okay for the comment then?
[17:05] <zyga> ijohnson: is it green now?
[17:06] <ijohnson> Ah debian seems unhappy now
[17:07] <roadmr> 😞  <- debian
[17:08] <cwayne> I love how roadmr is just like the dad of all IRC channels
[17:08] <roadmr> I am everywhere.
[17:08] <roadmr> Iam inevitable 👌
[17:08]  * zyga needs a nap
[17:09] <zyga> ijohnson: feel free to discard my review and land with a comment
[17:09] <zyga> I will be back in 2 hours
[17:09] <zyga> So sleepy now
[17:30] <mup> PR snapd#8157 opened: tests: using google storage when downloading ubuntu cloud images from gce <Created by sergiocazzolato> <https://github.com/snapcore/snapd/pull/8157>
[17:43]  * cachio afk 
[17:43] <cachio> going to the doctor
[19:22] <zyga> re
[19:22] <zyga> *ah, I needed that*
[19:23] <zyga> so now nfs-support fails
[19:24] <zyga> ijohnson: do you think that is related to the change?
[19:24] <zyga> that's an 18.04 failure
[19:24] <zyga> but actually, not
[19:24] <ijohnson> zyga: yes I noticed now that fails on the PR
[19:24] <ijohnson> zyga: I have a fix for that
[19:24] <zyga> across the board
[19:24] <zyga> do you know what happened?
[19:24] <ijohnson> I was setting StartLimitInterval wrong, that's just the, well, interval, not the number of times starts can happen
[19:24] <ijohnson> StartLimitBurst is the number of times it can start
[19:25] <zyga> aha
[19:25] <zyga> ok, let's try that
[19:25] <ijohnson> So I changed it to set both StartLimitBurst and StartLimitInterval and now it's good
[19:25] <ijohnson> let me push it up
[19:25] <zyga> thanks!
[19:25] <ijohnson> was in the middle of UC20 debugging
[19:25] <zyga> no worries, I just woke up
[19:39] <mup> PR snapcraft#2949 opened: Fix clean on Windows <Created by NickZ> <https://github.com/snapcore/snapcraft/pull/2949>
[19:55] <mup> PR snapd#8158 opened: tests: disable archlinux system <Created by sergiocazzolato> <https://github.com/snapcore/snapd/pull/8158>
[19:56] <ijohnson> cachio: should we use the no-spread label on that PR?
[20:05] <mup> PR snapd#8158 closed: tests: disable archlinux system <Skip spread> <Created by sergiocazzolato> <Merged by sergiocazzolato> <https://github.com/snapcore/snapd/pull/8158>
[21:16] <mup> PR snapcraft#2933 closed: remote-build: introduce --launchpad-snapcraft-channel option <Created by cjp256> <Closed by cjp256> <https://github.com/snapcore/snapcraft/pull/2933>
[23:36] <mup> PR snapd#8159 opened: snap-bootstrap: remove created partitions on reinstall <UC20> <Created by cmatsuoka> <https://github.com/snapcore/snapd/pull/8159>