=== NickZ3 is now known as NickZ [01:52] PR snapcraft#2950 opened: meta: Snap to_dict() cleanup [05:14] PR snapd#8148 closed: overlord/configstate/configcore: add support for backlight service <⛔ Blocked> [06:08] morning [06:35] school run [06:38] PR snapd#8160 opened: overlord/configstate: add backlight option [07:12] re [07:33] mvo: hey [07:38] mborzecki: good morning [07:50] good morning [07:50] hmm looks like my git filter-branch incantations on #8156 were not good enough [07:50] zyga: hey [07:50] PR #8156: [RFC] cmd/snap-bootstrap: subcommand to detect if we want a chooser to run [07:51] hmm [07:51] didn't I review https://github.com/snapcore/snapd/pull/8160 already? [07:51] PR #8160: overlord/configstate: add backlight option [07:51] or something just like it? [07:53] zyga: I think the config var got tweaks [07:54] meanwhile master is still red [07:54] Ian sent a fix but it seems to be insufficient or also broken somehow [07:55] hmm, what is curious is that the failure is only present on core18 [07:56] I wonder if that's because of the extra logic on how snapd is started there [08:03] zyga: I run a debug session now [08:03] good morning pstolowski [08:03] morning [08:03] hey Pawel, good morning [08:03] zyga: it's a bit annoying, also it's super unclear why it's starting now [08:03] pstolowski: morning! [08:04] zyga: actually - new systemd in bionic since 2020-02-17 [08:04] ohhh [08:04] zyga: I think this is roughtly when the trouble started? [08:04] that's a good trail [08:04] yeah, let's see the changelog [08:05] zyga: we could run core18 tests with stable core18, that still has the old version [08:05] wow, that's true, that's brilliant [08:05] if that passes we know for sure it's the base OS that changed [08:05] zyga: the changelog has a smoking gun [08:06] reading it now [08:06] og [08:06] OnFailure job something [08:06] don't trigger ONFailure [08:06] - Only trigger OnFailure= if Restart= is not in effect (LP: #1849261) [08:06] Bug #1849261: Update systemd for ubuntu 18.04 with fix for interaction between OnFailure= and Restart= [08:06] [08:06] zyga: exactly, this looks super suspicious [08:06] zyga: also wrong, I mean, if it fails to restart for n times we need to go into failure [08:06] anyway [08:07] hmm hmm hmm [08:07] mvo: but it passes in 20.04 [08:07] so something changed there [08:08] the patch is https://github.com/systemd/systemd/pull/9158/files [08:08] PR systemd/systemd#9158: trigger OnFailure= only if Restart= is not in effect [08:08] I'll check if that got changed further down the line [08:08] zyga: cool, thank you [08:08] zyga: I think this is it, kind of annoying [08:08] zyga: but at least we know now what to do [08:08] mvo: the test does run on 20.04, right? [08:09] zyga: lol - no [08:09] zyga: it has a TODO:UC20: "does not work for unknown reasons" [08:09] pfff [08:09] hahahaha [08:09] so it's broken [08:09] and now we need to live with it [08:10] ok [08:10] zyga: well [08:10] zyga: let me try something [08:10] the code didn't change since that patch, systemd master still behaves the same way it does in 18.04 [08:10] ok [08:10] so [08:11] I think we need to change how snapd restarts [08:11] if we limit that to a fixed number [08:11] zyga: I hope we just need to change the startlimitinterval [08:11] then on failure will trigger again [08:11] no, not the interval [08:11] zyga: then the failure will be triggered again [08:11] zyga: no? [08:11] we need to allow snapd to really fail [08:11] that's how I understand this fragment: [08:11] https://www.irccloud.com/pastebin/phqUSrfE/ [08:11] let me read the diff again [08:12] zyga: right, my understanding is that once we hit the restart interval it will actually go into failed state and then the OnFailure is run [08:12] zyga: but I just infered this, did not read the patch [08:12] we need to check [08:13] I mean [08:13] just do [08:13] zyga: mvo: what I think is the best fix to fix this with the new systemd is 2 things, 1 nfs-support test needs to also reset-failed on snapd.socket and 2 snap-failure needs to reset-failed before trying to start the socket unit [08:13] https://github.com/systemd/systemd/pull/9158/files#diff-a89a9b6f80aada989d298b4c2c3a9d64R2435 [08:14] PR systemd/systemd#9158: trigger OnFailure= only if Restart= is not in effect [08:14] so, as long as the service will auto restart [08:14] we don't get failed state [08:14] ijohnson: good mornign! [08:14] (it's kind of early, wow :) [08:14] Yeah couldn't sleep but probably will try to go back to bed shortly [08:15] ijohnson: woha, hey [08:15] ijohnson: good evening :) [08:15] * mvo hugs ijohnson [08:15] mvo: indeed :-) [08:15] ijohnson: hey! already adjusting to CET timezone? [08:16] mvo: I need to handle an errand at home, I'll be back in 20 [08:16] zyga: no worries [08:16] zyga: testing a patch now [08:16] Anyways snap-failure needs to be a bit more robust about starting the snapd.socket service anyways [08:16] ijohnson: thanks, that's good to know [08:17] Because cachio ran into another problem with this test that happens because the socket is still lying around [08:17] ijohnson: I'm poking at it now, a bit annoying since it takes forever to run each test [08:17] Haha yes that was most of my day yesterday waiting for spread and fixing other random things in the meantime [08:18] mvo: but also see my comment about the nfs-support test, it calls reset-failed on _only_ snapd, it should also call that on snapd.socket too [08:22] stepping out for a bit to get the papers to my accountant, and then will probably work from some place in the city [08:34] PR snapd#8161 opened: tests: set StartLimitInterval in snapd failover test [08:36] ijohnson: thanks, I check these area too but it seems the current issue is about the StartLimitInterval [08:36] re [08:41] there's some chaos at home today [08:41] I'm heading out [08:41] mvo: what I was trying to say is that the nfs-support test will start failing too when you make adjustments to the StartLimitInterval that are favorable to snapd-failover because that test is flaky due to not resetting the failed count of the snapd.socket service [08:41] don't want to be a part of this [08:41] (family life and living with parent-in-law) [08:43] * mvo hugs zyga [08:44] ijohnson: aha, thanks. my current approach was to adjust the StartLimitInterval only in this one test [08:44] ijohnson: i.e. just enough to get us on our feed again, not at all against fixing more [08:45] Ah okay nvm me then [08:47] ijohnson: let's talk some more later, I really don't want to keep you from sleep :) [08:47] It's fine no worries [09:08] mvo: re [09:08] mvo: small observation [09:08] mvo: perhaps we want to consider a spread suite that does run in autopkgtest and gates classic systemd [09:09] mvo: we would catch this if that test was a part of that set [09:09] mvo: something to consider after this is resolved [09:10] mvo: I'll work on my tasks from Jamie's review - please pull me into a review once you have the fix [09:10] mvo: and it's also a good good catch about this test, if we had disabled that test and released a new core we could really have a bad day [09:11] zyga: it's an interessting idea, we did have autopkgtest for snapd but it was terrile unreliable [09:11] zyga: we would have caught this thought [09:11] yeah, perhaps we can turn some knobs to make it better though [09:11] dunno [09:11] * mvo nods [09:13] totally offtopic, I noticed that systemd --user runs for gdm [09:13] we should probably not run snap services for gdm [09:15] mvo: hi, is it expected there's no standup in the calendar today? [09:16] pedronis: not expected, uh, let me fix this [09:16] pedronis: thanks for catching that [09:23] #8130 needs a 2nd review [09:24] PR #8130: overlord, state: don't abort changes if spawn time before StartOfOperationTime (2/2) [09:27] PR snapd#8162 opened: features: enable "parallel-installs" by default [09:29] yay [09:29] mvo: that's for 2.44? [09:30] mvo: that's great to see, thank you! [09:32] mborzecki: that's something to discuss [09:32] mborzecki: 2.44 means it's ready for 20.04 and most likely on the CD [09:32] mborzecki: which is great [09:33] mborzecki: but also risky [09:34] ayy ;) [09:34] mborzecki: let's talk at the standup [09:34] mvo: ok [09:34] also - amazon linux is failing to install packages right now :( [09:35] mvo: duh, got logs? [09:36] mborzecki: yes, on sec [09:36] mborzecki: I move it to unstable for now, look like some repo issue [09:36] mborzecki: probably transient [09:37] mborzecki: https://travis-ci.org/snapcore/snapd/jobs/652883253?utm_medium=notification&utm_source=github_status [09:39] yay, 8161 unbreaks the snapd-failover test [09:39] mvo: we really need to do something about aliases before turning on parallel installs by default [09:39] mvo: yeah, looks like inconsistency in amazon's repos [09:40] mvo: and kinda basic, iproute, iptables :/ [09:40] PR snapd#8162 closed: features: enable "parallel-installs" by default [09:40] pedronis: ok, closing this again then until we have time for this [09:41] mborzecki: yeah, that was my impression as well, hopefully transient [09:41] pedronis: thanks for the review [09:41] mvo: yea, sadly it needs more work, but I think the current experience is suboptimal [09:41] and we turn it on we are stuck with the behavior forever [09:45] reviews for 8161 would be appreicated [09:45] (should be super simple) [09:45] and fixed the snapd-failover test - at least in the one run that happened there :) [10:09] mvo: what's the context of amazon linux and unstable [10:09] the commit message has no other information [10:12] also - amazon linux is failing to install packages right now :( [10:13] PR snapd#8163 opened: tests: enable snapd-failover on uc20 [10:37] * pedronis reboots [11:26] PR snapd#8164 opened: snap: use the actual staging snap-id for snapd [11:26] pedronis: got a question about https://github.com/snapcore/snapd/pull/8156#discussion_r381881462 you'd see the WaitTriggerKey() be moved to some other file right and just keep the interfaces and related structs in triggerwatch.go? [11:26] PR #8156: [RFC] cmd/snap-bootstrap: subcommand to detect if we want a chooser to run [11:27] mborzecki: not quite, I think the structs should go where they are used as input or output to functions [11:27] mborzecki: this is go so the interfaces can be defined on the consuming side [11:29] mborzecki: I mean keyEvent etc should be in evdev.go [11:35] pedronis: hmm right, the only concern i have is that keyEvent is a concrete type, so it's kind of shared between the consumer and producer, not sure if i'm explaining this right :) [11:35] mborzecki: you are, but is not a concern in go [11:36] the interfaces exist mostly for testing, no? [11:36] mborzecki: it's very unsual for a consumer to define structs it wants [11:36] pedronis: yeah, well, maybe i'm trying to complicated it too har :) [11:37] mborzecki: let's put it this, way if you tried to stick evdev.go in its own package as is, it wouldn't work [11:42] mborzecki: does this ^ remark makes sense? I'm also happy to HO if I'm still confusing you [11:47] pedronis: it's ok, i'm overcomplicating this, thought about a scenario when evdev went to a separate pacakge and KeyEvent would either need to be an iface, or one package would need to import the other to get the definition, but it makes no sense to restructure it this way [11:49] cachio: hey! i've asked 2 questions under #8157 [11:49] PR #8157: tests: using google storage when downloading ubuntu cloud images from gce [11:49] pstolowski, hi, I'll take a look [11:54] mborzecki: just out of curiosity, what's the time difference between running the chooser trigger check from initramfs or early in the normal system? [11:56] cmatsuoka: we can start the detection early, in my VMs it's like <1s earlier than early boot, but i guess it may be different on the actual devices [11:56] cmatsuoka: and there's also a question, whether we'd be able to execute the gadget-provided trigger hook in initramfs, which i suspect we won't [11:57] cmatsuoka: as i said during the standup, it's probably only to the benefit of a device that supports the trivial input method like keyboard or somesuch [11:57] mborzecki: I had this feeling that initramfs should execute really fast [11:58] mborzecki: but yeah I don't know how slow real devices could be [12:07] zyga: hey, fyi, https://github.com/snapcore/snapd/pull/7980 is -><- close [12:07] PR #7980: packaging,snap-confine: stop being setgid root [12:15] jdstrand: cool [12:16] jdstrand: I'll check soon [12:16] jdstrand: going through coverity, last one to fix [12:20] * pstolowski lunch [12:20] moving back home [12:20] bbiab [12:25] mvo: the enable failover tests failed for some reason [12:26] jdstrand: https://github.com/snapcore/snapd/pull/8165 [12:27] PR #8165: cmd/snap-confine: fix everything flagged by coverity [12:27] PR snapd#8165 opened: cmd/snap-confine: fix everything flagged by coverity [12:30] jdstrand: replied to setgid, I'll jump there immediately now as it's so low hanging and it could be done shortly [12:30] jdstrand: the bigger branch is still in progress, I'm working on improving the testing setup and resolving the issue you highlighted in remote access to 18.04 system [12:30] jdstrand: it is real but for reasons I need to dig deeper to understand [12:31] mvo, jdstrand: FYI - I will make coverity scan automatic, this is just the first step [12:32] jdstrand: btw, I think we got remarkably little things flagged by coeveity [12:32] Coverity [12:33] I was bracing for a huge list [12:33] It is a really cool tool [12:37] mvo: it seems core 20 needs more love for failover [12:37] mvo: /bin/bash: line 89: /usr/bin/snap: No such file or directory [12:39] * zyga small coffee & pączek break [12:41] Pączki look delicious! [12:44] cmatsuoka: it's fat Thursday here [12:44] cmatsuoka: so pączki are everywhere [12:52] mvo: is it known that snapd snap ships the code in /snap ? [12:56] pedronis: you mean the go source code? yes, but it's to be solved with ijohnson's patch to use base: core in snapcraft.yaml [12:59] we back up like real men [12:59] by shipping all code to each user in production [13:00] Dear $NAME; This is not a scam. Can you send us your core snap back please? We lost all disks in a thunderstorm. kthxbye [13:05] re [13:07] I just jumped through initrd break=premount [13:07] fun [13:15] I think I will head back [13:15] better to have the standup in a private setting [13:15] I'll finish my coffee and walk home [13:19] Speaking of my branch it should be ready to land when the Snapcraft in candidate goes to stable [13:19] cmatsuoka, zyga re core20 failover> it is strange, I ran this locally and it worked, oh well. needs investigation [13:21] mvo: your fix for master failed on preseed test during image download. Perhaps it would make sense to land cachio's 8157 first (and flag it with 'skip spread') [13:29] zyga: Can you explain what syntax/grammer of the mount command here has? And where receive the command then? [13:29] ``` [13:29] emit(" mount options=(bind, rw) %s -> %s,\n", bindFile, path) [13:29] emit(" mount options=(rprivate) -> %s,\n", path) [13:29] emit(" umount %s,\n", path) [13:29] ``` [13:29] I mean: who receives the message? [13:31] zyga: wait. it's in the comments. Thank you [13:34] pstolowski: hm, I can cherry pick his fix into my PR [13:36] mvo: that works too [13:44] re [13:44] sdhd-sascha: that's apparmor mount specification [13:45] sdhd-sascha: I'm glad you asked about grammar [13:45] sdhd-sascha: that's about the only thing that is very thoroughly documented :) [13:45] sdhd-sascha: go to http://manpages.ubuntu.com/manpages/bionic/man5/apparmor.d.5.html [13:46] sdhd-sascha: and check the MOUNT RULE = part [13:55] mvo: Thursdays SUs are in a different room now? [13:56] zyga: :-) thank you, again. [13:56] i'm looking for a way to mount when a classic snap service restarts. Think the install hook is not enough. [13:56] sdhd-sascha: can you summarize as to what you want to detect and what you want to do in response please [13:57] cmatsuoka: not really, sorry, silly calendar [13:59] hmm [13:59] should I join what meet proposes [13:59] I'm in the standp [13:59] nobody there [13:59] zyga: use the regular one I guess [14:00] can you spare the link [14:00] zyga: the github-runner, needs write access in the same directory. so i create /var/lib/runner/$SNAP_NAME/_work directory on install-hook. Then i want to be sure, that this is mounted to $SNAP/usr/lib/runner/_work [14:00] I just go to meet.google.com [14:00] mvo: can you share the correct link [14:01] zyga: it's classic, because i don't know what the github-action-runner-scripts can do [14:02] zyga: on classic, the layout in snapcraft.yaml is not executed... [14:02] sdhd-sascha: layouts don't do anything during classic confinement [14:03] * sdhd-sascha nod [14:04] Would be nice if layouts in classic mode would also work. [14:04] sdhd-sascha: what would you expect them to do? [14:04] sdhd-sascha: change the host? [14:05] sdhd-sascha: what if two snaps want to have $SNAP/foo in /usr/lib? [14:05] sdhd-sascha: who wins? [14:05] maybe, limit the mounts to `/var/lib/runner/$SNAP_NAME/$SNAP_REVISION/_work` like above [14:06] sdhd-sascha: what is /var/lib/runner? [14:06] i mean, without runner [14:06] PR snapd#8166 opened: cmd/snap-bootstrap: create a new parser instance [14:06] sdhd-sascha: I think this is weird, a classic snap can just mount stuff there [14:06] sdhd-sascha: perhaps I'm missing something but I think layouts are not meant for this [14:07] ok [14:07] thank you [14:07] sdhd-sascha: layouts take something from the snap and put it somewhere in the view of the snap [14:07] sdhd-sascha: but none of that changes the host for real [14:24] cmatsuoka: btw. could we just mkfs rather than delete/remove partitions? [14:27] mborzecki: right, I think we could just redeploy them instead of messing with the partition table [14:27] cmatsuoka: just look at the attributes to know which ones we can wipe [14:29] mborzecki: let's try that [14:36] mvo, hi, sorry to ping again wrt #1817276, is there anything I can capture when this happens to help debugging? I see it quite frequently locally. it might be made worse by the fact that my laptop is a bit old [14:36] Bug #1817276: snapfuse use a lot of CPU inside containers [14:39] ackk: I need to appologize for this one, we still have not invstigated how this can happen [14:40] mborzecki: I blocked the PR while I try those changes, but I'll work on the TPM cmdline measuring first [14:40] cmatsuoka: cool, added a note there too [14:41] mvo, np. I can't change the status of that bug. does it make sense to reopen or should I file a new one ? [14:42] mborzecki: ping me when the chooser PR is ready for re-review [14:42] ackk: let me reopen it [14:42] mvo, thanks [14:43] ackk: reopened and updated the description. let me try to investigate again [14:48] ijohnson: do you think you could have a look at 8161 ? it's green and seems to fix the snapd failover, if it's not wrong I'm in favor of merging and then we can merge your improvements next? [14:48] ijohnson: it should unblock all the other PRs we have open [14:50] mvo: sure [14:50] mvo, thanks [14:53] mvo: approved [14:54] pedronis: thanks for letting me know about the /snaps subdir in the snapd snap - I (strongly) suspect that snapcraft is acting silly here, will investigate [14:55] mvo: it's fixed in modern snapcraft [14:55] ijohnson: aha! so we just need to land your PR :) ? [14:55] mvo: my PR open right now fixes that, but we just need to wait until snapcraft 3.10 on candidate channel is promoted to stable [14:55] yes [14:55] ijohnson: I guess I could change our build recipe to use snapcraft from candidate? [14:56] ijohnson: let me try this [14:56] mvo: yes that would let us land sooner, not sure if that's okay to build snapd that gets released with candidate? depends on how much you trust sergiusens I guess :-) [14:56] mvo: the spread test we have is already using candidate [14:56] (but that doesn't release anywhere, just makes sure it builds) [14:57] ijohnson: I trust sergiusens a lot - plus the extra testing snapcraft would get this way is probably good [14:57] sure sounds good then :-) [14:57] PR snapd#8161 closed: tests: set StartLimitInterval in snapd failover test <⚠ Critical> [14:57] ijohnson: I am mostly waiting on feedback from field, but I probably won't be releasing this week [14:58] sergiusens: I switched our default snapd builds to use the snapcraft from candidate now, I think this makes sense anyway for testing your stuff earlier [14:59] ijohnson: so just to double check 7904 will work with snapcraft 3.10 in candidate? so I can remove the blocked label? [14:59] mvo: yes [15:00] ijohnson: awsome [15:02] thanks mvo, that will help! [15:06] zyga: https://github.com/snapcore/snapd/pull/8165/files#r382057305 [15:06] PR #8165: cmd/snap-confine: fix everything flagged by coverity [15:07] zyga: found an example of use in qemu: https://github.com/qemu/qemu/blob/master/scripts/coverity-model.c [15:08] mborzecki: looking [15:21] cmatsuoka: can you take a quick look at https://github.com/snapcore/snapd/pull/8166 ? [15:21] PR #8166: cmd/snap-bootstrap: create a new parser instance [15:22] ackk: in the bugreport you write this happens inside a bionic container, this is puzzling. however it seems like I can explain why it's happening inside a disco container. but it's a bionic container? [15:22] ackk: anyway, I will dig a bit deeper, if you have access to the container the output of "apt list squashfsfuse" would be nice [15:23] ackk: to the problematic container [15:27] mvo, well yeah I've seen it in bionic, but also on focal while testing maas upgrade from deb to snap [15:28] ackk: ok, I keep diging [15:28] mvo, in this scenario I launch a bionic container, install maas from deb, do-release-upgrade to focal which updates to the transitional deb (which installs the snap). once services in the snap start, snapfuse eats all cpu (along with python for the services) [15:43] ackk: in a meeting right now, will get back to you (and try this out) [15:52] Is it possible to backup a snap installation and restore them on a new installation as blob ? [15:53] It could be faster on github-actions if the installation of "snap", "lxd", "snapcraft" could be shortend with a tar-package [15:54] sdhd-sascha: ish, it's not trivial [15:54] * zyga is busy and won't respond now, sorry [15:56] mborzecki: thanks, that's very useful! [15:58] * mborzecki figures it's better to rebuild initramfs before repacking the kernel snap /o\ [16:03] ackk: just one more question - what is your "host" os that you run lxd on ? [16:03] ackk: just to make sure I get a realistic reproducer [16:03] mvo, currently eoan [16:03] mvo, lxd backend is btrfs (if that matters) [16:04] (lxd 3.20) [16:05] ackk: thank you [16:05] mvo, np, let me know if you need any more info [16:08] * cachio lunch [16:09] ijohnson: 7904 is green :) [16:09] can someone do a second review on 7904 please? [16:10] y [16:10] mvo: pstolowski reviewed and approved it a while ago [16:10] but sure more reviews is never a bad idea [16:10] :-) [16:11] mvo: what does adopt-info snapd do? [16:11] I remember the concept [16:11] PR snapd#8163 closed: tests: enable snapd-failover on uc20 [16:11] but what's the data source for "snapd"? [16:12] zyga: it means to get info like version from the snapd part [16:12] aha, I see [16:12] but where is the part defining anything that is adopted? [16:12] is that set-version snapcraftctl? [16:15] zyga: the data source for snapd when not specified defaults to "." [16:16] err for any snapcraft part rather [16:18] zyga: `adopt-info: snapd` means get metadata for the snap from the part named "snapd", which is a bit counter-intuitive since the whole snap is named snapd and we also have the magic `type: snapd` so it's not clear if it's a special value or just a part [16:18] yeah [16:18] zyga: if you like I could rename the part from snapd to `snapd-deb` as that's really what the part is building [16:18] a comment would be nice [16:18] followup material [16:18] cool, yes I would prefer a followup if that's okay [16:19] * ijohnson is not feeling lucky [16:19] * ijohnson about tests today [16:19] hahaha [16:19] PR snapd#8167 opened: o/standby: add SNAPD_STANDBY_WAIT to control standby in development [16:22] PR snapd#7904 closed: snapcraft.yaml: use build-base and adopt-info, rm builddeb plugin [16:22] ackk: one more question - in the bugrpeort you say that you see squashfuse generating a lot of cpu, it's squashfuse (and not snapfuse), right? [16:23] mvo, let me confirm [16:26] mvo, it seems it's actually snapfuse [16:26] ackk: that's interessting [16:27] ackk: the bugreport says "after installing core and rebooting". this is installing core in the container, I suppose but then rebooting the host? [16:27] mvo, no, rebooting the container, but that might be a red herring, I tried reproducing it that way and didn't manage to. so it might have been something else [16:27] mvo, what I did right now, though is just to install the maas snap, and initialize it [16:27] not a clean container, just my dev onne [16:28] pedronis: mvo: xnox mentioned to me a while back that we should consider upgrading the build-base of the snapd snap to build a newer libc/other tools, thoughts on this? I don't think we need to do this now, but I could add a TODO:UC20: to the snapcraft.yaml for the snapd snap so we don't lose this suggestion [16:28] mvo, I take the more I/O the application does, the more snapfuse has to work as well? [16:28] ackk: yes [16:29] ijohnson: if you do, you do need to check minumum kernel requirements from newer libc, and how it matches the platforms you support. [16:29] ackk: I mean, snapfuse taking quite a bit of cpu is normal but we fixed using the wrong snapfuse binary a while ago and it puzzling that apparently this is not fully working for you [16:29] also xnox in other news, you should now be able to cleanly build snapd snap from git master with candidate snapcraft :-) [16:29] xnox: hmm good point [16:30] but do note that trusty is no longer a support target for snapd [16:30] mvo, I haven't seen issues with other snaps in containers, but maas does quite a lot of IO even just when starting up / setting up, so maybe the issue shows there more prominently [16:30] mvo, plus, my laptop is old and non-nvme [16:31] mvo, btw, the original issue reported by BjornT was on zfs, mine is on btrfs, so I was wondering if COW filesystems might play a role [16:31] PR snapcraft#2939 closed: pluginhandler: user directories scoped to partdir for snapcraftctl [16:32] mvo, I have a focal machine, I can try spawning a container there and see if it it's different [16:32] ackk: interessting, could be. I'm trying in a clean 19.10 VM, all freshly created in qemu and see snapfuse hoover around 40% cpu which seems not that unexpected [16:33] mvo, is that with maas? [16:33] ackk: yes, with maas init [16:33] ackk: I now see two snapfuse, one 65 the other 20% [16:33] zyga: see ^ or v depending on how quick mup is [16:33] ackk: I think you mentioned multiple ones with 100% ? [16:33] ackk: but this is ext4, I need to try zfs/btrfs :/ [16:33] PR snapd#8168 opened: snapcraft.yaml: add comments, rename snapd part to snapd-deb [16:34] there it is ^ [16:34] mvo, no just one, plus multiple python processes also using quite a lot of cpu, but that's expected during init [16:34] ackk: yeah, I see python too but was assuming that is maas :) [16:34] ackk: just one> but 100% cpu? [16:34] PR snapcraft#2947 closed: remote-build: pass through 'source-subdir' property [16:34] mvo, correct [16:35] ackk: ok, I think I need to re-do this with a different fs, ext4 seems to not show it (or I'm doing something wrong) [16:35] cachio: can you merge master into 8157? [16:37] ackk: it runs for a long time, that is normal? [16:37] mvo, yeah, although snapfuse stealing cpu makes it worse [16:37] ackk: yeah :( [16:39] mvo, fwiw in my container here maas init has finished but snapfuse is still running at 100%cpu [16:39] ackk: will redo this now with a different fs [16:39] ackk: uh, that's interessting [16:39] mvo, that's what always happens [16:39] ackk: can you check where snapfuse comes from? it should be /usr/bin/snapfuse from inside the container [16:40] mvo, /usr/bin/snapfuse [16:40] (from /proc/pid/exec) [16:40] ackk: yeah, that should be fine (well, "should") [16:40] err, exe [16:40] PR snapcraft#2943 closed: spread: capture developer debug information [16:40] * mvo add zfs or something to see if it makes a difference [16:41] * ackk stops maas before laptop catches fire [16:42] ackk: let it burn and buy a new one already :) [16:42] mborzecki: 8136 needs a 2nd review [16:43] BjornT, because of free mentining it I'm now kinda waiting for gen8 [16:45] PR snapd#8166 closed: cmd/snap-bootstrap: create a new parser instance [16:46] PR snapcraft#2923 closed: requirements: Update PyYAML requirement to 5.3 [16:47] PR snapd#8155 closed: tests: mv ubuntu-core-snapd{,-failover} to core/ suite <⚠ Critical> [16:51] mvo: #8153 needs more work [16:51] PR #8153: [RFC] "snap run --explain" with different formating [16:52] pedronis: cmatsuoka: i've updated #8156 [16:52] PR #8156: cmd/snap-bootstrap: subcommand to detect UC chooser trigger [16:54] pedronis: as for naming, snap-boostrap core-chooser-trigger and snapd.core-chooser-trigger.service ? [16:55] mvo, I started a bionic container and installed maas there on my other machine which is running focal, snapfuse is not using cpu there [16:55] pstolowski, done [16:55] pstolowski, thanks for the heads up [16:56] mborzecki: s/core/recovery/ [16:56] pedronis: recovery-chooser-trigger? [16:56] ackk: oh? so on focal things are working for you? [16:56] mvo, it seems so, I wonder what happens if I update my laptop [16:57] mborzecki: yes [16:59] cachio: yw [17:02] ackk: but I can reproduce the "keeps hogging cpu" issue in bionic [17:02] mvo, oh, "good" [17:04] cachio: i think only 19.10 and 20.04 images need to be re-downloaded often [17:05] ackk: but I see a ton of run-supervisorctl restart in my ps output, so the snapfuse might be legit traffice because of all this activity in the container? [17:06] ackk: like literally >200 supervisorctl restart [17:06] ackk: is that expected? also it seems like its not actually restarting :( [17:06] ackk: and the number fluctuates [17:06] ackk: like I had 206 some moments ago, now 211, then 214, 215 [17:07] ackk: now 217 [17:07] #8130 and #8136 could use a 2nd review [17:08] mvo, you mean there are that many processes in parallel running? [17:08] PR #8130: overlord, state: don't abort changes if spawn time before StartOfOperationTime (2/2) [17:08] PR #8136: boot: write current_kernels in bootstate20, makebootable [17:09] ackk: correct [17:09] ackk: I added that to the bugreport too [17:10] mvo, can you check /var/snap/maas/common/log/supervisor-run.log ? [17:11] ackk: I see a lot of waiting/stopped/spawned [17:11] ackk: I can try to scp but i'm at >1600 process right now and I think this will soon OOM [17:11] ackk: I am using 2.7/edge maybe edge was not a good idea :) [17:11] mvo, you might wanna snap stop maas [17:11] master is fixed, right? [17:11] o [17:12] zyga: I have seen green things [17:12] zyga: snapd-failover should be good now with mvo's PR [17:12] mvo, oh, yeah although I don't think there's much difference with stable [17:12] superb news [17:12] can't speak to other things [17:12] thank you everyone who pushed towards fixing that bug :) [17:12] ackk: http://paste.ubuntu.com/p/MkgJwgfhnf/ [17:13] zyga: yeah, master should be green [17:13] zyga: thank ijohnson mostly [17:13] Thank you ijohnson :) [17:13] PR snapd#8164 closed: snap: use the actual staging snap-id for snapd [17:13] ackk: should I attach the log to the bugreport too? [17:13] ackk: I wonder if maybe this is causing this heat/cpu issue [17:14] mvo, if you could grab a tarball of the whole log/ dir it would be great [17:14] mvo, I can take a look at that [17:15] pedronis: cool, pushed the naming update to #8156, it'd be great if mvo, cmatsuoka or ijohnson could take a look [17:15] PR #8156: cmd/snap-bootstrap: subcommand to detect UC chooser trigger [17:15] mborzecki: I'll try to take a look today, it was on my queue of things to look at [17:16] mborzecki: will check asap, will finish a debug run here first [17:16] ackk: sure, I will do and attach to the bug(?). it looks huge though [17:16] ijohnson: cmatsuoka: if there's anything super silly feel free to push a patch, i'll pick it up in the morning [17:17] mvo, yeah if you don't want to attach it just put it somewhere and I can download it [17:17] mborzecki: sounds good [17:19] ackk: the gzip version of this dir is ~1.2Gb [17:19] oh boy [17:20] ackk: yeah, I will upload to some gdrive, not sure if LP will like it if I attach to the bugreport [17:22] heh [17:23] PR snapd#8169 opened: [wip] tests/many: don't use StartLimitInterval, StartLimitBurst anymore [17:25] mvo, thanks, I think you're probably now hitting some bug [17:25] mvo, how did you reproduce the issue btw? just installing in a bionic container? [17:25] ackk: correct. it's outlined in the original bug. I did use a clean 19.10 eon VM with 8g ram [17:25] ackk: then installed snapd and lxd [17:26] ackk: then lxd init with btfs as backend [17:26] mvo, weird, on this focal machine I don't even see snapfuse using CPU while maas init is running [17:26] ackk: then created an "lxc launch images:ubuntu/18.04 container" and "lxc exec bash" and installed maas in there and ran "maas init" [17:27] ackk: that's pretty cool, at least it means this problem will go away :) [17:27] mvo: snap-confine bug? [17:27] ackk: but yeah, frustrating [17:27] zyga: do you have more details? [17:27] mvo, yeah I hope so. I'll try upgrading my laptop to focal, the installer was crashing on me before, have to try a more recent daily [17:27] mvo: no, I'm asking what this is about but now I see this is the snapfuse perf bug [17:27] zyga: aha, yes, that's correct [17:28] zyga: trying to reproduce their issue but no luck and apparently hitting a different bug [17:28] mvo: what did you hit? [17:29] mvo, it might be related, because I also usually see supervisord-managed process restarting a lot, but never seen so many running supervsiorctl calls [17:29] zyga: I don't know but it's some sort of (slow) fork bomb [17:29] ackk: interessting! [17:30] mvo, I thought the restarts would be beacuse of timeouts caused by processes being slowed down by snapfuse [17:32] ackk: could be but my VM is pretty fast, it's an ssd and has enough ram and the cpu load is not that high. still a possiblity of course [17:37] ackk: shared my dir with you, please let me know if you need anything else before I stop this vm again [17:39] PR snapd#8170 opened: snap-preseed: support for preseeding of snapd and core18 [17:48] mvo, can you try starting the maas snap again? just to confirm if it starts happening again [17:50] mvo, (downloading logs as we speak) [17:50] PR snapd#8167 closed: o/standby: add SNAPD_STANDBY_WAIT to control standby in development [17:52] pstolowski: reviewed 8003 [17:57] ijohnson: thank you! if it's green then i'd like to land it and address your suggestions in a followup; if it fails then i'll push to this PR [17:57] pstolowski: sounds good [18:07] seems travis is clogged up with jobs :-/ === ijohnson is now known as ijohnson|lunch [18:09] we opened too many new PRs especially considering that master was red [18:10] yeah but this actually happens pretty regularly in the afternoon my TZ, not sure why, but my thinking is that everyone in EU submits things before they log off === ijohnson|lunch is now known as ijohnson [18:44] * ijohnson -> dentist [18:50] check yo teef [18:56] jdstrand: it looks like firefox isn't auto-connecting the pulseaudio plug anymore https://twitter.com/thecalmsprings/status/1230542924489875457 [18:56] love getting bug reports via twitter :) [18:58] kenvandine: that's weird. I grandfathered that [18:58] * jdstrand fixes and investigates [18:58] jdstrand: thanks [18:59] ackk: sure, let me retry this [19:00] ackk: seems to be happening again, I see proxy, ntp, syslog [19:00] ackk: then proxy again, it's now at 15 [19:04] ackk: now at 46 [19:08] PR snapd#8149 closed: snapmgr, backends: maybe restart & security backend options [19:50] * ijohnson -> back [20:24] cachio: do you have a reliable way to reproduce that issue with snapd-failover you had the other morning, i.e. where in the logs snap-failure couldn't start snapd.socket because the socket file already exists? I have a fix for that I'd like to confirm if it works but I have never been able to reliably reproduce that error condition [20:25] PR snapd#8165 closed: cmd/snap-confine: fix everything flagged by coverity [20:38] this PR title is a lie, there are three left that weren't "new" [20:38] oh well [20:44] ijohnson, yes I have [20:44] but I am leaving in 2 mins [20:44] I ll be back in 2 hours [20:45] or just telegram me [20:45] cachio: no problem, it can wait til tomorrow [20:45] I will mention you on the PR I open I think, so if you could comment there tomorrow that would be great [20:46] ijohnson, sure [20:46] when I am back I'll comment [20:47] sorry but I was working with the new arch image [20:48] * cachio afk [20:48] cachio: no problem, ttyl [21:01] debug tpm unlocking inside the initrd is painful [21:07] sounds quite painful [21:07] Hope this works now: Hey snap people. A question regarding an issue (https://github.com/keepassxreboot/keepassxc-browser/issues/439) with the KeepassXC browser plugin (Firefox, Chromium) talking to KeepassXC through NativeMessaging. Afaik this won't work with snap-packaged apps, because of sandboxing / security. [21:07] Ah. That looks better... [21:08] PR snapd#8171 opened: cmd/snap-failure/snapd: also rm snapd.socket if it still exists [21:09] What would be the "correct" solution to connect e.g. a browser plugin to an application ouside of the sandbox? [21:10] raer: it depends, I suggest starting a post on forum.snapcraft.io where more folks will be available to look at your issue [21:11] I'm reluctant to open that box... [21:11] raer: AFAIK it's a known problem and oSoMoN has a bug filed somewhere about this issue with the chromium snap specifically [21:12] https://bugs.launchpad.net/ubuntu/+source/chromium-browser/+bug/1741074 ? [21:12] (and notable oSoMoN is offline right now) [21:12] Bug #1741074: [snap] chrome-gnome-shell extension fails to detect native host connector [21:12] dat :) [21:12] it seems so [21:13] I'm not familiar enough with the chromium snap to answer myself without more details about how plugins work with chromium and you're sure to find the right folks on the forum so that would be my recommendation [21:13] you could also try asking in EU tz morning time [21:13] It is broken for 2 years now [21:14] A workaround is to install from repo, which is kind of ridiculous. [21:15] Might ask oSoMoN in the morning. Not keen on signing up for yet another forum... [21:16] if you have a LP account I believe that the ubuntu forums at discourse.ubuntu.com are tied to your LP account so you don't have to create another account [21:16] and oSoMoN is around there as well [21:17] ok. thanks. [21:20] ijohnson: did you ever get multipass working on the raspberry pi 4? [21:21] NickZ: no I haven't tried for some time, but I think the required kvm options were added to the kernel now [21:21] NickZ: IIRC there was some other problem with multipass not qemu/kvm related [21:21] yeah, I was lookinga t your issue [21:21] that issue seems resolved now, but im having another one [21:22] I was just wondering if anyone else had gotten this working [21:22] what's the new issue ? [21:23] NickZ: do you know if Pi3 kernel also got those? [21:24] can't say, haven't tried, although I heard that hardware virtualization is only supported on arm64 hosts [21:24] ijohnson: https://github.com/canonical/multipass/issues/1376 [21:26] NickZ: thanks, we'll have a look - it's weird that -machine would be required, maybe the default can be set somewhere [21:26] hmm yeah I can't say I know anything about that but I bet Saviq can be of more help :-) [21:27] yeah, it seems unique for raspi hosts, dunno why [21:39] PR snapd#8157 closed: tests: using google storage when downloading ubuntu cloud images from gce [22:06] PR snapd#8172 opened: snapcraft.yaml: add python3-apt, tzdata as build-deps for the snapd snap [23:19] ijohnson: found the problem [23:27] cmatsuoka: what was the problem? [23:41] ijohnson, hey [23:42] ijohnson, I'll run the test in a loop [23:43] the fix is in the code? or in the tests? [23:44] cachio the fix in the PR I mentioned was in the code for snap-failure [23:44] At least what I think is a fix [23:45] I don't entirely understand the root cause but this should at least let snap-failure get past the issue and still recover [23:45] ijohnson, mmmm, in that case it is not so simple to test [23:45] because I need to create an image with the fix [23:46] is this PR still open? [23:46] Yes one sec let me get you the link [23:46] cachio: https://github.com/snapcore/snapd/pull/8171 [23:46] PR #8171: cmd/snap-failure/snapd: also rm snapd.socket if it still exists [23:48] ijohnson, I'll run it in google [23:48] ijohnson, the other way I can reproduce it is difficul because I need an image [23:48] Great, the tests all passed for that PR in Travis which is a good sign [23:48] cachio it can certainly wait til tomorrow to [23:49] *too [23:49] ijohnson, yes, I'll test it tonight and add a note in the PR [23:51] Awesome thanks [23:53] yaw