[05:22] morning [05:30] PR snapd#7193 opened: [WIP] cgroupsv2 spread run [06:08] PR snapd#7194 opened: packaging: use %systemd_user_* macros to enable session agent socket according to presets [06:19] quick errand, back in 30 [06:38] re [06:38] hmm fmt.Printf("aligninfo: %+v\n", aligninfo) [06:38] ...open /home/maciek/work/canonical/workspace/src/github.com/snapcore/snapd/cmd/snap/cmd_services_test.go: too many open files [06:56] hmm some http.Response structs with no calls to rsp.Body.Close() [06:59] PR snapd#7184 closed: daemon: do not modify test data in user suite [07:02] and there's different --help output when running unit tests via go test or via a prebuilt test binary :/ === pstolowski|afk is now known as pstolowski [07:15] morning [07:16] pstolowski: hey [07:17] pstolowski: we're leaking fds in unit tests [07:18] pstolowski: and a lot [07:18] pstolowski: ./snap.test -check.f TestSession -test.count=99999 is leaking ~50fds/s on my machine [07:18] pstolowski: as in go test github.com/snapcore/snapd/cmd/snap [07:26] mborzecki: oh /o\, nice discovery.. TestSession appears to be a new thing? [07:27] pstolowski: fwiw, we were leaking fds before that too, just that now there appears to be more [07:28] Hi all, I'm new to Snap and have some questions and will be very happy if you could help me to answer them. [07:30] I'v also posted my question in Snapcraft community: https://forum.snapcraft.io/t/azul-jre-bundled-with-my-snap/12524 [07:35] Basically, as I understood it. Snap works similar to AppImage, it should come with all the dependencies bundled in the snap. [07:37] So, when I create a snap package for a Java app I want the jre bundled in the snap and use java from there and not the one installed(if installed) in the system. [07:37] Am I right? [07:42] Talzzz: yes. this is relevant to your question: https://snapcraft.io/docs/java-applications [07:43] Talzzz: i'm not familiar with building java-based snaps at all, but it seems to me that the gradle plugin does the magic [07:44] pstolowski: ok, so http.Server.Shutdown() when passed a context.WithTimeout() does not work the way we think it works [07:44] mborzecki: ah [07:45] pstolowski: in SessionAgent.Stop() when passing a nil or background context, there's no fd leak [07:47] mborzecki: so, we're leaking fds for real, not just in tests? [07:53] pstolowski: in real world, shutdown is in the exit path, so the process will be wrapped up anyway [07:53] mborzecki: good [08:02] quick errand, back in 20 [08:23] PR snapd#7195 opened: cmd/snap, usersession/agent: close http.Response body in tests [08:23] pstolowski: ^^ [08:24] mborzecki: ty! +1 [08:26] haha, and need to fix it :P [08:28] pstolowski: force pushed [08:34] mborzecki: hmm. i didn't spot the difference, perhaps i was looking at the new code already? [08:35] pstolowski: https://github.com/snapcore/snapd/compare/88aba0a637d35b96a419f1f1f1018d411b8d9b9e..e6e95530470b25a7dc6fa3a336f541aa8901f0f8 [08:35] mborzecki: ah, this, very subtle ;) [08:47] pstolowski: and the unit test job stalled [08:47] more fixes clearly needed [08:50] mborzecki: uhm... this is annoying. i'll look at reproducing this too [08:50] pstolowski: try passing -timeout 5m to go test [08:59] pstolowski: wondering wehether we should even use a graceful shutdown in user session [09:04] mborzecki: is -timeout for one test, or for all of them? [09:04] 'if a test binary runs for more than ' [09:05] it isn't clear if 'go test ./...' counts as one binary or multiple [09:05] :) [09:05] i can test [09:05] Chipaca: yeah, i think it's per package [09:05] Chipaca: go test builds a *.test binary for each package [09:05] yeah, if your change works, i guess it is :) [09:05] also we can drop the silly 'go list | grep -v vendor' i think? [09:05] Chipaca: well, it did get stuck on travis :/ so [09:05] * Chipaca checks with 1.9 [09:06] actually that's >= 1.10 so it should just work with ./... directly [09:06] hmm [09:07] mborzecki: interestingly, why didn't you set -timeout for the 'short' run? [09:07] surely that'd be the interesting one, to catch things fast [09:09] Chipaca: missed that, force pushed now [09:09] omg, and we actually run --unit-short on travis /o/ [09:09] :) [09:11] yeah the 'go list | grep' trick is no longer needed with 1.9 either [09:11] just tested :) [09:14] but that's for another time probably [09:30] https://paste.ubuntu.com/p/T5w4r43FSt/ locally with -timeout [09:32] mborzecki: do you have the arch with 5.2 to hand? [09:32] mborzecki: need to know the version of util-linux [09:33] Chipaca: 2.34 [09:33] mborzecki: and systemd? [09:33] Chipaca: 242 [09:33] ok [09:33] thanks [09:38] things move forward on the 'zany loop zaniness' front [09:49] heh, so travis unit tests job is passing each time i restart it now [09:49] sounds terrible [09:57] PR snapd#7196 opened: packaging: use passthrough for type:snapd [09:59] Chipaca: something fishy, osutil.RunWithContext() spawns a goroutine to call Process.Kill() when the context times out, just got a failure where there several terminal pages of runnable goroutines in that code [10:00] mborzecki: is this with an insane -count ? [10:00] maybe it's leaking goroutines? [10:00] anyway RunWithContext should go [10:00] (we've got contexts now) [10:01] Chipaca: runnig in this way: waitDone := make(chan struct{}) [10:01] duh, wrong paste [10:01] hile true; do echo "---------------------- $(date)"; GOMAXPROCS=1 GOTRACEBACK=all go test -count=1 -timeout 5m $(go list ./... |grep -v vendor |grep -v snap-seccomp) || break ; done [10:01] at top of the tree [10:01] Chipaca: just running to go test -count=insane inside osutil doesn't trigger it [10:01] mborzecki: phew [10:02] mborzecki: IIRC the goroutine sticks around until timeout [10:02] mborzecki: which should be fine normally [10:02] Chipaca: hm, but waitDone is closed earlier than that [10:02] but, anyway, seriously stdlib now has exec.COmmandContext which replaces this [10:02] anyway let me see [10:02] Chipaca: is that in 1.9 too? [10:03] mborzecki: 1.7+ [10:03] mborzecki: look at the right of https://golang.org/pkg/os/exec/#CommandContext [10:04] mborzecki: with enough contention (e.g. in tests?) i could see some piling up here while it sorts things out [10:04] mborzecki: ctx.Done() has a synchronizing mechanism which will be slow enough to appear [10:04] mborzecki: unless you're telling me it died _because_ of the number of goroutines i wouldn't be alarmed [10:06] Chipaca: heh, couldn't scroll back that far [10:06] mborzecki: the unit tests for osutil alone fire off 76 of those fwiw [10:07] no, 41, sorry [10:07] * Chipaca is very good at counting [10:07] Chipaca: omg https://travis-ci.org/snapcore/snapd/jobs/565388637 no backtrace in the log /o\ [10:07] and now i count 74 [10:08] mborzecki: ? [10:08] mborzecki: line 1213? [10:08] Chipaca: w8, there's one, maybe i opend the log page too early [10:08] * Chipaca w8s, m8 [10:08] 2l8 [10:09] * Chipaca hugs diddledan [10:09] \o/ [10:09] goroutine 2440 [IO wait, 4 minutes]: heh, stuck in godbus read? [10:09] godbus! [10:09] public transport for heaven? [10:10] go go dbus :) [10:11] [10:12] via "go dbus go" I now have a "doo-doo-doo-doo-doo inspector dbus" earworm. hope you're all happy [10:13] cjwatson: snapd snap, doo doo doo doo doo doo [10:13] Chipaca: fwiw, i don't see the goroutine that's supposed to send SIGUSR1 to the process, so it probably happened and it exited, but signal was not received? [10:13] cjwatson: snapd snap, doo doo doo doo doo doo [10:14] cjwatson: snapd snap! [10:14] hopefully that got rid of the ear bug [10:15] cjwatson: https://youtu.be/zuq55zgXxHU [10:15] diddledan: https://www.youtube.com/watch?v=XqZsoesa55w [10:16] mborzecki: maybe it went to the wrong process? [10:16] raising the bar: https://www.youtube.com/watch?v=wcibw-m8tRk [10:17] Chipaca: golang, process signals & goroutines sounds like a bad mix [10:18] https://lore.kernel.org/linux-block/f7e6717f-be46-1dfe-80ed-f85a18065c85@i-love.sakura.ne.jp/T/#m7b86e0c0f368ad6523c80b02bcd758d387a93a77 [10:18] fwiw [10:19] bug understood, fix … who knows [10:19] mborzecki: why would you raise the bar, this was a fight to the bottom [10:19] hahaha [10:20] but, ok, let's raise the bar: https://www.youtube.com/watch?v=UMVjToYOjbM [10:25] this loop device racing is interesting. maybe ioctl(LOOP_CTL_GET_FREE) could lock the device it returns for the PID (and child PIDs?) calling the ioctl for a short period so that the next ioctl(LOOP_CTL_GET_FREE) gets a different device because the first one is locked [10:28] continuing raising the bar, https://www.youtube.com/watch?v=6SFNW5F8K9Y [10:28] or maybe we need a different ioctl entirely that returns the loop number AND does what we're currently requiring a second ioctl for [10:28] psh, locks, who needs those [10:29] * diddledan throws away the key [10:36] * Chipaca proposes https://www.youtube.com/watch?v=11GYvfYjyV0 not as a bar-raising but as a bar-mellowing thing [10:36] * Chipaca just puts things on shuffle now [10:42] pstolowski: Chipaca: can you take another look at https://github.com/snapcore/snapd/pull/7195 i've pushed a couple fixes [10:42] PR #7195: many: fix unit tests getting stuck [10:54] mborzecki: extra +1 for good fortune [10:57] Chipaca: haha well https://paste.ubuntu.com/p/FnBhJbYYv7/ [11:05] PR snapd#7197 opened: usersession: track connections to session agent for exit on idle and peer credential checks [11:08] mborzecki, hey [11:09] nice forum post [11:09] just a clarification [11:09] the image is not fedora 30 [11:09] is fedora rawhide [11:10] mborzecki, I created that based on our current fedora rawhide image (not useing the previous one) [11:10] cachio: thanks for catching this, can you build one on fedora 30 too? just under a different name [11:11] mborzecki, yes [11:11] cachio: great, thanks! [11:13] running process [11:13] mborzecki, it should be ready in less than 30 minutes [11:15] PR snapd#7198 opened: tests: reboot the node when restoring after a test involving lxd [11:16] zyga: ^^ [11:24] cachio: good job on spread#82 [11:25] PR spread#82: Introducing tags for the test which allows to filter executions [11:25] Chipaca, thanks!! [11:25] not a +1 yet, but, yeah, good job [11:25] Chipaca, :) [11:25] Chipaca, I did it as simple as posible [11:25] cachio: I can't stress enough how nice it is to see a new, useful feature be minimal so we can be sure it's useful before going crazy [11:26] change that first useful to 'seemingly useful' for that sentence to make sense 100% [11:26] Chipaca, nice!! [11:27] Chipaca, and it has spread tests too :) [11:27] yes :-) [11:27] cachio: as noted the implementation can be made even simpler, at the code level (conceptually it's the same) [11:31] Chipaca, thanks for the review, cheking it... [11:39] zyga, preset request for new snapd user socket service: https://bugzilla.redhat.com/show_bug.cgi?id=1734371 [11:39] mborzecki: ^ [11:40] Eighth_Doctor: nice [11:51] o/ [11:51] Eighth_Doctor: thank you === morphis9 is now known as morphis [12:46] PR snapcraft#2648 opened: remote-build: include project name in build-id [13:02] ijohnson: hiya [13:28] pstolowski: what was the issue with snapcraft 3.x and lxd/travis? [13:36] ijohnson: i forgot the exact error message, but it was something about not being able to install core from /var/tmp/core.snap [13:37] is there anyway for that test to remove the hacked core and just use the normal core snap when running snapcraft? [13:38] ah I think I found the PR, it's 7092 right? [13:38] err #7092 [13:38] PR #7092: packaging: use snapd type and snapcraft 3.x (2/4) <β›” Blocked> [13:41] ijohnson: 7092, correct [13:42] ijohnson: the test it's not hacking the core, it's just trying to install snapcraft 3.x and build snapd snap (using lxd container as multipass doesn't work due to kvm extensions) [13:43] right I understand why you can't use multipass, but I'm wondering why there aren't assertions for that core snap when it runs [13:43] is it perhaps because a test that ran before this one installed an unasserted core snap? [13:45] ijohnson: hmm, interesting idea [13:46] the error here: error: cannot find signatures with metadata for snap "/var/tmp/core.snap" is not coming from snapcraft, it's coming from snapd inside the lxd container [13:47] snapcraft is trying to be smart and re-use the core snap from the host (which in this case is a test VM), but there aren't assertions for that core snap for some reason [13:47] jdstrand_, I'm using review-tools.snap-review and getting "uncompressed snap is too large for available space (1160M > 1212M)". Is there a work-around? [13:51] nm, was just disc space and a weird "1160M > 1212M" claim [13:52] ijohnson: ahhh, i see, sounds plausible [13:53] ijohnson: i'll see if refreshing to proper core will help [13:53] ack [13:53] ijohnson: thanks for the insight! [13:59] pstolowski: it looks like snapcraft will install the snap in the LXD machine with --dangerous only if the revision number starts with "x", otherwise it assumes it's able to download the .assert file as well as the .snap file from the snaps/${snap}/file snapd endpoint [14:03] ijohnson: also, i've just noticed the comment/suggestion from jamesh (thanks!) [14:04] right, yeah I think there might be a silently ignored error from snapcraft when it tries to copy the assertion file (which doesn't exist in this case) between the host and the container [14:10] i'm trying destructive-mode first, that would be much faster thank lxd for our tests, if it works [14:20] Hey [14:20] How arΔ™ tyings? [14:24] * cachio afk [14:27] Chipaca, mborzecki, pstolowski anything interesting to report? [14:27] zyga: unit tests hate us [14:28] zyga: but, nah [14:28] nothing on particular fire [14:28] Chipaca: is that random hang understood better now? [14:35] zyga: we think so [14:35] zyga: not entirely solved yet though [14:35] zyga: signals and threads are fun [14:35] Chipaca: because of more coding required or because review/landing? [14:35] Chipaca: indeed :) [14:36] i'm not quite sure where it's at, mborzecki is on it [14:36] Chipaca: any thread in the process can receive async signals, unless signalling a specific thread and unless sync signal is sent [14:36] mborzecki: are you around? [14:36] zyga: feel free to take a look at https://github.com/snapcore/snapd/pull/7195 [14:36] thank, you [14:36] PR #7195: many: fix unit tests getting stuck [14:36] zyga: the 5.2 mount issue is understood, and there's a patch [14:37] zyga: that's also nice news [14:37] zyga: both userd and session-agent use signals to know when to stop, and both tests (run one after another, but inside the same process) send signals to self to stop userd/session-agent [14:37] Chipaca: excellent, that is indeed good news [14:37] Chipaca: on the kernel side or on systemd side? [14:38] zyga: kernel [14:38] upstream or other? [14:38] zyga: the 5.3 mount issue (affecting snapped lxd launching things) is https://lkml.org/lkml/2019/7/26/388 [14:38] zyga: upstream [14:39] zyga: the 5.3 has a proposed patch fro al viro, which is probably as good as it gets, at https://lkml.org/lkml/2019/7/26/1441 [14:39] i havn't checked to see if that's in yet, [14:39] . [14:40] Chipaca: thanks, that's really good news [14:42] mborzecki: I managed to get lucky and https://github.com/snapcore/snapd/pull/7190 is green [14:42] PR #7190: tests: set GOTRACEBACK=1 when running tests [14:42] mborzecki: it switched the timeout to 5 minutes [14:42] I think you did that in your branches as well [14:42] mborzecki: so with the signal fix, do we know why unit tests failed in that PR? [14:43] zyga: no, the test likely sends the signal to stop the agent, but the agent code continues as if there was no signal [14:44] mborzecki: which signal is that? [14:44] mborzecki: perhaps the test should fork and run in a separate process [14:44] mborzecki: (and bridge the gap to the parent for reporting) [14:44] mborzecki: or perhaps we should entirely mock that away for testing [14:45] zyga: i changed userd test to use SIGUSR1 and session-agent to use SIGUSR2, but same thing happens [14:46] mborzecki: are those signals used by the go runtime or by the unit testing stack? [14:46] zyga: go test has no functionality to run test as a process [14:46] mborzecki: and do you know how exactly is go setting up the signal handling logic? [14:46] zyga: options are, move session-agent and userd into separate pacakges (then they're run as separate processes), or mock away the signal bits [14:48] mborzecki: I'm happy with both, please use your best judgment [14:48] Chipaca: do you have preferences on how that should be done ^ [14:48] * zyga needs a trivial + 1 on https://github.com/snapcore/snapd/pull/7191/files [14:48] PR #7191: tests: remove installed snap on restore [14:49] mborzecki: zyga: I'm OK with mocking the signal setup for tests [14:49] Chipaca: +! [14:49] +1 :) [14:49] mborzecki: zyga: as long as we're sure that bit works :) [14:50] in case this is not working today let's please revert the code for now or disable the test [14:50] so that rest of master is not stuck [14:57] zyga: left a note under https://github.com/snapcore/snapd/pull/7198 [14:58] PR #7198: tests: reboot the node when restoring after a test involving lxd [14:58] mborzecki: ah, that's cool [14:58] if that's the case +1 [15:00] I need to hear back from samuele before doing more work on the refactor, so I'm going to step out now, and come back for a couple of hours after dinner [15:01] mborzecki: commented in the lxd branch [15:01] telegram me if you need me to come in before that :-) [15:01] Chipaca: if you ask me a question I can relay [15:01] zyga: i'm in communication with him [15:01] +1 [15:01] zyga: but thank you [15:01] great, thanks [15:02] zyga: hmm, it failed on 16.04 only??? [15:03] I think it failed on other systems as well [15:03] zyga: nope, 2019-07-30 12:02:48 Rebooting on jul301153-640125 (google:ubuntu-18.04-64:tests/main/lxd) as requested... [15:07] ijohnson, jamesh: it worked in destructive mode! [15:07] pstolowski: cool! [15:29] hey Chipaca any chance you could comment on my model PR with what direction I should go? from SU I recall that device-key is wrong and it should be at the bottom and use "device-key: |\nblahblahblah", but I'm not sure what you would like me to do about the others [15:57] degville: I've just amended the docker and docker-support interface docs slightly. Can you check the wording is appropriate, for me, please? I edited the top-most line in each doc (https://forum.snapcraft.io/t/the-docker-support-interface/7810 and https://forum.snapcraft.io/t/the-docker-interface/7787) [15:58] diddledan: will do - thanks for making the updates. [15:58] cheers :-) [16:04] diddledan, your indendation of the plugs is wrong [16:04] (in your last forum post) [16:04] nuh uh [16:05] for lists you can do without the indentation [16:05] oh ? [16:05] i never tried ... heh [16:06] https://www.irccloud.com/pastebin/iIgMWTmF/ [16:08] of course it gets confusing when you do: https://www.irccloud.com/pastebin/4h38icL2/ [16:09] that's why I like to not indent the list so that an object is clearly delineated as being either incorrectly nested or a child of a list-item [16:10] s/object/hash/ [16:10] hash/map [16:10] I good words do [16:11] what happened to the nice syntax highlighting/colouring on the forums?! [16:16] it went away a while ago and nobody has added it back [16:16] PR snapd#7196 closed: packaging: use passthrough for type:snapd [16:17] diddledan: https://forum.snapcraft.io/t/code-block-syntax-highlighting-no-longer-works/11004/3 [16:18] yeah, I don't get why it went away though === pstolowski is now known as pstolowski|afk [17:27] PR snapd#7191 closed: tests: remove installed snap on restore [17:57] PR core-build#49 opened: initramfs: check keypress in run mode [17:59] PR snapcraft#2647 closed: build providers: catch LXD socket error [18:02] PR snapcraft#2649 opened: Release changelog for 3.7.1 [18:20] hi, coming here from suggestion at https://forum.snapcraft.io/t/network-manager-broken-for-desktop-ubuntu/12512/26?u=jcrben - does anyone know how to make a PR to update text in a store description, such as https://snapcraft.io/network-manager? [19:29] PR snapcraft#2649 closed: Release changelog for 3.7.1 [20:30] is it snapped yet? https://www.collabora.com/news-and-blog/news-and-events/moving-the-linux-desktop-to-another-reality.html [22:02] jcrben, https://git.launchpad.net/~snappy-hwe-team/snappy-hwe-snaps/+git/network-manager/tree/ the text comes from the snapcraft.yaml ... but since you already talk to alfonso (who is the maintainer) you can probably just go on discussing changes to the text in the forum thread with him