[02:33] PR snapcraft#2066 closed: errors: feature flag error reports [02:36] PR snapcraft#2069 opened: Reports [04:43] jjohansen: artful is also affected. I will give you more results today [04:44] zyga: ah good, I was going through the diff and going htf is artful not affected and bionic is :) [04:48] Mainline is not affected or very little [04:48] Loading one profile over and over leaks very very quickly [04:48] Maybe some new table is not released? [04:49] I wrote some tests but went to sleep. I will keep looking today [05:03] mainline certainly has a leak [05:03] which kernel version for mainline are you not seeing it on? [05:03] zyga: ^ [05:05] jjohansen: I built mainline from yesterday, I was at e241e3f2bf97 [05:05] okay, thanks [05:06] jjohansen: when I say there was no leak I mean that loading a profile over and over (unchanged) ran for over 30 minutes with minimal memory bump (probably noise from other programs) [05:06] jjohansen: at most 300MB [05:06] jjohansen: on a affected kernel a few minutes of this would consume all my ram [05:07] jjohansen: I will give you more data soon, sorry, yesterday I just collapsed [05:08] jjohansen: this is the base64-encoded binary profile I was loading in a loop http://paste.ubuntu.com/p/Jfs3RRKcPw/ [05:11] jjohansen: on 4.13.16 inserting that 10K times leaks 440MB [05:11] (on amd64) [05:12] jjohansen: perhaps other profiles we tested inside spread+snapd leaked more memory but I wanted to keep using one profile for experiments [05:12] morning [05:13] hey mborzecki, good morning [05:13] any fires today? [05:14] mborzecki: no, I think all is the same for now [05:31] jjohansen: on xenial kernel the jump is from 946 all the way up to ~2GB [05:31] (this time using distribution kernel, not my build of the corresponding tag) [05:31] jjohansen: xenial kernel is misleading, this was 4.13.0-37 [05:35] jjohansen: 4.4.0-119 on xenial is also affected but very slightly so, same profile, same count, 626MB->660MB [05:35] jjohansen: I'll test intermediate kernels now [05:40] jjohansen: 4.8.0-58 goes from 699M -> 773M [05:42] jjohansen: 4.10.0-42 goes from 698M -> 723M [05:42] jjohansen: so that feels like noise so far [05:42] the real jump is in 4.13 [05:42] where we drop significant amount of memory [05:48] jjohansen: 4.15.0-13 goes from 640M to 1.61G [05:50] jjohansen: so for all practical purpose the diff between 4.10 and 4.13 has introduced the major part of the leak [05:51] jjohansen: but note that even on 4.4 there's some memory going somewhere, maybe that's just slab growing [05:51] jjohansen: I'll introduce a variant that does 10M insertions to see if slab stabilizes [06:12] zyga: do you know if issue #3 'Memory use on minimal/constrained systems' had any further developments? [06:12] jjohansen: on the xenial 4.4 kernel 10M insertions doesn't seem to actually leak memory, after some initial growth (of non-free memory) slab stabilises at 929M and just stays there [06:12] mborzecki: no, I didn't focus on it [06:13] i know we said it's won'tfix for 18.04, but didn't see any messages that would indicate it was further discussed later yesterday [06:16] mborzecki: sorry, I don't know more than that [06:16] mvo: did you end up having the meeting with cloud guys? [06:54] I'll terminate testing 4.4., it's pretty much rock solid [06:55] mborzecki: hi, some of the things you worked on recently (autostart, timers? ) need to be added here https://forum.snapcraft.io/t/the-snap-format/698 ? [06:56] pedronis: thanks, will do [06:56] mborzecki: as usual we need put then version (2.xx+) where it starts working [06:58] zyga, mborzecki yeah, we had a meeting yesterday. we will not do anything right now, its too risky, but we want to prepare so that we can provide a fix post-release asap [06:58] jjohansen: 4.4 is rock solid, doesn't leak memory over extreme number of insertions, I'm looking at 4.10 now and it also looks good, memory use stops at ~1.03GB after 10s of thousands of insertions [06:59] jjohansen: I'll keep it running for some more time and then try 4.13 where I suspect we really leak memory constantly [06:59] mvo: sounds very godo [06:59] good [07:03] pedronis: mvo: one thing about exiting when idle, we don't have snap.refresh.timer anymore to wake us up, but we could schedule a command to run as a on-demand timer using systemd-run [07:04] moin moin [07:07] zyga: nice findings on the kernel mem leak front [07:07] mborzecki: indeed [07:07] mvo: i hope we can find the leak soon enough :-) === pstolowski|afk is now known as pstolowski [07:13] good morning [07:14] mborzecki: it depends what's the goal [07:14] mvo: is the plan to make exit on idle, generalized behavior? [07:15] mvo: did you discuss just timings or also a bit the goals? [07:17] mborzecki: is setting configuration with "set system"  landed? [07:18] pedronis: yes [07:18] it's on edge but not 2.32 , right? [07:18] but iirc it's in master only [07:18] ok [07:19] bit unfortunate, but oh well [07:20] hm timer services are in egde too [07:20] but not in 2.32.* [07:20] ah [07:20] * pedronis admits to have lost track of things a bit (2.32 being so long lived) [07:21] mborzecki: anyway that's bit less of an issue [07:21] the issue with set core vs set system is that it must work before one installs core [07:21] pedronis: gustavo preferes the wake up, do stuff, exit approach over not doing anything at all. but its still a bit undecided so worthwhile to have another meeting to discuss options. I personally still favour the "do as little as possible via units" approach [07:21] pedronis: yeah, 2.32 is the new 2.33 :/ its a bit annoying [07:22] so basically we need to document set core and support it [07:22] for the life of bionic [07:22] (more or less) [07:23] mvo: discussion for monday I suppose? [07:23] pedronis: yeah, *maybe* today but I think gustavo is pretty busy today [07:23] pedronis: it's 2 small patches, should be easy to cherry-pick in case we want fixes in 2.32 [07:23] it's too late I think [07:25] mborzecki: you can prepare a PR and target it so that *if* we need to rebuild we have it. but I'm with pedronis probably too late [07:25] mvo: sure [07:26] mvo: so I imagine we concluded that it's called 2.32.4 , not 2.33 ? [07:27] pedronis: yes, I had a call with Adam about it, the amount of work to make it 2.33 is just too high at this point [07:28] I don't think we have promised/enforced minor releases to be small or have no features [07:28] we try to [07:29] in theory we have assumes , but seems they stay unused [07:29] (anyway they are not relevant for the API, it's transparent) [07:33] PR snapd#5044 opened: 'system' nickname for 'core' in snap get/set (2.32) [07:37] mvo: I'm going to merge my PRs about doMountSnap, should I prepare cherry-picks? or will you later? [07:40] PR snapd#5036 closed: overlord/snapstate: allow to get an error from readInfo instead of a broken stub, use it in doMountSnap [07:45] * pedronis is clearly not rested enough [07:56] zyga: I posted on opensuse-packaging, but it's a very low traffic list, will let you know if I get any replies [07:59] 5038 failing on travis, works if i run it from host, cleaned the git tree but still [08:00] and it's awkward, afaict all services are getting enabled by dh snippets in postinst, but the snapd.wakeup.service is disabled when the test starts [08:01] https://media1.giphy.com/media/12NUbkX6p4xOO4/giphy.gif probably [08:04] Caelum: perfect, thanks! [08:17] i really have packaging at times [08:23] if only someone invented a simple way to package stuff! [08:23] slackware? [08:24] zyga: could you take another look at https://github.com/snapcore/snapd/pull/4989 later on? [08:24] lol [08:24] PR #4989: tests: add arch to CI [08:24] mborzecki: sure [08:32] moin moin [08:33] hey Chipaca [08:33] mvo: do you remember why switch didn't have --devmode and etc? [08:33] Chipaca: I think there is no reason, its a nice idea [08:33] also what we agreed about --no-devmode [08:33] ditto --classic vs --no-classic, etc [08:33] Chipaca: iirc nobody asked that switch would have this capability but I think its a nice idea [08:33] mvo: people have asked, we've just not been paying attention =) [08:33] Chipaca: I had this problem too (wanting to swtich from strict to devmode and vice-versa) [08:34] Chipaca: *cough* [08:34] Chipaca: details ;) [08:34] Chipaca: don't destroy my narative [08:34] mvo: https://forum.snapcraft.io/t/refresh-into-devmode/4130 and https://forum.snapcraft.io/t/refreshing-snaps-in-devmode/4942 [08:35] mvo: but also I remember niemeyer had a reason for not having --no-devmode etc, but I don't remember it [08:35] and I'm not sure it wasn't due to him confusing go-flags and flags, wrt --= [08:41] mborzecki: https://github.com/snapcore/snapd/pull/4989/files#r181321369 [08:41] PR #4989: tests: add arch to CI [08:43] Chipaca: sorry, I don't remember why we would not want --no-devmode etc [08:43] mborzecki: after that +1 [08:47] mvo: how do you get out of devmode? [08:48] Chipaca: I think you need to refresh to a new revision for this right now, no? [08:52] mvo: I think so yes [08:56] the memory leak was introduced in one of 144 patches [08:57] I will review them quickly and see if I can automate a test loop [09:02] interestingly this patch is in that list a7c3e901a46ff54c016d040847eda598a9e3e653 [09:02] zyga: nice [09:02] zyga: bisect ftw [09:03] it's not the bug yet though [09:08] PR snapd#5039 closed: overlord/snapstate: use the readInfo in doMountSnap as a check only, undo if it errors [09:08] PR snapd#5040 closed: overlord/snapstate: poll for up to 10s if a snap is unexpectedly not mounted in doMountSnap [09:09] jdstrand: I didn't know we already supported loading arbitrary extended data into profiles! [09:11] mborzecki: thanks for your review for 5043 [09:11] mborzecki: I will dig a bit more in a bit but right now I think there is no way to disentangle --kill-who=main and KllMode=process [09:11] mvo: so! what can i help with today? [09:12] Chipaca: smart ideas about 5043 are in short supply right now :) [09:13] PR snapd#5045 opened: overlord/snapstate: poll for up to 10s if a snap is unexpectedly not mounted in doMountSnap (2.32) [09:14] mvo: ^ prepared the backport [09:14] pedronis: \o/ thank you [09:14] mvo: I hear good things about runit [09:14] Chipaca: :-D [09:14] mvo: :-) [09:15] so things that appear described as unrelated are interacting in obscure ways and are not orthogonal :/ [09:16] mvo: so, if I understand the issue correctly, it's that if a daemon has refresh-mode=potato but does not hangle sigpotato and instead dies, then the whole service is killed? [09:18] Chipaca: yeah, all processes in the cgroup will be killed, that is my understanding [09:18] mvo: right [09:19] mvo: but isn't a daemon not handling the signal it asks to be delivered on refresh a bug? [09:19] Chipaca: from reading the source (my understaning is still a bit incomplete) I think what happens is that the main pid dies and that triggeres sigchld in systemd which notices that the main pid of the given unit died [09:19] Chipaca: this makes the unit enter "stop state" and systemd will do what it needs to do when this state appears. which includes the cleanup of the cgroup (AIUI) [09:20] Chipaca: in the sigterm case what we want is that the daemon restarts, I guess one could argue it should re-exec and use the same pid [09:20] Chipaca: this would solve the problem but I think this is not how most apps deal with it :/ [09:22] mvo: what are we trying to accomplish? with refresh-mode=, when the snap refreshes, we do what? and the daemon does what? and systemd does what? [09:22] we=snapd there [09:24] zyga: have you seen jdstrand's comment to 5041? [09:25] Chipaca: on snap refresh with --refresh-mode=sigterm what we want is that the main process of the unit in question gets a sigterm. but that the other processes in the cgroup are left alone and survive [09:25] Chipaca: the use case is e.g. libvirt when it has a bunch of vms running that should not stop [09:26] Chipaca: we tell systemd to do "systemctl kill --kill-who=main -s TERM snap.name.app" [09:26] Chipaca: instead of the usual "systemctl stop snap.name.app" [09:26] Chipaca: making sense so far? [09:26] yes [09:26] now the problem seems to be that the option --kill-who=main is not orthogonal to KillMode= in a service file [09:27] Chipaca: or it is but in a different way, there is some entanglement [09:27] (the enganglement I described above, main pid dies, systemd wants to cleanup) [09:27] yes [09:28] mvo: what do we _want_ systemd to do? [09:29] Chipaca: on "systemctl kill --kill-who=main" we want it to kill the main pid and leave the rest alone [09:29] Chipaca: on unit stop we want it to stop everything [09:30] mvo: do we want it to restart the thing? [09:30] or _just_ kill it? [09:30] Chipaca: do whatever is defined in Restart= [09:30] Chipaca: it seems that this is a decision for the snap [09:30] Chipaca: but normally it would be Restart=on-failure (which is our default) [09:31] Chipaca: for a lot of things (SIGHUP) its a non-problem because the process will handle it and not die but sigterm is the problematic one [09:31] Chipaca: still making sense :) ? [09:31] restart=on-failure does not restart the thing when killed with TERM [09:32] might this be the reason it's entering stop mode at all? [09:32] Chipaca: I think I tested with "Restart=always" [09:32] Chipaca: and it had no effect but I can do so again [09:33] pstolowski: looking [09:34] mvo: sigterm is like sighup, in that the process can catch it etc (sigkill is the uncatchable one) [09:34] but, ok [09:34] Chipaca: I know [09:34] Chipaca: but it seems the services we care about do not catch it [09:34] mvo: a'ight [09:35] Chipaca: we could argue they should and the problem would go away [09:35] mvo: and AIUI the problem with using KillMode is that 'stop' will no longer work as expected? [09:35] jdstrand: I replied to your comments on 5041 [09:36] Chipaca: correct [09:36] Chipaca: it will mean there are processes hanging around (potentially) [09:37] mvo: and can ExecStop itself call systemctl? [09:40] Chipaca: a good question, I think so, what do you have in mind? [09:40] mvo: wondering whether we can manually use ExecStop to get the 'stop' behaviour we want [09:42] mvo: as all the rest seems to be ok with the particular choice of restart/killmode [09:44] Chipaca: hm, won't systemd just call ExitStop= in both cases? when kill was used and when stop was used? [09:50] mvo: will it? [09:53] Chipaca: it seems to be, I added "ExecStop=/bin/sh -c "echo foo >/tmp/foo"" and ran a kill (with Restart=always) and /tmp/foo with the content got created [09:55] mvo: and is ExecStopPost _also_ run with kill? [09:59] Chipaca: let me check [10:00] Chipaca: yes, I also get a debug file with it [10:01] mvo: it sounds to me like the lesser weevil is to document that if you use refresh-mode, systemctl stop will be weird [10:02] Chipaca: yeah, and on remove do a extra kill --kill-who=all [10:02] Chipaca: I can't figure another way but I can poke at it a bit more after lunch [10:02] mvo: and do that on 'stop' itself also [10:02] Chipaca: on snap stop service? [10:02] ie 'snap stop' should work as expected even when systemctl stop doesn't [10:03] Chipaca: thats a nice idea [10:03] yeh [10:44] re [10:49] jjohansen, jdstrand: I took the profile with a leak and started removing features from it; I want to see if any of the newly-added features may be responsible [10:50] jjohansen, jdstrand: I also stress-tested all of the sysfs files in apparmorfs for extensive reading and can say that they are not a factor [10:51] (though I found one curious behaviour of the "revision" file, is that documented anywhere? [10:55] the revision file's behavior is known and going to change [10:58] jjohansen: that it "sleeps" [10:58] jjohansen: loading an empty profile leaks as well [11:03] jjohansen: https://github.com/zyga/apparmor-bug-leak [11:04] loading this is sufficient https://github.com/zyga/apparmor-bug-leak/blob/master/neutered-sample.aa [11:04] so it's not like a new optional feature is there and causes the leak [11:04] maybe something is not unref'd? [11:18] mvo: there's a failure in https://travis-ci.org/snapcore/snapd/builds/365982227?utm_source=github_status&utm_medium=notification in linode:debian-9-64:tests/main/snap-service-refresh-mode https://paste.ubuntu.com/p/cxTNbkBbtb/ [11:21] mborzecki: thanks, looking [11:22] mvo: just the paste, i've restarted the build [11:22] mborzecki: thanks, I see it in the paste I will chase that [11:23] mborzecki: looking into this area anyway currently [11:23] mvo: yup, that's what i thought :) [11:27] jjohansen: I varied the test to insert a profile with ever-changing contents, I will check how that behaves across kernel versions [11:29] jjohansen: I noticed that one of the things that differs between the broken and working kernels is the presence/absence of symlinks in apparmorfs/policy/profiles/$PROFILE/ [11:40] jjohansen, jdstrand: loading _different_ profiles forever on an affecter kernels doesn't leak memory [11:41] mvo: as a workaround I can generate random garbage rule like /tmp/.snapd.bug.1234.$RANDOM r, [11:41] mvo: and inject that into all profiles [11:41] mvo: we will lose all caching but we will not leak [11:45] pstolowski: would you like to review https://github.com/snapcore/snapd/pull/5034 ? :) [11:45] PR #5034: userd: set up journal logging streams for autostarted apps [11:46] pstolowski: it's based on 5024, so it's just the last 2 patches that are different 5034 specific [11:47] mvo: more ideas on v [11:47] https://forum.snapcraft.io/t/oom-for-interfaces-many-on-bionic-i386/4101/18?u=zyga [11:49] jjohansen, jdstrand: I'm now looking at error recovery in aa_replace_profiles [11:49] zyga: yay [11:49] zyga: let me know if you have anything to test [11:50] mvo: look at the options I gave [11:50] mvo: 1 is simple but wasteful [11:50] mvo: 2 should have no downsides but is complex [11:50] mvo: I can prepare a test with .1 quickly [11:50] zyga: if (1) is simple, we we just do it as an expeiment? [11:50] surte [11:50] *sure [11:50] zyga: yeah, lets do it if it does not take much time on your side [11:57] mborzecki: sure [11:58] mvo: after the break, now need to do stuff at home [12:01] Son_Goku: hey! am i missing something, or copr.fedorainfracloud.org doesn't actually offer an option for uploading spec+srpm as advertised on the wiki? [12:01] pstolowski: yes you are [12:01] you can upload a srpm via the copr CLI tool [12:01] or you can upload it somewhere and tell the web ui to fetch it [12:02] pstolowski: you should be able to from http://copr.fedorainfracloud.org/coprs/pstolowski/go-udev/packages/ [12:02] ah, and you can upload the srpm from the web ui too [12:03] pstolowski: https://copr.fedorainfracloud.org/coprs/pstolowski/go-udev/add_build_upload/ [12:04] Son_Goku: ah, there! I was staring at the Packages -> Add New Package where you need to have git/svn [12:04] thanks [12:24] reminds me to drop my copr packages, they're quite old anyway [12:26] PR snapd#5046 opened: snap, wrappers, tests: rename refresh-mode -> stop-mode, endure -> skip-refresh [12:26] mvo: ^^ [12:36] * kalikiana lunch [12:49] re [12:49] mvo: looking at that workaround now, it will be a moment, I'm almost done [12:52] mborzecki: \o/ [12:52] zyga: yay [12:55] Issue snapcraft#2028 closed: When asking to release to a branch that's too long, a traceback is printed that gives no hints as to the source of the error [12:55] PR snapcraft#2059 closed: storeapi: handle 500 error response when releasing snap [12:55] mborzecki: haven't we had a release with refresh-mode already? [12:59] * Chipaca hunts for his headphones [12:59] Chipaca: yes, that's why i have some doubts about backwards compat [13:02] Chipaca: standup [13:06] PR snapd#5047 opened: tests: removing linode-sru backend [13:20] PR snapd#5048 opened: tests: updating bionic version for spread tests on google [13:32] Is it "known" that opengl apps on 18.04 on binary nvidia are broken? [13:33] popey: no [13:33] shotcut "could not initialize opengl" but works on intel [13:33] popey: on stable? [13:33] i am on beta [13:33] but i can go back to stable to test [13:33] popey: can you give us some more data, which hardware, which snaps, etc [13:33] mborzecki: ^ [13:33] sure [13:34] popey: after standup i'll reboot and check on bionic [13:34] filing a bug to capture it [13:34] popey: how does it fail? is there any log? [13:35] https://bugs.launchpad.net/snapd/+bug/1763717 [13:35] Bug #1763717: some opengl applications don't work on nvidia binary driver [13:35] popey: try to /usr/lib/snapd/snap-discard-ns , and then capture the log with SNAPD_DEBUG=1 SNAP_CONFINE_DEBUG=1 [13:35] ok [13:37] added to the bug [13:37] popey: it's a classic snap [13:37] re [13:38] Is that bad? :) [13:38] popey: if it's a classic snap then it does not get any of the nvidia namespace setup treatment [13:38] (I mean, we'd like it to not be classic) [13:38] popey: classic snaps don't have any support for that [13:39] so it should "just work" right? [13:39] it depends on how it is made [13:39] but it's something for kalikana perhaps [13:39] we don't influence i [13:39] we don't influence it [13:39] it probably doesn't work [13:39] popey: shotcut snap right? [13:39] yes [13:39] because 18.04 and 16.04 differ [13:39] its in the store [13:40] host's handling of nvidia changed [13:40] there's nothing we can do IMO [13:41] i'll probably fail on arch too, let me try [13:42] popey: aborts on arch too https://paste.ubuntu.com/p/qPtZStGWth/ [13:42] :( [13:45] you can surely do something about it in a wrapper script in the snap itself [13:45] (hacks) [13:52] popey: i've installed all dependecies and can run shotcut directly outside of snap [13:54] I'm at a loss what we suggest the developer does for a smooth experience installing snaps on multiple distros at different releases and have it work. [13:54] do not use classic :) [13:54] mvo: I'll take the dog out now but then I will be back to propose the workaround [13:54] zyga: thank you [13:54] popey: you _can_ use nvidia but the snap needs some code for that [13:54] popey: talk to kalikana and sergiusens [13:54] popey: it's doable just nobody done it [13:54] ok [13:55] popey: snapd is not preventing that [13:55] popey: it's just not enabling that (because it cannot) [13:55] * zyga -> afk [13:55] yeah ... as i said, wrapper hackery [13:55] mvo: mborzecki: so... [13:56] mvo: mborzecki: removing catalogrefresh from snapd drops it (with an ~empty state) from 25MB rss to 15MB rss on start [13:56] pedronis: also ^ [13:56] catalogrefresh is all kind of evil :) [13:56] Chipaca: so basically .text + .data + .bss size [13:56] * Chipaca covers catalogrefresh's ears [13:56] Chipaca: woah [13:57] all kinds of .bs [13:57] this is also because of bolt db code [13:57] Chipaca: heh ;) [13:57] I suppose [13:57] pedronis: that's my assumption, yes [13:57] Chipaca: with GOGC=1 RSS was 19MB [13:57] Chipaca: nice, lets move it out into a separate helper [13:57] Chipaca: thanks, thats a huge win [13:57] mvo: now, or post-1804 [13:58] there are some issues around auth to move it out [13:58] pedronis: why? i thought it didn't auth at all [13:58] Chipaca: depends on your schedule for today, if you have free cycles I would say asap but does not have to be *now* [13:58] Chipaca: we need to talk with nessita, apparently /names can be per store [13:58] I don't know if it is yedt [13:58] oh [13:58] ok [13:59] mvo: I'll check with nessita and work on it today [14:00] mborzecki: I will merge your reresh-mode-endur-rename PR into my snap-rereshmode-fixes PR and work from that, ok? [14:00] Chipaca: sounds good, thank you [14:00] mvo: if we do a .5 we should consiser #5044 [14:00] PR #5044: 'system' nickname for 'core' in snap get/set (2.32) [14:01] I don't remember how risky it is [14:01] pedronis: +1 (cc mborzecki if you have cycles you could prepare a backport) [14:01] but it would be good to have that in bionic from the start [14:01] pedronis: iirc not risky [14:01] mvo: that is already the backport [14:01] afaiu [14:03] mvo: sounds good [14:03] mvo: 5044 is a backport already [14:03] mborzecki, pedronis: aha, excellet [14:03] mvo: or did you mean a backport of the rename pr? [14:04] mborzecki: he said he will merge the rename in his PR [14:05] mborzecki: I meant the system one [14:05] mvo: ok, so 5044 is good then ;) [14:06] mborzecki: yes! [14:06] mborzecki: if you want you can look into a smarter way to do https://github.com/snapcore/snapd/pull/5043/files#diff-08088787360fb3ca74a0aed4c6103b22R312 [14:06] PR #5043: many: fix "refresh-mode: sig{term,hup,usr[12]}" via KillMode=process [14:07] roadmr: hey, can you pull r1025? this turns on by default (but that doesn't matter with your django changes), improves the resquash error message for snapcraft 2.38 and fixes a typo in overrides.py [14:07] mborzecki: i.e. what we want is to ensure that on remove everything in the unit gets killed eventually [14:07] jdstrand: sure thing! [14:07] roadmr: thanks! [14:08] mborzecki: so something like "check unit, anything in there left? if so send sigterm. poll for up to 5 (or 10) seconds and check every ~1s if there is anything left. then do a hard sigkill on the group [14:13] PR snapcraft#2070 opened: extractors: support for setup.py [14:14] gnome shell froze again :/ [14:14] Pharaoh_Atem: Hey [14:14] I will build a test snap on top of the === mborzeck1 is now known as mborzecki [14:14] Fedora base snap this weekend [14:15] And I will try to publish your base [14:16] mborzecki: did you got my last messages or shall I repaste? [14:16] mvo: repaste please [14:16] mvo: ah, I wondering if we should operate a bit more like systemd when we really try to kill the whole service [14:16] mborzecki: so something like "check unit, anything in there left? if so send sigterm. poll for up to 5 (or 10) seconds and check every ~1s if there is anything left. then do a hard sigkill on the group [14:17] pedronis: yeah, I think on remove we must be [14:17] mborzecki: if you want you can look into a smarter way to do https://github.com/snapcore/snapd/pull/5043/files#diff-08088787360fb3ca74a0aed4c6103b22R312 [14:17] PR #5043: many: fix "refresh-mode: sig{term,hup,usr[12]}" via KillMode=process [14:17] mborzecki: i.e. what we want is to ensure that on remove everything in the unit gets killed eventually [14:17] mborzecki: sorry, slightly out of order [14:17] that's ok [14:17] I left a comment there [14:18] about a code org question [14:18] and John input [14:18] Chipaca: you should also look at #5043 [14:18] PR #5043: many: fix "refresh-mode: sig{term,hup,usr[12]}" via KillMode=process [14:18] pedronis: I did, had a long chat with mvo about it this morning [14:19] heh, ok [14:22] zyga: mborzecki I am not convinced this is a classic only issue. snap install mame --beta, that's not classic and doesn't work on 18.04/nvidia, but does on 16.04/intel. [14:23] I am certain this worked previously on 18.04/nvidia (because both I and Wimpress have tested it on that platform) [14:23] Yes, It worked on 16.04 by accident [14:23] But Ubuntu changed so it no longer does [14:24] It probably still works on 16.04 [14:24] Chipaca can confirm as he has the right software and hardware [14:25] i what now? [14:25] This doesn't feel like an optimal response to "My snap worked and now it doesn't" [14:25] popey: updating my bionic install now, i'll check in a minute [14:26] zyga: ? [14:29] pedronis: thank you, I have a look, I just added stop-mode and tweaked a bit but I need to add more tests and tweaks but I think the direction is nice [14:29] PR snapd#5046 closed: snap, wrappers, tests: rename refresh-mode -> stop-mode, endure -> skip-refresh [14:30] zyga: before we publish anything, we need permission from Fedora Council and Server WG for usage of the name 'fedora' [14:34] Pharaoh_Atem: that's a good point [14:34] Pharaoh_Atem: how can we get that? [14:34] probably talk to sgallagh to see how to do that [14:34] Pharaoh_Atem: can we can publish it as phedora for now? [14:34] for development [14:34] before anyone can think it's ready for use [14:35] fedora-release needs to be swapped for generic-release, and then we can call it whatever [14:35] Pharaoh_Atem: that's nice [14:36] Pharaoh_Atem: I'll make a test snap, if we agree on the name "phedora" [14:36] or something like thaT :) [14:36] with generic-release, the system identifies as Generic Linux :) [14:36] https://src.fedoraproject.org/rpms/generic-release/blob/master/f/generic-release.spec#_136-154 [14:36] popey: ohmygiraffe works fine, trying out mame now, any roms you can recommend? :) [14:36] mborzecki: did you play golden sun? [14:37] mborzecki: I have it on an old Nintendo console, it's an amazing game [14:38] Pharaoh_Atem: generic is ... too genric [14:38] but we can come up with something I'm sure [14:38] i'm a metal slug fan ;) [14:38] mborzecki: yes, omg works, but that was built and never rebuilt a year ago. [14:38] mborzecki: ah, I love that game too :) man I'm so old now that I think of it [14:39] zyga: we can also fork generic-release and produce a -release package to brand as another distro that says "ID_LIKE=fedora" [14:39] again, fairly trivial stuff [14:39] yeah, that's good [14:39] mborzecki: Personally I like ghosts & goblins, r-type, r-type 2, rtype leo, scramble, nemesis, side arms and bomb jack - and I own the arcade boards so I'm (allegedly) legal ;) [14:39] I'll make some packages and stick it into copr [14:39] ;) [14:39] popey: _wow_ [14:40] zyga: you could even call it Ubuntu RPM Edition if you wanted :P [14:40] pedronis: re https://github.com/snapcore/snapd/pull/5043#discussion_r181401324 - do you have a suggestion what place to use? [14:40] PR #5043: many: add "stop-mode: sig{term,hup,usr[12]}{,-all}" instead of conflating that with refresh-mode [14:40] Pharaoh_Atem: but then ubuntu trademark ;-) [14:40] though I think sabdfl would be a little bit peeved :P [14:40] Pharaoh_Atem: I'll call it zygoonix [14:40] hehe [14:40] Pharaoh_Atem: hahaha [14:40] popey: yeah, mame does not work, but the nvidia libs get mounted at the proper location, i'll check the wrapper script in case it's doing something silly [14:40] Orange Hat [14:40] :D [14:40] OMG [14:40] lol [14:40] why not, eh? [14:40] orange is the new hat ;-) [14:41] * zyga gets back to kernel bug fixing and workarounds [14:41] mborzecki: I was using a manual launcher I made previously, I switched to desktop-glib-only [14:41] if you want to have more fun, there's also a generic-logos package you can fork and make up some fun branding for [14:41] mvo: I think the helper should just return if the mode if for all or main? and then wrappers picks process based on that [14:42] pedronis: yes, that sounds good [14:42] pedronis: I will change it this way [14:42] popey: Purple Cap :P [14:42] mvo: thank you, is mostly for future readers, is a bit strange to be in snap [14:42] pedronis: I will call it "KillEmAll()" [14:42] pedronis: totally agreed, the wrong place [14:42] mvo: how about SecureKillEmAll [14:43] we want to be safe, after all [14:43] Pharaoh_Atem: I like! [14:43] I think I have a name [14:44] but I'll announce it later :P [14:44] popey: strace shows it's not loading libGL from the right location [14:44] zyga: now don't forget, we need logos and stuff too :P [14:45] mvo: it looks like I won't be splitting catalog refresh today [14:45] Pharaoh_Atem: and a boot up chime [14:45] YES! [14:45] Pharaoh_Atem: it can be "SCO LINUX" played backwards [14:45] haha [14:46] it takes ~40 minutes to make your own branded derivative of Fedora, satisfying _all_ of the necessary trademark guidelines [14:46] and with your derivative, you can do whatever you want [14:48] popey: this is ld.so log from when it tries to load libGL https://paste.ubuntu.com/p/S5MyfKdyXT/ SNAP LIBRARY_PATH is not in LD_LIBRARY_PATH anymore [14:49] that's why it does not pick up the right libGL [14:50] mborzecki: can we inject that for classic snaps? [14:50] (this isnt classic) [14:50] oh [14:50] then we can definitely influence that [14:50] so could I work around this by adding an environment stanza which prepends SNAP_LIBRARY_PATH to LD_LIBRARY_PATH? [14:52] popey: that's what snapcraft does [14:52] popey: i'm looking into the wrappers now [14:52] ok [14:52] thanks [14:53] huh, I did apt purge snapd and now have two loopback devices hanging around [14:53] Chipaca: can you run [14:54] losetup [14:54] and paste that [14:54] I can run, just not on open road just yet [14:54] too hard on the back [14:54] zyga: it lists two devices, both with "deleted" [14:54] Chipaca: super interesting [14:54] so two snaps were removed but their loop devices were not cleaned up [14:54] yes [14:54] what are those [14:54] and I can mount them :-) [14:55] some experiments or regular things? [14:55] core and lxd [14:55] intersting [14:55] maybe related to what pedronis is chasing [14:55] and i can't detach them even though they're not mounted [14:55] do you have lxd containers running? [14:57] pedronis: I don't think so, but how do i check? [14:57] huh, i do have lxd stuff running [14:57] * zyga found one typo, closer to having a patch [14:57] that might explain why didn't go away [14:57] maybe [14:58] dbus, lxd itself, and dnsmasq [14:58] thinking about it I manually removed the lxd snap yesterday [14:58] as in rm -f [14:58] no, as in 'snap remove lxd' [14:58] zyga, mvo, niemeyer: Flock 2018 is in Dresden, Germany: https://fedoramagazine.org/flock-2018-venue-announcement/ [14:59] during holidays [14:59] nice [14:59] maybe I can come [14:59] even totally unofficialy [14:59] * pedronis is going to eow [15:03] zyga: does RPHAT have higher priority than LD_LIBRARY_PATH? [15:03] RPATH [15:05] mborzecki: thinking [15:05] mborzecki: I don't know :/ [15:05] I need to check spec for that [15:05] rpath that has lower precedence than the LD_LIBRARY_PATH environment variable [15:06] googled and memory recollection seems to align with that [15:06] elf files can be disabled from looking at LD_LIBRARY_PATH though [15:07] sergiusens: zyga: this is what I see: https://paste.ubuntu.com/p/8mbNy545Z4/ [15:07] if i LD_PRELOAD then mame works [15:07] and I can see the paths from LD_LIBRARY_PATH being used when searching for other libs [15:09] PR snapd#5048 closed: tests: updating bionic version for spread tests on google [15:10] mborzecki: don't you need to snap --shell run `mame` and set LD_LIBARARY_PATH in there? [15:11] sergiusens: i'm inside the snap ns [15:11] PR snapd#5047 closed: tests: removing linode-sru backend [15:12] oh, I read incorrectly [15:12] can I see all of readelf's output? [15:12] but will look shortly, need to do a pickup [15:15] sergiusens: http://paste.ubuntu.com/p/bGPHtsMHMs/ [15:21] sergiusens: LD_DEBUG=all http://paste.ubuntu.com/p/Zm57vmQCwR/, at first it's only looking at RPATH, then around line 5217 it picks up LD_LIBRARY_PATH when it moves to loading deps of other libraries without RPATH [15:22] pff no clue what's happening, i could probably try to patch mame binary and strip/fixup rpath [15:36] mvo: testing the fix now [15:36] zyga: testing as in running spread with 18.04-32 ? [15:36] yes [15:37] cool [15:37] mvo: I've added a constraint that runs this only on classic [15:37] we can further refine it to look for apparmor "feature" non-leaking kernel or something [15:39] btw, where did you get bionic 32 image [15:39] I had to use my hacky scripts to make one [15:39] for qemu [15:40] zyga: autopkgtest-buildvm-ubuntu-cloud -r bionic -a i386 [15:41] yeah [15:41] that's what I do [15:41] zyga: oh, and it did not work? [15:42] no, I mean, I hoped we had a google image [15:43] zyga: aha, not yet afaict [15:43] afaik [15:44] mvo: quick pre-review? [15:44] https://github.com/snapcore/snapd/compare/master...zyga:tweak/inject-random-profiles?expand=1 [15:44] Chipaca, zyga 5043 is ready I think for a review, tests still running so maybe some tweaks needed but hopefully its ok [15:44] ok, looking [15:45] holly molly [15:45] that's not a small change sadly [15:45] but let me read on [15:55] jdstrand: hey [15:55] jdstrand: if around, can you look at the link above please [15:55] zyga: did you see jdstrand commented in the forum? [15:56] checking [15:56] yeah, just saw [16:00] zyga: fwiw, it looks reasonable - does it help? i.e. is the test working now? [16:00] still going for now [16:01] it definitely does help in my simplified testing where I just insert random profile in a tight loop [16:01] that's stable over 100K insertions [16:02] nice === sergiusens_ is now known as sergiusens [16:58] * kalikiana wrapping up for the week o/ [16:59] jdstrand, [Apr13 16:23] audit: type=1400 audit(1523636600.835:36): apparmor="DENIED" operation="open" profile="snap-update-ns.classic" name="/dev/urandom" pid=1811 comm="3" requested_mask="r" denied_mask="r" fsuid=0 ouid=0 [17:00] jdstrand, looks like the classic-support interface would like /dev/urandom support ;) [17:02] snap-update-ns.classic [17:02] what would make it open /dev/urandom? [17:02] ogra_: that profile is applied to a program that modifies snap namespaces, not to the program itself [17:02] zyga, thats the classic (developer chroot) snap on a core system [17:03] yes, but the denial is odd, this is not a profile for the application [17:03] something inside tries to access urandom i guess [17:03] it is a profile for the tool that constructs the execution environment [17:03] it never should open /dev/urandom [17:03] can you tell me how to reproduce this? [17:04] not really but i can try to proxy ... afaik ian only noticed it in his logs after working a day with his dragonboard [17:04] in any case it should not cause any actual harm [17:04] no, thats clear [17:04] but i thought it was just missing device access in the classic-support interface [17:08] Issue snapcraft#1920 closed: Design error reporting [17:08] PR snapcraft#1951 closed: repo: do not pull in libc6-dev by default for stage-packages [17:09] zyga: snap-update-ns might load bits of snapd that try to use random generation [17:09] ah, indeed [17:09] perhaps osutil does need it [17:10] we would need to investigate, it might not be something that snap-update-ns uses but some init of something linked might use it [17:11] Issue snapcraft#2071 opened: patchelf to handle RUNPATH [17:42] * cachio afk [17:53] hrm, hrm, no travis slots, the typical friday evening problem after a busy week [17:53] * mvo is slightly sad about this [17:59] roadmr, are there issues with the staging store right now? [17:59] roadmr, having trouble logging in, do you see any issues here? https://pastebin.ubuntu.com/p/KVXTxkYZBf/ [18:02] kyrofa: what's a 504?? hehe let me see [18:03] A timeout, indeed, that seemed odd to me as well [18:03] weird [18:03] a sec [18:06] Issue snapcraft#1918 closed: Add y/n support for sending errors back [18:06] PR snapcraft#2069 closed: Reports [18:07] kyrofa: I'm able to "snapcraft login" to staging just fine [18:08] roadmr, I actually had a success with another account as well-- any chance there could be an issue with a specific account? [18:08] Either that or it doesn't always happen [18:08] Although I got it twice in a row with the same account [18:09] kyrofa: could be! can you authenticate with that account at login.staging.ubuntu.com? [18:09] Let me try [18:10] kyrofa: hm, the URL that's timing out on you is actually in the staging store / dashboard. Incremented Retry for (url='/dev/api/account'): Retry(total=4, connect=None, read=None, redirect=None) [18:10] kyrofa: when was this? [18:11] Just now [18:11] kyrofa: ok, let me check whether any of the app servers are wonky [18:12] (they were earlier today, that's why I asked "when") [18:14] kyrofa: everything looks healthy :( [18:14] kyrofa: I'm about to go lunch but you can ask in #snapstore, anyone there should be able to look at this [18:21] roadmr, note that I can indeed login to login.staging.ubuntu.com [18:23] wooooot [18:23] first thunderstorm of the season [18:23] !!! [18:23] waterfall from the sky [18:25] zyga: do you know if mb figured out what was breaking mame? [18:25] (is there something i can do in the launcher?) [18:25] popey: sorry, I don't know [18:25] ok [18:25] I'll ask next week [18:25] I'm looking at a kernel leak all day [18:27] hey zyga, any news about the daemon thingy? :) [18:27] suebt: hey, no, I will try next week, sorry :/ [18:28] ok np === devil is now known as Guest61956 [18:38] PR snapd#5049 opened: interfaces/apparmor: workaround kernel memory leak [18:39] kyrofa: yes, per my earlier comment, it's not a login.staging.u.c issue, but a dashboard.snapcraft.io one [18:40] Ah, okay [18:40] jdstrand: I can't upload with snapcraft 2.40 - despite your thread suggesting 2.40 implements this -no-fragments thing [18:40] https://forum.snapcraft.io/t/automated-reviews-and-snapcraft-2-38/4982 [18:41] Hang on, this is an electron application - built with electron-builder. Does that bake in something older than 2.40? [18:43] kyrofa: just so we're clear, things are working well for you now, and you'll let me (or anyone in store team) know if you see this again, right? [18:43] kyrofa: (just to avoid an expectation mismatch "I thought you were looking into it" "I thought you'd tell me if it broke again" :) [18:46] roadmr, no, something is wrong. I can definitely login with other accounts, but not this one. Ever [18:46] Same thing happens every time [19:01] kyrofa: oh! ok, let's look at it from this angle then [19:05] kyrofa: oi oi [19:05] Hey there cprov [19:06] kyrofa: the `account` endpoint (used on login to cache account-id, IIRC) is returning a lot more data in `snaps` and it's undeniably slower, specially for crazy testing users we have in staging [19:07] cprov, if you look at the snap names, we testing things like registration [19:07] cprov, and then we never touch some again. Is there a way we can tell the store to toast them? [19:07] kyrofa: cleaning up snaps is not operational yet, for now the best suggestion I have is to create a new user [19:07] Ouch [19:08] This is all CI [19:08] Can we manuall clear out the snaps? [19:08] manually* [19:08] kyrofa: it's not that bad, a new user will be performing well up to 1000s snaps, which is weeks of testing [19:10] kyrofa: cleanup is too manual atm, via the UI one by one, we need come up with a script to do that properly [19:12] cprov, okay. Is there a different endpoint we could be using that won't give us every single snap? [19:13] kyrofa: not yet, but what do you have in mind ? [19:13] cprov, just seems silly to return a bunch of data we don't need [19:13] I don't want to create a new user every few weeks [19:14] kyrofa: why does it need the account/ on login ? [19:15] kyrofa: ftr, there are 4200 u1test* snaps in staging [19:17] zyga: I see #5049 - did it survive ubuntu-18.04-32? [19:17] PR #5049: interfaces/apparmor: workaround kernel memory leak [19:20] cprov, no idea, do you think it shouldn't? [19:21] I can find out if you think it unnecessary [19:21] kyrofa: let me check the code, I don't see an obvious reason [19:24] mvo: my workaround worked [19:24] mvo: it failed on network on one test [19:24] mvo: I restarted but it works [19:25] cprov, _check_dev_agreement_and_namespace_statuses I think [19:25] mvo: I get dinner now [19:25] We check to ensure the account is properly setup with the dev namespace and whatnot [19:25] kyrofa: ah, I see [19:27] zyga: nice [19:27] zyga: I'm also waiting for tests [19:36] PR snapcraft#2072 opened: lifecycle: handle missing version correctly === devil is now known as Guest55708 [19:45] popey: you said 'hold on'. I don't know if electron builder pins snapcraft in any way, but I can say the store queue is empty [19:45] popey: make sure that schroot, lxd containers, your snaps, your debs, your git checkout, etc is using the latest snapcraft [19:46] popey: if you refresh the review-tools snap, you can do 'snap-review /path/to/snap' and it will have better error reporting on the issue and can let you know if -no-fragments was used or not, or something else [19:48] popey: you mentioned electron-- if you are using 'allow-sandbox: true' with a setuid chome sandbox, that will also cause it to fail this test (it will also fail *other* tests though too) [19:48] any ideas about: cannot change profile for the next exec call: No such file or directory [19:48] snap-update-ns failed with code 1: No such file or directory [19:49] jdstrand, zyga: ^ [19:49] popey: but ultimately, you didn't give a store url so I can't look into it more [19:49] that's when running corebird, previously working [19:49] Yes [19:49] But not debugged [19:50] Watching movie with family now [19:57] zyga: enjoy. still no travis slot for me, I call it a day [19:58] Boo :/ sorry to hear that [19:58] Travis has a weekend mode [20:07] popey: please see the forum [20:31] PR snapcraft#2068 closed: states: track override scriptlets [20:34] jdstrand: ok. i dont know how electron does it [20:34] jdstrand: there's none in the queue I guess because it never got to the store [20:41] popey, I believe they call mksquashfs themselves [20:41] kyrofa: ah, nice! if that's true and electron-builder can be poked at, their mksquashfs call can be made to conform to what Jamie suggested [20:44] No, I lied: https://forum.snapcraft.io/t/snapcraft-push-error-keyerror-architectures/4056/9 [20:44] Sounds like they use snapcraft pack [20:44] But what version? [20:45] aha :) [20:46] They have a docker image, I bet it hasn't been respun [20:49] Though I can't pretend to understand fully how it works [20:50] their instructions ask to install snapcraft as a deb [20:51] 2.40 is available for xenial, artful and bionic, and from this I understand it'd use the deb installed on the system... which popey said was 2.40 [20:52] Interesting [20:52] it looks like e-b manually does some unsquashfs / mksquashfs, but I'm not a developer, so I am going by what I think i see on github [20:52] https://github.com/electron-userland/electron-builder/blob/7800ce1301154281564a23b5707e9a79bf3aa52d/scripts/snap-template.sh#L8 [20:54] I saw that too... but it looked more like a test [20:56] looks like the node module electron-builder-lib [20:57] I don't understand this :( [20:57] one for monday. [20:58] PR snapcraft#1911 closed: Add support enable configurable runtime version for .NET Core applications [20:59] niemeyer, there? [21:05] cachio: Here, but still in the conference [21:06] niemeyer, nice, just 1 quick question [21:07] I am working with the amazon image [21:07] it is working but still trying to resize it [21:07] I read it is not possible to shrink xfs filesystems [21:07] and it is using an xfs [21:07] I see 2 alternatives [21:08] I could create a snaller partition and copy the fs to there [21:09] or leave it with the same size and make a change in spread to skip setting the size of the disk when it is defined in the sprad.yaml [21:09] niemeyer, do you see other alternatives? [21:09] what do you prefer? [21:14] cachio: What happens if you set the image to exactly 10GB? I hope it won't try to resize it [21:15] cachio: We shouldn't change the nature of the image though.. shouldn't be a different filesystem for example [21:15] niemeyer, you mean if I truncate the image to 10GB? [21:15] cachio: Worst case we can have a setting in spread to preserve the image size [21:16] niemeyer, yes, that was the other alternative [21:17] I can try to resize the partition but we could loose data [21:18] If you agree I could make a PR in spread to preserve the image size i¿ [21:19] niemeyer, in that case we can use the amazon image which it already uploaded [21:22] Issue snapcraft#2064 closed: Support for set-grade [21:22] PR snapcraft#2067 closed: many: add snapcraftctl set-grade [21:25] PR snapcraft#2073 opened: meta: validate extracted and scriptlet metadata [22:42] cachio: Yeah, I think that's fine.. we can use "preserve" as a value, that internally sets the value to zero, and we interpret it as such [23:16] cachio: Actually, zero won't work as that's the default in which we want to resize as what most cases need.. preserve could set it to -1 though [23:16] niemeyer, ok [23:17] I'll try it [23:18] thanks