=== chihchun_afk is now known as chihchun
mupPR snapd#5459 opened: cmd/snap: add 'debug paths' command <Simple> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/5459>06:28
zygagood morning06:50
mborzeckizyga: hey06:56
mborzeckipstolowski: heya07:05
mvohey pstolowski and mborzecki07:08
mborzeckimvo: hey, have you seen https://forum.snapcraft.io/t/need-help-snapd-services-in-ubuntu-18-04/6205/4?07:08
mvomborzecki: good morning - not yet, let me look07:10
* zyga settles in for work07:12
zygatoday looks, finally, like an emerging summer day07:13
zygait's not expected to rain and temperatures will be a little above 2007:13
mvomborzecki: hm, hm07:14
mborzeckimvo: wouldn't that be caused by stuff running with 5minute iteration times?07:22
zygamvo: that's not great :/07:27
zygamvo: is seeding blocking 1st boot?07:27
zygamvo: or perhaps registration blocking 1st boot07:27
mvomborzecki: can you elaborate a bit, not sure I get what you mean. I think we have two problems: a) its so slow - why is that? we don't seed on classic, so the snapd startup is slow. I saw earlier reports about missing entropy in VMs so that might be a clue then we need to figure out why we need random data b) why we block login, i.e. we should not have wantedby=multi-user.target07:28
mvozyga: aiui this happens on every boot not just first boot07:28
mvozyga: in any case, I think we need to check what target we can use that is not blocking login07:29
zygamvo: if this is on every boot then something is very wrong07:29
zygaunless it fails and keeps retrying07:29
mvozyga: yeah, it *might* be entropy but it might also be something else, definitely not reproducible on my (real HW) box but I try a VM next07:30
zygamvo: I'm all-day-vm and I haven't seen that07:30
zygamvo: but I'm in vmware07:30
mborzeckimvo: tought that it's some ensure thingy, but i see that doMarkSeeded calls EnsureBefore(0) so it's probably good07:34
pedronismborzecki: yes, I added it so that registration starts immediately but nothing waits on registration afaiu07:36
pedronis(I mean refresh does but shouldn't be visible outside)07:36
mborzeckihm if release.OnClassic && !osutil.FileExists(seedYamlFile) {, is there a seed.yaml in the desktop images?07:36
pedronisthey do seeding in 18.04, no?07:37
pedronisI mean desktop07:37
mborzeckihmm maybe that gnome-calculator snap07:37
mborzeckistill doesn't explain why this happens on each boot :(07:38
pedroniseach is weird07:38
pedronisapparmor stuff ?07:38
pedronisdo we reload profiles for some strange reasons at each boot?07:38
mvomborzecki: do you know if there is a default systemd target that runs after multi-user?07:42
zygapedronis: on a new kernel we would but otherwise no07:42
mborzeckimvo: iirc graphical requires multi-user07:42
zygamborzecki: graphical.target07:42
mborzeckiyup systemctl cat graphical.target07:43
* zyga learned this while installing RHEL 7 07:43
mborzeckihaha ;)07:43
zygaany objections for merging https://github.com/snapcore/snapd/pull/544307:44
mupPR #5443: interfaces: treat "snapd" snap as type:os <Core18> <Created by zyga> <https://github.com/snapcore/snapd/pull/5443>07:45
zygamborzecki: I was thinking if we should build a script that collects some information for debugging07:45
zygafor people that come and say "foo is broken"07:46
zygawe ask snap version, os-release, a few others07:46
mborzeckizyga: a combination of all debug commands probably07:52
Chipacazyga: 'snap debug lies'07:56
zygaChipaca: snap debug report07:57
zygaI already like it07:57
zygabut I think it should be a shell script in case "snap debug" is broken07:57
mvohm, askubuntu is interessting07:57
zygamvo: ?07:57
mvoone guy reports the problem *and* he did not upgrade snapd but the kernel and a bunch of unrelated packages07:57
mupPR snapd#5443 closed: interfaces: treat "snapd" snap as type:os <Core18> <Created by zyga> <Merged by zyga> <https://github.com/snapcore/snapd/pull/5443>08:01
mborzeckiChipaca: and you don't know if it's a verb or a noun ;)08:05
* zyga debugs interfaces-many08:10
zygapublic service announcement08:13
zygausing MATCH for if ... MATCH; then ... else .... fi is not good, it spams the log with garbage that you don't want to see08:13
zygaalso running this test makes me want to run and optimize08:21
mvoso askubuntu indicates that using the older kernel makes the problem go away and the kernel was released two days ago08:24
mvoso that matches - *but* its not clear why I can't reproduce it :(08:24
Chipacazyga: make 'snap foo' try to run snap-foo if foo isn't an internal command, and ship snap-debug-report?08:25
zygathat's sweet08:25
zygaactually /usr/lib/snapd/snap-foo maybe08:25
zygathat would be nice, too bad golang cannot do small executables08:26
* zyga applied one simple apparmor optimization and goes to look for breakfast08:26
Chipacazyga: i thought you said it'd be a script?08:26
zygabut in general08:26
zygathis one would be a script, yes08:26
Chipacazyga: second breakfast i hope08:26
zygaI tend to shift my hours late08:28
zygaAnd I wake up later as well08:28
zygaWell apart from today at 5AM because $SUNSHINE08:29
* Chipaca off to take one of the boys to the doc08:35
mborzeckisimple review https://github.com/snapcore/snapd/pull/5459 anyone?08:40
mupPR #5459: cmd/snap: add 'debug paths' command <Simple> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/5459>08:40
zygaohh, nice08:43
pedronisJob for systemd-journald.service failed. See "systemctl status systemd-journald.service" and "journalctl -xe" for details.08:47
zygapedronis: I saw that on travis yesterday08:47
zygaweird-ish but I think not new08:47
zygavery unlikely combination of jurnal ops08:47
zygawe probably don't wait for it to stop and sync08:48
zygaand restart fails because it's running08:48
zygaor something08:48
mvoxnox: hey, good morning! a recent kernel update caused some issues with random numbers and that caused snapd to start super slow (because it now waits for entropy). what is the best way to a) ensure snapd starts b) but is not in the way of the login screen? do we need a different target (wantedby) than multi-user.target?08:51
=== chihchun is now known as chihchun_afk
zygatests are taking longer, I suspect they will pass now09:11
zygawhich is good :)09:11
niemeyermvo: I've merged the spread PR, with a small tweak so closing the old session is done right before assigning the new one, similar to what we have before.. that avoids multi-closing and leaving a closed session assigned.. please let me know if the old behavior was intended somehow09:14
niemeyermvo: Travis is also updated with the new logic09:15
mvoniemeyer: sounds correct, thank you09:15
=== chihchun_afk is now known as chihchun
niemeyermvo: Replied on https://forum.snapcraft.io/t/how-to-specific-the-kernel-snap-on-core18/5947/5 .. please let me know if we need anything else about this09:24
pedronismborzecki: mvo: this is happening quite a lot:  Job for systemd-journald.service failed. See "systemctl status systemd-journald.service" and "journalctl -xe" for details.09:25
pedronisI got two test runs in a row hitting that09:25
mvoniemeyer: thanks, I think this is perfect09:25
mborzeckipedronis: and on ubuntu-core only iirc09:25
pedronisseems so, but on random tests09:26
pedronisanyway is part of prepare09:26
mvopedronis, mborzecki thanks, sounds like we need add debug code there. side-note: tests seems to be more unstable lately again, kind of annoying09:27
mborzeckipedronis: i saw one more with `find ..../state-lib/*`, seemd like a glob gone wrong09:27
* mvo meanwhile goes into a systemd fistfight09:27
mborzeckipedronis: ^^09:27
pstolowskiuh, google:ubuntu-16.04-64:tests/main/interfaces-contacts-service failure again09:28
mborzeckipstolowski: same as before?09:29
pstolowskimborzecki: i don't remember the previous failure. relevant bit is https://pastebin.ubuntu.com/p/p6vTVH5nxx/09:30
mborzeckipstolowski: nope, this one occurred randomly before (rarely though) and iirc i tracked it back to libevolution-blahblah09:31
pstolowskimborzecki: ack, not good.. thanks09:31
zygaChipaca: hey09:45
zygaquestion about the warnings09:45
Chipacazyga: 'sup09:45
Chipacazyga: sure09:46
zygashould we seek integration with desktop notification systems?09:46
zyga(where appropriate)09:46
zygaimagine we never open the CLI09:46
zygaand never see any warnings there09:46
Chipacazyga: I'd expect each client to keep track of warnings in its own way09:46
zygaso gnome-software09:46
zygathat's sane, yes09:47
Chipacazyga: e.g. gnome-software would keep a 'last warning seen' timestamp around, separate from snapd's09:47
Chipacazyga: even the acking mechanism could be different09:47
Chipacazyga: for example, gnome software might take 'ownership' of the warnings itself09:47
zygacould a waning be generated asynchronously by snapd itself09:47
Chipacazyga: whereas snap does not09:47
Chipacazyga: did not follow09:48
Chipacazyga: all warnings are generated by snapd itself09:48
zygayou are on an idle desktop, snapd fires a warning, a desktop notification shows up, you click on it and go to gnome-software showing the details there09:48
zygaChipaca: (without user action triggering it)09:48
Chipacazyga: that'd be gnome-software's integration work if so09:48
Chipacaalso that'd probably only happen after we got notifications working from snapd09:48
Chipacathat is09:48
Chipacagnome-software have asked for a no-polling way of getting stuff09:49
Chipaca(so if you install something from snap, and gnome software has a listing, it can update the listing for example)09:49
Chipacaor if it's showing a listing and a snap refreshes, it can update it09:49
Chipacaanyway, i need to step away again09:49
Chipacazyga: but, this thing should support all that (modulo the no-poll thing)09:50
xnoxmvo, it sounds like you want to fix the kernel, no?10:13
xnoxmvo, do you have the entropy bug? is snapd consuming entropy? (e.g. we had a bug were generating a few uuids could stall things)10:13
xnoxmvo, if a hardware number generator is available.... is it in use?10:14
xnoxmvo, and we do need to ensure that snapd starts when it does =/ because of cloud-init, etc.10:15
xnoxmvo, also, there were some kernels rolled back.10:16
mvoxnox: well, I follow #stable-kernel and have no seen anything rolled back yet. as for fixing> yes. however I wonder if we can mitigate this somehow by ensuring that snapd is not blocking login10:34
mvoxnox: I think its an entropy problem, we use random numbers for various things, I need to dig where this exactly hangs though10:34
xnoxmvo, that's conflicting goals. as then you will break all the public clouds that preseed snaps and expect that sdk/agents are there, and working upon login.10:35
mvoxnox: hm, hm. fair point10:35
zygahonestly I think that is a special case10:35
zygaand it should not be default10:36
pedroniswell, we cannot really change it10:37
pedroniswe have the relationship setup that way since a while10:37
zygaI need 2nd review on https://github.com/snapcore/snapd/pull/545710:39
mupPR #5457: many: lessen the use of core-support <Core18> <Created by zyga> <https://github.com/snapcore/snapd/pull/5457>10:39
pedronismvo: also I'm not quite sure what we need entropy for after the first boot10:39
mvopedronis: yeah, I'm checking if I can find anything10:41
mvopedronis: catalogRefresh needs quite a bit of getrandom() data - that one is easy to delay. however we also have something that uses 4 bytes of getrandom() before main() which I'm looking at right now. aiui the problem is that the urandom entropy now only becomes non-blocking once a certain amount of real entropy is available to seed the prng (but I might be wrong here). so even those 4 bytes are problem blocking the boot10:55
zygamvo: are all our timer randomization things using real randomness?11:07
mvozyga: getrandom is pseudo random unless a flag is specified. but it looks like the bug is that urandom now is blocking until its initialzed with real entropy (my working theory so far)11:09
zygahmmm, I strongly doubt that is the case11:09
zyga(urandom blocking ever)11:09
mupPR snapd#5460 opened: tests: use grep to avoid non-matching messages from MATCH <Created by zyga> <https://github.com/snapcore/snapd/pull/5460>11:12
mupPR snapd#5461 opened: tests: "snap connect" is idempotent so just connect <Simple> <Created by zyga> <https://github.com/snapcore/snapd/pull/5461>11:13
mvozyga: well11:13
mvozyga: " If  the  urandom  source has not yet been initialized, then getrandom()11:13
mvo       will block, unless GRND_NONBLOCK is specified in flags.11:13
mvozyga: from the man-page11:13
zygathat's very interesting then11:14
zygaso urandom is not really that reliable after all11:15
mvozyga: it looks like it, I'm digging. we are not the only ones affected I'm trying to figure out a fix for the common case11:15
mvozyga: seeding will be hard though, here we need urandom11:16
zygamvo: can we polinate from snapd?11:16
mvozyga: but at least we should not block in the already-seeded case11:16
zygamvo: if urandom blocks then we do what polinate does11:16
mvozyga: an interessting idea! the kernel team did investigate polinate and they suspect it might be buggy though11:16
zygafun :)11:16
zygaI'll go to core18 topics for now11:16
mvozyga: I did not follow that11:16
zygagood luck11:16
mupPR snapd#5403 closed: many: use extra "releases" information on store "revision-not-found" errors to produce better errors <Squash-merge> <Created by pedronis> <Merged by pedronis> <https://github.com/snapcore/snapd/pull/5403>11:20
mborzeckihmmpf, woring on some changes in snapstate, broke aliases not even touching that code :/11:23
pedronismborzecki: aliases are delicate11:26
pedronismborzecki: let me know if I cand help11:27
* Chipaca ~> lunch11:38
* zyga tests a switch over to internal LTE11:48
pedronismvo: why does catalogUpdate delays  start up? it's a goroutine, no?12:00
mvopedronis: because it accesses getrandom12:02
pedronisand it blocks everything?12:02
pedronissorry, I'm dense but still not understanding12:03
=== alan_g is now known as alan_g|lunch
mvopedronis: sorry, let me give a bit more context12:03
mvopedronis: the latest kernel update make reading from urandom block at early boot12:03
mvopedronis: until it has a certain amount of entropy12:03
mvopedronis: that seems to be a regression and a bug but its not totally clear yet, the kernel team is researching this, it might be a valid security fix12:04
pedronisI understand12:04
mvopedronis: I looked into why we need urandom at startup of snapd12:04
pedronisbut I understand why getrandom from some init code whould block stuff12:04
pedronisI don't understand why getrandom from a gorotuine not in the main daemon start path12:04
pedroniswe do systemdSdNotify READY  from  daemon.Start12:05
mvopedronis: I need to look, maybe the catalog-update is not the problem. there is another reader of getrandom() (bson.go) which happens during "func init()" time12:05
mvopedronis: I just wrote it in the forum, its two places right now12:06
pedronisyes, I understand12:06
pedronisI see the problem with init12:06
pedronisnot sure I understand the other one12:06
pedronisunless go is starved somehow of threads for os calls12:07
ogra_hmm, wasnt the purpose of urandom (vs random) to be non-blocking ?12:10
ogra_(sounds like a kernel bug all over, why would someone change that)12:11
=== pstolowski is now known as pstolowski|lunch
mvopedronis: if the theory is correct it does not even enter main() because the init code in bson.go reads urandom - or am I misunderstanding you maybe :) ?12:14
pedronismvo: yea, then why would catalog-update matter?12:14
mvoogra_: yeah, I'm simplifying here a bit but the getrandom syscall man page explains that it may block if the prng is not initialized yet12:14
mvopedronis: maybe/probably it does not12:15
mvopedronis: sorry, I was just hunting for the sources of what reads getrandom12:15
mvopedronis: I don't run into the bug myself so I can't test my theory :/ but I will add code that does12:15
mvo(that does test my theory)12:15
ogra_mvo, well, but thats new behaviour and getrandom can be switched back to the old one if GRND_NONBLOCK is set as i understand12:18
mvopedronis: you are correct, catalog-update does not matter -12:23
mvopedronis: sorry for that12:23
mvoogra_: correct, GRND_NONBLOCK is not used by the golang runtime though and we can not control that12:23
ogra_evil ...12:24
mvoogra_: yes12:27
mvopedronis: I updated the forum message - its just bson.go it seems that we would have to fix to workaround the issue12:28
mvopedronis: anyway lets talk in the standup12:29
=== alan_g|lunch is now known as alan_g
=== pstolowski|lunch is now known as pstolowski
* Chipaca on his way13:01
Chipacamvo: is #1779948 the same issue again?13:25
mupBug #1779948: Snapd gets stuck when starting Ubuntu. <amd64> <apport-bug> <bionic> <snapd (Ubuntu):New> <https://launchpad.net/bugs/1779948>13:25
mvoChipaca: probably, let me look13:28
zygaha, this is fun13:32
zyga while systemctl restart systemd-journald; do :; done13:32
zygathis fails nearly instantly13:32
* zyga investigates13:32
Chipacapstolowski: you reminded me of https://www.facebook.com/jesse.newton.37/posts/776177951574 (viewable in incognito if you don't use the book of faces)13:56
mborzeckipstolowski: hahah https://github.com/snapcore/snapd/pull/5416/files#diff-9c91792e9bb71d29a3ae728fc544152fR4814:04
* zyga goes to make some coffee14:04
mupPR #5416: interfaces/hotplug: add hotplug Specification and HotplugDeviceInfo <Created by stolowski> <https://github.com/snapcore/snapd/pull/5416>14:04
zygaChipaca: I know that story, it's terrible14:04
pstolowskiChipaca: lovely, rotfl :). and btw, ircloud kindly gave me entire content inline, the dod that apparently for facebook too14:05
zygapstolowski: irc cloud is nice, eh?14:06
zygaare you using the snap or the web browser to use it?14:06
pstolowskimborzecki: yeah i do that when i run out of foo & bar vocabulary ;)14:06
zygamy only issue with it is lack (apparent) of any way to set the font I want14:06
pstolowskizyga: i didn't know of a snap; i just use it in the browser. yes it's nice, i haven't looked back since i started my subscription a few months back14:07
mborzeckiChipaca: found the problem with aliases :( magic name handling in fakeSnappyBackend.ReadInfo14:11
pedronismborzecki: that's used for everything though, is not just aliases14:16
pedroniswe fake various snaps there14:16
Chipacamborzecki: I'm still grinning evilly, here14:19
mborzeckipedronis: yeah, i just missed a little `if strings.Contains(snapName, "alias-snap") {` which changes the name :(14:23
mborzeckifunny that it worked until it hit reenabling of manual aliases which looks in info.Apps[], which was obviously an empty map14:25
mborzeckianyone up for a simple review of #5459?14:42
mupPR #5459: cmd/snap: add 'debug paths' command <Simple> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/5459>14:42
mupPR snapd#5462 opened: many: use extra "releases" information on store "revision-not-found" errors to produce better errors (2.34) <Created by pedronis> <https://github.com/snapcore/snapd/pull/5462>14:45
mborzeckipedronis: i've resolved the conflicts in #5452 and force pushed15:09
mupPR #5452: store, overlord/snapstate: introduce instance name in store APIs <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/5452>15:09
pedronismborzecki: thx,  I will look tomorrow morning I think15:10
mborzeckipedronis: great, thanks!15:10
mupPR snapd#5463 opened: Optimize snap install time 1 <Created by alfonsosanchezbeato> <https://github.com/snapcore/snapd/pull/5463>15:12
mupPR snapd#5464 opened: vendor: switch to latest bson <Created by mvo5> <https://github.com/snapcore/snapd/pull/5464>15:25
* Chipaca grins15:30
mupPR snapd#5465 opened: daemon, overlord/state: warnings pipeline <Created by chipaca> <https://github.com/snapcore/snapd/pull/5465>15:30
=== chihchun is now known as chihchun_afk
* cachio lunch15:44
=== pstolowski is now known as pstolowski|afk
* ogra_ hugs popey 16:11
ogra_(for also doing an armhf build of xonotic !)16:11
popeybet that doesn't work16:12
popeywe're dumping their pre-built binaries, not building from source16:12
ogra_dont lauch, my next objective is kiosk systems so after i have a proper chromium kiosk image for the pi (which might still take a lot of work, it is definitely not accelerated atm) i'll move on to kodi and xonotic ;)16:13
ogra_i dont see why it wouldnt work ...16:13
ogra_anyway ... as a xonotic junkie i'm happy to see we ha a snap now16:14
popeyi see threads suggesting it could work16:15
popeythe snap is smaller than the upstream zip too :)16:15
ogra_ha !16:15
popey(and stays compressed of course, double bonus)16:16
ogra_the littel ram might be an issue while gaming16:17
ogra_though if it fully utilizes the GPU it should work16:18
cachiozyga, do you have a dragonboard with you?16:22
cachioI can reproduce the error of MATCH16:23
cachiomvo, do you have one?16:23
zygacachio: no, I don't :-(16:29
zygacachio: well, I do back in my offie16:29
zygaI could get it online shortly but not instantly16:29
zygacachio: can you tell me more about the match issue?16:29
cachiozyga, the test interfaces-bluetooth-control is failing on dragonboard16:30
cachioit fails when it does MATCH "Permission denied" < btusb.error16:31
cachiobut when I debug it, the file contains the string16:31
cachiothen, if I change the match by a grep it works16:31
cachiozyga, it is supper weird16:32
cachioI'll run with -vv to see the deatuls16:32
zygacachio: that's the same as the "^test:" string we've seen elsewhere16:33
zygaI don't think it's specific to any hardware16:33
pedronisdo we get the wrong MATCH definition in some context?16:34
cachiozyga, the test just works on dragonboard16:34
pedronisit's starting to be too strange16:34
zygacachio: can we add a trace to what MATCH does16:34
zygacachio: yes but the user check is universal16:34
zygapedronis: I doubt that, it is defined by spread16:34
zygapedronis: definitely we don't have one that's nearly the same but inverts one bit of logic while keeping everything else16:35
pedroniszyga: well,  I see tests/lib/spread-funcs16:35
zygalook inside16:35
pedronisthat is used sometimes16:35
zygathe only definition is that from spread16:35
pedronisare we getting coonfused by that16:35
zygaand this is not new, we had that for months16:35
pedronismaybe somthing changed16:36
zygait'd be interesting to see if we can run tests from last month and hit this16:36
cachioin the debug session I defined MATCH as spread does16:38
cachioand I run the same line which is failing during the test and it works16:38
cachiozyga, pedronis, I'll make a change to spread to add some debug info16:41
zygacachio: maybe just set -x16:41
pedroniswe can also try to do declare -pf MATCH somewhere close to where we get those errors16:41
zygaor redirect to a file16:41
zygapedronis: good idea16:41
zygamaybe in that prepare logic16:42
zygathat seems to be hit very often16:42
zygawe can also re-define MATCH just ahead of that16:42
zygathough I think it must be something that is racing in the system16:42
zygain a way that we don't understand16:42
zygathat impacts MATCH16:42
zygabut I cannot put my finger on anything that could do something like this16:43
zygaone thing to, perhaps, think about16:43
zygais a very obscure mechanism in bash (and maybe dash)16:43
zygathat "inherits" function definitions16:43
zygafrom one shell to another16:43
zygabut this still doesn't explain why it is racy16:43
zygaespecially when invoked with a file16:43
zygawe could also try to copy /etc/passwd to /tmp/INSANE16:43
zygaand MATCH that to isolate from anything writing to passwd16:44
zygasome ideas to explore16:44
cachiozyga, ok, working on that16:51
cachiolet's see what's going on16:51
zygathank you!16:52
mupPR snapd#5466 opened: tests: remove extra ' which breaks interfaces-bluetooth-control test <Created by sergiocazzolato> <https://github.com/snapcore/snapd/pull/5466>17:24
mvocachio: sorry, I don't have a dragonboard17:41
cachiomvo, np17:54
zygaChipaca: are we there yet ;-) https://twitter.com/c___f___b/status/101452917974281011217:54
* zyga break18:33
mupPR snapd#5467 opened: tests: stop restarting journald service on prepare <Created by sergiocazzolato> <https://github.com/snapcore/snapd/pull/5467>18:58
mupPR snapd#5461 closed: tests: "snap connect" is idempotent so just connect <Simple> <Created by zyga> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/5461>19:22
mupPR core#92 closed: Remove core-support plug <Created by zyga> <Merged by mvo5> <https://github.com/snapcore/core/pull/92>19:32
* cachio afk19:40
mupPR snapd#5279 closed: interfaces/builtin: create can-bus interface <Created by jocave> <Merged by niemeyer> <https://github.com/snapcore/snapd/pull/5279>20:09

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!