[05:11] morning [05:57] trivial review #5550 [05:57] PR #5550: spread: switch Fedora and openSUSE images (2.34) [06:13] mvo: hi [06:14] mborzecki: hey, good morning [06:15] mvo: i have a vague recollection that you were suppsoed to be off today :) [06:15] mborzecki: I'm not really here today but the SRU is still pending and I need to do some testing to ensure it can go in in time :/ [06:15] mborzecki: so yeah, hopefully not here for long [06:15] mvo: aah [06:16] mvo: while at it, #5550 is really simple, i hope if fixes the issues with fedora we're seeing in 2.34 [06:16] PR #5550: spread: switch Fedora and openSUSE images (2.34) [06:17] mborzecki: oh, nice! [06:17] mvo: we somehow ended up using different image in 2.34 branch, and i'd guess only the one used in master branch got updated [06:17] mborzecki: yeah, makes sense [06:17] mborzecki: thanks for finding that [06:17] mborzecki: long standing branches are always a bit of a pain :/ [06:19] mborzecki: feel free to merge once its green [06:20] mvo: ack [06:21] good morning [06:22] * zyga sort of feels better now [06:24] mvo: and merged, i've restart #5545 [06:24] PR #5545: snapstate: allow setting "refresh.timer=managed" (2.34) [06:24] zyga: hey [06:24] hey, how are things? [06:32] zyga: good morning! how are you? well rested? [06:34] mvo: so so but better, this trip was pretty unfortunate (lots of delays, lots of gate changes, terrible seats next to crying babies) [06:34] mvo: but I'm much better than yesterday :) [06:34] mborzecki, zyga: we have a bit of a problem - on a fresh install via http://cdimage.ubuntu.com/bionic/daily-live/current/ when doing "snap install gedit" it hangs forever in auto-connect [06:34] zyga: trip> meh, not good [06:34] hmm hmm, we had this bug before right? [06:35] mvo: do we have any logs? [06:35] auto-connect task would spin in a loop, failing and trying again [06:36] I have a hunch I know what causes mount failures [06:36] the "https://forum.snapcraft.io/t/unexplained-mount-failure-protocol-error-what-we-know-so-far/5682/7" issue [06:36] increased parallelism [06:37] if allocating a loopack device is racy [06:37] then we have tow racing tasks that somehow end up both using one /dev/loopX [06:37] fwiw, happens with 2.33 and 2.34 [06:38] http://paste.ubuntu.com/p/DDCFVn7d4d/ [06:38] zyga: yeah that sounds sensible [06:38] mborzecki: the pastebin is the change changes output [06:38] mvo: sadly we don't know what happens inside that task [06:39] mvo: as in, there is no log of attempts and failures [06:39] http://paste.ubuntu.com/p/VfKnHtHkzy/ [06:39] zyga: yeah [06:39] hmm [06:40] spins locally too [06:40] mvo: the seed order in wrong no? we told them to fix it but they haven't [06:41] pedronis: oh, right [06:42] mvo: or the last change in the gnome snap needs yet a different order [06:42] pedronis: yeah, task <1> is also in doing [06:42] pedronis: I think that explains it === chihchun_afk is now known as chihchun [06:46] pedronis, mborzecki, zyga I think pedronis is right and the hang is just a side effect of the incorrect seed order. so less urgent and we can wait for the new ISO biuld with the seed order fix [06:47] mvo: do we know they fix the order? [06:48] pedronis: yeah, they added a workaround in livecd-rootfs [06:48] pedronis: where gtk-themes-common is sorted first now [06:48] ok [06:48] pedronis: I hope it actually works, it was a bunch of shell [06:48] mvo: btw I'm happy for us to fix it, but I think for 2.34 is too much of a risk (it would need to add a bunch of code) [06:49] we can try to have something for 2.36 [06:49] pedronis: +1 [06:49] 2.34.2 is now bionic-updates [06:50] so we will be on the 18.04.1 iso :) [06:50] thx [07:00] mvo: #5545 is finally green :) [07:00] PR #5545: snapstate: allow setting "refresh.timer=managed" (2.34) [07:05] small coffee break to be more awake [07:31] zyga: when you're more awake, could I get your opinion on this? https://forum.snapcraft.io/t/should-v2-interfaces-select-connected-return-unconnected-plugs-slots/6455 [07:32] I ran into it last week, and the suggestion was to wait for you to get back from the sprint [07:39] zyga: with this weather some cold brew would be great :P [08:17] re, sorry, this took a while [08:17] jamesh: looking now [08:18] zyga: no problem. I was getting over my own jetlag last week :) [08:18] jamesh: aha, curious, let me look at the code [08:20] jamesh: so, the output is as intended, I think what is missing is connection-level information (that was supposed to be returned by the never-implemented "snap connections" [08:21] jamesh: the last part I don't fully follow, the /v2/interfaces endpoint has two GET behaviors, legacy (which is more connection oriented) and non-legacy which is interface (not plug/slot but actual interface) oriented [08:21] zyga: I was just trying to work out what users would want the current behaviour [08:21] jamesh: "Also, if one output mode is referred to as getLegacyConnections, should it be considered a bug that you can’t get the equivalent information from the non-legacy interface?" - this is the part that I don't understand [08:21] zyga: I'm particularly concerned what the plug is connected to: just whether it is connected [08:22] jamesh: right, the thing is that the /v2/interfaces?select=connected returns _interfaces_ for which a connection exists between a plug and a slot [08:22] jamesh: currently there is no way to limit plug and slot references it returns (when prompted) to just those that actually have a connection [08:23] jamesh: the behavior is specifically modeled for "snap interface" to show a smaller list of interfaces (just those that have some actual connections established), unlike what "snap interfaces" (plural) does [08:23] zyga: so the only way to find out if one particular plug is connected is by asking snapd to serialise every single connected plug/slot on the system and then filter it client side :( [08:24] jamesh: still there is no great way to show the state of a particular plug or slot [08:24] jamesh: I think we are open to improvements, I'm just explaining why it is the way it is now [08:24] jamesh: I think we need a connection-oriented view [08:24] jamesh: and perhaps that would also answer the query you are after [08:24] jamesh: what is it specifically that you want to access? [08:25] (or perhaps what is the UX that this is a part of) [08:25] zyga: "does snap A have a connected plug of interface type X?" [08:25] jamesh: interesting, of interface type X or of name X [08:26] one thing I was looking at just last week was that in spread tests we often grep for stuff while we just want to answer "is this connected" [08:26] snap is-connected snap-name:plug-or-slot-name [08:26] this is for a trusted helper type use, making a decision based on the interface connection state [08:26] and I was thinking we should implement the connections endpoint which returns all connections, with enough filtering to just answer that [08:26] I see [08:27] jamesh: note that in theory one snap can define multiple plugs or slots of the same type [08:27] (as long as those have different names) [08:27] zyga: the particular use case is having pulse audio restrict access to microphones unless a particular interface is connected [08:27] in conversations long time ago we wanted "snap connections" to essentially return a set of tuples, one per connection [08:27] ah, right [08:27] since we can't handle that kind of thing at the AppArmor level [08:28] jamesh: so I think that right now we don't have a nice API for that [08:28] jamesh: I think we should write one (along with snap-connections) [08:28] jamesh: as the only other alternative is the legacy endpoint and client side flitering [08:30] okay [08:54] is mvo around today? [09:07] zyga: he is offically off [09:07] ah, I didn't know [09:07] just today or for a week? [09:08] just today I think [09:09] ok [09:43] hm building amzn2 using common fedora spec is triggering isseses when generating selinux policy files, looks like selinux-policy is missing map for file class :( [09:52] I have code that refuses to work locally if I imported from snapd, but works elsewhere imported from purportedly the same code, and works locally if I paste it into somewhere else. i've removed pkg/ from my gopath and no change. Strace doesn't seem to point to any weird imports. Any ideas? [09:54] vendored dependencies? [09:55] hmmm [09:55] jamesh: moving aside vendor does make the code work! [09:57] jamesh: but that makes even less sense: the code I'm running is in one of the unit tests that has to be using the vendored code [09:57] but as soon as I run govendor sync again, it fails again [09:58] jamesh: code is https://pastebin.ubuntu.com/p/rtv8QH34qB/ fwiw [09:58] although I'm going to assume it's something local [09:59] Chipaca: either (a) you're running into a bug in a dep that has been fixed since the revision govendor pulls, or (b) something bad happens when two copies of the package exist in the same process [10:00] anything github.com/snapcore/snapd/strutil pulls in will reference github.com/snapcore/snapd/vendor/gopkg.in/yaml.v2 [10:01] by "doesn't work", do you mean crashes, or failes to compile? [10:02] jamesh: yaml: unmarshal errors: line 2: cannot unmarshal !!map into yaml.MapSlice {[] map[]} [10:02] jamesh: option (c) :-) [10:04] Chipaca: Looking at the yaml code, it has a branch of code dependent on "reflect.TypeOf(MapItem{})" [10:04] indeed it fails as soon as I copy gopkg.in into vendor/ [10:05] Chipaca: so the MapItem as seen by strutil.OrderedMap is different to the MapItem the non-vendored yaml package sees [10:05] Chipaca: presumably you could change your program to instead import "github.com/snapcore/snapd/strutil/vendor/gopkg.in/yaml.v2" [10:06] I thought go blocked that [10:06] lemme see [10:06] imports github.com/snapcore/snapd/strutil/vendor/gopkg.in/yaml.v2: must be imported as gopkg.in/yaml.v2 [10:07] eh, nevermind [10:07] jamesh: now I understand it's not my go installation broken in weird ways (although arguably all installations are, given this), i can stop worrying [10:07] anyway. You can see that strutil.OrderedMap.UnmarshalYAML asks the non-vendored yaml package to unmarshal into the vendored yaml.MapSlice type [10:08] which fails because it sees the vendored yaml.MapSlice as just some random third party type rather than something to handle specially [10:10] jamesh: i thought it was the other way around: the u function will be unvendored (as it's provided by the caller which is outside of snapd, so unvendored) and the MapSlice will be vendored (as it's imported by strutil which will use the vendored) [10:10] that's what I said, isn't it? [10:10] ah maybe thats what you said [10:10] :-) [10:10] jamesh: yes [10:10] jamesh: I'm easily confused, it seems [10:40] Mornings [10:43] niemeyer: o/ [10:53] could somebody run the unit tests on cmd/snap on master a few (10?) times and tell me if it fails? [10:54] it fails here, at least once every 5-10 times; and it's failing every _single_ time on #5506 :-( [10:54] PR #5506: cmd/snap: add a green check mark to verified publishers [10:54] completely unrelated to the colour green :-( [10:55] the error is in cmd_aliases_test, ... value client.ConnectionError = client.ConnectionError{error:(*url.Error)(0xc82034e030)} ("cannot communicate with server: Get http://127.0.0.1:34111/v2/aliases: dial tcp 127.0.0.1:34111: getsockopt: connection refused") [11:00] Chipaca: got it on the first run [11:00] Chipaca: on master [11:23] niemeyer: can you upload the latest spread to s3? i've opened a pr for amazon linux but spread is complaining about invalid size string: "preserve-size" [11:24] mborzecki: Ah, indeed I haven't updated, waiting for feedback on whether it worked [11:24] cachio: have you had a chance to try it out? [11:24] niemeyer: it worked :) [11:24] mborzecki: There you go :) [11:24] mborzecki: Updating [11:25] niemeyer: thanks! [11:25] niemeyer, yes, I updated the amazon image using this one [11:25] mborzecki: Done, please let me know [11:26] hey niemeyer [11:27] how are you feeling? [11:27] in case anyone wants to take a look #5552 [11:27] I had a rough ride home, I haven't felt this tired after returning from a sprint in a while [11:27] PR #5552: (WIP) Amazon Linux 2 packaging and spread tests [11:38] zyga: Yo [11:40] zyga: Feeling pretty reasonable.. don't have much time to feel tired this week :) [11:40] zyga: The sprint was intense indeed, though [11:40] I blame the short sessions, in part [11:43] indeed, lots of tasks switching during the week [11:54] pedronis: did you get a chance to take another look at #5434 ? [11:54] PR #5434: overlord: introduce InstanceKey to SnapState and SnapSetup, renames [11:59] mborzecki: I think I skimmed it again, but didn't do a full re-review [12:00] pedronis: do you think you could do it this week? it'd be nice to land it before you're off for vacation :) [12:01] pedronis: i'm updating the store pr too, should be pushing the changes later today/tomorrow === chihchun is now known as chihchun_afk [12:10] * zyga -> walk [12:15] mborzecki: yes, not today though, more likely tomorrow [12:15] pedronis: works for me, thank you! [12:25] * zyga actually goes for that walk... [13:18] Saviq, yesterday I updated images, please tell me if you see any error [13:30] zyga: vey nice umbrella :P [13:49] mborzecki: I have a snap package for a program that uses vulkan but it complains about not being able to find the vulkan loader. I was told you were the person to talk to. Any suggestions? :) [13:49] hunterk: is it published? [13:49] yes, retroarch [13:51] it initially launches with a GL renderer, but I can walk you through enabling the Vulkan renderer if you like [13:52] hunterk: let me check my notes, iirc there was some fishy stuff about vulkan an how it finds icd files [13:52] kk [13:52] i have to run to a meeting, so no hurry === plars_ is now known as plars [14:29] Pharaoh_Atem: hey [14:29] around? [14:29] zyga: yes? [14:29] hey, I [14:30] hey, I'm looking into f29 base snap now, I've started playing with image factory, trying to get it to do _basic_ things (whatever those are) in a way I understand [14:30] I wanted to quickly sync with you if that is the right way to start [14:31] Pharaoh_Atem: my plan is to write a plugin for it (called snapcraft) that builds a base snap according to the stuff in the template (still hand-wavy at this stage) [14:31] sounds like a good strategy [14:32] zyga, do you know why we are installing linux-image-extra-* ? [14:32] Pharaoh_Atem: I don't know what the constraints are, I suspect more modern things may be a dependency problem, I plan to use python 2.7 and shell for now [14:32] zyga, what so we need from it? [14:32] cachio: AFAIK for the extra drivers, but not specifcially [14:32] * Chipaca takes a break from bashing his head on tests and goes get a cuppa [14:33] zyga: you can email Brendan Reilly about it [14:33] bah [14:33] Brendan Reilly [14:33] zyga, ok, because there is not a package for the latest update on ubuntu 16.04-64 [14:33] on gce [14:33] cachio: I don't know what that means, [14:33] zyga: I'd plan on py3 compatible py2 code, since I think imgfac is going to be ported soon [14:34] are you saying the package is out of sync somehow? [14:34] it is built as a part of the kernel [14:34] zyga, we are installing this on snapd test suite [14:34] linux-image-extra-$(uname -r) [14:34] Pharaoh_Atem: that's a good hint [14:34] Pharaoh_Atem: who is Brendan? [14:34] but the last kernel is 4.15.0-1014-gcp [14:34] he's the maintainer and main developer of Image Factory/Oz [14:35] at least for the last two releases, he's been the guy making them [14:35] zyga, what I can so is to install 4.15.0-15 to make that work [14:35] zyga: https://github.com/redhat-imaging/imagefactory/blob/master/imagefactory.spec#L132-L147 [14:35] cachio: sorry, I don't know enough about the problem to help you [14:35] zyga, ok, np [14:35] I'll fix it [14:35] Pharaoh_Atem: ah, I see [14:36] Pharaoh_Atem: do you expect we will need to make changes outside of image factory in order to get the base snap building in place? [14:36] we may need to touch pungi and koji [14:36] cachio: I know we are installing it in the test suite but I don't know what the problem is really, is the package uninstallable? [14:37] https://pagure.io/pungi & https://pagure.io/pungi-fedora; https://pagure.io/koji [14:37] Pharaoh_Atem: to ping imagefactory or to do some other things? [14:37] oz is run by koji, which is kicked off by pungi [14:37] and how does imagefactory fit into this? [14:38] for example: https://koji.fedoraproject.org/koji/buildinfo?buildID=1130195 [14:38] zyga, the problem is that when I update the image for xenial 64, there is not linux-image-extra for the new kernel instlaled [14:39] zyga: imagefactory is run as a koji task [14:39] cachio: I would ask the kernel team about htis [14:39] zyga, ok, make sense [14:39] Pharaoh_Atem: and how does this relate to pungi? [14:39] pungi is the tool that actually kicks off all the koji tasks [14:40] and tells the tools what they should do [14:40] so koji can build things [14:40] by deferring to imagefactory [14:40] but it has no scheduler [14:40] so pungi is doing that? [14:40] koji is the build system, imgfac is the tool, and pungi is the orchestrator [14:40] pungi -> koji -> imgfac [14:40] ok [14:40] thanks, I see now [14:41] so I started with the right place it seems :) [14:41] hmm, did we have any recent snapctl changes ? [14:41] zyga: you can also ask mboddu in #fedora-releng for more details ;) [14:41] https://paste.ubuntu.com/p/9Qg7szthFq/ [14:41] i'm seeing "snapctl: Permission denied" errors [14:41] Pharaoh_Atem: for now I think this is enough to keep me busy but I will write that down, contact points are useful [14:42] Mohan will be at Flock, too [14:42] (this has worked before last edge update of core i think) [14:42] ogra_: it moved from /usr/bin/ to /usr/lib/snapd/ so maybe apparmor is out of sync somehow [14:42] see if you have any denials [14:42] Pharaoh_Atem: Mohan == mboddu? [14:43] yeah [14:43] his name is Mohan Boddu [14:43] heh, surely a gazillion (but from other stuff ...) [14:43] is he responsible for for koji and friends? [14:43] ogra_: look for specific denial for snapctl [14:43] [ 1007.643957] audit: type=1400 audit(1532443192.989:1239): apparmor="DENIED" operation="exec" profile="snap.chromium-mir-kiosk.hook.configure" name="/usr/lib/snapd/snapctl" pid=4522 comm="configure" requested_mask="x" denied_mask="x" fsuid=0 ouid=0 [14:43] [ 1007.710435] audit: type=1400 audit(1532443193.053:1240): apparmor="DENIED" operation="exec" profile="snap.chromium-mir-kiosk.hook.configure" name="/usr/lib/snapd/snapctl" pid=4529 comm="configure" requested_mask="x" denied_mask="x" fsuid=0 ouid=0 [14:43] yeah [14:43] should another reboot help here ? [14:44] no, one sec [14:44] (i'm freshly booted after core update) [14:44] can you check if /var/lib/snapd/apparmor/profiles/snap.chromium-mir-kiosk.hook.configure talks about snapctl (grep for it) [14:44] ogra_: if it doesn't then i think this is a bug [14:45] ogra@pi3:~$ sudo grep snapctl /var/lib/snapd/apparmor/profiles/snap.chromium-mir-kiosk.hook.configure [14:45] # snapctl and its requirements [14:45] /usr/bin/snapctl ixr, [14:45] /usr/lib/snapd/snapctl ixr, [14:45] ogra@pi3:~$ [14:45] looks like it is there [14:45] with the correct path [14:45] that's interesting [14:45] can you run apparmor_parser -r on that file (as root) [14:45] and see if that fixes it [14:46] I wonder if we have a cache issue [14:46] it's also likely the file is correct now [14:46] oh [14:46] so run configure again [14:46] maybe it will work without running apparmor_parser [14:46] if it doesn't then do run apparmor_parser [14:46] and then try to run configure again [14:46] i did run configure a few times already [14:46] ok [14:46] and it fails, right? [14:47] yes, running the parser now [14:47] now the configure works [14:47] ha [14:47] ogra@pi3:~$ snap set chromium-mir-kiosk disablekiosk=true [14:47] ogra@pi3:~$ [14:47] can you reproduce that issue? [14:47] no issues [14:48] I mean, can you get to a state where it happens again [14:48] phew ... that takes a while (takes minutes to install the chromium snap) [14:48] well, i'd remove and reinstall the snap [14:48] not sure if that changes anything though [14:49] the image is a week old or so and silently updated core but nothing else when i applied network 30min ago [14:49] not sure if i can repro that state [14:50] bah, dang ! [14:50] ogra@pi3:~$ snap remove chromium-mir-kiosk [14:50] error: cannot remove "chromium-mir-kiosk": snap "chromium-mir-kiosk" is not removable [14:50] it is in the model assertion as required snap ... [14:51] so no way i could try to repro that by removing the snap [15:08] cachio: ack, thanks for the heads up (down?) ;) [15:09] isnt that also called "nodding" ? [15:09] "heads up (down)" [15:13] pedronis: pushed fixes to the store PR too, thanks for the review comments there was indeed an error in mapping install errors there === jkridner|pd is now known as jkridner [15:26] zyga, aha ... seems a reboot gets me back into the broken state [15:27] that's very interesting [15:27] [ 1178.619187] audit: type=1400 audit(1532445976.250:48): apparmor="DENIED" operation="exec" profile="snap.chromium-mir-kiosk.hook.configure" name="/usr/lib/snapd/snapctl" pid=4393 comm="configure" requested_mask="x" denied_mask="x" fsuid=0 ouid=0 [15:27] because it seems to suggest that we load a profile from cache [15:27] but loading a profile from the file (compiling it again) results in correct behavior [15:27] even when i ran the parser manually ? [15:27] ogra_: that's different, the cache behaves in another wy [15:27] i thought that would also update the cache [15:27] ah [15:28] no, that's separatet [15:28] can you provide the details [15:28] on the forum [15:28] and save the cache / profiles somewhere [15:29] this is very interesting to debug [15:29] if you can (and this is a SD card) [15:29] just save the card and don't change it [15:29] e.g. fill empty space with zero, dd the card, compress and send to me [15:29] zyga, http://people.canonical.com/~ogra/snappy/kiosk/ ... it is just this image [15:29] I should be able to extract cache data and debug this [15:30] ogra_: do you have RTC on the device where this runs? is the hardware public? [15:30] aha, pi [15:30] do I need a specific pi? [15:30] it will auto-update core (built from edge) on first boto and then you should see the error when trying to set anything for chromium-mir-kiosk [15:30] so I suspect this is a real bug in the cache system [15:30] nope, that uses my universal gadget [15:30] and it affects devices that have no RTC that is battery backed [15:30] runs on every pi [15:30] so on boot the time is very much wrong [15:30] (well, every pi we support in core indeed) [15:30] thank you for this, this is very very useful [15:31] well, the clock is correct after first network connection [15:31] and the device was only rebooted, not powered off [15:31] so it comes up with a proper clock on reboot [15:31] jdstrand_, jjohansen: ^ it looks like apparmor_parser cache is susceptible to misbehavior when loading profiles on boot on a device without battery backed RTC [15:31] ogra_: yes but apparmor loads much earlier than that [15:31] ogra_: before most of regular startup [15:32] earlier than what ? [15:32] ogra_: and definitely before network [15:32] the HW clock is set correct on reboot [15:32] ogra_: look at "systemctl cat apparmor.service" [15:32] ogra_: how is it being set? [15:32] you dont understand :) [15:32] ogra_: what sets the hardware clock? [15:32] ntp should call systohc [15:32] once it gets the correct time [15:32] ogra_: does pi have an RTC on the board? AFAIK it doesn't [15:33] so on reboot the time is constant [15:33] so the HW clock itself should be fine until you power off the board [15:33] and only gets fixed by userspace later [15:33] I see [15:33] this is something to investigate [15:33] it surely has an RTC, just no battery [15:33] can you please collect the apparmor cache and profiles, just in case [15:33] anyway, the board also boots with fixrtc set [15:34] cachio: from /etc/apparmor and /var/lib/snapd/ and /var/cache AFAIR [15:34] brb [15:34] er [15:34] ogra_: ^ [15:34] so even if the clock was wrong, it would only be slightly off (set to the last mount time of the rootfs) [15:34] I need to go afk for some time [15:37] hunterk: so i've tried vulkaninfo from my graphics-debug-tools-bboozzoo snap and ran into issues, we're not picking up a library from the host which apparently is required [15:37] hmm, actually it doesnt have an RTC ... but fixrtc should still kick in from the initrd [15:37] hunterk: i've opened a PR #5553, feel free to build it locally and check [15:37] PR #5553: cmd/snap-confine: (nvidia) pick up libnvidia-glvkspirv.so [15:37] zyga: ^^ [15:38] hunterk: also, you actually need to set a path to the nvidia ICD file, the way i'm running vulkaninfo is (inside snap run --shell): VK_ICD_FILENAMES=/var/lib/snapd/lib/vulkan/icd.d/nvidia_icd.json /var/lib/snapd/snap/graphics-debug-tools-bboozzoo/current/command-vulkaninfo.wrapper [15:38] What does fixrtc do? [15:39] hunterk: for comparison egl has icd search paths which is : separated list of dirs with icd files, libvulkan had no such thing :( hence the manual hack [15:44] ogra_: what does fixrtc do? [15:44] zyga, running from initrd, setting the clock to the last mount time of the rootfs [15:44] on boot [15:45]