[05:11] morning [05:57] trivial review #5550 [05:57] PR #5550: spread: switch Fedora and openSUSE images (2.34) [06:13] mvo: hi [06:14] mborzecki: hey, good morning [06:15] mvo: i have a vague recollection that you were suppsoed to be off today :) [06:15] mborzecki: I'm not really here today but the SRU is still pending and I need to do some testing to ensure it can go in in time :/ [06:15] mborzecki: so yeah, hopefully not here for long [06:15] mvo: aah [06:16] mvo: while at it, #5550 is really simple, i hope if fixes the issues with fedora we're seeing in 2.34 [06:16] PR #5550: spread: switch Fedora and openSUSE images (2.34) [06:17] mborzecki: oh, nice! [06:17] mvo: we somehow ended up using different image in 2.34 branch, and i'd guess only the one used in master branch got updated [06:17] mborzecki: yeah, makes sense [06:17] mborzecki: thanks for finding that [06:17] mborzecki: long standing branches are always a bit of a pain :/ [06:19] mborzecki: feel free to merge once its green [06:20] mvo: ack [06:21] good morning [06:22] * zyga sort of feels better now [06:24] mvo: and merged, i've restart #5545 [06:24] PR #5545: snapstate: allow setting "refresh.timer=managed" (2.34) [06:24] zyga: hey [06:24] hey, how are things? [06:32] zyga: good morning! how are you? well rested? [06:34] mvo: so so but better, this trip was pretty unfortunate (lots of delays, lots of gate changes, terrible seats next to crying babies) [06:34] mvo: but I'm much better than yesterday :) [06:34] mborzecki, zyga: we have a bit of a problem - on a fresh install via http://cdimage.ubuntu.com/bionic/daily-live/current/ when doing "snap install gedit" it hangs forever in auto-connect [06:34] zyga: trip> meh, not good [06:34] hmm hmm, we had this bug before right? [06:35] mvo: do we have any logs? [06:35] auto-connect task would spin in a loop, failing and trying again [06:36] I have a hunch I know what causes mount failures [06:36] the "https://forum.snapcraft.io/t/unexplained-mount-failure-protocol-error-what-we-know-so-far/5682/7" issue [06:36] increased parallelism [06:37] if allocating a loopack device is racy [06:37] then we have tow racing tasks that somehow end up both using one /dev/loopX [06:37] fwiw, happens with 2.33 and 2.34 [06:38] http://paste.ubuntu.com/p/DDCFVn7d4d/ [06:38] zyga: yeah that sounds sensible [06:38] mborzecki: the pastebin is the change changes output [06:38] mvo: sadly we don't know what happens inside that task [06:39] mvo: as in, there is no log of attempts and failures [06:39] http://paste.ubuntu.com/p/VfKnHtHkzy/ [06:39] zyga: yeah [06:39] hmm [06:40] spins locally too [06:40] mvo: the seed order in wrong no? we told them to fix it but they haven't [06:41] pedronis: oh, right [06:42] mvo: or the last change in the gnome snap needs yet a different order [06:42] pedronis: yeah, task <1> is also in doing [06:42] pedronis: I think that explains it === chihchun_afk is now known as chihchun [06:46] pedronis, mborzecki, zyga I think pedronis is right and the hang is just a side effect of the incorrect seed order. so less urgent and we can wait for the new ISO biuld with the seed order fix [06:47] mvo: do we know they fix the order? [06:48] pedronis: yeah, they added a workaround in livecd-rootfs [06:48] pedronis: where gtk-themes-common is sorted first now [06:48] ok [06:48] pedronis: I hope it actually works, it was a bunch of shell [06:48] mvo: btw I'm happy for us to fix it, but I think for 2.34 is too much of a risk (it would need to add a bunch of code) [06:49] we can try to have something for 2.36 [06:49] pedronis: +1 [06:49] 2.34.2 is now bionic-updates [06:50] so we will be on the 18.04.1 iso :) [06:50] thx [07:00] mvo: #5545 is finally green :) [07:00] PR #5545: snapstate: allow setting "refresh.timer=managed" (2.34) [07:05] small coffee break to be more awake [07:31] zyga: when you're more awake, could I get your opinion on this? https://forum.snapcraft.io/t/should-v2-interfaces-select-connected-return-unconnected-plugs-slots/6455 [07:32] I ran into it last week, and the suggestion was to wait for you to get back from the sprint [07:39] zyga: with this weather some cold brew would be great :P [08:17] re, sorry, this took a while [08:17] jamesh: looking now [08:18] zyga: no problem. I was getting over my own jetlag last week :) [08:18] jamesh: aha, curious, let me look at the code [08:20] jamesh: so, the output is as intended, I think what is missing is connection-level information (that was supposed to be returned by the never-implemented "snap connections" [08:21] jamesh: the last part I don't fully follow, the /v2/interfaces endpoint has two GET behaviors, legacy (which is more connection oriented) and non-legacy which is interface (not plug/slot but actual interface) oriented [08:21] zyga: I was just trying to work out what users would want the current behaviour [08:21] jamesh: "Also, if one output mode is referred to as getLegacyConnections, should it be considered a bug that you can’t get the equivalent information from the non-legacy interface?" - this is the part that I don't understand [08:21] zyga: I'm particularly concerned what the plug is connected to: just whether it is connected [08:22] jamesh: right, the thing is that the /v2/interfaces?select=connected returns _interfaces_ for which a connection exists between a plug and a slot [08:22] jamesh: currently there is no way to limit plug and slot references it returns (when prompted) to just those that actually have a connection [08:23] jamesh: the behavior is specifically modeled for "snap interface" to show a smaller list of interfaces (just those that have some actual connections established), unlike what "snap interfaces" (plural) does [08:23] zyga: so the only way to find out if one particular plug is connected is by asking snapd to serialise every single connected plug/slot on the system and then filter it client side :( [08:24] jamesh: still there is no great way to show the state of a particular plug or slot [08:24] jamesh: I think we are open to improvements, I'm just explaining why it is the way it is now [08:24] jamesh: I think we need a connection-oriented view [08:24] jamesh: and perhaps that would also answer the query you are after [08:24] jamesh: what is it specifically that you want to access? [08:25] (or perhaps what is the UX that this is a part of) [08:25] zyga: "does snap A have a connected plug of interface type X?" [08:25] jamesh: interesting, of interface type X or of name X [08:26] one thing I was looking at just last week was that in spread tests we often grep for stuff while we just want to answer "is this connected" [08:26] snap is-connected snap-name:plug-or-slot-name [08:26] this is for a trusted helper type use, making a decision based on the interface connection state [08:26] and I was thinking we should implement the connections endpoint which returns all connections, with enough filtering to just answer that [08:26] I see [08:27] jamesh: note that in theory one snap can define multiple plugs or slots of the same type [08:27] (as long as those have different names) [08:27] zyga: the particular use case is having pulse audio restrict access to microphones unless a particular interface is connected [08:27] in conversations long time ago we wanted "snap connections" to essentially return a set of tuples, one per connection [08:27] ah, right [08:27] since we can't handle that kind of thing at the AppArmor level [08:28] jamesh: so I think that right now we don't have a nice API for that [08:28] jamesh: I think we should write one (along with snap-connections) [08:28] jamesh: as the only other alternative is the legacy endpoint and client side flitering [08:30] okay [08:54] is mvo around today? [09:07] zyga: he is offically off [09:07] ah, I didn't know [09:07] just today or for a week? [09:08] just today I think [09:09] ok [09:43] hm building amzn2 using common fedora spec is triggering isseses when generating selinux policy files, looks like selinux-policy is missing map for file class :( [09:52] I have code that refuses to work locally if I imported from snapd, but works elsewhere imported from purportedly the same code, and works locally if I paste it into somewhere else. i've removed pkg/ from my gopath and no change. Strace doesn't seem to point to any weird imports. Any ideas? [09:54] vendored dependencies? [09:55] hmmm [09:55] jamesh: moving aside vendor does make the code work! [09:57] jamesh: but that makes even less sense: the code I'm running is in one of the unit tests that has to be using the vendored code [09:57] but as soon as I run govendor sync again, it fails again [09:58] jamesh: code is https://pastebin.ubuntu.com/p/rtv8QH34qB/ fwiw [09:58] although I'm going to assume it's something local [09:59] Chipaca: either (a) you're running into a bug in a dep that has been fixed since the revision govendor pulls, or (b) something bad happens when two copies of the package exist in the same process [10:00] anything github.com/snapcore/snapd/strutil pulls in will reference github.com/snapcore/snapd/vendor/gopkg.in/yaml.v2 [10:01] by "doesn't work", do you mean crashes, or failes to compile? [10:02] jamesh: yaml: unmarshal errors: line 2: cannot unmarshal !!map into yaml.MapSlice {[] map[]} [10:02] jamesh: option (c) :-) [10:04] Chipaca: Looking at the yaml code, it has a branch of code dependent on "reflect.TypeOf(MapItem{})" [10:04] indeed it fails as soon as I copy gopkg.in into vendor/ [10:05] Chipaca: so the MapItem as seen by strutil.OrderedMap is different to the MapItem the non-vendored yaml package sees [10:05] Chipaca: presumably you could change your program to instead import "github.com/snapcore/snapd/strutil/vendor/gopkg.in/yaml.v2" [10:06] I thought go blocked that [10:06] lemme see [10:06] imports github.com/snapcore/snapd/strutil/vendor/gopkg.in/yaml.v2: must be imported as gopkg.in/yaml.v2 [10:07] eh, nevermind [10:07] jamesh: now I understand it's not my go installation broken in weird ways (although arguably all installations are, given this), i can stop worrying [10:07] anyway. You can see that strutil.OrderedMap.UnmarshalYAML asks the non-vendored yaml package to unmarshal into the vendored yaml.MapSlice type [10:08] which fails because it sees the vendored yaml.MapSlice as just some random third party type rather than something to handle specially [10:10] jamesh: i thought it was the other way around: the u function will be unvendored (as it's provided by the caller which is outside of snapd, so unvendored) and the MapSlice will be vendored (as it's imported by strutil which will use the vendored) [10:10] that's what I said, isn't it? [10:10] ah maybe thats what you said [10:10] :-) [10:10] jamesh: yes [10:10] jamesh: I'm easily confused, it seems [10:40] Mornings [10:43] niemeyer: o/ [10:53] could somebody run the unit tests on cmd/snap on master a few (10?) times and tell me if it fails? [10:54] it fails here, at least once every 5-10 times; and it's failing every _single_ time on #5506 :-( [10:54] PR #5506: cmd/snap: add a green check mark to verified publishers [10:54] completely unrelated to the colour green :-( [10:55] the error is in cmd_aliases_test, ... value client.ConnectionError = client.ConnectionError{error:(*url.Error)(0xc82034e030)} ("cannot communicate with server: Get http://127.0.0.1:34111/v2/aliases: dial tcp 127.0.0.1:34111: getsockopt: connection refused") [11:00] Chipaca: got it on the first run [11:00] Chipaca: on master [11:23] niemeyer: can you upload the latest spread to s3? i've opened a pr for amazon linux but spread is complaining about invalid size string: "preserve-size" [11:24] mborzecki: Ah, indeed I haven't updated, waiting for feedback on whether it worked [11:24] cachio: have you had a chance to try it out? [11:24] niemeyer: it worked :) [11:24] mborzecki: There you go :) [11:24] mborzecki: Updating [11:25] niemeyer: thanks! [11:25] niemeyer, yes, I updated the amazon image using this one [11:25] mborzecki: Done, please let me know [11:26] hey niemeyer [11:27] how are you feeling? [11:27] in case anyone wants to take a look #5552 [11:27] I had a rough ride home, I haven't felt this tired after returning from a sprint in a while [11:27] PR #5552: (WIP) Amazon Linux 2 packaging and spread tests [11:38] zyga: Yo [11:40] zyga: Feeling pretty reasonable.. don't have much time to feel tired this week :) [11:40] zyga: The sprint was intense indeed, though [11:40] I blame the short sessions, in part [11:43] indeed, lots of tasks switching during the week [11:54] pedronis: did you get a chance to take another look at #5434 ? [11:54] PR #5434: overlord: introduce InstanceKey to SnapState and SnapSetup, renames [11:59] mborzecki: I think I skimmed it again, but didn't do a full re-review [12:00] pedronis: do you think you could do it this week? it'd be nice to land it before you're off for vacation :) [12:01] pedronis: i'm updating the store pr too, should be pushing the changes later today/tomorrow === chihchun is now known as chihchun_afk [12:10] * zyga -> walk [12:15] mborzecki: yes, not today though, more likely tomorrow [12:15] pedronis: works for me, thank you! [12:25] * zyga actually goes for that walk... [13:18] Saviq, yesterday I updated images, please tell me if you see any error [13:30] zyga: vey nice umbrella :P [13:49] mborzecki: I have a snap package for a program that uses vulkan but it complains about not being able to find the vulkan loader. I was told you were the person to talk to. Any suggestions? :) [13:49] hunterk: is it published? [13:49] yes, retroarch [13:51] it initially launches with a GL renderer, but I can walk you through enabling the Vulkan renderer if you like [13:52] hunterk: let me check my notes, iirc there was some fishy stuff about vulkan an how it finds icd files [13:52] kk [13:52] i have to run to a meeting, so no hurry === plars_ is now known as plars [14:29] Pharaoh_Atem: hey [14:29] around? [14:29] zyga: yes? [14:29] hey, I [14:30] hey, I'm looking into f29 base snap now, I've started playing with image factory, trying to get it to do _basic_ things (whatever those are) in a way I understand [14:30] I wanted to quickly sync with you if that is the right way to start [14:31] Pharaoh_Atem: my plan is to write a plugin for it (called snapcraft) that builds a base snap according to the stuff in the template (still hand-wavy at this stage) [14:31] sounds like a good strategy [14:32] zyga, do you know why we are installing linux-image-extra-* ? [14:32] Pharaoh_Atem: I don't know what the constraints are, I suspect more modern things may be a dependency problem, I plan to use python 2.7 and shell for now [14:32] zyga, what so we need from it? [14:32] cachio: AFAIK for the extra drivers, but not specifcially [14:32] * Chipaca takes a break from bashing his head on tests and goes get a cuppa [14:33] zyga: you can email Brendan Reilly about it [14:33] bah [14:33] Brendan Reilly [14:33] zyga, ok, because there is not a package for the latest update on ubuntu 16.04-64 [14:33] on gce [14:33] cachio: I don't know what that means, [14:33] zyga: I'd plan on py3 compatible py2 code, since I think imgfac is going to be ported soon [14:34] are you saying the package is out of sync somehow? [14:34] it is built as a part of the kernel [14:34] zyga, we are installing this on snapd test suite [14:34] linux-image-extra-$(uname -r) [14:34] Pharaoh_Atem: that's a good hint [14:34] Pharaoh_Atem: who is Brendan? [14:34] but the last kernel is 4.15.0-1014-gcp [14:34] he's the maintainer and main developer of Image Factory/Oz [14:35] at least for the last two releases, he's been the guy making them [14:35] zyga, what I can so is to install 4.15.0-15 to make that work [14:35] zyga: https://github.com/redhat-imaging/imagefactory/blob/master/imagefactory.spec#L132-L147 [14:35] cachio: sorry, I don't know enough about the problem to help you [14:35] zyga, ok, np [14:35] I'll fix it [14:35] Pharaoh_Atem: ah, I see [14:36] Pharaoh_Atem: do you expect we will need to make changes outside of image factory in order to get the base snap building in place? [14:36] we may need to touch pungi and koji [14:36] cachio: I know we are installing it in the test suite but I don't know what the problem is really, is the package uninstallable? [14:37] https://pagure.io/pungi & https://pagure.io/pungi-fedora; https://pagure.io/koji [14:37] Pharaoh_Atem: to ping imagefactory or to do some other things? [14:37] oz is run by koji, which is kicked off by pungi [14:37] and how does imagefactory fit into this? [14:38] for example: https://koji.fedoraproject.org/koji/buildinfo?buildID=1130195 [14:38] zyga, the problem is that when I update the image for xenial 64, there is not linux-image-extra for the new kernel instlaled [14:39] zyga: imagefactory is run as a koji task [14:39] cachio: I would ask the kernel team about htis [14:39] zyga, ok, make sense [14:39] Pharaoh_Atem: and how does this relate to pungi? [14:39] pungi is the tool that actually kicks off all the koji tasks [14:40] and tells the tools what they should do [14:40] so koji can build things [14:40] by deferring to imagefactory [14:40] but it has no scheduler [14:40] so pungi is doing that? [14:40] koji is the build system, imgfac is the tool, and pungi is the orchestrator [14:40] pungi -> koji -> imgfac [14:40] ok [14:40] thanks, I see now [14:41] so I started with the right place it seems :) [14:41] hmm, did we have any recent snapctl changes ? [14:41] zyga: you can also ask mboddu in #fedora-releng for more details ;) [14:41] https://paste.ubuntu.com/p/9Qg7szthFq/ [14:41] i'm seeing "snapctl: Permission denied" errors [14:41] Pharaoh_Atem: for now I think this is enough to keep me busy but I will write that down, contact points are useful [14:42] Mohan will be at Flock, too [14:42] (this has worked before last edge update of core i think) [14:42] ogra_: it moved from /usr/bin/ to /usr/lib/snapd/ so maybe apparmor is out of sync somehow [14:42] see if you have any denials [14:42] Pharaoh_Atem: Mohan == mboddu? [14:43] yeah [14:43] his name is Mohan Boddu [14:43] heh, surely a gazillion (but from other stuff ...) [14:43] is he responsible for for koji and friends? [14:43] ogra_: look for specific denial for snapctl [14:43] [ 1007.643957] audit: type=1400 audit(1532443192.989:1239): apparmor="DENIED" operation="exec" profile="snap.chromium-mir-kiosk.hook.configure" name="/usr/lib/snapd/snapctl" pid=4522 comm="configure" requested_mask="x" denied_mask="x" fsuid=0 ouid=0 [14:43] [ 1007.710435] audit: type=1400 audit(1532443193.053:1240): apparmor="DENIED" operation="exec" profile="snap.chromium-mir-kiosk.hook.configure" name="/usr/lib/snapd/snapctl" pid=4529 comm="configure" requested_mask="x" denied_mask="x" fsuid=0 ouid=0 [14:43] yeah [14:43] should another reboot help here ? [14:44] no, one sec [14:44] (i'm freshly booted after core update) [14:44] can you check if /var/lib/snapd/apparmor/profiles/snap.chromium-mir-kiosk.hook.configure talks about snapctl (grep for it) [14:44] ogra_: if it doesn't then i think this is a bug [14:45] ogra@pi3:~$ sudo grep snapctl /var/lib/snapd/apparmor/profiles/snap.chromium-mir-kiosk.hook.configure [14:45] # snapctl and its requirements [14:45] /usr/bin/snapctl ixr, [14:45] /usr/lib/snapd/snapctl ixr, [14:45] ogra@pi3:~$ [14:45] looks like it is there [14:45] with the correct path [14:45] that's interesting [14:45] can you run apparmor_parser -r on that file (as root) [14:45] and see if that fixes it [14:46] I wonder if we have a cache issue [14:46] it's also likely the file is correct now [14:46] oh [14:46] so run configure again [14:46] maybe it will work without running apparmor_parser [14:46] if it doesn't then do run apparmor_parser [14:46] and then try to run configure again [14:46] i did run configure a few times already [14:46] ok [14:46] and it fails, right? [14:47] yes, running the parser now [14:47] now the configure works [14:47] ha [14:47] ogra@pi3:~$ snap set chromium-mir-kiosk disablekiosk=true [14:47] ogra@pi3:~$ [14:47] can you reproduce that issue? [14:47] no issues [14:48] I mean, can you get to a state where it happens again [14:48] phew ... that takes a while (takes minutes to install the chromium snap) [14:48] well, i'd remove and reinstall the snap [14:48] not sure if that changes anything though [14:49] the image is a week old or so and silently updated core but nothing else when i applied network 30min ago [14:49] not sure if i can repro that state [14:50] bah, dang ! [14:50] ogra@pi3:~$ snap remove chromium-mir-kiosk [14:50] error: cannot remove "chromium-mir-kiosk": snap "chromium-mir-kiosk" is not removable [14:50] it is in the model assertion as required snap ... [14:51] so no way i could try to repro that by removing the snap [15:08] cachio: ack, thanks for the heads up (down?) ;) [15:09] isnt that also called "nodding" ? [15:09] "heads up (down)" [15:13] pedronis: pushed fixes to the store PR too, thanks for the review comments there was indeed an error in mapping install errors there === jkridner|pd is now known as jkridner [15:26] zyga, aha ... seems a reboot gets me back into the broken state [15:27] that's very interesting [15:27] [ 1178.619187] audit: type=1400 audit(1532445976.250:48): apparmor="DENIED" operation="exec" profile="snap.chromium-mir-kiosk.hook.configure" name="/usr/lib/snapd/snapctl" pid=4393 comm="configure" requested_mask="x" denied_mask="x" fsuid=0 ouid=0 [15:27] because it seems to suggest that we load a profile from cache [15:27] but loading a profile from the file (compiling it again) results in correct behavior [15:27] even when i ran the parser manually ? [15:27] ogra_: that's different, the cache behaves in another wy [15:27] i thought that would also update the cache [15:27] ah [15:28] no, that's separatet [15:28] can you provide the details [15:28] on the forum [15:28] and save the cache / profiles somewhere [15:29] this is very interesting to debug [15:29] if you can (and this is a SD card) [15:29] just save the card and don't change it [15:29] e.g. fill empty space with zero, dd the card, compress and send to me [15:29] zyga, http://people.canonical.com/~ogra/snappy/kiosk/ ... it is just this image [15:29] I should be able to extract cache data and debug this [15:30] ogra_: do you have RTC on the device where this runs? is the hardware public? [15:30] aha, pi [15:30] do I need a specific pi? [15:30] it will auto-update core (built from edge) on first boto and then you should see the error when trying to set anything for chromium-mir-kiosk [15:30] so I suspect this is a real bug in the cache system [15:30] nope, that uses my universal gadget [15:30] and it affects devices that have no RTC that is battery backed [15:30] runs on every pi [15:30] so on boot the time is very much wrong [15:30] (well, every pi we support in core indeed) [15:30] thank you for this, this is very very useful [15:31] well, the clock is correct after first network connection [15:31] and the device was only rebooted, not powered off [15:31] so it comes up with a proper clock on reboot [15:31] jdstrand_, jjohansen: ^ it looks like apparmor_parser cache is susceptible to misbehavior when loading profiles on boot on a device without battery backed RTC [15:31] ogra_: yes but apparmor loads much earlier than that [15:31] ogra_: before most of regular startup [15:32] earlier than what ? [15:32] ogra_: and definitely before network [15:32] the HW clock is set correct on reboot [15:32] ogra_: look at "systemctl cat apparmor.service" [15:32] ogra_: how is it being set? [15:32] you dont understand :) [15:32] ogra_: what sets the hardware clock? [15:32] ntp should call systohc [15:32] once it gets the correct time [15:32] ogra_: does pi have an RTC on the board? AFAIK it doesn't [15:33] so on reboot the time is constant [15:33] so the HW clock itself should be fine until you power off the board [15:33] and only gets fixed by userspace later [15:33] I see [15:33] this is something to investigate [15:33] it surely has an RTC, just no battery [15:33] can you please collect the apparmor cache and profiles, just in case [15:33] anyway, the board also boots with fixrtc set [15:34] cachio: from /etc/apparmor and /var/lib/snapd/ and /var/cache AFAIR [15:34] brb [15:34] er [15:34] ogra_: ^ [15:34] so even if the clock was wrong, it would only be slightly off (set to the last mount time of the rootfs) [15:34] I need to go afk for some time [15:37] hunterk: so i've tried vulkaninfo from my graphics-debug-tools-bboozzoo snap and ran into issues, we're not picking up a library from the host which apparently is required [15:37] hmm, actually it doesnt have an RTC ... but fixrtc should still kick in from the initrd [15:37] hunterk: i've opened a PR #5553, feel free to build it locally and check [15:37] PR #5553: cmd/snap-confine: (nvidia) pick up libnvidia-glvkspirv.so [15:37] zyga: ^^ [15:38] hunterk: also, you actually need to set a path to the nvidia ICD file, the way i'm running vulkaninfo is (inside snap run --shell): VK_ICD_FILENAMES=/var/lib/snapd/lib/vulkan/icd.d/nvidia_icd.json /var/lib/snapd/snap/graphics-debug-tools-bboozzoo/current/command-vulkaninfo.wrapper [15:38] What does fixrtc do? [15:39] hunterk: for comparison egl has icd search paths which is : separated list of dirs with icd files, libvulkan had no such thing :( hence the manual hack [15:44] ogra_: what does fixrtc do? [15:44] zyga, running from initrd, setting the clock to the last mount time of the rootfs [15:44] on boot [15:45] so if the clock is off it is only by a really small margin [15:47] mmm [15:47] mount or unmount? [15:47] it may well explain the problem [15:48] mborzecki: reviewed [15:52] zyga (cc jjohansen): re battery backed rtc> yes, the clock needs to be right because of the mtime check. Ubuntu had a bug on one of the Touch devices for that [15:52] mborzecki: awesome. I'll take a look. Thanks for your help! [15:52] jdstrand_: I think this would neatly explain this === jdstrand_ is now known as jdstrand [15:52] jdstrand: if it is really _mount_ time it is clearly a way to be wrong and reproduce the problem [15:53] jdstrand: no immediate action now but I will look at reproducing this [15:53] zyga: thanks! [15:53] ogra_: can you point me to fixrtc sources? [15:54] zyga, apt-get source initramfs-tools [15:55] ogra_: thank you very much sir! [15:55] ir uses dumpe2fs to read the last mount date from whatever root= is [15:55] *it [15:55] and then uses date to set the clock to it ... pretty simple thing (and ages old) [16:00] ogra_: thank you [16:00] ogra_: and I think that is the bug actually [16:00] but I will confirm first [16:01] whats the bug ? tha the clock is some minutes in the past ? [16:01] *that [16:02] (i could imagine it being a but if it is in the future ... but the past ? ) [16:02] *a bug [16:03] ogra_: if it really is the mount time it will be wrong for the cache use case [16:03] ogra_: it must be the unmount time to be correct [16:03] well, thats nothing the metadata of ext4 stores sadly [16:04] you only have creation, last mount and last write [16:04] oh, and last checked [16:04] last write then [16:04] last mount is 100% wrong in this case [16:04] because depending on how the cache is made [16:04] then you are in the past and the cache is from the future [16:05] or you actually get the _old_ entry with the correct "window" of time [16:05] I will reproduce this and gather some evidence [16:07] ah, right [16:07] that makes sense indeed [16:08] yeah, it's such a interesting bug [16:08] it dates back to 15.04 [16:09] ogra_: I cannot find fixrtc there [16:09] I got the sid version of the package [16:09] ubuntu [16:09] it isnt in debian [16:09] aha [16:09] thanks [16:10] getting now [16:10] great [16:13] ogra_: can you please run this on your pi just now: [16:14] dupe2fs -h /dev/mmcblk0XXX [16:14] adjust to point to rootfs [16:14] and uptime [16:14] and pastebin both [16:14] oh ... i think we have a bug here [16:14] https://paste.ubuntu.com/p/pJqTTZzYkg/ [16:15] damn ... [16:15] if this is the bug we shall try those 1L beer mugs at the next sprint [16:15] well, the bug goes deeper [16:15] look at the timestamps [16:15] Filesystem created: Mon Jul 9 13:01:52 2018 [16:15] Last mount time: Mon Jul 9 13:12:09 2018 [16:15] Last write time: Mon Jul 9 13:12:09 2018 [16:15] July 9! [16:15] yeah [16:15] thats the image creation date [16:16] and uptime? [16:16] ogra_: ha, I suspected that :D [16:16] uptime ? [16:16] you mean date [16:16] no, really uptime [16:16] Tue Jul 24 16:16:20 UTC 2018 [16:16] ogra@pi3:~$ uptime [16:16] 16:16:22 up 1:09, 1 user, load average: 2.34, 2.23, 2.64 [16:16] ogra@pi3:~$ [16:16] to know when it booted [16:16] ok [16:17] so for some reason the ext4 metadata doesnt get updated correctly in our weird stacked rootfs mounts setup [16:17] ogra_: and date? [16:17] see above [16:17] ogra_: it thinks it's 9th of July because that's what is baked as fallback when the image was not mounted, then it runs with that because maybe we never unmount cleanly so that is what stays there forever [16:17] (above the uptime call) [16:17] ah, that's good (date) [16:17] ogra_: if you have a serial, can you shut down / reboot [16:17] well, Chipaca's helper unmounts it [16:18] ogra_: and see if things error on the poweroff tool we wrote [16:18] so technically it shoudl unmount cleanly [16:18] ogra_: well, it unmounts stuff but it happily ignores errors [16:18] ogra_: and refcount must go to 0 for the unmount to be effective [16:18] nope, they do not error ... i see it telling me it unmounts (i did a few reboots today) [16:18] ogra_: and after reboot, let's look at that data again: from dumpe2fs [16:18] ok [16:18] there is the usual systemd error for writable ... [16:19] so this is golden: [16:19] then the helper kicks in at the very end and tells that it unmounted writable fine [16:19] we have two bugs: one is that we must use "Last write time" to fix apparmor cache [16:19] two is that we somehow not unmount cleanly (or so it seems) [16:19] yeah [16:19] next up: look at dumpe2fs to see how that field is read [16:20] ogra_: or maybe the kernel doens't flush buffers before pi reboots [16:20] ogra_: "enough" [16:20] to really sync the SD card [16:20] can you reboot to just ensure once and for all that this is still the 9th? [16:20] and I will look at dumpe2fs [16:21] ogra@pi3:~$ sudo touch /writable/foobar [16:21] ogra@pi3:~$ sudo sync [16:21] ogra@pi3:~$ sudo dumpe2fs -h /dev/disk/by-label/writable | grep "Last write time" [16:21] dumpe2fs 1.42.13 (17-May-2015) [16:21] Last write time: Mon Jul 9 13:12:09 2018 [16:21] ogra@pi3:~$ [16:21] i think it is the stacked nature of our rootfs that breaks it [16:21] ogra_: I only suspect that gets updated when we really unmount [16:21] even touching and syncing a file doesnt update the field [16:21] ogra_: make a loopback mounted ext4 and see [16:21] really ? [16:22] i'd excpect sync to update it [16:22] ogra_: it would be silly to sync that on _any_ metadata write [16:22] ogra_: superblock [16:22] ogra_: sync is really "buffers are flushed" [16:22] well, i explicitly tell the kernel to ... [16:22] but this buffer gets dirty when we unmount the superblock [16:22] if you see what I mean [16:22] it is really only written once we unmount [16:22] not at any time [16:23] a loopback ext4 test will confirm that [16:23] well, then it is a miracle to me that the filesystem doesnt completely fall apart all the time [16:23] I'm looking at e2fsprogs now [16:23] i mean ... [16:25] ogra_: yeah, Last write time: ... is coming from the superblock [16:25] ... it effectively thimnks there hasnt been written anything in 3 weeks [16:25] ogra_: I will check when the kernel writes there now [16:25] ogra_: yes, fun finding eh? :) [16:25] ogra_: I love things like this, casual chat ends up finding very serious bug :) [16:26] and long long long standing one :) [16:26] and the journal can only hold 16M ... [16:26] (i definitely installed and wrote more stuff than 16M since july 9 ) [16:26] journal as in journald? [16:26] ah [16:26] this journal [16:26] ad in filesystem journal [16:26] right [16:26] so where is my data !!! [16:27] (it isnt like its not there ... but ... but ... ) [16:27] ogra_: which kernel version do you have there, I'll sync the right tag [16:28] 4.4 whatever is in the stable channel [16:28] ah, no, i'm lying ... its edge [16:28] pi2-kernel 4.4.0-1092.100 56 edge canonical kernel [16:29] thanks [16:30] zyga, btw, we can only walk iteratively over creation, last mount, last write ... mount and write wont be populated at all on new images ... and there are other boards relying on it where it works (if you simply use a server image without loop mounted stuff ) [16:30] https://github.com/torvalds/linux/blob/894b8c000ae6c106cc434ee0000dfb77df8bd9db/fs/ext2/super.c#L1251 [16:30] so if we change fixrtc we have to do it very careful to not break the world [16:31] mhm [16:32] ogra_: this field is used by ext2, not sure if ext4 _also_ uses it [16:32] looking [16:32] pretty likely [16:33] https://github.com/torvalds/linux/blob/ca04b3cca11acbaf904f707f2d9ca9654d7cc226/fs/ext4/super.c#L4813 [16:33] interesting [16:33] if the filesystem is mounted read only the superblock write time is _not_ touched [16:33] I will do some tests now [16:33] sure [16:33] but we (the initrd) remount it rw [16:34] wonder if this actually happens when we make / ro just before unmounting [16:34] yeah [16:34] but on shutdown [16:34] it becomes ro for a sec AFAIR [16:34] well [16:34] ah [16:34] so... maybe that's enough not to write this [16:34] yeah [16:34] testing now [16:35] whats a bit bothering is that "Last checked" is also not updated ... i'd expect that to be done sepoarately from unmounting [16:36] and we definitely run fsck once per boot [16:36] Filesystem created: Tue Jul 24 18:36:09 2018 [16:36] Last mount time: n/a [16:36] Last write time: Tue Jul 24 18:36:10 2018 [16:36] this is a loopback, freshly created, never mounted [16:37] obviously [16:37] this is the same thing after mounting and writing a file [16:37] Filesystem created: Tue Jul 24 18:36:09 2018 [16:37] Last mount time: Tue Jul 24 18:37:00 2018 [16:37] Last write time: Tue Jul 24 18:37:00 2018 [16:37] note that mount time == write time [16:37] (the n/a is new with 16.04 ... before it was hardcoded to the epoch) [16:37] and I obviously wrote _after_ (a few seconds after that) [16:38] sync doesn't affect that [16:38] ok [16:38] after unmounting I get this: [16:38] Filesystem created: Tue Jul 24 18:36:09 2018 [16:38] Last mount time: Tue Jul 24 18:37:00 2018 [16:38] Last write time: Tue Jul 24 18:38:14 2018 [16:38] so so far all is good [16:38] now ... if you remount ro and do an fsck ... is the check field updated ? [16:38] I will now remount it ro before unmounting (repeating earlier experiments) [16:38] :D [16:38] :) [16:38] yep, that's what I want to know [16:39] first, I mounted it ro as we would in initrd [16:39] Filesystem created: Tue Jul 24 18:36:09 2018 [16:39] Last mount time: Tue Jul 24 18:37:00 2018 [16:39] Last write time: Tue Jul 24 18:38:14 2018 [16:39] (after mounting as read only) [16:39] note how the "last mount time" is not changed [16:39] yeah [16:39] I will now unmount it (still ro) to just ensure this is changed (or not changed) [16:40] Last mount time: Tue Jul 24 18:37:00 2018 [16:40] it is not changed [16:40] there we go [16:40] ok, now I will make it writable for a moment [16:40] I mounted it ro, remounted to rw [16:40] i still dont get how it manages to not lose data that way ... unless your journal grows and grows [16:41] Filesystem created: Tue Jul 24 18:36:09 2018 [16:41] Last mount time: Tue Jul 24 18:40:42 2018 [16:41] Last write time: Tue Jul 24 18:38:14 2018 [16:41] last mount time has changed [16:41] yeah [16:41] and it is still mounted [16:41] so at least this suggests we should see "last mount time" changing [16:41] unless we really cannot write the superblock back [16:41] well ... [16:41] we set the clock in initrd ... [16:42] I touched the file again [16:42] ah, right, when we ... have no time :) [16:42] then mount rw, do an fsck ... [16:42] touching the file did not affect last write time [16:42] remounting as ro and unmounting now [16:42] well, we have a time [16:42] I re-mounted as ro and got no change [16:42] but only the time from that metadata [16:43] Filesystem created: Tue Jul 24 18:36:09 2018 [16:43] Last mount time: Tue Jul 24 18:40:42 2018 [16:43] Last write time: Tue Jul 24 18:38:14 2018 [16:43] and bingo! [16:43] Filesystem created: Tue Jul 24 18:36:09 2018 [16:43] Last mount time: Tue Jul 24 18:40:42 2018 [16:43] Last write time: Tue Jul 24 18:38:14 2018 [16:43] so the snapd helper would need to remount rw once, go back to ro and only then shut down ? [16:43] well, not sure yet [16:44] or force call the internal fs sync command [16:44] but re-mounting the filesystem as read only means we never write the superblock [16:44] right [16:44] so all the dates are stuck from any previous experiment [16:44] this feels like a kernel bug [16:44] it should remember the FS was writable and written to [16:44] it probably does somewhere in the journal [16:45] ogra_: we could perhaps use an ext4 specific utility to write that date ourselves [16:45] but anyway, this is the culprit right there [16:45] right [16:45] that coupled with the other bug (wrong timestamp used) [16:45] I will update the forum thread about this [16:45] and let's down this beer in September :) [16:45] 0.5 or more [16:45] +1 ! [16:45] I wonder who gets more drunk a polish guy or a german guy drinking beer :D [16:46] :D [16:46] haha, we'll see :) [16:46] like this cat that spins with a toast on his back;-) [16:46] jdstrand: we got to the bottom of the issue! [16:46] * zyga goes to the forum [16:52] zyga, https://forum.snapcraft.io/t/snapctl-permission-denied-with-latest-edge-core-update/6520 [16:55] ogra_: https://forum.snapcraft.io/t/apparmor-profile-caching/1268/9 [16:56] bah [16:56] look at this and check if I got things right [16:56] let's just cross reference [16:56] tomorrow we can discuss with mvo on how to fix this [16:56] yeah [16:56] I need to make a coffee :) [16:56] and play with fedora some more [16:56] I'm super happy we found this [16:56] me too ! [16:56] all pi devices and other RTCless devices will benefit [16:57] yep [16:57] high five orgra :) [16:57] o/ [16:57] * ogra_ ^5 zyga [16:57] woot :-) [16:58] thanks for the cross-ref :) [16:58] zyga, i guess this should hold back the next release til we have a fix ... else all configure hooks will explode [16:59] yes [16:59] this is a release blocker [16:59] CC cachio [16:59] that's a very important observation ogra_ [16:59] :) [17:00] I cross-referenced mvo and will discuss with him tomorrow [17:00] this was a good day :) [17:00] what kind of beer do you like more dark or light? [17:01] * ogra_ goes back to watch aquaman HD trailers in loops on the chromium kiosk RPi ... SW rendered but it's *not* a slideshow !! [17:01] zyga, depends ... thats a daily mood thing [17:01] generally pilsner style but i have my dark-ale days :) [17:03] zyga, and on bad days: https://d3r6kbofdnmd8.cloudfront.net/media/catalog/product/cache/image/1536x/a4e40ebdc3e371adff845072e1c73f37/1/0/100135_Fucking-Hell-Bier-6x033L-49-Vol_4.jpg [17:03] (thats actually real :) ) [17:10] zyga, thanks for the heads up [17:10] hecking [17:10] checking [17:10] ogra_: I'm slowly getting into the more proper coffee [17:10] ogra_: not "fire and forget" [17:11] ogra_: I wonder if I will reach that level in beer :) [17:11] get a proper espresso machine and start drinking americano ... IMHO the only proper way of consuming coffee ;) [17:12] americano? OMG I cannot stand it [17:13] oh, why ? [17:13] it should be just called diluto [17:13] !strong enough [17:13] sorry ubottu :) [17:13] heh [17:14] but tasty ! [17:14] (and it is as strong as the espresso you take for it ...) [17:14] sure but then you can fit more espresso [17:14] though I understand diluting wine with juice and fruit to make sangria [17:15] so ... maybe I just need to find the taste [17:15] heh [17:16] the point is that putting clear water into an already produced espresso makes the thing keep its taste ... you just dont get a heart attack after three cupts (and i have ten during the day) [17:17] it is definitely my preferred coffee over any filter coffee or even crema [17:17] (though at times i like a straight espresso) [17:24] re, just finished brewing it [17:24] I never liked filtered [17:24] but my parents drink that sometimes [17:24] though I suspect it's just because it was easier to make [17:30] yeah [17:31] https://www.ecm.de/fileadmin/products/slider/ECM-Espressomaschine-Classika-II-Hauptbild.png [17:31] :D [17:31] ogra_: I'll switch to fedora work now [17:31] (thats what makes my coffee) [17:31] yeah, enjoy ! [17:31] that's neat [17:31] is that all metal? [17:31] yeah [17:31] hand made [17:32] looks very nice [17:32] (german company from heidelberg) [17:32] brews very nice too ;) [18:09] new snap alert! WOOP WOOP. new snap alert! https://snapcraft.io/starruler2 [18:23] wow ... revision 1 in stable ! [18:24] :-D [18:24] I don't upload until it works :-p [18:30] ... 518kB/s ... [18:31] * ogra_ twiddles thumbs [18:31] eep [18:31] it's "only" 490MB [18:32] smaller than supertuxkart :) [18:41] jibel: good news (maybe you know already) the latest desktop image is fine, snap install gedit finishes as expected [18:59] * cachio afk [19:50] roadmr: hey, did you flip on resquashfs enforcement yet? [19:50] jdstrand: no! was waiting for your ok. I can do so now [19:50] roadmr: please do so. thanks! [19:51] jdstrand: I'm 2 hours from EOD but the store never sleeps (tm) so feel free to holler if there're issues [19:52] roadmr: yep, thanks :) I suspect few issues, if any. the last time it was on for a week and only had the couple failures we needed to look at [19:52] roadmr: do remember it will make reviews take longer, in case there is a question wrt your monitoring infra [19:53] jdstrand: noted, but last time there was no issue with that [19:53] * jdstrand nods === LinAGKar[m] is now known as LinAGKar [19:53] jdstrand: switch flipped! [19:53] \o\ [19:53] /o/ [19:53] \o/ [19:53] :) [19:54] roadmr: thanks again :) hoping this is the *one* [19:54] hopefully! [21:28] hello [21:29] does anyone have a snap file for firefox v52.9.0 ESR? [21:30] I need it badly, I don't want to use any more recent version of firefox [21:32] `snap install skype` -> `error: This revision of snap "skype" was published using classic confinement and thus may perform arbitrary system changes outside of the security sandbox that snaps are usually confined to, which may put your system at risk. If you understand and want to proceed repeat the command including --classic.` how safe is it? I don't understand. Is it just a general warning or is Skype requiring some extra permissions? [21:34] !classic [21:35] I'm confused by ubuntus use of snappy, it looks like they're using snaps for gnome applications? is that right? [22:09] halfbit: yes, some gnome apps on 18.04 are shipped as snaps [22:09] * zyga heads to bed so cannot have a long conversation [22:10] FreeBDSM: firefox snap has a ESR track, for more info look at "snap info firefox", it is not at the exact version you wanted though [22:10] zyga: exactly why I'm asking here for a particular version. [22:10] FreeBDSM: classic confinement is "like typical distribution package", there is no confinement between the application process and the system [22:11] it is as safe as a .deb or .rpm [22:11] it is based on trust in the actual publisher (microsoft in this case) and that they are not attacking your machine [22:11] when such a snap (with classical confinement) gets removed from the system - does it leave trails? [22:12] FreeBDSM: usually only in logs and leftover data in your home directory (if any) [22:12] but not in the system [22:12] good [22:12] technically when a snap is removed it is just unmounted, the whole snap is in one file [22:12] * zyga hasn't seen jdstrand this happy in a while! [22:12] I've heard there are also flatpaks, how do they differ from snaps? [22:12] (just scrolling back and noticing the victory dance) [22:13] FreeBDSM: in tons of ways, this is a huge topic and it's far too late to discuss here [22:13] * jdstrand notes that a classic snap can do anything to the system, so it could leave trails. a well-behaved content snap won't do that of course [22:13] (now, it's after midnight for me) [22:13] zyga: goodnight! :) [22:13] zyga: utc+3? [22:13] s/content/classic/ [22:13] FreeBDSM: in the security model, in the distribution method, in the update method, in the scope and intended feature set ,etc [22:14] okay, got it, it's a huge topic [22:14] FreeBDSM: I'm sure that jdstrand can give you one difference to research [22:14] thanks for the answers [22:14] it's past midnight for me as well [22:14] I can too but I'm not an expert on flatpak and I may be imprecise [22:15] FreeBDSM: both projects use many kernel features to make the apps work [22:15] and we also use some of the work on the portal system that flatpak initiated [22:15] but I think we are building slightly different things in the end [22:16] 230 users on #flatpak vs 245 on #snappy, hehe [22:16] alex is a cool, skilled and motivated developer (really kudos for that) [22:16] well, irc is kind of niche so I don't think that's a useful metric [22:16] ask him too, I'm sure he will have interesting things to share [22:18] yes. in terms of isolation, flatpak in general uses namespaces and trusted helpers (eg, portals). strict mode snaps can be unmodified, are wrapped with LSM and seccomp filtering (a form of containerization), but also use various traditional containerization techniques (eg, device cgroup, mount namespace, devpts new instance) [22:19] but snaps as of recently can use portals. they can also run as classic snaps (ie, unconfined) [22:20] there was talk of flatpak doing more with LSM, but afaik, that is still a roadmap item [22:20] like zyga said though, trying to doing different things [22:20] okay, sorry, I don't understand a thing, haha [22:25] so, does anyone have a snap for firefox 52.9.0 esr? [22:25] or any instructions on how to build it? [22:56] is there a chat for ubuntu core, looks like something I'd be interested in checking out [23:35] can I run all of gnome on ubuntu core? [23:35] or is that a standard deb based ubuntu only thing