[06:40] morning [06:45] PR snapd#8289 opened: xdgopenproxy: forward requests to the desktop portal [06:58] hmm the code in master isn't formatted according to gofmt 1.10? [07:26] jamesh: thanks for opening the PR! thre's some unit tests failures in the travis job [07:26] mborzecki: yep. Just looking into that (borrowing some logic from gio) [07:27] PR snapd#8290 opened: run-checks: tweak formatting checks [07:30] mborzecki: I was adjusting the testing to run against a real bus rather than trying to mock everything [07:32] jamesh: right, userd tests did that too iirc [07:41] mvo: hey, a trivial PR to start your day with: https://github.com/snapcore/snapd/pull/8290 [07:41] PR #8290: run-checks: tweak formatting checks [07:42] mborzecki: sure! [07:44] PR snapd#8288 closed: release: 2.44 [07:55] good morning [07:56] good morning zyga [07:56] how is everyone? [07:56] it looks like a record-hot day today [07:57] I gave up and bought a disk for debian :/ [07:57] I could not find any spare drives at home, that were not filled with stuff [07:59] did you guys noticed the github app went live yesterday [07:59] I tried it and it's pretty slick [07:59] it feels almost better than using the browser on a desktop [08:05] zyga: instead of just one browser you can run n instances now :) [08:06] morning [08:06] hey :) [08:06] pstolowski: still waiting? [08:06] zyga: yep; not even dispatched [08:06] it will come [08:07] just thinking about some stuff I talked about with mborzecki earlier [08:07] mborzecki: RGB HDD FTW [08:11] mvo: is 2.44 out? [08:11] zyga: yes, since last night in beta [08:11] woot [08:11] is that expected to be the 2.44 final that goes out to ISOs? [08:12] zyga: there will be a 2.44.1 that will probably go on the iso [08:12] zyga: with search v2 [08:12] aha [08:12] ok [08:12] should I target fixes to it or to 2.45 [08:12] thinking about stuff like EROFS from yesterday [08:12] zyga: fixes for 2.44 [08:12] zyga: please mark everything that is worth for 2.44 [08:12] https://github.com/snapcore/snapd/pull/8285 is like one [08:12] PR #8285: cmd/snap-update-ns: ignore EROFS from rmdir/unlink [08:12] zyga: we don't really have time for a 2.45 before the release [08:13] sure [08:13] I was thinking if we are doing any .1 or not at all [08:13] zyga: yeah, most likley a .1 [08:13] zyga: (like ~95%) [08:13] zyga: I added the 2.44 milestone to you pr and will cherry-pick [08:13] I'll make a milestone on LP and retarget thebug [08:14] zyga: haha rgb hdd, how about a transparent rgb hdd with a led platter on top of the magnetic ones? :) [08:15] mborzecki: as soon as we have that transparent aluminum ;) [08:15] mborzecki: but I would totally buy an RGB HDD [08:15] because, why not :) [08:41] zyga: heh, already auto-filed https://bugzilla.redhat.com/show_bug.cgi?id=1814552 [08:41] yeah I got the mail ping a moment ago [08:41] ITSABUGFIXITNOOOW [08:42] mvo: will you make the release tarballs? [08:49] brb [08:54] zyga: sure [09:01] thank you :) [09:20] pedronis: I pushed an update to the nocloud, probably needs a lot of naming tweaks [09:22] mvo: #8278 ? [09:22] PR #8278: devicestate: disable cloud-init by default on uc20 [09:22] pedronis: yes [09:22] mvo: ok, I will look at it, looking at the next secboot PR atm [09:23] pedronis: no rush, thanks! [09:26] pedronis: actually let me tweak it a bit more, I think it's silly at this point to split it into two [09:28] i can't decide whether go allowing to embed `*type` and `type` is good or bad thing [09:37] zyga: question to the env fix PR [09:41] pstolowski: ok [09:42] zyga: ah, it's splitN, so it's fine [09:42] heh, I just replied the same :) [09:42] I believe this is also tested [09:43] there's a test like K=V=... [09:43] anyway [09:43] I need to take half day off [09:43] until we figure out home schooling [09:43] Lucy is constantly with me [09:43] crying if she is with grandparents [09:43] and I just cannot focus [09:44] until enough discipline and getting over the fact working at home is working is understood by older kids [09:44] sorry, I'll file the time off later :/ [09:44] we're walking up and down the stairs all morning [09:51] PR snapd#8234 closed: devicestate: support for "cloud.cfg.d" cloud-init in uc20 with grade: dangerous [09:55] mvo: I'm a bit unsure why you think is silly to split it? I asked it to be split that way [10:02] mvo: what I mean, please don't merge 8234 into 8278 [10:04] pedronis: ok, I will do them separately, sorry [10:05] pedronis: then 8278 should be ok for a first look, probably lots of tweaks still needed but hopefully it captures what we discussed [10:06] mvo: 8278 is not that small anyway, with more stuff it would maybe >500 lines [10:06] pedronis: yeah, makes sense [10:24] mborzecki: how close do you think is #8159, apart your comment? [10:24] PR #8159: snap-bootstrap: remove created partitions on reinstall [10:24] pedronis: it's just some nitpicks [10:25] pedronis: i think a helper called from Run() before CreateMising() that explicity mutates LaidOutVolume based on options would be fine there [10:25] mborzecki: I see, I sort of feel there already too many helpers but maybe is just me [10:28] pedronis: perhaps that should even be something in gadget, since gadget is looked at from other places too (eg. boot assets update) [10:28] pedronis: wdyt? [10:28] morning folks [10:29] ijohnson: hey! [10:29] hey mborzecki o/ [10:29] mborzecki: I honestly wonder if overreding what the gadget says is worth the trobule [10:30] the specs says anyway that the normal partitions can be luks [10:30] afaiu [10:30] pedronis: leave a type as it is then? [10:30] yes [10:31] mborzecki: see here: https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/ [10:31] pedronis: mhm, cryptsetup open probably does not even look at the partition type in gpt/mbr [10:32] it's cute to have the "right" type but atm it just seem grief [10:32] haha fair enough [10:32] notice that I didn't request that [10:32] also as you said, not sure other gadget code will be happy [10:32] about the changed type [10:34] any idea why snap prepare-image would fail to find the account-key here: https://pastebin.ubuntu.com/p/Qz3RvprPK6/, this is from #8286 [10:34] PR #8286: tests: cleanup various uc20 boot tests from previous PR [10:41] zyga, hi, is https://paste.ubuntu.com/p/Yv9dkYThZr/ the same issue with mounts you're fixing? [10:41] zyga: hrm, hrm, the debian package fails with https://paste.ubuntu.com/p/P674rpMTMB/ which in my pbuilder, which is super strange given that this is building travis, I wonder if we do something silly there like not excluding the vendor stuff or sillyness like this [10:44] pedronis: ok, let's discuss that with claudio after the standup maybe? [10:47] mvo: I +1ed one 8278 but it needs a 2nd review given all the changes, also Opts name need to change. Lots of details are still unclear to me but shouldn't be a blocker for what is there currently [10:47] ackk: no, you cannot chown $SNAP [10:47] hmmm [10:48] pedronis: I'm reviewing 8278 right now [10:48] mvo: weird [10:48] is golang-evdev in the vendor tarball> [10:48] or are we de-vendoring? [10:48] I mean, are we really building the devendorized tarball in our spread runs? [10:48] we shouldn't be building things that need on debian [10:48] need it [10:48] anyway [10:48] pedronis: good point [10:48] Lucy is asleep now but I think today is rather hectic : [10:48] maybe we are because [10:48] :/ [10:49] mborzecki: I'm not sure there is more to discuss, that PR is open since forever [10:49] pedronis: yeah, I'm unhappy that we did not caught this, the sbuild nightly test is failing since days and it was not noticed [10:50] pedronis: ok, let me add a note that we drop the type chane [10:50] zyga, that's not $SNAP, it's $SNAP_DATA [10:50] mvo did you see my comment about cloud-init spread test on 8278 ? [10:50] zyga, oh wait, I'm dumb [10:50] ackk: snap/maas/current/usr/lib/postgresql/10/bin seems like $SNAP [10:50] zyga, I saw the "current" there and thought it was SNAP_DATA, sorry :) [10:50] ijohnson: I think we need core 20 specific cloud-init tests, I suppose that's mvo plan [10:50] :D [10:51] mvo: do you know what is missing in our spread setup? [10:51] ijohnson: given the behavior will be tied to grade etc [10:51] ok, sure I would still like to see a TODO:UC20: somewhere about that so that we remember to write that :-) [10:51] zyga: not sure yet, just checking one idea [10:51] ijohnson: that's ok, my point is that the TODO should simply say re-enable this for UC20 [10:51] heh [10:52] should not simply say [10:52] :-) [10:52] zyga: in any case, the nightly suite did notice the issue we just did not pay attention, I think we need alerting here [10:52] :/ [10:52] ijohnson: yeah, sorry, need to reply, but I think we need special tests, uc20 behaves differently from the other oses we have [10:52] yes that's fine [10:53] * ijohnson just likes todos so we don't forget [10:53] I like todos too [10:53] I like to do todos sometimes too [10:53] or even done todos too [10:53] :) [10:53] haha nah I'm not much for that, I just like them to add up [10:53] :-) [10:54] there's a benjamin franklin quote something like "I love deadlines, I love the whooshing sound they make as they go by" [10:54] lol pedronis :D [10:54] we need to do do more todos [11:52] PR snapd#8291 opened: packaging,tests: ensure debian-sid builds without vendor/ [11:53] mvo: lol :) [11:53] sorry :) [11:54] mvo: approved [11:55] zyga: it's all very sad [11:56] * zyga elbow-hugs mvo [11:56] not _all_ very sad :) [11:56] PR snapd#8292 opened: travis.yml: run unit tests on master as well [12:02] mvo: small review for 8292 [12:03] mvo: I also wonder if this bug is related https://bugs.launchpad.net/snapd/+bug/1867755 [12:03] Bug #1867755: snapd fails to build in focal, unit test clientSuite.TestClientFindFromPathErrIsWrapped fails [12:04] kinda meh that dh_auto_build is so magical you need to remove parts of the source tree [12:04] mborzecki: when it works it is useful [12:04] no packaging is good [12:11] PR snapd#8293 opened: packaging: add README.source for debian [12:14] mborzecki: ha [12:14] mborzecki: check this out please [12:14] https://www.irccloud.com/pastebin/5hCE1Blq/ [12:14] specifically line 4 [12:15] zyga: 1867755 is fun, it seems to be a race [12:15] does it seem to you that dbus activation is somehow not working? [12:15] zyga: which is so strange - I saw it sometimes in a PPA build (rarely) and even once in travis I think [12:16] heh [12:16] mvo: use teasaiding they [12:17] mborzecki: for some reason I can't answer your review comment on #8159 but you're right, we're mutating external data that shouldn't be touched there [12:17] PR #8159: snap-bootstrap: remove created partitions on reinstall [12:17] zyga: can't parse that [12:17] mvo: use threads they said | scramble [12:17] cmatsuoka: left a note there, after discussing with pedronis it probably makes most sense to not update the type at all and leave what's defined in the gadget [12:17] zyga: hahahahaa [12:18] mborzecki: ok, will do it that way [12:19] zyga: hmm session systemd has no love for the session tool? [12:19] mborzecki: it fails on 18.04 [12:19] but not in others somehow [12:19] I collected some more data, let me go through it [12:20] ijohnson: as discussed apart the test renames #8208 seems good to go, it needs 2nd review though [12:20] PR #8208: boot_test: add many boot robustness tests for UC20 kernel MarkBootSuccessul and SetNextBoot [12:27] * cmatsuoka hates when you notice something strange in your code, check the branch to see if it's right (it is) and then realize you're on the wrong host [12:35] ijohnson: I reviewed #8286, thanks for it [12:35] PR #8286: tests: cleanup various uc20 boot tests from previous PR [12:36] PR snapd#8285 closed: cmd/snap-update-ns: ignore EROFS from rmdir/unlink [12:36] ah, damn, should have squash merged [12:42] #8292 (small) needs a 2nd review [12:42] PR #8292: travis.yml: run unit tests on master as well [12:45] making progress :) [12:46] brb, I'll make coffee and check up on kids [12:57] ijohnson: did you change anything related to grubenv yesterday? cachio reports boot errors in the ci tests [12:58] ijohnson: qemu-system-x86_64[62371]: error: file `/EFI/ubuntu/grubenv' not found. [13:18] last things merged 2 days ago, was the start of uboot support [13:18] shouldn't have broken amd64 [13:19] don't know if something was done to the gadget or somewhere else [13:19] ha, ok I think I got it [13:19] mborzecki: it works :) [13:20] now the only thing I need is to inject one more exec [13:20] as the pid I get is not the pid of the app [13:20] but of the intermediate bash [13:29] zyga: not urgent, but if you could look at my changes to #8242 it could be otherwise be landable [13:29] PR #8242: many: improve environment handling, fixing duplicate entries [13:30] ack [13:30] Pawel reviewed it as well [13:30] in a few minutes [13:30] I had a look originally but I need to read it again now as it paged out of my memory [13:31] oh, right. 2.44 is branched so we can land 2.45 things again [13:44] cmatsuoka: that sounds like the same issue you and cachio had at the sprint no? [13:45] zyga, mvo, xnox: as part of my SRU verification for ubuntu-image, I just built an amd64 uc20 image, but I can't get a working system - I just get the grub menu, and whatever I select I get some errors, is that expected? [13:45] I just want to know if it's ubuntu-image broken or something else [13:45] sil2100: I don't know [13:46] cachio: could you check for "bootloader files not found" messages in the system journal? [13:46] cachio: of see if the bootloader is actually there [13:48] ijohnson: that's a possibility, I asked sergio to confirm if that's the case [13:48] cachio: s/of/or/ [13:49] cmatsuoka, what I see now is that the system is restarting in a loop [13:49] sil2100: what are the menu items in the grub you see [13:50] the last thing I see is Mar 18 13:49:05 mar181302-158658 qemu-system-x86_64[61560]: [ 16.102470] systemd[1]: Started Create Static Device Nodes in /dev. [13:50] then allways restarts in install mode [13:50] cachio: that didn't happen with the bootloader error in the sprint, right? [13:51] cmatsuoka, I thinks it is not the same but not totally sure [13:51] cmatsuoka, sometimes I even see Mar 18 13:51:27 mar181302-158658 qemu-system-x86_64[61560]: Press enter to configure. [13:51] and then reboots again automatically [13:52] ijohnson: 'Recover using /systems/*', 'Install using /systems/*' and 'System setup' [13:53] cmatsuoka, this is the last log v [13:53] https://paste.ubuntu.com/p/6bNqps9Np2/ [13:53] cachio: that's very strange, what changed from the latest build that worked? is there a new kernel snap, or new ubuntu-image? [13:55] There has not been a new ubuntu-image in stable since a while, 1.9 is the latest [13:56] sil2100: thanks for the information [13:57] cachio: so let's see what the differences are, something must be different [13:58] cmatsuoka, cachio: when did you start noticing the problems? And are you using ubuntu-image from the snap or deb? [13:59] That being said, I don't think we had any uc20 related changes to 1.9 even [14:02] sil2100, snap [14:03] cmatsuoka, I think the problem was already there [14:03] the system is restarting and restarting until at some point it does not restart [14:03] then it works [14:04] cachio: coming to SU? [14:04] cachio: while we're at it, what's up with core18 the snap? I see it's not marked as ready for beta all the time, is something failing for the pi3? [14:04] it started failing now because I reduced the timeouts because I am tryting to speed up a bit the tests because it is taking like 40 minutes each test [14:04] cwayne, plars: ^ [14:05] sil2100, let me check [14:06] sil2100, this is the problem https://trello-attachments.s3.amazonaws.com/5da8bc830df86851446e9d4e/5e688b02c26ce805a8a979ef/8bb39aabc282ceb61f07f599cd7d72b6/core18_20200311_(1705)_pi3_arm32.log [14:06] the system fails to flash the image [14:07] cachio: oh, so not really an issue with the snap itself - should we poke plars around it and see if he can help? [14:08] I think the issue is with the device in the lab [14:08] sil2100, not related to the image [14:09] I trigegred it again [14:09] lets see [14:11] sil2100: what error do you see when you try to use the `install using ...` menu item? [14:14] sil2100: did you boot in UEFI mode, with secureboot, and snakeoil UEFI VARS? [14:14] using OVMF firmware? [14:14] sil2100: if you see "*" it means you booted in bios mode [14:14] * xnox should push a grub change to print something sensible [14:18] cachio: that doesnt make any sense, we use that image on that pi a lot [14:21] cachio: cwayne: I can't even reach that device at the moment, so I have no idea, but it's dead now. My best guess is that something was wrong with the sd card on it, but I can't login to check [14:25] plars, ok [14:26] cwayne, I just tried to run smoke tests for core18 and that happend [14:28] mborzecki, zyga: so this is now a thing: https://docs.fedoraproject.org/en-US/ci/how-to-add-dist-git-test/ [14:28] perhaps one day we could add some tests for snapd for fedora infra to run (similar to autopkgtests for debian) [14:29] Eighth_Doctor: oh nice, we had some ideads about testing packages before pushing to distros [14:29] nice, thank you [14:30] you can kind of thank pitti here, he pushed for this to become a thing, and learned from the mistakes in making autopkgtests [14:31] yay, so session tool can now track PIDs of the started processes [14:32] separately from the session manager [14:32] so that's good [14:32] I added that to the cgroup-tracking test [14:32] let's see what we get [14:43] Eighth_Doctor: can you ask him to push it back to Debian ;D [14:45] hah, you know as well as I do that debian doesn't do change :P [14:49] Eighth_Doctor: it does, in debian stable ;) [14:58] \o/ [14:58] jdstrand: I reproduced the issue reliably in spread now [14:58] woot [14:58] Mar 18 14:56:41 mar181449-606426 systemd[20017]: snap.8661f9ab-6dbb-4fd5-bb75-9bb871f5dddc.test-snapd-sh.sh.scope: Failed to add PIDs to scope's control group: Permission denied [14:59] I can now push some simple fixes for session-tool [14:59] and start working on untangling this particular error [14:59] let me just quickly verify that it's not affecting other releaes [14:59] *releases [15:00] pedronis, mvo this is the log for the retries on uc20 https://paste.ubuntu.com/p/kjhYcqcmrw/ [15:01] it is saying: error: file `/EFI/ubuntu/grubenv' not found. [15:01] quite a bit [15:01] cmatsuoka: I had a look at the log for PR 8159, and if you merge master that snapd-failover issue should go away [15:01] PR #8159: snap-bootstrap: remove created partitions on reinstall [15:02] ijohnson: thanks, will do [15:08] 16.04 passes [15:08] I'm pretty sure 18.04 has a bug [15:08] that's not present in 20.04 or 16.04 [15:08] ah, sorry 20.04 passed, 16.04 still running [15:08] cachio: mvo: we don't even get to systemd in the first couple of those reboots [15:09] afaict [15:09] pedronis: hm, yeah, this looks like initramfs snafu [15:10] or something uefi related [15:11] mvo: there first time we get to EFI Variables Facility v0.08 2004-May-17 [15:11] and then there's a reboot there [15:12] pedronis, could be related to the gadget snap version? [15:12] I am retrying with the edge version now, this run was using the stable version [15:13] cachio: yeah, that sounds like it could be [15:16] mvo, I'll have results soon [15:16] cool [15:42] PR snapd#8294 opened: seed: make Brand() part of the Seed interface [15:49] * cachio lunch [15:57] mvo, pedronis using gadget from edge I see the same [15:58] this is the full log https://paste.ubuntu.com/p/bsmwC5Q52b/ [15:58] xnox: does https://paste.ubuntu.com/p/bsmwC5Q52b/ ring any bells? [16:03] xnox: the funny part is that after some reboots this goes away, is there maybe something in initramfs that looks for it, if it's missing the unit fails and reboots and sometimes we run stuff with the right timing and things work, would the initramfs reboot if a unit fails? [16:04] in a meeting [16:04] xnox: no worries, sry [16:34] mvo: hey [16:35] mvo: so that message about lack of grubenv is from grub. it failing to read and load grubenv, seems to indicate that this is not a UC20 boot? because UC20 gadget relies on grubenv to be present and valid..... [16:36] xnox: aha, that's interessting (cc cachio) - what's strange is that apprerntly for cachio it works after a bunch of reboots in his VM setup [16:37] mvo: i am concerned on how you are starting the VM and whether or not you passed which disk is the expected correct bootdevice [16:37] mvo: because it seems like your tpm state will be corrupted. [16:38] xnox, https://paste.ubuntu.com/p/JTGNRh7djq/ [16:39] xnox, should I change anything in the command? [16:39] mvo: the boot does not seemed to be complete. as the boot is in install mode, and the next one is as well..... [16:39] do the tests reboot twice correctly? and again, without grubenv being written correctly, it will not have pointer to boot into run mode. [16:40] cachio: you have bootindex=1 set correctly there, so boot device is there. [16:41] xnox, do you want to inspect the image? [16:41] mvo: cachio: i'd like to see the VM image ubuntu-core-new.img contents after it was created, but before it is booted. And the file listing of all files on all partitions, and contents of grubenv file. [16:41] =) [16:41] cachio: we think alike ;-) [16:41] #8249 is super simple and needs a 2nd review ;) [16:41] PR #8249: interfaces: make gpio robust against not-existing gpios in /sys [16:41] heh [16:42] #8294 this one [16:42] PR #8294: seed: make Brand() part of the Seed interface [16:45] mborzecki: could you review #8208 when you have time (is not simple) [16:45] PR #8208: boot_test: add many boot robustness tests for UC20 kernel MarkBootSuccessul and SetNextBoot [16:47] cachio: mvo: the image looks quite small, i don't know if we have enough disk space to complete the install [16:48] it is 5.4GB big, and I usually locally make 8G/10G big images. [16:49] xnox, ahh, I can try with 10GB [16:49] does it makes sense? [16:52] mvo: cachio: the failing to load grubenv is a red herring, and i should fix our grub.cfg to not print that pointlessly [16:52] cachio: please try with a 10GB image, yes. [16:53] pedronis: #8270 updated with the extra fields, ready for review [16:53] PR #8270: store: support for search API v2 [16:53] xnox, sure, running [17:01] #8286 needs a 2nd review, it's simple (mostly adding tests and improving code behavior a bit) [17:02] PR #8286: tests: cleanup various uc20 boot tests from previous PR [17:03] cachio: i'm afk / busy until end of day, sorry. Can chat tomorrow, if you will have uc20 logs from a bigger image. [17:03] xnox, sure, I'll try again and send you the logs in case it fails again [17:03] thanks for your help [17:11] jdstrand: hey [17:11] jdstrand: I've debugged the issue [17:11] pedronis: ack [17:11] jdstrand: dbus-user-session [17:11] jdstrand: the new app tracking feature depends on it and we just didn't notice because it's pre installed since eoan (on server) [17:12] jdstrand: I've expanded the test to completely cover running as user [17:12] and with this package it all passes [17:12] and without it it behaves just as your system did [17:13] jdstrand: a bionic desktop has it installed by default [17:13] jdstrand: but a bionic server does not [17:13] jdstrand: can you, if you remember, confirm where you were testing my branch at the time? [17:13] zyga: hi! (sorry, was doing 360s) [17:14] jdstrand: no worries :) [17:14] I'm so glad I made some progress on this branch [17:15] zyga: iirc, it was a bionic desktop amd64 vm, logging in over ssh [17:15] jdstrand: do you have dbus-user-session installed in that VM? [17:15] I'm checking now [17:18] zyga: it is installed. when I login I see this in the environ: DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus [17:18] hmmm [17:18] ookay [17:18] that's interesting, and it's not installed recently? [17:18] I'll double check in any way [17:18] thank you [17:19] zyga: let me look at the PR again. I think I said what I was using [17:19] right now the tests do pass in all xenial+ systems [17:19] as root and as user [17:19] but not as user logged in via ssh [17:19] (I didn't write that test since it's all kind of remote) [17:19] but maybe it's a relevant factor [17:22] "Unfortunately, on (at least) bionic, when logged in via ssh or directly via the console" [17:22] ok, I'll get to it [17:22] I suspect something else is a factor then [17:22] but I'm closer, I think [17:22] zyga: sounds like you are getting there! :) [17:23] zyga: it's the feature that won't stop giving :) [17:23] oh yes [17:24] zyga: and for completeness, logging into a vt, DBUS_SESSION_BUS_ADDRESS was set on the console as well [17:24] I suspect that is logind for both [17:24] I took a snapshot of my 18.04 desktop vm and I'll dig in [17:25] yes, that's pam-systemd [17:25] though that's really [17:25] dbus-session-bus started in the session [17:26] jdstrand: is your desktop VM logged in as a desktop user? [17:27] zyga: the vm came up and is sitting at the gdm prompt. I did not login. I then ssh as the normal user (ie, the one I would use at the gdm login) [17:28] ok [17:28] thanks [17:28] I have the same setup now [17:28] and when you are logged in via ssh [17:28] PR snapd#8295 opened: osutil: do not leave processes behind after the test run [17:28] loginctl shows only two sessions? [17:28] one for gdm [17:28] zyga: I can say that /run/user/1000/bus exists and it is /lib/systemd/systemd --user that setup that socket [17:28] and one for your ssh [17:29] zyga: however, there is no dbus-daemon running under this user. just dbys-daemon --system (root) and dbus-daemon --session (gdm) [17:29] hmmm [17:29] I see [17:29] https://www.irccloud.com/pastebin/H3nx0PgY/ [17:29] (well, there is also the accessibility bus for gdm) [17:29] zyga 1689 0.0 0.1 49792 3840 ? Ss 18:26 0:00 /usr/bin/dbus-daemon --session --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only [17:30] zyga: loginctl shows gdm and the user I logged in as [17:31] zyga: you have that running as 'zyga' without logging into gdm? [17:31] I ssh'd in [17:31] yes [17:31] but I have dbus-user-session installed [17:31] let me log out and double check PIDs recyce [17:31] oh wait [17:31] it's gone now [17:32] I have dbus-user-session installed, logged in via ssh and do *not* have the above in ps for my user [17:32] maybe it's just activated [17:32] ha [17:32] wait [17:32] it didn't [17:32] zyga@bionic-desktop:~$ systemd-run --user --scope ls [17:32] Job for run-re0d33bd5a3334fc39992748ea3053b88.scope failed. [17:32] See "systemctl status run-re0d33bd5a3334fc39992748ea3053b88.scope" and "journalctl -xe" for details. [17:32] that's exactly what you saw [17:32] what the... [17:32] I think my auto-login spawned dbus [17:32] and it was around [17:32] I logged out, killed that session [17:32] ah, yes, that would do it [17:32] logged out from ssh [17:32] logged back in [17:32] and no dbus [17:32] *progress* [17:33] \o/ [17:33] thank you, I won't bother you anymore until I crack this [17:33] zyga: it feels so close! :) [17:35] mar 18 18:32:12 bionic-desktop systemd[2002]: run-re0d33bd5a3334fc39992748ea3053b88.scope: Failed to add PIDs to scope's control group: Permission denied [17:43] mvo: small review on 8295 [17:44] mvo: will kill-sleeper work without job control in tty? [17:46] zyga: I don't know, I tested it locally and it was ok [17:46] locally it would have your pty [17:46] zyga: yeah [17:46] let's see what happens [17:46] mvo: I think exec sleep is better [17:46] then you can kill it [17:47] zyga: but I need the pid? [17:47] exec :D [17:47] h [17:47] ah [17:47] sorry [17:47] :/ [17:47] missed that [17:47] zyga: :) [17:47] zyga: no worries [17:47] you could echo $$ > /tmp/pid [17:47] and kill that [17:47] but uck [17:47] zyga: happy about better ideas, feel free to push to the PR, it was the best I could think of [17:47] yeah, that's fine [17:47] let's see if it passes [17:47] I'm debugging systemd behavior now [17:48] but happy to look next [17:48] zyga: no worries, if it's good enough that would be nice if not we can iterate [17:54] I think I found the systemd bug [17:54] mkdir("/sys/fs/cgroup/pids/user.slice/user-1000.slice/user@1000.service/run-rbabf8a8b75af4ad68cf81da60da72a47.scope", 0755) = -1 EACCES (Permission denied) [17:54] https://bugzilla.redhat.com/show_bug.cgi?id=1413075 [17:54] seems related [18:52] PR snapd#8294 closed: seed: make Brand() part of the Seed interface [18:52] I give up for today [18:52] I'm attached to systemd --user that exhibits the error [18:52] I have debug symbols and all that [18:52] I can reproduce the error at will [18:53] need to jump into this with fresh head tomorrow [18:53] jdstrand: ^ FYI [18:53] I also have the same setup with newer systemd on focal where it doesn't fail [18:53] that's all for today folks [18:53] o/ [18:55] cachio: booting the test image it worked, but auto-import from /dev/sda failed (with exit code 127) [18:57] cmatsuoka, perhaps mvo knows about which other thing to check [18:57] cachio: auto-import.assert is indeed in /dev/sda [18:58] cmatsuoka: any more output from auto-import ? [18:59] ijohnson: not much, let me retrieve the actual log entries [19:01] ijohnson: https://pastebin.ubuntu.com/p/8B3JyXMsbx/ [19:02] cmatsuoka: 127 is command-not-found IIRC [19:02] ijohnson: /dev/sda is where the assertion is, I mounted it and it's there [19:02] oh [19:02] so /bin/snap whatever is not tehre [19:02] that's embarrassing [19:02] ;D [19:02] well then [19:02] * zyga goes away [19:03] cachio: I think it's explained then, let me check the initramfs to see if the executable is there... [19:04] I mean [19:04] it's on the real system, right? so no, it's not tehre [19:04] there [19:06] cmatsuoka: this indicates that the snap command is not there, this usually happen on the very first boot when snapd is not seeded yet [19:07] cmatsuoka, do you see any error during the seeding? [19:07] mvo: ah yes, that makes sense [19:07] cachio: seeding seems good to me [19:09] cachio: no more auto-import messages later [19:09] if you restart it does it work? [19:09] does the auto-import service depend on seeded system? [19:10] it probably shold [19:10] *should [19:10] zyga: it seems that it doesn't but it should, from what we see here [19:11] I think mvo has thoughts on it already [19:11] this is the same bug that came up during the sprint [19:11] yes [19:12] it's a known bug atm [19:12] cmatsuoka, pedronis yeah, I think we either need to just echo the device paths and catchup or do a udev scan of the block devices in snap.autoimport.service. it's mostly jfdi but I didn't manage to find the time for it yet [19:13] it's not an immediate blocker, the cost is rebooting again [19:13] pedronis, in uc20 is taking so long to boot [19:13] ok then, now I see how this ties to the sprint conversation [19:14] we were researching why [19:14] cachio: well, I new run reboot shouldn't take that longer [19:14] cachio: isn't the issue the multiple reboots? [19:14] or is this something else [19:14] cachio: I mean, did we fix the issue you showed today, were we get a bunch of reboots [19:14] that don't even reach systemd [19:15] it is something else, it was caused becuase the zise of the image was 5g and we needed 10g [19:15] cachio: anyway, in all my tests it booted correctly to run mode without any boot loops [19:15] interesting, that seems big, do we know what's the space needed for? [19:16] I don't yet [19:16] ok [19:16] pedronis: that's curious because the partition grow code is not there yet [19:17] so the partition sizes should stay the same [19:18] do we have a weird bug in install code that makes a partition of the wrong sizes? [19:19] * zyga EODs [19:19] cachio: could you generate a 5G image for me? [19:19] cachio: so I can check if there's something funky in partitioning [19:20] sure [19:21] first, could you try restart that vm [19:22] in my case when I restart the vm it goes in a rrboot loop [19:22] and I see error: file `/EFI/ubuntu/grubenv' not found. [19:25] do we know if it's in the image? [19:26] it should be (at least we put one there end of Feb) [19:27] I am generating a new image to check again [19:41] ijohnson: cachio: so I made an image and there's no grubenv, but that's expected I think, there will be one now with the new code [19:42] pedronis: there won't be a grubenv on the root ubuntu-seed grubenv [19:42] yes [19:43] not until we set mode to run [19:43] err sorry there won't be any variables set, I guess I don't know for certain whether there would be an empty grubenv or not [19:43] there isn't one [19:43] yes when we run makeBootable20RunMode then we set grubenv on ubuntu-seed [19:43] I just made an image and mouunted [19:43] iirc [19:43] it [19:46] https://paste.ubuntu.com/p/w3NbfYdt5V/ [20:20] cmatsuoka: is #8159 ready for a re-review ? [20:20] PR #8159: snap-bootstrap: remove created partitions on reinstall [20:27] pedronis: yes, I also fixed issues raised my maciek (the struct pointer issue and better attribute parsing) [20:28] cmatsuoka: ok, thanks, I'll look at it in my morning [20:29] pedronis: I'm currently skipping partitions that have a type that's not in our list. the alternative would be to fail and abort installation [21:03] * cachio afk [21:09] PR snapd#8296 opened: httputil/client_test.go: add two TLS version tests [21:10] ijohnson: seems here failover failed again, but it seems that PR got the latest code for that: https://api.travis-ci.org/v3/job/664035070/log.txt [21:11] pedronis: which pr [21:11] ijohnson: your latest [21:11] 8287 ? [21:11] #8286 [21:11] PR #8286: tests: cleanup various uc20 boot tests from previous PR [21:13] also the ubuntu-image test change breaks it [21:13] su => session-tool [21:13] s/ubuntu-image/prepare-image/ [21:13] hmm? [21:13] session-tool broke something? [21:13] I'm also not sure why using session-tool there [21:14] it seems overkill, but I may be missing something [21:14] where? [21:14] I just came to check on something and noticed [21:14] zyga: https://github.com/snapcore/snapd/pull/8286/files [21:14] PR #8286: tests: cleanup various uc20 boot tests from previous PR [21:14] zyga: ijohnson replaced a su invocation with session-tool, do the test is just calling prepare-image, and things break [21:14] do you know why it is invoked as user in the first place? [21:15] mvo wrote the original test fwiw [21:15] to check that it works [21:15] as a normal user [21:15] aha [21:15] and how does it break/ [21:15] it's not finding an assertion, the breakage is strange [21:15] anyway not sure it needs to use session-tool [21:16] zyga: there's spread logs there [21:16] I mean isn't this why we have session-tool though ... [21:16] ok [21:16] zyga: https://pastebin.ubuntu.com/p/sJzvXy3dWw/ [21:16] ijohnson: do break working tests? [21:16] thanks! [21:16] to break working tests? [21:16] pedronis: true always works ;) [21:16] no I mean to have a tool that "does all the right things" [21:16] su doesn't do all the right things [21:17] what is right or wrong is very contextually dependent [21:17] hmm [21:17] is /home/test/tmp//model.assertion sensible [21:17] as in // that feels like something not expanded [21:17] it's just a var ending in / [21:17] ah, $ROOT has / at the end [21:17] anyway works with just su [21:18] it's easy enough to back out that change but it's just frustrating that we have these problems :-/ [21:18] what places model.assertion there? [21:19] also the snapd-failover failure is very depressing tbh [21:19] ijohnson: yes, that one is [21:19] definitely [21:19] ijohnson: I made some improvements to session-tool, fixed some issues with quoting [21:19] but I don't suspect it's a factor in this case [21:19] ijohnson: the other one I would just move back to su and be happy for a while [21:19] oh wait [21:20] hmm [21:20] I'm quite all the other tests like this one use su anyway [21:20] so we would have to change all or none [21:20] session-tool expects stuff normally, not as one big string [21:20] maybe my quoting fix would actually help [21:20] and it doesn't seem a good use of time at this moment [21:20] ijohnson: yeah, leave it to me [21:21] I plan to do a pass [21:21] awesome, thanks zyga [21:21] I wonder if we should just stop snapd.socket then start it again rather than a restart [21:21] I just really really wish we understood why that keeps happening anyways [21:23] I reverted to su for now [21:24] we have 29 places using it still [21:25] I pushed some fixes to session tool [21:25] https://github.com/snapcore/snapd/pull/8297 [21:25] PR #8297: tests: session-tool improvements [21:25] that's good, I don't think converting su to session-tool is somethign anybody should work on unless there is a good reason for specific tests at this point in time [21:26] though [21:26] too many other things [21:26] PR snapd#8297 opened: tests: session-tool improvements [21:26] pedronis: I think it's that they are definitely not really testing the feature [21:26] which feature? [21:26] pedronis: su is not representative of a user running the code [21:26] pedronis: anything [21:27] pedronis: it's really 90% root [21:27] pedronis: 10% user [21:27] maybe [21:27] it depends [21:27] pedronis: if the point of the test is to check if it runs for normal people [21:27] su is not it [21:27] it does depend [21:27] but we should generally not use su, because it depends is complicated [21:27] is still not a good use of time all things considered [21:27] at this point in time [21:27] sure, at some point it will though [21:27] yea [21:28] let's try to get there [21:28] indeed :) [21:28] * zyga wrote a comment on https://github.com/snapcore/snapd/pull/7825#issuecomment-600861003 that is sad but relevant [21:28] PR #7825: many: use transient scope for tracking apps and hooks [21:28] (not to this discussion though) [21:29] and I'm back to being away [21:40] zyga: we prefer "win the lottery" instead of "hit by bus" - just saying :) [21:41] mvo: if I win a lottery I will perpetually send patches for things I like and not worry about paycheck ;) [21:41] mvo: *then* you get the bus :) [21:42] I've installed buster now [21:42] and installing nvidia driver [21:42] haha [21:44] ha! [21:44] * mvo really calls it a day [21:44] mvo: can you dput to buster-backports? [21:44] * zyga wonder if he saw that [21:54] nvidia + debian = :( [21:54] + snaps [22:41] zyga: Not so nice to hear. Thanks for investigating the Nvidia problem though [23:23] cachio: #8260 tests failed [23:23] PR #8260: tests: enable nested on core20 and test current branch [23:36] cmatsuoka, I'll push a small change now to retrigger this [23:36] need to wait until tests pass here [23:40] it seems that tests are especially slow today