/srv/irclogs.ubuntu.com/2021/10/12/#snappy.txt

ijohnson[m]waltman: you should create a forum topic with more detailed output to share for folks to look at, it's hard to tell what the problem is without seeing the full output and many folks are offline right now01:13
ijohnson[m]mwhudson: so is there an easy reproducer for this bug ?01:13
waltmanijohnson[m]: What category should it go in?01:29
ijohnson[m]waltman: probably the snapd category makes the most sense01:29
waltmanCool, that's what I picked01:30
mwhudsonijohnson[m]: yes, boot the server installer and pkill snapd01:36
mwhudsoner sudo pkill -TERM snapd01:36
ijohnson[m]mwhudson: which server installer?01:37
mwhudsonijohnson[m]: impish01:37
mwhudsonijohnson[m]: https://cdimage.ubuntu.com/ubuntu-server/daily-live/pending/impish-live-server-amd64.iso01:37
ijohnson[m]perfect thanks, I'll give it a try and see what's up01:37
mwhudsonthanks01:37
mwhudsonijohnson[m]: having fun yet?03:18
ijohnson[m]yeah this is weird03:19
ijohnson[m]mwhudson: so would the previous installers where this was working have used the release pocket or would they have been using the proposed pocket ?03:28
ijohnson[m]cause I see the same behavior with 2.53+21.10 deb of snapd too03:28
ijohnson[m]so I'm thinking this was broken between 2.51.7 -> 2.5303:28
mwhudsonijohnson[m]: the isos are built with the release pocket03:31
ijohnson[m]and that hasn't changed recently right03:31
mwhudsonijohnson[m]: i'm not sure if you saying it's weird makes me happy (i didn't miss something obvious) or sad (it might be a pain to fix)03:31
mwhudsonijohnson[m]: no03:31
mwhudsonwhen did 2.53 hit the release pocket?03:32
ijohnson[m]it's okay to have complicated and conflicting feelings about snapd03:32
ijohnson[m]2.53+21.10ubuntu1 just hit like 2 days ago03:32
ijohnson[m]but I can see the same behavior with 2.53+21.10 which is why I was wondering if it changed03:32
mwhudson2.53~pre1.git19b68f708 landed about two weeks ago it seems03:33
ijohnson[m]that's the one I'm trying now03:33
mwhudsontempted to build my own snapd with more debugging03:36
ijohnson[m]seems fine with that version03:36
mwhudsonah ok03:36
ijohnson[m]that's probably the next step03:36
mwhudsonhopefully that's a smaller diff to read :)03:36
ijohnson[m]yeah03:36
mwhudsonoh no03:37
mwhudsoni wonder if it's go 1.16 vs go 1.1703:37
mwhudsonah no 2.53~pre1.git19b68f708 was build with 1.1703:38
ijohnson[m]hmm though after numerous iterations on the same VM I can't reproduce it anymore 😕03:38
mwhudsonhmm03:41
mwhudsoni'll try an iso with 2.53~pre1.git19b68f708 installed03:41
ijohnson[m]yeah all I was doing was just downloading the debs and installing them in the root shell03:42
mwhudsonhmm 2.53~pre1.git19b68f708 seems the same here :/03:42
mwhudsonoh but i got "WARNING: cannot gracefully shut down in-flight snapd API activity in: 25s"03:43
mwhudsonhow did it get that far without writing maintenance.json03:45
ijohnson[m]yeah so I booted a fresh iso that had 2.53+21.10ubuntu1 on it, reproduced the bug, immediately downloaded snapd_2.53~pre1.git19b68f708_amd64.deb + installed it and now I can't reproduce the bug anymore03:45
mwhudsonhmmmm03:45
mwhudsondid snapd get restarted during the upgrade?03:45
ijohnson[m]yes03:48
ijohnson[m]so I see the same thing with 2.53+21.10 too03:48
ijohnson[m]this is what I'm doing https://pastebin.ubuntu.com/p/ZR2Zh62DrP/03:49
mwhudsoni installed ~pre1 into an iso and trying to reinstall (i accidentally left the deb in the live session's rootfs) is hanging03:49
ijohnson[m]maybe I'm not triggering the bug the same way?03:49
ijohnson[m]the first time I run pkill -TERM it doesn't kill snapd immediately, but after downgrading the deb, then snapd is restarted and now it doesn't exhibit the bug 03:50
mwhudsonyeah so in general it seems after snapd has restarted once it's ok03:50
ijohnson[m]yeah agreed03:54
ijohnson[m]it's getting a bit late for me, but I think checking if a iso built with 2.53+21.10 and another one built with 2.53~pre1-blah are affected the same way would be super helpful in bisecting where this got broken03:55
ijohnson[m]I really hope we don't have to go all the way back to 2.51.7, but also that would be really surprising if this only recently just started failing03:55
mwhudsontrying 2.51.1+21.10 now04:01
mwhudsonseems to fail the same way???04:04
mwhudsoni have to go and make dinner now04:04
ijohnson[m]oh noes04:04
ijohnson[m]mwhudson: I pinged the EU folks who should be around in 1-2 hours who can take a look04:04
mwhudsonijohnson[m]: maybe it will turn out to be systemd's fault!04:05
ijohnson[m]Oooh that's my favorite04:06
mwhudsonijohnson[m]: i'm stumped and am giving up for now, hopefully the european wizards can figure it out04:15
mwhudson(also had covid vaccine #2 yesterday and am finding it a bit hard to think)04:16
mardymvo: hi! Does golang has a sort of event loop? I'm just trying to make some sense out of https://bugs.launchpad.net/ubuntu-cdimage/+bug/1946656 (added a few questions in the last comment)06:31
mupBug #1946656: [daily impish-live-server] snap stuck in the installer system <fr-1794> <snapd:New for mardy> <subiquity:New> <Ubuntu CD Images:New> <https://launchpad.net/bugs/1946656>06:31
mborzeckimwhudson: still around, i see that subiquity restarts the snapd.service for some reason, and it takes a while for snapd to go down, however after it is finally down, snapd.socket is stopped to (does subiquity request that?), anyways, if the socket is down, snap list will obviously block07:09
zygagood morning 07:25
mupPR snapd#10908 opened: cmd/snap-failure: use snapd from the snapd snap if core is not present <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/10908>07:54
mborzeckizyga: hey08:01
zygamborzecki, hey :)08:02
zygahow are you today?08:02
zygaI'm playing with intellinet PDU08:02
mborzeckizyga: fighting fires with PRs08:02
zygamborzecki, good luck, is the fire externally caused or internal / rushing fixes for impish?08:05
mardyzyga: hi!08:05
zygahey mardy 08:05
sil2100mvo, mardy, mborzecki: thanks for looking into LP: #1946656 ! Hope you and the subiquity guys can find out what's up there since this is basically a 21.10 release-blocker, so we'd appreciate all-hands-on-deck for that one o/08:22
mupBug #1946656: [daily impish-live-server] snap stuck in the installer system <fr-1794> <snapd:New for mardy> <subiquity:New> <Ubuntu CD Images:New> <https://launchpad.net/bugs/1946656>08:22
mardysil2100: maybe you can help me: can I replace a binary in the impish-live-server-amd64.iso? I mounted it as a loop device, but inside it I don't see a normal Linux FS; just the boot/ folder, and then a debian repo08:35
mardyah, I just found a couple of squashfs files08:37
sil2100mardy: depending on what kind of changes you need to do, but I suppose using something like https://github.com/mwhudson/livefs-editor might be helpful!08:37
mardysil2100: wow, that looks handy, thanks!08:46
mborzeckisil2100: who can take a look into subiquity? i see there's like 25 connections to /run/snapd.socket from /snap/subiquity/2793/usr/bin/python3.8 -m subiquity.cmd.server that don't go away when snapd is shutting down (also what's causing snapd to wait longer due to a graceful shutdown)08:50
sil2100mwhudson, dbungert: ^08:54
mwhudsonmborzecki: yeah i noticed the same thing08:55
mborzeckimwhudson: just added https://bugs.launchpad.net/ubuntu-cdimage/+bug/1946656/comments/1308:55
mupBug #1946656: [daily impish-live-server] snap stuck in the installer system <fr-1794> <snapd:New for mardy> <subiquity:New> <Ubuntu CD Images:New> <https://launchpad.net/bugs/1946656>08:55
mwhudsonmborzecki: all i can think is that moving from core18 to core20 brought a new version of some library that is messing things up08:55
mardymwhudson: how do I install livefs-edit? I tried "pip install .", but then the program fails with:08:57
mborzeckimwhudson: i suspect the problem here is taht snapd tries to do a graceful shutdown and waits for those connections to be idle, but somehow they do not enter such state, so eventually we hit a sigterm timeout in systemd which issues a sigkill, thus snapd fails, triggering snap-failure which hits a 'bug' i fixed in https://github.com/snapcore/snapd/pull/10908, but fixing the connections part would also make the problem go away08:57
mupPR #10908: cmd/snap-failure: use snapd from the snapd snap if core is not present <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/10908>08:57
mardy  File "/home/mardy/.local/bin/livefs-edit", line 8, in <module>08:57
mardy    sys.exit(__main__())08:57
mardyTypeError: 'module' object is not callable08:57
mwhudsonmardy: oops08:58
mwhudsonmardy: i usually just run "sudo PYTHONPATH=~/src/livefs-editor python3 -m livefs_edit ..."08:58
mardymwhudson: I'll follow your steps, then, thanks :-)08:59
mwhudsonmborzecki: yeah i wonder why the connections are not going idle08:59
mwhudsonmborzecki: maybe subiquity isn't reading the complete response or something?08:59
mborzeckimwhudson: like a trailing \n or something, hmm that's possible09:00
mwhudsonone quick hack coming up09:00
mborzeckimwhudson: also why so many connections? maybe there's a connection per request?09:01
mwhudsonmborzecki: i suspect the answer to that is the answer to the other thing09:01
mwhudsonmaybe requests_unixsocket just isn't very good09:02
mborzeckimwhudson: yeah, might be, perhaps the connection does not become idle as data was not fully received, and a new one is created for a subsequent request09:02
mardymwhudson: could it be possibly related to https://bugs.launchpad.net/snapd/+bug/1943169 ?09:05
mupBug #1943169: snapd daemon doesn't send an EOF <snapd:Invalid by mardy> <https://launchpad.net/bugs/1943169>09:05
mwhudsonmardy: i doubt it09:06
mardymborzecki: regardless of that, can't/shouldn't snapd just close all the connections abruptly?09:06
mwhudsonpfffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff09:06
mwhudsonso this patch seems to fix the issue https://paste.ubuntu.com/p/ThkpshCXRz/09:07
mborzeckimardy: well, we can, but that's not nice 😉09:07
mardymwhudson: nice! ship it! :-)09:07
mborzeckimwhudson: ah enter/exit 😉 nice09:08
mardymborzecki: yes, but relying on clients behaving properly is not robust (though it's nice :-) ), since we might not control them09:09
mborzeckimwhudson: so what's the plan now? fix in subiquity, rebuild the package & respin the image?09:09
mwhudsonmborzecki: my plan is to ask sil2100 what to do :)09:10
mborzeckihaha 🙂 fair enough09:10
mborzeckimaybe i can patch the livecd and check with the patch locally09:11
sil2100mwhudson: ooooh! So we can fix this in subiquity then?09:17
sil2100mwhudson: yes, if this works then damn, let's get this into subiquity stable and respin o/09:18
mborzeckisil2100: please let me know when you have something09:20
mwhudsonmborzecki: have you seen the joy/horror that is ./scripts/quick-test-this-branch.sh in the subiquity tree?09:22
mborzeckimwhudson: oooh, interesting09:24
mwhudsonmborzecki, sil2100: https://github.com/canonical/subiquity/pull/109409:24
mupPR canonical/subiquity#1094: close the session object after each request to the snapd API <Created by mwhudson> <https://github.com/canonical/subiquity/pull/1094>09:24
mardymborzecki: are you doing any more work on this issue? Just to double-check that we don't do duplicate efforts09:38
mborzeckimardy: not really, trying to repack subiquity snap and edit the livecd09:39
mardymborzecki: I've been reproduced part of the issue locally, where snapd gets stuck after a TERM signal if it has a pending connection, and I'd like to try to fix it (maybe wait for three seconds, and then just close all connections)09:39
mardymborzecki: nice09:39
mborzeckimardy: look at the code in daemon.go, we pass context.WithTimeout() to Shutdown(), which in theory should hit a timeout at some point09:40
mborzeckimeh, livefs-editor doesn't like me really10:12
frederic_02speack french?10:23
mupPR snapd#10909 opened: daemon: make daemon shutdown timeout shorter <Simple 😃> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/10909>11:00
mupPR snapd#10901 closed: overlord: add managers unit test demonstrating cyclic dependency between gadget and kernel updates <Skip spread> <Created by bboozzoo> <Merged by bboozzoo> <https://github.com/snapcore/snapd/pull/10901>11:20
mupPR snapd#10910 opened: o/snapstate: check snaps for duplicate or invalid names <Created by MiguelPires> <https://github.com/snapcore/snapd/pull/10910>12:55
mupPR snapd#10911 opened: daemon: use the syscall connection to get the socket credentials <Created by mardy> <https://github.com/snapcore/snapd/pull/10911>13:00
mupPR snapd#10912 opened: tests: not testing lxd snap anymore on i386 architecture <Simple 😃> <Created by sergiocazzolato> <https://github.com/snapcore/snapd/pull/10912>14:31
flotterIs snapd supported on the platforms under packages/ and are all those up to date ?14:43
flotteri.e. fedora14:44
mupPR snapd#10890 closed: tests: using test-snapd-curl snap instead of http snap <Created by sergiocazzolato> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10890>15:31
ijohnson[m]flotter: do you mean `packaging/` dir in the snapd git tree? if so then yes those are the supported distros and they should generally be up to date, sometimes they can lag a bit since it takes time to do release artifacts and get them through the upstream pipelines, etc after we have cut a tag for a release15:40
miguelpiresmvo: can you merge this https://github.com/snapcore/snapd/pull/10897 please? failures are unrelated16:02
mupPR #10897: osutil: ensure parent dir is opened and sync'd <Simple 😃> <Created by MiguelPires> <https://github.com/snapcore/snapd/pull/10897>16:02
mvomiguelpires: sure16:51
mupPR snapd#10897 closed: osutil: ensure parent dir is opened and sync'd <Simple 😃> <Created by MiguelPires> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10897>16:56
mupPR snapd#10913 opened: [WIP] docs: update HACKING.md instructions <Created by flotter> <https://github.com/snapcore/snapd/pull/10913>17:36

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!