=== nuccitheboss1 is now known as nuccitheboss === guiverc2 is now known as guiverc === nuccitheboss1 is now known as nuccitheboss [06:50] teward: i am now === sem2peie- is now known as sem2peie === bdrung changed the topic of #ubuntu-devel to: Archive: Kinetic open! | Devel of Ubuntu (not support) | Build failures: http://qa.ubuntuwire.com/ftbfs/ | #ubuntu for support and discussion of Bionic-Jammy | If you can't send messages here, authenticate to NickServ first | Patch Pilots: bdrung [09:23] is https://reports.qa.ubuntu.com/reports/sponsoring/index.html also not working for others? if those do you know who is maintaining the service/where to report? [09:24] bdmurray, ^ iirc you had at least access? [09:30] seb128: i can confirm it's not working now. it did work for me about 2 hours ago [09:35] ginggs, thanks [10:10] hi - any chance systemd/ppc64el could be added to https://git.launchpad.net/~ubuntu-release/autopkgtest-cloud/+git/autopkgtest-package-configs/tree/big_packages ? [10:10] systemd/amd64 is already there, and the ppc64el test run does the exact same things, and often times out [10:13] bluca: would you submit a MR please? also -> #ubuntu-release [10:18] can do, what's the repo? [10:27] bluca: as above? https://git.launchpad.net/~ubuntu-release/autopkgtest-cloud/+git/autopkgtest-package-configs/ or do I misunderstand? [10:34] no my bad, thought you meant on Salsa for some reason - https://code.launchpad.net/~bluca/autopkgtest-cloud/+git/autopkgtest-package-configs/+merge/424487 [10:40] bluca: thanks! [10:43] thanks for the review - who can do the merge? [10:48] anyone in ~ubuntu-release -- i've pinged them [10:50] thanks! [12:42] seb128: http://reqorts.qa.ubuntu.com/reports/sponsoring/ working again for me now [12:44] ginggs, thanks; it also works here now [12:44] vorlon: "do we know why these tests are running more slowly on ppc64el than on other archs" > no, I did not investigate any further. I just noticed the tests were very inconsistently hitting timeouts due to the race conditions. [14:15] ginggs: can you look at #1814302 and advise waveform on prepping kinetic and jammy patches and not have version collission? [14:15] you sponsored it last according to LP [14:15] while waveform TIL [14:16] also i cant spell apparently. *goes to get coffee* [14:16] (merge/sync for Jammy == regression introduction) [14:18] teward, the last rebuild that ginggs sponsored was to do with ... (checks notes) ... oh, the debhelper prerm fun. I think the actual sync from debian was before that but I've no idea what triggered that (given we had a delta for the lxd apparmor bits) [14:19] waveform: based on affected releases I would say Jammy introduced the dropped delta which regressed the SRU [14:20] but to apply in Kinetic and Jammy you need to avoid the version string collission (same version, two releases) [14:20] at least until newer quassel is merged : P [14:20] i am on a commute or i would JFDI for uploading :) [14:20] yeah, I figured it'd get sponsored into kinetic, then I could look at SRU'ing to jammy (with something like ~22.04.1 on the end) [14:23] but if it can go into both that's certainly easier [14:24] use -Xubuntu0.22.04.1 [14:24] *points at sec team docs& [14:24] waveform: if you create the jammy debdiff i'll upload both simultaneously [14:25] at least... when i'm at my computer [14:25] phone is not capable of debsign xD [14:26] *adds a note to stab bryceh about nginx merge from Debian* [15:03] teward: I doubt you'll have to worry about a new quassel for a while, perhaps years. They're slower than turtles crawling through peanut butter at their releases. [15:03] :P [15:03] Eickmeyer: maybe, but i can expect a -3 in Debian at any time so :P [15:04] Fair. [15:04] Seeded in Studio, FWIW. [15:05] good afternnon [15:06] ddstreet: hello there, I'm having a problem which might be in systemd and I'm collecting info to try to report a bug, if you have some mins I would appreciate your advice, details below [15:07] - I have some servers to [re]build kubuntu packages [15:07] - in these servers I have LXD containers [15:08] - in the containers meant to build the package I also do an autopkgtest run [15:08] - I'm using autopkgtest with the LXD backend, therefore I'm using nested containers [15:09] - when autopkgtest starts it executes something like this to verify the container actually started: [15:09] lxc exec runlevel [15:10] the thing is, since some time ago, "runlevel" no longer works in nested cointainers [15:10] so I have been researching the problem and I found out a couple of interesting systemd services: [15:10] waveform: uploaded to Kinetic and jammy-proposed [15:11] systemd-update-utmp.service and systemd-update-utmp-runlevel.service [15:11] the first one works fine, the second doesn't [15:12] the second service is suposed to execute "/lib/systemd/systemd-update-utmp runlevel" [15:13] if I execute that manually in a nested container I get this line of output "Failed to get new runlevel, utmp update skipped." [15:13] teward, thanks [15:14] I have been looking into systemd's upstream git, but I couldn't find anything interesting so far [15:14] any advice to continue the research? I'm a bit stuck with this atm [15:16] oh I forgot to mention [15:17] executing "runlevel" in a nested container is always returning "unknown" (and thats what is making my autopkgtest executions fail) [15:18] santa_ in the top-level container, are you using 'security.nesting=true'? if not, it's worth giving that a try first [15:19] ddstreet: yes of course, also in case you are wondering this happens both with privileged and non-privileged containers [15:20] hmm, and is the 'runlevel' call failing for older releases too or just jammy/kinetic? [15:21] in my server I have 20.04 [15:21] in the building containers I have 18.04 [15:21] the nested containers have kinetic [15:22] however, I tried other versions [15:22] for example this happens with 20.04 top-level container + 20.04 nested container [15:23] also 20.04 top-level + kinetic nested container [15:24] ... and others (I don't remeber all combinations I tried but I couldn't get the runlevel thing working yet) [15:24] i think you'll likely need to enable systemd debug in the nested container to get more detail about why 'runlevel' is failing [15:24] maybe you can manually start the nested container and manually check 'runlevel' [15:26] yes, that's what I'm doing, it's always returning "unknown" [15:27] I mean in addition to the building infra I have I have been creating other containers manually to experiment [15:27] what does 'systemctl is-system-running' report? [15:31] ddstreet: "starting" on nested container [15:31] yeah that's the problem [15:31] I think it should say "running" at this point [15:32] the 'runlevel' program connects to pid1 over dbus, and asks for the state of specific units, namely 'graphical.target', 'multi-user.target', and 'rescue.target' [15:32] if any of those are 'active' (or 'reloading'), it uses that 'runlevel' (respectively, 5, 3, 1) [15:33] if none of those are active/reloading, it falls into the error you're seeing [15:33] and if the system is still 'starting' none of those will be active yet [15:33] and really, that makes sense, if the system is still starting it isn't in any 'runlevel' [15:35] is autopkgtest the code calling 'runlevel' in the container? if so, it should be waiting for the container to start, like with 'systemctl --wait is-system-running' or similar [15:35] ack [15:35] yes, let me find the code... [15:36] https://salsa.debian.org/ci-team/autopkgtest/-/blob/master/virt/autopkgtest-virt-lxd#L101 [15:37] perhaps it's about time that autopktest switches to 'systemctl --wait is-system-running'? [15:38] well they have a loop there, which is fine (i guess 'runlevel' is ok to use if you don't know if the container is running systemd...?) [15:38] i think they need to increase that timeout though, the default system timeout for units is 90 seconds [15:39] so any unit that delays the bootup will almost certainly go past that 60 second timeout [15:39] yeah, I have a customized autopkgttest increasing some timeouts [15:39] in the container you could check what service is delaying bootup, networking is a frequent offender (i.e. systemd-networkd-wait-online) but it could be something else [15:40] ah, I have 3 failed units, I could have a look into that [15:41] yep, probably one or more of those delaying boot [15:41] also, if you look at line 109 of the code I linked above, that seems to contain legacy code in case you are not using systemd [15:42] so I guess the idea of executing runlevel was to support old systems without systemd [15:42] yep, seems reasonable since Debian does allow not using systemd [15:43] main thing that probably needs changing there is to increase the timeout from 60 to 120 or so [15:43] unless the intention is to detect failed boot-time units, of course [15:46] ddstreet: funny coincidence I have 120 in mine + other fix: https://git.launchpad.net/~tritemio-maintainers/tritemio/+git/autopkgtest/commit/?id=976c322fca13a69a366c7dc60f1fe423702ff7ba [15:46] Commit 976c322 in ~tritemio-maintainers/tritemio/+git/autopkgtest "Increase various timeouts" [15:47] maybe I should send part of that patch to the autopkgtest guys [15:53] ok, so going back to systemd in nested containers, it seems there is a problem with apparmor [15:54] so I have 3 failed services: [15:54] - apparmor.service [15:54] - networkd-dispatcher.service [15:54] - snapd.apparmor.service [16:16] ok I have been playing around with a kinetic container nested in another kinetic container: [16:16] - I had the 3 services mentioned above failing [16:17] - I have removed the "apparmor" and "networkd-dispatcher", after that no more services failing [16:18] - however the nested cointainer is still in the same weird state [16:20] santa_ i started up a nested kinetic, and it looks like dbus isn't running, or is having major problems [16:20] weird state = runlevel reporting 'unknown', 'systemctl is-system-running' says 'starting' [16:21] ddstreet: ah, I remember some packages being updated and complaining about dbus or something [16:21] yeah while 'starting' runlevel will always return 'unknown' [16:22] also the multi-user, graphical or emergency targets were not reached apparently [16:23] yep, i am pretty sure the nested lxd containers is causing some issue(s) [16:23] i don't think it's necessarily systemd that's broken [16:24] yeah, I share that opinion [16:24] * I agrre [16:24] * I agree [16:24] also I have "Unexpected error response from GetNameOwner(): Connection terminated" from various systemd services [16:24] yep me too [16:27] slyon have you looked into problems with nested lxd containers with kinetic? ^ [17:18] ok, I have found a workaround [17:18] https://bugs.launchpad.net/cloud-init/+bug/1905493 [17:18] Launchpad bug 1905493 in snapd "cloud-init status --wait hangs indefinitely in a nested lxd container" [Low, Confirmed] [17:19] ddstreet: this thing you mention here: https://bugs.launchpad.net/cloud-init/+bug/1905493/comments/2 fixes 'runlevel' for me [17:22] + removing "apparmor" package [17:52] santa_ ah wow that bug still isnt fixed! wonder if server team might want to review it, cpaelzer_ ^ [17:53] ddstreet: probably it was fixed, then at some point regressed [17:53] I mean I have been taking some time for personal reasons, but a few months ago my test rebuilds were working [17:59] bdmurray: Any idea why this systemd-fsckd test would be marked FAIL in a PPA autopkgtest, despite returning 77 and having 'skippable' in the test restrictions? https://autopkgtest.ubuntu.com/results/autopkgtest-kinetic-enr0n-systemd-251/kinetic/amd64/s/systemd/20220607_181205_41289@/log.gz [17:59] I ran it locally in qemu, and the test was skipped as expected [18:00] enr0n did it output anything to stderr and not have allow-stderr? [18:01] ddstreet: AFAICT, no. It just makes a print() call stating "SKIP: root file system is being checked by initramfs already [18:02] So that's stdout [18:03] enr0n: Out of curiuosity what does the systemd-fsckd test output look like when you run it locally? [18:03] enr0n: Also what version of autopkgtest are you using when you run it locally? [18:04] bdmurray: Here is the local output: https://paste.ubuntu.com/p/nK88K8nxfQ/ [18:05] bdmurray: And I have autopkgtest 5.20 installed on jammy [18:06] enr0n: that test ran with systemd 251-2ubuntu1~ppa3 [18:06] but your PPA now has 251.2-2ubuntu1~ppa1 [18:07] It's also worth mentioning that the autopkgtest deb is not the same autopkgtest code being used in the infrastructure [18:10] ginggs: ah, weird. Yeah it looks like the testbed for my 251.2-2ubuntu1~ppa1 test died on amd64. Thanks for catching that. [18:12] enr0n: Alright are you set then? [18:13] enr0n as ginggs mentioned that test was with ~ppa3 and pull-ppa-source for that version shows it doesn't include 'skippable' [18:17] bdmurray: Yeah I think so. However, my PPA test for amd64 with trigger systemd/251.2-2ubuntu1~ppa1 shows a kernel panic, but is still listed on https://autopkgtest.ubuntu.com/running. [18:18] I don't know how that page works exactly, but maybe that test needs to be killed manually or something? [18:19] ddstreet: yeah, thank you. I see that now. I got confused by the timing of my test results. The tests I had triggered last week are still running I guess. [18:24] enr0n: I'll kill that test run [18:24] bdmurray: thanks! [19:00] !dmb-ping [19:00] rbasak, sil2100, teward, bdmurray, kanashiro, seb128: DMB ping [19:01] *spawns in, but requires rbasak to pay for the coffee this time* [19:03] * genii 's ears perk up for a moment at the mention of coffee [19:04] * arraybolt3[m] activates coffee teleporter and accidentally explodes cup everywhere [19:45] bdmurray: I think something funky is happening with the rest of my PPA autopkgtests listed on https://autopkgtest.ubuntu.com/running#pkg-systemd. They have each been showing various "Running for" durations the last couple days (i.e. the durations don't appear to be strictly increasing). [20:01] enr0n: looking [20:07] enr0n: the arm64 one is still running, I killed the amd64 one again [20:09] enr0n: the s390x test is also running [20:10] enr0n: so is the ppc64el one. [20:11] enr0n: Or are you concerned that the tests should have finished by now but keep running for some reason? [20:11] bdmurray: Yes, the latter. They have been "running" since at least Friday I believe. [20:17] enr0n: When you run it locally how long would it take for all the tests to finish? [20:20] bdmurray: Hm, 30 minutes or so? If I do run it locally, I usually turn my attention to something else for a while. [22:42] enr0n: the amd64 test of systemd should stop running now [23:18] bdmurray: thanks, I have the full logs for that now