=== nuccitheboss1 is now known as nuccitheboss | ||
=== guiverc2 is now known as guiverc | ||
=== nuccitheboss1 is now known as nuccitheboss | ||
ginggs | teward: i am now | 06:50 |
---|---|---|
=== sem2peie- is now known as sem2peie | ||
=== bdrung changed the topic of #ubuntu-devel to: Archive: Kinetic open! | Devel of Ubuntu (not support) | Build failures: http://qa.ubuntuwire.com/ftbfs/ | #ubuntu for support and discussion of Bionic-Jammy | If you can't send messages here, authenticate to NickServ first | Patch Pilots: bdrung | ||
seb128 | is https://reports.qa.ubuntu.com/reports/sponsoring/index.html also not working for others? if those do you know who is maintaining the service/where to report? | 09:23 |
seb128 | bdmurray, ^ iirc you had at least access? | 09:24 |
ginggs | seb128: i can confirm it's not working now. it did work for me about 2 hours ago | 09:30 |
seb128 | ginggs, thanks | 09:35 |
bluca | hi - any chance systemd/ppc64el could be added to https://git.launchpad.net/~ubuntu-release/autopkgtest-cloud/+git/autopkgtest-package-configs/tree/big_packages ? | 10:10 |
bluca | systemd/amd64 is already there, and the ppc64el test run does the exact same things, and often times out | 10:10 |
ginggs | bluca: would you submit a MR please? also -> #ubuntu-release | 10:13 |
bluca | can do, what's the repo? | 10:18 |
ginggs | bluca: as above? https://git.launchpad.net/~ubuntu-release/autopkgtest-cloud/+git/autopkgtest-package-configs/ or do I misunderstand? | 10:27 |
bluca | no my bad, thought you meant on Salsa for some reason - https://code.launchpad.net/~bluca/autopkgtest-cloud/+git/autopkgtest-package-configs/+merge/424487 | 10:34 |
ginggs | bluca: thanks! | 10:40 |
bluca | thanks for the review - who can do the merge? | 10:43 |
ginggs | anyone in ~ubuntu-release -- i've pinged them | 10:48 |
bluca | thanks! | 10:50 |
ginggs | seb128: http://reqorts.qa.ubuntu.com/reports/sponsoring/ working again for me now | 12:42 |
seb128 | ginggs, thanks; it also works here now | 12:44 |
enr0n | vorlon: "do we know why these tests are running more slowly on ppc64el than on other archs" > no, I did not investigate any further. I just noticed the tests were very inconsistently hitting timeouts due to the race conditions. | 12:44 |
teward | ginggs: can you look at #1814302 and advise waveform on prepping kinetic and jammy patches and not have version collission? | 14:15 |
teward | you sponsored it last according to LP | 14:15 |
teward | while waveform TIL | 14:15 |
teward | also i cant spell apparently. *goes to get coffee* | 14:16 |
teward | (merge/sync for Jammy == regression introduction) | 14:16 |
waveform | teward, the last rebuild that ginggs sponsored was to do with ... (checks notes) ... oh, the debhelper prerm fun. I think the actual sync from debian was before that but I've no idea what triggered that (given we had a delta for the lxd apparmor bits) | 14:18 |
teward | waveform: based on affected releases I would say Jammy introduced the dropped delta which regressed the SRU | 14:19 |
teward | but to apply in Kinetic and Jammy you need to avoid the version string collission (same version, two releases) | 14:20 |
teward | at least until newer quassel is merged : P | 14:20 |
teward | i am on a commute or i would JFDI for uploading :) | 14:20 |
waveform | yeah, I figured it'd get sponsored into kinetic, then I could look at SRU'ing to jammy (with something like ~22.04.1 on the end) | 14:20 |
waveform | but if it can go into both that's certainly easier | 14:23 |
teward | use -Xubuntu0.22.04.1 | 14:24 |
teward | *points at sec team docs& | 14:24 |
teward | waveform: if you create the jammy debdiff i'll upload both simultaneously | 14:24 |
teward | at least... when i'm at my computer | 14:25 |
teward | phone is not capable of debsign xD | 14:25 |
teward | *adds a note to stab bryceh about nginx merge from Debian* | 14:26 |
Eickmeyer | teward: I doubt you'll have to worry about a new quassel for a while, perhaps years. They're slower than turtles crawling through peanut butter at their releases. | 15:03 |
teward | :P | 15:03 |
teward | Eickmeyer: maybe, but i can expect a -3 in Debian at any time so :P | 15:03 |
Eickmeyer | Fair. | 15:04 |
Eickmeyer | Seeded in Studio, FWIW. | 15:04 |
santa_ | good afternnon | 15:05 |
santa_ | ddstreet: hello there, I'm having a problem which might be in systemd and I'm collecting info to try to report a bug, if you have some mins I would appreciate your advice, details below | 15:06 |
santa_ | - I have some servers to [re]build kubuntu packages | 15:07 |
santa_ | - in these servers I have LXD containers | 15:07 |
santa_ | - in the containers meant to build the package I also do an autopkgtest run | 15:08 |
santa_ | - I'm using autopkgtest with the LXD backend, therefore I'm using nested containers | 15:08 |
santa_ | - when autopkgtest starts it executes something like this to verify the container actually started: | 15:09 |
santa_ | lxc exec <container_name> runlevel | 15:09 |
santa_ | the thing is, since some time ago, "runlevel" no longer works in nested cointainers | 15:10 |
santa_ | so I have been researching the problem and I found out a couple of interesting systemd services: | 15:10 |
teward | waveform: uploaded to Kinetic and jammy-proposed | 15:10 |
santa_ | systemd-update-utmp.service and systemd-update-utmp-runlevel.service | 15:11 |
santa_ | the first one works fine, the second doesn't | 15:11 |
santa_ | the second service is suposed to execute "/lib/systemd/systemd-update-utmp runlevel" | 15:12 |
santa_ | if I execute that manually in a nested container I get this line of output "Failed to get new runlevel, utmp update skipped." | 15:13 |
waveform | teward, thanks | 15:13 |
santa_ | I have been looking into systemd's upstream git, but I couldn't find anything interesting so far | 15:14 |
santa_ | any advice to continue the research? I'm a bit stuck with this atm | 15:14 |
santa_ | oh I forgot to mention | 15:16 |
santa_ | executing "runlevel" in a nested container is always returning "unknown" (and thats what is making my autopkgtest executions fail) | 15:17 |
ddstreet | santa_ in the top-level container, are you using 'security.nesting=true'? if not, it's worth giving that a try first | 15:18 |
santa_ | ddstreet: yes of course, also in case you are wondering this happens both with privileged and non-privileged containers | 15:19 |
ddstreet | hmm, and is the 'runlevel' call failing for older releases too or just jammy/kinetic? | 15:20 |
santa_ | in my server I have 20.04 | 15:21 |
santa_ | in the building containers I have 18.04 | 15:21 |
santa_ | the nested containers have kinetic | 15:21 |
santa_ | however, I tried other versions | 15:22 |
santa_ | for example this happens with 20.04 top-level container + 20.04 nested container | 15:22 |
santa_ | also 20.04 top-level + kinetic nested container | 15:23 |
santa_ | ... and others (I don't remeber all combinations I tried but I couldn't get the runlevel thing working yet) | 15:24 |
ddstreet | i think you'll likely need to enable systemd debug in the nested container to get more detail about why 'runlevel' is failing | 15:24 |
ddstreet | maybe you can manually start the nested container and manually check 'runlevel' | 15:24 |
santa_ | yes, that's what I'm doing, it's always returning "unknown" | 15:26 |
santa_ | I mean in addition to the building infra I have I have been creating other containers manually to experiment | 15:27 |
ddstreet | what does 'systemctl is-system-running' report? | 15:27 |
santa_ | ddstreet: "starting" on nested container | 15:31 |
ddstreet | yeah that's the problem | 15:31 |
santa_ | I think it should say "running" at this point | 15:31 |
ddstreet | the 'runlevel' program connects to pid1 over dbus, and asks for the state of specific units, namely 'graphical.target', 'multi-user.target', and 'rescue.target' | 15:32 |
ddstreet | if any of those are 'active' (or 'reloading'), it uses that 'runlevel' (respectively, 5, 3, 1) | 15:32 |
ddstreet | if none of those are active/reloading, it falls into the error you're seeing | 15:33 |
ddstreet | and if the system is still 'starting' none of those will be active yet | 15:33 |
ddstreet | and really, that makes sense, if the system is still starting it isn't in any 'runlevel' | 15:33 |
ddstreet | is autopkgtest the code calling 'runlevel' in the container? if so, it should be waiting for the container to start, like with 'systemctl --wait is-system-running' or similar | 15:35 |
santa_ | ack | 15:35 |
santa_ | yes, let me find the code... | 15:35 |
santa_ | https://salsa.debian.org/ci-team/autopkgtest/-/blob/master/virt/autopkgtest-virt-lxd#L101 | 15:36 |
santa_ | perhaps it's about time that autopktest switches to 'systemctl --wait is-system-running'? | 15:37 |
ddstreet | well they have a loop there, which is fine (i guess 'runlevel' is ok to use if you don't know if the container is running systemd...?) | 15:38 |
ddstreet | i think they need to increase that timeout though, the default system timeout for units is 90 seconds | 15:38 |
ddstreet | so any unit that delays the bootup will almost certainly go past that 60 second timeout | 15:39 |
santa_ | yeah, I have a customized autopkgttest increasing some timeouts | 15:39 |
ddstreet | in the container you could check what service is delaying bootup, networking is a frequent offender (i.e. systemd-networkd-wait-online) but it could be something else | 15:39 |
santa_ | ah, I have 3 failed units, I could have a look into that | 15:40 |
ddstreet | yep, probably one or more of those delaying boot | 15:41 |
santa_ | also, if you look at line 109 of the code I linked above, that seems to contain legacy code in case you are not using systemd | 15:41 |
santa_ | so I guess the idea of executing runlevel was to support old systems without systemd | 15:42 |
ddstreet | yep, seems reasonable since Debian does allow not using systemd | 15:42 |
ddstreet | main thing that probably needs changing there is to increase the timeout from 60 to 120 or so | 15:43 |
ddstreet | unless the intention is to detect failed boot-time units, of course | 15:43 |
santa_ | ddstreet: funny coincidence I have 120 in mine + other fix: https://git.launchpad.net/~tritemio-maintainers/tritemio/+git/autopkgtest/commit/?id=976c322fca13a69a366c7dc60f1fe423702ff7ba | 15:46 |
ubottu | Commit 976c322 in ~tritemio-maintainers/tritemio/+git/autopkgtest "Increase various timeouts" | 15:46 |
santa_ | maybe I should send part of that patch to the autopkgtest guys | 15:47 |
santa_ | ok, so going back to systemd in nested containers, it seems there is a problem with apparmor | 15:53 |
santa_ | so I have 3 failed services: | 15:54 |
santa_ | - apparmor.service | 15:54 |
santa_ | - networkd-dispatcher.service | 15:54 |
santa_ | - snapd.apparmor.service | 15:54 |
santa_ | ok I have been playing around with a kinetic container nested in another kinetic container: | 16:16 |
santa_ | - I had the 3 services mentioned above failing | 16:16 |
santa_ | - I have removed the "apparmor" and "networkd-dispatcher", after that no more services failing | 16:17 |
santa_ | - however the nested cointainer is still in the same weird state | 16:18 |
ddstreet | santa_ i started up a nested kinetic, and it looks like dbus isn't running, or is having major problems | 16:20 |
santa_ | weird state = runlevel reporting 'unknown', 'systemctl is-system-running' says 'starting' | 16:20 |
santa_ | ddstreet: ah, I remember some packages being updated and complaining about dbus or something | 16:21 |
ddstreet | yeah while 'starting' runlevel will always return 'unknown' | 16:21 |
santa_ | also the multi-user, graphical or emergency targets were not reached apparently | 16:22 |
ddstreet | yep, i am pretty sure the nested lxd containers is causing some issue(s) | 16:23 |
ddstreet | i don't think it's necessarily systemd that's broken | 16:23 |
santa_ | yeah, I share that opinion | 16:24 |
santa_ | * I agrre | 16:24 |
santa_ | * I agree | 16:24 |
santa_ | also I have "Unexpected error response from GetNameOwner(): Connection terminated" from various systemd services | 16:24 |
ddstreet | yep me too | 16:24 |
ddstreet | slyon have you looked into problems with nested lxd containers with kinetic? ^ | 16:27 |
santa_ | ok, I have found a workaround | 17:18 |
santa_ | https://bugs.launchpad.net/cloud-init/+bug/1905493 | 17:18 |
ubottu | Launchpad bug 1905493 in snapd "cloud-init status --wait hangs indefinitely in a nested lxd container" [Low, Confirmed] | 17:18 |
santa_ | ddstreet: this thing you mention here: https://bugs.launchpad.net/cloud-init/+bug/1905493/comments/2 fixes 'runlevel' for me | 17:19 |
santa_ | + removing "apparmor" package | 17:22 |
ddstreet | santa_ ah wow that bug still isnt fixed! wonder if server team might want to review it, cpaelzer_ ^ | 17:52 |
santa_ | ddstreet: probably it was fixed, then at some point regressed | 17:53 |
santa_ | I mean I have been taking some time for personal reasons, but a few months ago my test rebuilds were working | 17:53 |
enr0n | bdmurray: Any idea why this systemd-fsckd test would be marked FAIL in a PPA autopkgtest, despite returning 77 and having 'skippable' in the test restrictions? https://autopkgtest.ubuntu.com/results/autopkgtest-kinetic-enr0n-systemd-251/kinetic/amd64/s/systemd/20220607_181205_41289@/log.gz | 17:59 |
enr0n | I ran it locally in qemu, and the test was skipped as expected | 17:59 |
ddstreet | enr0n did it output anything to stderr and not have allow-stderr? | 18:00 |
enr0n | ddstreet: AFAICT, no. It just makes a print() call stating "SKIP: root file system is being checked by initramfs already | 18:01 |
enr0n | So that's stdout | 18:02 |
bdmurray | enr0n: Out of curiuosity what does the systemd-fsckd test output look like when you run it locally? | 18:03 |
bdmurray | enr0n: Also what version of autopkgtest are you using when you run it locally? | 18:03 |
enr0n | bdmurray: Here is the local output: https://paste.ubuntu.com/p/nK88K8nxfQ/ | 18:04 |
enr0n | bdmurray: And I have autopkgtest 5.20 installed on jammy | 18:05 |
ginggs | enr0n: that test ran with systemd 251-2ubuntu1~ppa3 | 18:06 |
ginggs | but your PPA now has 251.2-2ubuntu1~ppa1 | 18:06 |
bdmurray | It's also worth mentioning that the autopkgtest deb is not the same autopkgtest code being used in the infrastructure | 18:07 |
enr0n | ginggs: ah, weird. Yeah it looks like the testbed for my 251.2-2ubuntu1~ppa1 test died on amd64. Thanks for catching that. | 18:10 |
bdmurray | enr0n: Alright are you set then? | 18:12 |
ddstreet | enr0n as ginggs mentioned that test was with ~ppa3 and pull-ppa-source for that version shows it doesn't include 'skippable' | 18:13 |
enr0n | bdmurray: Yeah I think so. However, my PPA test for amd64 with trigger systemd/251.2-2ubuntu1~ppa1 shows a kernel panic, but is still listed on https://autopkgtest.ubuntu.com/running. | 18:17 |
enr0n | I don't know how that page works exactly, but maybe that test needs to be killed manually or something? | 18:18 |
enr0n | ddstreet: yeah, thank you. I see that now. I got confused by the timing of my test results. The tests I had triggered last week are still running I guess. | 18:19 |
bdmurray | enr0n: I'll kill that test run | 18:24 |
enr0n | bdmurray: thanks! | 18:24 |
rbasak | !dmb-ping | 19:00 |
ubottu | rbasak, sil2100, teward, bdmurray, kanashiro, seb128: DMB ping | 19:00 |
teward | *spawns in, but requires rbasak to pay for the coffee this time* | 19:01 |
* genii 's ears perk up for a moment at the mention of coffee | 19:03 | |
* arraybolt3[m] activates coffee teleporter and accidentally explodes cup everywhere | 19:04 | |
enr0n | bdmurray: I think something funky is happening with the rest of my PPA autopkgtests listed on https://autopkgtest.ubuntu.com/running#pkg-systemd. They have each been showing various "Running for" durations the last couple days (i.e. the durations don't appear to be strictly increasing). | 19:45 |
bdmurray | enr0n: looking | 20:01 |
bdmurray | enr0n: the arm64 one is still running, I killed the amd64 one again | 20:07 |
bdmurray | enr0n: the s390x test is also running | 20:09 |
bdmurray | enr0n: so is the ppc64el one. | 20:10 |
bdmurray | enr0n: Or are you concerned that the tests should have finished by now but keep running for some reason? | 20:11 |
enr0n | bdmurray: Yes, the latter. They have been "running" since at least Friday I believe. | 20:11 |
bdmurray | enr0n: When you run it locally how long would it take for all the tests to finish? | 20:17 |
enr0n | bdmurray: Hm, 30 minutes or so? If I do run it locally, I usually turn my attention to something else for a while. | 20:20 |
bdmurray | enr0n: the amd64 test of systemd should stop running now | 22:42 |
enr0n | bdmurray: thanks, I have the full logs for that now | 23:18 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!