[21:20] juliank: indeed it does look awful; I don't think we've got any ideas either so all advice welcomed [21:23] sarnold: Unfortunately no idea :/ [21:26] :( [21:27] sarnold: have you heard of lp-bug-dupe-properties? [21:28] bdmurray: nope [21:28] sarnold: http://pastebin.ubuntu.com/24911326/ [21:29] bdmurray: well that's cool [21:29] I don't know how helpful that specifically is but the tool is neat. [21:29] that indeed answers a question I had :) I was hoping there'd be a way to see releases affected and relative weights [21:30] well there you go! [21:30] it might be skewed towards new releases because I only manually duped bugs that were filed this month. it might be identical cause with earlier bugs bug something changed "recently" to make this far worse [21:31] there's always a trickle of half-installed failed upgrades but this is a real river.. [21:32] It looks like there is a bug with lp-bug-dupe-properties since getting RelatedPackageVersions doesn't work. It might be worth fixing that to find the versions of apt and dpkg. [21:33] This "NULL: ConfigurePending" is weird too. [21:34] sarnold: The "is not a symbolic link" thing would imply to me that they have a second libssl.so.1.0.0 in their library path, and ldconfig wants to symlink to it, but can't, because the real file exists. [21:34] sarnold: But why that would suddenly have become an issue is a bit of a mystery. [21:34] bdmurray: hrm I think I see loads of NULL: ConfigurePending [21:34] it doesn't feel unique here [21:35] infinity: oh ewww. [21:35] sarnold: oh you mean across all bugs? [21:35] sarnold: Though, ldconfig doesn't exit with an error code in that case, so that could be a red herring. [21:35] bdmurray: yeah, I know I've seen it in many other bugs [21:36] it might still be a contributing factor here, like I said, no ideas.. [21:36] sarnold: http://paste.ubuntu.com/24911399/ [21:36] sarnold: I'd posit those postinsts are failing for other reasons that aren't giving you useful terminal output. [21:37] :( [21:43] sarnold: Oh and, indeed, they can't be failing postinsts, because the packages haven't gotten that far. [21:43] Hrm. [21:44] sarnold: I've confirmed your statement about NULL: ConfigurePending existing in other bug reports [21:44] sarnold: Plus, libssl-doc and libssl-dev are both represented here. [21:44] Aren't ldconfig triggers? [21:44] juliank: Yeah, the ldconfig thing is a red herring here. [21:44] The real problem is that the packages are half-unpacked. Which is probably why there's a second libssl in the library path. [21:45] does this mean dpkg isn't spitting out an error for one of its operations? [21:45] It could well be a dpkg bug. Are you seeing this on all releases, or...? [21:46] infinity: of the ones filed this month, it's 90% 16.04 LTS and 10% 16.10 http://pastebin.ubuntu.com/24911326/ [21:46] Looks like at least xenial and zesty. [21:47] Preparing to unpack .../libssl1.0.0_1.0.2g-1ubuntu4.8_amd64.deb ... [21:47] Unpacking libssl1.0.0:amd64 (1.0.2g-1ubuntu4.8) over (1.0.2g-1ubuntu4.6) ... [21:47] Log ended: 2017-06-11 18:51:11 [21:47] Should the log have just ended there? [21:47] Ideally not, but that could be an apt logging bug. :P [21:48] Which log was that? [21:48] Unpacking libssl-doc (1.0.2g-1ubuntu4.8) over (1.0.2g-1ubuntu4.6) ... [21:48] Log ended: 2017-06-08 17:32:24 [21:48] ^-- From the one that later complains about libssl-doc being half-installed [21:48] https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1692981/comments/8 [21:48] Launchpad bug 1692981 in openssl (Ubuntu) "package libssl-dev:amd64 1.0.2g-1ubuntu11.2 failed to install/upgrade: package is in a very bad inconsistent state; you should reinstall it before attempting configuration" [Undecided,Confirmed] [21:49] bdmurray: Yup, so that would explain the ldconfig output on that one. [21:49] bdmurray: If dpkg was interrupted mid-unpack, you'd have libssl.so.1.0.0.dpkg-tmp sitting there. [21:50] Also the other full term log file show libssl-doc being unpacked and ending. [21:50] And, thus, two versions of the same SONAME. [21:50] So, in both cases, it looks like dpkg is just dying (or being interrupted by the frontend) [21:51] Well, s/interrupted/killed/, cause I think a SIGINT would actually unwind gracefully. [21:51] HistoryLog.txt shows "/usr/bin/dpkg exited unexpectedly" [21:51] Indeed it does. Shame it doesn't log a signal. Though I'm not sure that would be illuminating. [21:52] Were the upgrades unattended? [21:52] Commandline: aptdaemon role='role-upgrade-packages' sender=':1.3320' [21:52] update-manager, maybe? [21:53] I suppose there's a super tiny chance it could have been OOM killed. But I wouldn't expect that to be extra bad this week compared to last. [21:53] Unless we recently shipped a memory leak or two. [21:54] * infinity looks at dmesg. [21:54] All the openssl bugs are amd64 fwiw [21:54] bdmurray: Could also be a red herring, just because most of our users are amd64. [21:54] right [21:55] Nothing remotely fun in dmesg. Like, at all. [21:55] I wonder if they rebooted before apporting. [21:55] oh? I thuoght I saw :386 in many of them [21:55] Oh hey - what's this? https://errors.ubuntu.com/problem/fbab11d5ffef5f64e1affa33c6e5bf0330075104 [21:56] log on to the internerr [21:56] ermagerd internerr! [21:56] sarnold: ack there are some i386 ones [21:57] can we kill dpkg-split already [21:58] I'm blind. Where's the indication that that crash is the same as the bug we were looking at? [21:58] infinity: its not I went looking for dpkg crashes since it died unexpectedly and found this. [21:59] I mean there is no clear indication. [21:59] But if dpkg is crashing on a stable release it wouldn't by default end up in LP. [21:59] Of course, related or not, that's a whole lot o' crashes. What's the process for getting non-Canonical people access to their crash data? I think guillem should have a poke at that. [21:59] fill out a form somewhere [22:00] https://forms.canonical.com/reports/ [22:00] Though, that one seems to be fixed in >> xenial? [22:00] Which means it's probably *not* the bug we're looking at. [22:00] That error tracker bug looks weird [22:01] only 3 of the reports are using dpkg 1.18.10* [22:01] That's 3 more than 0. [22:01] Wait, which reports? [22:02] 3 of the duplicates of the openssl bug in Launchpad [22:02] For the error tracker, I see everything being <= 1.18.4 [22:02] The openssl bug, I saw a few in newer releases, yeah. [22:02] Hence my guess that it's not the same issue. [22:03] It can't be related because it's a crash that only occurs when formatting an error message when dpkg-split failed. In which case there is nothing to partially unpack [22:04] it's not a crash when formatting; it's an abort with message [22:05] but still, dpkg-split [22:05] Ah yes, that makes more sense, it calls abort() there [22:06] Anyhow, still obviously unrelated, since dpkg-split isn't involved in dpkg --unpack :P [22:06] Yes, and fixed in newer dpkgs [22:06] Well, it does not abort() anymore, just logs an error [22:06] Still worth fixing in 16.04 though. [22:07] it's worth suppressing those crash reports [22:07] it's not really worth fixing; if dpkg-split was invoked at all, your real problem is somewhere upstream [22:07] how do you ask autopktest to use multiple triggers? I remember doing so in the past but I seem unable to now [22:07] bdmurray: dpkg-split is the tool used to reassemble floppy-disk-sized fragments of a .deb into a single file on your hard drive for installation. It's slightly obsolete [22:07] with `&trigger=pkg1/version1,pkg2/version2` it says it's malformed [22:08] mapreri: trigger=&trigger=&trigger= [22:08] slangasek: That's dpkg.git commit 521e84da3a2b9ad62d5dbab0f4e1794aef149996 [22:08] &mushroom=&mushroom [22:08] infinity: ah, thanks [22:08] SNAKE, AND A SNAKE. [22:08] https://anonscm.debian.org/cgit/dpkg/dpkg.git/commit/?id=521e84da3a2b9ad62d5dbab0f4e1794aef149996 [22:08] It was fixed in 1.18.5, so should probably apply easily [22:09] bdmurray: so if the best way to suppress the garbage being sent to errors.u.c is by SRUing dpkg, ok - but "fixing" that won't actually benefit users [22:10] More to the point, we've still got no idea what's eating dpkg on those libssl logs. [22:10] Which I imagine isn't related to libssl at all. Has anyone found dupes elsewhere? [22:10] infinity: yes [22:11] bug 1692996 [22:11] Given it's happened to libssl1.0-dev (which is a metapackage), libssl1.0.0 (a library package), and libssl-doc (static docs), it seems like just a timing coincidence. [22:11] bug 1692996 in apport (Ubuntu) "package apport 2.20.4-0ubuntu4 failed to install/upgrade: package is in a very bad inconsistent state; you should reinstall it before attempting configuration" [Undecided,Confirmed] https://launchpad.net/bugs/1692996 [22:12] bdmurray: Okay, that's mildly "comforting". [22:13] bdmurray: I wonder if the common thread here is that it's all aptdaemon. Though, again, could be a red herring just due to how many aptdaemon upgrades happen. [22:14] Also, unless we can correlate some actual dpkg *crashes*, it really feel more like something's just murdering it. [22:14] s/feel/feels/ [22:14] * bdmurray checks his cache o' bugs === sarnold_ is now known as sarnold [22:17] Fine time for a server to explode. [22:18] mapreri: What was your full request URI? [22:18] mapreri: Here's an example of a working on (don't click it :P) [22:18] mapreri: http://autopkgtest.ubuntu.com/request.cgi?release=artful&arch=amd64&package=snapd&trigger=linux-meta/4.10.0.20.22&trigger=linux/4.10.0-20.22 [22:19] mapreri: You might just need to urlencode your + and/or ~ in those versions. [22:19] yeah that's what i've hit in the past (urlencoding) [22:19] I don't recall the encoding for either of those chars because I haven't done web devel for decades, but I suspect Google can help. :P [22:19] mapreri_: mapreri: You might just need to urlencode your + and/or ~ in those versions. [22:20] + = %2B and ~ = %7E [22:21] hah excess flood kills too. he probably saw none of this. [22:21] Oh well. [22:21] * infinity pokes mapreri tentatively. [22:21] mapreri: Alive this time? [22:22] * infinity gives up for now until freenode gets over itself. [22:23] http://autopkgtest.ubuntu.com/request.cgi?release=artful&arch=amd64&package=visp&trigger=visp/3.0.1-2build1&trigger=opencv/3.1.0%2Bdfsg1-1%0Aexp1 still says "malformed trigger" :\ [22:23] yeah, even if my client says it has quite some lag [22:23] I should probably tell my bouncer to try to join channels slower than ~30 at once. [22:23] luckily it happens rarely enough ^^ [22:23] Did nobody write a request-autopkgtest script yet? :D [22:23] heh 88 second ping to mapreri :) [22:24] mapreri: %0A is \n [22:24] mapreri: Not surprised it doesn't like that. ;) [22:25] mapreri: ~ = %7E [22:25] infinity: should I be looking for bugs where dpkg exited unexpectedly and aptdaemon is not in use and the package is in an inconsistent state? [22:25] bdmurray: If it's easier to prove a negative than a positive, yeah. [22:26] bdmurray: I mean, we want to know one way or the other if *all* related bugs involve aptdaemon, cause then we might have a clue that aptdaemon is murdering its children. [22:26] but if the package is already installed and configured that's not the same issue? [22:26] bdmurray: But, again, we do so much automagic upgrading with aptdaemon, it could still be a red herring. :/ [22:26] bdmurray: already-installed-and-configured is a different bug, yes. [22:26] Test request submitted. [22:26] triggers [22:26] ['visp/3.0.1-2build1', 'opencv/3.1.0+dfsg1-1~exp1'] [22:26] \o/ [22:26] infinity: ♥ [22:27] aptdaemon could still be a particular cause of some kinds of failures, e.g. apt debconf pre-configuration [22:27] bdmurray: This bug would key on half-installed in the terminal log, or dpkg exiting unexpectedly in the dpkg log. [22:27] but that would only apply if it were really always libssl1.0.0 getting the axe [22:28] ... also libssl1.0.0 has no config script, so nevermind [22:28] bdmurray: What I really want to know is if we have any dpkg crashes that match the same systems/times as these bugs. If not, then it's more likely a parent committing infanticide. [22:29] slangasek: It's happened to libssl1.0-dev, which is just a transitional with no maintainer scripts or triggers at all, which I think rules out any packaging issues other than dpkg/frontend being derp. [22:29] * slangasek nods [22:30] infinity: There's not an easy way to do that but we could ask people to check their whoopsie logs or give us their system identifier. [22:32] I kinda want to find out that it's actually a memory leak shipped in compiz a week earlier and dpkg is being OOM-killed, just because that would be almost as much fun as Can't Print on Tuesdays. [22:34] * infinity tries to find one with an intact dmesg that wasn't obviously rebooted before the report was sent. [22:35] The fact that all of them (so far I've been through about 10) seem to have been rebooted before the report was sent. Is that coincidence due to how/when apport runs, or could that be indicative of a larger issue that forced a reboot? [22:35] bdmurray: ^ [22:36] Cause another suspicion here is that weird "cannot fork" thing I saw on my machine a while back. [22:36] Which would, indeed, lead to dpkg bailing during unpack. [22:37] infinity: you ran into a 'cannot fork' in the wild? as root? [22:38] sarnold: Hard to say if I ran into as root, given that I couldn't execute anything to elevate privs. [22:38] sarnold: So, "maybe"? :P [22:40] infinity: hehe [22:40] infinity: I thought maybe that was apt/dpkg ... [22:42] Anyhow, every one of those dupes has a fresh dmesg. Which might mean something, or might just mean apport doesn't do useful things until reboot. [22:42] In which case, sending dmesg instead of kern.log.* seems mostly useless. [22:43] I'm not sure about the relative merits of kern.log vs dmesg but I close a ton of bugs are hardware related thanks to dmesg errors [22:47] An interesting thing about apport and whoopsie bug reports is that they add what's in /var/crash and I'm not seeing any dpkg crashes there. [22:49] bdmurray: Which might well imply that dpkg is being killed, not crashed. [22:50] Bug #1699356 is libgcc1:i386 -- the dpkg history log shows "Sub-process /usr/bin/dpkg returned an error code (1) [22:50] bug 1699356 in gcc-6 (Ubuntu) "package libgcc1:i386 1:6.2.0-5ubuntu12 failed to install/upgrade: package libgcc1:i386 is not ready for configuration cannot configure (current status 'half-installed')" [Undecided,New] https://launchpad.net/bugs/1699356 [22:51] .. before the "Sub-process /usr/bin/dpkg exited unexpectedly" [22:53] could anybody merge doxygen? doxygen-latex is uninstallable in proposed due to latex being rude at the world, and uninstallable doxygen-latex makes me sad. [22:53] (happy to provide debdiffs and all, but `grab-merge doxygen` gets it right by itself…) [22:56] actually, grab-merge is not enough, as the latest upload is not yet there