[08:18] sil2100, hey https://github.com/CanonicalLtd/ubuntu-image/pull/155 still needs another round :) [10:27] coreycb: this change got accidentally dropped when you merged imagemagick: https://launchpadlibrarian.net/363532898/imagemagick_8%3A6.9.7.4+dfsg-16ubuntu5_8%3A6.9.7.4+dfsg-16ubuntu6.diff.gz [11:39] jbicha: i'll take a look shortly. sorry about that. [11:42] coreycb: no problem, I tried to push that change into Debian but I wasn't able to convince the maintainer why we wanted it [19:13] kees, stgraber: no TB meeting today? [19:19] slangasek: I'm conferencing in vancouver [19:20] ok [19:20] seb128: are you as sick of this ding autopkgtest failure as I am? [19:21] slangasek, yes! looks like you got a sucess in your most recent try though :) [19:22] seb128: yeah except that's the baseline test, which fails to confirm that this is an ignorable regression in the release pocket :/ [19:22] :/ [19:25] slangasek: how about https://autopkgtest.ubuntu.com/packages/u/udisks2/cosmic/s390x ? 😢 [19:25] jbicha: good news, that one *did* fail a baseline retest [19:26] slangasek: I think we might need to talk about an actual policy about flapping tests: that is, they should be either fixed, disabled, or hinted. Retrying endlessly to sometimes get a thing we think might be a correct result isn't sane QA or CI. [19:27] infinity: perhaps you would like to weigh in on my proposal that britney should automatically reset the gate whenever a retest against the release pocket fails [19:27] (And, worse, ignoring a flapping test, while we hope it means we're ignoring a bad testsuite, often means we're ignoring bad code that the test is rightly finding by accident, due to external fuzz) [19:28] jbicha, we should probably skip that one again :/ [19:28] That is, retrying until it passes is a false sense of security. [19:28] the test that was always failed got fixed but the other ones are still flacky [19:28] infinity: there is lots of bad code that isn't tested and the bad code that triggers flaky tests is not per se more important to pay attention to than other bad code [19:29] slangasek: I don't like the "automated" part of that proposal, cause it then implies that no human will give any thought to the how or the why of the regression. [19:29] infinity: it only means that humans thinking about the regressions are decoupled from the actual transitioning [19:29] slangasek: Your thing in theory, mine in practice, I suspect. [19:30] Without gates, people's carefactor is much lower. Whether the gate is britney or having to justify themselves to a release member or whatever. [19:31] infinity: the carefactor SHOULD BE MUCH LOWER because the regressions have already happened in the release pocket [19:31] instead, right now it's a waste of release team time to manually hint [19:31] automate that; and deal with questions of adequately resourcing efforts to fix autopkgtests elsewhere [19:32] slangasek: One could make the exact inverse argument that carefactor should be higher because we somehow screwed up and let a bug slip out of proposed. :P [19:32] All of proposed can be as buggy as can be, but when thing start regressing in the release pocket, quality is going backward. [19:35] slangasek: And if it's just about not caring about the devel series until closer to release, s/release/updates/ and make the same arguments. A regression noticed in updates should be a short stop the line event to investigate the severity and determine a course of action. It doesn't need to be actioned immediately if it's determined to be not a big deal (hint it and put it on someone's TODO for when-never), but it's worth the 5 minutes of ... [19:35] ... talking about it. [19:35] infinity: except when the test regressed because it was a bad test to begin with (flaky); or when the infrastructure changed in a way that invalidates the test but has not regressed the code; or we are treating as "regressed" the case of a failure in a test that was previously not run (due to testbed restrictions and then we moved the testbed), and now runs and fails; or the test is dumb and [19:36] embeds a pre-generated SSL certificate with an expiration date [19:36] slangasek: I argue that the round-trip with the release (or in the updates case, SRU) team triggers that 5 minutes line stoppage and conversation. [19:36] slangasek: Yes, there are cases where the investigation immediately goes to "derp, this isn't something we care about", but a computer can't make that call. [19:37] infinity: you do understand the geometrically-scaling impact of each of these 5-minute conversations on library transitions, right? [19:37] slangasek: I much preferred the other proposal to allow a reset-baseline style hint, so we can have that discussion ONCE, but then not have to keep bumping hints for eternity. [19:41] infinity: BTW, you just caused all of IS (and perhaps other groups) to check this channel :) [19:41] do not invoke the holy name of STL in vain [19:42] fo0bar: You're not the only people in the world who use the term, perhaps you could adjust your filters to only scan #is and #is-outage (which is where I yell it when I want you). [19:42] fo0bar: Also, insert Nelson "ha ha" clip here? [19:44] infinity: there are a few other channels, but yeah, just giving you grief [19:48] fo0bar: Yeah, #canonical-sysadmin also came to mind after I hit [enter]. [19:49] fo0bar: Anyhow, grief registered and returned. ;) === gurmble is now known as grumble [20:55] slangasek, i'd like to see that "this all-release-pocket fail" + "this all-proposed-pocket fail" cause we have things migrating that do pass in all-proposed due to incomplete deps/ordering. [20:56] for the case where "this" is only published in release pocket for example. [20:56] slangasek, i do not want this automated, no. that feels wrong. [20:56] slangasek, i feel it should be "force-always-failed package/version" to set the threshold of where we count as always failed. [20:57] slangasek: same as kees [20:57] (well usual syntax to specify arch/min-ver-barrier [20:57] )