[11:54] <benjaoming> Heya, is there a sysop that would be able to look into "OOPS-35d6453724aa3422a94066b42dbbc86b" ? I've had problems copying a specific package for a couple of days now.
[11:57] <cjwatson> Ah.  That will be a fun one.
[11:58] <cjwatson> Let me dig up the bug explaining that case
[11:59] <cjwatson> https://bugs.launchpad.net/launchpad/+bug/1475358
[12:00] <cjwatson> benjaoming: Could you save me some fiddly queries and point me to the PPA and the package you're trying to copy?
[12:01] <benjaoming> cjwatson: thanks, yes it's this package: https://launchpad.net/%7Elearningequality/+archive/ubuntu/ka-lite-proposed/+sourcepub/6389034/+listing-archive-extra
[12:01] <benjaoming> Am trying to copy from "ka-lite-proposed" to "ka-lite"
[12:02] <cjwatson> Right, as expected there are multiple diffs there.
[12:02] <wgrant> Why have three DB patches tonight when we can have four!
[12:03] <cjwatson> Well, we'll have to clean up the duplicates first ...
[12:03] <wgrant> Indeed, but that is a simple query.
[12:04] <cjwatson> This is the wgrant version of simple that involves window expressions?
[12:05] <cjwatson> Also it's not clear whether we can put the DB constraint in place when the code is still racy.
[12:07] <cjwatson> Without something like the tweak I suggest towards the end of that bug description I think just having the constraint would result in failed copies.
[12:07] <benjaoming> cjwatson: thanks for looking into it! I take it that the linked bug report is possibly the one that causes this, should I just subscribe to it and suppose it's wrapped up soon? For now, it just means that there's no Trusty version of out package.. I could also bump the version and build a new if the wait is long.
[12:07] <wgrant> launchpad_dogfood=# DELETE FROM packagediff USING (SELECT from_source, to_source, MIN(id) FROM packagediff GROUP BY from_source, to_source HAVING count(*) > 1) AS duplicates WHERE duplicates.from_source = packagediff.from_source AND duplicates.to_source = packagediff.to_source AND id != duplicates.min;
[12:07] <wgrant> DELETE 106
[12:08] <wgrant> cjwatson: No window functions required, unless that's required.
[12:08] <cjwatson> Fair enough :)
[12:08] <cjwatson> benjaoming: Right, it's definitely the same bug.  We'll try to sort it out soon, but failing that, a fresh upload is a workaround.
[12:08] <wgrant> Indeed, it would could racing copies to fail, but better than corrupting DB state. Copies should be fixed to retry on conflicts anyway.
[12:09] <cjwatson> This isn't a conflict as such though, it's just PackageDiffAlreadyRequested in a different form.
[12:09] <cjwatson> Does my slightly grotty IntegrityError suggestion sound reasonable?
[12:09] <wgrant> I haven't read the bug all the way through, let me see.
[12:10] <wgrant> cjwatson: That sort of solution is quite awkward, as the integrityerror will abort the whole transaction.
[12:10] <wgrant> The solution in the webapp is just to retry the whole operation a couple of times.
[12:10] <cjwatson> Maybe the existing places where we use that are with very short transactions.  Or maybe they're broken.
[12:10] <wgrant> I've removed a few of them over the years.
[12:11] <wgrant> eg. there was a great confusing librarian race which would manifest as a "you can't have a / in a filename" exception, because it blindly caught IntegrityError when other changes could be flushed as well.
[12:11] <wgrant> Catching IntegrityError in normal code is almost always wrong, unless you're explicitly managing transactions carefully.
[12:11] <wgrant> garbo is the only remaining non-infra code that catches it, AFAICS.
[12:12] <cjwatson> I guess this is the not at all scary LaunchpadBrowserPublication.handleException
[12:12] <wgrant> Grepping for "Retry" should find it.
[12:12] <wgrant> And you are correct.
[12:12] <cjwatson> Oh, right, I filed that bug before r17764.
[12:12] <wgrant> Jobrunners already know how to retry on some exceptions.
[12:12] <wgrant> eg. AdvisoryLockHeld
[12:14] <wgrant> cjwatson: So I'm tempted to fix the data now, fix the schema when I get all the other hot patches applied tomorrow, and live with the very occasional copy crashing rather than making future copies crash.
[12:14] <cjwatson> It's certainly a better failure mode.
[12:14] <cjwatson> But we should leave the bug open because the incidence rate will be basically equal to the current one, just changed in form.
[12:14] <wgrant> It's very similar to https://bugs.launchpad.net/launchpad/+bug/682692, but that one needs much more complicated DB surgery.
[12:15] <wgrant> Indeed, the bug should remain open, but the key difference is that you can fix the new failure mode by simply retrying.
[12:39] <wgrant> benjaoming: That copy should work now.
[12:39] <wgrant> We've repaired the data.
[12:40] <benjaoming> wgrant: great! let me try it out immediately!
[12:48] <benjaoming> wgrant, cjwatson: it worked, thanks so much!
[12:48] <wgrant> benjaoming: Great, thanks for confirming.
[19:16] <CarlFK> https://launchpad.net/~carlfk/+archive/ubuntu/ppa/+packages    Builds                      amd64                - Pending publication             Note: Some binary packages for this source are not yet published in the     repository.
[19:16] <teward> CarlFK: that's normal?
[19:16] <CarlFK> how long does that take?
[19:17] <CarlFK> wondering if now is a good time to get lunch
[19:17] <teward> CarlFK: um, why would taking lunch have an impact?
[19:17] <teward> if it's successfully built, and is pending publishing, go get food
[19:17] <teward> applaud yourself
[19:21] <CarlFK> if it is a few min, I'll wait and see what I run into next, wich may mean I send an email and wait an hour.. which would be a better time to eat
[19:22] <teward> CarlFK: i don't understand your developer process here.  pending publication is not something I can really gauge, though I"ve never seen it be more than 15 minutes unless there's something evil going on
[19:23] <CarlFK> that's what I was hoping for
[19:51] <cjwatson> CarlFK: it's on the order of 10-20 minutes
[19:51] <cjwatson> the publisher is not the fastest thing on the planet and there are a lot of PPAs
[19:52] <CarlFK> no complaints.  I think the whole ppa thing is fantastic.
[19:53] <cjwatson> well, I'd love for it to be not intrinsically bound to a single machine, which is kind of the source of lots of the problems.  Maybe later this year
[19:53] <CarlFK> I was just evangelizing it to a friend a few days ago.  he builds .deb and then uses file system things to get them onto the target box
[19:53] <CarlFK> usb stick would not surprise me
[19:54] <CarlFK> or email to the other developer
[21:06] <teward> can someone take a look at OOPS-0a1448e0ecb45e3d5e9fe97cc39f71d5 and tell me why a ppa package copy failed?
[21:06] <teward> and if i need to worry about it