[08:06] <wgrant> NOTICE:  identifier "binarypackagereleasedownloadcount__archive__bpr__day__count__idx" will be truncated to "binarypackagereleasedownloadcount__archive__bpr__day__count__id"
[08:06] <wgrant> so. close.
[09:12] <wgrant> cjwatson: Index addition to optimise BPRDC queries, if you have a moment: https://code.launchpad.net/~wgrant/launchpad/+git/launchpad/+merge/379942
[09:20] <tomwardill> wgrant: if you're about, I have a column in the OCI stuff that needs to be NOT NULL, is it safe to add a migration to do that (given we should have 0 rows atm)
[09:20] <wgrant> (prevents HOT updates, but almost certainly worth it given that updates are async)
[09:20] <wgrant> tomwardill: Yep
[09:21] <tomwardill> righto
[09:21]  * tomwardill does the thing
[10:36] <cjwatson> wgrant: Is it worth including country as well?  Archive.getPackageDownloadCount uses that
[10:37] <wgrant> cjwatson: It'd make the index a bit wider, and nobody really cares about the country breakdown, but it's true it wouldn't cost much more at all.
[10:37] <wgrant> I suppose we might as well.
[10:37] <wgrant> It's how I'd model it in Cassandra were I not doing it in 2009.
[10:38] <wgrant> commit 1c040ea5ee863a784fb85b1a6455668f94e0d0be
[10:38] <wgrant> Author: William Grant <me@williamgrant.id.au>
[10:38] <wgrant> Date:   Sat Feb 20 15:36:29 2010 +1100
[10:38] <wgrant>     Add and test Archive.updatePackageDownloadCount.
[10:39] <wgrant> Just made it ten years
[10:39] <wgrant> nice
[10:39] <wgrant> Ten years and close to a billion rows before it needed significant revision.
[10:41] <SpecialK|Canon> Nice
[10:42] <wgrant> When we upgrade to PostgreSQL >=11 we can replace this index and binarypackagereleasedownloadcount__archive__binary_package_rele (the UNIQUE constraint) with a new UNIQUE constraint with INCLUDE (count), but for now we can afford the extra 30GB.
[10:43] <wgrant> Right, that additional column is worth it just because it makes the commit message wrapping less ugly.
[10:43] <wgrant> cjwatson: Updated.
[10:45] <wgrant> cjwatson: btw I got some bug update timeouts earlier, but they seemed shorter lived -- and now much easier to debug, as there's a lot less that can be going wrong in the new triggers.
[10:45] <wgrant> I wonder if we can capture interesting bits of pg_locks regularly
[10:47] <cjwatson> wgrant: Should it be "count, country" rather than "country, count" so that the query in getDailyDownloadTotals can use the partial index more effectively?
[10:47] <cjwatson> (Feel free to tell me that this doesn't matter in practice)
[10:48] <cjwatson> wgrant: Anything interesting in the bug update oopses?
[10:48] <wgrant> cjwatson: It's important that it be country, count, or the queries that filter on country can't use it effectively. You can only efficiently filter on a prefix -- if you filter on something other than a prefix, you end up having to scan through the intermediate columns.
[10:49] <wgrant> But if you're not filtering on country then you are scanning through all the tuples anyway, so the fact you can't filter by country isn't problematic.
[10:49] <cjwatson> wgrant: Right, but doesn't getDailyDownloadTotals filter on only archive, bpr, day, count?
[10:49] <wgrant> It doesn't filter on count. It selects it.
[10:49] <cjwatson> Oh no, I see, it's getting count from the index but ... that
[10:49] <wgrant> https://oops.canonical.com/oops/?oopsid=OOPS-ae3af2c30556a12f4fe8f8acf345a419
[10:49] <cjwatson> Got it
[10:50] <wgrant> https://oops.canonical.com/oops/?oopsid=OOPS-f0f9a7577246c3195e93f200cbb32ca1
[10:50] <cjwatson> wgrant: r=me
[10:50] <wgrant> cjwatson: And I assume db=you?
[10:50] <wgrant> Ah yes
[10:50] <cjwatson> Indeed
[10:51] <wgrant> Will get it applied hot.
[10:51] <wgrant> And hopefully we weren't relying too much on HOT
[10:51] <wgrant> But given the nature of PPAALP, probably not
[10:52] <cjwatson> So those timeouts rule out the theory (which may have been only spitballing to start with) that it was a lock that prevented the creation of temporary tables
[10:52] <wgrant> It was only spitballing, though I'm pretty confident we did narrow it down to the bugsummary triggers at one point.
[10:52] <wgrant> But there are very limited locks that can have caused that timeout. Mostly a row lock on bugtaskflat.
[10:56] <cjwatson> Presumably the bugsummary triggers, though it's annoying that the pseudo-traceback doesn't say that
[10:58] <cjwatson> I wondered whether it might be a deadlock with the BugSummaryJournalRollup garbo job somehow, but logs show that that job wasn't running at the time of https://oops.canonical.com/oops/?oopsid=OOPS-ae3af2c30556a12f4fe8f8acf345a419
[11:00] <wgrant> But unless we're SERIALIZABLE they shouldn't be able to conflict, since the trigger only inserts journal rows.
[11:00] <wgrant> I think reporting on long locks on bugtaskflat might be interesting.
[11:02] <cjwatson> Agreed.  How do we organise that?
[11:03] <cjwatson> Perhaps stub can help
[11:12] <tomwardill> stand back, twom made a DB patch! https://code.launchpad.net/~twom/launchpad/+git/launchpad/+merge/379960
[11:13] <wgrant> tomwardill: Wasn't the intent that git_repository_url was to be an alternative?
[11:13] <wgrant> Like snap.git_repository_url
[11:13] <cjwatson> I recall deciding YAGNI on that for the moment
[11:14] <tomwardill> wgrant: I think eventually, yes. But for now, git_repository_url doesn't exist.
[11:14] <tomwardill> so atm it's possible to create OCIRecipes that make no sense as there's no source
[11:15] <wgrant> Fair enough, db=me
[11:18] <tomwardill> so I should be able to Top Approve that, then provide the equivalent code patch?
[11:19] <wgrant> I don't think there's a particular sequencing constraint between the DB and code patches here, but yes.
[11:20] <cjwatson> Right, code can land concurrently with that as long as the code does the corresponding checks (which IIRC it does)
[11:29] <ilasc> Signing Service question: looking at the code / data model I think we have a 1:1 mapping for: Client (one) <-> (one) ClientPublicKey
[11:29] <ilasc> can I please get a confirmation on that being the case in the real world too, sqlAlchemy requires explicit mapping definition
[11:30] <cjwatson> ilasc: No.  A client may have multiple public keys.
[11:30] <cjwatson> It's explicitly not one-to-one, because that would make it impossible to do rotation.
[11:30] <cjwatson> ilasc: Note the existence of Client.public_keys
[11:31] <ilasc> cjwatson: excellent, indeed saw that now that u mentioned it, thank you!
[11:33] <tomwardill> ah, cool. All the code appears to assume that git_repository is required
[11:33] <tomwardill> although I found a typo in an error message
[11:38] <cjwatson> tomwardill: Will take a while anyway - Thiago has a patch ahead of you in the DB queue
[11:39] <tomwardill> no problems, I think there's just the DB patch for that, so it can wait a bit :)
[11:39] <cjwatson> Let me get a ticket in for that one at least
[11:40] <wgrant> cjwatson: I think we can consider relaxing the one patch per deployment constraint for obviously fine cases.
[11:40] <wgrant> We've done it once or twice in the past.
[11:40] <wgrant> And it's been a long time since we've actually run into deployment trouble.
[11:41] <cjwatson> This is true
[11:41] <cjwatson> The documented rule is "not without manual approval" anyway
[11:41] <cjwatson> So we can just manually approve obviously fine cases :)
[11:41] <tomwardill> tiny typo fix while I don't forget: https://code.launchpad.net/~twom/launchpad/+git/launchpad/+merge/379961
[11:42] <cjwatson> I updated LPS for Thiago's patch, but I'll just wait until Tom's passes buildbot and gets onto staging, and then update the commit ID before filing the ticket
[11:42] <wgrant> Like I wouldn't deploy the bugsummary triggers and the BPRDC index at the same time, because they both have unpredictable performance implications.
[11:42] <cjwatson> tomwardill: r=me
[11:43] <cjwatson> Yeah
[11:44]  * tomwardill tries to remember where he left the OCIProject UI patch
[13:09] <cjwatson> wgrant: Could you comment on the indexing problem in https://code.launchpad.net/~pappacena/launchpad/+git/launchpad/+merge/379218 ?
[13:10] <cjwatson> I'm not sure we've split up {,Archive}SigningKey in the right way.
[13:10] <cjwatson> Maybe key_type needs to be denormalised onto ArchiveSigningKey?
[13:12] <wgrant> cjwatson: Hm, are type and purpose somewhat orthogonal?
[13:12] <wgrant> The restriction is on the purpose, not necessarily the type.
[13:12] <wgrant> But yes, if we want to say type == purpose then the way to express the constraint would be to denorm SigningKey.type onto ArchiveSigningKey and add a composite FK
[13:13] <cjwatson> wgrant: Not really usefully orthogonal, because lp-signing needs to know more than just the bare key type in order to implement the right signing logic
[13:13] <wgrant> True.
[13:15] <wgrant> I'm thinking of cases like existing OpenPGP archive signing keys, which are used to sign not just indexes but also directly signable customs.
[13:16] <cjwatson> Composite FK as in ADD CONSTRAINT archivesigningkey__key_type__signing_key__fk FOREIGN KEY (key_type, signing_key) REFERENCES signingkey (key_type, signing_key) right?
[13:16] <wgrant> Right.
[13:18] <cjwatson> I'm trying to think if even the OpenPGP case would benefit from type != purpose.
[13:18] <cjwatson> I can't quite see how it would.
[13:19] <cjwatson> It would only benefit if we had a use case for multiple different OpenPGP keys per (archive, [series]) that were used for different things.
[13:19] <wgrant> Right
[13:19] <pappacena> Can't we create a trigger to check if we are not creating more than one ArchiveSigningKey with the same key type, and prevent the duplication of key_type in ArchivesSigningKey table?
[13:19] <cjwatson> pappacena: composite FKs are simpler than triggers
[13:19] <wgrant> We have that for UEFI today, though it happens that the alternate key is only held by Microsoft.
[13:20] <cjwatson> And denormalisation is permitted
[13:20] <wgrant> Making the trigger non-racy would also be somewhere between difficult and impossible.
[13:21] <wgrant> We're likely to need more granularity on things like FIT keys later
[13:21] <wgrant> But I guess we can do that later.
[13:22] <cjwatson> I was thinking that that granularity would be more likely to be expressed by extra columns on ArchiveSigningKey (e.g. sourcepackagename, not that I'm enthusiastic about that)
[13:23] <cjwatson> But perhaps I don't understand what the likely changes are
[13:25] <cjwatson> At any rate, the composite FK approach doesn't preclude type != purpose later if we need to; we'd add an extra purpose column, relax the index, and maybe add a constraint for the permitted combinations of type and purpose
[13:25] <cjwatson> s/index/unique constraint/
[13:26] <wgrant> Yep, fair enough.
[13:26] <wgrant> sourcepackagename for FIT, arch for UEFI, that sort of thing
[13:27] <cjwatson> Yeah
[13:28] <wgrant> The signing service could conceivably have a key type of X.509 with purpose restriction UEFI, then LP ties that to series and arches and packages
[13:28] <wgrant> Also a key type of OpenPGP with purpose restriction APT_INDEX and DIST_UPGRADER
[13:28] <pappacena> Just to make sure I understood it correctly, this granularity doesn't exist today, right?
[13:28] <wgrant> Since currently those are the same key, but there's no actual reason for that.
[13:29] <wgrant> Right, not needed today. It's only by upload type and series so far.
[13:30] <cjwatson> For X.509, I prefer to have the signing service have different key types if the signing process requires using a different binary (sbsign vs. kmodsign), even if they're technically the same type of key material.  But otherwise fair enough.
[13:30] <cjwatson> To me that's slightly more than a restriction of purpose (which is about which keys may validly be selected for use).
[13:31] <cjwatson> Combination of key type and mechanism, perhaps, but collapsed into key_type
[13:46] <wgrant> Right, but I'm wondering if in some cases they might be the same key material.
[13:46] <wgrant> Which we could just duplicate, of course.
[13:50] <pappacena> I think we could just have several ArchiveSigningKey with different "filters" pointing to the same SigningKey object (on LP's database and lp-signing service)
[13:53] <pappacena> And having different key_types (meaning "purpose", and not really the key file format), there shouldn't be a problem with the unique constraint.
[13:56] <cjwatson> Yep, we already have that situation for keys that have currently been manually copied between archives
[14:00]  * SpecialK|Canon watches Chromium spin
[14:02] <cjwatson> It's true that we might end up with SigningKeys that have the same material in some cases.  Duplication probably isn't the worst thing there
[14:03] <wgrant> OK
[14:17] <cjwatson> tomwardill: http://lpbuildbot.canonical.com/builders/lp-db-devel-xenial/builds/874/steps/shell_9/logs/summary   oh dear, that DB patch is going to have trouble until we get some code changes landed on production
[14:18] <tomwardill> that's... weird
[14:18] <tomwardill> I ran those tests
[14:18] <cjwatson> Did you definitely apply the DB patch to the launchpad_ftest_template DB?
[14:19] <cjwatson> tomwardill: I'm going to go ahead and request that Thiago's DB patch be deployed, since yours seems likely to be blocked for a bit
[14:19] <tomwardill> hmm, maybe no. I ran ./database/schema/update.py
[14:19] <tomwardill> cjwatson: sure, go ahead
[14:19] <tomwardill> I'll replicate and fix
[14:20] <cjwatson> database/schema/upgrade.py defaults to, err, I'm not sure which but probably launchpad_dev
[14:20] <cjwatson> It takes a -d option
[14:20] <tomwardill> yeah. The one time I don't just run `make schema` and try to be all fancy, it bites me :)
[14:20] <cjwatson> Well, you don't need to do that necessarily
[14:21] <cjwatson> You can run upgrade.py on all five (launchpad_empty, launchpad_dev_template, launchpad_dev, launchpad_ftest_template, launchpad_ftest_playground) to make sure
[14:21] <pappacena> I think it defaults to launchpad_dev... I ran it few days ago without the '-d'...
[14:21] <cjwatson> And usually best to do likewise with security.py
[14:21]  * tomwardill improves notes
[14:21] <cjwatson> Having a "make -C database/schema migrate" or similar to do them all might not be the worst thing in the world
[14:23] <cjwatson> Though with a comment that it's only for development
[14:28] <tomwardill> aaah, there we go, that's all the errors I was kind of expecting
[14:28] <tomwardill> Total: 100 tests, 0 failures, 52 errors, 0 skipped in 52.836 seconds.
[14:28]  * tomwardill coughs
[14:36] <tomwardill> hmm, why would `self.git_repository = value.repository` trigger the NOT NULL constraint? I'm setting the value, not trying to read it
[15:10] <tomwardill>  cjwatson: I can't work out why this line: https://git.launchpad.net/launchpad/tree/lib/lp/oci/model/ocirecipe.py#n156 when called via the makeOCIRecipe method in factory.py is causing a storm flush. Any hints?
[15:11] <cjwatson> tomwardill: Have you tried it with LP_DEBUG_SQL=1 ?
[15:14] <tomwardill> cjwatson: yeah, it seems to do a select, then an insert: https://pastebin.canonical.com/p/dfSbgR9NmN/
[15:14] <tomwardill> the insert is missing the git_repository value
[15:14] <tomwardill> but I've no idea why
[15:15] <tomwardill> (well, it's doing the insert on the line that the git_repository value is being set, so it's more 'why is it doing this insert' rather than 'why is git_repository missing')
[15:15] <cjwatson> This sort of thing usually happens when it needs to flush something in order to get an ID
[15:16] <cjwatson> IIRC in Snap I had to order some things very carefully to avoid this
[15:16] <cjwatson> Which of the failing tests is this?
[15:17] <tomwardill> lp.oci.tests.test_ocirecipebuild.TestOCIRecipeBuild.test_queueBuild
[15:17] <tomwardill> although I think it's the same symptom in most of them
[15:18] <cjwatson> OK, give me a few minutes to upgrade
[15:30] <cjwatson> Ah, of course.  Snap doesn't run into this because it's intentionally possible to have a Snap without any source at all, in order to avoid blocking repository deletion on deleting snaps built on it.  Instead, we just "detach" the snap when the source is deleted, leaving it with all of branch, git_repository, and git_repository_url being NULL.
[15:31] <cjwatson> And there's a check in SnapBuildBehaviour.extraBuildArgs to refuse to dispatch a build of a snap that's in this state.
[15:31] <tomwardill> aha
[15:31] <tomwardill> that explains the checks in there (like the git_ref setter allowing it to be None)
[15:32] <cjwatson> So while we could fix this by more careful ordering in some way (because you need to get hold of stuff from the GitRef row before you populate anything in the new OCIRecipe row), I think it would be better to just revert this DB patch for the same reasons.
[15:32] <cjwatson> And implement the same detaching logic for OCIRecipe.
[15:32]  * cjwatson slowly loads stuff back into their brain
[15:32] <tomwardill> sure, that makes sense
[15:33] <cjwatson> Sorry for misleading you about this
[15:34] <cjwatson> You'll need to add a bit of stuff to the machinery around GitRepository.destroySelf as well
[15:34] <tomwardill> no worries :)
[15:34] <cjwatson> Look for detachFromGitRepository
[15:35] <cjwatson> We can just revert the DB patch directly rather than doing anything more clever; because it hasn't passed buildbot, it isn't on staging or anything like that
[15:36] <cjwatson> Anyone who's applied it locally will have to manually unapply it (DROP NOT NULL and delete the LaunchpadDatabaseRevision row), but that's probably just the two of us
[15:36] <tomwardill> incoming MP :)
[15:36] <tomwardill> https://code.launchpad.net/~twom/launchpad/+git/launchpad/+merge/379978
[15:37] <cjwatson> db=me
[15:38] <tomwardill> landing
[15:39] <cjwatson> for db in launchpad_empty launchpad_dev_template launchpad_dev launchpad_ftest_template launchpad_ftest_playground; do psql -c 'ALTER TABLE OCIRecipe ALTER COLUMN git_repository DROP NOT NULL;' -c 'DELETE FROM LaunchpadDatabaseRevision WHERE major = 2210 AND minor = 8 AND patch = 4;' "$db"; done
[15:39] <cjwatson> ^- manual revert
[15:40] <tomwardill> ace :)
[15:40]  * tomwardill blindly copy pastes random things from irc...
[15:41] <cjwatson> Robert'); DROP TABLE Students;--
[15:43] <tomwardill> there's at least one company named similar in companies house records
[16:34] <cjwatson> Anyone up for reviewing https://code.launchpad.net/~cjwatson/lp-source-dependencies/+git/lp-source-dependencies/+merge/379682 and https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/379683 to make it possible to build a wheel for unittest2?
[16:37] <tomwardill> how much work would it be to drop unittest2 entirely? Or is a dep for something more fundamental?
[16:43] <cjwatson> testtools mainly
[16:43] <tomwardill> ah, that would do it, yes
[16:43] <cjwatson> In fact testtools exclusively
[16:43] <cjwatson> https://github.com/testing-cabal/testtools/pull/277
[16:45] <tomwardill> +1 to both
[16:45] <cjwatson> Thanks