[01:19] wgrant: Thanks for that analysis. I think then that the indexes in https://code.launchpad.net/~cjwatson/launchpad/package-cache-indexes/+merge/296379 should be enough, since those are basically the inverted ones you got to in the end. The bit I missed was probably pulling out the archive ids. I don't agree that the query is artificial and can be restricted by distro, because users of these vocab [01:19] ularies are basically always ... [01:19] ... selecting both distribution and package at the same time, unfortunately; but we could get hold of all main archives in all distributions, which is still pretty fast. [01:19] wgrant: In fact inverting the binary index too ("CREATE INDEX temp_dspc_really_2 ON distroseriespackagecache (binarypackagename, archive);") makes it comfortably sub-20ms even without pulling out the archive ids. [01:20] Not sure why I wasn't seeing that before but it seems quite consistent now. [01:21] wgrant: So the thing I still need to work out is how to push the LIMITs down through the UNION in the combined vocabulary, which I bet will be tedious Storm fiddling but should be doable. [01:22] wgrant: Could you elaborate on what you meant by the definition of search being weird? [05:25] cjwatson: I'd be more comfortable with the archive IDs pulled out anyway, since the stats are totally different for archive 1 and archive every else, so they should affect the planning. [05:26] cjwatson: Substring matching then sorting by name isn't the most useful kind of search. [11:20] wgrant: Do we have better patterns? I'm not very familiar with our search code really. [12:40] cjwatson: Most other searches are FTI, which is a bit less silly. But it would, for example, be better here to list an exact match first. And perhaps also prefer $term% and %-$term. [12:40] Not preferring exact matches is infuriating for eg. retargeting bugs. [12:43] cjwatson: Pushing LIMITs down through the union is somewhat complicated, because the result of the union is ordered. But with appropriate indexes postgres should terminate the search when it has enough. [12:44] I hope. [12:44] Hm, maybe not. [12:44] Difficult for it to realise that the underlying results are ordered the same as the top level. [12:45] otoh it's not much of a problem if we don't do substring matching until the usual 3-character threshold. [12:45] (but avoiding even exatch matching until that threshold? again, infuriating)