/srv/irclogs.ubuntu.com/2016/06/04/#launchpad-dev.txt

cjwatsonwgrant: Thanks for that analysis.  I think then that the indexes in https://code.launchpad.net/~cjwatson/launchpad/package-cache-indexes/+merge/296379 should be enough, since those are basically the inverted ones you got to in the end.  The bit I missed was probably pulling out the archive ids.  I don't agree that the query is artificial and can be restricted by distro, because users of these vocab01:19
cjwatsonularies are basically always ...01:19
cjwatson... selecting both distribution and package at the same time, unfortunately; but we could get hold of all main archives in all distributions, which is still pretty fast.01:19
cjwatsonwgrant: In fact inverting the binary index too ("CREATE INDEX temp_dspc_really_2 ON distroseriespackagecache (binarypackagename, archive);") makes it comfortably sub-20ms even without pulling out the archive ids.01:19
cjwatsonNot sure why I wasn't seeing that before but it seems quite consistent now.01:20
cjwatsonwgrant: So the thing I still need to work out is how to push the LIMITs down through the UNION in the combined vocabulary, which I bet will be tedious Storm fiddling but should be doable.01:21
cjwatsonwgrant: Could you elaborate on what you meant by the definition of search being weird?01:22
wgrantcjwatson: I'd be more comfortable with the archive IDs pulled out anyway, since the stats are totally different for archive 1 and archive every else, so they should affect the planning.05:25
wgrantcjwatson: Substring matching then sorting by name isn't the most useful kind of search.05:26
cjwatsonwgrant: Do we have better patterns?  I'm not very familiar with our search code really.11:20
wgrantcjwatson: Most other searches are FTI, which is a bit less silly. But it would, for example, be better here to list an exact match first. And perhaps also prefer $term% and %-$term.12:40
wgrantNot preferring exact matches is infuriating for eg. retargeting bugs.12:40
wgrantcjwatson: Pushing LIMITs down through the union is somewhat complicated, because the result of the union is ordered. But with appropriate indexes postgres should terminate the search when it has enough.12:43
wgrantI hope.12:44
wgrantHm, maybe not.12:44
wgrantDifficult for it to realise that the underlying results are ordered the same as the top level.12:44
wgrantotoh it's not much of a problem if we don't do substring matching until the usual 3-character threshold.12:45
wgrant(but avoiding even exatch matching until that threshold? again, infuriating)12:45

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!