/srv/irclogs.ubuntu.com/2016/06/04/#launchpad-dev.txt

cjwatson	wgrant: Thanks for that analysis. I think then that the indexes in https://code.launchpad.net/~cjwatson/launchpad/package-cache-indexes/+merge/296379 should be enough, since those are basically the inverted ones you got to in the end. The bit I missed was probably pulling out the archive ids. I don't agree that the query is artificial and can be restricted by distro, because users of these vocab	01:19
cjwatson	ularies are basically always ...	01:19
cjwatson	... selecting both distribution and package at the same time, unfortunately; but we could get hold of all main archives in all distributions, which is still pretty fast.	01:19
cjwatson	wgrant: In fact inverting the binary index too ("CREATE INDEX temp_dspc_really_2 ON distroseriespackagecache (binarypackagename, archive);") makes it comfortably sub-20ms even without pulling out the archive ids.	01:19
cjwatson	Not sure why I wasn't seeing that before but it seems quite consistent now.	01:20
cjwatson	wgrant: So the thing I still need to work out is how to push the LIMITs down through the UNION in the combined vocabulary, which I bet will be tedious Storm fiddling but should be doable.	01:21
cjwatson	wgrant: Could you elaborate on what you meant by the definition of search being weird?	01:22
wgrant	cjwatson: I'd be more comfortable with the archive IDs pulled out anyway, since the stats are totally different for archive 1 and archive every else, so they should affect the planning.	05:25
wgrant	cjwatson: Substring matching then sorting by name isn't the most useful kind of search.	05:26
cjwatson	wgrant: Do we have better patterns? I'm not very familiar with our search code really.	11:20
wgrant	cjwatson: Most other searches are FTI, which is a bit less silly. But it would, for example, be better here to list an exact match first. And perhaps also prefer $term% and %-$term.	12:40
wgrant	Not preferring exact matches is infuriating for eg. retargeting bugs.	12:40
wgrant	cjwatson: Pushing LIMITs down through the union is somewhat complicated, because the result of the union is ordered. But with appropriate indexes postgres should terminate the search when it has enough.	12:43
wgrant	I hope.	12:44
wgrant	Hm, maybe not.	12:44
wgrant	Difficult for it to realise that the underlying results are ordered the same as the top level.	12:44
wgrant	otoh it's not much of a problem if we don't do substring matching until the usual 3-character threshold.	12:45
wgrant	(but avoiding even exatch matching until that threshold? again, infuriating)	12:45

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!