[00:04] <StevenK> wgrant: I'm not certain that the info query in _getDirectAndDuplicateSubscriptions is to blame. The slow query is using ORDER BY and DISTINCT
[00:07] <StevenK> wgrant: It's pretty much the same query, though
[00:20] <StevenK> wgrant: Yeah, the long query is from IBug.getSubscribersForPerson
[00:21] <StevenK> And just took 4 minutes on DF
[00:22] <wgrant> :D
[00:22] <wgrant> So, explanalyze it
[00:22] <StevenK> That was what took 4 minutes
[00:22] <wgrant> What's the plan?
[00:24] <StevenK> wgrant: http://pastebin.ubuntu.com/1274106/ and http://pastebin.ubuntu.com/1274107/
[00:38] <StevenK> wgrant: Still distracted?
[01:01] <wgrant> StevenK: Sorry, food was required.
[01:01] <wgrant> Let's see
[01:02] <wgrant> StevenK: So, do you see in the plan what I described on the call this morning?
[01:03] <StevenK> That's okay, I amused myself by destroying Bug{Security,Visibility}Change.
[01:03] <wgrant> Great
[01:03] <StevenK> y% bzr di | diffstat -s
[01:03] <StevenK>  7 files changed, 36 insertions(+), 236 deletions(-)
[01:04] <wgrant> :)
[01:04] <StevenK> wgrant: So I can recall you said there are two ways postgres can look up the information, either by the bug or the team
[01:04] <StevenK> And it's obviously going via the bug due to:
[01:04] <StevenK>                      ->  Index Scan using bug_pkey on bug  (cost=0.00..0.62 rows=1 width=4) (actual time=1.729..1.729 rows=0 loops=147850)
[01:04] <StevenK>                            Index Cond: (id = bugsubscription.bug)
[01:05] <StevenK>                            Filter: ((id = 429322) OR (duplicateof = 429322))
[01:05] <wgrant> StevenK: Not quite.
[01:05] <wgrant> The plan is largely executed from inside to out, top to bottom
[01:05] <wgrant> So it starts with the innermost topmost thing
[01:05] <wgrant> In this case, the teamparticipation index scan
[01:06] <wgrant> So it's finding all teamparticipations for the person in question
[01:06] <wgrant> And if you look at the right, you'll see "rows=27"
[01:06] <wgrant> So it finds 27 teamparticipation rows
[01:06] <wgrant> for that person
[01:10] <StevenK> wgrant: Right
[01:10] <wgrant> Then it you go up to the parent node
[01:10] <StevenK> wgrant: And that feeds into the bugsubscription index with loops=27
[01:10] <wgrant> And see it's a nested loop
[01:10] <wgrant> A nested loop has two nodes: the second is executed once for every row in the first node
[01:10] <wgrant> So for each teamparticipation record, it finds the relevant bugsubscriptions
[01:10] <wgrant> And there are 27 rows, so loops=27
[01:11] <StevenK> Ah, right
[01:11] <wgrant> The row count there is an average
[01:11] <wgrant> So, now we exit the first nested loop
[01:11] <wgrant> And find ourselves in a second one
[01:11] <wgrant> With 150000 rows to process
[01:12] <wgrant> *Now* we query the bug for each of those 150000 bugsubscriptions
[01:12] <wgrant> And check whether it matches our filter
[01:12] <wgrant> So that's roughly 150000 random seeks
[01:12] <wgrant> (although in reality it'll be cached, it's still not good)
[01:12] <StevenK> Right
[01:12] <wgrant> Once we find the bugs that match, we do one final nested loop to join the people so we can return Person.name
[01:12] <wgrant> So, what's the problem here?
[01:13] <StevenK> wgrant: That bug number is the worst of the lot with 1.1k dupes
[01:13] <wgrant> Right
[01:13] <wgrant> Is there an alternate strategy that postgres could use to get the same result?
[01:13] <StevenK> wgrant: I guess we'd ideally like to answer the bugsubscriptions for the bug and the person in one shot using an index
[01:13] <wgrant> Not really
[01:14] <wgrant> The problem here is that it has to do 150000 index lookups
[01:14] <wgrant> If we can minimise that step, we win
[01:14] <StevenK> It wouldn't be that many
[01:14] <StevenK> Since it filter the index based on the information we want
[01:15] <wgrant> An alternate strategy is to first ask for all the duplicate bugs, then for each bug look for the subscriptions, and compare them to a list of teams that we participate in
[01:15] <wgrant> That works out to roughly one index scan and a hash lookup for each bug
[01:15] <StevenK> wgrant: Using a JOIN, or what are you thinking?
[01:16] <wgrant> Well, once we've devised a plan, we need to trick postgres into using it
[01:16] <wgrant> We can't tell it directly.
[01:18] <StevenK> wgrant: And we do that by tweaking the query?
[01:18] <wgrant> Right
[01:18] <wgrant> It requires a bit of experimentation
[01:19] <wgrant> Often you can force it to do your bidding using subqueries or CTEs
[01:20] <wgrant> eg. a common thing is to convince it to build a hash of the teams we're looking for
[01:20] <wgrant> By using "
[01:20] <wgrant> WITH teams AS (SELECT team FROM teamparticipation WHERE person = 1191103)"
[01:22] <StevenK> wgrant: http://pastebin.ubuntu.com/1274170/
[01:22] <wgrant> That's a different query, isn't it?
[01:22] <wgrant> Although it is quite related
[01:22] <wgrant> And a similar plan
[01:22] <StevenK> Yeah, it's different
[01:22] <StevenK> Bug.id IN (SELECT id FROM BUG WHERE id = 429322 OR duplicateof = 429322)
[01:23] <wgrant> Right
[01:23] <StevenK> Versus Bug.id = 429322 OR Bug.duplicateof = 429322
[01:23] <wgrant> Note also in both plans where postgres goes wrong
[01:23] <wgrant> After the nested loop against bug:
[01:23] <StevenK> Yeah, in the same place
[01:23] <wgrant>                      ->  Nested Loop  (cost=0.00..1172.79 rows=139 width=16) (actual time=0.063..2313.480 rows=147850 loops=1)
[01:23] <wgrant> The first parenthesised bit is the estimate
[01:23] <wgrant> The second is the actual
[01:23] <wgrant> So it expected it'd get 139 bugs
[01:23] <wgrant> But actually got 147850
[01:23] <StevenK> It estimated ~150 rows and got back ~150k ?
[01:24] <wgrant> Right
[01:24] <wgrant> Its plan was a good one given the estimates it had
[01:25] <StevenK> wgrant: How does the teamparticipation CTE look?
[01:26] <wgrant> StevenK: Monumentally unsuccessful
[01:26] <StevenK> Maybe a bug CTE?
[01:28] <wgrant> StevenK: That might be worth a try
[01:28] <wgrant> It could convince it to use a hash for the bug lookups
[01:28] <StevenK> And the query in IPersonSubscriptionInfo._getDirectAndDuplicatesSubscriptions has the same estimate versus actual factor difference, so the same change will probably make both queries performant
[01:28] <wgrant> Right :)
[01:29] <StevenK> WITH bugs AS (SELECT id FROM bug WHERE id = 429322 OR duplicateof = 429322)
[01:29] <wgrant> Right
[01:29] <StevenK> Where does that go in the query, though?
[01:29] <wgrant> CTEs always go at the start
[01:30] <wgrant> eg
[01:30] <wgrant> WITH teams AS (SELECT team FROM teamparticipation WHERE person = 1191103), bugs AS (SELECT id FROM bug WHERE duplicateof = 429322)
[01:30] <wgrant> SELECT BugSubscription.id, BugSubscription.bug
[01:30] <wgrant> [...]
[01:30] <StevenK> and then WHERE Bug.id IN bugs ?
[01:30] <wgrant> Bug.id IN (SELECT id FROM bugs)
[01:30] <wgrant> It appears as a table
[01:30] <wgrant> Not an expression itself
[01:31] <wgrant> (also note that the CTE I quoted ignores the bug.id check, just to simplify this initial experimentation)
[01:31] <wgrant> So it only counts duplicates, not the bug itself
[01:31] <StevenK> http://pastebin.ubuntu.com/1274176/
[01:32] <StevenK> It still sucks as estimating
[01:32] <wgrant> It's never going to estimate well
[01:32] <wgrant> Because this is an utterly extraordinary case
[01:32] <wgrant> We have to force its hand
[01:33] <StevenK> Let me try the two CTEs like you suggest
[01:33] <wgrant> That sounds like a good plan
[01:38] <StevenK> wgrant: http://pastebin.ubuntu.com/1274179/
[01:40] <wgrant> StevenK: You have a pointless bug join in your main query
[01:40] <wgrant> StevenK: You can just say bugsubscription.bug IN (SELECT id FROM bugs)
[01:48] <wgrant> StevenK: Any luck?
[01:53] <StevenK> wgrant: Sorry http://pastebin.ubuntu.com/1274200/
[01:54] <wgrant> StevenK: Oh
[01:55] <wgrant> StevenK: You have this person join there that's unconstrained
[01:55] <StevenK> wgrant: I don't need Person.id IN ... ?
[01:55] <StevenK> I wasn't really sure about that bit
[01:55] <wgrant> You perhaps want "AND Person.id = BugSubscription.person" instead
[01:55] <wgrant> So Person is actually joined against BugSubscription
[01:58] <StevenK> wgrant: http://pastebin.ubuntu.com/1274207/
[01:58] <StevenK> That one came back quicker than I was expecting. :-)
[01:59] <wgrant> Indeed, that's more like it
[01:59] <wgrant> Now, index-only scans would make a pretty nice difference here
[01:59] <wgrant> Sadly we don't have them yet
[02:00] <wgrant> StevenK: It may be helpful to see what's going on by using EXPLAIN (ANALYZE ON, VERBOSE ON) instead of EXPLAIN ANALYZE
[02:00] <wgrant> THen it'll show the columns it grabs from each result node
[02:01] <StevenK> wgrant: http://pastebin.ubuntu.com/1274209/
[02:02] <wgrant> So we can now see exactly what it's doing
[02:02] <wgrant> StevenK: It's still trying to go TeamParticipation->BugSubscription->Bug, as you can see
[02:02] <wgrant> But that's faster now since it's just a hash lookup on a thousand bugs
[02:02] <wgrant> Not an index lookup
[02:03] <wgrant> It may be worth trying to convince it to start at the bugs, then go to the bugsubscriptions
[02:03] <StevenK> So you want faster still?
[02:03] <wgrant> Perhaps by creating a CTE of bugsubscriptions
[02:03] <wgrant> StevenK: Right
[02:03] <wgrant> StevenK: This is still quite slow
[02:03] <wgrant> Fast enough to work
[02:03] <wgrant> But it'd be nice if we could get it a bit faster
[02:03] <StevenK> Nod
[02:04] <StevenK> wgrant: Can you have a play, I need to get lunch?
[02:04] <wgrant> Sure
[02:04] <wgrant> Well
[02:04] <wgrant> I already have had a play
[02:05] <StevenK> wgrant: Oh?
[02:06] <lifeless> might be better to nab the subscriptions and do a constrained cross product with person
[02:07] <lifeless> move the person manipulation out of the inner loops
[02:08] <lifeless> what you have there is 8 seconds on prod
[02:10] <lifeless> mm, you are already.
[02:10]  * lifeless fiddles
[02:10] <lifeless> btw - http://paste.ubuntu.com/1274216/
[02:10] <wgrant> I have a reasonable solution myself, I'm trying to walk StevenK to it :)
[02:11] <lifeless>                ->  Nested Loop  (cost=0.07..857.65 rows=4 width=19) (actual time=0.447..622.478 rows=147850 loops=1) is the issue, cold cache effects will hurt you
[02:11] <wgrant> lifeless: Hm, which query is that from?
[02:11] <wgrant> Yes
[02:11] <lifeless> wgrant: http://pastebin.ubuntu.com/1274209/
[02:11] <wgrant> That's what we're trying to eliminate
[02:11] <wgrant> Oh, that one, right
[02:12]  * lifeless tries w/out person
[02:13] <lifeless> that distinct you have is distinctly odd
[02:13] <lifeless> why do you want order by person.name ?
[02:13] <wgrant> I haven't looked at that particular query, so I don't know
[02:13] <wgrant> I was looking at the other one
[02:13] <lifeless> if its the portlet
[02:13] <wgrant> The core loop is similar
[02:13] <lifeless> then we're not slicing
[02:14] <lifeless> so bring them all back and sort in appserver.
[02:14] <lifeless> 1200ms
[02:14] <lifeless> 300ms hot
[02:15] <wgrant> Right, about 1.5x faster than DF, as usual
[02:15] <lifeless> the person including one comes down to 500ms hot
[02:15] <lifeless> 229ms
[02:15] <lifeless> 344ms
[02:15] <lifeless> new one is 285,193, 400 - significantly faste
[02:16] <lifeless> he says from probably too few trials
[02:17] <lifeless> 5000 lookups on bugsubscription_person_idx
[02:17] <lifeless> 11ms there, 35ms on bug hash lookups
[02:18] <lifeless> still 159K rows scanning bugsubscriptions
[02:18] <lifeless> union time perhaps
[02:19] <lifeless> http://paste.ubuntu.com/1274226/ is what I just played with for ref
[02:20] <wgrant> Huh
[02:20] <wgrant> That's a bit slow
[02:20] <wgrant> That sort of thing was like 500ms on DF an hour ago
[02:20] <wgrant> Or is it subtly different somehow...
[02:21] <lifeless> union is faster at the innermost layer
[02:22]  * lifeless tries again
[02:23] <lifeless> nah, needs bigger hammer
[02:25] <wgrant> Yeah
[02:25] <wgrant> I have a 15ms hammer
[02:25] <wgrant> I'm pretty sure you can't do much better than that
[02:26] <lifeless> gnar hash join diaf
[02:26] <wgrant> :D
[02:26] <wgrant> Then you disable hashjoin
[02:27] <wgrant> And it does mergejoin instead :)
[02:29] <wgrant> Ah
[02:29] <wgrant> And this thing is even <100ms on the worst bug in existence
[02:30] <lifeless> wait, wtf is this doing
[02:30] <lifeless> its expanding all the teams?
[02:30] <lifeless> I mean, if its the portlet, *why are we expanding teams* ?
[02:31] <lifeless> or is it 'teams I am a member of that are subscribed'
[02:31] <lifeless> that must be it
[02:31] <wgrant> That's the one
[02:32] <wgrant> http://pastebin.ubuntu.com/1274107/ is similarly misbehaved, but without the insane distinct/order
[02:32] <wgrant> The core misbehaviour is identical
[02:32] <wgrant> It endeavours to identify those duplicate bugs to which the user or their teams has a bugsubscription
[02:33] <lifeless> wheee, seq scan on bugsub
[02:33] <lifeless> diaf again. Srs for reals.
[02:40] <wgrant> Ah
[02:40] <wgrant> And with another tweak it's down to 20ms for the worst bug ever
[02:41] <lifeless> you have become very good at this
[02:41] <lifeless> you should blog about it
[02:41] <wgrant> Hah
[02:41] <lifeless> seriously
[02:41] <lifeless> I'm down to 13ms
[02:41] <lifeless> whats the worst bug ever ?
[02:42] <lifeless> or is that bug, person combo I've been using the worst case already ?
[02:42] <wgrant> On DF, 429322 and 419488 are probably at the top of their respective badness dimensions
[02:43] <lifeless> 7ms
[02:43] <lifeless> for the pair I've been testing
[02:43] <lifeless> which is bug 42933 and person 1191103
[02:43] <_mup_> Bug #42933: Kubuntu Installer crashed trying to manually edit partitions table -- Reproduceable <ubiquity (Ubuntu):New> < https://launchpad.net/bugs/42933 >
[02:44] <lifeless> wgrant: http://paste.ubuntu.com/1274251/
[02:45] <lifeless> person 419488 gives me 0.168ms
[02:45] <wgrant> I mean bug 419488
[02:45] <wgrant> One has the most dupes, the other has the most dupe subscribers
[02:45] <lifeless> ah, stay with person 1.1M ?
[02:45] <wgrant> Right
[02:46] <lifeless> 14ms
[02:46] <lifeless> wgrant: I'm curious what you came up with
[02:46] <lifeless> wgrant: and serious about bloggin :)
[02:49] <wgrant> sec
[02:49] <wgrant> just trying with another bug
[02:49] <wgrant> 492322 is very strange
[02:50] <wgrant> Er, 429322
[02:50] <wgrant> It has 1107 duplicates
[02:51] <wgrant> But only 35 dupe subscribers, apparently
[02:51] <wgrant> 659438 has around 1000 of each
[02:52] <wgrant> lifeless: http://paste.ubuntu.com/1274256/ is what I had
[02:52] <lifeless> 333 cold, 18 hot
[02:52] <wgrant> On 659438 the performance is basically identical to yours
[02:52] <wgrant> 26-27ms
[02:52] <lifeless> for 659438
[02:52] <wgrant> I can probably inline two of those CTEs
[02:52] <lifeless> as low as 11
[02:52] <wgrant> I haven't tried since createing the bugsub CTE
[02:52] <wgrant> Which is the critical bit
[02:53] <wgrant> Ah
[02:53] <wgrant> So yours is basically the same
[02:53] <lifeless> I think you'll trigger hash joins
[02:53] <wgrant> Just unformatted and with joins instead
[02:53] <lifeless> I threw a union in there
[02:53] <wgrant> CAn you try mine on prod/staging with 429322, 419488 and 659438?
[02:54] <lifeless> 52,12,20
[02:54] <wgrant> :(
[02:54] <lifeless> 19,11,12
[02:55] <lifeless> 39,16,13
[02:55] <wgrant> Oh
[02:55] <wgrant> Those are three runs from each?
[02:55] <wgrant> So mine's pretty competitive
[02:55] <lifeless> yes
[02:55] <wgrant> Yay
[02:55] <lifeless> yah, and a bit simpler
[02:55] <lifeless> but returning different data
[02:56] <wgrant> Is it? :/
[02:56] <wgrant> Oh, just the missing decorations like person and distincts and such
[02:56] <lifeless> I'm returning the person
[02:56] <wgrant> I ripped them out since they weren't relevant to optimising the core
[02:56] <lifeless> you're returning the subscription id and the bug
[02:56] <wgrant> Yeah
[02:57] <lifeless> 42, 29, 36
[02:57] <wgrant> I was worried that this technique would be terrible for people with few teams on bugs with tonnes of dupe subs
[02:57] <lifeless> doing 6* with changed return column to bugsubscription.person
[02:57] <wgrant> But it seems to be pretty fast
[02:57] <wgrant> Hm
[02:57] <wgrant> Mysterious
[02:58] <lifeless> I get 66,11,25 doing three of mine on it
[02:58] <lifeless> more variance
[02:58] <lifeless> 10,27,27 repeating
[02:59] <wgrant> Anyway, a bit better than what we had before
[02:59] <lifeless> induitable
[02:59] <wgrant> Though I think we might have scared StevenK away
[02:59] <lifeless> y
[02:59] <StevenK> I just got back from lunch
[02:59] <StevenK> Since it's raining cats and dogs, I had to drive
[03:00] <lifeless> teamparticipation_person_idx + bug_duplicateof_idx + bugsubscription_bug_idx and a hash on the directsubscribers result.
[03:00] <wgrant> lifeless: I had this 20ms solution before you started investigating, FWIW :P
[03:00] <wgrant> Right, that's basically the optimal plan, AFAICT
[03:00] <StevenK> Excellent, so now you share :-P
[03:00] <lifeless> for you, teamparticipation_person_idx a BitmapOr on bug_duplicateof_idx, and - here is the difference - a HashAggregate on bugs + bugsubscription_bug_idx scan.
[03:01] <lifeless> and then you also on top o fthat do a hash for the teamwork
[03:01] <lifeless> wgrant: I know, but you were teasing StevenK with a solution you had.
[03:02] <lifeless> wgrant: which btw is slightly cruel.
[03:03] <wgrant> I hadn't revealed I had a good solution yet!
[03:04] <wgrant> I was hoping to walk StevenK through optimising that query, but I had to know I was going the right direction first :)
[03:07] <StevenK> I didn't think he was teasing
[03:08] <StevenK> He was waiting for me to catch up
[03:08] <StevenK> Admitedly, he could have landed and QA'd his solution before I got there
[03:08] <wgrant> lifeless: Inlining my bugs/teams CTEs as joins gets removes the excess hashes.
[03:08] <lifeless> mentoring is good
[03:08] <lifeless> wgrant: /query me the new one
[03:08] <StevenK> wgrant: Ah ha, so we want Joins
[03:09] <wgrant> StevenK: Well, yes and no. The key is a CTE
[03:09] <StevenK> Oh, so BOTH. A CTE and we join on it?
[03:09] <lifeless> wgrant: interestingly I have that union *because* without it I got big dirty great hashes on bugs
[03:10] <wgrant> lifeless: Interesting
[03:10] <lifeless> 258ms
[03:10] <wgrant> lifeless: I've run into that before, but it doesn't end up as a problem here
[03:10] <wgrant> Huh
[03:10] <lifeless> 15ms
[03:10] <wgrant> That very query
[03:10] <wgrant> Ah :)
[03:10] <lifeless> 14ms
[03:10] <wgrant> It's 24ms on DF
[03:10] <lifeless> fwiw mine is still strictly faster :P
[03:11] <wgrant> fuuuu
[03:12] <wgrant> Where's the time?
[03:12] <wgrant> Might be able to save a couple of ms with a slightly different TP join
[03:13] <lifeless> http://paste.ubuntu.com/1274276/
[03:14] <lifeless> both in one paste
[03:14] <lifeless> best of the runs I just did for each.
[03:14] <wgrant> Yours is actually 1.5ms slower on DF
[03:14] <wgrant> Intriguing, thanks
[03:14] <lifeless> inconsequential diff on teams
[03:14] <wgrant> Yeah
[03:15] <wgrant> I'm not sure how your union wins
[03:15] <wgrant> But it does
[03:15] <lifeless> directsubscribers is 3ms faster
[03:15] <lifeless> unions are good
[03:15] <wgrant> Right, but it should be inconsequential here
[03:16] <wgrant> It's probably the extra heap scan on bug
[03:16] <wgrant> Because the bitmapor needs a recheck
[03:16] <wgrant> Yeah, that likely explains it
[03:16] <wgrant> It's trivial in just about every case
[03:17] <lifeless> it is inconsequential, its like 20% on a query with 5000% variance
[03:17] <wgrant> But we are competing over trivial differences here, exactly :)
[03:17] <lifeless> whose competing :)
[03:17] <lifeless> <- ego.
[03:17] <wgrant> Heh
[03:18] <lifeless> right, time to mail silbs
[03:18] <wgrant> :(
[03:19] <StevenK> Indeed :-(
[03:20] <wgrant> Hm
[03:20] <wgrant> You might be getting away without the recheck because of your limited select list
[03:21] <wgrant> Yep
[03:21] <wgrant> I think
[03:22] <wgrant> ... or not
[03:22] <wgrant> lifeless: Your query has the heap scan when run on DF
[03:22] <wgrant> Around the bug_duplicateof_idx scan
[03:22] <wgrant> Same as mine...
[03:23] <wgrant> And that indeed explains the extra 2ms
[03:25] <lifeless> wgrant: combine both and get a win ?
[03:25] <StevenK> Hah, this might mean that we can destroy self._subscriber_cache in IBug
[03:26] <StevenK> If the query in getSubscribersForPerson is performant, we can just check
[03:26] <lifeless> StevenK: long as you don't destroy they 'and folk unsubscribing, unassigning etc still get notified hack that that cache serves'
[03:27] <StevenK> lifeless: The only use of _subscriber_cache was in another method
[03:31] <StevenK> lifeless: So I don't think that cache serves that function
[03:31] <StevenK> I think that cache serves that function because the query was a pig
[03:31] <wgrant> StevenK: It'll be fast, but still not as fast as looking at a local cache
[03:31] <wgrant> We don't want to run this query more than once a request
[03:31] <StevenK> Aww
[03:31] <wgrant> It's still not superfast, and probably cannot be
[03:31] <wgrant> 20ms is quite a while
[03:31] <wgrant> Or 14ms or whatever the latest result was
[03:32] <wgrant> And, in general, you don't want to be repeating queries at all
[03:33] <StevenK> Right
[03:33] <StevenK> So I'll leave it alone, sadly
[03:33] <StevenK> INNER JOIN is the same as LEFT JOIN, isn't it?
[03:34] <wgrant> No
[03:34] <wgrant> LEFT JOIN is shorthand for LEFT OUTER JOIN
[03:34] <wgrant> A plain JOIN is an INNER JOIN
[03:34] <StevenK> Right
[03:34] <wgrant> No, a RIGHT JOIN is also an outer join :)
[03:35] <StevenK> Bleh
[03:36] <StevenK> Blah, lifeless' query is too simple
[03:36] <wgrant> Yeah, lifeless' doesn't give you enough to get the subs/bugs back
[03:36] <wgrant> Just the teams
[03:38] <StevenK> And I don't understand your query either :-(
[03:42] <StevenK> Right
[03:42] <StevenK> Now I have a query, I need to translate it to Storm
[03:42] <wgrant> You just want the bug IDs, right?
[03:43] <StevenK> Which is messy since you're using the CTE as the main table in FROM
[03:43] <wgrant> I forget the original query
[03:43] <wgrant> Right, that is a tad awkward
[03:43] <StevenK> IBug.getSubscribersForPerson() wants (Person, BugSubscription)
[03:43] <wgrant> Ah, I forgot about that method
[03:43] <wgrant> Let's see what it actually does
[03:43] <StevenK> Populates a cache using DRS and results the results
[03:44] <StevenK> returns the results even
[03:49] <wgrant> StevenK: Look what it does with the the BugSubscription, though
[03:49] <wgrant>         def cache_subscriber(row):
[03:49] <wgrant>             subscriber, subscription = row
[03:49] <wgrant>             if subscription.bug_id == self.id:
[03:49] <wgrant>                 self._subscriber_cache.add(subscriber)
[03:49] <wgrant>             else:
[03:49] <wgrant>                 self._subscriber_dups_cache.add(subscriber)
[03:49] <wgrant>             return subscriber
[03:49] <wgrant> It doesn't even really care about the bug ID; it just cares about whether it's this bug or not
[03:50] <StevenK> That's what the DRS does, what about IBug.getSubscribersForPerson()'s callsites?
[03:50] <wgrant> StevenK: They don't get to see the subscription
[03:50] <wgrant> DRS just says "return subscriber"
[03:50] <wgrant> They get the person
[03:50] <StevenK> Right, okay, it really wants bug.id
[03:51] <wgrant> And it's somewhat inefficient to query each person out
[03:51] <wgrant> It might be worth just grabbing (Person.id, Bug.id), then querying the distinct people
[03:52] <wgrant> But then this leads to something interesting
[03:52] <StevenK> wgrant: Why? It's always one person.
[03:52] <wgrant> If you have multiple subscriptions to dupes, it will return the person many times
[03:52] <wgrant> StevenK: No, it may be a team
[03:52] <StevenK> Ah
[03:52] <wgrant> And, in the pathological case we discovered yesterday, it will be crash-bugs-universe 400 times
[03:52] <wgrant> Perhaps look at revising the callsites
[03:53] <wgrant> I suspect they don't need the full functionality of this method, and it can be cut back
[03:53] <StevenK> lib/lp/bugs/browser/bug.py only cares if it's empty
[03:54] <wgrant> I suspect that it's the same with a lot of the others
[03:54] <wgrant> But I haven't looked
[03:54] <StevenK> lib/lp/bugs/browser/bugsubscription.py probably wants distinct anyway
[03:54] <wgrant> I can't see much of a legitimate use for the detail
[03:54] <wgrant> s
[03:54] <StevenK> lib/lp/bugs/browser/bugtask.py seems to want to use it as a preload if it's the callsite I'm thinking of
[03:55] <wgrant>         # This line of code keeps the view's query count down,
[03:55] <wgrant>         # possibly using witchcraft.
[03:56] <wgrant> Encouraging.
[03:56] <wgrant> It doesn't make very much sense
[03:56] <wgrant> So I'd look at the bug and delete it
[03:56] <wgrant> Oh
[03:56] <wgrant> The bug is actually just saying we should delete it
[03:56] <wgrant> It's two years old now
[03:56] <wgrant> Nuke
[03:56] <wgrant> Pray
[03:56] <StevenK> Destroy the function?
[03:57] <StevenK> bugsubscription's use of it is ... interesting
[03:57] <wgrant> Destroy the preloading callsite in browser.bugtask
[03:57] <wgrant> It doesn't care about the output, at least
[03:58] <StevenK> lib/lp/bugs/browser/bug.py is fine, since it doesn't care about the output either
[03:58] <wgrant> Right, so browser.bugsubscription just wants distinct people
[03:58] <StevenK> Right
[03:59] <wgrant> Which ends up being what lifeless' thing does
[03:59] <wgrant> But it may also use the extra caching side-effect
[03:59] <wgrant> Which lifeless' doesn't satisfy
[03:59] <StevenK> bugsubscription doesn't, no
[03:59] <StevenK> Other things in model/bug want it
[04:01] <StevenK> wgrant: Oh, it alreayd does distinct on Person.name, bugsubscription.person_id
[04:01] <wgrant> StevenK: What does?
[04:01] <wgrant> Oh, so it does
[04:02] <wgrant> wtf is it trying to do
[04:02] <wgrant> I assumed it was distinct on person, bugsubscription
[04:04] <StevenK> wallyworld: https://code.launchpad.net/~stevenk/launchpad/destroy-old-bugactivity/+merge/129339
[04:09] <StevenK> wgrant: I don't get it either
[04:11] <StevenK> Given the only thing that cares about its output is a bugsubscription browser property that wants distinct people we can probably just make it sane
[04:12] <wgrant> You should also track down everything that uses the cache
[04:12] <wgrant> Forget what the function does today
[04:13] <wgrant> Just look at what the function and cache callsites need
[04:13] <StevenK> The cache callsites are all in model/bug
[04:13] <StevenK> They're using the cache to work out if the person is subscribed or not and if it is this bug or one of the dupes
[04:14] <StevenK> personIsDirectSubscriber uses _subscriber_cache and _unsubscribed_cache
[04:15] <wgrant> Does anything care about the actual list of subscribed teams?
[04:15] <wgrant> Possibly +subscribe
[04:15] <wgrant> I suspect most things just care about "am I or my teams subscribed to this bug?" and "am I or my teams subscribed to any of this bug's duplicates?"
[04:15] <StevenK> personIsSubscribedToDuplicate uses _subscriber_dups_cache and _unsubscribed_cache
[04:16] <StevenK> That's it
[04:16] <wgrant> So
[04:16] <wgrant> Make a list of the function and cache callsites
[04:16] <wgrant> And what they need
[04:16] <wgrant> What each needs, that is
[04:16] <wgrant> Then we can work out what the function has to do
[04:21] <StevenK> wgrant: http://pastebin.ubuntu.com/1274325/
[04:22] <wgrant> There's also the browser.bug and browser.bugtask callsites
[04:22] <StevenK> browser.bugtask is gone
[04:22] <wgrant> It may be gone
[04:22] <wgrant> It probably is
[04:23] <wgrant> But we should check if the lack of cache might confuse things
[04:25] <wgrant> OK, so I agree with your assessment of those callsites
[04:25] <StevenK> wgrant: http://pastebin.ubuntu.com/1274328/
[04:27] <wgrant> StevenK: Right. So now we need to look at their callsites to see what's done with them
[04:27] <wgrant> Messy :(
[04:28] <StevenK> wgrant: The two cache callsites are easy, they're just a boolean, so the cache is entirely internal
[04:28] <wgrant> StevenK: It's not quite that simple
[04:28] <wgrant> StevenK: They only return a boolean, sure
[04:28] <wgrant> But they take a person argument
[04:29] <wgrant> wat
[04:29] <wgrant> Anyway
[04:29] <StevenK> Digging
[04:30] <wgrant> BugSubscriptionSubscribeSelfView.shouldShowUnsubscribeFromDupesWarning calls personIsDirectSubscriber and personIsSubscribedToDuplicate
[04:30] <wgrant> Indirectly, with all of the user's participated teams
[04:32] <wgrant> That may in fact be the only place it's called with anyone other than the current user
[04:34] <StevenK> Ah
[04:55] <StevenK> wgrant: So what should we do
[04:55] <wgrant> StevenK: Fix the world
[04:56] <StevenK> Hah
[04:57] <wgrant> Most pages probably only care if the user is subscribed
[04:57] <wgrant> Somehow
[04:57] <wgrant> Not whether a specific team is subscribed
[04:58] <StevenK> We want to all this trouble to write a query, so let's make use of it for IBug.getSubscribersForPerson and IPersonSubscriptionInfo._getDirectAndDuplicateSubscriptions?
[04:58] <wgrant> Other than like +subscribe, basically everything else probably just wants to know if there's a participated subscription for the bug, or for its dupes
[04:58] <wgrant> Right, either way the reworked query is necessary there :)
[04:58] <wgrant> So it may be best to work on that first
[04:58] <wgrant> Now, IIRC that just wants bug.id?
[04:59] <StevenK>         info = store.find(
[04:59] <StevenK>             (BugSubscription, Bug, Person),
[04:59] <wgrant> That's what it asks for
[04:59] <wgrant> What it asks for and what it wants are often not the same thing
[04:59] <StevenK> It adds the bug to a set
[05:00] <StevenK> It adds the subscriber, bug and bugsubscription to another set
[05:00] <wgrant> Ah, it uses it to build the annotations, right
[05:00] <wgrant> Still don't know why that's there, but I guess we should preserve it
[05:01] <lifeless> or nuke with prejuidice
[05:01] <lifeless> WCPGW
[05:01] <StevenK> lifeless: Then you deal with the fallout
[05:02] <StevenK> I'd like to get this landed before EOD due to holidays
[05:02] <lifeless> <sarcastic response elided>
[05:02] <wgrant> Oh right, holidays
[05:03] <StevenK> wgrant: Yes! And we're so damn close!
[05:03] <StevenK> Which I'm in favour of just fiddling the two queries and then review/landing
[05:05] <wgrant> http://paste.ubuntu.com/1274347/
[05:05] <wgrant> May be a reasonably easy way to stormify it
[05:06] <wgrant> You can grab BugSubscription, Bug, Person in the outer query as before
[05:06] <wgrant> But all the filtering is done in the CTEs
[05:06] <wgrant> I think the CTEs have to be split like that, or it'll try to optimise
[05:06] <StevenK> wgrant: You assume I've done CTEs with Storm before ...
[05:07]  * StevenK finds a nice example in sharingservice
[05:08] <wgrant> StevenK: Grep for store.with_
[05:08] <wgrant> Yeah
[05:08] <wgrant> sharingservice's isn't too bad
[05:08] <wgrant> There's also a good one in I forget where
[05:09] <wgrant> bugsummaryrebuild is a nice example of using complex CTEs to pull complex data out
[05:09] <wgrant> But we don't need that here.
[05:10] <wgrant> Still, the bugsummaryrebuild stuff might be a good example
[05:11] <StevenK> wgrant: How do I shoehorn the join in Select() ?
[05:12] <wgrant> StevenK: You can either specify it explicitly with eg. tables=[Foo, Join(Bar, Bar.id == Foo.bar_id)], or tables=[Foo, Bar], where=(Bar.id == Foo.bar_id)
[05:20] <StevenK> wgrant: http://pastebin.ubuntu.com/1274363/ and then I'm stuck
[05:24] <wgrant> StevenK: http://pastebin.ubuntu.com/1274369/ might work
[05:24] <wgrant> I haven't tested it
[05:24] <wgrant> And it's only slightly different from yours
[05:24] <wgrant> But I think it might work
[05:25] <StevenK> Except for the hanging Person.id which is what I was actually asking about
[05:25] <wgrant> Oh
[05:25] <wgrant> Person.id == BugSubscription.personID
[05:25] <wgrant> Or person_id
[05:25] <wgrant> I forget if it's new or old
[05:26] <wgrant> And Bug.id == BugSubscription.bugID
[05:26] <StevenK> person_id
[05:28] <StevenK> wgrant: So refactor the with_statement out and make use of it in model/bug?
[05:28] <wgrant> StevenK: Right, probably
[05:29] <wgrant> StevenK: postgres doesn't try very hard to optimise across CTEs, so they usually act as a handy optimisation barrier
[05:29] <wgrant> So you can more reliably substitute them into another query and have them behave as you expect
[05:37] <StevenK> Right, pyflakes is happy at least
[05:41] <wgrant> And the queries work?
[05:41] <wgrant> You probably also want to throw the branch at DF
[05:41] <wgrant> To confirm
[05:42] <StevenK> I'm even earlier than that, I'm running tests.
[05:43] <wgrant> Well, if the query compiles at all then that's good enough for me
[05:43] <StevenK> So far it doesn't, and I've been fixing it
[05:43] <wgrant> Ah
[05:44] <StevenK> TypeError: __init__() takes exactly 3 arguments (5 given)
[05:44] <StevenK> Still getting that
[05:44] <wgrant> From which constructor? Select()?
[05:44] <StevenK>     with_statement = generate_subscription_with(bug, person)
[05:44] <StevenK>   File "/home/steven/launchpad/lp-branches/no-more-o-n-queries-bug-dupes/lib/lp/bugs/model/bug.py", line 2847, in generate_subscription_with
[05:44] <StevenK>     where=[TeamParticipation.personID == person.id]))
[05:44] <StevenK> TypeError: __init__() takes exactly 3 arguments (5 given)
[05:45] <StevenK> So ... I guess?
[05:45] <wgrant> StevenK: Oh
[05:45] <wgrant> You need a second With()
[05:46] <StevenK> Oh, handy.
[05:46] <nigelb> lifeless: are you leaving Canonical?
[05:46] <wgrant> [With('foo', ...), With('bar', ...)]
[05:46] <wgrant> I think
[05:48] <StevenK> ProgrammingError: relation "all_bugscriptions" does not exist
[05:48] <StevenK> Oh, facepalm
[05:48]  * StevenK goes to wipe the egg from his face
[05:48] <lifeless> nigelb: I have left as of ~2 hrs back
[05:48] <lifeless> StevenK: nice typo
[05:48] <wgrant> bugscriptions
[05:48] <wgrant> That's new one
[05:48] <wgrant> a new one
[05:49] <nigelb> lifeless: oh!
[05:49] <StevenK> lifeless: Oh, thanks. I try.
[05:49] <nigelb> lifeless: congrats! What next? :)
[05:49] <StevenK> This is what I get for rushing
[05:49] <lifeless> nigelb: I start at HP Cloud Services on Monday
[05:49]  * nigelb blinks
[05:49] <nigelb> Nice!
[05:50] <StevenK> ProgrammingError: column reference "id" is ambiguous
[05:50] <StevenK> LINE 1: ...ption, Person WHERE BugSubscription.id IN (SELECT id FROM bu...
[05:50] <StevenK> Haha
[05:50] <StevenK> I guess that wants bugsubscription.id
[05:51] <lifeless> nigelb: I hope so :)
[05:51] <lifeless> nigelb: be a real PITA to go through the effort of leaving... and then have it be a bad choice :)
[05:51] <nigelb> Hehe
[05:51] <StevenK> You should find out by Monday afternoon :-P
[05:51] <nigelb> Been there, done that :)
[05:52] <spm> o/ nigelb
[05:54] <nigelb> Hey spm!
[05:54] <StevenK> wgrant: So now I'm getting an AssertionError from PSI.add
[05:55] <wgrant> PackageSetInclusion?
[05:55] <wgrant> Oh, PersonSubscriptionInfo?
[05:55] <StevenK> Yeah
[05:55] <wgrant> What is the assertionerror?
[05:55] <StevenK>     assert principal.is_team, (principal, self.person)
[05:55] <StevenK> AssertionError: (<Person at 0x122a14d0 name12 (Sample Person)>, <Person at 0x12b71c90 person-name-100517 (Person-name-100517)>)
[05:56] <wgrant> Sounds liek the query might not be returning just relevant people
[05:58] <StevenK> So my query is buggy :-(
[06:02] <wgrant> StevenK: What is the query?
[06:05] <StevenK> wgrant: http://pastebin.ubuntu.com/1274401/
[06:06] <wgrant> StevenK: I'm surprised that postgres accepts it
[06:06] <wgrant> "(SELECT all_bugsubscriptions" probably wants to be "(SELECT all_bugsubscriptions.id"
[06:06] <wgrant> I'm not sure what it's interpreting the former as
[06:06] <wgrant> But it might not be what you want
[06:07] <StevenK> Heh
[06:08] <StevenK> wgrant: Same AssertionError
[06:09] <StevenK> With AS (SELECT all_bugsubscriptions.id FROM all_bugsubscriptions ...
[06:10] <wgrant> StevenK: Oh
[06:10] <wgrant> WHERE BugSubscription.id IN
[06:10] <wgrant>     (SELECT bugsubscription.id
[06:10] <wgrant>      FROM bugsubscriptions)
[06:10] <wgrant> Spot the issue
[06:10] <StevenK> Hahahaha
[06:11] <StevenK> wgrant: Shall I revert the SELECT all_bugsubscriptions bit then?
[06:11] <StevenK> Or that's fine too?
[06:11] <wgrant> You should need both fixes
[06:12] <wgrant> It even returns the same number of rows on DF
[06:12] <wgrant> WIth those fixes
[06:13] <StevenK> Total: 35 tests, 0 failures, 0 errors in 48.975 seconds.
[06:13] <StevenK> (test_bug_views)
[06:13] <wgrant> :)
[06:13] <StevenK> It neatly exercised both queries, totally by accident
[06:14] <wgrant> (that query is also confirmed fast on DF)
[06:14] <StevenK> :-D
[06:14] <StevenK> So, commit and push?
[06:15] <wgrant> Might as well
[06:20] <StevenK> Right, done
[06:20]  * StevenK waits for the branch scanner
[06:32] <StevenK> wgrant: https://code.launchpad.net/~stevenk/launchpad/no-more-o-n-queries-bug-dupes/+merge/129355
[07:53] <czajkowski> morning
[08:03] <adeuring> good morning
[10:50] <rick_h_> morning
[10:52] <czajkowski> rick_h_: ello
[13:09] <wgrant> blah
[13:11] <czajkowski> wgrant: we can assign you everything if thats what you want
[15:45] <adeuring> bac, sinzui, flacoste: could one of you please review my latest attempt for the security adapter for IProduct: https://code.launchpad.net/~adeuring/launchpad/authentication-for-private-products-2/+merge/129459 ?
[15:47]  * sinzui looks
[15:50] <flacoste> adeuring: i'll leave someone else to actually review the code, but the explanation you give in the MP looks very good to me
[15:50] <adeuring> thanks :)
[15:51] <sinzui> adeuring, looks good. I think we can simplify one test. overall r=me
[15:51]  * sinzui update MP with smaller test
[15:51] <adeuring> sinzui: thanks
[15:57] <czajkowski> sinzui: a question from potential user on commerical, I'm unsure of. in what form code is stored
[15:57] <czajkowski> locally at your site -- is there any encryption ?
[15:57] <sinzui> no
[15:58] <czajkowski> that answers that then
[15:58] <czajkowski> thank you
[17:27] <abentley> gary_poster: I believe I've found a bug in Zope: http://pastebin.ubuntu.com/1275277/
[17:28] <gary_poster> abentley, hi.  I'm actually mildly curious--those two particular parts of Zope are historically rock solid, with expectation bugs if anything--but I'm in crunch mode till this upcoming Tuesday's demo
[17:29] <gary_poster> bugs in the expectations of the code, is what I mean
[17:29] <gary_poster> I mean people expect something different than it does
[17:29] <gary_poster> sigh :-)
[17:36] <abentley> gary_poster: What I can tell is that the adaptation of ProductSeries to IInformationType (which returns a Product) has the side effect of changing the Checker for the Product so that Checker.set_permissions does not include 'bugtracker'.
[17:36] <abentley> Even though I don't use the instance of Product that's returned.
[17:44] <abentley> gary_poster: When you do have time, I've pushed the example to lp:~abentley/launchpad/adaptation-checker-insanity
[17:47] <abentley> deryck, rick_h_, adeuring: Adaptation (i.e. to IInformationType) has side effects that look like a bug in Zope.  http://pastebin.ubuntu.com/1275277/
[17:49] <deryck> abentley, hmm, yeah, that looks odd.
[17:50] <abentley> deryck: So if you see ForbiddenAttribute, it doesn't necessarily mean that the attribute isn't exposed in configure.zcml.
[17:51] <deryck> abentley, can you work around it?
[17:51] <abentley> deryck: Yes.  First I'll see if using IProductUtil helps.  If not, I can just check self.product.information_type.
[17:52] <abentley> s/IProductUtil/IPackagingUtil/
[17:52] <deryck> ok, cool.
[17:53] <gary_poster> ack abentley
[18:11] <sinzui> bac, Do you have time to review a rewrite of a docttest? https://code.launchpad.net/~sinzui/launchpad/nomination-investigation-1/+merge/129489
[18:12] <bac> sinzui: actually, no.  i'm trying to land two branches.  maybe later.
[18:12] <bac> hi statik!
[18:12] <sinzui> thanks bac
[18:25] <abentley> rick_h_: When I go to change information_type for a product, I see radio buttons, not the choice widget.  And it includes choices it shouldn't (Public Security, Private Security...)
[18:39] <rick_h_> abentley: heading out the door, but shoot me the info of where you see this and I'll look.
[18:39] <abentley> rick_h_: https://launchpad.dev/firefox/+edit
[18:41] <rick_h_> https://qastaging.launchpad.net/launchpad/+edit I see the edit, only the three options
[18:41] <rick_h_> firefox a projectgroup? maybe I missed something there
[18:51] <abentley> rick_h_: No, firefox is a product, same as elsewhere.
[20:09] <abentley> rick_h_: If I disable JavaScript, I see the same thing on qastaging.
[20:19] <abentley> deryck: If you're around, could you please review https://code.launchpad.net/~abentley/launchpad/no-proprietary-linked-products/+merge/129507 ?
[20:22] <deryck> abentley, sure
[20:22] <abentley> deryck: Thanks.
[20:22] <deryck> abentley, np
[20:47] <deryck> abentley, r=me
[20:47] <abentley> deryck: Thanks.