[00:41] wgrant, Although it is unfortunate for several reasons that we have to support obsolete series, I'm curious why you care as it isn't you that has to do the work to support the obsolete release. :P [00:42] cody-somerville: Publisher performance. [00:42] cody-somerville: We can save a lot of time if we have to publish 6 releases, not 14. [00:43] We should probably be querying for dirty series anyway, but it's a whole lot easier to just skip obsolete ones. [00:46] wgrant, sounds like a scaling issue. [00:47] cody-somerville: It is. [00:47] But I'd also like to reject uploads to obsolete series. [00:47] we has a lot of them [00:47] I can haz perf problems [00:47] I now haz full publisher log, so I can actually see where the problems lie. [00:49] https://devpad.canonical.com/~lpqateam/qa_reports/deployment-stable.html is updated [00:49] Need simple eager loading :( [01:01] wgrant: well, I've specced it now.,.. you can write if you like :) [01:04] Hmm. [01:04] I'd really like to see how much faster the PPA publisher is if we turn off cache flushing. [01:05] elmo: have you seen http://blog.benstrong.com/2010/11/google-and-microsoft-cheat-on-slow.html ? [01:24] flacoste: I wasn't arguing against the desiloization in general or arguing for soyuz retaining specific people - as I think I've said to you and certainly to lifeless before now, I'm all for desiloization, I think it's important. what I was reacting to was the idea that soyuz is no different to e.g. bugs or jml's "be bold, be confident, do what the fuck you want" (I may be paraphrasing) and wanted to reinforce the idea that soyuz is not like the rest of L [01:24] lifeless: I hadn't - that's interesting [01:25] elmo: your long statement cut off at 'the rest of L' [01:25] lala [01:25] P - it a) doesn't have the safety net of the browser in terms of impact on systems and b) can impact orders of magnitudes more systems in an automated way compared to other LP systems that also don't have the browser safety net (e.g. code) [01:25] flacoste: ^-- [01:33] man, now i'm going to be reading this tcpm list for hours [02:00] elmo: sorry! [02:06] === Top 10 Time Out Counts by Page ID === [02:06] Hard / Soft Page ID [02:06] 168 / 6026 Archive:+index [02:06] 94 / 256 POFile:+translate [02:06] 90 / 269 BugTask:+index [02:06] 19 / 268 Distribution:+bugs [02:06] 16 / 118 ProjectGroupSet:CollectionResource:#project_groups [02:06] 15 / 282 Distribution:+bugtarget-portlet-bugfilters-stats [02:06] 15 / 2 ProjectGroup:+milestones [02:06] 10 / 21 DistroSeries:+queue [02:06] 10 / 19 DistroSeriesLanguage:+index [02:06] 8 / 0 ProductSeries:+edit [02:07] lifeless, what exactly is the difference between a hard and a soft timeout? is it where it ran up against our time limits vs when the server actually gives up and dies? [02:08] that may be a stupid question, but i just realized i don't actually know. [02:12] soft timeouts are requests that complete ok, longer than X where X is configured in lp-production-configs [02:12] hard timeouts are requests that trip the timeout code (grep for RequestExpired) and get interrupted [02:13] the default hard timeout is currently 15 seconds [02:13] and some page ids have higher ones, while we recover from pg8.4 [02:30] lifeless: Any chance you could throw one of the recent DistroSeries:+queue timeout query logs my way? [03:45] wgrant: sorry no [03:45] wgrant: the grouping in the report means that high frequency things crowd out the details for lower ones [03:46] wgrant: so I don't have a +queue OOPS that is a timeout. [03:46] wgrant: https://bugs.launchpad.net/soyuz/+bug/276950 is the bug [03:46] <_mup_> Bug #276950: DistroSeries:+queue Timeout accepting many packages queue page [03:46] wgrant: and it has a one month old bug [03:46] one month old OOPS [03:49] lifeless: Ah, OK. [03:49] Thanks. [03:49] I'll pull out data from it in a sec [03:54] wgrant: updated [03:55] lifeless: Thanks. [03:58] wow [03:58] Hm? [03:59] https://bugs.launchpad.net/soyuz/+bug/276950/comments/12 [03:59] <_mup_> Bug #276950: DistroSeries:+queue Timeout accepting many packages queue page [04:00] Uh. [04:01] Why didn't it use packageupload__distroseries__status__idx? [04:01] Or does it really need a full (archive, distroseries, status) index? [04:02] (that should be the index anyway, but I still think it should have been able to get some benefit from the existing one) [04:02] it may be outdated statistics [04:03] its the order by [04:03] you need an index with the order key, to use the index [04:03] SELECT PackageUpload.archive, PackageUpload.changesfile, PackageUpload.date_created, PackageUpload.distroseries, PackageUpload.id, PackageUpload.pocket, PackageUpload.signing_key, PackageUpload.status FROM PackageUpload WHERE packageupload.distroseries = 103 AND packageupload.archive IN (1, 534) AND packageupload.status IN (0) LIMIT 31 OFFSET 0; [04:03] Interesting. [04:03] (1 row) [04:04] Despite there being only one row. [04:04] Time: 1.601 ms [04:04] standard sql rules [04:04] So I guess we want an (archive, distroseries, status, id) index? [04:04] -id [04:04] but yeah [04:05] Limit (cost=0.00..90.66 rows=31 width=36) (actual time=58.704..58.709 rows=1 loops=1) [04:05] -> Index Scan using packageupload__distroseries__status__idx on packageupload (cost=0.00..8946.13 rows=3059 width=36) (actual time=58.701..58.704 rows=1 loops=1) [04:05] Index Cond: ((distroseries = 103) AND (status = 0)) [04:05] Filter: (archive = ANY ('{1,534}'::integer[])) [04:05] without the order by [04:05] That's a little better. [04:06] 1.5ms hot :) [04:06] anyhow [04:06] theres another index that I created, you do archive, distroseries, status, id DESC [04:06] or something like that [04:06] That's not the normal timeout, but it'll do for now. [04:06] we may be able to do [04:07] shaving 1.5s of normal is WIN [04:07] packageupload__distroseries__status__idx" btree (distroseries, status) [04:07] packageupload__distroseries__status__idx" btree (distroseries, status, id DESC) [04:07] may be enough [04:07] experiment in your local machine [04:07] disable all scans [04:07] How do I disable scans? [04:08] http://www.postgresql.org/docs/8.4/interactive/runtime-config-query.html#RUNTIME-CONFIG-QUERY-ENABLE [04:09] Thanks. [04:09] so its set [04:09] set enable_seqscan 0 [04:35] wgrant: did that help? [04:36] wgrant: also see http://www.postgresql.org/docs/8.4/static/sql-createindex.html [04:36] look for ORDER BY [04:40] lifeless: Debugging Soyuz DB corruption issues at the moment. [04:40] \o/ [04:41] looking at this [04:41] I'd do [04:41] distroseries, status, archive, id [04:42] as the index, replace status__idx [04:44] lifeless: I've tried a few combinations, but it will always use an explicit sort step. If I disable explicit sort steps, it reverts to using distroreleasequeue_pkey [04:45] Regardless of sort direction. [04:45] Odd. [04:45] Oh. [04:45] There we go. [04:45] .. no. [04:45] It will not use the index. [04:46] Can I deprioritise distroreleasequeue_pkey somehow? [04:50] well [04:50] the stats will play into what it uses [04:50] I guess I really need to try on real data. [04:50] Right. [05:12] hey stub [05:12] reckon you create an index on qastaging to test a query improvement for wgrant and I ? [05:12] create index packageupload__distroseries__status2__idx on packageupload(distroseries, status, archive, id); [05:13] this is the query 'SELECT PackageUpload.archive, PackageUpload.changesfile, PackageUpload.date_created, PackageUpload.distroseries, PackageUpload.id, PackageUpload.pocket, PackageUpload.signing_key, PackageUpload.status FROM PackageUpload WHERE packageupload.distroseries = 103 AND packageupload.archive IN (1, 534) AND packageupload.status IN (0) ORDER BY packageupload.id desc LIMIT 31 OFFSET 0; [05:18] sinzui: The bugjam doesn't start for another 9 days. You don't need to start drowning us already :P [05:43] lifeless: yo [05:46] Currently that is a crap query (28 sec) [05:46] It wasn't always :( [05:47] yer - surprised it isn't using the distroseries,status index [05:47] Exactly. [05:47] That should really be archive, distroseries, status nowadays, but still. [05:47] We are missing an index on archive [05:48] (different problem) [05:48] Oh, true, I misread the fkey as an index. [05:50] Just adding an archive, status index brings it down to 1.6 seconds [05:50] Oh... must be warm cache now. It isn't using that index ;) [05:52] stub: it needs an index on id [05:52] stub: because of the order byu [05:52] yeah - warming up to that [05:52] stub: thus the 4-element one I'm proposing [05:55] Its always wanting to use the primary key index [05:58] That's what I found locally. [05:58] I hoped real data would work better :( [05:59] Define 'better'. PG is sometimes smarter than we are when picking plans. It might be correct that this approach is faster with the real data [06:00] Ok... id needs to be first [06:01] stub: ?! [06:01] ... what? [06:01] stub: win [06:01] Limit (cost=0.00..518.75 rows=31 width=36) (actual time=0.061..92.017 rows=1 loops=1) [06:01] -> Index Scan Backward using pu_test7 on packageupload (cost=0.00..52209.53 rows=3120 width=36) (actual time=0.059..92.014 rows=1 loops=1) [06:01] Index Cond: ((distroseries = 103) AND (status = 0)) [06:01] Filter: (archive = ANY ('{1,534}'::integer[])) [06:01] Total runtime: 92.065 ms [06:01] (5 rows) [06:01] How does that make sense? [06:01] stub: whats the defn of that index? [06:01] It doesn't have to materialize the whole thing before truncating 'first 30 items' [06:02] It can pull suitably filtered rows straight off the index in order. [06:02] Ahh. [06:02] stub: thats still way worse than if you drop the ORDER BY :( [06:02] create index pu_test7 on packageupload(id, distroseries, status, archive); [06:02] stub: 50* worse [06:03] Looking at that, it doesn't bother using the index for archive [06:04] what about [06:04] (not surprising if those archives are big ones... it might get used for small archives so we can leave that column in the index) [06:04] Given the distribution that might not be unreasonable. [06:04] create index packageupload__distroseries__status2__idx on packageupload(distroseries, status, archive, id desc); [06:04] lifeless: tried that one [06:05] stub: with the DESC ? [06:05] "pu_test1" btree (archive, status) [06:05] "pu_test2" btree (archive, status, id) [06:05] "pu_test3" btree (distroseries, archive, status, id) [06:05] "pu_test4" btree (archive, status, id DESC) [06:05] "pu_test5" btree (distroseries, status, archive, id) [06:05] "pu_test6" btree (distroseries, status, archive, id DESC) [06:05] "pu_test7" btree (id, distroseries, status, archive) [06:05] kk [06:05] DESC shouldn't make a difference in this case - it can use the index backwards just as easily. If you were ordering by 'archive, id DESC' that would be different. [06:06] yeah, theory n prac though [06:06] Yup. I always pick my theory after empirical results are in ;) [06:07] I sound smarter that way. [06:08] Do we need the index for this rollout? [06:09] It's not a recent regression. [06:09] So I'd imagine not. [06:10] I can apply the patch live if we want to skip getting a patch on the production branch [06:10] Tomorrow though - already got some indexes building on production for translations [06:10] no panic [06:10] but we should figure it out :) [06:11] If you put in a db patch, please add the index on archive too... I don't like having ON DELETE CASCADE foreign key reference without the index to support that quickly. [06:11] Shall I drop these indexes on qastaging now so we are qaing what we think we are qaing? [06:15] stub: please [06:16] stub: I don't know that we've got a good enough index yet. [06:16] stub: for the archive one, tell wgrant what it should be, he can add a patch for that for your happiness :) [06:16] CREATE INDEX packageupload__archive__idx ON PackageUpload(archive); [06:18] I can't see how to make that index better [06:19] and 91ms is pretty good [06:19] unordered its 1.6ms :) [06:20] Yes, but that is pulling rows in disk order rather than randomly - far fewer pages to pull up off disk. [06:20] One block vs 30 odd. [06:21] I can make an index that is used that runs slower ;-) [06:21] oh? [06:22] yer - you want the terms that filter the most rows on the left. [06:23] bbiafew hours [06:23] But with timings this low, we are dealing with shape of the data for those parameters [06:23] Ditto [15:11] wgrant, sorry. This is not about the bugjam. I want every registry bug tagged with something I can find before the bugs are moved to launchpad. [15:21] 'morning [18:08] sinzui: I think its great that you're tagging separately from 'closing'... I'm expecting the bugjam to close bugs it shouldn't really. [18:08] theres not much nuance in the message :) [18:09] lifeless, I filter all bug mail from me to junk :) [18:10] sinzui: I read all bugmail on all lp projects :) [18:10] + all bzr projects [18:10] + my own things [18:11] I think bug mail is easy to read with the right filers. Maybe we should start a project/wiki of bug mail filters [18:13] might be interesting [18:13] I'm reading it without filters atm [18:14] wow. There are toddlers who are faster at reading than myself. I cheat to keep up [18:14] I've got fairly good at short circuiting things as I go [18:25] sinzui: what do you think of alexanders comments on bug 525235 [18:25] <_mup_> Bug #525235: milestone page: expected release said XX hours ago, when scheduled for today [18:26] I have put some thought to that [18:26] we use a date field, not a date time, but our formatter assumes it is a datetime [18:27] I think we want to assume the date + 23:59:59.9 seconds so all the UI cases look right [18:29] lifeless, the case of "today" is like the case of "minutes ago" our formatters are ridiculously precise (2 seconds ago). [18:29] right [18:30] another bug is open about that :) [18:30] I think we are tracking that issue in launchpad-web tagged with tales [18:30] \o/ one bug trackker [18:30] I think an engineer could close all the tales bugs in 2 days [18:30] that would be awesome [18:30] then we could delete the project :) [18:32] sinzui: what about bug 212439 - application vs acceptance ? [18:32] <_mup_> Bug #212439: Incorrect date for "member since" on +members, round 2 [18:32] sinzui: it seems reasonable to me that until you're accepted, you're not a member. [18:34] I think there are really two issues here. One is an effort to fix the existing data with data that does not exist. We really do not know the answer to when you were accepted [18:34] sinzui: poll support is being yanked, right? [18:35] sinzui: so, I accept that we can't retrospectively fix, but when someone goes through the process now, do we show 'accepted' or 'applied' in the 'member since' field ? [18:35] As to recording that information, we could...but Launchpad is not hosting the team. The correct were is register a team. Lp could learns of the team, and the membership years too lae [18:35] sinzui: if we show applied, I think its a legit bug - we should update the application date when they are accepted. [18:36] Polls will be removed. Someone will get to close twenty bugs soon [18:36] sinzui: you've already started :) [18:36] sinzui: so, https://edge.launchpad.net/~pythonistas/+members *is* an lp team [18:37] sinzui: this leaves me confused [18:37] the member since/accepted issue assumes the team only exists in Lp. I can be a member of a team before Lp learns of it and before I decide it is now time to register with Lp and join the team [18:37] sinzui: right, but thats a bit meta [18:38] I mean, its addressable by saying 'member of (in launchpad)' [18:38] as the column heading [18:38] The admin will need a new field to track that data.... [18:38] sinzui: why? [18:38] note that we do not have a date when a team was registered. [18:38] sinzui: I'm really confused. [18:38] sinzui: the bug doesn't talk about membership *outside* lp. [18:39] The issue is that there is an underlying assumption that Lp knows about these dates [18:39] sinzui: thats *an* issue, not the only one. [18:40] sinzui: I agree that addressing the issue with the underlying assumption might also address the issue the users are talking about. [18:40] We do not know when a team was registered. membership dates in a team are mutable, they change often as the user's role changes [18:41] sinzui: mmm, holding-a-role-dates change, part-of-team doesn't fluctuate like that, neither in reality or as lp models it [18:41] join-exit-join does cause terrible modelling issues. [18:41] lifeless, I think my tagging of "teams" now shows that that is the most common problem in the registry domain. Teams do not work as users expect. [18:41] sinzui: ok [18:42] sinzui: so, I guess I'm saying: bug 212439 sounds like a reasonable thing we could fix, even if its not the big picture. You seem to disagree (its marked wont fix); why? [18:42] <_mup_> Bug #212439: Incorrect date for "member since" on +members, round 2 [18:42] lifeless, I cannot fix that user's data. that is why. [18:42] (indeed, it sounds like a trivial thing to fix) [18:43] The data does not exist [18:43] sinzui: its not a request for data fixing [18:43] We could record the data, then tell admins that we have a field that can be set to correct the lost data [18:43] they are asking why lp didn't record the time that the admin clicked on 'accept' in the team. [18:44] sinzui: separate the concerns. a) fix existing data, b) do better in future. [18:44] I'm talking about b) only [18:44] The reason is because all membership changes over write the date [18:44] sinzui: they seem to be saying that they *don't* [18:44] see the other bugs the not that that underlying issue is wrong [18:44] or are you saying 'if they had joined recently the acceptance date is what would have been shown' ? [18:45] bugger [18:45] I misread the other bugs [18:46] The other bugs are about *expiration* date, not acceptance date [18:49] sinzui: I might reopen this one ? [18:49] There is still an issue that we do not know how to fix the data [18:50] right [18:50] but I think might be able to do better going forward [18:50] (date_review, date_accepted) -> date_joined. They are used based on subscription policy, and that too is mutable [21:13] wgrant: care to put a patch up for CREATE INDEX packageupload__archive__idx ON PackageUpload(archive); [22:03] lifeless: Will do it later today. [22:06] wgrant: kk [22:30] Hmmmmmmmmmmmmmmmmmm.