[05:04] wallyworld__: I've reviewed your branch [05:04] "And it's all rubbish" [05:04] :-P [05:04] Not quite :) [05:05] wgrant: thanks, i'll look after i get back from school pickup [06:36] wgrant: i can't use rs.find().set() because that doesn't return the rowcount [06:37] Ah [06:37] also, i can't include lpsprcdateuploaded < dateuploaded in the where condition, i have to use greatest() so rowcount correctly says if that row exists or not [06:37] wallyworld__: You may be better off doing an initial SELECT to work out whether you need to UPDATE or INSERT [06:37] that's an extra select i'd rather avoid [06:37] unless you think it won't affect things [06:37] But that turns each batch into 4 queries [06:38] Currently you're doing 2+n queries [06:38] hmmm. i do i update for creator != null, 1 for maintainer != null, and a possible insert [06:39] (aha, my first test run on DF crashed with the INSERT conflict that I mentioned) [06:39] Right [06:39] as well as the initial select [06:39] You do the initial select, two updates for each row, and an insert [06:39] no, a possible insert [06:39] I'd suggest instead the initial select, some computation, a batch select, a batch update, and a batch insert [06:39] True [06:40] hmmm. you really think that would be better? [06:40] It's going to reduce the query count by a factor of 1000 or more [06:40] most times the insert wil not be required once the table is populated [06:41] Sure, but the insert is negligible [06:41] Because it's batched [06:41] Its elimination does not help performance to any measurable degree [06:42] so you want to initially select from lpsprc where spr_id in (all the spr ids from the slice)? [06:42] and see what archive/spn/distroseries records exist [06:42] Right [06:42] Otherwise you can't identify which rows need inserting unless you do a thousand separate UPDATEs for the batch [06:43] Which is obviously substantially more expensive than a single SELECT [06:43] my thinking was most times all the updates would be required anyway [06:43] once the table is populated [06:43] But then you have 1000 queries instead of 1 [06:43] Still [06:43] i mean, if the updates are required, you still have 1000 queries [06:44] plus the initial computation as an overhead [06:44] Not with my technique [06:45] You can do a batch update using either a few CASE expressions or a literal subselect [06:45] i'd have to think a bit about that [06:47] i could insert the records to be updated into a temp table. or you suggesting a sub select from the original spph table? [06:50] UPDATE lpsprc SET dateuploaded = sub.dateuploaded FROM (VALUES (some, cache, row, key, that_dateuploaded), (some, other, row, key, other_dateuploaded)) AS sub WHERE lpsprc.some = sub.some AND [...] AND lpsprc.dateuploaded < sub.dateuploaded; [06:52] ok, thanks, i'll see how that works [06:54] (it's not a pattern we use anywhere else yet, because we rarely have code adequately architected to be able to benefit from a bulk update) [06:55] yeah [06:55] This is a garbo job we should try to get Rightâ„¢, as it's likely to be used as a model for many more in the future. [06:56] i haven't used that pattern before elsewhere either [06:56] And I don't think it's too much effort to get it Rightâ„¢, apart from getting Storm to cope with that UPDATE syntax [06:56] But it should be relatively easy to do with some of the helpers in lp.services.database.bulk [06:57] yeah, no sure if thre would be native storm support for it, may need a bit of SQL("..") [07:03] There isn't native support AFAIK. You'll probably just need to define a VALUES stormexpr near your helper and then do a bit of SQL() with column definitions maybe. But the hideousness can be encapsulated in lp.services.database.bulk, as the insert/load horrors are [07:05] i'll see what falls out [07:05] fun fun fun [07:10] Storm has the ROW constructor nowadays, but the only use of VALUES expressions that I know of is the INSERT special case [07:11] wallyworld__: btw, I've confirmed on DF that my suggested optimisations for the getPendingUpdates query are indeed of great import. [07:12] ok, ta. i'll let you know when it's ready to look at again, will be after dinner and a few errands [07:12] may well need another iteration [07:13] Yeah [07:13] But this is looking very good, and I think the correctness concerns I had have been ironed out. [07:14] and with the bulk update this will be extremely fast, which is good because I was banking on the "consider more rows, but calculate and process them far more quickly" strategy. [07:15] hopefully this pattern could be applicable to other similar reporting type scenarios where we do live calculations off a fairly normalised model [07:19] wallyworld__: Exactly [07:19] wallyworld__: This effort probably isn't completely worth it just to fix those pages, but it is the first of a new style of caching table that has a huge number of potential uses. [07:21] This is our first table with eventual consistency and just a few minutes of lag. [07:22] Everything else we have is either updated by triggers or a huge daily cronjob :) [07:30] We should destroy nightly.sh for being a terrible idea and making a lot of people very angry [07:39] In favour of? [07:40] smaller celery tasks on demand might be nicer [07:40] afternoonly.sh [07:40] even if it would mean more overhead [07:40] manyana.sh [09:03] good morning [10:37] wgrant: Can you think of any reason why Archive._copySources (for syncSource/syncSources) couldn't just pass unembargo=True as a direct replacement for allow_delayed_copies=True? [10:37] It seems too eay [10:37] *easy [10:42] cjwatson: As long as it's been fixed to not reupload then it should be fine [10:42] Yeah, I destroyed the reuploading a while back [10:43] Yeah, I thought I recalled that [10:46] I guess I have to destroy delayed copies in two pieces, to avoid a situation where we created a delayed copy just before a rollout and then have no way to publish it [10:49] Yeah [10:49] And we might have to keep a tiny amount of code around for the sake of old entries in DONE queues and the like [11:05] cjwatson: Yeah, but most of the code can die [11:24] frankban: could you please review this mp: https://code.launchpad.net/~adeuring/launchpad/product-lp-limitedview-3-1/+merge/134075 ? [11:29] adeuring: sure [11:29] frankban: thanks! === frankban changed the topic of #launchpad-dev to: http://dev.launchpad.net/ | On Call Reviewer: frankban | Firefighting: - | Critical bugs: ~200 [11:31] adeuring: LimitedView delegates to View by default, so you probably don't need to check for View at all [11:32] wgrant: ah, right, let me try this [11:32] The ifs will still be a little ugly, but it's at least slightly less bad === yofel_ is now known as yofel === almaisan-away is now known as al-maisan === al-maisan is now known as almaisan-away === rick_h changed the topic of #launchpad-dev to: http://dev.launchpad.net/ | On Call Reviewer: frankban, rick_h | Firefighting: - | Critical bugs: ~200 [13:12] rick_h: morning [13:12] morning czajkowski [13:13] rick_h: I come with a gift :) https://bugs.launchpad.net/launchpad/+bug/1078239 [13:13] <_mup_> Bug #1078239: loading team blueprints leads to timeout Person:+index < https://launchpad.net/bugs/1078239 > [13:14] czajkowski: ok, https://bugs.launchpad.net/launchpad/+bug/1075365 is in qa mode so we should hopefully have that ready for a deploy in a couple of hours [13:14] <_mup_> Bug #1075365: Timeout when trying to visit previous sprint pages < https://launchpad.net/bugs/1075365 > [13:15] and if that doesn't fix it...well then back to the code [13:15] rick_h: great [13:16] czajkowski: so card is added to our board and we'll make sure to follow up. [14:07] morning all. [14:09] party [14:13] morning. [14:30] hi === matsubara is now known as matsubara-lunch [14:45] sinzui: what does the tag specification mean? [14:46] The error involves the specification object [14:55] deryck: here's the test case I'm working on with the matchers https://pastebin.canonical.com/78291/ and the html content appears in line 151 of the test failure output [14:55] which is the browser.contents [14:56] I moved that