[00:39] lifeless: you probably figured it out already, but there has to be a 1:1 mapping from values to tokens (and back) [00:47] benji: yeah [00:47] I had a query that wasn't DISTINCT [00:47] easy fixed [00:48] benji: a better error would list the token and titles that were duplicated [02:11] wgrant: if you're interested [02:11] 54 SELECT DISTINCT DistributionSourcePackageCache.archive, DistributionSourcePackageCache.binpkgdesc ... DistributionSourcePackageCache.archive IN (%s, %s) AND BinaryPackageName.name = %s ORDER BY name: [02:11] GET: 54 Robots: 2 Local: 40 [02:11] 54 https://launchpad.net/ubuntu/+search (Distribution:+search) [02:11] OOPS-1704C1791, OOPS-1704C1793, OOPS-1704C554, OOPS-1704C773, OOPS-1704C775 [02:11] wgrant: e.g. highest oopser yesterday [02:30] thumper, I just got your winge email about lazr-js. What's up? [02:30] * rockstar has no problems [02:30] rockstar: no worky [02:30] thumper, did you do a make? [02:30] aye [02:30] I pulled trunk [02:30] went make [02:30] said I had bad versions [02:30] went to download-cache [02:30] did bzr up [02:30] went back [02:31] did make [02:31] said invalid version for setuptools [02:31] edited version.cfg [02:31] thumper, what about download-cache 'bzr up' [02:31] did make [02:31] worked [02:31] but examples don't work [02:31] thumper, do you get any javascript errors? [02:32] not obviously [02:32] CSS seems broken [02:32] thumper, probably because your make didn't actually work. [03:03] thumper, could you file a bug with the output of make and the like? [03:03] thumper, I'm concerned that my 'make' works and yours doesn't. [03:03] rockstar: bit busy right now trying to get ian connected to my network again [03:04] thumper, no hurry. I'm EOD'd, but I'd like to make it work for you tomorrow if I can. [03:04] Might be nice for wallyworld to be able to see the demos... === Ursinha-afk is now known as Ursinha [04:13] anyone seen this ? [04:13] File "/home/robertc/launchpad/lp-branches/working/lib/canonical/lazr/pidfile.py", line 83, in get_pid [04:13] raise ValueError("Invalid PID %s" % repr(pid)) [04:13] ValueError: Invalid PID '' [04:16] nvm, found the files === Ursinha is now known as Ursinha-afk [08:23] noodles775: It looks to me like the distroseriesdifference model doesn't allow for multiple parent series. [08:24] That seems like a pretty critical feature. [08:24] Is it in the spec? [08:24] wgrant: pretty sure its not [08:24] Hm. [08:24] Odd. [08:24] How's that going to work, then? [08:25] Surely Linaro's N release is going to want to inherit from Linaro M, and then merge from Ubuntu N. [08:25] I suspect it will be done thusly: [08:25] L-N lcones from L-M, with differences, and then inherits U-N [08:27] wgrant: I've not heard of it before it was mentioned in that email, but I wasn't involved in the initial discussions. [08:28] Ah, maybe I should read this thread, then. [08:28] * wgrant looks. [08:28] hrm, maybe it was on https://dev.launchpad.net/LEP/DerivativeDistributions/UserTestingRound2 ... [08:29] * noodles775 checks [08:29] wgrant: yes, it was mentioned by Loic (see Mock-up 2 for Loic on the above page). [08:30] And cjwatson as well. [08:30] I should probably read the whole thing. [08:30] Yep. [08:31] Although Loïc refers to a parent distribution, not series. [09:00] good morning [09:30] Hmm. [09:30] wgrant? [09:30] A friend just ran into a bit of a strange PPA key generator case. [09:31] He created his first PPA a while ago, but never uploaded anything. [09:31] Then created a new one just now, and uploaded something to it. [09:31] Now, of course it had no key preset... so it's now going to get one of its own generated. [09:43] How often does the key generator run? [09:45] wgrant: currently every 20mins [09:45] Yeah, it just appeared now. [09:45] Thanks. [09:49] any reason that can't run on-demand ? [09:49] * noodles775 remembers cprov requesting hardware entropy. [09:50] But not sure. [09:50] well, on-demand would have no higher or lower demand on entropy [09:50] on average (and if there is enough to backlog after 20 minutes, there is enough to backlog for 20 minutes [09:51] I thought it already did have a hardware RNG. [09:52] It's particularly bad, since the PPA will be published unsigned. [09:52] Then you have to upload something again to get it signed. [09:59] lifeless: But yes, the Referer check is stupid. [09:59] I proposed it as a quick solution. [09:59] I hoped that the user backslash would be enough to convince Foundations to fix it properly. [09:59] But apparently not. [10:08] thumper: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1704EB1743 [10:09] So... user pages have been broken for five days now. [10:10] Or 6. One of them. [10:11] losa ping === jtv is now known as jtv-afk [10:40] losa ping? [10:41] lifeless: sorry, missed the last one [10:41] lifeless: here now... :) [10:41] thats ok :) [10:41] I didn't want to interrupt steve if he'd EOD'd. [10:41] anyhow [10:41] two tings [10:41] there is a CP for the home page maps problem in LP [10:41] this one actually has the code [10:42] secondly, staging - has it successfully restored yet ? [10:42] sweet [10:42] I'm a little worried about staging, monday PQM freezes and folk will be qaing like mad... we hope. [10:43] hmm, there was a stale lockfile from the last broken restore, I've removed and it'll kick off very soon [10:45] starting now... [10:49] mthaddon: thanks; can you please let uhm, someone, know about the CP ? [10:50] lifeless: I'm doing it now [10:50] perhaps get a dev to tickle launchpadstatus on identi.ca or something if it works [10:50] mthaddon: cool cool, not rushing you [10:50] handing-off, I have to sleep, early wakeup tomorrow [10:50] it's all scripted, so there's no rushing to do :) [10:50] \o/ [11:42] (I fail at sleep) [11:42] SpamapS: https://lp-oops.canonical.com/oops.py/?oopsid=1704S20 btw [11:42] thats your slow page [11:43] for some reason its counting every open bug. [11:43] clearly daft. [11:46] SpamapS: https://bugs.edge.launchpad.net/malone/+bug/627974 [11:46] <_mup_> Bug #627974: MaloneApplication:CollectionResource is slow/times out [12:06] Morning, all. === Ursinha-afk is now known as Ursinha === matsubara-afk is now known as matsubara === Ursinha is now known as Ursinha-afk === jtv1 is now known as jtv === Ursinha-afk is now known as Ursinha [13:52] morning, all. [13:53] quick question for anyone who might have an answer: is it unexpected that a test should fail on EC2 but work just fine when you run that test locally? [14:01] jcsackett, yes, but you might try to run that test by itself on EC2, to find out whether it's test-fails-on-EC2 problem or test-fails-when-running-full-testsuite problem [14:02] if you're brave enough to run the full testsuite locally and the test passes, then it's likely to be something related to EC2 or one of those intermittent failures we have once in a while [14:09] um, anyone around I could talk to about the map issue? Do you folks think its because of the sensor parameter? [14:09] It would be nice to have a way to check if its that. [14:10] wgrant: you around? [14:14] nigelb, thanks! :) [14:15] benji: Hi. [14:15] Ursinha: :) [14:16] wgrant: hey; I hear that you have a launchpadlib script that I should take a look at [14:17] benji: It mostly works (albeit very slowly) when running on a lazr.restfulclient with retry support. [14:17] But http://qa.ubuntuwire.org/ftbfs/source/ has the code. [14:20] jcsackett, instead of running the entire suite locally you may want to try just running the part of the suite that contains the failing test. If lp.registry.openid.foo failed, try running all the tests in lp.registry.openid [14:21] jcsackett, it is a fast-running way to see that you did not encounter a test isolation failure [14:22] jcsackett, while you are doing that you can also fire off an EC2 run or two to see if the failure is intermittent: ec2 test -o '-vv lp.registry.foo.bar' [14:24] mars: i've already run the single test locally; it passed. [14:25] i'm about to fire off EC2 [14:26] mars: i believe it may be an isolation issue; if i run the entire bugs set of tests the one in question fails. [14:26] cool [14:27] jcsackett, if it fails locally when running that one suite then I wouldn't bother re-running on EC2 [14:27] mars, good to know. [14:28] thanks, mars. [14:50] wgrant: we'd like for you to try you the new lazr.restfulclient (0.10.0) and launchpadlib (1.6.5) and the devel version of the web service; using those versions will be much faster for you and be easier on the servers [14:50] benji: Is there a PPA for those two? [14:50] mmm, I don't think so; let me see === Ursinha is now known as Ursinha-afk [14:51] wgrant: sorry, there isn't [14:58] benji: OK. It's about to hit midnight, so I'll look at installing those tomorrow. [14:58] benji: The script shouldn't need any changes? [14:59] wgrant: it shouldn't [15:02] benji: Great. Thanks. [15:04] Hurrah for tests that should never have passed and yet somehow did. [15:11] lifeless: :) thanks, I've subscribed to that bug report. :) [15:21] rockstar, ping, have a moment for a question about the bzr pipeline mail you sent to the list last week? [15:46] Is anyone here an ace with zope.component? [15:47] allenap: what's up? [15:48] benji: Hi. I'm trying to land lp:~allenap/launchpad/cache-experiment-roll-out, but I keep getting adaption errors when it runs in ec2. I can't replicate them locally. [15:49] benji: The adapters are registered normally with some zcml, but I've also added registered them with the global site manager at import time. [15:49] s/added / [15:50] allenap: can you point me at the test results? [15:51] benji: http://pastebin.ubuntu.com/486797/ [16:02] allenap: I hava a suspicion, but it's taking forever for bzr to download the branch so I can confirm the suspicion [16:04] benji: Shall I send you a bundle? [16:06] allenap: a diff would be good [16:07] benji: http://paste.ubuntu.com/486805/ [16:10] gary_poster: can I ask you again for help about this failure: https://pastebin.canonical.com/36340/ for this diff https://pastebin.canonical.com/36336/ ? [16:10] * gary_poster whimpers [16:11] trying to help noodles with an immediate problem while trying to get longer term fix done in background cycles [16:11] allenap: I /suspect/ that the IPropertyCache adapter (get_default_cache perhaps) is returning None, which means "I couldn't adapt" [16:11] but will look adeuring [16:11] gary_poster: thanks1 [16:11] of course, I don't know why that would happen in EC2 and not in dev [16:12] benji: Yeah, it's very odd. [16:13] adeuring: help me page back in Robert's comments, please--hat did he say? [16:14] you might add an assertion to get_default_cache that the return value is non-None and run just one of the failing tests in EC2 to see if my suspicion is correct [16:14] gary_poster: that we should try a completely different approach for the current issue, ie.e, complete aoid this code, for unrelated reasons. [16:15] gary_poster: problem is: his approach -- allow access to the restricted librarian for certain DC machines -- is refused by the admins [16:15] due to security concerns [16:17] benji: Okay, I'll give that a go. [16:17] gary_poster: so, to get a fix for bug 620458, i am back to the implementation as sketched by francis [16:17] <_mup_> Bug #620458: cannot access attachments of private bugs any more [16:17] adeuring: oh, LOSAs or James or somebody weighed in and said no? [16:18] gary_poster: it was james [16:18] ok [16:18] then here we are, yes, got it. :-( [16:18] Robert's concerns as to load were very valid [16:18] gary_poster: sure... [16:18] but we don't have another answer now, I know [16:19] right. === matsubara is now known as matsubara-lunch [16:19] FWIW, adeuring... [16:20] Zope/WSGI allows us to pass file objects back to serve [16:20] sounds interesting [16:20] not immediately useful, maybe...but yeah [16:20] if we had the files to serve [16:21] the Zope could just give the open files back as a response (there's probably some incantation) [16:21] gary_poster: well, the current implementationn even creates a temp file [16:21] and then the thread would be back open for business. [16:21] right, but it blocks on downloading the temp file, right? [16:22] well s/right/good/ :-) [16:22] eh [16:22] gary_poster: yes. But: We have a file-like object there [16:22] I meant [16:22] "good, but it blocks on downloading the temp file, right?" [16:22] fwiw [16:22] ok [16:23] so adeuring, just so I understand, this is an oauth sort of problem, right? [16:23] gary_poster: do you my test failures or the whole issue? [16:24] adeuring, the whole issue [16:24] gary_poster: yes [16:24] ok [16:24] your test failures, AIUI, are about the fact that something somewhere in the vicinity of canonical_url does not perform as we expect [16:24] gary_poster: exactly [16:26] adeuring: does your code work correctly in the real world? [16:27] if you know what I mean? [16:27] gary_poster: well, it works launchpad.dev [16:27] right, ok [16:32] adeuring: There, of course, are tons of pieces of context that are not set up, between what you have here in the test, and what you have in live usage. If this were my problem, I would put a pdb in getUrl, and walk through it once when it worked (launchpad.dev) and once when it didn't (tests). If you want, give me the branch name, and I'll try to do that comparison for you. [16:33] (To be clear, I'm saying I don't know what context you need. Sorry. :-/ ) [16:33] gary_poster: ok, just a second [16:36] gary_poster: lp:~adeuring/launchpad/bug-620458-private-bugattachments-api-access [16:40] on it, adeuring. What are instructions for testing live? === salgado is now known as salgado-lunch [16:42] gary_poster: (1) add an attachment to a bug, (2) make that bug private, (3) use the sample code from pitti in bug 620458 [16:43] <_mup_> Bug #620458: cannot access attachments of private bugs any more [16:44] gary_poster: in short: lp.bugs[1].attachments[0].open() [16:44] k === Ursinha-afk is now known as Ursinha [17:11] gary_poster: r375 of storm trunk has the fix we discussed [17:12] lifeless, awesome, thanks for update. May make a temporary release then , when I get to it. [17:12] gary_poster: It should be cherrypickable; and the workaround that is in launchpad/lazr for the sqlobject variant should be backed out [17:12] agreed [17:13] I suspect a cherrypick on 0.17 will work better than a snapshot of trunk; there are two bugs in 0.17; one is fixed in trunk, but it exposes the other... the first one resultsin more queries, the second in an internal error. [17:13] you're welcome to try however :P just hoping to save you some time. === benji is now known as benji-lunch [17:14] leonardr: ^ [17:16] ack lifeless, thanks === beuno is now known as beuno-lunch [17:38] is the entire change history of a branch_merge_proposal? like all status changes and such (not votes/comments) === matsubara-lunch is now known as matsubara === salgado-lunch is now known as salgado === benji-lunch is now known as benji === deryck is now known as deryck[lunch] === gary_poster_ is now known as gary_poster [18:41] lifeless: if you had to choose between running zope or django for the standalone instance of the results tracker, what would you do? === beuno-lunch is now known as beuno === deryck[lunch] is now known as deryck [19:03] leonardr: ping [19:05] or anyone else i guess [19:05] should launchpadlib throw HTTPError when it gets a 301 redirect for an api call? [19:06] dobey: no. what are you doing? does launchpad.me work? [19:07] leonardr: getSeries() on a distribution is giving me a redirect and HTTPError [19:08] http://pastebin.ubuntu.com/486892/ [19:08] dobey: do you know if this is a recent change, or did you just start using getSeries? [19:09] i just started using it. i'm trying to get a valid series link so i can pass it in to the source_package_recipe.requestBuild() method, since it doesn't accept u'maverick' as valid [19:10] dobey: give me the code you use to invoke getSeries so i can see if the problem has been fixed in a more recent version [19:11] leonardr: do you need the whole script, or just the launchpad bits after authentication? [19:12] dobey: the latter [19:13] leonardr: http://pastebin.ubuntu.com/486896/ [19:14] adeuring: I believe the biggest difference between live QA and automated testing is because the WebserviceTestRequest has a different behavior in getRootURL than the real webservice request. The real request has the behavior in line 1332 of c/l/w/servers.py, while the test request does not include that method for some reason . [19:14] http://pastebin.ubuntu.com/486893/ gets closer (sorry, it includes your changes because I did this in a not very smart way). See lib/canonical/launchpad/webapp/servers.py for the most important change, and line 135 . Can I toss this to you for a bit more digging on your side? [19:14] dobey: can you tell me the value of self.series? [19:14] 'maverick' [19:15] dobey: ok, i can reproduce it. will you file a bug against lazr.restfulclient? it might be a problem in httplib2 but we'll start there [19:16] leonardr: does distribution.series[u'maverick'] give the (ideal) same result? [19:16] dobey: well, [x for x in distribution.series if x.name=='maverick'] will work [19:17] but the [] operator isn't supported on arbitrary collections right now [19:17] ok [19:32] bryceh: don't put the [] stuff in the launchpad commit message [19:32] bryceh: ec2land does it for you [19:39] lifeless, why not? does it hurt anything? [19:39] lifeless, from WorkingWithDbDevel it seems to suggest including it in the commit message [19:39] leonardr: bug #628267 [19:39] <_mup_> Bug #628267: distribution.getSeries() raises HTTPError on redirect [19:40] lifeless: did you notice my question above? personally, I'm leaning towards django, not by preference but to keep things lean for now [19:43] bryceh: yes, it hurts [19:43] bryceh: you'll have them twice [19:43] bryceh: they have to be in what pqm receives, which is -not- what goes in the lp commit message; its what ec2 land outputs for you [19:44] cr3: uhm, I don't know. Maybe neither? [19:44] lifeless: what did you have in mind? [19:44] lifeless, sheesh [19:44] cr3: trolling you successfully [19:45] bryceh: please do improve the wiki page to prevent other folk having the same misunderstanding [19:45] cr3: if you feel like being experimental as well as lean, you might like bobo: http://bobo.digicool.com/ [19:46] rockstar: ^ also tarmac will need to know the logic ec2land uses to do commit messages/approval recording etc; presumably via a plugin (and as folk set this from the commandline, I guess we need extra data when the branch is set to 'Queued' [19:46] cr3: our best of breed set is (djanog, zope); [19:47] cr3: frankly I think there is a bit of a hole in the middle; djano gives too little support, zope ---to much--- [19:47] lifeless, maybe I'm missing context there, but that sentence didn't make a lot of sense to me. [19:48] rockstar: to use tarmac to land launchpad branches, preserving the [foo=bar] in the commit messages which is used for QA workflow, we'll need to make sure that the 'ec2 land' process still handshakes properly with the thing doing the commits. [19:48] lifeless: agreed, I'll work towards that [19:48] cr3: either django or zope would make sense to me here I guess [19:49] lifeless, oh, yeah. I kinda see what you're saying. I suspect Tarmac does most of that already (you can get it to format your commit message for you) [19:49] rockstar: the tricky bit will be things like ec2 land --incr [19:49] rockstar: and --no-qa [19:49] rockstar: we may need to change how ec2 land does it for pqm, in advance, if we want to make migration easy [19:50] lifeless: a combination of both works too, manage.py here, storm there and interfaces everywhere. something like that [19:51] lifeless: what does ec2land do for approval recording? [19:51] lifeless, hm. I think we just teach ec2 how to set a commit message then. [19:52] dobey: generates the [r=foo] prefixes launchpad use by convention (and that PQM checks against a regex) [19:52] rockstar: something :) [19:52] rockstar: just alerting you is all [19:52] lifeless, when we move to Tarmac, we won't want to be setting "Approved" on the branches, and deferring that instead to having ec2 do it on a successful test run. [19:53] rockstar: I can't quite parse the nested clauses there. [19:53] rockstar: can you rephrase? [19:53] lifeless: ah. so, tarmac automatically harvests the list of reviewers and shoves them in a revision property on the new revision, called 'reviewers' :) [19:54] Tarmac is pretty dumb (by design). We could have it grow some brains, but I think having our tools know how Tarmac likes it would be better. [19:54] dobey, yeah, kinda like what we already do with the commit message formatter. :) [19:54] rockstar: pqm is too :P [19:54] does it only generate [r=xxx] or does it also include the [bug=nnn] and [ui=none] stuff? [19:54] bryceh: it generates it all [19:54] * bryceh grumbles about crufty wiki pages [19:54] bryceh, allenap was the last to touch the Tarmac commit message formatter, but I think it'll do bug= as well. [19:55] this is like the 3rd thing WorkingWithDbDevel has misled [19:55] er, me on [19:55] lets try my oops branch again [19:56] I've fixed WorkingWithReviews - it was mentioning that you shouldn't use [r=xx] but only in a footnotey way [19:57] well, we've moved away from the [r=] bit in Ubuntu One stuff, for branches we're landing with tarmac, since it stores the metadata in revision properties [19:57] probably we want a dedicated page for landing-stuff [19:58] dobey: sure; the main thing is toolchain friction - need to update a bunch of analysis tools, and in the past folk have said they really like it being visible in trunk with just 'bzr log' [19:59] yeah, the bugs are currently the only thing visible there [19:59] but we could (should) fix bzr to make it much easier to see additional revision properties [20:02] gary_poster: thanks! [20:03] bryceh: ah so the key thing is 'do not use -s' [20:03] I'll make that edit [20:04] bryceh: also you don't need to add options, you just target your branch to db-devel and it all Just Works. [20:05] lifeless, yeah you'll notice I corrected that already [20:05] bryceh: I just used bigger scissors [20:05] as a db-devel newbie I found that page full of snakes [20:06] adeuring1: welcome. :-) please feel free to toss it back after you've tackled it a bit more. [20:06] I can imagine [20:06] bryceh: anyhow, I've just deleted all the stale info around the ectest/land bits [20:07] er actually DatabaseSchemaChangesProcess is the one full of snakes [20:07] the wiki is full of schnaaaaake [20:07] describes a cases that are actually special cases the ordinary newbie doesn't actually need to care about [20:08] e.g. all the emphasis about doing sampledata updates, security.cfg changes, etc. [20:09] lifeless, yeah that looks better [20:11] bryceh: partly this is cruft we've accumulated/rtolerated [20:11] bryceh: you do need to do security.py very often [20:11] fwiw [20:12] lifeless, in my case I did it but it wasn't needed. The problem with the docs is it is vague about indicating *when* it should/shouldn't be done (perhaps a link to an appropriate doc about what the file is/does?) [20:12] good idea :) [20:12] the README in the schema is probably appropriate [20:12] its visible on bazaar.lp.net/... [20:19] bac: is the review meeting today? [20:19] bac: if so, how many hours away [20:20] 4:40 [20:20] thanks [20:20] 0Z [20:20] is https://staging.launchpad.net/ supposed to say 'Code Update In Progress' right now? [20:21] yes [20:21] usually does about this time [20:21] however, unless a losa restarted it, its bust [20:21] losa ping [20:22] hello lifeless [20:24] hi mbarnett [20:24] so staging failed overnight [20:24] (my overnight) [20:24] slony again [20:24] stub isn't around for 6-7 hours [20:24] lifeless: i am pretty sure tom kicked it off again a couple horus ago [20:25] let me check [20:25] staging is down right now; do we have any way to bring it up or kill-db-clients-harder and start it again. [20:25] I saw chatted in lp-code 2 hours back [20:25] chatter [20:33] lifeless: hmm, it looks like a db error that we are actually waiting on stub for. [20:33] mbarnett: the slony timeout [20:34] mbarnett: thats what I said above, isn't it ? [20:34] or is it something differnet? [20:35] the slony timeout seems to be the current offender [20:37] righto [20:37] My understanding, totally limited, is that connected clients break slony. [20:37] that *looks* to my limited understanding to be that happening. [20:42] lifeless, hi. Some timeout work questions, if you don't mind. [20:42] shoot [20:42] I'm all eyes [20:43] so I see 2-3 bugs that all seem to come back to adding notifications. and I'm trying small steps to improving then breaking that out.... [20:43] trying to just get rid of a flush to start with :-) See: http://pastebin.ubuntu.com/486931/ [20:43] looking [20:43] now obviously, this has the downside of doing severl inserts, compared to the original way [20:43] if its not in an inner loop it shouldn't be an issue [20:44] ah, so that flush I removed is probably not an issue then? [20:44] I don't think it will be [20:44] ah, ok [20:44] flushes have the following overhead: [20:44] - a cache walk to detect dirty objects [20:45] - DB update/insert calls per modified object to clean them up [20:45] a flush with no modified objects then has a cachewalk only overhead (costly, but not fatal) [20:45] a flush with many modified objects pending behind it is costly, but not because its a flush, because there is a lot of work to do. [20:46] note that storm, when you do a select, will often flush automatically anyway. [20:46] ok, that makes sense and is useful to understand. [20:46] the previous single INSERT INTO is probably maximally efficient. [20:46] right [20:46] whats the bug report that got you digging into this code ? [20:47] and that was my other question, is there a storm way to get that single insert? i.e. without manually building the sql statement? [20:47] lets have a quick poke at expr.py [20:47] (storm/expr.py) [20:48] see the class Insert [20:48] the docstring on it is fairly clear to me: [20:48] ok, cool. it wasn't to me. [20:48] hmm, Insert isn't good enough [20:48] looking more [20:50] ah yes it is, it uses itervalues() which confuses me [20:50] I wonder how it guarantees iteration order [20:50] deryck: https://bugs.edge.launchpad.net/storm/+bug/411556 [20:50] <_mup_> Bug #411556: Storm should support multi-row inserts [20:51] ah, looking at the bug.... === Ursinha is now known as Ursinha-afk [20:52] deryck: so what bug/oopses are you looking at ? [20:52] let me get numbers, I started with 2 or 3 and just poked and ended up where I was at. === jtv is now known as jtv-zzz [20:54] deryck: yeah, I don't doubt that we have a few key inflection points for many issues [20:54] bug 618403 and bug 611115 and something else that led me here... maybe the one about addComment [20:54] <_mup_> Bug #618403: BugTask:+editstatus-page timing out in ~4% of requests [20:54] deryck: I'm just curious ;) [20:54] <_mup_> Bug #611115: timeout: bug notifications are calculated in-request [20:54] hah, I recognise the chap that filed those. [20:54] terrible fellow [20:55] heh [20:55] lets see if we can get an oops for the first one [20:55] yeah, todays edge OOPS report had that one [20:55] did it, looking [20:56] trying to get on myself.... slow browser [20:56] https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1704ED939 [20:57] so [20:57] I think this is fixed [20:57] ah no [20:57] I think the actual timeout on max(bug.heat) is misleading. [20:58] just before is the call to Bug.addChange, which many of these I looked at had in common [20:58] sec [20:59] just colating [20:59] ok [21:00] so I've put my normal stuff-from-an-oops in the tbug [21:00] what I've done is: [21:00] - linked the url [21:00] - the query parameters that are needed to make sense of whats happening (if not private) [21:00] - the broad times [21:00] and the sql queries that really stand out [21:01] note the *300* email lookups - they will be hurting more than 1.6 seconds; I'd estimate 2 or more seconds [21:01] as for the 8 second heat lookup, lets see how it does on staging quickly [21:01] wheeee its slow [21:01] still waiting [21:02] Time: 14447.014 ms [21:02] yeah, i guess it is the max heat select. I could have sworn the ones I looked at today that was misleading. [21:02] deryck: I don't think the 8 second heat lookup is misleading at all :) [21:02] right :-) [21:02] now, its faster the second time, but I could swear we had this the other day [21:03] lpmain_staging=> select bug.heat from bug, bugtask where bugtask.bug = bug.id and bugtask.distribution = 1 order by bug.heat desc limit 1; [21:03] heat [21:03] ------- [21:03] 11062 [21:03] (1 row) [21:03] Time: 58.710 ms [21:03] that should fix it nicely. [21:04] nice [21:04] easy peasy then :-) [21:04] should be yes [21:04] except it makes me nervous. [21:05] how so ? [21:05] I remember us moving to limit 1 before for speed, but then someone changed back to max() saying it was faster. [21:05] so now limit 1 is suddenly faster again? [21:05] so this is db query plan hell [21:05] yup [21:05] lets check whether this exact query went through that transition [21:05] annotate time! [21:06] heh, blame FTW! [21:07] bugtarget.py seems to be the file [21:08] yes [21:08] 7675.778.1 is relevant [21:08] 10580 perhaps [21:09] and 10428 [21:09] running with -p now [21:10] yeah, 06-25 it was switched [21:10] r=me in fact [21:10] bug 615644 [21:10] <_mup_> Bug #615644: BugTask:+distrotask timeout on HEAT lookup [21:11] interestingly [21:11] the patch didn't completely change it one way or the other [21:11] right, I thought I recalled this. [21:11] it left projectgroups alone, or they were the current way already or something [21:11] yeah, in was inconsistent in the first iteration. [21:12] I imagine oversight, not design :-) [21:12] naturally [21:12] if we used a query builder for this [21:12] rather than complete literals, it would mean only the builder needs changing [21:12] rather than multiple similar clauses [21:13] ok [21:13] so I've looked in the whole history [21:13] yes, this has flip flopped [21:13] lets look at that old bug [21:13] and see what context it was searching on [21:13] it may be that distros want one, and products another [21:14] ok, so it was a different case stub analysed [21:14] and it still runs fast [21:15] still runs fast meaning the limit runs as fast for the case stub tried as for the case we're at now? [21:15] rather than the max he suggested? [21:16] the current code runs fast for the case that bug 615644 examined [21:16] <_mup_> Bug #615644: BugTask:+distrotask timeout on HEAT lookup [21:16] gmb, quite reasonably changed the other similar queries at the same time. [21:16] sadly it looks like the different queries go through the planner differently [21:17] here: [21:17] lpmain_staging=> SELECT max(Bug.heat) FROM Bug, BugTask, DistroSeries [21:17] lpmain_staging-> WHERE BugTask.bug = Bug.id [21:17] lpmain_staging-> AND BugTask.distroseries = DistroSeries.id [21:17] lpmain_staging-> AND DistroSeries.distribution=3; [21:17] max [21:17] ----- [21:17] (1 row) [21:17] Time: 58.259 ms [21:17] this is likely related to how many bugs are on distribution 3 [21:18] 3 is debian [21:18] 1 is ubuntu [21:18] OOPS-1705EB2156 [21:18] hmm... I thought there was a bot that would give a link if I pasted an oops into the channel? [21:18] #lp [21:19] deryck: so there are two variations here: [21:19] - different query with distro series [21:19] - different distro [21:19] lets try changing 1 to 3 [21:19] yeah, I did. longer but not horrible. ~200 ms [21:19] well, not horrible on the order of 8000 :-) [21:20] and the limit approach there is 10ms [21:20] lets try the limit approach for the qyery from the 06 bug [21:20] bingo thats slow [21:21] - no rows [21:21] lifeless, which query is this? [21:22] I'm trying permutations [21:22] I wonder if adding the condition heat not null makes it faster? [21:22] we have: MAX or LIMIT [21:22] we have distro 1 and distro 3 [21:22] we have with distroseries and without [21:22] Does anyone know why calling setOwner on a branch via web API OOPses even though its successful? [21:22] cody-somerville: web API ? [21:23] Module canonical.launchpad.webapp.publisher, line 691, in _handle_next_object [21:23] raise NotFound(self.context, name) [21:23] cody-somerville: please file a bug [21:23] NotFound: Object: , name: u'omsk' [21:23] deryck: no change for me adding not null there [21:24] with the MAX case [21:24] lifeless, I meant for the limit case, sorry. [21:24] no [21:24] 5seconds [21:25] hmmm [21:26] I'm going to fill in this table: [21:26] MAX/LIMIT 1/3 D/DS [21:26] if you're on staging, and wanted to do all the LIMIT ones [21:26] I'll do all the MAX ones [21:27] MAX/LIMIT 1/3 D/DS TIME [21:27] lifeless, Filed bug #628352 [21:27] ah.... checking that distroseries is not null seems to speed up the original offending query using limit 1. [21:27] <_mup_> Bug #628352: Calling setOwner on a branch via web API OOPses even though operation is successful [21:28] MAX 1 D 2000ms [21:28] MAX 3 D 150ms [21:28] MAX 1 DS 196ms [21:28] MAX 3 DS 2ms [21:29] cody-somerville: I don't know what a 'web API' is [21:29] deryck: adding that in [21:29] lifeless, the distro series not null condition? [21:30] lifeless, http://en.wikipedia.org/wiki/Web_API [21:30] deryck: can you paste the query, so I get it right [21:30] cody-somerville: ugh [21:31] cody-somerville: please say 'LP API' and you will avoid all confusion [21:31] or just "API" [21:31] at least round these parts [21:31] confirming query, lifeless ... [21:33] lifeless, slow first, fast second: [21:33] SELECT Bug.heat FROM Bug, Bugtask, DistroSeries [21:33] WHERE Bugtask.bug = Bug.id AND [21:33] Bugtask.distroseries = DistroSeries.id AND [21:33] DistroSeries.distribution = 3 ORDER BY Bug.heat [21:33] DESC LIMIT 1; [21:33] SELECT Bug.heat FROM Bug, Bugtask, DistroSeries [21:33] WHERE Bugtask.bug = Bug.id AND [21:33] Bugtask.distroseries IS NOT NULL AND [21:33] Bugtask.distroseries = DistroSeries.id AND [21:33] DistroSeries.distribution = 3 ORDER BY Bug.heat [21:33] DESC LIMIT 1; [21:34] ugh [21:34] (because inner joins imply not null anyhow .. [21:35] doing the limit cases in the table [21:36] lifeless, I really appreciate this help, but I need to leave now. Passed EOD and kids are arriving home. [21:36] Sorry if I distracted you from other work [21:36] deryck: this sort of stuff is my job :) [21:36] MAX/LIMIT 1/3 D/DS/DSNN TIME [21:36] MAX 1 D 2000ms [21:36] MAX 3 D 150ms [21:36] MAX 1 DS 196ms [21:36] MAX 3 DS 2ms [21:36] LIMIT 3 DSNN 5000ms [21:36] LIMIT 3 DSNN 2ms [21:36] LIMIT 1 DSNN 200ms [21:37] LIMIT 1 DS 2ms [21:37] its missing two rows - LIMIT 'D' [21:37] adding them in a sec, I'll put this in the bug [21:37] catch you later [21:37] ok, thanks! I'll build on this work tomorrow, if you don't beat me to it :-) [21:38] later on [22:02] anyone from registry still around ? [22:06] lifeless, yes. what's up? [22:06] I had a question [22:06] uhm [22:06] ah yes [22:06] LoginToken:+accountmerge [22:06] is that on your kanban board for moving to a job ? [22:07] bug 104088 [22:07] <_mup_> Bug #104088: Time-out problem at merging accounts [22:07] lifeless, yeah it's on our OOPS backlog. [22:07] ok [22:07] cool [22:07] its the 3rd slowest page in the candidate-timeout-reports [22:08] if you like i can reference it tomorrow at our standup; i'm wrapping up work on some stuff today, and may be able to take a look at it tomorrow. [22:08] well, its possibly longwinded; I was just cirious [22:08] whatever work item selection process you are following; keep following it :) [22:13] oh sodium [22:13] thou art the very esssence of a reactive metal [22:51] * rockstar afks === matsubara is now known as matsubara-afk [22:52] so if i have a testing situation that is coming up b/c of the way bugs are being cached breaking test isolation, is there a way in a test to clear out the cache? [22:53] what sort of cache [22:53] memcache? cachedproperty? storm cache? [22:56] jcsackett: ^ [22:58] lifeless: in your refactoring of launchpad, are you considering splitting the database into multiple databases per pillar for example? [22:58] cr3: yes to splitting the database, ENotDesigned to anything further [22:59] and splitting may mean ' one big db using a different tech' [22:59] there are all sorts of options [22:59] lifeless: fun! [22:59] I need the breathing space to get into it though [22:59] get page performance up and the DB will have a lot more headroom [22:59] we can relax a bit and take time to fix stuff up [23:02] lifeless: storm, i believe. [23:02] i'm working off an unclaimed not entirely clear XXX. [23:03] gimme details [23:03] <- data monster [23:04] lifeless: this is the test case: https://pastebin.canonical.com/36595/ [23:04] ahha, I wrote this :) [23:04] if i run that test, it's case, or the file it's in by itself, it passes. [23:04] ok [23:05] if i run the bugs.tests module or more, it fails, with one more query than expected. [23:05] this is true both locally and on ec2. [23:05] let me have a go [23:05] so there are two possibilities [23:06] lifeless: i suspect it will pass for you. if i run this on devel there's no problem. i think, given the behavior, it's an isolation problem. but it's one that cropped up b/c of my changes. [23:06] * jcsackett realizes that might not be what you meant by "have a go" [23:06] one is that the fuzz I permitting in the test is permitting it to pass when bug 619017 triggers, and that when it doesn't you've caussed a scaled bug [23:06] <_mup_> Bug #619017: __storm_loaded__ called on empty object [23:07] the other possibility is that as you say its purely storm caches getting in the way [23:07] so, to reset the storm cache [23:07] you can do IMasterStore(Bug).flush(); IMasterStore(Bug).reset() [23:07] or something like that [23:07] at the top [23:08] test_testcase (I think) does this [23:08] its a bit of a heavy hammer, but likely needed [23:08] 0.18 should fix this, but 0.18 has a different showstopper bug in it [23:08] jcsackett: what I'd like you to do though, is to compare the two queries [23:08] drop the 24 to 20 or so [23:08] and run it in the it-fails manner [23:08] that will get you a query list [23:08] shove that in a buffer somewhere [23:09] and compare it to the query list you get when it fails without modifications [23:09] jcsackett: by have a go I meant run bugs.tests [23:09] i can do that, lifeless. [23:12] probably won't have a comparison for a while though. [23:14] jcsackett: ok, well I'm going to pop out for a chore or two [23:14] lifeless: i'll be EODing shortly. i'll ping you tomorrow if i have anything informative. [23:14] I'll be delighted to help figure out whats going on when I get back. I suspect a diff will be reasonably helpful though. [23:15] jcsackett: ok. [23:15] thanks for the help. :-) [23:15] (and who knows, i may still be poking at this) [23:15] if it is an isolation issue, we'll definitely want to fix it systematically - I'm adding *lots* of these tests. [23:16] and I hope others will too. [23:16] its a great way to prevent scaling surprises [23:17] lifeless, i hear that. [23:17] one place in particular that may be causing the queries [23:17] is Person._init === Ursinha-afk is now known as Ursinha [23:24] losa ping [23:24] lpoops is telling me [23:24] Exception Value: [23:24] [Errno 13] Permission denied: '/x/launchpad.net-logs/production/mizuho/2010-09-01' [23:24] on https://lp-oops.canonical.com/oops.py/?oopsid=1705A2026 [23:25] let me take a look [23:26] hmm, that's a bad path [23:26] * mbarnett lies [23:27] well, it is. there is no mizuho directory there [23:27] wow, i am failing this evening [23:27] there is. checking permissions. [23:27] * lifeless sniggers [23:31] try again [23:34] lifeless: that time i got that it didn't match any oopses.. [23:34] mbarnett: I got OSError at /oops.py/ [23:34] [Errno 2] No such file or directory: 'lp_publish' [23:35] hmm, reloading it got me that as well. [23:37] [Wed Sep 01 23:18:25 2010] [error] [client 122.63.10.108] Unauthorized access attempt for 'https://lp-oops.canonical.com/oops.py/?oopsid=1705A2026' by 'https://login.ubuntu.com/+id/kPbPBDC' (teams: []) [23:37] looks like there might be a bug in the lp-oops auth code... [23:38] hah [23:38] would not surprise me [23:38] i wonder who would handle that.. [23:39] matsubara, but hes a) afk and b) swamped with merge workflow stuff [23:40] heh, yeah. i pinged him anywhoo. [23:41] I saw :P [23:41] * mbarnett goes to file a bug [23:41] thanks [23:41] ok, bbs === salgado is now known as salgado-afk [23:51] lifeless: how do I profile a page? [23:51] lifeless: the new cloud is in place [23:51] lifeless: but the queries still seem a little slow to me [23:51] but I don't know which bit is slow