[00:24] desperately seeking reviewers [00:25] lifeless: Where? [00:25] I don't see a lifeless review pending. [00:25] https://code.launchpad.net/~lifeless/launchpad/bug-739799/+merge/54287 [00:26] Oh, it didn't actually exist yet. [00:26] I see. [00:26] lifeless: Still 100ms? :/ [00:28] wgrant: see the oops: 9000ms cold, new query is 900ms cold, 100ms hot [00:30] No thumper, and no StevenK yet I guess :( [00:30] Hmm? === wgrant changed the topic of #launchpad-dev to: Performance Tuesday | https://dev.launchpad.net/ | firefighting: - | On call reviewer: wgrant | https://code.launchpad.net/launchpad-project/+activereviews [00:31] StevenK: https://code.launchpad.net/~lifeless/launchpad/bug-739799/+merge/54287 [00:31] next thing is to permit bulk_load chainging [00:31] and we'll have a poor man persistence layer [00:31] Indeed. [00:32] I also need to look at a helper for collections. [00:33] wgrant: Done. [00:34] StevenK: Thanks. [00:34] StevenK, lifeless: https://code.launchpad.net/~cjwatson/launchpad/tar-xz/+merge/54215 [00:37] wgrant / sinzui can you confirm jc is still on Cve:+index ? [00:37] lifeless: He mentioned it on the call this morning. [00:37] kk [00:37] So I think so. [00:38] sob [00:38] I don't want to touch 27 / 26 ProductSet:CollectionResource:#projects [00:38] It's not that bad. [00:38] Just needs a few things preloaded. [00:38] A couple of them are collections, though. [00:38] its a nuisance determining all the traversals being hopped to [00:39] Indeed. [00:39] We may want to fix lazr.restful. [00:39] To provide a hook. [00:39] productseries should be sufficient to make the timeout go away [00:40] That was my impression, but officialbugtag might be bad too. [00:44] StevenK, lifeless: Thanks. [00:46] sinzui: [00:46] Traceback (most recent call last): [00:46] File "bin/sprite-util", line 54, in [00:46] url_prefix_substitutions={'/@@/': '../images/'}) [00:46] File "/home/robertc/launchpad/lp-branches/working/lib/lp/services/spriteutils.py", line 94, in __init__ [00:46] css_template_file, group_name, url_prefix_substitutions) [00:46] File "/home/robertc/launchpad/lp-branches/working/lib/lp/services/spriteutils.py", line 101, in _loadCSSTemplate [00:46] self.css_object = cssutils.parseFile(css_template_file) [00:46] File "/home/robertc/launchpad/lp-sourcedeps/eggs/cssutils-0.9.6-py2.6.egg/cssutils/__init__.py", line 156, in parseFile [00:46] return CSSParser().parseFile(*a, **k) [00:46] File "/home/robertc/launchpad/lp-sourcedeps/eggs/cssutils-0.9.6-py2.6.egg/cssutils/parse.py", line 130, in parseFile [00:46] return self.parseString(open(filename, 'rb').read(), [00:46] IOError: [Errno 2] No such file or directory: '/home/robertc/launchpad/lp-branches/working/lib/canonical/launchpad/icing/style-3-0.css.in' [00:51] * StevenK glares at run_script. Something looks to be running DELETE FROM FeatureFlag first [00:53] see jtv's recent landing? [00:53] StevenK: wgrant: any idea about this issue I have ? [00:54] or should I back 12638 out? [00:55] lifeless: Reproduced. [00:55] bah, not that rev. other rev [00:55] lifeless: We have another 3 hours left on buildbot. [00:55] no, actually, that rev [00:55] Let's debug slightly first. [00:56] its trying to build the style-3.0 regardless [00:57] Ah. [00:57] I wonder if bin/sprite-util didn't get rebuilt... [00:57] so the easiest fix is to rename it [00:58] indeed [00:58] Or touch something that buildout depends on. [00:58] removing it [00:58] and it fails to run [00:58] 'make' doesn't know that bin/sprite-util should be rebuilt [00:58] bin/buildout fixed [00:58] style-3-0.css.in should not exit /bin/buildout will fix the utils [00:59] lifeless: Fixed? [00:59] wgrant: made 'make' work [00:59] Looks like the buildout templates were not regenerated during the deploy [00:59] we need to land a dependency rule to rerun buildout [00:59] it may|maynot fail on deploy [00:59] lifeless: It shouldn't fail on deploy. [01:00] wgrant: we don't start with a clean tree [01:00] wgrant: do we explicitly run buildout? [01:00] Then why does it take 30 minutse to run buildout? [01:00] what [01:00] ? [01:01] bbr [01:01] brb [01:01] sinzui: Your change to sprite-util.in wasn't enough to cause buildout to be rerun. [01:02] sinzui: So bin/sprite-util is still old. [01:02] ./bin/buildout will remake the bin dir from the new bin templates [01:02] Right, but until then everyone's tree is broken. [01:03] Maybe I should just upgrade loggerhead. [01:03] Kill two birds with one stone. [01:03] Oh, no. [01:03] versions.cfg, not sourcedeps.conf :( [01:03] Mine wasn't. I think this is case of different setups. I branch and make new trees every tine [01:04] sinzui: I normally do, but sometimes run stuff from devel. [01:04] And as of yesterday I'm trying bzr-colo, so all my one tree is broken. [01:04] thanks for trying it [01:04] i do use it for lp development fwiw, but not as often as you will be [01:04] I stopped running from devel when we switched to eggs. It became unstrust worthy [01:04] what happened? [01:04] I miss simple python and debs [01:05] sinzui: Indeed. [01:05] poolie: Makefile + buildout == confusion [01:05] Compounded by running with a single tree. [01:05] sinzui, what's the alternative? [01:05] to running from devel, i mean [01:06] my frustration here is that i only touch lp stuff every 1-3 weeks and so every time i do, i need to fight with dependencies [01:08] so minimally [01:08] someone needs to mail the dev list [01:08] secondly, we should just fix Makefile to invoke buildout here [01:08] poolie: yes :( I run rocketfuel-setup the moment I start my day, merge and rebuild all my trees. I can review projects, check spam and answer questions, before all that work is complete some days [01:09] wgrant, so the breakage is not specifically from bzr-colo? [01:09] poolie: No. [01:09] poolie: colo is working OK, but the fact that it shares trees means that our screwed up build system is more exposed. [01:17] jcsackett: I have a change to lib/lp/bugs/browser/cve.py that may trivially fix the CVE:+index bug, though I'm not sure if it is/isn't related [01:19] jcsackett: actually nvm, I was on (hopeful) crack [01:23] huh [01:23] sinzui: looks to me like there is only one user of ProductSet.search : the API. [01:24] sinzui: if I'm right, could make it siiiiimpler === jtv-afk is now known as jtv [01:25] morning folks [01:36] wgrant: fyi, offiical bug tags may be an issue, but they are also locked behind a mixing behind a property hidden in a dilemma. [01:36] wgrant: e.g. TooHardBin [01:36] lifeless: Perhaps. [01:38] wgrant: certainly not low hanging fruit [01:52] hi StevenK, are you here? [02:09] lifeless: I would expect ProductSetView to use IProductSet.search(). I see it uses IPillarNameSet. I think this is because users do not care care that we divide things into distros, projects, and project groups [02:09] sinzui: I had such expectations too [02:09] sinzui: I found a total of 1 use: bugalsoaffects [02:16] can has review? https://code.launchpad.net/~lifeless/launchpad/bug-727020/+merge/54290 [02:18] jtv: I am now. What are you after? [02:18] StevenK: I figured it out for myself now. [02:19] jtv: Nice work. :-) [02:20] Well I still haven't solved my problem. :) The difficulty was finding out where Sources gets generated. Turns out the answer depends on the archive purpose. In my case it should be done by a-f. [02:31] StevenK: https://code.launchpad.net/~lifeless/launchpad/bug-727020/+merge/54290, if you would be so kind [02:34] wgrant: how is Archive:+index looking perf wise now, do you think ? [02:34] lifeless: Done. [02:35] thanks [02:35] lifeless: getBuild(Status)?Summar(y|ies)(ForSource(Id)?s(AndArchive)?)? is still being problematic. [02:35] Across +index, +packages, +copy-packages and +delete-packages [02:36] wgrant: want me to look ? [02:37] or I can fix ScopedCollection:CollectionResource:#person-page-resource [02:37] hmm, might do that one [02:37] lifeless: I'm looking. [02:37] Well, was looking before, and will be looking again in a while, I imagine... [02:38] lifeless: It needs to be rewritten to not execute four 300ms queries, basically. [02:38] wgrant: thats only 1.2 seconds [02:38] lifeless: The queries are at best 300ms. They're often much more. [02:40] jtv: Hi! :-) [02:40] uh-oh [02:40] yes? [02:40] jtv: Is there anything special I need to do to see feature flags in a script? [02:41] StevenK: if it's a LaunchpadScript, no. [02:41] If it's not a LaunchpadScript, you're still on your own. [02:41] I don't think it is. Can you point me at the scaffolding I need to set up? [02:42] Surprisingly, you can find the answer in LaunchpadScript. :) [02:42] Have a look at lp.services.scripts.base: LaunchpadScript.run [02:42] In a nutshell: [02:42] from lp.services.features import install_feature_controller, make_script_feature_controller [02:43] Ah ha, it's a JobCronScript, which bases from LaunchpadCronScript, which is based off LaunchpadScript. [02:43] install_script_feature_controller(make_script_feature_controller("identifying script name")) [02:43] Well then there you go. [02:43] The feature controller is installed only during "run." [02:43] After that, it restores whatever controller was previously active (which probably means "none"). [02:44] If you call main directly in a test, it uses whatever controller the test had set up. [02:44] I call the cronscript directly using run_script [02:45] Then it should just work (apart from ZCML taking #$%@ ages to parse during script startup) [02:45] It insists the feature flag isn't set [02:46] I shall worry about it after lunch [02:47] StevenK: my branch landed only recently, so make sure it's included. [02:50] jtv: This is with latest devel merged in [02:52] jtv: Is it because the test set up uses a fixture? [02:52] Uses a fixture for what? [02:52] self.useFixture(FeatureFixture({FEATURE_FLAG_ENABLE_MODULE: u'on'})) [02:54] Does FeatureFixture actually change your feature-flag settings? Because it'd have to do that to have any influence on your script run. [02:54] jtv: Running the cronscript test by itself with LP_DEBUG_SQL=1, it runs "DELETE FROM FeatureFlag" [02:55] The job gets added, then there is a pause, which is ZCML and such like, and then two statements get executed, the first is the delete, and the second is checking the feature flag which is now unset [02:55] No idea where that comes from; does the FeatureFixture also INSERT your test flag (and do you commit before the script runs)? [02:56] My impression was that the FeatureFixture is an in-memory trick only, in which case it's not going to do anything for scripts. [02:56] jtv: So this is using the testcase you added :-) [02:56] How does that follow? [02:57] If FeatureFixture is in memory only, how do I go about enabling the flag? [02:57] its not in memory only AFAIK [02:57] it was discussed, not done. [02:57] IMBW [02:57] check the code [02:58] lifeless: So how can I tell what is running the DELETE? [02:58] LP_DEBUG_SQL_EXTRA=1 [02:58] I'm already running that [02:59] I was right, it is the fixture. [02:59] it should show a backtrace then [02:59] It's cleaning itself up before the script runs [03:00] StevenK: pastebin the code? [03:01] lifeless: It's in an MP: https://code.launchpad.net/~stevenk/launchpad/dsdj-runner/+merge/54159 ; I'm just pushing latest changes [03:02] Unless I cause the fixture to cleanup by running transaction.commit() [03:02] Anyway, lunch [03:02] (Finally) [03:09] grrrrr [03:09] self._fd.close() [03:09] IOError: [Errno 28] No space left on device [03:09] trying to setup a test env [03:09] because sill factory triggers [03:09] File "/home/robertc/launchpad/lp-sourcedeps/eggs/zope.sendmail-3.7.1-py2.6.egg/zope/sendmail/delivery.py", line 136, in createDataManager [03:09] msg.close() [03:10] File "/home/robertc/launchpad/lp-sourcedeps/eggs/zope.sendmail-3.7.1-py2.6.egg/zope/sendmail/maildir.py", line 136, in close [03:30] StevenK: looks like the fixture DELETE should happen at the end of the test, not before === jamesh_ is now known as jamesh [03:50] wgrant: I'm driving myself up the walls trying to find out why a-f isn't generating any Sources files for me. One thing I notice is that there's no trace of binary packages in the setup that the test publisher has produced for me… is that normal? [03:52] I do get a _main_source referring to an existing dsc file… what else is needed? [03:52] jtv: What does the a-f conf look like? [03:52] * jtv finds pastebin [03:53] wgrant: I hope apt.conf is the one you need… http://paste.ubuntu.com/583629/ [03:53] jtv: That's the one. [03:54] So, that should work. But I wonder if it's not working because you have no DASes. [03:54] I'm not sure anyone has tested that. [03:54] So the test publisher doesn't set any up? [03:54] Not if you just makeDistroSeries, no. [03:54] Also, the contents.header that that config refers to doesn't exist prior to the a-f run. [03:54] It's not designed to placate buggy a-f stuff. [03:54] Right, that's fine. [03:55] Trying it with a DAS in place, just for the hell of it… [03:56] It shouldn't make a difference, but you never know. [03:58] I'm just getting more error messages out of a-f. :) [03:58] It now also says "could not open file" about the Packages files. [03:58] Could you pastebin the full output? [03:58] Or throw the branch my way? [04:00] wgrant: the branch is lp:~jtv/launchpad/bug-55798 ; run ./bin/test lp.soyuz.scripts.tests.test_publish_ftpmaster -t publishes [04:00] I've set it up to tar up directory contents in /tmp, though not all of that is committed (because I don't want it in my final branch) [04:04] lifeless: It looks to me that the commit is setting it off [04:09] wgrant: another fairly wild guess: if none of this has been tested with distros other than Ubuntu, could something be reacting badly to the test distro not having (or populating) all the pockets that ubuntu has? [04:09] StevenK: the commit will commit all pending stuff [04:09] StevenK: which will include a cleanout of *prior* featureflags [04:09] StevenK: and should include an add of yours [04:10] StevenK: check the FeatureFixture code [04:10] jtv: Sorry, got distracted by uncontrollable anti-Microsoft rage. [04:10] Let's see. [04:10] wgrant: never apologize for that. It's Microsoft's fault and don't stop until they're gone. [04:12] argh [04:12] activemembers unions with adminmembers. why [04:14] lifeless: That doesn't explain what I see with LP_DEBUG_SQL_EXTRA. http://pastebin.ubuntu.com/583636/ [04:15] StevenK: thats after the test ran [04:16] File "/home/steven/launchpad/lp-sourcedeps/eggs/testtools-0.9.8-py2.6.egg/testtools/runtest.py", line 154, in _run_cleanups [04:16] StevenK: so its irrelevant to you === Ursinha is now known as Ursinha-afk [04:16] Drat, pasted the wrong bit [04:17] lifeless: http://pastebin.ubuntu.com/583637/ [04:18] lifeless: The traceback from the commit is half-way through the test [04:25] jtv: Ah [04:27] StevenK: still in *test* cleanups [04:27] File "/home/steven/launchpad/lp-sourcedeps/eggs/testtools-0.9.8-py2.6.egg/testtools/runtest.py", line 154, in _run_cleanups [04:30] lifeless: So the code for the statement is before the statement, http://pastebin.ubuntu.com/583640/ shows the entire test, and you can plainly see it runs the clean ups after line 6 of the test [04:33] StevenK: this suggests that that commit is failing [04:33] StevenK: raising an exception, which triggers cleanups [04:33] StevenK: i promise you, you've left your entire test according the backtraces you are showing me [04:34] StevenK: if I was debugging this, I would break in with pdb about now [04:34] Project db-devel build #477: FAILURE in 5 hr 1 min: https://lpci.wedontsleep.org/job/db-devel/477/ [04:35] argh [04:35] ---------------------------------------------------------------------- [04:35] SELECT 1 FROM (SELECT PersonTransferJob.json_data, PersonTransferJob.id, PersonTransferJob.job, PersonTransferJob.job_type, PersonTransferJob.major_person, PersonTransferJob.minor_person FROM Job, PersonTransferJob WHERE PersonTransferJob.job_type = 1 AND PersonTransferJob.job = Job.id AND Job.status IN (0, 1, 4) AND (PersonTransferJob.minor_person = 251673 OR PersonTransferJob.major_person = 251673) LIMIT 1) AS "_tmp" LIMIT 1 [04:35] ---------------------------------------------------------------------- [04:35] (repeat for the batch size) [04:35] SELECT 1 FROM (SELECT PersonTransferJob.json_data, PersonTransferJob.id, PersonTransferJob.job, PersonTransferJob.job_type, PersonTransferJob.major_person, PersonTransferJob.minor_person FROM Job, PersonTransferJob WHERE PersonTransferJob.job_type = 1 AND PersonTransferJob.job = Job.id AND Job.status IN (0, 1, 4) AND (PersonTransferJob.minor_person = 251672 OR PersonTransferJob.major_person = 251672) LIMIT 1) AS "_tmp" LIMIT 1 [04:35] jtv: Ran 1 tests with 0 failures and 0 errors in 32.236 seconds. [04:35] wgrant: yes, but no Sources files. Check /tmp/var_archive.tar.gz. [04:37] jtv: It's empty, but it's there. [04:37] (because I fixed it) [04:37] wgrant: you do know how to keep the tension up, don't you? [04:37] Indeed. [04:37] How did you fix it, and why is it empty? [04:37] You had your paths wrong [04:37] config.archivepublisher.root on prod is /srv/launchpad.net/ubuntu-archive, not /srv/launchpad.net. [04:37] And what's all this PPA business? [04:40] lp:~wgrant/launchpad/jtv-bug-55798 [04:40] What is what PPA business? [04:41] Your test and cron.publish-ftpmaster talk about PPAs. [04:41] cron.publish-ftpmaster does not go near PPAs. [04:41] Oh, right, there was that alternate codepath that could never have worked. [04:42] You speak in riddles. [04:42] - ARCHIVEROOT_PARTNER=$PUBLISHER_ROOT/ppa/$DISTRONAME-partner [04:42] All my test did was create some ppa directories that appeared to be needed for anything to run. [04:42] You added that. [04:43] Possibly because the original did this: [04:43] - ARCHIVEROOT_PARTNER=/srv/launchpad.net/ppa/ubuntu-partner [04:43] How is that different? [04:43] That branch has the minor issue of not making any sense at all. [04:43] So I removed it. [04:43] What branch? [04:44] -if [ "$LPCONFIG" = "$PRODUCTION_CONFIG" ]; then [04:44] - ARCHIVEROOT_PARTNER=/srv/launchpad.net/ubuntu-archive/ubuntu-partner [04:44] - GNUPGHOME=/srv/launchpad.net/ubuntu-archive/gnupg-home [04:44] -else [04:44] - # GNUPGHOME does not need to be set, keys can come from ~/.gnupg. [04:44] - ARCHIVEROOT_PARTNER=/srv/launchpad.net/ppa/ubuntu-partner [04:44] -fi [04:44] That's how it is in devel now. [04:44] The second part of that is broken. [04:44] You mean "branch" as in "if"? [04:44] Yes. [04:44] Sorry. [04:45] bah [04:45] wgrant: sinzui is gone, can you show https://bugs.launchpad.net/launchpad/+bug/739961 to him tomorrow ? [04:45] <_mup_> Bug #739961: person collections in API late evaluate PersonTransferJob < https://launchpad.net/bugs/739961 > [04:45] wgrant: Okay, but does this change really affect the Sources issue? I can see how the _other_ change affects it. [04:45] lifeless: Sure. [04:46] review plox https://code.launchpad.net/~lifeless/launchpad/bug-727540/+merge/54293 [04:46] jtv: I don't recall exactly. But the partner dists move was crashing until I fixed that. [04:48] wgrant: your change does mean that ARCHIVEROOT becomes /srv/launchpad.net/ubuntu instead of /srv/launchpad.net/ubuntu-archive/ubuntu as it currently is, doesn't it? [04:49] jtv: Yes. [04:49] Er. [04:49] Er. [04:49] Where does cron.publish-ftpmaster get this info? [04:49] It duplicates what it finds in the lazr config. [04:49] So it should be fine, then. [04:49] Which is one of the things that made this untestable. [04:50] Because it will see config.archivepublisher.root == /srv/launchpad.net/ubuntu-archive [04:50] But that's different yet again. [04:51] On development systems, by the way, it's all in /var/tmp/archive [04:53] Right. [04:53] But how is it different yet again? [04:53] Assuming that you give cron.publish-ftpmaster config.archivepublisher.root, it should behave the same on prod as the one in devel does. [04:55] StevenK: perhaps you could review https://code.launchpad.net/~lifeless/launchpad/bug-727540/+merge/54293 ? its very simple and has its diff now [04:55] StevenK: I'm going to grab dinner, after that I can perhaps pull your branch down and have a poke for you [04:58] wgrant: except that the script duplicates it all in its variables. You just said the production config said /srv/launchpad.net/ubuntu-archive, but you changed the script to use /srv/launchpad.net/ubuntu instead. [04:58] This is all so confusing. [04:58] lifeless: So where is the eager loading? [04:59] jtv: The production dists tree lives in /srv/launchpad.net/ubuntu-archive/ubuntu/dists [04:59] jtv: The dev one traditionally lives in /var/tmp/archive/ubuntu/dists [04:59] oic! [05:00] So then you missed a change: [05:00] PUBLISHER_ROOT="${PUBLISHER_ROOT:-/srv/launchpad.net/ubuntu-archive}" [05:00] Right? [05:00] StevenK: _members(need_validity..) etc [05:01] wgrant: in other words, the "ubuntu-archive" goes into the $PUBLISHER_ROOT instead of being added in ARCHIVEPARENT. [05:01] And in fact PUBLISHER_ROOT and ARCHIVEPARENT are the same thing [05:01] jtv: I did enough to make the tests pass. [05:01] (which the script previously hid by using $ARCHIVEROOT/..) [05:02] I didn't actually look at what changes you really made. [05:02] Enough to get the tests to pass with a sane path structure, that is. [05:02] It's pernicious because there's no way of testing it. Eventually it'll all go; right now I'm concerned with the "chalk outline" that the new version must fill. [05:03] Righ. [05:03] Rhymes with "Sigh." :) [05:03] * jtv looks up production config [05:04] Also note that Julian's config work will change how this works, but the values should stay the same. [05:04] Just your tests will need to change, if they decide to override configs. [05:15] They don't—wouldn't work for external script runs. [05:16] Well, it can. [05:16] We have facilities for that now. [05:16] Indeed you probably should. [05:17] (eg. the librarian and DB fixtures push config overlays then write them to disk) [05:23] That may become necessary later as we move this script to python and create more appropriate tests. Then again, at that point it won't involve script runs any more so we should be able to do more in-memory. [05:25] wgrant: you were right by the way: the archivepublisher's "root" setting on production is /srv/launchpad.net/ubuntu-archive. [05:25] jtv: Yup. [05:26] Thanks for helping me with this. I'm reluctant to drop the ppa stuff until I've got something of a proper test hammered out, but it'd be good to streamline things then. [05:26] Right. [05:27] Food. === jtv is now known as jtv-eat [05:35] Right, something odd is going on, since I've added the flag manually, and the cron script still can't see it. [05:39] * StevenK doesn't understand feature flags [05:40] StevenK: does your script install a flag handler? [jtv's work was in this area] [05:41] lifeless: jtv added it to LaunchpadScript's run() method from what I can see, and my script (through 2 layers of indirection) bases off it [05:44] lifeless: Unless I'm wrong. What's a flag handler? [05:46] sadness [05:47] 1112 timeouts [05:47] StevenK: featurecontroller or whatever [05:47] install_feature_controller(make_script_feature_controller(self.name)) [05:47] Is that enough? [05:51] lifeless: ^ [05:51] yeah, should be [05:51] StevenK: whats the feature controller present when your code does a flag lookup ? [05:51] How do I check that? [05:52] print ? [05:52] Sure, but print *what* ? :-) [05:52] lp.services.features.mumble [05:55] Guessing get_relevant_feature_controller(), which gives None [05:56] lifeless: ^ Which is likely sub-optimal [05:58] StevenK: that is indeed likely to be a problem [05:58] StevenK: is your script threaded? [05:58] lifeless: I have no idea, but I doubt it. [06:07] timeouts top to bottom - ugh, ugh, ugh, fixed fixed fixed ugh ugh in-progress ugh ugh ugh ugh [06:08] StevenK: so, keep spattering prints around - like after install_feature_controller - until you can see what goes wrong [06:10] ok seriously did someone rename style-3-0.css.in? [06:10] huwshimi: run bin/buildout [06:14] lifeless: So directly after install_feature_controller, it's still None [06:15] so, either you're looking in the wrong place, or that method is failing ;) [06:16] I don't get what install_feature_controller is supposed to do [06:19] StevenK: it's a simple setter for the feature controller, though it only applies to the running thread. === jtv-eat is now known as jtv [06:19] I'm close to ripping this out for now [06:20] lifeless: That didn't help. I think it has been renamed. [06:20] I don't understand it, and it seems quite opaque [06:20] huwshimi: delete bin/sprite-util [06:20] huwshimi: then run it [06:20] Or make clean; make [06:21] StevenK: let's start by establishing that you're talking to the right feature controller. [06:21] But that's a bigger hammer [06:22] StevenK: if get_relevant_feature_controller returns None right after install_feature_controller, then are you sure you didn't pass it a None? [06:22] 'install_feature_controller(make_script_feature_controller(self.name))' is the call [06:23] Which is the line you added to LaunchpadScript [06:23] Yes [06:24] Oh actually you don't need to install it; make_script_feature_controller calls install_feature_controller. [06:24] After which in fact it returns None. [06:24] Oh, so it's broken [06:24] :-P [06:24] Yup. [06:24] Fixing. [06:25] It's fine, I can do it [06:25] You can replace install_feature_controller(make_script_feature_controller(self.name)) with just make_script_feature_controller; or better yet, [06:25] make make_script_feature_controller return the controller instead of installing it. [06:27] We don't need the controller, so if it installs it, that's enough [06:27] jtv: This was the failing test for me, so I can just fix the bug in my branch if you wish [06:27] True, it's just nice to export just a single function to set the thing. [06:27] Needs a test also. [06:28] Which you're working on? [06:28] Yes [06:39] StevenK: pushing fix [06:40] bug 739986 [06:40] <_mup_> Bug #739986: LaunchpadScript fails to install feature controller < https://launchpad.net/bugs/739986 > [06:45] lifeless: just fyi that file was renamed [06:46] StevenK: care to review? https://code.launchpad.net/~jtv/launchpad/bug-739986/+merge/54299 [06:48] jtv: r=me [06:48] thanks [06:48] and sorry for the… [06:49] oh wait I already said that didn't I [06:49] A few times :-P [06:49] possibly not enough :) [06:49] This is something that still bothers me about our testing. Quis custodiet ipsos custodies? [06:50] jtv: You'll toss that through ec2? [06:50] I was considering it, yes—meanwhile you could just merge it in [06:50] Yup [06:51] huwshimi: it was, but the reason you had a problem was due to a stale built file [06:51] huwshimi: if you fixed it locally by renaming again, you've done it wrong [06:53] lifeless: I haven't renamed it, I was just getting conflicts within the file and it was renamed and now I'm in a world of hurt with strange diffs, but I think it's completely unrelated to the buildout stuff. I just happened to run into it at the same time. [06:54] lifeless: I wasn't really expecting to come across the changes and renaming of the files without some kind of warning (like an email) [06:59] heh. wgrant, was that you who wrote "I do not care about sources"? Because that seems to be what's bothering a-f now. :) Not sure that we need to fix that though, as long as a Sources file gets generated. [07:00] jtv: I don't recall saying that, but I may have. [07:00] Anyone want to review https://code.launchpad.net/~wgrant/launchpad/restore-archivepublisher-config/+merge/54301? [07:00] It just sounded like you somehow. :) [07:00] Probably. [07:00] I suppose I owe you one, so sure. [07:01] It's not very difficult :) [07:01] wgrant: hit jtv for Cake. that way you can start paying us back. /unsubtle-hint [07:01] jtv: (that was cprov) [07:02] Cake? [07:02] But yes, it does sound like something I'd say. [07:04] wgrant: I just approved without comment. May be a first. [07:04] (I did check the bug reference) [07:05] jtv: I'm tossing my branch at ec2. Which means your bug fix might inadvertly land [07:05] StevenK: oh woe is us [07:05] jtv: Thanks. [07:06] And in case the sarcasm didn't drip quite where everyone can see it, I'll add what BBC teletext page 888 does: "(!)" [07:06] (I hope I'm not so much older that I now have to explain what teletext is) [07:06] I'm so tempted to ask, now [07:06] Even I know what Teletext is. [07:06] although only because I didn't know what a button on the old TV did. [07:06] StevenK: there you go, ask william just like you would with any soyuz question. [07:07] The BBC has excellent subtitling on page 888. [07:07] Someone hums half a bar of an ill-remembered melody, off tune, in an anvil factory and the subtitles will tell you it's the 3rd movement from Vivaldi's opus #128 etc. [07:08] jtv: That is surely crucial to the plot. [07:08] Surely. [07:08] They also use different colours for different characters in films, as I recall. [07:09] It was a great help in learning English. [07:10] huwshimi: renames are normally pretty painless; also having short lived branches avoids a lot of pain [07:38] StevenK: in replacing cron.publish-ftpmaster, do I need to call getPubConfig on every archive that the script is to work on? [07:40] publish-distro will do that for you. [07:40] But the dists musical chairs probably needs it run before publish-distro. [07:40] But youi'll be moving that into publish-distro soon. [07:40] I expect I'll probably need to do some setup before then, yes [07:41] Will I? [07:41] I hope so. [07:41] It belongs there. [07:43] For now I've been far too busy getting it to run at all. Probably wouldn't have been wise to start hunting for cleanup opportunities too soon. :) [07:43] My next step is to pythonify the shell script as-is. [07:44] I'm not sure that Pythonifying it directly is particularly valuable. [07:44] It's just one of the steps. [07:44] It means that I can use config instead of shell variables that must match the config, for one. [07:45] It also means that I can use dedicated temporary directories again, which helps avoid pollution by previous test runs. [07:46] And of course it means I can run unit tests and get the results the same day. :) [07:49] Project windmill build #85: FAILURE in 1 hr 6 min: https://lpci.wedontsleep.org/job/windmill/85/ [08:00] have I mentioned how much I love that Storm and sqlobject result sets are incompatible with each other ? [08:01] * jkakar hugs lifeless [08:02] lifeless: Isn't the problem just that the SQLObject one is incompatible with Storm? ;b [08:02] jkakar: they both ship in the storm tarball >< [08:02] lifeless: I know, I was just being facetious... sorry, coffee hasn't set in yet. [08:02] jkakar: it makes migration tricky [08:02] jkakar: :) [08:02] jkakar: end of long day for me, so the reverse... [08:02] lifeless: Heh [08:04] heya jkakar! [08:05] spm: Hi! :) [08:16] jkakar: so you miss us after all, do you? :) [08:17] jtv: I'm having fun at the new gig, but yes, I do miss you. :) [08:17] :) [08:20] wgrant: Hi, I'd appreciate a review of https://code.launchpad.net/~rvb/launchpad/db-dds-fix-filter-form-add-parent-name/+merge/54239 [08:23] * lifeless wonders if http://dev.launchpad.net/ReviewingCodeImports?action=diff&rev1=9&rev2=10 was intended [08:28] wgrant: I need some blurb aboutthe change for oem [08:32] rvba: Done. === jam1 is now known as jam [08:32] lifeless: :( === wgrant changed the topic of #launchpad-dev to: Performance Tuesday | https://dev.launchpad.net/ | firefighting: - | On call reviewer: - | https://code.launchpad.net/launchpad-project/+activereviews [08:33] Yippie, build fixed! [08:33] Project windmill build #86: FIXED in 43 min: https://lpci.wedontsleep.org/job/windmill/86/ [08:33] wgrant: I can just say 'xss hole and escaping blah', but perhaps a little more accuracy would help? [08:33] wgrant: thx [08:34] wgrant: if you're doing up an incident report, just make a clear summary, and I'll copy that across - avoids you duplicating effort [08:34] lib/lp/bugs/tests/../doc/bugnotification-sending.txt", line 1260, in bugnotification-sending.txt [08:34] ^ db-devel failure. [08:34] ring any bells? [08:35] ah [08:35] - Matching filters: Allow-comments filter [08:35] ? ^ ^ ^^ [08:35] + Matching subscriptions: Allow-comments filter [08:35] ? ^^^^^^ ^ ^^^ [08:35] someone didn't use ec2 :) [08:36] (or ran into a landing race I guess) [08:40] lifeless: Merge issue. Fix already landed. [08:45] lifeless: https://wiki.canonical.com/IncidentReports/2011-03-22-LP-unescaped-json, but the summary is not very good. [08:46] jtv: Hi! [08:46] hi henninge === almaisan-away is now known as al-maisan [08:47] jtv: I picked up bug 605924 as you may have noticed. I just commented on what I plan to do. [08:47] <_mup_> Bug #605924: Clean up IHasTranslationTemplates < https://launchpad.net/bugs/605924 > [08:47] if you are interested. [08:47] * jtv has a look [08:50] henninge: looks good—I agree that the hard part is to find all uses and make sensible decisions for them. The tests probably aren't going to find all of those, but that's no excuse for ignoring the issue forever. [08:50] jtv: well, I am picking it up because it is hindering me in using this properly. [08:51] That's as good a reason as any other, isn't it? :) [08:54] well, it certainly raises its priority ... or urgency? whatever [08:56] wgrant: cool; are you forcing a build? [08:56] lifeless: It's already building again. [08:56] cool [08:56] Because there was the testfix rev in the queue. [08:59] good morning [09:00] morning adeuring [09:00] lifeless: hello [09:00] hi jtv [09:01] jml: hi [09:02] lifeless: skype? [09:02] jml: aye [09:03] lifeless: hello, I can now hear you [09:07] lifeless: Total: 231 (+42, -48 since 2011-03-15) [09:08] lifeless: Thanks. [09:08] wgrant: de nada, thank you! [09:09] is staging down? [09:10] i'm getting a 504 gateway timeout trying to make api calls against it [09:10] it might be just me [09:10] staging is down, yes. [09:10] Fix landed. [09:11] thanks [09:11] We might be able to get it going again mid-morning tomorrow. [09:11] Maybe. [09:14] is there any point filing a bug saying the api error is less clear than the web ui one? [09:14] nope [09:22] this seems like another reason to make lplib default away from staging [09:22] morgen [09:22] Morning bigjools. [09:23] bigjools: I've restored those archivepublisher config schema entries in db-devel, so the configs work again. [09:23] wgrant: why? [09:23] bigjools: staging broke. [09:24] yes, it was broke before and Chex cowyboyed it [09:24] Also because having to merge configs in lockstep with deployments sucks. [09:24] We're going to have to cowboy it every time it updates... [09:24] grar [09:24] Anyway, fixed now. [09:24] I detest this config system [09:24] I just have to remove the entries right after the next deployment. [09:24] yes, but it's hassle and pain and annoyance [09:25] It is. [09:26] Bloody hell. If it isn't the backup killing garbo-hourly, it's a vacuum [09:26] wgrant: nice security fix earlier :) [09:26] bigjools: Yeah :/ [09:26] you'd think it having been broken once before there'd be a test case [09:26] I plan to add one tomorrow. [09:28] Also I might upgrade our simplejson and fix it properly. [09:28] Since IE prevents any sensible fix. [09:30] heh, that explains your FB update earlier [09:32] I mean, XHTML was only standardised 11 years ago. [09:38] wow, hooray for bug expiry [09:38] Project db-devel build #478: STILL FAILING in 5 hr 4 min: https://lpci.wedontsleep.org/job/db-devel/478/ [09:41] jml: 11 / 18 Product:+code-index [09:42] jml: https://code.qastaging.launchpad.net/++profile++show/launchpad [09:43] jml: 06699-12526@SQL-launchpad-main-slave SELECT DISTINCT BugBranch.branch FROM Bug, BugBranch WHERE BugBranch.branch IN (24637, 34755, 34756, 352362, 423464, 423252, 438596, 423710, 438533, 28862, 421710, 425429, 425609, 424278, 422318, 423317, 423707, 423704, 422533, 421039, 423618, 422191, 423555, 423341, 423342, 423273, 423359, 423465, 419565, 423403, 423337, 30466, 421067, 422299, 421762, 422907, 422741, 422712, 319807, 422 [09:43] is 6000ms [10:00] jml: https://lp-oops.canonical.com/oops.py/?oopsid=1906G1635#repeatedstatements === gmb changed the topic of #launchpad-dev to: Performance Tuesday | https://dev.launchpad.net/ | firefighting: - | On call reviewer: gmb | https://code.launchpad.net/launchpad-project/+activereviews === almaisan-away is now known as al-maisan [10:42] jml: https://lp-oops.canonical.com/oops.py/?oopsid=1906G1635#statementlog - query 39 and 40 appear identical to me [10:42] jml: both are 'bmps where the source is for this product and both branches are visible [10:43] jml: I think we could get a little fuzzy and use 'the source is for this product and the source is visible' - because you cannot /create/ a bmp targeting a branch you cannot see: [10:44] some person subscribed to the source might see the *count* of a bmp they cannot otherwise see [10:44] but would that matter? anonymous users wouldn't get wrong data, nor would authenticated users with no access to the source branches [10:45] doing that makes the query run in 70ms [10:48] sorry to bother people---I've been searching for about 15 minutes and haven't turned it up yet [10:48] I've had an email passed to me about visual/side-by-side diffing in code-hosting of non-code [10:49] along similiar lines to https://github.com/cameronmcefee/Image-Diff-View-Modes/commit/8e95f70c9c47168305970e91021072673d7cdad8 [10:49] I remember this being discussed at length ~3 years ago, but I haven't turned up any blueprints, mailing list posts or even bugs against launchpad itself [10:50] lifeless: looking... [10:50] ah https://answers.launchpad.net/bzr/+question/121049 [10:51] "Can Bazaar meet requirements for game-development workflows?" [10:52] sladen: yeah, discussed on canonical-launchpad and canonical-bazaar too:) [10:53] lifeless: my apologies. Guess I need to join those to read the archives [10:56] jml: actually, I've got a 130ms plan [10:56] jml: with half the cost estimate [10:56] lifeless: nice [10:57] https://bugs.launchpad.net/launchpad/+bug/736008 [10:57] <_mup_> Bug #736008: Product:+code-index timeouts < https://launchpad.net/bugs/736008 > [11:01] Morning, all. === henninge_ is now known as henninge [11:29] danilos, jtv: Any idea why we have this in our code (for productseries and distroseries) ? http://paste.ubuntu.com/583755/ [11:29] danilos, jtv: I mean, why override the usage of the parent object? [11:29] ah [11:30] Asside from the fact that we get a nice loop here [11:30] because the template cound is dependend on the usage ... [11:30] Ah, that's fun… [11:31] In any case it seems like a strange fallback. [11:31] It was decided at the time that there is basically no usage as long as there are no templates, and now we do allow that again. [11:36] jtv: I understand the "usage" more like an expression of the maintainers intend. [11:36] That was historically the idea behind official_rosetta, yes. [11:37] well, the question is what are the consequences of the usage value? [11:37] But in UI terms, I thought the idea had shifted (as part of the cross-app "usage" renovation) to "there is a stated intent and there is a usable setup." [11:38] ok, but this is doing "a stated intend OR a usable setup" [11:38] which is strange. [11:39] I'd rather have LP say "The user wants to use launchpad for translations but has not yet configured it fully." [11:39] Also, the presence of templates does not imply "LAUNCHPAD" usage, it might also be external templates being mirrored in. [11:39] through a branch [11:40] which is actually a common setup we are trying to achieve for upstream message sharing. [11:40] I agree with your desire, but I'd talk to sinzui and… was it jcsackett who re-did the usage flags? [11:40] ok [11:40] thanks, jtv [11:41] jtv: oh, I just saw that there is wrong line in my paste [11:41] ? [11:41] line 5 [11:42] I hope you were not discussing that? ;-) [11:42] Now I have to look again… [11:42] Here is the correct version that reflects the current code: http://paste.ubuntu.com/583761/ [11:43] henninge: oh, that was so weird my eyes just skipped over it. [11:43] phew [11:44] danilos: ^ in case you read the backscroll [11:45] henninge: and "usage" is "self.usage"? [11:46] argh [11:46] jtv: this is the correct paste. really now. http://paste.ubuntu.com/583762/ [11:46] usage = self.product.translations_usage [11:46] sorry for the confusion [11:46] Ah, that was the sort of line my eyes assumed must be there in previous versions. :) === al-maisan is now known as almaisan-away [12:27] leonardr: did my info help with your start up event problem? [12:27] benji: yes, thanks! [12:27] good === matsubara-afk is now known as matsubara [12:59] henninge: i see that jtv recommends talking with me and/or sinzui? something about translations_usage? [12:59] TypeError: getPublishedSources() got an unexpected keyword argument 'status' [12:59] * bigjools blinks [13:05] leonardr: enabling the IJSONRequestCache.objects functionality for anonymous users causes xx-bug-obfuscation.txt to fail. I think this mechanism may be too revealing. [13:05] abentley: what's the failure? [13:05] leonardr: http://pastebin.ubuntu.com/583778/ === almaisan-away is now known as al-maisan [13:06] abentley: so the problem is not that None gets into the cache, it's that something gets into the cache that shouldn't [13:07] leonardr: yes. [13:07] i don't think "don't show the cache at all" is a winning long-term strategy. that cache is becoming more and more important [13:07] the code that populates the cache needs to learn some discretion [13:07] do you agree? [13:07] leonardr: no, I don't. [13:08] leonardr: I think it suggests that we should be specific about the data we pass on. === al-maisan is now known as almaisan-away [13:08] you may not agree with me, but i agree with you [13:08] leonardr: rather than passing whole objects. [13:08] leonardr: We should just pass the specific attributes we need. [13:08] :q [13:10] if we had the expand/filter system in place we could exploit that [13:10] what attributes do you want to pass? [13:13] leonardr: For ProductSeries, title, web link, autoimport status, project link. For Branch, unique_name, web link. For Project, translations_usage, title, web link. [13:13] abentley: i think you may have uncovered a larger bug. presumably the anonymous user is seeing the uncensored email address in the representation of the bug [13:13] what happens if the anonymous user just requests that bugtask through the web service? [13:15] leonardr: no idea. [13:15] abentley: i'm going to speak with the authority of someone who won't be working here next week :) [13:16] putting entire objects into the cache when you don't need the whole thing is inefficient, but it should not cause data breaches [13:16] there's something wrong with the way the cache is populated that's causing xx-bug-obfuscation to fail [13:17] leonardr: the email address is appearing in the bug description. [13:17] right. but why is launchpad giving an uncensored bug description to an anonymous user? either the censoring isn't happening when a json representation is requested, in which case the web service as a whole is leaking data [13:18] or when we make an internal request for the json representation of the bug, we are for whatever reason escalating the user's privileges [13:18] so they see an uncensored version [13:19] either way, refusing to show the cache at all is just papering over the problem === almaisan-away is now known as al-maisan [13:19] leonardr: if the privileges are being escalated, they're being escalated after the tal condition used to be evaluated, which is very late. [13:20] leonardr: Well, I suppose they could be escalated and then dropped by the time the tal is evaluated. But the template knows we've got an unauthenticated user. [13:33] jcsackett: Hi, yes about its intended meaning. [13:34] jcsackett: I understand the usage flags to reflect the intention of the maintainer of how to use that particular part of LP for their project. [13:35] henninge: right. as i recall, we had some confusion of what best reflected that for translations, as translations has quite a few more "nobs" than other parts of launchpad. [13:35] at least when we initially put the enums into use, translations used the enums plus some other information to determine what would be showed. [13:36] * jcsackett doesn't know what's become of that since. [13:36] jcsackett: I was particularly surprised to find this in the code http://paste.ubuntu.com/583762/ [13:36] jcsackett: ok, I guess I can feel free to redefine that, then ... ;-) [13:37] hi jcsackett, bug 740153 looks more like a LP Translations bug to me (possibly related to message sharing?) than a synaptic bug. Shouldn't it be reassigned to LP? [13:37] <_mup_> Bug #740153: Translations skewed for synaptic < https://launchpad.net/bugs/740153 > [13:37] dpm: that could be. i may have misunderstood the complaint. [13:38] jcsackett, the reported was saying that the translations in LP were overwritten by others done a long time ago, which I'd guess couldn't have happened in Synaptic itself, but in LP [13:39] dpm: ah, yes. okay, that should be targeted back to LP. [13:39] henninge: correct. there was some ongoing work on translations when we were phasing that in (perhaps recife?) that was going to require changes to the translations_usage property. [13:40] does that jive with what concerns you? [13:49] jcsackett: possibly. [13:49] jcsackett: I am trying this out here to see what happens. [13:51] jcsackett, would you mind reassigning it to LP when you've got a minute? I don't seem to be able to make it, I get a "Too many matches. Please try to narrow your search." when trying to change the product [13:51] dpm: i'm getting the same issue. :-P i'll reassign it soon. [13:52] jcsackett, ah, and I thought you launchpadders had superpowers to overcome the issue :P [13:53] dpm: we have many superpowers. in this case, the super power of being very patient with launchpad may be most applicable. :-) [13:53] :-) [13:53] abentley, adeuring: I will relocate quickly now, might be a bit late for the stand-up. [13:58] dpm: and we're good. [13:58] thanks for catching my mistake. :-) [13:58] jcsackett, awesome, thanks! [14:09] hi deryck, can i quiz you about some structural subscription minutiae? === leonardr changed the topic of #launchpad-dev to: Performance Tuesday | https://dev.launchpad.net/ | firefighting: - | On call reviewer: gmb, leonardr | https://code.launchpad.net/launchpad-project/+activereviews [14:13] bac: you can, but I'm about to do a call with abentley. Can we chat after that? [14:13] deryck: sure [14:13] ok, cool. I'll ping when I'm free. [14:18] henninge: danilos: can either of you assist me in answering https://answers.launchpad.net/launchpad/+question/148141 [14:19] sinzui: looking [14:23] sinzui: unfortunately he is right. That feature has been removed. [14:23] unfortunate for his project, that is. [14:25] henninge: Why did we remove this for projects. I think I understand the issue for Ubuntu [14:25] sinzui: I am not sure, actually. It may be pure oversight. [14:30] henninge: Should I report a bug asking that the new/translated status for upstreams be restored for projects? [14:32] sinzui: Yes, at least to have a point of discussion why this was done. [14:33] okay. Thank you [14:39] Project db-devel build #479: STILL FAILING in 5 hr 1 min: https://lpci.wedontsleep.org/job/db-devel/479/ [14:48] sinzui, I've just caught the conversation re: new/translated status in projects. When you file the bug, would you mind subscribing me or pointing me to it? I'd be interested in following the discussion. Thanks! [14:48] gmb: leonardr: could one of you please review this MP: https://code.launchpad.net/~adeuring/launchpad/no-package-translation-uploads-when-sharing/+merge/54356 ? [14:48] i got it [14:49] dpm bug 740225 [14:49] <_mup_> Bug #740225: The differences between New and Translated for upstreams was removed < https://launchpad.net/bugs/740225 > [14:49] I just confirmed another XSS vulnerability in loggerhead for bug #740142 [14:49] how do we roll out fixes for that sort of thing? [14:49] thanks sinzui [14:51] leonardr: thanks! [14:52] I'd also like to avoid pushing the fix making it public === al-maisan is now known as almaisan-away === almaisan-away is now known as al-maisan [15:03] adeuring: r=me [15:03] leonardr: thanks! === al-maisan is now known as almaisan-away === Ursinha-afk is now known as Ursinha-lunch [15:37] bac: about ready now. voice or chat? [15:37] deryck: mumble might be quickest [15:38] ok. firing it up.... [15:40] bac: what room? [15:40] bac: just drag me around ;) === salgado is now known as salgado-lunch === beuno is now known as beuno-lunch === matsubara is now known as matsubara-lunch === Ursinha-lunch is now known as Ursinha === beuno-lunch is now known as beuno === salgado-lunch is now known as salgado === deryck is now known as deryck[lunch] [17:15] sinzui: do you know much about how security adapters work? [17:19] or, anyone, same question? i'm having trouble getting an adapter (or really a new security.py file) working. [17:20] we really need to fix the ajax-forms-lose-data problem [17:20] just got burned big time [17:21] jcsackett: I do now quite a bit [17:21] by clicking a non ajax find-person link on the bug filing page :/ [17:21] jcsackett: is this about adaption to IFAQTarget or IQuestionTarget [17:21] it's a security adapter for IQuestion. [17:21] sinzui ^ [17:21] * sinzui starts mumble [17:22] * jcsackett starts it too. [17:23] sinzui: i suspect you cannot hear me. i cannot hear you either. :-P [17:24] * sinzui stabs mumble [17:24] i heard you. [17:24] i will restart mumble as well. [17:25] sinzui: unrelated, are there still long term plans to deprecate polls (there's a question in #launchpad) [17:25] jcsackett: I see the mic setting went missing. I was there 18 hours ago [17:25] They are deprecated. We do not support them. There exist only for the Ubuntu community [17:26] sinzui: i can hear you. [17:27] but clearly you cannot hear me. :-P [17:27] no I cannot [17:27] hm. [17:41] abentley: I have some questions about the TwistedJobRunner [17:42] jml: shoot [17:42] abentley: in getTaskSource, why does the producer exhaust the iterReady() iterator before the loop begins? [17:43] abentley: hmm. actually another way of putting this is, what are you trying to get out of using PollingTaskSource? [17:43] jml: I believe that's so that it runs for a finite time, instead of pulling jobs infinitely. [17:45] jml: PollingTaskSource was existing Twisted code for doing this kind of thing. [17:45] abentley: I think we can achieve the same effect more simply by using DeferredSemaphore instead of PollingTaskSource and ParallelLimitedTaskConsumer, but I'm a little bit hesitant because I don't see many tests. [17:46] Project windmill build #89: FAILURE in 1 hr 17 min: https://lpci.wedontsleep.org/job/windmill/89/ [17:46] abentley: also, that reminds me, TwistedJobRunner uses ParallelLimitedTaskConsumer with a hard-coded maximum number of 1 task at a time. What's the reason behind that? [17:47] lifeless: and as if by magic, bug 740250 appears [17:47] jml: The idea was that we could relax the maximum when we felt that would improve performance. [17:47] <_mup_> Bug #740250: Timeout accepting packages < https://launchpad.net/bugs/740250 > [17:47] abentley: fair enough. [17:47] abentley: last question, I think... [17:48] abentley: I think I've isolated the intermittent failure in TestTwistedJobRunner.test_timeout. It comes from ampoule killing the job before the custom signal handler is installed [17:48] I don't think that's possible. [17:49] abentley: I can prove it with science. [17:49] the signal handler is installed when the first job runs, right? [17:50] hmm. interesting point. [17:51] OK. I can say with much greater confidence that if I reduce the lease_length in the test from 1 to a very small number, like 0 or 0.01, then the "intermittent" failure becomes deterministic. [17:51] abentley: is it guaranteed to re-use the same process? [17:52] jml, that's what it's supposed to do, and I didn't see anything to make me believe otherwise. [17:52] abentley: good answer :) [17:53] jml: I haven't dug through ampoule's internals thoroughly, and it's been a while since I looked at them at all. [17:54] abentley: I had a bit of a poke. There's definitely something racy going on with the lease_length (ampoule's "deadline"). [17:55] jml: Okay, IIRC 0 is treated completely stupidly. [17:55] jml: I don't remember the details. [17:55] abentley: it's a bit obscured by the abstraction in job. [17:55] abentley: ok. I'll keep poking. [17:55] abentley: are there tests for the TwistedJobRunner other than TestTwistedJobRunner? [17:56] jml: There are tests for the branch_merge_proposal_jobs script, which uses it. [17:56] abentley: thanks. I'll use those as backup. [17:59] g'night everyone [18:00] bigjools: g'night. [18:03] jml: I wonder if setting the value to a very small number would cause the deadlineCall to fire before the callRemote? [18:03] abentley: that does happen. Although interestingly, the failure in the test is that the job *does* complete successfully (you can tell because the wait goes up to 30s+) [18:04] jml: That's the failure case I hadn't been able to pin down-- the unexpected success. [18:05] abentley: I guess I'll instrument ampoule & see what happens. [18:08] jml: I think I was wrong about the 0 handling, or else it was improved. In any case, that's the timeout handling, and we're using deadlines, IIRC. === deryck[lunch] is now known as deryck [18:14] oh. [18:14] hm. [18:14] abentley: it uses the same mechanism, afaict. [18:15] http://paste.ubuntu.com/583909/ [18:15] jml: well, similar, but without special-casing 0. [18:15] right. [18:35] leonardr: I'm writing some code that wants to treat an object the same whether it's from the LP.cache or retrieved from the web service. [18:35] abentley: that's a good place to be going. what problem are you having? [18:36] leonardr: But WS objects use .get('foo') and LP.cache objects use ['foo']. [18:36] probably because ws objects are wrapped in a LP.Entry class (or whatever it's called), and LP.cache objects are just data structures [18:37] leonardr: right. [18:37] you should be able to wrap the data structure yourself, and i'd even say that should be done automatically [18:38] leonardr: sounds reasonable. [18:57] benji, i've once again reached the point where i can't stand to work on this multiversion stuff anymore. do you want to talk about filter/expand? [18:59] leonardr: sure, say 10 minutes from now? [18:59] ok [19:01] abentley: OK. I've got it figured out. http://paste.ubuntu.com/583931/ [19:01] abentley: the signal is being sent before the callRemote call... [19:01] (as you said) [19:02] abentley: what this means is that the signal handler is triggered, raises an error, and then ... well, who cares? it's not going to terminate the process. [19:03] jml: Cool. That's great you were able to track it down. [19:04] jml: so we need to tweak ampoule so it doesn't try to kill processes until they have launched? [19:04] abentley: not sure. in this case the process has launched, it's just that the command isn't running. tbh I'm not sure by what mechanism an exception that is raised in a signal handler is propagated out. [19:04] abentley: experimenting now... [19:08] *maybe* something to do with hooking up the OOPS system to the Twisted log? [19:09] errors raised in sighup handlers while reactor is running get treated as Unhandled errors, I think. [19:12] benji: i'd like to start the chat in irc if that's ok === matsubara-lunch is now known as matsubara [19:12] leonardr: sounds good [19:13] i'm vacillating on having it here vs. in another room where we won't flood everyone else's conversations [19:13] jml: I see what you mean. [19:13] like the defunct #launchpad-meeting [19:14] anyway, let me try to find the stuff i wrote on this before [19:16] here we go: https://dev.launchpad.net/Foundations/Webservice/DraftProposal [19:17] i think the only thing not mentioned in that document [19:17] leonardr: we can do it in PM, or we can just fabricate a chan., like #expand-and-filter [19:18] let's do it in pm, i don't think there's all that much to discuss now that i've found that document [19:18] since it never got much further than that document [19:18] k [19:18] basically i just think this is really important to the future of the web service and i don't want it to fall through the cracks when i leave [19:19] yep, agreed [19:25] abentley: so actually, it's just that the exception is raised in whatever code happens to be running at the time. maybe we can store some state ("hey, we got HUPped") and then raise during runJob if we see that state? [19:29] jml: The exception being TimeoutError? [19:30] abentley: yes, or indeed anything that 'handler' might raise. [19:30] jml: And if the job hasn't started, that would be raised in ampoule code, I guess. [19:31] jml: And then, for some reason, it's not causing the process to terminate. [19:31] abentley: sort of. It will be raised as an unhandled error, because it's during the reactor.run() in BOOTSTRAP [19:31] abentley: e.g. http://paste.ubuntu.com/583943/ (simplified version) [19:31] the reactor keeps running [19:32] and whatever things were set up to listen to connections are still set up [19:32] in this case, the AMP protocol handlers [19:32] jml: The reactor in the child process keeps running? [19:32] abentley: yes. [19:32] abentley: if the SIGHUP happens between jobs, it's only the child process that ever learns about the raised error. [19:35] jml: I guess your solution makes some sense, but it doesn't feel satisfying. What if no job ever gets run in the child? [19:36] abentley: I agree that it doesn't feel satisfying. I don't know what would happen in that case, or what we desire to happen. [19:36] abentley: I guess the timeout/deadline/lease is tied to the job rather than the child process. [19:36] abentley: so maybe it wouldn't matter. [19:36] jml: true. [19:37] jml: What about the case where the timeout fires after the job completes, but before the next is dispatched? [19:37] abentley: then in the simplest implementation, the next job would "time out" instantly [19:38] or at least, that's what it would look like. [19:38] I'm also worried about making timeout code too complex. It's supposed to be something you can rely on when other things fail. [19:39] jml: Fully agreed. [19:39] jml: Can we tweak this so that the child always dies when we raise TimeoutError? [19:39] abentley: probably. it's easy enough to sys.exit or reactor.stop or something. how does ampoule handle dead children? [19:40] sinzui: can i set you as help contact in #launchpad? [19:41] even if we stopped the process, we'd have to figure out how to tell the parent that we did it because of a timeout. I guess we have OOPS logging available to us from BaseJobRunner. [19:43] jml: I don't know how it handles dead children, but it expects timeouts to kill children, so it should expect this. [19:44] abentley: OK. I'll give it a try, see how it goes. [19:44] relatedly, I'd like to have a few more direct unit tests for this code. [19:44] not tonight though. [19:45] Yippie, build fixed! [19:45] Project db-devel build #480: FIXED in 5 hr 5 min: https://lpci.wedontsleep.org/job/db-devel/480/ [19:45] jml: I guess at the time it seemed like there weren't a lot of units to test. [19:47] abentley: I know the feeling. maybe there aren't, but after debugging this for a while I'd like to give it a try. [19:51] abentley: we'll have to do some work to translate the death into a TimeoutError, but I think it can be done [19:52] but that's enough from me for this evening [19:52] abentley: thanks for the help. [19:52] jml: no problem. [19:52] jml: good to get some fresh eyes on this. [20:11] g'night all. [20:12] jcsackett: I am so sorry. I forgot :( [20:12] sinzui: no worries, it's been a quiet day. [20:13] i just didn't want to switch you over if you were busy. [20:58] hi sinzui [21:04] bac [21:05] benji: hi. did you get a chance to look at my lazr restful enhancement? i was hoping to check with leonard but he's not around atm it seems [21:10] wallyworld_: yep I looked at it and didn't see anything bad jump out at me, I'd like to give it some more thought and get back to you tomorrow, how's that? [21:10] hi sinzui [21:10] hello [21:11] i wanted to ask you about the new hidden menu items and the use of the invisible-link class [21:11] it doesn't seem to do what i thought it did, i.e. make the link invisible [21:11] benji: that's fine thanks. i'll continue on with a branch to use the new stuff to fix the actual bug and we can revisit when you are ready [21:12] sinzui: was i correct in understanding the link would be present but invisible until i removed the class in JS? [21:13] bac: yes [21:13] hmm, ok [21:13] bac: let me look for a page with such as link [21:37] Yippie, build fixed! [21:37] Project windmill build #90: FIXED in 1 hr 8 min: https://lpci.wedontsleep.org/job/windmill/90/ [22:09] wgrant: ping me when you're back [22:55] wgrant: back yet? [23:23] lifeless: Hi. [23:24] wgrant: so, why is the id desc sort important? [23:25] lifeless: For the POST? It's probably not; that query shouldn't be happening at all. [23:25] For the GET... what other order is there? [23:25] Apart from date_created, which is the same issue. [23:25] wgrant: so I proposed distroseries desc, status desc, archive desc, PackageUpload.id DESC [23:26] wgrant: distroseries and status are constants [23:26] so this is known as 'noddy fool the planner stuff' [23:26] archive has two values - 1, 534 [23:26] It's fooling the planner into making a bad decision. [23:26] wgrant: good decision [23:27] Bad, because the ID sort is unindexed in that case. [23:27] This will probably exacerbate timeouts in the DONE case. [23:27] wgrant: huh, no, id is in the index [23:27] Oh, so it doesn't use it even with the full index? [23:27] I suspect your index is buggy. [23:28] packageupload(archive, distroseries, status, id) is what I think is ideal. [23:28] create index packageupload__distroseries__status__archive__idx on packageupload(distroseries, status, archive, id); [23:28] Ahh. [23:28] is what we tried [23:28] Interesting. [23:28] That's better for this particular query, but worse for others, but mine worked fine on mawson. [23:29] Can you tell why the planner is avoiding the index? [23:29] Limit (cost=0.00..105.89 rows=31 width=36) (actual time=0.106..0.108 rows=1 loops=1) [23:29] -> Index Scan Backward using packageupload__distroseries__status__archive__idx on packageupload (cost=0.00..5365.99 rows=1571 width=36) (actual time=0.104..0.106 rows=1 loops=1) [23:29] Index Cond: ((distroseries = 104) AND (status = 0)) [23:29] Filter: (archive = ANY ('{1,534}'::integer[])) [23:29] Total runtime: 0.159 ms [23:30] wgrant: what are the other queries? theres no reason for use to ues the same query for different cases [23:30] bah, you know what I mean [23:31] lifeless: Everything except that page is driven by archive, not distroseries. So having archive first is going to be more generally useful. [23:31] wgrant: anyhow, *why* do we need id to sort before archive *on this page* [23:32] lifeless: Because it's meant to be a chronological ordering, not have every partner upload ever first. [23:32] Perhaps if primary was first. [23:32] But definitely not partner. [23:32] wgrant: this is for processing NEW right? [23:33] lifeless: Not just NEW, but yes. [23:34] lifeless: The other statuses get looked at to see what has happened to various packages. [23:34] And having to scroll through an indeterminate number of pages of partner to get to the latest uploads is not going to make people happy. [23:34] https://bugs.launchpad.net/launchpad/+bug/276950/comments/23 [23:34] <_mup_> Bug #276950: DistroSeries:+queue Timeout accepting many packages queue page < https://launchpad.net/bugs/276950 > [23:34] Both because it's stupid, and because it's partner. [23:35] lifeless: What if you try another series? [23:35] The planner is being really stupid here. [23:36] wgrant: this is ubuntu [23:36] no ? [23:36] lifeless: Ubuntu and Canonical Partner. [23:36] yes [23:36] wgrant: so, I propose we take a different approach to fixing this [23:37] wgrant: a) define the scope [23:37] b) solve that scaope [23:37] c) wait for others to have issues [23:37] theres no a-priori requirement to have the same query generated by python for different use cases. [23:38] The same query is fine. [23:38] PostgreSQL's planner is not. [23:39] folk shouldn't be scrolling through pages and pages to get to things [23:39] thats nuts anyway [23:39] so I really don't think we should care about status not in (0,1) [23:40] and for status in (0,1) its a work queue thats meant to be driven to zero; it shouldn't normally matter if we group on archive or not [23:40] These postgres workarounds are piling up. [23:40] But OK. [23:41] we can do a partial index - (date_created desc, archive, distroseries) where status in (0,1) [23:41] that models the work queue case accurately. [23:41] which is *always* going to be lopsided [23:41] the non-queue case is 2.7MILLION rows [23:42] lifeless: And the indexes handle that trivially, when they are properly used. [23:42] But we cannot make them be properly used. [23:42] wgrant: storing a work queue in a historical table is improper use [23:42] Because postgres doesn't let us override its bad judgement. [23:42] wgrant: thats why postgresql is having trouble [23:42] lifeless: It depends on your definition of "proper", but probably. [23:43] proper = will work well for the DB stack we're using [23:43] I would s/proper/hackish/ [23:44] wgrant: working against the stack will be hard [23:44] wgrant: its not adroit :P [23:44] Sure. [23:44] But it's still a really bad hack. [23:44] We probably have no choice. [23:44] wgrant: what makes it that ? [23:44] But that doesn't make it good. [23:44] wgrant: seriously, we're not using bdb [23:45] lifeless: We are changing data models and rewording queries to convince the postgres planner to do exactly what we want it to do. [23:45] When we know exactly what we want it to do, but we can't tell it that. [23:45] wgrant: you're talking about query hinting ? [23:46] wgrant: how would you describe what we want the planner to do? [23:46] wgrant: (in english) - because looking at the indices *and the query constraints* I think its being pretty sane [23:47] specifically, its looking for a small number of rows near the end of the table by walking backwards. [23:47] its been told: [23:47] - we don't want every row (limit 31) and we want the beginning of the sequence (offset 0) [23:47] so its a very smart choice. [23:48] lifeless: It's avoiding using an index which is perfect for that query. [23:48] We know that. [23:48] But the skewed data prevents it from knowing that. [23:48] packageupload__distroseries__status__idx isn't perfect for the query [23:48] With archive and id it is. [23:50] wgrant: results of hte partial on the bug [23:50] Thanks. [23:51] wgrant: the index isn't perfect - archive 1 + 534 - 1M rows [23:52] Is comment #24 corect? [23:52] How is the index condition distroseries, when date_created is first in the index? [23:53] its scanning the index [23:53] it can't select at the root [23:53] but because its only looking at status 0,1 its very cheap [23:54] Ah, true. [23:54] it can satisfy the distroseries rule from the index [23:54] before looking at table row [23:54] We will probably need to revisit this in a year or so. [23:54] Once we have a derivative distro or two. [23:54] Index cond -- index tells us, filter -- table row is used. [23:54] and - win [23:54] we can make the current query use it [23:55] How fast? [23:55] 7ms [23:55] Not bad. [23:55] I can live with it [23:55] It still leaves accepted/rejected/done timing out often, but I guess that is less of a user-affecting issue. [23:57] wgrant: that use case needs to be revisted anyway [23:57] wgrant: it should be a search etc, not a freaking pagination [23:58] wgrant: showing 1M rows of done in a pagination view is very hard to do at all, let alone well [23:58] Indeed. [23:59] wgrant: I'm going to prep a db patch to add the index, and get it applied live [23:59] lifeless: Thanks. [23:59] I think jam has halted(), we should do loggerhead ourselves