[00:05] thumper: did you really mean to link to bug 240067 ? [00:05] <_mup_> Bug #240067: Launchpad projects need wikis [00:09] lifeless: no [00:09] lifeless: I've fixed it [00:09] I had two bugs open [00:09] and I linked to the wrong one :( [00:09] * thumper heading afk for hair cut and lunch :) [00:14] Project devel build (109): STILL FAILING in 1 hr 28 min: https://hudson.wedontsleep.org/job/devel/109/ [00:14] * Launchpad Patch Queue Manager: [r=allenap][ui=none][bug=656759] It's now possible to subscribe with, [00:14] and filter direct subscriptions by, a BugNotificationLevel. [00:14] * Launchpad Patch Queue Manager: [r=adeuring][ui=none][bug=651240] The Bug:+subscribe functionality [00:14] has been moved into its own view class. [00:35] * wgrant deletes lucilleconfig. [01:15] james_w: around? does that develop= line work to use regular non-buildout projects ? [01:19] dunno [01:19] lifeless, I believe the result is to run "setup.py develop" in those dirs, and then add a .egg-link to them [01:19] and I don't think that requires them to use buildout [01:19] it wouldn't really surprise me if it didn't work though [01:20] develop isn't in standard setup.py [01:21] hmm, it may work, something funny with the iport thought [01:22] \o/ [01:22] >>> fixtures.PopenFixture [01:22] [01:25] depends what you mean by "standard", it's in setuptools, so a fair number of projects will support it, and they don't have to use buildout to have it [01:27] distutils 4 eva [01:27] anyhow, develop= worked with python-fixtures [01:27] which is nice === Ursinha is now known as Ursinha-afk [01:44] Project devel build (110): STILL FAILING in 1 hr 29 min: https://hudson.wedontsleep.org/job/devel/110/ [01:44] * Launchpad Patch Queue Manager: [r=leonardr][ui=none][bug=650343] Add X-Launchpad-Original-To to [01:44] recipient lists; [01:44] move some related files from c/l/mail to lp/services/mail. [01:44] * Launchpad Patch Queue Manager: [r=leonardr][ui=none][bug=656213] Allow .lpconfig to be honored if [01:44] initial configuration is "development" [01:45] """Manage a publisher configuration from the database. (Read Only) [01:45] This class provides a useful abstraction so that if we change [01:45] how the database stores configuration then the publisher will not [01:45] need to be re-coded to cope""" [01:45] What an *awesome* idea. [01:45] uhm [01:45] yagni [01:45] Yes. [01:46] We haven't needed it in more than five years, so I'm deleting it. [02:20] wallyworld, hey. [02:20] rockstar: hiho [02:21] see my email? [02:21] wallyworld, just out of curiosity, could you `make clean build` and then run the windmill test? [02:21] that's what i did :-( [02:22] wallyworld, hm. [02:23] rockstar: any thought's? looks like YUI() is borked in windmill [02:24] wallyworld, this is the Y.io() thing, right? [02:24] yeah [02:24] and also any other YUI eg Y.all() [02:24] or maybe not Y.all - it may have been run on the wrong instance. i need to check that [02:25] but Y.io is definitely broken [02:25] wallyworld, what happens if you -D, wait for it to fail, and then try and look up the url being requested in the url bar? [02:26] rockstar: good question. hold that thought and i'll get back to you [02:27] wallyworld, I'm also curious to know whether or not a YUI 3.2 update would make it work. [02:28] rockstar: i'm not yet familiar enough with the failure mode nor what's new in 3.2 to know if it will help or not. can't hurt to try i guess [02:28] wallyworld, well, as I upgraded, I found that lots of bugs were hidden in our javascript that YUI 3.2 doesn't pass silently. Maybe that might help as well. [02:28] wallyworld, I don't know, we're at a weird impass now. [02:29] rockstar: i'll check this one other thing. perhaps you could grab my branch and try it with 3.2? [02:30] wallyworld, sure. One sec. [02:30] rockstar: bin/test -vvt test_branch_broken_links <--- for when you have the branch pulled [02:30] rockstar: hey [02:30] lifeless, hi sir. [02:31] rockstar: I have a question about windmill [02:31] rockstar: what will happen if I run two windmill tests concurrently? [02:31] lifeless, I SEVERELY doubt I have the answer, but I can probably get as close as anyone... [02:32] lifeless, I think, were you to do that, you'd want them running in separate processes at least. They'll need their own instances of the browser they fire up. [02:32] rockstar: will doing xvfb-run be enough to do that ? [02:32] lifeless, no idea. [02:33] I'll ask on twitter :) [02:34] lifeless, :) [02:34] lifeless, mikeal may have a good answer for you. [02:34] I wish I knew more about windmill, so I could actually fix it. [02:54] rockstar: the client call to the server side endpoint works fine from a console on the test browser, where the Y.io call fails [02:55] rockstar: ie window.open('http://code.launchpad.dev:8085/+check-links?link_hrefs={}') [02:55] wallyworld, what if you use the actual browser that it brings up? The Firefox instance? [02:55] yes, that's what i'm using [02:56] i open the firebug light console [02:56] Is that the exact url that is getting requested? [02:57] almost: i have cut out the actual json guts between the {}. with an empty {} it just returns {} so the end-end works [02:58] i can open a new tab and type the url directly or use the firebug console and do a window.open() [02:59] do our tests for windmill go through apache?! [02:59] lifeless, no. [02:59] whew [02:59] so the 8085 is just a daft default? [02:59] Yeah. [03:00] a default that's typed in many, many places in the code [03:00] wallyworld, where? [03:01] wallyworld: it shouldn't be; but if it is replace it with a config lookup please [03:01] * wallyworld looks at the code [03:01] wallyworld, canonical_url should just give it to you correctly. [03:02] * rockstar wrote that patch... [03:03] lots of tests (doc and py) have code like browser.open('http://launchpad.dev:8085/+login') [03:04] ugh [03:04] wallyworld: care to file a bug? [03:04] (or better yet, just fix) [03:04] lifeless: will do. [03:05] lifeless: after rockstar and i get this @!^#@#@@& YUI and windmill issue fixed [03:05] naturally [03:08] rockstar: i'm hungry. i'll grab a bite to eat and check back after that on the progess using YUI 3.2 to see if it sheds more light on the ajax breakage [03:09] wallyworld, windmill tests are running now. [03:09] rockstar: ok. i'll wait a bit. did you try the specific one i mentioned? [03:10] wallyworld, no, because I'm also trying to fix my test. :) [03:10] np [03:10] wallyworld, I'm testing lifeless' query right now... [03:11] My laptop is not happy. [03:11] rockstar: from our discussions, the root cause of the ajax issue for your and my tests is likely the same anyways i think? [03:12] wallyworld, I'm not so sure. [03:13] rockstar: ok. i'll wait for a progress report from your side of things. in the meantime, food awaits [03:13] rockstar: about parallel testing? [03:13] lifeless, yes. [03:13] rockstar: I'd expect things to blow up hugely [03:14] rockstar: but not because of windmill [03:14] lifeless, no, it was windmill. [03:14] rockstar: is there some way to identify windmill tests a priori? [03:14] lifeless, so, the second instance of bin/test died because port 8085 was in use. [03:15] rockstar: right, thats not a windmill issue [03:15] lifeless, I think that means that we might be able to run it fine once everything is set up. [03:15] rockstar: theres a long chain of things to fix ;) [03:15] rockstar: the testdb, all the external helpers, the oops dir, the log dir [03:16] lifeless, but there's also probably a huge penalty in starting another firefox instance. [03:18] poolie: I htink its a bad idea to link the code style giude from the front page [03:19] poolie: but I'd like to know why before rolling your change back [03:19] i'd like to know why too :) [03:19] I think its a bad idea because really all the guides need to be read, to write patches for LP [03:20] special casing that one makes it feel like that element on the front page is a nested table of contents [03:20] which its not [03:20] poolie: your turn :) [03:20] it's the one i've found myself linking to or looking at most [03:21] obviously other people may care more about js style [03:21] feel free to revert [03:21] I'd be happy to list them all there and delete the list-page of the guides [03:21] there's no content on the indirection page [03:21] what do you think? [03:22] you could make 'Style guides' a new blue-ish section [03:22] also 'architecture guideline' is not really a style guide [03:22] blue section ? [03:22] heh, doubly so because it does not exist :) [03:23] like 'process', 'tools', [03:23] an h2 or h3 or whatever [03:23] 'does not exist' meaning the first link on https://dev.launchpad.net/StyleGuides is bronke [03:24] * rockstar suspects the js style guide needs some updating. [03:27] lifeless: so that's settled? [03:27] inlined it on the front page [03:27] great [03:27] glad i could push it into a better direction :) [03:43] how much memory does a bin/test process need [03:44] poolie: we can hope it is better :) [03:44] lifeless, I guess it depends on what tests it's running. [03:44] rockstar: peak memory is what I need [03:44] like, if the machine has 2 GB [03:44] and 16 procs [03:44] should I start 16 test runners [03:44] or 4 [03:45] 16 cores [04:27] bac: is that the threading thing [04:27] ? [04:27] lifeless: yes [04:27] test_network [04:28] lifeless: do you have a sec to talk about the metric in the ArchitectureGuide wrt 2 second tests? [04:29] sure [04:29] so i have a branch that adds a new test, it runs four tests and the test case is instantiated for five different pillars [04:29] do you mean irc or voice when you say talk [04:29] lifeless: here is good and my voice is gone today [04:29] sure [04:29] so it runs 20 tests and was taking 42 seconds with the straightforward (naive) implmentation [04:29] so, new test, 4 cases, parameterised over 5 pillars [04:30] the first question I have is why it needs parameterisation (can things actually vary) [04:30] i started playing around with it yesterday and got it down to <11 seconds with the same coverage [04:30] good, yeah? [04:30] bac: thats excellent. [04:30] bac: I presume thats not including the layer setup time. [04:31] well, in order to do that i had to collapse the four tests into one so the expensive setup was not called multiple times, violating our guideline to only test one thing at a time [04:31] what makes the setup expensive? [04:31] that was the bulk of it, caching some properties accounted for the rest, which jtv objected to [04:31] lifeless: creating objects [04:31] bac: if caching properties makes the test suite faster, it doesn't necessarily follow that prod will be faster - but it may well. [04:32] bac: if its noticably faster, it probably will make prod faster. [04:32] anyhow, back to your question. [04:33] well, this is not really a production issue. the test is just to verify that the correct anti-robot meta is included in the rendered templates based on LP usage [04:33] so the speed or slowness of the test really has nothing to do with performance in production [04:33] 2 seconds is a statement of where we'd like to be. If we can't reach it right now, I suggest analying *why* far enough to be confident that we know what to fix, file a XXX bug on that (e.g. creating DB test objects is slow) and move on. [04:33] however I'd be utterly delighted if you were to keep driving it down to a sensible figure [04:34] bac: I have to disagree about its relevance for production [04:34] lifeless: my question is this: what do we value more: test readability and separation or speed? [04:34] bac: all three [04:35] bac: in this case, for instance, you can use a layer (ugh, but thats the current tool) to provide the same expensive setup for all 20 tests, and have separate clear test code per case [04:35] and it should perform identically [04:35] except that you'll need to intercept the db layer rollback (and there is a flag for that to let you do so) [04:37] so in your layer setUp to set the 'do not rollback' flag, and in your layer tearDown you restore it. [04:38] lifeless: here is the original: http://pastebin.ubuntu.com/513562 [04:38] bac: I'm still staggered that setUp is so slow for your test [04:39] and here is the restructured one http://pastebin.ubuntu.com/513561 [04:40] lifeless: i make the argument about no effect on production b/c all i have done is make changes to the structure of the test not any of the production code [04:40] so you're saying that canonical_url and getUserBrowser are sufficiently slow [04:40] bac, I'm happy to do reviewer meeting duty next week. [04:40] we use canonical_url in prod [04:41] bac: I agree, seeing that, but its data we should gather for where prod performance issues are [04:41] lifeless: right. but i haven't sped them up, i've avoided them [04:41] rockstar rocks [04:41] the getUserBrowser cachedproperty looks like it will get no hits, FWIW [04:42] how's that? getRenderedContents is called four times [04:43] oh, I see [04:43] thanks [04:43] so, I think that that test is fine [04:43] personally I'd structure it as a matcher [04:43] then you wouldn't need a mixin [04:44] but doing multiple similar checks on one fixture are fine [04:44] totally readable [04:44] lifeless: i guess my main point is i'm dissatisfied b/c i find the explicitness of the original appealing [04:44] but dog slow [04:44] i did not know about using a new layer to accomplish what you suggest [04:45] i may pursue that and if it works send out an email as a case study [04:45] bac: I wouldn't in this case, its nice and clean, not at all what I envisioned [04:45] bac: please don't, layers are on the way out [04:45] oh, good [04:46] ok, well maybe i'll just send an email with what i've done and the four-fold increase in speed [04:46] I'd suggest you look into testresources or similar, but thats conflicts with layers, so its not a good learning point just yet. [04:46] anyhow, If you were testing many different things in those cases, I'd understand and agree with it being better to have 4 cases [04:46] lifeless would you mind putting your stamp on the MP for these changes [04:46] but I think you're really testing a single contract [04:47] the contract being the 4 cases where robots are blocked. [04:47] whats the MP ? [04:47] https://code.edge.launchpad.net/~bac/launchpad/bug-652280-pg-trans/+merge/38195 [04:48] you couldn't use cached property here anyway, you need to reload the browser each time, you could if you called .open(), but the thing you were measuring was probably the open and render time anyhow. [04:48] (looking at the diff in the MP I see that you're not caching) [04:50] lifeless: i do reload the browser each time [04:50] cool [04:51] lifeless: the MP is not yet updated [04:51] bac: I've stamped my opinion on the bits in question. [04:51] I haven't dug really deep, but I see you've had a thorough discussion anyhow ;) [04:51] yeah,thanks [04:53] lifeless, just saw your tweet. I think you need to at least have two separate firefox instances. [04:54] rockstar: why? [04:54] I mean, if we need it, we need it [04:54] but its going to suck having 8 concurrent firefoxes doing their thing [04:55] or 16 on serious desktops [04:55] e.g. [04:55] robertc 15152 2.9 4.2 412104 86656 pts/1 Sl+ 14:56 0:01 | \_ /usr/bin/python -S bin/test -vt test_parallel --parallel --subunit [04:55] robertc 15167 7.4 6.5 399240 134248 pts/1 S+ 14:56 0:03 | \_ /usr/bin/python -S bin/test -vt test_parallel --subunit --load-list /tmp/tmpSbVCl6 [04:55] robertc 15168 7.5 6.5 399320 134244 pts/1 S+ 14:56 0:03 | \_ /usr/bin/python -S bin/test -vt test_parallel --subunit --load-list /tmp/tmpsNzicF [04:55] lifeless, I think the interface to remotely control firefox is per browser. I remember mikeal's pycon talk about it. [04:56] ah [04:56] still, out of scope for me -just yet- [04:58] jml: I'm sad; someone fucked with list-tests and now it prints test descriptions not ids, and prints other guff too. [05:01] jml: either that or it never really existed. [05:11] ooooh yeah [05:11] parallel testing working. |o| [05:11] of course, it'll blow up trivially on stuff like rockstars experiment [05:13] Or, say, database access? [05:14] that will be the height of hilarity. [05:24] StevenK: does your hudson have resources to run a parallel version of launchpad tests? [05:24] StevenK: one that I expect to fail massively. [05:57] wgrant: https://code.edge.launchpad.net/~lifeless/launchpad/paralleltests if you want to play [05:57] can anyone tell me the difference between zope.app.pagetemplate and zope.pagetemplate? [06:00] generally zope.app stuff is more tightly tied to the application server environment [06:00] often it is glue between generic and specific stuff [06:00] Lots of things from zope.app.* are being moved into zope.* as part of the ZTK rework. [06:00] So zope.app.pagetemplate could just be a deprecated alias now, or it could still have some functionality of its own. [06:01] i have a patch for zope.pagetemplate.engine.py - is the best course of action to ask on the zope dev list. i assume htere is one? [06:02] wgrant: excellent! [06:02] bah [06:02] wallyworld: ^ [06:02] https://mail.zope.org/mailman/listinfo/zope3-dev is the list. [06:02] But you might be better off talking to some of our Zopeish people... [06:03] lifeless: it's a simple one. i've replaced if getattr(object, '__class__', None) == dict: with if zope_isinstance(object, dict): [06:03] yay [06:03] so, find the project in lp and propose a merge proposal [06:03] they are usin gLP [06:03] it restores the short circuit for menu links when used with the branch i just did for performance improvements [06:03] lifeless: They're even using LP for MPs? [06:03] wgrant: yes [06:03] Because they still use zope.org svn. [06:03] what's gLP? [06:03] incoherently [06:04] wallyworld: using LP [06:04] ok. i had a quick look at lp. there's a few zope projects there. i wasn't sure if we were just mirroring their stuff or not [06:04] or if i had to wander over to zope.org [06:04] either, AFAIK [06:05] ok. thanks [06:05] wgrant: thanks also for the info [06:05] perhaps both is best [06:06] wgrant: this is what you'll get: [06:06] psycopg2.ProgrammingError: database "launchpad_ftest" already exists [06:06] :) [06:06] with --parallel on db tests [06:07] lifeless: Ah, I see. [06:07] Still, that should be reasonably easy to fix. [06:09] wgrant: like I say, there's a list of things to fix ;) [06:10] * wgrant is demolishing lucilleconfig while trying to avoid finishing assignments. [06:11] lifeless: I'm not sure how to get Hudson to run arbitary branches [06:12] TBH, I'd just throw it at ec2 and see what happens [06:12] StevenK: set up a new job, also of trunk [06:12] StevenK: you misunderstand me [06:12] StevenK: I've added a new bin/test option, --parallel. [06:13] StevenK: due to our test suite making many inappropriate assumptions, this won't work for all tests yet. [06:13] StevenK: I want a ratchet, a visible progress marker. [06:14] StevenK: so I want it run, with as many cores at once, to find unknown issues [06:14] https://hudson.wedontsleep.org/job/parallel-test/ [06:14] and when it starts to pass most tests, we can organise a dedicated burn-down window for it [06:15] StevenK: awesome; you'll need --parallel in the job options (I can't see the config can I?) and my branch to land. [06:15] bbs [06:15] lifeless: I can add an account for you and you can fix the tests [06:16] fix the tests? [06:16] Er, fix the config [06:18] its just --parallel on bin/test :) [06:18] verry simple. [06:18] I don't run bin/test, I call make check [06:18] you can validate it using lp:~lifeless/launchpad/paralleltests [06:18] ah [06:18] ok account me up [06:18] we should have openid sometime soon [06:21] lifeless: I've added the ACLs, now how do I actually create an account for you? [06:21] depends [06:22] oh, you've made a warthogs user already ? [06:22] Yes [06:22] has it got write? [06:22] No, but it can [06:22] lifeless: You just want to use warthogs? [06:22] ok, I can login as that [06:23] yeah, why not.. simples [06:23] lifeless: Okay, you should be able to configure parellel-test [06:23] yea [06:24] lifeless: You can turn off IRC notification if you wish, but e-mail notification isn't set up [06:26] running a test build [06:27] Hm, that should start another executor, not block on devel [06:29] Ah, there it is [06:42] Bleh, how are people landing things when PQM tells me we're in testfix? === almaisan-away is now known as al-maisan [06:56] StevenK: failures? [06:57] lifeless: On db-devel [06:59] right [06:59] file a bug :) [06:59] And the failures look strange [06:59] a librarian process is probably still runnning [07:00] *or* was interrupted so violently that the pid wasn't removed. [07:00] we need to figure out which [07:00] and file a bug to prevent it happening again [07:00] Whoops. [07:00] Yes, I thought so too, so I suspect the next step is stab buildbot? [07:00] 2400 line branch. [07:00] StevenK: no [07:00] we should never stab [07:00] we should analyse the cause [07:00] gather data [07:00] and make sure we have enough info to fix [07:00] lifeless: Even when the failure is 'slave lost'? :-) [07:01] *then* stab [07:01] StevenK: yes. [07:01] why was it lost? [07:01] where did it go? [07:01] what happened? was it manually rebooted? did the machine crash? [07:01] slave lost is usually buildbot going gaga [07:01] its the lack of analysis and debugging that has us in this situation. [07:02] (Cough, hudson) [07:02] right [07:02] Speaking of, there's a new one, apparently [07:02] but we'll have issues with hudson that need similar care and detailed attention to fix, too. [07:03] Project parallel-test build (1): FAILURE in 26 min: https://hudson.wedontsleep.org/job/parallel-test/1/ [07:11] Project parallel-test build (2): STILL FAILING in 1 min 39 sec: https://hudson.wedontsleep.org/job/parallel-test/2/ [07:12] StevenK: can you kill the executor ? [07:13] lifeless: Done [07:22] lifeless: Just thinking about it, I didn't think anyone had access to the buildbot slaves? [07:23] they don't [07:23] so someone needs to work with the losas to diagnose. [07:23] its way past EOD for me, but I'll swing my irc every hour or two till I sleep [07:24] So, spm is off sick, and Tom turns up in ~ 2 hours, and I have branches to land. :-( [07:24] ugh [07:24] you could ask a gsa [07:24] but its really important to understand whats gone on. [07:25] I don't know enough about the buildbot infrastructure to know which machines are involved [07:25] * StevenK will wait for Tom [07:30] anyhow, I'm off for a while [07:30] jml: I've asked for your review on this parallel tests branch; to help you I reviewed all the testtools MP's. [08:28] hello jml, StevenK [08:28] lifeless: that's great new about parallelization [08:31] poolie: Hi [08:45] lifeless: I'm sorry to hear about list-tests. [08:46] lifeless: and thanks for the reviews, I'll look at your parallel tests branch soon. [08:48] Hm, that's odd. [08:49] Looks like a stale pid file [08:56] good morning [09:09] Hello [09:18] poolie: thanks, yes. [09:18] jml: \o/ [09:18] jml: yeah to get a listing its --list-tests --subunit | subunit-ls, which is a bit grotty [09:22] lifeless: oh, I think what I did there is deleted our own hack in favour of the upstream support... iirc. [09:22] gotta run, late for an appointment [09:22] ciao [09:29] Project parallel-test build (3): STILL FAILING in 2 hr 4 min: https://hudson.wedontsleep.org/job/parallel-test/3/ [09:30] bigjools: You're not particularly attached to lucilleconfig, are you? [09:31] wgrant: I'm not sure which part of "must die in fire" was ambiguous, no. [09:31] Haha [09:31] bigjools: Well, I accidentally slipped and deleted it today. [09:31] that's quite a slip [09:32] 2400 lines or so. [09:32] ! [09:33] Although some of that is a sampledata rebuild. [09:33] * bigjools otp [09:37] hahah https://hudson.wedontsleep.org/job/parallel-test/3/testReport/? [09:37] *boom* [09:38] and meh, something did a 'print' and mangled the stream [09:38] I'll look more closely monday. [09:38] jml, i just added http://pastebin.ubuntu.com/513690/ to my dkim tree [09:38] is that a better explanation? [09:58] it feels like we've been in testfix mode for a week [09:59] oh wait ... [10:05] bigjools: Could you please grab Ubuntu's lucilleconfig from staging? [10:05] Just need to check that it's within archivepublisher.root [10:05] Since that's what the new code uses. [10:06] wgrant: what are you replacing lucilleconfig with? [10:07] bigjools: distroseries.lucilleconfig is now calculated from ComponentSelection. [10:07] http://pastebin.ubuntu.com/513712/ [10:07] distribution.lucilleconfig's paths are calculated from archivepublisher.root, like we already do for !primary. [10:07] Aha, thanks. [10:08] And stayofexecution should be a new config key. [10:08] But right now is hardcoded. [10:11] * wgrant digs out the prod configs. [10:12] inability to remove hardcoded sample data is another thing that's wrong with doctest :/ [10:13] s//difficulty of [10:13] poolie: I think we need more reasons to hate doctest, there's not enough :) [10:13] perhaps at the 2012 epic it can be burned in effigy [10:13] What is the purpose of the 2011 Epic? [10:14] saying "god that's fast" [10:14] wgrant: To laugt at you. [10:14] *laugh [10:14] :( [10:14] and when stevenk laughs at you, you stay laughed :} [10:14] yet somehow we're laughing at StevenK [10:14] poolie: rofl [10:18] bigjools: So my change is compatible with production's primary archive setup. [10:18] And it aligns it with the way PPA/partner/debug/copy has been done forever. [10:18] good [10:18] And it removes those goddamn text ini columns. [10:19] \o/ [10:20] I mean, there are some strange bits of Soyuz. [10:20] But that just sort of wins. [10:25] wgrant: http://imagebin.ca/view/gayZQ1G.html [10:25] fun :) [10:26] bigjools: That is a pretty picture [10:26] But why was utilisation so low towards the end of the low-resource period? [10:29] Project devel build (111): STILL FAILING in 4 hr 4 min: https://hudson.wedontsleep.org/job/devel/111/ [10:29] wgrant: different arches [10:29] there were a load of amd64 builds queued up and nothing else [10:29] bigjools: Oh, right, forgot that. [10:29] the graph is good because it's telling us we can share arches [10:30] we just need to do the work for that [10:30] Well, we could if it didn't break out build time estimation algorithm, somewhat ironically. [10:30] indeed [10:30] queuing theory... [10:30] I hear there's been a bit of research in that area [10:33] We could also just declare that dispatch time estimates are too inaccurate and drop them. [10:33] That is very tempting. [10:38] bigjools: P[P3]As never publish outside main, do they? [10:38] wgrant: they're not *supposed* to :) [10:38] Even the security one? [10:38] yes [10:38] bigjools: I'm wondering why we don't restrict the PPA publisher to main. [10:39] Like we restrict the partner one to partner. [10:39] hysterical raisins [10:39] * wgrant might fix that later, then, now the code is not nauseating. [10:40] Ah, yeah, all pubs are overridden to main during creation, so it should be impossible for them to be anywhere else. [10:41] we used to publish all components if you remember [10:41] I do indeed. [10:41] then it was decided that we should simplify it [10:41] which was a superb idea [10:42] I was really wondering if you knew a reason that the restriction wasn't added then. [10:43] I guess nobody thought to do it [10:44] I am going to have to fix that double-copy bug - the errors are mounting in the publisher log :/ [10:45] not to mention a bunch of PoolFileOverwriteError which I can't remember why is happening [10:45] The PFOEs were meant to all be fixed. [10:45] But I saw some last week. [10:45] We must have a new bug. [10:45] yes [10:46] mthaddon: Did you intentionally not enable natty/armel PPA support? [10:47] wgrant: not intentionally - I just did what I was asked - bigjools, is that something we'd want to do? [10:47] mthaddon: I think so, yes. [10:47] Ah, NRCP is wrong. [10:47] * wgrant fixes. [10:47] bigjools: do we need to check with someone? [10:47] mthaddon: I'd say lamont but he's not around this week [10:48] so whoever is covering him? [10:48] It's already enabled for all the old series. [10:48] let me check [10:49] ffs, landing one branch worked then the other gets caught by testfix mode [10:52] bigjools: See, you should use EC2. It forces you to get used to frequent arbitrary rejections :) [10:52] haha [10:53] * bigjools goes to get caffeine [10:53] My average success rate in the last three weeks is below 10%. === al-maisan is now known as almaisan-away [10:59] bigjools: sorry, what would lamont be able to tell us? [10:59] elmo: whether to enable natty/armel for PPAs [11:00] given armel's enabled for the other series (restricted to certain PPAs of course) [11:00] well, that's why we're not enabling it - it's not clear to me how the restriction works and whether it's automatically carried over [11:00] I was hoping you would know [11:02] elmo: it's carried over - the restriction is based on the ProcessorFamily only [11:02] mthaddon: ^-- go ahead and enable arm too [11:03] ok [11:03] bigjools: erm, where would I do that? [11:04] mthaddon: same as the ones you already did, just for natty/armel/+admin [11:05] doh [11:05] bigjools, elmo: I take it not "Official Support", just "PPA Support Enabled"? [11:05] yup [11:05] ok, done [11:06] Thanks. [11:45] bigjools: We have actually changed it. It's always been 5 days in the code, and 1 day in production. [11:45] I guess I could change it to 1 day everywhere. [11:45] (I'm not sure when the production change happened) [11:45] wgrant: I don't recall it changing so it's been there a while [11:45] and changing config is as painful as changing code, so code it and reduce complexity [11:46] Yep. [11:46] Thanks. [11:49] jml: around? [11:49] bigjools: yes [11:50] jml: I've started hacking up the shutdown but I could do with a little help on testing! [11:50] http://pastebin.ubuntu.com/513776/ [11:51] jml: we don't need that addSystemEventTrigger since this is a Service ... heh [12:03] Morning, all. [12:14] deryck: hi [12:14] bigjools: ok. all yours. [12:14] jml: how sweet :) [12:15] bigjools: testing reactor shutdown is unsupported, pretty much [12:16] bigjools: I reckon the best thing to do here is call stopService in your tests directly [12:17] jml: there's aleady one that does that - well, see testBuilddManagerRuns [12:17] it's not in your diff [12:17] but I've no idea if my stopService is being called or not [12:18] jml: it's already there [12:18] bigjools: ahh, ok. [12:18] starts up via the tac [12:18] and then shuts down [12:18] I need to page more stuff in before I confuse myself or you any further [12:19] (also, a pox on London traffic) [12:20] s/London/UK/ [12:21] well, I'm not learning to drive in the bits of the UK that aren't in London [12:22] bigjools: ok, I think I've got this figured. [12:23] bigjools: BuilddManager is a Service, so you don't need to do the manual reactor.addSystemEventTrigger [12:23] jml: I know that bit, I said above :) [12:23] forgot to remove it from the diff [12:23] oh ok [12:23] I'm looking at the tip of your branch [12:23] not push [12:23] ed [12:23] hang on [12:24] bigjools: there's no really good way of checking that stopService is called when the reactor is shut down... Twisted makes that promise any way [12:25] jml: pushed. Does the shutdown code look sane though? I'm just making it wait on all the LoopingCall.start() deferreds that fire when stop() is called. [12:28] bigjools: yeah. it's good. looks correct too :) [12:28] bigjools: I'd probably just keep the return values of startCycle and scheduleScan rather than making a new property on SlaveScanner and NewBuildersScanner, but ymmv [12:29] possible, but it's more hassle in the manager [12:29] (class) [12:29] yeah. it's a trade-off, more state in the manager vs unnecessary fattening of interfaces [12:30] I don't think either is clearly better [12:30] me neither [12:30] my main question to you was the sanity of doing that - I think it's ok, but I need to test it on DF to get a better idea. But DF is down for another day. [12:31] it's completely sane [12:31] great [12:32] now, I need to go out and get a new photo for my driving licence, for which I am privileged to offer the government 20 of my hard-earned pounds. [13:23] StevenK, around? [13:27] salgado: Barely [13:29] StevenK, on r11710 you've added a initialisedistroseries section to configs/development/launchpad-lazr.conf but I don't seem to have a matching section in schema-lazr.conf, which causes a trunk branch to fail to build [13:29] Eeep [13:36] StevenK, if you're quick, even buildbot won't notice it. ;) [13:38] salgado: Yeah, I'm putting together a branch now [13:40] cool! [14:00] salgado: https://code.edge.launchpad.net/~stevenk/launchpad/fix-idsjob-schema/+merge/38536 [14:06] StevenK, taking ages to update the diff. can you paste it somewhere? [14:07] salgado: http://paste.ubuntu.com/513849/ [14:11] StevenK, any luck with the hudson builder fixes? [14:12] mars: https://code.edge.launchpad.net/~stevenk/launchpad/test-thread-debug/+merge/38510 [14:13] StevenK, great, so that solves the two code hosting threads issue, which is encouraging for solving the windmill issue as well [14:14] mars: Yeah, I just don't know where to start with the windmill ones [14:14] StevenK, or does it happen to fix the windmill issue too? [14:15] StevenK, it would if an errant puller was also running in the windmill suite. I guess your Hudson builder will tell us if it worked? [14:16] mars: I've been using ec2 to diagnose if it was fixed, and 2 windmill tests still failed [14:16] argh, ok. one can always hope :) === Ursinha is now known as Ursinha-afk === Ursinha-afk is now known as Ursinha [14:26] Project devel build (112): STILL FAILING in 3 hr 40 min: https://hudson.wedontsleep.org/job/devel/112/ [14:26] Launchpad Patch Queue Manager: [r=lifeless][ui=none][no-qa] Create a cancel_on_timeout() helper that [14:26] cancels Twisted Deferreds after a configurable timeout. [14:27] Project devel build (113): STILL FAILING in 1 min 14 sec: https://hudson.wedontsleep.org/job/devel/113/ [14:27] * Launchpad Patch Queue Manager: [r=gmb][ui=none][bug=608621] Add a cronscript for [14:27] InitialiseDistroSeriesJobs. [14:27] * Launchpad Patch Queue Manager: [r=gmb][ui=none][bug=608621] Allow parameters to be passed to [14:27] InitialiseDistroSeriesJobs. [14:27] * Launchpad Patch Queue Manager: [r=allenap][ui=none][bug=659129] Prevent the /ubuntu/+ppas page from [14:27] timing out by fixing a slow query that PG8.4 hates. [14:37] salgado: Haha. buildbot may not have noticed, but Hudson did [14:59] jam: ping [14:59] morning abentley [15:00] morning, jam. thumper suggested I should talk to you about history-db. [15:01] sure [15:01] jam, are you up for a Skype or Mumble conversation? [15:01] sure, in just a bit [15:01] jam: cool. === almaisan-away is now known as al-maisan [15:08] bigjools: Your landing to db-devel "Prevent the /ubuntu/+ppas page from timing out by fixing a slow query that PG8.4 hates", is that safe to land in devel? [15:08] allenap: yes [15:08] I landed it on db-devel by mistake, so I also landed it on devel [15:08] bigjools: Cool, 'cause it will be soon owing to my mistake. [15:09] :) [15:09] bigjools: Oh, so it's already there. Even better. [15:13] jml: hello, can I again bounce something off you for a sanity check? [15:13] bigjools: sure [15:14] jml: thanks. See _update_builder_status in builder.py [15:14] it's called from updateBuilderStatus [15:14] I want to ditch _update_builder_status and make updateBuilderStatus call builder.rescueIfLost [15:14] ie, not trap errors there any more [15:15] then we can allow the failure to hit the manager code which will look at failure counts [15:15] bigjools: i.e. completely ditch the rescue_failed errback? [15:16] jml: yes [15:16] bigjools: I think that's a great idea [15:16] I want to move all the failure handling code to the manager [15:16] cool, thanks jml [15:17] fwiw, for similar-ish things, it sometimes makes sense to distinguish between "expected errors" and, well, errors that should be in OOPS reports [15:18] yeah - eventually that's what I want to do in _scanFailed [15:18] hence the list of trapped exceptions [15:19] cool [15:20] I reckon that will make the code much more readable. [15:20] jml: I'm also temptedto remove handleTimeout [15:20] hmm. [15:20] it does one of two things: [15:21] I'm looking at it :) [15:21] 1. if the builder is virtual it immediately tries to reset it. That's wasteful, because the exact same thing will happen again when the next job comes along [15:21] abentley: want to do the call now? Skype works best for me [15:21] 2. for a non-virt builder it just fails it immediately [15:21] both of these actions are undesirable [15:22] that sounds reasonable [15:22] also, it doesn't seem like 'timeout' is any different from any other unexpected error [15:22] indeed [15:23] another thing I want to do is allow more slack for failing builders - sometimes they come back after 4-5 attempts to contact [15:23] but that becomes trivial once all the errors are handled in _scanFailed [15:24] message-oriented programming yay [15:25] I'd prefer massage-oriented [15:26] :D [15:26] jml: one more thing - I am thinking of adding transaction.abort() as the first line of code in _scanFailed(), what do you think? [15:27] I reckon in the past we've had problems with data corruption where half of the previous failure was committed with the subsequent successful dispatch [15:28] bigjools: I think until all of the transaction management is done outside of manager.py it's all pretty much rearranging deck chairs. it seems like a not-unreasonable paranoid step. [15:28] bigjools: it would be better if you had a test that reproduced one of those problems [15:28] jml: FWIW I think *all* txns should be done in the manager, but we can't realistically do that for everything [15:28] yes, a test :) [15:31] well, the manager could conceivably be written so as to only talk to the db via the webservice or XML-RPC. But even before that wire-level change was made, there could be an object that's solely responsible for mutating db state separate from the object that's responsible for coordinating the whole thing. I think such a separation would make the txn business much more transparent. [15:32] StevenK: I'm getting this error locally. lazr.config.interfaces.ConfigErrors: ConfigErrors: schema-lazr.conf does not have a initialisedistroseries section. [15:32] EdwinGrubbs, StevenK: I'm getting the same error in EC2 [15:32] jml: that's another way of looking at it, yeah. [15:33] jml: http://pastebin.ubuntu.com/513924/ [15:33] gmb: I think we just need to remove that section from configs/development/launchpad-lazr.conf until it is added to lib/canonical/config/schama-lazr.conf [15:34] EdwinGrubbs: Agreed. Can you work up a branch for that and I'll review it? [15:34] I'm right in the middle of some hairy test changes. [15:34] gmb: sure [15:35] bigjools: why not delete updateBuilderStatus altogether and just call builder.rescueIfLost directly? [15:35] jml: eventually - but some other shit needs it split out (can't remember what offhand) [15:36] bigjools: if you say so, but as best as I can judge here all updateBuilderStatus is adding is an optional log message [15:37] otherwise, looks good. [15:37] EdwinGrubbs: Thanks. [15:38] jml: heh, MockBuilder is using it [15:39] updateBuilderStatus is also an utterly crap name [15:41] jml: so.......I think the coding is done. (except the minor cleanups) [15:42] bigjools: awesome [15:42] Someone broke devel. the config uses initialisedistroseries but that is not defined in the schema [15:42] sinzui: sounds like StevenK's change :/ [15:42] but I know he used ec2, so ... [15:42] bigjools: can you post it up for review? I won't be able to do a line-by-line, but I'd like a chance to see how it's all come together before it lands. [15:43] * sinzui knows [15:43] jml: sure thing. I'm just going to prune the XXXes that are done [15:43] bigjools: cool. [15:43] 12 left! [15:43] bigjools, the section was added to a config, but the testrunner does not use that config. [15:43] fwiw, I'm making the twisted/testtools thing I'm doing as rock-solid as I can. [15:43] bigjools, I have a branch I want to land. I will have a patch in a few mintues [15:44] sinzui: ha. [15:44] ok thanks sinzui [15:44] Edwin ping [15:44] EdwinGrubbs ping [15:46] EdwinGrubbs, We can add the section to the schema just as easily as removing it. Are you landing directly to PQM? [15:50] Project db-devel build (68): STILL FAILING in 4 hr 8 min: https://hudson.wedontsleep.org/job/db-devel/68/ [15:50] Project devel build (114): STILL FAILING in 24 min: https://hudson.wedontsleep.org/job/devel/114/ [15:51] sinzui: gmb was going to review it first. I'm actually looking into adding it to the schema now, although I am surprisingly getting errors. [15:52] Edwin, this does not work http://pastebin.ubuntu.com/513935/ when added to the schema? [15:52] it works for me [15:54] sinzui: well, it breaks the cronscript, because it doesn't have a dbuser, but that really shouldn't be my concern, since the tests are so weak, they don't detect that. [15:56] EdwinGrubbs, yes, and I do not think that error dir path is in production [15:58] EdwinGrubbs, are you submitting directly to PQM? [15:58] jml: https://code.edge.launchpad.net/~julian-edwards/launchpad/builderslave-resume/+merge/36351 [15:58] it'll update with a bit more in 5 mins [15:59] bigjools: ta [15:59] jml: I'm confused since I landed the cancel_on_timeout stuff separately but it's conflicting. It's the same damn revision, so why! [16:00] EdwinGrubbs, this was my test to verify the config is sane: ./bin/test -vvc --layer=Unit -t test_config [16:00] bigjools: I can wave my hands and say criss-cross merges [16:00] heh [16:00] fixing it .... [16:01] sinzui: I was going to get a review first, although I'm basically copying your section. [16:01] jml: --lca DTRT FWIW. (that's a lot of acronyms) [16:01] ten four [16:02] It's a shame I don't get karma per line of diff [16:04] sinzui, gmb: here is my mp to fix the config error if either of you want to review it. [16:04] bigjools, I would win in that case, and I already have a ridiculous amount [16:04] * bigjools bows to the new overlord, sinzui [16:05] I like the sound of overlord. [16:05] it fits well with your green nail polish [16:05] One day I will be the god-emperor overload of the Earth, The Universe, and Canada [16:05] \o/ [16:05] EdwinGrubbs: r=me. [16:09] bloody internet crap [16:09] ooo I just noticed that the merge page groups commits that were pushed at the same time [16:10] mars, I think Canada is behind global warming. The transcontinental railroad is a dissatisfying hack to get to the Pacific. Canada really wants a NorthWest passage and it can have it when the Arctic ice cap is gone. [16:19] sinzui, alas, our true intentions became known at Copenhagen. Why else would we start a war with Denmark over a miserable piece of arctic rock off the coast of Ellesmere Island? Someone will have to put a lighthouse on it, and the Danes just wouldn't paint it the right colour. [16:19] abentley, hi. maybe you can help me. I pushed a branch for review with bzr lp-propose-merge. I did this branch as a pipe and the plugin made my early pipe a pre-requisite in the MP... [16:19] :)) [16:19] abentley, is there a way to edit the MP and drop the pre-req? [16:19] jml: there;s one more thing we didn't get around to fixing - an async getFile() [16:20] I should probably do that now since it will have the biggest effect on concurrent performance of any of the changes :) [16:21] * jml looks [16:21] oh right [16:21] if you don't care about streaming data, then t.web.client.getPage is probably enough. === benji is now known as benji-lunch [16:25] deryck: no, you need to create a new merge proposal. [16:25] abentley, ok, thanks. [16:26] deryck: why was your early pipe not a suitable prerequisite branch? [16:27] jml: didn't we try and use that elsewhere and had issues? [16:27] bigjools: that was a different branch, I think [16:27] jml: maybe - it's all a blur :) [16:28] abentley, because I wanted all revs from the start of the work in the diff for review. Using the prev pipe as a pre-req, only the changes in the current pipe were listed in the diff for review. [16:30] Edwin-afk, StevenK has already fixed the problem you've reported on launchpad-dev@ [16:31] deryck: ah. The design assumes that you will want to submit each pipe for review separately. [16:32] abentley, right. and in my case, I did some work (not submitted) then created new pipes to break up the work. [16:34] deryck: so you still have the unsubmitted work in the first pipe? [16:35] sort of. [16:35] had branch 1, did work which amounted to a bunch of test refactors, saw the diff was getting long, add-pipe to submit it as it's own branch, then back to first pipe to keep working while that gets approved. [16:36] but I did a couple small fixes in the second pipe to prep for review, which ended up being all that was up for review when the first branch was listed as a pre-req. [16:37] * jml opens up buildout.txt again [16:39] what do I have to do to get "python setup.py egg_info -r -bDEV sdist" to work. Currently getting "error: invalid command 'egg_info'" [16:50] jam: ping [16:51] hey abentley [16:51] jam: when would you like to chat? [16:51] now would be ok [16:51] I tried earlier, not sure if you missed my ping [16:52] jam: I guess I did. Sorry. [16:52] anyway, skype works best for me, if that is ok === salgado is now known as salgado-lunch [17:13] jml: I am thinking "separate branch" for the async getFile - the rabbit hole goes deep again === matsubara is now known as matsubara-lunch [17:13] jml: so go ahead, I won't make any more changes to that branch now [17:16] bigjools: ok. I might defer it until Monday, actually. [17:17] jml: defer - heh [17:23] hmm. [17:23] I should probably bump up the default timeout on the testtools twisted support [17:23] 0.005s isn't really going to work for Launchpad [17:24] the good news is that in substance it seems to work [17:24] I dunno whether to laugh or cry === al-maisan is now known as almaisan-away [17:31] jam: https://dev.launchpad.net/Code/BranchRevisions === deryck is now known as deryck[lunch] === gary_poster is now known as gary-lunch === benji-lunch is now known as benji [17:44] EdwinGrubbs: regarding your email to -dev about the config error, did you see that another branch landed 30 minutes later that adds the schema-lazr stuff? [18:11] g'night all === matsubara-lunch is now known as matsubara === gary-lunch is now known as gary_poster === deryck[lunch] is now known as deryck === gary_poster is now known as gary-bbs === salgado-lunch is now known as salgado === gary-bbs is now known as gary_poster [19:46] abentley, does anything use the storm Sugar class? [19:49] rockstar: I don't know. [19:49] abentley, didn't you write it? [19:50] rockstar: yes. [19:50] abentley, did you write it with a specific class in mind? [19:50] rockstar: well, it was meant to be used with all classes. [19:50] abentley, okay, so we're supposed to be moving towards it? [19:51] rockstar: I dunno. No one else really got behind it. [19:51] abentley, yeah, that's what it looks like. [20:02] Project devel build (115): STILL FAILING in 3 hr 45 min: https://hudson.wedontsleep.org/job/devel/115/ [20:02] Launchpad Patch Queue Manager: [r=salgado][ui=none][no-qa] Add the initialisedistroseries section to [20:02] the LAZR schema. [21:32] mwhudson: if you're awake and online, I just pushed up what I think is a successful layer setting up the forking service at the same time as the ssh twisted daemon [21:32] I'd like to discuss one aspect, though. [21:35] What is the state of ec2? I had six failures overnight. [21:35] jml: hah. I tried to create an mp, the form hadn't submitted. [21:36] wgrant: clearly the state is that it has 6 failures... :) [21:36] jam: Well, given that it had at least six very different failure modes over the course of last week, I'm not sure that's a valid assumption. [21:38] jml: https://code.edge.launchpad.net/~lifeless/launchpad/paralleltests/+merge/38594 if you'd like to stash an approve vote for the toolchains delight [21:48] lifeless: didn't we talk about problems pushing/pulling looms from lp? [22:04] abentley: you still around? [22:33] barry: there was a mail thread about it [22:33] barry: where I said roughly 'update looms to use the newer bzrlib api I wrote for it' [22:48] lifeless: yeah, i'd forgotten where we discussed this, and couldn't find a bug on it. (it's still broken). i suppose i should file a bug, but where? launchpad-code? bzr? [22:51] How are people landing stuff? Or is ec2 only broken for me? [23:03] wgrant: broken like this https://pastebin.canonical.com/37924/ ? [23:03] benji: EPERM [23:03] wgrant, ? [23:04] In test_network? [23:04] wgrant, in lp.codehosting.puller.tests.test_worker.TestWorkerProgressReporting.test_network [23:04] Yeah, that one. [23:04] wgrant, stevenk has a fix in devel [23:04] I know that's known to be broken, and there is a branch that fixes it... but stuff seems to still be landing. [23:04] looking for alinnk [23:04] Is it actually in devel yet? [23:04] I didn't see it fly past. [23:05] wgrant, https://code.edge.launchpad.net/~stevenk/launchpad/test-thread-debug/+merge/38510 [23:05] wgrant, you may still get windmill errors though. And the consensus is to ignore the lot and us bzr lp-land [23:09] mars: ah, so it has landed. [23:09] Thanks. [23:10] np [23:11] In that case, could someone please try to ec2 https://code.edge.launchpad.net/~wgrant/launchpad/bug-629921-packages-empty-filter/+merge/37339, https://code.edge.launchpad.net/~wgrant/launchpad/bug-655648-a-f-maverick/+merge/37820, https://code.edge.launchpad.net/~wgrant/launchpad/better-publisher-index-tests/+merge/38462 and https://code.edge.launchpad.net/~wgrant/launchpad/bug-661109-buildable-architectures/+merge/38529? [23:19] Project db-devel build (69): STILL FAILING in 4 hr 8 min: https://hudson.wedontsleep.org/job/db-devel/69/ === matsubara is now known as matsubara-afk [23:43] Project devel build (116): STILL FAILING in 3 hr 41 min: https://hudson.wedontsleep.org/job/devel/116/ [23:43] * Launchpad Patch Queue Manager: [r=sinzui][ui=none][bug=none] Removing redundant config section. [23:43] * Launchpad Patch Queue Manager: [r=lifeless][ui=none][no-qa] Fix single-line glob imports in doctests. [23:43] * Launchpad Patch Queue Manager: [r=gmb][ui=none][bug=none] Fixed ConfigError: schema-lazr.conf does [23:43] not have a initialisedistroseries section. [23:43] * Launchpad Patch Queue Manager: [r=sinzui][ui=none][bug=652315] Adds nofollow, [23:43] noindex to unknown code page for products. [23:43] * Launchpad Patch Queue Manager: [r=adeuring][ui=none][no-qa] Use FixedHttpServer in the branch [23:43] puller's test_worker. [23:44] Down to a single failure. Yay. [23:55] barry: bzr-loom + udd