=== Ursinha-bbl is now known as Ursinha [00:07] * poolie might try to finish my dkim branch [00:08] oh, poo, codehosting i guess is entirely off line [00:15] yup [00:15] * mtaylor once again annoyingly renews his objection to making launchpad completely unusable for 2 hours in the middle of the afternoon for the US west coast in the middle of the week [00:17] i know, it's crazy [00:18] also, i don't see any good reason why you shouldn't be able to at least read branches [00:18] and indeed why not write to them, because they're not stored in the db [00:18] but, it's getting better [00:18] i have trust in lifeless and co [00:32] so today we had a db config issue on the readonly slave; another case where schema problems have bitten us. [00:32] (the way we do schema changes, I mean) [00:38] lifeless, what server does PQM run on? [00:38] prae [00:39] pqm runs outside a chroot, but code from our branches runs inside the chroot [00:39] ah, darnit [00:39] ? [00:40] darnit that we are not there yet. [00:40] with Python 2.6 [00:40] right [00:40] there may be other gotchas [00:40] on prae? yes, maybe [00:40] the simplest thing IME is to stay compatible until we have *every possible case* covered. [00:40] no shortcuts [00:41] well, that is going to be what happens now :) [00:59] thumper, rockstar, stub, mwhudson, stevek, lifeless, wgrant -- Review Meeting starting soon [01:00] bac: thumper sends his apologies, but i'll lurk if that's ok [01:00] wallyworld__: that's great. [01:02] lifeless: ping === Chex changed the topic of #launchpad-dev to: Launchpad Development Channel | Week 4 of 10.10 | PQM is Release-Critical; devel is closed (Release manager: EdwinGrubbs) | firefighting: - | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting [01:06] Edwin-afk: so when does devel open again? :-) [01:07] bac: hi [01:08] lifeless: join us in #launchpad-meeting? [02:23] StevenK: An Ohloh reimport is in process, BTW. [02:23] https://www.ohloh.net/p/launchpad/enlistments [02:24] Huzzah [02:30] mars: Do you have a few seconds for me to bend your ear? [02:31] StevenK, sure [02:32] mars: I keep seeing failures such as https://hudson.wedontsleep.org/job/db-devel/lastFailedBuild/testReport/junit/lp.codehosting.puller.tests.test_worker/TestWorkerProgressReporting/test_network/ keep appearing in hudson, do you have any clues? [02:32] I think the same failures happen on ec2 too [02:33] looking [02:33] wgrant, this project does amazing things for one's Ohloh Kudos rank :) [02:33] mars: If that small snippet isn't helpful, there's a full console output link on the left [02:34] StevenK, unfortunately that tells me enough. This is Benji's error from ec2 earlier today: https://pastebin.canonical.com/38615/ [02:35] /with/ a patch I had to try and fix it [02:35] what's funny is the thread ID is the same [02:35] Always the 18th thread started [02:36] mars: It does! [02:36] It looks like this import might only take a few days. [02:37] StevenK, I am almost to the point of desperate measures to solve this. Two thoughts: in the tearDown, enumerate all running threads, .join(3.0) on them. Give them time to halt. [02:37] mars: There are two others linked from https://hudson.wedontsleep.org/job/db-devel/lastFailedBuild/testReport/junit/ as well [02:37] StevenK, or run something yucky like a custom tracer via threading.settrace(), then pick out whatever the heck it is that is hanging around [02:38] StevenK, I think it might be an intermittent race condition between the zope testrunner and our test infrastructure. I solved this once before (and forget how once again) [02:40] StevenK, this work will probably become a priority. When it does, you will have a fix for it. [02:40] mars: Excellent. My only other concern is why does it impact hudson and ec2, but not buildbot? [02:41] StevenK, my theory: BB servers are faster, affecting the race [02:42] That could be why many of us can't reproduce it locally [02:43] yes [02:43] mars: So it may even end up being a race in zope's testrunner, and we just happen to tickle it? [02:44] maybe. The testrunner does no thread cleanup [02:46] mars: That sounds like a zope fail to me [02:46] When's devel likely to reopen? [02:46] wgrant: It was going to be in an hour, but now it's 3. [02:46] More seriously, RSN [02:47] StevenK, got to run - fwiw, the testrunner could die horribly when doing a thread.join() - unjoined threads are test garbage and break isolation [02:47] best leave their cleanup to the test itself [02:47] same as leaking memory garbage [02:47] later === thumper changed the topic of #launchpad-dev to: Launchpad Development Channel | Week 4 of 10.10 | PQM is open | firefighting: - | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting [03:19] mars: StevenK: do we know what threads they are? [03:19] https://code.edge.launchpad.net/~wgrant/launchpad/bug-655648-a-f-maverick/+merge/37820, https://code.edge.launchpad.net/~wgrant/launchpad/bug-629921-packages-empty-filter <-- can someone please land those? [03:21] lifeless: Personally, I have no idea [03:21] energy invested in making that automatically determined would be of great evalue [03:22] or we could just disable the check, though I know thats not terribly popular an idea. [03:25] lifeless: Hudson is showing, time and again, that there is a number of them that fail the same way [03:25] Ursinha: I may be confused [03:25] Ursinha: but I thought there was a report of oopses-received-today [03:26] lifeless, it's the lpnet-oops.html [03:26] same for edge and staging [03:26] Ursinha: how often does it update? [03:26] lifeless, hourly, I guess [03:26] let me check [03:26] last updated at 11pm utc, not updated since. [03:30] lifeless: Hints on where to look would be awesome [03:30] StevenK: have the teardown that bitches introspect the thread objects [03:30] theres a few attributes that may be useful like name [03:31] but also whether its dameonised and if accessible the start function would be good. oh and the class, though I think we're seeing that already. [03:32] does anyone know if there's a bug report i can associate https://code.edge.launchpad.net/~jameinel/launchpad/lp-service/+merge/37531 with? [03:32] mwhudson: yes [03:32] hmm, there was. [03:32] but hell, file a new one. [03:37] lifeless: ok, let me know the number when you have it? [03:37] or link it, either works [03:39] lifeless: Using threading.enumerate(): http://paste.ubuntu.com/512834/ [03:40] lifeless, we don't know what threads they are. That is why I am thinking of using the threading.settrace() method to find out. My research says threads are a black box otherwise [03:40] unless you use the thread name well [03:41] mars: ^ [03:42] yes, not the most helpful thread names :) [03:42] StevenK, didnt' think of using the imp module - would that work? [03:43] err, inspect, not imp [03:43] StevenK, for each thread, can you see the .__class__ method? [03:43] attribute [03:44] bad typing tonight [03:45] mwhudson: I oculdn't find it; want to file one ? [03:46] lifeless: ok [03:46] mars: not a black box [03:47] >>> t = threading.Thread(target=lambda:None) [03:47] >>> t._Thread__target [03:47] at 0x7f75dfdd92a8> [03:47] t.daemon [03:47] StevenK: print those two things as well please [03:47] (t.name, t.daemon, t._Thread__target) [03:51] lifeless, ah, you are accessing a private object member - clever [03:52] >>> t = threading.Thread(target=lambda:None) [03:52] >>> dir(t) [03:52] and look [03:52] we may also want _Thread__args [03:52] but thats more likely to throw up in our face, I think. [03:54] um [03:54] is person search on launchpad completely horked right now? [03:54] yes [03:54] the huge vocab bug [03:54] 8.4 regression [03:54] ah ok [03:54] once Ursinha gets back to me about lpnet-oops being stale I will raise the timeout for it via flags. [03:56] its probably validpersoncache [03:56] but also we've got this bizarre thing where staging is fine and prod sucks [03:56] which reminds me, its bug fiuling time on that [04:00] lifeless, so.. it seems oops-tools can't find any oopses [04:00] aarrrgghh [04:01] I'm checking devapd [04:01] Ursinha: are there any on disk on sodium? [04:01] devpad [04:01] that's what I'm checking [04:01] kk, great minds :) [04:03] lifeless, mars: Sorry, was afk: http://paste.ubuntu.com/512844/ [04:03] lifeless, hm, I see oopses there [04:03] lifeless, /srv/launchpad.net-logs/lpnet/gandwana/2010-10-14 [04:03] I see a bunch [04:04] for instance [04:04] 340 [04:04] Ursinha: ok, so oopstools is broken ? [04:05] StevenK: so, that tells use that that one has a bzr server [04:05] lifeless, investigating what's happening.. [04:05] StevenK: two threads; making the error print this extra info would be useful :) [04:05] lifeless: Which I'm guessing it started and didn't tear down? [04:05] StevenK: theres a few possibilities [04:06] oh god [04:06] StevenK: process_request_thread may mean that there is a half closed socket, for instance. [04:06] mwhudson: been drinking ? :) [04:06] i think this is because bzrlib's test http server implementation doesn't join() its thread [04:06] lifeless: i wish [04:06] lifeless, are you the one that's viewing src/oopstools/oops/dboopsloader.py? [04:06] mwhudson: invoking the 'oh god' :) [04:07] Ursinha: no [04:07] hm [04:07] mwhudson: Does that make it a bzrlib bug, then? [04:07] eggs/bzr-2.2.0-py2.6-linux-x86_64.egg/bzrlib/tests/http_server.py:597 and thereabouts [04:08] StevenK: 'difference of opinion' vs bug [04:08] that said, i thought we joined that thread in launchpad, so maybe it's something else [04:08] Heh, I see that, reading the comment. [04:08] that's not entirely unrelated [04:08] lifeless, I see a problem here in the oops loader, have to find out how to break the lock file [04:08] Ursinha: ok [04:08] Ursinha: I wait with bated breath [04:08] StevenK: you could try putting a join() in there and seeing what happens [04:09] mwhudson, this argues for a test teardown .join() call then [04:09] /win 2 [04:10] mars: not really [04:11] lifeless, why not? [04:11] mars: it argues that our test code looking for thread leaks is wrong [04:11] mars: / unnecessarily strict [04:11] mars: looking over the life of the whole test run should be sufficient, for instance (which is what bzrlib does, more or less) [04:12] that said, there's no reason not to join there, that comment is *almost certainly* premature optimisation. [04:12] lifeless, sorry, you lost me - no reason not to join where? [04:12] Can I encourage 'you' to put a patch forward to bzr to fix this. [04:12] mars: stop_server(self) [04:14] Adding self._http_thread.join() to bzrlib has no effect on the output [04:14] lifeless, oops-tools are mostly matsubara-afk 's domain, I'm not sure what to do there without possibly breaking something [04:15] lifeless: ^ [04:16] hmm [04:16] I guess I solved :) [04:16] update_infestation is running, let's see if oops-tools loads all oopses now [04:17] StevenK: we're not sure that function is being called, are we? [04:18] hmm, stop_server is being called [04:18] and a gc.collect() is being called [04:18] at least, according to the test [04:19] Indeed [04:19] Ursinha: awesome, how long should I wait before trying [04:20] lifeless, no idea, I'll let you know as soon as it finishes, but shouldn't take long [04:21] lifeless, hm, something is really wrong, script output was "no infestation updated" [04:21] it's not recognizing the oopses [04:22] hmm [04:23] lifeless, the line was commented out the crontab... [04:23] I ran that manually and it's loading the oopses [04:23] but not sure why that was commented [04:23] hopefully it won't cause any trouble [04:32] lifeless, I ran out of ideas. It loaded 7k oopses to the database but it just cannot find it for lpnet, only edge [04:32] edge has a partial report now, but lpnet doesn't [04:32] report generator claims there are no oopses for lpnet [04:32] we'll have to wait for diogo [04:34] Ursinha: thanks for trying [04:34] stub: hey, can we have a brief skype call? [04:35] no problem lifeless, sorry not being more helpful [04:35] going to eat something and sleep [04:35] Ursinha: ciao === Ursinha is now known as Ursinha-afk [04:38] lifeless: sure [04:38] lifeless: stuartbishop [04:43] https://bugs.edge.launchpad.net/soyuz/+bug/659129 [04:43] <_mup_> Bug #659129: Distribution:+ppas timeout in PPA search [04:46] lifeless, StevenK, I tried the .enumerate() patch, but the results are nonsense. Here is the patch, in case you see anything obviously wrong with it: http://pastebin.ubuntu.com/512867/ [04:47] lifeless, StevenK, the threads are still alive at the end of the test run, after TestCaseWithTransport.tearDown(self), but the zope testrunner doesn't complain [04:51] mars: Throw that at ec2 and see what happens after runs the test suite? [04:51] StevenK, it fails locally. It is bunk :( [04:51] no point in running it through ec2 [04:59] StevenK: http://wiki.postgresql.org/wiki/Postgres-XC [04:59] bah [04:59] stub: http://wiki.postgresql.org/wiki/Postgres-XC [05:17] https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1747S227 [05:17] ^ [05:17] 06:27 < bigjools> lifeless: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1747S227 [05:17] 06:27 < bigjools> the fixed query is not the one that's timing out [05:17] 06:28 < bigjools> I smell a problem with the ValidPersonOrTeamCache view [05:18] https://bugs.edge.launchpad.net/launchpad-registry/+bug/655802 [05:18] <_mup_> Bug #655802: Branch:+huge-vocabulary timeout (Person and team AJAX picker fails) [05:20] stub: https://lp-oops.canonical.com/oops.py/?oopsid=1740EC788 [05:26] "The PostgreSQL Project finally switched from CVS to Git in September 2010" [05:26] O.o [05:40] stub: https://bugs.edge.launchpad.net/launchpad-foundations/+bug/660291 [05:40] <_mup_> Bug #660291: inconsistent performance between staging and prod [06:20] stub: I've tagged all the bugs pg83; probably a little zealous, but better safe than sorry ;) [06:33] ALL the bugs?? [06:35] "Every bug is now tagged pg83, enjoy" ? :-) [06:44] stub: all the timeout ones === almaisan-away is now known as al-maisan [07:12] So can anyone tell me why the debugging information isn't being printed at https://lpbuildbot.canonical.com/builders/lucid_prod_lp/builds/7/steps/shell_7/logs/stdio ? Relevant code is test_on_merge.py around line 80. [07:27] Project devel build (104): STILL FAILING in 4 hr 2 min: https://hudson.wedontsleep.org/job/devel/104/ [07:27] stub: what debugging info are you expecting ? [07:28] lifeless: The information about the open connections that test_on_merge.py should emit around line 80. [07:29] stub: if not results: [07:29] break [07:30] hmm, no, that can't be triggering [07:30] yer - I don't follow. Either it is a logic error I can't see, or buildbot is stripping the information, or it isn't on that branch. [07:30] I suspect the latter [07:31] that code is in prod-devel [07:31] Given how noisy test_on_merge usually is [07:31] hangon [07:32] stub: thats not the code thats executing [07:32] print 'Cannot rebuild database. There are open connections.' [07:32] != [07:32] Cannot rebuild database. There are 1 open connections. [07:32] ahh [07:32] Can we move to Hudson now? Buildbot is annoying me. [07:33] StevenK: you were working on reliability - hows that going? [07:33] lifeless: I got distracted by real work [07:33] StevenK: fair enough [07:34] stub: I don't know what code *is* running, but its not test on merge [07:34] I can't find it in the tree [07:34] Maybe buildbot scripts? [07:34] I guess [07:34] it claims its running test_on_merge [07:34] yer... [07:35] perhaps a .pyc stale issue [07:36] thats the old output [07:36] Right. Or that branch just doesn't have my patch. [07:36] and now bb is down >< [07:36] stub: it does [07:36] stub: I've pulled and checked [07:36] rev 9848 [07:38] redoing just to be sure [07:38] nope, its good. [07:42] hmm, where *has* julians distribution:+ppas patch gone [07:43] ah, its in devel [07:44] way past eod; ciao. [07:44] stub: build 7 was old [07:44] https://lpbuildbot.canonical.com/builders/lucid_prod_lp/builds/9/steps/shell_7/logs/stdio was current [07:45] till spm nuked bb [07:45] My fault [07:48] Could someone be convinced to EC2 my two long-approved branches? [07:49] wgrant: Oh, sorry, I meant to do that. Link me again? [07:50] StevenK: https://code.edge.launchpad.net/~wgrant/launchpad/+activereviews [07:50] wgrant: Rah [07:50] Oh? [07:51] * StevenK blinks [07:51] return self._mp.queue_status == 'Approved' [07:52] AttributeError: 'Entry' object has no attribute 'queue_status' [07:52] The URL is right? [07:53] Oh, wait [07:54] wgrant: Yeah, my fault [07:55] Yay. [07:55] EC2 doesn't hate me for the seventh time :) [08:02] Hm [08:02] ec2 still uses pg8.3 [08:02] Eep. [08:03] I think we need a new machine image [08:07] wgrant: You don't really care about the ec2 URLs, right? [08:07] * StevenK also notes that ec2 land gets really unhappy if two copies are running [08:22] wgrant: Both branches are in ec2 [08:42] grah we're in testfix ? :( [08:52] lifeless: Only because buildbot sucks [08:53] good morning [08:59] heh, oops [08:59] " [08:59] Invalid stacked on location: /+branch/qbzr [08:59] " [08:59] So whilst the ssh server understands those new URLs, Launchpad gets irked if you use them as stacking locations [09:04] please file a bug [09:11] filed as bug 660358, with a bonus easter egg extra idea :-) [09:47] lifeless: still around? [09:50] mars: O hai -- our ec2 images still use postgres 8.3, could you prod them up to 8.4? [10:00] StevenK: Thanks. [10:01] * bigjools is way behind on email [10:01] mrevell: Fancy looking at bug 660283? [10:01] <_mup_> Bug #660283: Bug search pages should document valid search expressions [10:01] * mrevell looks [10:01] thanks allenap [10:03] I'm confused. Should we have "timeout" and "oops" on timeouts? Or just oops? Or just timeout? [10:08] bigjools: timeouts should have 'oops' and 'timeout' unless something has changed recently [10:08] jml: I need to have a word with Rob then :) [10:09] bigjools: is lifeless deleting tags? [10:11] jml: a bunch of soyuz bugs had "oops timeout" turned into "timeout" IIRC and when he added the pg83 tag the oops tag got removed. I don't know if that's deliberate. [10:11] bigjools: me either. [10:12] Might have been removed on purpose - pg83 indicates we don't have a current valid OOPS (although a number of non-db ones have got that tag too...) [10:13] ahh, yeah, that'll be it. === jtv is now known as jtv-afk [10:13] furry muff [10:24] bigjools: I'm told that timeout and oops tags are mutually exclusive [10:24] bigjools: by urshina [10:24] lifeless: that seems sub-optimal to me [10:25] I'll talk to her and see why, thanks [10:25] https://dev.launchpad.net/LaunchpadBugTags doesn't explain [10:25] there was a different page [10:26] anyhow, on my second day or so I was updating bugs with 'timeout oops' as per how I read the policy [10:26] and urshina said that it was meant to be one or the other [10:26] bigjools: to me one seems as good as the other, just as long as we're consistent so tools can be written [10:26] jml: bigjools: ah, here it is https://dev.launchpad.net/PolicyAndProcess/ZeroOOPSPolicy [10:26] lifeless: thanks [10:26] 'It should be tagged with either 'oops' or 'timeout' on it. [10:27] it doesn't say why [10:27] and I find it useful to have both tags [10:28] bigjools: It doesn't matter to me either way as long as I don't need to remember different rules for different parts of the same project [10:28] agreed [10:28] bigjools: I'd be delighted to change, if you want to bring it up with urshina/the list [10:29] my (probably faulty) recollection is that it was for qa tooling [10:29] I find it useful to search for just one tag "oops" and get everything related. If we don't get stuff tagged with just "timeout" it's more time-consuming, at least for me, to remember to look for the other tag too. [10:29] also, someday I'm going to make that oops/timeout graph that lifeless asked me for [10:29] heh [10:30] consistent tagging will be important [10:43] * bigjools totally loves PG84's psql that tells you what other tables reference your column [10:44] stub: regarding that sql you did in the bug comment, I vaguely remember someone saying something about there being a way to check person validity directly on Person? [10:45] bigjools: Yer - love that. Pain in the arse backtracking that stuff before [10:46] bigjools: You need person, emailaddress and account for our current 'valid person' rules. [10:46] bigjools: With just person, you can't tell if their account is active or if they have a preferred email address [10:46] ok [10:47] I still need to order on person, so I need that extra crap :( [10:48] So my timings on current staging seem ok. lifeless got one with a 10 second query. [10:48] I'll make another patch to try out with that changed query [10:51] sqlobj doesn't do LEFT OUTER JOIN does it :/ [10:53] I've blotted all that from my mind. [10:57] this is going to be tricky to change [10:58] It's not easily Stormifiable? [10:58] no [10:59] take a look at Distribution.searchPPAs [10:59] it has fti stuff - last time I tried that in Storm it was a world of pain [11:00] If necessary you could just SQL() that bit. [11:00] I could, which is what I think I did [11:00] but something else is nagging me and I can't remember what it was [11:00] The horrible horrible string concatenation in that method? [11:00] heh [11:01] that's how it was done with sqlobj [11:01] yes, but that's like so three years ago. [11:01] I do not want to stormify this query [11:01] not right now anyway [11:01] Seems the fastest approach to me [11:02] It looks easy enough to Stormify... as long as the callsites aren't braindead. [11:02] ... [11:02] you know what they say about assumptions [11:02] But this should have only one callsite. [11:02] hahaha [11:03] Well, there's only one non-test callsite that cares about the result. [11:05] * bigjools considers store.execute [11:05] * stub ♥ store.execute() [11:06] Why are you ordering by Person.name anyway? [11:06] Seems... odd... [11:06] because this stuff needs to appear on +ppas [11:07] Yer, but Person.name is a very arbitrary order. [11:07] true but it's less arbitrary than anything else [11:07] displayname is better. [11:08] But isn't relevance better still? [11:08] yes [11:08] If you pick a field in Archive to order by, you don't have to rewrite it in Storm :) [11:08] stub: I know, don't think I hadn't considered that :) [11:08] Oh, you order by relevance then name, I see. [11:09] * stub changes his Person.name to 'aaaaaaaaaaaaaaaa_stub' [11:09] I could order by relevance, then ppa name perhaps [11:09] * bigjools thinks [11:09] * nigelb points stub to http://uncyclopedia.wikia.com/wiki/AAAAAAAAA! [11:10] like :) [11:13] wgrant: actually something sensible needs to be the default for when no search term is supplied [11:14] I like displayname to some extent [11:14] bigjools: Person.name is probably not that [11:14] Archive.displayname might be. [11:14] indeed [11:14] since it will include the person.name anyway unless they changed it :) [11:14] Um, well, not any more. [11:14] There is no default [11:15] mmm true [11:15] I think it's a better default, let's see [11:15] We really need to fix that lack of default. [11:15] Although it's not so bad now that the key doesn't acquire that name permanently. [11:24] WARNING:root:Memcache set failed for... [11:24] whut? [11:25] Didn't you disable memcache on your system? [11:26] I did [11:26] sigh [11:31] actually, not on this machine, so something else is wrong [11:33] test isolation crappiness, it seems [11:40] actually... [11:40] mpt did a spec about consolidating the name/displayname/title thingy to make it consistent across LP [11:40] we should do that [11:44] We should do a lot of things :( [11:44] Yes! [11:44] Do more things! [11:47] An intriguing proposal. [11:48] wgrant: https://staging.launchpad.net/ubuntu/+ppas?name_filter= [11:49] jml: +1000000000000000000 [11:50] bigjools: Looks a bit crap, but not much worse than the old one. [11:50] The testsuite doesn't talk to the system memcache [11:50] wgrant: seems better to me since it's sorting by something you can see :) [11:50] bigjools: True. [11:53] What was the hack in /default to stop rabbitmq starting up? I used to use update-rc.d but that isn't nice apparently [11:53] This a.bono guy has a lot of PPAs. [11:55] please file a bug on the rabbit package if there isn't something in /etc/default/rabbitmq [11:56] stub: I put DAEMON=true in it [11:56] And that *stops* it starting? [11:56] yes [11:56] huh [11:56] jml, https://dev.launchpad.net/RegistrySimplifications [11:56] best not to ask :) [11:56] please file a bug, thats really nonobvious [11:57] mpt: thank you. [11:57] I didn't touch Archive. though [11:57] mpt: That is a lot of steps. [11:57] wgrant: but they are simple steps. [11:57] True. [11:57] (some of them might be tedious, but that's a different axis) [12:00] Archive.displayname is, I guess, what ends up being shown in the left pane of Ubuntu Software Center [12:01] Morning, all. [12:01] deryck: good morning [12:07] morning deryck [12:08] stub: so why was that join to Person having such a big effect in the query? And only on 8.4? [12:08] I suspect it always had that effect - from my testing it reduced a 1.5 second query to 0.5. [12:09] bug #660460 [12:09] <_mup_> Bug #660460: Need option to not launch server on boot [12:11] Booger. /ubuntu/+ppas just gave me a timeout on staging. [12:12] There seems to be a db restore going on... might just be busy [12:18] yeah it was very quick last time [12:18] wgrant: Both your branches failed. :-( [12:21] @#@!(U$@!$! [12:21] #7 [12:21] Although that's only failure #3 for the other one. [12:23] So, um... === jtv-afk is now known as jtv [12:30] StevenK: I need to check it out, but I think that's a very old "bug" in bzr [12:33] jml: Sure, I'd be happy to be pointed at where the issue actually is. [12:33] StevenK: looking now. [12:33] iirc, it might have something to do with python's socket lib being terrible [12:34] StevenK: https://bugs.edge.launchpad.net/bzr/+bug/193253 [12:34] <_mup_> Bug #193253: sockets being leaked in branch puller tests [12:37] * jml replies on list [12:42] done === matsubara-afk is now known as matsubara [13:49] Project devel build (105): STILL FAILING in 4 hr 8 min: https://hudson.wedontsleep.org/job/devel/105/ === Ursinha-afk is now known as Ursinha === salgado is now known as salgado-dr [15:28] rockstar, ping. (read: yo, where my new lazr-js, sucka!) [15:31] jelmer: You know that chardet-dep branch that you fixed? Well I think you forgot to push :-) [15:32] maxb: bzr says "No new revisions to push.". Looking into it.. [15:33] deryck, ping [15:33] hi sinzui [15:34] deryck, I would like your +1 to run this sql update in production. I just confirmed the test run on staging did purge 2500 known spam messages from openjdk: http://pastebin.ubuntu.com/513143/ [15:35] * deryck looks [15:36] deryck, I have it reviewed and everything, but I have 9 test failures, and they are all bugs' fault. [15:36] "fault" is such a harsh term. ;) [15:37] sinzui, r=me. Do I need to update this on the LPS page? [15:37] rockstar, do you need our help fixing these test failures? Are they Windmill or yuitest? [15:37] deryck, I do not know. [15:39] sinzui, if you didn't add the query to LPS, I wouldn't think I need to do so. If a losa wants it added there, I can update when you ping back. [15:39] I will do that now [15:41] deryck, windmill. I think I just need to see details on the failures before I can make a statement on help. [15:41] rockstar, ok, cool. [15:44] deryck, https://wiki.canonical.com/InformationInfrastructure/OSA/LaunchpadProductionStatus [15:44] deryck, bug 660541 [15:44] <_mup_> Bug #660541: Messages from the list in the moderation queue are spam, discard them with a script [15:44] * deryck updates LPS page [15:45] sinzui, done. r=me, of course. [15:45] thanks [15:46] np [15:47] maxb: fixed [16:00] rockstar, ping. [16:03] jcsackett, pong. [16:04] rockstar, could i get your input on https://bugs.edge.launchpad.net/launchpad-code/+bug/652126? [16:04] <_mup_> Bug #652126: branch collection action portlet is missing links [16:04] it was filed as part of our bridging-the-gap work; i think we can either dismiss it or we need to talk about a different way of presenting the statistics it's talking about. [16:05] if it's the latter, i don't want to take a stab at it without talking with someone on the code team. [16:06] rockstar ^ === james_w` is now known as james_w [16:25] gary_poster: you uh, TIMEd? :) [16:26] dobey, you mean Time Inc? If so, yeah :-) [16:26] gary_poster: i mean, i see a CTCP TIME from you :) [16:28] anyway, lunch is calling. bbiab [16:28] dobey: ah, yeah. We were having a discussion of python keyring and I was bringing up concerns you mentioned in passing, and I idly did a whois sort of thing [16:28] nothing to respond to [16:28] ah ok [16:28] cheers then [16:29] bye :-) [16:34] bigjools: otp now, fwiw. [16:34] bigjools: errr, off the phone [16:35] :) [16:35] jml: tests are working now [16:35] \o/ [16:35] jml: one more thing, we're still trapping xmlrpclib.Failure - is that raised by twisted? [16:35] rockstar: had a chance to look at that bug? [16:35] bigjools: you mean xmlrpclib.Fault? [16:35] yes :/ [16:35] jcsackett, yeah, but I'm not entirely sure why we need links there. [16:35] bigjools: yes, it is. I think t.w.xmlrpc.Fault is just a re-export of that [16:36] coolio [16:36] It's displaying information, but it doesn't need to link anywhere. [16:36] At least, that's my opinion. [16:36] jml: committing and pushing so you can peruse, one sec [16:36] rockstar: thus my pinging you. it looks to me like that is meant as statistics, not action. [16:36] rockstar: inasmuch as the only place those could link to is the very page you're looking at. [16:37] bigjools: ta [16:37] jcsackett, yes. [16:37] rockstar: do you think it's better to dismiss the bug or find a different way to display the info? [16:37] rockstar: i'm not sure, but perhaps the sidebar things are meant to be action links only? [16:37] * jcsackett looks for precedent. [16:38] jcsackett, well, there are lots of things wrong with that portlet. [16:38] jml: http://bazaar.launchpad.net/~julian-edwards/launchpad/builderslave-resume/revision/11676 [16:38] well, if it would not OOPS :/ [16:38] rockstar: so would something like showing at the top of the page the "N branches, M commits" and removing it from the portlet make sense? [16:38] jcsackett, maybe. [16:39] jcsackett, although I'd contend that it shouldn't link to active reviews if there are no active reviews... [16:39] bigjools: I'll just pull the branch. [16:39] rockstar: also valid, but maybe that's a different bug. [16:39] jml: url works now, but ok [16:39] jcsackett, I hate hate hate when we do things like "0 foos in the bar" [16:40] rockstar: yeah; i can see that. [16:40] jcsackett, if you're going to fix this portlet, I suggest killing the whole portlet and moving all of the information to be part of the page. [16:40] rockstar: i was just thinking that. [16:40] jcsackett: perhaps placed above the drop down selector [16:40] * jcsackett looks at what he typed. [16:41] i clearly need more sleep and/or coffee. rockstar, last comment was meant for you, not me. :-P [16:43] bigjools: instead of: "server_proxy.queryFactory = QuietQueryFactory" you could also do "server_proxy.queryFactory.noisy = False" and then delete the QuietQueryFactory class [16:44] jml: I know, I kinda like the explicit factory [16:44] but meh [16:44] bigjools: yeah, it's a taste thing. I think they are both equally explicit in terms of intent though [16:46] bigjools: looks good. [16:46] jml: cool, thanks. So now I have some more tests to write that I thought of that are lacking from the previous manager. [16:46] bigjools: it might be worth adding an errback that transforms the CancelledError into a TimeoutError or something, just to make the API clearer. [16:46] ah very good point [16:46] bigjools: and _with_timeout probably ought to be a standalone function in lp.services.twistedsupport [16:47] also we need the top-level to handle it [16:48] handle the timeout? yes. [16:48] at the moment it would print a stack trace [16:48] it just needs to be added to the list of known failures [16:49] jml: oh one more thing, when testing on dogfood I noticed that not all types of failure would get back to the top level scan(), which left the scan "dead" after a traceback. [16:49] I'm not sure what was going on - but at the very least I need to make sure that *all* errors are caught in scan() [16:50] and I don't know why it's not doing that [16:50] bigjools: hmm. [16:50] bigjools: I don't either – that's not much to go on. Was it an "Unhandled error"? [16:50] once it gets in there the failure counting will deal nicely with it [16:50] * bigjools hunts logs === beuno is now known as beuno-lunch [16:51] jml: yes, Unhandled Error [16:51] bigjools: ahh right [16:51] it was a coding mistake [16:52] but it should be caught further up [16:52] the manager should trap those and kill the build [16:52] bigjools: that means a Deferred somewhere has had errback called on it, but there's nothing handling that error. Normally either the deferred is not being returned ... [16:52] or it is and someone didn't add errback [16:52] hmmm [16:52] bigjools: there's not many options open to you if you don't return the Deferred. [16:53] jml: so since _startCycle() does d.addErrback(self._scanFailed) then I presume it's the former [16:54] bigjools: or something down deeper. [16:54] bigjools: setting 'debug' (or is it 'DEBUG'?) on the Deferred class to True will give you more information [16:54] debug [16:55] can you get on mawson? [16:55] no [16:56] http://pastebin.ubuntu.com/513226/ [16:56] that's the general type of thing [16:57] I need to handle shutdown gracefully as well, did we work out if there was a reactor hook? [16:58] bigjools: I didn't find out. [16:59] bigjools: part of the problem with unhandled errors is that there's no way of telling a particular error is unhandled until the gc kills the deferred with the error. [16:59] * jml has a look at the code [16:59] and that's bad, because in at least one case it failed the builder immediately (I need to purge all those) [17:00] jml: def defer_with_timeout(self, d, timeout): [17:00] look ok? [17:00] in lib/lp/services/twistedsupport/xmlrpc.py [17:02] jcsackett, I think that's fine. It'll probably make that page a little busier. [17:03] bigjools: it's actually not anything to do w/ xmlrpc, I'd put it in __init__ and call it cancel_on_timeout, I think. [17:03] bigjools: and I presume you saw the recent discussion on #twisted [17:03] no, not paying attention [17:03] rockstar: i think if it's refactor into one smooth sentence it'll be alright. i'll get a ui review on it when i'm done to be sure though. [17:03] bigjools: reactor.addSystemEventTrigger('before', 'shutdown', f) [17:03] bigjools: also, we should be using Services, but one thing at a time [17:04] oo nice [17:07] mrevell: do we have any docs about the product release finder? [17:09] hmm [17:11] jml we have this http://blog.launchpad.net/cool-new-stuff/automatically-import-files-to-launchpad-using-product-release-finder [17:14] mrevell: ta === deryck is now known as deryck[lunch] [17:20] jml: I want to test the scenario where the deferred completes instead of cancelling [17:20] and struggling a bit [17:21] bigjools: from a proxy or from cancel_on_timeout? [17:21] bigjools: from a proxy, just do DeadProxy again but with 'return defer.succeed(None)' [17:21] jml: testing the cancel on timeout [17:22] bigjools: from cancel_on_timeout, d = cancel_on_timeout(defer.succeed(None), timeout=10) [17:22] jml: can I assert it was not cancelled? [17:22] that will not fail either way :) [17:22] bigjools: sort of. you can assert that you can add a callback to it and get whatever you passed in to succeed() [17:23] bigjools: and rely on the TestCase to assert that the callLater has been cancelled [17:23] right [17:23] bigjools: if that's too implicit for you... === al-maisan is now known as almaisan-away [17:24] actually... that won't work anyway [17:24] self.assertEqual([], clock.getDelayedCalls()) [17:25] because Trial makes assertions about the actual reactor, and presumably you are passing in a fake. [17:26] (see IReactorTime for docs on that) [17:28] implicit is fine [17:28] bigjools: not if you are passing in a clock, it isn't. it won't make the assertion at all. [17:28] oh, meh [17:30] Project db-devel build (66): STILL FAILING in 4 hr 19 min: https://hudson.wedontsleep.org/job/db-devel/66/ [17:31] now, where was I [17:31] error handling. [17:32] Project devel build (106): STILL FAILING in 3 hr 43 min: https://hudson.wedontsleep.org/job/devel/106/ [17:32] Launchpad Patch Queue Manager: [r=jelmer][ui=none][bug=656295] Allow derivers to specify which [17:32] packagesets they want to copy to the child distroseries in [17:32] InitialiseDistroSeries. === matsubara is now known as matsubara-lunch [17:34] jml: http://pastebin.ubuntu.com/513242/ [17:35] bigjools: sweet. [17:35] I'll commit here but land it in a separate branch too [17:35] bigjools: good call. [17:36] one of the few things I can :) [17:36] bigjools: I've just looked at that unhandled error in the paste you posted earlier [17:37] bigjools: this is the relevant code: [17:37] d = self.scan() [17:37] d.addErrback(self._scanFailed) [17:37] return d [17:37] if _scanFailed fails unexpectedly, nothing will handle that [17:37] ha [17:37] so you could change it to: [17:37] d = self.scan() [17:37] d.addErrback(self._scanFailed) [17:37] d.addErrabck(self._handleUtterDisaster) [17:37] return d [17:37] but I'm not sure that would win you very much. [17:37] so I need an error handler for the error handler === benji is now known as benji-lunch [17:38] (particularly if you typo addErrback!) [17:38] belt & braces [17:38] :) [17:39] bigjools: hmm, I'm not sure writing more code to handle runtime cases where there are programming errors counts as belt&braces [17:39] *deleting* code would be a far more sound strategy :) [17:39] I know what you mean, *but* there is one piece of code that's out of the managers hands [17:39] builder.getCurrentBuildFarmJob() [17:40] that can failed and we have no control [17:40] fail, even [17:40] fair enough. in that case maybe put a try/except around it in _scanFailed? [17:40] can do - and fail the job immediately [17:51] unfortunately I have to go now. === beuno-lunch is now known as beuno [18:11] abentley: around? [18:18] bac: hi? [18:24] dobey: back. [18:26] abentley: hi. about the bzr upgrade of remote branches issues. is there some way to separately upgrade the repository from the branch itself? or is that what the link on the web does? [18:27] dobey: I don't understand the question. Technically, the branch will already be in format 7, and its repository is the only thing that could be upgraded. [18:28] abentley: ok, i upgraded a branch to 2a several weeks ago, and no new revisions have been committed, and the web page still says KnitPacks5 for the repository format [18:28] abentley: https://code.edge.launchpad.net/~configglue/configglue/trunk === matsubara-lunch is now known as matsubara [18:31] dobey: the branch and its repository have been upgraded, but the database is out of sync. === deryck[lunch] is now known as deryck [18:32] abentley: and there's nothing i can do myself short of committing a new revision to force a rescan, right? [18:32] dobey: well, we could try taking out a branch lock and seeing if that triggers a scan. [18:33] dobey: or you could uncommit a revision and then push it again. === salgado-dr is now known as salgado === benji-lunch is now known as benji [18:35] abentley: ok. i'm not especially concerned at this point. i was just wondering since your e-mail suggested this case shouldn't be happening, as an upgrade should cause a rescan to trigger. [18:35] It shouldn't be happening. An upgrade should cause a rescan to trigger. I can investigate why that didn't happen, now that I have an example. [18:37] abentley: do you know more specifically when upgrades should have started causing a rescan? if the code that handles that was deployed after i did the upgrade, that would probably be the reason. but "these days" isn't a very exact metric :) [18:37] is there a way to force lowercase on text in a template? ideally without using "python: " ? [18:38] dobey: months ago. [18:38] ah ok [18:39] dobey: probably around May. [18:39] yeah this was definitely upgraded after that then [18:40] well the last revision was in August, and I think I did the upgrade probably about 3-4 weeks ago [18:41] dobey: my logs go back to Sept 23. Was it before that? [18:42] let me see if i can find an exact date for it [18:42] abentley: it probably would have been sept 22 or 23 though [18:43] Thu 2010-09-23 11:09:34 -0400 [18:43] 0.025 bazaar version: 2.2.0 [18:43] 0.025 bzr arguments: [u'upgrade', u'--default', u'lp:configglue'] [18:43] abentley: should i file a bug? [18:46] dobey: I guess. I'm not sure we can do more than mark it incomplete, at this stage. [18:47] dobey: what's your offset from UTC? [18:49] abentley: -0400 [18:50] So 11:09 would be 15:09 [18:50] dobey: So you remotely upgraded the branch, rather than using the web page, right? [18:51] abentley: right, i did bzr upgrade --default lp:configglue [18:52] dobey: okay, it's possible we don't handle that case. I was thinking of web-based upgrades, where I know we handle it. [18:55] abentley: ok, i'll file a bug [19:03] Yippie, build fixed! [19:03] Project devel build (107): FIXED in 1 hr 30 min: https://hudson.wedontsleep.org/job/devel/107/ [19:03] * Launchpad Patch Queue Manager: [r=leonardr][ui=none][no-qa] Modify xx-person-subscriptions.txt not [19:03] to use sample data. [19:03] * Launchpad Patch Queue Manager: [r=lifeless][ui=none][bug 46581] Do not oops when users hack poll [19:03] types. [19:03] <_mup_> Bug #46581: Change a poll type URL manually crashes [19:04] abentley: filed bug #660695 [19:04] <_mup_> Bug #660695: Remote upgrade not triggering rescans [19:05] dobey: I can't reproduce it. I just created a branch and upgraded it remotely. [19:05] hmm [19:06] abentley: let me see if i have another branch i can try on [19:06] dobey: And it shows the correct repository format. [19:06] dobey: wrong branch format, though. [19:08] hmm [19:21] abentley: interesting. i see that as well on another branch i tried. and yet another i tried, has the right branch format, but still says KnitPack6 [19:23] dobey: you mean, you just created a new branch in KnitPack6, upgraded it, and it's showing the wrong data? [19:23] abentley: no, i found a branch already on lp which i already owned, which was in the old format, and upgraded it [19:24] dobey: but you upgraded it just now? Okay, maybe it's something to do with the format. [19:24] yes [19:25] abentley: it seems that things that are already BranchFormat7, exhibit the issue of showing the old Repository format info [19:25] dobey: aha! [19:25] abentley: and BranchFormat6 branches seem to always show BranchFormat6, but the Repository format gets updated [19:27] abentley: that makes sense to you i take it? :) [19:27] dobey: I can see how such a bug would happen. Someone misunderstood the significance of branch format, and assumed if it hadn't changed, the repository format hadn't changed. [19:29] abentley: ah. and presumably the reverse as well, in the other case where BranchFormat doesn't change? [19:29] (doesn't change in the UI) [19:29] dobey: AFAICT, there are no cases where the branch changes in the UI from an upgrade alone. [19:31] abentley: so for the case where the UI still shows BranchFormat6 after an upgrade to 2a, is a separate bug? [19:31] dobey: yes, I've filed a bug for it. [19:31] abentley: ok, cool. glad i could help :) [19:31] dobey: bug 660706 [19:31] <_mup_> Bug #660706: Wronng branch format shown on web page after remote upgrade [19:34] dobey: Successfully reproduced the bug. [19:39] abentley: cool. === cjohnston_ is now known as cjohnston === almaisan-away is now known as al-maisan [20:09] good morning [20:11] moin === al-maisan is now known as almaisan-away === almaisan-away is now known as al-maisan [21:07] Any progress on the EC2 breakage? [21:08] wgrant: i just sent a mail that may be related [21:08] Ah. [21:08] * wgrant looks. [21:09] Hm, yes. [21:09] mwhudson: so, have you filed it upstream yet? [21:09] lifeless: no [21:10] at least, not that i remember [21:15] mwhudson: (hint) [21:16] yeah yeah [21:16] i don't really want to remember the details, they made me very angry [21:18] Simple* can do that ;) [21:20] i'll do it when i get through my mail [21:20] Project db-devel build (67): STILL FAILING in 3 hr 49 min: https://hudson.wedontsleep.org/job/db-devel/67/ [21:20] * Launchpad Patch Queue Manager: [rs=buildbot-poller] automatic merge from stable. Revisions: 11696 [21:20] included. [21:20] which has a new way for oopsprune to fail today [21:20] * Launchpad Patch Queue Manager: [r=gmb][ui=none][bug=655567] Wire up structural bug subscription [21:20] filters to the subscriber selection machinery. [21:20] IOError: [Errno 13] Permission denied: '/srv/launchpad.net/production-logs/2010-10-14/57341.C3848' [21:23] I wonder why Hudson managed to discover the errors more than a week before EC2. [21:26] wgrant: hudson is nice :) [21:27] Well, yes, but... [21:30] mwhudson: #python-testing may interest you as a lurking channel [21:30] I need a teddy bear for where to put some testing code [21:30] lifeless: lp.testing.$whatever? [21:30] mwhudson: mocker vs fixtures.* [21:30] ah [21:31] so, I'm writing some code that invokes external processes. [21:31] I have no interest in actually invoking them in my tests. [21:31] partly because the process is bin/test so it would be hugely meta to do so [21:31] and partly because the zope stack is so slow I'd die of old age waiting for things to run [21:33] Can I send you off on a diversion to fix that slowness? :P [21:33] That generally seems to be fairly effective. [21:33] wgrant: yes, ditch zcml & global state. Thanks, I'll expect a patch :) [21:35] mwhudson: just to be sure, merging *from* devel and proposing *into* db-devel is always safe, right? [21:36] (aside from short-term bugs in devel that haven't made it into stable yet, that would eventually cause testfix mode) [21:36] jam: yeah, but you'll end up with a diff that's way bigger than necessary in the mp [21:36] it sucks that we can't retarget the existing mp [21:36] thumper: hey, fix that :-p [21:37] yeah yeah [21:37] I've talked with wally about that [21:38] mwhudson: why would it be too big? Stuff that devel has that hasn't propagated into db-devel yet? [21:38] Right. That can take days if buildbot is screwed. [21:38] Which is fairly often these days. [21:39] jam: yeah, exactly [21:39] mwhudson: so where was I [21:40] right, I want to use a test double for 'subprocess.Popen' [21:40] I *could* use a mocking library [21:40] but I'll still have an implementation of subprocess hanging around which is a little larger than a mock [21:41] So instead I'm thinking to add a tested double and a glue fixture, with setUp monkey patching the system module [21:41] mwhudson: what do you think? === mordred_ is now known as mtaylor [21:42] lifeless: You wanted less global state, but you are advocating monkeypatching out subprocess.Popen? :P [21:42] wgrant: sadly yes. But for the duration of a single test [21:43] wgrant: modules are themselves global state; we can work better layers up on top of them though. [21:43] another option is to make the code I write take a Popen implementation [21:46] dependency injection woo [21:46] that said, i think monkey patching will produce less wtfs per second [21:47] jelmer: Hi. [21:48] yeah [21:48] wgrant: Hello. [21:49] jelmer: I was just stalking recent Launchpad branches, and I think the assert in 135610-duplicated-ancestry is still a little wrong. [21:50] hmm, I probably need to upload fixtures to debian soonish [21:50] There can be more than one Published publication if the publisher hasn't finished yet. [21:50] (it first marks everything has Published, then later marks old ones Superseded) [21:52] lifeless: NEW is stalled again though :-/ [21:52] wgrant: Hmm [21:53] when logging into the local launchpad instance, what username are you supposed to use? (I'm trying to get an ssh key registered, etc) [21:53] It seems to be redirecting me to "testopenid.dev" but that isn't letting me create a new account [21:53] jam: admin@canonical.com is an admin. [21:53] admin@canonical.com:test [21:54] wgrant: and that person is then "name16"... very strange [21:54] but it worked, thanks [21:54] jam: Well, username != password [21:55] I think there's a script (utilities/make-lp-user?) which will create a new user and upload your SSH key to it. [21:55] wgrant: or login name, or account name, or... [21:55] Not sure if it still works. [21:55] jam: By password I of course meant email address. [21:55] username != email address [21:56] wgrant: yeah, but also calling "admin@canonical.com" "name16" isn't very obvious, either [21:56] "admin" would have been a bit more obvious [21:56] admin@canonical.com is a relatively new alias. [21:56] name16 dates to 2005 or so. [21:56] Yay sampledata. [21:56] I wouldn't have too much problem, but it seems that every "make schema" nukes the stored ssh keys [21:56] Why are you running make schema so often? [21:56] so just about every time I update my launchpad branch, I have to start from scratch [21:57] You could always upgrade your DB with database/schema/upgrade.py, I guess. [21:57] wgrant: I get a lot of "schema is out of date" failures, but partly because this was originally based on db-devel [21:57] But make-lp-user may be a better option. [21:58] I have a couple of scripts which I use to prepare a fresh DB. [21:58] So make schema isn't so bad. [22:07] abentley, thumper, rockstar: standup? [22:07] wallyworld, sure. === al-maisan is now known as almaisan-away [22:14] jam, so, still hoping for a landing before uds? [22:14] poolie: well, land, maybe deployed to staging, pretty unlikely to be deployed to production [22:14] is my current thought [22:15] Hm. [22:15] launchpad-developer-dependencies should depend on lpreview-body, or something. [22:15] mwhudson: so I haven't reproduced the failing tests yet, but mostly because "make schema && bin/test -vvt lp.codehosting" seems to be awfully slow going [22:15] but I'll hopefully get there before I stop for tonight [22:16] jam: you should definitely only have to run make schema once per branch [22:16] mwhudson: well, and every time I update from somewhere [22:16] I think I was slightly out of date from the last time I pulled in db-devel [22:16] jam: if you're targeted devel now, then merge from there, commit, run make schema [22:16] once :) [22:16] mwhudson: right [22:17] it can be a pain though, it takes a long time [22:17] jam so it looks like you're not blocked, at least [22:17] I'm at the point that "bin/test" has been running for 20 minutes or so, and it is on "lp.codehosting.tests" [22:17] take that back "bin/test" has been running for 1:06 [22:18] jam i recently fixed a bug where resizing the window made the tests crash [22:18] _that_ is annoying when they've been going for an hour [22:18] poolie: definitely [22:19] I'm running it on a 1GB VM, and having bin/test fork bin/test fork python fork... and all at 177MB of virtual memory isn't very happy, either [22:19] argh [22:19] I'm pretty sure lp. has now bloated from about 120MB to about 150MB since I last updated [22:20] mwhudson: I at least make sure to "/etc/init.d/apache2 stop" since that would be running yet-another copy of lp.* [22:22] jam: yes, layers are terrible (layers support the idea of unteardownable fixtures) [22:22] which is frankly batshit insane [22:23] hi lifeless [22:24] wallyworld, do you have time for a chat now, or do you have kids to deal with? [22:25] rockstar: i have to do the school drop off - can you ping me alter after you've done your things? [22:25] hi poolie [22:25] wallyworld, mine things don't need to be done for 2.5 hours, so ping me when you're back. [22:25] lifeless: would it matter if you just made sure all of the layers that *launchpad* has can be torn down? [22:26] jam: at that point we can just drop in testresources :) [22:26] lifeless, yes please. [22:26] Does lp-propose do prereq branches? [22:26] rockstar: actually, we can have a chat now if you like. i don't have to do it for another 20 mins or so [22:26] wallyworld, okay. [22:26] lifeless: so is that why it forks its own children? So that it can ensure clean state? [22:27] jam: yes, it forks its own child when it hits 'NotImplementedError' in a layer tearDown [22:27] best feature evar [22:27] rockstar: skype? [22:27] wallyworld, yea, I'm up. [22:27] and depending on where its at in the test process hitting that error means it won't tear down other layers, so a month ago you'd have had multiple librarians, memcached, etc all running too [22:28] any ideas on the "No module named mailman.monkeypatches.*" stuff? I did just run "rocketfuel-get" and "make" [22:28] jam, try 'make clean && make' just to be safe, and check for conflicts [22:28] jam: rm -rf lib/mailman [22:28] jam: then do 'make [22:29] mwhudson: I can no longer understand your comment in layers.py about twisted signal handling: http://paste.ubuntu.com/513406/ [22:29] lifeless: so you're saying that I need to start this 1+ hour test run from scratch again... [22:29] :) [22:29] jam: its a stale statically-copied mailman element [22:30] I find it interesting that the import facist complains so loudly, yet it doesn't stop anything from running [22:30] jml: it's possible it doesn't apply with twisted any more [22:30] mwhudson: well, aiui, when the reactor is running there *is* a sigchld handler installed [22:31] jml: let me read some code again [22:31] mwhudson: ok. [22:32] jml: ah ok [22:32] the point is to make sure that the sigchld handler is not installed while the test_* method runs [22:33] mwhudson: the point of the comment or the code? [22:33] if a test returns a non-fired deferred, trial starts the reactor [22:33] which installs a sigchld handler [22:33] and then stops/crashes/mutilates it again [22:33] which doesn't uninstall the handler [22:33] jml: the code [22:34] jml: i can probably rephrase the comment a bunch [22:34] mwhudson: I thought the code was about undoing the mangling that trial does [22:34] jml: i don't think so [22:34] mwhudson: then save & restore are odd names [22:35] jml: they're about undoing the mangling that reactor.run does [22:35] mwhudson: oh ok, that's what I meant. [22:36] mwhudson: if you could rephrase that comment that would help me, I think [22:38] jml: ok [22:40] jml: twisted no longer raises PotentialZombieWarning [22:40] mwhudson: so we can forget about the comment? [22:40] well, simplify it a lot [22:40] heh [22:41] i think we suppressed the warnings in a few tests, but i deleted the suppressions when i upgraded to 10.1 [22:44] jml: it's still a bit of a novel: http://paste.ubuntu.com/513424/ [22:45] mwhudson: that's good, thanks. [22:46] mwhudson: so if we replaced our Twisted tests with something that always ran the reactor, basically we'd be unable to use tachandler as-is [22:46] Project devel build (108): FAILURE in 3 hr 42 min: https://hudson.wedontsleep.org/job/devel/108/ [22:46] * Launchpad Patch Queue Manager: [r=edwin-grubbs][ui=none][bug=347218] Allow a project's or [22:46] distribution's bug supervisor to set the official bug tags for it. [22:46] * Launchpad Patch Queue Manager: [r=gary][ui=none][bug=628510] Fixes bug 628510 by overriding the [22:46] default permission value for the oops file and oops dir when [22:46] those are created. [22:46] <_mup_> Bug #628510: OSError at /oops.py/ when using lp-oops [22:47] jml: yeah [22:47] "IOError: [Errno 28] No space left on device" [22:47] \o/ [22:47] yeeeeeeeeeeeha! [22:48] rockstar: just checked - old version has same issue :-( [22:50] wallyworld, okay. Maybe the problem IS in our code. [22:50] * rockstar sads [22:51] mwhudson: ok, thanks. the testtools/twisted thing I'm coming up with does the signal save/restore, but always runs tests in the reactor [22:51] mwhudson: so I guess I'm going to have to fix tachandler [22:52] jml: ok [22:52] jml: i think there is something in twisted that is supposed to do signal handling in a less interruptive fashion [22:53] jml: but i don't know the details and it didn't seem to work for the buildd-manager so ... [22:53] mwhudson: maybe I'll be forced to write APIs for twistd [22:54] jml: woo === matsubara is now known as matsubara-afk [22:57] I would like some more days [22:59] hello jml [22:59] poolie: hello [22:59] poolie: do you have some days that I can use? [23:10] jml: you can use saturday and sunday :) [23:10] thumper: you'd think so [23:10] thumper: but they are decreasingly available to me for programming [23:12] jml, octember's looking good [23:14] :) [23:14] poolie: Cool, thanks. I'll see what I can do. [23:21] * jml -> bed [23:21] g'night === _thumper_ is now known as thumper