[00:01] wgrant: yes, we should celebrate this joyous occasion. [00:02] Although the branch is a bit of a read :/; [00:10] LOSA ping: Can someone please run http://paste.ubuntu.com/511984/ on prod? We seem to have corrupt more builds like the ones we found when trying to open Natty. [00:10] wooo [00:10] We had to fix those quickly, so the evidence was destroyed. [00:11] So it is probably... good... that there is more breakage. [00:11] FSVO 'good' [00:11] wgrant: (4510 rows) but they're all looking like " 4789 | | |" [00:11] (missing BuildQueues in both cases) [00:11] Er. [00:11] Did I miss a join somewhere... [00:12] Whoops. Left out the WHERE clause in that version of the query. [00:12] WHERE sourcepackagerecipebuild.id IN (4256, 4257); [00:12] *Two* rows. [00:13] ha [00:14] id | id | id | id [00:14] ------+------+---------+---- [00:14] 4256 | 4255 | 6311021 | [00:14] 4257 | 4256 | 6311022 | [00:14] wgrant: ^^ [00:14] Precisely as I suspected. [00:14] Thanks. [00:14] Now to work out why we have various builds without BuildQueues. [00:20] How I hate this model. [00:22] thumper: https://devpad.canonical.com/~stub/ppr/lpnet/latest-daily-pageids.html [00:24] How far do buildd-manager logs go back? [00:29] A long while [00:29] * StevenK goes for breakfast before his stomach breaks out on its own [00:34] rooooar [00:34] holy cow [00:34] hwsubmission || 525012.40 tuples/sec [00:35] 'release time anyone' ? [00:35] except, its not writes. hmmm [00:36] Yeah, that can't really be writes, as it would go through the last released count of Ubuntu users in 24 seconds. [00:36] wgrant: s/count/estimate/ [00:36] :-P [00:37] elmo: True that. [00:52] thumper: -> https://bugs.edge.launchpad.net/bzr/+bug/93609 <- poolie: [00:52] <_mup_> Bug #93609: Better error messages for bzr lp:// [00:53] wgrant: Do you need logs? [00:55] StevenK: I'd like to see what the logs have to say about BinaryPackageBuilds 1968526, 1968544, 1969032, and SourcePackageRecipeBuilds 4256, 4257. [00:55] Because they somehow ended up NEEDSBUILD without a BuildQueue. [00:55] wgrant: How far back are we talking? [00:55] StevenK: The BPBs are a little over three weeks ago. The SPRBs 11 days ago. [00:57] In two months I will be able to stop pestering people for logs, yay. [01:09] wgrant: in that losas aren't people? or??? [01:09] Heh. [01:36] losa ping [01:37] I need someone to delete "hard_timeoutpageid:BugTask:+index15000" from https://staging.launchpad.net/+feature-rules [01:39] lifeless: you have been unfeatured [01:41] thank you [02:08] I *so* want to not get mail for invalid bug tasks. [02:09] There are two types, though. [02:09] Of Invalid task. [02:10] lifeless: do i also need to ask for a separate review in #launchpad-reviews for that menu branch? or is your stamp of approval enough? [02:12] wallyworld: IIRC userdict provides better hook points than dict, though its not really relevant here [02:12] wgrant: bad and worse? [02:12] wgrant: I've clickly clicked for you [02:12] bah [02:12] wallyworld: [02:12] lifeless: yo [02:13] ^ 3 lines [02:13] ack. so i'll leave the code as is... [02:13] lifeless: "This isn't a valid bug", and "This isn't a valid bug here" [02:13] uhm [02:14] I think you're conflating task and bug [02:14] something like : a bug is invalid if all tasks are invalid [02:14] It is necessary to conflate them. [02:15] ‽ [02:15] We have only per-task statuses. [02:15] But, as you say, a bug could be considered invalid if all its tasks are. [02:15] The interactions with notifications may depend on both. [02:16] well, for my needs, if I'm structurally notified via an invalid task, i don't want to be. [02:16] But what if the user is arguing that the bug is valid? [02:16] You do want to be notified. [02:16] hi lifeless [02:16] 5-digit bug hey [02:16] i agree that looking at the cases is good [02:16] wgrant: can't they toggle it to new? [02:17] lifeless: That's impolite. [02:17] poolie: hey, so, thumper and I were talking. [02:17] poolie: I promised to file a bug describing a design to get better messages out with hopefully backwards compat [02:17] poolie: and suggested that spiv would be a great person to look at the detail. [02:17] poolie: when I wwent to file it I found such a closely aimed bug that I reused it. [02:18] wow [02:18] 1026 / 119 Distribution:+ppas [02:18] ^- unhappy pages R us. [02:18] Yeah, known 8.4 regression. [02:18] Not sure of the details. [02:18] yeah [02:18] But there are a few bugs on it. [02:18] count(*) [02:18] we've dupped them all I think [02:19] bad query is listed in the remaining active [02:19] yep, i thought a bug would be good too [02:19] and that's it [02:20] I don't see the bug. [02:20] * wgrant searches other projects. [02:20] https://bugs.edge.launchpad.net/launchpad-project/+bugs?field.tag=timeout [02:20] wgrant: ^ [02:20] Ah, on launchpad. [02:20] fixed [02:21] such a nuisance [02:21] I was trying to fix it. [02:21] But it kept using the non-AJAX form. [02:21] Something odd is going on. [02:21] you clicked too quickly [02:21] gotta wait for yui to initialize and overlay [02:21] I thought so. [02:21] But apparently not. [02:21] odd indeed then [02:21] I can wait a few seconds after the status picker works, and the project one still doesn't. [02:22] oh, it never has [02:22] Hm? [02:22] there isn't an ajax project picker is there? just the drop-down if you click on the expander [02:22] There is an AJAX project picker. [02:22] I've never seen it :P [02:22] It's often buggy, but it usually at least appears. [02:36] what dor str(Person) do (when Person is a Launchpadlib object) [02:41] lifeless: IIRC str()ing an Entry will give you its URL. [02:41] blah, thanks. [02:42] Ah, now I see the relevance. [02:43] wgrant: ? [02:44] The qa-tagger bug. [02:44] ah yes [02:44] I really should add that to my sieve script, since it ends up in my inbox. [02:44] open sourcing by incremental pastebin :P [02:45] I learnt a lot about LP through dogfood back when it had public tracebacks :P [02:45] But then Julian fixed that :( [03:03] aw [03:03] i wonder why not enable them? [03:04] They leak data. [03:04] in variable values etc? [03:04] Although not much worse than API exceptions can do. [03:04] The most obvious one is object reprs in Unauthorized exceptions. [03:04] But there are probably others. [03:09] As an odd question, I wonder why ohloh doesn't have any updates from Launchpad branches since Jan [03:09] really? [03:10] you could mail them [03:10] I'm checking again [03:10] Hmm. I have a PPA build finished 13 minutes ago and still not published [03:11] Last commit it shows was 4 months ago, but the code metrics seem much older [03:15] 18 minutes :-/ [03:16] The publisher runs every 20 minutes, so that is the worse case, right? [03:16] The *PPA* publisher used to run every 5 minutes, last time I was told [03:18] maxb: Which PPA? [03:19] bzr-beta-ppa/ppa - it does seem to have just published now [03:59] lifeless, ping, if you have a moment, I am in need of some help with a zope-related implementation question [04:02] or thumper, if you have a moment to spare? [04:04] mars: whazzup? [04:04] Hi thumper, I have a question about storing state for the duration of a Zope request [04:04] right... [04:05] and what is the question? [04:06] The profiling code needs to store the active profiler information for the duration of the request. I could throw that information onto the request object itself, or take the current approach of using a module-level threading.local() [04:07] One approach turns the request object into a cesspool, the other uses a sort-of-magic secret module level variable - is one approach preferred over the other? [04:07] mars: does this relate to the existing timeline code? [04:08] no [04:08] mars: my suggestion would be to do what the timeline code does [04:09] mars: the request object is already a cesspool [04:09] heh, ok [04:10] thumper, where would I find the timeline code? [04:10] mars: I think there is a part of the request object used to store "stuff" [04:10] mars: lp.services ? [04:10] right [04:13] thank you for the help thumper [04:13] mars: hi [04:13] mars: webapp.adapter [04:13] mars: as thumper says, see what lp.services.timeline.requesttimeline does [04:14] (which is to shove it into webapp.adapter and cry with XXX's.) [04:14] lifeless, cool, thanks [04:14] mars: doesn't the profiler already work though? [04:14] [I'm interested in what you're hacking on] [04:15] lifeless, the profiler worked when using lazr.config. The setting was static for the duration of the request, so profiling was either on or off [04:15] mars: are you working on flags to enable profiling? [04:16] mars: if so, i think it still needs to be either on or off, just evaluated earlier. [04:16] lifeless, but feature flags are set and cleared with request events, and lo, so is profiling. If the features get nuked before the profiling end hook is run, no profile is generated [04:17] I am dealing with an event processing order bug [04:17] mars: sure, but wouldn't 'just' caching the result of 'should we profile' be all thats needed (and in fact I thought that that is what it did?) [04:17] win 67 [04:18] lifeless, no, it did not cache the 'profiling is on' flag (unless you count the aforementioned thread-local storage as the flag) [04:18] lifeless, and I was asking about the best implementation for said cache [04:18] either thread-local storage, or a request.atttribute [04:19] end_requqest does this [04:19] _profilers.profiler is not None [04:19] so it caches (by virtue of using an object) [04:20] mars: I expect you'll have a great deal of trouble changing the cache implementation for profiling [04:20] until we fix two bugs: [04:20] hmm, ok [04:20] - many tests call in-request-context-code outside of a reqest-context [04:20] - scripts don't establish a request-context in th esame way the webapp does [04:21] the second point won't really affect you atm as we don't profile scripts, but the first one will hurt - a lot- [04:21] so if I change this then I'll probably trip over yet more test-related bugs [04:21] see the 'except AttributeError:' on line 78 [04:21] I'd be delighted to see the root issue here fixed. [04:21] but I don't know if you have the time budgeted for that :) [04:22] this has rotted on the kanban board already (as I am sure you and Martin have noticed). I think I'll just try and land it :) [04:22] ok, so perhaps I can help solve your immediate issue. [04:22] what is it ? [04:23] well, I think it is solved then - I just have to add a comment string or two to that thread-local storage to make it less magic [04:23] name it _active_profilers or some such obvious thing, etc. [04:24] mmm [04:24] there are no inactive_profilers [04:24] well, _active_profile then [04:24] if _active_profile.actions = None [04:25] mars: if you come up with a good name; great. Myself, I'd just add a comment to the assignment and the module dostring. [04:25] anyway, something like that [04:25] lifeless, yes, will do [04:25] lifeless, thank you for the help, and for pointing past the minefield [04:28] de nada [04:39] If I have a db-devel-based branch that will land in devel after the release, how should I propose it? With db-devel as a prereq? [04:40] propose it to devel [04:40] after the release, the diff will be normal :) [04:41] If I somehow convince it to update after the release, sure. [04:41] I guess I could just wait until after the release. [04:41] one day [04:41] Although I have to wonder why we don't merge db-devel into devel at the time devel freezes. [04:41] I suggested that on the lit ;) [04:41] Ah, I'm a bit behind at the moment. [04:49] EdwinGrubbs, pong? [04:49] EdwinGrubbs, holiday here today, sorryu [04:49] *sorry [04:56] Ursinha-afk: no problem, we sorted out some issues we were having with the qatagger. It's working now, and we opened bugs for the problems that should be fixed later. [05:10] StevenK: re. Ohloh: The import has been tried a few times, but always fails after working for a couple of weeks. [05:10] StevenK: Their bzr importer doesn't seem to be awfully good. [05:30] * wgrant requests that it be kicked. [05:49] I wonder if the regular failures are because of the frequent LP downtime. [05:50] sush :-) [05:51] Well, we are the only hosting site I know of that has regular downtime... [06:20] lifeless: http://sourcefrog.wordpress.com/2010/10/13/59/ about ssl performance may interest you [06:29] poolie: thaks [06:45] stub: question about the BranchRevision patch… [06:45] ? [06:45] The problem is too much downtime when setting a new primary key, right? [06:46] The change requires creation of a huge new unique index, even though we already have one. [06:47] So I'm wondering: why do we need to declare a primary key in the first place? Does Slony require it? [06:47] Storm does. [06:48] Storm requires us to declare a primary key _to Storm_. But does it require us to declare a primary key _in the schema_? [06:48] Ah, fair point. [06:48] Because AFAIK this whole primary key business is little more than convention, a bit of optional syntactic sugar. [06:49] "Slony-I needs to have a primary key or candidate thereof on each table that is replicated." [06:49] Hm. [06:49] Okay, what does it consider a candidate? Because we have unique index and we have not-null. [06:50] One can override it to another candidate key. [06:50] So it should be doable, yes. [06:50] (just needs UNIQUE and NOT NULL) [06:52] stub: we already have the index and constraint we need… what do you think? [06:52] jtv: Slony requires a primary key [06:53] It says it requires a primary key or a candidate key. [06:53] stub: see what wgrant just dug up—we can define a primary key there without declaring it in the schema. [06:54] I like to stick to best practices, and from the best practices section " Discussed in the section on Replication Sets, it is ideal if each replicated table has a true primary key constraint; it is acceptable to use a "candidate primary key." [06:54] Well, it's either use a candidate key or have several hours of downtime... [06:55] there's also the 'make branchrevision2' plan isn't there? [06:55] or has that been shot down for some reason [06:55] That would work too. [06:55] At the expense of a crapload of disk, I guess. [06:55] yeah [06:56] Candidate key is certainly an option, but we need to teach or maintenance scripts about it. [06:56] Everyone bring your spare disks to the DC for Operation Dunkirk. :) [06:57] There is also the ignore-it-until-we-drop-branchrevision route. There might be faster, smaller and better solutions than the big table in the RDB. [06:57] How long until we run out of rows? [06:57] yeah, that's probably a reasonable approach [06:58] wgrant: it's at 600 million ish and runs out at 2 billion [06:58] so, no massive rush [06:58] Oh, I didn't realise it was only 600 million. [06:58] What's the next id though? [06:58] The table is pretty much INSERT only [06:59] good question, but probably not noticeably more than the row count [06:59] Except on branch deletion and overwrite, I guess. [06:59] Highest id is 757259996 [06:59] And aborted transactions, and maintenance. [06:59] True. [06:59] So that's okay. [07:00] according to the graph (which is only an estimate i think) we've added 300 million ish rows in the last year [07:00] Well then, let's just take one bit off the id for now and save 1/32nd of the space :) [07:00] jtv: don't muck about eh? drop database. [07:01] at last we'll have some disk space free [07:01] Also, since the only thing that inserts rows into that table is the branch scanner, we could do magic and have everything live while we do the maintenance and only have the branch scanner disabled. [07:02] stub: I don't suppose there's a way to make the new index be created concurrently like you can with a regular index? [07:02] The new primary key index? No. [07:04] And the *other* other alternative is to talk to the pg-hackers and update the system catalog to promote an existing unique index to the primary key index. [07:06] oooh getting dirty now [07:08] I keep getting this when I run tests on any branch locally: [07:08] Test-modules with import problems: [07:08] lp.services.mailman.tests.test_lpmoderate [07:08] (And one other) [07:08] It can't find blah.blah.mailman.monkeypatches, or something like that? [07:09] That error mysteriously vanished when I reinstalled and did a clean rf-setup. [07:38] wgrant: Ah ha [07:45] wgrant: Just to be clear, you mailed ohloh and asked them to kick their LP import? [07:46] StevenK: I posted in the forum as they suggest. [07:46] They seem to kick them pretty quickly. [07:47] I wonder if they'll have to do another full re-import (which takes like a month) [08:02] stub: http://paste.ubuntu.com/512180/ is what I have so far—running tests now [08:03] jtv: That looks rather evil. [08:04] wgrant: stop, you're making me blush [08:04] Less evil, however, than an email I just received from my university. [08:04] ? [08:04] It was apparently generated by a recent Microsoft Office product, and is a fairly short largely textual email. The text/plain version is <1KiB. The text/html version is 14KiB, contains 53 XML namespace declarations, and 6KiB of CSS. [08:05] It is evil, but also something I was considering :) Certainly needs to go past upstream though - everything that needs to be updated may not be obvious. [08:05] stub: absolutely—I only found out about the pg_index part because \d didn't pick up the change. [08:06] wgrant: that's actually pretty good. Last time I did that sort of analysis, it went something like 10:1 for some very basic HTML cleanups such as not declaring the font for every paragraph, and then it was another 10:1 for the cleaned-up HTML against the text. [08:10] jtv: 53 NAMESPACES. [08:10] That may be a tad on the high side, yes. [08:11] They probably just couldn't be bothered to see which ones were really needed. [08:11] Indeed. [08:11] After all, who doesn't want flashing Microsoft logos with a half-functional copy of their website embedded in their letters? [08:11] Or something. [08:23] lifeless: When you say 'the OOPS rate has tripled', is that ongoing or are you referring to the spikes during and just after the upgrades? [08:27] stub: its ongoing [08:27] Hmm... wasn't aware it was that much. [08:28] daily oops reports - we were at a few hundred, we're at 1.mumble K [08:28] its noisy too, not flat [08:28] +ppas and the person picker could well make up most of that, though. [08:28] Or maybe we have just tripled our throughput - the DB load is still half what we used to see and I don't know why ;) [08:29] stub: we should normalise by requests/day - I have a bug open for that on the oops summaries [08:29] ok, tuolumne is doing some -weird- aggregation here - https://lpstats.canonical.com/graphs/OopsEdgeHourly/20100914/20101014/ [08:30] lifeless: Oh... and we might be getting a lot of OOPS reports from newly enabled cronscripts, including the doubling up of OOPSes from some scripts. If we just look at timeouts though, the cronscripts won't contribute to that. [08:30] stub: the upgrades - https://lpstats.canonical.com/graphs/OopsLpnetHourly/20100914/20101014/ [08:32] stub: I mispoke, I should have said 'timeout rates have tripled' [08:33] Yer, and those spikes helpfully making the rest of the graph unreadable :-P I agree it looks like hard timeouts have increased significantly, and user generated errors too [08:33] stub: the ppa search is a major one [08:33] So this was for pre-release, so we can't blame increased traffic from 10/10 [08:33] 1000 a day [08:34] I haven't followed the PPA search issues - do you have a bug # handy? [08:34] https://bugs.edge.launchpad.net/soyuz/+bug/659129 [08:34] <_mup_> Bug #659129: Distribution:+ppas timeout in PPA search [08:34] I tagged it dba :) [08:34] (only this morning though - not digging at you) [08:38] isn't SourcepackagePublishingHistory a big table ? [08:40] Maybe 10-15 million records? [08:40] yup [08:40] SELECT DISTINCT archive FROM SourcepackagePublishingHistory WHERE status IN (2, 1) [08:40] Is that the package cache updater? [08:40] thats the ppa search query [08:41] That can't be the whole query. [08:41] no, its a in clause [08:41] wgrant: see the bug I linked [08:41] But still, that's indexed. [08:41] Time: 1265.807 ms [08:42] 6690 rows [08:42] its not an obvious cause for 15 second queries, but its a tad slow. [08:42] whats annoying is that the query in the OOPS runs in 1.5s on staging [08:43] so I can't reproduce [08:54] stub: how long does it take on the prod slaves? [08:55] Just over 2 seconds [08:57] >< [08:57] on all ? [08:57] Yup. I can run the query and return the 6733 results to Bangkok in 2 seconds. [08:58] damn [08:59] 2.7 seconds on a cold slave [08:59] actually hitting up https://edge.launchpad.net/ubuntu/+ppas?name_filter=munin times out though [09:00] That would be doing a different query, or is the filtering done client side? [09:00] slightly different yes [09:00] (Error ID: OOPS-1747ED330) === almaisan-away is now known as al-maisan [09:06] waiting for that to sync [09:06] Hello [09:07] hi === al-maisan is now known as almaisan-away === almaisan-away is now known as al-maisan [09:12] morning jml [09:13] grrr oops syncing :< [09:13] lifeless: good morning [09:17] mwhudson: thanks for doing the branch-distro thing [09:18] jml: it's probably still failing intermittently [09:18] mwhudson: I had been meaning to ask, the failures in the bug report, do they require manual intervention? [09:18] jml: no [09:19] actually seems to be going currently [09:19] jml: well, the script needs restarting, that's all [09:19] mwhudson: what's the intermittent failure then? [09:19] mwhudson: ahh... that's manual enough for me :) [09:19] jml: intermittent, as in, until the losas notice [09:19] lifeless, wgrant: Huh. SourcepackagePublishingHistory is only 1.2 million records. I thought it was bigger too. [09:20] Think I must have been confused with BinarypackagePublishingHistory, which is 11 million [09:20] mwhudson: I can't tell you how keen I am to make this much a fire-and-forget operation for next time [09:21] jml: well, you saw my advice on that front i guess [09:21] mwhudson: rather than a multi-person project [09:21] jtv: Hi. Edwin asked me to check with you about the bug you're still qa-ing. Any news? [09:21] mwhudson: yeah, thanks [09:21] gmb: yes, the build farm is now finally up and running. And now something else is broken. [09:22] Ah. Doubleplusungood. [09:22] gmb: I'm leaving out some other breakages for brevity. [09:22] losa ping - where is OOPS 1747ED330 ? [09:23] lifeless: erm, context? [09:23] jtv: Anything i can do to help? [09:23] stub: Er, yes, I mixed them up. [09:23] Oops. [09:23] mthaddon: sorry; we're trying to track down the highest timeout since pg8.4 [09:23] mthaddon: thats an OOPS I triggered 24 minutes ago [09:23] but lp-oops isn't showing it [09:23] lifeless: and what do you mean by "where is"? which server is it on, or you can't find it? [09:24] mthaddon: I guess I mean 'has it rsynced across, has the injection into lp-oops completed' - I'm very vague about where to look to figure this out myself :( [09:25] oh great, edge4 and edge5 are using the same OOPS prefix [09:25] lifeless: you should familiarise yourself with the OOPS prefix system I think - very useful for tracking this kind of stuff down [09:26] mthaddon: I'll file a bug about the fact the overlap wasn't detected [09:26] mthaddon: I rewrote part of it recently ;) [09:26] mthaddon: but the operational deployment is a bit different [09:27] That 'stable has test failures' in the topic is old now, right? [09:27] I don't know :) === stub changed the topic of #launchpad-dev to: Launchpad Development Channel | Week 4 of 10.10 | PQM is Release-Critical; devel is closed (Release manager: EdwinGrubbs) | firefighting: - | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting [09:28] Its a lie anyway - if there are failures nobody is fighting the fire ;) [09:28] lifeless: I don't see an OOPS with that ID on either edge4 or edge5 :/ [09:29] jml: in other news, a longer lead in time on new dev would perhaps allow a more thorough qa, which *might* have caught the other problem [09:29] mthaddon: filed https://bugs.edge.launchpad.net/launchpad-foundations/+bug/659752 about the overlap aspect [09:29] <_mup_> Bug #659752: duplicate oops prefixes are not detected in production-configs [09:29] mwhudson: I don't quite get it. [09:30] jml: thumper tested his branch on like 1% of package branches [09:30] if he'd tested on 10%, it might have found the bug i'd reported [09:30] (probably not though) [09:31] it seems there were some maverick branches that hadn't been scanned [09:31] lifeless: I'm not really sure how else to track this OOPS down if it doesn't appear on either of the two server with the "ED" OOPS prefix [09:31] I poked them to get scanned [09:31] which seemed to make them continue [09:31] we are just of 13K through [09:32] when it fails, it fails after the restacking [09:32] so the failing ones just need to be scanned [09:32] mthaddon: fair enough [09:32] thumper: one of the branches i poked at seemed broken [09:32] openoffice.org? [09:32] mthaddon: I'm assuming that you mean you looked on the appserver itself when you say that ? [09:32] no, the one in the bug report [09:32] mthaddon: if so thank you very much for prodding. [09:32] ok [09:32] i didn't poke very hard [09:32] mthaddon: How about 1746B1391 which also hasn't synced? [09:33] * mwhudson back to chessy drama tv [09:33] lifeless: yeah, looked on both appservers edge4 and edge5 [09:33] mthaddon: do you need a merge proposal to allocate a new prefix for edge5, or are you doing that? [09:33] lifeless: I'll take care of it [09:33] mthaddon: ok, thanks. [09:34] stub: don't see that one on lpnet2 either :( [09:35] mthaddon: I have a note here to check that we're still ontarget for the 18th for golive of qastaging [09:36] lifeless: there have been some delays because the DB build for it was causing blocking problems for the staging DB [09:36] lifeless: so I'm not sure if we'll still make the 18th [09:37] lifeless: So the PPA search is still timing out on the COUNT() in the bug report, and that query takes 40+ seconds, and I've added the solution to the bug report. Would be nice to know we are not throwing OOPS reports away though ;) [09:37] stub: didn' you just try the count() and have it fast on prod etc? [09:38] (it was fast on staging) [09:38] I wonder how much of the PPA publisher's time is spent in domination. [09:38] That's *really* slow at the moment, since it blows away the cache a couple of times for each (suite, arch). [09:39] stub: I'm wondering if we're facing a systematic difference between staging and prod making staging unsuitable for evaluating performance [09:39] lifeless: SELECT DISTINCT archive FROM SourcepackagePublishingHistory WHERE status IN (2, 1), which was the proceeding query in my scrollback when you asked [09:39] ah [09:39] so on staging [09:39] SELECT COUNT(*) FROM Archive, Person, ValidPersonOrTeamCache WHERE Archive.purpose = 2 AND Archive.distribution = 1 AND Person.id = Archive.owner AND Person.id = ValidPersonOrTeamCache.id AND Archive.id IN ( SELECT DISTINCT archive FROM SourcepackagePublishingHistory WHERE status IN (2, 1)) AND Archive.fti @@ ftq('munin') AND Archive.private = FALSE AND Archive.enabled = TRUE AND (1=1) [09:39] ; [09:39] count [09:39] ------- [09:39] 11 [09:39] (1 row) [09:40] Time: 1204.301 ms [09:40] wgrant: care to whip up a patch for https://bugs.edge.launchpad.net/soyuz/+bug/659129 with stubs tweaks ? [09:40] <_mup_> Bug #659129: Distribution:+ppas timeout in PPA search [09:40] lifeless: I'd love to, but I have five projects due on Monday, so I probably shouldn't. [09:41] stub: \d validpersonorteamcache [09:41] ERROR: column "reltriggers" does not exist [09:41] LINE 1: SELECT relhasindex, relkind, relchecks, reltriggers, relhasr... [09:41] ^ [09:41] wgrant: ok, no worries. [09:42] lifeless: postgresql client package needs to be upgraded on devpad [09:42] 8.3 client doesn't know about 8.4 system table updates [09:43] ah [09:45] and it even tells you that in the client at startup :) [09:47] lifeless: you've been busy duping bugs, thanks [09:48] bigjools: I picked a newer one but it had more detail, hope that is ok [09:48] bigjools: and sorry for the confusion about the % bug [09:48] lifeless: sure thing [09:48] lifeless: no worries. you did make me double check it. Twice. :) [09:49] hah :) [09:49] I know I'm a little scatterbrained.... [09:50] lifeless: btw are you fixing that ppa search timeout? [09:50] stub: so any idea why the count is fast on staging? [09:50] bigjools: certainly not tonight [09:50] it would be nice to make the release [09:50] bigjools: stub has a proposed fix for the query [09:50] I saw, that's great [09:50] bigjools: should be near trivial to apply it (ignore my comment about validpersonorteamcache for now) [09:51] lifeless: btw your comment on the ML thread about not joining via TeamParticipation confused me - it's not needed I don't think? [09:52] bigjools: if you or someone in non-asia timezone can get a fix together it should be possible to make the release. [09:52] though thats only 12 hours away now [09:52] I know that I can't make the release; to tired to be coding right now, and not enough time to run the test suite after I start work before the release ;) [09:54] bigjools: teamparticipation - its needed because 'team X has open subscription' can be achieved two ways: [09:54] - directly and indirectly [09:54] eek [09:54] for direct, the team flag matters [09:54] -> - ops, private matter [09:54] for indirect the same flag on *any team that is a member of the team*. [09:58] lifeless: do you remember where in the code that slow ppa query is@? [09:58] * bigjools curses cold fingers [09:59] bigjools: no, I hadn't looked for it [09:59] sorry [09:59] np, was just looking for a time-saver :) [10:00] yeah, I wish I had one for you [10:06] found it [10:08] stub: in your fixed query you say joining to Person is not necessary, but I think it is as it's ordering on Person.name [10:08] bigjools: its not needed for the count(*); this points at a storm issue [10:09] lifeless: interesting - I dunno how it could know that [10:09] neither do I :) [10:10] but the way we use .count() implies that we need to teach it, or to start using separate query definitions for estimation, sizing and results [10:10] we may need to do that anyway. [10:11] stubs comment that the distinct is the key thing suggests that ignoring the Person usage is reasonable [10:13] lifeless: the batch nav would need fixing to do that [10:14] would be a nice fix though [10:14] we can supply fake values for huge sets where preciseness is not useful [10:14] there is an estimator in the code base [10:14] which uses the table stats to make an estimate [10:15] I wonder what someone thought that DISTINCT was doing [10:16] refactored code perhaps [10:17] lifeless: how long are you around for? [10:18] * bigjools grumbles at "make schema" building wadl*3 [10:18] bigjools: if you need me, I can be around [10:18] bigjools: if you don't, given there is a team lead meeting in < 8 hours, I won't be around for long at all [10:18] lifeless: ok I'm just running tests on this trivial change, if you can bless the branch that'd be good [10:19] * bigjools will be 5 mins [10:19] ok [10:19] just doing the MP [10:21] lifeless: https://code.edge.launchpad.net/~julian-edwards/launchpad/slow-ppa-search-bug-659129/+merge/38307 [10:21] rc blessed [10:22] ta [10:22] lifeless: can you r= it as well :) [10:23] no, lp won't let me. [10:23] jml: can you? [10:23] Oo [10:23] bigjools: only one vote type per person [10:23] prob cause the diff is not ready [10:24] we need a rabbit [10:24] You could teach ec2 to interpret 'code release-critical' properly. [10:26] Bigjools: I'll take a look [10:27] ah thanks gmb [10:27] still no diff.... is the scanner foobar? [10:27] Didn't we just branch Ubuntu? [10:28] /o\ [10:28] yes [10:28] but the branch of ubuntu has staggering code [10:28] so, last time the backlog was hojw many days? :) [10:28] Bye bye scanner for the next month. Unless that thing got CP'd? [10:29] * bigjools back in 5 [10:29] bigjools: How many lines should there be in the diff? [10:29] I still see "updating diff" but there's also a diff... [10:29] I see no "updating diff" any more. [10:29] ... and now the message has gone. [10:30] Right. [10:30] Nice. [10:30] Hurrah for confusing-ui. [10:30] bigjools: Anyway, r=me. [10:31] thumper: Edwin-afk asked me to check on the status of https://code.edge.launchpad.net/~thumper/launchpad/better-errors-for-translate-path/+merge/38178 whilst he slept. Is it in EC2 atm? [10:32] gmb: ta [10:36] bigjools: Keep the Person join if you want - it doesn't slow things that much. The DISTINCT is the killer. [10:36] stub: right. I can't work out why it was needed for an IN query as well [10:37] fix heading to PQM now [10:37] bigjools: It may well have been an optimization. I think PG 8.4 query planner handles IN better now. [10:38] good to know, ta [10:38] lifeless: can I what? [10:53] jml: do a code review (which gmb has done) [10:53] lifeless: good good :) [10:53] halt() [11:01] jtv: So, talk to me (as an Edwin proxy). What's the situation w.r.t. breakages and how likely are these things to become rollout blockers? [11:01] (Note that I have *zero* context besides Edwin saying you had bug that was qa-needstesting) [11:01] gmb: we're still working to get staging fully operational for our purposes, which is kind of a prerequisite for a rollout. [11:02] Ah, right. [11:02] One thing we identified is that the librarian gc grace period needs to be long enough to keep essential files present over the lifetime of a staging copy. [11:03] I don't know if there are any other branches waiting for Q/A that would be affected by this. Possibly not; most buildfarm Q/A can happen on dogfood. [11:03] Can DF not be used? [11:03] Just one thing missing on dogfood: codehosting. [11:03] There's a hack for that.™ [11:04] Oh, but I guess you need the scanner. Sad. [11:04] Well we need to trigger an ITipChanged event and watch it fire off a buildfarm job. [11:04] Exactly. [11:06] You can't QA the scanner on staging and manually create the records on DF? [11:06] gmb: it's _probably_ safe to roll back my revision, but again we can't verify that assertion in staging's current state. [11:06] jtv: Right. Deep joy. [11:07] jtv: So, once staging's operational for you, your plans are to do QA and / or check that you can roll back safely? [11:07] * gmb stabs wildly in the dark [11:07] gmb: a brilliant guess [11:07] I'm still betting on forward rather than back. [11:08] Cool. [11:09] bigjools: are you going to be working on the buildd-manager branch of death today? [11:11] * jml pokes network drivers… again [11:11] jml: otp, but yes === jtv is now known as jtv-afk [11:14] bigjools: cool. just trying to plan my day. [11:30] jml: I'll start on it again in about 30 mins, do you want to join in? [11:38] bigjools: sure [11:39] great - just wading through the mountain of email first [11:39] bigjools: That branch made me scream, but then I noticed that you killed buildd-slavescanner.txt, so all was forgotten. [11:39] :) [11:39] it's working on dogfood, did I mention that? :) [11:40] You did. That's good news. [11:40] * jml out arranging lunch – it's all about timing the interruptions. [11:40] argh, this person picker timout is seriously pissing me off [11:41] lifeless: is that on your radar? [11:58] looks like branch-distro finished? [11:58] * gmb lunches [11:58] mwhudson_: cool! how can you tell? [11:58] Finished, or failed again? [11:58] https://code.edge.launchpad.net/ubuntu/natty/ has a big number on it [11:59] and it's not changing [11:59] i guess it might have fallen over again [12:03] gmb: Revision: 9885 of db-devel, but buildbot seems down [12:03] * thumper -> bed [12:03] bigjools: Are we going to need something like CustomUploadPublishingHistory, maybe? [12:04] no idea, I've not thought about it [12:04] Ah. [12:04] possibly, though [12:05] I think it would probably make a lot of things suck less. But it needs thought. [12:05] Morning, all. [12:30] gmb: Hi! Can you look at https://code.edge.launchpad.net/~stevenk/launchpad/db-add-derivedistroseries-api/+merge/38189 again? I've made the changes you suggested. [12:57] thumper, lifeless: do you know if bug 658124 can be marked qa-ok? It's on staging already, but stub is not around. [12:57] <_mup_> Bug #658124: Revision karma allocator glacial and needed to be disabled [13:00] jtv-afk: how goes QA? [13:05] EdwinGrubbs: both are asleep. It's 1am in NZ [13:06] jml: you wouldn't happen to know about thumper's two branches that are release-critical approved but not landed? [13:07] EdwinGrubbs: not off the top of my head. [13:08] EdwinGrubbs: I tried to check in with thumper but he'd already left. [13:09] EdwinGrubbs: jtv-afk's situation is thus: [13:09] jtv: So, talk to me (as an Edwin proxy). What's the situation w.r.t. breakages and how likely are these things to become rollout blockers? [13:09] (Note that I have *zero* context besides Edwin saying you had bug that was qa-needstesting) [13:09] gmb: we're still working to get staging fully operational for our purposes, which is kind of a prerequisite for a rollout. [13:09] Ah, right. [13:09] One thing we identified is that the librarian gc grace period needs to be long enough to keep essential files present over the lifetime of a staging copy. [13:09] I don't know if there are any other branches waiting for Q/A that would be affected by this. Possibly not; most buildfarm Q/A can happen on dogfood. [13:09] Can DF not be used? [13:09] Just one thing missing on dogfood: codehosting. [13:09] There's a hack for that.™ [13:09] Oh, but I guess you need the scanner. Sad. [13:09] Well we need to trigger an ITipChanged event and watch it fire off a buildfarm job. [13:09] Exactly. [13:09] You can't QA the scanner on staging and manually create the records on DF? [13:09] gmb: it's _probably_ safe to roll back my revision, but again we can't verify that assertion in staging's current state. [13:09] jtv: Right. Deep joy. [13:09] jtv: So, once staging's operational for you, your plans are to do QA and / or check that you can roll back safely? [13:09] * gmb stabs wildly in the dark [13:09] gmb: a brilliant guess [13:09] I'm still betting on forward rather than back. [13:09] EdwinGrubbs: Not sure if that helps you any. [13:11] gmb: well, it confirms that it is as bad as I thought. [13:14] What's actually broken about staging? [13:15] No idea. [13:16] I see a TTBJ in progress there as we speak. [13:16] Ah, it just restarted. [13:16] Well. [13:16] I think the chroot is missing. [13:16] Easy fix. [13:18] Yes, the staging DB just needs to have a new chroot thrown at it. [13:21] So, unless there's some issue deeper than the obvious fatal one, it should take roughly two commands and ten seconds to fix. [13:23] one of these days I'm going to think seriously about the build farm UI [13:24] Oh? [13:25] e.g. that end users care about https://edge.launchpad.net/builders and actually check the page smells wrong to me [13:27] I think we should wait and see what happens now that multi-day queues are rare. [13:27] well I'm hardly proposing immediate action [13:27] a page that showed the queue, otoh, I could see being useful to end-users and operators alike [13:28] That sounds like my bug #155758 [13:28] <_mup_> Bug #155758: Global PPA +builds would be useful === jtv-afk is now known as jtv [13:30] hi gmb, hi EdwinGrubbs, I'm back from dinner [13:30] wgrant: right [13:31] jtv: I'm hoping staging works for you now. [13:31] no, not yet [13:31] wgrant: I guess in general I'm thinking that we should think about operator UI and user UI as being two distinct things that only sometimes overlap [13:32] jtv: Is there more to the staging breakage than the obvious chroot LFA expiration? [13:33] wgrant: I was about to ask how mthaddon had fared [13:33] mthaddon: staging is back… have you tried running those commands to upload the new chroot again? [13:45] jtv: I haven't no [13:45] mthaddon: could you? [13:45] Just grabbing the current prod one and manage-chroot.py'ing it in? [13:46] running it now [13:46] jtv: and done [13:46] mthaddon: yay! Thanks. [13:47] Success. [13:47] mthaddon: and now it's producing output. I'll throw in another "thanks" for good measure. [13:47] yer welcome [13:47] wgrant is obviously watching the same page I am. :) [13:48] Indeed. But I'm a little confused as to how clementine is involved. I thought that was used by the server team for stuff. [13:49] wgrant: are you saying it's not supposed to be a staging builder!? [13:51] EdwinGrubbs, gmb: yay! qa-ok. Only took a few seconds once everything was set up. :) [13:51] jtv: yipeee [13:51] I.. don't know. === al-maisan is now known as almaisan-away [13:53] wgrant: I've had clementine for these jobs before. Can't say if that was IS's intention, but that's how it's been working. [13:56] clementine is the staging builder [13:56] phew [13:56] and build it did. [13:57] I got my template. [13:57] Shouldn't feel like such an achievement for something we've already had in production for a while, should it? :) [13:57] So many moving parts. [13:57] the build farm is complicated... [13:59] ok jml I am about to start hacking on the BFH [14:02] jam: heh, I just googled for something in libpqxx and found it on arbash-meinel.com :) [14:04] bigjools: \o/ [14:05] gmb: Did you see my gentle prod? [14:06] gmb: You did, never mind. Thanks for the review. [14:08] wallyworld_, rockstar: ping [14:12] wgrant: https://edge.launchpad.net/ubuntu/+source/linux-lts-backport-maverick [14:12] spot the problem [14:12] bigjools: Already commented. [14:13] :) [14:13] your diagnosis is the same as mine [14:16] bigjools: Right, clementine is clearly a staging builder... but I also saw it being used to publish Maverick's EC2 AMIs. So I'm a little confused. [14:16] Maybe there are two clementines. [14:20] * jtv suppresses mumbled remark about her sister [14:20] —having listened to far too much Tom Lehrer for his own good === Ursinha-afk is now known as Ursinha [14:52] rockstar, abentley: ping [14:52] EdwinGrubbs: pong [14:53] abentley: can you QA bug 634149 and bug 639785 that thumper landed? [14:53] <_mup_> Bug #634149: Poor error messages from the smart server [14:53] <_mup_> Bug #639785: InvalidProductName when trying to access some branches in LP [14:53] allenap or gmb, how should I run test_bugtask_status? I get LayerIsolationError about the librarian being killed or hung. [14:54] I've tried all manner of ./bin/test with test_bugtask_status [14:55] deryck: Can you paste the error? I would use bin/test -vvct test_bugtask_status (and I have recently because I've changed it). === almaisan-away is now known as al-maisan [14:55] EdwinGrubbs: I'm not sure. The merge proposal doesn't describe how to demo the fix. [14:56] deryck: What allenap said. [14:56] abentley: well, if we can't QA it, I guess we can't release it. [14:56] allenap, gmb -- http://pastebin.ubuntu.com/512400/ [14:57] devel should be up to date for me, but confirming that now. [14:57] abentley, adeuring, allenap , bac, danilo, sinzui, deryck, EdwinGrubbs, flacoste, gary, gmb, henninge, jelmer, jtv, bigjools, leonard, mars, noodles775: Reviewer Meeting in 3 minutes [14:58] deryck: Have you tried removing pid files, or even (gasp) rebooting? [14:58] heh [14:58] gasp indeed! ;) [14:59] deryck: kill all existing librarians then remove the pid file [14:59] deryck: Again, what the other guy said. I can't reproduce locally... [14:59] reproduce *the error* [14:59] it happens to me occasionally [14:59] * bigjools chuckles at gmb [14:59] * gmb saw that coming [15:00] ... [15:00] *sigh* [15:00] Oh, hey, grep works better if you spell things right. Who'da thunk? [15:01] hmmm, no luck. [15:01] rm'ed /var/tmp/testrunner-librarian.pid, still have the error [15:01] don't show the librarian running via ps [15:02] oh, well. Will dig in more after reviewer meeting [15:03] deryck: Try lsof +D $tree_dir to see if anything is still running there. [15:08] nothing running except gvim :-) [15:10] deryck: is that plain devel or do you have some changes? [15:10] a-ha! rm /var/tmp/fatsam.test/librarian.pid did the trick. [15:10] lifeless, plain devel. [15:10] but I've got it now, finally. [15:12] thats very odd; you should have been getting a error from setUp [15:12] (layers.py line 589) [15:20] lifeless! [15:21] having trouble sleeping [15:22] so hey, is https://devpad.canonical.com/~lpqateam/qa_reports/deployment-db-stable.html down for other people ? I get connection refused [15:25] yeah loolks like devpad is misbehaving [15:31] jml: ok just looked at your patch *finally* and it appears I hadn't pushed up the latest revision for you to play with :( [15:31] bigjools: I'm not surprised, it looked wrong :) [15:31] bigjools: but I reckon my patch is good enough for illustration [15:31] yep [15:31] thanks [15:33] Ursinha, matsubara: do you know of any oopses or or other problems that might be a rollout blocker? [15:33] abentley, are you still having problems with 'make check' failing on your local system? [15:33] EdwinGrubbs, I'm checking right now [15:33] abentley, in reference to bug 629746 [15:33] <_mup_> Bug #629746: make check fails with "A test appears to be hung. There has been no output for 600 seconds." [15:34] mars: I dunno-- I stopped using it. bin/test has always worked fine. [15:34] abentley, ok, I'll flag it as Won't Fix [15:34] mars: !!! [15:34] mars: Why would you do something like that? [15:35] mars: make check uses test_on_merge, which turns out to be completely different code from bin/test. [15:36] abentley, well, 'make check' is a target intended for automated test suite use. After discussing this with the other build people, we decided that bin/test for local dev was fine, and save 'make check' for the bots [15:37] mars: The difference in behavour between bin/test and test_on_merge worries me. [15:37] EdwinGrubbs, devpad is down, I can't keep checking right now :/ [15:37] it should be back soon, according to losas [15:37] so I'll have an answer soon [15:37] abentley, it only has a call to 'make schema' in between, and it calls xvfb-run [15:38] mars: last I saw, test_on_merge was its own test runner. [15:38] yes, unfortunately [15:38] jml: I reckon _with_timeout could be a nice decorator in lp.services.twisted [15:38] abentley, it checks the child process' exit status [15:39] abentley, and kills hung tests [15:39] bigjools: interesting thought. it assumes that the Deferred has a nice canceller defined, but that might not be a problem. [15:39] jml: or perhaps a context manager [15:40] mars: my mistake. I guess I saw lots and lots of Python and assumed it was a test runner. [15:41] bigjools: yeah. maybe. I still find it hard to get excited about them. [15:41] mars: still, I wonder-- are these tests not displaying output for 15 minutes and then succeeding on my box? [15:42] jml: I know what you mean, but the code would look much nicer if nothing else :) [15:42] abentley, I would say run the test in isolation with bin/test to see [15:42] no opinion on that yet [15:43] jml: oh I was going to ask - does our version of Twisted have that Deferred cancellation in it for the one callRemote returns? [15:43] bigjools: I don't know off the top of my head. the NEWS file will mention it. [15:44] mars: Do you think testPPAUploadResultingInNoBuilds is the test in question? [15:44] abentley, no, I can't be sure. The other problem with test_on_merge.py is that output from hung tests, etc. gets buffered [15:45] abentley, usually one needs to run the sub-suite with bin/test [15:45] bin/test does the right thing: hangs, then prints to the console when you hit Ctrl-C [15:47] mars: Perhaps you should send SIGINT before signal 15, then? [15:47] hmm, I don't know, maybe [15:48] mars: so by sub-suite, you mean test_ppauploadprocessor? [15:48] yes [15:48] and if that does not reproduce it, step one suite higher in the chain again [15:48] jml: yes, it does [15:49] bigjools: cool. [15:49] * bigjools hax0rs [15:55] benji, leonardr, I think I have an idea about cleaning up windmill, if that is indeed the problem [15:56] mars: great [15:56] I'll whip up a patch [15:58] jtv: do you have the link? === Ursinha is now known as Ursinha-afk === Ursinha-afk is now known as Ursinha [16:10] jam: what, for the libpqxx stuff? Not anymore… I've got a little design problem with notification listeners in libpqxx so I thought I'd look up who uses them and how. === Ursinha is now known as Ursinha-lunch [16:10] gary_poster: thanks for following up on the storm issue [16:11] np jtv [16:12] jtv: hmmm, when I search for it, I get an old .weave file for a project I used to work on. Interesting to have text-readable repository content get indexed [16:13] jml: btw did you look into how to pass a connection timeout value to the reactor's connect method? [16:14] jam: I think there was mention of university stuff. On an unrelated note, I'm sorry there's so much trouble getting the BranchRevision patch through. [16:14] bigjools: it's still the same as yesterday: patch Twisted or clone & mutate the Proxy class. [16:14] jml: yes I realise that :) Just wondered if you'd done anything. [16:14] bigjools: oh, no. [16:15] I think I found a problem with the decorator approach as well - it's kinda hard to pass different reactors for tests :( [16:16] mars: I can't reproduce the specific delay. I have noticed the test suite seems to hang sometimes and then gets moving again, but I don't usually run with -vv, so I'm not sure where. [16:17] bigjools: good point [16:17] jml: unless we decorate the whole class [16:18] mars: I worry that it might be because getUniqueInteger now has a random starting point. [16:18] bigjools: that seems like one of solutions in search of a problem [16:18] yer [16:18] one of those, I meant to say [16:18] I'm just looking for ways to make the code re-usable. [16:18] jtv: well, modifying a 600M row table and having a hard-timeout on how long the upgrade can run is always going to be a bit of a problem :) [16:19] bigjools: well, it can be a function, rather than a method [16:19] jam: that's the thing—we're not doing anything that should take noticeable time. [16:19] jml: that's the easy approach, yes [16:19] def cancel_on_timeout(timeout, d, reactor=None): [16:19] jam: dropping a table in postgres is at its core just dropping it from metadata, and leaving the existing rows untouched (until the next vacuum I guess—but we vacuum anyway) [16:20] jtv: it is changing the primary key, which requires re-indexing [16:20] jam: that's the stupid thing. No technical reason for that that we can think of. It's just that nobody bothered to allow it yet. [16:21] jam: changing a primary key is a matter of dropping one constraint and adding another, which is really just a not-null plus a unique plus a little marker saying "this is the primary key." [16:22] jam: a primary key has basically no special meaning, except various scripts and slony like to work with the primary key by default. [16:23] jam: so stub came up with a good one… just ram the change into the system catalog. Make it say that there's a primary-key constraint and that the existing index was made for it. But that needs careful deliberation with pg-hackers. === deryck is now known as deryck[lunch] [16:23] jtv: true, though I'd rather work with thumper to redesign the whole logic and turn 600M rows into <1M [16:23] :) [16:24] jam: that's good too :-) [16:24] Is that the "compression" scheme you mentioned in Prague? [16:24] it would be interesting to get a dump of that content, and see how small it can be [16:24] yes [16:27] EdwinGrubbs: qa-ed. [16:31] jam, speaking of that, can we chat? [16:31] abentley: thanks [16:33] jml: success. Now to write tests :) [16:34] bigjools: I don't know whether to be happy or sad. [16:34] jml: DDT [16:34] arf === gary_poster is now known as gary-lunch [16:42] gmb: are we still supposed to write up a rollout report? I don't see one for last cycle. https://wiki.canonical.com/Launchpad/RolloutReports === benji is now known as benji-lunch [16:43] EdwinGrubbs: Yes. https://wiki.canonical.com/Launchpad/RolloutReports/10.09. I just forgot to link to it. === salgado is now known as salgado-physio [16:54] EdwinGrubbs: ping. [16:54] jcsackett: hi [16:55] EdwinGrubbs: how do you feel about altering the codehosting_usage property on projects so it's set to LAUNCHPAD if products for the group are? [16:56] rationale is that we have a bug about showing the unknown code hosting message if there are no branches for the projectgroup products, but it seems odd to test branches when we're looking at usage. [16:56] i'm pinging you, EdwinGrubbs, since you most recently refactored that. (really nicely, btw). === deryck[lunch] is now known as deryck [16:59] alternatively right now i can just test the output of getBranches on a projectgroup and display the unknown message if it's 0; that just seems odd. [17:00] jcsackett: hmmm, maybe projectgroups need a special message when there are no branches to say that you can submit code on the following products that have codehosting turned on. [17:00] leonardr, benji-lunch, here is a patch that should if the problem is the windmill suite: http://pastebin.ubuntu.com/512467/ [17:01] jcsackett: I assume that if we change the codehosting_usage to LAUNCHPAD for the projectgroup, it will show a bunch of action links that don't apply. [17:01] EdwinGrubbs: fair point; the overview involvement menu will trigger, which is ridiculous. [17:01] * jcsackett hadn't thought that through. [17:01] leonardr, benji-lunch, create a new branch of your current work, apply this patch, push, and run with "ec2 test -o '-t whatever_the_failing_test'", see if the problem persists. If it does not, try a full test run. [17:01] will do [17:03] EdwinGrubbs: i think a message about no branches and list of links to the products makes sense. just displaying the "we don't know" message seems underwheming to me. [17:03] thanks. === bigjools is now known as bigjools-afk === beuno is now known as beuno-lunch === Ursinha-lunch is now known as Ursinha === benji-lunch is now known as benji === gary-lunch is now known as gary_poster === bigjools-afk is now known as bigjools === beuno-lunch is now known as beuno [18:02] hi guys, what's the story with python-psycopg2 for lp development on maverick? launchpad-dependencies won't install because it requires < 2.0.14 [18:03] g'night [18:03] jml: good night [18:11] barry: manually install the lucid .deb, then install launchpad-dependencies [18:14] maxb: ah. what are the prospects for updating to the maverick version of the package? [18:14] I am not the person to ask about that [18:14] maxb: gotcha, thanks. maybe flacoste, gary_poster or sinzui knows? [18:14] The underlying question is "How soon can we resolve our/Storm's unicode issues and be able to use PsycoPg 2.2?" [18:15] barry, maxb, you need to revert to the older one. stub has a task to do the update probably in the next month/cycle [18:16] gary_poster: cool, thanks [18:16] np [18:17] bigjools: what was the outcome of testing that query fix ? [18:18] lifeless: nothing else happened [18:18] barry: https://bugs.edge.launchpad.net/ubuntu/+source/psycopg2/+bug/650777 [18:18] it doesn't break functionality so there's no point worrying [18:18] <_mup_> Bug #650777: operator does not exist: text = bytea [18:18] bigjools: ok [18:18] bigjools: works for me [18:18] lifeless: we can do more investigation on the performance later [18:18] ack [18:19] lifeless: thanks. subscribed [18:19] barry: we have lucids version, which works, in the lp PPA [18:19] lifeless: so i just need to uninstall the mav version and it should work? [18:19] there is an argument that 'lp is broken and this just shows it'. [18:20] barry: downgrade yeah [18:20] +1 === matsubara is now known as matsubara-lunch [18:26] lifeless: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1747S227 [18:27] the fixed query is not the one that's timing out [18:27] I smell a problem with the ValidPersonOrTeamCache view [18:28] oh wow, lp-oops openid dance is speial now [18:28] 3 continue buttons [18:28] yeah I had to log out/in [18:28] I frickin hate openid [18:29] yeah, the valid* caches have been the source of lots of timeouts [18:29] there's a validity clause for Person [18:29] which you can incorporate into a storm query, guess we need the same for team [18:30] * bigjools heads off for dinner [18:30] laterz [18:30] ciao [18:34] (mars, look over here :-) ) flacoste: do you know if dogfood is on lucid now, or if not, when it will be? [18:35] gary_poster: only a losa or bigjools would know [18:35] ok thanks flacoste [18:35] huh, oh look, I haven't been on canonical irc all day :-P [18:39] lifeless: you mentioned to me a while ago that storm trunk had a bug that would bite us. Do you remember what it was, and do you know if it has been resolved? If it has been resolved, you can even ignore the first question if you want :-) [18:40] lifeless/bigjools: try lp-oops now [18:40] I don't think it has [18:40] lifeless, hi. Are you subscribed to the security mailing list that gets mail for security bugs? [18:40] elmo: thanks [18:40] deryck: I don't know [18:40] lifeless, I posed a question for you in bug 638054, and then realized I wasn't sure if you would see it. [18:41] deryck: I'm structurally subcribed to 'launchpad-project' ;) [18:42] lifeless, but that doesn't get you security bugs. non-obvious as that may be and all. ;) [18:42] huh, well I got the bug mail anyhow [18:42] and I've replied just now [18:52] lifeless, "I don't think it has" was directed to me, I think; thank you. Do you know what the issue was? I'd like to mention it to jamu [18:54] something like [18:54] there was this cache incoherency [18:55] mm [18:55] I know, its an XXX === salgado-physio is now known as salgado [18:55] https://bugs.edge.launchpad.net/storm/+bug/619017 [18:55] <_mup_> Bug #619017: __storm_loaded__ called on empty object [18:56] thats the bug I mentioned, i think [18:56] now to find the fallout [18:56] https://bugs.edge.launchpad.net/storm/+bug/620615 [18:56] <_mup_> Bug #620615: please contact the developers [18:56] we hit that one when we use storm with the earlier one fixed. [18:56] IIRC [19:01] lifeless: thank you. the title of the bug made me laugh. OK, so to your knowledge, when that bug is fixed, we can switch to Storm trunk (via an official 0.18 or unofficial release)? [19:01] deryck: my reply is alittle brief - but all the context is in the rt [19:01] gary_poster: I'm not aware of any other blockers [19:01] great thanks again [19:02] lifeless, ok, thanks for responding. Brief is fine. :-) I'll look at the RT. === matsubara-lunch is now known as matsubara === al-maisan is now known as almaisan-away [19:33] abentley: ping [19:34] jcsackett: OTP [19:35] never mind then; rockstar, you available for a quick question? [19:35] jcsackett, on the phone. [19:37] jcsackett, abentley and I are on the phone together. :) [19:37] rockstar: that's sort of what i guessed from the paired responses. :-) [19:37] jcsackett: ok, I think I can answer your question now. [19:37] EdwinGrubbs: i'm all ears. [19:38] jcsackett: if you go to https://launchpad.dev/$project/+branchvisibility, you can specify the visibility policy for multiple teams separately. You can also specify the default visibility for people not in one of those teams. [19:43] jcsackett: back [19:43] EdwinGrubbs: can anyone with Edit permission for the product use that page? [19:44] abentley: i'm trying to answer a question that EdwinGrubbs has helped me out with a bit. namely, if a product has branches set as private to one team, can you make branches on the product for another team? [19:45] jcsackett: I don't know a lot about branch privacy. [19:45] ah. Edwin thought you might. [19:46] jcsackett: I don't think that "private to one team" is a thing. [19:46] different teams can have private branches on the same product [19:46] jcsackett: I'm pretty sure the enum is "private", "private by default" or "public". [19:46] jcsackett: I don't think there's a team relationship. [19:46] the ability to make private branches is a commercial feature [19:46] abentley: the policy is team based [19:47] lifeless: i know, i'm trying to answer a question for a commercial team.https://answers.edge.launchpad.net/launchpad-registry/+question/129258 [19:47] he marked it as solved, but has a new question in the comment he closed it with. [19:47] nonmunged link https://answers.edge.launchpad.net/launchpad-registry/+question/129258 [19:49] EdwinGrubbs pointed out he might be able to set it up with the +branchvisibility link; i'm just figuring out all the bits before i answer him. [19:49] jcsackett, you make need to to retarget the question to launchpad-code or at least get thumper or abentley to explain that branch visibly does not work the way they think. [19:52] sinzui: dig. i'll retarget. [19:56] sinzui: jcsackett: I don't think it needs retargeting. Have you looked at their oops? [19:57] lifeless: yes, that's why i was asking my initial question. [19:58] the fact they are encountering an oops suggests they have found a bug, to me. [19:59] they do need to have each *team* that is pushing branches setup with a branch policy (or have a meta-team containing all their teams, and give it the policy) [19:59] if the project is proprietary they should have the no-defined-policy behavior be to prevent pushing. [20:00] lifeless, they cannot define branch visibility. We suck [20:01] sinzui: they can have us do it though, no? [20:01] how do they know that? [20:01] Who walked them through how privacy works? [20:02] Where are the documents that they can read? [20:02] lifeless, this is an unusable feature. Lp engineers read the code to find out how it works [20:02] sinzui: i think this *is* us walking them through privacy. :-/ === mwhudson_ is now known as mwhudson [20:03] jcsackett, I think we should have done that 6 hours ago. This is the third problem [20:04] sinzui: https://help.launchpad.net/Code/SubscribingToPrivateBranches <- does this help? [20:04] it seems lightweight to me [20:05] jcsackett, I do not service commercial requests anymore because my experience as a subscriber to lp bugs, questions, feedback@ and commercial@ are preventing other engineers from discovering they need to do better [20:05] sinzui: that's reasonable. thus i am now the one learning we need to do better. :-P [20:06] however, i do not seem to have bac's powers to deal with their issue (insofar as setting visibility). [20:06] lifeless, no, they want to define visibility for another team, I think 10 people can do that [20:06] sinzui: do you file bugs about the things that are faulty/missing? [20:07] jcsackett, bac did no put you in the commercial admin team [20:08] lifeless, I have done more than that, I have had meetings to explain that is is wrong. [20:08] sinzui: could you bind 'what' to a value, for clarity [20:08] I also have nasty recipes in FAQs how to setup a commercial project that works somewhat like someone expect [20:09] I have an lp create-project tool in lptools [20:09] would it be reasonable to extend that for commercial projects (so that jc, bac etc - those with lp perms to do it, can do it easily) [20:10] lifeless, I an at 99% frustration and I am on my vacation. Lp engineers do not believe they made an bad service that is an embarrassment to charge money for. [20:10] sinzui: I didn't realise you were on leave. Shoo! go relax. [20:15] lifeless: so, acknowledging that this sucks (and that the OOPs in question should probably be a bug, since a useful warning would be better), you good with it being retargeted per our vacationing friend's suggestion? [20:15] jcsackett: I don't see that a handoff will help the user. [20:15] jcsackett: if you think its the best thing for them to hand it off, sure, by all means do so. [20:16] jcsackett: if I was handling the question, I would: [20:16] - file a bug for the user for the OOPS [20:16] * jcsackett nods. [20:16] - write up a help page now with what we know about branch privacy [20:16] - point them at it [20:16] not entirely sure we *know* about it; but i can seed a page with what i've learned. [20:16] s/i've/we've [20:17] - read the wiki page curtis mentioned that is in the internal wiki [20:17] (and use that pages notes as part of the seed for the help.lp.net page) [20:18] dig. [20:19] * jcsackett did not expect to open this can of worms. [20:19] if we write it up well, it will stay more closed in future [20:30] leonardr, ping [20:30] rockstar, hi [20:30] leonardr, did you write the lp.client javascript? [20:31] rockstar: i don't rember. i probably wrote some of it [20:31] can we talk about dominion instead? [20:31] leonardr, :) [20:32] leonardr, I only see a definition of the lp.client.plugins module, never a mention of just the lp.client module. [20:33] rockstar: look in l/c/l/javascript/client/client.js? [20:33] leonardr, yeah, no definition in there of lp.client [20:33] line 4 seems to have one? [20:34] no, it's tautological [20:34] well, no, it's not tautological [20:34] LP.client starts out an empty dictionary [20:34] leonardr, yeah, I can't seem to be able to do YUI().use("lp.client"); [20:34] leonardr, that's what I mean. [20:34] and then over the course of that file, stuff is added to the dictionary [20:35] rockstar: at that point you've lost me. it's possible there's some module management thing that this file isn't doing, but you should talk to maybe mars [20:35] rockstar, did you check the base template for it? [20:36] base-template-macros.pt [20:36] mars, yes. [20:36] mars, it's included there. === almaisan-away is now known as al-maisan [20:37] rockstar, what is the problem? [20:37] mars, on upgrading to YUI 3.2, I get a "requirement NOT loaded: lp.client" [20:37] mars, that's all I know. [20:38] I suspect it's related to a dependent module not being available, but I have no idea how to prove that. [20:38] rockstar, have you grepp'd the source tree for the lp.client symbol? [20:38] I don't know where the Y.add('lp.client') is. [20:44] In fact, I'm almost positive now that there isn't one. [20:48] rockstar, that is odd [20:49] mars, I think it's a boo boo that thumper made that might have caused everything to break in yui 3.2. [20:49] mars, I think it's gotten stricter. [20:49] There is no lp.client. [20:49] that would be nice [20:49] the loose dependency loading in earlier versions caused problems for developers [20:50] mars, yeah, I'm doing a clean and a build real quick. I think this might fix it. [20:50] rockstar, well, if there is no .add() for it, but there is a 'requires:' stanza, then I would say it is an error in the requirements [20:50] mars, yeah, I think that's what's broken. [20:51] leonardr, benji, any luck with the windmill fix I proposed earlier? [20:51] mars: i had promising results, but i'm not giving them a lot of credence since the problem seems to skip from test to test. am doing a full run now [20:51] mars: I haven't gotten my test results back yet (I did a full run because I don't get the same errors twice) [20:52] ok, let me know how it turns out [20:59] mars, we upgraded windmill, right? [21:00] rockstar, we did, to tip. [21:00] mars, er, we've upgraded windmill since we discovered the 512K bug, right? [21:00] yes [21:00] we upgraded last week [21:04] mars, great. I'm hoping the 512K bug is fixed. Otherwise, I've hit yet another blocker in my quest to get a new YUI in... [21:05] rockstar, you could ask on #windmill. It is early PDT [21:05] that reminds me [21:05] mars, I didn't think it was a windmill bug, but in the way we used windmill. [21:06] I have a patch I would like to get into windmill trunk [21:11] rockstar: what boo boo? === al-maisan is now known as almaisan-away [21:11] thumper, your popup diff claims to require lp.client, but there is no lp.client module. [21:11] rockstar: so fix it :) [21:12] thumper, not a big problem, except that I believed there really was, so I approached the problem thinking something I changed broke it. [21:12] Truthfully, something I changed just made the breakage more apparent (and break ALL OTHER JAVASCRIPT in the process...) [21:14] I'm out. Later on, all. [21:18] EdwinGrubbs, qa-tagger is back, it tagged a bug as needstesting, should I keep that as qa-bad as it was? it's bug 659129 [21:18] <_mup_> Bug #659129: Distribution:+ppas timeout in PPA search [21:19] Ursinha: qa-ok for that bug, if you're touching it [21:19] lifeless, cool, in a second [21:19] Ursinha: the fix wasn't /wrong/ it just wasn't deep enough (a later query times out instead) [21:20] I see [21:27] abentley: thanks for adding to the arch guide [21:32] jcsackett: yes, thats the page - see the bottom where it talks about branch privact setup [21:32] lifeless: ok, thanks. [21:33] lifeless, hi [21:34] re. bug 659618 [21:34] <_mup_> Bug #659618: crash when assignee is None [21:34] jcsackett: and unless we're talking confidential stuff, the default place to talk is here :) [21:34] I'm getting the committer from the info of the merged branch [21:34] lifeless: right; i went there b/c i was unsure of the status of that page: re confidentiality. [21:34] Ursinha: I'd like some more detail than that, if you can humour me. [21:34] lifeless, sure, a minute [21:35] jcsackett: fair enough; uhm, its not about any specific customer or a security vulnerability or embargoed surprise feature etc. [21:35] jcsackett: nor is it operational details we'd hold sacred [21:35] * jcsackett nods. [21:36] fair enough. [21:36] lifeless, the committer there: http://paste.ubuntu.com/512644/ [21:36] jcsackett: nor is it of a personal nature for one of us ;) [21:37] i would hope that case doesn't come up too often on IRC. :-P [21:37] lifeless, get_apparent_authors [21:37] Ursinha: so using that for assignee has a few problems. [21:37] jcsackett: people taking leave is personal, shouldn't be discussed in public (unless the person in question does) [21:37] jcsackett: (because if the house is vacant...) [21:37] lifeless, go ahead [21:38] firstly, there may be no merged commits. [21:38] lifeless, in devel/stable? [21:38] this happens when someone does a fix directly on trunk on pqm [21:38] which is needed (rarely) when things go completely cactus [21:38] I've never saw that happen. As in all commits go to trunk through pqm [21:39] Ursinha: there was one about 7 weeks ago in fact :) [21:39] so pqm will always be the first committer, and we'll have the apparent authors [21:40] Ursinha: no, that doesn't make sense [21:40] in your pastebin [21:40] rev 11682 [21:40] will have apparent authors of pqm [21:40] and rev 11643.2.7 [21:40] will have apparent authors of stevenk [21:40] get_apparent_authors applies only to a single rev. [21:40] right.. [21:41] anyhow, you need to handle the case of there being no merged revisions [21:41] lifeless, it goes one level deeper to get the apparent authors [21:41] secondly, if fred does a lot of work but goes on leave, barney may merge devel to fix conflicts and then land it; barney isn't the assignee [21:42] or if fred and barney are collaborating, the revision graph would look similar. [21:43] what would we want in the web report in those cases? [21:43] lifeless, how do you suggest fixing that? [21:43] Ursinha: well, I don't know what 'assignee' in the report is for actually. [21:43] Ursinha: so I'd like to understand the use case and our goals before offering solutions [21:45] lifeless, the assignee, in my understanding, should be the person responsible for that fix, as the one that should QA that and fix if anything goes wrong [21:45] lifeless, the person we should poke about that item [21:45] ok [21:45] if we rename it 'authors' [21:46] and include *all* the authors found by looking at the trunk revision and all the merge revisions, transitively [21:46] then it would do that comprehensively [21:47] thumper, I wonder if you and I could have a phone call/review on my lazr-js branch. [21:47] the common case will be one person [21:47] but in the uncommon case we'll have accurate information [21:47] lifeless, right. [21:47] lifeless, mind writing this on the bug report? [21:48] I have to leave now and should return in ~1h30 [21:48] you'll be here anyway :) [21:48] sure [21:48] thanks lifeless === Ursinha is now known as Ursinha-bbl [21:54] rockstar: ok, OTP right now [21:55] thumper, ack. We can do it much later in the day. [22:02] thumper: https://dev.launchpad.net/ArchitectureGuide [22:07] morning [22:17] lifeless: got a second to be a second set of eyes on the help.lp.net page? === Chex changed the topic of #launchpad-dev to: Launchpad Development Channel | Week 4 of 10.10 | PQM is Release-Critical; devel is closed (Release manager: EdwinGrubbs) | firefighting: - | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting | 17:15 -!- Topic for #ubuntu-devel: 10.10 released on 10/10/10 at 10:10:10UTC!! | Development of Ubuntu (not support, not app development) | #ubuntu fo === Chex changed the topic of #launchpad-dev to: Launchpad Development Channel | Week 4 of 10.10 | PQM is Release-Critical; devel is closed (Release manager: EdwinGrubbs) | firefighting: - | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting | Launchpad down/read-only from 22:00 - 00:00 UTC for a code update [22:19] mars: no dice on the tests: https://pastebin.canonical.com/38615/ [22:20] Project db-devel build (65): STILL FAILING in 3 hr 44 min: https://hudson.wedontsleep.org/job/db-devel/65/ [22:20] benji, darnit [22:21] * Launchpad Patch Queue Manager: [release-critical=lifeless][r=gmb][ui=none][bug=659129] Speed up the [22:21] query used to search for PPAs. [22:21] * Launchpad Patch Queue Manager: [release-critical=lifeless][r=jtv][ui=none][bug=634149, [22:21] 639785] Catch the CannotHaveLinkedBranch exception in translatePath. [22:21] ok [22:21] hmm [22:21] so testtools has this nifty addDetail feature [22:22] perhaps that could be used to create a graceful way to do printf debugging on these errors [22:23] /if/ I can get useful identifying information about the active thread [22:23] monkey-patch threading.Thread.start()? :( [22:24] yuck [22:24] benji, I'll think about it. leonardr, looks like the errors persisted for benji. [22:35] jcsackett: sure [22:36] lifeless: https://help.launchpad.net/Code/PrivateBranches [22:36] mars: do you know what the errors even mean? If so, can you explain it to me? [22:36] restricted -or- moderated? [22:38] benji, it means that threading.enumerate() says there are still live unjoined threads after the test is finished [22:38] benji, and the thread is still alive [22:38] benji, threading has a settrace() method that may provide some insight [22:38] mars: and that's something that RemotedTestCase tests for and generates errors if it is the case? [22:39] jcsackett: other than that, great [22:39] benji: RemotedTestCase is just the local proxy for a test that ran in another processs [22:39] benji: BaseLayer checks for the threads. [22:40] lifeless: good catch. the internal stuff said restricted, but they're referring to a special case of all of this. [22:40] thanks! [22:40] benji, it is added by zope.testing.testrunner.runner.TestResult, printed as a subunit stream by zope.testing.testrunner.formatter, then the stream is interpreted by RemotedTestCase [22:40] and raises the test error [22:48] thumper, https://code.edge.launchpad.net/~rockstar/launchpad/javascript-refresh/+merge/38373 === salgado is now known as salgado-afk [22:54] flacoste: have you looked at the google doc I made? Any thoughts? [22:55] lifeless: i found it interesting, and a good direction, but didn't have any specific thoughts when i read it, [22:55] flacoste: ok; I'm going to code today. [22:55] mars: my full test run also failed with three random failures [22:56] flacoste: I'll get back to it tomorrow/monday === matsubara is now known as matsubara-afk [22:56] jtv: The chroot_url URL seems to work fine for me... [22:56] wgrant: yeah, _now_ it does [22:58] jtv: Maybe you caught it in a moment when the chroots were being replaced? [22:58] I can't see why they'd be blank. [22:59] wgrant: so I get you straight: you're saying the production maverick chroot librarian URL works if you edit it to use the staging librarian? [23:00] jtv: No, no, I refer to your wiki edit. [23:00] Sorry. [23:01] ah [23:01] wgrant: no idea what caused it, but glad to hear it. [23:03] Ah, now this is one of those "we have no idea how long it's going to take, because staging sucks" upgrades, isn't it? [23:05] wgrant: We do have some idea; stub did a manual db upgrade that doesn't show up in successful-updates.txt [23:05] Ah. [23:07] thumper: did you see https://lp-oops.canonical.com/oops.py/?oopsid=1747ED661 and https://answers.edge.launchpad.net/launchpad-registry/+question/129258 ? [23:10] i'm glad someone tweeted at the relevant time [23:27] poolie: yes I saw it