=== mrevell is now known as mrevell-lunch === salgado is now known as salgado-brb === mrevell-lunch is now known as mrevell === salgado-brb is now known as salgado [14:58] ping MootBot [15:00] #startmeeting [15:00] Meeting started at 09:00. The chair is matsubara. [15:00] Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE] === salgado is now known as salgado-lunch [15:00] Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues. [15:00] [TOPIC] Roll Call [15:00] New Topic: Roll Call [15:00] me [15:00] me [15:00] me [15:00] me [15:00] me [15:01] I'm standing in for henninge today, since he's on sprint. [15:01] thanks jtv [15:01] me [15:01] me [15:02] so, if any of you can't make the meeting next week, please coordinate with another teammate to cover for you and add a notice in the Apologies section in the MeetingAgenda page. [15:02] flacoste: ping [15:02] me [15:03] bigjools: ping [15:03] me [15:03] me [15:04] ok, let's move on, stub can join later [15:04] [TOPIC] Agenda [15:04] New Topic: Agenda [15:04] * Actions from last meeting [15:04] * Oops report & Critical Bugs [15:04] * Operations report (mthaddon/herb/spm) [15:04] * DBA report (DBA contact) [15:04] [TOPIC] * Actions from last meeting [15:04] New Topic: * Actions from last meeting [15:04] * bac to check with barry if there's an open bug for OOPS-1125A1096, if not, Ursinha to file one - there was, bug 280925 [15:04] * intellectronica to work on bug 279561 [15:04] * rockstar to check OOPS-1125CEMAIL1 [15:04] * bac to take a look at OOPS-1125A165 - bac filed bug 322792 [15:04] * Ursinha to check with kiko if any other rollouts will happen this week [15:04] Launchpad bug 280925 in launchpad-registry "Project overview page shows obsolete series" [Low,Fix released] https://launchpad.net/bugs/280925 [15:05] https://devpad.canonical.com/~jamesh/oops.cgi/1125A1096 [15:05] Error: Could not parse data returned by Launchpad: The read operation timed out (https://launchpad.net/bugs/279561/+text) [15:05] Launchpad bug 322792 in launchpad-bazaar "Attempting traversal past an unknown object causes an OOPS" [Undecided,Fix released] https://launchpad.net/bugs/322792 [15:05] https://devpad.canonical.com/~jamesh/oops.cgi/1125A165 [15:05] holy crap [15:05] that bug is timing out [15:05] or what? [15:06] anyway [15:06] intellectronica: any news about the api bug? [15:06] my items were done [15:06] rockstar: what's up about that oops? [15:06] matsubara: sorry, no news yet [15:06] matsubara, he landed a fix for that [15:06] Ursinha: who's he? [15:06] :-) [15:06] matsubara, rockstar :) [15:06] sorry [15:06] matsubara, I got an RC in for that one. [15:07] ok, I remember that one. [15:07] so I guess the only pending one is the api bug which is a mistery to everyone [15:08] the good news is that intellectronica found out another thing about that bug that might lead to its root cause [15:08] intellectronica: thanks for keeping us posted in the report [15:08] let's move on [15:08] [TOPIC] * Oops report & Critical Bugs [15:08] New Topic: * Oops report & Critical Bugs [15:09] so in today's oops section I'd like to talk about the timeout bugs you guys are working for the LPW [15:09] https://dev.launchpad.net/PerformanceWeeks/February2009 [15:10] I'm going to review all the landings related to LPW work and add to that page. [15:10] so if you wanna help, point me to revision numbers on RF related to that work [15:10] matsubara: bug 324264 is now Fix Committed. [15:10] Launchpad bug 324264 in rosetta "Speed up +translations" [High,Fix committed] https://launchpad.net/bugs/324264 [15:11] matsubara: r7705 for Bug: 325321 [15:11] jtv, great [15:11] intellectronica, BjornT : any news about the bug number 1 time out? [15:11] matsubara: EdwinGrubbs will land his branch today [15:11] work on bug 302798 is in progress in different ways (there's a commit from 323something which disabled external suggestions to give us a better idea on how stuff is working, and henning is working on removal of obsolete translations which will reduce our DB size by ~33%) [15:11] Launchpad bug 302798 in rosetta "Timeout on +translate page" [High,Triaged] https://launchpad.net/bugs/302798 [15:12] matsubara: me, intellectronica, and allenap are working on it [15:12] danilos: hey, I was putting that paragraph together! [15:12] matsubara: allenap is looking at reducing the time it take to render the comments, possibly by not showing all by default [15:13] jtv: ok, I'll let you handle it all from now on :) [15:13] matsubara: intellectronica is working on loading the subscribers portlet in a different request [15:13] matsubara: you can have full trust in jtv as far as PW is concerned :) [15:13] *cough* [15:13] matsubara: and i'm working on optimizing code, based off profiling information [15:13] danilos: I do! he's been very helpful with the status updates [15:14] matsubara: also, intellectronica's inital tests on dogfood were successful, reducing the time quite a lot :) [15:14] great [15:14] matsubara: i'm working on bug 316881 (which isn't an OOPS per-se, but related to performance anyhow) [15:14] Launchpad bug 316881 in launchpad "Page headers not suitable for HTTP caching" [High,In progress] https://launchpad.net/bugs/316881 [15:15] stub: there's an email from jono asking for some help with the +project-cloud oops, so if you could help out there, would be awesome [15:15] thanks flacoste, i'll add it to the page [15:15] I've replied on the bug. Not sure if I'm helpful though. [15:15] stub: cool. thank you [15:16] bigjools: news in soyuz. how about the one muharem is taking care of? [15:16] s/./?/ [15:16] matsubara: it's not going so well unfortunately [15:16] I don't expect any progress this week [15:16] [action] matsubara to add 316881 to foundations section in LPW wiki page [15:16] ACTION received: matsubara to add 316881 to foundations section in LPW wiki page [15:17] bigjools: why is that? [15:17] matsubara: the first attempt to fix it failed. he's also been at the distro sprint this week [15:18] bigjools: oh right. well, you guys are excused since the whole team is sprinting and you already landed 2(?) timeout fixes :-) [15:18] matsubara: thanks :) [15:18] I guess that's it from me. Ursinha, anything else? [15:19] matsubara, no, the pending oops for soyuz I already talked with bigjools [15:19] yeah, get edge updating again and we'll see how it went [15:19] great. thanks everyone! [15:19] [TOPIC] * Operations report (mthaddon/herb/spm) [15:19] New Topic: * Operations report (mthaddon/herb/spm) [15:19] - 2009-01-30 - Friday we updated lpnet, edge and the scripts servers to to r7676. [15:19] - 2009-02-04 - Yesterday we updated codebrowse to r43 [15:20] - I feel like I'm starting to sound like a broken record... But we're still being bothered daily, often multiple times, by bug #156453 and bug #118625 which seem to be related. [15:20] Launchpad bug 156453 in loggerhead "production loggerhead branch leaks memory" [Critical,In progress] https://launchpad.net/bugs/156453 [15:20] - We also continue to run into issues associated with bug #260171 [15:20] Launchpad bug 118625 in launchpad-bazaar "codebrowse sometimes hangs" [High,Triaged] https://launchpad.net/bugs/118625 [15:20] Bug 260171 on http://launchpad.net/bugs/260171 is private [15:20] herb: i herd mwhudson was working on loggerhead performance this week [15:20] as part of LPW [15:20] yes! [15:20] herb: There is a sprint in the planning to fix that too [15:20] yes, he is [15:20] mwhudson is indeed working on that [15:21] herb: and well-placed sources also told me that a sprint is being organized to fix those damn issues [15:21] herb: so there is hope! [15:21] excellent. a fix would be huge for the LOSAs [15:21] herb, mwhudson is on the verge of insanity tracking loggerhead issues down. [15:22] thanks for all the work on that. it is much appreciated. [15:22] thanks herb [15:22] [TOPIC] * DBA report (stub) [15:22] New Topic: * DBA report (stub) [15:22] The production dbs seem to be ticking away nicely. [15:22] Staging db updates are being disabled by the losas for some testing, so expect a drop in timeout OOPSES. [15:22] I've had some db patches for this cycle come through already, which is great. [15:24] 3 [15:24] 2 [15:24] 1 [15:24] all right. thanks stub [15:24] what about the staging issues? [15:25] herb: ^ [15:25] staging isn't available at the moment [15:25] it's restoring [15:25] but we didn't understand why it failed? [15:25] or did we? [15:25] herb: how long will it take to restore? [15:25] and it's just that gmb is done with the testing and we can resume normal staging updates? [15:25] matsubara: should be back up within the next couple of hours. [15:26] herb, about 10 hours ago I was talking with spm and he was unsuccessful to put staging on again [15:27] flacoste: upgrade.py failed because there will still connections open to the staging db. This left the DB in an indeterminate state, so we had to go through the full restore process with a new copy of the staging DB. [15:27] flacoste: I'm not done yet. [15:27] flacoste: I need staging to be up for that [15:27] so this means that we are still screwed? [15:28] This is all useful information for read only launchpad btw. [15:28] we either need to fix upgrade.py to work without a db restore [15:28] or to turn off staging ugprades for the duration of gmb's test [15:28] flacoste: upgrade.py works fine [15:29] and we'll need to turn off upgrades while gmb's testing, per stubs note above. [15:29] why were there still connections open? [15:29] upgrade.py cannot work when there are active db connections, as upgrade requires exclusive locks on all the replicated db tables. [15:29] flacoste: the rollout process shuts down the app servers, but there are still potentially cron scripts running, etc. [15:30] why usually is this not a problem then? [15:31] Because the restore, replicate and upgrade process is done as a fresh db. Once it is finished, it is swapped into place. [15:31] ah right! [15:31] * bigjools has to run, will catch scrollback later if you need anything from me [15:32] so when I said fix upgrade.py, i should have said 'fix rollout process' [15:32] so I guess it's best to simply disbale upgrade for now [15:32] i also heard that gmb might do his tests elsewhere [15:32] so that might becomes a moot issue [15:32] but like stub said, very instructive for read-only launchpad [15:33] flacoste: Well, that's an embryonic idea right now. I still need a staging / demo machine for the forseeable future. [15:34] I want to make an action item for the fix rollout process thing [15:34] not sure who would be responsible for that [15:34] losas? [15:35] whoa [15:35] with the help of stub probably [15:35] [action] losas and stub to fix rollout process to avoid the staging restore problems [15:35] ACTION received: losas and stub to fix rollout process to avoid the staging restore problems [15:35] there isn't anything inherently wrong with the rollout process. [15:35] but ok [15:36] thanks everyone [15:36] anythign else before I close? [15:36] well, it's currently not reliable if we don't do a DB restore it seems [15:36] nope [15:36] Thank you all for attending this week's Launchpad Production Meeting. See the channel topic for the location of the logs. [15:36] #endmeeting [15:36] Meeting finished at 09:36. [15:36] thanks matsubara === matsubara is now known as matsubara-lunch === salgado-lunch is now known as salgado === matsubara-lunch is now known as matsubara === bac is now known as bac_lunch === abentley1 is now known as abentley === bac_lunch is now known as bac === salgado is now known as salgado-afk === mthaddon_ is now known as mthaddon