[00:17] sorry guys. [00:18] * jml is back. === herb__ is now known as herb === salgado-afk is now known as salgado === salgado is now known as salgado-brb === salgado-brb is now known as salgado === salgado_ is now known as salgado [15:50] henninge, :) [15:50] Ursinha: Hi :) [16:02] Hi [16:02] #startmeeting [16:02] Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues. [16:02] [TOPIC] Roll Call [16:02] Not on the Launchpad Dev team? Welcome! Come "me" with the rest of us! [16:02] Meeting started at 10:02. The chair is Ursinha. [16:02] Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE] [16:02] New Topic: Roll Call [16:02] me [16:05] me [16:05] ich [16:05] me [16:05] me [16:05] me [16:05] me [16:05] herb, intellectronica, hi [16:05] :) [16:05] who's missing? [16:05] stub is missing, but he can join later [16:05] intellectronica is missing too [16:05] let's move on [16:05] [TOPIC] Agenda [16:05] * Actions from last meeting [16:05] * Oops report & Critical Bugs [16:05] * Operations report (mthaddon/herb/spm) [16:05] * DBA report (stub) [16:05] New Topic: Agenda [16:05] [TOPIC] * Actions from last meeting [16:05] * Ursinha to talk to intellectronica about bug 357316 [16:05] * Ursinha to talk to henninge about bug 302449 [16:05] * rockstar to confirm that bzr fix for bug 360791 was applied to LP's bzr tree. [16:05] New Topic: * Actions from last meeting [16:05] * cprov to request CP of fix for bug 370513 [16:06] Launchpad bug 357316 in malone "hwdb +submit failing with KeyError OOPS" [Undecided,Triaged] https://launchpad.net/bugs/357316 [16:06] Launchpad bug 302449 in rosetta "Uploading a file with the same name triggers a database constraint." [Medium,Triaged] https://launchpad.net/bugs/302449 [16:06] I suck and failed mine [16:06] Launchpad bug 360791 in bzr/1.14 "bzr pull/branch shows "Error received from smart server: ('NoSuchRevision',)"" [Critical,In progress] https://launchpad.net/bugs/360791 [16:06] Launchpad bug 370513 in soyuz "failure to accept PPA uploads" [Critical,Fix committed] https://launchpad.net/bugs/370513 [16:06] [action] Ursinha to talk to intellectronica about bug 357316 [16:06] ACTION received: Ursinha to talk to intellectronica about bug 357316 [16:06] henninge, hi :) [16:06] I think danilo was onto that ... [16:06] I don't know if the fix has been cherry picked into production... [16:06] henninge, can you just confirm that, please? it's set as medium, do we use that status? [16:08] Ursinha: in rosetta we do ;) [16:08] rockstar, hm, can you check that too? [16:08] Code team does, for things of medium importance. [16:08] as does Soyuz [16:08] rockstar: it was cherry picked on 2009-05-09 [16:08] I rarely see medium statuses :) that's why I'm asking [16:08] thanks herb [16:08] herb: cool, thanks. [16:08] [action] henninge to check with danilo the status of bug 302449 [16:08] ACTION received: henninge to check with danilo the status of bug 302449 [16:08] Launchpad bug 302449 in rosetta "Uploading a file with the same name triggers a database constraint." [Medium,Triaged] https://launchpad.net/bugs/302449 [16:08] cool [16:08] moving on then [16:08] [TOPIC] * Oops report & Critical Bugs [16:08] there's only one worth mentioning, that is the one causing the InterfaceError oopses, we're still having lots and lots of occurrences (bugs 374909 and 376207), seems to be worked on by jamesh, is that correct flacoste? [16:08] New Topic: * Oops report & Critical Bugs [16:08] Launchpad bug 374909 in storm "InterfaceError: connection already closed should be converted into DisconnectionError" [High,Triaged] https://launchpad.net/bugs/374909 [16:08] Launchpad bug 376207 in launchpad-foundations "LaunchpadOpenIDStore doesn't support database disconnection" [High,In progress] https://launchpad.net/bugs/376207 [16:11] so, jamesh is working on 374909 [16:11] and the other one also [16:11] right [16:11] but stuart has an easy fix for the later, that I'll likely asked to be cherrypicked [16:11] we can deploy jamesh proper fix during next roll-out [16:11] flacoste, hm, good [16:11] flacoste, about the cp, when do you think it can be done? [16:11] i didn't look at the branch [16:11] 374909 is still cropping up from time to time, btw. though it's much gooder(tm) than it was last week. [16:11] but once i approved it, as soon as the LOSA can take care of it [16:12] flacoste, right. okay [16:13] i don't think i'll ask a C-P of the INterfaceError [16:14] (storm fix being worked on by jamesh) [16:14] flacoste: why? [16:14] well, because that's not a root-cause [16:14] so with the other fix in place, we shouldn't see it [16:14] ok [16:14] that fix is more prophylactic [16:14] flacoste, who's investigating the root cause? [16:17] if you are talking about why we are getting disconnection errors in the first place [16:17] nobody, really [16:17] but we have a fix for the places where we should be trapping those [16:17] and recovering [16:17] yes, the disconnection errors [16:17] we have no ideas at why it's happening [16:17] there is nothing in the DB logs [16:17] about them [16:17] this is creepy [16:17] the fix you say you have [16:18] is inside the fixes for one of those bugs? [16:18] yes [16:18] great [16:18] so, you'll ask a cp for the second bug [16:18] exactly [16:20] [action] flacoste to ask a cp for fix for bug 376207 after reviewing it [16:20] ACTION received: flacoste to ask a cp for fix for bug 376207 after reviewing it [16:20] Launchpad bug 376207 in launchpad-foundations "LaunchpadOpenIDStore doesn't support database disconnection" [High,In progress] https://launchpad.net/bugs/376207 [16:20] awesome [16:20] we have one critical bug, being worked on [16:20] so, unless anyone has anything to point, moving to the next section! [16:20] good [16:20] [TOPIC] * Operations report (mthaddon/herb/spm) [16:20] New Topic: * Operations report (mthaddon/herb/spm) [16:20] 2009-05-07 - Cherry pick r8906 to the scripts server and r122 of storm to lpnet* & edge* [16:20] 2009-05-09 - Cherry pick r4006 of bzr to the codehosting server and r123 of storm to lpnet* & edge* [16:20] 2009-05-09 - Cherry pick r8348, r8312 to the PPA server and r8376 to lpnet* [16:20] 2009-05-10 - mailman didn't have access the necessary access to the DB server, but it was only noticed after restarting for log rotation. mailing lists were unavailable for approximately 7 hours. [16:20] We still seem to be encountering bug #156453 and bug #118626, but the situation is much improved since the rollout. [16:23] flacoste: cprov requested a rollout of the current production tree to cesium. Apparently there was a critical fix that was included in while in RC, but we didn't re-roll to cesium. Can you (dis)approve that? [16:23] Launchpad bug 156453 in loggerhead "production loggerhead branch leaks memory" [Critical,In progress] https://launchpad.net/bugs/156453 [16:23] Launchpad bug 118626 in bzr-email "plugin documentation does not make interaction with checkouts clear" [Medium,Confirmed] https://launchpad.net/bugs/118626 [16:23] herb: it's already approved by kiko [16:23] herb: i think kiko did? but otherwise, i can look into it [16:23] right [16:23] bigjools: missed it. thanks [16:23] any other questions to herb? [16:23] okay [16:23] [TOPIC] * DBA report (stub) [16:23] New Topic: * DBA report (stub) [16:23] herb: I was under the impression that the loggerhead stuff was WAY better than "much improved" [16:23] stub sent it to the list [16:23] rockstar: we're still restarting a couple of times a week. which is much improved over a couple of times a day. [16:23] rockstar: order of magnitude. [16:23] flacoste, to lp list? I can't seem to find it [16:23] herb: okay. Is it memory restarts, or hanging restarts [16:23] The ex-master database (launchpad_prod on hackberry) is bloated and [16:23] started exceeding its free space map settings. Nothing really to worry [16:23] about - it might cause bloat to spiral but I suspect not in this case. [16:23] The losas can bounce it after shutting down the systems using it as a [16:23] slave, and I've suggested using it as the standalone replica for [16:23] read-only mode launchpad during the rollout because we then rebuild it [16:23] (The memory restarts shouldn't be happening anymore) [16:23] afterwards and it will be all nice and freshly packed. [16:23] Nothing major do do with database patches this cycle. rockstar's [16:23] bugbranch and specbranch column pruning needs to be cleared with Mark [16:23] still. [16:23] thanks flacoste [16:24] flacoste: yeah, and there are other branches dependent on that one. [16:24] rockstar: The memory situation seems ok (ie. not death spiraling). seems to be hangs at this point. [16:25] rockstar: we still see it ~1.5GB resident, but doesn't seem to grow beyond that. [16:25] herb: well, "death spiraling" to be is different than the memory issues. [16:26] herb: it's going to be *kinda* memory intensive just because of what it's serving. [16:26] rockstar: understood [16:26] rockstar: as I said, much improved. 1.2 - 1.5G is much better than 3.7G [16:26] herb: and we know the cause of the hangs, we just don't know how to fix it. [16:26] rockstar: good news, bad news, eh? [16:26] herb: yeah, something like that. [16:27] okay. anyone else want to say something? [16:27] 5 [16:27] 4 [16:28] 3 [16:28] 2 [16:28] 1 [16:28] Thank you all for attending this week's Launchpad Production Meeting. See the channel topic for the location of the logs. [16:28] #endmeeting [16:28] Meeting finished at 10:27. [16:28] thanks all [16:28] thanks Ursinha ;) [16:28] henninge, :) === salgado is now known as salgado-lunch === Ursinha is now known as Marvin_ === Marvin_ is now known as Ursinha === Ursinha is now known as Marvin_ === Marvin_ is now known as Ursinha === salgado-lunch is now known as salgado === salgado_ is now known as salgado === salgado is now known as salgado-afk