=== bigjools-afk is now known as bigjools [16:00] #startmeeting [16:00] Meeting started at 10:00. The chair is matsubara. [16:00] Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE] [16:00] Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues. [16:00] [TOPIC] Roll Call [16:00] New Topic: Roll Call [16:00] me [16:01] me [16:01] me/2 [16:01] me [16:01] only half of you henninge ? :-) [16:01] stub, hi [16:01] rockstar, hi [16:01] me [16:01] bigjools, hello [16:01] ni! [16:01] me [16:01] matsubara: yes, sorry, concurrent meeting in #ubuntu-meeting [16:02] ok, everyone is here [16:02] Ursinha sends apologies as she's still ill [16:02] [TOPIC] Agenda [16:02] New Topic: Agenda [16:02] * Actions from last meeting [16:02] * Oops report & Critical Bugs [16:02] * Operations report (mthaddon/herb/spm) [16:02] * DBA report (stub) [16:02] [TOPIC] * Actions from last meeting [16:02] New Topic: * Actions from last meeting [16:03] * Ursinha to talk to flacoste about buildbot and storm updating for testing when he's available today [16:03] I've asked Ursinha and she did that [16:03] so, moving on [16:03] [TOPIC] * Oops report & Critical Bugs [16:03] New Topic: * Oops report & Critical Bugs [16:04] I have 1 bug and 2 oopses for today. bug 394560 for malone, OOPS-1274A1231 for translations and OOPS-1274ED146 for registry [16:04] Launchpad bug 394560 in launchpad "LookupError: unknown encoding: processing email" [Undecided,New] https://launchpad.net/bugs/394560 [16:04] https://devpad.canonical.com/~jamesh/oops.cgi/1274A1231 [16:04] https://devpad.canonical.com/~jamesh/oops.cgi/1274ED146 [16:04] who should I talk to to update ubottu oops urls? [16:04] [action] matsubara to chase owner of ubottu to update the oops url [16:04] ACTION received: matsubara to chase owner of ubottu to update the oops url [16:05] intellectronica, I need to try to reproduce bug 394560 and add more info [16:05] Launchpad bug 394560 in launchpad "LookupError: unknown encoding: processing email" [Undecided,New] https://launchpad.net/bugs/394560 [16:05] matsubara, you should chase the owner with a bit of wood. [16:05] intellectronica, once I do that, do you think you can assign someone for a fix? [16:06] * intellectronica looking [16:06] rockstar, you mean like a club? [16:06] i hate to say this, but this looks like a Foundations issue :-( [16:06] hehe [16:07] matsubara: sure, i'll assign to myself and investigate. i have no idea, off the top of my head, what the problem is [16:07] flacoste, intellectronica: ok, I'll try to reproduce the issue today and let you know. I'll move the bug to the right project accordingly [16:07] matsubara, :) [16:07] intellectronica: well, it seems that we are trying to decode the string [16:07] based on the encoding specified in the email [16:07] and that encoding is x-unknown [16:07] which of course doesn't exist [16:07] so i'd say from the face of it it's a client-side problem [16:07] [action] matsubara to reproduce bug 394560, add more info and work with team responsible to have it prioritized [16:07] Launchpad bug 394560 in launchpad "LookupError: unknown encoding: processing email" [Undecided,New] https://launchpad.net/bugs/394560 [16:07] ACTION received: matsubara to reproduce bug 394560, add more info and work with team responsible to have it prioritized [16:08] matsubara: that oops is covered by bug 394224 [16:08] Launchpad bug 394224 in rosetta "More unique constraints in updateTranslation" [High,Triaged] https://launchpad.net/bugs/394224 [16:08] henninge, I see it's already triaged and prioritized, so thanks! [16:09] flacoste: yeah, if that's the case then it's not a bug [16:09] sinzui, the other one is a OOPS-1274ED146 a timeout on team membership view [16:09] https://devpad.canonical.com/~jamesh/oops.cgi/1274ED146 [16:09] sinzui, it's issuing > 3000 queries [16:10] so it probably can be optimized further [16:10] [action] matsubara to file a bug for OOPS-1274ED146 [16:10] https://devpad.canonical.com/~jamesh/oops.cgi/1274ED146 [16:10] ACTION received: matsubara to file a bug for OOPS-1274ED146 [16:10] https://devpad.canonical.com/~jamesh/oops.cgi/1274ED146 [16:10] flacoste: or maybe it's worth coercing x-unknown to None, if that's a common behaviour for mailers [16:10] This looks just like the timout for merging teams [16:11] matsubara: I will ask Edwin to look into this since the problem look identical [16:11] sinzui, thanks. once I have a bug for it I'll assign to him [16:11] sinzui, looking at the page where the oops happened, doesn't seem related to team merge [16:12] The two repeat queries are the same that I looked at two hours [16:13] I see. I'll file a bug anyway and can dupe later on to the team merge bug if that's the case [16:13] sinzui, thanks [16:13] matsubara: We have a systemic problem in that we do not have a standard method to get a lot of IPerson [16:14] sinzui, so the fix for the team merge one will include such a method? [16:14] No not at all, We need a big spec to fix a systemic problem [16:15] * sinzui created a spec yesterday to end the nightmare of deactivated account being members or owners of other Launchpad objects [16:15] sinzui, by spec you mean a blueprint or a new story or something else? [16:16] I mean both. It is a project that requires lots of planning and breakdown into work [16:17] for the critical bugs: we have two, one in progress which the fix is ready (according to the bug report) and another one in Triaged state [16:17] bug 391903 and bug 390563 [16:17] Launchpad bug 391903 in launchpad-code "Scanner user needs more database permissions" [Critical,In progress] https://launchpad.net/bugs/391903 [16:17] Launchpad bug 390563 in bzr "absent factory exception from smart server when streaming 2a stacked branches" [Critical,In progress] https://launchpad.net/bugs/390563 [16:18] rockstar, do you know about the latter one? the lp-code bugtask is in triaged state [16:18] matsubara, so, there were two issues here, one client side, one server side. The server side one has been fixed on Launchpad. [16:18] sinzui, all right. where are you keeping track of this work? I'd like to link bugs that might benefit from it to the spec [16:19] bug 391903 should probably not be critical - a patch and test update needs to land this cycle sometime. Production is fine at the moment. [16:19] Launchpad bug 391903 in launchpad-code "Scanner user needs more database permissions" [Critical,In progress] https://launchpad.net/bugs/391903 [16:19] A blueprint. I have not scheduled it since my team may state it requires an infinite number of monkeys to complete [16:19] stub, is it not critical anymore because the permissions were granted on the production db? [16:20] I think so. The patch does have to land before next rollout though or that will revert. Does that count as critical? [16:20] sinzui, right. let me know the URL please [16:21] stub, matsubara: that should simply be high [16:21] I'll lower the importance. thanks flacoste and stub [16:21] rockstar, since it was fixed on LP, could you set the bug as fix released? [16:21] matsubara: systemic problems are not fixed in critical issues. I am mearly stating that I expect to continue to allocate staff to fix these timeout issues until we have new tools to get masses of users and teams from the db [16:22] s/bug/bugtask/ [16:23] sinzui, the oops I brought to you today is not critical. [16:23] matsubara, I think there are some other things we're trying to track (like the client side issue), but I'll raise it in the standup today. [16:23] matsubara: So I do not need to fix this this cycle? [16:24] sinzui, what I'm trying to do is link those bugs to the blueprint, so when the systemic problem is fixed, we can fix the oopses with the new infrastructure (or at least consider those bugs while implementing the new infrastructure) [16:25] sinzui, I don't think they are critical, since it's a soft time out. I'm bringing the issue to you to keep it on your radar [16:25] matsubara: I must create a blueprint for this class of problem first [16:26] [action] sinzui to create a blueprint about the systemic problem in retrieving lots of IPerson and let matsubara know [16:26] ACTION received: sinzui to create a blueprint about the systemic problem in retrieving lots of IPerson and let matsubara know [16:26] rockstar, thanks! [16:27] I think that's all for the oops & critical bugs section [16:27] thanks everyone [16:27] [TOPIC] * Operations report (mthaddon/herb/spm) [16:27] New Topic: * Operations report (mthaddon/herb/spm) [16:27] herb, ? [16:28] Since I totally slacked off last week on the OSA report, I'll cover the last two weeks here. [16:28] 2009-06-24 - Production rollout to r2.2.6. Rollout took longer and caused unforseen downtime. Details can be found in the incident report. [16:28] 2009-06-26 - Cherry picked r8204 to codehost [16:28] 2009-06-26 - Cherry picked r8205 to lpnet*, edge* and the scripts server. [16:28] 2009-06-26 - Cherry picked r8701 and r8703 to lpnet* and (part of) soyuz. [16:28] Since the 2.2.6 rollout we've seen a slight uptick in load on the app servers and, early on at least, significant load on the code importers. [16:28] Staging has seen signifcant memory leaks. The app server has often been 5-10GB resident. This is a pretty bug that will clearly need to be resolved before we can push to production. [16:29] herb: it's because of the storm update, stub thinks he landed a fix today [16:29] herb: plan is to land this storm update on edge once it clears staging [16:29] flacoste: I figured. I'd like to see it running on staging for a couple of days without leaks before seeing it on edge. [16:30] herb: sure [16:30] cool [16:30] I'll assign bug 393990 to stub and move to -foundations. [16:30] Launchpad bug 393990 in launchpad "staging app server using too much memory" [Undecided,New] https://launchpad.net/bugs/393990 [16:31] matsubara: that's actually a dupe [16:31] matsubara: that's probably a dupe of 390861 [16:31] sorry #390861 [16:31] bug 390861 [16:31] Launchpad bug 390861 in launchpad-foundations "Appserver memory issues with Storm 0.14" [High,Triaged] https://launchpad.net/bugs/390861 [16:32] dammit! [16:32] :) [16:32] ah, ok. so it's all fine [16:32] matsubara: bug 390861 [16:32] well, not fine, but at least it's being tracked :-) [16:32] thanks herb and flacoste [16:32] let's move on [16:32] [TOPIC] * DBA report (stub) [16:32] New Topic: * DBA report (stub) [16:32] Staging is now hopefully running a non-leaking version of Storm (0.14 + r290 from lp:storm). Its been up for maybe an hour now with no sign of memory bloat, so it is looking promising. if all goes well, next step is to land this on launchpad/devel for testing on edge. [16:32] I repaired about 1300 invalid crosslinks between Person and EmailAddress as best I could. Hopefully these were caused by manual updates or since fixed bugs. The other 2.7 million records are fine. Admins or myself should be able to recover any lost permissions of branches if the reclaim account and merge processes don't suffice. garbo-nightly.py is growing detection of this case (code done, tests to go). [16:33] . [16:34] question to stub? [16:34] ok, I think that's all for today then. [16:35] Thank you all for attending this week's Launchpad Production Meeting. See the channel topic for the location of the logs. [16:35] stub: is the permissions reclamation process documented? [16:35] #endmeeting [16:35] Meeting finished at 10:35. [16:35] oops, sorry herb [16:35] no problem [16:35] thanks matsubara [16:36] herb: No - it will depend what the problem is. Most people affected (assuming they are active users) should be able to recover things using their registered email addresses. [16:36] ok [16:36] herb: It should be like a forgotten password... [16:36] * stub crosses fingers [16:37] heh. got it. [16:37] matsubara: https://blueprints.edge.launchpad.net/launchpad-registry/+spec/efficient-user-sets [16:39] sinzui, thanks === salgado is now known as salgado-lunch === salgado-lunch is now known as salgado === salgado is now known as salgado-afk