=== thumper is now known as eric-the-viking === eric-the-viking is now known as thumper === mrevell is now known as mrevell-lunch === mrevell-lunch is now known as mrevell === danilos is now known as danilo[home] === salgado is now known as salgado-lunch === matsubara-lunch is now known as matsubara [16:00] #startmeeting [16:00] Meeting started at 10:00. The chair is matsubara. [16:00] Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE] [16:00] Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues. [16:00] [TOPIC] Roll Call [16:00] New Topic: Roll Call [16:00] me [16:00] ni! ni! [16:00] me [16:00] me [16:00] me [16:01] Chex, bigjools, hi [16:01] me [16:01] Chex: hello [16:01] ok, everyone is here. [16:02] .. /o\ err, hi [16:02] apologies from Ursinha and stub [16:02] [TOPIC] Agenda [16:02] New Topic: Agenda [16:02] * Actions from last meeting [16:02] * Oops report & Critical Bugs & Broken scripts [16:02] * Operations report (mthaddon/Chex/spm/mbarnett) [16:02] * DBA report (stub) [16:02] * Proposed items [16:02] [TOPIC] * Actions from last meeting [16:02] New Topic: * Actions from last meeting [16:02] * Ursinha to send one email to lp list explaining the qa-tags experiment [16:02] * Chex to follow up with thumper about the multiple git import failures on the importd [16:02] * matsubara to file a high/critical bug for OOPS-1430F2574 [16:02] * matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1427EA45 [16:02] * matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1426EC1536 [16:02] https://lp-oops.canonical.com/oops.py/?oopsid=1430F2574 [16:02] * emailed Tim about it [16:02] https://lp-oops.canonical.com/oops.py/?oopsid=1427EA45 [16:02] * matsubara to talk to TL about not having the LP production meeting anymore or change its format [16:02] https://lp-oops.canonical.com/oops.py/?oopsid=1426EC1536 [16:02] * matsubara to email QA contacts about next LP prod. meeting at 16UTC [16:02] * emailed the list and QA contacts about this :-) [16:02] * matsubara to email losas about their weekly report [16:02] * emailed them requesting that the operation report be sent to the list [16:03] I talked to salgado about OOPS-1430F2574 and it wasn't necessary to file a bug for that one. it was a one off problem and I'm keeping an eye on oops reports if it shows up again [16:03] https://lp-oops.canonical.com/oops.py/?oopsid=1430F2574 [16:03] danilos sent the email about the qa-tags experiment [16:04] matsubara, Ursula is on vacation, I'd like her to give us her PoV as well [16:04] Chex, did you sort out the failures in the git import with thumper? [16:04] matsubara, though, that will likely happen as part of the discussion on the list, so we can probably take the action item off [16:04] danilos, right. thanks for starting the discussion [16:04] matsubara: I followed up with him briefly, but was not able to resolve anything, I need to talk to him again, sorry about that [16:04] np, it was way overdue [16:05] Chex, shall I re-add the action item to the list? [16:05] matsubara: yes please do [16:05] [action] * Chex to follow up with thumper about the multiple git import failures on the importd [16:05] ACTION received: * Chex to follow up with thumper about the multiple git import failures on the importd [16:05] ok, thanks Chex [16:05] let's move on [16:06] [TOPIC] * Oops report & Critical Bugs & Broken scripts [16:06] New Topic: * Oops report & Critical Bugs & Broken scripts [16:06] bigjools, https://bugs.edge.launchpad.net/soyuz/+bug/493703 [16:06] Ubuntu bug 493703 in soyuz "LocationError raised in build page and distribution arch series binary package page" [High,Triaged] [16:06] it's targeted to .12, currently not assigned and the cycle will end tomorrow. Any chance of have that one fixed before the holidays? it's generating > 1K OOPS a day (most of them from robots but it's pretty importante nonetheless) [16:06] matsubara: zero chance [16:06] s/importante/important/ sorry for the portuguese leakage there :-) [16:07] heh I am used to it from working with cprov :) [16:07] bigjools, :-( [16:07] the sad smile is for the zero chance comment btw [16:07] yeah, there's another serious problem that is taking precedence. If by some miracle I get that fixed then we can look at the oopses [16:07] hmm that's the top OOPS we have [16:07] gar sorry [16:08] when will the pain end this week [16:08] oh, the retry dep thingie? [16:08] yep [16:08] right. ok then [16:08] gary_poster, https://bugs.edge.launchpad.net/launchpad-foundations/+bug/403618 [16:08] Ubuntu bug 403618 in launchpad-foundations "Launchpad should return a 404 instead of ForbiddenAttribute OOPS" [High,Triaged] [16:09] gary_poster, same thing, that one is happening quite frequently. any chance of landing a fix before the holidays? [16:09] matsubara: holidays, yes, next release, no [16:09] where yes is "any chance" :-) [16:09] I suppose it can be an RC then [16:09] I mean CP [16:10] gary_poster, all right. as long as they disappear from the OOPS summaries, it's good :-) [16:10] :-) understood [16:10] gary_poster, could you take a look at https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1439EB784 ? [16:10] https://lp-oops.canonical.com/oops.py/?oopsid=1439EB784 [16:10] it's a timeout error on the api [16:11] I'm not sure if it's just regular timeouts, if they do, then I'd need to update oops-tools to handle those just like any other timeouts [16:11] currently they show up in the exceptions section [16:11] matsubara: yes, it's another timeout [16:11] rockstar, OOPS-1438EA844 [16:11] https://lp-oops.canonical.com/oops.py/?oopsid=1438EA844 [16:11] gary_poster, so just a matter of moving that kind of exception to the right section in the summaries? [16:11] * rockstar looks [16:11] matsubara: i.e., this is something that should be addressed by bugs, notfoundatons/leonardr [16:12] gary_poster, ok, looks like a time out using the dupe finder [16:12] but using the API [16:12] so, I'll talk to the bugs team about it and sort it out (and file a bug to have oops-tools updated to handle it appropriately) [16:12] matsubara: right. the problem probably needs to be addressed in lp.bugs.model.bugtask, line 571, in findSimilarBugs [16:13] sinzui, https://bugs.edge.launchpad.net/launchpad-registry/+bug/495051 [16:13] Ubuntu bug 495051 in launchpad-registry "UnboundLocalError editing proposed team membership" [High,In progress] [16:13] 'nough said [16:14] allenap, I have a few timeout OOPSes on +filebug. are you interested in those? I know gmb just landed code to make it async so maybe it's just a matter of waiting for that and see how things will behave [16:14] sinzui, indeed! thanks dude! [16:14] matsubara, how common is eg. bug 493703 in OOPSes? it looks reasonably simple to solve that somebody outside soyuz can fix it? bigjools, is my assessment wrong? [16:14] Launchpad bug 493703 in soyuz "LocationError raised in build page and distribution arch series binary package page" [High,Triaged] https://launchpad.net/bugs/493703 [16:14] matsubara: gmb's async stuff probably won't make the timeouts any different, it's just that the user won't be so affected. [16:15] danilos: noodles is going to look into it [16:15] [action] matsubara to talk to bugs team about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1439EB784 and file a bug on oops-tools to handle LaunchpadTimeoutError correctly [16:15] https://lp-oops.canonical.com/oops.py/?oopsid=1439EB784 [16:15] ACTION received: matsubara to talk to bugs team about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1439EB784 and file a bug on oops-tools to handle LaunchpadTimeoutError correctly [16:15] https://lp-oops.canonical.com/oops.py/?oopsid=1439EB784 [16:15] matsubara: The URL will change slightly, to +filebug-inline-form. Timeouts for this page are far less important. [16:15] bigjools, ok, if you need help finding someone to work on it (though, looking into it might be most of the work anyway), I'd be happy to give a hand tomorrow (in looking for someone, not doing it :) [16:15] danilos: ok thanks :) [16:16] danilos, thanks. we have around 1K of those a day (mostly from bots triggering it) [16:16] matsubara, right, thanks [16:17] allenap, all right. yesterday bigkev was trying to file some bugs and couldn't due to timeouts. I wonder if you need more OOPSes to help investigate the issue [16:18] matsubara: He should be able to file bugs today, because the async dupe-finder is there now. But more OOPS reports are useful, if you have a bug to attach them to? [16:19] allenap, cool. I'll add those to the bug report then [16:19] matsubara: Thanks :) [16:19] [action] matsubara to add +filebug timeout oopses to the bug report [16:19] ACTION received: matsubara to add +filebug timeout oopses to the bug report [16:20] rockstar, that SQLObjectNotFound oops is quite strange [16:20] rockstar, have you seen it before? looks like it happened only twice [16:20] matsubara, yeah, I'm looking at it. It probably should have 404'd - Not sure though. [16:20] and I was unable to reproduce [16:21] I have not seen it before. It's probably some corrupted bmp somewhere. [16:21] matsubara, that strikes me as replication-related, but I am no expert :) [16:21] rockstar, worth a bug rpeort for that one? [16:21] danilos, yeah, that's what I thought. [16:21] matsubara, not sure. I'll look into it, and file one if need be. [16:21] danilos, rockstar yeah, same thought here [16:21] rockstar, cool. thanks! I'll let you know if it happens again [16:22] we had some script failures since last week [16:23] Scripts failed to run: loganberry:allocate-revision-karma, loganberry:flag-expired-memberships [16:23] sinzui, ^ I think that one is yours? [16:23] the retry depwait script failure is being worked on by bigjools [16:23] matsubara: We have had intermittent timing issues because of long processes [16:24] floundering on [16:24] matsubara: there are no errors and the script do run when they get their turn [16:25] sinzui, ok, so that means that the failures on mizuho:librarian-gc and loganberry:karma-update, loganberry:allocate-revision-karma, loganberry:launchpad-stats, loganberry:expire-questions, loganberry:productreleasefinder, loganberry:update-cache, loganberry:launchpad-targetnamecacheupdate are probably related to that? [16:25] I guess so, since the last failures for those were 2 days ago [16:27] we have only one critical bug which bigjools is on it. [16:27] matsubara: right. I do not investigate a failure to run for 24 hours after the notice because ANOTHER process is responsible for that. When all scripts fail, I might investigate withing 24 hours [16:27] I see. all right then. thanks sinzui [16:27] and thanks everyone. let's move on [16:28] [TOPIC] * Operations report (mthaddon/Chex/spm/mbarnett) [16:28] New Topic: * Operations report (mthaddon/Chex/spm/mbarnett) [16:28] matsubara: and i check if we changed production code in the last 24 hours [16:29] hello everyone, a rport focused on the LP rollout: [16:29] Chex, ? [16:29] wow, nice timing :-) [16:29] - The LP 3.1.11 rollout was last week, and there is a upcoming 'short' LP rollout next week. [16:29] - 3.1.11 roll-out took 2 days, due to some problems with the rollout [16:29] process. We are working to address these issues for next time. [16:29] Steps we are taking to improve the process are: [16:29] : moving to build centrally before pushing code out to speed up pushing and building of code [16:29] : investigating less error prone ways (and quicker ways) of switching to read-only mode [16:29] : ensuring we're not interrupted by other DB jobs on other servers in the cluster that block the DB upgrade [16:29] and thats all for this week, questions/comments, anyone?? [16:31] Chex, thanks [16:31] matsubara: your welcome [16:31] [TOPIC] * DBA report (stub) [16:31] New Topic: * DBA report (stub) [16:31] Chex, yes, are we getting any of this for 3.1.12? [16:31] oops, sorry, go ahead danilos [16:32] [action] matsubara to email stub about the DBA report [16:32] ACTION received: matsubara to email stub about the DBA report [16:32] danilos: yes, most of those items I listed should make it into the 3.1.12 release, I believe [16:32] Chex, ok, cool, that sounds great then, but there's always potential for failure with new features like that; I'll try to keep an eye on that :) [16:33] danilos: ok, great, we appreciate all and any eyeballs on the process [16:33] Chex, fwiw, I'll be doing a release manager rotation, it's not that I don't trust our lovely LOSA team :) [16:33] bigjools, (add a link to the image :) [16:34] http://people.canonical.com/~ed/losa-team.png [16:34] LINK received: http://people.canonical.com/~ed/losa-team.png [16:34] LOL [16:35] thank you, we can move on :) [16:35] sorry, got very distracted by that picture hehe [16:35] [TOPIC] * Proposed items [16:35] New Topic: * Proposed items [16:35] there's no new proposed items [16:36] the new meeting time seems to work fine for everyone [16:36] anything else before I close? [16:36] And if anyone has any issues that may need tracking, please ping me as the release manager for 3.1.12. Thank you. [16:37] danilos, my hero === salgado-lunch is now known as salgado [16:37] Thank you all for attending this week's Launchpad Production Meeting. See https://dev.launchpad.net/MeetingAgenda for the logs. [16:37] #endmeeting [16:37] Meeting finished at 10:37. [16:37] cheers === danilos is now known as danilo-bblot === salgado is now known as salgado-afk === matsubara is now known as matsubara-afk === EdwinGrubbs_ is now known as EdwinGrubbs