[16:00] <matsubara> #startmeeting
[16:00] <MootBot> Meeting started at 10:00. The chair is matsubara.
[16:00] <MootBot> Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE]
[16:00] <matsubara> Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues.
[16:00] <matsubara> [TOPIC] Roll Call
[16:00] <MootBot> New Topic:  Roll Call
[16:00] <gary_poster> me
[16:00] <rockstar> ni! ni!
[16:00] <danilos> me
[16:00] <allenap> me
[16:00] <sinzui> me
[16:01] <matsubara> Chex, bigjools, hi
[16:01] <bigjools> me
[16:01] <Chex> Chex: hello
[16:01] <matsubara> ok, everyone is here.
[16:02] <Chex> .. /o\ err, hi
[16:02] <matsubara> apologies from Ursinha and stub
[16:02] <matsubara> [TOPIC] Agenda
[16:02] <MootBot> New Topic:  Agenda
[16:02] <matsubara>  * Actions from last meeting
[16:02] <matsubara>  * Oops report & Critical Bugs & Broken scripts
[16:02] <matsubara>  * Operations report (mthaddon/Chex/spm/mbarnett)
[16:02] <matsubara>  * DBA report (stub)
[16:02] <matsubara>  * Proposed items
[16:02] <matsubara> [TOPIC] * Actions from last meeting
[16:02] <MootBot> New Topic:  * Actions from last meeting
[16:02] <matsubara> * Ursinha to send one email to lp list explaining the qa-tags experiment
[16:02] <matsubara> * Chex to follow up with thumper about the multiple git import failures on the importd
[16:02] <matsubara> * matsubara to file a high/critical bug for OOPS-1430F2574
[16:02] <matsubara> * matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1427EA45
[16:02] <matsubara> * matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1426EC1536
[16:02] <matsubara>     * emailed Tim about it
[16:02] <matsubara> * matsubara to talk to TL about not having the LP production meeting anymore or change its format
[16:02] <matsubara> * matsubara to email QA contacts about next LP prod. meeting at 16UTC
[16:02] <matsubara>     * emailed the list and QA contacts about this :-)
[16:02] <matsubara> * matsubara to email losas about their weekly report
[16:02] <matsubara>     * emailed them requesting that the operation report be sent to the list
[16:03] <matsubara> I talked to salgado about OOPS-1430F2574 and it wasn't necessary to file a bug for that one. it was a one off problem and I'm keeping an eye on oops reports if it shows up again
[16:03] <matsubara> danilos sent the email about the qa-tags experiment
[16:04] <danilos> matsubara, Ursula is on vacation, I'd like her to give us her PoV as well
[16:04] <matsubara> Chex, did you sort out the failures in the git import with thumper?
[16:04] <danilos> matsubara, though, that will likely happen as part of the discussion on the list, so we can probably take the action item off
[16:04] <matsubara> danilos, right. thanks for starting the discussion
[16:04] <Chex> matsubara: I followed up with him briefly, but was not able to resolve anything, I need to talk to him again, sorry about that
[16:04] <danilos> np, it was way overdue
[16:05] <matsubara> Chex, shall I re-add the action item to the list?
[16:05] <Chex> matsubara: yes please do
[16:05] <matsubara> [action] * Chex to follow up with thumper about the multiple git import failures on the importd
[16:05] <MootBot> ACTION received:  * Chex to follow up with thumper about the multiple git import failures on the importd
[16:05] <matsubara> ok, thanks Chex
[16:05] <matsubara> let's move on
[16:06] <matsubara> [TOPIC] * Oops report & Critical Bugs & Broken scripts
[16:06] <MootBot> New Topic:  * Oops report & Critical Bugs & Broken scripts
[16:06] <matsubara> bigjools,  https://bugs.edge.launchpad.net/soyuz/+bug/493703
[16:06] <matsubara> it's targeted to .12, currently not assigned and the cycle will end tomorrow. Any chance of have that one fixed before the holidays? it's generating > 1K OOPS a day (most of them from robots but it's pretty importante nonetheless)
[16:06] <bigjools> matsubara: zero chance
[16:06] <matsubara> s/importante/important/ sorry for the portuguese leakage there :-)
[16:07] <bigjools> heh I am used to it from working with cprov :)
[16:07] <matsubara> bigjools, :-(
[16:07] <matsubara> the sad smile is for the zero chance comment btw
[16:07] <bigjools> yeah, there's another serious problem that is taking precedence.  If by some miracle I get that fixed then we can look at the oopses
[16:07] <matsubara> hmm that's the top OOPS we have
[16:07] <bigjools> gar sorry
[16:08] <bigjools> when will the pain end this week
[16:08] <matsubara> oh, the retry dep thingie?
[16:08] <bigjools> yep
[16:08] <matsubara> right. ok then
[16:08] <matsubara> gary_poster, https://bugs.edge.launchpad.net/launchpad-foundations/+bug/403618
[16:09] <matsubara> gary_poster, same thing, that one is happening quite frequently. any chance of landing a fix before the holidays?
[16:09] <gary_poster> matsubara: holidays, yes, next release, no
[16:09] <gary_poster> where yes is "any chance" :-)
[16:09] <gary_poster> I suppose it can be an RC then
[16:09] <gary_poster> I mean CP
[16:10] <matsubara> gary_poster, all right. as long as they disappear from the OOPS summaries, it's good :-)
[16:10] <gary_poster> :-) understood
[16:10] <matsubara> gary_poster, could you take a look at https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1439EB784 ?
[16:10] <matsubara> it's a timeout error on the api
[16:11] <matsubara> I'm not sure if it's just regular timeouts, if they do, then I'd need to update oops-tools to handle those just like any other timeouts
[16:11] <matsubara> currently they show up in the exceptions section
[16:11] <gary_poster> matsubara: yes, it's another timeout
[16:11] <matsubara> rockstar, OOPS-1438EA844
[16:11] <matsubara> gary_poster, so just a matter of moving that kind of exception to the right section in the summaries?
[16:11]  * rockstar looks
[16:11] <gary_poster> matsubara: i.e., this is something that should be addressed by bugs, notfoundatons/leonardr
[16:12] <matsubara> gary_poster, ok, looks like a time out using the dupe finder
[16:12] <matsubara> but using the API
[16:12] <matsubara> so, I'll talk to the bugs team about it and sort it out (and file a bug to have oops-tools updated to handle it appropriately)
[16:12] <gary_poster> matsubara: right.  the problem probably needs to be addressed in lp.bugs.model.bugtask, line 571, in findSimilarBugs
[16:13] <matsubara> sinzui, https://bugs.edge.launchpad.net/launchpad-registry/+bug/495051
[16:13] <sinzui> 'nough said
[16:14] <matsubara> allenap, I have a few timeout OOPSes on +filebug. are you interested in those? I know gmb just landed code to make it async so maybe it's just a matter of waiting for that and see how things will behave
[16:14] <matsubara> sinzui, indeed! thanks dude!
[16:14] <danilos> matsubara, how common is eg. bug 493703 in OOPSes? it looks reasonably simple to solve that somebody outside soyuz can fix it? bigjools, is my assessment wrong?
[16:14] <allenap> matsubara: gmb's async stuff probably won't make the timeouts any different, it's just that the user won't be so affected.
[16:15] <bigjools> danilos: noodles is going to look into it
[16:15] <matsubara> [action] matsubara to talk to bugs team about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1439EB784 and file a bug on oops-tools to handle LaunchpadTimeoutError correctly
[16:15] <MootBot> ACTION received:  matsubara to talk to bugs team about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1439EB784 and file a bug on oops-tools to handle LaunchpadTimeoutError correctly
[16:15] <allenap> matsubara: The URL will change slightly, to +filebug-inline-form. Timeouts for this page are far less important.
[16:15] <danilos> bigjools, ok, if you need help finding someone to work on it (though, looking into it might be most of the work anyway), I'd be happy to give a hand tomorrow (in looking for someone, not doing it :)
[16:15] <bigjools> danilos: ok thanks :)
[16:16] <matsubara> danilos, thanks. we have around 1K of those a day (mostly from bots triggering it)
[16:16] <danilos> matsubara, right, thanks
[16:17] <matsubara> allenap, all right. yesterday bigkev was trying to file some bugs and couldn't due to timeouts. I wonder if you need more OOPSes to help investigate the issue
[16:18] <allenap> matsubara: He should be able to file bugs today, because the async dupe-finder is there now. But more OOPS reports are useful, if you have a bug to attach them to?
[16:19] <matsubara> allenap, cool. I'll add those to the bug report then
[16:19] <allenap> matsubara: Thanks :)
[16:19] <matsubara> [action] matsubara to add +filebug timeout oopses to the bug report
[16:19] <MootBot> ACTION received:  matsubara to add +filebug timeout oopses to the bug report
[16:20] <matsubara> rockstar, that SQLObjectNotFound oops is quite strange
[16:20] <matsubara> rockstar, have you seen it before? looks like it happened only twice
[16:20] <rockstar> matsubara, yeah, I'm looking at it.  It probably should have 404'd - Not sure though.
[16:20] <matsubara> and I was unable to reproduce
[16:21] <rockstar> I have not seen it before.  It's probably some corrupted bmp somewhere.
[16:21] <danilos> matsubara, that strikes me as replication-related, but I am no expert :)
[16:21] <matsubara> rockstar, worth a bug rpeort for that one?
[16:21] <rockstar> danilos, yeah, that's what I thought.
[16:21] <rockstar> matsubara, not sure.  I'll look into it, and file one if need be.
[16:21] <matsubara> danilos, rockstar  yeah, same thought here
[16:21] <matsubara> rockstar, cool. thanks! I'll let you know if it happens again
[16:22] <matsubara> we had some script failures since last week
[16:23] <matsubara> Scripts failed to run: loganberry:allocate-revision-karma, loganberry:flag-expired-memberships
[16:23] <matsubara> sinzui, ^ I think that one is yours?
[16:23] <matsubara> the retry depwait script failure is being worked on by bigjools
[16:23] <sinzui> matsubara: We have had intermittent timing issues because of long processes
[16:24] <bigjools> floundering on
[16:24] <sinzui> matsubara: there are no errors and the script do run when they get their turn
[16:25] <matsubara> sinzui, ok, so that means that the failures on mizuho:librarian-gc and loganberry:karma-update, loganberry:allocate-revision-karma, loganberry:launchpad-stats, loganberry:expire-questions, loganberry:productreleasefinder, loganberry:update-cache, loganberry:launchpad-targetnamecacheupdate are probably related to that?
[16:25] <matsubara> I guess so, since the last failures for those were 2 days ago
[16:27] <matsubara> we have only one critical bug which bigjools is on it.
[16:27] <sinzui> matsubara: right. I do not investigate a failure to run for 24 hours after the notice because ANOTHER process is responsible for that. When all scripts fail, I might investigate withing 24 hours
[16:27] <matsubara> I see. all right then. thanks sinzui
[16:27] <matsubara> and thanks everyone. let's move on
[16:28] <matsubara> [TOPIC] * Operations report (mthaddon/Chex/spm/mbarnett)
[16:28] <MootBot> New Topic:  * Operations report (mthaddon/Chex/spm/mbarnett)
[16:28] <sinzui> matsubara: and i check if we changed production code in the last 24 hours
[16:29] <Chex> hello everyone, a rport focused on the LP rollout:
[16:29] <matsubara> Chex, ?
[16:29] <matsubara> wow, nice timing :-)
[16:29] <Chex> - The LP 3.1.11 rollout was last week, and there is a upcoming 'short' LP rollout next week.
[16:29] <Chex> - 3.1.11 roll-out took 2 days, due to some problems with the rollout
[16:29] <Chex>           process. We are working to address these issues for next time.
[16:29] <Chex>     Steps we are taking to improve the process are:
[16:29] <Chex>         : moving to build centrally before pushing code out to speed up pushing and building of code
[16:29] <Chex>         : investigating less error prone ways (and quicker ways) of switching to read-only mode
[16:29] <Chex>         : ensuring we're not interrupted by other DB jobs on other servers in the cluster that block the DB upgrade
[16:29] <Chex> and thats all for this week, questions/comments, anyone??
[16:31] <matsubara> Chex, thanks
[16:31] <Chex> matsubara: your welcome
[16:31] <matsubara> [TOPIC] * DBA report (stub)
[16:31] <MootBot> New Topic:  * DBA report (stub)
[16:31] <danilos> Chex, yes, are we getting any of this for 3.1.12?
[16:31] <matsubara> oops, sorry, go ahead danilos
[16:32] <matsubara> [action] matsubara to email stub about the DBA report
[16:32] <MootBot> ACTION received:  matsubara to email stub about the DBA report
[16:32] <Chex> danilos: yes, most of those items I listed should make it into the 3.1.12 release, I believe
[16:32] <danilos> Chex, ok, cool, that sounds great then, but there's always potential for failure with new features like that; I'll try to keep an eye on that :)
[16:33] <Chex> danilos: ok, great, we appreciate all and any eyeballs on the process
[16:33] <danilos> Chex, fwiw, I'll be doing a release manager rotation, it's not that I don't trust our lovely LOSA team :)
[16:33] <danilos> bigjools, (add a link to the image :)
[16:34] <bigjools> http://people.canonical.com/~ed/losa-team.png
[16:34] <MootBot> LINK received:  http://people.canonical.com/~ed/losa-team.png
[16:34] <matsubara> LOL
[16:35] <danilos> thank you, we can move on :)
[16:35] <matsubara> sorry, got very distracted by that picture hehe
[16:35] <matsubara> [TOPIC] * Proposed items
[16:35] <MootBot> New Topic:  * Proposed items
[16:35] <matsubara> there's no new proposed items
[16:36] <matsubara> the new meeting time seems to work fine for everyone
[16:36] <matsubara> anything else before I close?
[16:36] <danilos> And if anyone has any issues that may need tracking, please ping me as the release manager for 3.1.12. Thank you.
[16:37] <bigjools> danilos, my hero
[16:37] <matsubara> Thank you all for attending this week's Launchpad Production Meeting. See https://dev.launchpad.net/MeetingAgenda for the logs.
[16:37] <matsubara> #endmeeting
[16:37] <MootBot> Meeting finished at 10:37.
[16:37] <danilos> cheers