[15:04] <matsubara> #startmeeting
[15:04] <MootBot> Meeting started at 09:04. The chair is matsubara.
[15:04] <MootBot> Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE]
[15:04] <matsubara> Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues.
[15:04] <sinzui> me
[15:04] <matsubara> [TOPIC] Roll Call
[15:04] <MootBot> New Topic:  Roll Call
[15:04] <allenap> me
[15:04] <matsubara> sorry for being late, I was on my stand up meeting
[15:05] <matsubara> stub, Chex, gary_poster, rockstar, bigjools, danilo_: Hi
[15:05] <gary_poster> matsubara: hi
[15:05] <danilos> me
[15:05] <bigjools> matsubara: TLs are in a call.... :/
[15:05] <bigjools> noodles775: can you cover me please?
[15:05] <noodles775> bigjools: OK.
[15:05] <matsubara> bigjools, right, by the end of this meeting sinzui will propose a change for the production meeting so it won't clash anymore
[15:05] <bigjools> ta
[15:05] <danilos> bigjools, we've got an agenda item to move the call, so you should be in for at least that
[15:07] <matsubara> ok, so let's move on. Chex, rockstar and stub can join in later
[15:08] <matsubara> [TOPIC] Agenda
[15:08] <matsubara>  * Actions from last meeting
[15:08] <matsubara>  * Oops report & Critical Bugs & Broken scripts
[15:08] <matsubara>  * Operations report (mthaddon/Chex/spm/mbarnett)
[15:08] <matsubara>  * DBA report (stub)
[15:08] <MootBot> New Topic:  Agenda
[15:08] <matsubara>  * Proposed items
[15:08] <matsubara> [TOPIC] * Actions from last meeting
[15:08] <MootBot> New Topic:  * Actions from last meeting
[15:08] <matsubara> * Ursinha to send one email to lp list explaining the qa-tags experiment
[15:08] <matsubara> * matsubara to chase someone from code team about bug 480000
[15:08] <matsubara> * matsubara to chase code people about code script failures (create-merge-proposals, branch puller and update branches)
[15:08] <matsubara> * matsubara to ask someone from code about bug 485318
[15:08] <matsubara>     * emailed tim about these
[15:08] <matsubara> * Chex to follow up with thumper about the multiple git import failures on the importd
[15:08] <matsubara> * matsubara to file a bug for OOPS-1420ED1047
[15:08] <matsubara>     * there was a bug filed for this already. Bug 484368
[15:08] <matsubara> * sinzui to investigate failure on the mirror prober (The script 'distributionmirror-prober' didn't run on 'loganberry' between 2009-11-23 06:07:04 and 2009-11-23 12:07:04 (last seen 2009-11-23 04:54:10.444057))
[15:08] <matsubara> * matsubara to ask gary about python2.5 update and get back to losas
[15:08] <matsubara>     * francis emailed the list and gary about this
[15:08] <matsubara> * matsubara to ask stub to contact losas about load increase on wildcherry
[15:08] <matsubara>     * emailed stub about this one
[15:09] <matsubara> I don't recall an email explaining the qa-tags experiment but Ursula did show up a wiki page for me before leaving on vacation
[15:09] <sinzui> matsubara: the script was dealyed, ran fine later. The same thing happened to the PRF. it is running fine
[15:09] <matsubara> danilo_, do you know if that email was done?
[15:09] <matsubara> sinzui, thanks for checking
[15:09] <matsubara> s/done/sent/
[15:11] <matsubara> mthaddon, around?
[15:12] <matsubara> I guess people are too busy with other stuff
[15:12] <matsubara> let's move on
[15:12] <matsubara> [action] * Ursinha to send one email to lp list explaining the qa-tags experiment
[15:12] <MootBot> ACTION received:  * Ursinha to send one email to lp list explaining the qa-tags experiment
[15:12] <matsubara> [action] * Chex to follow up with thumper about the multiple git import failures on the importd
[15:12] <MootBot> ACTION received:  * Chex to follow up with thumper about the multiple git import failures on the importd
[15:13] <matsubara> [TOPIC] * Oops report & Critical Bugs & Broken scripts
[15:13] <MootBot> New Topic:  * Oops report & Critical Bugs & Broken scripts
[15:13] <matsubara> danilo_,  is https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1425EA795 and https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1429EB593 related to bug https://bugs.edge.launchpad.net/rosetta/+bug/484368?
[15:14] <matsubara> sinzui, https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1430F2574
[15:14] <matsubara> sinzui, is this registry or foundations?
[15:14] <sinzui> foundations
[15:14] <sinzui> matsubara: high/critical
[15:15] <sinzui> matsubara: This may be caused by the replication delay
[15:15] <matsubara> all right. I'll file a bug for it and ask gary_poster to take a look
[15:15] <matsubara> [action] matsubara to file a high/critical bug for OOPS-1430F2574
[15:15] <MootBot> ACTION received:  matsubara to file a high/critical bug for OOPS-1430F2574
[15:16] <matsubara> rockstar, https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1427EA45 I've seen 4 occurrences of this oops last week, is this a known issue? some bad data? worth a bug?
[15:16] <matsubara> https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1426EC1536 shouldn't this one be a NotFound rather than a NotFoundError?
[15:16] <matsubara> I guess I'll have to email tim about those as well
[15:16] <gary_poster> OOPS-1430F2574 : I agree that this is probably replication
[15:16] <matsubara> [action] matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1427EA45 I've seen 4 occurrences of this oops last week, is this a known issue? some bad data? worth a bug?
 https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1426EC1536 shouldn't this one be a NotFound rather than a NotFoundError?
[15:16] <MootBot> ACTION received:  matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1427EA45 I've seen 4 occurrences of this oops last week, is this a known issue? some bad data? worth a bug?
[15:16] <matsubara> damn
[15:17] <matsubara> [action] matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1426EC1536  shouldn't this one be a NotFound rather than a NotFoundError?
[15:17] <MootBot> ACTION received:  matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1426EC1536  shouldn't this one be a NotFound rather than a NotFoundError?
[15:18] <matsubara> we have 5 critical bugs, 4 of them fix committed and 1 in progress
[15:18] <matsubara> so all good in that area
[15:19] <matsubara> no script failures since last week (well, only PRF but that's fine per sinzui)
[15:20] <matsubara> [TOPIC] * Operations report (mthaddon/Chex/spm/mbarnett)
[15:20] <MootBot> New Topic:  * Operations report (mthaddon/Chex/spm/mbarnett)
[15:21] <matsubara> let's move to the next topic as there's no losa around
[15:21] <matsubara> [TOPIC] * DBA report (stub)
[15:21] <MootBot> New Topic:  * DBA report (stub)
[15:21] <stub> We have had two incidents where appserver requests have sent the load on the main database server over 100 in some sort of a feedback loop we dubbed the DB Death Spiral. We think we tracked down the trigger - the page the load balancers used to detect if Launchpad is up accessed the session database, and our session machinery becomes a bottleneck under load.
[15:22] <stub> What we hope is the immediate fix lands tomorrow - stopping that page from accessing the database. I have plans of offloading the bulk of the session machinery work to memcache so it should stop becoming a bottleneck under load, but that is work for the next cycle or two.
[15:22] <stub> We also managed to have replication issues, because when it rains it pours. Both times where do do with adding a new replica into the cluster.
[15:22] <stub> The first time, it turned out some events where left around that should have been cleared up causing conflicts. So when one of our replicas tried to confirm it had seen an event, it found the confirmation was already there so it aborted.
[15:22] <stub> The second one, today, removing the replica from the cluster hadn't quite succeeded so replication lag on the cluster was increasing. This wasn't noticed or was ignored, and we attempted to re-add the database back into a heavily lagged cluster. This needed recovering. I don't think users where affected today.
[15:23] <stub> And that is all I've typed so far ;)
[15:24] <matsubara> stub, should I expect to see lots of OOPSes in the reports about this replication lag issue?
[15:24] <stub> I've got a bug open to add some more safety belts to our helpers to catch these cases.
[15:24] <stub> matsubara: Hopefully not. I'm not sure though.
[15:25] <matsubara> stub, all right. I'll let you know if spot anything
[15:25] <matsubara> thanks stub
[15:25] <matsubara> [TOPIC] * Proposed items
[15:25] <MootBot> New Topic:  * Proposed items
[15:26] <matsubara> # Move the production meeting one hour later to avoid clash with other meetings (sinzui)
[15:26] <sinzui> please
[15:26] <sinzui> I am in another meeting right now
[15:26] <matsubara> I'm +1 on the change. it'd be actually better for me to have the meeting at 16UTC
[15:26] <matsubara> how about the others?
[15:27] <noodles775> I'm assuming that bigjools is also +1 for the same reason.
[15:27] <matsubara> and danilo too
[15:27] <bigjools> +1
[15:28] <sinzui> +1
[15:28] <stub> That is getting nuts for me, but I can do the report by email just as easily as typing it up here.
[15:28] <matsubara> on the other hand, what do you think about not having this meeting at all anymore? do you think it's useful or the format could be changed? I see lots of people missing this meeting or not paying much attention...
[15:29] <matsubara> stub, your section and the losas section are the ones that interest me the most :-)
[15:29] <matsubara> stub, reports by email are fine by me, not sure about others
[15:30] <stub> I tend to think email would be a better forum rather than playing Chinese whispers.
[15:30] <matsubara> yeah
[15:32] <matsubara> [action] matsubara to talk to TL about not having the LP production meeting anymore or change its format
[15:32] <MootBot> ACTION received:  matsubara to talk to TL about not having the LP production meeting anymore or change its format
[15:32] <matsubara> and for the next one, let's try to have it at 16UTC. I'll email the QA contacts to let everyone know.
[15:33] <matsubara> [action] matsubara to email Qa contacts about next LP prod. meeting at 16UTC
[15:33] <MootBot> ACTION received:  matsubara to email Qa contacts about next LP prod. meeting at 16UTC
[15:33] <stub> At that hour, it will be drunk from a gogo bar :)
[15:33] <matsubara> [action] matsubara to email losas about their weekly report
[15:33] <MootBot> ACTION received:  matsubara to email losas about their weekly report
[15:33] <matsubara> hehe
[15:33] <matsubara> and I think that's all for today
[15:33] <matsubara> Thank you all for attending this week's Launchpad Production Meeting. See https://dev.launchpad.net/MeetingAgenda for the logs.
[15:34] <noodles775> Thanks matsubara
[15:34] <matsubara> #endmeeting
[15:34] <MootBot> Meeting finished at 09:34.