/srv/irclogs.ubuntu.com/2009/02/26/#launchpad-meeting.txt

=== salgado is now known as salgado-afk
=== danilo_ is now known as danilos
=== danilos is now known as danilo-afk
=== danilo-afk is now known as danilos
=== danilos is now known as danilo-afk
=== Ursinha__ is now known as Ursinha
=== Ursinha is now known as Guest41422
=== mrevell is now known as mrevell-dejeuner
=== mrevell-dejeuner is now known as mrevell
matsubara#startmeeting15:00
MootBotMeeting started at 09:00. The chair is matsubara.15:00
MootBotCommands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE]15:00
matsubaraWelcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues.15:00
matsubara[TOPIC] Roll Call15:00
MootBotNew Topic:  Roll Call15:00
matsubaraNot on the Launchpad Dev team? Welcome! Come "me" with the rest of us!15:00
henningeme15:00
Ursinhame15:00
matsubaraUrsinha, flacoste, bigjools, intellectronica, herb15:00
bigjoolsme15:00
herbme15:00
matsubarabac, ping15:00
flacosteme15:00
Ursinhamatsubara, already answered15:01
intellectronicame15:01
matsubararockstar, hi15:01
rockstarme15:01
rockstarmatsubara, hi15:01
bacme15:01
matsubaraok, stub can join later. everyone else is here.15:03
matsubara[TOPIC] Agenda15:03
MootBotNew Topic:  Agenda15:03
matsubara * Actions from last meeting15:03
matsubara * Oops report & Critical Bugs15:03
matsubara * Operations report (mthaddon/herb/spm)15:03
matsubara * DBA report (DBA contact)15:03
matsubara[TOPIC] * Actions from last meeting15:03
MootBotNew Topic:  * Actions from last meeting15:03
matsubara * stub to investigate the fix to avoid staging restore problems15:03
matsubara * matsubara to chase rockstar about a fix for OOPS-1138CEMAIL1215:03
matsubara    * asked jml about this. It's bug 326056 and had importance raised.15:03
matsubara * cprov and bigjools to investigate OOPS-1145EA1415:03
matsubara * Ursinha to file bugs:15:03
matsubara    * Bug 333072: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1143EB18915:03
ubottuLaunchpad bug 326056 in launchpad-bazaar "OOPS on BadStateTransition when reviewing code by mail" [High,Triaged] https://launchpad.net/bugs/32605615:03
matsubara    * Bug 333071: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1145EA1415:03
ubottuLaunchpad bug 333072 in soyuz "AttributeError OOPS on Build:+index" [Undecided,Invalid] https://launchpad.net/bugs/33307215:03
ubottuLaunchpad bug 333071 in soyuz "AssertionError OOPS on +copy-packages" [High,Triaged] https://launchpad.net/bugs/33307115:03
bigjools333072 is invalid15:04
matsubarabigjools, any news about 333071?15:04
bigjoolsyes, it's not too serious, we've set it for 2.2.315:04
bigjoolsit's a corner case in the copying15:04
bigjoolsdespite the doom-mongering error message15:05
matsubaraok. thanks bigjools15:05
matsubara[action] matsubara to chase stub about staging restore problems15:06
MootBotACTION received:  matsubara to chase stub about staging restore problems15:06
matsubara[TOPIC] * Oops report & Critical Bugs15:06
MootBotNew Topic:  * Oops report & Critical Bugs15:06
* matsubara hands Ursinha the mic15:06
* Ursinha looks15:06
* rockstar runs15:06
Ursinharegistry, foundations, code and bugs: oopses for you15:07
UrsinhaRegistry:-15:07
Ursinhahttps://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153E91915:07
Ursinhahttps://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153A1135 (or foundations, not sure)15:07
UrsinhaFoundations:-15:07
Ursinhahttps://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153D66715:07
UrsinhaCode:15:07
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153E91915:07
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153A113515:07
Ursinhahttps://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1152XMLP115:07
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153D66715:07
UrsinhaBugs:15:07
Ursinhahttps://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1152EA16215:07
Ursinha~15:07
Ursinharockstar, ha!15:07
Ursinharockstar, have you seen this one: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1152XMLP1?15:07
rockstarUrsinha, looking at all of them now.15:08
Ursinharockstar, you can just look at code's one :)15:08
Ursinhasinzui, hi15:08
Ursinhasinzui, I'm not sure if https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153A1135 is foundations or registry15:09
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153A113515:09
matsubaraUrsinha, looks like registry15:09
intellectronicaUrsinha: strange. do you see lots of those?15:09
bacUrsinha: yes, looks like registry15:09
Ursinhaintellectronica, no, actually not15:09
Ursinhaintellectronica, but never saw one of those before15:09
Ursinhaso better bring to attention15:10
matsubaraintellectronica, Ursinha that one looks like caused by the rollout15:10
sinzuiUrsinha I don't know the answer either. I will look into it and assign it. I suspect salgado-afk is working on15:10
Ursinhamatsubara, even in the time it happened?15:10
intellectronicamatsubara: i also thought so15:10
=== salgado-afk is now known as salgado
intellectronicabut it is quite early15:10
Ursinhaintellectronica, I've discarded the rollout possibility because of its timestamp15:11
Ursinhasinzui, thanks for that15:11
matsubarayeah, too early to be caused by the rollout.15:11
Ursinhaintellectronica, can you take a look then, please?15:11
rockstarUrsinha, I'll have to investigate our oops.  It's the XML-RPC server, and it requires the sacrifice of a virgin goat.15:11
matsubaracheck OSAs incident log to see if something happened during that time15:11
intellectronicaso, this isn't really a bugs oops, but i don't know whether it's rollout-related or not. fwiw it's more than three hours before rollout, so it's hard to see how it would be related15:11
Ursinharockstar, oh, I have a bunch here in my backyard if you need some15:11
rockstarUrsinha, :)15:12
Ursinhaintellectronica, I'll do what matsubara suggested15:12
matsubara[action] ursinha to check OSAs incident log to help identify cause of OOPS-1152EA16215:13
MootBotACTION received:  ursinha to check OSAs incident log to help identify cause of OOPS-1152EA16215:13
Ursinhathanks intellectronica and matsubara15:14
matsubara[action] rockstar to investigate xmlrpc oops OOPS-1152XMLP115:14
MootBotACTION received:  rockstar to investigate xmlrpc oops OOPS-1152XMLP115:14
Ursinhaflacoste, hi15:14
henningeTranslations is happy, that POFile:+translate dropped from the timeout top ten now ..15:14
henningebtw15:14
henninge;)15:15
Ursinhahenninge, indeed, congrats to translate team :)15:15
Ursinhatranslations15:15
Ursinhathere he is :)15:15
henningeUrsinha: thank you, I will pass it on.15:15
Ursinhasinzui, about the other oops15:15
stubSorry - on a call and didn't realize the time15:16
sinzuibac: can you look at it.15:16
bacUrsinha: they seem to be related (acting for sinzui today)15:16
* sinzui is in another meeting15:16
flacostehmm15:16
flacostei'd say registry15:16
bacyes, i think registry for both15:17
flacosteUrsinha are you talking about OOPS-1153A1135?15:17
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153A113515:17
Ursinhabac, hi :) so, can you take a look in both oopses? do you need me to file bugs about them?15:17
* Ursinha looks15:17
bacUrsinha: yes i'll look at them both15:17
baci can open the bugs15:17
bacunless you need the karma15:17
Ursinhaflacoste, no, https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153D66715:17
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153D66715:17
Ursinhabac, haha, no15:17
flacosteUrsinha: that's also a registry query15:18
Ursinha[action] bac to file bugs and take care of https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153E919 and https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153A113515:18
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153E91915:18
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153A113515:18
matsubara[action] bac to file bugs for OOPS-1153E919 and OOPS-1153A113515:18
MootBotACTION received:  bac to file bugs for OOPS-1153E919 and OOPS-1153A113515:18
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153E91915:18
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153A113515:18
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153E91915:18
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153A113515:18
bacwow, y'all are insistent today!  :)15:19
Ursinha:)15:19
Ursinhaflacoste, hm.15:19
Ursinhathanks15:19
Ursinhabac, can you take a look at that too?15:19
bacwhich?15:20
Ursinhapromise not to paste the oops again15:20
=== danilo-afk is now known as danilos
Ursinhabac, https://devpad.canonical.com/~jamesh/oops.cgi/1153D66715:20
UrsinhaI tried :)15:20
* bac looks15:20
bacyes15:20
Ursinhabac, thanks15:21
Ursinhathat's all from me from the oops land15:21
matsubara[action] bac to also file a bug and take care of OOPS-1153D66715:21
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153D66715:22
MootBotACTION received:  bac to also file a bug and take care of OOPS-1153D66715:22
ubottuhttps://devpad.canonical.com/~jamesh/oops.cgi/1153D66715:22
matsubaraok, thanks everyone.15:22
matsubara[TOPIC] * Operations report (mthaddon/herb/spm)15:22
MootBotNew Topic:  * Operations report (mthaddon/herb/spm)15:22
Ursinhathere's one critical bug, though15:22
Ursinhaargh15:22
Ursinhabad bad timing15:22
herbshall I wait for the critical bug?15:23
Ursinhaherb, just a second, let me check with henninge15:23
matsubaradanilo is handling the critical bug, so won't duplicate what's in the bug report.15:23
matsubarait's bug 33478715:23
Ursinhamatsubara, okay, if you say so15:23
ubottuLaunchpad bug 334787 in rosetta "Ubuntu packagers are not translation editors (assertion error)" [Critical,In progress] https://launchpad.net/bugs/33478715:23
matsubaralet's move on15:23
Ursinhago ahead herb, thanks15:24
herb2009-02-20 - We had an issue that may have caused some users to experience intermittent outages on Launchpad. I worked with joey and flacosted to find the issue. joey's notes were sent to the list. I would be interested in hearing any updates we might have on this issue.15:24
herb2009-02-21 and 2009-02-22 - It appears we had bit of buggy code land on edge that caused a performance problem on both edge and production. The revision was backed out and I believe the code has been fixed.15:24
herb2009-02-26 - We rolled out 2.2.2 based on r776315:24
herbWe continue to see problems relating to bug #156453 and bug #118625. So much so that we're going to start bouncing codebrowse regularly to hopefully head off any issues. I want to emphasize that this will be masking the problem and we really do need to find the root cause and fix it.15:24
ubottuLaunchpad bug 156453 in loggerhead "production loggerhead branch leaks memory" [Critical,Triaged] https://launchpad.net/bugs/15645315:24
ubottuLaunchpad bug 118625 in launchpad-bazaar "codebrowse sometimes hangs" [High,Triaged] https://launchpad.net/bugs/11862515:24
herbBug #260171 continues to creep up regularly (every few days). This is already morked as high and I know that mwhudson's plate is full with codebrowse issues, but can we get an update on this one?15:24
ubottuBug 260171 on http://launchpad.net/bugs/260171 is private15:24
* herb somehow managed to change flacoste into a verb.15:24
danilosmatsubara, Ursinha: I am running tests on the critical bug fix, will let you know once it has landed15:24
flacostei saw!15:25
baci've been flacosted!15:25
matsubaradanilos, thanks15:25
Ursinhathanks danilos15:25
matsubararockstar, can you bring up the codebrowse issue to the code team?15:25
rockstarmatsubara, everyday.  :)15:25
matsubararockstar, thanks :-)15:25
rockstarCodebrowse is being ACTIVELY worked on.  It'd be nice if we knew what the issues is.  Right now, we're just fixing things and hoping that was the problem.15:26
herbrockstar: let the losas know if there is anything we can do to help.15:26
rockstarherb, we certainly will.15:26
stubShould we be bringing in any outside help to intrument, test and diagnose the issue?15:27
matsubaraherb, anything happened to the DB during the time of this OOPS-1152EA162?15:27
matsubaraor maybe stub might know ^15:28
herbmatsubara: nothing in the incident log.15:28
stubmatsubara: That is one of the connection reaper scripts kicking in15:28
herbmatsubara: I think that's also on the void between LOSAs.15:29
herbah, there we go.15:29
stubWe kill connections idle in a transaction more than a few hours (and should be more agressive), and appserver connections that have been in a transaction for more than 2 minutes.15:29
Ursinhastub, I see15:30
matsubarastub, ok. so if we start seeing too many of those, we have a problem somewhere and a few is kinda normal?15:30
stubThe notification gets sent to the error-reports list (where we can confirm that this is indeed what happened)15:30
matsubarastub, aha. that's better. I'll chase the lp-errors for that one15:31
matsubaras/lp-errors/lp-errors list/15:31
stubIf we see many of them, we have a problem. One is probably a problem - appserver requests taking two minutes on the db means we need to investigate why the normal timeout mechanisms didn't work.15:31
matsubara[action] matsubara to look lp-errors list to determine cause of OOPS-1152EA16215:31
MootBotACTION received:  matsubara to look lp-errors list to determine cause of OOPS-1152EA16215:31
matsubararight. thanks for the explanation15:32
stub-1 second non-sql time, 0 seconds total time indicates a problem at the appserver? The request never got started?15:32
matsubaraI'll file a bug about that one and we can discuss there15:33
stubhmm... might be a reconnection bug - perhaps the previous request handled by that thread got killed?15:34
stubI don't know if we Retry on DisconnectionError exceptions, or if it is a good idea in all cases.15:34
matsubaraok15:35
matsubara[TOPIC] * DBA report (stub)15:35
MootBotNew Topic:  * DBA report (stub)15:35
matsubaraand thanks herb and stub15:35
stubNew hardware exists and is being brought online by IS. I've realized I might need to tweak the db maintenance scripts (upgrade.py, security.py etc.) to cope with a third replica - I think it only copes with a single master and slave at the moment.15:36
stubStaging can be moved by the LOSAs as soon as the hardware is available and they have time, which will move that load from the production systems.15:36
stubI assume the rollout went fine as far as the db upgrade procedure goes.15:36
herbI assume it did too. I didn't hear any complaints from my colleagues.15:37
matsubarastub, great news! with the new hardware we won't have the staging restore problems anymore?15:37
herbstub: what's the plan with the 3rd replica?15:37
stubThe staging restore problems should no longer be a problem.15:38
* herb feels like he missed something15:38
stubherb: We can start by pointing half the appservers at the new slave when it is online. We really should get a connection pool/load balancer thingy though running like pgbouncer, pgpool 1 or 2.15:38
herbstub: gotcha15:39
stubherb: I realized just now though that upgrade.py won't apply patches to a third replica, which would be bad. So that needs to be fixed.15:39
herbyeah. that's important.15:40
stubOr actually, slonik may take care of all that. I need to confirm anyway.15:40
stubI forget and it is too late for my brain :)15:40
stuberm... late as in evening15:40
matsubaraall right. I guess that's all unless there are questions for stub15:42
matsubarathanks stub15:42
matsubaraThank you all for attending this week's Launchpad Production Meeting. See the channel topic for the location of the logs.15:42
matsubara#endmeeting15:42
MootBotMeeting finished at 09:42.15:42
intellectronicathanks matsubara15:42
flacostehey15:42
flacostematsubara: question15:42
flacostedo we need a new roll-out?15:43
flacosteand i think it applies to everyone here15:43
flacosteanyone requires a new roll-out?15:43
matsubaraflacoste, I was on vacation and need t ocheck that15:43
matsubarabut I think there's at least danilos' bug to re roll15:43
bacflacoste: i don't know of any issues for us15:43
danilosmatsubara, flacoste: yes15:43
stubI thought it was policy to let enough bugs through qa to require a rerollout?15:43
flacostewe're getting better at QA stub15:44
flacosteeven the code team weren't that late this cycle :-)15:44
matsubaraok, so we'll need a re-roll for translations. need to check for the other teams, but so far, there's nothing on the radar15:45
stubWe need a counter somewhere - 'Launchpad has been running for n days without need to a release critical patch'15:46
Ursinhastub, :)15:46
matsubaraI think that's all then. thanks everyone15:47
Ursinhathanks matsubara15:48
=== matsubara is now known as matsubara-lunch
=== matsubara-lunch is now known as matsubara
=== salgado is now known as salgado-lunch
=== salgado-lunch is now known as salgado
=== thumper_laptop is now known as thumper
=== Ursinha is now known as Ursinha-fud
=== salgado is now known as salgado-afk

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!