=== salgado is now known as salgado-afk | ||
=== danilo_ is now known as danilos | ||
=== danilos is now known as danilo-afk | ||
=== danilo-afk is now known as danilos | ||
=== danilos is now known as danilo-afk | ||
=== Ursinha__ is now known as Ursinha | ||
=== Ursinha is now known as Guest41422 | ||
=== mrevell is now known as mrevell-dejeuner | ||
=== mrevell-dejeuner is now known as mrevell | ||
matsubara | #startmeeting | 15:00 |
---|---|---|
MootBot | Meeting started at 09:00. The chair is matsubara. | 15:00 |
MootBot | Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE] | 15:00 |
matsubara | Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues. | 15:00 |
matsubara | [TOPIC] Roll Call | 15:00 |
MootBot | New Topic: Roll Call | 15:00 |
matsubara | Not on the Launchpad Dev team? Welcome! Come "me" with the rest of us! | 15:00 |
henninge | me | 15:00 |
Ursinha | me | 15:00 |
matsubara | Ursinha, flacoste, bigjools, intellectronica, herb | 15:00 |
bigjools | me | 15:00 |
herb | me | 15:00 |
matsubara | bac, ping | 15:00 |
flacoste | me | 15:00 |
Ursinha | matsubara, already answered | 15:01 |
intellectronica | me | 15:01 |
matsubara | rockstar, hi | 15:01 |
rockstar | me | 15:01 |
rockstar | matsubara, hi | 15:01 |
bac | me | 15:01 |
matsubara | ok, stub can join later. everyone else is here. | 15:03 |
matsubara | [TOPIC] Agenda | 15:03 |
MootBot | New Topic: Agenda | 15:03 |
matsubara | * Actions from last meeting | 15:03 |
matsubara | * Oops report & Critical Bugs | 15:03 |
matsubara | * Operations report (mthaddon/herb/spm) | 15:03 |
matsubara | * DBA report (DBA contact) | 15:03 |
matsubara | [TOPIC] * Actions from last meeting | 15:03 |
MootBot | New Topic: * Actions from last meeting | 15:03 |
matsubara | * stub to investigate the fix to avoid staging restore problems | 15:03 |
matsubara | * matsubara to chase rockstar about a fix for OOPS-1138CEMAIL12 | 15:03 |
matsubara | * asked jml about this. It's bug 326056 and had importance raised. | 15:03 |
matsubara | * cprov and bigjools to investigate OOPS-1145EA14 | 15:03 |
matsubara | * Ursinha to file bugs: | 15:03 |
matsubara | * Bug 333072: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1143EB189 | 15:03 |
ubottu | Launchpad bug 326056 in launchpad-bazaar "OOPS on BadStateTransition when reviewing code by mail" [High,Triaged] https://launchpad.net/bugs/326056 | 15:03 |
matsubara | * Bug 333071: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1145EA14 | 15:03 |
ubottu | Launchpad bug 333072 in soyuz "AttributeError OOPS on Build:+index" [Undecided,Invalid] https://launchpad.net/bugs/333072 | 15:03 |
ubottu | Launchpad bug 333071 in soyuz "AssertionError OOPS on +copy-packages" [High,Triaged] https://launchpad.net/bugs/333071 | 15:03 |
bigjools | 333072 is invalid | 15:04 |
matsubara | bigjools, any news about 333071? | 15:04 |
bigjools | yes, it's not too serious, we've set it for 2.2.3 | 15:04 |
bigjools | it's a corner case in the copying | 15:04 |
bigjools | despite the doom-mongering error message | 15:05 |
matsubara | ok. thanks bigjools | 15:05 |
matsubara | [action] matsubara to chase stub about staging restore problems | 15:06 |
MootBot | ACTION received: matsubara to chase stub about staging restore problems | 15:06 |
matsubara | [TOPIC] * Oops report & Critical Bugs | 15:06 |
MootBot | New Topic: * Oops report & Critical Bugs | 15:06 |
* matsubara hands Ursinha the mic | 15:06 | |
* Ursinha looks | 15:06 | |
* rockstar runs | 15:06 | |
Ursinha | registry, foundations, code and bugs: oopses for you | 15:07 |
Ursinha | Registry:- | 15:07 |
Ursinha | https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153E919 | 15:07 |
Ursinha | https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153A1135 (or foundations, not sure) | 15:07 |
Ursinha | Foundations:- | 15:07 |
Ursinha | https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153D667 | 15:07 |
Ursinha | Code: | 15:07 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153E919 | 15:07 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153A1135 | 15:07 |
Ursinha | https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1152XMLP1 | 15:07 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153D667 | 15:07 |
Ursinha | Bugs: | 15:07 |
Ursinha | https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1152EA162 | 15:07 |
Ursinha | ~ | 15:07 |
Ursinha | rockstar, ha! | 15:07 |
Ursinha | rockstar, have you seen this one: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1152XMLP1? | 15:07 |
rockstar | Ursinha, looking at all of them now. | 15:08 |
Ursinha | rockstar, you can just look at code's one :) | 15:08 |
Ursinha | sinzui, hi | 15:08 |
Ursinha | sinzui, I'm not sure if https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153A1135 is foundations or registry | 15:09 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153A1135 | 15:09 |
matsubara | Ursinha, looks like registry | 15:09 |
intellectronica | Ursinha: strange. do you see lots of those? | 15:09 |
bac | Ursinha: yes, looks like registry | 15:09 |
Ursinha | intellectronica, no, actually not | 15:09 |
Ursinha | intellectronica, but never saw one of those before | 15:09 |
Ursinha | so better bring to attention | 15:10 |
matsubara | intellectronica, Ursinha that one looks like caused by the rollout | 15:10 |
sinzui | Ursinha I don't know the answer either. I will look into it and assign it. I suspect salgado-afk is working on | 15:10 |
Ursinha | matsubara, even in the time it happened? | 15:10 |
intellectronica | matsubara: i also thought so | 15:10 |
=== salgado-afk is now known as salgado | ||
intellectronica | but it is quite early | 15:10 |
Ursinha | intellectronica, I've discarded the rollout possibility because of its timestamp | 15:11 |
Ursinha | sinzui, thanks for that | 15:11 |
matsubara | yeah, too early to be caused by the rollout. | 15:11 |
Ursinha | intellectronica, can you take a look then, please? | 15:11 |
rockstar | Ursinha, I'll have to investigate our oops. It's the XML-RPC server, and it requires the sacrifice of a virgin goat. | 15:11 |
matsubara | check OSAs incident log to see if something happened during that time | 15:11 |
intellectronica | so, this isn't really a bugs oops, but i don't know whether it's rollout-related or not. fwiw it's more than three hours before rollout, so it's hard to see how it would be related | 15:11 |
Ursinha | rockstar, oh, I have a bunch here in my backyard if you need some | 15:11 |
rockstar | Ursinha, :) | 15:12 |
Ursinha | intellectronica, I'll do what matsubara suggested | 15:12 |
matsubara | [action] ursinha to check OSAs incident log to help identify cause of OOPS-1152EA162 | 15:13 |
MootBot | ACTION received: ursinha to check OSAs incident log to help identify cause of OOPS-1152EA162 | 15:13 |
Ursinha | thanks intellectronica and matsubara | 15:14 |
matsubara | [action] rockstar to investigate xmlrpc oops OOPS-1152XMLP1 | 15:14 |
MootBot | ACTION received: rockstar to investigate xmlrpc oops OOPS-1152XMLP1 | 15:14 |
Ursinha | flacoste, hi | 15:14 |
henninge | Translations is happy, that POFile:+translate dropped from the timeout top ten now .. | 15:14 |
henninge | btw | 15:14 |
henninge | ;) | 15:15 |
Ursinha | henninge, indeed, congrats to translate team :) | 15:15 |
Ursinha | translations | 15:15 |
Ursinha | there he is :) | 15:15 |
henninge | Ursinha: thank you, I will pass it on. | 15:15 |
Ursinha | sinzui, about the other oops | 15:15 |
stub | Sorry - on a call and didn't realize the time | 15:16 |
sinzui | bac: can you look at it. | 15:16 |
bac | Ursinha: they seem to be related (acting for sinzui today) | 15:16 |
* sinzui is in another meeting | 15:16 | |
flacoste | hmm | 15:16 |
flacoste | i'd say registry | 15:16 |
bac | yes, i think registry for both | 15:17 |
flacoste | Ursinha are you talking about OOPS-1153A1135? | 15:17 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153A1135 | 15:17 |
Ursinha | bac, hi :) so, can you take a look in both oopses? do you need me to file bugs about them? | 15:17 |
* Ursinha looks | 15:17 | |
bac | Ursinha: yes i'll look at them both | 15:17 |
bac | i can open the bugs | 15:17 |
bac | unless you need the karma | 15:17 |
Ursinha | flacoste, no, https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153D667 | 15:17 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153D667 | 15:17 |
Ursinha | bac, haha, no | 15:17 |
flacoste | Ursinha: that's also a registry query | 15:18 |
Ursinha | [action] bac to file bugs and take care of https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153E919 and https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1153A1135 | 15:18 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153E919 | 15:18 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153A1135 | 15:18 |
matsubara | [action] bac to file bugs for OOPS-1153E919 and OOPS-1153A1135 | 15:18 |
MootBot | ACTION received: bac to file bugs for OOPS-1153E919 and OOPS-1153A1135 | 15:18 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153E919 | 15:18 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153A1135 | 15:18 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153E919 | 15:18 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153A1135 | 15:18 |
bac | wow, y'all are insistent today! :) | 15:19 |
Ursinha | :) | 15:19 |
Ursinha | flacoste, hm. | 15:19 |
Ursinha | thanks | 15:19 |
Ursinha | bac, can you take a look at that too? | 15:19 |
bac | which? | 15:20 |
Ursinha | promise not to paste the oops again | 15:20 |
=== danilo-afk is now known as danilos | ||
Ursinha | bac, https://devpad.canonical.com/~jamesh/oops.cgi/1153D667 | 15:20 |
Ursinha | I tried :) | 15:20 |
* bac looks | 15:20 | |
bac | yes | 15:20 |
Ursinha | bac, thanks | 15:21 |
Ursinha | that's all from me from the oops land | 15:21 |
matsubara | [action] bac to also file a bug and take care of OOPS-1153D667 | 15:21 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153D667 | 15:22 |
MootBot | ACTION received: bac to also file a bug and take care of OOPS-1153D667 | 15:22 |
ubottu | https://devpad.canonical.com/~jamesh/oops.cgi/1153D667 | 15:22 |
matsubara | ok, thanks everyone. | 15:22 |
matsubara | [TOPIC] * Operations report (mthaddon/herb/spm) | 15:22 |
MootBot | New Topic: * Operations report (mthaddon/herb/spm) | 15:22 |
Ursinha | there's one critical bug, though | 15:22 |
Ursinha | argh | 15:22 |
Ursinha | bad bad timing | 15:22 |
herb | shall I wait for the critical bug? | 15:23 |
Ursinha | herb, just a second, let me check with henninge | 15:23 |
matsubara | danilo is handling the critical bug, so won't duplicate what's in the bug report. | 15:23 |
matsubara | it's bug 334787 | 15:23 |
Ursinha | matsubara, okay, if you say so | 15:23 |
ubottu | Launchpad bug 334787 in rosetta "Ubuntu packagers are not translation editors (assertion error)" [Critical,In progress] https://launchpad.net/bugs/334787 | 15:23 |
matsubara | let's move on | 15:23 |
Ursinha | go ahead herb, thanks | 15:24 |
herb | 2009-02-20 - We had an issue that may have caused some users to experience intermittent outages on Launchpad. I worked with joey and flacosted to find the issue. joey's notes were sent to the list. I would be interested in hearing any updates we might have on this issue. | 15:24 |
herb | 2009-02-21 and 2009-02-22 - It appears we had bit of buggy code land on edge that caused a performance problem on both edge and production. The revision was backed out and I believe the code has been fixed. | 15:24 |
herb | 2009-02-26 - We rolled out 2.2.2 based on r7763 | 15:24 |
herb | We continue to see problems relating to bug #156453 and bug #118625. So much so that we're going to start bouncing codebrowse regularly to hopefully head off any issues. I want to emphasize that this will be masking the problem and we really do need to find the root cause and fix it. | 15:24 |
ubottu | Launchpad bug 156453 in loggerhead "production loggerhead branch leaks memory" [Critical,Triaged] https://launchpad.net/bugs/156453 | 15:24 |
ubottu | Launchpad bug 118625 in launchpad-bazaar "codebrowse sometimes hangs" [High,Triaged] https://launchpad.net/bugs/118625 | 15:24 |
herb | Bug #260171 continues to creep up regularly (every few days). This is already morked as high and I know that mwhudson's plate is full with codebrowse issues, but can we get an update on this one? | 15:24 |
ubottu | Bug 260171 on http://launchpad.net/bugs/260171 is private | 15:24 |
* herb somehow managed to change flacoste into a verb. | 15:24 | |
danilos | matsubara, Ursinha: I am running tests on the critical bug fix, will let you know once it has landed | 15:24 |
flacoste | i saw! | 15:25 |
bac | i've been flacosted! | 15:25 |
matsubara | danilos, thanks | 15:25 |
Ursinha | thanks danilos | 15:25 |
matsubara | rockstar, can you bring up the codebrowse issue to the code team? | 15:25 |
rockstar | matsubara, everyday. :) | 15:25 |
matsubara | rockstar, thanks :-) | 15:25 |
rockstar | Codebrowse is being ACTIVELY worked on. It'd be nice if we knew what the issues is. Right now, we're just fixing things and hoping that was the problem. | 15:26 |
herb | rockstar: let the losas know if there is anything we can do to help. | 15:26 |
rockstar | herb, we certainly will. | 15:26 |
stub | Should we be bringing in any outside help to intrument, test and diagnose the issue? | 15:27 |
matsubara | herb, anything happened to the DB during the time of this OOPS-1152EA162? | 15:27 |
matsubara | or maybe stub might know ^ | 15:28 |
herb | matsubara: nothing in the incident log. | 15:28 |
stub | matsubara: That is one of the connection reaper scripts kicking in | 15:28 |
herb | matsubara: I think that's also on the void between LOSAs. | 15:29 |
herb | ah, there we go. | 15:29 |
stub | We kill connections idle in a transaction more than a few hours (and should be more agressive), and appserver connections that have been in a transaction for more than 2 minutes. | 15:29 |
Ursinha | stub, I see | 15:30 |
matsubara | stub, ok. so if we start seeing too many of those, we have a problem somewhere and a few is kinda normal? | 15:30 |
stub | The notification gets sent to the error-reports list (where we can confirm that this is indeed what happened) | 15:30 |
matsubara | stub, aha. that's better. I'll chase the lp-errors for that one | 15:31 |
matsubara | s/lp-errors/lp-errors list/ | 15:31 |
stub | If we see many of them, we have a problem. One is probably a problem - appserver requests taking two minutes on the db means we need to investigate why the normal timeout mechanisms didn't work. | 15:31 |
matsubara | [action] matsubara to look lp-errors list to determine cause of OOPS-1152EA162 | 15:31 |
MootBot | ACTION received: matsubara to look lp-errors list to determine cause of OOPS-1152EA162 | 15:31 |
matsubara | right. thanks for the explanation | 15:32 |
stub | -1 second non-sql time, 0 seconds total time indicates a problem at the appserver? The request never got started? | 15:32 |
matsubara | I'll file a bug about that one and we can discuss there | 15:33 |
stub | hmm... might be a reconnection bug - perhaps the previous request handled by that thread got killed? | 15:34 |
stub | I don't know if we Retry on DisconnectionError exceptions, or if it is a good idea in all cases. | 15:34 |
matsubara | ok | 15:35 |
matsubara | [TOPIC] * DBA report (stub) | 15:35 |
MootBot | New Topic: * DBA report (stub) | 15:35 |
matsubara | and thanks herb and stub | 15:35 |
stub | New hardware exists and is being brought online by IS. I've realized I might need to tweak the db maintenance scripts (upgrade.py, security.py etc.) to cope with a third replica - I think it only copes with a single master and slave at the moment. | 15:36 |
stub | Staging can be moved by the LOSAs as soon as the hardware is available and they have time, which will move that load from the production systems. | 15:36 |
stub | I assume the rollout went fine as far as the db upgrade procedure goes. | 15:36 |
herb | I assume it did too. I didn't hear any complaints from my colleagues. | 15:37 |
matsubara | stub, great news! with the new hardware we won't have the staging restore problems anymore? | 15:37 |
herb | stub: what's the plan with the 3rd replica? | 15:37 |
stub | The staging restore problems should no longer be a problem. | 15:38 |
* herb feels like he missed something | 15:38 | |
stub | herb: We can start by pointing half the appservers at the new slave when it is online. We really should get a connection pool/load balancer thingy though running like pgbouncer, pgpool 1 or 2. | 15:38 |
herb | stub: gotcha | 15:39 |
stub | herb: I realized just now though that upgrade.py won't apply patches to a third replica, which would be bad. So that needs to be fixed. | 15:39 |
herb | yeah. that's important. | 15:40 |
stub | Or actually, slonik may take care of all that. I need to confirm anyway. | 15:40 |
stub | I forget and it is too late for my brain :) | 15:40 |
stub | erm... late as in evening | 15:40 |
matsubara | all right. I guess that's all unless there are questions for stub | 15:42 |
matsubara | thanks stub | 15:42 |
matsubara | Thank you all for attending this week's Launchpad Production Meeting. See the channel topic for the location of the logs. | 15:42 |
matsubara | #endmeeting | 15:42 |
MootBot | Meeting finished at 09:42. | 15:42 |
intellectronica | thanks matsubara | 15:42 |
flacoste | hey | 15:42 |
flacoste | matsubara: question | 15:42 |
flacoste | do we need a new roll-out? | 15:43 |
flacoste | and i think it applies to everyone here | 15:43 |
flacoste | anyone requires a new roll-out? | 15:43 |
matsubara | flacoste, I was on vacation and need t ocheck that | 15:43 |
matsubara | but I think there's at least danilos' bug to re roll | 15:43 |
bac | flacoste: i don't know of any issues for us | 15:43 |
danilos | matsubara, flacoste: yes | 15:43 |
stub | I thought it was policy to let enough bugs through qa to require a rerollout? | 15:43 |
flacoste | we're getting better at QA stub | 15:44 |
flacoste | even the code team weren't that late this cycle :-) | 15:44 |
matsubara | ok, so we'll need a re-roll for translations. need to check for the other teams, but so far, there's nothing on the radar | 15:45 |
stub | We need a counter somewhere - 'Launchpad has been running for n days without need to a release critical patch' | 15:46 |
Ursinha | stub, :) | 15:46 |
matsubara | I think that's all then. thanks everyone | 15:47 |
Ursinha | thanks matsubara | 15:48 |
=== matsubara is now known as matsubara-lunch | ||
=== matsubara-lunch is now known as matsubara | ||
=== salgado is now known as salgado-lunch | ||
=== salgado-lunch is now known as salgado | ||
=== thumper_laptop is now known as thumper | ||
=== Ursinha is now known as Ursinha-fud | ||
=== salgado is now known as salgado-afk |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!