/srv/irclogs.ubuntu.com/2008/09/25/#launchpad-meeting.txt

mrevell#t/nick mrevell-lunch12:23
mrevelldamn12:23
=== mrevell is now known as mrevell-lunch
=== salgado-afk is now known as salgado
=== mrevell-lunch is now known as mrevell
=== bac_afk is now known as bac
=== salgado is now known as salgado-brb
=== salgado-brb is now known as salgado
matsubara#startmeeting16:00
MootBotMeeting started at 10:00. The chair is matsubara.16:00
MootBotCommands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE]16:00
matsubaraWelcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues.16:00
matsubara[TOPIC] Roll Call16:00
MootBotNew Topic:  Roll Call16:00
sinzuime16:00
bigjoolsme16:00
matsubaraso, who is here today?16:00
matsubarame16:00
matsubaraflacoste, Ursinha, Rinchen,?16:01
flacosteme16:01
Ursinhame16:01
bigjoolscprov, ping16:01
cprovpong16:01
cprovme16:01
herbme16:01
matsubarawe're missing, code16:02
matsubararockstar: ?16:02
intellectronicame16:02
flacostei'm Foundations and the DBA report16:02
matsubaraUrsinha: who's the qa contact for rosetta?16:02
matsubaradanilo[home]__: ?16:02
danilosme16:02
Ursinhadanilo[home]__, ping16:02
matsubaraah hinding there16:03
matsubaraok, so we're missing code16:03
danilosmatsubara: right, danilo[home]__ is, surprisingly, my account at home :)16:03
matsubaralet's move on then16:03
matsubara[TOPIC] Agenda16:04
MootBotNew Topic:  Agenda16:04
matsubara * Next meeting16:04
matsubara * Actions from last meeting16:04
matsubara * Oops report & Critical Bugs16:04
matsubara * Operations report (mthaddon/herb/spm)16:04
matsubara * DBA report (DBA contact)16:04
matsubara * Sysadmin requests (Rinchen)16:04
matsubara[TOPIC] Next meeting16:04
MootBotNew Topic:  Next meeting16:04
matsubaranext meeting, same time next week? ok for everyone?16:04
flacosteyep16:04
Rinchenme16:04
rockstarmatsubara, here16:04
flacostestub should be able to make it16:04
matsubaraRinchen, rockstar: noted. thanks16:05
flacostehe had a previous engagement tonight16:05
Ursinhaflacoste, nice16:05
matsubaragreat16:05
rockstarI forgot that we were doing this earlier.16:05
matsubara[TOPIC] * Actions from last meeting16:05
MootBotNew Topic:  * Actions from last meeting16:05
matsubara * stub to patch our fti regexp to avoid OOPSes (bug 174368) and discuss a proper fix with jtv16:05
ubottuLaunchpad bug 174368 in launchpad-foundations "Search query triggering error in tsearch" [Undecided,Confirmed] https://launchpad.net/bugs/17436816:05
flacostematsubara: that's not started16:05
Ursinhaa hanging bug16:06
flacostehonestly, i don't think it's high priority16:06
matsubarahmm shall I keep bring it up during this meeting?16:06
flacosteit mostly trigger an OOPS when somebody tries spamming one of our search fields16:06
flacostewell deserved imho16:06
flacoste:-)16:06
Ursinha:)16:07
matsubara:-)16:07
Ursinhamatsubara, well, guess not so16:07
flacostebut you might have a different opinion?16:07
matsubaraI'd like to have it targeted to a milestone at least16:07
Ursinhayes16:07
matsubaraso we won't keep pushing oops bugs to the end of the queue16:07
bigjoolswell, if it's acceptable it should not be an OOPS should it?16:07
flacosteit's not acceptable16:08
flacostethere is some rare, but legitimate query that are affected by it16:08
matsubarait's important that we fix OOPS bugs, even if they affect a few users16:08
danilostargeting to a milestone doesn't mean much if there is no real dedication to finish it in a certain timeframe16:08
flacosteyepp16:08
Ursinhadanilos, indeed16:09
danilossome bugs, like this one, for example, are not known how much they might take to solve (it might be a bunch of different things)16:09
flacostei'll keep it on our radar16:09
matsubarawell, that's the point of targeting it, isn't it? you set a deadline to fix it16:09
flacosteand try to get to it at the end of the cycle16:09
matsubaraall right. thanks flacoste16:09
matsubaraI take it off from the actions from last meeting16:09
matsubara[TOPIC] * Oops report & Critical Bugs16:10
MootBotNew Topic:  * Oops report & Critical Bugs16:10
matsubaraToday's oops report is about bugs 271561, 27336316:10
ubottuLaunchpad bug 271561 in launchpad-bazaar "OOPS calling __repr__ in xmlrpc method" [Undecided,New] https://launchpad.net/bugs/27156116:10
ubottuLaunchpad bug 273363 in launchpad-foundations "'LaunchpadDatabasePolicy' object has no attribute 'read_only' in xmlrpc server" [Undecided,New] https://launchpad.net/bugs/27336316:10
matsubararockstar, any news about #271561?16:10
matsubarathat's been happening at least once a day and I didn't see any progress in the bug report.16:11
rockstarmatsubara, it's being worked on, that's all I know about it.16:11
matsubaraflacoste, do you think bug 273363 might be related to bug 271902?16:11
ubottuLaunchpad bug 271902 in launchpad-foundations "db_policy equals None causing OOPS" [High,In progress] https://launchpad.net/bugs/27190216:11
sinzuiEdwin has a fix for tno attribute 'read_only' I beleive16:11
flacostematsubara: stub has a fix in review for that one16:11
flacosteand yes, they are dupped16:11
flacostesinzui?16:12
matsubarasinzui and flacoste you might want to coordinate who will fix it then? :-)16:12
flacostewell, it's assigned to stuart16:12
matsubarabut I guess that's on stub's turf16:12
sinzui273363 is assigned to Edwin16:12
flacostelol16:12
matsubarathe other one is assigned to EdwinGrubb16:12
flacostewell, stuart has a branch fixing both in review16:12
sinzuiIt is caused by Edwins fix to the cookie issue with feeds16:13
flacostei'll speak to Edwin16:13
matsubararockstar: can you assign it to the devel fixing that issue and change the status to in progress?16:13
flacostematsubara: can you dup it?16:13
matsubaraflacoste: sure16:13
rockstarmatsubara, sure.16:13
matsubarathanks guys.16:13
Ursinhaok16:13
matsubaraUrsinha: stage is yours16:13
Ursinhaone critical, bug 27348916:14
ubottuLaunchpad bug 273489 in rosetta "Remaining Intrepid template approvals" [Critical,In progress] https://launchpad.net/bugs/27348916:14
Ursinhadanilos, i've sent one email to jtv yesterday, to get more details on the problem16:14
danilosright16:14
Ursinhacan you help me with that after the meeting?16:14
danilosthis has basically been 'fix committed'16:14
daniloswe are now importing all the Intrepid templates16:14
Ursinharight16:14
matsubaraI also have one soyuz critical as well. yesterday cprov identified an oops that affected 60 or so PPA's16:14
danilosand this morning there were around 13K files left to import (yesterday afternoon around 18K)16:15
danilosso, this should be completely fixed in two days at most16:15
matsubaracprov: can we expect a IR for that one? I presume no bug was filed and things were already fixed, right?16:15
Ursinhadanilos, great, thanks16:15
cprovmatsubara: and it was fixed at that time as well, which a production update query.16:15
bigjoolsit was only edge that was affected16:15
Ursinhamatsubara, which bug is it?16:15
matsubaraso, no code change needed?16:15
cprovmatsubara: no16:15
danilosUrsinha: ping me after the meeting for more details, but right after the meeting, I am likely to get out soon16:16
matsubaraUrsinha: no bug reported for that one16:16
cprovmatsubara: as I said, we just had to rush the data migration.16:16
matsubaracprov: ok. so, Rinchen asked about doing an IR for that.16:16
Ursinhamatsubara, are we going to file one?16:17
RinchenDid file a critical bug for that?16:17
Rinchenso, if we didn't file a bug....16:17
matsubarano, I can file one, but it's kinda pointless, isn't it? since the problem is fixed16:17
Rinchenmatsubara, only in the sense that we fixed the problem, not what caused the problem16:18
matsubaraand there'll be no code change16:18
bigjoolsit's not a bug, it's an edge rollout issue16:18
Rinchenso let's start please with a write up to the dev list about what happened and how to prevent it and not do an IR16:18
Rinchenis that acceptable to everyone?16:18
bigjoolscprov already did that last night16:18
cprovRinchen: yes, email already sent.16:18
Rinchenok, thanks. I've been searching for it but haven't found it yet. :-(16:19
RinchenI'll keep looking.16:19
RinchenThanks!16:19
Rinchenand thanks for resolving it quickly last nigiht16:19
matsubarathanks16:19
matsubaramoving on16:20
Ursinhathanks guys16:20
matsubara[TOPIC] * Operations report (mthaddon/herb/spm)16:20
MootBotNew Topic:  * Operations report (mthaddon/herb/spm)16:20
herb* 2008-09-19 - Updated 2.1.9 to r7035. This update included planned downtime. The service was down for approximately an hour.16:20
herb* During the week we've had a few app servers die and leave core files. flacoste was investigating.16:20
herb* 2008-09-23 - Cherry pick r7058 and r7064 to the scripts server and bzrsyncd server respectively.16:20
herb* 2008-09-24 - Cherry pick r7066 and r7072 to lpnet*, update edge* to r7072.16:20
flacosteherb: so i investigated the core files16:20
herbflacoste: any update on the dying app servers?16:20
herbcool16:20
flacosteunfortunately, the stack track is pretty useless16:21
herbboo16:21
flacosteseems like accessing corrupted memory16:21
flacostethings that barry and mwhudson said we could consider is running the appserver16:21
flacosteusing python2.4-dbg16:21
flacostewhich has some more debugging stuff in it16:21
flacostebut it requires that the packages we are using also have a -dbg build16:21
flacosteand are three times slower16:21
flacosteanother interesting thing16:22
herbthat's less than ideal.16:22
flacosteis the stack trace posted by jtv16:22
flacostethat occured in one of his script16:22
flacosteit seems to point to a zope or storm problem16:22
flacostein the case of zope, the landscape team has a fixed for it16:22
flacostewe should get it by moving to zope 3.4 (which we are going to attempt next week)16:23
flacostebut it wasn't clear from the discussion i overheard if it looked like the same problem they experienced16:23
flacosteanother hypothesis made it a problem with the Storm C extensions16:23
flacosteit's a good working hypothesis that the script and app server death is related to the same symptom16:24
flacostethe fact that we have a better stack trace in the script case is probably due to the fact that it runs single-threaded16:24
herbok. short of running with all -dbg packages is there anything we can do to help isolate the problem?16:24
flacosteso next step is to follow-up on the jtv, gustavo, barry discussion16:24
flacosteand see what conclusions was there16:25
flacosteand if there is something we can try from there16:25
flacosteone last thing16:25
flacostei gave mthaddon the command to extract a stacktrace from a core file16:25
flacostethat's really the only thing we need or can do with it16:25
mthaddonhttps://launchpad.canonical.com/OSA/HowTo/BacktraceFromCoredump16:25
flacosteso you could just save the stack trace instead of the whole core files16:25
flacostemthaddon: awesome!16:25
herbflacoste: ok. cool.16:25
flacosteEOT unless there are questions16:26
herbthat's it from the LOSAs unless there are any questions.16:26
matsubaracool. thanks for the thoroughly explanation flacoste16:26
matsubaraand thanks herb, mthaddon!16:26
matsubara[TOPIC] * DBA report (DBA contact)16:26
MootBotNew Topic:  * DBA report (DBA contact)16:26
flacosteNothing unusual I'm aware of on the production database.16:26
flacosteReplication testing scheduled this week using demo.launchpad.net as16:26
flacosteper discussions with Francis. Turn off the monitoring if it was16:26
flacosteswitched back on.16:27
flacosteAssuming demo.launchpad.net testing doesn't push us back to the16:27
flacostedrawing board, I want to have a replication version of the  staging16:27
flacosterollout scripts ready for next cycle, which should involve some16:27
flacostetesting this cycle to ensure they actually work.16:27
flacosteSchedule is to have the production Launchpad database replicated as16:27
flacostepart of the 2.1.11 release. Staging running replicated for the whole16:27
flacostecycle should give enough experience for signoff from everyone. There16:27
flacosteshould be no unusual downtime requirements. I will need to be around16:27
flacostefor the rollout though.16:27
flacosteThe new DB baseline doesn't affect production or staging.16:27
flacostemthaddon, herb: stuart will send you over notes on the changes this means for the staging roll-out process16:27
flacostei can take questions16:27
mthaddonflacoste, great, thx16:28
herbflacoste: ok.  when should we expect notes?16:28
flacostenot before the end of next week i think16:28
flacosteit will take that much time to test on demo16:28
flacosteand then upgrade the staging scripts16:28
herbflacoste: at a high level how will this change the rollout process?16:28
matsubaraflacoste: can Ursinha and I help with the testing somehow?16:28
flacostethe restoration of the DB16:28
flacostematsubara: i don't think so, at this stage it's not about QA, but more about seeing performance16:29
flacostematsubara: i'll talk to stuart and forward the offer, i might be mistaken16:29
flacosteherb: we'll be having a replicated DB (so a master and a slave) on staging16:29
matsubaraflacoste: okie. tell him to ping/email us if he needs something16:30
flacosteso this affects how the DB sync is done16:30
herbflacoste: ok16:30
herbflacoste: thanks16:30
matsubaraall right. thanks flacoste16:30
matsubara[TOPIC] * Sysadmin requests (Rinchen)16:30
MootBotNew Topic:  * Sysadmin requests (Rinchen)16:30
RinchenHi!16:30
RinchenIs anyone blocked on an RT or have any that are becoming urgent?16:30
matsubaraI have one RT #31795, which I'd like to suggest priority ~8016:31
RinchenDoes anyone thing this section of the meeting is worthwhile?16:31
rockstarRinchen, it might if me had an RT...16:31
Rinchen:-)16:31
rockstars/me/we16:31
danilosRinchen: I guess we are free to raise RT tickets with you and our team leads whenever we want anyway16:31
bigjoolsI think it is worthwhile16:31
Rinchenmatsubara, ok, will look into that16:31
intellectronicamaybe we should make it possible for people to add an RT-related item to the agenda before the meeting, if they have one, but skip the section if no-one does16:31
matsubaraRinchen: thank you16:31
bigjoolsif nothing else it acts as a memory jog16:31
danilosintellectronica: +116:32
matsubaraintellectronica: I like that16:32
Rinchendanilos, intellectronica - yeah, that was my point.  If others find it helpful though, I'm happy to continue it.16:32
Rinchenit's not like it's a hard thing to do :-D16:32
RinchenI'll let matsubara and Ursinha make the call.16:32
RinchenAny other tickets?16:32
Rinchenok, if you do have something, please ping me!16:33
Rinchenthanks matsubara16:33
matsubaraok, let's experiment with intellectronica suggestion for awhile then. I'll update MeetingAgenda page to reflect that16:33
matsubarathank you Rinchen16:33
matsubaraanything else before I close?16:34
matsubaraall right16:34
matsubaraThank you all for attending this week's Launchpad Production Meeting. See the channel topic for the location of the logs.16:34
matsubaraactually the log part is a lie16:35
flacostethzx!16:35
matsubarabut thanks!16:35
matsubara#endmeeting16:35
MootBotMeeting finished at 10:35.16:35
danilosthanks all, matsubara especially for running the meeting :)16:35
matsubaranp16:36
intellectronicathanks, matsubara16:36
Ursinhathanks matsubara16:36
=== matsubara is now known as matsubara-lunch
=== salgado is now known as salgado-lunch
=== matsubara-lunch is now known as matsubara
=== salgado-lunch is now known as salgado
=== thumper_laptop is now known as thumper
=== salgado is now known as salgado-afk

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!