mrevell | #t/nick mrevell-lunch | 12:23 |
---|---|---|
mrevell | damn | 12:23 |
=== mrevell is now known as mrevell-lunch | ||
=== salgado-afk is now known as salgado | ||
=== mrevell-lunch is now known as mrevell | ||
=== bac_afk is now known as bac | ||
=== salgado is now known as salgado-brb | ||
=== salgado-brb is now known as salgado | ||
matsubara | #startmeeting | 16:00 |
MootBot | Meeting started at 10:00. The chair is matsubara. | 16:00 |
MootBot | Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE] | 16:00 |
matsubara | Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues. | 16:00 |
matsubara | [TOPIC] Roll Call | 16:00 |
MootBot | New Topic: Roll Call | 16:00 |
sinzui | me | 16:00 |
bigjools | me | 16:00 |
matsubara | so, who is here today? | 16:00 |
matsubara | me | 16:00 |
matsubara | flacoste, Ursinha, Rinchen,? | 16:01 |
flacoste | me | 16:01 |
Ursinha | me | 16:01 |
bigjools | cprov, ping | 16:01 |
cprov | pong | 16:01 |
cprov | me | 16:01 |
herb | me | 16:01 |
matsubara | we're missing, code | 16:02 |
matsubara | rockstar: ? | 16:02 |
intellectronica | me | 16:02 |
flacoste | i'm Foundations and the DBA report | 16:02 |
matsubara | Ursinha: who's the qa contact for rosetta? | 16:02 |
matsubara | danilo[home]__: ? | 16:02 |
danilos | me | 16:02 |
Ursinha | danilo[home]__, ping | 16:02 |
matsubara | ah hinding there | 16:03 |
matsubara | ok, so we're missing code | 16:03 |
danilos | matsubara: right, danilo[home]__ is, surprisingly, my account at home :) | 16:03 |
matsubara | let's move on then | 16:03 |
matsubara | [TOPIC] Agenda | 16:04 |
MootBot | New Topic: Agenda | 16:04 |
matsubara | * Next meeting | 16:04 |
matsubara | * Actions from last meeting | 16:04 |
matsubara | * Oops report & Critical Bugs | 16:04 |
matsubara | * Operations report (mthaddon/herb/spm) | 16:04 |
matsubara | * DBA report (DBA contact) | 16:04 |
matsubara | * Sysadmin requests (Rinchen) | 16:04 |
matsubara | [TOPIC] Next meeting | 16:04 |
MootBot | New Topic: Next meeting | 16:04 |
matsubara | next meeting, same time next week? ok for everyone? | 16:04 |
flacoste | yep | 16:04 |
Rinchen | me | 16:04 |
rockstar | matsubara, here | 16:04 |
flacoste | stub should be able to make it | 16:04 |
matsubara | Rinchen, rockstar: noted. thanks | 16:05 |
flacoste | he had a previous engagement tonight | 16:05 |
Ursinha | flacoste, nice | 16:05 |
matsubara | great | 16:05 |
rockstar | I forgot that we were doing this earlier. | 16:05 |
matsubara | [TOPIC] * Actions from last meeting | 16:05 |
MootBot | New Topic: * Actions from last meeting | 16:05 |
matsubara | * stub to patch our fti regexp to avoid OOPSes (bug 174368) and discuss a proper fix with jtv | 16:05 |
ubottu | Launchpad bug 174368 in launchpad-foundations "Search query triggering error in tsearch" [Undecided,Confirmed] https://launchpad.net/bugs/174368 | 16:05 |
flacoste | matsubara: that's not started | 16:05 |
Ursinha | a hanging bug | 16:06 |
flacoste | honestly, i don't think it's high priority | 16:06 |
matsubara | hmm shall I keep bring it up during this meeting? | 16:06 |
flacoste | it mostly trigger an OOPS when somebody tries spamming one of our search fields | 16:06 |
flacoste | well deserved imho | 16:06 |
flacoste | :-) | 16:06 |
Ursinha | :) | 16:07 |
matsubara | :-) | 16:07 |
Ursinha | matsubara, well, guess not so | 16:07 |
flacoste | but you might have a different opinion? | 16:07 |
matsubara | I'd like to have it targeted to a milestone at least | 16:07 |
Ursinha | yes | 16:07 |
matsubara | so we won't keep pushing oops bugs to the end of the queue | 16:07 |
bigjools | well, if it's acceptable it should not be an OOPS should it? | 16:07 |
flacoste | it's not acceptable | 16:08 |
flacoste | there is some rare, but legitimate query that are affected by it | 16:08 |
matsubara | it's important that we fix OOPS bugs, even if they affect a few users | 16:08 |
danilos | targeting to a milestone doesn't mean much if there is no real dedication to finish it in a certain timeframe | 16:08 |
flacoste | yepp | 16:08 |
Ursinha | danilos, indeed | 16:09 |
danilos | some bugs, like this one, for example, are not known how much they might take to solve (it might be a bunch of different things) | 16:09 |
flacoste | i'll keep it on our radar | 16:09 |
matsubara | well, that's the point of targeting it, isn't it? you set a deadline to fix it | 16:09 |
flacoste | and try to get to it at the end of the cycle | 16:09 |
matsubara | all right. thanks flacoste | 16:09 |
matsubara | I take it off from the actions from last meeting | 16:09 |
matsubara | [TOPIC] * Oops report & Critical Bugs | 16:10 |
MootBot | New Topic: * Oops report & Critical Bugs | 16:10 |
matsubara | Today's oops report is about bugs 271561, 273363 | 16:10 |
ubottu | Launchpad bug 271561 in launchpad-bazaar "OOPS calling __repr__ in xmlrpc method" [Undecided,New] https://launchpad.net/bugs/271561 | 16:10 |
ubottu | Launchpad bug 273363 in launchpad-foundations "'LaunchpadDatabasePolicy' object has no attribute 'read_only' in xmlrpc server" [Undecided,New] https://launchpad.net/bugs/273363 | 16:10 |
matsubara | rockstar, any news about #271561? | 16:10 |
matsubara | that's been happening at least once a day and I didn't see any progress in the bug report. | 16:11 |
rockstar | matsubara, it's being worked on, that's all I know about it. | 16:11 |
matsubara | flacoste, do you think bug 273363 might be related to bug 271902? | 16:11 |
ubottu | Launchpad bug 271902 in launchpad-foundations "db_policy equals None causing OOPS" [High,In progress] https://launchpad.net/bugs/271902 | 16:11 |
sinzui | Edwin has a fix for tno attribute 'read_only' I beleive | 16:11 |
flacoste | matsubara: stub has a fix in review for that one | 16:11 |
flacoste | and yes, they are dupped | 16:11 |
flacoste | sinzui? | 16:12 |
matsubara | sinzui and flacoste you might want to coordinate who will fix it then? :-) | 16:12 |
flacoste | well, it's assigned to stuart | 16:12 |
matsubara | but I guess that's on stub's turf | 16:12 |
sinzui | 273363 is assigned to Edwin | 16:12 |
flacoste | lol | 16:12 |
matsubara | the other one is assigned to EdwinGrubb | 16:12 |
flacoste | well, stuart has a branch fixing both in review | 16:12 |
sinzui | It is caused by Edwins fix to the cookie issue with feeds | 16:13 |
flacoste | i'll speak to Edwin | 16:13 |
matsubara | rockstar: can you assign it to the devel fixing that issue and change the status to in progress? | 16:13 |
flacoste | matsubara: can you dup it? | 16:13 |
matsubara | flacoste: sure | 16:13 |
rockstar | matsubara, sure. | 16:13 |
matsubara | thanks guys. | 16:13 |
Ursinha | ok | 16:13 |
matsubara | Ursinha: stage is yours | 16:13 |
Ursinha | one critical, bug 273489 | 16:14 |
ubottu | Launchpad bug 273489 in rosetta "Remaining Intrepid template approvals" [Critical,In progress] https://launchpad.net/bugs/273489 | 16:14 |
Ursinha | danilos, i've sent one email to jtv yesterday, to get more details on the problem | 16:14 |
danilos | right | 16:14 |
Ursinha | can you help me with that after the meeting? | 16:14 |
danilos | this has basically been 'fix committed' | 16:14 |
danilos | we are now importing all the Intrepid templates | 16:14 |
Ursinha | right | 16:14 |
matsubara | I also have one soyuz critical as well. yesterday cprov identified an oops that affected 60 or so PPA's | 16:14 |
danilos | and this morning there were around 13K files left to import (yesterday afternoon around 18K) | 16:15 |
danilos | so, this should be completely fixed in two days at most | 16:15 |
matsubara | cprov: can we expect a IR for that one? I presume no bug was filed and things were already fixed, right? | 16:15 |
Ursinha | danilos, great, thanks | 16:15 |
cprov | matsubara: and it was fixed at that time as well, which a production update query. | 16:15 |
bigjools | it was only edge that was affected | 16:15 |
Ursinha | matsubara, which bug is it? | 16:15 |
matsubara | so, no code change needed? | 16:15 |
cprov | matsubara: no | 16:15 |
danilos | Ursinha: ping me after the meeting for more details, but right after the meeting, I am likely to get out soon | 16:16 |
matsubara | Ursinha: no bug reported for that one | 16:16 |
cprov | matsubara: as I said, we just had to rush the data migration. | 16:16 |
matsubara | cprov: ok. so, Rinchen asked about doing an IR for that. | 16:16 |
Ursinha | matsubara, are we going to file one? | 16:17 |
Rinchen | Did file a critical bug for that? | 16:17 |
Rinchen | so, if we didn't file a bug.... | 16:17 |
matsubara | no, I can file one, but it's kinda pointless, isn't it? since the problem is fixed | 16:17 |
Rinchen | matsubara, only in the sense that we fixed the problem, not what caused the problem | 16:18 |
matsubara | and there'll be no code change | 16:18 |
bigjools | it's not a bug, it's an edge rollout issue | 16:18 |
Rinchen | so let's start please with a write up to the dev list about what happened and how to prevent it and not do an IR | 16:18 |
Rinchen | is that acceptable to everyone? | 16:18 |
bigjools | cprov already did that last night | 16:18 |
cprov | Rinchen: yes, email already sent. | 16:18 |
Rinchen | ok, thanks. I've been searching for it but haven't found it yet. :-( | 16:19 |
Rinchen | I'll keep looking. | 16:19 |
Rinchen | Thanks! | 16:19 |
Rinchen | and thanks for resolving it quickly last nigiht | 16:19 |
matsubara | thanks | 16:19 |
matsubara | moving on | 16:20 |
Ursinha | thanks guys | 16:20 |
matsubara | [TOPIC] * Operations report (mthaddon/herb/spm) | 16:20 |
MootBot | New Topic: * Operations report (mthaddon/herb/spm) | 16:20 |
herb | * 2008-09-19 - Updated 2.1.9 to r7035. This update included planned downtime. The service was down for approximately an hour. | 16:20 |
herb | * During the week we've had a few app servers die and leave core files. flacoste was investigating. | 16:20 |
herb | * 2008-09-23 - Cherry pick r7058 and r7064 to the scripts server and bzrsyncd server respectively. | 16:20 |
herb | * 2008-09-24 - Cherry pick r7066 and r7072 to lpnet*, update edge* to r7072. | 16:20 |
flacoste | herb: so i investigated the core files | 16:20 |
herb | flacoste: any update on the dying app servers? | 16:20 |
herb | cool | 16:20 |
flacoste | unfortunately, the stack track is pretty useless | 16:21 |
herb | boo | 16:21 |
flacoste | seems like accessing corrupted memory | 16:21 |
flacoste | things that barry and mwhudson said we could consider is running the appserver | 16:21 |
flacoste | using python2.4-dbg | 16:21 |
flacoste | which has some more debugging stuff in it | 16:21 |
flacoste | but it requires that the packages we are using also have a -dbg build | 16:21 |
flacoste | and are three times slower | 16:21 |
flacoste | another interesting thing | 16:22 |
herb | that's less than ideal. | 16:22 |
flacoste | is the stack trace posted by jtv | 16:22 |
flacoste | that occured in one of his script | 16:22 |
flacoste | it seems to point to a zope or storm problem | 16:22 |
flacoste | in the case of zope, the landscape team has a fixed for it | 16:22 |
flacoste | we should get it by moving to zope 3.4 (which we are going to attempt next week) | 16:23 |
flacoste | but it wasn't clear from the discussion i overheard if it looked like the same problem they experienced | 16:23 |
flacoste | another hypothesis made it a problem with the Storm C extensions | 16:23 |
flacoste | it's a good working hypothesis that the script and app server death is related to the same symptom | 16:24 |
flacoste | the fact that we have a better stack trace in the script case is probably due to the fact that it runs single-threaded | 16:24 |
herb | ok. short of running with all -dbg packages is there anything we can do to help isolate the problem? | 16:24 |
flacoste | so next step is to follow-up on the jtv, gustavo, barry discussion | 16:24 |
flacoste | and see what conclusions was there | 16:25 |
flacoste | and if there is something we can try from there | 16:25 |
flacoste | one last thing | 16:25 |
flacoste | i gave mthaddon the command to extract a stacktrace from a core file | 16:25 |
flacoste | that's really the only thing we need or can do with it | 16:25 |
mthaddon | https://launchpad.canonical.com/OSA/HowTo/BacktraceFromCoredump | 16:25 |
flacoste | so you could just save the stack trace instead of the whole core files | 16:25 |
flacoste | mthaddon: awesome! | 16:25 |
herb | flacoste: ok. cool. | 16:25 |
flacoste | EOT unless there are questions | 16:26 |
herb | that's it from the LOSAs unless there are any questions. | 16:26 |
matsubara | cool. thanks for the thoroughly explanation flacoste | 16:26 |
matsubara | and thanks herb, mthaddon! | 16:26 |
matsubara | [TOPIC] * DBA report (DBA contact) | 16:26 |
MootBot | New Topic: * DBA report (DBA contact) | 16:26 |
flacoste | Nothing unusual I'm aware of on the production database. | 16:26 |
flacoste | Replication testing scheduled this week using demo.launchpad.net as | 16:26 |
flacoste | per discussions with Francis. Turn off the monitoring if it was | 16:26 |
flacoste | switched back on. | 16:27 |
flacoste | Assuming demo.launchpad.net testing doesn't push us back to the | 16:27 |
flacoste | drawing board, I want to have a replication version of the staging | 16:27 |
flacoste | rollout scripts ready for next cycle, which should involve some | 16:27 |
flacoste | testing this cycle to ensure they actually work. | 16:27 |
flacoste | Schedule is to have the production Launchpad database replicated as | 16:27 |
flacoste | part of the 2.1.11 release. Staging running replicated for the whole | 16:27 |
flacoste | cycle should give enough experience for signoff from everyone. There | 16:27 |
flacoste | should be no unusual downtime requirements. I will need to be around | 16:27 |
flacoste | for the rollout though. | 16:27 |
flacoste | The new DB baseline doesn't affect production or staging. | 16:27 |
flacoste | mthaddon, herb: stuart will send you over notes on the changes this means for the staging roll-out process | 16:27 |
flacoste | i can take questions | 16:27 |
mthaddon | flacoste, great, thx | 16:28 |
herb | flacoste: ok. when should we expect notes? | 16:28 |
flacoste | not before the end of next week i think | 16:28 |
flacoste | it will take that much time to test on demo | 16:28 |
flacoste | and then upgrade the staging scripts | 16:28 |
herb | flacoste: at a high level how will this change the rollout process? | 16:28 |
matsubara | flacoste: can Ursinha and I help with the testing somehow? | 16:28 |
flacoste | the restoration of the DB | 16:28 |
flacoste | matsubara: i don't think so, at this stage it's not about QA, but more about seeing performance | 16:29 |
flacoste | matsubara: i'll talk to stuart and forward the offer, i might be mistaken | 16:29 |
flacoste | herb: we'll be having a replicated DB (so a master and a slave) on staging | 16:29 |
matsubara | flacoste: okie. tell him to ping/email us if he needs something | 16:30 |
flacoste | so this affects how the DB sync is done | 16:30 |
herb | flacoste: ok | 16:30 |
herb | flacoste: thanks | 16:30 |
matsubara | all right. thanks flacoste | 16:30 |
matsubara | [TOPIC] * Sysadmin requests (Rinchen) | 16:30 |
MootBot | New Topic: * Sysadmin requests (Rinchen) | 16:30 |
Rinchen | Hi! | 16:30 |
Rinchen | Is anyone blocked on an RT or have any that are becoming urgent? | 16:30 |
matsubara | I have one RT #31795, which I'd like to suggest priority ~80 | 16:31 |
Rinchen | Does anyone thing this section of the meeting is worthwhile? | 16:31 |
rockstar | Rinchen, it might if me had an RT... | 16:31 |
Rinchen | :-) | 16:31 |
rockstar | s/me/we | 16:31 |
danilos | Rinchen: I guess we are free to raise RT tickets with you and our team leads whenever we want anyway | 16:31 |
bigjools | I think it is worthwhile | 16:31 |
Rinchen | matsubara, ok, will look into that | 16:31 |
intellectronica | maybe we should make it possible for people to add an RT-related item to the agenda before the meeting, if they have one, but skip the section if no-one does | 16:31 |
matsubara | Rinchen: thank you | 16:31 |
bigjools | if nothing else it acts as a memory jog | 16:31 |
danilos | intellectronica: +1 | 16:32 |
matsubara | intellectronica: I like that | 16:32 |
Rinchen | danilos, intellectronica - yeah, that was my point. If others find it helpful though, I'm happy to continue it. | 16:32 |
Rinchen | it's not like it's a hard thing to do :-D | 16:32 |
Rinchen | I'll let matsubara and Ursinha make the call. | 16:32 |
Rinchen | Any other tickets? | 16:32 |
Rinchen | ok, if you do have something, please ping me! | 16:33 |
Rinchen | thanks matsubara | 16:33 |
matsubara | ok, let's experiment with intellectronica suggestion for awhile then. I'll update MeetingAgenda page to reflect that | 16:33 |
matsubara | thank you Rinchen | 16:33 |
matsubara | anything else before I close? | 16:34 |
matsubara | all right | 16:34 |
matsubara | Thank you all for attending this week's Launchpad Production Meeting. See the channel topic for the location of the logs. | 16:34 |
matsubara | actually the log part is a lie | 16:35 |
flacoste | thzx! | 16:35 |
matsubara | but thanks! | 16:35 |
matsubara | #endmeeting | 16:35 |
MootBot | Meeting finished at 10:35. | 16:35 |
danilos | thanks all, matsubara especially for running the meeting :) | 16:35 |
matsubara | np | 16:36 |
intellectronica | thanks, matsubara | 16:36 |
Ursinha | thanks matsubara | 16:36 |
=== matsubara is now known as matsubara-lunch | ||
=== salgado is now known as salgado-lunch | ||
=== matsubara-lunch is now known as matsubara | ||
=== salgado-lunch is now known as salgado | ||
=== thumper_laptop is now known as thumper | ||
=== salgado is now known as salgado-afk |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!