/srv/irclogs.ubuntu.com/2012/02/07/#launchpad-dev.txt

salgadois there a way to move a PPA from a team to another or do we need to create a new PPA and copy all the packages over?00:35
lifelessnew ppa + copy00:38
salgadohm, ok. thanks lifeless00:42
rick_hStevenK: what's up?02:36
StevenKrick_h: sidnei marked my MP as Needs Fixing :-(02:38
StevenKI didn't think combo_app() was actually tested.02:38
rick_hStevenK: yea, it is02:40
rick_hthe TestApp() wraps combo_app02:40
rick_hand loads it as a wsgi appliaction that gets tested in the tests/test_combo.py02:40
rick_hStevenK: not sure what he wants test-wise, the current tests make sure combo_app functions, I suppose a test mounting _application in TestApp would work02:41
rick_hStevenK: and yea, I kind of agree with him that it'd probably be best to not _ it if we're going to import it02:41
StevenKRight02:42
StevenKI also think I didn't explain it the best in the MP, if he's asking "Why do you need this?"02:42
rick_hStevenK: yea, I didn't follow it either until you showed me how we were changing out code side02:42
rick_hthough I think that the reason of "combo_app should be able to be wrapped as a wsgi app" is enough reason02:43
StevenKRight, so pasting our WSGI wrapper might help02:43
rick_hStevenK: yea, at least in the MP so there's a record I guess. Technically it's the more 'correct' way anyway.02:43
StevenKrick_h: I'll look at sorting it out after lunch.02:43
rick_hStevenK: ok, thanks02:44
rick_hI didn't see his response so sorry I didn't catch him during hte day with it02:44
rick_hjust ping'd to let him know we needed it and ask him to take a peek02:44
* StevenK tries to work out how to run convoy's test suite04:39
wgrantGrarrrr04:46
* wgrant mauls the branch scanner04:46
* wgrant chainsaws the branch scanner04:46
StevenKHaha04:47
StevenKWhyfor?04:47
wgrantIt is slow.04:48
wgrantIt holds locks.04:48
wgrantIt randomly hangs.04:48
wgrantIt's like Soyuz, except more unreliable and with a simpler task.04:48
StevenKwgrant: Can't bug 910492 be closed?04:48
_mup_Bug #910492: long urls break lazr restful object representation cache <oops> <Launchpad itself:Triaged> < https://launchpad.net/bugs/910492 >04:48
wgrantStevenK: Done.04:49
wgrant        # Bug heat increases by a quarter of the maximum bug heat04:49
wgrant        # divided by the number of days since the bug's creation date.04:49
wgrantwut04:49
wgrantlifeless: Bug heat confuses me04:49
lifelessI expect there are plentiful lies by now in the code04:51
lifelessdue to age04:51
wgrantHa ha04:51
wgrantHm04:53
spmwgrant: while you're in a chainsawing slow services mindset, may I direct your energies towards.... checkwatches?04:54
spmor have I trolled too far?04:54
wgrantHey, it hasn't hung in weeks.04:54
spmis it running?04:55
wgrantUnfortunately.04:56
spm:-)04:56
StevenKwgrant: O hai. So given https://code.launchpad.net/~launchpad/+recipe/launchpad-convoy . If I push changes to the packaging branch does that mean it will build 0.2.1-0~19-oneiric1 again?05:40
wgrantStevenK: Yes05:42
wgrantStevenK: The packaging revno is not included in that version template.05:42
wgrantAlso, why do both thumper and StevenK ask me about recipes, when it was their project :(05:42
StevenKHaha05:42
StevenKwgrant: What would you recommend?05:42
StevenKwgrant: You know, the brain supresses bad memories ...05:43
wgrantStevenK: Either commit to trunk or include the packaging revno in the template (possibly temporarily)05:47
StevenKwgrant: I don't want the packaging in trunk05:50
wgrantStevenK: I never suggested that :)05:52
StevenKI could pull out lp:convoy, but then I/someone else has to merge lp:convoy into the packaging branch and push it05:53
wgrantWhat about one of the options I gave?05:53
StevenKI'm just not sure how/where to inject the packaging revno05:55
StevenKWithout screwing up upgrades05:55
wgrantLet me find one of mine.06:00
wgrantStevenK: https://code.launchpad.net/~wgrant/+recipe/ivle-trunk06:01
StevenKWhy +dr?06:04
wgrantdebian revision06:04
wgrantIt's arbitrary.06:04
StevenKRight06:04
wgrantStevenK: https://code.launchpad.net/~wgrant/launchpad/stop-this-aging-nonsense/+merge/9176306:05
StevenKIt's over 9000!06:06
StevenK(IE, r=me)06:07
wgrantThanks.06:07
StevenKI thought bug heat was already done in the DB06:09
wgrantThe calculation is a PL/Python function, if that's what you mean.06:10
StevenKRight06:11
nigelblol. stop this aging nonsense.06:13
StevenKYes. wgrant will forever be 16.06:13
nigelbhahaha06:13
nigelbwgrant is the Australian vampire I suppose. :P06:14
nigelbStevenK: ^06:14
StevenKHaha06:14
cody-somervilleHe doesn't sparkle though.06:18
StevenKSpoken like a Twilight fan06:18
wgrantHeh06:19
=== almaisan-away is now known as al-maisan
wgrantstub: Oh, didn't know that was possible. Thanks.07:27
stubThe planner gets to inline SQL functions into SQL queries. Not sure if it will help in this case.07:29
wgrantUnlikely, since the surrounding Python is terrible.07:30
wgrantBut it's something.07:30
adeuringgood  morning08:50
bigjoolsjml: who do we ping to get your updated testtools in the archive? Do you know who packages it?10:30
jmlbigjools: it gets imported from Debian, lifeless is the maintainer there.10:30
bigjoolsjml: ok ta10:31
bigjoolsjml: there's an ubuntu-specific version in the archive at the moment10:31
jmloh really?10:31
bigjoolsjml: 0.9.11-1ubuntu110:31
* jml should really pay more attention to downstreams.10:31
bigjoolsI haven't looked at its local patch10:31
bigjoolsheh, debian has 0.9.11-110:32
=== al-maisan is now known as almaisan-away
jml"  * Build using dh_python2."10:44
jmlfrom doko10:44
jmlI guess it's not much of a patch :)10:44
jml(incidentally, well done LP for making that easy to find out: https://launchpad.net/ubuntu/+source/python-testtools)10:44
bigjools\m/10:46
adeuringgmb, wgrant: could you review this MP: https://code.launchpad.net/~adeuring/launchpad/bug-829074-ui/+merge/91796?11:03
gmbadeuring: Sure thing.11:04
=== gmb changed the topic of #launchpad-dev to: https://dev.launchpad.net/ | On call reviewer: gmb | Firefighting: - | Critical bugtasks: 4*10^2
adeuringgmb: thanks!11:05
gmbadeuring: Just finishing another branch, but I'll get to it presently.11:05
gmbadeuring: Looks good. r=me.11:48
adeuringgmb: thanks!11:48
gmbWelcome :)11:49
StevenKadeuring: I think your change in r14748 does require QA.11:52
adeuringStevenK: yes and no -- the point is that the new featrues can't be used yet. The branch just reviewed by Graham will make that easier11:53
rick_hmorning12:01
adeuringmorning rick_h12:01
wallyworld_bigjools: wtf, just read that kubuntu is being killed :-(12:16
jelmerwallyworld_: it no longer has a dedicated canonical engineer working on it (as was the case when riddell was on rotation to Bazaar)12:18
jelmerwallyworld_: that's not quite the same as it being killed12:18
wallyworld_jelmer: the net effect will be the same i fear12:19
jelmerthey seem to've done fine for 11.10 when jonathan was on rotation to bzr, and {x,edu,l}ubuntu seem to do fine with just infrastructure support too12:20
jelmernot saying it won't have a negative impact12:21
stubI think it would take much more than a single bullet to kill of kubuntu.12:21
wallyworld_yeah, maybe i'm being too pessimistic12:21
wallyworld_just a bit sad i guess12:21
bigjoolswallyworld_: not killed12:21
bigjoolswallyworld_: I think it's a good thing actually12:22
wallyworld_really?12:22
bigjoolsyeah, it means any criticism about it will need to be levelled at the community, not canonical12:23
wallyworld_true12:23
bigjoolsand the community will not be encumbered by anything12:23
rick_hStevenK: can the combo-url land? Or we waiting on RT?12:26
StevenKrick_h: We are waiting for the convoy MP.12:28
rick_hStevenK: ok cool.12:28
StevenKrick_h: If that lands, then I can update the convoy package, and land combo-url12:30
rick_hadeuring: do you know anyone that knows translations well?13:36
adeuringrick_h: jtv for example13:36
rick_hadeuring: I'm trying to find some way to mass download .pot files without any success in wiki/google13:36
rick_hadeuring: ok, thanks13:36
rick_hjtv: ping, got a sec for a translation question? someone is asking about mass downloading all ubuntu .pot files for spanish languages?13:57
rick_hjtv: I don't see any way to mass download from the wiki/webui. I see a lp-translations tools package that seems to do mass uploads though, but code doesn't seem to download?13:58
rick_hadeuring: so did maint. RT, questions, translations, and new projects13:58
adeuringrick_h: cool -- i sucked again :(13:59
rick_hbah, missed jtv13:59
deryckMorning, all.14:01
rick_hmorning deryck14:05
=== almaisan-away is now known as al-maisan
deryckadeuring, rick_h -- I'd like to do a G+ hangout for our standup today.14:10
rick_hderyck: sounds like a plan14:11
=== matsubara is now known as matsubara-lunch
adeuringderyck: gahh -- i still have no g+ account :(14:16
deryckadeuring, see my PM to you. :)14:16
deryckabentley, we're G+ hanging out today for standup.14:33
abentleyjcsackett, sinzui: A branch's unique name is a well-established term.  Unique names do not include the lp: prefix.15:17
sinzuimy apologies15:17
abentleysinzui: np, just let's keep the definition consistent.15:18
jelmerwhat's happening with bug heat?15:26
jelmerI thought it was going to be removed - is it going to stay around in some form (given it has just changed)?15:26
=== matsubara-lunch is now known as matsubara
jcsackettabentley, sinzui: so do we need to roll that back?15:30
abentleyjcsackett: No, but you should change the function name, or else change where you attach the prefix.15:31
jcsackettabentley: ok, i'll land a follow up to correct the name.15:31
abentleyjcsackett: thanks.15:31
abentleyjcsackett: You should also change the HTML so it's not called #branch-unique-name.15:34
* jcsackett nods15:34
jcsackettseems odd unique name was ever used, as it's all about presenting the location, which incorporates the unique name but isn't the same thing at all.15:34
abentleyjcsackett: I don't know where this is, but it's possible the authors thought it was about presenting the name, not the location.15:35
jcsackettabentley: fair. could be it was misappropriated for presenting the location later.15:36
abentleyjcsackett: Nope, looks like it was always presenting the name and calling it the location.15:38
jcsackettright on. well, i'll be making it consistent shortly.15:38
abentleyjcsackett: I guess you could stick the lp: in the template.15:39
jcsackettpossibly, but it would have to exist outside of the node, since that gets set by the js.15:39
jcsackettseems a might bit hackish.15:39
abentleyjcsackett: Yes, it's treating HTML as a template language.15:42
abentleyjelmer: https://blog.launchpad.net/general/bugheatchange16:18
jelmerabentley: ah, thanks - missed that for some reason16:20
sinzuidanhg, talky talky time?16:26
danhgHey sinzui, I'm in the middle of MaaS tests, I should be free by 18:00 GMT?16:47
sinzuiokay16:47
deryckadeuring, hey, any luck on those interrupt duties? (a friendly ping from one slacker to another. ;)17:29
adeuringderyck: sgh... goit distracted again by working on a branch...17:30
lifelessderyck: hey17:52
deryckhi lifeless17:59
lifelesshow is your schedule to-day?18:00
derycklifeless, unfortunately, my wife ninja-scheduled me for the dentist.  and I need to leave a little early today.18:01
deryckshe knows I'm a baby and need her to hold my hand.18:01
lifelessderyck: I take it that that is in a few minutes time? If it was say 40-50 minutes away we could do a quick call...18:04
derycklifeless, no, it's actually couple hours away still.  I'm just heads down trying to finish these hanging actions from TL call.18:05
derycklifeless, I'd really love to chat, but I don't want to be yet another day working on these "investigations" either. :)18:05
derycklifeless, how about tomorrow post TL call?18:06
lifelessIIRC I was going to give you a hand with one of them18:06
derycklifeless, you gave me enough of I hand, I think.  I filed a bug this morning.18:06
lifelessuhm, I have no idea, let me check (the new time has thrown out my memosied schedule)18:06
derycklet me see bug number....18:06
derycklifeless, bug 92832718:06
_mup_Bug #928327: codebrowse hangs due to exception/oops handling <loggerhead:Triaged> < https://launchpad.net/bugs/928327 >18:06
derycklifeless, my guess/diagnose could easily be wrong ^^ so I appreciate you looking at the bug.18:07
lifelessgary_poster: we have a parallel testing biweekly thing conflicting with the TL new time18:07
lifelessderyck: I have a slot *before* the tl meeting; after has my 1:1 with statik18:07
derycklifeless, that works for me better actually.  forgot about the TL call time shift.18:08
lifelessderyck: why do you think oops is implicated ?18:09
deryckthe hang seems to be in oops_middleware18:10
lifelessI don't follow - oops_middleware is in the call stack yes, but its a WSGI middleware, so it will always be so.18:10
lifelessthread 11 in https://pastebin.canonical.com/59603/ is in the middle of a global GC run18:11
lifelessbut the other one has no GC in it, so either different cases, or not GC.18:12
derycklifeless, so I saw the threads that seemed stuck in sock_sendall had stuff happening in httpexceptions and oops_middleware....18:12
derycklifeless, so I just assumed something was hanging in dealing with an oops.18:12
lifelessso this, for instance:18:13
lifeless#6 0x00000000004fa67b in sock_sendall (s=0xa8e4ba0, args=<value optimized out>) from ../Modules/socketmodule.c18:13
lifeless#7 0x00000000004a7c5e in call_function () from ../Python/ceval.c18:13
lifeless/usr/lib/python2.6/socket.py (282): flush18:13
lifeless/usr/lib/python2.6/socket.py (292): write18:13
lifeless/srv/codebrowse.launchpad.net/production/launchpad2-rev-14640/eggs/Paste-1.7.2-py2.6.egg/paste/httpserver.py (123): wsgi_write_chunk18:13
lifeless/srv/codebrowse.launchpad.net/production/launchpad2-rev-14640/eggs/oops_wsgi-0.0.8-py2.6.egg/oops_wsgi/middleware.py (131): oops_write18:13
lifeless?18:14
lifelessderyck: ^18:15
derycklifeless, indeed.  that's what I meant.18:16
lifelessderyck: ok, so the way wsgi works means that every layer that offers facilities will /tend/ to have its own 'write' callable that is passed down.18:17
gary_posterlifeless, yeah I noticed18:17
gary_posterlifeless, I think TL wins ;-)18:17
lifelessoops_write is the callable passed from the oops middleware to the next deeper wsgi thing18:17
lifelessand wsgi_write_chunk is the callable that was returned by the paste http server18:18
gary_posterflacoste, lifeless, should we move parallel testing to 4PM Eastern, 21 UTC?18:18
lifelesshttp_exceptions etc18:18
gary_posterWed still?18:18
lifelessgary_poster: is that 1 hour later or something?18:18
derycklifeless, ah, ok.  Didn't realize that.18:18
gary_posterlifeless, right18:18
lifelessgary_poster: I have a call with statik then18:18
gary_posterlifeless, doesn't parallel testing take precedence? ;-)18:19
gary_posterlifeless, ok.  I'll look at schedules and make another proposal later18:19
lifelessgary_poster: I'm about a month out what with budapest, sickness, the QBR.18:19
lifelessgary_poster: if I hadn't missed that many 1:1's I'd say sure...18:19
gary_posterlifeless, sure. np18:19
lifelessbut IME if you don't pin statik down with a nailgun .. :)18:19
derycklifeless, so it seems those pastes are pretty useless then, if I understand that right.  except for knowing we're stuck in socket send.  or am I missing something?18:19
lifelessderyck: well, we don't know we're stuck in socket send18:20
deryckok18:20
lifelessderyck: there are lots of threads, and some of them were writing content when the core was taken18:20
lifelesswe don't know how long they had been there18:20
lifelessderyck: it *may* be that that is a smoking gun indicating e.g. network issues talking to haproxy or something18:21
lifelessor it may be totally irrelevant18:21
deryckah, gotcha.18:21
lifelessderyck: lets go through in some detail tomorrow, for now I've gardened the bug to have just the definitive data18:21
lifelessgary_poster: +118:21
gary_postercool18:22
derycklifeless, ok18:22
lifelessderyck: note that sock_sendall is a python module, so it may well get involved in or mangled by GIL issues, bad locking etc18:22
lifelessderyck: we may end up spelunking into C18:22
lifelessderyck: that said, we're missing line numbers18:23
derycksounds fun :)18:23
lifelessderyck: what command did you use to get the traces ?18:23
lifelessderyck: and did you get missing symbol errors when you fired up gdb in the chroot ?18:23
derycklifeless, used pygdb.  and no, I don't think so.  I can look again now.18:23
lifelessderyck: if you could, with regular gdb, uhm, 'thread apply all bt' and see if you get line numbers for the C frames18:24
lifelessif you don't, then we haven't got the debug environment right18:24
derycklifeless, ah, yes, that is better.  line numbers indeed.18:25
derycklifeless, could have sworn I did this and didn't get anything, and then tried pystack macros which hung.18:25
derycklifeless, but may the regular bt attempt was when I was running locally still, and not in right env.18:26
lifelessderyck: not to worry - you have line numbers now ;) - could you refresh the paste links in the bug ?18:27
abentleygmb: are you still ocr?18:31
derycklifeless, done.18:35
=== al-maisan is now known as almaisan-away
lifelessderyck: in bug 928327 ? I still see the old numbers.18:36
_mup_Bug #928327: codebrowse hangs in production <loggerhead:Triaged> < https://launchpad.net/bugs/928327 >18:36
lifelessderyck: ahha18:37
lifelessderyck: mmm, cluster lag18:37
lifelessderyck: the trick for the source is apt-get source python2.6 in the chroot18:39
lifelessderyck: so we can see that in https://pastebin.canonical.com/59625/18:40
lifelessthread 3 is in a libc call -  n = send(s->sock_fd, buf, len, flags);18:41
lifelessthread 7 is in the same libc call - send()18:42
lifelessthe content being written looks like fairly inane annotated pages18:42
lifelessmysql-5.1-wl820 in thread318:43
lifelesssame branch in thread 718:43
lifelesstotally different thing in thread 8 - ~dcplusplus-team/dcplusplus/dcpp-plugins/revision/3/win32/PluginPage.h18:44
lifelessand it renders pretty snappily18:44
lifelessthread 10 is in the python zlib module18:45
lifelessI've found race conditions / bugs in it before18:45
lifelessso little alerts are going off for me18:45
lifelessnote that it is in PyEval_RestoreThread18:45
lifelesssee (http://docs.python.org/c-api/init.html) - in short, this is a common place for hangs18:46
lifelessit means it will be trying to get the GIL18:46
lifelessnow, looking down its frames, that is in knit extraction18:47
lifelessso this should be safe as long as loggerhead isn't sharing the same objects across threads18:48
lifeless(it may be safe if the objects are being shared, but its less of an automatic assumption)18:48
flacostegary_poster: how about at the old TL call position?18:49
lifelessthreads 14,13,12 10 are all waiting on the GIL18:49
lifeless(determined by taking the GIL lock which the call to RestoreThread identifies and searching fo rit18:50
derycklifeless, interesting.  I had to read back a few times, but I follow now. I feel +2 times smarter now. :)18:51
lifelessif you check the code for PyEval_RestoreThread you can see how I got the GIL lock just from the backtrace18:51
lifelessbecause the only lock it tries to get is the GIL18:51
lifelessthread 11 is doing a GC18:52
lifelessthis means thread 11 holds the GIL18:52
lifelessthe threads that are in sock_sendall have released the GIL18:52
lifeless(line 2723 in socketmodule.c is wrapped in Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS18:52
lifelessagain - see http://docs.python.org/c-api/init.html - that means they have released the GIL18:53
lifelessso thats all the threads18:53
lifelessthe other threads have noise in their stack18:53
lifelessI *suspect* they are killed threads by the paste thread killing code18:53
lifelesse.g. dead but not joined yet18:54
lifelessnow, if the server has been attempted to shutdown18:54
lifelessbut hasn't gone18:54
lifelessthis would explain why there is no main thread visible18:54
lifeless(thread 1 shows18:55
lifelessThread 1 (Thread 27332):18:55
lifeless#0  0x00002b34765e5ebc in ?? ()18:55
lifeless#1  0x0000000000000000 in ?? ()18:55
lifeless)18:55
lifelessderyck: ^ probably need to ask webops for the exact sequence of events leading up to the core to validate that theory18:55
lifelessif we can validate the theory then we can make an interesting observation18:55
lifelesswhich is that the listen event loop *has* shutdown properly; what is missing is cleanup of these other threads18:56
lifelesswhich cannot happen until garbage collection completes18:56
derycklifeless, right18:56
lifelesswell, not properly :P18:56
lifelessthere are 2 threads in sock operations, 4 threads waiting for the gil (and apparently fine otherwise) and 1 in gc with nothing sensible higher up its stack18:57
lifelessthats a total of 7, but we'd expect 10 worker threads IIRC, plus mainloop18:58
lifelessso I strongly suspect a SIGINT or something already sent18:58
lifelessnow, lets peek at the other core18:59
lifelessthread 4 is in sendall19:00
lifelessas is 9, same spot as they have all been19:00
lifelessthread 12 is taking a threading lock19:00
lifelesslets see19:00
gary_posterflacoste, sorry, just saw this.  The old team lead call time is fine with me, but I thought lifeless would prefer not to meet that early.  The later we make it, the less likely Europeans from my team can attend, and the easier it is for lifeless, AFAIK.19:00
lifelessgary_poster: I can attend at that time, but we'll have to boot deryck :)19:01
lifelessgary_poster: who claimed that spot like a flash, before19:01
gary_posterlifeless, heh19:01
gary_posterum19:01
gary_posterok, I'll go look at the calendar...19:02
lifelessderyck: you'd be ok ~ this time on your thursday ?19:02
deryckI'm more Batman than Flash.  are we talking about me? :)19:02
gary_posterheh19:02
derycklifeless, sure.19:02
=== abentley changed the topic of #launchpad-dev to: https://dev.launchpad.net/ | On call reviewer: - | Firefighting: - | Critical bugtasks: 4*10^2
lifelessok, so deryck if you move your one +24 hours or so, and gary_poster you can move the paralleltests one an hour earlier.19:02
derycklifeless, done!19:03
gary_posterthank you lifeless & deryck19:03
lifelessderyck: thread 12 looks like its the implementation for a threaded queue or something (haven't checked the .py source yet)19:03
gary_posterflacoste, done on calendar19:04
lifelessderyck: so its not going to be the gil that its waiting ok - in fact it has just released the gil (see threadmodule.c line 46)19:04
lifelessderyck: it looks like it is waiting for another request to service, judging from the call stack19:05
lifelessthread 13 is running some code *I think*19:06
lifelessthe python integration blew up though (frame 0 is the fail)19:06
lifelessso we need to check the source to see if it holds the GIL19:06
lifelessand indeed, line 943 is in the middle of the big opcode jump case statement19:07
lifelessso, thread 13 holds the gil19:07
lifelessand is processing a knit repository, which the other inventory access in the other core was doing as well19:08
lifeless /~starbuggers/sakila-server/mysql-5.1-wl820/view/head:/plugin/java_udf/java_context_test.cc is the file19:08
abentleyderyck, rick_h: Interrupt dutes done in less than an hour.  Went down the whole list.19:10
lifeless /~starbuggers/sakila-server/mysql-5.1-wl820/view/head:/plugin/java_udf/grokjni.pl was the inventory content the other core was doing19:10
deryckabentley, nice!  I'll look forward to mine in an hour then. :)19:10
rick_habentley: rocking19:11
rick_hderyck: coat tail rider :P19:11
lifelessso, weak correlation there19:11
deryckrick_h, now you've finally figured me out.  oh no, my secret is exposed! :)19:11
lifelessthread 14 is another waiting-for-a-request19:12
lifelessas is 15,16,1719:12
derycklifeless, ah, so dealing with the same objects in different threads.  did I understand that right?19:12
lifeless1819:12
lifelessderyck: no, no indication of that yet; was noting that the same branch is being accessed from each core19:12
lifelessso there may be something to do with that content19:12
derycklifeless, ok19:12
lifelessits also a 'knit' format branch which is bzr < 1.0's native format19:13
lifelessI think, or something in that general area19:13
lifeless18 is waiting for a request19:13
lifeless19 and 20 too19:13
lifeless21 is waiting for the GIL19:14
lifelessand its the actual mainloop - note the serve_forever () and the PyMain at the top of the stack19:14
lifelessPy_Main I mean19:14
lifelessderyck: I think https://pastebin.canonical.com/59626/ has two different bt's in it, its a little confusing19:15
lifelessyeah, definitely does19:15
lifelessmy info is ok, because I started at the bottom which was indeed the other set of bt's19:17
lifelessanyhow, what does this mean19:17
lifeless12,14,15,16,17,18,19,20 are workers waiting for a request, 21 is the mainloop, 13 is doing work - thats 9 waiting, one mainloop and one worker working19:18
lifelessso this second core looks totally healthy and unstuck19:18
lifelessthreads 9 and 4 are a little worrying - that send() behaviour19:20
lifelessbut they don't hold the GIL19:20
deryckdid I cut-n-paste wrong or something, to get different bt's?  I thought I just scp'ed gdb and pasted straight as is.19:21
deryckgdb.txt, I meant.19:21
lifelessthere is nothing, assuming thread 13 would come alive again, stopping the healthy workers from serving more requests19:21
lifelessderyck: https://pastebin.canonical.com/59626/ and https://pastebin.canonical.com/59625/19:21
lifelessderyck: compare the first four lines19:21
lifelessderyck: and then the bottom four lines19:21
lifelessthe bottom four lines of 59625 appear in the middleish of 5962619:22
lifelessderyck: so, the core with happy workers has only one real issue and thats a busy thread; its possible that that isn't releasing the GIL for some reason, but just regular bzrlib code *should* give other threads timeslices19:23
lifelessderyck: were both cores taken from hung loggerheads? How was hung determined ?19:23
derycklifeless, that's a webops question.  not sure.  I can ask them.19:24
lifelessderyck: the mysql urls in question both render near-instantly for me19:24
lifelesshttp://bazaar.launchpad.net/~starbuggers/sakila-server/mysql-5.1-wl820/view/head:/plugin/java_udf/grokjni.pl and http://bazaar.launchpad.net/~starbuggers/sakila-server/mysql-5.1-wl820/view/head:/plugin/java_udf/java_context_test.cc19:25
lifelessderyck: so, you may want to copy some of this to the bug; the bad news is I see now reason for the second process to appear hung, and the first process appears to have had its mainloop killed (e.g. via the OOM killer, manual SIGINT, whatever) and that *will* stop it serving.19:26
lifelessderyck: we now need to track down more data around the state of both of the cores, to see if we can infer anything else.19:26
lifelessderyck: I hope this has helped!19:27
derycklifeless, I really don't mind copying this too the bug.  but it's a lot of text.  Would it be better for you to just summarize this briefly there?19:27
deryckjust so I don't mis-represent.19:27
lifelessone core has damaged (I suspect killed but not joined()) threads including a missing mainloop. The missing mainloop would on its own make it appear dead to haproxy.19:28
lifelessIt is in gc in another thread; one possible theory is it got too big memory wise and what we are looking at is damaged fallout from some attempt to recover it19:29
lifelessthe other core appears entirely healthy except for the oddness that stuff is stuck in send(); but that is normal if the OS buffer is full, which will happen if the internets are not brilliantly happy (because buffering affects the entire chain)19:29
lifelessso we need to know for the first one, as much as we can about how it got to that state - were any sysadmin interventions applied first? (if so, the core doesn't represent the failure, it represents the failure + mangling)19:31
lifelessfor the second, we need to know the symptoms that were being reported19:32
lifelessderyck: I suggest putting the transcript in an attachment for folk wanting to check the workings19:32
derycklifeless, done19:37
lifelesscool19:41
lifelessand now, breakfast.19:41
=== matsubara is now known as matsubara-afk
barrysinzui: is there anything we can do to fix private mailing list archive access? :(20:06
lifelessbarry: isd have a fix20:07
lifelessbarry: it is 'in deployment'20:07
barrylifeless: excellent, thanks20:07
sinzuilifeless, since when.20:08
sinzuibarry, lifeless bug 663923 give no indication there is a fix available20:10
_mup_Bug #663923: Cannot view list archive of private team <escalated> <mailing-lists> <ml-archive-sucks> <regression> <Apache OpenID:In Progress by mars> <Launchpad itself:In Progress by mars> < https://launchpad.net/bugs/663923 >20:10
* barry subscribes20:11
sinzuiI still believe grackle will be deployed and that bug will be fixed20:11
barrysinzui: what's grackle?20:13
sinzuibarry, the archiver we are writing20:13
barryah, right.  do you mean once grackle is deployed, you won't need the openid dance?20:14
sinzuicorrect20:14
barrycool20:15
barrythat'll be nice20:15
barryheck,  i might even switch to grackle in mm320:15
sinzuibarry, possibly. I think Cassandra should be a choice rather than a requirement. We can written an almost complete memory store implementation that could be subclassed to implement a sql or simple mbox implementation20:17
sinzuis/We can written/We HAVE written/20:17
barrynice.  what's the status of it?  is code available?  is it functional yet?20:19
abentleybarry: We started work on it at the Thunderdome, but I haven't been involved since.20:23
barrywhere are the branches? :)20:23
abentleybarry: lp:grackle20:23
* barry branches it for later20:24
sinzuibarry, I need one more day to complete the client. We can them complete the server in a  few days20:25
sinzuibarry, all the code is in trunk https://code.launchpad.net/grackle20:26
lifelesssinzui: since the ISD weekly report20:26
barrythanks.  i will definitely keep my eye on it20:27
marssinzui, it is also in my team's goals for Q420:27
sinzuimars, thanks20:28
lifelesssinzui: are you on isd-announce?20:28
sinzuino20:28
lifelesssinzui: I'm not sure how to get you on it; but it does have a aweekly summary of what ISD are up to that may be informative20:37
sinzuilifeless, I do not need to be more involved. This issue will be closed  soon20:38
lifelesssinzui: bug 92839120:47
_mup_Bug #928391: ProgrammingError creating new team <oops> <Launchpad itself:Triaged> < https://launchpad.net/bugs/928391 >20:47
lifelesssinzui: I think that that might be something your squad knows aboot20:47
sinzuilifeless, i learn about it an hour ago.20:48
sinzuiMy team will fix it20:48
lifelesskk20:53
deryckdentist time, yuck21:02
abentleyrick_h: have  you closed bug #294656  ?21:12
_mup_Bug #294656: Every page requests two JavaScript libraries (remove MochiKit) <javascript> <lp-bugs> <lp-translations> <lp-web> <tech-debt> <Launchpad itself:Triaged> < https://launchpad.net/bugs/294656 >21:12
rick_habentley: ah, sorry. Guess that never got linked to the branch. Yea, mochi is done and gone21:13
wallyworld_sinzui: https://pastebin.canonical.com/59655/22:27
wgrantfuuuu22:31
wgrant<unprintable Unauthorized object>22:31
wgrantFrom the isd team creation forbidden22:31
jelmerg'morning wallyworld_, wgrant22:32
wallyworld_jelmer: g'day22:33
sinzuiwallyworld_, are you running tip? I see what looks like a fix: https://code.launchpad.net/~bzr-pqm-devel/bzr-pqm/devel22:37
wallyworld_sinzui: no, just whatever a default precise install provides. i'll try tip, thanks22:38
wgrantMorning jelmer.22:38
wgrantStevenK: Bug #92844022:39
_mup_Bug #928440: When attempting to create a new team, I'm told I am "Not allowed here" <fallout> <regression> <Launchpad itself:Triaged> < https://launchpad.net/bugs/928440 >22:39
wgrantSee my comment22:39
sinzuiwallyworld_, or revert to -r 80.22:42
wallyworld_sinzui: tip still breaks, so will try that rev22:43
sinzuiwallyworld_, the branch is in ec2 now22:47
wallyworld_sinzui: and rev 80 breaks too. i need to see where RemoteBranch lives22:47
wallyworld_thanks for landing22:47
wallyworld_sinzui: the issue is the version of bzrlib22:48
sinzuiwallyworld_, yes, I think I am using the system lib22:48
wallyworld_sinzui: makes sense. i am using the one from lp-sourcedeps22:49
wgrantlifeless: Shall I start landing my heat incineration branches?23:16
lifelesswgrant: yes23:27
lifelessnoone has flamed us AFAICT23:27
* thumper flames lifeless23:27
wgrantThat was my thinking23:27
* thumper flames wgrant and wallyworld for good measure23:27
wgrantUhoh23:27
* thumper leaves again23:28
wallyworldthumper: what have i done this time?23:28
thumperwallyworld: I'm sure you know...23:28
wallyworldthumper: well, it could be one or soooo many things23:28
wallyworlds/or/of23:28
wgrantlifeless: Does this count as removing complexity to offset disclosure? :P23:29
lifelesswgrant: I can see we're going ot have fun with that23:29
lifelessdisclosure is offsetting user ticket complexity and performance suckiness too23:29
wgrantIt is23:29
wgrantI am trying to respect the 5s rule with bug searches.23:30
wgrantAs much as I can.23:30
lifelessanyhow23:30
lifelessheat was signed off on by stakeholders including ubuntu, for the changes effective monday23:30
lifelessI see no reason to wait an extended period23:31
wgrantSure.23:31
wgrantThe stakeholders aren't the only stakeholders, but indeed the outcry seems to be nonexistent.23:31
lifelesswgrant: btw23:31
wgrantWhich is as I expected.23:31
lifelesswgrant: you can't delete the garbo job straight away23:31
wgrantOh?23:31
wgrant(he says, as he Ctrl+Cs the lp-landing of the garbo job removal)23:32
lifelessexercise for the reader. You will facepalm.23:32
lifelesstell me if you timeout ;)23:32
lifelessrick_h: is bug 928500 your work?23:33
_mup_Bug #928500: 'Series and Milestones' graph not loading - LPJS is not defined <graph> <javascript> <latency> <loading> <lpjs> <milestone> <series> <Launchpad itself:New> < https://launchpad.net/bugs/928500 >23:33
wgrantlifeless: I don't see the issue.23:33
lifelesswgrant: we are changing the rule for heat calculation to not include age.23:33
lifelesswgrant: what process will we use to update bugs that *are not changed* to use the new rule ?23:34
wgrantlifeless: I decided that we don't care enough.23:34
wgrantDo we?23:34
lifelessWell, if we can point any-and-all 'wtf' bug reports to you, sure.23:34
lifelessI think its pretty cheap to let the garbo do one full scan post-heat-calculation-change, and it ensures that it is all consistent23:35
wgrantThen I'll mark the bug as affecting and then not-affecting me, and then say "wtf" back because the value is correct :)23:35
wgrantBut true.23:35
wgrantSo, I guess I'll put the DB patch in a separate pipe and do that first.23:36
lifeless\o/23:37
wgrantlifeless: Hm,23:37
wgrantlifeless: Except that the updater never completes.23:37
wgrantBug #90619323:37
lifelesswgrant: I'm pretty sure it is incremental23:37
wgrantProbably better to do a one-off23:37
_mup_Bug #906193: BugHeatUpdater never completes <Launchpad itself:In Progress by wgrant> < https://launchpad.net/bugs/906193 >23:37
wgrantIt's not23:37
wgrantOh23:37
wgrantI guess it is23:37
lifelessthe warning was bogus, last I looked at it23:37
lifelessit doesn't do a full scan in 1 hour23:37
wgrantYeah, true.23:38
wgrantIt probably never catches up, though.23:38
wgrantAnyway, will land the DB patch without the garbo dropping.23:38
lifelesslet me check ze code23:38
lifelesswgrant: so yeah -23:39
lifeless    def _outdated_bugs(self):23:39
lifeless        outdated_bugs = getUtility(IBugSet).getBugsWithOutdatedHeat(23:39
lifeless            self.max_heat_age)23:39
wgrantBut it seems to never finish.23:39
wgrantWhich means it is behind.23:39
wgrantIt's incremental, but probably never catches up.23:39
lifelessyour new function should be cheaper23:40
wgrantI suspect it's better just to do a one-off four-line script to update everything.23:40
wgrantIt is about 5 times cheaper, true.23:40
lifelessI don't have an opinion; script is fine if thats what you think is best23:40
lifelessmy reading of the code is that the heat updater will, on each run, do the lowest-id N older-than-X bugs23:40
lifelessthis could be always be behind but still hit everything23:41
wgrantIt does, yes.23:41
wgrantOh?23:41
wgrantWhat if the first hundred thousand bugs get updated regularly?23:41
wgrantThe top 800000 never will23:41
lifelessso lets say it takes Y days to become stale23:41
lifelessmm, rephrase23:42
lifelessruns hourly23:42
lifelessin one hour, it does N bugs23:42
lifelessif the number of bugs *becoming stale* per hours is greater than N23:42
wgrantI wouldn't expect it to be, but it looks like isDone is never hit.23:42
wgrantWhich suggests that it is.23:42
lifelessthen after X days, it will have to do the first N again, and you'll have a loop over 24*Y*N bugs23:43
lifelessany bug updated for other reasons within that Y period will perturbate the loop and get other bugs updated.23:43
lifelessanyhoo; shrug. Like I say, choose the best use of your time w/curtis, and have fun.23:43

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!