[00:05] blr: Hi [00:06] hello! [00:06] blr: Colin can't do his Monday evening next week, so how does your Wednesday morning work for a meeting? [00:06] sure [00:21] I think that works here. Kirsten's sometimes out on Tuesdays which leaves me looking after the children, but if we're talking about something like 23:00 London then that would normally be OK, I think. [00:22] Of course the difference shifts radically between winter and summer ... [00:23] 2300 London is, what, 1000 for wgrant and 1200 for blr? [00:23] I think so. [00:23] I can easily do 3-4h earlier, but I suspect that makes it clashier for you. [00:24] Yeah, that's kind of pessimal here [00:24] 11 for me currently I think [00:24] 90 minutes ago, so midday for you. [00:24] We should probably replan in summer rather than trying to find something that has 2h of tolerance with no practical experience [00:24] Definitely. [00:24] Silly planet [00:25] Well, silly DST [00:26] timezones are such a nuisance. [00:26] We should all move to Queensland. [00:26] * blr plays the NZ gigabit fibre trump card [00:26] Yeah yeah, you and your marginally less insane government. [00:27] _marginally_ [00:28] Gigabit fibre with a silly name [00:30] well cjwatson and wgrant, I can try to be flexible around times, tend to be up fairly early [00:31] and happy to hop on a hangout late if that also works on the other side [00:31] I think there's few enough of us that we can have a bit of slippage [00:31] We'll have more in APAC soon to swing it our way :P [00:31] Yeah, if anything I'm more likely to find it occasionally convenient to be later [00:32] Chances of having the children in bed before 10 < chances of Australia having a sane government [00:32] hahah [00:32] Heh [00:33] But we can see how it goes [00:34] Night [00:34] Night [00:34] g'night [00:46] Hrm [00:47] So the librarian leak is not technically a leak at all. The HTTPConnection cycles are collectable, it's just that the GC doesn't run in time at a particular load level. [00:47] Locally, 200rps is fine but 100rps eventually dies [03:41] ohai [03:46] o/ [03:47] Morning. [03:47] thomi: Your fix is on qastaging now, btw. Please check that it does what it says and doesn't break anything obvious, then change the bug tag from qa-needstesting to qa-ok. [03:47] http://lpqateam.canonical.com/qa-reports/deployment-stable.html will then be happy and green [03:48] wgrant: ok, stupid question... where is qastaging? [03:48] thomi: The aptly named qastaging.launchpad.net [03:48] It's a snapshot of the prod DB that is updated when I get around to it, automatically running the latest tested code. [03:49] ahh [03:49] latest _untested_ code? [03:49] Tested, but not manually QAed. [03:49] gotchya [03:50] We land code to devel, http://lpbuildbot.canonical.com/waterfall runs the test suite over any new revisions. [03:50] Assuming all the tests pass, a bot pulls devel into stable. [03:50] qastaging polls stable every couple of minutes and updates whenever it changes. [03:50] deployment-stable.html lists anything that's on qastaging but not on production. [03:51] cool [03:51] wgrant: I'm getting timeouts on qastaging. I sure hope that's not due to my code... [03:52] thomi: No, qastaging's just on a much smaller DB server [03:52] Only 32GiB of RAM [03:52] ok [03:52] It'll take a few refreshes for the bugs homepage to load, if it ever does. [03:52] that makes me a little bit sad on the inside [03:52] Yeah [03:52] I have like 8GiB or RAM sitting around, I'll mail them to you :D [03:52] But qastaging's resourcing can never quite be a valid match for prod, simply because it has very little load. [03:53] So even if it had 128GiB of RAM like prod, there'd be less contention so it would be implausibly fast. [03:53] wgrant: would you have a moment to re-review the verbose-diff branch again next week, or did you have some concerns around unhandled edge cases still? [03:54] blr: I think it's probably good, but I need to torture test it. [03:55] I'll get to it on Monday unless the sky keeps falling :) [03:55] Thanks for working through the bzrlib etc. changes. I think we have a good solution now. [03:55] ok, it would just potentially be forgotten amoungst all the git work, so thought I would mention it :) [03:56] I had planned to do it before you got back, but production burning down has had me unfortunately distracted. [03:56] hah yes [03:57] and what was the deal with the spam thing? [03:57] Just a few incompetent spammers from Egypt trying to find gaps. [03:57] They really like advertising Arabs Got Talent. [03:57] launchpad seems like an unlikely vector heh weird [03:57] Though my Arabic isn't very good, so I may be misinterpreting. [03:58] It's a particularly odd vector for non-English spam, since the vast majority of the content is English and anything else is very detectable. [03:59] they should contribtute to translations as penance [03:59] Heh [04:30] right - I'm off for the day. [04:31] Will be at LCA next week, so wgrant gets a week's respite from my annoying questions. [04:31] see y'all in the future! [04:41] yep, about to pass out myself. wgrant the routing in cornice/pyramid is a refreshing change from django :) [05:33] wgrant: The explicit close branch should sort the need for garbage collection under load, assuming requests.Session.close() actually closes the sockets and doesn't leave that for the garbage collector... [05:34] wgrant: the http_connection is set (if it doesn't already exist) for all actions afaics, not just on retries. [05:35] wgrant: But maybe we are more confident with just bumping up the fd limit and not worrying about switching to an untested-by-us python-swiftclient at this stage? [05:42] We would need over 100 *errors* per second to trigger the issue, wouldn't we? I guess a pentest or something might generate that many 404s. [05:49] stub: the explicit close branch doesn't reliably fix it locally. I'm not sure why. [05:49] Where do you get the 100 errors per second number? [05:49] It mostly depends on how often gc.collect() is invoked. [05:50] 07:47:36> So the librarian leak is not technically a leak at all. The HTTPConnection cycles are collectable, it's just that the GC doesn't run in time at a particular load level. [05:50] 07:47:49> Locally, 200rps is fine but 100rps eventually dies [05:50] Oh, right [05:50] load is load [05:51] If the close() branch does nothing, I guess we want to drop it (at least for now) to avoid unnecessary changes. [05:52] I'd rather not sabotage this Swift cutover unnecessarily [05:54] With the 404's going back to the pool, as they were supposed too, the other errors should be so infrequent that it won't matter if gc is slow. [05:54] Right, that's my theory. [05:55] Cool. Then we are done if reality agrees, apart from landing a one line patch. [05:55] I wonder if it's worth looking at the new swiftclient to see if we can use its internal multithreading support rather than creating hundreds of them. [05:55] Anyway, this seems to work for now. [05:56] We should only ever create 10 of them, really [05:58] I didn't know enough about twisted internals to be more clever. eg. are deferToThread threads reused forever or a limited number of times? If the latter, threading.locals will leak. [16:55] wgrant: \o/ working split-out txpkgupload [16:56] passes all tests, successful ftp and sftp uploads [16:56] authenticating against local LP authserver [16:57] shall I release lazr.sshserver 0.1 and proceed with removing that from the LP tree? [16:57] lp:txpkgupload exists if you want to give it a once-over. [17:02] I expect I'll tweak things a bit as I attempt to split it out of LP and think about deployment. [19:14] The only thing I've noticed so far is that I probably want to make twistd --logfile DTRT rather than requiring YAML log configuration. [19:16] I think I have roughly suitable puppet branches for deploymgr config and for switching over globally. Need to think about the best deployment ordering at some point when it isn't 7pm on a Friday. [22:42] cjwatson: Oh, excellent. [22:43] cjwatson: We should probably upgrade txlongpoll while we're looking at that sort of stuff. Prod's currently using the pre-YAML version. [22:44] What tweaking do you envisage as you split it out of LP? [22:44] I'd get that working before releasing lazr.sshserver, just because there's no real reason to release beforehand and it shouldn't take long. [23:51] wgrant: I haven't noticed anything other than making the logging a bit more graceful so far (the mechanics for this changed when switching from TAC to plugins and there were a few ways I could have done it), but I haven't played with it very much yet as I only got it working pretty close to EOD. [23:51] pre-YAML> useful to know, that means I can't look to it for advice ;-) [23:52] I need to figure out where to put configs for different installations. Possibly just different command-line options from the init script. [23:54] Yeah. [23:55] txlongpoll is also packaged for use by MAAS, but I suspect it doesn't have its own initscript. [23:55] -rw-r--r-- root/root 404 2012-03-14 16:14 ./etc/init/txlongpoll.conf [23:56] exec /usr/bin/twistd -n --pidfile=/run/txlongpoll.pid --logfile=/var/log/txlongpoll.log txlongpoll --config-file=/etc/txlongpoll.yaml [23:57] I don't think it's worthwhile packaging this though [23:57] No, just mentioning it for possible inspiration in terms of config handling. [23:58] Mm. I based txpkgupload's plugin pretty directly on it. [23:58] But I should check back for how their logging works, since I see --logfile there. [23:59] Should be able to release to download-cache and PyPI on Monday, anyway. [23:59] Yeah, I'll look over them both on my Monday, but from what I've seen they're all good.