dpm | good morning all | 08:04 |
---|---|---|
TLE | whois mdke | 12:28 |
TLE | whoops, juts didn't remember the name ;) | 12:28 |
dpm | :) | 12:33 |
dpm | hi TLE | 12:33 |
TLE | hi | 12:33 |
TLE | I'm writing the todo list for tomorrow | 12:34 |
dpm | ah, cool | 12:34 |
dpm | I'm progressing on the database import performance improvements for the translation stats. I've thrown away sqlite and I'm using postgresql and raw sql queries instead of django. Postgresql has a nice COPY command especially designed to import data from CSV files (well, json with a bit of massaging in my case) | 12:35 |
dpm | that reduced the import from hours to just a few minutes, which is rather nice :) | 12:36 |
dpm | now I'm fighting with SQL, as I need to do another additional INSERT query after the import, which is either taking a bit long (on django 1.3) or failing (on django 1.1) | 12:37 |
dpm | but that's probably me needing to learn SQL | 12:37 |
TLE | ahh, sounds like good progress | 12:37 |
dpm | yeah, but I wanted to have it ready before the UGJ this weekend, and I'm not sure I'll have the time :/ | 12:38 |
dpm | I'll figure it out, I'll just have to find a postgresql or sql expert to give me a hand :) | 12:39 |
TLE | when you say manual INSERT, you mean directly with and not via Django? | 12:41 |
dpm | TLE, exactly, I'm doing raw sql inserts, which really boosts performance in this particular case | 12:59 |
TLE | but then it must be a python sql thing that is failing and not django right | 13:00 |
dpm | yeah, psycopg2 | 13:00 |
dpm | they are the underlying python DB bindings Django is using for Postgresql access | 13:01 |
dpm | http://bazaar.launchpad.net/~dpm/+junk/ubuntu-translations-stats/view/head:/stats/management/commands/importdata.py#L207 | 13:01 |
TLE | ahhhh that away | 13:01 |
dpm | the other query below, on line 256 is the one that's failing on django 1.1 (or rather in whichever psycopg2 version django 1.1 uses). You can also see the original Django code commented out above | 13:04 |
dpm | I still haven't figured out why, but it returns duplicate rows when it shouldn't | 13:05 |
TLE | hmm, I can't spot anything, there is an extra , after the columns in 256, but I don't think it matters | 13:12 |
TLE | well, better find an apropriate SQL expert | 13:13 |
dpm | yeah, but thanks for looking anyway! | 13:34 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!