[03:11] <carlos> SteveA: please, ping us when you are ready
[03:13] <carlos> if you are busy to have the meeting now, tell us it too so we can have another meeting that we have pending
[03:13] <carlos> please
[03:14] <SteveA> carlos: hi
[03:14] <SteveA> I'm here
[03:14] <carlos> hi
[03:14] <SteveA> two things
[03:14] <SteveA> 1. I want to understand better the timestamp and manually entered data thing
[03:15] <SteveA> 2. about the view refactoring, one point that came out of discussion with kiko and mark is that it is something to do if it will be a small job, and will improve things
[03:15] <SteveA> don't spend time on it before 1.0 if it is a larger thing
[03:15] <SteveA> it isn't a goal in itself
[03:15] <SteveA> it is an idea tha tmight help work on TranslationReview
[03:16] <SteveA> so, dealing with (2) first
[03:16] <SteveA> what do you think?
[03:17] <carlos> well, the thing is that I could implement TranslationReview with current views, but adding some hacks to see, for instance, the copy button that the user clicked to copy a message in the text area to modify it
[03:18] <carlos> I don't think it would be much more ugly than what we have right now
[03:19] <carlos> so if you prefer to leave it for later, I can do it, but taking into account that on review time
[03:19] <SteveA> maybe you should guesstimate it?
[03:20] <SteveA> how long to do the refactoring in days?  how long to do TranslationReview after the refactoring?  how long to do TranslationReview before the refactoring?  how long to do the refactoring after TranslationReview?
[03:20] <carlos> based on what kiko told me or what danilo and I talk ?
[03:21] <SteveA> tell me what you think
[03:21] <SteveA> after all, you'll be doing the work
[03:22] <carlos> well, I would need to think about it before I can give you an estimation
[03:22] <carlos> I know the idea behind kiko's suggestion, but I didn't think about it yet in depth because the pending meetings that would change a bit the solution
[03:22] <danilos> (just as a sidenote, this is exactly the reasoning why I want to work on generalized TranslationImport stuff: it will take a 2-4 days to implement it, but adding OOo, KDE PO would be much shorter and cleaner)
[03:22] <carlos> I could give you that info at the end of today
[03:23] <carlos> is that ok?
[03:24] <SteveA> yes, that's fine.
[03:24] <SteveA> that will give you some idea of whether to do this first or later
[03:25] <carlos> danilos: well, in that case you are preventing the hack, we already have that hack in production, so it makes mucho more sense in your case
[03:25] <SteveA> so, about the exports and timestamps
[03:25] <carlos> s/mucho/much/
[03:25] <carlos> ok
[03:25] <danilos> carlos: I know, just giving the reasoning behind it
[03:25] <carlos> SteveA: let me tell you what we have atm and how does it work, ok?
[03:26] <SteveA> what I understand is that there is this process that involves producing exports
[03:26] <SteveA> and when pitti says one is good enough, that latest one becomes the baseline
[03:26] <carlos> yeah
[03:26] <SteveA> where does the manually entered timestamp come into it?
[03:27] <carlos> pitti tells me the timestamp of the tarball used for the baseline
[03:27] <carlos> and I ask Stuart to store it in our database
[03:27] <SteveA> is that the same data that is the most recent baseline export?
[03:28] <carlos> not always
[03:28] <SteveA> the tarball is made from the most recent baseline export?
[03:28] <SteveA> why might it not be?
[03:28] <carlos> because there is a small delay
[03:28] <SteveA> a small delay between what and what?
[03:28] <carlos> between when pitti gets a tarball, creates the .deb packages and notify me the timestamp
[03:28] <carlos> and we create a new export every day
[03:28] <SteveA> the timestamp is based on data in the database
[03:29] <SteveA> it is purely within the database's data
[03:29] <carlos> no, the timestamp is based on the date when the export was done
[03:29] <SteveA> what happens to that data -- making a tarball or a deb -- doesn't affect the timestamp of when the export was produced
[03:29] <carlos> oh
[03:29] <carlos> I see what you mean
[03:29] <carlos> sort of
[03:30] <SteveA> so, I cannot see a reason to put a timestamp *into* the database
[03:30] <carlos> it would work that way if our exports come from production 
[03:30] <SteveA> it is something that should only ever come out of the database
[03:30] <carlos> but it's not the case
[03:30] <SteveA> what I can see going into the database is
[03:30] <SteveA> marking which export is the one that is used
[03:30] <carlos> we use a read only mirror
[03:30] <carlos> we don't store the list of exports in our database
[03:30] <SteveA> oh
[03:31] <carlos> our datamodel doesn't allow it
[03:31] <carlos> is not like Ubuntu packages, we don't need that complexity
[03:32] <SteveA> ok
[03:32] <SteveA> so, the way I'd approach that is
[03:32] <SteveA> in the exported data, add a timestamp + checksum of timestamp
[03:32] <SteveA> so there cannot be a typo
[03:34] <SteveA> but, I'm more inclined to connect to the real database
[03:34] <carlos> me too
[03:34] <SteveA> and store that an export was produced
[03:34] <SteveA> and store the date of the latest translation used, or whatever is appropriate
[03:34] <SteveA> and give that an export-id
[03:34] <SteveA> so pitti can just say "this export-id is the baseline"
[03:35] <carlos> I guess we could use the link to the librarian as that 'export-id'
[03:36] <carlos> so we could have it for free adding a link to latest baseline langpack
[03:37] <carlos> SteveA: also, I was thinking on use the timestamp for latest translation in that export as the timestamp for the language pack so this problem wouldn't happen again
[03:37] <SteveA> right
[03:38] <SteveA> please file a bug or two on these issues
[03:38] <carlos> there is still another issue
[03:38] <SteveA> what's that?
[03:38] <carlos> as we generate daily language packs
[03:39] <carlos> we need to provide Martin Pitti a way to go to launchpad and note himself which language packs has been used as the base package
[03:40] <SteveA> how does pitti get a particular language pack?
[03:40] <SteveA> does he get an email?
[03:40] <SteveA> or go to a page in launchpad?
[03:40] <carlos> atm, he fetchs it from people.ubuntu.com
[03:40] <carlos> once we move to production, he will get an email
[03:40] <carlos> with a link to librarian
[03:40] <SteveA> how does it get to people.ubuntu.com?
[03:41] <carlos> my script in carbon pushes the tarball there once built
[03:41] <SteveA> I see
[03:41] <SteveA> so, if we had a database table for langpacks produced
[03:41] <SteveA> he could see it in a UI
[03:41] <SteveA> and mark in that same UI if one has been used as the baseline for what
[03:41] <carlos> although Martin asked us for a fixed URL in librarian so he doesn't need to parse the email
[03:41] <carlos> SteveA: right
[03:41] <SteveA> or we could offer him xmlrpc 
[03:42] <carlos> I think that would be the right approach
[03:42] <SteveA> if he wants to automate it
[03:42] <carlos> yeah, that would be also a good way to do it
[03:42] <SteveA> ok
[03:42] <SteveA> that seems like a small spec to me
[03:42] <SteveA> couple of paragraphs explaining the background and proposed solution
[03:42] <SteveA> so we can schedule it for post 1.0
[03:43] <carlos> ok
[03:43] <carlos> also, I think that's the only missing part to move language packs to production
[03:44] <SteveA> ok
[03:44] <SteveA> so we talked about...
[03:44] <SteveA>  - having the export script write to the db
[03:44] <SteveA>   maybe in a special table
[03:44] <SteveA> or maybe using the librarian id
[03:44] <carlos> well, we still need a table
[03:45] <SteveA> so that there is a database-produced unique ID for a langpack that is generated
[03:45] <carlos> it's a one to many relation
[03:45] <SteveA> so, use the 'id' in the table rather than the librarian id perhaps
[03:45] <carlos> one distrorelease has n language pack exports
[03:45] <carlos> ok
[03:46] <SteveA> ok
[03:46] <SteveA> and then we want to record there whether it is used as the baseline
[03:46] <SteveA> or a baseline
[03:46] <SteveA> and the timestamp of the most recent translation
[03:46] <carlos> ok
[03:46] <carlos> also we have another kind of language packs, the ones that only have updates...
[03:47] <carlos> but I don't think I should bore you with that
[03:47] <carlos> I will note that in the spec
[03:47] <SteveA> I know that they exist
[03:47] <SteveA> and the script that produces them can use this data to know what range of translations to include
[03:47] <carlos> right
[03:48] <SteveA> and then we also talked about a UI + maybe xmlrpc for pitti
[03:48] <carlos> also, I'm thinking on adding something to launchpad that allows pitti to select whether we are going to generate updates or full exports
[03:48] <SteveA> to get the langpacks, see what langpacks are available, and mark the one he chooses as a baseline
[03:48] <SteveA> ok
[03:48] <carlos> so he can decide to do a new full export
[03:48] <SteveA> so, this is something to discuss in person with pitti perhaps?
[03:49] <SteveA> all these things together
[03:49] <carlos> to reduce the packages with just updates (it already happened with dapper point release, but he had to do it manually)
[03:49] <carlos> I guess, let's try first phone call after we have a draft
[03:50] <carlos> and see whether that's needed
[03:50] <carlos> anyway, if it's post 1.0, we could talk about it in the allhands meeting
[03:51] <SteveA> well, when are you moving the langpack production into production?
[03:51] <carlos> once we have this new spec implemented
[03:51] <SteveA> ok
[03:52] <SteveA> adn that's not a 1.0 goal
[03:52] <SteveA> as far as I'm aware
[03:52] <carlos> right, it's post 1.0
[03:52] <SteveA> ok, then I think that's settled
[03:52] <SteveA> what do you think danilos ?
[03:52] <danilos> I'm fine with it
[03:52] <SteveA> ok, great.
[03:53] <SteveA> thanks for having this meeting.
[03:53] <danilos> and it has just moved to carbon
[03:53] <danilos> so performance should not be an issue in the near future
[03:53] <SteveA> we must just be careful about those timestamps
[03:53] <carlos> well, the performance issues were already fixed even before moving it to carbon
[03:53] <SteveA> until using a better system
[03:53] <SteveA> maybe add a checksum to the timestamp as an interim measure... ?
[03:54] <SteveA> or ensure that the mail sent to set it in the db
[03:54] <carlos> SteveA: don't worry, what I will try to do is to set as the timestamp latest modified translation
[03:54] <SteveA> is sent to pitti too
[03:54] <carlos> SteveA: that way we solve the mirror problem
[03:54] <SteveA> carlos:  you still need to store that somewhere
[03:54] <carlos> SteveA: inside the tarball exported
[03:54] <SteveA> so, there's still a timestamp coming out of the database
[03:54] <SteveA> across to pitti
[03:54] <carlos> we already have a timestamp.txt
[03:54] <SteveA> then back to you
[03:54] <SteveA> and back into the database
[03:54] <carlos> right
[03:55] <SteveA> and that is error-prone
[03:55] <carlos> I will add also the checksum as you suggested to be completely sure that nothing was lost
[03:55] <SteveA> so, consider this
[03:55] <carlos> those are more or less trivial tasks
[03:55] <SteveA> add a checksum to timestamp.txt
[03:56] <SteveA> and write a small script to read a timestamp.txt, check checksum and set it in the databaes
[03:56] <SteveA> if that will take under 2 hrs, then I'd say do that
[03:56] <SteveA> if more, then it is too much work
[03:56] <carlos> hmmm, why should I do latest part?
[03:56] <SteveA> what does that mean?
[03:56] <carlos> I still need to ask stuart to do the update sentence
[03:56] <carlos> I don't have direct access to production database
[03:56] <SteveA> you can tell stuart to run that script
[03:57] <SteveA> or you can send stuart the timestamp.txt
[03:57] <SteveA> and tell stuart to run the script on it
[03:57] <carlos> I see
[03:57] <carlos> ok
[03:57] <SteveA> the point is to avoid having someone typing in a manual query
[03:57] <carlos> so we reduce the chance to introduce a typo
[03:57] <SteveA> copied from some email or text file
[03:57] <SteveA> but, it is worth doing this only if it will be quick
[03:57] <carlos> good plan
[03:57] <SteveA> don't bother if it will be more than 2 hours
[03:57] <carlos> ok
[03:58] <carlos> I think it would fit in a 2 hours slot
[03:58] <SteveA> because it is just an interim thing
[03:58] <SteveA> until the better system is designed and implemented
[03:58] <carlos> so pitti needs to send me the timestamp.txt file and that's all
[03:58] <SteveA> yes
[03:58] <carlos> ok
[03:59] <SteveA> >>> md5.new('timestamp').hexdigest()
[03:59] <SteveA> 'd7e6d55ba379a13d08c25d15faf2a23b'
[04:00] <carlos> SteveA: yeah, I already used md5 module while debugging this problem
[04:00] <carlos> to know which files change between current language packs
[04:00] <carlos> and a full export I forced
[04:00] <carlos> so don't worry
[04:01] <SteveA> something like that
[04:01] <SteveA> dsfok, great
[04:01] <SteveA> dsfok, gre
[04:01] <SteveA> um
[04:01] <SteveA> lag
[04:01] <SteveA> great
[04:03] <carlos> lag + garbage :-P
[04:04] <carlos> SteveA: btw, now that we talk about this
[04:04] <carlos> there is another issue we should fix
[04:04] <SteveA> what's that?
[04:04] <carlos> related with Rosetta and dapper
[04:04] <carlos> and perhaps breezy
[04:04] <carlos> https://launchpad.net/bugs/58221
[04:05] <carlos> SteveA: pkgstriptranslations was stripping and feeding Rosetta with translations
[04:05] <carlos> for packages in the backports pocket
[04:05] <carlos> danilos: please, pay attention too, because we need to solve this issue
[04:06] <danilos> carlos: no problem, I am ;)
[04:06] <carlos> that means that we have some .pot files imported in dapper
[04:06] <carlos> that doesn't match with what dapper has in its official release
[04:06] <SteveA> does this continue to happen?
[04:07] <SteveA> or is it just that we have some bad data now?
[04:07] <carlos> no, 30 minutes ago, the buildds for backports has been fixed
[04:07] <carlos> but we have some bad data 
[04:07] <SteveA> no to what?
[04:07] <carlos> that we need to fix
[04:07] <SteveA> you're saying that the problem is fixed
[04:07] <SteveA> but the bad data remains?
[04:07] <carlos> right
[04:08] <SteveA> does rosetta get told the pocket by the buildds?
[04:08] <carlos> Well, I think we can know that from our datamodel 
[04:08] <carlos> and is a protection we should implement to prevent such breakages in the future
[04:09] <SteveA> right, please file a bug on that
[04:09] <SteveA> or, add a task to that bug
[04:09] <carlos> I did already
[04:09] <carlos> let me look for it
[04:09] <SteveA> now, about the bad data.  what do we need to do?
[04:10] <carlos> https://launchpad.net/products/rosetta/+bug/58223
[04:10] <carlos> I think that we need to get the right .pot files and reimport them again
[04:10] <carlos> that should be enough
[04:11] <SteveA> how do you identify what pot files are needed?
[04:11] <carlos> because the .po imported didn't change anything in Rosetta other than what came from upstream, so nothing was broken from the Ubuntu translator point of view (we just added some new translations from upstream)
[04:12] <carlos> well, that's the problem I don't know exactly how to solve
[04:12] <carlos> I was thinking on ask for a list of packages in the backports pocket
[04:12] <carlos> and filter out the ones without translations
[04:12] <carlos> so I get that list
[04:12] <carlos> but I think this would be a manual process
[04:13] <carlos> in the other hand, I guess the amount of packages should be low because Dapper was released only 3 months ago
[04:14] <SteveA> well...
[04:14] <SteveA> all those packages will also need to be rebuilt
[04:14] <SteveA> with non-stripped translations
[04:14] <carlos> breezy would be more complicated, but the amount of templates imported is quite low than dapper so I don't think the problem would be too bad (we didn't get a full import for Hoary or Breezy)
[04:14] <carlos> SteveA: right
[04:15] <carlos> because they are without translations atm
[04:15] <SteveA> are there buildd logs we can use?
[04:16] <carlos> I guess, but I could check with Martin the right solution for this, because he would need to get that list too 
[04:17] <carlos> to rebuild the packages
[04:17] <SteveA> if there are buildd logs or records of this, that would be probably better
[04:17] <SteveA> maybe ask celso
[04:17] <SteveA> or infinity
[04:17] <carlos> there are buildd logs and I think we can see some output from pkgstriptranslations script
[04:17] <carlos> so it should be more or less doable
[04:18] <carlos> at least we use them from time to time to debug some problems with .pot regeneration
[04:18] <SteveA> so, what is the plan?
[04:18] <carlos> talk with Martin, just in case he already got the list of packages to rebuild
[04:20] <carlos> if he doesn't have such list, talk with celso/infinity to see if they could provide the logs using any script (they are available from launchpad/librarian) so we don't need to use the web interface for every single package in backports
[04:20] <carlos> and get the list of packages with translations using those logs
[04:21] <SteveA> ok.  let me know how it goes.
[04:21] <carlos> once we get the list, I think the only chance we have is to upload again one by one the .pot files for those packages
[04:21] <carlos> (we have them available at people.ubuntu.com)
[04:21] <SteveA> is there any risk of nuking work that has been done since?
[04:21] <carlos> no, we don't nuck anything
[04:21] <carlos> that work will appear as suggestions
[04:21] <SteveA> so, there is work to do :-(
[04:21] <carlos> what we will do is to 'hide' them for dapper
[04:21] <SteveA> especially without translation review
[04:22] <carlos> and show again some others that were removed when the backports one were imported
[04:22] <carlos> hmm
[04:22] <carlos> not really
[04:22] <carlos> I mean, they don't need to reactivate anything
[04:22] <SteveA> if there are translations that were made, which were confirmed, and which are now just suggestions
[04:22] <SteveA> then that's a step backwards
[04:22] <SteveA> and work needs to be done confirming the suggestions
[04:23] <carlos> it's just a matter of setting some strings as obsolete and remove the obsolete mark from others
[04:23] <carlos> what I mean as suggestions is that if that string that we are hidding in Dapper appears later in other distro release
[04:23] <carlos> it will appear as a suggestion
[04:23] <SteveA> we need to find out how many packages are involved.
[04:23] <SteveA> ok
[04:23] <SteveA> that's for strings that are in the backport
[04:23] <carlos> so we will reuse that work later as part of our translation memory
[04:23] <SteveA> but not in the one in main
[04:24] <carlos> hmm
[04:24] <carlos> the problem is that the backports have packages in main
[04:24] <carlos> oh, you mean with 'main' release?
[04:26] <SteveA> I'm concerned that after uploading the new POTs
[04:26] <SteveA> that the state of translations in there will overrule work people have done since those POTs were produced
[04:26] <carlos> no
[04:26] <carlos> any translation done
[04:27] <carlos> will remain selected
[04:27] <carlos> what we do is that, for instance
[04:27] <carlos> a backport for ktorrent includes a new string 'ktorrent rules'
[04:27] <carlos> once we revert to previous .pot
[04:27] <carlos> that string will not appear anymore in dapper's imports
[04:27] <SteveA> consider this
[04:28] <SteveA> week 1: translation done on ktorrent
[04:28] <carlos> just because it doesn't belongs to dapper
[04:28] <SteveA> week 2: dapper POT produced
[04:28] <SteveA> week 3: more translation done on ktorrent
[04:28] <SteveA> week 4: backport built
[04:28] <SteveA> week 5: we fix problem by uploading the dapper POT
[04:28] <SteveA> have we lost the work done in week 3?
[04:28] <carlos> no
[04:29] <carlos> we will be at that exact status
[04:29] <carlos> my point was that
[04:29] <carlos> week 4.5: translated something new from backport
[04:29] <danilos> afai get this, we might only have some additional translations which belong in backports, but these won't be used
[04:30] <carlos> in that case, those new translations will be hidden with the .pot change, nothing else
[04:30] <danilos> right, week 4.5 :)
[04:30] <carlos> danilos: ;-)
[04:30] <SteveA> ok
[04:30] <SteveA> that was my concern.  if you're confident that's not an issue, then that's good
[04:30] <carlos> but we don't remove them so they would appear as suggestions for Edgy if we publish the same version that the backport had
[04:31] <carlos> that was my point
[04:31] <carlos> sorry if that introduced some misunderstandings
[04:32] <SteveA> ok
[04:32] <SteveA> I have a call now.  let me know how the discussion with various people go.
[04:32] <SteveA> thanks
[04:33] <carlos> SteveA: you are welcome
[04:33] <carlos> danilos: let me have a 15 minutes break and then, we could start our next meeting. Is that ok for you?
[04:33] <danilos> carlos: sure
[04:34] <carlos> so let's talk at 16:45 our time
[04:34] <danilos> carlos: I'd want to drop by store as well, so I wonder if I should do that first as well?
[04:34] <danilos> (but 10mins is not enough for that ;)
[04:34] <danilos> carlos: so how about 17h?
[04:34] <carlos> ok
[04:34] <danilos> great
[04:34] <carlos> 17h works for me
[04:35] <carlos> see you later
[04:35] <danilos> later; SteveA, carlos, thanks for bringing these issues up, even if I didn't have much to say on them :)
[04:35] <carlos> danilos: you are welcome ;-)
[04:36] <carlos> danilos: at least you should know about those issues
[04:36] <danilos> carlos: indeed
[05:03] <carlos> danilos: hi, should we have the meeting here?
[05:03] <danilos> carlos: sure
[05:03] <danilos> carlos: so, lets start with TranslationImport thing
[05:04] <danilos> did you have a chance to take a look at the very drafty-spec, and more importantly, to think about it?
[05:04] <danilos> my idea is to create a simple interface which will provide all data we need, *without* any database stuff
[05:05] <danilos> the reasoning is that most of the database stuff is repeated (as experienced developing xpi import)
[05:05] <carlos> yeah I saw your document
[05:05] <carlos> but I'm not completely sure of how do you plan to do it...
[05:06] <carlos> I know the idea
[05:06] <danilos> well, I haven't written anything about implementation
[05:06] <carlos> to have an object with a single file as an input
[05:06] <carlos> and n files as output
[05:06] <danilos> and that's why I want to discuss it with you and have a preimplementation call with a reviewer
[05:07] <carlos> I think that the basic idea is good enough to work on it
[05:07] <danilos> my current idea is as written above: accept path/content in constructor, and produce something like a list of templates with all the needed data
[05:07] <danilos> specifically, I would make TranslationImport.templates a dict keyed by potemplate name
[05:07] <carlos> but I would like to know how do you plan to deal with the import queue (this changes it a lot)
[05:08] <danilos> well, I'd go for minimal changes in import queue
[05:08] <danilos> when there is only a single POT/PO being imported, we'd have the same thing we have now
[05:08] <carlos> sure
[05:08] <carlos> but most powerful part of the import queue
[05:08] <danilos> when there are more than one, we fully expect imported file to list all the templates it wants to go into
[05:08] <carlos> are tarball imports from Ubuntu packages
[05:08] <carlos> with multiple .po and .pot files
[05:09] <carlos> and the guessing code to decide were should that be imported
[05:09] <danilos> indeed, and no reason to abandon that
[05:09] <danilos> I'll just move that to separate TranslationImport class
[05:10] <danilos> the only problem I can see is that we'd have to approve/disapprove the entire tarball
[05:10] <carlos> so you will need to open tarballs every single time we scan all entries in the queue?
[05:11] <carlos> that's not possible, we should be able to reject or block single files within a tarball just like we do right now
[05:11] <danilos> hum, if I want to provide more details, yes
[05:11] <carlos> so I guess some extra metadata would be needed
[05:11] <carlos> hmmm
[05:11] <danilos> well, the other, probably better option is to also allow separation as it's done currently
[05:12] <danilos> for tarballs, that is
[05:12] <carlos> well, the code that guess were every entry should be imported needs to check the paths of every single file
[05:12] <danilos> so, eg. GSI files would present themselves as separate queue entries as well
[05:12] <danilos> and we can link to the same librarian file for all of the entries
[05:12] <carlos> so we would have more than one entry for a single GSI?
[05:13] <danilos> yeah, for different potemplates/languages
[05:13] <carlos> so we use 'path' field as the way to filter out the file from the tarball stored in the librarian?
[05:13] <danilos> that's right
[05:13] <carlos> hmm, I think I like that
[05:14] <danilos> the only thing we need to watch out for is not to delete file in librarian until all references to it are cleared ;)
[05:14] <carlos> that will not need too many changes in our current code
[05:14] <carlos> librarian handles that automatically
[05:14] <carlos> no entries are removed until there are no more references to it
[05:14] <danilos> exactly, and we get pretty sofisticated management of even other things, we'd be able to move KDE langpack support to that, etc.
[05:15] <carlos> how's that?
[05:16] <danilos> well, a single kde-l10n-<LANG> will appear as several PO files in the queue entry; i.e. the same way as tarballs work now, just per-language, not per-template
[05:17] <danilos> and, we'd be able to approve "subcomponents" of files (eg. approve single language from GSI file, which may contain a number of languages)
[05:18] <carlos> I see
[05:18] <carlos> what I mean is that in this concrete case, kde-l18n handling will not change a lot
[05:18] <carlos> we will eat less disk space, but that's all
[05:18] <danilos> well, it won't change at all
[05:19] <carlos> phone...
[05:19] <danilos> except that code will be cleaner, imho ;)
[05:20] <carlos> I'm back, sorry
[05:20] <danilos> no problem
[05:21] <carlos> well, I'm not completely sure whether it would be really cleaner....
[05:21] <carlos> I mean, we still need code to handle the tarball extract
[05:21] <carlos> o well, the list of the tarball
[05:21] <carlos> s/o/or/
[05:21] <carlos> and the disk space needed will be reduced a lot
[05:21] <danilos> indeed, but there won't be things like hardcoding all the checks in translation_import_queue.py
[05:22] <carlos> how's that?
[05:22] <carlos> I know that's true for OO and FF
[05:22] <danilos> i.e. we currently check if path.endswith('.po') or path.endswith('.pot')...
[05:22] <danilos> and completely special-case kde stuff with another function
[05:22] <carlos> right
[05:23] <carlos> but there are other checks
[05:23] <carlos> that cannot be moved the way you plan
[05:23] <carlos> for instance
[05:23] <danilos> ok, let me rephrase that: instead of "cleaner", I should have said "more generalized"
[05:23] <carlos> GNOME tarballs mix two different layouts
[05:23] <carlos> one for the application with a po/foo.pot and several .po files insde that directory
[05:24] <carlos> and another for documentation with something/foo.pot and then subdirectories for the .po files
[05:24] <danilos> yeah, but what's the problem with that?
[05:24] <carlos> anyway, even leaving the code to handle GNOME things, I agree now, it would be cleaner so we move KDE specific code to KDE tarballs
[05:25] <carlos> danilos: it cannot be moved outside translation_import_queue
[05:25] <carlos> that's all
[05:25] <danilos> I don't understand why not
[05:26] <danilos> afaics, I'd have TranslationImport.providesTemplates which would return a list of all the templates a file provides, and then that would reference all translations for those templates
[05:27] <carlos> Hmmm, I see
[05:28] <danilos> and don't forget that we also have default_file_format to determine which importer to use
[05:28] <carlos> so you mean that you 'extract' a tarball only when you know exactly where its entries will be imported?
[05:28] <carlos> right, in fact I was thinking on default_file_format and it doesn't solve the problem with GNOME layout
[05:28] <danilos> well, you list the filenames when you create import queue entries
[05:29] <carlos> I'm a bit lost because I see some holes in the process you describe
[05:29] <carlos> could you please describe step by step
[05:29] <danilos> actually, I don't really care about space-savings, so I may also extract them right away, just like you do with current tarball imports
[05:29] <carlos> what would happen when we import, for instance, evolution 2.18 translations?
[05:30] <danilos> especially if I am going to lose a lot of speed with that approach
[05:30] <carlos> so we get a tarball and the reference to the sourcepackage and distrorelease
[05:30] <danilos> TranslationImport detects there is only one template in there, and creates a single template, and a bunch of pofile import queue entries
[05:31] <carlos> let's see what do you have in mind, and then, decide whether we need to untar the entries or not (I don't think we need to untar it, just extract it when the .po file is actually used)
[05:31] <carlos> danilos: doesn't it have doc + application .pot files?
[05:31] <danilos> (as for untarring, that can be handled independently of TranslationImport design)
[05:31] <carlos> sure
[05:32] <danilos> well, if it has both doc & application .pot files, then it will provide two entries for templates, along with a bunch of pofile entries for each of them
[05:32] <danilos> now, you're probably thinking of automatic po file matching?
[05:33] <danilos> especially because evolution POT file will probably need a rename
[05:33] <carlos> no, I just want to see the full process, not thinking yet on specific details
[05:33] <carlos> from what you just described to me
[05:33] <carlos> is the same thing we do right now
[05:33] <danilos> and we should establish the relation between template and pofiles once they are added to queue
[05:34] <carlos> get the tarball and fill the queue with all .pot and .po files included
[05:34] <carlos> ok so until this step, the code would be the same, perhaps moving it to other parts of our tree, but same procedure
[05:34] <danilos> well, yes, that's the point; it would work mostly the same for what we have already, yet allow easy extending to what we don't have (like Firefox, OOo, KDE PO, Zope...)
[05:34] <danilos> I don't really see much wrong in the current procedure, to be honest
[05:35] <danilos> except that I would move it outside the import queue code and generalize it
[05:35] <carlos> so you would have a kind of adaptor
[05:36] <carlos> for tarballs that will do what we do atm
[05:36] <carlos> another for .po and .pot files that do nothing
[05:36] <carlos> (if it's not a KDE PO or Zope one
[05:36] <carlos> )
[05:36] <danilos> the only problem I see with the current implementation is that I need to modify like 5-6 places and add a couple of lines on each of them, and when I develop a new importer, I duplicate much of the code from poimport.py
[05:36] <danilos> that's right
[05:36] <carlos> well, even if it's a KDE PO or Zope one, as we agreed the format change will be done on import time not as a .po file 
[05:37] <carlos> another for OOo that will split the file in smaller chunks
[05:37] <danilos> well, for KDE PO file, we probably need to descend from POParser only
[05:37] <carlos> etc
[05:37] <carlos> etc
[05:37] <danilos> that's right, but without duplication of database object creation
[05:37] <carlos> sure 
[05:38] <carlos> all them inheriting from a common class
[05:38] <danilos> for Firefox, I ended up copying most of the import_po stuff, and just replacing relevant parts
[05:38] <danilos> that's right
[05:38] <carlos> and we only write the method to do the split
[05:38] <carlos> ok
[05:38] <danilos> well, all of that would be part of TranslationImport interface, that was my idea
[05:38] <danilos> so, how do you feel about that?
[05:39] <carlos> I see an easy optimisation to reduce wasted disk space, but let's leave that for a latter optimisation
[05:39] <carlos> That solves the problem with code duplication that you talk about
[05:39] <carlos> but
[05:40] <danilos> ?
[05:40] <carlos> I still don't see how do you plan to move the code from translation_import_queue to guess the POFile and POTemplate where an entry should be imported (we should start thinking on rename those table names...)
[05:40] <carlos> I see that when you extract an entry
[05:41] <carlos> you can try to guess it and link it 
[05:41] <carlos> but
[05:41] <danilos> well, it's easy to have TranslationImport.template['evolution'] .guessed_template property
[05:41] <carlos> as you already pointed, when you need to do a translation domain change, the .pot file will not find a link
[05:41] <carlos> sure
[05:41] <danilos> and translation import queue will use the same method as now: you can override it
[05:41] <carlos> and you call that before creating the entry in the queue, right?
[05:42] <danilos> that's right
[05:42] <carlos> ok
[05:42] <carlos> let's see it this other way
[05:42] <carlos> let's say you have a layout like gtk+
[05:42] <danilos> ok
[05:42] <carlos> where they use the package version as part of the path
[05:42] <carlos> so you have something like gtk+-2.10.3/po/gtk20.pot
[05:43] <carlos> and we have imported 2.10.2
[05:43] <carlos> the automatic matching will not work here
[05:43] <danilos> hum, I don't follow
[05:43] <carlos> so you will not be able to do that link
[05:43] <carlos> we had this problem with Edgy
[05:43] <carlos> to link the new .pot file
[05:43] <danilos> if we have imported 2.10.2, where do we get gtk+-2.10.3 from?
[05:43] <carlos> is the new one
[05:44] <carlos> that we are handling
[05:44] <danilos> ah, ok
[05:44] <danilos> we have sourcepackage and distrorelease here, right?
[05:45] <carlos> we do TranslationImport.template['gtk20']  .... this has a problem, the translation domain is not always the same as the .pot filename so we look for pot files based on its path
[05:45] <carlos> yeah, you know sourcepackagename and distrorelease
[05:45] <danilos> that will appear in the queue as a separate entry, just like it appears now
[05:46] <danilos> and gtk20-properties will as well
[05:46] <danilos> i.e. I am still not getting what you are aiming at
[05:47] <carlos> ok, but without a link to a POTemplate or POFile, right?
[05:47] <danilos> that's right
[05:47] <carlos> just like we do right now
[05:47] <carlos> ok
[05:47] <carlos> what I do atm is
[05:48] <carlos> go to Edgy's gtk20 template
[05:48] <carlos> and change the path
[05:48] <carlos> optionally, I could link the .pot file with this POTemplate that I just fixed
[05:48] <carlos> ok?
[05:49] <danilos> ok, so you want us to automatize that as well?
[05:49] <carlos> no, it's not my point, we should do it, but it's not related to this discussion
[05:49] <carlos> now, what happens with the .po files?
[05:49] <carlos> with current code, the .po files will be visited again and this time we will be able to link them with POFiles
[05:49] <carlos> because we find now a POTemplate in the same path
[05:50] <danilos> well, that's the point I had above: we need to link them to this template queue entry once we create them as well
[05:50] <carlos> hmm I see your point
[05:50] <danilos> so, instead of doing "path matching" in queue entry code, we'd do that while adding queue entries
[05:50] <danilos> which means that we might need another column in translationimportqueue
[05:51] <danilos> rather, translationimportqueueentry ;)
[05:51] <carlos> like template_entry?
[05:51] <danilos> something like that, yes
[05:51] <danilos> and then we can just directly approve them once template gets imported
[05:52] <carlos> I see
[05:52] <carlos> would you note that we need a trigger or something to check
[05:52] <carlos> that the entry pointed by template_entry should have its template_entry set to NULL ?
[05:53] <danilos> sure, I'll summarize this entire discussion in the spec when we are done, so I'll note that as well
[05:53] <carlos> either that, or use an external table to represent this relationship so we don't need triggers
[05:53] <carlos> Stuart should tell you the best solution
[05:53] <carlos> ok
[05:53] <carlos> I agree more or less with this solution
[05:54] <carlos> but let's talk about some corner cases
[05:54] <carlos> ok?
[05:54] <danilos> sure
[05:54] <carlos> (I think this one is easy)
[05:54] <carlos> we get a tarball with a set of .po files and no .pot files at all
[05:54] <carlos> that link will not be done
[05:54] <carlos> which is fine
[05:55] <carlos> now, we get a new version of that tarball that includes the .pot file
[05:55] <carlos> your code should be able to detect the duplicated .po files and update those rows
[05:55] <carlos> I think this corner case is not a big deal
[05:55] <danilos> that's right
[05:55] <carlos> ok, next one
[05:56] <carlos> we implement a new layout support 
[05:56] <carlos> for something that we already got imported into the queue
[05:56] <carlos> so the .pot and .po files aren't linked
[05:56] <danilos> bullocks, DELETE those, and reimport ;)
[05:56] <carlos> that's not possible, a reimport requires a new Ubuntu package release
[05:57] <danilos> we can also try to handle that using the same librarian reference and path's
[05:57] <carlos> the DELETE is not needed, a new import will update those entries
[05:57] <carlos> with that, you will need then to store references to the tarball
[05:57] <danilos> if we go without extracting, but if we extract everything, then the only thing we can work with are paths
[05:58] <carlos> instead of the content of the concrete entry
[05:58] <carlos> ok
[05:58] <carlos> so no extracting++
[05:58] <carlos> after handling the queue
[05:58] <carlos> you will need to revisit every entry in the queue grouped by its librarian reference
[05:58] <carlos> and try again to detect its potemplate and pofile
[05:59] <danilos> that's right, and make sure that logic for linking those in TranslationImport is separated, so it can be run just on paths
[05:59] <carlos> hmmm
[06:00] <carlos> I'm not sure about that last thing
[06:00] <carlos> I mean, you are interested only in librarian links
[06:00] <danilos> well, how would we otherwise do the matching after the import?
[06:00] <danilos> so you think redoing the import would be better?
[06:00] <carlos> once you do that, you can handle it the same way a new import is handled
[06:00] <danilos> ok, sure, it's simpler
[06:01] <carlos> because you already have code to deal with already existing code
[06:01] <carlos> otherwise the complexity would be higher than needed, isn't it?
[06:01] <carlos> we are talking about opening a tarball and get the list of entries inside it
[06:02] <carlos> but we are not fetching its content
[06:02] <danilos> that's right, I agree
[06:02] <carlos> if you see it as a performance problem, we can do what you suggest
[06:03] <danilos> it shouldn't be too much of a problem, I believe
[06:03] <carlos> also, to prevent long delays in the queue
[06:03] <carlos> I think we should note last time we checked a set of entries
[06:03] <carlos> so we try to guess the same entries once per day
[06:04] <carlos> not sure If you understand what I mean
[06:04] <danilos> sure
[06:04] <carlos> ok, let me see if I find any other corner case based on what we found already....
[06:05] <danilos> we also need to do the guessing iff we have a new template import entry
[06:06] <carlos> for product imports?
[06:07] <danilos> for tarballs and stuff
[06:07] <carlos> other than products
[06:07] <carlos> what's the point for that?
[06:07] <carlos> products relay on manual imports
[06:08] <carlos> so we could get a potemplate and later a tarball with languages
[06:08] <carlos> well, the other way, first translations and later a template
[06:09] <danilos> not sure I understand you
[06:09] <carlos> if we get a tarball without template
[06:09] <danilos> I meant that apart from checking entries only once a day, we also don't need to do that if there was no template
[06:09] <carlos> and later a new upload
[06:09] <danilos> ah, right
[06:09] <carlos> oh, I see
[06:09] <carlos> sorry, I misunderstood you :-P
[06:10] <danilos> no problem ;)
[06:10] <carlos> but, you need to do it anyway
[06:10] <danilos> right
[06:10] <carlos> just in case we already had a template importe
[06:10] <carlos> d
[06:10] <danilos> anyway, who do you suggest that I ask to be my pre-implementation call reviewer? :)
[06:10] <carlos> phone, sorry
[06:18] <carlos> ok, back
[06:18] <carlos> hmm
[06:19] <carlos> well, SI guess james or Bjorn would fit
[06:19] <carlos> grr
[06:19] <carlos> weel, Steve usually suggests james or Bjorn for this kind of things
[06:19] <carlos> s/weel/well/
[06:19] <carlos> I'm a bit dyslexic today...
[06:20] <danilos> ok ;)
[06:20] <danilos> I'll probably ask Bjorn, since I haven't worked with him so far ;)
[06:22] <carlos> ok
[06:22] <carlos> but prepare an spec update with what we just talked
[06:22] <carlos> so he can read it before the call
[06:22] <carlos> ok?
[06:23] <carlos> let's take another 10 minutes break and we could start with our other meeting about view restructuring 
[06:23] <carlos> ok?
[06:24] <danilos> carlos: anyway, we also planned to discuss your view restructuring work, right?
[06:25] <danilos> sure
[06:25] <carlos> :-)
[06:25] <carlos> I see you have lag...
[06:26] <carlos> let's meet again at 18:35 local time, ok?
[06:26] <danilos> yeah, that's fine
[06:40] <carlos> so
[06:40] <carlos> danilos: meeting time? (again...)
[06:40] <danilos> carlos: yay ;)
[06:41] <carlos> danilos: You would want to read the email from kiko
[06:41] <carlos> he sent it 31st August with the subject: "POFileTranslationView/POMsgSetView cleanup guide, was Re: Status of a few bugs"
[06:42] <danilos> ok, sure
[06:43] <danilos> ok, it's already marked as "important" in my mail folder ;)
[06:43] <carlos> danilos: the important bits are the ones related with pomsgset.py
[06:43] <carlos> :-P
[06:46] <danilos> ok, I've read the msgset.py bits
[06:48] <danilos> carlos: ping ;)
[06:48] <carlos> ok
[06:48] <carlos> sorry, I wasn't taking care of this channel O:-)
[06:48] <carlos> so
[06:48] <danilos> weren't you talking about not using several view classes?
[06:49] <carlos> well
[06:49] <carlos> the suggestion by kiko is not exactly that
[06:49] <danilos> and cleaning up _*_submissions is basically what bug 30602 was all about ;)
[06:49] <carlos> at the moment, the view for POMsgSet pages is used from POFile's view
[06:49] <carlos> and that's broken
[06:50] <danilos> ah, ok
[06:50] <danilos> I see your point
[06:50] <carlos> because we need to know when are we using it from a POFile view or a POMsgset view directly
[06:50] <carlos> in this case
[06:50] <carlos> the shared bits are moved to a different view
[06:50] <danilos> so, the plan is to use POMsgSetView from both?
[06:50] <carlos> yeah
[06:51] <carlos> but without using that view as a zope one
[06:51] <danilos> ok, sounds much better
[06:51] <carlos> so it's not linked without any web page
[06:51] <carlos> it just have information
[06:51] <carlos> that the POFile and POMsgSet views will use
[06:51] <danilos> yeah, understood
[06:52] <carlos> I still think that POFileTranslationView and POMsgSetPageView would use a common class from where they would inherit 
[06:52] <carlos> because they would share a lot of code
[06:53] <carlos> because POFileTranslationView is for a set of messages and POMSgSetPageView is just for a single message
[06:53] <carlos> but I guess we could leave it for later
[06:53] <danilos> Yeah, I know
[06:54] <danilos> the thing, as I see it, is that it would mostly be about template sharing
[06:54] <carlos> so POFileTranslationView and POMsgSetPageView will have just the needed bits to render the web page
[06:54] <carlos> well, we already have that done
[06:54] <danilos> i.e. it would be mostly the same template for processing data from POMsgSetView
[06:54] <carlos> we are already sharing the template
[06:54] <carlos> and that doesn't need any change
[06:54] <carlos> to go with this solution
[06:55] <danilos> ok, so what code is useful for both, yet can't be moved to PoMsgSetView?
[06:55] <carlos> the main problem were with navigation links, that we had to check whether we were being used from a POFile view or a POMsgSetView directly
[06:55] <carlos> to generate them
[06:56] <danilos> right, but I am wondering what would need sharing? navigation would be separate, so that's cool ;)
[06:56] <carlos> tabindex generation, general statistics info (the one at the end of the form)
[06:57] <carlos> the alternative language selector code
[06:57] <carlos> more or less I think that's all, but anyway, we can leave it duplicated as it's atm and think on the inheritance later
[06:57] <danilos> hum, some of them might belong in other classes (like statistics being part of POFile, no?)
[06:58] <danilos> sure, I don't see much use of inheritance right away
[06:58] <carlos> could be
[06:58] <carlos> but don't worry, I don't want to handle that right now
[06:58] <danilos> ok
[06:59] <danilos> so, is there anything specific you want to discuss?
[06:59] <carlos> we just need POMsgSetView to handle all information that is part of the small section of a message
[06:59] <carlos> whether you think this is a good thing to do  ;-)
[06:59] <danilos> ah, ok :)
[06:59] <carlos> I think this solves some problems that our current model has
[07:00] <carlos> for instance
[07:00] <danilos> well, I believe it's a very good thing, it will make it even more clear for anyone delving into code in the future as well ;)
[07:01] <danilos> i.e. I know it wasn't very simple for me to track down all the relations and dependencies this way, with views being used from views, etc. :)
[07:01] <carlos> the link problem
[07:01] <carlos> we currently solve that adding a flag to the POMsgSetView class to note if it comes from a POFile
[07:01] <danilos> and at the same time, you will be doing the _*_submissions() cleanup, probably reducing the number of queries as well ;)
[07:01] <carlos> to give one kind of links or others...
[07:01] <carlos> well
[07:01] <carlos> not really
[07:02] <danilos> yeah, which is a kludge, agreed
[07:02] <carlos> or not sure...
[07:02] <carlos> I should not change anything but the restructuring
[07:02] <carlos> anything else should be deferred to another branch
[07:02] <carlos> (I don't mind to take care of that task anyway, but not as part of that branch)
[07:03] <danilos> ok, maybe you won't, but then I will later on, and it will be simpler for me :)
[07:03] <carlos> yeah, that's the goal
[07:03] <carlos> so do you think this solution would require less time to fix that bug?
[07:03] <carlos> could you quantify it for me?, you don't need to be precise ;-)
[07:04] <carlos> because this would be another argument to do this now instead of post 1.0
[07:04] <carlos> :-)
[07:05] <danilos> well, it will probably drop from 3-6 days of active work to 2-4 days, not really sure
[07:05] <danilos> the thing is that it requires some optimization work and profiling, which you never knows how much it will take (just remember our edgy migration work in london ;)
[07:06] <carlos> I know, but that's enough
[07:06] <carlos> I think I could do the restructuring in around 4 hours + test fixes + move from POST to GET
[07:07] <carlos> I guess that in 1 day and a half of work would get that done
[07:07] <carlos> I just need to move code around
[07:07] <carlos> and change the POST to be a GET
[07:07] <carlos> + fix a lot of tests
[07:08] <carlos> do you think this is something optimistic or realistic?
[07:08] <danilos> realo-optimistic ;)
[07:08] <carlos> in fact, add half day to the mix to cover me from being a bit lazy...
[07:08] <carlos> so 2 days
[07:08] <danilos> sure, sounds reasonable
[07:09] <danilos> so, I guess with what we win (nicer code, altogether maybe 1 day more work), I believe it's worthy it
[07:09] <danilos> s/worthy/worth/
[07:09] <carlos> well, 1 day more work only related with your bug...
[07:10] <carlos> with TranslationReview we have less extra work
[07:10] <carlos> but is hard to me to estimate what would we save
[07:10] <carlos> I think that we would save nothing, just clear code
[07:10] <carlos> s/clear/clearer/ 
[07:11] <carlos> which could save more time in the future...
[07:11] <danilos> yeah, right
[07:11] <danilos> so, if we stick to our time estimates, I believe it's the way to go
[07:11] <danilos> what do you think?
[07:11] <carlos> I think so, yes
[07:12] <carlos> I'm going to write to Steve and the list about this
[07:12] <carlos> in fact
[07:12] <carlos> due I'm not going to change anything there, I'm not sure whether another meeting with a reviewer would be necessary
[07:13] <carlos> the bigger part of this is part of your bug...
[07:13] <carlos> and I'm not going to do it a this stage
[07:13] <carlos> but later
[07:13] <carlos> what do you think?
[07:13] <danilos> yeah, sounds reasonable
[07:14] <carlos> ok, thanks
[07:14] <carlos> Is there anything else we should talk about?
[07:14] <danilos> and you need to be careful with the tests and GET/POST switch
[07:15] <carlos> I already did such change for the message filtering code, so don't worry, I felt the pain already....
[07:15] <carlos> we were doing POST for them until some months ago
[07:16] <danilos> ok, great :)
[07:16] <danilos> I think that's all, enough meetings for today :)
[07:17] <carlos> yeah
[07:17] <carlos> today was pretty intense with meetings...
[07:17] <carlos> I didn't wrote any code :-(
[07:18] <carlos> thanks for your input
[07:19] <carlos> danilos: do you need anything from me?
[07:19] <danilos> no, that's all; enjoy your evening ;)
[07:20] <carlos> same for you
[07:20] <danilos> I am out myself, will be back later for some more action though :)
[07:20] <carlos> ok
[07:20] <carlos> cheers!