[03:03] <pitti> ok, so is everybody familiar with the current structure of locales?
[03:04] <Kamion> ish
[03:05] <pitti> ok, let's summarize it for the others to read
[03:05] <pitti> previously, package 'locales' shipped all locale definitions; /etc/locale.gen selected the ones which were compiled and available in the system
[03:05] <pitti> now, 'locales' only ships helper scripts, keyboard stuff, etc., but not the locale definitions any more
[03:06] <pitti> and the locale definitions are shipped in the respective langpacks
[03:06] <pitti> langpack postinst/prerm generates/removes the compiled locales now
[03:07] <pitti> this way we can update them after release and, more important, handle them with Rosetta
[03:07] <pitti> that was the way as described by LocalesThatDontSuck
[03:07] <pitti> unfortunately it turns out that they suck differently now
[03:07] <pitti> oh, mvo, you were away
[03:07] <pitti> jbailey, doko, Mithrandir: please ack that you read the summary
[03:07] <mvo> pitti: can you /msg me the scrollback?
[03:08] <jbailey> pitti: ack
[03:08] <doko> pitti: how often is the locale data updated (from your experience)
[03:08] <jbailey> doko: Do you mean from Belocs?
[03:08] <pitti> in breezy we did it a couple of times
[03:08] <pitti> but we didn't ship belocs locales (too late for testing)
[03:08] <pitti> so it would have been better to update more of them 
[03:08] <jbailey> doko: Belocs upstream takes a steady stream of patches into their repository.  Not that fast, but occasionally.
[03:09] <Mithrandir> mvo: done
[03:09] <Mithrandir> pitti: read the summary, yes
[03:09] <doko> I mwan, how often are files touched in /usr/share/i18n/locales ?
[03:09] <pitti> so AFAICS we now have three problems:
[03:09] <doko> #s/mwan/mean/
[03:09] <pitti> 1) locales cannot be generated without installing a langpack
[03:09] <pitti> 2) some test suites currently use external locale definitions
[03:10] <pitti> 3) d-i needs locales as well
[03:10] <pitti> seriously, 2) seems like a non-issue to me, but doko wants them back
[03:10] <Mithrandir> I guess the live cd usecase is an instance of 1)?
[03:10] <pitti> 1) is inconvenient
[03:10] <pitti> and 3)  is a real problem
[03:10] <pitti> Mithrandir: right, 1 or 3, whatever
[03:10] <pitti> jbailey: what was the primary reason to ship them in the langpacks instead of 'locales'?
[03:10] <doko> pitti: please don't take 2) too lightly ...
[03:11] <Kamion> by 3) I assume you mean the localechooser build process
[03:11] <jbailey> pitti: As in, why did we split them out?
[03:11] <pitti> doko: somebody still needs to convince me that a static expected output tested against dynamic external data makes sense :)
[03:11] <pitti> jbailey: yes
[03:11] <Kamion> the rest of d-i's problems are due to 1) (namely, if you don't have network access and the language pack isn't on the CD, you can't even get the locale installed so you're doomed to a zillion perl errors on install)
[03:11] <pitti> Kamion: well, that, and that CDs which do not ship the langpack for  the locale the user selects are broken
[03:11] <Kamion> and you'll probably get wrong collation order etc.
[03:12] <doko> pitti: ohh, I don't include glibc and binutils in the gcc testsuite as well. do you think I should?
[03:12] <Kamion> pitti: right, I think that's more or less what I said?
[03:12] <pitti> Kamion: yes, didn't get that line early enough :)
[03:12] <Mithrandir> so 1) and 3) are really the same problem?
[03:12] <pitti> yes, probably
[03:12] <Kamion> Mithrandir: related, anyway
[03:12] <Mithrandir> or 3) is a subinstance of 1)
[03:12] <pitti> so it all comes down to shipping a locales definitions in 'locales' again
[03:13] <jbailey_> Lagging a sec, X hung on me.
[03:13] <pitti> unless anyone has a different great idea?
[03:13] <Mithrandir> just making perl, etc shut up about not being able to set locale is possible, but it doesn't solve the problem, it just hides the symptoms.
[03:13] <pitti> jbailey_: so which use case would break if we put the locale sources back into 'locales'?
[03:13] <doko> pitti: so why not have a second folder for /usr/share/i18n/locales, which you can install updated locale data, and which is search first?
[03:14] <pitti> doko: that's another option
[03:14] <Kamion> Mithrandir: indeed, and I'm concerned that we don't generally test the case where setlocale() fails, at least not on the level of "will your desktop work properly"
[03:14] <pitti> but that would mean to keep the locales data in two different sorts of packages, which I don't like
[03:14] <doko> then you can ship the locale data in locales, and update them in rosetta
[03:15] <Kamion> I'd much rather be able to assert that setlocale() will never fail immediately after a standard installation
[03:15] <pitti> right, that seems easiest
[03:15] <Mithrandir> I think the approach with shipping locale data in locales is fine, but we want it generated from the belocs data?
[03:15] <jbailey_> pitti: The idea of having them split was two fold.  1) Genearting new languages in the future in Rosetta is then an isolated problem.  It doesn't need to affect other languages.
[03:15] <pitti> so my question is what the primary reason was for splitting them out to langpacks in the first place
[03:16] <jbailey_> 2) As we get more and more languages, it means more systems are carrying around baggage that they don't need.  We expect folks to install language packs anyway for all of their language components, so they then have the locale data present.
[03:16] <doko> jbailey_: yes, but you can overcome this kind of problem with a second folder, which is searched
[03:16] <pitti> hm, with a central location we need to update locales for every locale update, right, but that shouln't bite too hard?
[03:16] <doko> jbailey_: /usr/share/i18n/locales is 750k compressed
[03:16] <Kamion> there's another option which sort of addresses jbailey's baggage objections, although I don't like it for other reasons; still I might as well throw it out there
[03:16] <doko> pitti: it's the only package built from the source package
[03:17] <Kamion> create a udeb with all the locale data and have the installer copy in just the relevant locale
[03:17] <jbailey_> What problem are we trying to solve here?  I think I missed that when my X died.
[03:17] <pitti> doko: right, nowadays (was previously in libc, which hurted)
[03:17] <Kamion> it does mean that the locale file is owned by no package
[03:17] <Mithrandir> Kamion: doesn't solve the live cd problem, though.
[03:17] <Kamion> 14:11 < Kamion> the rest of d-i's problems are due to 1) (namely, if you don't have network access and the language pack isn't on the CD, you can't even get the locale installed so you're doomed to a zillion
[03:17] <Kamion>                 perl errors on install)
[03:17] <Kamion> 14:14 < Kamion> Mithrandir: indeed, and I'm concerned that we don't generally test the case where setlocale() fails, at least not on the level of "will your desktop work properly"
[03:17] <Kamion> jbailey_: ^-- basically that
[03:17] <Kamion> (imho)
[03:18] <doko> Kamion: and these files are installed during the installation?
[03:18] <Kamion> and there's an analogous problem on the live CD
[03:18] <Kamion> doko: yes, but as Mithrandir says it wouldn't solve the similar live CD problem anyway
[03:18] <ogra> for edubuntu its more than perl errors, it breaks the complete install ...
[03:18] <jbailey_> Is it correct for us to offer the locales as an installation option that aren't on the CD anyway?
[03:18] <Kamion> jbailey_: not having the locale your environment variables say you do is a fair bit worse than not having any text
[03:19] <jbailey_> It seems like it's a confusing thing to pick, say, French Canadian and not have the resulting system be in French Canadian.
[03:19] <Kamion> jbailey_: yes; very few languages fit on the CD
[03:19] <Kamion> I really don't want to strip our language offering down to that level
[03:19] <pitti> jbailey_: I think so, language-selector can fix things up after installation
[03:19] <Mithrandir> jbailey_: sure, especially since you can set LC_CTYPE to your local locale even though you don't want messages in that locale..
[03:19] <jbailey_> Right, that's what I mean.  We generally expect people to be able to cope with English on the installed system.  I'm wondering if just having locales available on the CD is a bandaid solution.
[03:19] <Kamion> and also localechooser doesn't have access to the contents of the CD yet
[03:20] <Kamion> jbailey_: having the locale data for all languages on the CD has worked fine up until now
[03:21] <jbailey> Hmm
[03:21] <Kamion> as ogra says, not having a locale breaks various package maintainer scripts too
[03:21] <pitti> yes, it looks totally ridiculous, but it works
[03:21] <jbailey> Right, I agree that we shouldn't set a locale to something that's even not available.
[03:21] <jbailey> It breaks things like printing in incredible subtle ways.
[03:21] <Kamion> there's a case for saying that those maintainer scripts are buggy, but (a) the specific case in question is actually kind of non-obvious, (b) we don't test that scenario much
[03:22] <jbailey> (like perl spew winding up in the middle of postscript output)
[03:23] <jbailey> I'm just wondering if the right solution is to delay the whole language conversation until install time when the user has to deal with it anyway.
[03:23] <Kamion> If you look at it from the point of view of a user who was using a supported language from warty->breezy (which is most users - I think we support >> 60% of the globe even without good support for the Indics), it seems like a cut-and-dried regression
[03:23] <pitti> ok, so does anybody see a problem with that: we rebuild all langpacks without locales, stuff all locales back to 'locales'
[03:23] <Kamion> jbailey: we don't have time for that sort of major reengineering in dapper, I'm afraid
[03:23] <pitti> and we keep the locale installation/removal in the langpack postinst/prerm to avoid the conffile?
[03:23] <jbailey> Kamion: I don't think labelling it as a 'regression' is valid here.
[03:24] <Kamion> jbailey: I do, judging from my bugs
[03:24] <jbailey> We break things all the time when chasing feature goals.  People are expected to read the release notes for system changes.
[03:24] <jbailey> This isn't new.
[03:24] <Kamion> Sure, breakage during development is fine, but this *will* be a regression come release.
[03:24] <pitti> btw, NB that this doesn't affect upgrades so much
[03:24] <jbailey> How so?  The release notes say "Install your language pack"
[03:24] <pitti> just mainly new installs
[03:24] <Kamion> in the installer, so of course not many developers care
[03:24] <pitti> since locales for upgraded systems won't automatically be removed
[03:24] <Kamion> jbailey: Edubuntu doesn't even *install*
[03:25] <doko> pitti: when you talk about locale data, you only mean /usr/share/i18n/locales ?
[03:25] <jbailey> Ihave a side concern about applications relying on locale data for their testsuite, I'd like to visit that after.
[03:25] <Kamion> jbailey: also, I'm concerned that the initial desktop is not going to look great if the locale isn't there
[03:26] <Kamion> for such a release note to be a valid get-out, stuff has to work right until you can install the language pack
[03:26] <jbailey> Kamion: If you're expecting that you've installed the system in French, and it comes up in English, it's already not great.
[03:26] <jbailey> I agree that we might not have time to reengineer this or dapper.
[03:26] <Kamion> jbailey: seriously, this has been fine up until now, and we just made it lots worse
[03:26] <jbailey> But I really do wonder if perhaps another question can be gotten rid of from the installer by punting it to runtime.
[03:26] <Kamion> I really don't accept that this is somehow not a regression because we can push more of the job of the installer onto the user
[03:27] <jbailey> I don't buy the argument "It's been fine up until now" as an argument to reverse it in the long term.
[03:27] <jbailey> It needs to really be "The solution is wrong because of FOO"
[03:27] <jbailey> I'm trying to ask whether or not the solution is right, and we just need to do the reenginneering.
[03:27] <Mithrandir> jbailey_: in the (fairly common) case that the user has a network connection, she won't see it come up in English, she'll see it in French because the langpack has been downloaded.
[03:27] <jbailey> Separate from the question of whether this whole thing is right for dapper.
[03:27] <Kamion> OK, the solution is wrong because it makes it unnecessarily difficult for the installer to construct a minimally working system in the locale you requested in the absence of network access
[03:27] <jbailey> Mithrandir: Err.. Really?  My dekstops all came up in English.
[03:28] <Kamion> and it requires slow network access during install even if you *do* have network access
[03:28] <Kamion> language packs aren't small
[03:28] <jbailey> Right, I'll buy those as reasonable arguments then.
[03:28] <ogra> especially in edubuntu, where i have gnome and kde langpacks this is a pain
[03:29] <pitti> jbailey: french is shipped on the isntall CDs, just not on the live ones
[03:29] <Kamion> ogra: the installer only needs to install the base langpack, so that isn't so bad
[03:29] <jbailey> pitti: hmm.  I should check why it didn't work on my ppc installation sometime, then.
[03:29] <pitti> jbailey: yes, that's a bit frightening; if you select the correct language in the installer, it should work
[03:29] <ogra> Kamion, KDE-de is somewhere around 20-30MB
[03:30] <pitti> ogra: uncompressed?
[03:30] <ogra> nope
[03:30] <pitti> ogra: the compressed deb should be around 4 MB
[03:30] <ogra> all the stuff that gets downloaded
[03:30] <pitti> anyway, it's still huge
[03:30] <ogra> yup
[03:30] <pitti> ogra: ah, that's language-support-de, for sure
[03:31] <Kamion> Note that most of our live CDs only have English
[03:32] <ogra> i wasnt talking about the liveCD ...
[03:32] <Kamion> yeah, I know, just saying
[03:32] <jbailey> Is there a threshold at which we should revisit this?
[03:32] <jbailey> I know that the belocs locales are larger in their source form than the glibc locales one are, just because there's so many more of them.
[03:32] <Kamion> basically if the live CD can't get at the locale, we need to make sure that the entire desktop works even in an incorrect locale until such time as you can run the language selector
[03:32] <jbailey> I don't know off hand how much more they are.
[03:33] <jbailey> But at some point, I'd assume that Rosetta will be pumping out new ones with updates and changes every week or so.
[03:33] <jbailey> And as people add more languages, I don't know how big it will get.
[03:33] <Kamion> There's a limit to how much we can do with those in the installer / live CD context anyway ...
[03:33] <pitti> we can still update the locales package with that approach
[03:34] <jbailey> Yup.  I'm just considering size-wise.
[03:34] <jbailey> Like, if it hits a meg, do we suddenly care?  Is there another piece that we need to watch for?
[03:35] <jbailey> (In mirror hit time for stable updates, etc)
[03:35] <pitti> hm, one meg of new data would require us to double the number of locale definitions - that's certainly a lot
[03:35] <Kamion> if it overflows the CD, we care, but that sounds like a very long way off
[03:36] <jbailey> Right, but it might be a threshold at which someone notices and says "Hey, let's think about this", etc.
[03:36] <jbailey> Okay, cool.  Measured that way is fine, too.
[03:36] <Kamion> I'd say if it's << language pack size 
[03:36] <Kamion> let me start again
[03:36] <Kamion> I'd say if it's < language pack size, then we're probably OK, considering that language packs are updated (more?) regularly, there are loads more of them, and they're bigger
[03:36] <Kamion> like, an average language pack
[03:38] <jbailey> And I guess relative to langpack updates, updating the locales packages for folks on a regular basis won't be bad either.
[03:38] <pitti> right
[03:38] <pitti> but locales-updates is a bit too much for my taste
[03:38] <pitti> so we should always update the whole lot
[03:38] <jbailey> Yes, I think so.
[03:39] <jbailey> It can still be generated by the same method you use now.
[03:39] <Kamion> pitti: do you have a rough planned schedule for dapper locales/langpack updates?
[03:39] <jbailey> And just generate a whole new package each time.
[03:39] <pitti> Kamion: I'm still desperately wiating for Rosetta
[03:39] <Kamion> mm, understood
[03:39] <pitti> Kamion: but I can update the packs with buildd data at any time
[03:40] <pitti> so, if we say we need locales centralized by tomorrow, I'll build new packs and update locales very soon
[03:40] <jbailey> pitti: jordi and daf each seemed a bit surprised that updates with rosetta had any problem.  It might be worth poking a bit harder.
[03:40] <Kamion> if I manage to arrange for netboot installations to install from breezy-updates, then at least netboot installs will have current locales from the beginning
[03:40] <pitti> not sure how soon, my head is exploding (bad cold), but I'll manage
[03:40] <Mithrandir> pitti: it's blocking a part of me, but I'm on VAC from Wednesday, so not a lot of hurry from me.
[03:40] <Mithrandir> just a slight urgency.
[03:41] <jbailey> I have an ongoing concern about packages build-dep'ing on locales for their testsuites.
[03:41] <jbailey> I think it's still a mistake.
[03:41] <pitti> ok, so shall we do that then? centralize locales again?
[03:42] <pitti> jbailey: I agree, but that's pretty orthogonal, I think
[03:42] <jbailey> pitti: Right.  It's more the "now that we're centralising, this becomes an issue again"
[03:42] <pitti> the centralized file has the disadvantage that we probably need /etc/locale.gen back, or not?
[03:43] <pitti> of course we could also add a 'generate only this locale' function to localedef
[03:43] <pitti> but that wouldn't be updated with locales updates
[03:44] <jbailey> Nothing static should be provided outside of the locales definition, since glibc has no knowledge of those locales.
[03:44] <pitti> jbailey: sorry, I don't understand that
[03:45] <jbailey> perhaps I misunderstood you then too.
[03:45] <pitti> Currently I call localedef manually to generate a test locale (I need a latin1 one for testing)
[03:45] <jbailey> What are you saying wouldn't get updated and might need to?
[03:46] <pitti> that works fine, but when that locale is updated, nothing will rebuild it since it isn't mentioned in any config file
[03:46] <pitti> we got rid of /etc/locale.def, right?
[03:46] <jbailey> Oh, I see.
[03:46] <pitti> and /v/l/locales/supported.d will not mention manually generated locales
[03:46] <pitti> unless we teach locale-def to add a 'local' file there
[03:46] <Kamion> it could, if locale-gen added one ... what you said
[03:47] <Kamion> that's another good point, we have to honour people's requests for legacy locales
[03:48] <Kamion> language packs don't do that because they only generate UTF-8 locales
[03:48] <jbailey> Kamion: How legacy?  We drop old locales that don't make any sense anymore (Like where countries dissapear, etc.)
[03:48] <pitti> so we could put all non-UTF8 locales to supported.d/local during the upgrade from the breezy locales package?
[03:48] <Kamion> jbailey: legacy as in non-UTF-8
[03:51] <pitti> ok, so far this all sounds like that there is only one solution that would not break too bad for now - locales back to 'locales', new langpacks without them, add legacy and local feature to locale-def
[03:51] <pitti> does anybody still have a concern? The discussion seems to die down a bit
[03:52] <doko> pitti: could you summarize?
[03:52] <pitti> I just did
[03:52] <pitti> locales back to 'locales', new langpacks without them, add legacy and local feature to locale-def
[03:52] <jbailey> Just that I'd like to work with someone who knows a bit more about custom locales to help me make sure that the localedef hack does what they need.
[03:53] <pitti> so new langpacks would stil automaticaly handle locales
[03:53] <pitti> jbailey: I think it should be in locale-gen
[03:53] <Kamion> jbailey: I think it should be a locale-gen hack (high-level) rather than a localedef hack (low-level)
[03:53] <doko> pitti: what is "legacy and local feature"?
[03:53] <pitti> doko: when upgrading from breezy, we put all locales not covered by langpacks to /var/lib/locales/supported.d/local
[03:53] <jbailey> Ah, okay.  I was confused when you said locale-def =)
[03:54] <doko> pitti: thanks
[03:54] <pitti> doko: similarly, if you want to generate zh_CN.UTF-8 locally for testing, you don't need to install the zh langpack
[03:54] <pitti> but just do sth like 'sudo locale-gen zh_CN.UTF-8'
[03:54] <pitti> since this is mainly a developer feature, I think that's ok
[03:54] <doko> pitti: let me re-ask: it's not enought to install locales for generating a locale?
[03:54] <pitti> normal users won't care about locales
[03:54] <Kamion> yeah, I have a bunch of locales lying around for man-db testing, similarly
[03:55] <Kamion> which are in just about every encoding under the sun
[03:55] <pitti> doko: right now not, but we have to change that
[03:55] <doko> pittI: ok, thanks for clarifying. so we avoid changing other packages ...
[03:55] <Kamion> doko: just installing the locales package won't generate all possible locales, if that's what you're asking
[03:55] <jbailey> pitti, doko: Right, but that's what I'm saying shouldn't be done for testsuites.
[03:55] <pitti> langpacks install would do --keep-existing, locales upgrade would regenerate them all, right?
[03:55] <pitti> right
[03:56] <pitti> test suites should have synthetic local locales
[03:56] <jbailey> testsuites should not *count* on the data in the locales being consistant from one upload to the next.
[03:56] <Kamion> same as it's been since BenC redid the locales package in Nov 2000
[03:56] <pitti> yep
[03:56] <doko> Kamion: true. but I am able to generate them myself with a b-d on locales (and only that b-d)
[03:56] <Kamion> doko: right
[03:56] <jbailey> You'll already find that in the Belocs locales that things like colation orders have changed for almost all western european locales.
[03:56] <pitti> doko: but that does not prove anything
[03:57] <pitti> doko: python wants to test that a locale with a particular date format spits out that date format; not that the date format in German is 14.01.2005
[03:57] <doko> pitti: ?
[03:57] <pitti> test suites want to test correct output for known input
[03:57] <pitti> not test assumptions about external locale data
[03:58] <pitti> e. g. above date format (it actually changes), or the assumption that 'german' is latin-1
[03:58] <pitti> etc.
[03:58] <doko> jbailey: well, then at least you can ping upstream to adopt their test suites
[03:58] <pitti> right
[03:58] <jbailey> doko: No! It's still a mistake.
[03:58] <jbailey> Because it can still change often.
[03:59] <jbailey> There's no promises around consistancy of that format.
[03:59] <jbailey> s/format/definition/
[03:59] <jbailey> Like right now, de_DE is different on our setup than other glibc based setups.
[03:59] <pitti> ok, that's another discussion
[03:59] <pitti> very interesting, but different
[03:59] <jbailey> pitti: It wsa one of the three concerns you mentioned originally.
[03:59] <pitti> jbailey: right
[04:00] <pitti> jbailey: what I wanted to ask you is, do you see a major drawback with above plan?
[04:00] <pitti> since decentralizing was a major step that had its reasons
[04:00] <doko> jbailey: you won't convince upstream to ship locale data, because they got all they need for windows and mac with the system. why has linux to be an exception?
[04:00] <jbailey> pitti: Nope.  It still leads us to the two goals I really care about 1) I don't want to have to think about locales. 2) I want Rosetta to think about locales.
[04:01] <jbailey> pitti: The separateing into langpacks was a just a logical extension of the rosetta idea that didn't work out.
[04:01] <pitti> right, I agree
[04:01] <jbailey> doko: I suspect you're wrong on that, especially since it actually lets them test what they want to test.
[04:02] <pitti> doko: then they either don't test on windows, or it's similarly broken
[04:02] <jbailey> doko: I wouldn't ship whole locales, I would ship locale fragments that are designed to test particular i18n features.
[04:02] <pitti> ok, thank you for that first part. sounds like a plan. I'll see to implement it ASAP
[04:04] <Kamion> ok, sounds good, thanks all
[04:04] <mvo> thanks
[04:04] <doko> fine, let's delay the test discussion for dapper+1 ...
[04:04] <pitti> doko: oh, spank upstream, tell'em to fix it, kthxbye :)
[04:05] <jbailey> I think it's not so much, delay the test discussion as, it's not a bug when it breaks.
[04:06] <jbailey> And if you try to fix it upstream, they may point out to you that on their systems it works fine.
[04:06] <jbailey> And that's not a bug either.
[04:06] <doko> pitti: unfortunately you have to address this with every upstream ...
[04:07] <pitti> well, with everyone that tests locales
[04:07] <pitti> it seems that you have the honor of maintaining the majority of these :)
[04:07] <pitti> anyway, it's not something we have to fix in a hurry