Choose-locale script defaults non-UTF8

Woof has a major overhaul in how locale is set, see my blog report:

http://puppylinux.com/blog/?viewDetailed=00523

Woof has all international locales built-in, nothing to download (unlike all earlier puppies).

However, as I have reported in the last few posts, UTF-8 locales cause scripts to run incredibly slow. In my Woof-Intrepid build, if locale is set to en_US.utf8 my little speed-test script took 155 seconds. I'm now using en_US, and amazing, the time is 36 seconds.

So, I have modified the 'chooselocale' script to default to non-UTF8. This script has two GUI interfaces, one using 'dialog' for running at first boot, the other using Xdialog for running in X. The latter has a checkbox to enable UTF8 if required.

Note, I am putting LANG=C into my scripts, so that even if the system does have UTF8 turned on, the overall speed should be ok. Maybe, I don't know about the effect on compiled applications.

Note, scripts can still switch to the system locale whenever a user input or output is required.


Posted on 1 Feb 2009, 18:48


Comments:

Posted on 2 Feb 2009, 6:21 by BarryK
Turning on UTF8
A further qualification. Although 'chooselocale' defaults to a non-UTF8 locale, in some cases only UTF8 is available. The glibc package has a file, /usr/share/i18n/SUPPORTED that documents this -- I have included this file in Woof as chooselocale reads it, but it isn't in earlier puppies.

magerlab posted yesterday that UTF8 is the best choice for Russian. Looking in the SUPPORTED file:

ru_RU.UTF-8 UTF-8
ru_RU ISO-8859-5

I only have ISO-8859-1, ISO-8859-2 and ISO-8859-15 charmap files in Woof (same as earlier puppies) (see /usr/share/i18n/charmaps), so the chooselocale script will decide to use UTF8.

Anyway, it's simply a matter of ticking a checkbox in the chooselocale script if you do want UTF8. Or, in magerlab's case UTF8 will get chosen regardless.



Posted on 3 Feb 2009, 7:51 by Leon
Some real progress in internationalization
Forum member 'wow' made some real progress in internationalization.

From my experience his Unnamed-pupplet-puppy411-kernel-2.6.27.5 Xorg-7.4-LXDE is the only Puppy that prooperly display sl_SI locale special characters in Rox.

http://www.murga-linux.com/puppy/viewtopic.php?t=36592&sid=66d8e4ae588a99a706f551ed19372bcd

His description of this feature:

"Default encoding for filenames in vfat partitions is UTF-8 (kernel module nls_utf8 by default)"

I noticed that it not depend on the locale set by chooselocale script.

Is there any chance to implement this feature in Woof?


Posted on 4 Feb 2009, 13:29 by BarryK
nls_utf8.ko
Leon,
According to /usr/share/i18n/SUPPORTED file:

sl_SI.UTF-8 UTF-8
sl_SI ISO-8859-2

Woof will default to the ISO-8859-2 charmap, but if you tick the checkbox in the Choose-locale script then you will get UTF8, and perhaps that will fix things for you.

Regarding nls_utf8 kernel module, I wasn't aware that any URLs required this. That is, I assumed all international URLs use English characters only. Perhaps I haven't realised, as I never had nls_utf.ko loaded.



Posted on 4 Feb 2009, 13:34 by BarryK
Charmap clarification
I only have ISO-8859-1, ISO-8859-2 and ISO-8859-15 charmap files in Woof (same as earlier puppies) (see /usr/share/i18n/charmaps), so the chooselocale script will decide to use UTF8.

That statement is misleading. Actually this is what Woof has in /usr/share/i18n/charmaps:

CP737 CP775 IBM437 IBM850 IBM852 IBM855 IBM857 IBM860 IBM861 IBM862 IBM863 IBM865 IBM866 IBM869 ISO-8859-15 ISO-8859-1 ISO-8859-2 UTF-8