|
Internationalization
Page updated March 14, 2011
BaCon version 1.0.22 introduced internationalization support. This
means that we can write programs with automatic insertion of
non-English text in the appropriate places, depending on the locale (the language, country, money, date, and other localizations).
I would like to thank Peter van Eerten and "L18L" who helped me with
some teething troubles when I wrote my first internationalized "Hello
World" program. Our discussion: http://basic-converter.proboards.com/index.cgi?board=general&action=display&thread=89
Before we get started, a note to anyone who will want to create a language translation file for an application. Make sure that UTF-8
is enabled for your locale. All Woof-built puppies built from Woof
dated March 2011 will have this as default. To check your locale, click
the 'setup' icon on the desktop, choose 'Configure Puppy for your
country', then 'Choose your locale' -- you will then see a checkbox for
UTF-8 -- if you change this, restart X for it to take effect. |
Hello World CLI example
CLI means CommandLine Interface, that is, an application that does not
have a GUI (Graphical User Interface). Here is my simple program, hello.bac:
OPTION INTERNATIONAL TRUE SETENVIRON "OUTPUT_CHARSET", "UTF-8" PRINT INTL$("hello world") PRINT INTL$("some more text")
There are three things that have been done here. Firstly, the
INTERNATIONAL argument to the OPTION statement, that must be placed at
the beginning of the program, secondly specify output text to be UTF-8, and thirdly the INTL$ function that must be used
for any text that needs to be internationalized. Further reading:
1. Now, you compile the program, but you have to use the '-x' parameter:
> bacon -x hello.bac
This compiles the program, but also extracts the translatable text strings. Two files are generated, hello, the binary executable, and hello.pot which has the text strings.
As you can see from the source code, the default text strings are in
English, however I could pretend they are not and define alternative
text for the en locale. The same principle will apply to any other locale, for example de for German, fr for French, etc.
2. The next step is to create a .po file for the required locale:
> msginit --locale=en --output-file=en.po --input=hello.pot
Then you edit en.po and insert your alternative text:
# English translations for temp package. # Copyright (C) 2011 THE temp'S COPYRIGHT HOLDER # This file is distributed under the same license as the temp package. # root <root@localhost>, 2011. # msgid "" msgstr "" "Project-Id-Version: temp 3\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2011-03-09 07:40+0800\n" "PO-Revision-Date: 2011-03-09 07:41+0800\n" "Last-Translator: root <root@localhost>\n" "Language-Team: English\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=2; plural=(n != 1);\n"
#: hello.bac.c:31 msgid "hello world" msgstr "Howdi Guys"
#: hello.bac.c:35 msgid "some more text" msgstr "yadda yadda"
Note above, I edited charset=UTF-8.
It may default to ASCII or ISO-8859-1, but please always change it to
UTF-8. This is necessary to ensure correct translation under all
conditions.
3. Finally, you compile and install the translation file:
> mkdir -p /usr/share/locale/en/LC_MESSAGES > msgfmt --check --output-file=/usr/share/locale/en/LC_MESSAGES/hello.mo en.po
Three simple steps. Now run the program:
> ./hello Howdi Guys yadda yadda
What about en_AU, en_CA, en_DK, en_US...
A Linux user sets their system up with a locale such as en_AU, which means English, Australia, or en_CA
which is English, Canada, etc. This is because there are different
dialects of English in different parts of the world. In the above Hello
World example, I just specified en
translation, which is generic for all English dialects, however, if I
wanted a specific translation for Australian English, this is what I
would do:
> msginit --locale=en_AU --output-file=en_AU.po --input=hello.pot
I then edited en_AU.po with suitable Aussie text strings. Then:
> mkdir -p /usr/share/locale/en_AU/LC_MESSAGES > msgfmt --output-file=/usr/share/locale/en_AU/LC_MESSAGES/hello.mo en_AU.po
yes, it works!:
> ./hello How ya goin mate put another prawn on the barby
Note that the generic en translation still works for all other English dialects.
A simple GUI hello world
Here is a simple Hello World GTK GUI application, using HUG (HUG functions are shown in purple text):
OPTION INTERNATIONAL TRUE SETENVIRON "OUTPUT_CHARSET", "UTF-8" INCLUDE "hug.bac"
mainwin = WINDOW("Internationalized Hello World", 400, 50)
label1=MARK(INTL$("Hello World"),350,15) ATTACH(mainwin,label1,58,10)
DISPLAY
Following the same steps:
bacon -x hello-gui.bac msginit --locale=en_US --output-file=en.po --input=hello-gui.pot msgfmt --output-file=/usr/share/locale/en/LC_MESSAGES/hello-gui.mo en.po
The end result:
What you can see from this, is that the person who is maintaining an
application written in BaCon, will only have to provide a .pot file and
other people can take that and create a .po file for their language
and country. So, in future when I release a binary PET package, I will
also include the .pot file and users can create .po files and email
them to me for inclusion in the PET.
Note that if there is a compiled .mo file that you would like to edit
and update, you can un-compile it, which creates a .po file:
> msgunfmt hello.mo
Singular and plural
This is an appropriate web page to mention the issue of text
that has to represent singular or plural values. Thanks to "L18L" who
showed how to do this. For example, say that you had a variable x in an
application representing the number of green bottles. You want to print
a message stating how many green bottles. You could do it like this, for the example of quantity 2:
"There is/are two green bottles(s)"
...not very professional!
What you really want is two different messages: "There is one green bottle" or "There are two green bottles".
A problem with internationalization is that the text message may have
to be different depending on whether there are quantities of 1, 2, 3,
or more. The problem, and solution, is explained in these 'gettext'
documentation links:
In BaCon, the NNTL$ function will handle printing of singular and plural forms. For example:
OPTION INTERNATIONAL TRUE SETENVIRON "OUTPUT_CHARSET", "UTF-8"
x=2 PRINT INTL$("first msg here") PRINT x FORMAT NNTL$("There is one green bottle","There are %d green bottles",x)
...The x FORMAT is required if you want to substitute the value of x into the strings (in place of %d).
Compile with the -x option, and a .pot file is generated, with this in it:
msgid "There is one green bottle" msgid_plural "There are %d green bottles" msgstr[0] "" msgstr[1] ""
...insert the singular string into msgstr[0] and the plural form into msgstr[1], including the %d
as-is. As mentioned, some languages have different text depending on
plurality being 2, 3, or more -- read the above documentation links for
further information.
Further reading
These are some extra links that you might find useful:
And of course HUG is Highlevel Universal GUI, an easy way to write GUI applications in BaCON: my-1st-gtk-app/index.html
© Copyright Barry Kauler 2011 bkhome.org All rights reserved
See FAQ for legal statement.
The BaCon logo is © copyright Peter van Eerten, used with permission.
|