Kitchen.i18n Module¶
I18N is an important piece of any modern program. Unfortunately, setting up i18n in your program is often a confusing process. The functions provided here aim to make the programming side of that a little easier.
Most projects will be able to do something like this when they startup:
# myprogram/__init__.py:
import os
import sys
from kitchen.i18n import easy_gettext_setup
_, N_ = easy_gettext_setup('myprogram', localedirs=(
os.path.join(os.path.realpath(os.path.dirname(__file__)), 'locale'),
os.path.join(sys.prefix, 'lib', 'locale')
))
Then, in other files that have strings that need translating:
# myprogram/commands.py:
from myprogram import _, N_
def print_usage():
print _(u"""available commands are:
--help Display help
--version Display version of this program
--bake-me-a-cake as fast as you can
""")
def print_invitations(age):
print _('Please come to my party.')
print N_('I will be turning %(age)s year old',
'I will be turning %(age)s years old', age) % {'age': age}
See the documentation of easy_gettext_setup()
and
get_translation_object()
for more details.
See also
gettext
for details of how the python gettext facilities work
- babel
The babel module for in depth information on gettext, message catalogs, and translating your app. babel provides some nice features for i18n on top of
gettext
Functions¶
easy_gettext_setup()
should satisfy the needs of most users.
get_translation_object()
is designed to ease the way for anyone that
needs more control.
- kitchen.i18n.easy_gettext_setup(domain, localedirs=(), use_unicode=True)¶
Setup translation functions for an application
- Parameters
domain – Name of the message domain. This should be a unique name that can be used to lookup the message catalog for this app.
localedirs – Iterator of directories to look for message catalogs under. The first directory to exist is used regardless of whether messages for this domain are present. If none of the directories exist, fallback on
sys.prefix
+/share/locale
Default: No directories to search so we just use the fallback.use_unicode – If
True
return thegettext
functions forstr
strings else return the functions for bytebytes
for the translations. Default isTrue
.
- Returns
tuple of the
gettext
function andgettext
function for plurals
Setting up
gettext
can be a little tricky because of lack of documentation. This function will setupgettext
using the Class-based API for you. For the simple case, you can use the default arguments and call it like this:_, N_ = easy_gettext_setup()
This will get you two functions,
_()
andN_()
that you can use to mark strings in your code for translation._()
is used to mark strings that don’t need to worry about plural forms no matter what the value of the variable is.N_()
is used to mark strings that do need to have a different form if a variable in the string is plural.See also
- Kitchen.i18n Module
This module’s documentation has examples of using
_()
andN_()
get_translation_object()
for information on how to use
localedirs
to get the proper message catalogs both when in development and when installed to FHS compliant directories on Linux.
Note
The gettext functions returned from this function should be superior to the ones returned from
gettext
. The traits that make them better are described in theDummyTranslations
andNewGNUTranslations
documentation.Changed in version kitchen-0.2.4: ; API kitchen.i18n 2.0.0 Changed
easy_gettext_setup()
to return the lgettext functions instead of gettext functions when use_unicode=False.
- kitchen.i18n.get_translation_object(domain, localedirs=(), languages=None, class_=None, fallback=True, codeset=None, python2_api=True)¶
Get a translation object bound to the message catalogs
- Parameters
domain – Name of the message domain. This should be a unique name that can be used to lookup the message catalog for this app or library.
localedirs – Iterator of directories to look for message catalogs under. The directories are searched in order for message catalogs. For each of the directories searched, we check for message catalogs in any language specified in:attr:languages. The message catalogs are used to create the Translation object that we return. The Translation object will attempt to lookup the msgid in the first catalog that we found. If it’s not in there, it will go through each subsequent catalog looking for a match. For this reason, the order in which you specify the
localedirs
may be important. If no message catalogs are found, either return aDummyTranslations
object or raise anIOError
depending on the value offallback
. Rhe default localedir fromgettext
which isos.path.join(sys.prefix, 'share', 'locale')
on Unix is implicitly appended to thelocaledirs
, making it the last directory searched.languages –
Iterator of language codes to check for message catalogs. If unspecified, the user’s locale settings will be used.
See also
gettext.find()
for information on what environment variables are used.class – The class to use to extract translations from the message catalogs. Defaults to
NewGNUTranslations
.fallback – If set to data:False, raise an
IOError
if no message catalogs are found. IfTrue
, the default, return aDummyTranslations
object.codeset – Set the character encoding to use when returning byte
bytes
objects. This is equivalent to callingoutput_charset()
on the Translations object that is returned from this function.python2_api – When data:True (default), return Translation objects that use the python2 gettext api (
gettext()
andlgettext()
return bytebytes
.ugettext()
exists and returnsstr
strings). WhenFalse
, return Translation objects that use the python3 gettext api (gettext returnsstr
strings and lgettext returns bytebytes
. ugettext does not exist.)
- Returns
Translation object to get
gettext
methods from
If you need more flexibility than
easy_gettext_setup()
, use this function. It sets up agettext
Translation object and returns it to you. Then you can access any of the methods of the object that you need directly. For instance, if you specifically need to accesslgettext()
:translations = get_translation_object('foo') translations.lgettext('My Message')
This function is similar to the python standard library
gettext.translation()
but makes it better in two ways- It returns
NewGNUTranslations
orDummyTranslations
objects by default. These are superior to the
gettext.GNUTranslations
andgettext.NullTranslations
objects because they are consistent in the string type they return and they fix several issues that can causethe python standard library objects to throwUnicodeError
.
- It returns
- This function takes multiple directories to search for
The latter is important when setting up
gettext
in a portable manner. There is not a common directory for translations across operating systems so one needs to look in multiple directories for the translations.get_translation_object()
is able to handle that if you give it a list of directories to search for catalogs:translations = get_translation_object('foo', localedirs=( os.path.join(os.path.realpath(os.path.dirname(__file__)), 'locale'), os.path.join(sys.prefix, 'lib', 'locale')))
This will search for several different directories:
A directory named
locale
in the same directory as the module that calledget_translation_object()
,In
/usr/lib/locale
In
/usr/share/locale
(the fallback directory)
This allows
gettext
to work on Windows and in development (where the message catalogs are typically in the toplevel module directory) and also when installed under Linux (where the message catalogs are installed in/usr/share/locale
). You (or the system packager) just need to install the message catalogs in/usr/share/locale
and remove thelocale
directory from the module to make this work. ie:In development: ~/foo # Toplevel module directory ~/foo/__init__.py ~/foo/locale # With message catalogs below here: ~/foo/locale/es/LC_MESSAGES/foo.mo Installed on Linux: /usr/lib/python2.7/site-packages/foo /usr/lib/python2.7/site-packages/foo/__init__.py /usr/share/locale/ # With message catalogs below here: /usr/share/locale/es/LC_MESSAGES/foo.mo
Note
This function will setup Translation objects that attempt to lookup msgids in all of the found message catalogs. This means if you have several versions of the message catalogs installed in different directories that the function searches, you need to make sure that
localedirs
specifies the directories so that newer message catalogs are searched first. It also means that if a newer catalog does not contain a translation for a msgid but an older one that’s inlocaledirs
does, the translation from that older catalog will be returned.Changed in version kitchen-1.1.0: ; API kitchen.i18n 2.1.0 Add more parameters to
get_translation_object()
so it can more easily be used as a replacement forgettext.translation()
. Also change the way we use localedirs. We cycle through them until we find a suitable locale file rather than simply cycling through until we find a directory that exists. The new code is based heavily on the python standard librarygettext.translation()
function.Changed in version kitchen-1.2.0: ; API kitchen.i18n 2.2.0 Add python2_api parameter
Translation Objects¶
The standard translation objects from the gettext
module suffer from
several problems:
They can throw
UnicodeError
They can’t find translations for non-ASCII byte
str
messagesThey may return either
unicode
string or bytestr
from the same function even though the functions say they will only returnunicode
or only return bytestr
.
DummyTranslations
and NewGNUTranslations
were written to fix
these issues.
- class kitchen.i18n.DummyTranslations(fp=None, python2_api=True)¶
Safer version of
gettext.NullTranslations
This Translations class doesn’t translate the strings and is intended to be used as a fallback when there were errors setting up a real Translations object. It’s safer than
gettext.NullTranslations
in its handling of bytebytes
vsstr
strings.Unlike
NullTranslations
, this Translation class will never throw aUnicodeError
. The code that you have around a call toDummyTranslations
might throw aUnicodeError
but at least that will be in code you control and can fix. Also, unlikeNullTranslations
all of this Translation object’s methods guarantee to return bytebytes
except forugettext()
andungettext()
which guarantee to returnstr
strings.When byte
bytes
are returned, the strings will be encoded according to this algorithm:If a fallback has been added, the fallback will be called first. You’ll need to consult the fallback to see whether it performs any encoding changes.
If a byte
bytes
was given, the same bytebytes
will be returned.If a
str
string was given andset_output_charset()
has been called then we encode the string using theoutput_charset
If a
str
string was given and this isgettext()
orngettext()
and_charset
was set output in that charset.If a
str
string was given and this isgettext()
orngettext()
we encode it using ‘utf-8’.If a
str
string was given and this islgettext()
orlngettext()
we encode using the value oflocale.getpreferredencoding()
For
ugettext()
andungettext()
, we go through the same set of steps with the following differences:We transform byte
bytes
intostr
strings for these methods.The encoding used to decode the byte
bytes
is taken frominput_charset
if it’s set, otherwise we decode using UTF-8.
- input_charset¶
is an extension to the python standard library
gettext
that specifies what charset a message is encoded in when decoding a message tostr
. This is used for two purposes:
If the message string is a byte
bytes
, this is used to decode the string to astr
string before looking it up in the message catalog.In
ugettext()
andungettext()
methods, if a bytebytes
is given as the message and is untranslated this is used as the encoding when decoding tostr
. This is different from_charset
which may be set when a message catalog is loaded becauseinput_charset
is used to describe an encoding used in a python source file while_charset
describes the encoding used in the message catalog file.
Any characters that aren’t able to be transformed from a byte
bytes
tostr
string or vice versa will be replaced with a replacement character (ie:u'�'
in unicode based encodings,'?'
in other ASCII compatible encodings).See also
gettext.NullTranslations
For information about what methods are available and what they do.
Changed in version kitchen-1.1.0: ; API kitchen.i18n 2.1.0 * Although we had adapted
gettext()
,ngettext()
,lgettext()
, andlngettext()
to always return bytebytes
, we hadn’t forced those bytebytes
to always be in a specified charset. We now make sure thatgettext()
andngettext()
return bytebytes
encoded usingoutput_charset
if set, otherwisecharset
and if neither of those, UTF-8. Withlgettext()
andlngettext()
output_charset
if set, otherwiselocale.getpreferredencoding()
. * Make settinginput_charset
andoutput_charset
also set those attributes on any fallback translation objects.Changed in version kitchen-1.2.0: ; API kitchen.i18n 2.2.0 Add python2_api parameter to __init__()
- output_charset()¶
Compatibility for python2.3 which doesn’t have output_charset
- set_output_charset(charset)¶
Set the output charset
This serves two purposes. The normal
gettext.NullTranslations.set_output_charset()
does not set the output on fallback objects. On python-2.3,gettext.NullTranslations
objects don’t contain this method.
- class kitchen.i18n.NewGNUTranslations(fp=None, python2_api=True)¶
Safer version of
gettext.GNUTranslations
gettext.GNUTranslations
suffers from two problems that this class fixes.gettext.GNUTranslations
can throw aUnicodeError
ingettext.GNUTranslations.ugettext()
if the message being translated has non-ASCII characters and there is no translation for it.gettext.GNUTranslations
can return bytebytes
fromgettext.GNUTranslations.ugettext()
andstr
strings from the othergettext()
methods if the message being translated is the wrong type
When byte
bytes
are returned, the strings will be encoded according to this algorithm:If a fallback has been added, the fallback will be called first. You’ll need to consult the fallback to see whether it performs any encoding changes.
If a byte
bytes
was given, the same bytebytes
will be returned.If a
str
string was given andset_output_charset()
has been called then we encode the string using theoutput_charset
If a
str
string was given and this isgettext()
orngettext()
and a charset was detected when parsing the message catalog, output in that charset.If a
str
string was given and this isgettext()
orngettext()
we encode it using UTF-8.If a
str
string was given and this islgettext()
orlngettext()
we encode using the value oflocale.getpreferredencoding()
For
ugettext()
andungettext()
, we go through the same set of steps with the following differences:We transform byte
bytes
intostr
strings for these methods.The encoding used to decode the byte
bytes
is taken frominput_charset
if it’s set, otherwise we decode using UTF-8
- input_charset¶
an extension to the python standard library
gettext
that specifies what charset a message is encoded in when decoding a message tostr
. This is used for two purposes:
If the message string is a byte
bytes
, this is used to decode the string to astr
string before looking it up in the message catalog.In
ugettext()
andungettext()
methods, if a bytebytes
is given as the message and is untranslated his is used as the encoding when decoding tostr
. This is different from the_charset
parameter that may be set when a message catalog is loaded becauseinput_charset
is used to describe an encoding used in a python source file while_charset
describes the encoding used in the message catalog file.
Any characters that aren’t able to be transformed from a byte
bytes
tostr
string or vice versa will be replaced with a replacement character (ie:u'�'
in unicode based encodings,'?'
in other ASCII compatible encodings).See also
gettext.GNUTranslations.gettext
For information about what methods this class has and what they do
Changed in version kitchen-1.1.0: ; API kitchen.i18n 2.1.0 Although we had adapted
gettext()
,ngettext()
,lgettext()
, andlngettext()
to always return bytebytes
, we hadn’t forced those bytebytes
to always be in a specified charset. We now make sure thatgettext()
andngettext()
return bytebytes
encoded usingoutput_charset
if set, otherwisecharset
and if neither of those, UTF-8. Withlgettext()
andlngettext()
output_charset
if set, otherwiselocale.getpreferredencoding()
.