A little I18N framework for libraries and applications
Project description
A library with CLI tools allowing to add i18n and l10n to Python apps and libs with ease.
Contents
Synopsis
verboselib is a thin abstraction layer on top of GNU gettext toolset and Python gettext module.
In contrast to the raw gettext API, verboselib provides a unified and thread-safe way to access translation catalogs and an easier way to set active language dynamically.
Additionally, it allows definitions of lazy translations, which can be useful for module-level variables or class-level attributes.
Besides, verboselib comes with a bunch of CLI tools for extracting translatable messages from sources and compiling them. Those tools do not impose code dependencies and can be used as stand-alone utilities.
Domains of Use
The primary domain of use is applications & services. However, it can also be used in libraries.
In such case users of a target library will have to be aware of using verboselib as well. Hence, verboselib can be viewed as an i18n & l10n framework.
Generally, reliance on frameworks is a thing to refrain from in stand-alone public libraries, but it can also be a totally valid design decision for auxiliary libraries at a product-level scale.
Installation
Available as a PyPI package:
pip install verboselib
Quickstart
The following examples provide a quick overview of what usage of verboselib looks like.
Brief Example
The briefest usage example shows how to get immediate translations:
from verboselib import Translations # (1)
from verboselib import set_language # (2)
translations = Translations( # (3)
domain="the_app",
locale_dir_path="locale",
)
_ = translations.gettext # (4)
set_language("en") # (5)
print(_("Hi there!")) # (6) 'Hi there!'
set_language("sv") # (7)
print(_("Hi there!")) # (8) 'Hej där!'
And here is the explanation for the noted lines:
Import Translations class, which is a translations registry.
Import set_language() function, which allows switching between languages.
Create an instance of Translations class, specifying the messages domain and location of the translations catalogs directory.
Define _ as a shortcut for translations.gettext() function.
Set the current language to English.
Print a Hi there! there message passed as an argument to the _ function. This gives Hi there! as the output.
Set the current language to Swedish.
Print the same message again and get Hej där! as the output.
The example is naïve, but calls to gettext() via the _ shortcut are very common inside functions, where messages are translated when functions are called. For example:
def print_message():
print(_("Hi there!"))
set_language("en")
print_message() # 'Hi there!'
set_language("sv")
print(_("Hi there!")) # 'Hej där!'
Lazy Translations Example
Oftentimes there’s a need to have a placeholder or just a message, the definition of which must be separated from its evaluation. This is achieved via lazy translations:
from verboselib import Translations
from verboselib import set_language
translations = Translations(
domain="the_app",
locale_dir_path="locale",
)
L_ = translations.gettext_lazy # (1)
class Greeter:
greeting_fmt = L_("Hi there, {name}!") # (2)
@classmethod
def make_greeting(cls, name):
return cls.greeting_fmt.format( # (3)
name=name,
)
set_language("en")
print(Greeter.make_greeting("user")) # (4) 'Hi there, user!'
set_language("sv")
print(Greeter.make_greeting("user")) # 'Hej där, user!'
Comments for the noted lines:
gettext_lazy is used instead of gettext and L_ shortcut is used instead of _.
A translatable string is defined as a class-level attribute using L_ shortcut.
The translatable string is accessed as a normal string.
The method is called and a parameterized translated string is returned.
This example is also naïve, but here the value of Greeter.greeting_fmt is not translated into a solid string during construction of the Greeter class. This is important, as the class is constructed only once. The actual type of greeting_fmt is not a string, but lazy_string.LazyString, which is a string’s proxy:
>>> type(Greeter.greeting_fmt)
<class 'lazy_string.LazyString'>
API
There are several aspects to consider when using verboselib:
Active language.
Translations catalogs registry — an instance of verboselib.Translations class.
Translations catalogs directory — a directory where .po and .mo files are located.
Translatable messages themselves.
Tools for extracting messages and compiling translations.
The sections below describe those aspects separately.
Active Language
Active language is the language which will be used for getting final values of translatable strings.
Its current value is a string defined by a user, e.g. “en”. The value can be missing, i.e. not set.
In addition to the current value, it is possible to define a default value. So, if the current value is not set, it will fallback to the default value.
By default, both “current value” and “default value” are not set, i.e. they are None.
Finally, it’s possible to turn translations off, so that translations will be equal to original messages.
Current Language
The current language in verboselib is controlled and queried via the following functions:
- set_language(language)
Sets the current language for the current thread.
from verboselib import set_language set_language("en")
- get_language()
Queries name of the current language in the current thread as a string.
from verboselib import get_language from verboselib import set_language get_language() # None set_language("en") get_language() # 'en'
- drop_language()
Removes the value of the current language for the current thread. The value will fallback to the default value.
from verboselib import drop_language from verboselib import get_language from verboselib import set_language set_language("en") get_language() # 'en' drop_language() get_language() # None
Default Language
The default language is controlled by functions which are similar to functions used to control the current language:
- set_default_language(language)
Sets the default language for the current thread.
from verboselib import set_default_language set_default_language("en")
- get_default_language
Queries value of the default language for the current thread as a string.
from verboselib import get_default_language from verboselib import set_default_language get_default_language() # None set_default_language("en") get_default_language() # 'en'
- drop_default_language
Removes the value of the default language for the current thread.
from verboselib import drop_default_language from verboselib import get_default_language from verboselib import set_default_language set_default_language("en") get_default_language() # 'en' drop_default_language() get_default_language() # None
Usually, only the set_default_language(...) is used. This can be helpful if None is a possible value for the current language. In such a case at least a default language will be used:
def greet_user(user):
set_language(user.language) # can be None
print(_("Hi there, {name}!").format(name=user.name))
drop_language()
set_default_language("en")
...
user = get_user()
greet_user(user)
Disabling Translations
At certain times it can be useful to disable translations, for example, during debugging.
This can be done via set_language_bypass() function. It disables the current language and prevents it from falling back to the default language.
from verboselib import drop_language
from verboselib import get_language
from verboselib import set_language
from verboselib import set_default_language
from verboselib import set_language_bypass
set_default_language("en")
set_language("fr")
get_language() # 'fr'
set_language_bypass()
get_language() # None
drop_language()
get_language() # 'en'
Note that the 2nd call to get_language() returned None.
Locale-to-language Conversions
verboselib comes up with a couple of helper functions for converting languages to locales:
from verboselib import to_locale
to_locale("en-us") # 'en_US'
and vice versa, for converting locales to languages:
from verboselib import to_language
to_language("en_US") # 'en-us'
Translations Catalogs Registry
Translations catalogs registry (verboselib.Translations) is a facade in front of gettext.GNUTranslations.
Instance Creation
The verboselib.Translations class requires the following arguments to be provided:
- domain
A name (string) of the domain of translations. Usually, it’s the name of the application, of the library, or it can be just "messages".
- locale_dir_path
A path (string or pathlib.Path) to the translations catalogs directory, which is a place where actual translations are stored. Usually, such directory is called locale and is located inside the top-level directory of the application or library. The path is strongly recommended to be absolute.
Example:
from pathlib import Path
from verboselib import Translations
__here__ = Path(__file__).absolute().parent
translations = Translations(
domain="messages",
locale_dir_path=(__here__ / "locale"),
)
Instance Location
Although instances of Translations are just objects which can be passed to functions, it is recommended to create a single instance of Translations as a global variable in a separate module, say translations.py. Those instances are thread-safe.
Additionally, it can be handy to make module-level aliases for the methods of a Translations instance:
# foo_package/translations.py
from pathlib import Path
from verboselib import Translations
translations = Translations(
domain="foo_package",
locale_dir_path=(Path(__file__).absolute().parent / "locale"),
)
gettext = translations.gettext
gettext_lazy = translations.gettext_lazy
This can look a bit ugly, but in such a case it’s convenient to access those methods as functions from other modules, e.g.:
# foo_package/logic.py
from .translations import gettext as _
from .translations import gettext_lazy as L_
print(_("Hello"))
greeting_fmt = L_("Hello, {name}")
Methods
The API of verboselib.Translations is compatible with GNUTranslations and includes the following methods:
gettext(message)
ngettext(singular, plural, n)
pgettext(context, message)
npgettext(context, singular, plural, n)
Additionally, verboselib.Translations provides their lazy versions:
gettext_lazy(message)
ngettext_lazy(singular, plural, n)
pgettext_lazy(context, message)
npgettext_lazy(context, singular, plural, n)
Those lazy methods return an instance of lazy_string.LazyString which is a string’s proxy.
As for ngettext and npgettext methods and their lazy counterparts, not only an int can be passed as the n argument, but also a callable accepting no arguments and returning an int. For example, both the following calls are valid and conceptually identical:
translations.ngettext("window", "windows", 1)
translations.ngettext("window", "windows", lambda: 1)
Translations Catalogs Directory
All translations are stored in a catalogs directory, where each language has its own subdirectory.
This section describes how to build such a catalog.
Workflow Overview
Firstly, translatable messages are extracted from source files into .po files. Those files contain IDs of messages and file locations where those messages are observed, e.g.:
#: foo.py:105 foo.py:203
msgid "Hi there, {name}!"
msgstr ""
Translators fill in, well, translations for IDs inside .po files:
#: foo.py:105 foo.py:203
msgid "Hi there, {name}!"
msgstr "Hej där, {name}!"
Finally, those .po files are compiled into .mo files.
Discovery of Translatable Messages
In order to be discovered and extracted, messages in source files have to be marked in a certain way.
This is achieved by wrapping a message by a pair of parentheses () prefixed by a keyword. This makes it look like a function call, which it really is:
gettext("a message")
Here, gettext is used as a keyword. It’s also possible to use its shortcut which is _:
_("a message")
Both of those variants are equal, but the latter is more concise.
By default verboselib recognizes the following keywords:
Keyword |
Shortcut |
Example |
---|---|---|
gettext |
_ |
_("message") |
gettext_lazy |
L_ |
L_("message") |
ngettext |
N_ |
N_("single", "plural", 123) |
ngettext_lazy |
LN_ |
LN_("single", "plural", 123) |
pgettext |
P_ |
P_("message context", "message") |
pgettext_lazy |
LP_ |
LP_("message context", "message") |
npgettext |
NP_ |
LP_("message context", "single", "plural", 123) |
npgettext_lazy |
LNP_ |
LP_("message context", "single", "plural", 123) |
Technically, any literal can be used as a keyword. But if a non-default keyword is used, it must be specified during extraction, which is described later.
.po files
Every .po file includes a header at the beginning.
It consists of key-value metadata separated from file’s body via a blank line. Example:
msgid ""
msgstr ""
"Project-Id-Version: foo 1.0.0\n"
"PO-Revision-Date: 2020-10-09 21:24+0300\n"
"Report-Msgid-Bugs-To: support@foo.com\n"
"Last-Translator: Mr Translator <translation-team@foo.com>\n"
"Language: de\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
msgid "Log in"
msgstr ""
Refer to gettext docs for more details on .po files and on .po headers.
Message Contexts
Functions as pgettext(), npgettext(), and their lazy fellows allow to provide a message context.
This is just a string which will appear in .po files to give a hint for translators about the meaning of the message.
For example, the following call to pgettext:
P_("abbrev. month", "Jan")
will add a msgctxt attribute into .po files:
#: foo.py:90
msgctxt "abbrev. month"
msgid "Jan"
msgstr ""
Plural Forms
Functions as ngettext(), npgettext(), and their lazy counterparts allow to get different translations depending on the integer number n provided to them, e.g.:
N_("window", "windows", 1)
In this trivial example n is 1. However, it can be a variable or a parameterless callable returning an int:
def get_users_online() -> int:
n = ...
return n
N_("user online", "users online", get_users_online)
This looks pretty simple, but that is not the end of the story.
Different languages can have different number of plural forms and each form can have their own calculation rules.
So, in order to make plural forms actually work, each .po file must include a Plural-Forms metadata in its header.
For example, languages of the Germanic family, like English, have 2 plural forms defined as:
"Plural-Forms: nplurals=2; plural=n != 1\n"
Examples of rules for other languages can be found at Plural-Forms documentation page.
Finally, every message having plural forms must have as many translations as there are plural forms specified by Plural-Forms. For example:
#: foo.py:74
msgid "window"
msgid_plural "windows"
msgstr[0] "вікно"
msgstr[1] "вікна"
msgstr[2] "вікон"
Refer to gettext docs for more info on translating plural forms.
Utilities
verboselib comes with a couple of stand-alone CLI utilities for extracting and compiling translatable messages.
These utilities are implemented as subcommands of the main command named verboselib.
Run verboselib with -h flag to get generic help:
verboselib -h
usage: verboselib [-h] [-V] {extract,x,compile,c} ...
run a verboselib command
optional arguments:
-h, --help show this help message and exit
-V, --version show version of verboselib and exit
subcommands:
{extract,x,compile,c}
extract (x) extract translatable strings from sources into '.po' files
compile (c) compile '.po' text files into '.mo' binaries
extract or x
Used to extract translatable messages from sources. Creates or updates the directory with translations catalogs. Run with -h flag for help:
verboselib x -h
usage: extract [-h] [-d DOMAIN] [-l LOCALE] [-a] [-o OUTPUT_DIR] [-k KEYWORD] [--no-default-keywords] [-e EXTENSIONS] [-s] [-i PATTERN] [--no-default-ignore] [--no-wrap]
[--no-location] [--no-obsolete] [--keep-pot] [--xgettext-extra-args XGETTEXT_EXTRA_ARGS] [--msguniq-extra-args MSGUNIQ_EXTRA_ARGS]
[--msgmerge-extra-args MSGMERGE_EXTRA_ARGS] [--msgattrib-extra-args MSGATTRIB_EXTRA_ARGS] [-v]
extract translatable strings from sources into '.po' files
optional arguments:
-h, --help show this help message and exit
-d DOMAIN, --domain DOMAIN
domain of message files (default: messages)
-l LOCALE, --locale LOCALE
create or update '.po' message files for the given locale(s), ex: 'en_US'; can be specified multiple times (default: None)
-a, --all update all '.po' message files for all existing locales (default: False)
-o OUTPUT_DIR, --output-dir OUTPUT_DIR
path to the directory where locales will be stored, a.k.a. 'locale dir' (default: locale)
-k KEYWORD, --keyword KEYWORD
extra keyword to look for, ex: 'L_'; can be specified multiple times (default: None)
--no-default-keywords
do not use default keywords as {'_', 'gettext', 'L_', 'gettext_lazy', 'N_:1,2', 'ngettext:1,2', 'LN_:1,2', 'ngettext_lazy:1,2', 'P_:1c,2',
'pgettext:1c,2', 'LP_:1c,2', 'pgettext_lazy:1c,2', 'NP_:1c,2,3', 'npgettext:1c,2,3', 'LNP_:1c,2,3', 'npgettext_lazy:1c,2,3'} (default: False)
-e EXTENSIONS, --extension EXTENSIONS
extra file extension(s) to scan in addition to '.py'; separate multiple values with commas or specify the parameter multiple times (default: None)
-s, --links follow links to files and directories when scanning sources for translation strings (default: False)
-i PATTERN, --ignore PATTERN
extra glob-style patterns for ignoring files or directories; can be specified multiple times (default: None)
--no-default-ignore do not ignore the common glob-style patterns as {'.*', '*~', 'CVS', '__pycache__', '*.pyc'} (default: False)
--no-wrap do not break long message lines into several lines (default: False)
--no-location do not write location lines, ex: '#: filename:lineno' (default: False)
--no-obsolete remove obsolete message strings (default: False)
--keep-pot keep '.pot' file after creating '.po' files (useful for debugging) (default: False)
--xgettext-extra-args XGETTEXT_EXTRA_ARGS
extra arguments for 'xgettext' utility; can be comma-separated or specified multiple times (default: None)
--msguniq-extra-args MSGUNIQ_EXTRA_ARGS
extra arguments for 'msguniq' utility; can be comma-separated or specified multiple times (default: None)
--msgmerge-extra-args MSGMERGE_EXTRA_ARGS
extra arguments for 'msgmerge' utility; can be comma-separated or specified multiple times (default: None)
--msgattrib-extra-args MSGATTRIB_EXTRA_ARGS
extra arguments for 'msgattrib' utility; can be comma-separated or specified multiple times (default: None)
-v, --verbose use verbose output (default: False)
The basic usage example:
verboselib x -l 'uk' -l 'en' -l 'it'
Use -a flag to update all existing .po files:
verboselib x -a
Use --keyword (-k) argument to specify additional keywords to look for, e.g.:
verboselib x -a -k 'FOO_' -k 'BAR_'
compile or c
Compiles all .po files into .mo files. Basic usage has no arguments:
verboselib c
Use -h flag for help:
verboselib c -h
usage: compile [-h] [-d LOCALES_DIR] [-l LOCALE] [-e EXCLUDE] [-f] [--msgfmt-extra-args MSGFMT_EXTRA_ARGS] [-v]
compile '.po' text files into '.mo' binaries
optional arguments:
-h, --help show this help message and exit
-d LOCALES_DIR, --locale-dir LOCALES_DIR
path to the directory where locales are stored (default: locale)
-l LOCALE, --locale LOCALE
locale(s) to process, ex: 'en_US'; can be specified multiple times; all locales are processed if not specified (default: None)
-e EXCLUDE, --exclude EXCLUDE
locale(s) to exclude, ex: 'en_US'; can be specified multiple times (default: None)
-f, --use-fuzzy use fuzzy translations (default: False)
--msgfmt-extra-args MSGFMT_EXTRA_ARGS
extra arguments for 'msgfmt' utility; can be comma-separated or specified multiple times (default: None)
-v, --verbose use verbose output (default: False)
Thread-safety
The current and the default languages are thread-local. Hence, the functions for manipulating and querying them, like set_language(), are thread-safe. However, the values have to be set in each thread separately.
As for the translations catalog registry, verboselib.Translations, it is also thread-safe, as it relies on RLocks. It’s recommended to be used in libraries. However, if the target is an application and it is guaranteed to be single-threaded, it’s possible to use a not-thread-safe version:
from verboselib import NotThreadSafeTranslations
Changelog
1.1.0 (Nov 18, 2020)
API changes:
verboselib.lazy is extracted into a standalone lib lazy-string.
Misc:
Annotations using classes from typing are tidied up and deprecations starting from Python 3.9 are handled.
1.0.1 (Oct 30, 2020)
Fix verboselib.utils.export() helper which adds objects to __all__ variable of their own modules.
1.0.0 (Oct 11, 2020)
API changes:
verboselib.factory.TranslationsFactory is now verboselib.translations.Translations.
locale_dir argument of Translations is locale_dir_path now and instances of pathlib.Path can be used in addition to strings.
verboselib.translations.Translations is now thread-safe.
verboselib.translations.NotThreadSafeTranslations is added.
Methods ugettext() and ugettext_lazy() are removed from Translations.
Methods ngettext(), ngettext_lazy(), pgettext(), pgettext_lazy(), npgettext(), and npgettext_lazy() are added to Translations.
Function get_default_language() is added.
Function verboselib.heplers.to_locale() no longer has to_lower argument.
verboselib-manage CLI utility is now just verboselib.
compile and extract subcommands of verboselib have c and x aliases respectively.
Flags --no-default-keywords, --xgettext-extra-args, --msguniq-extra-args, --msgmerge-extra-args, and --msgattrib-extra-args are added to the extract command.
Flags --exclude, --use-fuzzy, and --msgfmt-extra-args are added to the compile command.
Python support:
Support of all Python versions below 3.7 is dropped.
Other:
All external dependencies are removed.
The license is switched from LGPLv3 to MIT.
0.2.1 (Jul 16, 2017)
Fix version command.
Rename verboselib-manage.py executable to simply verboselib-manage.
0.2.0 (Dec 31, 2014)
Add get_default_language() method.
Use default translation classes from gettext module.
0.1.0 (Jul 17, 2014)
Initial version.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file verboselib-1.1.0.zip
.
File metadata
- Download URL: verboselib-1.1.0.zip
- Upload date:
- Size: 47.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8bf44376493066c1bd13620f95c7b120938722cd83235a7f6c153161ad4d5c11 |
|
MD5 | 5a9e8a2fed03714e394e6db0f1e104fa |
|
BLAKE2b-256 | f92b7f65b19615b4d69e9286ffb0f419195f7f49c26e1a1de92730682992861b |