Skip to main content

Classes and functions for performing pseudo-localization on strings and PO files.

Project description

Python module for performing pseudo-localization on strings. Tested against Python 2, Python3, PyPy and PyPy3.

Installation

The module is available on PyPI and is installable via pip:

pip install pseudol10nutil

Dependencies

This package has the following external dependencies:

  • six - for Python 2 to 3 compatibility

PseudoL10nUtil class

Class for pseudo-localizing strings. The class currently has the following members:

  • transforms - field that contains the list of transforms to apply to the string. The transforms will be applied in order. Default is [transliterate_diacritic, pad_length, square_brackets]

  • pseudolocalize(s) - method that returns a new string where the transforms to the input string s have been applied.

pseudol10nutil.transforms module

The following transforms are currently available:

  • transliterate_diacritic - Takes the input string and returns a copy with diacritics added e.g. Hello -> Ȟêĺĺø.

  • transliterate_circled - Takes the input string and returns a copy with circled versions of the letters e.g. Hello -> Ⓗⓔⓛⓛⓞ

  • transliterate_fullwidth - Takes the input string and returns a copy with the letters converted to their fullwidth counterparts e.g. Hello -> Hello

  • pad_length - Appends a series of characters to the end of the input string to increase the string length per IBM Globalization Design Guideline A3: UI Expansion.

  • angle_brackets - Surrounds the input string with ‘《’ and ‘》’ characters.

  • curly_brackets - Surrounds the input string with ‘❴’ and ‘❵’ characters.

  • square_brackets - Surrounds the input string with ‘⟦’ and ‘⟧’ characters.

Format string support

When performing pseudo-localization on a string, the process will skip performing pseudo-localization on format strings. Python style format strings (e.g. {foo}) and printf style format strings (e.g. %s) are supported. For example:

Input [1]: Source {source1} returned 0 rows.
Output [1]: '⟦Șøüȓċê {source1} ȓêťüȓñêđ 0 ȓøẁš.﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎ЍאdžᾏⅧ㈴㋹⟧

Input [2]: Source %(source2)s returned 1 row.
Output [2]: ⟦Șøüȓċê %(source2)s ȓêťüȓñêđ 1 ȓøẁ.﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎ЍאdžᾏⅧ㈴㋹퓛⟧

Input [3]: Source %s returned %d rows.
Output [3]: ⟦Șøüȓċê %s ȓêťüȓñêđ %d ȓøẁš.﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ⟧

Example usage

Python 3 example:

>>> from pseudol10nutil import PseudoL10nUtil
>>> util = PseudoL10nUtil()
>>> s = u"The quick brown fox jumps over the lazy dog."
>>> util.pseudolocalize(s)
'⟦Ťȟê ʠüıċǩ ƀȓøẁñ ƒøẋ ǰüɱƥš øṽêȓ ťȟê ĺàźÿ đøğ.﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎Ѝא⟧'
>>> import pseudolocalize.transforms
>>> util.transforms = [pseudol10nutil.transforms.transliterate_fullwidth, pseudol10nutil.transforms.curly_brackets]
>>> util.pseudolocalize(s)
'❴The quick brown fox jumps over the lazy dog.❵'
>>> util.transforms = [pseudol10nutil.transforms.transliterate_circled, pseudol10nutil.transforms.pad_length, pseudol10nutil.transforms.angle_brackets]
>>> util.pseudolocalize(s)
'《Ⓣⓗⓔ ⓠⓤⓘⓒⓚ ⓑⓡⓞⓦⓝ ⓕⓞⓧ ⓙⓤⓜⓟⓢ ⓞⓥⓔⓡ ⓣⓗⓔ ⓛⓐⓩⓨ ⓓⓞⓖ.﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎Ѝא》'

Example web app

There is an example web app in the examples/webapp/ directory that provides a web UI and a REST endpoint for pseudo-localizing strings. This example is also available on Docker hub.

Once the docker container is running, the web UI could be accessed via the following URL:

http://localhost:8080/pseudol10nutil/

The REST endpoint could be accessed as follows:

>>> import pprint
>>> import requests
>>> strings = { "s1": "The quick brown {0} jumps over the lazy {1}.", }
>>> data = { "strings": strings }
>>> headers = { "Accept": "application/json", "Content-Type": "application/json" }
>>> api_url = "http://localhost:8080/pseudol10nutil/api/v1.0/pseudo"
>>> resp = requests.post(api_url, headers=headers, json=data)
>>> resp.status_code
200
>>> pprint.pprint(resp.json())
{'strings': {'s1': '⟦Ťȟê ʠüıċǩ ƀȓøẁñ {0} ǰüɱƥš øṽêȓ ťȟê ĺàźÿ '
                   '{1}.﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎Ѝא⟧'}}

POFileUtil class

Class for performing pseudo-localization on .po (Portable Object) message catalogs. Currently the class has a single method, pseudolocalizefile(input_file, output_file, input_encoding='UTF-8', output_encoding='UTF-8', overwrite_existing=True).

The default transforms will be applied to the strings in the input file. To override this behavior, create an instance of the PseudoL10nUtil class with the desired behavior and assign it to the l10nutil field prior to calling the pseudolocalizefile() method.

Example usage

Using pypy3:

>>>> from pseudol10nutil import POFileUtil
>>>> pofileutil = POFileUtil()
>>>> input_file = "./testdata/locales/helloworld.pot"
>>>> output_file = "./testdata/locales/eo/LC_MESSAGES/helloworld_pseudo.po"
>>>> pofileutil.pseudolocalizefile(input_file, output_file)
>>>> with open(input_file, mode="r") as fileobj:
....     for line in fileobj:
....         if line.startswith("msgstr"):
....             print(line)
....
msgstr ""

msgstr ""

msgstr ""

>>>> with open(output_file, mode="r") as fileobj:
....     for line in fileobj:
....         if line.startswith("msgstr"):
....             print(line)
....
msgstr ""

msgstr "⟦Ẃȟàť ıš ÿøüȓ ñàɱê?: ﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎ЍאdžᾏⅧ㈴㋹⟧"

msgstr "⟦Ȟêĺĺø {0}!﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎ЍאdžᾏⅧ㈴㋹⟧"

>>>> from pseudol10nutil import PseudoL10nUtil
>>>> util = PseudoL10nUtil()
>>>> import pseudol10nutil.transforms
>>>> util.transforms = [pseudol10nutil.transforms.transliterate_circled, pseudol10nutil.transforms.pad_length]
>>>> pofileutil.l10nutil = util
>>>> pofileutil.pseudolocalizefile(input_file, output_file)
>>>> with open(output_file, mode="r") as fileobj:
....     for line in fileobj:
....         if line.startswith("msgstr"):
....             print(line)
....
msgstr ""

msgstr "Ⓦⓗⓐⓣ ⓘⓢ ⓨⓞⓤⓡ ⓝⓐⓜⓔ?: ﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎ЍאdžᾏⅧ㈴㋹"

msgstr "Ⓗⓔⓛⓛⓞ {0}!﹎ЍאdžᾏⅧ㈴㋹퓛ﺏ𝟘🚦﹎ЍאdžᾏⅧ㈴㋹"

>>>>

License

This is released under an MIT license. See the LICENSE file in this repository for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pseudol10nutil-0.1.dev5.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

pseudol10nutil-0.1.dev5-py2.py3-none-any.whl (9.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pseudol10nutil-0.1.dev5.tar.gz.

File metadata

File hashes

Hashes for pseudol10nutil-0.1.dev5.tar.gz
Algorithm Hash digest
SHA256 e34f89a79619c9480527791826bf5b402298629f5c70ea3fc07ecf42fea01f3a
MD5 896dbf207ddbf34f06e6a4c8c17c4c66
BLAKE2b-256 7a3e001be04b2d7f3d67f82ffde252735ee46a69bd724ff35bf34d86e538b9f5

See more details on using hashes here.

File details

Details for the file pseudol10nutil-0.1.dev5-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for pseudol10nutil-0.1.dev5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 626686f4e4000d7431e4e7d2e3d9b4df2f8ea81be0ee65a42541893bd4ebd8bf
MD5 16624dd968cd107aeea951765cd7992a
BLAKE2b-256 efa289b14b337f3b6eba9447b1ea08bbcf65b91ebea7b4eb4ba73eac43de6468

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page