Allows writing simple scripts to manage locale/translation files
Project description
Translinguer
Allows writing simple scripts to manage locale/translation files.
The core idea behind this project is:
- Provide a powerful yet flexible, extensible tool able to support arbitrary text types/formats 💪
- Allow writing short and easy-to-use scripts 😊
- It's free! 💃
Here is the simplest usage example:
from translinguer import Translinguer
document = Translinguer().load_from_gsheets(name="My Translations")
document.save_cfg_by_language_page("__path__")
Origin
As a startuper and indi game developer, I created lots of apps with multiple languages support, constantly met a problem of platform/os/framework specific text files formats and a need of manual dealing with then (which is especially painful for cross-platform games support like TacticToy). I found no universal tool that fits all my requirements (excepting paid services), so I wrote separate scripts before I finally got enough (pain and) experience, understanding to create a comprehensive tool like Translinguer.
Originally developed for the Destiny Garden game and widely enhanced while developing Factorio mods, now I hope it can benefit other developers on various platforms!
Workflow
My typical translating process consists of these steps:
- (Optionally) import existing raw locale files (json, ini, cfg, csv) and upload to Google Sheets
- Write/update texts in a Google Sheet (with myself, friends and volunteers, or Google Translate and ChatGPT)
- Download and parse texts from the Google Sheet
- (Optionally) Save into local cache
- Export texts to a required format (json, ini, cfg, csv)
But you may have a really different process, and Translinguer will still suit your needs!
Inner data structure
I'm a big fan on typing and robust decomposition! To support all the needs I met, I designed 5-layer structure: (1) a document consists of (2) pages that consist of (3) sections which consist of (4) entries that consist of (5) texts by language.
1. Translinguer aka Document
- languages:
LanguageList = List[str]
- texts:
Locales = Dict[PageName: str, Page]
2. Page
- name:
str
- sections:
Dict[SectionName: str, Section]
- languages:
Optional[LanguageList]
3. Section
- name:
str
– may be an empty string - entries:
Dict[key: str, Entry]
4. Entry
- key:
str
- by_language:
Dict[Language: str, text: str]
Usage
1. Simple yet verbose example
Most of the translation file formats support sections and comments. Translinguer allows this even for tables like CSV, Google Sheets.
Consider the following table:
key | Eng | Rus |
---|---|---|
# This is | a comment | |
greeting | Hello, world! | Привет, мир! |
farewell | Good bye! | Всего хорошего! |
[some-section] | ||
j1.1 | In the beginning was the word | В начале было слово |
Let it be a sheet named "my-texts" in a Google Sheets document. After parsing, it will become a Translinguer document with internal structure looking like this:
{
"languages": ["Eng", "Rus"],
"pages": {
"my-texts": {
"languages": ["Eng", "Rus"],
"sections": {
"": { # Default nameless section
"entries": {
"greeting": {
"Eng": "Hello, world!",
"Rus": "Привет, мир!",
},
"farewell": {
"Eng": "Good bye!",
"Rus": "Всего хорошего!",
},
}
},
"some-section": {
"entries": {
"j1.1": {
"Eng": "In the beginning was the word",
"Rus": "В начале было слово",
},
}
},
}
}
},
}
On converting to CFG files, it will become /en/my-texts.cfg
:
greeting=Hello, world!
farewell=Good bye!
[some-section]
j1.1=In the beginning was the word
And /ru/my-texts.cfg
:
greeting=Привет, мир!
farewell=Всего хорошего!
[some-section]
j1.1=В начале было слово
Comments and sections syntax can be easily change with methods arguments.
Notice that only a whole row may be specified as a comment, you shouldn't use it elsewhere; btw, Google Sheets provide notes and comments which aren't content of the table, I recommend utilise it.
Here is a small script performing all these things:
from translinguer import Translinguer
document = Translinguer(lang_mapper={
'Eng': 'en',
'Rus': 'ru',
})
document.load_from_gsheets(key="__XYZ__")
print(document.stats)
document.validate(raise_error=True)
document.save_cfg_by_language_page("__output_folder__")
The lang_mapper
argument isn't required, but allows having different language names in source tables
and raw/result files, which is more readable and just fancy! ✨
The majority of methods print logs to stdout for a user to know what's going on.
2. Real example
Here is usage for one of my open source Factorio mods:
3. Multi-project setup
You can work with multiple projects in a single document on separate pages with different languages. This may be useful for several small or related sets of texts so that translators can work on them in one place.
TODO: add an example...
4. Customisation
You can easily customise or extend Translinguer with:
- New formats for reading/writing – just have a look at the source code.
- Your own validation logic, similarly to embedded
validate
method. Example use cases:- To check that all entries from your source code are present. You can see it in the real example mentioned above.
- To ensure some project-specific consistency
Install
Already wanna try it out? 😁
pip install Translinguer
# If you also wanna use Google Sheets:
pip install gspread
Docs
Here is a specification of Translinguer class 🧩
Properties
pages: [page_name: str, Page]
– main data storageentries_number: int
– returns total number of entriestexts_number: int
– returns total number of texts in entriesstats: str
– returns a detailed string to print
General methods
__init__
languages: List[str], optional
– generally there is no need to set this manuallylang_mapper: LangRenamer = Dict[str, str], optional
– allows to have different language names in source tables and raw/result files. Defaults toProxyDict
which stores nothing and simply returns given key as a value.
validate -> int
Checks each entry looking missing language texts. Returns number of errors, optionally raises exception.
raise_error: bool = False
Text formats
0. Cache (JSON)
Used to store fully serialized document data.
Filename can be specified with an argument, defaults to DEFAULT_CACHE_FILE = 'texts.json'
to_dict -> DocumentDict = Dict[str, Dict[...]]
Converts whole document in a pure Python object.
from_dict
Loads document data from a dict.
data: DocumentDict = Dict[str, Dict[...]]
save_cache
Saves document into self.cache
file.
filename: str, optional = DEFAULT_CACHE_FILE
– cache filename
load_cache
Loads document from self.cache
file.
filename: str, optional = DEFAULT_CACHE_FILE
– cache filename
1. GSh (Google Sheets)
load_from_gsheets
Updates current document with texts from specified Google Sheet table.
Sections and comments are supported, their syntax can be configured with method arguments.
Note that either name
or key
must be provided.
name: str, optional
– Google sheet filenamekey: str, optional
– Google sheet URL keypage_filter: set[str], optional
– Parses sheets with specified names onlymerge_pages: str, optional
– Merge sheets into one page of given namecomment_prefix: str = '#''
– If the first column starts with this, the line is considered as a commentsection_prefix: str = '['
– If the first column starts with this, it is considered as a section declarationsection_postfix: str = ']'
– Section postfix to clean its name
2. CSV
to_csv -> str
Exports document (whole or partially) to a CSV string. It is useful for parsing raw text files to upload them into Google Sheet. Sections are supported, their syntax can be configured with method arguments.
lang_mapper: FlexibleRenamer, optional
page_filter: set[str], optional
sections: bool = False
– If to write sectionssection_prefix: str = '['
– If the first column starts with this, it is considered as a section declarationsection_postfix: str = ']'
– Section postfix to clean its namedelimiter: str = '\t'
– csv delimiter
3. INI
save_ini_by_language
The method saves texts into {output_path}/{language}.{ext}
files. Note that pages get merged.
output_path: str
lang_mapper: FlexibleRenamer, optional
page_filter: set[str], optional
ext: str = 'ini'
4. CFG
save_cfg_by_language_page
The method saves texts into {output_path}/{language}/{page_name}.cfg
files.
output_path: str
lang_mapper: FlexibleRenamer, optional
page_filter: set[str], optional
load_cfg
The method loads texts from {input_path}/{language}/{page_name}.cfg
files.
input_path: str
lang_mapper: FlexibleRenamer, optional
Other
Translinguer lib has some inner types, utility objects and functions.
Classes:
EntryDict
,SectionDict
,PageDict
,DocumentDict
– used for serialization, inherited fromTypedDict
.Entry
,Section
,Page
– actual content ofTranslinguerBase
.ProxyDict
– stores nothing and simply returns given key as a value.
Types:
LangRenamer = Mapping[source: str, raw: str]
– mapping to keep different language names for raw and result files, used as init parameter, may be aProxyDict
.FlexibleRenamer = Union[None, LangRenamer, Callable[[dict], dict]]
– optionally, the same mapping or a function to adjust it, notablydict_reversed
– used to parse raw/result files.PageFilter = Optional[set[str]]
– an alias for a set of page names to pick on texts exporting.LanguageList = List[str]
– an alias for a list of language names as string.Locales = Dict[page_name: str, Page]
– an alias for a dict ofPage
objects.
ToDo List
General:
- Describe data structure
- Describe existing methods and supported formats
- Publish to PyPI
Core:
- Replace dicts in dicts with lists of dedicated classes
- Allow to have separate languages for pages
- Add unit-tests
- Add CI
- Add Google Translate / ChatGPT API usage
- Allow import multiple files with different languages (get rid of
self.pages =
outside of base init)
Formats:
- GSh: add page name mapping
- GSh: add saving
- CSV: add reading
- Export to iOS / Android locale files. With multiple mappings from key to components?
Contributing
Feel free to make PRs! Just follow the guide below. Also, you can join my Discord to discuss anything.
New formats
-
Publish file type if it belongs to a popular platform or framework, no need of project-specific formats.
-
On adding new formats support, follow existing mixin classes type hinting approach and methods naming convention described in the next section.
-
"Private" methods should have file type in their names to avoid collisions.
-
Type-specific settings must be passed as method arguments (see
...
in the naming convention), not into class properties. -
If you add tabular format, make sure to use arguments
comment_prefix, section_prefix, section_postfix
. Use existing GSh and CSV as a reference. -
On document saving/loading (cases 2 & 3 in naming convention) print these events to the console, similarly to other formats methods.
You can use existing add_TYPE.py
files as templates.
Naming convention
This is naming and signature convention of file types public methods.
1. Dealing with files as strings
from_TYPE...(content: str, ...)
to_TYPE...(lang_mapper: FlexibleRenamer, ...) -> str
2. Dealing with files by local machine path
Typically, these deal with multiple raw/result files at once.
load_TYPE...(input_path: str, ...) -> self
save_TYPE...(output_path: str, lang_mapper: FlexibleRenamer, ...)
3. Dealing with external resources like Google Drive
load_from_TYPE...(...) -> self
save_to_TYPE...(..., lang_mapper: FlexibleRenamer)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file Translinguer-0.0.3.tar.gz
.
File metadata
- Download URL: Translinguer-0.0.3.tar.gz
- Upload date:
- Size: 14.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bd6d3399c4a3dc752e383b1884f7ef27175f29cf184ce3fc8f7896ab4abb80bc |
|
MD5 | 2a85083aa23632b1d76b3943e57bd4be |
|
BLAKE2b-256 | c41dacfb54451dbb3eedb9d26c81dc4f6fc937bc4e5680304a63272d5e26b99a |