Pyatus is another localization QA tool.
Project description
pyatus
pyatus is another localization QA tool, which is a Python implementation of hiatus.
Some outdated functions are removed.
Detectable errors
-
Glossary
When a glossary source term is found in a source segment, the tool checks if the corresponding glossary target term exists in the target segment. Supports RegExp for advanced matching. -
Search Source or Target Segment (Defined as monolingual)
Searches source or target segments exclusively and reports errors if specified text is found. Supports RegExp for advanced matching. -
Inconsistency
Checks for inconsistencies bidirectionally: Source-to-Target and Target-to-Source. -
Numbers
Detects numbers present in the source but missing in the target. -
Length
Flags source and target segments when their lengths differ by more than ±50%. -
Skipped Translation
Reports errors for blank target segments. -
Identical Translation
Reports errors when the source and target segments are identical. -
Alphanumeric Strings in Target but NOT in Source (Defined as unsourced)
Effective only when the target language is non-alphabetic (e.g., Japanese, Chinese, Korean). -
Alphanumeric Strings in Source but NOT in Target (Defined as unsourced_rev)
Effective only when the source language is non-alphabetic (e.g., Japanese, Chinese, Korean). -
Spell
Spellcheck is conducted using pyspellchecker.
Supported Bilingual File Formats
- CSV
- XLSX
Features
- pyatus can automatically convert dictionary forms into possible active forms for English (optional).
Example: Converts write into RegExp (?:write|writes|writing|wrote|written). - Simple output report (XLS) that is easy to filter.
Environment
Python 3.x.x
Installation
pip install pyatus
How to use pyatus?
Fill out the necessary fields in config.yaml.
from pyatus import Pyatus
# Generate a Pyatus instance.
# p = Pyatus(str)
# str = File path to config.yaml file
p = Pyatus('foo/config.yaml')
# To generate error report -> XLSX file is generated.
p.generate_report()
# Only to read files -> List of file info is returned.
p.read_files()
# Only to output errors -> List of error info is returned.
p.run_checker()
About config.yaml
You can find config.yaml in the sample folder.
# Specify the folder where files you want to check are located, and columns to read.
reader:
folder_path: python/pyatus/sample/target_files
source_column: "en_US" # column number (integer starting from 0) or header string. Type ("int" or "str") should be the same as target column.
target_column: "ja_JP" # column number (integer starting from 0) or header string. Type ("int" or "str") should be the same as source column.
# Specify True for checks you want to run, paths to read glossary and/or monolingual files, and source and target languages for spellcheck.
checker:
source_lang: "en_US"
target_lang: "ja_JP"
glossary: True
glossary_path: python/pyatus/sample/glossary
inconsistency_s2t: False
inconsistency_t2s: True
skip: True
identical: False
spell: False
monolingual: True
monolingual_path: python/pyatus/sample/monolingual
numbers: True
unsourced: True
unsourced_rev: False
length: False
# Specify the path on which error report is generated.
writer:
output_path: python/pyatus/sample/report
How to create Glossary file?
Refer to the following instructions and files in the sample folder.
Glossary File Format
Four-Column TAB delimited Text in the UTF-8 format.
Structure
| Column 1 | Column 2 | Column 3 | Column 4 |
|---|---|---|---|
| Source | Target | Option | Comment |
| Column | Description |
|---|---|
| Source | Glossary source term. RegExp supported. Required |
| Target | Glossary target term. RegExp supported. Required |
| Option | Conversion option. Required |
| Comment | Comment. Optional |
About Options
Available options are combination of followings
| Option | Description |
|---|---|
| i | ignore case + Auto Conversion |
| z | No Conversion + No RegExp + Case-Insensitive |
| Blank | No Conversion + No RegExp + Case-Sensitive (= As is) |
| Prefix # | |
| # | Auto Conversion OFF. When you use your own RegExp, add # at the beginning of the option field |
Sample
Server サーバー z
(?:node|nodes) ノード #i ノードの訳に注意
import(?:ing) インポート #i
Japan 日本 JapanはCase-sensitive
run 走る i
(?<!start¥-|end¥-)point 点 #i Feedback No.2
How to create Monolingual file?
See below and the files in sample folder.
Monolingual File Format
Four-Column TAB delimited Text in the UTF-8 format.
Structure
| Column 1 | Column 2 | Column 3 | Column 4 |
|---|---|---|---|
| s or t | Expression | Option | Comment |
| Column | Description |
|---|---|
| s or t | Segment to search. 's' is source, 't' is target segment. Required |
| Expression | Search expression. RegExp supported. Required |
| Option | Conversion option. Required |
| Comment | Comment. Optional |
About Option
Available options are combination of followings
| Option | Description |
|---|---|
| i | ignore case + Auto Conversion |
| z | No Conversion + No RegExp + Case-Insensitive |
| Blank | No Conversion + No RegExp + Case-Sensitive (= As is) |
| Prefix # | |
| # | Auto Conversion OFF. When you use your own RegExp, add # at the beginning of the option field |
Sample
t ; # 全角セミコロン;を使用しない
t [\p{Katakana}ー]・ # カタカナ間の中黒を使用しない
t [0123456789]+ # 全角数字を禁止
s not z 否定文?
t Shared Document #i Windows のファイル パスはローカライズする(共有ドキュメント)。
t [あいうえお] # Hiragana left
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyatus-1.0.2.tar.gz.
File metadata
- Download URL: pyatus-1.0.2.tar.gz
- Upload date:
- Size: 13.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9a84564488c395163bfaede3e4d86496cce84ddc7906ad718151524123c6845
|
|
| MD5 |
d3a88f3c4c85769f0e526db6fcb5feb4
|
|
| BLAKE2b-256 |
7eac934100c655264711cd182b4fdce15162494a17c8d26ab796038e9801c616
|
File details
Details for the file Pyatus-1.0.2-py3-none-any.whl.
File metadata
- Download URL: Pyatus-1.0.2-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2caf69862567ed8e817c8099c8597ad73310e51f04517e1e1b8f73474f1da88
|
|
| MD5 |
97b3e947e404786aedac7e9963ec5909
|
|
| BLAKE2b-256 |
777a957b45dbd3f774188cb280e5e16af10f0fa330ff7079b63fec670da04f24
|