Skip to main content

Pyatus is another localization QA tool.

Project description

Pyatus

Pyatus is another localization QA tool, which is a Python implementation of hiatus.
Some outdated functions are removed.

Detectable errors

  • Glossary
    When a glossary source term is found in a source segment, the tool checks if the corresponding glossary target term exists in the target segment. Supports RegExp for advanced matching.

  • Search Source or Target Segment (Defined as monolingual)
    Searches source or target segments exclusively and reports errors if specified text is found. Supports RegExp for advanced matching.

  • Inconsistency
    Checks for inconsistencies bidirectionally: Source-to-Target and Target-to-Source.

  • Numbers
    Detects numbers present in the source but missing in the target.

  • Length
    Flags source and target segments when their lengths differ by more than ±50%.

  • Skipped Translation
    Reports errors for blank target segments.

  • Identical Translation
    Reports errors when the source and target segments are identical.

  • Alphanumeric Strings in Target but NOT in Source (Defined as unsourced)
    Effective only when the target language is non-alphabetic (e.g., Japanese, Chinese, Korean).

  • Alphanumeric Strings in Source but NOT in Target (Defined as unsourced_rev)
    Effective only when the source language is non-alphabetic (e.g., Japanese, Chinese, Korean).

  • Spell
    Spellcheck is conducted using pyspellchecker.

Supported Bilingual File Formats

  • CSV
  • XLSX

Features

  • pyatus can automatically convert dictionary forms into possible active forms for English (optional).
    Example: Converts write into RegExp (?:write|writes|writing|wrote|written).
  • Simple output report (XLS) that is easy to filter.

Environment

Python 3.x.x

Installation

pip install pyatus

How to use pyatus?

Fill out the necessary fields in config.yaml.

from pyatus import Pyatus

p = Pyatus('foo/config.yaml')

# To generate error report -> XLSX file is generated.
p.generate_report()

# Only to read files -> List of file info is returned.
p.read_files()

# Only to output errors -> List of error info is returned.
p.run_checker()

About config.yaml

You can find the simple config.yaml in the sample folder.

# Specify the folder where files you want to check are located, and columns to read.

reader:
  folder_path: python/pyatus/sample/target_files
  source_column: "en_US" # column number (integer starting from 0) or header string. Type ("int" or "str") should be the same as target column.
  target_column: "ja_JP" # column number (integer starting from 0) or header string. Type ("int" or "str") should be the same as source column.


# Specify True for checks you want to run, paths to read glossary and/or monolingual files, and source and target languages for spellcheck.

checker:
  source_lang: "en_US"
  target_lang: "ja_JP"
  glossary: True
  glossary_path: python/pyatus/sample/glossary
  inconsistency_s2t: False
  inconsistency_t2s: True
  skip: True
  identical: False
  spell: False
  monolingual: True
  monolingual_path: python/pyatus/sample/monolingual
  numbers: True
  unsourced: True
  unsourced_rev: False
  length: False

# Specify the path on which error report is generated.

writer:
  output_path: python/pyatus/sample/report

How to create Glossary file?

Refer to the following instructions and files in the sample folder.

Glossary File Format

Four-Column TAB delimited Text in the UTF-8 format.

Structure

Column 1 Column 2 Column 3 Column 4
Source Target Option Comment
Column Description
Source Glossary source term. RegExp supported. Required
Target Glossary target term. RegExp supported. Required
Option Conversion option. Required
Comment Comment. Optional

About Options

Available options are combination of followings

Option Description
i ignore case + Auto Conversion
z No Conversion + No RegExp + Case-Insensitive
Blank No Conversion + No RegExp + Case-Sensitive (= As is)
Prefix #
# Auto Conversion OFF. When you use your own RegExp, add # at the beginning of the option field

Sample

Server	 サーバー	z
(?:node|nodes)	ノード	#i	ノードの訳に注意
import(?:ing)	インポート	#i
Japan	日本		JapanはCase-sensitive
run	走る	i	
(?<!start¥-|end¥-)point	点	#i	Feedback No.2

How to create Monolingual file?

See below and the files in sample folder.

Monolingual File Format

Four-Column TAB delimited Text in the UTF-8 format.

Structure

Column 1 Column 2 Column 3 Column 4
s or t Expression Option Comment
Column Description
s or t Segment to search. 's' is source, 't' is target segment. Required
Expression Search expression. RegExp supported. Required
Option Conversion option. Required
Comment Comment. Optional

About Option

Available options are combination of followings

Option Description
i ignore case + Auto Conversion
z No Conversion + No RegExp + Case-Insensitive
Blank No Conversion + No RegExp + Case-Sensitive (= As is)
Prefix #
# Auto Conversion OFF. When you use your own RegExp, add # at the beginning of the option field

Sample

t	;	#	全角セミコロン;を使用しない
t	[\p{Katakana}ー]・	#	カタカナ間の中黒を使用しない
t	[0123456789]+	#	全角数字を禁止
s	not	z	否定文?
t	Shared Document	#i	Windows のファイル パスはローカライズする(共有ドキュメント)。
t	[あいうえお]	#	Hiragana left

License

Copyright © 2025 Ayumu Hanba (ayumuhamba19<at_mark>gmail.com)
Distributed under the GPL License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyatus-1.0.1.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

Pyatus-1.0.1-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file pyatus-1.0.1.tar.gz.

File metadata

  • Download URL: pyatus-1.0.1.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for pyatus-1.0.1.tar.gz
Algorithm Hash digest
SHA256 42e421a012b16bc7773fcf79e90ae62abbf1aa5f28ab8f28366fc1fc1f2515d0
MD5 1ec442986fe6759441f55627f48df752
BLAKE2b-256 a49061f64d52c53716d33ef4025e684ef0ae00fed4d6ed20b41a831c17f5a30b

See more details on using hashes here.

File details

Details for the file Pyatus-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: Pyatus-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for Pyatus-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b287dc2860981bb9dc88ed2dc0f6e8595d759b6f77630657420522724df6dd15
MD5 23b8a1b01c1338d647311f69974b8d47
BLAKE2b-256 3a1f34fe7301a80181311e28ee81c2f54dc564e6c22159c720e12d884656fe5f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page