Skip to main content

Pyatus is another localization QA tool.

Project description

pyatus

pyatus is another localization QA tool, which is a Python implementation of hiatus.
Some outdated functions are removed.

Detectable errors

  • Glossary
    When a glossary source term is found in a source segment, the tool checks if the corresponding glossary target term exists in the target segment. Supports RegExp for advanced matching.

  • Search Source or Target Segment (Defined as monolingual)
    Searches source or target segments exclusively and reports errors if specified text is found. Supports RegExp for advanced matching.

  • Inconsistency
    Checks for inconsistencies bidirectionally: Source-to-Target and Target-to-Source.

  • Numbers
    Detects numbers present in the source but missing in the target.

  • Length
    Flags source and target segments when their lengths differ by more than ±50%.

  • Skipped Translation
    Reports errors for blank target segments.

  • Identical Translation
    Reports errors when the source and target segments are identical.

  • Alphanumeric Strings in Target but NOT in Source (Defined as unsourced)
    Effective only when the target language is non-alphabetic (e.g., Japanese, Chinese, Korean).

  • Alphanumeric Strings in Source but NOT in Target (Defined as unsourced_rev)
    Effective only when the source language is non-alphabetic (e.g., Japanese, Chinese, Korean).

  • Spell
    Spellcheck is conducted using pyspellchecker.

Supported Bilingual File Formats

  • CSV
  • XLSX

Features

  • pyatus can automatically convert dictionary forms into possible active forms for English (optional).
    Example: Converts write into RegExp (?:write|writes|writing|wrote|written).
  • Simple output report (XLS) that is easy to filter.

Environment

Python 3.x.x

Installation

pip install pyatus

How to use pyatus?

Fill out the necessary fields in config.yaml.

from pyatus import Pyatus

# Generate a Pyatus instance.
# p = Pyatus(str)
# str = File path to config.yaml file

p = Pyatus('foo/config.yaml')

# To generate error report -> XLSX file is generated.
p.generate_report()

# Only to read files -> List of file info is returned.
p.read_files()

# Only to output errors -> List of error info is returned.
p.run_checker()

About config.yaml

You can find config.yaml in the sample folder.

# Specify the folder where files you want to check are located, and columns to read.

reader:
  folder_path: python/pyatus/sample/target_files
  source_column: "en_US" # column number (integer starting from 0) or header string. Type ("int" or "str") should be the same as target column.
  target_column: "ja_JP" # column number (integer starting from 0) or header string. Type ("int" or "str") should be the same as source column.


# Specify True for checks you want to run, paths to read glossary and/or monolingual files, and source and target languages for spellcheck.

checker:
  source_lang: "en_US"
  target_lang: "ja_JP"
  glossary: True
  glossary_path: python/pyatus/sample/glossary
  inconsistency_s2t: False
  inconsistency_t2s: True
  skip: True
  identical: False
  spell: False
  monolingual: True
  monolingual_path: python/pyatus/sample/monolingual
  numbers: True
  unsourced: True
  unsourced_rev: False
  length: False

# Specify the path on which error report is generated.

writer:
  output_path: python/pyatus/sample/report

How to create Glossary file?

Refer to the following instructions and files in the sample folder.

Glossary File Format

Four-Column TAB delimited Text in the UTF-8 format.

Structure

Column 1 Column 2 Column 3 Column 4
Source Target Option Comment
Column Description
Source Glossary source term. RegExp supported. Required
Target Glossary target term. RegExp supported. Required
Option Conversion option. Required
Comment Comment. Optional

About Options

Available options are combination of followings

Option Description
i ignore case + Auto Conversion
z No Conversion + No RegExp + Case-Insensitive
Blank No Conversion + No RegExp + Case-Sensitive (= As is)
Prefix #
# Auto Conversion OFF. When you use your own RegExp, add # at the beginning of the option field

Sample

Server	 サーバー	z
(?:node|nodes)	ノード	#i	ノードの訳に注意
import(?:ing)	インポート	#i
Japan	日本		JapanはCase-sensitive
run	走る	i	
(?<!start¥-|end¥-)point	点	#i	Feedback No.2

How to create Monolingual file?

See below and the files in sample folder.

Monolingual File Format

Four-Column TAB delimited Text in the UTF-8 format.

Structure

Column 1 Column 2 Column 3 Column 4
s or t Expression Option Comment
Column Description
s or t Segment to search. 's' is source, 't' is target segment. Required
Expression Search expression. RegExp supported. Required
Option Conversion option. Required
Comment Comment. Optional

About Option

Available options are combination of followings

Option Description
i ignore case + Auto Conversion
z No Conversion + No RegExp + Case-Insensitive
Blank No Conversion + No RegExp + Case-Sensitive (= As is)
Prefix #
# Auto Conversion OFF. When you use your own RegExp, add # at the beginning of the option field

Sample

t	;	#	全角セミコロン;を使用しない
t	[\p{Katakana}ー]・	#	カタカナ間の中黒を使用しない
t	[0123456789]+	#	全角数字を禁止
s	not	z	否定文?
t	Shared Document	#i	Windows のファイル パスはローカライズする(共有ドキュメント)。
t	[あいうえお]	#	Hiragana left

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyatus-1.0.2.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

Pyatus-1.0.2-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file pyatus-1.0.2.tar.gz.

File metadata

  • Download URL: pyatus-1.0.2.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for pyatus-1.0.2.tar.gz
Algorithm Hash digest
SHA256 a9a84564488c395163bfaede3e4d86496cce84ddc7906ad718151524123c6845
MD5 d3a88f3c4c85769f0e526db6fcb5feb4
BLAKE2b-256 7eac934100c655264711cd182b4fdce15162494a17c8d26ab796038e9801c616

See more details on using hashes here.

File details

Details for the file Pyatus-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: Pyatus-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for Pyatus-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c2caf69862567ed8e817c8099c8597ad73310e51f04517e1e1b8f73474f1da88
MD5 97b3e947e404786aedac7e9963ec5909
BLAKE2b-256 777a957b45dbd3f774188cb280e5e16af10f0fa330ff7079b63fec670da04f24

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page