Skip to main content

Machine Translation from Chinese, Japanese, or Korean.

Project description

Website: <http://lapislazulitexts.com/sanzang>

GitHub: <https://github.com/yaoguai/sanzang-utils>

Sanzang is a compact and simple cross-platform machine translation system. This system is especially useful for translating from CJK languages (Chinese, Japanese, and Korean). Unlike most other machine translation systems, Sanzang is useful even for ancient and otherwise difficult texts.

Due to its origins in translating texts from the Chinese Buddhist canon, this translation system is called Sanzang (三藏), a literal translation of the Sanskrit word “Tripitaka,” which is a general term for the Buddhist canon. As demonstrated by the Sanzang translator itself:

$ echo 三藏 | szu-t sztab
1.1|三藏
1.2| sānzàng
1.3| tripiṭaka

Anyone can learn how to use Sanzang Utils, and use these programs to read and analyze texts. Unlike other systems, Sanzang is small and approachable. You can easily develop your own translation rules, and these rules are just stored in a text file and applied at runtime.

Components

Sanzang Utils includes the following programs:

  • szu-ed (1) - Command-based translation table editor

  • szu-r (1) - Preprocessor for reformatting CJK text

  • szu-t (1) - The main translation program

  • szu-ss (1) - Case-sensitive string substitution tool

These are all written in Python 3 and available under the MIT License.

Installation

To run these programs, Python 3.x is required. Installation on a Unix-like platform is advised, but Windows is possible too. If you are using Windows, then Cygwin is the best environment for these programs. To install Sanzang Utils, you can call pip3 to download and install it:

# pip3 install sanzang-utils

If you do not have pip3 installed on your system, you can download the Sanzang Utils software manually and then install it using the old method:

# python3 setup.py install

Documentation

After installing Sanzang Utils, you may want to read the introduction and the tutorial. The introduction introduces you to the core concepts and rationale behind the system. The tutorial gives practical instruction on exactly how to use each component, how they relate to one another, and how you can develop your own translation tables.

Sanzang Introduction: <http://lapislazulitexts.com/articles/sanzang_intro>

Sanzang Utils Tutorial: <http://lapislazulitexts.com/articles/szu_tutorial>

In addition to the introduction and tutorial, each program also includes a traditional Unix manual page that can be used as a reference.

Updates

To find out what new fixes and features are available with each new version, you can check the NEWS file, which lists all notable changes according to the version number and date.

Versions of Sanzang Utils follow the scheme N.N.N, indicating the major version, minor version, and the patch number. The patch number is incremented for sets of small updates or bug fixes, the minor number indicates some new feature or new behavior, and the major number indicates a big change or new and incompatible behavior.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sanzang-utils-1.3.3.tar.gz (19.1 kB view details)

Uploaded Source

File details

Details for the file sanzang-utils-1.3.3.tar.gz.

File metadata

File hashes

Hashes for sanzang-utils-1.3.3.tar.gz
Algorithm Hash digest
SHA256 84426c3be8bd79006c2ed48c22092eae96da998fea0e925f652f4d94aec0906b
MD5 edd81f520c5e7e43aef8827fd90c6b4f
BLAKE2b-256 9328ad39a3fa2be57c38e9bf0d98b92c277588d5a238092b64b0101bc08e6969

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page