Identification and conversion functions for Chinese text processing
Project description
Dragon Mapper is a Python library that provides identification and conversion functions for Chinese text processing.
Documentation: http://dragonmapper.rtfd.org
Free software: MIT license
Features
Convert between Chinese characters, Pinyin, Zhuyin, and the International Phonetic Alphabet.
Identify a string as Traditional or Simplified Chinese, Pinyin, Zhuyin, or the International Phonetic Alphabet.
>>> s = '我是一个美国人。'
>>> dragonmapper.hanzi.is_simplified(s)
True
>>> dragonmapper.hanzi.to_pinyin(s)
'wǒshìyīgèměiguórén。'
>>> dragonmapper.hanzi.to_pinyin(s, all_readings=True)
'[wǒ][shì/shi/tí][yī][gè/ge/gě/gàn][měi][guó][rén/ren]。'
>>> s = 'Wǒ shì yīgè měiguórén.'
>>> dragonmapper.transcriptions.is_pinyin(s)
True
>>> dragonmapper.transcriptions.pinyin_to_zhuyin(s)
'ㄨㄛˇ ㄕˋ ㄧ ㄍㄜˋ ㄇㄟˇ ㄍㄨㄛˊ ㄖㄣˊ.'
>>> dragonmapper.transcriptions.pinyin_to_ipa(s)
'wɔ˧˩˧ ʂɨ˥˩ i˥ kɤ˥˩ meɪ˧˩˧ kwɔ˧˥ ʐən˧˥.'
Getting Started
Report bugs and ask questions via GitHub Issues
Refer to the API documentation when you need more technical information
Contribute documentation, code, or feedback
Change Log
0.2.4 (2015-04-08)
# Fixes #8. Adds re.UNICODE to transcription conversion. # Fixes misformatted readings for certain characters. * Fixes #7. Fixes incorrect Unihan Database readings for the ‘ou’ vowel combinations.
0.2.3 (2014-04-28)
Fixes #6. Adds -r suffix syllable to transcription mapping data.
0.2.2 (2014-04-28)
Fixes a capitalization bug related to #5.
0.2.1 (2014-04-28)
Reformats README.rst.
Renames change log file to *.rst.
Adds authors and contributing files.
Sets up Travis CI.
Adds version to __init__.py.
Fixes #5. Make accented_to_numbered() add apostrophes when needed.
Fixes #4. Fixes numbered_to_accented() handling of 'v' vowel.
Fixes #3. Changes IndexError exception handlers to KeyError.
Fixes #2. Fixes accented_to_numbered() with uppercase accented vowel.
0.2.0 (2014-04-14)
Fixes typo in is_pinyin.
Adds is_pinyin_compatible() and is_zhuyin_compatible() functions.
Removes code for identifying Hanzi and incorporates Hanzi Identifier library.
Removes Sphinx viewcode extension.
Adds Python 3.4 environment to tox configuration.
Fixes typo in setup.py. Fixes #1.
0.1.0 (2014-02-17)
Initial release.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.