Convert a Chinese sentence to Pinyin or Jyutping
Project description
Python module which converts a Chinese sentence from Simplified/Traditional to Mandarin/Pinyin and Traditional/Simplified to Cantonese/Jyutping, outputting diacritics (accented characters). I designed this library to create Mandarin and Cantonese flashcards.
Want to support my work on this module ? Become a supporter: https://www.patreon.com/lucw
Install
$ pip install pinyin_jyutping
Usage
Pinyin
>>> import pinyin_jyutping >>> p = pinyin_jyutping.PinyinJyutping() >>> p.pinyin('忘拿一些东西了')[0] 'wàng ná yīxiē dōngxī le' >>> p.pinyin('忘拿一些东西了', tone_numbers=True)[0] 'wang4 na2 yi1xie1 dong1xi1 le5' >>> p.pinyin('忘拿一些东西了', tone_numbers=True, spaces=True)[0] 'wang4 na2 yi1 xie1 dong1 xi1 le5'
Jyutping
>>> import pinyin_jyutping >>> j = pinyin_jyutping.PinyinJyutping() >>> j.jyutping('我出去攞野食')[0] 'ngǒ cēothêoi ló jěsik' >>> j.jyutping('我出去攞野食', tone_numbers=True)[0] 'ngo5 ceot1heoi3 lo2 je5sik6' >>> j.jyutping('我出去攞野食', tone_numbers=True, spaces=True)[0] 'ngo5 ceot1 heoi3 lo2 je5 sik6'
How it works
Uses the Jieba library (https://github.com/fxsjy/jieba) to tokenize the sentence. Then words are converted to Pinyin/Jyutping either as a whole, or character by character, using the CC-Canto dictionary (http://cantonese.org/about.html). The Jyutping diacritic conversion is not standard but originally described here: http://www.cantonese.sheik.co.uk/phorum/read.php?1,127274,129006
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pinyin_jyutping-0.5.tar.gz
.
File metadata
- Download URL: pinyin_jyutping-0.5.tar.gz
- Upload date:
- Size: 7.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d58edeea116643acca5c1727294ed3e7bf5ad8744d068594409f1c717e24c60 |
|
MD5 | 079ef54e3fe9e762cd0dd1e7e4d089a5 |
|
BLAKE2b-256 | 6883378bf49d12e691ac26fff34d989454fc2cb932c52a3c4888002e0fed2a8c |