Skip to main content

Convert a Chinese sentence to Pinyin or Jyutping

Project description

Python module which converts a Chinese sentence from Simplified/Traditional to Mandarin/Pinyin and Traditional/Simplified to Cantonese/Jyutping, outputting diacritics (accented characters). I designed this library to create Mandarin and Cantonese flashcards.

Want to support my work on this module ? Become a supporter: https://www.patreon.com/lucw

Install

$ pip install pinyin_jyutping

Usage

Pinyin

generate the best solution:

>>> import pinyin_jyutping
>>> p = pinyin_jyutping.PinyinJyutping()
>>> p.pinyin('忘拿一些东西了')
'wàng ná yīxiē dōngxī le'
>>> p.pinyin('忘拿一些东西了', tone_numbers=True)
'wang4 na2 yi1xie1 dong1xi1 le5'
>>> p.pinyin('忘拿一些东西了', tone_numbers=True, spaces=True)
'wang4 na2 yi1 xie1 dong1 xi1 le5'

generate all possible solutions:

>>> import pinyin_jyutping
>>> p = pinyin_jyutping.PinyinJyutping()
>>> p.pinyin_all_solutions('忘拿一些东西了')
{'word_list': ['忘', '拿', '一些', '东西', '了'], 'solutions': [['wàng'], ['ná'], ['yīxiē'], ['dōngxī', 'dōngxi'], ['le', 'liǎo', 'liào']]}

Jyutping

generate the best solution:

>>> import pinyin_jyutping
>>> j = pinyin_jyutping.PinyinJyutping()
>>> j.jyutping('我出去攞野食')
'ngǒ cēothêoi ló jěsik'
>>> j.jyutping('我出去攞野食', tone_numbers=True)
'ngo5 ceot1heoi3 lo2 je5sik6'
>>> j.jyutping('我出去攞野食', tone_numbers=True, spaces=True)
'ngo5 ceot1 heoi3 lo2 je5 sik6'

generate all possible solutions:

>>> import pinyin_jyutping
>>> j = pinyin_jyutping.PinyinJyutping()
>>> j.jyutping_all_solutions('我出去攞野食')
{'word_list': ['我', '出去', '攞', '野食'], 'solutions': [['ngǒ'], ['cēothêoi'], ['ló', 'lō'], ['jěsik', 'jězi', 'jěsit', 'jězik']]}

How it works

Uses the Jieba library (https://github.com/fxsjy/jieba) to tokenize the sentence. Then words are converted to Pinyin/Jyutping either as a whole, or character by character, using the CC-Canto dictionary (http://cantonese.org/about.html). The Jyutping diacritic conversion is not standard but originally described here: http://www.cantonese.sheik.co.uk/phorum/read.php?1,127274,129006

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pinyin_jyutping-0.8.tar.gz (7.4 MB view details)

Uploaded Source

File details

Details for the file pinyin_jyutping-0.8.tar.gz.

File metadata

  • Download URL: pinyin_jyutping-0.8.tar.gz
  • Upload date:
  • Size: 7.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/37.3 requests/2.28.2 requests-toolbelt/0.10.1 urllib3/1.26.15 tqdm/4.65.0 importlib-metadata/6.1.0 keyring/23.13.1 rfc3986/2.0.0 colorama/0.4.6 CPython/3.9.16

File hashes

Hashes for pinyin_jyutping-0.8.tar.gz
Algorithm Hash digest
SHA256 a044f4d8c6bb997300f0942aa71c18aea358b191b9fa6e63f34284533e98eb8d
MD5 b1f2a4e83cf5ae4a4c9933adeb21edc3
BLAKE2b-256 5accab9ef4488a077728c72caac23e1088f95acac3d87acf12b04cab39fcca9d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page