Skip to main content

Convert a Chinese sentence to Pinyin or Jyutping

Project description

https://travis-ci.org/lucwastiaux/python-pinyin-jyutping-sentence.svg?branch=master

Python module which converts a Chinese sentence from Simplified/Traditional to Mandarin/Pinyin and Traditional/Simplified to Cantonese/Jyutping, outputting diacritics (accented characters). I designed this library to create Mandarin and Cantonese flashcards.

Want to support my work on this module ? Become a supporter: https://www.patreon.com/lucw

Install

$ pip install pinyin_jyutping_sentence

Usage

>>> import pinyin_jyutping_sentence
>>> pinyin_jyutping_sentence.pinyin("提高口语")
'tígāo kǒuyǔ'
>>> pinyin_jyutping_sentence.jyutping("我出去攞野食")
'ngǒ cēothêoi ló jěsik'
# the tone_numbers argument can be used to disable diacritics
>>> pinyin_jyutping_sentence.pinyin("忘拿一些东西了", tone_numbers=True)
'wang4 na2 yi1xie1 dong1xi5 le5'
# the spaces argument adds a space between each syllable
>>> pinyin_jyutping_sentence.pinyin("忘拿一些东西了", tone_numbers=True, spaces=True)
'wang4 na2 yi1 xie1 dong1 xi5 le5'
>>> pinyin_jyutping_sentence.jyutping("有啲好貴", tone_numbers=True)
'jau5 di1 hou3 gwai3'

Changelog

  • v1.1: improve conversion logic for single characters

  • v0.9: removed stdout logging, added tox support

  • v0.8: embed MDBG CC-CEDICT for more accurate Pinyin conversions

  • v0.6: allow converting Traditional characters to Pinyin, and Simplified to Jyutping

Google Sheets add-on

This library is available in the form of a Google Sheets Add-on. You can read about it here: https://medium.com/@lucw/converting-chinese-characters-to-pinyin-or-jyutping-on-google-sheets-eb12cca669cb

How it works

Uses the Jieba library (https://github.com/fxsjy/jieba) to tokenize the sentence. Then words are converted to Pinyin/Jyutping either as a whole, or character by character, using the CC-Canto dictionary (http://cantonese.org/about.html). The Jyutping diacritic conversion is not standard but originally described here: http://www.cantonese.sheik.co.uk/phorum/read.php?1,127274,129006

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pinyin_jyutping_sentence-1.3.tar.gz (12.4 MB view details)

Uploaded Source

File details

Details for the file pinyin_jyutping_sentence-1.3.tar.gz.

File metadata

File hashes

Hashes for pinyin_jyutping_sentence-1.3.tar.gz
Algorithm Hash digest
SHA256 93d9d0486d5fd533f12f500cdb8945cacee989dfb7f9d3d2bfa4030f05da9b00
MD5 2735a8b5a17a7c0a1d20d67913e97271
BLAKE2b-256 a1f59eb9524a3ca30569116bba634371f1fb9016349fca977a0b03ab511d0147

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page