Skip to main content


Project description

PyPI version Supported Python versions Build Code Health


Download and install

PyCantonese is available through pip:

$ pip install -U pycantonese

Setting up a Development Environment

The latest code under development is available on Github at pycantonese/pycantonese. To obtain this version for experimental features or for development:

$ git clone
$ cd pycantonese
$ pip install -r requirements.txt
$ pip install -r dev-requirements.txt
$ python develop

To run tests:

$ py.test -vv --cov pycantonese pycantonese
$ flake8 pycantonese


Developer: Jackson L. Lee

A talk introducing PyCantonese:

Lee, Jackson L. 2015. PyCantonese: Cantonese linguistic research in the age of big data. Talk at the Childhood Bilingualism Research Centre, Chinese University of Hong Kong. September 15. 2015. Notes+slides

Please also see

Change Log

Please see


MIT License. Please see LICENSE.txt for details.

The HKCanCor dataset included in PyCantonese is substantially modified from its source in terms of format. The original dataset has a CC BY license. Please see pycantonese/data/hkcancor/ for details.

Project details

Release history Release notifications

This version
History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
pycantonese-2.2.0-py3-none-any.whl (685.2 kB) Copy SHA256 hash SHA256 Wheel py3 Jul 1, 2018
pycantonese-2.2.0.tar.gz (621.8 kB) Copy SHA256 hash SHA256 Source None Jul 1, 2018

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page