Cantonese segmentation tool 粵語分詞工具
Project description
cantoseg
Cantonese segmentation tool 粵語分詞工具
Install
$ pip install cantoseg
Usage
>>> import cantoseg
>>> cantoseg.cut('香港喺舊石器時代就有人住')
['香港', '喺', '舊石器時代', '就', '有人', '住']
A generator version is also available: cantoseg.lcut
.
Design
See article Cantonese Segmentation and Part-Of-Speech Tagging (in Chinese).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cantoseg-0.0.1.tar.gz
(3.3 kB
view details)
Built Distribution
File details
Details for the file cantoseg-0.0.1.tar.gz
.
File metadata
- Download URL: cantoseg-0.0.1.tar.gz
- Upload date:
- Size: 3.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82269ecf698dd0010f2b3d759d8e8d0ed1192c9248498830a4ccfc133695a671 |
|
MD5 | 7611787ec96127776114b82d67b1f006 |
|
BLAKE2b-256 | f851ab8e527a594c7058464464a8560fff37fbf41fd69010d61474ec0f3c2155 |
File details
Details for the file cantoseg-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: cantoseg-0.0.1-py3-none-any.whl
- Upload date:
- Size: 3.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5229263ca114ba087b980baa9e7edda1e3f5ffb426dd2dbec910e5f008a2539 |
|
MD5 | 61c95dbddbc2cf5015bdecd4804f181b |
|
BLAKE2b-256 | dd10dc1dca275bfa12e6c7002cb89b6074d2cb1190345b4e6aab5e7bc14e0a13 |