Skip to main content

Multi-criteria Cantonese segmentation with dashes, intermediates, pipes, and spaces.

Project description

pydips

Multi-criteria Cantonese segmentation with dashes, intermediates, pipes, and spaces.

Note: This package is still in beta, there might be breaking changes in the future. Currently supports macOS (Apple Silicon) and Linux (x86_64 with avx, avx2, and fma instructions)

Install

pip install pydips

Usage

>>> from pydips import BertModel
>>> model = BertModel()

>>> model.cut('阿張先生嗰時好nice㗎', mode='coarse')
['阿張先生', '嗰時', '好', 'nice', '㗎']

>>> model.cut('阿張先生嗰時好nice㗎', mode='fine')
['阿', '張', '先生', '嗰', '時', '好', 'nice', '㗎']

>>> model.cut('阿張先生嗰時好nice㗎', mode='dips_str')
'阿-張|先生 嗰-時 好 nice 㗎'

>>> model.cut('阿張先生嗰時好nice㗎', mode='dips')
['S', 'D', 'P', 'I', 'S', 'D', 'S', 'S', 'I', 'I', 'I', 'S']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydips-0.0.4.tar.gz (3.8 MB view details)

Uploaded Source

Built Distribution

pydips-0.0.4-py3-none-any.whl (3.8 MB view details)

Uploaded Python 3

File details

Details for the file pydips-0.0.4.tar.gz.

File metadata

  • Download URL: pydips-0.0.4.tar.gz
  • Upload date:
  • Size: 3.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.0

File hashes

Hashes for pydips-0.0.4.tar.gz
Algorithm Hash digest
SHA256 c05bfadfac41a620fa28c3015cad5b4b9d54d601936b537a6d02f8eff5e2f2df
MD5 fa601b5046c58288c8524a1f90ecd34d
BLAKE2b-256 00e094bbfc9797b01b5d749d34e025638ea60d011f220e06ab868094462f74a4

See more details on using hashes here.

File details

Details for the file pydips-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: pydips-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 3.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.0

File hashes

Hashes for pydips-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9ff7f2b48fa253c9112b72482573f0185c7c8f2ead14e776e85fe10e76d64f3a
MD5 ab12395f3e25c8530f79473a4a9a0c89
BLAKE2b-256 8e2b7397dc6e8b9707afa6287ab7dd11cb15d8d05d8e52fb78a4832dcd4b8368

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page