Skip to main content

NLP tools

Project description

word_segmentation

Chinese word segmentation algorithm without corpus

Usage

from word_segmentation import get_words
content = '北京比武汉的人口多,但是北京的天气没有武汉的热,武汉有热干面,北京有北京烤鸭'
words = get_words(content, max_word_len=2, min_aggregation=1, min_entropy=0.5)
print(words)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hellonlp-0.2.41.tar.gz (1.5 MB view hashes)

Uploaded Source

Built Distribution

hellonlp-0.2.41-py3-none-any.whl (1.5 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page