NLP tools
Project description
word_segmentation
Chinese word segmentation algorithm without corpus
Usage
from word_segmentation import get_words
content = '北京比武汉的人口多,但是北京的天气没有武汉的热,武汉有热干面,北京有北京烤鸭'
words = get_words(content, max_word_len=2, min_aggregation=1, min_entropy=0.5)
print(words)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hellonlp-0.2.37.tar.gz
(952 Bytes
view hashes)
Built Distribution
Close
Hashes for hellonlp-0.2.37-py2-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 708a115dc201662d7a0f615531bc62c2c4af4ef9305e40bfa0d2da10c5d0d247 |
|
MD5 | 249f84d11f0a0085d5d6c1f44b778dde |
|
BLAKE2b-256 | 234b3d38f936dbba71c11222ced2cc04e17fd058323bccab24bec055f8398b37 |