NLP tools
Project description
word_segmentation
Chinese word segmentation algorithm without corpus
Usage
from word_segmentation import get_words
content = '北京比武汉的人口多,但是北京的天气没有武汉的热,武汉有热干面,北京有北京烤鸭'
words = get_words(content, max_word_len=2, min_aggregation=1, min_entropy=0.5)
print(words)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hellonlp-0.2.40.tar.gz
(9.6 kB
view hashes)
Built Distribution
hellonlp-0.2.40-py3-none-any.whl
(12.9 kB
view hashes)
Close
Hashes for hellonlp-0.2.40-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dba65b5e077689430e341bb656d93c4e9789aa9db2d9835a7d203d900e4e9166 |
|
MD5 | 800e435bd260871925311c6c80f1937e |
|
BLAKE2b-256 | 0d858298bbd927c17157341546cf8b824fb402395ce4bb0b577952a6bc2d83b9 |