NLP tools
Project description
word_segmentation
Chinese word segmentation algorithm without corpus
Usage
from word_segmentation import get_words
content = '北京比武汉的人口多,但是北京的天气没有武汉的热,武汉有热干面,北京有北京烤鸭'
words = get_words(content, max_word_len=2, min_aggregation=1, min_entropy=0.5)
print(words)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hellonlp-0.2.41.tar.gz
(1.5 MB
view hashes)
Built Distribution
Close
Hashes for hellonlp-0.2.41-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2eb82dee5c098456105098ebad56bd731ea9d0dc992315f973991446a81b7905 |
|
MD5 | 0d9290ad5a475d8809d209c1596a7e77 |
|
BLAKE2b-256 | ac3a93dfc2fdffda7899117bd48ad995ff46d4160f99647df0f67a3606c351cc |