NLP tools
Project description
word_segmentation
Chinese word segmentation algorithm without corpus
Usage
from word_segmentation import get_words
content = '北京比武汉的人口多,但是北京的天气没有武汉的热,武汉有热干面,北京有北京烤鸭'
words = get_words(content, max_word_len=2, min_aggregation=1, min_entropy=0.5)
print(words)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hellonlp-0.2.38.tar.gz
(9.6 kB
view hashes)
Built Distribution
hellonlp-0.2.38-py2-none-any.whl
(13.7 kB
view hashes)
Close
Hashes for hellonlp-0.2.38-py2-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 989855b4be9e27616666f3c8996dd66d61907f0b8ca078499c7fb0eaa557ef29 |
|
MD5 | 8e030ca26d25423bff8fd256e2950793 |
|
BLAKE2b-256 | ed7cb2477eec121b4c1e2d67daec3d0389fea5e019f41c0c9379cfd448745a59 |