A small NLP tool package.
Project description
五艺(WuYi)
五艺是一个简易的中文自然语言处理工具
主要包括的功能有:中文分词、词性标注、情感分析、命名实体识别、关系抽取、关键词抽取、文本摘要、新词发现、文本聚类。
当前还在开发中。
安装
使用pip安装
pip install WuYi
中文分词
from wuyi import BasicTokenizer
tokenizer = BasicTokenizer()
text = "测试中文分词效果。"
tokens = tokenizer.tokenize(text=text)
print(tokens)
评价指标
from wuyi import ROUGE, BLEU
rouge = ROUGE()
bleu = BLEU()
hyp = "简单测试一下五艺的效果。"
ref = "测试是否能够正确输出。"
rouge_score = rouge.get_scores(hyp, ref, avg=True)
print(rouge_score)
bleu_score = bleu.get_scores(hyp, ref)
print(bleu_score)
开发进度
中文分词【未开始】
词性标注【未开始】
情感分析【未开始】
命名实体识别【未开始】
关系抽取【未开始】
关键词抽取【未开始】
文本摘要【未开始】
新词发现【未开始】
文本聚类【8.17开始】
数据评测指标【8.30开始】
文档结构
\examples 示例代码
\wuyi
\clustering 聚类算法[未完成]
kmeans.py K-Means算法[未完成]
\core 核心部分代码[未完成]
\tokenizers 分词部分代码[未完成]
BasicTokenizer.py 基础分词[完成]
\metric 指标部分代码[未完成]
BLEU.py bleu评测指标[完成]
ROUGE.py rouge评测指标[完成]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
WuYi-0.0.6.tar.gz
(11.0 kB
view details)
Built Distribution
WuYi-0.0.6-py3-none-any.whl
(14.1 kB
view details)
File details
Details for the file WuYi-0.0.6.tar.gz
.
File metadata
- Download URL: WuYi-0.0.6.tar.gz
- Upload date:
- Size: 11.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
c2c76b5e4e3bdb55d19e6fa673d216d4042bd9efebb6467da515ed0a153f5ec9
|
|
MD5 |
0f5523516b322445b3a680a081e5f877
|
|
BLAKE2b-256 |
323f3337f62291a032b0629c4c7358f88cd840839a216dcf6503fff2d786e86a
|
File details
Details for the file WuYi-0.0.6-py3-none-any.whl
.
File metadata
- Download URL: WuYi-0.0.6-py3-none-any.whl
- Upload date:
- Size: 14.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
856ed4e8c763cb045089c7d101eed34b6a06235c11102e1dbe25f3a08608fc08
|
|
MD5 |
4c7f4e55683d6b1c1653f8f083e01cce
|
|
BLAKE2b-256 |
741ee1a25f7114278bd46b627a0490c722d4d5318686db1442b12e3be53bd760
|