Skip to main content

Query tools for Chinese Synonyms

Project description

Chinese-Synonyms

Chinese Synonyms 中文同义词查询工具包 Chinese Synonyms for Natural Language Processing and Understanding.

cnsyn

"cnsyn":Python 中文同义词查询工具组件

"cnsyn" : Python Query tools for Chinese Synonyms.

GitHub: https://github.com/shangfr/Chinese-Synonyms

特点

  • 支持同义词查询
  • 支持自定义词典 > 已删除
  • Apache License 2.0 授权协议

在线演示:

安装说明

  • 全自动安装:pip install cnsyn
  • 半自动安装:先下载 https://pypi.python.org/pypi/cnsyn/ ,解压后运行 python setup.py install
  • 手动安装:将 cnsyn 目录放置于当前目录或者 site-packages 目录,通过 import cnsyn 来引用。

同义词库说明

  • 1、wiki:通过维基百科构建的一个中文同义词库-AitSimwords.txt;
  • 2、cndict:中文同义词字典-chinese_dictionary.txt;
  • 3、words_id_emb: 基于PaddleNLP TokenEmbedding的预训练模型获取的词向量,合并wiki、cndict词库,共计129691个词;

查询原理

  • 基于词的传统召回 基于倒排索引,当用户输入查询词后,根据该词到倒排索引中进行查找该词的同义词。

  • 基于向量的语义召回 基于KNN-BallTree算法,找出某一个词向量最相近的词集合;

代码示例

# encoding=utf-8

import cnsyn

# 查询同义词(全部词库)
word = '垃圾'
cnsyn.search(word)
cnsyn.search(word,topK=3)
# 使用wiki词库
cnsyn.search(word, origin='wiki')
# 使用中文同义词字典库
cnsyn.search(word, origin='cndict')

# 基于向量的语义召回Approximate Nearest Neighbor Search 
cnsyn.anns(word)
cnsyn.anns(word,topK=3)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cnsyn-1.2.0.tar.gz (41.7 MB view details)

Uploaded Source

Built Distribution

cnsyn-1.2.0-py3-none-any.whl (42.7 MB view details)

Uploaded Python 3

File details

Details for the file cnsyn-1.2.0.tar.gz.

File metadata

  • Download URL: cnsyn-1.2.0.tar.gz
  • Upload date:
  • Size: 41.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for cnsyn-1.2.0.tar.gz
Algorithm Hash digest
SHA256 0586b4dc4ba5c4070a62c59e6df3bd4eeca9a576c435f435eadedd3708f90ebe
MD5 e1e5be989753f89c29dbefe71796e229
BLAKE2b-256 9794ecb40802c10dcd92b0790255065aab7bc60def0151092bb00559377b0b79

See more details on using hashes here.

File details

Details for the file cnsyn-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: cnsyn-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 42.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for cnsyn-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bb07b530cd96e6f2e62a27c104cf5ae5802da3365eda3ba495e197d578acc3b5
MD5 1e366edd6f0bb89dc9733bf404b01a47
BLAKE2b-256 35839544c435806c5f4dc45800ea0ab0f5053c64d51348fdaeb1cbd39a33404c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page