Skip to main content

stopwords-zh

Project description

image image image

🔥stopwords-zh🔥


欢迎提交更新,共建中文停用词库

Install

pip install -U stopwords-zh

Docs

Usages

  • source: string, 停用词来源,目前支持
    • baidu: 百度停用词表
    • hit: 哈工大停用词表
    • ict: 中科院计算所停用词表
    • scu: 四川大学机器智能实验室停用词库
    • cn: 广为流传未知来源的中文停用词表
    • marimo: Marimo multi-lingual stopwords collection 内的中文停用词
    • iso: Stopwords ISO 内的中文停用词
    • all: 上述所有停用词并集
import jieba
from stopwords import stopwords, filter_stopwords

print(filter_stopwords(jieba.cut('欢迎提交更新,共建中文停用词库')))

TODO

  • 停用词
  • 情感字典

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stopwords-zh-2023.6.5.13.18.38.tar.gz (38.0 kB view hashes)

Uploaded Source

Built Distribution

stopwords_zh-2023.6.5.13.18.38-py3-none-any.whl (39.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page