stopwords-zh
Project description
🔥stopwords-zh🔥
欢迎提交更新,共建中文停用词库
Install
pip install -U stopwords-zh
Docs
Usages
- source: string, 停用词来源,目前支持
- baidu: 百度停用词表
- hit: 哈工大停用词表
- ict: 中科院计算所停用词表
- scu: 四川大学机器智能实验室停用词库
- cn: 广为流传未知来源的中文停用词表
- marimo: Marimo multi-lingual stopwords collection 内的中文停用词
- iso: Stopwords ISO 内的中文停用词
- all: 上述所有停用词并集
import jieba
from stopwords import stopwords, filter_stopwords
print(filter_stopwords(jieba.cut('欢迎提交更新,共建中文停用词库')))
TODO
- 停用词
- 情感字典
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for stopwords-zh-2023.5.4.10.49.5.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | dfa77da1b1f392384d1a65901ad4746f6eefc17dc741096225312ce84d073492 |
|
MD5 | 94b82cff879520939ccf14fd549dfea0 |
|
BLAKE2b-256 | 9969ec4464fd2c45096822351a7cc9ed70fd31988b1651cc3518fea77772ab09 |
Close
Hashes for stopwords_zh-2023.5.4.10.49.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a598a8742a85c5cf5e6bb1b6f7796fcbd7404413451b6683bf03503d333948a |
|
MD5 | 9cd7e84b8f9230f6c4b1043f78638224 |
|
BLAKE2b-256 | 150935573a3a65ac8af8b289049a8afd0e5e85c312d725533dbbf412fb528fc2 |