基于bertopic对中文文档进行主题建模
Project description
About
一个基于bertopic对中文文档进行主题建模的包。
Install
$ pip3 install -U bertopic_base_chinese
Director
- bertopic_base_chinese
- _model.py
_model.py
- BERTopic类 重写了__init__(),设置embedding_model为"paraphrase-multilingual-MiniLM-L12-v2",以及选取tokenizer为jieba.lcut,初始化类参数。
Usage
from bertopic_base_chinese import BERTopic
docs = ["我爱北京天安门", "我家大门常打开,开放怀抱等你"]
topic_model = BERTopic()
topics, probs = topic_model.fit_transform(docs)
Contact us
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for bertopic_base_chinese-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 495acdc81a31a4c4e6dd361538221bb6bc2d2adaf6e3348f09dc1071fff8c26c |
|
MD5 | f235e4785f9e7606b91a2219e3d02259 |
|
BLAKE2b-256 | 19a967f3cabab763db3290752e01230f8457faebc298cb70038a7fac7423b9f3 |