Skip to main content

基于bertopic对中文文档进行主题建模

Project description

About

一个基于bertopic对中文文档进行主题建模的包。

Install

$ pip3 install -U bertopic_base_chinese

Director

  • bertopic_base_chinese
    • _model.py

_model.py

  • BERTopic类 重写了__init__(),设置embedding_model为"paraphrase-multilingual-MiniLM-L12-v2",以及选取tokenizer为jieba.lcut,初始化类参数。

Usage

from bertopic_base_chinese import BERTopic

docs = ["我爱北京天安门", "我家大门常打开,开放怀抱等你"]
topic_model = BERTopic()
topics, probs = topic_model.fit_transform(docs)

Contact us

may.xiaoya.zhang@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bertopic_base_chinese-0.0.1.tar.gz (2.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page