The quick-topic toolkit allows us to quickly analyze topic models in various methods.
Project description
Quick Topic Modeling Toolkit
The quick-topic
toolkit allows us to quickly evaluate topic models in various methods.
Functions
- Topic Prevalence Trends Analysis
Usage
Example 1: Topic Prevalence over Time
from quick_topic.topic_prevalence.main import *
# data file: a csv file; a folder with txt files named
# the same as the ID field in the csv file
meta_csv_file = "../datasets/list_company_news_meta.csv"
text_root = r"../datasets/text_data_processed2"
# word segmentation data files
list_keywords_path = [
"../datasets/keywords/countries.csv",
"../datasets/keywords/leaders_unique_names.csv",
"../datasets/keywords/carbon2.csv"
]
# remove keywords
stop_words_path = "../datasets/stopwords/hit_stopwords.txt"
# date range for analysis
start_year=2000
end_year=2021
# used topics
label_names = ['经济主题', '能源主题', '公众主题', '政府主题']
topic_economics = ['投资', '融资', '经济', '租金', '政府', '就业', '岗位', '工作', '职业', '技能']
topic_energy = ['绿色', '排放', '氢能', '生物能', '天然气', '风能', '石油', '煤炭', '电力', '能源', '消耗', '矿产', '燃料', '电网', '发电']
topic_people = ['健康', '空气污染', '家庭', '能源支出', '行为', '价格', '空气排放物', '死亡', '烹饪', '支出', '可再生', '液化石油气', '污染物', '回收',
'收入', '公民', '民众']
topic_government = ['安全', '能源安全', '石油安全', '天然气安全', '电力安全', '基础设施', '零售业', '国际合作', '税收', '电网', '出口', '输电', '电网扩建',
'政府', '规模经济']
list_topics = [
topic_economics,
topic_energy,
topic_people,
topic_government
]
# run-all
run_topic_prevalence(
meta_csv_file=meta_csv_file,
raw_text_folder=text_root,
save_root_folder="results/target1",
list_keywords_path=list_keywords_path,
stop_words_path=stop_words_path,
start_year=start_year,
end_year=end_year,
label_names=label_names,
list_topics=list_topics
)
License
The quick-topic
is provided by Donghua Chen with MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
quick-topic-0.0.1.tar.gz
(14.5 kB
view details)
Built Distribution
File details
Details for the file quick-topic-0.0.1.tar.gz
.
File metadata
- Download URL: quick-topic-0.0.1.tar.gz
- Upload date:
- Size: 14.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67c43d0eddf4cb9189e080f2af1cc1886df508bc62bf01af389ce51eaa004b02 |
|
MD5 | 02a434fc82ddc3cba86b52feb2b2944b |
|
BLAKE2b-256 | 15e1e057077741f0f2f83fd3eefd443ccf69823438b66784da377972c96e5b94 |
File details
Details for the file quick_topic-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: quick_topic-0.0.1-py3-none-any.whl
- Upload date:
- Size: 14.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 342a62eaefb77f2388dea5f4b2c28145048d7bc0420adbf012618da98a17477b |
|
MD5 | fc15e5e985b698fbc7eda177987d7ef7 |
|
BLAKE2b-256 | 087cfa72f7ecdf0e5bff830de1a4028ced72fe729df87e797bf7fd2284726684 |