Skip to main content

The quick-topic toolkit allows us to quickly analyze topic models in various methods.

Project description

Quick Topic Modeling Toolkit

The quick-topic toolkit allows us to quickly evaluate topic models in various methods.

Functions

  1. Topic Prevalence Trends Analysis

Usage

Example 1: Topic Prevalence over Time

from quick_topic.topic_prevalence.main import *
# data file: a csv file; a folder with txt files named 
# the same as the ID field in the csv file
meta_csv_file = "../datasets/list_company_news_meta.csv"
text_root = r"../datasets/text_data_processed2"

# word segmentation data files
list_keywords_path = [
        "../datasets/keywords/countries.csv",
        "../datasets/keywords/leaders_unique_names.csv",
        "../datasets/keywords/carbon2.csv"
    ]

# remove keywords
stop_words_path = "../datasets/stopwords/hit_stopwords.txt"

# date range for analysis
start_year=2000
end_year=2021

# used topics
label_names = ['经济主题', '能源主题', '公众主题', '政府主题']
topic_economics = ['投资', '融资', '经济', '租金', '政府', '就业', '岗位', '工作', '职业', '技能']
topic_energy = ['绿色', '排放', '氢能', '生物能', '天然气', '风能', '石油', '煤炭', '电力', '能源', '消耗', '矿产', '燃料', '电网', '发电']
topic_people = ['健康', '空气污染', '家庭', '能源支出', '行为', '价格', '空气排放物', '死亡', '烹饪', '支出', '可再生', '液化石油气', '污染物', '回收',
                '收入', '公民', '民众']
topic_government = ['安全', '能源安全', '石油安全', '天然气安全', '电力安全', '基础设施', '零售业', '国际合作', '税收', '电网', '出口', '输电', '电网扩建',
                    '政府', '规模经济']
list_topics = [
    topic_economics,
    topic_energy,
    topic_people,
    topic_government
]

# run-all
run_topic_prevalence(
    meta_csv_file=meta_csv_file,
    raw_text_folder=text_root,
    save_root_folder="results/target1",
    list_keywords_path=list_keywords_path,
    stop_words_path=stop_words_path,
    start_year=start_year,
    end_year=end_year,
    label_names=label_names,
    list_topics=list_topics
)

License

The quick-topic is provided by Donghua Chen with MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quick-topic-0.0.1.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

quick_topic-0.0.1-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file quick-topic-0.0.1.tar.gz.

File metadata

  • Download URL: quick-topic-0.0.1.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.6

File hashes

Hashes for quick-topic-0.0.1.tar.gz
Algorithm Hash digest
SHA256 67c43d0eddf4cb9189e080f2af1cc1886df508bc62bf01af389ce51eaa004b02
MD5 02a434fc82ddc3cba86b52feb2b2944b
BLAKE2b-256 15e1e057077741f0f2f83fd3eefd443ccf69823438b66784da377972c96e5b94

See more details on using hashes here.

File details

Details for the file quick_topic-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: quick_topic-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.6

File hashes

Hashes for quick_topic-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 342a62eaefb77f2388dea5f4b2c28145048d7bc0420adbf012618da98a17477b
MD5 fc15e5e985b698fbc7eda177987d7ef7
BLAKE2b-256 087cfa72f7ecdf0e5bff830de1a4028ced72fe729df87e797bf7fd2284726684

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page