Skip to main content
Avatar for 大邓 from gravatar.com

大邓

Username    thunderhit
Date joined   Joined

15 projects

cntext

Last released

Chinese text analysis library, which can perform word frequency statistics, dictionary expansion, sentiment analysis, similarity, readability, co-occurrence analysis, social calculation (attitude, prejudice, culture) on texts

pdfdocx

Last released

读取pdf、docx文件,返回文件内的文本数据。

multistop

Last released

文本分析停用词表,支持中英德法等15种语言。

ashares

Last released

简易A股行情数据API接口

smartscraper

Last released

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

simtext

Last released

文本、文档相似性计算

wordexpansion

Last released

bsite

Last released

bsite是用于采集B站用户视频列表页、视频评论数据的python包。 https://github.com/thunderhit/bsite

eventextraction

Last released

中文复合事件抽取,可以用来识别文本的模式,包括条件事件、因果事件、顺承事件、反转事件。代码为刘焕勇原创设计,项目地址https://github.com/liuhuanyong/ComplexEventExtraction 项目介绍很详细,感兴趣的一定要去原项目看一下。我仅仅是对代码做了简单的修改,增加了函数说明注释和stats函数,可以用于统计文本中各种模式的分布(数量)情况。

tidytextpy

Last released

将R语言tidytext包移植到Python,可简单调用unnest_tokens、get_sentiments、get_stopwords、bind_tf_idf等函数。

weibo-crawler

Last released

weibo_crawler 最简单的wiebo爬虫,可以轻度的进行微博数据采集

cnsenti

Last released

中文情感分析库(Chinese Sentiment))可对文本进行情绪分析、正负情感分析。

weiboa

Last released

采集微博某话题所有微博,某条微博所有评论 https://github.com/thunderhit/weiboa

shreport

Last released

上海证券交易所上市公司定期报告下载,项目地址 https://github.com/thunderhit/shreport

bar-chart-race-cn

Last released

解决bar_chart_race无法显示中文的问题

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page