No project description provided

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Weibo Scraper

Simple weibo tweet scraper . Crawl weibo tweets without authorization. There are many limitations in official API . In general , we can inspect mobile site which has it's own API by Chrome.

Why

Crawl weibo data in order to research big data .
Back up data for weibo's shameful blockade .

Installation

pip

$ pip install weibo-scraper==1.0.7b1

Or Upgrade it.

$ pip install --upgrade weibo-scraper

pipenv

$ pipenv install weibo-scraper==1.0.7b0

Or Upgrade it.

$ pipenv update --outdated # show packages which are outdated

$ pipenv update weibo-scraper # just update weibo-scraper

Only Python 3.6+ is supported

Usage

CLI

$ weibo-scraper -h

usage: weibo-scraper [-h] [-u U] [-p P] [-o O] [-f FORMAT]
                     [-efn EXPORTED_FILE_NAME] [-s] [-d] [--more] [-v]

weibo-scraper-1.0.7-beta 🚀

optional arguments:
  -h, --help            show this help message and exit
  -u U                  username [nickname] which want to exported
  -p P                  pages which exported [ default 1 page ]
  -o O                  output file path which expected [ default 'current
                        dir' ]
  -f FORMAT, --format FORMAT
                        format which expected [ default 'txt' ]
  -efn EXPORTED_FILE_NAME, --exported_file_name EXPORTED_FILE_NAME
                        file name which expected
  -s, --simplify        simplify available info
  -d, --debug           open debug mode
  --more                more
  -v, --version         weibo scraper version

API

Firstly , you can get weibo profile by name or uid .

>>> from weibo_scraper import get_weibo_profile
>>> weibo_profile = get_weibo_profile(name='来去之间',)
>>> ....

You will get weibo profile response which is type of weibo_base.UserMeta, and this response include fields as below

field	chinese	type	sample
id	用户id	str
screen_name	微博昵称	Option[str]
avatar_hd	高清头像	Option[str]	'https://ww2.sinaimg.cn/orj480/4242e8adjw8elz58g3kyvj20c80c8myg.jpg'
cover_image_phone	手机版封面	Option[str]	'https://tva1.sinaimg.cn/crop.0.0.640.640.640/549d0121tw1egm1kjly3jj20hs0hsq4f.jpg'
description	描述	Option[str]
follow_count	关注数	Option[int]	3568
follower_count	被关注数	Option[int]	794803
gender	性别	Option[str]	'm'/'f'
raw_user_response	原始返回	Option[dict]

Secondly , via tweet_container_id to get weibo tweets is a rare way to use but it also works well .

>>> from weibo_scraper import  get_weibo_tweets
>>> for tweet in get_weibo_tweets(tweet_container_id='1076033637346297',pages=1):
>>>     print(tweet)
>>> ....

Of Course , you can also get raw weibo tweets by nick name which is exist . And the param of pages is optional .

>>> from weibo_scraper import  get_weibo_tweets_by_name
>>> for tweet in get_weibo_tweets_by_name(name='嘻红豆', pages=1):
>>>     print(tweet)
>>> ....

If you want to get all tweets , you can set the param of pages as None

>>> from weibo_scraper import  get_weibo_tweets_by_name
>>> for tweet in get_weibo_tweets_by_name(name='嘻红豆', pages=None):
>>>     print(tweet)
>>> ....

There is a giant update since 1.0.6 🍰!

You can also get formatted tweets via api of weibo_scrapy.get_formatted_weibo_tweets_by_name,

>>> from weibo_scraper import  get_formatted_weibo_tweets_by_name
>>> result_iterator = get_formatted_weibo_tweets_by_name(name='嘻红豆', pages=None)
>>> for user_meta in result_iterator:
>>>     if user_meta is not None:
>>>         for tweetMeta in user_meta.cards_node:
>>>             print(tweetMeta.mblog.text)
>>> ....

Weibo Flasgger

Weibo Flasgger is a web api document for weibo scraper , and powered by flasgger .

P.S

Inspiration from Twitter-Scraper .
For 'XIHONGDOU' .
Welcome To Fork Me .

LICENSE

MIT

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.0.7rc1 pre-release

Jun 6, 2021

1.0.7rc1.dev3 pre-release

Nov 26, 2022

1.0.7rc1.dev2 pre-release

Nov 26, 2022

1.0.7rc1.dev1 pre-release

Jun 6, 2021

This version

1.0.7b1 pre-release

Jun 6, 2021

1.0.7b0 pre-release

Dec 19, 2018

1.0.6

Jun 9, 2018

1.0.4

May 21, 2018

1.0.2

May 10, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

weibo-scraper-1.0.7b1.tar.gz (20.4 kB view hashes)

Uploaded Jun 6, 2021 Source

Built Distribution

weibo_scraper-1.0.7b1-py2.py3-none-any.whl (25.3 kB view hashes)

Uploaded Jun 6, 2021 Python 2 Python 3

Hashes for weibo-scraper-1.0.7b1.tar.gz

Hashes for weibo-scraper-1.0.7b1.tar.gz
Algorithm	Hash digest
SHA256	`6058a23cc2164d247d834e25dcbd6e5a686964e38e95cb854764fabeea8d5702`
MD5	`815dd0de8ae80c68fc8f16d3cba2d616`
BLAKE2b-256	`c9be29f9bb8c9d61b54db042c1e5b0ba52be508e750b35ae130bbf914ff0b1c1`

Hashes for weibo_scraper-1.0.7b1-py2.py3-none-any.whl

Hashes for weibo_scraper-1.0.7b1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`4f433a73b44d62fad3285e107757647cfe5935d703e635594c9be6e1b03b54ba`
MD5	`59250744f4616eb7975b556a7b1c9d33`
BLAKE2b-256	`217ec8d70e5de7ce00f3b9aa9d1b760a6b0ec72b9e494cdb05b759394261b31d`