Skip to main content

Weibo trending posts scraper

Project description

Weibo trending posts scraper

Scrap trending posts from Weibo front page.

Usage guide

As a library

Install

pip install weibo-trending

Get and parse posts

from weibo_trending import get_new_posts, parse_response

resp = get_new_posts()
mblogs = parse_response(resp)
for mblog in mblogs:
    print(mblog)

Get and parse comments

from weibo_trending import get_raw_comments, parse_comments

example_post_id = "4840673008419178"
raw_cmt = get_raw_comments(example_post_id)
comments = parse_comments(raw_cmt)
for c in comments:
    print(c)

As a command line tool

Install

pip install weibo-trending

Usage

python -m weibo_trending --help

    usage: weibo_trending [-h] [-d DIR] [-s]

    Scrape and parse Weibo trending posts.

    optional arguments:
    -h, --help          show this help message and exit
    -d DIR, --dir DIR   specify the output directory. Defaults to the current working directory
    -s, --skip-parsing  whether to skip parsing and dump the raw JSON response from Weibo


python -m weibo_trending

weibo_trending will save each scraped post with the following filename format:

  • weibo_<user ID>_<post ID>.json
  • Example: weibo_1631153043_4834313265233660.json

Each call to weibo_trending usually saves 10 new files. If you get fewer than 10, that means the response contains one or more deleted posts. They are not saved.

Develop

git clone https://github.com/ericlingit/weibo-trending.git
cd weibo-trending
python3 -m venv venv
source venv/bin/activate
pip install -U pip wheel
pip install -r requirements.txt
pip install -e .
pytest

Package

python -m build --wheel

The built wheel is in ./dist/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

weibo_trending-0.0.6.tar.gz (49.8 kB view hashes)

Uploaded Source

Built Distribution

weibo_trending-0.0.6-py3-none-any.whl (17.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page