sinaspider

Scraping Weibos

These details have not been verified by PyPI

Project links

Home

Project description

准备工作

安装 postgresql 数据库:

brew install postgresql
brew services start postgresql

创建数据库.
```
createdb your_database_name
```

配置信息

from sinaspider import _config
# 写入配置信息
config(
   account_id = 'your accout id' # 你的微博账号
   database_name = 'your_database_name' # 微博和用户信息将保存在该数据库
   write_xmp=True # 是否将微博信息写入图片(可选, 需安装Exiftool)
)
# 读取配置信息
config()
>>> ConfigObj({'database_name': 'sina_test', 'write_xmp': 'True', 'account_id': '6619193364'})

设置cookie

import keyring
cookie = '...your cookie get from www.m.weibo.cn ...' # 需要m.weibo.cn网页的cookie
keyring.set_password('sinaspider', 'cookie', cookie)

Quick Start

将关注者放入配置列表中:

owner = Owner()
for following in owner.following():
    UserConfig(following)

读取配置列表中的用户:

>>> for user_config in UserConfig.yield_config_user():
>>>     pring(user_config)
>>>     break
# 打开所有配置选项
>>> user_config.toggle_all()
Fetch Weibo: True
Fetch Retweet: True
Download Media: True
Fetch following: True
# 获取所有微博
>>> user_config.fetch_weibo()
Fetching Retweet: True
Media Saving: ~/Downloads/sinaspider
Update Config: True

每个用户提供如下配置选项:

weibo_fetch: 是否下载微博
weibo_since: 只获取该日期后的微博(默认为1970-01-01, 即获取所有微博)
retweet_fetch: 是否下载转发微博
media_download: 是否下载图片和视频

微博保存与下载

User

获取用户信息

>>> from sinaspider import User
>>> uid = 6619193364 # 填写 用户id
>>> user = User(uid)

可通过user.weibos获取微博页面, 其具体参数参加get_weibo_pages

# 获取第3页到第10页的所有微博, 并将文件保存在`path/to/download`
weibos=user.weibos(retweet=True, star_page=3, end_page=10, 
                  download_dir='path/to/download')
# 返回下一条微博
next(weibos)

Owner

from sinaspider import Owner
from pathlib import Path
owner = Owner()
#获取自己的资料
owner.info
# 获取自己的关注信息
myfollow = owner.following()
# 获取自己的微博
myweibo = owner.weibos(download_dir='path/to/dir')
# 获取收藏页面
>>> mycollection=owner.collections(download_dir='path/to/dir)
>>> next(mycollection)

Project details

These details have not been verified by PyPI

Project links

Home

Release history Release notifications | RSS feed

This version

0.4.1

Feb 1, 2022

0.4.0

Sep 5, 2021

0.3.2

Sep 3, 2021

0.3.0

Sep 2, 2021

0.2.0

Aug 30, 2021

0.1.1

Aug 26, 2021

0.1.0

Aug 26, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinaspider-0.4.1.tar.gz (15.8 kB view details)

Uploaded Feb 1, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sinaspider-0.4.1-py2.py3-none-any.whl (16.9 kB view details)

Uploaded Feb 1, 2022 Python 2Python 3

File details

Details for the file sinaspider-0.4.1.tar.gz.

File metadata

Download URL: sinaspider-0.4.1.tar.gz
Upload date: Feb 1, 2022
Size: 15.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.27.1

File hashes

Hashes for sinaspider-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`35666c72fc7f8e82d16c3c2a4296e385c9f8bcbc960edf0a326483ade30df589`
MD5	`01fef47906e9de6033824eb6a57674c6`
BLAKE2b-256	`4ca0d7c33652867ef20a082890d461f49e31badc418fc613ac60078c6cfcccaf`

See more details on using hashes here.

File details

Details for the file sinaspider-0.4.1-py2.py3-none-any.whl.

File metadata

Download URL: sinaspider-0.4.1-py2.py3-none-any.whl
Upload date: Feb 1, 2022
Size: 16.9 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.27.1

File hashes

Hashes for sinaspider-0.4.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`e6d74abe0b79a0f7f6f402f2896bad67bc39d531f3af34bbc662cac5b2113051`
MD5	`c19eec474dd136053206d1adebf3fb0d`
BLAKE2b-256	`847fca80b7b7f496a4c9e4ee6534a002c5a11c3101da8377b7f8ba71df709441`

See more details on using hashes here.

sinaspider 0.4.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

准备工作

Quick Start

微博保存与下载

User

Owner

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes