No project description provided
Project description
欢迎来到 bilibili-spiders
爬取/下载/展示bilibili综合热门/每周必看/入站必刷/排行榜视频数据信息
目录
安装
pip install spiders-for-all # python 版本 >= 3.10
使用方法
命令行
列出内置的爬虫
python -m spiders_for_all list-spiders
运行一个爬虫
通过爬虫名称运行:
python -m spiders_for_all run-spider -n precious
或通过别名:
python -m spiders_for_all run-spider -n 入站必刷
分析爬取的数据
python -m spiders_for_all data-analysis -n precious
通过bvid下载视频
注意: 在下载视频前, 你需要确保你的主机上已安装了ffmpeg
, 如果是使用docker方式启动, 则可以忽略这一步
python -m spiders_for_all download-video -b BV1hx411w7MG -s ./videos_dl
指定SESS_DATA下载高清视频
如何获取SESS_DATA
- 网页登陆bilibili
- 按
F12
打开开发者工具 - 刷新页面
- 打开
Network
选项卡 - 选中任何一个包含
Cookies
的请求 - 复制
Request Headers
中的Cookie
字段中的SESSDATA
值
python -m spiders_for_all download-video -b BV1hx411w7MG -s ./videos_dl -d {SESS_DATA}
查看帮助
python -m spiders_for_all --help
代码
运行爬虫
from spiders_for_all.bilibili.spiders import PreciousSpider
if __name__ == '__main__':
spider = PreciousSpider()
spider.run()
分析爬取的数据
from spiders_for_all.bilibili.analysis import Analysis
from spiders_for_all.bilibili import db
if __name__ == '__main__':
analysis = Analysis(db.BilibiliPreciousVideos)
analysis.show()
通过bvid下载视频
from spiders_for_all.bilibili.download import Downloader
if __name__ == '__main__':
downloader = Downloader(
bvid='BV1hx411w7MG',
save_path='./videos_dl',
sess_data="YOUR_SESS_DATA_HERE"
)
downloader.download()
定制你自己的爬虫
from spiders_for_all.core.base import Spider
class CustomSpider(Spider):
api = "Your api url to request"
name = "Your spider name"
alias = "Your spider alias"
# database model to save all your crawled data
database_model = YourDatabaseModel # type: db.Base
# item model to validate your crawled data
item_model = YourItemModel # type: pydantic.BaseModel
# response model to validate your api response
response_model = YourResponseModel # type: pydantic.BaseModel
def run(self):
# Your spider logic here.
# Note: You must implement this method.
pass
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spiders_for_all-0.1.2.tar.gz
(14.8 kB
view hashes)
Built Distribution
Close
Hashes for spiders_for_all-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9d3cf55d4dd218c6c21862f427ddcd796bff07a199a7a94037164c7da1fea23 |
|
MD5 | a94e1b0b0fa94fcf1746d81fb4a7955e |
|
BLAKE2b-256 | c0bcafbc2b47c3a0506dcc2a39d9aad6a202e4c673964edd1c5d0f4f326ee389 |