Skip to main content

No project description provided

Project description

爬取、下载哔哩哔哩、小红书等网站数据、视频, 持续更新中...

Warning:

本项目仅供学习交流使用, 请勿用于商业及非法用途, 由此引起的一切后果与作者无关

Menu

Quick Preview

根据用户id爬取b站用户主页投稿视频

python -m spiders_for_all bilibili download-by-author -m 用户id -s 保存目录

根据note_id批量爬取小红书笔记内容

python -m spiders_for_all xhs download-by-id -i note_id1,note_id2,note_id3 -s 保存目录

更多用法见Documentation部份

Installation

pip install spiders-for-all # python 版本 >= 3.12

Documentation

点击进入对应平台的使用文档

Roadmap

  • bilibili
    • 综合热门、入站必刷等栏目爬虫
    • 根据bvid爬取/批量爬取视频
    • 根据用户id爬取用户主页投稿视频
    • 爬取用户动态
  • xhs
    • 根据note_id爬取/批量爬取笔记
    • 根据用户id爬取用户主页首页笔记
    • 爬取笔记评论
  • GUI

Known Issues

  • 小红书爬取用户投稿的笔记时, 由于小红书签名算法的问题尚未解决, 只能爬取用户投稿的首页数据, 需要下拉加载的数据暂时无法爬取
  • 低版本的sqlite可能不支持ON CONFLICT DO UPDATE语法, 如果遇到该问题请尝试升级sqlite版本

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spiders_for_all-0.2.5.tar.gz (40.4 kB view hashes)

Uploaded Source

Built Distribution

spiders_for_all-0.2.5-py3-none-any.whl (52.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page