Skip to main content

Little Red Book notes, home page, detailed page crawler

Project description

Spider_XHS

image

小红书个人主页图片和视频无水印爬取

效果图

image

image

image

image

运行环境

Python环境

NodeJS环境

运行方法:把你想要的id全部放到列表里


# 主页处理

from xhs_spider.home import Home

home = Home()

url_list = [

    'https://www.xiaohongshu.com/user/profile/6185ce66000000001000705b',

    'https://www.xiaohongshu.com/user/profile/6034d6f20000000001006fbb',

]

home.main(url_list)

# 笔记处理

from xhs_spider.note import Note

one_note = OneNote()

url_list = [

    'https://www.xiaohongshu.com/explore/64356527000000001303282b',

]

one_note.main(url_list)

# 搜索结果处理

from xhs_spider.search import Search

search = Search()

query = '你好'

# 搜索的数量(前多少个)

number = 22

search.main(query, number)

日志

  1. 23/08/08 first commit

  2. 23/09/13 【api更改params增加两个字段】修复图片无法下载,有些页面无法访问导致报错。

  3. 23/09/16 【较大视频出现编码问题】修复视频编码问题,加入异常处理。

  4. 23/09/18 代码重构,加入失败重试。

  5. 23/09/19 新增下载搜索结果功能

注意事项

本项目仅供学习与交流,侵权必删

other

  1. 自行将cookies放到目录下cookies.txt中,去设置里的应用程序里找或者网络请求里找,需要哪些可以参考cookie.txt文件。

  2. 可采用以下方法获取cookie,并运行对应文件。

image

image

  1. 欢迎star,不时更新。

  2. 有问题可以加QQ或者微信交流(992822653)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xhs_spider-1.0.1.tar.gz (11.1 kB view details)

Uploaded Source

File details

Details for the file xhs_spider-1.0.1.tar.gz.

File metadata

  • Download URL: xhs_spider-1.0.1.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for xhs_spider-1.0.1.tar.gz
Algorithm Hash digest
SHA256 e878f43515bd815482628590f0b8ecde717feaf1ce09b6f00483f23d7268703b
MD5 8d09ed7901607753f00dfc489b62c501
BLAKE2b-256 7395d71fd8122516a59f1311ed4ea5903e29568a537cd17f7e6bb19d8f206f12

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page