Skip to main content

Little Red Book notes, home page, detailed page crawler

Project description

Spider_XHS

image

小红书个人主页图片和视频无水印爬取

效果图

image

image

image

image

运行环境

Python环境

NodeJS环境

运行方法:把你想要的id全部放到列表里


# 主页处理

from xhs_spider.home import Home

home = Home()

url_list = [

    'https://www.xiaohongshu.com/user/profile/6185ce66000000001000705b',

    'https://www.xiaohongshu.com/user/profile/6034d6f20000000001006fbb',

]

home.main(url_list)

# 笔记处理

from xhs_spider.note import Note

one_note = OneNote()

url_list = [

    'https://www.xiaohongshu.com/explore/64356527000000001303282b',

]

one_note.main(url_list)

# 搜索结果处理

from xhs_spider.search import Search

search = Search()

query = '你好'

# 搜索的数量(前多少个)

number = 22

search.main(query, number)

日志

  1. 23/08/08 first commit

  2. 23/09/13 【api更改params增加两个字段】修复图片无法下载,有些页面无法访问导致报错。

  3. 23/09/16 【较大视频出现编码问题】修复视频编码问题,加入异常处理。

  4. 23/09/18 代码重构,加入失败重试。

  5. 23/09/19 新增下载搜索结果功能

注意事项

本项目仅供学习与交流,侵权必删

other

  1. 自行将cookies放到目录下cookies.txt中,去设置里的应用程序里找或者网络请求里找,需要哪些可以参考cookie.txt文件。

  2. 可采用以下方法获取cookie,并运行对应文件。

image

image

  1. 欢迎star,不时更新。

  2. 有问题可以加QQ或者微信交流(992822653)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xhs_spider-1.0.8.tar.gz (50.3 kB view details)

Uploaded Source

File details

Details for the file xhs_spider-1.0.8.tar.gz.

File metadata

  • Download URL: xhs_spider-1.0.8.tar.gz
  • Upload date:
  • Size: 50.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for xhs_spider-1.0.8.tar.gz
Algorithm Hash digest
SHA256 124485fafca37b51ad17273a8a4c03b7458d80fb843bdb6e1b49fc485cb0404a
MD5 c35848f216be23661826f6c916c93fbc
BLAKE2b-256 7a0bb408f4ef2209896ba6561c13fc10535db9aaf128c3fb76ee52aea2b69bb1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page