Skip to main content

Search Xiaohongshu and generate AI-powered research reports

Project description

xhs-research

Search Xiaohongshu (小红书) and generate AI-powered research reports from the command line.

Instead of scrolling through dozens of posts one by one, type one command and get a structured summary with recommendations, price comparisons, and user sentiment — all powered by AI.

Features

  • One command, full report — search a keyword → scrape posts → AI generates a structured Markdown report
  • Multi-model support — OpenAI, Claude, DeepSeek, or run locally with llama.cpp / Ollama (zero cost)
  • Smart summarization — chunks large result sets and merges summaries to fit any model's context window
  • Structured output — recommendations table, buying advice, red flags, sentiment breakdown
  • Login state persistence — scan QR code once, cookies saved for reuse

Quick Start

Option A: Install from PyPI (recommended)

pip install xhs-research
playwright install firefox

Option B: Install from GitHub

pip install git+https://github.com/yongsinfok/xhs-research.git
playwright install firefox

Option C: Clone for development

git clone https://github.com/yongsinfok/xhs-research.git
cd xhs-research
pip install -e .
playwright install firefox

2. Configure AI model

mkdir -p ~/.xhs-research
cp config.example.yaml ~/.xhs-research/config.yaml

Edit ~/.xhs-research/config.yaml:

ai:
  api_key: sk-your-key        # not needed for local models
  base_url: null              # local models: http://localhost:11434/v1
  model: gpt-4o               # or deepseek-chat, llama3, etc.

3. Run

xhs-research search "马来西亚高性价比扫地机器人"

A browser window opens. Scan the QR code with the Xiaohongshu app to log in (only needed the first time). The tool then scrapes posts and generates a report.

Usage

# Basic search (default 20 posts)
xhs-research search "吉隆坡美食推荐"

# More posts for better coverage
xhs-research search "新加坡PR申请攻略" --limit 30

# Use a specific model
xhs-research search "MacBook Pro M4 值得买吗" --model deepseek-chat

# Save to a specific path
xhs-research search "装修避坑指南" --output ./my-report.md

# Also export raw data as JSON
xhs-research search "搬家攻略" --json

# View config file location
xhs-research config-path

Report Example

# 马来西亚扫地机器人推荐 调研报告

> 基于 20 篇小红书帖子 · 2026-05-25

## 核心发现

- 小米 X20+ 关注度最高(447赞),有实测背书
- Dreame D20/Ultra 为热门候选,性价比讨论多
- Mova E40 作为竞品出现

## 推荐清单

| 品牌/型号   | 提及次数 | 最高赞 | 定位       |
|-------------|---------|--------|------------|
| 小米 X20+   | 1+      | 447    | 高关注实测 |
| Dreame D20 Ultra | 2  | 131   | 热门候选   |
| Dreame D20  | 1       | 28     | 性价比讨论 |
| Mova E40    | 1       | 28     | 竞品对比   |

## 购买建议 / 踩坑提醒 / 观点分布
...

Supported AI Models

Provider base_url model example Cost
OpenAI (default) gpt-4o Paid
Anthropic (default) claude-sonnet-4-6 Paid
DeepSeek https://api.deepseek.com/v1 deepseek-chat Low cost
Ollama http://localhost:11434/v1 llama3, qwen2 Free
llama.cpp http://localhost:8080/v1 (local model) Free
Any OpenAI-compatible API (your endpoint) (your model) Varies

Any endpoint that exposes an OpenAI-compatible /v1/chat/completions API works out of the box.

Project Structure

xhs-research/
├── xhs_research/
│   ├── cli.py              # CLI entry point (typer)
│   ├── config.py           # YAML config loader
│   ├── models/post.py      # Post / Comment data models
│   ├── ai/
│   │   ├── client.py       # Unified AI client (OpenAI SDK)
│   │   └── summarizer.py   # Chunk + merge summarization
│   └── scraper/
│       ├── browser.py      # Playwright browser manager
│       ├── login.py        # QR code login handler
│       └── parser.py       # Search result scraper
├── config.example.yaml
├── requirements.txt
└── README.md

Limitations

  • Xiaohongshu web restrictions — post detail pages are often blocked on web, so reports are primarily based on search result titles and card summaries. Increasing --limit improves coverage.
  • Anti-scraping — uses standard Playwright Firefox. For better evasion, consider camoufox.
  • Login expiration — cookies may expire; re-scan QR code when prompted.
  • Personal use only — respect Xiaohongshu's terms of service. Do not use for commercial scraping.

Contributing

Issues and pull requests are welcome. Areas to contribute:

  • Mobile API support for full post content
  • Better anti-detection (camoufox integration)
  • Web UI or API server mode
  • Support for other platforms (Douyin, Bilibili, etc.)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xhs_research-0.1.1.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xhs_research-0.1.1-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file xhs_research-0.1.1.tar.gz.

File metadata

  • Download URL: xhs_research-0.1.1.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for xhs_research-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3116ac9a682c7ae69e4d0ba5b470df71d90d9b9f17d8f8cbcb1f8773421c84b2
MD5 0e18222d5d0dd49748f81519e1451776
BLAKE2b-256 56b59483b2ffeb49145aae5242bb4b8ecbde99dd583cf7f0f247adf7115c8909

See more details on using hashes here.

Provenance

The following attestation bundles were made for xhs_research-0.1.1.tar.gz:

Publisher: publish.yml on yongsinfok/xhs-research

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file xhs_research-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: xhs_research-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for xhs_research-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7079cedce1ed6cfe1f876bc1e01b45101a980bf619cf62ac8a5284b288fc85d5
MD5 9b2042697d4074f99424be90b595c81d
BLAKE2b-256 b34c18dfd32194ba721f477aaf6b5cb59680fdd7ce25f171b706ccb286ebb888

See more details on using hashes here.

Provenance

The following attestation bundles were made for xhs_research-0.1.1-py3-none-any.whl:

Publisher: publish.yml on yongsinfok/xhs-research

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page