Xiaohongshu contact lead crawler for fashion creators
Project description
red-crawler
CLI crawler for collecting Xiaohongshu beauty creator contact leads from profile bios and recommendation chains, with SQLite persistence and nightly automation.
Usage
Install the published CLI:
uv tool install red-crawler==0.1.0
Install the Playwright browser runtime:
red-crawler install-browsers
For local development from a checkout:
uv sync
uv run playwright install chromium
Save a reusable login session first:
red-crawler login --save-state "./state.json"
It will open a visible browser. Log in to Xiaohongshu there, then come back to the terminal and press Enter to save the session file.
Run a manual crawl with an existing Playwright storage state file:
red-crawler crawl-seed \
--seed-url "https://www.xiaohongshu.com/user/profile/USER_ID" \
--storage-state "./state.json" \
--max-accounts 20 \
--max-depth 2 \
--db-path "./data/red_crawler.db" \
--output-dir "./output"
crawl-seed defaults to safe mode, adding slower request pacing and dwell/scroll delays that look more like a normal browsing session. Use --no-safe-mode only if you explicitly want a faster run.
crawl-seed now does both:
- exports
accounts.csv,contact_leads.csv,run_report.json - upserts the same result into SQLite
Optional note-page expansion:
red-crawler crawl-seed \
--seed-url "https://www.xiaohongshu.com/user/profile/USER_ID" \
--storage-state "./state.json" \
--include-note-recommendations
List high-quality contactable creators from the SQLite database:
red-crawler list-contactable \
--db-path "./data/red_crawler.db" \
--min-relevance-score 0.7 \
--limit 20
Run nightly auto-collection with queue, search bootstrap, seed promotion, and daily report output:
red-crawler collect-nightly \
--storage-state "./state.json" \
--db-path "./data/red_crawler.db" \
--report-dir "./reports" \
--cache-dir "./.cache/red-crawler" \
--crawl-budget 30
Export weekly growth report and a contactable creator CSV:
red-crawler report-weekly \
--db-path "./data/red_crawler.db" \
--report-dir "./reports" \
--days 7
Key outputs:
- manual crawl:
accounts.csvcontact_leads.csvrun_report.json
- nightly automation:
reports/daily-run-report.jsonreports/weekly-growth-report.jsonreports/contactable_creators.csv
- SQLite database:
data/red_crawler.db
OpenClaw
The OpenClaw skill for this project lives at openclaw-skills/red-crawler-ops/.
To install it from a local path, point OpenClaw at that folder, or copy the skill directory into your OpenClaw skills location and register the same path.
Use the OpenClaw skill actions in this order:
bootstrapvalidates a local working directory and can run Chromium installation when explicitly requested.logincreates the Playwright storage state explicitly.crawl_seedandcollect_nightlyrequire an authenticated Playwright storage state file.report_weeklyandlist_contactablerun from the SQLite database and do not require--storage-state.
The skill does not clone repositories or create login sessions implicitly. Install the red-crawler CLI package first, point workspace_path at a local working directory, run bootstrap only for reviewed local setup steps, then run login when you are ready to create state.json.
Publishing
The package builds as a standard Python wheel and source distribution:
uv build
See docs/publishing.md for the release checklist and PyPI/TestPyPI commands.
launchd
For macOS local scheduling, use the template at docs/launchd/red-crawler.collect-nightly.plist.
Replace the placeholder paths:
__WORKDIR____UV_BIN____STORAGE_STATE____DB_PATH____REPORT_DIR____CACHE_DIR____LOG_DIR__
Then load it with:
launchctl unload ~/Library/LaunchAgents/com.red-crawler.collect-nightly.plist 2>/dev/null || true
cp docs/launchd/red-crawler.collect-nightly.plist ~/Library/LaunchAgents/com.red-crawler.collect-nightly.plist
launchctl load ~/Library/LaunchAgents/com.red-crawler.collect-nightly.plist
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file red_crawler-0.1.0.tar.gz.
File metadata
- Download URL: red_crawler-0.1.0.tar.gz
- Upload date:
- Size: 39.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0246d31ee71cbf51057b6755723c27abc75a0d4ef288a8742e7ca23cdb098ec5
|
|
| MD5 |
1ed2628202249a582c073bf8af9b6355
|
|
| BLAKE2b-256 |
dd96013af1fd229ded1838de7784c9eb6c8e0626d74f2e2acf4a6bf8eeade574
|
File details
Details for the file red_crawler-0.1.0-py3-none-any.whl.
File metadata
- Download URL: red_crawler-0.1.0-py3-none-any.whl
- Upload date:
- Size: 32.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1c4117ea5b863f8d2f85f152112f9046a6405d359ab2e10b342b9719f7864a5
|
|
| MD5 |
c37f19baa8a2b25123daed0cacdedc10
|
|
| BLAKE2b-256 |
a0d1194f23ad4db3173749f2cda78d1c9206a0f6110f5ec048fc5f686bdfc2d4
|