Skip to main content

CrawlNest MCP server for web scraping, Twitter, Reddit, GitHub, Hacker News, YC directory, Google Play Store, Apple App Store, video download, image download, and transcription. Use with Claude Code, Claude Desktop, Cursor, and other MCP clients.

Project description

crawlnest-mcp

MCP server for web scraping, Twitter, and Reddit — 17 tools for any MCP-compatible LLM client.

Works with Claude Code, Claude Desktop, Cursor, Windsurf, and any other MCP client.

Quick Start

pip install crawlnest-mcp

Then configure your client with two env vars:

Variable Description
CRAWLNEST_API_URL Your CrawlNest API endpoint (e.g., https://api.crawlnest.com or http://100.x.x.x:8000 for Tailscale)
CRAWLNEST_API_KEY Your API key (starts with cn_...)

That's it. No SSH, no scripts, no cloning repos.

Client Configuration

Claude Code

claude mcp add crawlnest \
  -e CRAWLNEST_API_URL=https://api.crawlnest.com \
  -e CRAWLNEST_API_KEY=cn_your_key \
  -- crawlnest-mcp

Or add to ~/.claude.json manually:

{
  "mcpServers": {
    "crawlnest": {
      "command": "crawlnest-mcp",
      "env": {
        "CRAWLNEST_API_URL": "https://api.crawlnest.com",
        "CRAWLNEST_API_KEY": "cn_your_key"
      }
    }
  }
}

Claude Desktop

Add to your config file:

Platform Config Path
macOS ~/Library/Application Support/Claude/claude_desktop_config.json
Windows %APPDATA%\Claude\claude_desktop_config.json
Linux ~/.config/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "crawlnest": {
      "command": "crawlnest-mcp",
      "env": {
        "CRAWLNEST_API_URL": "https://api.crawlnest.com",
        "CRAWLNEST_API_KEY": "cn_your_key"
      }
    }
  }
}

Restart Claude Desktop after saving.

Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "crawlnest": {
      "command": "crawlnest-mcp",
      "env": {
        "CRAWLNEST_API_URL": "https://api.crawlnest.com",
        "CRAWLNEST_API_KEY": "cn_your_key"
      }
    }
  }
}

Windsurf

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "crawlnest": {
      "command": "crawlnest-mcp",
      "env": {
        "CRAWLNEST_API_URL": "https://api.crawlnest.com",
        "CRAWLNEST_API_KEY": "cn_your_key"
      }
    }
  }
}

Self-Hosted / Tailscale / Proxmox

Point CRAWLNEST_API_URL at your instance:

{
  "mcpServers": {
    "crawlnest": {
      "command": "crawlnest-mcp",
      "env": {
        "CRAWLNEST_API_URL": "http://100.64.0.5:8000",
        "CRAWLNEST_API_KEY": "cn_your_key"
      }
    }
  }
}

Replace 100.64.0.5 with your Tailscale IP (tailscale ip on the server).

Tools (17)

Web Scraping

Tool Description Key Parameters
scrape_url Scrape a single page with anti-bot bypass url
crawl_website Crawl up to 50 pages with sitemap discovery url, max_pages
extract_structured Extract structured data with custom schema url, schema, output_format

Twitter

Tool Description Credentials
fetch_tweet Fetch tweet by URL (3-layer cascade) No
twitter_user_info User profile info No
twitter_search Search tweets by query Required
twitter_trending Trending topics (local/global) Required
twitter_news News timeline Required
twitter_for_you Explore timeline Required
twitter_home_timeline Home feed (for_you/following) Required

Reddit

Tool Description Credentials
reddit_post Fetch post + comment tree No
reddit_community Subreddit post listings No
reddit_about Subreddit metadata No
reddit_search Search across Reddit No
reddit_explore Discover subreddits No
reddit_popular Trending posts (with geo filter) No
reddit_feed Personalized home/news feed Only with use_cookies=true

Social Media Credentials

Twitter/Reddit tools that need authentication use server-side credentials — not passed through MCP. Set them once via the API.

Twitter (for search, trending, timelines)

Get auth_token and ct0 from x.com cookies (DevTools → Application → Cookies):

curl -X POST $CRAWLNEST_API_URL/api/twitter/credentials \
  -H "X-API-Key: cn_your_key" \
  -H "Content-Type: application/json" \
  -d '{"auth_token": "YOUR_AUTH_TOKEN", "ct0": "YOUR_CT0"}'

Reddit (for personalized feeds only)

Get reddit_session from reddit.com cookies:

curl -X POST $CRAWLNEST_API_URL/api/subreddit/credentials \
  -H "X-API-Key: cn_your_key" \
  -H "Content-Type: application/json" \
  -d '{"reddit_session": "YOUR_SESSION"}'

Usage Examples

> Scrape https://news.ycombinator.com and show the top stories

> Extract product name, price, and image from https://example.com/products as CSV

> Crawl https://anthropic.com/news with max 5 pages

> Get the latest tweet from https://x.com/anthropic/status/123456

> What's trending on Twitter right now?

> Show me the top posts from r/technology this week

> Search Reddit for "machine learning" in r/programming

Troubleshooting

"Command not found: crawlnest-mcp"

pip install --force-reinstall crawlnest-mcp
which crawlnest-mcp

"CrawlNest API not reachable"

curl -H "X-API-Key: cn_your_key" $CRAWLNEST_API_URL/health

"Twitter credentials missing"

Save credentials first (see Social Media Credentials).

Tools not showing

  1. Restart your client
  2. Check config JSON syntax: cat config.json | python3 -m json.tool
  3. Check logs: tail -f /tmp/mcp_server.log

Self-Hosting

git clone https://github.com/WonderCrafts/CrawlNest
cd CrawlNest
make setup && make infra && make db-apply && make dev

Then use http://localhost:8000 as your CRAWLNEST_API_URL.

License

MIT

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlnest_mcp-0.7.0.tar.gz (51.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawlnest_mcp-0.7.0-py3-none-any.whl (60.1 kB view details)

Uploaded Python 3

File details

Details for the file crawlnest_mcp-0.7.0.tar.gz.

File metadata

  • Download URL: crawlnest_mcp-0.7.0.tar.gz
  • Upload date:
  • Size: 51.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for crawlnest_mcp-0.7.0.tar.gz
Algorithm Hash digest
SHA256 c0e9a24069ee5be772941654b74b124a03118ee69d7bd91ec69535bce296d646
MD5 3ebf67c2722368253a82c97f8c7e7011
BLAKE2b-256 6abc92473328fbecaf0ef1b0b8fd6a60514cbcef96cd491367212c9e6debf40c

See more details on using hashes here.

File details

Details for the file crawlnest_mcp-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: crawlnest_mcp-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 60.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for crawlnest_mcp-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1b4a0b13fd76cb5f35746bc7e523a25a6522b8346b488003e7be0ba3687b71ee
MD5 441a14088c6fa0aff3d1d6e29904aacd
BLAKE2b-256 48ac278084f40a687522a5bfdc6137089e2086db44b9adfaefe3af4f26ffc1a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page