Skip to main content

CrawlNest MCP server for web scraping, Twitter, Reddit, video download, image download, and transcription. Use with Claude Code, Claude Desktop, Cursor, and other MCP clients.

Project description

crawlnest-mcp

MCP server for web scraping, Twitter, and Reddit — 17 tools for any MCP-compatible LLM client.

Works with Claude Code, Claude Desktop, Cursor, Windsurf, and any other MCP client.

Quick Start

pip install crawlnest-mcp

Then configure your client with two env vars:

Variable Description
CRAWLNEST_API_URL Your CrawlNest API endpoint (e.g., https://api.crawlnest.com or http://100.x.x.x:8000 for Tailscale)
CRAWLNEST_API_KEY Your API key (starts with cn_...)

That's it. No SSH, no scripts, no cloning repos.

Client Configuration

Claude Code

claude mcp add crawlnest \
  -e CRAWLNEST_API_URL=https://api.crawlnest.com \
  -e CRAWLNEST_API_KEY=cn_your_key \
  -- crawlnest-mcp

Or add to ~/.claude.json manually:

{
  "mcpServers": {
    "crawlnest": {
      "command": "crawlnest-mcp",
      "env": {
        "CRAWLNEST_API_URL": "https://api.crawlnest.com",
        "CRAWLNEST_API_KEY": "cn_your_key"
      }
    }
  }
}

Claude Desktop

Add to your config file:

Platform Config Path
macOS ~/Library/Application Support/Claude/claude_desktop_config.json
Windows %APPDATA%\Claude\claude_desktop_config.json
Linux ~/.config/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "crawlnest": {
      "command": "crawlnest-mcp",
      "env": {
        "CRAWLNEST_API_URL": "https://api.crawlnest.com",
        "CRAWLNEST_API_KEY": "cn_your_key"
      }
    }
  }
}

Restart Claude Desktop after saving.

Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "crawlnest": {
      "command": "crawlnest-mcp",
      "env": {
        "CRAWLNEST_API_URL": "https://api.crawlnest.com",
        "CRAWLNEST_API_KEY": "cn_your_key"
      }
    }
  }
}

Windsurf

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "crawlnest": {
      "command": "crawlnest-mcp",
      "env": {
        "CRAWLNEST_API_URL": "https://api.crawlnest.com",
        "CRAWLNEST_API_KEY": "cn_your_key"
      }
    }
  }
}

Self-Hosted / Tailscale / Proxmox

Point CRAWLNEST_API_URL at your instance:

{
  "mcpServers": {
    "crawlnest": {
      "command": "crawlnest-mcp",
      "env": {
        "CRAWLNEST_API_URL": "http://100.64.0.5:8000",
        "CRAWLNEST_API_KEY": "cn_your_key"
      }
    }
  }
}

Replace 100.64.0.5 with your Tailscale IP (tailscale ip on the server).

Tools (17)

Web Scraping

Tool Description Key Parameters
scrape_url Scrape a single page with anti-bot bypass url
crawl_website Crawl up to 50 pages with sitemap discovery url, max_pages
extract_structured Extract structured data with custom schema url, schema, output_format

Twitter

Tool Description Credentials
fetch_tweet Fetch tweet by URL (3-layer cascade) No
twitter_user_info User profile info No
twitter_search Search tweets by query Required
twitter_trending Trending topics (local/global) Required
twitter_news News timeline Required
twitter_for_you Explore timeline Required
twitter_home_timeline Home feed (for_you/following) Required

Reddit

Tool Description Credentials
reddit_post Fetch post + comment tree No
reddit_community Subreddit post listings No
reddit_about Subreddit metadata No
reddit_search Search across Reddit No
reddit_explore Discover subreddits No
reddit_popular Trending posts (with geo filter) No
reddit_feed Personalized home/news feed Only with use_cookies=true

Social Media Credentials

Twitter/Reddit tools that need authentication use server-side credentials — not passed through MCP. Set them once via the API.

Twitter (for search, trending, timelines)

Get auth_token and ct0 from x.com cookies (DevTools → Application → Cookies):

curl -X POST $CRAWLNEST_API_URL/api/twitter/credentials \
  -H "X-API-Key: cn_your_key" \
  -H "Content-Type: application/json" \
  -d '{"auth_token": "YOUR_AUTH_TOKEN", "ct0": "YOUR_CT0"}'

Reddit (for personalized feeds only)

Get reddit_session from reddit.com cookies:

curl -X POST $CRAWLNEST_API_URL/api/subreddit/credentials \
  -H "X-API-Key: cn_your_key" \
  -H "Content-Type: application/json" \
  -d '{"reddit_session": "YOUR_SESSION"}'

Usage Examples

> Scrape https://news.ycombinator.com and show the top stories

> Extract product name, price, and image from https://example.com/products as CSV

> Crawl https://anthropic.com/news with max 5 pages

> Get the latest tweet from https://x.com/anthropic/status/123456

> What's trending on Twitter right now?

> Show me the top posts from r/technology this week

> Search Reddit for "machine learning" in r/programming

Troubleshooting

"Command not found: crawlnest-mcp"

pip install --force-reinstall crawlnest-mcp
which crawlnest-mcp

"CrawlNest API not reachable"

curl -H "X-API-Key: cn_your_key" $CRAWLNEST_API_URL/health

"Twitter credentials missing"

Save credentials first (see Social Media Credentials).

Tools not showing

  1. Restart your client
  2. Check config JSON syntax: cat config.json | python3 -m json.tool
  3. Check logs: tail -f /tmp/mcp_server.log

Self-Hosting

git clone https://github.com/WonderCrafts/CrawlNest
cd CrawlNest
make setup && make infra && make db-apply && make dev

Then use http://localhost:8000 as your CRAWLNEST_API_URL.

License

MIT

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlnest_mcp-0.3.0.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawlnest_mcp-0.3.0-py3-none-any.whl (24.7 kB view details)

Uploaded Python 3

File details

Details for the file crawlnest_mcp-0.3.0.tar.gz.

File metadata

  • Download URL: crawlnest_mcp-0.3.0.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for crawlnest_mcp-0.3.0.tar.gz
Algorithm Hash digest
SHA256 38985fc7ebd8ac1e5112814a57fe15f74f52d847a32236d2dd52a456c4d830d4
MD5 5c7c9b88029c819894c058ccc5723f1c
BLAKE2b-256 d9a6344823ed6abe85d021d25a55b89b52f45fe37203e88fe940116dc3e36b2c

See more details on using hashes here.

File details

Details for the file crawlnest_mcp-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: crawlnest_mcp-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 24.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for crawlnest_mcp-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3fe19c6ba475f895008db7a961369a6d319a7ba2c2b2a149c257ebaaeba40fde
MD5 e7f37a2de6c28571e12ee1cc9c968bf3
BLAKE2b-256 86f635a6a95ab3692e1efde76169c593a94be29bb86b6b7913241a6e7e01d339

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page