Skip to main content

Helps AI assistants access text content from bot-protected websites. MCP server that fetches HTML/markdown from sites with anti-automation measures using Scrapling.

Project description

scrapling-fetch-mcp

License PyPI version

An MCP server and Claude Code skill that help AI assistants access text content from websites that implement bot detection, bridging the gap between what you can see in your browser and what the AI can access.

Intended Use

This tool is optimized for low-volume retrieval of documentation and reference materials (text/HTML only) from websites that implement bot detection. It has not been designed or tested for general-purpose site scraping or data harvesting.

Note: This project was developed in collaboration with Claude Sonnets 3.7 and 4.5, using LLM Context.

Installation

Requirements

  • Python 3.10+
  • uv package manager

Install

# Install scrapling-fetch-mcp
uv tool install git+https://github.com/cyberchitta/scrapling-fetch-mcp

# Install browser binaries (REQUIRED - large downloads)
uvx --from git+https://github.com/cyberchitta/scrapling-fetch-mcp scrapling install

Important: The browser installation downloads hundreds of MB of data and must complete before first use. If the MCP server times out on first use, the browsers may still be installing in the background. Wait a few minutes and try again.

Setup with Claude Desktop

Add this configuration to your Claude Desktop MCP settings:

MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "scrapling-fetch": {
      "command": "uvx",
      "args": ["scrapling-fetch-mcp"]
    }
  }
}

After updating the config, restart Claude Desktop.

What It Does

This MCP server provides two tools that Claude can use automatically when you ask it to fetch web content:

  • Page fetching: Retrieves complete web pages with support for pagination
  • Pattern extraction: Finds and extracts specific content using regex patterns

The AI decides which tool to use based on your request. You just ask naturally:

"Can you fetch the docs at https://example.com/api"
"Find all mentions of 'authentication' on that page"
"Get me the installation instructions from their homepage"

Protection Modes

The tools support three levels of bot detection bypass:

  • basic: Fast (1-2s), works for most sites
  • stealth: Moderate (3-8s), handles more protection
  • max-stealth: Maximum (10+s), for heavily protected sites

Claude automatically starts with basic mode and escalates if needed.

Tips for Best Results

  • Just ask naturally - Claude handles the technical details
  • For large pages, Claude can page through content automatically
  • For specific searches, mention what you're looking for and Claude will use pattern matching
  • The metadata returned helps Claude decide whether to page or search

Limitations

  • Designed for text content only (documentation, articles, references)
  • Not for high-volume scraping or data harvesting
  • May not work with sites requiring authentication
  • Performance varies by site complexity and protection level

Claude Code Skill

This repo also ships as a Claude Code skill with a /s-fetch slash command that fetches URLs directly via scrapling. The skill requires the MCP server to be installed first (it reuses the same scrapling environment).

Install from within Claude Code:

/skills install github:cyberchitta/scrapling-fetch-mcp

Once installed, Claude will use /s-fetch automatically when you ask it to fetch a URL from a bot-protected site. You can also invoke it directly as /s-fetch <url> with an optional mode (basic, stealth, max-stealth) and format (markdown, html).

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapling_fetch_mcp-0.2.1.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scrapling_fetch_mcp-0.2.1-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file scrapling_fetch_mcp-0.2.1.tar.gz.

File metadata

  • Download URL: scrapling_fetch_mcp-0.2.1.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.13

File hashes

Hashes for scrapling_fetch_mcp-0.2.1.tar.gz
Algorithm Hash digest
SHA256 7ac5b0166d46c46be3e96085d6c6685a88ca67f8b1a240493d8b27ba4b524086
MD5 5d9e2c58e996b29937aaef71655ac4ae
BLAKE2b-256 44956f715e596f6038430dc4a2a20becd3b4ecd301beb1bcf6db9133f5818b2b

See more details on using hashes here.

File details

Details for the file scrapling_fetch_mcp-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for scrapling_fetch_mcp-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 327b801629bb75e80268a5e5a0b5b230c7bf7dc4b55b094c2a2e2cc1f44791ce
MD5 9308bb86305f285bbe62e1bc489a5afd
BLAKE2b-256 215a85d24327a22b4300a1e8965cc1518097637e2714ade0cb2cf0c6254deb0c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page