Skip to main content

Simple URL fetcher MCP server for Scrapling. Retrieves HTML/markdown from automation-blocking sites with three protection levels.

Project description

Scrapling Fetch MCP Server

A simple Model Context Protocol (MCP) server implementation that integrates with Scrapling for retrieving web content with advanced bot detection avoidance.

Note: This project was developed in collaboration with Claude Sonnet 3.7, using LLM Context to share code during development. Initial vibe code session with Sonnet to get to a working prototype + several curation sessions where I (@restlessronin) refactored and refined with Sonnet's feedback.

Intended Use

This tool is optimized for low volume retrieval of documentation and reference materials (text/html only) from websites that implement bot detection. It has not been designed or tested for general-purpose site scraping or data harvesting.

Features

  • Retrieve content from websites that implement advanced bot protection
  • Three protection levels (basic, stealth, max-stealth)
  • Two output formats (HTML, markdown)

Installation

Install scrapling

uv tool install scrapling
scrapling install
uv tool install scrapling-fetch-mcp

Usage with Claude

Add this configuration to your Claude client's MCP server configuration:

{
  "mcpServers": {
    "Cyber-Chitta": {
      "command": "uvx",
      "args": ["scrapling-fetch-mcp", ">", "/dev/null", "2>&1"]
    }
  }
}

Available Tools

scrapling-fetch

Fetch a URL with configurable bot-detection avoidance levels.

{
  "name": "scrapling-fetch",
  "arguments": {
    "url": "https://example.com",
    "mode": "stealth",
    "format": "markdown",
    "max_length": 5000,
    "start_index": 0
  }
}

Parameters

  • url (required): The URL to fetch
  • mode (optional, default: "basic"): Protection level
    • basic: Fast retrieval with minimal protection
    • stealth: Balanced protection against bot detection
    • max-stealth: Maximum protection with all anti-detection features
  • format (optional, default: "markdown"): Output format (options: html, markdown)
  • max_length (optional, default: 5000): Maximum number of characters to return
  • start_index (optional, default: 0): Character index to start from in the response (useful for paginated content)

License

Apache 2

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapling_fetch_mcp-0.1.2.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scrapling_fetch_mcp-0.1.2-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file scrapling_fetch_mcp-0.1.2.tar.gz.

File metadata

  • Download URL: scrapling_fetch_mcp-0.1.2.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.29

File hashes

Hashes for scrapling_fetch_mcp-0.1.2.tar.gz
Algorithm Hash digest
SHA256 44b3c3fc2facef38b061a6b9e1c63914729c5cf7ea8439cef583ad3b5108cbea
MD5 ea5eda165a28ed15c0c8485a2abf5291
BLAKE2b-256 ddaa4b2013bd7c16365d6ec181768325954f99648fd98eb2725593bcfdf5bc5e

See more details on using hashes here.

File details

Details for the file scrapling_fetch_mcp-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for scrapling_fetch_mcp-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ea615646aec57133f542ad99667f3b0fc3b7034ad5072a1a52fe501fbe650a1e
MD5 f710e914781b5fab908246dbb14b72b3
BLAKE2b-256 38b62c793a2ad6ad64535c3202d58821a6540a4a296ba6a5913e5cf570bef533

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page