Skip to main content

A dedicated web content fetching and conversion service based on the MCP philosophy.

Project description

huoshui-fetch

A dedicated web content fetching and conversion MCP (Model Context Protocol) server that provides tools for fetching, converting, and extracting data from web pages.

Features

Fetching Tools

  • fetch_url: Fetch content from URLs with customizable timeout, redirect handling, and user-agent
  • fetch_with_headers: Fetch URLs with custom headers for authenticated requests

Conversion Tools

  • html_to_markdown_tool: Convert HTML to clean Markdown format
  • html_to_text_tool: Extract plain text from HTML
  • clean_html_tool: Remove scripts/styles and sanitize HTML
  • json_to_markdown_tool: Convert JSON data to readable Markdown

Extraction Tools

  • extract_article_tool: Extract main article content using readability
  • extract_links_tool: Extract all links with filtering options
  • extract_metadata_tool: Extract page metadata (title, description, OG tags)
  • extract_images_tool: Extract images with size filtering
  • extract_structured_data_tool: Extract JSON-LD and microdata

Installation

From MCP Registry (Recommended)

This server is available in the Model Context Protocol Registry. Install it using your MCP client.

mcp-name: io.github.huoshuiai42/huoshui-fetch

# Using uv (recommended)
uv sync

# Or install from GitHub
pip install git+https://github.com/yourusername/huoshui-fetch.git

Usage

Run with uvx (recommended for one-time use)

# From the repository
uvx --from . huoshui-fetch

# From GitHub (once published)
uvx --from git+https://github.com/yourusername/huoshui-fetch.git huoshui-fetch

Run directly

# Using uv
uv run python -m huoshui_fetch

# Or if installed
python -m huoshui_fetch

The server communicates via standard input/output, making it perfect for integration with Claude Desktop and other MCP-compatible clients.

Configuration for Claude Desktop

Add to your Claude Desktop configuration:

{
  "mcpServers": {
    "huoshui-fetch": {
      "command": "uvx",
      "args": ["--no-cache", "--from", ".", "huoshui-fetch"],
      "cwd": "/path/to/huoshui-fetch"
    }
  }
}

Or if installed from GitHub:

{
  "mcpServers": {
    "huoshui-fetch": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/yourusername/huoshui-fetch.git",
        "huoshui-fetch"
      ]
    }
  }
}

Example Usage

Once configured, you can use the tools in Claude Desktop:

// Fetch a webpage
fetch_url("https://example.com")

// Convert HTML to Markdown
html_to_markdown_tool("<h1>Hello</h1><p>World</p>")

// Extract article content
extract_article_tool(html_content, "https://example.com/article")

Requirements

  • Python 3.11+
  • Dependencies listed in pyproject.toml

Development & Publishing

This project includes comprehensive automation for building and publishing to PyPI.

Automated Publishing Workflow

# Complete automated workflow (TestPyPI + PyPI)
uv run python scripts/publish.py --include-pypi

# TestPyPI only (recommended for testing)
uv run python scripts/publish.py

# Bump version and publish
uv run python scripts/publish.py --version-bump patch --include-pypi

Individual Commands

# Version management
uv run python scripts/version_manager.py --check
uv run python scripts/version_manager.py --bump patch

# Build package
uv run python scripts/build.py

# Run comprehensive tests
uv run python scripts/test.py

# Upload to PyPI
uv run python scripts/upload.py

Features

  • Version Management: Automatic synchronization across all files
  • Quality Checks: Ruff linting and MyPy type checking
  • Build Automation: Clean builds with validation
  • Testing Suite: Comprehensive package and functionality tests
  • Publishing Workflow: TestPyPI → PyPI with validation
  • Error Recovery: Built-in error handling and recovery options

See PUBLISHING.md for detailed documentation.

DXT Extension

This project supports DXT (Desktop Extensions) format for easy distribution and installation.

To build the DXT extension:

python build_dxt.py

This will create a huoshui-fetch-{version}.dxt file that can be installed in compatible AI desktop applications.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

huoshui_fetch-0.1.2.tar.gz (80.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

huoshui_fetch-0.1.2-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file huoshui_fetch-0.1.2.tar.gz.

File metadata

  • Download URL: huoshui_fetch-0.1.2.tar.gz
  • Upload date:
  • Size: 80.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.25

File hashes

Hashes for huoshui_fetch-0.1.2.tar.gz
Algorithm Hash digest
SHA256 d2cd8fdcc3bea2d3caf04555ddab092dd0442d2ae81ce0c5b67042b3d5f702d5
MD5 596aead1ca7b879c122168d939953345
BLAKE2b-256 4dc4537e38cf6729b5ff9e2c8edf8fe7a9155bbf555a53cfe0a1c2ad14956ad3

See more details on using hashes here.

File details

Details for the file huoshui_fetch-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for huoshui_fetch-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 49f859d295ef681778277b29aa5124d41bdf15774cb49fb3348a897699d78c70
MD5 5297acf19e3c68d68374f1db5eef9c37
BLAKE2b-256 8e7a2d6102059aec8e3ecd9ad1dc0c3b21fd4e9cf0c6a243e284b15a26bcedc1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page