Skip to main content

Generate MCP servers from ReadTheDocs documentation

Project description

autodocs-mcp

PyPI version Python Version License Code style: black Ruff CI

Generate Model Context Protocol (MCP) servers from ReadTheDocs documentation.

Overview

autodocs-mcp is a CLI tool that automatically scrapes ReadTheDocs documentation, generates embeddings for semantic search, and creates a ready-to-use MCP server that can be integrated with VSCode and other MCP-compatible tools.

Features

  • 🔍 Format Detection: Automatically detects documentation format (Sphinx, MkDocs, or generic)
  • 📚 Smart Scraping: Uses objects.inv for Sphinx docs, sitemap.xml for MkDocs, with fallback to HTML crawling
  • 🧠 Semantic Search: Generates embeddings and creates a vector store for semantic search
  • ⚙️ MCP Server Generation: Creates a fully functional MCP server with tools and resources
  • 🔌 VSCode Integration: Generates VSCode configuration for easy integration

Installation

Install from PyPI:

pip install autodocs-mcp

We recommend using uv for faster and more reliable package management:

uv pip install autodocs-mcp

Or install from source:

git clone https://github.com/ziyacivan/autodocs-mcp.git
cd autodocs-mcp
uv sync

Usage

After installation, you can use autodocs-mcp directly from the terminal:

Basic Usage

autodocs-mcp generate https://docs.example.com/

Alternatively, you can run it as a Python module:

python -m autodocs_mcp generate https://docs.example.com/

Options

autodocs-mcp generate <readthedocs_url> \
  --output-dir ./mcp-server \
  --embedding-model all-MiniLM-L6-v2 \
  --python-path python

Options:

  • --output-dir: Output directory for generated files (default: ./mcp-server)
  • --embedding-model: Embedding model to use (default: all-MiniLM-L6-v2)
  • --cache-dir: Cache directory (default: output-dir/cache)
  • --python-path: Path to Python interpreter (default: python)

Example

# Generate MCP server for a documentation site
autodocs-mcp generate https://docs.readthedocs.io/en/stable/

# This will:
# 1. Detect the documentation format
# 2. Scrape all pages
# 3. Generate embeddings
# 4. Create vector store
# 5. Generate MCP server code
# 6. Create VSCode configuration

Output Structure

After running the tool, you'll get:

mcp-server/
├── mcp_server.py          # Generated MCP server
├── vector_store/          # ChromaDB vector store
├── vscode_config.json     # VSCode configuration
└── cache/                 # Cached content (optional)

VSCode Integration

  1. The tool generates a vscode_config.json file with the MCP server configuration.

  2. Add the configuration to your VSCode settings.json:

{
  "mcp.servers": {
    "docs-example-com": {
      "command": "python",
      "args": ["/path/to/mcp-server/mcp_server.py"]
    }
  }
}
  1. Restart VSCode to load the MCP server.

How It Works

Format Detection

The tool automatically detects the documentation format:

  1. Sphinx: Checks for objects.inv file
  2. MkDocs: Checks for sitemap.xml file
  3. Generic: Falls back to HTML crawling

Scraping Process

  • Sphinx: Uses sphobjinv to parse objects.inv and extract all documentation objects
  • MkDocs: Parses sitemap.xml or analyzes HTML navigation structure
  • Generic: Crawls HTML pages starting from the index page

Embedding Generation

  • Splits content into chunks (configurable size and overlap)
  • Generates embeddings using sentence transformers
  • Stores embeddings in ChromaDB vector store

MCP Server Features

The generated MCP server provides:

  • Resources: List of all documentation pages
  • Tools:
    • search_documentation: Semantic search across documentation
    • get_page_content: Get full content of a specific page

Requirements

  • Python 3.10+
  • See pyproject.toml for full dependency list

Development

Local Development Setup

For local development, install the package in editable mode:

# Clone the repository
git clone https://github.com/ziyacivan/autodocs-mcp.git
cd autodocs-mcp

# Install in editable mode with development dependencies
pip install -e ".[dev]"

# Or using uv
uv sync --extra dev

After installation, you can use the CLI tool:

# Using the CLI command (after editable install)
autodocs-mcp --help

# Or as a Python module
python -m autodocs_mcp --help

Testing

# Run tests
pytest

# Run tests with coverage
pytest --cov=src/autodocs_mcp --cov-report=html

# Format code
black src/ tests/

# Lint code
ruff check src/ tests/

# Fix auto-fixable linting issues
ruff check --fix src/ tests/

# Install pre-commit hooks (optional but recommended)
pre-commit install

Testing Package Installation

To test the built package locally:

# Build the package
python -m build

# Install from the built wheel
pip install dist/autodocs_mcp-*.whl

# Test the CLI command
autodocs-mcp --help

License

MIT License - see LICENSE file for details.

Troubleshooting

Common Issues

Issue: "No pages found"

  • Ensure the URL is correct and accessible
  • Check if the documentation site requires authentication
  • Verify the site is using a supported format (Sphinx, MkDocs, or generic HTML)

Issue: "Could not find Python executable"

  • Specify the Python path explicitly using --python-path
  • Ensure Python 3.10+ is installed and in your PATH

Issue: Embedding model download fails

  • Check your internet connection
  • The model will be downloaded on first use from Hugging Face
  • Ensure you have sufficient disk space (~100MB per model)

Issue: MCP server not working in VSCode

  • Verify the Python path in vscode_config.json is correct
  • Ensure all dependencies are installed: uv pip install chromadb sentence-transformers mcp (or pip install chromadb sentence-transformers mcp)
  • Check VSCode MCP extension is installed and enabled
  • Restart VSCode after configuration changes

Performance Tips

  • Use a smaller embedding model (e.g., all-MiniLM-L6-v2) for faster processing
  • Enable caching to avoid re-scraping documentation
  • For large documentation sites, consider processing in batches

Roadmap

  • Support for additional documentation formats
  • Incremental updates (only scrape changed pages)
  • Custom chunking strategies
  • Multiple embedding model support
  • Docker containerization
  • Pre-built MCP servers for popular documentation sites

Contributing

Contributions are welcome! Please read our Contributing Guide for details on our code of conduct and the process for submitting pull requests.

License

MIT License - see LICENSE file for details.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autodocs_mcp-0.1.2.tar.gz (30.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autodocs_mcp-0.1.2-py3-none-any.whl (25.3 kB view details)

Uploaded Python 3

File details

Details for the file autodocs_mcp-0.1.2.tar.gz.

File metadata

  • Download URL: autodocs_mcp-0.1.2.tar.gz
  • Upload date:
  • Size: 30.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for autodocs_mcp-0.1.2.tar.gz
Algorithm Hash digest
SHA256 63fe5100ac692e6caa5ca6945931c89f7ec9e811819290b6c5e98bc25848d008
MD5 26dfd9bd77c3261b4c53f724a5799889
BLAKE2b-256 dbb3216fecc40bfd1d5841b990add2d811768e012705d28955b59c6f0c1e23f3

See more details on using hashes here.

File details

Details for the file autodocs_mcp-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: autodocs_mcp-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 25.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for autodocs_mcp-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ac3e50991959079ef27b8848c4fe3aa5ea45d27b8bf89710d6649195a03ee0b7
MD5 10b66fc38e6dc1945f130aab7c00dbbd
BLAKE2b-256 bef50cbcd708c2ce9589e6b243bb8a4bcd7c40d6ce15ee2c1ada2050cb266d83

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page