Skip to main content

MCP server giving Claude and other LLM clients structured access to the Internet Archive's Wayback Machine.

Project description

Wayback Machine

wayback-mcp

A Model Context Protocol server giving Claude structured access to the Internet Archive's Wayback Machine.

CI Python 3.11+ MCP Built with uv


Overview

wayback-mcp is an async Python MCP server that exposes the Internet Archive's six core APIs — Availability, CDX, Advanced Search, Metadata, and Wayback content — as first-class tools, prompts, and resources for Claude. It handles rate limiting, retry/back-off, and response shape normalisation so the model only sees structured Pydantic data.

Features

  • Six MCP tools covering availability checks, snapshot lookups, full-text item search, domain crawls, page-text extraction, and item metadata
  • Four guided promptsresearch_topic, track_site_changes, audit_link_rot, setup_authentication
  • One MCP resourcewayback://item/{identifier} exposes IA item metadata as JSON
  • Async token-bucket rate limiter with per-endpoint buckets and Retry-After honoring
  • In-memory response cache with per-endpoint TTLs to keep token usage and IA load low
  • Internet Archive S3 authentication (optional) for higher rate-limit ceilings
  • Structured error model — expected failures return ToolError; unexpected ones raise
  • Tested against live IA APIs via an opt-in --integration pytest flag

Installation

Requires Python 3.11+.

pip install mcp-server-wayback

Once published to PyPI. Until then, see Development for the from-source workflow.

Usage

Wire it into Claude Desktop

Add an entry to claude_desktop_config.json (on macOS: ~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "wayback": {
      "command": "mcp-server-wayback"
    }
  }
}

Restart Claude Desktop. The wayback tools, prompts, and resources will appear in the MCP picker.

If you prefer not to install globally, run it on demand with uvx:

{
  "mcpServers": {
    "wayback": {
      "command": "uvx",
      "args": ["mcp-server-wayback"]
    }
  }
}

Optional: Internet Archive authentication

Set both keys in the server's environment to authenticate every IA request and raise your rate-limit ceiling. Run the setup_authentication prompt from Claude to walk through it interactively.

"env": {
  "WAYBACK_MCP_IA_ACCESS_KEY": "<your access key>",
  "WAYBACK_MCP_IA_SECRET_KEY": "<your secret key>"
}

Get keys at https://archive.org/account/s3.php.

Tools

Tool Purpose
check_availability Is this URL archived? Returns the closest snapshot
lookup_snapshots List CDX snapshots for a URL with date / status filters
search_archive Lucene search across IA collections with mediatype + year range
search_domain Discover archived URLs under a domain or path prefix
get_snapshot_content Fetch an archived page and extract its readable text
get_item_metadata Rich structured metadata for any IA item identifier

Prompts

Prompt What it does
research_topic Multi-mediatype IA search → synthesised topic overview
track_site_changes Sample snapshots over time → narrate how a page evolved
audit_link_rot Bulk-check URLs and surface archived alternatives
setup_authentication Walks the user through configuring IA S3 keys

Development

Requires Python 3.11+ and uv.

git clone https://github.com/lakshyamehta03/wayback-machine-mcp.git
cd wayback-machine-mcp
uv sync
uv run mcp-server-wayback      # run the server
uv run pytest                  # unit tests (httpx mocked via respx)
uv run pytest --integration    # also hit live Internet Archive APIs

CI runs the unit suite on every push and pull request via GitHub Actions.

License

MIT. The Wayback Machine logo is © Internet Archive and used here under fair use to identify the upstream service this project integrates with.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_server_wayback-0.1.0.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_server_wayback-0.1.0-py3-none-any.whl (18.5 kB view details)

Uploaded Python 3

File details

Details for the file mcp_server_wayback-0.1.0.tar.gz.

File metadata

  • Download URL: mcp_server_wayback-0.1.0.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcp_server_wayback-0.1.0.tar.gz
Algorithm Hash digest
SHA256 990a94141c6bccdafab39854579fbda2d509aad55aaf70cca5bdd957302f87a4
MD5 f819a7a945e39bc3f01d06de587ecf86
BLAKE2b-256 078e62b4288109e205a43514616e4565ade8dc9651521fa906c1057a1936b025

See more details on using hashes here.

File details

Details for the file mcp_server_wayback-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mcp_server_wayback-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcp_server_wayback-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 74e35dcdf89df9d4921880d557cf9c4874e886f0bb290d53a7678404f2c623bf
MD5 c8493927b41a646c30341f6918fcd782
BLAKE2b-256 8b06957d2d43b7339042c5b784e064283697b90014e407ce5b85353c137c9829

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page