Skip to main content

MCP server for searching and downloading files from the Internet Archive (archive.org)

Project description

mcarchive-org

An MCP (Model Context Protocol) server that lets an LLM search, inspect, and download content from the Internet Archive.

Built on FastMCP + httpx. No API key required — archive.org's read endpoints are public.

Tools

Tool Purpose
search_items Small Solr-style search via advancedsearch.php (1–200 rows, paginated)
scrape_items Bulk cursor-paginated search via Scrape API (count ≥ 100)
get_item_metadata Metadata for one item; skips the (possibly huge) files list by default
list_files Files array with optional format / glob filtering — includes download_url per file
get_file_url Build a canonical download URL without hitting the network
download_file Stream a file to disk with resume support and optional MD5 verification

Also exposes an MCP resource template: archive://item/{identifier}.

Install & run

# From a checkout:
uv sync
uv run mcarchive-org

# Or from PyPI (once published):
uvx mcarchive-org

Register with Claude Code:

claude mcp add archive-org -- uvx mcarchive-org
# or, from a local checkout:
claude mcp add archive-org -- uv run --directory /path/to/mcarchive-org mcarchive-org

Environment

Variable Default Purpose
MCARCHIVE_DOWNLOAD_ROOT ./downloads Base directory for download_file

Example flow

search_items(query='mediatype:audio AND creator:"Grateful Dead"', sort=['downloads desc'])
  → identifier 'gd77-05-08.sbd.hicks.4982.sbeok.shnf' (among others)

list_files(identifier='gd77-05-08.sbd.hicks.4982.sbeok.shnf', formats=['VBR MP3'])
  → [{ name: 'gd1977-05-08d1t01.mp3', size: 6342912, md5: '…', download_url: '…' }, …]

download_file(identifier='gd77-…', filename='gd1977-05-08d1t01.mp3', verify_md5='…')
  → { path: './downloads/gd77-…/gd1977-…mp3', bytes: 6342912, md5_ok: True }

Query syntax notes

archive.org uses a Solr/Lucene dialect:

  • mediatype:(audio OR movies) — restrict to media types
  • collection:etree — items in a specific collection
  • date:[1977-01-01 TO 1977-12-31] — date ranges
  • creator:"Grateful Dead" — phrase match
  • -subject:bootleg — exclusion
  • Sort by downloads desc, date asc, addeddate desc, etc.

See archive.org's search docs for the full grammar.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcarchive_org-2026.4.21.tar.gz (105.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcarchive_org-2026.4.21-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file mcarchive_org-2026.4.21.tar.gz.

File metadata

  • Download URL: mcarchive_org-2026.4.21.tar.gz
  • Upload date:
  • Size: 105.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"EndeavourOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcarchive_org-2026.4.21.tar.gz
Algorithm Hash digest
SHA256 ad7a1d7a9b55d255b92389d42dce2c21bd7bf02b70057f95dbc1f9764279a6e2
MD5 1ea69eb612daec99b1e27bd7a00ddbeb
BLAKE2b-256 b1c121b938738d4ecf12c2e6bce474df2c1f814550db134af5eb99bbafa47486

See more details on using hashes here.

File details

Details for the file mcarchive_org-2026.4.21-py3-none-any.whl.

File metadata

  • Download URL: mcarchive_org-2026.4.21-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"EndeavourOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcarchive_org-2026.4.21-py3-none-any.whl
Algorithm Hash digest
SHA256 7e346926bc5ea8ca0924f3f8ee651f7aa388c83129ef2740d00ea71c063790a0
MD5 f0dc2e3e7ea8b657ccd37644eaa44f6a
BLAKE2b-256 bb1b108fa759a09076b4a80809399e1fc6e5b84b114d9a83c8b1f94c636fb939

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page