Skip to main content

MCP server for the public PuRe (PubMan) REST API — search and retrieve Max Planck Society publications.

Project description

pure-mpg-mcp

CI License: MIT

An MCP server for the PuRe (PubMan) REST API — the Max Planck Society's publication repository at pure.mpg.de.

It lets any MCP client (Claude Desktop, Claude Code, etc.) search and retrieve Max Planck publications, organizational units, collections, and feeds.

Public & read-only. This server is anonymous: it only reaches RELEASED, publicly visible records. It does not log in, write, or access embargoed/private content. The PuRe write/curation/admin endpoints require authorization and are intentionally not exposed.

PuRe is the center. Every tool starts from a PuRe record. A few tools enrich that record with other free public scholarly APIs (CONE, OpenAlex, Crossref, Unpaywall, Semantic Scholar), but always keyed on identifiers PuRe itself provides (DOI, person ids). The external sources are enrichment only — never queried on their own, never the canonical record.

Tools

Search & retrieval

Tool What it does
search_publications Search by free text, author, genre, and year (compact results)
search_raw Run a raw Elasticsearch query for advanced cases
get_publication Full metadata for one item id (e.g. item_1552993)
find_by_doi Look up a publication by DOI (bare or doi.org URL)
export_publication Export as BibTeX, citation, MARC, EndNote, …
get_file_metadata Metadata for an attached file (component)
search_organizations Search institutes / departments (organizational units)
list_top_organizations Top-level organizational units
search_collections Search contexts (collections)
recent_publications Feed of recently released items
open_access_feed Feed of recent open-access items
service_info Version / status of the PuRe instance

Authority & analysis (for bibliometrics)

Tool What it does
resolve_author Resolve a name/person-id against the CONE authority → full name, affiliation, ORCID. Expands initials.
author_publications List an author's publications (by CONE id or family name)
publication_statistics Distributions over a result set: by year, genre, language, organization, or open_access
coauthorship_analysis Collaboration patterns: avg team size, solo-authored count, top co-authors & institutions
analyze_authors Extract & enrich authors of a publication/query — full names (initials expanded via CONE), ORCID, affiliation

External enrichment (PuRe DOI → public scholarly APIs)

Tool What it does
enrich_publication Attach external signals to a PuRe item: citations, topics, institutions (ROR), funders, license, OA full text. Pick sources from openalex, crossref, unpaywall, semanticscholar
get_citation_metrics Citation counts for one publication side-by-side across OpenAlex, Crossref, and Semantic Scholar (incl. influential citations)
find_full_text Locate free full text — PuRe's own public files first, then Unpaywall / OpenAlex open-access locations

Enrichment sources

All are free and require no authentication. They are queried only with an identifier taken from a PuRe record, and any source lacking that record is silently omitted.

Source Adds Notes
CONE Full author names, ORCID, affiliation MPG's own authority service
OpenAlex Citation count, topics, institutions/ROR, OA status, related works No key
Crossref References, funders, license, citing count No key
Unpaywall Definitive OA status + free full-text PDF Requires a contact email
Semantic Scholar Influential-citation count, TLDR summary No key; rate-limited

Citation counts differ across sources because each indexes a different corpus — that's expected, and why get_citation_metrics shows them side by side rather than picking one.

Note on analytics. PuRe's search endpoint strips Elasticsearch aggregations, so publication_statistics and coauthorship_analysis fetch a capped sample of records (scrolled, default 300–500) and aggregate client-side. When numberOfRecords exceeds the cap, treat the figures as sample-based, and raise max_records if you need more (at the cost of more requests).

Install

Requires Python ≥ 3.10. Using uv:

# from source (clone first)
git clone https://github.com/Toymen/pure-mpg-mcp.git
cd pure-mpg-mcp
uv pip install -e .

Once published to PyPI it will also be installable directly:

uvx pure-mpg-mcp          # run without installing
# or: uv pip install pure-mpg-mcp

Run

pure-mpg-mcp      # stdio transport

Claude Desktop / Claude Code config

Add to your MCP config (claude_desktop_config.json or .mcp.json):

{
  "mcpServers": {
    "pure-mpg": {
      "command": "pure-mpg-mcp"
    }
  }
}

If you installed into a virtualenv, point command at that venv's pure-mpg-mcp binary (e.g. /path/to/.venv/bin/pure-mpg-mcp). Once the package is on PyPI you can instead have the client fetch and run it via uvx:

{
  "mcpServers": {
    "pure-mpg": {
      "command": "uvx",
      "args": ["pure-mpg-mcp"]
    }
  }
}

Configuration

Env var Default Purpose
PURE_BASE_URL https://pure.mpg.de/rest Override the API base (e.g. a QA instance)
PURE_CONE_URL https://pure.mpg.de/cone Override the CONE authority base
PURE_CONTACT_EMAIL (unset) A real contact email. Used for the OpenAlex/Crossref "polite pool", and required by Unpaywallfind_full_text and enrich_publication skip the Unpaywall source (and say so) until this is set. @example.com addresses are treated as unset.

Example

"Find recent open-access articles from the Max Planck Institute for Evolutionary Anthropology about Neanderthals, and give me the BibTeX for the top hit."

The agent calls search_publications(text="Neanderthal", genre="ARTICLE"), then export_publication(item_id, format="BibTex").

Development

uv pip install -e ".[dev]"
ruff check .
pytest -m "not network"   # offline unit tests (what CI runs)
pytest                     # include live API smoke tests (network)

Tests are split with a network marker: offline tests cover all the pure aggregation/parsing logic and run in CI; network-marked tests hit the live public APIs and are skipped in CI so the suite never depends on third-party uptime or rate limits. GitHub Actions runs lint + offline tests on Python 3.10 and 3.12 (.github/workflows/ci.yml).

Publishing

MCP servers aren't "hosted" on GitHub — GitHub holds the source, and clients launch the server locally over stdio. The standard distribution path:

  1. GitHub — source of truth (this repo).
  2. PyPI — so users can uvx pure-mpg-mcp. Tag a release and the publish workflow builds and uploads via PyPI Trusted Publishing (no stored token). Configure the trusted publisher on PyPI first.
  3. MCP Registry (optional) — server.json is the manifest; the <!-- mcp-name: io.github.toymen/pure-mpg-mcp --> line in this README verifies ownership. Publish with the mcp-publisher CLI after the PyPI release exists.

API reference

License

MIT. This project is an independent client and is not affiliated with or endorsed by the Max Planck Society / Max Planck Digital Library.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pure_mpg_mcp-0.1.0.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pure_mpg_mcp-0.1.0-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file pure_mpg_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: pure_mpg_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pure_mpg_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 de9b93b0ba9a238e00e26a31c1360dee58930c84d21e77fabc2198cb0c993b56
MD5 933c172f6df6c885c169152b5858103b
BLAKE2b-256 2f8c127cb83ebb4c9c9461d7fc406f7d69a64f041760ca11119f6ef2f16ca96a

See more details on using hashes here.

Provenance

The following attestation bundles were made for pure_mpg_mcp-0.1.0.tar.gz:

Publisher: publish.yml on Toymen/pure-mpg-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pure_mpg_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pure_mpg_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pure_mpg_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 66886e273a9247a053f1e7c997397838b98768902438fd911aa08800ec240cbe
MD5 208434feb49262c30380b263baf1b9bb
BLAKE2b-256 09d3efc3b4cd785f9a6fcd44e76cc916048e80968adf0a07fd6db6e8b6f47a37

See more details on using hashes here.

Provenance

The following attestation bundles were made for pure_mpg_mcp-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Toymen/pure-mpg-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page