Skip to main content

MCP server for Perseus Greek and Latin text research

Project description

Project Status: Active – The project has reached a stable, usable state and is being actively developed. Docs License: MIT DOI Python Jupyter Ask DeepWiki

Perseus-mcp

Give Claude / Cursor / Windsurf direct access to the Perseus Digital Library — ancient Greek and Latin texts, precise CTS navigation, plaintext, search, and more.

A high-quality MCP server for Classical Greek and Latin literature. It runs as a local FastMCP server so MCP-capable applications can attach these Perseus tools to the LLM/model provider of your choice.

Features

This server exposes twenty-three MCP tools. Every tool returns a text payload: some are raw Perseus CTS XML or Scaife JSON, while the discovery and plaintext helpers return locally shaped JSON or readable text.

  • get_passage(urn) — fetch a CTS passage by URN.
  • get_passage_plus(urn) — fetch passage text plus contextual metadata.
  • get_passage_plaintext(urn) — fetch a CTS passage as plain readable text.
  • get_valid_references(urn, level=None) — retrieve navigable citation references for a work or edition.
  • get_valid_references_json(urn, level=None, limit=100, offset=0) — retrieve paged citation references as JSON (limit: 1–500).
  • count_valid_references(urn, level=None) — count valid references without returning the full list.
  • get_capabilities() — list available texts/editions from Perseus CTS.
  • get_cache_status() — inspect local metadata cache state.
  • refresh_metadata_cache() — refresh cached CTS and Scaife library metadata.
  • clear_metadata_cache() — clear in-memory and disk metadata cache entries.
  • list_text_groups(language=None, query=None, limit=100, offset=0) — list matching authors/textgroups and works with pagination metadata (limit: 1–500).
  • get_author_resources(author, language=None) — list works, editions, and translations for a matching author name or CTS textgroup URN.
  • find_author_names(query, language=None, limit=100, offset=0) — find author/textgroup names by partial name match with pagination metadata (limit: 1–500).
  • get_work_resources(urn_or_title, language=None) — list editions, translations, and resources for a work, optionally filtered by original language.
  • get_label(urn) — fetch human-readable metadata labels for a URN.
  • get_first_urn(urn) — get the first navigable URN under a work/edition.
  • get_prev_next_urn(urn) — get neighboring passage URNs for navigation.
  • search_perseus(query, language="greek", query_format="auto", author=None, search_kind="form", preserve_operators=False, page_num=1, text_group=None, work=None, result_format="instances") — search texts via Scaife search API. Greek queries may be entered as Unicode Greek (for example μῆνιν) or Beta Code (for example mh=nin).
  • search_within_text(query, text_urn, ..., size=10, offset=0) — search within one Scaife text/edition URN (size: 1–500).
  • get_passage_highlights(query, passage_urn, ...) — get Scaife token highlight positions for one passage.
  • get_scaife_library_metadata(urn) — get Scaife JSON metadata for a library URN.
  • get_scaife_passage_json(urn) — get Scaife JSON for a passage URN.
  • get_scaife_passage_text(urn) — get Scaife plaintext for a passage URN.

Greek Search Input

search_perseus normalizes Greek search terms before sending them to Scaife. You can pass Unicode Greek directly, or use Beta Code such as mh=nin a)/eide. Search queries must contain at least one non-whitespace character. The default query_format="auto" detects explicit Beta Code marks like =, /, (, ), and *, and also treats short unaccented Greek-looking queries such as logos as Beta Code. If an ASCII query is ambiguous, set query_format="betacode" to force conversion or query_format="unicode" to preserve it exactly. Search queries are normalized to composed Greek Unicode (NFC), matching sampled Perseus Greek text. The tool uses Scaife's JSON search route and returns the JSON response as text. The language argument accepts recognized Greek aliases (greek, grc, gr) or Latin aliases (latin, lat, la); blank input defaults to Greek and unrecognized values raise an error. It controls query normalization and is not currently sent to Scaife as a corpus language filter. For inventory discovery, list_text_groups, find_author_names, get_author_resources, and get_work_resources accept language="greek" or language="latin" (and common codes such as grc or lat) as an actual work language filter. find_author_names merges the legacy CTS inventory with the Scaife library catalog, so Scaife-only authors such as Philo Judaeus remain discoverable. Passage and navigation tools use CTS URNs, whose greekLit/latinLit namespace and edition identifier already select the text. Pass author to resolve a CTS author/textgroup name or URN. When it resolves to exactly one textgroup, Scaife receives a server-side text_group filter; ambiguous matches fall back to local CTS URN-prefix filtering of the current result page. Use search_kind="lemma" for lemma search; the default search_kind="form" keeps existing form-search behavior. For Scaife operator queries such as quoted phrases, -, |, *, or ~, set preserve_operators=True so Beta Code auto-detection does not consume operator characters. For example: search_perseus('"μῆνιν ἄειδε"', query_format="unicode", preserve_operators=True), search_perseus("μῆνιν -ἄειδε", query_format="unicode", preserve_operators=True), or search_perseus("λόγος | ἀνήρ", search_kind="lemma", query_format="unicode", preserve_operators=True). Use page_num for pagination and pass text_group or work to use Scaife's server-side scope filters. When author resolves to exactly one CTS textgroup, search_perseus sends that textgroup to Scaife instead of filtering only the returned page locally.

Local Metadata Cache

Discovery and navigation tools cache stable CTS metadata locally to avoid repeated multi-megabyte GetCapabilities and GetValidReff requests. The default disk cache lives in .cache/perseus-mcp under the current working directory and also uses an in-memory cache for the running server process. Configure it with:

  • PERSEUS_MCP_CACHE_DIR — override the disk cache directory.
  • PERSEUS_MCP_CACHE_TTL_SECONDS — set cache TTL; default is 86400 seconds.
  • PERSEUS_MCP_DISABLE_CACHE=1 — disable both memory and disk cache reads/writes.

The current working directory is the directory from which the Python process is started. Running the MCP server from the repository root uses .cache/perseus-mcp; running a notebook from examples/ would otherwise use examples/.cache/perseus-mcp. That is not a second server instance, only a second cache location for a separate Python process. To keep one cache location across notebooks and MCP clients, set PERSEUS_MCP_CACHE_DIR to an absolute path such as /path/to/Perseus-mcp/.cache/perseus-mcp. Disk entries are written to unique sibling temporary files and atomically replaced, so multiple local processes can safely share that directory without exposing partially written cache files. Disk-cache write failures emit a MetadataCacheWarning but do not discard a successfully fetched upstream response.

URN Discovery

Available edition URNs can differ between Perseus CTS and Scaife search results, and the live inventory can change. Use get_author_resources, get_work_resources, or list_text_groups before constructing edition-specific CTS passage URNs. The notebooks select advertised CTS editions from discovery results instead of assuming that a Scaife edition URN is valid for Perseus CTS.

The live Perseus CTS implementation may return malformed HTML for GetFirstUrn and GetPrevNextUrn. The MCP tools detect that response and derive valid XML results from GetValidReff.

Perseus may also return 429 Too Many Requests when a workflow sends many CTS requests in a short period. Pause before retrying, reduce concurrency, and add delays to passage-processing loops. The server currently exposes the upstream HTTP error instead of retrying automatically.

Setup

1) Install dependencies

Using uv:

uv sync

Or with pip:

pip install -e .

Once a release is published to PyPI, users can install it without cloning the repository:

pip install perseus-mcp

For development and tests:

pip install -e ".[dev]"

2) Run tests

pytest

With uv, use:

uv run --extra dev pytest

3) Run locally

uv run perseus-mcp

The installed console command and module entry point are equivalent:

perseus-mcp
python -m perseus_mcp

4) Inspect tools (optional)

npx @modelcontextprotocol/inspector uv run perseus-mcp

Test strategy and automation

Perseus MCP uses layered checks rather than relying on one end-to-end test. Most behavior is covered by deterministic pytest tests with local XML/JSON fixtures and mocked asynchronous HTTP calls. GitHub Actions separately verifies the supported Python and operating-system matrix, package artifacts, secrets, release tags, and publication.

The workflow files under .github/workflows/ are the executable source of truth.

Test suite organization

Pytest is configured in pyproject.toml to import from src/ and discover tests under tests/.

Test module Main responsibility
test_author_resources.py CTS author, work, and resource parsing; merged-author behavior
test_disk_cache.py Atomic cache writes, cache disabling, cleanup, and concurrent writers
test_exploration_tools.py Discovery, navigation, cache tools, author scope, and structured responses
test_greek_query_normalization.py Unicode Greek, Beta Code, Scaife parameters, and search operators
test_limits_and_language.py Result limits, paging bounds, and language aliases
test_packaging.py Metadata, dependencies, documentation assets, notebooks, and workflow expectations
test_scaife_urls.py Safe URL construction and CTS URN percent encoding
test_shared_http_client.py Connection reuse, event-loop changes, shutdown, and HTTP errors
test_xml_hardening.py Safe XML parsing and rejection of entity-based XML attacks

A regression fix should include a focused test that fails for the original problem. Tests should assert observable behavior and cover failure paths and boundary values as well as successful calls.

Isolation from Perseus and Scaife

Routine tests do not depend on live upstream services. HTTP helpers are monkeypatched with asynchronous test doubles, while representative CTS XML and Scaife JSON are stored in test fixtures. This keeps CI deterministic when catalogs change or an upstream service is unavailable, avoids unnecessary traffic to public scholarly infrastructure, and makes malformed-response tests safe.

Live read-only probes may be used during manual review for endpoint compatibility or connection-lifecycle changes, but they supplement rather than replace the automated suite.

Async test cleanup

Several tests invoke tools with asyncio.run(), which creates a new event loop for each call. The server uses a process-wide shared httpx.AsyncClient, so the autouse fixture in tests/conftest.py closes and resets that client after every test. Tests that manipulate shared client state must also leave it reset.

Local test commands

Run the complete suite:

python -m pytest

Run a module or one test:

python -m pytest tests/test_disk_cache.py
python -m pytest tests/test_disk_cache.py::test_disk_cache_set_writes_readable_content

Show skipped tests, the slowest tests, and local variables on failure:

python -m pytest -ra --durations=10 -l

Disable metadata-cache reads and writes during a test run:

PERSEUS_MCP_DISABLE_CACHE=1 python -m pytest

PowerShell equivalent:

$env:PERSEUS_MCP_DISABLE_CACHE = "1"
python -m pytest

GitHub Actions test matrix

.github/workflows/tests.yml installs the editable project with development dependencies and runs python -m pytest on:

  • Ubuntu and Windows;
  • Python 3.11, 3.12, and 3.13.

The matrix uses fail-fast: false, so every platform/version job finishes even when one fails. This makes version-specific and Windows-specific regressions visible in one run. Documentation-only changes under docs/** are excluded from the Python test workflow; exact event and branch filters remain defined in the workflow file.

Package validation

.github/workflows/package.yml checks that the repository produces a valid source distribution and universal wheel. It installs Python 3.12, runs:

python -m build
python -m twine check dist/*

and uploads dist/ as the python-package workflow artifact. The workflow is path-filtered to package-relevant files and supports manual dispatch.

tests/test_packaging.py complements this build by checking repository-level expectations such as metadata, dependencies, documentation files, notebook JSON, and workflow configuration. Both layers matter: metadata tests can pass while an isolated build fails, and a package can build while required repository assets are missing.

Secret scanning

.github/workflows/secret-scan.yml rejects tracked OpenRouter keys matching:

sk-or-v1-[A-Za-z0-9_-]{20,}

The workflow reports affected files without printing the matching secret. A detected key must be removed and rotated; the check should never be bypassed. This focused scan does not replace normal credential hygiene: do not commit .env files, tokens, private MCP configuration, or notebook outputs containing credentials.

Release and publication gates

.github/workflows/release.yml runs for v* tags or manual dispatch. For tag runs it verifies that the tag equals v<project.version>, builds the wheel and source archive, validates both with Twine, attaches them to a generated GitHub release, and dispatches the PyPI workflow using the same tag.

.github/workflows/publish.yml requires a tag reference, repeats the tag/version check, rebuilds and revalidates the artifacts, and publishes through PyPI trusted publishing. The protected pypi GitHub environment uses OIDC (id-token: write), so no long-lived PyPI API token is stored.

Rebuilding during publication avoids trusting an unrelated workflow artifact, while the repeated tag check prevents publishing from a branch or mismatched release tag.

Documentation deployment

.github/workflows/pages.yml builds docs/ with Jekyll and deploys the generated artifact to GitHub Pages after documentation changes reach main or master. This Pages site is intended primarily for end users; development and test-strategy documentation lives in this repository README.

Interpreting failures

  • Failures on every matrix job usually indicate a general regression.
  • A single Python-version failure suggests version-specific syntax, dependencies, or standard-library behavior.
  • Windows-only failures commonly involve paths, permissions, read-only attributes, or event-loop lifecycle.
  • A package failure with green pytest jobs usually concerns metadata, manifests, README rendering, or build isolation.
  • A secret-scan failure requires credential removal and rotation.
  • A release failure before publication commonly means the tag and project.version do not match.

Understand the failure before rerunning a job, and preserve useful workflow logs or tracebacks in the pull request when the cause is not obvious.

Maintainer-level conventions for extending the suite are also kept beside the tests in tests/testing.md.

Example notebooks

The examples/ directory includes Jupyter notebooks that demonstrate both direct endpoint calls and MCP client usage with real Greek and Latin data:

  • examples/00_install_and_run_perseus_mcp.ipynb — installation and launch guide covering PyPI, pip, uv, local repository development, MCP client configuration, verification, upgrades, and troubleshooting.
  • examples/01_basic_cts_workflow.ipynb — minimal direct CTS requests.
  • examples/02_search_and_navigation.ipynb — direct Scaife JSON search and CTS navigation from valid references.
  • examples/03_mcp_connection_homer_iliad.ipynb — FastMCP client connection, Homer resource discovery, and Iliad Greek passage analysis.
  • examples/04_mcp_greek_search_and_navigation.ipynb — MCP Greek search with Unicode/Beta Code, valid references, and passage navigation.
  • examples/05_mcp_all_tools.ipynb — complete MCP tool catalog with descriptions and input schemas.
  • examples/06_openrouter_llm_mcp_interaction.ipynb — optional OpenRouter LLM tool-calling loop over the local MCP tools, using OpenRouter's Free Models Router by default.
  • examples/07_mcp_advanced_search_options.ipynb — MCP form/lemma search, Scaife operator queries, and author-scoped search examples.
  • examples/08_mcp_cache_and_search_tools.ipynb — advanced demonstration of cache tools, paged references, scoped search, reader search, highlights, and Scaife metadata/text retrieval.
  • examples/09_openrouter_philo_politeia_analysis.ipynb — OpenRouter-assisted, evidence-first analysis of πολιτεία in Philo of Alexandria using scoped MCP search results and cited passages.
  • examples/10_mcp_latin_augustine_workflow.ipynb — Latin-language discovery, CTS navigation, passage retrieval, and a small text analysis using Augustine's Epistulae selections.

Run them after installing the project dependencies. The MCP notebooks use FastMCP's in-process client transport and call the same tools exposed to external MCP clients. The optional OpenRouter notebook also requires an OpenRouter API key; the MCP server itself does not. Notebook setup cells install notebook-only helpers such as python-dotenv directly. Those helpers are not core runtime dependencies of perseus-mcp.

Configure the OpenRouter API key

For examples/06_openrouter_llm_mcp_interaction.ipynb and examples/09_openrouter_philo_politeia_analysis.ipynb, copy .env.example to .env in the project root and replace the placeholder:

OPENROUTER_API_KEY=sk-or-v1-...

Get your API key at openrouter.ai. See OpenRouter's API key documentation for authentication details. The .env file is ignored by Git. You can also set OPENROUTER_API_KEY in your environment or enter it securely when the notebook prompts.

Both OpenRouter notebooks default to openrouter/free. This router selects among free models currently available on OpenRouter and filters for capabilities required by the request, such as tool calling or structured output. It avoids binding the examples to one free model that may later be removed or temporarily unavailable. The tradeoff is reduced reproducibility: separate runs may use different concrete models, so the notebooks record the resolved model returned by OpenRouter. Set OPENROUTER_MODEL to a fixed model slug when exact model selection matters.

Notebook 06_ can be saved and committed with its LLM and tool-call outputs so they render on GitHub. Python variables and kernel memory are not stored in an .ipynb file, and the notebook does not print the API key. Before committing a credentialed run, review the visible outputs and scan for a full OpenRouter key:

rg "sk-or-v1-[A-Za-z0-9_-]{20,}" examples/06_openrouter_llm_mcp_interaction.ipynb

The command should produce no output. It does not match the documented sk-or-v1-... placeholder.

Using with any MCP-capable LLM client

This project does not require a specific LLM. Configure your client to launch the local MCP server with:

uv --directory /full/path/to/Perseus-mcp run perseus-mcp

Most MCP clients need the same pieces: server name perseus, command uv, args --directory /full/path/to/Perseus-mcp run perseus-mcp, and an empty environment unless you have local customizations. See docs/enduser.md for generic client guidance and docs/architecture.md for the architecture choices, including why FastMCP is used.

Claude Desktop and Claude Code

The server runs with Claude over stdio, with no OpenRouter or API key required (OpenRouter is only needed for the optional demo client).

Claude Desktop — add to claude_desktop_config.json:

{
  "mcpServers": {
    "perseus": {
      "command": "uv",
      "args": ["--directory", "/full/path/to/Perseus-mcp", "run", "perseus-mcp"]
    }
  }
}

Restart Claude Desktop; the Perseus tools appear in the tools list.

Claude Code — one line:

claude mcp add perseus -- uv --directory /full/path/to/Perseus-mcp run perseus-mcp

Verified against a stdio MCP handshake: all 23 tools register and live calls return (tested with search_perseus and list_text_groups).

Build a PyPI distribution

Install the development dependencies, then build and validate both distribution formats:

python -m pip install -e ".[dev]"
python -m build
python -m twine check dist/*

The build creates a wheel and source archive under dist/. Test the wheel in a clean virtual environment before publishing. Upload to TestPyPI first:

python -m twine upload --repository testpypi dist/*

After verifying installation from TestPyPI, upload the same artifacts to PyPI:

python -m twine upload dist/*

PyPI does not allow replacing an existing release. Update project.version in pyproject.toml, rebuild from a clean dist/ directory, and publish each version only once. The package build workflow also builds and checks artifacts in CI without publishing them.

Automated GitHub release and PyPI publishing

The release automation follows the same trusted-publishing pattern as MorphKit:

  1. Set the release version in pyproject.toml, for example 1.2.3.
  2. Merge the version change to the commit that should be released.
  3. Create and push the matching tag, for example v1.2.3.
  4. The Build release artifacts workflow verifies the tag/version match, builds and validates both distributions, and attaches them to a generated GitHub release.
  5. That workflow dispatches Publish to PyPI, which rebuilds and validates the package before publishing through PyPI trusted publishing.

Configure the repository once before the first automated upload:

  • Create a GitHub Actions environment named pypi.
  • In the existing PyPI project settings, or as a pending publisher before the first upload, add a trusted publisher for owner tonyjurg, repository Perseus-mcp, workflow publish.yml, and environment pypi.
  • Do not add a PyPI API token; the workflow uses GitHub OIDC with id-token: write.

The workflows reject a tag such as v1.2.4 when project.version is still 1.2.3. PyPI versions are immutable, so increment the version before retrying a release that was already uploaded.

Contributing and reporting issues

Bug reports, documentation fixes, focused feature requests, and pull requests are welcome. Please report problems through the GitHub issue tracker and include the command, Python version, MCP client, tool arguments, traceback, and any relevant CTS URN or Greek search query when possible.

See docs/contributing.md for contribution guidance.

Responsible disclosure

This project was created with assistance from OpenAI Codex. The human maintainer remains responsible for reviewing, testing, and accepting all code and documentation changes.

License

This project is released under the MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

perseus_mcp-1.0.2.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

perseus_mcp-1.0.2-py3-none-any.whl (26.3 kB view details)

Uploaded Python 3

File details

Details for the file perseus_mcp-1.0.2.tar.gz.

File metadata

  • Download URL: perseus_mcp-1.0.2.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for perseus_mcp-1.0.2.tar.gz
Algorithm Hash digest
SHA256 56c07f86a66f1bb2a4d3b8869d427bcf5a80688d64ad93c4d761fdd74b02f098
MD5 06832e857ca9c43b2addde858eb257b9
BLAKE2b-256 e5781a59e6c1ed57f6b555c41661d5320c50134184271771e657811f16b7b7e4

See more details on using hashes here.

Provenance

The following attestation bundles were made for perseus_mcp-1.0.2.tar.gz:

Publisher: publish.yml on tonyjurg/Perseus-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file perseus_mcp-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: perseus_mcp-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 26.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for perseus_mcp-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 636b19d53d0f8b6a2e0c9440665402d33194d6020fdbed28867a1fdf3e6c21d0
MD5 ff61822ccd80446a2187382afc8a2fe6
BLAKE2b-256 3f301e078e86ae4d97daa51f7afe75e4f5e53175c19d0232e7af7a255f70773b

See more details on using hashes here.

Provenance

The following attestation bundles were made for perseus_mcp-1.0.2-py3-none-any.whl:

Publisher: publish.yml on tonyjurg/Perseus-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page