MCP server for Perseus Greek and Latin text research
Project description
Perseus-mcp
Give Claude / Cursor / Windsurf direct access to the Perseus Digital Library — ancient Greek and Latin texts, precise CTS navigation, plaintext, search, and more.
A high-quality MCP server for Classical Greek and Latin literature. It runs as a local FastMCP server so MCP-capable applications can attach these Perseus tools to the LLM/model provider of your choice.
Features
This server exposes twenty-three MCP tools. Every tool returns a text payload: some are raw Perseus CTS XML or Scaife JSON, while the discovery and plaintext helpers return locally shaped JSON or readable text.
get_passage(urn)— fetch a CTS passage by URN.get_passage_plus(urn)— fetch passage text plus contextual metadata.get_passage_plaintext(urn)— fetch a CTS passage as plain readable text.get_valid_references(urn, level=None)— retrieve navigable citation references for a work or edition.get_valid_references_json(urn, level=None, limit=100, offset=0)— retrieve paged citation references as JSON (limit: 1–500).count_valid_references(urn, level=None)— count valid references without returning the full list.get_capabilities()— list available texts/editions from Perseus CTS.get_cache_status()— inspect local metadata cache state.refresh_metadata_cache()— refresh cached CTS and Scaife library metadata.clear_metadata_cache()— clear in-memory and disk metadata cache entries.list_text_groups(language=None, query=None, limit=100, offset=0)— list matching authors/textgroups and works with pagination metadata (limit: 1–500).get_author_resources(author, language=None)— list works, editions, and translations for a matching author name or CTS textgroup URN.find_author_names(query, language=None, limit=100, offset=0)— find author/textgroup names by partial name match with pagination metadata (limit: 1–500).get_work_resources(urn_or_title, language=None)— list editions, translations, and resources for a work, optionally filtered by original language.get_label(urn)— fetch human-readable metadata labels for a URN.get_first_urn(urn)— get the first navigable URN under a work/edition.get_prev_next_urn(urn)— get neighboring passage URNs for navigation.search_perseus(query, language="greek", query_format="auto", author=None, search_kind="form", preserve_operators=False, page_num=1, text_group=None, work=None, result_format="instances")— search texts via Scaife search API. Greek queries may be entered as Unicode Greek (for exampleμῆνιν) or Beta Code (for examplemh=nin).search_within_text(query, text_urn, ..., size=10, offset=0)— search within one Scaife text/edition URN (size: 1–500).get_passage_highlights(query, passage_urn, ...)— get Scaife token highlight positions for one passage.get_scaife_library_metadata(urn)— get Scaife JSON metadata for a library URN.get_scaife_passage_json(urn)— get Scaife JSON for a passage URN.get_scaife_passage_text(urn)— get Scaife plaintext for a passage URN.
Greek Search Input
search_perseus normalizes Greek search terms before sending them to Scaife.
You can pass Unicode Greek directly, or use Beta Code such as mh=nin a)/eide.
Search queries must contain at least one non-whitespace character.
The default query_format="auto" detects explicit Beta Code marks like =, /, (, ), and *, and also treats short unaccented Greek-looking queries such as logos as Beta Code.
If an ASCII query is ambiguous, set query_format="betacode" to force conversion or query_format="unicode" to preserve it exactly.
Search queries are normalized to composed Greek Unicode (NFC), matching sampled Perseus Greek text.
The tool uses Scaife's JSON search route and returns the JSON response as text.
The language argument accepts recognized Greek aliases (greek, grc, gr)
or Latin aliases (latin, lat, la); blank input defaults to Greek and
unrecognized values raise an error. It controls query normalization and is not
currently sent to Scaife as a corpus language filter.
For inventory discovery, list_text_groups, find_author_names,
get_author_resources, and get_work_resources accept language="greek" or
language="latin" (and common codes such as grc or lat) as an actual work
language filter. find_author_names merges the legacy CTS inventory with the
Scaife library catalog, so Scaife-only authors such as Philo Judaeus remain
discoverable. Passage and navigation tools use CTS URNs, whose
greekLit/latinLit namespace and edition identifier already select the text.
Pass author to resolve a CTS author/textgroup name or URN. When it resolves
to exactly one textgroup, Scaife receives a server-side text_group filter;
ambiguous matches fall back to local CTS URN-prefix filtering of the current
result page.
Use search_kind="lemma" for lemma search; the default search_kind="form"
keeps existing form-search behavior. For Scaife operator queries such as
quoted phrases, -, |, *, or ~, set preserve_operators=True so Beta
Code auto-detection does not consume operator characters. For example:
search_perseus('"μῆνιν ἄειδε"', query_format="unicode", preserve_operators=True),
search_perseus("μῆνιν -ἄειδε", query_format="unicode", preserve_operators=True),
or search_perseus("λόγος | ἀνήρ", search_kind="lemma", query_format="unicode", preserve_operators=True).
Use page_num for pagination and pass text_group or work to use Scaife's
server-side scope filters. When author resolves to exactly one CTS textgroup,
search_perseus sends that textgroup to Scaife instead of filtering only the
returned page locally.
Local Metadata Cache
Discovery and navigation tools cache stable CTS metadata locally to avoid
repeated multi-megabyte GetCapabilities and GetValidReff requests. The
default disk cache lives in .cache/perseus-mcp under the current working
directory and also uses an in-memory cache for the running server process.
Configure it with:
PERSEUS_MCP_CACHE_DIR— override the disk cache directory.PERSEUS_MCP_CACHE_TTL_SECONDS— set cache TTL; default is 86400 seconds.PERSEUS_MCP_DISABLE_CACHE=1— disable both memory and disk cache reads/writes.
The current working directory is the directory from which the Python process is
started. Running the MCP server from the repository root uses
.cache/perseus-mcp; running a notebook from examples/ would otherwise use
examples/.cache/perseus-mcp. That is not a second server instance, only a
second cache location for a separate Python process. To keep one cache location
across notebooks and MCP clients, set PERSEUS_MCP_CACHE_DIR to an absolute
path such as /path/to/Perseus-mcp/.cache/perseus-mcp.
Disk entries are written to unique sibling temporary files and atomically
replaced, so multiple local processes can safely share that directory without
exposing partially written cache files. Disk-cache write failures emit a
MetadataCacheWarning but do not discard a successfully fetched upstream
response.
URN Discovery
Available edition URNs can differ between Perseus CTS and Scaife search results,
and the live inventory can change. Use get_author_resources,
get_work_resources, or list_text_groups before constructing
edition-specific CTS passage URNs. The notebooks select advertised CTS editions
from discovery results instead of assuming that a Scaife edition URN is valid
for Perseus CTS.
The live Perseus CTS implementation may return malformed HTML for
GetFirstUrn and GetPrevNextUrn. The MCP tools detect that response and
derive valid XML results from GetValidReff.
Perseus may also return 429 Too Many Requests when a workflow sends many CTS
requests in a short period. Pause before retrying, reduce concurrency, and add
delays to passage-processing loops. The server currently exposes the upstream
HTTP error instead of retrying automatically.
Setup
1) Install dependencies
Using uv:
uv sync
Or with pip:
pip install -e .
Once a release is published to PyPI, users can install it without cloning the repository:
pip install perseus-mcp
For development and tests:
pip install -e ".[dev]"
2) Run tests
pytest
With uv, use:
uv run --extra dev pytest
3) Run locally
uv run perseus-mcp
The installed console command and module entry point are equivalent:
perseus-mcp
python -m perseus_mcp
4) Inspect tools (optional)
npx @modelcontextprotocol/inspector uv run perseus-mcp
Test strategy and automation
Perseus MCP uses layered checks rather than relying on one end-to-end test. Most behavior is covered by deterministic pytest tests with local XML/JSON fixtures and mocked asynchronous HTTP calls. GitHub Actions separately verifies the supported Python and operating-system matrix, package artifacts, secrets, release tags, and publication.
The workflow files under .github/workflows/ are the executable source of
truth.
Test suite organization
Pytest is configured in pyproject.toml to import from src/ and discover
tests under tests/.
| Test module | Main responsibility |
|---|---|
test_author_resources.py |
CTS author, work, and resource parsing; merged-author behavior |
test_disk_cache.py |
Atomic cache writes, cache disabling, cleanup, and concurrent writers |
test_exploration_tools.py |
Discovery, navigation, cache tools, author scope, and structured responses |
test_greek_query_normalization.py |
Unicode Greek, Beta Code, Scaife parameters, and search operators |
test_limits_and_language.py |
Result limits, paging bounds, and language aliases |
test_packaging.py |
Metadata, dependencies, documentation assets, notebooks, and workflow expectations |
test_scaife_urls.py |
Safe URL construction and CTS URN percent encoding |
test_shared_http_client.py |
Connection reuse, event-loop changes, shutdown, and HTTP errors |
test_xml_hardening.py |
Safe XML parsing and rejection of entity-based XML attacks |
A regression fix should include a focused test that fails for the original problem. Tests should assert observable behavior and cover failure paths and boundary values as well as successful calls.
Isolation from Perseus and Scaife
Routine tests do not depend on live upstream services. HTTP helpers are monkeypatched with asynchronous test doubles, while representative CTS XML and Scaife JSON are stored in test fixtures. This keeps CI deterministic when catalogs change or an upstream service is unavailable, avoids unnecessary traffic to public scholarly infrastructure, and makes malformed-response tests safe.
Live read-only probes may be used during manual review for endpoint compatibility or connection-lifecycle changes, but they supplement rather than replace the automated suite.
Async test cleanup
Several tests invoke tools with asyncio.run(), which creates a new event loop
for each call. The server uses a process-wide shared httpx.AsyncClient, so the
autouse fixture in tests/conftest.py closes and resets that client after every
test. Tests that manipulate shared client state must also leave it reset.
Local test commands
Run the complete suite:
python -m pytest
Run a module or one test:
python -m pytest tests/test_disk_cache.py
python -m pytest tests/test_disk_cache.py::test_disk_cache_set_writes_readable_content
Show skipped tests, the slowest tests, and local variables on failure:
python -m pytest -ra --durations=10 -l
Disable metadata-cache reads and writes during a test run:
PERSEUS_MCP_DISABLE_CACHE=1 python -m pytest
PowerShell equivalent:
$env:PERSEUS_MCP_DISABLE_CACHE = "1"
python -m pytest
GitHub Actions test matrix
.github/workflows/tests.yml installs the editable project with development
dependencies and runs python -m pytest on:
- Ubuntu and Windows;
- Python 3.11, 3.12, and 3.13.
The matrix uses fail-fast: false, so every platform/version job finishes even
when one fails. This makes version-specific and Windows-specific regressions
visible in one run. Documentation-only changes under docs/** are excluded
from the Python test workflow; exact event and branch filters remain defined in
the workflow file.
Package validation
.github/workflows/package.yml checks that the repository produces a valid
source distribution and universal wheel. It installs Python 3.12, runs:
python -m build
python -m twine check dist/*
and uploads dist/ as the python-package workflow artifact. The workflow is
path-filtered to package-relevant files and supports manual dispatch.
tests/test_packaging.py complements this build by checking repository-level
expectations such as metadata, dependencies, documentation files, notebook
JSON, and workflow configuration. Both layers matter: metadata tests can pass
while an isolated build fails, and a package can build while required
repository assets are missing.
Secret scanning
.github/workflows/secret-scan.yml rejects tracked OpenRouter keys matching:
sk-or-v1-[A-Za-z0-9_-]{20,}
The workflow reports affected files without printing the matching secret. A
detected key must be removed and rotated; the check should never be bypassed.
This focused scan does not replace normal credential hygiene: do not commit
.env files, tokens, private MCP configuration, or notebook outputs containing
credentials.
Release and publication gates
.github/workflows/release.yml runs for v* tags or manual dispatch. For tag
runs it verifies that the tag equals v<project.version>, builds the wheel and
source archive, validates both with Twine, attaches them to a generated GitHub
release, and dispatches the PyPI workflow using the same tag.
.github/workflows/publish.yml requires a tag reference, repeats the
tag/version check, rebuilds and revalidates the artifacts, and publishes through
PyPI trusted publishing. The protected pypi GitHub environment uses OIDC
(id-token: write), so no long-lived PyPI API token is stored.
Rebuilding during publication avoids trusting an unrelated workflow artifact, while the repeated tag check prevents publishing from a branch or mismatched release tag.
Documentation deployment
.github/workflows/pages.yml builds docs/ with Jekyll and deploys the
generated artifact to GitHub Pages after documentation changes reach main or
master. This Pages site is intended primarily for end users; development and
test-strategy documentation lives in this repository README.
Interpreting failures
- Failures on every matrix job usually indicate a general regression.
- A single Python-version failure suggests version-specific syntax, dependencies, or standard-library behavior.
- Windows-only failures commonly involve paths, permissions, read-only attributes, or event-loop lifecycle.
- A package failure with green pytest jobs usually concerns metadata, manifests, README rendering, or build isolation.
- A secret-scan failure requires credential removal and rotation.
- A release failure before publication commonly means the tag and
project.versiondo not match.
Understand the failure before rerunning a job, and preserve useful workflow logs or tracebacks in the pull request when the cause is not obvious.
Maintainer-level conventions for extending the suite are also kept beside the
tests in tests/testing.md.
Example notebooks
The examples/ directory includes Jupyter notebooks that demonstrate both direct endpoint calls and MCP client usage with real Greek and Latin data:
examples/00_install_and_run_perseus_mcp.ipynb— installation and launch guide covering PyPI, pip, uv, local repository development, MCP client configuration, verification, upgrades, and troubleshooting.examples/01_basic_cts_workflow.ipynb— minimal direct CTS requests.examples/02_search_and_navigation.ipynb— direct Scaife JSON search and CTS navigation from valid references.examples/03_mcp_connection_homer_iliad.ipynb— FastMCP client connection, Homer resource discovery, and Iliad Greek passage analysis.examples/04_mcp_greek_search_and_navigation.ipynb— MCP Greek search with Unicode/Beta Code, valid references, and passage navigation.examples/05_mcp_all_tools.ipynb— complete MCP tool catalog with descriptions and input schemas.examples/06_openrouter_llm_mcp_interaction.ipynb— optional OpenRouter LLM tool-calling loop over the local MCP tools, using OpenRouter's Free Models Router by default.examples/07_mcp_advanced_search_options.ipynb— MCP form/lemma search, Scaife operator queries, and author-scoped search examples.examples/08_mcp_cache_and_search_tools.ipynb— advanced demonstration of cache tools, paged references, scoped search, reader search, highlights, and Scaife metadata/text retrieval.examples/09_openrouter_philo_politeia_analysis.ipynb— OpenRouter-assisted, evidence-first analysis ofπολιτείαin Philo of Alexandria using scoped MCP search results and cited passages.examples/10_mcp_latin_augustine_workflow.ipynb— Latin-language discovery, CTS navigation, passage retrieval, and a small text analysis using Augustine's Epistulae selections.
Run them after installing the project dependencies. The MCP notebooks use
FastMCP's in-process client transport and call the same tools exposed to
external MCP clients. The optional OpenRouter notebook also requires an
OpenRouter API key; the MCP server itself does not.
Notebook setup cells install notebook-only helpers such as python-dotenv
directly. Those helpers are not core runtime dependencies of perseus-mcp.
Configure the OpenRouter API key
For examples/06_openrouter_llm_mcp_interaction.ipynb and
examples/09_openrouter_philo_politeia_analysis.ipynb, copy .env.example to
.env in the project root and replace the placeholder:
OPENROUTER_API_KEY=sk-or-v1-...
Get your API key at openrouter.ai. See
OpenRouter's API key documentation for
authentication details.
The .env file is ignored by Git. You can also set OPENROUTER_API_KEY in your
environment or enter it securely when the notebook prompts.
Both OpenRouter notebooks default to openrouter/free. This router selects
among free models currently available on OpenRouter and filters for capabilities
required by the request, such as tool calling or structured output. It avoids
binding the examples to one free model that may later be removed or temporarily
unavailable. The tradeoff is reduced reproducibility: separate runs may use
different concrete models, so the notebooks record the resolved model returned
by OpenRouter. Set OPENROUTER_MODEL to a fixed model slug when exact model
selection matters.
Notebook 06_ can be saved and committed with its LLM and tool-call outputs so
they render on GitHub. Python variables and kernel memory are not stored in an
.ipynb file, and the notebook does not print the API key. Before committing a
credentialed run, review the visible outputs and scan for a full OpenRouter key:
rg "sk-or-v1-[A-Za-z0-9_-]{20,}" examples/06_openrouter_llm_mcp_interaction.ipynb
The command should produce no output. It does not match the documented
sk-or-v1-... placeholder.
Using with any MCP-capable LLM client
This project does not require a specific LLM. Configure your client to launch the local MCP server with:
uv --directory /full/path/to/Perseus-mcp run perseus-mcp
Most MCP clients need the same pieces: server name perseus, command uv, args --directory /full/path/to/Perseus-mcp run perseus-mcp, and an empty environment unless you have local customizations. See docs/enduser.md for generic client guidance and docs/architecture.md for the architecture choices, including why FastMCP is used.
Claude Desktop and Claude Code
The server runs with Claude over stdio, with no OpenRouter or API key required (OpenRouter is only needed for the optional demo client).
Claude Desktop — add to claude_desktop_config.json:
{
"mcpServers": {
"perseus": {
"command": "uv",
"args": ["--directory", "/full/path/to/Perseus-mcp", "run", "perseus-mcp"]
}
}
}
Restart Claude Desktop; the Perseus tools appear in the tools list.
Claude Code — one line:
claude mcp add perseus -- uv --directory /full/path/to/Perseus-mcp run perseus-mcp
Verified against a stdio MCP handshake: all 23 tools register and live calls return (tested with search_perseus and list_text_groups).
Build a PyPI distribution
Install the development dependencies, then build and validate both distribution formats:
python -m pip install -e ".[dev]"
python -m build
python -m twine check dist/*
The build creates a wheel and source archive under dist/. Test the wheel in a
clean virtual environment before publishing. Upload to TestPyPI first:
python -m twine upload --repository testpypi dist/*
After verifying installation from TestPyPI, upload the same artifacts to PyPI:
python -m twine upload dist/*
PyPI does not allow replacing an existing release. Update project.version in
pyproject.toml, rebuild from a clean dist/ directory, and publish each
version only once. The package build workflow also builds and checks artifacts
in CI without publishing them.
Automated GitHub release and PyPI publishing
The release automation follows the same trusted-publishing pattern as MorphKit:
- Set the release version in
pyproject.toml, for example1.2.3. - Merge the version change to the commit that should be released.
- Create and push the matching tag, for example
v1.2.3. - The
Build release artifactsworkflow verifies the tag/version match, builds and validates both distributions, and attaches them to a generated GitHub release. - That workflow dispatches
Publish to PyPI, which rebuilds and validates the package before publishing through PyPI trusted publishing.
Configure the repository once before the first automated upload:
- Create a GitHub Actions environment named
pypi. - In the existing PyPI project settings, or as a pending publisher before the
first upload, add a trusted publisher for owner
tonyjurg, repositoryPerseus-mcp, workflowpublish.yml, and environmentpypi. - Do not add a PyPI API token; the workflow uses GitHub OIDC with
id-token: write.
The workflows reject a tag such as v1.2.4 when project.version is still
1.2.3. PyPI versions are immutable, so increment the version before retrying
a release that was already uploaded.
Contributing and reporting issues
Bug reports, documentation fixes, focused feature requests, and pull requests are welcome. Please report problems through the GitHub issue tracker and include the command, Python version, MCP client, tool arguments, traceback, and any relevant CTS URN or Greek search query when possible.
See docs/contributing.md for contribution guidance.
Responsible disclosure
This project was created with assistance from OpenAI Codex. The human maintainer remains responsible for reviewing, testing, and accepting all code and documentation changes.
License
This project is released under the MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file perseus_mcp-1.0.2.tar.gz.
File metadata
- Download URL: perseus_mcp-1.0.2.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56c07f86a66f1bb2a4d3b8869d427bcf5a80688d64ad93c4d761fdd74b02f098
|
|
| MD5 |
06832e857ca9c43b2addde858eb257b9
|
|
| BLAKE2b-256 |
e5781a59e6c1ed57f6b555c41661d5320c50134184271771e657811f16b7b7e4
|
Provenance
The following attestation bundles were made for perseus_mcp-1.0.2.tar.gz:
Publisher:
publish.yml on tonyjurg/Perseus-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
perseus_mcp-1.0.2.tar.gz -
Subject digest:
56c07f86a66f1bb2a4d3b8869d427bcf5a80688d64ad93c4d761fdd74b02f098 - Sigstore transparency entry: 1967825634
- Sigstore integration time:
-
Permalink:
tonyjurg/Perseus-mcp@20c211cc5eb30fca86f03ff20c7c7ba759a9f9e3 -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/tonyjurg
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@20c211cc5eb30fca86f03ff20c7c7ba759a9f9e3 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file perseus_mcp-1.0.2-py3-none-any.whl.
File metadata
- Download URL: perseus_mcp-1.0.2-py3-none-any.whl
- Upload date:
- Size: 26.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
636b19d53d0f8b6a2e0c9440665402d33194d6020fdbed28867a1fdf3e6c21d0
|
|
| MD5 |
ff61822ccd80446a2187382afc8a2fe6
|
|
| BLAKE2b-256 |
3f301e078e86ae4d97daa51f7afe75e4f5e53175c19d0232e7af7a255f70773b
|
Provenance
The following attestation bundles were made for perseus_mcp-1.0.2-py3-none-any.whl:
Publisher:
publish.yml on tonyjurg/Perseus-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
perseus_mcp-1.0.2-py3-none-any.whl -
Subject digest:
636b19d53d0f8b6a2e0c9440665402d33194d6020fdbed28867a1fdf3e6c21d0 - Sigstore transparency entry: 1967825740
- Sigstore integration time:
-
Permalink:
tonyjurg/Perseus-mcp@20c211cc5eb30fca86f03ff20c7c7ba759a9f9e3 -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/tonyjurg
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@20c211cc5eb30fca86f03ff20c7c7ba759a9f9e3 -
Trigger Event:
workflow_dispatch
-
Statement type: