Skip to main content

Academic research MCP server — search, extract, and manage papers

Project description

GRaDOS

English | 简体中文

  .oooooo.    ooooooooo.             oooooooooo.     .oooooo.    .oooooo..o
 d8P'  `Y8b   `888   `Y88.           `888'   `Y8b   d8P'  `Y8b  d8P'    `Y8
888            888   .d88'  .oooo.    888      888 888      888 Y88bo.     
888            888ooo88P'  `P  )88b   888      888 888      888  `"Y8888o. 
888     ooooo  888`88b.     .oP"888   888      888 888      888      `"Y88b
`88.    .88'   888  `88b.  d8(  888   888     d88' `88b    d88' oo     .d8P
 `Y8bood8P'   o888o  o888o `Y888""8o o888bood8P'    `Y8bood8P'  8""88888P' 

Graduate Research and Document Operating System

The Python MCP server for academic paper search, full-text extraction, visible local paper storage, and semantic retrieval over a built-in ChromaDB index.

GRaDOS gives AI agents (Claude, Codex, Cursor, and similar clients) a single stdio MCP server that can search academic databases, fetch papers through paywalls, parse PDFs into canonical Markdown, and revisit saved papers for citation-grounded writing.

Architecture 🧭

GRaDOS is designed to sit inside an agent research workflow:

  1. Check the local paper library first with search_saved_papers, get_saved_paper_structure, or grados://papers/{safe_doi}
  2. Search remote academic sources in configured priority order
  3. Fetch full text through TDM -> OA -> Sci-Hub -> Headless
  4. Parse PDFs through PyMuPDF -> Marker -> Docling
  5. Save raw PDFs to downloads/, canonical Markdown to papers/, and semantic data to ChromaDB
  6. Re-open saved papers with low-token structure cards and deep-reading windows before citing them

MCP Tools 🔧

Server Tool Description
GRaDOS search_academic_papers Search Crossref, PubMed, Web of Science, Elsevier, and Springer with DOI deduplication plus resumable continuation tokens.
GRaDOS search_saved_papers Search the canonical saved-paper library backed by ChromaDB.
GRaDOS extract_paper_full_text Fetch a paper by DOI through the full-text waterfall, parse it, and save canonical Markdown plus raw PDF assets.
GRaDOS read_saved_paper Read paragraph windows from a saved paper for synthesis and citation verification.
GRaDOS get_saved_paper_structure Return a low-token structure card with preview text, headings, and asset summary.
GRaDOS import_local_pdf_library Bulk-import an existing PDF folder into the canonical paper store and retrieval index.
GRaDOS parse_pdf_file Parse a local PDF into canonical Markdown, optionally binding it to a DOI.
GRaDOS save_paper_to_zotero Save actually cited papers to Zotero through the Web API.

MCP Resources 📚

Resource Description
grados://papers/index Low-token index of all saved papers.
grados://papers/{safe_doi} Canonical overview card for one saved paper.

Local Paper Library 🗂️

After extraction or import, GRaDOS keeps papers in a visible on-disk layout:

Directory Content Purpose
config.json Runtime configuration One config file for the whole install
papers/ Canonical Markdown papers with YAML front-matter Deep reading, structure cards, and retrieval
downloads/ Raw .pdf files Archival copies of fetched or imported papers
database/chroma/ ChromaDB collections Built-in semantic retrieval store
browser/ Managed Chromium, profile, extensions Browser fallback for difficult publisher pages
models/ Embedding and OCR model caches Runtime assets warmed by setup

Repository Map 🗺️

  • README.md / README.zh-CN.md: primary installation and usage guides
  • .mcp.json: repo-local MCP wiring example
  • skills/grados/SKILL.md: structured research workflow built on top of the MCP tools
  • grados-python-implementation-plan.md: implementation plan and completion ledger
  • TODO.md: concise execution snapshot derived from the implementation plan

Installation 🚀

Option A: uv tool install (recommended)

uv tool install grados
grados setup --all

This creates ~/GRaDOS/config.json, prepares the visible directory layout, installs managed browser assets, and warms the default embedding model.

Option B: extras, zero-install, or pip

# Core install
uv tool install grados

# Install optional parser extras
uv tool install "grados[marker]"
uv tool install "grados[docling]"
uv tool install "grados[full]"

# Zero-install run
uvx grados version

# Traditional Python install
pip install grados

Extras in the current package:

  • grados: core MCP server, CLI, ChromaDB storage, default parser, browser automation, and built-in Zotero save support
  • grados[marker]: core plus the Marker PDF parser
  • grados[docling]: core plus the Docling PDF parser
  • grados[full]: core plus both heavier parsers

Option C: from source

git clone https://github.com/STSNaive/GRaDOS.git
cd GRaDOS
uv sync --all-extras
uv run grados setup --all
uv run grados status

Quick Start ⚡

  1. Install GRaDOS with uv tool install grados
  2. Run grados setup --all
  3. Edit ~/GRaDOS/config.json
  4. Run grados status to confirm dependencies, browser assets, and API keys
  5. Point your MCP client at grados or uvx grados
  6. If you already have a PDF library, run grados import-pdfs --from /path/to/papers --recursive

Configure your MCP client 🔌

Claude Code / Claude Desktop:

{
  "mcpServers": {
    "grados": {
      "command": "uvx",
      "args": ["grados"]
    }
  }
}

Codex:

[mcp_servers.grados]
command = "uvx"
args = ["grados"]

Use uvx when you want zero-install MCP launching. For long-lived local use, uv tool install grados plus the grados executable remains the primary path. If you want a custom data root, set GRADOS_HOME in your MCP client's environment.

Configuration ⚙️

Commands 🧰

Command Purpose
grados Start the MCP stdio server
grados setup --all Create directories, write config.json, install browser assets, and warm models
grados setup --with browser Install only browser runtime assets
grados setup --with models Warm only the embedding model
grados import-pdfs --from /path/to/papers --recursive Import an existing local PDF library into the canonical paper store
grados status Show config, dependency, runtime-asset, and API-key health
grados paths Show the resolved GRaDOS filesystem layout
grados update-db Build or refresh the ChromaDB index from papers/
grados migrate-config --from /path/to/legacy Migrate data from an older GRaDOS install
grados version Show package versions

Filesystem Layout 🗄️

By default, GRaDOS keeps everything in a visible directory:

~/GRaDOS/
├── config.json
├── papers/
├── downloads/
├── browser/
│   ├── chromium/
│   ├── profile/
│   └── extensions/
├── models/
├── database/
│   └── chroma/
├── logs/
└── cache/

Root selection priority:

  1. GRADOS_HOME
  2. ~/GRaDOS

API Keys 🔑

Key Source Required
ELSEVIER_API_KEY Elsevier Developer Portal No
WOS_API_KEY Clarivate Developer Portal No
SPRINGER_meta_API_KEY Springer Nature Metadata API No
SPRINGER_OA_API_KEY Springer Nature Open Access API No
LLAMAPARSE_API_KEY LlamaCloud No
ZOTERO_API_KEY Zotero Settings -> Keys No

Crossref and PubMed require no API keys. GRaDOS will use whichever services are configured and skip the rest. At minimum, the default remote search flow still works with the free sources, and the local paper workflow works without any third-party key.

Runtime Order 🌊

Search priority:

{
  "search": {
    "order": ["Elsevier", "Springer", "WebOfScience", "Crossref", "PubMed"]
  }
}

Full-text fetch priority:

{
  "extract": {
    "fetchStrategy": {
      "order": ["TDM", "OA", "SciHub", "Headless"]
    }
  }
}

PDF parsing priority:

{
  "extract": {
    "parsing": {
      "order": ["PyMuPDF", "Marker", "Docling"]
    }
  }
}

Migrating From Older Installs ♻️

If you already have an older GRaDOS data directory, use grados migrate-config to carry papers, downloads, browser assets, models, and compatible settings into the current layout.

Recommended migration flow:

uv tool install grados
grados migrate-config --from /path/to/legacy
grados status

What grados migrate-config carries forward:

  • Saved Markdown papers into papers/
  • Archived PDFs into downloads/
  • Managed browser assets into browser/
  • Model caches into models/
  • Compatible search, extraction, Zotero, and API-key settings into the new config.json

Path mapping:

Older layout Current layout
grados-config.json config.json
markdown/ papers/
downloads/ downloads/
.grados/browser/ browser/
models/ models/

Development 🛠️

uv sync --all-extras
uv run grados version
uv run pytest
uv build

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grados-0.6.6.tar.gz (67.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

grados-0.6.6-py3-none-any.whl (61.9 kB view details)

Uploaded Python 3

File details

Details for the file grados-0.6.6.tar.gz.

File metadata

  • Download URL: grados-0.6.6.tar.gz
  • Upload date:
  • Size: 67.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grados-0.6.6.tar.gz
Algorithm Hash digest
SHA256 51fcde08eaa89516206be12dd9e84e7091578e1e345f2d80c2904c534094b5bf
MD5 707a20276441c2821c2ae1f546c946f5
BLAKE2b-256 f00bca5a59a74fd57ba749553412856422069ab201c4dfb3248aafa5c8c4b73e

See more details on using hashes here.

Provenance

The following attestation bundles were made for grados-0.6.6.tar.gz:

Publisher: publish.yml on STSNaive/GRaDOS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file grados-0.6.6-py3-none-any.whl.

File metadata

  • Download URL: grados-0.6.6-py3-none-any.whl
  • Upload date:
  • Size: 61.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grados-0.6.6-py3-none-any.whl
Algorithm Hash digest
SHA256 88eab42e6a8588c2b6e7c4751c8351969e6aaa3c8ae5a47116076f1eb0e691df
MD5 2ce1d16069aa042b2820016e9fdd328e
BLAKE2b-256 222bc06f531622fe7d2faa52f81c05d4b5209ef3bb68976cb8f3b87f7841d4db

See more details on using hashes here.

Provenance

The following attestation bundles were made for grados-0.6.6-py3-none-any.whl:

Publisher: publish.yml on STSNaive/GRaDOS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page