Academic research MCP server — search, extract, and manage papers

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

STSNaive

These details have not been verified by PyPI

Project description

GRaDOS

English | 简体中文

  .oooooo.    ooooooooo.             oooooooooo.     .oooooo.    .oooooo..o
 d8P'  `Y8b   `888   `Y88.           `888'   `Y8b   d8P'  `Y8b  d8P'    `Y8
888            888   .d88'  .oooo.    888      888 888      888 Y88bo.     
888            888ooo88P'  `P  )88b   888      888 888      888  `"Y8888o. 
888     ooooo  888`88b.     .oP"888   888      888 888      888      `"Y88b
`88.    .88'   888  `88b.  d8(  888   888     d88' `88b    d88' oo     .d8P
 `Y8bood8P'   o888o  o888o `Y888""8o o888bood8P'    `Y8bood8P'  8""88888P'

Graduate Research and Document Operating System

The enrichment-grade MCP server for academic paper workflows. For science.

GRaDOS gives AI agents (Claude, Codex, Cursor, and similar clients) a single stdio MCP server that can search academic databases, fetch papers through paywalls, parse PDFs into canonical Markdown, and revisit saved papers for citation-grounded writing.

Architecture 🧭

GRaDOS is designed to sit inside an agent research workflow:

Check the local paper library first with search_saved_papers, get_saved_paper_structure, or grados://papers/{safe_doi}
Search remote academic sources in configured priority order
Fetch full text through api -> browser -> oa -> scihub
Parse PDFs through Docling -> Marker -> PyMuPDF
Save raw PDFs to downloads/, canonical Markdown to papers/, the paper index to database/chroma/, and remote metadata to database/remote_metadata/
Re-open saved papers with low-token structure cards and deep-reading windows before citing them

MCP Tools 🔧

Server	Tool	Description
GRaDOS	`search_academic_papers`	Search remote academic databases for paper metadata, DOI deduplication, resumable continuation tokens, and local saved/full-text/summary state. Optional `indepth=true` materializes returned candidates with the same `limit`; default config is off.
GRaDOS	`search_saved_papers`	Search the local saved-paper library with semantic retrieval, metadata filters, and optional lexical reranking. Returned snippets are screening hints, not citation evidence.
GRaDOS	`extract_paper_full_text`	Fetch, parse, and save one paper's canonical full text by DOI. Returns a compact save receipt with URI, file path, sections, and warnings rather than the full paper text.
GRaDOS	`read_saved_paper`	Read paragraph windows from one saved paper for canonical deep reading and citation verification. Accepts a DOI, safe DOI, or `grados://papers/...` URI.
GRaDOS	`get_saved_paper_structure`	Return a low-token structure card for one saved paper with preview text, headings, and asset summary. Use it for screening before deep reading, not as the final citation source.
GRaDOS	`import_local_pdf_library`	Import a local PDF file or directory into the canonical paper store and retrieval index. Returns an import summary plus the first 25 item results.
GRaDOS	`parse_pdf_file`	Parse a local PDF into markdown. Without a DOI it returns a truncated preview; with a DOI it saves the paper into the canonical library and returns a save receipt.
GRaDOS	`save_paper_to_zotero`	Save one paper to the configured Zotero library through the Web API, typically for papers that actually support the final answer.
GRaDOS	`save_research_artifact`	Persist reusable intermediate outputs such as search snapshots, extraction receipts, and evidence grids in the local SQLite state store.
GRaDOS	`query_research_artifacts`	Query previously saved research artifacts by id, kind, or keyword. `detail=true` returns the full stored content.
GRaDOS	`manage_failure_cases`	Record, inspect, and summarize failed fetch, parse, search, or citation attempts. Can also suggest conservative retry steps from local failure memory.
GRaDOS	`get_citation_graph`	Return lightweight local citation relationships, including citation neighbors, common references, and reverse citing-paper lookups.
GRaDOS	`get_papers_full_context`	Return structured full-context material for a small paper set, with token estimates or actual section content for CAG-style deep reading.
GRaDOS	`build_evidence_grid`	Build topic- or subquestion-centered evidence grids from the local paper library before drafting.
GRaDOS	`compare_papers`	Extract aligned comparison material across multiple saved papers, focused on methods, results, or full text.
GRaDOS	`audit_draft_support`	Audit draft claims against the local paper library and return `supported`, `weak`, `unsupported`, or `misattributed` statuses with candidate evidence. `misattributed` is currently reliable for resolvable Latin-script or Chinese author-year citations; numeric citations stay support-only until bibliography mapping exists.

MCP Resources 📚

Resource	Description
`grados://papers/index`	Low-token index of all saved papers.
`grados://papers/{safe_doi}`	Canonical overview card for one saved paper.

safe_doi is an opaque GRaDOS paper ID returned by save receipts, search results, or resource URIs. New saves include a short normalized-DOI hash suffix to avoid filename collisions; older IDs such as 10_1234_demo still resolve. Prefer passing the DOI itself or the returned URI instead of deriving a paper ID by replacing DOI punctuation.

Local Paper Library 🗂️

After extraction or import, GRaDOS keeps papers in a visible on-disk layout:

Directory	Content	Purpose
`config.json`	Runtime configuration	One config file for the whole install
`papers/`	Canonical Markdown papers with YAML front-matter	Deep reading, structure cards, and retrieval
`downloads/`	Raw `.pdf` files	Archival copies of fetched or imported papers
`database/chroma/`	ChromaDB collections	Built-in semantic retrieval store
`database/remote_metadata/`	ChromaDB collection	Remote paper metadata, fetch status, and browser-resume cache
`research_checkpoints/`	`checkpoint.json` and rendered `checkpoint.md` files	Recoverable indepth research workflow state
`paper_summaries/`	Query-independent derived paper summaries	Navigation and context recovery, never citation evidence
`browser/`	Managed Chromium, profile, extensions	Browser fallback for difficult publisher pages
`models/`	Embedding and OCR model caches	Runtime assets warmed by setup

Repository Map 🗺️

README.md / README.zh-CN.md: primary installation and usage guides
.mcp.json: repo-local MCP wiring example
.claude-plugin/: native Claude Code plugin manifests
.agents/plugins/marketplace.json: repo-hosted Codex marketplace manifest
plugin.mcp.json: root plugin-scoped MCP config used by the Claude Code plugin
plugins/grados/.codex-plugin/: self-contained Codex plugin bundle used by the marketplace
plugins/grados/plugin.mcp.json: plugin-scoped MCP config copied into the Codex bundle
skills/grados/SKILL.md: structured research workflow built on top of the MCP tools

Installation 🚀

Option A: `uv tool install` (recommended)

uv tool install grados
grados setup
grados client install all

This creates ~/GRaDOS/config.json, prepares the visible directory layout, installs managed browser assets, and warms the default Harrier embedding runtime. docling is now included in the default install because the canonical parsing pipeline is Docling-first. Use grados auth set <provider> to store API keys in the OS keychain. Plaintext keys placed in config.json are treated as a one-time import path and are cleared after a successful migration.

Option B: extras, zero-install, or pip

# Default install (includes Docling)
uv tool install grados

# Optional heavier parser extras
uv tool install "grados[marker]"
uv tool install "grados[full]"

# Zero-install run
uvx grados version

# Traditional Python install
pip install grados

Extras in the current package:

grados: core MCP server, CLI, ChromaDB storage, Docling-first default parser, PyMuPDF fallback, browser automation, and built-in Zotero save support
grados[marker]: core plus the Marker PDF parser
grados[docling]: compatibility alias for the built-in Docling runtime
grados[full]: core plus the Marker parser

Option C: from source

git clone https://github.com/STSNaive/GRaDOS.git
cd GRaDOS
uv sync --all-extras
uv run grados setup
uv run grados client install all
uv run grados status

Quick Start ⚡

Install GRaDOS with uv tool install grados (this now includes Docling by default)
Run grados setup
Run grados client install all to register Claude Code and Codex in one step
Run grados auth set elsevier (and any other providers you need)
Run grados status to confirm dependencies, browser assets, keychain health, and API-key sources
If you already have a PDF library, run grados import-pdfs --from /path/to/papers --recursive
If you are upgrading from an older MiniLM-backed index, run grados reindex once before semantic search

Configure your clients 🔌

Recommended:

grados client install all

This currently installs GRaDOS into both Claude Code and Codex:

registers the grados MCP server through each client's own CLI
copies the bundled grados skill into the user's skills directory

You can also target a single client:

grados client install claude
grados client install codex
grados client list
grados client doctor

Manual MCP wiring (fallback)

Claude Code / Claude Desktop:

{
  "mcpServers": {
    "grados": {
      "command": "uvx",
      "args": ["grados"]
    }
  }
}

Codex:

[mcp_servers.grados]
command = "uvx"
args = ["grados"]

Use uvx when you want zero-install MCP launching. For long-lived local use, uv tool install grados plus the grados executable remains the primary path, and now brings Docling with it by default. If you want a custom data root, set GRADOS_HOME in your MCP client's environment.

Native Plugin Install 🧩

GRaDOS now ships native plugins for Codex and Claude Code.

Claude Code:

/plugin marketplace add STSNaive/GRaDOS
/plugin install grados@grados-plugins
/reload-plugins

Codex:

codex plugin marketplace add STSNaive/GRaDOS
codex
/plugins

Then choose the GRaDOS Plugins marketplace, install the GRaDOS plugin, and start a new thread. You can call @grados explicitly or just describe the research task directly.

Companion Skill 🤖

GRaDOS still ships a repo-local skill in skills/grados/. The grados client install ... flow above is now the preferred path for local use. Plugin install remains the alternative when you specifically want the native plugin packaging.

skills/grados/SKILL.md contains the current search -> structure -> deep read -> cite -> verify workflow
skills/grados/references/tools.md documents the current 16 tools and 2 resources
skills/grados/agents/openai.yaml describes the OpenAI / Codex-facing dependency on the grados MCP server

Codex and Claude Code use the same skill directory shape, <skills-root>/grados/SKILL.md, with the same supporting files under that directory. Only the skills root differs:

Codex personal skills: ~/.agents/skills
Claude Code personal skills: ~/.claude/skills
Claude Code project skills: .claude/skills

Install it by copying the entire skills/grados/ directory into the appropriate skills root:

mkdir -p "<skills-root>"
cp -R skills/grados "<skills-root>/"

For Codex, set <skills-root> to ~/.agents/skills
For Claude Code personal skills, set <skills-root> to ~/.claude/skills
For Claude Code project skills, set <skills-root> to .claude/skills

This fallback assumes the grados MCP server is already registered in your client. This repository's .mcp.json is the minimal repo-local example; after copying the skill, reload your client so it can discover the new skill files.

Configuration ⚙️

Keep grados-config.example.json as the commented reference; edits take effect on the next CLI run or MCP server restart.

Timeout / Retry Knobs

search: connect_timeout, read_timeout
extract: fetch_connect_timeout, fetch_read_timeout
extract.headless_browser: deadline_seconds, networkidle_timeout, poll_min_seconds, poll_max_seconds
retry_policy: max_attempts, max_wait, respect_retry_after

Commands 🧰

Command	Purpose
`grados`	Start the MCP stdio server
`grados setup`	Create directories, write `config.json`, install browser assets, and warm models
`grados client install claude`	Register GRaDOS in Claude Code and install bundled skills into `~/.claude/skills`
`grados client install codex`	Register GRaDOS in Codex and install bundled skills into `~/.agents/skills`
`grados client install all`	Install GRaDOS into both Claude Code and Codex
`grados client list`	Show which supported clients currently have GRaDOS installed
`grados client doctor`	Run a lightweight health check for supported clients
`grados client remove claude	codex
`grados auth set/status/migrate/clear`	Manage provider API keys in the OS keychain
`grados import-pdfs --from /path/to/papers --recursive`	Import an existing local PDF library into the canonical paper store
`grados status`	Show config, dependency, runtime-asset, and API-key health
`grados paths`	Show the resolved GRaDOS filesystem layout
`grados update-db`	Incrementally refresh the ChromaDB index from `papers/` when the active indexing config is unchanged
`grados reindex`	Rebuild the semantic index from scratch after embedding-model or chunking changes
`grados version`	Show package versions

If you change indexing.model_id, indexing.max_length, or the section-aware chunking settings in config.json, use grados reindex instead of grados update-db.

Changing only indexing.batch_size is a runtime-only tuning knob and does not require a rebuild.

Indexing Defaults 🧠

Default model: microsoft/harrier-oss-v1-270m
Heavier opt-in model: microsoft/harrier-oss-v1-0.6b
Default indexing.max_length: 4096
Default indexing.batch_size: 0 (auto, conservative on CPU/MPS and wider on CUDA)
Overlong single paragraphs are re-split by sentence or clause before embedding so grados reindex does not send giant chunks into SentenceTransformer.encode()

GRaDOS does not assume FlashAttention is available on local macOS / CPU setups. If your runtime says it can use SDPA, that still does not guarantee a fused CUDA FlashAttention path; the safer default is smaller chunks, a shorter indexing length, and conservative batching.

Filesystem Layout 🗄️

By default, GRaDOS keeps everything in a visible directory:

~/GRaDOS/
├── config.json
├── papers/
├── downloads/
├── browser/
│   ├── chromium/
│   ├── profile/
│   └── extensions/
├── models/
├── database/
│   ├── chroma/
│   └── remote_metadata/
├── logs/
└── cache/

Root selection priority:

GRADOS_HOME
~/GRaDOS

API Keys 🔑

Key	Source	Required
`ELSEVIER_API_KEY`	Elsevier Developer Portal	No
`PUBMED_API_KEY`	NCBI E-utilities API key	No
`WOS_API_KEY`	Clarivate Developer Portal	No
`SPRINGER_meta_API_KEY`	Springer Nature Metadata API	No
`SPRINGER_OA_API_KEY`	Springer Nature Open Access API	No
`LLAMAPARSE_API_KEY`	LlamaCloud	No
`ZOTERO_API_KEY`	Zotero Settings -> Keys	No

Crossref works without an API key. PubMed also works without one, but PUBMED_API_KEY is available as an optional pacing upgrade for E-utilities. GRaDOS will use whichever services are configured and skip the rest; the default remote search flow still works with the free sources, and the local paper workflow works without any third-party key.

The preferred path is grados auth set <provider>, which stores the secret in the OS keychain. If you temporarily place a plaintext key in ~/GRaDOS/config.json, GRaDOS will import it into the keychain on the next run and then clear the plaintext value from the file.

Runtime Order 🌊

Search priority:

{
  "search": {
    "order": ["Elsevier", "Springer", "WebOfScience", "Crossref", "PubMed"]
  }
}

Full-text fetch priority:

{
  "extract": {
    "fetch_strategy": {
      "order": ["api", "browser", "oa", "scihub"]
    }
  }
}

Legacy fetch-strategy aliases such as TDM, OA, SciHub, and Headless are still accepted while existing configs migrate. The current scihub runtime uses extract.sci_hub.endpoints as an ordered access list: the first endpoint is tried first, and later entries are fallbacks. The legacy extract.sci_hub.fallback_mirror value is still accepted when endpoints is omitted or empty.

The browser strategy is a first-class path for institutional publisher access. If a publisher verification page blocks PDF capture, GRaDOS records a challenge with manual-resume metadata in remote_metadata; complete the verification in the managed browser profile, then call extract_paper_full_text again with resume_browser=true to continue from the saved browser URL/profile instead of restarting at api.

PDF parsing priority:

{
  "extract": {
    "parsing": {
      "order": ["Docling", "Marker", "PyMuPDF"]
    }
  }
}

Importing Existing PDF Libraries ♻️

If you already have a local PDF library, use grados import-pdfs to parse and copy those files into the canonical papers/ + downloads/ layout:

grados import-pdfs --from /path/to/papers --recursive
grados status

Development 🛠️

uv sync --all-extras
uv run grados version
uv run pytest
uv build

Project Docs 📚

ADR.md
- Records accepted architectural decisions and why the project chose them.
CHANGELOG.md
- Records completed, user-visible changes across releases and unreleased work.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

STSNaive

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.6.11

May 6, 2026

0.6.10

Apr 21, 2026

0.6.9

Apr 16, 2026

0.6.8

Apr 16, 2026

0.6.7

Apr 15, 2026

0.6.6

Apr 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grados-0.6.11.tar.gz (226.0 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

grados-0.6.11-py3-none-any.whl (166.5 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file grados-0.6.11.tar.gz.

File metadata

Download URL: grados-0.6.11.tar.gz
Upload date: May 6, 2026
Size: 226.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for grados-0.6.11.tar.gz
Algorithm	Hash digest
SHA256	`b41a6d5d8bd367cd7be6bfe01ce31fe900d0cbbf5fac158e92fca70121b1a82a`
MD5	`5b6dfcce34add4586bc521ff1617cb48`
BLAKE2b-256	`9be879fc2468d4822abe021409cca852dc8e9a8235301a417a445b4a1838ef26`

See more details on using hashes here.

Provenance

The following attestation bundles were made for grados-0.6.11.tar.gz:

Publisher: publish.yml on STSNaive/GRaDOS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: grados-0.6.11.tar.gz
- Subject digest: b41a6d5d8bd367cd7be6bfe01ce31fe900d0cbbf5fac158e92fca70121b1a82a
- Sigstore transparency entry: 1448360471
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: STSNaive/GRaDOS@382461230a704660d92f5421bfe2557b89409e13
- Branch / Tag: refs/tags/v0.6.11
- Owner: https://github.com/STSNaive
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@382461230a704660d92f5421bfe2557b89409e13
- Trigger Event: push

File details

Details for the file grados-0.6.11-py3-none-any.whl.

File metadata

Download URL: grados-0.6.11-py3-none-any.whl
Upload date: May 6, 2026
Size: 166.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for grados-0.6.11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`646f07459bf6c340affd6d2643900f3100dc44505b35c94f7154aff95d6da307`
MD5	`98d7684a729d2f8005b66b69910d8a68`
BLAKE2b-256	`00735569eeceaa0b01429af1e06c50341adf5a2f2c4f708612c35d17d85f5216`

See more details on using hashes here.

Provenance

The following attestation bundles were made for grados-0.6.11-py3-none-any.whl:

Publisher: publish.yml on STSNaive/GRaDOS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: grados-0.6.11-py3-none-any.whl
- Subject digest: 646f07459bf6c340affd6d2643900f3100dc44505b35c94f7154aff95d6da307
- Sigstore transparency entry: 1448360606
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: STSNaive/GRaDOS@382461230a704660d92f5421bfe2557b89409e13
- Branch / Tag: refs/tags/v0.6.11
- Owner: https://github.com/STSNaive
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@382461230a704660d92f5421bfe2557b89409e13
- Trigger Event: push

grados 0.6.11

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

GRaDOS

Architecture 🧭

MCP Tools 🔧

MCP Resources 📚

Local Paper Library 🗂️

Repository Map 🗺️

Installation 🚀

Option A: uv tool install (recommended)

Option B: extras, zero-install, or pip

Option C: from source

Quick Start ⚡

Configure your clients 🔌

Manual MCP wiring (fallback)

Native Plugin Install 🧩

Companion Skill 🤖

Configuration ⚙️

Timeout / Retry Knobs

Commands 🧰

Indexing Defaults 🧠

Filesystem Layout 🗄️

API Keys 🔑

Runtime Order 🌊

Importing Existing PDF Libraries ♻️

Development 🛠️

Project Docs 📚

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Option A: `uv tool install` (recommended)