A multi-agent LLM orchestrator for academic peer-review.
Project description
๐งโ๐ฌ Agentic_Paper
A multi-agent LLM orchestrator for academic peer-review.
Built for students, PhDs, and researchers who want a transparent, reproducible second opinion on a manuscript โ not another opaque chatbot.
Why Agentic_Paper?
This is not another ChatGPT wrapper.
A single LLM, given a paper and the prompt "please review this", gives you the average of the internet. Agentic_Paper does something genuinely different:
- ๐ง 12 specialised reviewer agents run in parallel, each with its own role, prompt, and base complexity โ Methodology, Results, Literature, Structure, Impact, Contradiction, Ethics, AI-Origin, Hallucination, Citation Validator, Statcheck Validator, Revision Assessor.
- ๐งโโ๏ธ A Coordinator synthesises their structured verdicts, names disagreements, and orders revision priorities.
- โ๏ธ An Editor + an Author/Editor Summary agent produce a journal-style decision letter and the confidential note to the editor โ separately.
- ๐ Every single LLM call is audited โ token counts, latency, cost estimate, prompt hash, thinking-mode flag, seed โ written to
audit.jsonlso you can prove what was asked and answered. No hallucination hides in the dark. - ๐ Citations are validated against OpenAlex (~250M open scholarly records, no API key needed). Fabricated references get flagged automatically.
- ๐งฎ Reported p-values are recomputed via the R
statcheckpackage โ if a paper sayst(28) = 2.3, p = .01and the math saysp โ 0.029, you'll see it. - ๐ Multi-provider, pluggable: OpenAI, Anthropic Claude, Google Gemini, and any OpenAI-compatible local endpoint โ see ยง Local & Free Models.
- ๐๏ธ Typed everything: reviewers don't return free-form prose, they return validated
pydanticmodels. Downstream agents consume structure, not substrings.
Outputs: a Markdown report, a stand-alone HTML dashboard, a structured JSON, and a run_id-scoped folder you can hand off when a journal asks "how was this assessment produced?".
Installation
pip install agentic-paper
That's it. Pure-Python; works on macOS, Linux, and Windows with Python 3.10+.
For the optional web UI (FastAPI + HTMX live demo):
pip install "agentic-paper[web]"
For statistical sanity checking (recommended for empirical papers), also install R and the statcheck + jsonlite packages:
install.packages(c("statcheck", "jsonlite"))
If R isn't available, the rest of the pipeline still runs โ the Statcheck Validator simply reports "not available" in the final report.
Quickstart
1. Set a provider key
export OPENAI_API_KEY="sk-..."
# Optional, for multi-provider routing:
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."
๐ก No budget? Skip this step and jump to Local & Free Models.
2. Review a paper from the terminal
agentic-paper paper.pdf --seed 42
Outputs land under output_paper_review/<run_id>/ โ open dashboard_*.html for a styled report, or read review_report_*.md directly.
3. Or use the web UI
agentic-paper-web --port 8000
# โ http://127.0.0.1:8000/
A clean drop-zone page: drag a PDF in, watch the 12 agents think live (real thinking_delta stream when the provider supports it), then read the report inline. Optional Bring-Your-Own-Key form for sharing the demo with colleagues without exposing your account โ keys are held in the worker stack frame, never logged, never written to disk.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ drop a PDF here โ watch the agents work โ
โ โ methodology readingโฆ โ
โ โ results done (4.2 s, $0.018) โ
โ โ ด literature thinkingโฆ โ
โ โฆ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โก Auto-Mode: never fail because of a missing key
The web UI's routing profiles (max / std / quick) deliberately spread agents across multiple vendors to play to each model's strengths โ for example std sends High-tier reasoning to Claude, Standard tier to GPT, Basic tier to Gemini. If you only paste one API key in the BYOK form, naรฏve routing would 404 on the other two providers and tank the run.
Auto-Mode fixes this transparently. When the BYOK form is submitted:
- Each tier is checked against the keys you actually provided.
- Tiers pointing to an unavailable provider are remapped to an equivalent model on a provider you do have (e.g.
tier_high: anthropic/claude-opus-4-7โgoogle/gemini-3-pro). thinking_budgetand the tier's role intensity are preserved โ Auto-Mode picks the flagship reasoning model of the fallback vendor fortier_high, the mid-tier fortier_standard, and the cheapest fortier_basic.- A yellow banner at the top of the run page lists every remap with the original vs. new (provider, model) so you know exactly what changed.
The run proceeds end-to-end with a single key, with no manual config edits. Auto-Mode only kicks in when at least one BYOK key is supplied โ runs that use the server-side config are left alone.
๐ฆ Local & Free Models (with Ollama)
You don't need a credit card to use Agentic_Paper. The ProviderRegistry accepts any OpenAI-compatible endpoint, which means you can run the entire pipeline against Ollama, LM Studio, vLLM, or any local server you control. Free peer-review, fully private, all on your laptop.
Step-by-step: Ollama + Llama 3
# 1. Install Ollama from https://ollama.com (one-line installer)
# 2. Pull a model โ Llama 3.1 8B fits on a laptop with 16 GB RAM
ollama pull llama3.1
# 3. Start Ollama in the background (it auto-serves an OpenAI-compatible API on :11434)
ollama serve &
# 4. Point Agentic_Paper at it โ two env vars is all it takes
export OPENAI_API_KEY="ollama" # any non-empty string
export OPENAI_API_BASE="http://localhost:11434/v1" # Ollama's OpenAI-compat endpoint
# 5. Run the review using your local model
agentic-paper paper.pdf --config config.local.yaml
Minimal config.local.yaml to wire every tier to the local model:
output_dir: output_paper_review
routing:
tier_high: { provider: openai, model: llama3.1 }
tier_standard: { provider: openai, model: llama3.1 }
tier_basic: { provider: openai, model: llama3.1 }
providers:
openai:
api_key_env: OPENAI_API_KEY
base_url: http://localhost:11434/v1
Recommended local model tiers
| Hardware | Suggested model | Notes |
|---|---|---|
| Laptop, 16 GB RAM | llama3.1 (8B) |
Solid baseline. Reviews are slower but coherent. |
| Workstation, 32 GB+ | llama3.1:70b or qwen2.5:32b |
Closer to GPT-4o quality on reasoning. |
| GPU box, 24 GB+ VRAM | deepseek-r1 via vLLM |
Excellent for the Methodology / Contradiction reviewers. |
| Mac Studio (M2 Ultra+) | llama3.1:70b MLX |
Apple-silicon native; faster than CUDA at comparable mem. |
Caveats with local models
- Structured outputs: small open-weight models occasionally violate the JSON schema. Agentic_Paper retries with
tenacityand falls back toresponse_format: json_object. Larger models (โฅ 30B) are noticeably more reliable. - Quality: a 7-8B local model will not match Claude Opus 4.7 โ but for a first pass on a draft (catching contradictions, missing citations, structural issues), it's more than enough.
- Privacy: nothing leaves your machine. Perfect for unpublished manuscripts under embargo or NDA.
- Cost: literally zero (modulo electricity).
Mixed routing: free local + paid top-tier
You can also keep the cheap agents local and route only the heavy reasoning to a paid provider:
routing:
tier_high: { provider: anthropic, model: claude-opus-4-7, thinking_budget: auto }
tier_standard: { provider: openai, model: gpt-5.4-mini }
tier_basic: { provider: ollama_local, model: llama3.1 }
providers:
ollama_local:
api_key_env: OPENAI_API_KEY
base_url: http://localhost:11434/v1
The framework treats any custom provider name with a base_url as OpenAI-compatible.
Architecture (in 30 seconds)
PDF โโโถ PaperExtractor โโโถ paper.txt + complexity score
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ConcurrentAgentRunner โ
โ (asyncio.gather) โ
โโโโโโโโโโโโฌโโโโโโโโโโโโโโ
โ 12 reviewers in parallel
โผ
Coordinator โโถ Author/Editor Summary
โ
โผ
Editor
โ
โผ
Markdown ยท JSON ยท HTML ยท audit.jsonl
(all under output/<run_id>/)
The codebase is deliberately small and modular:
orchestrator.pyโ coordinates the pipeline; doesn't know about concurrency.agent_runner.pyโConcurrentAgentRunnerowns theasynciomachinery. Swappable for Celery / Ray / Dask without touching the orchestrator.storage.pyโStorageProviderABC +LocalFileStorage. ImplementS3StorageorPostgresStorageonce; everything else keeps working.providers/โ one module per vendor (OpenAI,Anthropic,Google, OpenAI-compat). Each implements a uniformLLMProviderinterface.agents/โ one file per role. Each definesKEY,NAME,INSTRUCTIONS,SCHEMA,base_complexity. Adding a 13th reviewer is a 30-line file.schemas.pyโpydanticmodels. Every LLM call returns a validated instance, not a parsed string.external/โ OpenAlex (citations), statcheck (R subprocess).
If you read one file to understand the project, read agentic_paper/orchestrator.py. It's ~570 lines and reads like the table of contents of this README.
What's in the run directory
After agentic-paper paper.pdf finishes, output_paper_review/<run_id>/ contains:
audit.jsonl โ one JSON row per LLM call (12 fields)
paper.txt โ extracted text (kept for retry-failed-agents)
paper_info.json โ title / authors / abstract / detected sections
review_<agent>.txt โ every reviewer's validated, structured verdict
review_report_*.md โ the human-readable report
review_results_*.json โ machine-readable bundle (incl. routing + audit summary)
executive_summary_*.md โ one-page TL;DR
dashboard_*.html โ stand-alone styled report (no server needed)
prompts/<agent>.txt โ exact prompt sent โ full prompt + context dump
responses/<agent>.json โ raw response payload from the provider
paper_review_system.log โ debug log of the whole run
This is the reproducibility bundle. Hand it off when a journal asks "how was this assessment produced?" and the answer is one tarball.
Reproducibility & determinism
agentic-paper paper.pdf --seed 42
The seed is forwarded to every provider that supports it:
- OpenAI โ
seed=Non Responses + Chat Completions. - Google Gemini โ
GenerateContentConfig.seed=N. - Anthropic โ recorded in audit but not propagated (the Messages API doesn't expose a seed yet); pair with
temperature: 0for maximal stability.
Cost, latency, and token counts for every call are queryable from audit.jsonl with one jq command โ no separate observability stack required.
Limitations (honest)
Things Agentic_Paper does not do:
- Substitute for human peer review. It surfaces mechanical issues โ internal inconsistencies, citation gaps, statistical misreporting โ faster than a tired human reviewer. It does not have taste, domain depth in your niche, or knowledge of journal-specific norms.
- Inspect figures, tables, or equations rendered as images. Only text is parsed (pdfplumber + heuristics).
- Fact-check beyond citations. No PubMed / arXiv / Semantic Scholar grounding โ only OpenAlex resolution of explicit references.
- Multi-paper synthesis. One paper per run; use a shell loop for batch.
- Translate. Non-English papers technically work but the reviewer prompts assume an English peer-review register.
Development
git clone https://github.com/albertogerli/Agentic_Paper.git
cd Agentic_Paper
pip install -e ".[dev,web]"
pytest -q --cov=agentic_paper --cov-fail-under=60
224 tests, ~74 % line coverage, CI on Python 3.10 / 3.11 / 3.12.
PRs welcome โ especially: new local-model recipes, new reviewer roles, S3/Postgres StorageProvider implementations, non-English prompt packs.
Citing
If Agentic_Paper contributes to research output, please cite:
@software{gerli_agentic_paper_2026,
author = {Gerli, Alberto G.},
title = {Agentic\_Paper: A multi-agent, multi-provider, structured-output
peer-review pipeline for scientific manuscripts},
year = {2026},
url = {https://github.com/albertogerli/Agentic_Paper},
version = {2.0.0}
}
License
MIT. Use it, fork it, ship it.
Contact
- Issues / PRs: https://github.com/albertogerli/Agentic_Paper/issues
- Email: alberto@albertogerli.it
- Workshop: Physalia 2026 โ Agentic Workflows for Scientific Reviewing
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentic_paper-2.1.0.tar.gz.
File metadata
- Download URL: agentic_paper-2.1.0.tar.gz
- Upload date:
- Size: 102.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73130b1eba01905ddb254602adc5eb15f2bdc1a5630645688242fecd6ade76a5
|
|
| MD5 |
9c66cf33b9691ed8bebdbaadd437c640
|
|
| BLAKE2b-256 |
9e17d4b1151f23b6ce684b8e1cb3d3a17e1c741e1d1812935a3266c87f21e979
|
Provenance
The following attestation bundles were made for agentic_paper-2.1.0.tar.gz:
Publisher:
release.yml on albertogerli/Agentic_Paper
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentic_paper-2.1.0.tar.gz -
Subject digest:
73130b1eba01905ddb254602adc5eb15f2bdc1a5630645688242fecd6ade76a5 - Sigstore transparency entry: 1645199598
- Sigstore integration time:
-
Permalink:
albertogerli/Agentic_Paper@fac8f542b125f686d7cf95234410732cc605b3af -
Branch / Tag:
refs/tags/v2.1.0 - Owner: https://github.com/albertogerli
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@fac8f542b125f686d7cf95234410732cc605b3af -
Trigger Event:
push
-
Statement type:
File details
Details for the file agentic_paper-2.1.0-py3-none-any.whl.
File metadata
- Download URL: agentic_paper-2.1.0-py3-none-any.whl
- Upload date:
- Size: 124.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
106283ffac050b16b4f4fac30cd50f0103e1d82ef5907d1422844a994d9eb0c5
|
|
| MD5 |
9de663601bb63b09fb8d294d0ea0b8c2
|
|
| BLAKE2b-256 |
261df3f7b5e1704cba2280457cd69d30f18226e905b5a8955e9ce807c4b8c70f
|
Provenance
The following attestation bundles were made for agentic_paper-2.1.0-py3-none-any.whl:
Publisher:
release.yml on albertogerli/Agentic_Paper
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentic_paper-2.1.0-py3-none-any.whl -
Subject digest:
106283ffac050b16b4f4fac30cd50f0103e1d82ef5907d1422844a994d9eb0c5 - Sigstore transparency entry: 1645199653
- Sigstore integration time:
-
Permalink:
albertogerli/Agentic_Paper@fac8f542b125f686d7cf95234410732cc605b3af -
Branch / Tag:
refs/tags/v2.1.0 - Owner: https://github.com/albertogerli
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@fac8f542b125f686d7cf95234410732cc605b3af -
Trigger Event:
push
-
Statement type: