A multi-agent LLM orchestrator for academic peer-review.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

elbec80

These details have not been verified by PyPI

Project description

🧑‍🔬 Agentic_Paper

A multi-agent LLM orchestrator for academic peer-review.

Built for students, PhDs, and researchers who want a transparent, reproducible second opinion on a manuscript — not another opaque chatbot.

Why Agentic_Paper?

This is not another ChatGPT wrapper.

A single LLM, given a paper and the prompt "please review this", gives you the average of the internet. Agentic_Paper does something genuinely different:

🧠 12 specialised reviewer agents run in parallel, each with its own role, prompt, and base complexity — Methodology, Results, Literature, Structure, Impact, Contradiction, Ethics, AI-Origin, Hallucination, Citation Validator, Statcheck Validator, Revision Assessor.
🧑‍⚖️ A Coordinator synthesises their structured verdicts, names disagreements, and orders revision priorities.
✉️ An Editor + an Author/Editor Summary agent produce a journal-style decision letter and the confidential note to the editor — separately.
📜 Every single LLM call is audited — token counts, latency, cost estimate, prompt hash, thinking-mode flag, seed — written to audit.jsonl so you can prove what was asked and answered. No hallucination hides in the dark.
🔎 Citations are validated against OpenAlex (~250M open scholarly records, no API key needed). Fabricated references get flagged automatically.
🧮 Reported p-values are recomputed via the R statcheck package — if a paper says t(28) = 2.3, p = .01 and the math says p ≈ 0.029, you'll see it.
🔌 Multi-provider, pluggable: OpenAI, Anthropic Claude, Google Gemini, and any OpenAI-compatible local endpoint — see § Local & Free Models.
🎛️ Typed everything: reviewers don't return free-form prose, they return validated pydantic models. Downstream agents consume structure, not substrings.

Outputs: a Markdown report, a stand-alone HTML dashboard, a structured JSON, and a run_id-scoped folder you can hand off when a journal asks "how was this assessment produced?".

Installation

pip install agentic-paper

That's it. Pure-Python; works on macOS, Linux, and Windows with Python 3.10+.

For the optional web UI (FastAPI + HTMX live demo):

pip install "agentic-paper[web]"

For statistical sanity checking (recommended for empirical papers), also install R and the statcheck + jsonlite packages:

install.packages(c("statcheck", "jsonlite"))

If R isn't available, the rest of the pipeline still runs — the Statcheck Validator simply reports "not available" in the final report.

Quickstart

1. Set a provider key

export OPENAI_API_KEY="sk-..."
# Optional, for multi-provider routing:
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."

💡 No budget? Skip this step and jump to Local & Free Models.

2. Review a paper from the terminal

agentic-paper paper.pdf --seed 42

Outputs land under output_paper_review/<run_id>/ — open dashboard_*.html for a styled report, or read review_report_*.md directly.

3. Or use the web UI

agentic-paper-web --port 8000
# → http://127.0.0.1:8000/

A clean drop-zone page: drag a PDF in, watch the 12 agents think live (real thinking_delta stream when the provider supports it), then read the report inline. Optional Bring-Your-Own-Key form for sharing the demo with colleagues without exposing your account — keys are held in the worker stack frame, never logged, never written to disk.

┌─────────────────────────────────────────────┐
│  drop a PDF here  →  watch the agents work  │
│  ⠋ methodology   reading…                   │
│  ✓ results       done (4.2 s, $0.018)       │
│  ⠴ literature    thinking…                  │
│  …                                          │
└─────────────────────────────────────────────┘

⚡ Auto-Mode: never fail because of a missing key

The web UI's routing profiles (max / std / quick) deliberately spread agents across multiple vendors to play to each model's strengths — for example std sends High-tier reasoning to Claude, Standard tier to GPT, Basic tier to Gemini. If you only paste one API key in the BYOK form, naïve routing would 404 on the other two providers and tank the run.

Auto-Mode fixes this transparently. When the BYOK form is submitted:

Each tier is checked against the keys you actually provided.
Tiers pointing to an unavailable provider are remapped to an equivalent model on a provider you do have (e.g. tier_high: anthropic/claude-opus-4-7 → google/gemini-3-pro).
thinking_budget and the tier's role intensity are preserved — Auto-Mode picks the flagship reasoning model of the fallback vendor for tier_high, the mid-tier for tier_standard, and the cheapest for tier_basic.
A yellow banner at the top of the run page lists every remap with the original vs. new (provider, model) so you know exactly what changed.

The run proceeds end-to-end with a single key, with no manual config edits. Auto-Mode only kicks in when at least one BYOK key is supplied — runs that use the server-side config are left alone.

🦙 Local & Free Models (with Ollama)

You don't need a credit card to use Agentic_Paper. The ProviderRegistry accepts any OpenAI-compatible endpoint, which means you can run the entire pipeline against Ollama, LM Studio, vLLM, or any local server you control. Free peer-review, fully private, all on your laptop.

Step-by-step: Ollama + Llama 3

# 1. Install Ollama from https://ollama.com (one-line installer)
# 2. Pull a model — Llama 3.1 8B fits on a laptop with 16 GB RAM
ollama pull llama3.1

# 3. Start Ollama in the background (it auto-serves an OpenAI-compatible API on :11434)
ollama serve &

# 4. Point Agentic_Paper at it — two env vars is all it takes
export OPENAI_API_KEY="ollama"                          # any non-empty string
export OPENAI_API_BASE="http://localhost:11434/v1"      # Ollama's OpenAI-compat endpoint

# 5. Run the review using your local model
agentic-paper paper.pdf --config config.local.yaml

Minimal config.local.yaml to wire every tier to the local model:

output_dir: output_paper_review
routing:
  tier_high:     { provider: openai, model: llama3.1 }
  tier_standard: { provider: openai, model: llama3.1 }
  tier_basic:    { provider: openai, model: llama3.1 }
providers:
  openai:
    api_key_env: OPENAI_API_KEY
    base_url: http://localhost:11434/v1

Recommended local model tiers

Hardware	Suggested model	Notes
Laptop, 16 GB RAM	`llama3.1` (8B)	Solid baseline. Reviews are slower but coherent.
Workstation, 32 GB+	`llama3.1:70b` or `qwen2.5:32b`	Closer to GPT-4o quality on reasoning.
GPU box, 24 GB+ VRAM	`deepseek-r1` via vLLM	Excellent for the Methodology / Contradiction reviewers.
Mac Studio (M2 Ultra+)	`llama3.1:70b` MLX	Apple-silicon native; faster than CUDA at comparable mem.

Caveats with local models

Structured outputs: small open-weight models occasionally violate the JSON schema. Agentic_Paper retries with tenacity and falls back to response_format: json_object. Larger models (≥ 30B) are noticeably more reliable.
Quality: a 7-8B local model will not match Claude Opus 4.7 — but for a first pass on a draft (catching contradictions, missing citations, structural issues), it's more than enough.
Privacy: nothing leaves your machine. Perfect for unpublished manuscripts under embargo or NDA.
Cost: literally zero (modulo electricity).

Mixed routing: free local + paid top-tier

You can also keep the cheap agents local and route only the heavy reasoning to a paid provider:

routing:
  tier_high:     { provider: anthropic, model: claude-opus-4-7, thinking_budget: auto }
  tier_standard: { provider: openai,    model: gpt-5.4-mini }
  tier_basic:    { provider: ollama_local, model: llama3.1 }
providers:
  ollama_local:
    api_key_env: OPENAI_API_KEY
    base_url: http://localhost:11434/v1

The framework treats any custom provider name with a base_url as OpenAI-compatible.

Architecture (in 30 seconds)

        PDF ──▶ PaperExtractor ──▶ paper.txt + complexity score
                                          │
                                          ▼
                              ┌────────────────────────┐
                              │ ConcurrentAgentRunner  │
                              │   (asyncio.gather)     │
                              └──────────┬─────────────┘
                                         │ 12 reviewers in parallel
                                         ▼
                              Coordinator ─▶ Author/Editor Summary
                                         │
                                         ▼
                                      Editor
                                         │
                                         ▼
                          Markdown · JSON · HTML · audit.jsonl
                          (all under output/<run_id>/)

The codebase is deliberately small and modular:

orchestrator.py — coordinates the pipeline; doesn't know about concurrency.
agent_runner.py — ConcurrentAgentRunner owns the asyncio machinery. Swappable for Celery / Ray / Dask without touching the orchestrator.
storage.py — StorageProvider ABC + LocalFileStorage. Implement S3Storage or PostgresStorage once; everything else keeps working.
providers/ — one module per vendor (OpenAI, Anthropic, Google, OpenAI-compat). Each implements a uniform LLMProvider interface.
agents/ — one file per role. Each defines KEY, NAME, INSTRUCTIONS, SCHEMA, base_complexity. Adding a 13th reviewer is a 30-line file.
schemas.py — pydantic models. Every LLM call returns a validated instance, not a parsed string.
external/ — OpenAlex (citations), statcheck (R subprocess).

If you read one file to understand the project, read agentic_paper/orchestrator.py. It's ~570 lines and reads like the table of contents of this README.

What's in the run directory

After agentic-paper paper.pdf finishes, output_paper_review/<run_id>/ contains:

audit.jsonl              ← one JSON row per LLM call (12 fields)
paper.txt                ← extracted text (kept for retry-failed-agents)
paper_info.json          ← title / authors / abstract / detected sections
review_<agent>.txt       ← every reviewer's validated, structured verdict
review_report_*.md       ← the human-readable report
review_results_*.json    ← machine-readable bundle (incl. routing + audit summary)
executive_summary_*.md   ← one-page TL;DR
dashboard_*.html         ← stand-alone styled report (no server needed)
prompts/<agent>.txt      ← exact prompt sent — full prompt + context dump
responses/<agent>.json   ← raw response payload from the provider
paper_review_system.log  ← debug log of the whole run

This is the reproducibility bundle. Hand it off when a journal asks "how was this assessment produced?" and the answer is one tarball.

Reproducibility & determinism

agentic-paper paper.pdf --seed 42

The seed is forwarded to every provider that supports it:

OpenAI — seed=N on Responses + Chat Completions.
Google Gemini — GenerateContentConfig.seed=N.
Anthropic — recorded in audit but not propagated (the Messages API doesn't expose a seed yet); pair with temperature: 0 for maximal stability.

Cost, latency, and token counts for every call are queryable from audit.jsonl with one jq command — no separate observability stack required.

Limitations (honest)

Things Agentic_Paper does not do:

Substitute for human peer review. It surfaces mechanical issues — internal inconsistencies, citation gaps, statistical misreporting — faster than a tired human reviewer. It does not have taste, domain depth in your niche, or knowledge of journal-specific norms.
Inspect figures, tables, or equations rendered as images. Only text is parsed (pdfplumber + heuristics).
Fact-check beyond citations. No PubMed / arXiv / Semantic Scholar grounding — only OpenAlex resolution of explicit references.
Multi-paper synthesis. One paper per run; use a shell loop for batch.
Translate. Non-English papers technically work but the reviewer prompts assume an English peer-review register.

Development

git clone https://github.com/albertogerli/Agentic_Paper.git
cd Agentic_Paper
pip install -e ".[dev,web]"
pytest -q --cov=agentic_paper --cov-fail-under=60

224 tests, ~74 % line coverage, CI on Python 3.10 / 3.11 / 3.12.

PRs welcome — especially: new local-model recipes, new reviewer roles, S3/Postgres StorageProvider implementations, non-English prompt packs.

Citing

If Agentic_Paper contributes to research output, please cite:

@software{gerli_agentic_paper_2026,
  author    = {Gerli, Alberto G.},
  title     = {Agentic\_Paper: A multi-agent, multi-provider, structured-output
               peer-review pipeline for scientific manuscripts},
  year      = {2026},
  url       = {https://github.com/albertogerli/Agentic_Paper},
  version   = {2.0.0}
}

License

MIT. Use it, fork it, ship it.

Contact

Issues / PRs: https://github.com/albertogerli/Agentic_Paper/issues
Email: alberto@albertogerli.it
Workshop: Physalia 2026 — Agentic Workflows for Scientific Reviewing

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

elbec80

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

2.1.0

May 27, 2026

2.0.0

May 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentic_paper-2.1.0.tar.gz (102.8 kB view details)

Uploaded May 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentic_paper-2.1.0-py3-none-any.whl (124.6 kB view details)

Uploaded May 27, 2026 Python 3

File details

Details for the file agentic_paper-2.1.0.tar.gz.

File metadata

Download URL: agentic_paper-2.1.0.tar.gz
Upload date: May 27, 2026
Size: 102.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentic_paper-2.1.0.tar.gz
Algorithm	Hash digest
SHA256	`73130b1eba01905ddb254602adc5eb15f2bdc1a5630645688242fecd6ade76a5`
MD5	`9c66cf33b9691ed8bebdbaadd437c640`
BLAKE2b-256	`9e17d4b1151f23b6ce684b8e1cb3d3a17e1c741e1d1812935a3266c87f21e979`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_paper-2.1.0.tar.gz:

Publisher: release.yml on albertogerli/Agentic_Paper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentic_paper-2.1.0.tar.gz
- Subject digest: 73130b1eba01905ddb254602adc5eb15f2bdc1a5630645688242fecd6ade76a5
- Sigstore transparency entry: 1645199598
- Sigstore integration time: May 27, 2026
Source repository:
- Permalink: albertogerli/Agentic_Paper@fac8f542b125f686d7cf95234410732cc605b3af
- Branch / Tag: refs/tags/v2.1.0
- Owner: https://github.com/albertogerli
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@fac8f542b125f686d7cf95234410732cc605b3af
- Trigger Event: push

File details

Details for the file agentic_paper-2.1.0-py3-none-any.whl.

File metadata

Download URL: agentic_paper-2.1.0-py3-none-any.whl
Upload date: May 27, 2026
Size: 124.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentic_paper-2.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`106283ffac050b16b4f4fac30cd50f0103e1d82ef5907d1422844a994d9eb0c5`
MD5	`9de663601bb63b09fb8d294d0ea0b8c2`
BLAKE2b-256	`261df3f7b5e1704cba2280457cd69d30f18226e905b5a8955e9ce807c4b8c70f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_paper-2.1.0-py3-none-any.whl:

Publisher: release.yml on albertogerli/Agentic_Paper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentic_paper-2.1.0-py3-none-any.whl
- Subject digest: 106283ffac050b16b4f4fac30cd50f0103e1d82ef5907d1422844a994d9eb0c5
- Sigstore transparency entry: 1645199653
- Sigstore integration time: May 27, 2026
Source repository:
- Permalink: albertogerli/Agentic_Paper@fac8f542b125f686d7cf95234410732cc605b3af
- Branch / Tag: refs/tags/v2.1.0
- Owner: https://github.com/albertogerli
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@fac8f542b125f686d7cf95234410732cc605b3af
- Trigger Event: push

agentic-paper 2.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🧑‍🔬 Agentic_Paper

Why Agentic_Paper?

Installation

Quickstart

1. Set a provider key

2. Review a paper from the terminal

3. Or use the web UI

⚡ Auto-Mode: never fail because of a missing key

🦙 Local & Free Models (with Ollama)

Step-by-step: Ollama + Llama 3

Recommended local model tiers

Caveats with local models

Mixed routing: free local + paid top-tier

Architecture (in 30 seconds)

What's in the run directory

Reproducibility & determinism

Limitations (honest)

Development

Citing

License

Contact

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance