AI-powered academic paper reviewer
Project description
OpenAIReview
Our goal is provide thorough and detailed reviews to help researchers conduct the best research. See more examples here.
Installation
uv venv && uv pip install openaireview
# or: pip install openaireview
For fast PDF processing (requires MISTRAL_API_KEY):
uv pip install openaireview[mistral]
For development:
git clone https://github.com/ChicagoHAI/OpenAIReview.git
cd OpenAIReview
uv venv && uv pip install -e .
# or: pip install -e .
Updates
--max-pagesand--max-tokensto limit input size and save OCR cost- Mistral OCR and DeepSeek OCR as optional PDF engines (
pip install openaireview[mistral]) openaireview extractsubcommand for two-stage OCR + review workflow- Multi-provider routing: OpenRouter, OpenAI, Anthropic, Gemini, Mistral (
--provider) - Table and figure extraction from arXiv HTML (tables as markdown)
- pymupdf4llm + GNN layout as default PDF fallback (replaces raw PyMuPDF)
- Mobile-responsive visualization UI
- Collapsible resolved comments in viz
- Claude Code skill (
/openaireview) with multi-agent pipeline
PDF parsing engines (optional)
PDF extraction quality matters — math symbols, tables, and reading order all affect review quality. Four engines are supported, tried in order:
| Engine | Install | Best for | Notes |
|---|---|---|---|
| Mistral OCR | pip install openaireview[mistral] + set MISTRAL_API_KEY |
Best overall quality, math, tables | Cloud API, ~$0.001/page |
| DeepSeek OCR | pip install openaireview[deepseek] + local backend |
Privacy-sensitive docs | Local model via Ollama/vLLM |
| Marker | uv tool install marker-pdf --with psutil |
Math-heavy PDFs (offline) | Slow without GPU |
| pymupdf4llm | (included) | Fallback, always available | No math symbol support |
The engine is auto-detected: if MISTRAL_API_KEY is set, Mistral OCR is tried first; then DeepSeek (if installed); then Marker (if on PATH); finally pymupdf4llm. You can force a specific engine with --ocr:
openaireview review paper.pdf --ocr mistral
openaireview review paper.pdf --ocr marker
For papers with math, we recommend using .tex source, .md, or arXiv HTML URLs instead of PDF when possible — these always produce correct output without needing an OCR engine.
Quick Start
First, set an API key for any supported provider:
export OPENROUTER_API_KEY=your_key_here # OpenRouter (supports all models)
# or
export OPENAI_API_KEY=your_key_here # OpenAI native
# or
export ANTHROPIC_API_KEY=your_key_here # Anthropic native
# or
export GEMINI_API_KEY=your_key_here # Google Gemini native
# or
export MISTRAL_API_KEY=your_key_here # Mistral native (also enables Mistral OCR)
Or create a .env file in your working directory (see .env.example).
Then review a paper and visualize results:
# Review a local file
openaireview review paper.pdf
# Or review directly from an arXiv URL
openaireview review https://arxiv.org/html/2602.18458v1
# Visualize results
openaireview serve
# Open http://localhost:8080
CLI Reference
openaireview review <file_or_url>
Review an academic paper for technical and logical issues. Accepts a local file path or an arXiv URL.
| Option | Default | Description |
|---|---|---|
--method |
progressive |
Review method: zero_shot, local, progressive, progressive_full |
--model |
anthropic/claude-opus-4-6 |
Model to use |
--provider |
(auto) | LLM provider: openrouter, openai, anthropic, gemini, mistral |
--ocr |
(auto) | PDF OCR engine: mistral, deepseek, marker, pymupdf |
--max-pages |
(all) | Only process first N pages of a PDF (saves OCR cost) |
--max-tokens |
(all) | Truncate input text to first N tokens before review |
--output-dir |
./review_results |
Directory for output JSON files |
--name |
(from filename) | Paper slug name |
openaireview extract <file>
Run OCR extraction only and save as markdown with metadata frontmatter. Useful for a two-stage workflow: extract first, then review the markdown.
| Option | Default | Description |
|---|---|---|
-o, --output |
<file>.md |
Output markdown path |
--ocr |
(auto) | PDF OCR engine: mistral, deepseek, marker, pymupdf |
openaireview serve
Start a local visualization server to browse review results.
| Option | Default | Description |
|---|---|---|
--results-dir |
./review_results |
Directory containing result JSON files |
--port |
8080 |
Server port |
Supported Input Formats
- PDF (
.pdf) — auto-selects best available engine (Mistral OCR > DeepSeek > Marker > pymupdf4llm); see PDF parsing engines - DOCX (
.docx) — via python-docx - LaTeX (
.tex) — plain text with title extraction from\title{} - Text/Markdown (
.txt,.md) — plain text - arXiv HTML — fetch and parse directly from
https://arxiv.org/html/<id>orhttps://arxiv.org/abs/<id>
Environment Variables
| Variable | Default | Description |
|---|---|---|
OPENROUTER_API_KEY |
OpenRouter API key (supports all models) | |
OPENAI_API_KEY |
OpenAI native API key | |
ANTHROPIC_API_KEY |
Anthropic native API key | |
GEMINI_API_KEY |
Google Gemini native API key | |
MISTRAL_API_KEY |
Mistral API key (also used for Mistral OCR) | |
MODEL |
anthropic/claude-opus-4-6 |
Default model |
REVIEW_PROVIDER |
(auto) | Force a specific LLM provider |
Set one API key. The provider is auto-detected from whichever key is set (priority: OpenRouter > OpenAI > Anthropic > Gemini > Mistral). See .env.example for a template.
Supported Models & Pricing
All models available on OpenRouter are supported — use any model ID via --model. The following models have built-in pricing for accurate cost tracking in the visualization:
| Model | Input ($/1M tokens) | Output ($/1M tokens) |
|---|---|---|
anthropic/claude-opus-4-6 |
$5.00 | $25.00 |
anthropic/claude-opus-4-5 |
$5.00 | $25.00 |
openai/gpt-5.2-pro |
$21.00 | $168.00 |
google/gemini-3.1-pro-preview |
$2.00 | $12.00 |
For models not listed above, a default rate of $5.00/$25.00 per 1M tokens is used.
Review Methods
- zero_shot — single prompt asking the model to find all issues
- local — deep-checks each chunk with surrounding window context (no filtering)
- progressive — sequential processing with running summary, then consolidation
- progressive_full — same as progressive but returns all comments before consolidation
Claude Code Skill
A deep-review skill is bundled with the package. It runs a multi-agent pipeline — one sub-agent per paper section plus cross-cutting agents — and produces severity-tiered findings (major / moderate / minor).
Install once:
pip install openaireview
openaireview install-skill
Then in any Claude Code project:
/openaireview paper.pdf
/openaireview https://arxiv.org/abs/2602.18458
Finally, run openaireview serve to see results.
Development
Install with dev dependencies (includes pytest):
uv pip install -e ".[dev]"
Run tests:
pytest tests/
Integration tests that call the API require OPENROUTER_API_KEY and are skipped automatically when it's not set.
Benchmarks
Benchmark data and experiment scripts are in benchmarks/. See benchmarks/REPORT.md for results.
Related Resources
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openaireview-0.2.5.tar.gz.
File metadata
- Download URL: openaireview-0.2.5.tar.gz
- Upload date:
- Size: 62.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
387c61ad784d7161af402b96404f9186886a3241b57d073fe13aa4a6ed448538
|
|
| MD5 |
97c3f283b100aa3b47eca837974e075f
|
|
| BLAKE2b-256 |
09e39e8d9ff8cf2ea6667e31367adee2a38a50d348f24bd648a60d66b8e82b44
|
Provenance
The following attestation bundles were made for openaireview-0.2.5.tar.gz:
Publisher:
publish.yml on ChicagoHAI/OpenAIReview
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openaireview-0.2.5.tar.gz -
Subject digest:
387c61ad784d7161af402b96404f9186886a3241b57d073fe13aa4a6ed448538 - Sigstore transparency entry: 1117450164
- Sigstore integration time:
-
Permalink:
ChicagoHAI/OpenAIReview@8aed9d7b0320f9d4fe9774047e8656153f630653 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/ChicagoHAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8aed9d7b0320f9d4fe9774047e8656153f630653 -
Trigger Event:
push
-
Statement type:
File details
Details for the file openaireview-0.2.5-py3-none-any.whl.
File metadata
- Download URL: openaireview-0.2.5-py3-none-any.whl
- Upload date:
- Size: 68.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
707578dbe7d0b0207b30549e1ff6df65aa3fb879b6e6704702a8f78fbdddee94
|
|
| MD5 |
e7fa169711c83e0246825b0a6f10f3c5
|
|
| BLAKE2b-256 |
ce63d0a99c5cac90464af9ba02c1680ada2d7afb0b42cbb157a51b0a1bbd5b59
|
Provenance
The following attestation bundles were made for openaireview-0.2.5-py3-none-any.whl:
Publisher:
publish.yml on ChicagoHAI/OpenAIReview
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openaireview-0.2.5-py3-none-any.whl -
Subject digest:
707578dbe7d0b0207b30549e1ff6df65aa3fb879b6e6704702a8f78fbdddee94 - Sigstore transparency entry: 1117450380
- Sigstore integration time:
-
Permalink:
ChicagoHAI/OpenAIReview@8aed9d7b0320f9d4fe9774047e8656153f630653 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/ChicagoHAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8aed9d7b0320f9d4fe9774047e8656153f630653 -
Trigger Event:
push
-
Statement type: