Skip to main content

A comprehensive tool for validating reference accuracy in academic papers

Project description

RefChecker

Validate reference accuracy in academic papers.
Catch citation errors, fabricated references, and metadata mismatches before they reach reviewers.

Quick StartFeaturesWeb UICLIHallucination DetectionDeployment

Download for macOS   Download for Windows

Linux: .AppImage · .deb · all builds

Native desktop builds powered by Tauri · Built and signed by GitHub Actions on every release tag.

What the desktop app adds (v0.6.19)

  • Cascade extraction (token saver). Settings → Reference Extraction picks between cascade (regex/BibTeX/GROBID first, LLM only on the messy or unrecognized entries) and LLM-only. Default is cascade — typically uses 60–90% fewer LLM tokens on well-formatted papers.
  • Global reference library (read + write, every code path). Every verified reference — including ones that flow through post-processing rewriters like the hallucination resolver — is now persisted to the global identity cache (DOI / arXiv / normalized title key) via a single emit_progress hook. Previous builds only persisted along one of six code paths, which left a lot of refs uncached. Future verifications consult the cache automatically for instant matches. New Seen References (library) view at the top of the main panel.
  • Find similar papers — multi-source, actively verified. Pulls candidates from Semantic Scholar (recommendations + co-citation), OpenAlex (DOI-resolved co-citation), the user's configured web-search provider (OpenAI / Anthropic / Gemini), and the user's default LLM ("what else should I read?"). Candidates are deduped across sources, each row shows a source badge, and any cache-miss candidate is actively re-verified through the hybrid checker before display — so you see real ✓ verified or ? unconfirmed, not just metadata.
  • Real citation graph. Force-directed view of the paper's references, edges drawn from the real Semantic Scholar citation graph (ref A → ref B iff A cites B), nodes sized by in-paper in-degree (how many other refs in the same bibliography cite each one). Double-click any node to expand one hop further — pulls in that paper's top outgoing references as new nodes.
  • Citation context. Each reference card shows the sentence in the paper where it's cited — now matches both numeric markers ([12]) AND author-year style ((Smith et al., 2020), Smith (2020)), so APA / Chicago papers light up too. Up to two sentences per ref, captured at parse time with no extra LLM call.
  • Live citation health badge. Score recomputes on every edit (Apply Fix / Add / Remove). Copy as Markdown for a README badge.
  • Add / Remove / Suggest alternative — everywhere. Now in both the References tab and the Corrections tab. Newly-added references are re-verified live, and Apply Fix in the Corrections tab also re-runs the verifier on the corrected metadata so the citation-health chip moves in real time. Suggest alternative combines an LLM-backed "what real paper did the author probably mean?" with Semantic Scholar title-search — and each candidate is rendered in your currently-selected citation style (APA / IEEE / BibTeX / your custom template) with a one-click Copy button, so the replacement drops into your bibliography in the right format.
  • Minimal Grammarly-style citation-health chip sits inline in the Summary header — color-coded, hover for breakdown, recomputes on every edit. No copy / download clutter; the score follows you.
  • Tunable citation styles + custom-style builder. APA / IEEE / Vancouver / etc. now expose Max-authors, et-al threshold, and Include-URL toggles. Need a journal-specific format? Customize style → New custom style lets you save a template like {authors} ({year}). {title}. {venue}. {doi} and pick it from the dropdown forever after.
  • Seen References — live + clearable. The library auto-refreshes whenever a check finishes (newly-verified refs appear immediately) and a Clear cache button wipes the whole identity table when you need a fresh start.
  • LLM token + cost meter. Inline chip in the Summary header tracks total tokens and an estimated USD cost across every provider you've used (OpenAI / Anthropic / Gemini) with a per-provider breakdown on hover. Counters persist across restarts via llm_usage.json. Cost rates are list-price USD-per-1K-tokens, hand-curated per model.
  • Smoother citation graph. Labels are now hover-on-demand (one node at a time, source always labelled) so the canvas stays readable; hovered node gets a soft outline ring. Slower force-cooldown lets the network breathe.
  • Bullet-proof external links. Open in browser / GitHub / DOI / arXiv buttons now use an explicit shell:allow-open scope (https://**, http://**, mailto:**, tel:**) so the Tauri shell plugin actually opens them — the default scope was silently empty.
  • Apply Fix actually moves the citation-health chip. Accepting a correction now merges the verifier's suggested metadata back into the stored reference before re-verifying, so the ref flips to verified and the badge updates in real time. Apply-all-visible parallelises the re-verifies at 4 in flight.
  • Better "Similar Papers" diagnostics. When nothing comes back, the panel now shows which sources were tried and how many candidates each produced, and explicit hints for the common causes (rate limits, refs without DOIs, no web-search-capable LLM provider).
  • Token meter shows per-kind breakdown + cascade savings hint. Hover the chip for tokens grouped by call kind (extraction / hallucination / suggest-alt / web-search). When cascade saved measurable cost vs an LLM-only path, the savings are estimated and surfaced at the bottom of the tooltip.
  • Desktop version visible in Settings. Settings → footer now shows both the desktop bundle version (e.g. Desktop v0.6.6) and the underlying Python engine version separately so you can tell at a glance which build you're on.
  • Reference-manager export (RIS). New Export → RIS option produces a .ris file that imports directly into Zotero, EndNote, Mendeley, Rayyan, Papers, and RefWorks — including the verifier's corrected metadata (DOI, arXiv ID, fixed authors) instead of the wrong-as-cited values. The export menu also gets a Sort control: citation order, alphabetical (first author), or year ascending/descending.
  • References tab honors the citation-style picker live. Pick APA / IEEE / BibTeX / your custom template in the References-tab header and each card renders a styled preview line above the structured metadata. Title / authors / venue rows stay below so per-field badges keep working.
  • Tab pill counts respect Summary filters. Click "Errors" in the Summary chips and the References tab pill drops to "3" instead of staying at the full bibliography count — so the tab header matches the in-page header.
  • Author rendering no longer leaks [object Object]. formatAuthors now normalises every author shape we've seen from upstream (bare strings, {name}, {display_name}, OpenAlex {author:{display_name}}, CSL JSON {family, given}, JSON-encoded arrays).
  • Drag-and-drop + Open With. Drop a PDF / DOCX / ODT / RTF / Markdown / HTML / BibTeX / LaTeX / plain text on the window — or right-click any of those in Finder/Explorer and pick RefChecker — and the check starts immediately.

RefChecker verifies citations against Semantic Scholar, OpenAlex, CrossRef, DBLP, and ACL Anthology, and uses LLM-powered deep web search to flag likely fabricated references. When the LLM finds a more likely source than the first database match, RefChecker re-verifies the citation against the LLM-found metadata before deciding whether it is an error or a hallucination. It supports single papers, bulk batches, and automated scanning of entire OpenReview venues.

Built by Mark Russinovich with AI assistants (Cursor, GitHub Copilot, Claude Code). Watch the deep dive video.


Contents


Quick Start

Web UI (Docker)

docker run -p 8000:8000 ghcr.io/markrussinovich/refchecker:latest

Open http://localhost:8000 in your browser.

Web UI (pip)

pip install academic-refchecker[llm,webui]
refchecker-webui

CLI (pip)

pip install academic-refchecker[llm]
academic-refchecker --paper 1706.03762
academic-refchecker --paper /path/to/paper.pdf

LLM extraction is generally more accurate, but PDFs can fall back to GROBID when no extraction LLM is configured. Deep hallucination checks require a hallucination-capable LLM provider: OpenAI, Anthropic, Google, or Azure.

Tip: Set SEMANTIC_SCHOLAR_API_KEY for 1-2s per reference vs 5-10s without.


Features

Category What it does
Input formats ArXiv IDs/URLs, PDFs, LaTeX (.tex), BibTeX (.bib/.bbl), plain text
Verification sources Semantic Scholar, OpenAlex, CrossRef, DBLP, ACL Anthology
LLM extraction OpenAI, Anthropic, Google, Azure, or local vLLM for parsing complex bibliographies
Metadata checks Titles, authors, years, venues, DOIs, ArXiv IDs, URLs
Smart matching Handles formatting variations (BERT vs B-ERT, pre-trained vs pretrained)
Hallucination detection Flags likely fabricated references using deterministic pre-filters, LLM deep web search, and metadata reverification when the LLM finds a better match
AI-generated-text detection (opt-in) Optionally analyzes the body text of each checked article for AI-generated-likelihood, returning a low/medium/high band plus advisory flagged passages. Three engines: a local calibrated model (offline, downloadable), an LLM judge (reuses your configured LLM), or an external API (Pangram/GPTZero). Advisory only — detection is unreliable on technical and non-native-English academic writing, so results are framed as a self-check and never as proof of misconduct. Enable under Settings → AI Detection.
Bulk checking Upload multiple files or a ZIP in the Web UI; use --paper-list or --openreview in the CLI
OpenReview scanning Fetch all accepted (or submitted) papers for a venue and scan them in one command
Reports JSON, JSONL, CSV, or text — with error details, corrections, and hallucination assessments
Corrections Auto-generates corrected BibTeX, plain-text, and bibitem entries for each error
Web UI Real-time progress, history sidebar, batch tracking, split extraction/hallucination LLM settings, export (Markdown/text/BibTeX), dark mode
Multi-user hosting OAuth sign-in (Google, GitHub, Microsoft), per-user rate limiting, admin controls

Sample Output

Web UI

RefChecker Web UI

CLI — Single Paper

📄 Processing: Attention Is All You Need
   URL: https://arxiv.org/abs/1706.03762

[1/45] Neural machine translation in linear time
       Nal Kalchbrenner et al. | 2017
       ⚠️  Warning: Year mismatch: cited '2017', actual '2016'

[2/45] Effective approaches to attention-based neural machine translation
       Minh-Thang Luong et al. | 2015
       ❌ Error: First author mismatch: cited 'Minh-Thang Luong', actual 'Thang Luong'

[3/45] Deep Residual Learning for Image Recognition
       Kaiming He et al. | 2016 | https://doi.org/10.1109/CVPR.2016.91
       ❌ Error: DOI mismatch: cited '10.1109/CVPR.2016.91', actual '10.1109/CVPR.2016.90'

============================================================
📋 SUMMARY
📚 Total references processed: 68
❌ Total errors: 55  ⚠️ Total warnings: 16  ❓ Unverified: 15

CLI — Hallucination Flagging

[5/7] Efficient Neural Network Pruning Using Iterative Sparse Retraining
      Shuang Li, Yifan Chen | 2019
      ❓ Could not verify
      🚩 Hallucination assessment: LIKELY
         A web search for the exact title and authors yields no results in any
         academic database. The paper does not appear in ICML 2019 proceedings,
         indicating it is probably fabricated.

Install

PyPI (recommended)

pip install academic-refchecker[llm,webui]  # Web UI + CLI + LLM providers
pip install academic-refchecker[llm]        # CLI + LLM providers; recommended for best extraction and hallucination checks
pip install academic-refchecker             # CLI only; PDFs can still fall back to GROBID when available

From Source (development)

git clone https://github.com/markrussinovich/refchecker.git && cd refchecker
python -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -e ".[llm,webui]"
pip install -r requirements-dev.txt                  # pytest, playwright, etc.

Requirements: Python 3.11+. Node.js 20.19+ is only needed for Web UI frontend development.


Web UI

The Web UI provides real-time progress, check history, batch tracking, and one-click export of corrections.

LLM extraction is preferred, but PDF uploads and direct PDF URLs can fall back to GROBID. Hallucination checks use a separate hallucination LLM selection when one is configured; otherwise the UI falls back to the selected extraction LLM only if that provider supports web search. Local vLLM can be used for extraction, but hallucination checks require OpenAI, Anthropic, Google, or Azure.

refchecker-webui                    # default: http://localhost:8000
refchecker-webui --port 9000        # custom port

Key features:

  • Single check — paste an ArXiv URL/ID or upload a PDF/BibTeX/LaTeX file
  • Bulk check — upload multiple files (up to 50) or a single ZIP archive; papers are grouped into a batch with a progress bar
  • Bulk URL list — paste up to 50 URLs or ArXiv IDs (one per line) to check in a single batch
  • Status dashboard — filterable badge counts for errors, warnings, unverified, and hallucinated references
  • Reference cards — per-reference details with corrections, source links (Semantic Scholar, ArXiv, DOI), and hallucination assessment
  • Export — download corrections as Markdown, plain text, or BibTeX
  • History sidebar — browse and re-run previous checks; batches are grouped together
  • Settings — separate extraction and hallucination LLM provider/model selection, API key management, Semantic Scholar key validation, local database directory, dark/light/system theme

Frontend Development

cd web-ui && npm install && npm start     # http://localhost:5173

Or run backend and frontend separately:

# Terminal 1 — Backend
python -m uvicorn backend.main:app --reload --port 8000

# Terminal 2 — Frontend
cd web-ui && npm run dev

See web-ui/README.md for more.


CLI

# ArXiv (ID or URL)
academic-refchecker --paper 1706.03762
academic-refchecker --paper https://arxiv.org/abs/1706.03762

# Local files (PDF, LaTeX, text, BibTeX)
academic-refchecker --paper paper.pdf
academic-refchecker --paper paper.tex
academic-refchecker --paper refs.bib

# With LLM extraction (recommended for complex bibliographies)
academic-refchecker --paper paper.pdf --llm-provider anthropic

# Save human-readable output
academic-refchecker --paper 1706.03762 --output-file errors.txt

# Save structured report (JSON, JSONL, CSV, or text)
academic-refchecker --paper 1706.03762 --report-file report.json --report-format json

# Bulk: check a list of papers
academic-refchecker --paper-list papers.txt --report-file report.json

# OpenReview: fetch and scan an entire venue
academic-refchecker --openreview iclr2024 --report-file report.json

# OpenReview: fetch the paper list only and save it to a custom path
academic-refchecker --openreview aistats2025 --openreview-list-only --openreview-output-file paper_lists/aistats2025.txt

All CLI Options

Input (choose one):
  --paper PAPER              ArXiv ID, URL, PDF, LaTeX, text, or BibTeX file
  --paper-list PATH          Newline-delimited file of paper specs (URLs, IDs, paths)
  --openreview VENUE         Fetch papers from a supported OpenReview venue (iclr, icml, aistats, uai, corl)
  --openreview-status MODE   accepted (default) or submitted
  --openreview-list-only     Fetch the OpenReview paper list and exit without scanning
  --openreview-output-file PATH
                            Custom path for the generated OpenReview paper list

LLM:
  --llm-provider PROVIDER    openai, anthropic, google, azure, or vllm
  --llm-model MODEL          Override the default model for the provider
  --llm-endpoint URL         Custom endpoint (e.g. local vLLM server)
  --llm-parallel-chunks      Enable parallel LLM chunk processing (default)
  --llm-no-parallel-chunks   Disable parallel LLM chunk processing
  --llm-max-chunk-workers N  Max workers for parallel LLM chunks (default: 4)
  --hallucination-provider PROVIDER
                            Separate provider for deep hallucination checks: openai, anthropic, google, or azure
  --hallucination-model MODEL
                            Override the hallucination-check model for the provider
  --hallucination-endpoint URL
                            Custom endpoint for the hallucination-check provider

Verification:
  --database-dir PATH        Directory containing local DBs: semantic_scholar.db, openalex.db, crossref.db, dblp.db, acl_anthology.db
  --s2-db PATH               Path to local Semantic Scholar database
  --openalex-db PATH         Path to local OpenAlex database
  --crossref-db PATH         Path to local CrossRef database
  --dblp-db PATH             Path to local DBLP database
  --acl-db PATH              Path to local ACL Anthology database
  --update-databases         Install/update configured local databases
  --openalex-since DATE      Only ingest OpenAlex partitions newer than YYYY-MM-DD during updates
  --openalex-min-year YEAR   Only ingest OpenAlex works published in YEAR or later during updates
  --db-path PATH             (Deprecated) alias for --s2-db
  --semantic-scholar-api-key KEY   Override SEMANTIC_SCHOLAR_API_KEY env var
  --disable-parallel         Run verification sequentially
  --max-workers N            Max parallel verification threads (default: 6)

Output:
  --output-file [PATH]       Human-readable output (default: reference_errors.txt)
  --report-file PATH         Structured report (includes hallucination assessments)
  --report-format FORMAT     json (default), jsonl, csv, or text
  --debug                    Verbose logging

Hallucination Detection

RefChecker automatically evaluates suspicious references for potential fabrication using deterministic filters, LLM deep web search, and metadata reverification.

Stage 1 — Deterministic Pre-filter (no LLM needed)

References are flagged for deeper inspection when they exhibit:

  • Unverified status — not found in Semantic Scholar, OpenAlex, CrossRef, DBLP, or ACL Anthology
  • Author overlap below 60% — fewer than 60% of cited authors match any known paper (applies to references with 3+ authors)
  • Identifier conflicts — DOI or ArXiv ID resolves to a different paper
  • URL verification failure — cited URL is broken or points to a different paper

References with only minor issues (year off by one, venue variation) are not flagged.

Stage 2 — LLM Deep Web Search

Flagged references are sent to the configured hallucination LLM for a mandatory web search. The LLM must look for a dedicated page for the cited work, not just a citation in another paper's reference list. It returns a short verdict plus the best link it found and any found title, authors, and year.

Supported hallucination-check providers are OpenAI, Anthropic, Google, and Azure. The CLI can use the extraction provider when it is hallucination-capable, or you can pass --hallucination-provider / --hallucination-model to use a different model. The Web UI exposes the same split as separate extraction and hallucination selectors in Settings.

Stage 3 — Reverification Against LLM-Found Metadata

When the LLM says the reference is probably real (UNLIKELY) and provides found metadata, RefChecker re-runs its normal title, author, and year comparison against that LLM-found metadata. This catches cases where a database lookup matched the wrong edition, version, or similarly titled work. If the cited title/authors/year match the LLM-found source, stale unverified or wrong-match errors can be cleared and the LLM-found URL is added as an llm_verified source. If substantive mismatches remain, the reference stays an error rather than being blindly upgraded.

If the LLM cannot find an exact source, or finds only a similar paper with different authors or identifiers, the reference remains suspicious and can be marked as a likely hallucination.

Each reference receives a verdict:

Verdict Meaning
🚩 LIKELY Probably fabricated — no exact source was found, or the found source conflicts substantially with the citation
UNCERTAIN Inconclusive — may exist but could not be confirmed
UNLIKELY Probably real — found on a dedicated page with matching title/authors, then rechecked against the cited metadata

Hallucination assessments appear inline in CLI output, in Web UI reference cards, and in structured reports (JSON/JSONL/CSV) via the hallucination_assessment field.


AI-Generated Text Detection

Opt-in and advisory only. AI-text detection is unreliable on academic, technical, and non-native-English writing, and on human text polished with AI. RefChecker frames every result as a low/medium/high likelihood band with a permanent disclaimer — never a binary verdict or proof of misconduct, and never a basis for an accusation, grade, or decision. Below ~300 words, or on equation/code/citation-heavy passages, it abstains (inconclusive).

When enabled (Settings → AI Detection), each checked article's body text is analyzed for AI-generated likelihood, in single and batch modes. If both reference checking and AI detection are on, they run in parallel. Results show a band + score, an optional list of advisory flagged passages, and the engine/model used.

Detection engines (pick one in Settings)

Engine What it is Cost Notes
Local model (default) desklib/ai-text-detector (DeBERTa-v3, MIT) run offline via ONNX/Transformers Free One-time download (managed in Settings); calibrated, reproducible; no data leaves your machine
LLM judge Reuses your configured LLM provider (OpenAI/Anthropic/Google/Azure) with an anti-false-positive rubric LLM tokens Uncalibrated, so it is hard-capped at "medium" — it can never raise a standalone "high"
External API Pangram or GPTZero Per-word $ Requires an API key and explicit consent (your manuscript text is sent to a third party)

Usage & cost tracking

AI-detection work is metered in the same per-check token/$ badge under an "AI-generated-text detection" flow: the local model records the processed word count at $0; the API backends record words sent plus an estimated dollar cost; the LLM-judge records real input/output tokens and their cost.

Graph 2nd-degree expansion

In the Graph tab, the 2nd-degree expansion has a "Refs only" vs "+ AI-gen" toggle. With "+ AI-gen", each expanded article also gets an AI-likelihood ring (red = high, amber = medium), estimated locally from its abstract (free, offline). Abstracts are short, so most come back inconclusive — this is an advisory signal, never a full-text analysis.

Sources & credits

The detection engines build on these open-source projects and services:

On the reliability of detectors for academic/non-native-English text, see Liang et al., arXiv:2304.02819.


Bulk Checking

Web UI

Upload multiple files or a ZIP archive to check up to 50 papers in a single batch. Alternatively, paste a list of URLs or ArXiv IDs (one per line). Batches track progress per paper and appear as a group in the history sidebar.

Supported file types: PDF, TXT, TEX, BIB, BBL, ZIP.

CLI

Create a text file with one paper per line (ArXiv IDs, URLs, or local file paths):

1706.03762
https://openreview.net/pdf?id=ZG3RaNIsO8
paper/local_sample.bib
/path/to/paper.pdf

Then run:

academic-refchecker --paper-list papers.txt --report-file bulk_report.json

The report includes per-paper rollups and a cross-paper summary with flagged reference counts.


OpenReview Integration

Scan all accepted (or submitted) papers for an OpenReview venue in one command:

# Scan accepted papers
academic-refchecker --openreview iclr2024 --report-file report.json

# Scan all public submissions instead
academic-refchecker --openreview iclr2024 --openreview-status submitted --report-file report.json

Supported venues: ICLR, ICML, AISTATS, UAI, and CoRL.

Use shorthands like iclr2024, icml2025, aistats2025, uai2025, or corl2025.

The command fetches the paper list from OpenReview, writes it to output/openreview_<venue>_<status>.txt by default, and then runs a bulk scan. Use --openreview-list-only to generate the list without running verification, and --openreview-output-file to choose the output path. The structured report includes per-paper rollups with flagged record counts and error-type distributions, making it easy to triage an entire conference for citation problems.


Output & Reports

Result Types

Type Description Examples
Error Critical issues needing correction Author/title/DOI mismatches, incorrect ArXiv IDs
⚠️ Warning Minor issues to review Year differences, venue variations
ℹ️ Suggestion Recommended improvements Add missing ArXiv/DOI URLs
Unverified Could not verify against any source Rare publications, preprints
🚩 Hallucination Likely fabricated reference Unverifiable with rich metadata, identifier conflicts

Structured Reports

Write machine-readable reports with --report-file and --report-format:

academic-refchecker --paper 1706.03762 --report-file report.json --report-format json
Example JSON report structure
{
  "generated_at": "2026-03-15T19:50:52Z",
  "summary": {
    "total_papers_processed": 1,
    "total_references_processed": 7,
    "total_errors_found": 2,
    "total_warnings_found": 2,
    "total_unverified_refs": 4,
    "flagged_records": 3,
    "flagged_papers": 1
  },
  "papers": [
    {
      "source_paper_id": "local_hallucination_7ref_sample",
      "source_title": "Hallucination 7Ref Sample",
      "total_records": 6,
      "flagged_records": 3,
      "max_flag_level": "high",
      "error_type_counts": { "unverified": 3, "multiple": 2, "year (v1 vs v2 update)": 1 },
      "reason_counts": { "unverified": 3, "web_search_not_found": 3 }
    }
  ],
  "records": [
    {
      "ref_title": "Deep Residual Learning for Image Recognition",
      "ref_authors_cited": "Jian He, Xiangyu Zhang, Shaoqing Ren, Jian Sun",
      "ref_authors_correct": "Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun",
      "error_type": "multiple",
      "error_details": "- First author mismatch ...\n- Year mismatch ...",
      "ref_corrected_bibtex": "@inproceedings{he2016resnet, ... year = {2015} ...}",
      "hallucination_assessment": { "verdict": "UNLIKELY", "explanation": "..." }
    }
  ]
}
CLI output examples
❌ Error: First author mismatch: cited 'Jian He', actual 'Kaiming He'
❌ Error: DOI mismatch: cited '10.5555/3295222.3295349', actual '10.48550/arXiv.1706.03762'
⚠️ Warning: Year mismatch: cited '2019', actual '2018'
ℹ️ Suggestion: Add ArXiv URL https://arxiv.org/abs/1706.03762
❓ Could not verify: Llama guard (M. A. Research, 2024)
🚩 Hallucination assessment: LIKELY — no matching paper found in academic databases

Each report record includes the original reference, error details, corrected metadata (BibTeX, plain text, bibitem), verified URLs, and hallucination assessment when applicable.


Deployment

Docker

Pre-built multi-architecture images are published to GitHub Container Registry on every release.

# Quick start
docker run -p 8000:8000 ghcr.io/markrussinovich/refchecker:latest

# With LLM API key (recommended)
docker run -p 8000:8000 -e ANTHROPIC_API_KEY=your_key ghcr.io/markrussinovich/refchecker:latest

# Persistent data
docker run -p 8000:8000 \
  -e ANTHROPIC_API_KEY=your_key \
  -v refchecker-data:/app/data \
  ghcr.io/markrussinovich/refchecker:latest

Other LLM providers:

docker run -p 8000:8000 -e OPENAI_API_KEY=your_key ghcr.io/markrussinovich/refchecker:latest
docker run -p 8000:8000 -e GOOGLE_API_KEY=your_key ghcr.io/markrussinovich/refchecker:latest

Docker Compose

git clone https://github.com/markrussinovich/refchecker.git && cd refchecker
cp .env.example .env   # Add your API keys
docker compose up -d
docker compose logs -f    # View logs
docker compose down       # Stop
docker compose pull       # Update to latest
Tag Description Arch Size
latest Latest stable release amd64, arm64 ~800MB
X.Y.Z Specific version (e.g., 2.0.18) amd64, arm64 ~800MB

Multi-User Server (OAuth)

By default, RefChecker runs in single-user mode — no login required. Enable multi-user mode for shared deployments where each visitor signs in via OAuth. If the server has LLM provider environment variables such as ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY, or AZURE_OPENAI_API_KEY, the Web UI exposes those providers as selectable server-environment configs without revealing the secret. Users can still enter their own keys to override the server key for their browser session; user-entered keys are stored in the browser's localStorage and sent per-request — never stored on the server.

1. Generate a JWT Secret Key

python -c "import secrets; print(secrets.token_hex(32))"

2. Register an OAuth Application

Configure at least one provider:

Provider Registration URL Callback URL
Google Google Cloud Console https://<domain>/api/auth/callback/google
GitHub GitHub Developer Settings https://<domain>/api/auth/callback/github
Microsoft Azure App Registrations https://<domain>/api/auth/callback/microsoft

3. Configure Environment Variables

cp .env.example .env
REFCHECKER_MULTIUSER=true
JWT_SECRET_KEY=<output from step 1>
SITE_URL=https://<your-domain>
HTTPS_ONLY=true

# At least one OAuth provider
GOOGLE_CLIENT_ID=...
GOOGLE_CLIENT_SECRET=...

GITHUB_CLIENT_ID=...
GITHUB_CLIENT_SECRET=...

MS_CLIENT_ID=...
MS_CLIENT_SECRET=...

# Optional
REFCHECKER_ADMINS=github:you  # comma-separated; first sign-in is auto-admin
MAX_CHECKS_PER_USER=3         # max concurrent checks per user (default: 3)

4. Launch

docker compose up -d

Or without Docker:

pip install "academic-refchecker[llm,webui]"
REFCHECKER_MULTIUSER=true JWT_SECRET_KEY=<secret> GOOGLE_CLIENT_ID=... GOOGLE_CLIENT_SECRET=... \
  refchecker-webui --port 8000

Verify:

curl http://localhost:8000/api/auth/providers
# {"providers":["google","github"]}

Notes:

  • The first user to sign in is automatically admin. Add more via REFCHECKER_ADMINS.
  • Each user may run up to MAX_CHECKS_PER_USER concurrent checks (default 3). The 4th returns HTTP 429.
  • The CLI is unaffected — academic-refchecker works without any auth configuration.
  • Place the server behind a TLS-terminating reverse proxy (nginx, Caddy) for HTTPS.

Deploy to Render

RefChecker includes a render.yaml Blueprint for one-click deployment to Render:

  1. Fork this repo (or connect your own copy).
  2. On Render, click New +Blueprint → select the repo.
  3. Render reads render.yaml and creates the service with a persistent disk.
  4. Set environment variables in the Render dashboard (Environment tab):
    • SITE_URL — your public URL including https:// (must match exactly — OAuth fails otherwise).
    • HTTPS_ONLY=true for production.
    • REFCHECKER_DATA_DIR=/data (matches the persistent disk mount).
    • At least one OAuth provider's CLIENT_ID / CLIENT_SECRET.
  5. Register each provider's callback URL as https://<your-url>/api/auth/callback/{google,github,microsoft}.

Note: The persistent disk at /data stores the SQLite database and uploaded files, so data survives redeployments. For other PaaS hosts (Railway, Fly.io), the same Docker image works — set PORT, REFCHECKER_DATA_DIR, and the auth env vars.


Configuration

LLM Providers

LLM-powered extraction improves accuracy with complex bibliographies. Hallucination detection is configured separately so you can use one model for extraction and another, web-search-capable model for deep hallucination checks. Claude Sonnet 4 performs best for extraction; GPT-4o may hallucinate DOIs.

Provider Env Variable Example Model
Anthropic ANTHROPIC_API_KEY claude-sonnet-4-6
OpenAI OPENAI_API_KEY gpt-4.1
Google GOOGLE_API_KEY gemini-3.1-flash-lite-preview
Azure AZURE_OPENAI_API_KEY gpt-4.1
vLLM (local) meta-llama/Llama-3.3-70B-Instruct

When running the Web UI, provider keys present in the server environment are added automatically as selectable LLM configurations in both single-user and multi-user mode. The key value is not returned to the browser; users can still enter a browser/session key to override the server environment key for their own run.

export ANTHROPIC_API_KEY=your_key
academic-refchecker --paper 1706.03762 --llm-provider anthropic

academic-refchecker --paper paper.pdf --llm-provider openai --llm-model gpt-4.1
academic-refchecker --paper paper.pdf --llm-provider vllm --llm-model meta-llama/Llama-3.3-70B-Instruct

# Use one model for extraction and another for hallucination checks
academic-refchecker --paper paper.pdf \
  --llm-provider vllm --llm-model meta-llama/Llama-3.3-70B-Instruct \
  --hallucination-provider anthropic --hallucination-model claude-sonnet-4-6

Hallucination-capable providers are OpenAI, Anthropic, Google, and Azure. vLLM can extract references but cannot perform live web search, so pair it with --hallucination-provider when you want hallucination checks.

Local Models (vLLM)

Run an OpenAI-compatible vLLM server for local inference:

pip install "academic-refchecker[vllm]"
python scripts/start_vllm_server.py --model meta-llama/Llama-3.3-70B-Instruct --port 8001
academic-refchecker --paper paper.pdf --llm-provider vllm --llm-endpoint http://localhost:8001/v1

Environment Variables

# LLM
export REFCHECKER_LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY=your_key           # Also: OPENAI_API_KEY, GOOGLE_API_KEY

# Performance
export SEMANTIC_SCHOLAR_API_KEY=your_key    # Higher rate limits / faster verification

Local Database

For offline verification or faster processing:

python scripts/download_db.py \
  --field "computer science" \
  --start-year 2020 --end-year 2024

academic-refchecker --paper paper.pdf --s2-db semantic_scholar_db/semantic_scholar.db
academic-refchecker --paper paper.pdf --database-dir /path/to/local-db-folder
academic-refchecker --database-dir /path/to/local-db-folder --update-databases
academic-refchecker --database-dir /path/to/local-db-folder --update-databases --openalex-min-year 2020

--update-databases now refreshes local S2, DBLP, and OpenAlex databases when those paths are configured. DBLP follows Hallucinator's offline-dump approach by downloading and parsing dblp.xml.gz, while OpenAlex follows Hallucinator's S3 snapshot model and can be scoped with --openalex-since or --openalex-min-year to avoid a full build. CrossRef remains API-first; RefChecker will still use live CrossRef lookups, but offline CrossRef population is not automated yet.

When the Web UI has local databases configured, it scans REFCHECKER_DATABASE_DIRECTORY for well-formed DB names (semantic_scholar.db, openalex.db, crossref.db, dblp.db) and schedules asynchronous background refresh tasks for discovered DBs. Background refresh uses the bundled local database updater for discovered S2, DBLP, and OpenAlex files. The downloader also writes a latest_snapshot.txt file next to the SQLite database for operator visibility, while the Web UI shows the current snapshot from the database metadata in the settings panel.


Documentation

Detailed project documentation lives under docs/README.md, including the Web UI guide and testing guide.


Testing

680+ tests covering unit, integration, and end-to-end scenarios.

pytest tests/                    # All tests
pytest tests/unit/              # Unit only
pytest tests/e2e/               # End-to-end (Playwright)
pytest --cov=src tests/         # With coverage
make clean                      # Remove generated local artifacts (logs, debug output, cache, build files)

See tests/README.md for details.


License

MIT License — see LICENSE.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

academic_refchecker-3.0.144.tar.gz (918.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

academic_refchecker-3.0.144-py3-none-any.whl (928.4 kB view details)

Uploaded Python 3

File details

Details for the file academic_refchecker-3.0.144.tar.gz.

File metadata

  • Download URL: academic_refchecker-3.0.144.tar.gz
  • Upload date:
  • Size: 918.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for academic_refchecker-3.0.144.tar.gz
Algorithm Hash digest
SHA256 2932607a196cc7fbed3006a089ff842f140bd0b1db158769394877e6121696f8
MD5 41fe439bda0ffdd56d9f75ec7a5e7503
BLAKE2b-256 0df26705cf364c406d735d6f1ffe0ec1652ce1e637ab5702b3332f9711a4b94f

See more details on using hashes here.

File details

Details for the file academic_refchecker-3.0.144-py3-none-any.whl.

File metadata

File hashes

Hashes for academic_refchecker-3.0.144-py3-none-any.whl
Algorithm Hash digest
SHA256 0847616d18bb5bb5f136219f26e08ae91551db0119b4a0e781aa4eae766c6f33
MD5 30ff19dc7e18b46775fb30992ed0eb20
BLAKE2b-256 6897f819b7f2e77862f675f86c2aa6dee58f14cb4abfa7761f1b3bcf81871eac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page