Skip to main content

Detect hallucinated references in academic papers

Project description

hallucinator-rs

Rust implementation of the Hallucinated Reference Detector. Includes a CLI and an interactive terminal UI (TUI) for batch-processing PDFs and archives.

Same validation engine as the Python version — queries 10 academic databases in parallel, fuzzy-matches titles, checks for retractions — but with a native async runtime and a full-screen TUI for working through large batches interactively.


Building

Requires a Rust toolchain. Install from rustup.rs or rust-lang.org/tools/install.

cd hallucinator-rs
cargo build --release

Binaries are placed in target/release/:

  • hallucinator-cli — command-line interface
  • hallucinator-tui — terminal UI

CLI

# Check a PDF
hallucinator-cli check paper.pdf

# With offline databases (recommended)
hallucinator-cli check --dblp-offline=dblp.db --acl-offline=acl.db paper.pdf

# With API keys
hallucinator-cli check --openalex-key=KEY --s2-api-key=KEY paper.pdf

# Save output to file
hallucinator-cli check --output=report.log paper.pdf

# Disable specific databases
hallucinator-cli check --disable-dbs=OpenAlex,PubMed paper.pdf

# No color
hallucinator-cli check --no-color paper.pdf

CLI Options

Option Description
--openalex-key=KEY OpenAlex API key
--s2-api-key=KEY Semantic Scholar API key
--dblp-offline=PATH Path to offline DBLP database
--acl-offline=PATH Path to offline ACL Anthology database
--output=PATH Write output to file
--no-color Disable colored output
--disable-dbs=CSV Comma-separated database names to skip
--check-openalex-authors Flag author mismatches from OpenAlex (off by default)

Building Offline Databases

# DBLP (~4.6GB download, builds SQLite with FTS5 index)
hallucinator-cli update-dblp dblp.db

# ACL Anthology
hallucinator-cli update-acl acl.db

TUI

The TUI is designed for processing multiple papers at once — pick files, queue them up, and watch results stream in.

# Launch with file picker
hallucinator-tui

# Pre-load PDFs or archives
hallucinator-tui paper1.pdf paper2.pdf proceedings.zip

# With options
hallucinator-tui --dblp-offline=dblp.db --acl-offline=acl.db --theme=modern

TUI Options

All CLI options above, plus:

Option Description
--theme hacker|modern Color theme (default: hacker)
--mouse Enable mouse support
--fps N Target framerate, 1-120 (default: 30)

The TUI also has update-dblp and update-acl subcommands, same as the CLI.

Screens

File Picker — Browse directories, select PDFs or archives (ZIP, tar.gz). Archives are streamed: PDFs are extracted and queued as they're found, so processing starts immediately.

Queue — Shows all papers with real-time progress bars. Sort by order, problem count, problem %, or filename. Filter by status (all, has problems, done, running, queued). Search by filename with /.

Paper Detail — All references for a single paper. Filter to show problems only. Sort by reference number, verdict, or source database.

Reference Detail — Full info for a single reference: title, authors, raw citation, matched authors, source database, DOI/arXiv info, retraction warnings, per-database timeout status. Mark false positives as safe with Space.

Config — Edit all settings inline: API keys (masked display), database paths, disabled databases, concurrency limits, timeouts, archive size limit, theme, FPS.

Export — Save results as JSON, CSV, Markdown, plain text, or HTML. Export a single paper or all papers at once.

Key Bindings

Key Action
j/k or arrows Navigate
Enter Select / confirm
Esc Back / cancel
o Add more PDFs to queue
e Export results
, Open config
s Cycle sort order
f Cycle filter
Space Mark reference as safe
Tab Toggle activity pane
? Help screen

Configuration

Settings are loaded from (highest to lowest priority):

  1. CLI arguments
  2. Environment variables (OPENALEX_KEY, S2_API_KEY, DBLP_OFFLINE_PATH, ACL_OFFLINE_PATH, DB_TIMEOUT, DB_TIMEOUT_SHORT)
  3. Config file
  4. Defaults

Config File

The TUI looks for config files at:

  1. ./hallucinator.toml (current directory)
  2. ~/.config/hallucinator/config.toml (or platform equivalent via $XDG_CONFIG_HOME)

Settings changed in the TUI config screen are persisted automatically.

[api_keys]
openalex_key = "..."
s2_api_key = "..."

[databases]
dblp_offline_path = "/path/to/dblp.db"
acl_offline_path = "/path/to/acl.db"
disabled = ["OpenAlex", "PubMed"]

[concurrency]
max_concurrent_papers = 2
max_concurrent_refs = 4
db_timeout_secs = 10
db_timeout_short_secs = 5
max_archive_size_mb = 500  # 0 = unlimited

[display]
theme = "modern"
fps = 30

Offline Database Auto-Detection

If no path is specified, the tool checks:

  1. dblp.db / acl.db in the current directory
  2. ~/.local/share/hallucinator/dblp.db (or platform equivalent)

Databases

Same 10 databases as the Python version:

Database Coverage
CrossRef DOIs, journal articles, conference papers
arXiv Preprints (CS, physics, math, etc.)
DBLP Computer science bibliography (online + offline)
Semantic Scholar Aggregates Academia.edu, SSRN, PubMed, and more
ACL Anthology Computational linguistics (online + offline)
NeurIPS NeurIPS proceedings
SSRN Social science research
Europe PMC Life science literature (42M+ abstracts)
PubMed Biomedical literature via NCBI
OpenAlex 250M+ works (optional, needs API key)

Each reference is checked against all enabled databases concurrently. First verified match wins (early exit).


Architecture

Workspace Crates

Crate Purpose
hallucinator-pdf PDF text extraction (MuPDF), reference parsing, archive handling
hallucinator-core Validation engine, database backends, fuzzy matching, retraction checks
hallucinator-dblp Offline DBLP database builder and querier (SQLite + FTS5)
hallucinator-acl Offline ACL Anthology database builder and querier
hallucinator-cli CLI binary
hallucinator-tui Terminal UI (Ratatui)
hallucinator-web Web interface

Concurrency Model

  • Configurable number of papers processed in parallel (TUI)
  • 4 references checked in parallel per paper (configurable)
  • All enabled databases queried concurrently per reference
  • Early exit on first verified match
  • Retry pass for timed-out queries at the end
  • Per-batch cancellation token for graceful stopping

Result Persistence

The TUI automatically saves results to ~/.cache/hallucinator/runs/<timestamp>/ as JSON, so completed work is not lost if you quit mid-batch.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hallucinator-0.1.0.tar.gz (111.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hallucinator-0.1.0-cp312-cp312-win_amd64.whl (8.9 MB view details)

Uploaded CPython 3.12Windows x86-64

hallucinator-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl (10.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

hallucinator-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (8.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

hallucinator-0.1.0-cp312-cp312-macosx_10_15_x86_64.whl (8.9 MB view details)

Uploaded CPython 3.12macOS 10.15+ x86-64

File details

Details for the file hallucinator-0.1.0.tar.gz.

File metadata

  • Download URL: hallucinator-0.1.0.tar.gz
  • Upload date:
  • Size: 111.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hallucinator-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a9610b4591039500debe3af0605d72c14b2e7d7ebce0a192df78154201b83888
MD5 5c2f7bdc895c2086f9ff38e309e93058
BLAKE2b-256 0221121d68458e9218373c8adc0bf045df8aad8e9c68d94dd08002f5a9ca51b0

See more details on using hashes here.

File details

Details for the file hallucinator-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: hallucinator-0.1.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 8.9 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hallucinator-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 1c0a3d99d83d8d00ba10aa5b4f89701ef2e79ac8c9604741d9a1a62e52dcc35f
MD5 75bddf10b91e918e1214940b39e810fc
BLAKE2b-256 4081ace4902e52ff4a74117b6520337e4ae20c2ab84861a5f6430c6ec543e5b9

See more details on using hashes here.

File details

Details for the file hallucinator-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

  • Download URL: hallucinator-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl
  • Upload date:
  • Size: 10.7 MB
  • Tags: CPython 3.12, manylinux: glibc 2.28+ x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hallucinator-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f47d6f290e926bdf170e53d865a5238f3150bb119116a24d19b21f365818ef87
MD5 f381d7b3973254d333fe9f6dd4d4abb9
BLAKE2b-256 9f33d1f77adca811393c285886bf191f498bbc1c90aaf5abd1fa4d066b45fd55

See more details on using hashes here.

File details

Details for the file hallucinator-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

  • Download URL: hallucinator-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 8.6 MB
  • Tags: CPython 3.12, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hallucinator-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0fab2398144f79ee3734032cbe0458c3030cd12bd0749870d966b1c8884e9066
MD5 66b7f2fcdd9b8190b1c160441f105f24
BLAKE2b-256 8d608ff418aa8bbddb8256afe0d89548496e6f710fab76304ef4b1284f1e3cd2

See more details on using hashes here.

File details

Details for the file hallucinator-0.1.0-cp312-cp312-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: hallucinator-0.1.0-cp312-cp312-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 8.9 MB
  • Tags: CPython 3.12, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hallucinator-0.1.0-cp312-cp312-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 77af973ea9c523217b72d3108035bf0e8128619dbd569e64916c86d4e4c54e62
MD5 c166a166d66c2a25920b06cb0cceedb7
BLAKE2b-256 be4d72204f334139d71fdbdd1a7a2ee97218e2f1737538d4fb442459c452d26d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page