Detect hallucinated references in academic papers

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Python Bindings

Python bindings for the Rust hallucinator engine, powered by PyO3 and Maturin. Extract references from academic PDFs and validate them against 10 academic databases — all from Python, with Rust-native performance.

from hallucinator import PdfExtractor, Validator, ValidatorConfig

# Extract references from a PDF
ext = PdfExtractor()
result = ext.extract("paper.pdf")
print(f"Found {len(result)} references")

# Validate them against academic databases
config = ValidatorConfig()
validator = Validator(config)
results = validator.check(result.references)

for r in results:
    print(f"[{r.status}] {r.title}")

Installation

From PyPI (recommended)

Pre-compiled wheels for Python 3.12 — no Rust toolchain needed:

pip install hallucinator

Available platforms: Linux (x86_64), macOS (x86_64 + Apple Silicon), Windows (x86_64).

From source

Requires Python 3.9+ and a Rust toolchain (rustup.rs).

cd hallucinator-rs

# Using uv (recommended)
uv venv && source .venv/bin/activate   # or .venv\Scripts\activate on Windows
uv pip install maturin
maturin develop --release

# Or with pip
pip install maturin
maturin develop --release

After installation, the hallucinator package is importable:

>>> import hallucinator
>>> hallucinator.PdfExtractor()
PdfExtractor(...)

PDF Extraction

Quick start

from hallucinator import PdfExtractor

ext = PdfExtractor()
result = ext.extract("paper.pdf")

for ref in result.references:
    print(ref.title)
    print(f"  Authors: {', '.join(ref.authors)}")
    if ref.doi:
        print(f"  DOI: {ref.doi}")

PdfExtractor

The main entry point for extraction. Wraps the Rust engine and adds support for custom Python segmentation strategies.

ext = PdfExtractor()

# Full pipeline: PDF file → ExtractionResult
result = ext.extract("paper.pdf")

# Full pipeline on already-extracted text
result = ext.extract_from_text(text)

# Individual pipeline stages
text = ext.extract_text("paper.pdf")       # Step 1: PDF → raw text
section = ext.find_section(text)            # Step 2: locate references section
segments = ext.segment(section)             # Step 3: split into individual refs
ref = ext.parse_reference(segments[0])      # Step 4: parse a single reference

Configuration

Override regex patterns and thresholds to handle non-standard paper formats.

Property	Default	Description
`section_header_regex`	Matches "References", "Bibliography", etc.	Regex to find the start of the references section
`section_end_regex`	Matches "Appendix", "Acknowledgments", etc.	Regex to find the end of the references section
`fallback_fraction`	`0.7`	Fraction of document to skip when no header found (0.7 = use last 30%)
`ieee_segment_regex`	Matches `[1]`, `[2]`, etc.	Regex for IEEE-style reference numbering
`numbered_segment_regex`	Matches `1.`, `2.`, etc.	Regex for numbered-list references
`fallback_segment_regex`	Double newline	Fallback segmentation when no numbering detected
`min_title_words`	`4`	Minimum words in a title (shorter → skipped)
`max_authors`	`15`	Cap on extracted author count per reference

ext = PdfExtractor()

# Handle Spanish papers
ext.section_header_regex = r"(?i)\n\s*(?:Bibliografía|Referencias)\s*\n"

# Accept shorter titles
ext.min_title_words = 3

# Custom venue cutoff (don't include journal name in title)
ext.add_venue_cutoff_pattern(r"(?i)\.\s*Nature\b.*$")

# Preserve compound words across line breaks
ext.add_compound_suffix("powered")   # "AI- powered" → "AI-powered"

Custom segmentation strategies

For reference formats that no regex can handle, register a Python callable:

import re

def paren_segmenter(text: str) -> list[str] | None:
    """Split references numbered as (1), (2), (3)..."""
    parts = re.split(r'\n\s*\(\d+\)\s+', text)
    parts = [p.strip() for p in parts if p.strip()]
    return parts if len(parts) >= 3 else None

ext = PdfExtractor()
ext.add_segmentation_strategy(paren_segmenter)
result = ext.extract("unusual_paper.pdf")

Strategies are tried in registration order. Return None (or fewer than 3 items) to fall through to the next strategy, then to Rust built-ins.

ext.add_segmentation_strategy(try_format_a)
ext.add_segmentation_strategy(try_format_b)
# Falls through: try_format_a → try_format_b → Rust built-ins

ext.clear_segmentation_strategies()  # Remove all custom strategies

Archive extraction

Extract references from ZIP or tar.gz archives containing PDFs, BBL, or BIB files:

ext = PdfExtractor()

for entry in ext.extract_archive("papers.zip"):
    print(f"{entry.filename} ({entry.file_type})")
    if entry.result:  # PDF — full extraction
        for ref in entry.result.references:
            print(f"  {ref.title}")
    elif entry.content:  # BBL/BIB — raw text
        print(f"  {len(entry.content)} chars")

Use max_size_bytes to cap total extracted size (0 = unlimited):

for entry in ext.extract_archive("papers.tar.gz", max_size_bytes=100_000_000):
    ...

Check the iterator's warnings for size-limit messages after iteration.

The is_archive_path() helper detects supported formats:

from hallucinator import is_archive_path

is_archive_path("papers.zip")     # True
is_archive_path("papers.tar.gz")  # True
is_archive_path("paper.pdf")      # False

ArchiveEntry

Each item yielded from archive iteration:

entry.filename   # str — original filename within the archive
entry.file_type  # str — "pdf", "bbl", or "bib"
entry.result     # ExtractionResult | None — populated for PDFs
entry.content    # str | None — populated for BBL/BIB files

ExtractionResult

Returned by extract() and extract_from_text().

result = ext.extract("paper.pdf")

result.references   # list[Reference]
len(result)         # number of parsed references

# Skip statistics
result.skip_stats.total_raw     # total raw segments before filtering
result.skip_stats.url_only      # skipped: non-academic URLs only
result.skip_stats.short_title   # skipped: title too short
result.skip_stats.no_title      # references with no parseable title
result.skip_stats.no_authors    # references with no parseable authors

Reference

A parsed reference with structured fields.

ref.raw_citation    # str — the cleaned-up citation text
ref.title           # str | None — extracted title
ref.authors         # list[str] — author names
ref.doi             # str | None — DOI if found
ref.arxiv_id        # str | None — arXiv ID if found
ref.original_number # int — 1-based position in the PDF (0 for manually created refs)
ref.skip_reason     # str | None — why this ref was skipped ("url_only", "short_title"), or None

Creating references manually

You can create Reference objects directly — without PDF extraction — for batch validation of structured data (e.g. from a CSV, BibTeX parser, or API response):

from hallucinator import Reference

ref = Reference("Attention Is All You Need", authors=["Vaswani", "Shazeer"])
ref = Reference("BERT", doi="10.18653/v1/N19-1423")
ref = Reference("My Paper", authors=["Smith"], arxiv_id="2301.00001")

Constructor signature:

Reference(
    title: str,
    authors: list[str] = [],
    doi: str | None = None,
    arxiv_id: str | None = None,
    raw_citation: str | None = None,  # defaults to title if omitted
)

Batch validation without PDF extraction

For use cases where you already have structured reference data (titles, authors, DOIs) and want to validate without going through PDF extraction:

from hallucinator import Reference, Validator, ValidatorConfig

# Build references from structured data
refs = [
    Reference("Attention Is All You Need", authors=["Vaswani", "Shazeer"]),
    Reference("BERT: Pre-training of Deep Bidirectional Transformers",
              authors=["Devlin", "Chang"], doi="10.18653/v1/N19-1423"),
    Reference("A Completely Made Up Paper Title That Does Not Exist"),
]

# Validate
config = ValidatorConfig()
validator = Validator(config)
results = validator.check(refs)

for r in results:
    print(f"[{r.status}] {r.title}")

Reference Validation

After extracting references, validate them against academic databases. The validator queries up to 10 databases concurrently per reference, with early exit on first match.

Quick start

from hallucinator import PdfExtractor, Validator, ValidatorConfig

ext = PdfExtractor()
result = ext.extract("paper.pdf")

config = ValidatorConfig()
validator = Validator(config)
results = validator.check(result.references)

for r in results:
    if r.status == "verified":
        print(f"  OK: {r.title} (via {r.source})")
    elif r.status == "not_found":
        print(f"  ?? {r.title}")
    elif r.status == "author_mismatch":
        print(f"  ~~ {r.title} (authors don't match)")

ValidatorConfig

All configuration for database queries. Create one, tweak what you need, pass it to Validator().

config = ValidatorConfig()

API keys

config.s2_api_key = "your-semantic-scholar-key"
config.openalex_key = "your-openalex-key"
config.crossref_mailto = "you@university.edu"  # CrossRef polite pool

Concurrency and timeouts

config.num_workers = 4               # references checked in parallel (default: 4)
config.db_timeout_secs = 10          # per-database timeout (default: 10)
config.db_timeout_short_secs = 5     # short timeout for fast DBs (default: 5)
config.max_rate_limit_retries = 3    # max 429 retries per DB query (default: 3)

Persistent cache

config.cache_path = "/path/to/cache.db"  # SQLite cache for cross-run persistence

# Cache TTL tuning (optional)
config.cache_positive_ttl_secs = 604800  # verified results (default: 7 days)
config.cache_negative_ttl_secs = 86400   # not-found results (default: 24 hours)

SearxNG web search fallback

config.searxng_url = "http://localhost:8888"  # optional SearxNG instance URL

Disable databases

config.disabled_dbs = ["openalex", "pubmed"]

Database names: crossref, arxiv, dblp, semantic_scholar, acl, neurips, ssrn, europe_pmc, pubmed, openalex.

Offline databases

Point to local SQLite databases for DBLP and ACL Anthology (built with the CLI's update-dblp / update-acl commands). Dramatically faster than online queries.

config.dblp_offline_path = "/path/to/dblp.db"
config.acl_offline_path = "/path/to/acl.db"
config.openalex_offline_path = "/path/to/openalex.idx"

If the path doesn't exist or the file isn't a valid database, Validator(config) raises RuntimeError.

Author checking

config.check_openalex_authors = True  # verify authors for OpenAlex matches (default: False)

Validator

The main validation engine. Create it once, call check() as many times as needed.

validator = Validator(config)

check()

Validates a list of Reference objects against all enabled databases. Blocks until complete but releases the Python GIL, so other threads can run.

results = validator.check(references)
# or with a progress callback:
results = validator.check(references, progress=on_progress)

Returns list[ValidationResult].

Progress callbacks

Pass a callable to check() to receive real-time progress events:

def on_progress(event):
    if event.event_type == "checking":
        print(f"[{event.index + 1}/{event.total}] Checking: {event.title}")
    elif event.event_type == "result":
        r = event.result
        print(f"[{event.index + 1}/{event.total}] {r.status}: {r.title}")
    elif event.event_type == "warning":
        print(f"Warning: {event.title} — {event.message}")
    elif event.event_type == "retrying":
        print(f"Retrying: {event.title} (failed: {', '.join(event.failed_dbs)})")
    elif event.event_type == "retry_pass":
        print(f"Retrying {event.count} unresolved references...")
    elif event.event_type == "db_query_complete":
        print(f"  {event.db_name}: {event.db_status} ({event.elapsed_ms:.0f}ms)")
    elif event.event_type == "rate_limit_wait":
        print(f"  Rate limited on {event.db_name}, waiting {event.wait_ms:.0f}ms...")
    elif event.event_type == "rate_limit_retry":
        print(f"  Retrying {event.db_name} (attempt {event.attempt}, backoff {event.backoff_ms:.0f}ms)")

results = validator.check(refs, progress=on_progress)

ProgressEvent properties

All properties return None when not applicable to the event type.

Property	Type	Event types
`event_type`	`str`	all
`index`	`int`	checking, result, warning, retrying
`total`	`int`	checking, result, warning, retrying
`title`	`str`	checking, warning, retrying
`result`	`ValidationResult`	result
`failed_dbs`	`list[str]`	warning, retrying
`message`	`str`	warning
`count`	`int`	retry_pass
`paper_index`	`int`	db_query_complete
`ref_index`	`int`	db_query_complete, rate_limit_retry
`db_name`	`str`	db_query_complete, rate_limit_wait, rate_limit_retry
`db_status`	`str`	db_query_complete
`elapsed_ms`	`float`	db_query_complete
`attempt`	`int`	rate_limit_retry
`wait_ms`	`float`	rate_limit_wait
`backoff_ms`	`float`	rate_limit_retry

Cancellation

Cancel a running check from another thread:

import threading

validator = Validator(config)

def run_check():
    results = validator.check(refs)

t = threading.Thread(target=run_check)
t.start()

# Cancel after 30 seconds
import time
time.sleep(30)
validator.cancel()
t.join()

Stats

Compute summary statistics from results:

stats = Validator.stats(results)
print(f"Total:           {stats.total}")
print(f"Verified:        {stats.verified}")
print(f"Not found:       {stats.not_found}")
print(f"Author mismatch: {stats.author_mismatch}")
print(f"Retracted:       {stats.retracted}")
print(f"Skipped:         {stats.skipped}")

ValidationResult

The result of checking a single reference.

r = results[0]

r.title            # str — reference title
r.raw_citation     # str — original citation text
r.status           # "verified" | "not_found" | "author_mismatch"
r.source           # str | None — database that verified it (e.g. "crossref")
r.ref_authors      # list[str] — authors from the parsed reference
r.found_authors    # list[str] — authors from the matching DB record
r.paper_url        # str | None — URL in the matching database
r.failed_dbs       # list[str] — databases that timed out or errored

Per-database results

Every database query is recorded, even if it didn't match:

for db in r.db_results:
    print(f"  {db.db_name}: {db.status}", end="")
    if db.elapsed_ms is not None:
        print(f" ({db.elapsed_ms:.0f}ms)", end="")
    if db.paper_url:
        print(f" → {db.paper_url}", end="")
    print()

DbResult.status values: "match", "no_match", "author_mismatch", "timeout", "rate_limited", "error", "skipped".

DOI and arXiv info

if r.doi_info:
    print(f"DOI: {r.doi_info.doi} (valid={r.doi_info.valid})")
    if r.doi_info.title:
        print(f"  Resolved title: {r.doi_info.title}")

if r.arxiv_info:
    print(f"arXiv: {r.arxiv_info.arxiv_id} (valid={r.arxiv_info.valid})")

Retraction info

if r.retraction_info and r.retraction_info.is_retracted:
    print(f"RETRACTED!")
    if r.retraction_info.retraction_doi:
        print(f"  Retraction DOI: {r.retraction_info.retraction_doi}")
    if r.retraction_info.retraction_source:
        print(f"  Source: {r.retraction_info.retraction_source}")

Complete example

Extract, validate, and report — the full pipeline:

from hallucinator import PdfExtractor, Validator, ValidatorConfig

# Extract
ext = PdfExtractor()
result = ext.extract("paper.pdf")
refs = result.references
print(f"Extracted {len(refs)} references")

# Configure
config = ValidatorConfig()
config.s2_api_key = "your-key"              # optional but improves results
config.dblp_offline_path = "dblp.db"        # optional, faster than online
config.disabled_dbs = ["openalex"]           # skip DBs you don't need

# Validate with progress
def on_progress(event):
    if event.event_type == "checking":
        print(f"  [{event.index + 1}/{event.total}] {event.title}")
    elif event.event_type == "result":
        r = event.result
        icon = {"verified": "+", "not_found": "?", "author_mismatch": "~"}[r.status]
        src = f" ({r.source})" if r.source else ""
        print(f"  [{icon}] {r.title}{src}")

validator = Validator(config)
results = validator.check(refs, progress=on_progress)

# Summary
stats = Validator.stats(results)
print(f"\nVerified: {stats.verified}/{stats.total}")
if stats.not_found:
    print(f"Potentially hallucinated: {stats.not_found}")
if stats.retracted:
    print(f"Retracted: {stats.retracted}")

# Flag suspicious references
for r in results:
    if r.status == "not_found":
        print(f"\n  NOT FOUND: {r.title}")
        print(f"  Citation: {r.raw_citation[:120]}...")
    if r.retraction_info and r.retraction_info.is_retracted:
        print(f"\n  RETRACTED: {r.title}")

API Reference

Extraction types

Class	Description
`PdfExtractor`	Configurable PDF extraction pipeline with custom strategy support
`ExtractionResult`	Container for parsed references and skip statistics
`Reference`	A parsed reference (title, authors, DOI, arXiv ID) — also constructible manually
`SkipStats`	Counts of skipped references by reason
`ArchiveEntry`	A single entry yielded from archive extraction
`ArchiveIterator`	Iterator over archive entries
`is_archive_path()`	Returns `True` if a path looks like a supported archive

Validation types

Class	Description
`ValidatorConfig`	Configuration: API keys, timeouts, concurrency, offline DBs, cache, SearxNG
`Validator`	Validation engine — call `.check(refs)` to validate
`ValidationResult`	Per-reference result: status, source, authors, per-DB details
`DbResult`	Single database query result: status, elapsed time, found authors
`DoiInfo`	DOI resolution result
`ArxivInfo`	arXiv resolution result
`RetractionInfo`	Retraction check result
`ProgressEvent`	Real-time progress callback event
`CheckStats`	Summary statistics (verified, not_found, author_mismatch, retracted)

Status values

ValidationResult.status: "verified" | "not_found" | "author_mismatch"

Examples

See python/examples/ for runnable scripts:

Example	Description
`basic_usage.py`	Extract references from a PDF
`step_by_step.py`	Run each pipeline stage individually
`custom_regexes.py`	Override patterns for non-standard formats
`validate_references.py`	Full pipeline: extract + validate + report
`batch_validate.py`	Validate references without PDF extraction (#178)

Threading and performance

GIL release: Validator.check() releases the Python GIL during the Rust async runtime call. Other Python threads can execute freely while validation runs.
Concurrency: References are checked in parallel (default 4 at a time). All 10 databases are queried concurrently per reference. First match triggers early exit.
Progress callbacks: The GIL is briefly re-acquired to call Python progress callbacks. Since events fire once per reference (not per HTTP request), overhead is negligible.
Tokio runtime: Each Validator instance owns a tokio multi-threaded runtime. Creating many validators is wasteful — reuse a single instance for multiple check() calls.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

worst

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0

Apr 30, 2026

This version

0.1.2

Apr 10, 2026

0.1.1

Feb 24, 2026

0.1.1a5 pre-release

Feb 21, 2026

0.1.0

Feb 12, 2026

0.0.1

Feb 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hallucinator-0.1.2.tar.gz (757.0 kB view details)

Uploaded Apr 10, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hallucinator-0.1.2-cp312-cp312-win_amd64.whl (11.0 MB view details)

Uploaded Apr 10, 2026 CPython 3.12Windows x86-64

hallucinator-0.1.2-cp312-cp312-manylinux_2_28_x86_64.whl (13.3 MB view details)

Uploaded Apr 10, 2026 CPython 3.12manylinux: glibc 2.28+ x86-64

hallucinator-0.1.2-cp312-cp312-macosx_11_0_arm64.whl (10.9 MB view details)

Uploaded Apr 10, 2026 CPython 3.12macOS 11.0+ ARM64

hallucinator-0.1.2-cp312-cp312-macosx_10_15_x86_64.whl (11.4 MB view details)

Uploaded Apr 10, 2026 CPython 3.12macOS 10.15+ x86-64

File details

Details for the file hallucinator-0.1.2.tar.gz.

File metadata

Download URL: hallucinator-0.1.2.tar.gz
Upload date: Apr 10, 2026
Size: 757.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hallucinator-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`64af75c9f7cb728ed9d4084c0088c17d8c4036fbe733a7b8b9efad8d158cc697`
MD5	`0bd51b151f7d255d29873e3241c78d8e`
BLAKE2b-256	`15fcdde644bf994fb029ed4f2b638b5aec1f9cfcca6e0b2aedc3467da89554e9`

See more details on using hashes here.

File details

Details for the file hallucinator-0.1.2-cp312-cp312-win_amd64.whl.

File metadata

Download URL: hallucinator-0.1.2-cp312-cp312-win_amd64.whl
Upload date: Apr 10, 2026
Size: 11.0 MB
Tags: CPython 3.12, Windows x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hallucinator-0.1.2-cp312-cp312-win_amd64.whl
Algorithm	Hash digest
SHA256	`aff484da75b4e2b16674c48d358a76e340efd7dca4ee5989ac5e95361d17c53d`
MD5	`557a913e4e125287dd4337915ca6604c`
BLAKE2b-256	`b1a93c2921a816b4a4dcd5cdd4121d1079f656457bb31e66060fc78c32bbc1ca`

See more details on using hashes here.

File details

Details for the file hallucinator-0.1.2-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

Download URL: hallucinator-0.1.2-cp312-cp312-manylinux_2_28_x86_64.whl
Upload date: Apr 10, 2026
Size: 13.3 MB
Tags: CPython 3.12, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hallucinator-0.1.2-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`b1fb4988ade59e351119bc692b9d08b41b6eb2ea875c04a3ef91727374490d26`
MD5	`1a16a7625743e664da3403117bd1f3a5`
BLAKE2b-256	`e372c7c1888e6774eefa46a67e4bee7cfc3875c4e4fea639f8dde8b3f80fefb5`

See more details on using hashes here.

File details

Details for the file hallucinator-0.1.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

Download URL: hallucinator-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
Upload date: Apr 10, 2026
Size: 10.9 MB
Tags: CPython 3.12, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hallucinator-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`fa38194c94c5e5533c4954c0b3ee0bb19188d4a72519ee667e1956c4bab3d5e1`
MD5	`a3e074f182475caa4d013ae24efca807`
BLAKE2b-256	`8beeb711f97a842fcbf0922f6e50e432d8a0fc5779e31d03af1e1eaf91336ec7`

See more details on using hashes here.

File details

Details for the file hallucinator-0.1.2-cp312-cp312-macosx_10_15_x86_64.whl.

File metadata

Download URL: hallucinator-0.1.2-cp312-cp312-macosx_10_15_x86_64.whl
Upload date: Apr 10, 2026
Size: 11.4 MB
Tags: CPython 3.12, macOS 10.15+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hallucinator-0.1.2-cp312-cp312-macosx_10_15_x86_64.whl
Algorithm	Hash digest
SHA256	`adbea16f8c086f40a163bde6f0405429cf78263c32b57c671afa985120c14a35`
MD5	`879c7f8dbae481c49db7ddbd529463f7`
BLAKE2b-256	`88fbae99e5fae1fef97029d681d82424c5e77b620299443042e1ff798bf016bc`

See more details on using hashes here.

hallucinator 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Python Bindings

Installation

From PyPI (recommended)

From source

PDF Extraction

Quick start

PdfExtractor

Configuration

Custom segmentation strategies

Archive extraction

ArchiveEntry

ExtractionResult

Reference

Creating references manually

Batch validation without PDF extraction

Reference Validation

Quick start

ValidatorConfig

API keys

Concurrency and timeouts

Persistent cache

SearxNG web search fallback

Disable databases

Offline databases

Author checking

Validator

check()

Progress callbacks

ProgressEvent properties

Cancellation

Stats

ValidationResult

Per-database results

DOI and arXiv info

Retraction info

Complete example

API Reference

Extraction types

Validation types

Status values

Examples

Threading and performance

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes