Cartograph hallucination risk across an LLM's knowledge space
Project description
๐บ๏ธ hallucimap
Cartograph hallucination risk across an LLM's knowledge space
hallucimap doesn't just ask "did the model hallucinate?" โ it builds a persistent danger map showing exactly where a model confabulates across every domain, time period, and entity type it knows.
science/physics โโโโโโโโโโ 0.82 โ high risk zone
history/wwii โโโโโโโโโโ 0.53
medicine/anatomy โโโโโโโโโโ 0.31
finance/markets โโโโโโโโโโ 0.19
factual/mathematics โโโโโโโโโโ 0.08 โ well-calibrated
๐งญ What is this?
Most hallucination tools check a single output โ one prompt, one verdict. That tells you almost nothing about the model's systematic failure modes.
hallucimap takes a different approach:
- Probe the model across hundreds of questions spanning domains, time periods, and entity types
- Score each answer using consistency sampling โ ask the same question N times and measure how much the model contradicts itself
- Map the scores into a persistent
RiskAtlasโ a 2-D grid of hallucination risk per knowledge cell - Visualize the atlas as an interactive heatmap so you can instantly see the danger zones
The result is a reusable, persistent fingerprint of where a model is unreliable โ not just on today's test set, but structurally across its knowledge space.
โจ Features
| Feature | Description |
|---|---|
| ๐ Consistency Sampling | Ask the same question N times at temperature > 0. Low agreement = high risk. |
| ๐ Factual Grounding | Cross-check answers against known references to catch confident confabulation. |
| ๐๏ธ Persistent RiskAtlas | JSON-serializable danger map that accumulates across multiple scan sessions. |
| ๐ Multi-Model | OpenAI (GPT-4o), Anthropic (Claude 3.5+), or any local HuggingFace model. |
| โก Async-First | Fully non-blocking โ scans run concurrently with tunable parallelism. |
| ๐บ๏ธ Interactive Heatmap | Plotly HTML output โ hover for domain, risk score, confidence, and sample count. |
| ๐ Incremental Scans | Load an existing atlas and extend it โ only probe what you haven't mapped yet. |
| ๐ฅ๏ธ CLI | hallucimap scan and hallucimap show โ batteries included. |
๐ Quickstart
Install
pip install hallucimap
Scan a model
# OpenAI
export OPENAI_API_KEY=sk-...
hallucimap scan --model gpt-4o --domains science,history,medicine --samples 5
# Anthropic
export ANTHROPIC_API_KEY=sk-ant-...
hallucimap scan --model claude-3-5-sonnet-20241022 --domains law,finance --samples 5
Visualize the danger map
# Open interactive heatmap in browser
hallucimap show atlas_gpt-4o.json --browser
# Save as standalone HTML
hallucimap show atlas_gpt-4o.json --save map.html
Print a summary table
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Risk Atlas: gpt-4o โ
โกโโโโโโโโโโโโโโโโฏโโโโโโโโโโโโโโโโฏโโโโโโโโฏโโโโโโโโโโโโโฏโโโโโโโโโฉ
โ Domain โ Subdomain โ Risk โ Confidence โ Samplesโ
โโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโค
โ science โ physics โ 0.821 โ 0.91 โ 25 โ
โ temporal โ post_cutoff โ 0.764 โ 0.88 โ 20 โ
โ entity โ person โ 0.612 โ 0.85 โ 15 โ
โ history โ wwii โ 0.534 โ 0.83 โ 15 โ
โ medicine โ pharmacology โ 0.487 โ 0.82 โ 15 โ
โโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโดโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโ
RiskAtlas(model=gpt-4o, cells=24, mean_risk=0.371, sessions=2)
๐ Python API
import asyncio
from hallucimap import RiskAtlas, HallucinationScorer
from hallucimap.models import AnthropicAdapter
from hallucimap.probes import DomainProbe, EntityProbe, TemporalProbe
from hallucimap.viz import HeatmapRenderer
async def main():
# 1. Set up adapter + scorer
adapter = AnthropicAdapter(model="claude-3-5-sonnet-20241022")
scorer = HallucinationScorer(adapter=adapter, n_samples=5, temperature=0.9)
atlas = RiskAtlas(model_id="claude-3-5-sonnet-20241022")
# 2. Run probes across multiple domains
probes = [
DomainProbe(domain="science"),
DomainProbe(domain="medicine"),
EntityProbe(entity_type="person"),
TemporalProbe(cutoff_year=2024),
]
for probe in probes:
results = await probe.run_all(adapter, concurrency=10)
questions = [(r.question, r.domain, r.subdomain) for r in results]
references = [r.reference for r in results]
scored = await scorer.score_batch(questions, references=references)
atlas.update(scored)
# 3. Inspect the danger map
print(atlas.summary())
for cell in atlas.hottest_cells(n=5):
print(f" {cell.domain}/{cell.subdomain} risk={cell.risk_score:.3f}")
# 4. Persist + visualize
atlas.save("atlas.json")
HeatmapRenderer(atlas).save("atlas.html") # standalone interactive HTML
asyncio.run(main())
๐๏ธ How It Works
โโโโโโโโโโโโโโโ questions โโโโโโโโโโโโโโโโโ N completions โโโโโโโโโโโโโโโโโโโโโโโ
โ Probes โ โโโโโโโโโโโโโโโโบ โ LLM Adapter โ โโโโโโโโโโโโโโโโบ โ HallucinationScorer โ
โ โ โ (async+retry)โ โ โ
โ โข Temporal โ โโโโโโโโโโโโโโโโโ โ consistency score โ
โ โข Entity โ โ + grounding score โ
โ โข Domain โ โ โ risk_score [0,1] โ
โ โข Factual โ โโโโโโโโโโโโฌโโโโโโโโโโโ
โโโโโโโโโโโโโโโ โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ RiskAtlas โ
โ โ
โ domain/subdomain โ
โ โ AtlasCell โ
โ (risk, confidence, โ
โ sample_count) โ
โโโโโโโโโโโโฌโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
โ atlas.save() โ โ HeatmapRenderer โ
โ atlas.json โ โ โ atlas.html โ
โ (incremental) โ โ (Plotly, hover) โ
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
Scoring algorithm
The risk score for any (question, domain, subdomain) is:
consistency = mean pairwise similarity across N samples
grounding = token-F1 between response and reference answer (if known)
risk_score = ฮฑ ร (1 โ consistency) + ฮฒ ร (1 โ grounding)
where ฮฑ = 0.7, ฮฒ = 0.3 (grounding term omitted if no reference)
Phase 2 will replace the token-F1 heuristic with sentence-transformer embeddings (
all-MiniLM-L6-v2) for semantic consistency scoring.
๐ฌ Probe Types
FactualProbe โ calibration baseline
Tests unambiguous facts with known answers (capitals, atomic numbers, historical dates). A well-functioning model should score near 0 here; elevated scores flag systemic issues.
from hallucimap.probes import FactualProbe
probe = FactualProbe()
# โ "What is the chemical symbol for water?" ref: "H2O"
# โ "How many sides does a hexagon have?" ref: "6"
EntityProbe โ named entity knowledge
Probes biographical facts, founding dates, and key attributes of people, organizations, and places โ a classic hallucination flashpoint where models invent plausible-but-wrong details.
from hallucimap.probes import EntityProbe
probe = EntityProbe(entity_type="person")
# โ "What year was Marie Curie born?" ref: "1867"
# โ "What university did Einstein attend?" ref: "ETH Zurich"
DomainProbe โ deep domain knowledge
Targets specialized fields where overconfident confabulation is dangerous: biomedical, legal, financial, and scientific knowledge.
from hallucimap.probes import DomainProbe
probe = DomainProbe(domain="medicine", subdomain="pharmacology")
# โ "What is the antidote for acetaminophen overdose?" ref: "N-acetylcysteine"
# โ "What are SSRIs used to treat?" ref: "depression, anxiety"
TemporalProbe โ post-cutoff events
Tests knowledge of events after the model's training cutoff. A well-calibrated model should hedge; a hallucinating one will invent confident but fabricated details.
from hallucimap.probes import TemporalProbe
probe = TemporalProbe(cutoff_year=2024, target_years=[2024, 2025])
# โ "Who won the Nobel Prize in Physics in 2025?"
# โ "What major AI models were released in 2025?"
๐ค Supported Models
| Provider | Models | Adapter |
|---|---|---|
| OpenAI | gpt-4o, gpt-4-turbo, gpt-3.5-turbo |
OpenAIAdapter |
| Anthropic | claude-3-5-sonnet-20241022, claude-opus-4-6, claude-3-5-haiku-20241022 |
AnthropicAdapter |
| HuggingFace | Any local causal LM (Llama, Mistral, Phiโฆ) | HFAdapter |
All adapters share the same async interface โ swap models by changing one line.
# Swap from OpenAI to Anthropic โ nothing else changes
adapter = OpenAIAdapter(model="gpt-4o")
adapter = AnthropicAdapter(model="claude-3-5-sonnet-20241022")
adapter = HFAdapter(model="meta-llama/Llama-3-8B-Instruct", device="cuda")
๐ Project Structure
hallucimap/
โโโ src/hallucimap/
โ โโโ core/
โ โ โโโ atlas.py โ RiskAtlas โ load / update / save / query
โ โ โโโ scorer.py โ HallucinationScorer โ consistency + grounding
โ โ โโโ topology.py โ KnowledgeTopology โ 2-D PCA/UMAP projection
โ โโโ probes/
โ โ โโโ base.py โ BaseProbe (abstract)
โ โ โโโ temporal.py โ post-cutoff date facts
โ โ โโโ entity.py โ named entities (people, orgs, places)
โ โ โโโ domain.py โ domain knowledge (bio, law, finance)
โ โ โโโ factual.py โ verifiable factual claims
โ โโโ models/
โ โ โโโ base.py โ BaseLLMAdapter (abstract)
โ โ โโโ openai_adapter.py
โ โ โโโ anthropic_adapter.py
โ โ โโโ hf_adapter.py
โ โโโ viz/
โ โ โโโ heatmap.py โ Plotly interactive heatmap renderer
โ โโโ testing.py โ MockAdapter for downstream tests
โ โโโ cli.py โ hallucimap scan / hallucimap show
โโโ tests/ โ 53 tests, 63% coverage
โโโ examples/
โ โโโ scan_gpt4o.py
โ โโโ scan_claude.py
โโโ .github/workflows/ci.yml
๐ ๏ธ Development
git clone https://github.com/advait27/hallucimap.git
cd hallucimap
pip install -e ".[dev]"
# Lint
ruff check src tests
# Type-check
mypy src
# Tests with coverage
pytest
# Full CI check (lint + types + tests)
ruff check src tests && mypy src && pytest
๐๏ธ Roadmap
| Phase | Status | Description |
|---|---|---|
| 0 โ Scaffold | โ Done | Package structure, Pydantic models, adapter stubs, CLI skeleton |
| 1 โ Adapters | โ Done | OpenAI, Anthropic, HuggingFace adapters with retry logic |
| 2 โ Scorer | ๐ง Next | Embedding-based consistency via all-MiniLM-L6-v2 |
| 3 โ Topology | โณ Planned | UMAP projection of knowledge space; semantic clustering |
| 4 โ Probe Datasets | โณ Planned | TriviaQA, Wikidata, curated post-cutoff corpora |
| 5 โ Heatmap v2 | โณ Planned | Topology-aware heatmap overlay; cluster annotations |
| 6 โ CLI + Docs | โณ Planned | Rich progress bars, hallucimap summary, hosted docs |
| 7 โ PyPI | โณ Planned | Publish to PyPI; versioned releases |
๐ License
MIT ยฉ Advait Dharmadhikari โ see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hallucimap-0.1.0.tar.gz.
File metadata
- Download URL: hallucimap-0.1.0.tar.gz
- Upload date:
- Size: 37.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f165c3e942a389aec30a696307568ffd508a194a8a11cd4a88ed8ee0249e484f
|
|
| MD5 |
c014c92d296d2f2a09c1c679111b430b
|
|
| BLAKE2b-256 |
cf0ae183baba470a1f1b136cab29d1424c70724f5f07fecabf3e78bd289ea71c
|
File details
Details for the file hallucimap-0.1.0-py3-none-any.whl.
File metadata
- Download URL: hallucimap-0.1.0-py3-none-any.whl
- Upload date:
- Size: 39.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5eb75a711d8646a6b20652d2163774d8f30125e1d0fc6f65647722e5778e87aa
|
|
| MD5 |
f8d2711963379be08bd083a96df0562d
|
|
| BLAKE2b-256 |
d9f3f8cd0181992cd3d47f2178623e2d287b4cace7702c3dc11d6534c2733b1d
|