Fact Error Rate (FaER) evaluation framework for deep research reports
Project description
AI systems can now generate research reports with citations, but how do you know the citations actually support the claims? That a source URL is real? That the facts are correct?
Faerie (Fact Error Rate) is an evaluation framework that answers these questions. Give it a markdown report and it will parse every claim, fetch every cited source, verify citations against source content, independently fact-check claims via web search, and produce a detailed grade card with metrics. The pipeline runs in seven steps — parse, verify sources, extract claims, verify claims, calculate metrics, assess quality, extract insights — and outputs a structured EvaluationReport with scores, verdicts, and recommendations.
- Source verification — fetches every cited URL to check accessibility and content
- Citation accuracy — checks whether each claim is actually supported by its cited source
- Independent fact-checking — verifies claims against the web, not just the cited source
- Exponential penalty scoring — false facts halve the score (0.5^n), unverifiable claims cut 25% each (0.75^n)
- Quality assessment — LLM judge evaluates report structure and thoroughness
- Insight extraction — identifies Nth-order insights that synthesize across multiple facts
- Grade card — S+ through F- with ASCII art, per-category assessment, and actionable recommendations
Installation
pip install faerie-eval
Quick Start
CLI
faerie evaluate report.md
faerie evaluate report.md --output results.json --verbose
Python
from faerie import FaerieEvaluator
evaluator = FaerieEvaluator(model="gemini-3-flash-preview", verbose=True)
result = evaluator.evaluate_file("report.md")
print(f"Overall Score: {result.overall_quality_score:.0%}")
print(f"Citation Accuracy: {result.faer_metrics.citation_accuracy_rate:.0%}")
print(f"Factual Accuracy: {result.faer_metrics.fact_accuracy_rate:.0%}")
Generate + Evaluate
Faerie pairs with Dragen for end-to-end research pipelines. Generate a report, then evaluate it:
# Generate a research report
python examples/deep_research.py "AI agents in 2025"
# Evaluate the generated report
faerie evaluate report.md --verbose
CLI Options
faerie evaluate <report.md> [OPTIONS]
--output, -o PATH Save full JSON results to file
--model, -m MODEL LLM model for verification (default: gemini-3-flash-preview)
--skip-fact-check Skip independent fact verification (faster, citation-only)
--skip-insights Skip Nth-order insight extraction
--max-workers, -w N Max parallel verification workers (default: 3)
--verbose, -v Print detailed progress
--json Output full JSON to stdout
Scoring
The overall score combines four dimensions with exponential penalties for errors:
| Dimension | Weight | Description |
|---|---|---|
| Citation Accuracy | 30% | Do sources support the claims they're cited for? |
| Factual Accuracy | 60% | Are claims independently verifiable as true? |
| Structure | 5% | Is the report well-organized? |
| Thoroughness | 5% | Does it cover the topic in depth? |
Penalty system: Each false fact multiplies the score by 0.5. Each unverifiable claim multiplies by 0.75. A report with 3 false facts and 2 unverifiable claims gets: base_score * 0.5^3 * 0.75^2 = base_score * 0.07.
Citation
If you use Faerie in your research, please cite it as:
@software{faerie,
title = {Faerie: Fact Error Rate Evaluation Framework},
author = {Chonkie Inc.},
url = {https://github.com/chonkie-inc/faerie},
license = {MIT},
year = {2025-2026}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file faerie_eval-0.1.0.tar.gz.
File metadata
- Download URL: faerie_eval-0.1.0.tar.gz
- Upload date:
- Size: 28.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c79a38b8387b13d21c72653459195929a5db43c6d4a391ced4d529a5bd67197
|
|
| MD5 |
984053971914d48402896d814188184e
|
|
| BLAKE2b-256 |
2055a2af10da365f9b615ee5125b90d778aa943f8d44f9b6d3ad96be6bc54b3a
|
File details
Details for the file faerie_eval-0.1.0-py3-none-any.whl.
File metadata
- Download URL: faerie_eval-0.1.0-py3-none-any.whl
- Upload date:
- Size: 36.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7cc7467b9cd828ca43ef213945f438750fe47dfb11757bb1d78af8365386c399
|
|
| MD5 |
ae237d8f868293f72ff287994b3d5566
|
|
| BLAKE2b-256 |
7e5a379638d969d916c2f1cb8b6034af1998b394f55a5fe4d6f2cb8608230e30
|