Genuine AI epistemic self-assessment framework - Universal interface for single AI tracking
Project description
Empirica
Teaching AI to know what it knows—and what it doesn't
What is Empirica?
Empirica is an epistemic self-awareness framework that enables AI agents to genuinely understand the boundaries of their own knowledge. Instead of producing confident-sounding responses regardless of actual understanding, AI agents using Empirica can accurately assess what they know, identify gaps, and communicate uncertainty honestly.
The core insight: AI systems today lack functional self-awareness. They can't reliably distinguish between "I know this well" and "I'm guessing." Empirica provides the cognitive infrastructure to make this distinction measurable and actionable.
Why This Matters
The Problem: AI agents exhibit "confident ignorance"—they generate plausible-sounding responses about topics they don't actually understand. This leads to:
- Hallucinated facts presented as truth
- Wasted time investigating already-explored dead ends
- Knowledge lost between sessions
- No way to tell when an AI is genuinely confident vs. bluffing
The Solution: Empirica introduces epistemic vectors—quantified measures of knowledge state that AI agents track in real-time. These vectors emerged from observing what information actually matters when assessing cognitive readiness.
The 13 Foundational Vectors
These vectors weren't designed in a vacuum. They emerged from 600+ real working sessions across multiple AI systems (Claude, GPT-4, Gemini, Qwen, and others), with Claude serving as the primary development partner due to its reasoning capabilities.
The pattern proved universal: regardless of which AI system we tested, these same dimensions consistently predicted success or failure in complex tasks.
The Vector Space
| Tier | Vector | What It Measures |
|---|---|---|
| Gate | engagement |
Is the AI actively processing or disengaged? |
| Foundation | know |
Domain knowledge depth (0.7+ = ready to act) |
do |
Execution capability | |
context |
Access to relevant information | |
| Comprehension | clarity |
How clear is the understanding? |
coherence |
Do the pieces fit together? | |
signal |
Signal-to-noise in available information | |
density |
Information richness | |
| Execution | state |
Current working state |
change |
Rate of progress/change | |
completion |
Task completion level | |
impact |
Significance of the work | |
| Meta | uncertainty |
Explicit doubt tracking (0.35- = ready to act) |
Why These Vectors?
Readiness Gate: Through empirical observation, we found that know ≥ 0.70 AND uncertainty ≤ 0.35 reliably predicts successful task execution. Below these thresholds, investigation is needed.
The Key Insight: The uncertainty vector is explicitly tracked because AI systems naturally underreport doubt. Making it a first-class metric forces honest assessment.
Applications Across Industries
While the vectors emerged from software development work, they map to any domain requiring knowledge assessment:
| Industry | Primary Vectors | Use Case |
|---|---|---|
| Software Development | know, context, uncertainty, completion | Code review, architecture decisions, debugging |
| Research & Analysis | know, clarity, coherence, signal | Literature review, hypothesis testing |
| Healthcare | know, uncertainty, impact | Diagnostic confidence, treatment recommendations |
| Legal | context, clarity, coherence | Case analysis, precedent research |
| Education | know, do, completion | Learning assessment, curriculum design |
| Finance | know, uncertainty, impact | Risk assessment, investment analysis |
Why Software Development First?
Software engineering provides an ideal testbed because:
- Measurable outcomes - Code either works or it doesn't
- Complex knowledge states - Requires synthesizing documentation, code, tests, and context
- Session continuity - Projects span days/weeks with context loss between sessions
- Multi-agent potential - Team collaboration benefits from shared epistemic state
Empirica was battle-tested here before expanding to other domains.
Quick Start
For End Users
Visit getempirica.com for the guided setup experience with tutorials and support.
For Developers: One-Command Install
The installer sets up everything: Claude Code hooks, system prompts, environment configuration, and a demo project.
Linux / macOS
curl -fsSL https://raw.githubusercontent.com/Nubaeon/empirica/main/scripts/install.py | python3 -
Or download and run manually:
wget https://raw.githubusercontent.com/Nubaeon/empirica/main/scripts/install.py
python3 install.py
Windows (PowerShell)
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/Nubaeon/empirica/main/scripts/install.py" -OutFile "install.py"
python install.py
What the Installer Does
- Installs Empirica via pip
- Sets up Claude Code hooks for automatic epistemic continuity
- Places CLAUDE.md in the correct location (
~/.claude/CLAUDE.md) - Configures environment variables for your shell
- Creates a demo project so you can try it immediately
- Optionally sets up Qdrant for semantic memory (local vector search)
Manual Installation
If you prefer manual setup:
# Install from PyPI
pip install empirica
# Or with all features
pip install empirica[all]
# MCP Server (for Claude Desktop, Cursor, etc.)
pip install empirica-mcp
# Initialize in your project
cd your-project
empirica project-init
⚠️ Important: System Prompt Required
Empirica requires a system prompt to function correctly. The CLI tools work without it, but the full epistemic workflow (CASCADE phases, calibration, Sentinel gates) requires the AI to understand the framework.
For manual installations, copy the system prompt:
# Create Claude Code config directory mkdir -p ~/.claude # Copy the system prompt (choose your AI) curl -fsSL https://raw.githubusercontent.com/Nubaeon/empirica/main/docs/human/developers/system-prompts/CLAUDE.md \ -o ~/.claude/CLAUDE.mdThe installer handles this automatically. See System Prompts for prompts for other AI assistants (Copilot, etc.).
Homebrew (macOS)
brew tap nubaeon/tap
brew install empirica
Docker
# Standard image (Debian slim, ~414MB)
docker pull nubaeon/empirica:1.5.6
# Security-hardened Alpine image (~276MB, recommended)
docker pull nubaeon/empirica:1.5.6-alpine
# Run
docker run -it -v $(pwd)/.empirica:/data/.empirica nubaeon/empirica:1.5.6 /bin/bash
After Installation: Getting Started
Once installed, let Empirica teach you how it works:
Option 1: Interactive Onboarding (Recommended)
# Start the guided onboarding experience
empirica onboard
This walks you through creating your first session, understanding vectors, and logging your first finding.
Option 2: Ask the AI to Explain
If you're using Claude Code or another AI with Empirica installed:
"Explain how Empirica works using docs-explain"
"What are epistemic vectors and how do I use them?"
"Help me set up Empirica for my project"
The AI can query Empirica's documentation semantically and explain concepts tailored to your context.
Option 3: Explore Documentation
# Search documentation semantically
empirica docs-explain --topic "epistemic vectors"
empirica docs-explain --topic "CASCADE workflow"
empirica docs-explain --topic "session management"
# List all available topics
empirica docs-list
Option 4: Try the Demo Project
The installer creates a demo project at ~/empirica-demo/. Navigate there and follow the WALKTHROUGH.md:
cd ~/empirica-demo
cat WALKTHROUGH.md
Expanding Your Own Projects
Once you understand the basics, add epistemic foundations to your existing projects:
cd your-existing-project
empirica project-init
# Create your first session
empirica session-create --ai-id claude-code --output json
# Start tracking what you know
empirica preflight-submit -
Documentation
For Humans
Start here based on your role:
| Role | Start With | Then Read |
|---|---|---|
| End User | Getting Started | Empirica Explained Simply |
| Developer | Developer README | Claude Code Setup |
Documentation Structure:
docs/
├── human/ # Human-readable documentation
│ ├── end-users/ # Installation, concepts, troubleshooting
│ └── developers/ # Integration, system prompts, API
│ └── system-prompts/ # AI system prompts (Claude, Copilot, etc.)
│
└── architecture/ # Technical architecture (for AI context loading)
For AI Integration
If you're integrating Empirica into an AI system:
- System Prompts: docs/human/developers/system-prompts/
- MCP Server: empirica-mcp/ (Model Context Protocol integration)
- Architecture Docs: docs/architecture/ (AI-optimized technical reference)
Key Guides
| Guide | Purpose |
|---|---|
| CASCADE Workflow | The PREFLIGHT → CHECK → POSTFLIGHT loop |
| Epistemic Vectors Explained | Deep dive into all 13 vectors |
| CLI Reference | Complete command documentation |
| Storage Architecture | Four-layer data persistence |
How It Works
The CASCADE Workflow
Every significant task follows this loop:
PREFLIGHT ────────► CHECK ────────► POSTFLIGHT
│ │ │
│ │ │
Baseline Decision Learning
Assessment Gate Delta
│ │ │
"What do I "Am I ready "What did I
know now?" to act?" learn?"
PREFLIGHT: AI assesses its knowledge state before starting work. CHECK: Sentinel gate validates readiness (know ≥ 0.70, uncertainty ≤ 0.35). POSTFLIGHT: AI measures what it learned, creating a learning delta.
Learning Compounds Across Sessions
Session 1: know=0.40 → know=0.65 (Δ +0.25)
↓ (findings persisted)
Session 2: know=0.70 → know=0.85 (Δ +0.15)
↓ (compound learning)
Session 3: know=0.82 → know=0.92 (Δ +0.10)
Each session starts higher because learnings persist. No more re-investigating the same questions.
Live Metacognitive Signal
With Claude Code hooks enabled, you see epistemic state in your terminal:
[empirica] ⚡94% │ 🎯3 ❓12/5 │ POSTFLIGHT │ K:95% U:5% C:92% │ ✓ │ ✓ stable
What this tells you:
- ⚡94% — Overall epistemic confidence (⚡ high, 💡 good, 💫 uncertain, 🌑 low)
- 🎯3 ❓12/5 — Open goals (3) and unknowns (12 total, 5 blocking goals)
- POSTFLIGHT — CASCADE phase (PREFLIGHT → CHECK → POSTFLIGHT)
- K:95% U:5% C:92% — Knowledge, Uncertainty, Context scores
- ✓ / ⚠ / △ — Learning delta summary (net positive / net negative / neutral)
- ✓ stable — Drift indicator (✓ stable, ⚠ drifting, ✗ severe)
Built With Empirica
Projects using Empirica's epistemic foundations:
| Project | Description | Use Case |
|---|---|---|
| Docpistemic | Epistemic documentation system | Self-aware documentation that tracks what it explains well vs. poorly |
| Carapace | Defensive AI shell | Security-focused AI wrapper with epistemic safety gates |
| Empirica CRM | Customer relationship management | CRM where AI knows its confidence about customer insights |
Building something with Empirica? Open an issue to get listed here.
What's New in 1.5.6
- Qdrant Hardening — File-based fallback removed (#45), None guards on all 36 call sites, graceful degradation when no server
- Schema Migration Fix (#44) — CREATE INDEX runs after migrations that add columns, fixing crash on existing DBs
- project-embed Path Resolution (#46) — Resolves correct sessions.db from workspace.db, not CWD
- Instance Isolation — Closed transactions persist as project anchors for post-compact resolution
- transaction-adopt Fix — Same-instance adoption no longer loses the transaction file
Privacy & Data
Your data stays local:
.empirica/— Local SQLite database (gitignored by default).git/refs/notes/empirica/*— Epistemic checkpoints (local unless you push)- Qdrant runs locally if enabled
No cloud dependencies. No telemetry. Your epistemic data is yours.
Community & Support
- Website: getempirica.com
- Issues: GitHub Issues
- Discussions: GitHub Discussions
License
MIT License — Maximum adoption, aligned with Empirica's transparency principles.
See LICENSE for details.
Author: David S. L. Van Assche Version: 1.5.6
Turtles all the way down — built with its own epistemic framework, measuring what it knows at every step.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file empirica-1.5.6.tar.gz.
File metadata
- Download URL: empirica-1.5.6.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5465a0fef784e02e5a68b2ff14650f75271f065af9767f69c6e89c4a14396009
|
|
| MD5 |
f95ba9f71675b48acd82cc10de45c4ca
|
|
| BLAKE2b-256 |
ac5077e9388d3a4de5c46a36a9feac22b1736f971aa01a2a64ce5c2ef7f4d33a
|
File details
Details for the file empirica-1.5.6-py3-none-any.whl.
File metadata
- Download URL: empirica-1.5.6-py3-none-any.whl
- Upload date:
- Size: 1.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
997238b0d63e8b1abc32ff3ec6bf786f307e2c5787514ff6b98ad80c11b5912f
|
|
| MD5 |
3442be1eb298e03ab4ab8a7d37e6b3fa
|
|
| BLAKE2b-256 |
a3e5161a8b5de9875ff44e85c76f7e129a989fd651e2b960e92896205c516dbf
|