Measurement and calibration layer for AI — track what it knows, gate what it does

These details have not been verified by PyPI

Project description

Empirica

We Gave AI a Mirror. Now It Measures What It Believes.

Epistemic infrastructure for AI — measurement, memory, and calibration across sessions.

Empirica tracks what AI knows, gates what it does, and compounds learning across session boundaries. It measures the gap between what AI predicts and what's true — making AI agents measurably more reliable.

Training & Guides | CLI Reference | Architecture

Important: Empirica is an AI measurement framework. It has no cryptocurrency, token, coin, or blockchain component. Any token using the Empirica name (including "$EMPIRICA" on Solana) is unauthorized and not affiliated with this project or Empirica AI GmbH.

The Problem

AI coding agents today have no self-awareness about what they know:

Forgets between sessions — same questions, same dead ends, every time
Acts before understanding — edits your code without knowing the architecture
Can't tell you when it's guessing — no distinction between knowledge and confabulation
No audit trail — reasoning evaporates with the context window

What Empirica Does

Capability	What You Experience
Measures before acting	AI investigates your codebase before touching it. The Sentinel gate blocks edits until understanding is demonstrated
Remembers across sessions	Findings, dead-ends, and learnings persist in a 4-layer memory system. Session 3 starts where Session 2 left off
Prevents confident mistakes	The CHECK gate uses domain-aware thresholds scaled by criticality — cybersec/high is stricter than default/low
Shows confidence in real-time	Live statusline in your terminal: `[empirica] ⚡94% ↕70% │ 🎯3 │ POST 🔍92% │ K:95% C:92%`
Calibrates against reality	Three-vector model: self-assessed, observed (from deterministic checks), and AI-reasoned grounded state with rationale. Domain compliance loops iterate until all checks pass
Tracks your codebase	Temporal entity model auto-extracts functions, classes, and imports from every file edit — the AI knows what's alive and what's stale
Coordinates with peer AIs	Cross-Claude mesh via Cortex — peer AIs propose work, ECO accepts/declines, completion handshakes carry commit SHAs. A persistent listener wakes idle sessions on inbox events
Works through natural language	You describe tasks normally. The AI operates the measurement system automatically

How You Use It

You talk to your AI normally. Empirica works in the background:

You:      "Fix the authentication bug in the login flow"

Empirica: [AI investigates → logs findings → passes Sentinel gate → implements fix → measures learning]

You see:  ⚡87% ↕70% │ 🎯1 │ POST 🔍85% │ K:88% C:82% │ Δ +K

You direct. The AI measures.

Empirica's CLI has 150+ commands spanning investigation, measurement, calibration, and memory — like a cockpit instrument panel. You don't need to learn any of them. The AI reads the instruments, operates the controls, and reports back in natural language. The statusline gives you the flight data at a glance.

For power users, direct CLI access is always available: empirica goals-list, empirica calibration-report, empirica project-search --task "...", and more.

Learn the full workflow: getempirica.com has interactive training, guides, and deep explanations of every concept.

Quick Start

Install + Claude Code (Recommended)

pip install empirica
empirica setup-claude-code

Then just start working. The hooks, Sentinel, system prompt, statusline, and MCP server are all configured automatically. See Claude Code Setup for details — including a "What the hooks inject" section for Claude sessions that want to see the contract (which hook fires when, what it adds to the AI's context, source pointers for every emission) before agreeing to install.

Already have Claude Code configured? Use --force to replace your default Claude Code settings with Empirica's epistemic hooks. Without --force, setup only writes files that don't already exist — so if you've already used Claude Code, the default internals stay in place and Empirica's hooks won't activate.

empirica setup-claude-code --force

--force replaces hooks in settings.json but only removes Empirica's own hooks — hooks from other plugins (Railway, Superpowers, etc.) are preserved.

Alternative Installation Methods

Homebrew (macOS)

brew tap nubaeon/tap
brew install empirica
empirica setup-claude-code

Docker

# Security-hardened Alpine image (~276MB, recommended)
docker pull nubaeon/empirica:1.10.5-alpine

# Standard image (Debian slim, ~414MB)
docker pull nubaeon/empirica:1.10.5

# Run
docker run -it -v $(pwd)/.empirica:/data/.empirica nubaeon/empirica:1.10.5 /bin/bash

Manual / Other AI Platforms

pip install empirica
pip install empirica-mcp        # MCP Server (for Cursor, Cline, etc.)
cd your-project && empirica project-init

The CLI works standalone on any platform. The full epistemic workflow (epistemic transactions, Sentinel, calibration) requires loading the system prompt into your AI — the easiest path is empirica setup-claude-code, which wires the lean prompt into ~/.claude/empirica-system-prompt.md and references it from your ~/.claude/CLAUDE.md. See Claude Code Setup for details.

First Session

empirica onboard   # Interactive walkthrough of the full workflow

Or just start working — with Claude Code hooks active, the AI manages the epistemic workflow automatically.

The Measurement Architecture

Empirica works through nested abstraction layers:

Plan
 └── Transaction 1 (Goal A)
      ├── NOETIC: investigate, search, read → findings, unknowns, dead-ends
      ├── CHECK: Sentinel gate → proceed / investigate more
      ├── PRAXIC: implement, write, commit → goals completed
      └── POSTFLIGHT: measure learning delta → persists to memory
 └── Transaction 2 (Goal B, informed by T1's findings)
      └── ...

Plans decompose into transactions — one per goal or Claude Code task. Each transaction is a noetic-praxic loop: investigate first (noetic), then act (praxic), with the Sentinel gating the transition. Along the way, the AI collects and reads artifacts (findings, unknowns, assumptions, dead-ends, decisions) while using semantic search to surface relevant epistemic patterns and anti-patterns from the project's history. Top artifacts are ranked by confidence and fed into each project's MEMORY.md as a hot cache.

The Epistemic Transaction Cycle

PREFLIGHT ────────► CHECK ────────► POSTFLIGHT
    │                 │                  │
 Baseline         Sentinel           Learning
 Assessment        Gate               Delta
    │                 │                  │
 "What do I      "Am I ready      "What did I
  know now?"      to act?"         learn?"

PREFLIGHT: AI assesses its knowledge state before starting work. CHECK: Sentinel gate validates readiness before allowing code edits. POSTFLIGHT: AI measures what it learned, creating a delta that persists.

Live Statusline

With Claude Code hooks enabled, you see the AI's epistemic state in real-time:

[empirica] ⚡94% ↕70% │ 🎯3 ❓12/5 │ POST 🔍92% │ K:95% C:92% │ Δ +K +C

Signal	Meaning
⚡94%	Overall epistemic confidence
↕70%	Sentinel threshold (know gate) — user-facing only
🎯3 ❓12/5	Open goals (3), unknowns (12 total, 5 blocking)
POST 🔍92%	Transaction phase + work state (🔍 investigating / 🔨 acting) with composite score
K:95% C:92%	Knowledge and Context vectors (color-coded by gap to threshold)
Δ +K +C	Learning delta (POSTFLIGHT only) — which vectors improved

The 13 Epistemic Vectors

These vectors emerged from 600+ real working sessions across multiple AI systems. They measure the dimensions that consistently predict success or failure in complex tasks.

Tier	Vector	What It Measures
Gate	`engagement`	Is the AI actively processing or disengaged?
Foundation	`know`	Domain knowledge depth
	`do`	Execution capability
	`context`	Access to relevant information
Comprehension	`clarity`	How clear is the understanding?
	`coherence`	Do the pieces fit together?
	`signal`	Signal-to-noise in available information
	`density`	Information richness
Execution	`state`	Current working state
	`change`	Rate of progress/change
	`completion`	Task completion level
	`impact`	Significance of the work
Meta	`uncertainty`	Explicit doubt tracking

Deep dive: Epistemic Vectors Explained

How It Works With Claude Code

Empirica doesn't replace or reinvent anything Claude Code already does. Claude Code owns tasks, plans, memory, and projects. Empirica adds the measurement layer on top:

Claude Code Does	Empirica Adds
Task management	Epistemic goals with measurable completion
Plan mode	Investigation phase with Sentinel gating — no edits until understanding is verified
MEMORY.md	Auto-curated hot cache ranked by epistemic confidence
Context window	4-layer memory that survives compaction and persists across sessions
Code editing	Grounded calibration — was the AI's confidence justified by test results?
Subagent spawning	Bounded autonomy with delegated work counting and budget tracking

The result: Claude Code's native capabilities, enhanced with measurement, gating, and calibration feedback that compounds over time.

Cross-AI Mesh

Empirica isn't just per-session measurement — multiple Claude sessions across projects can coordinate as peers. The mesh runs on top of Empirica Cortex (proprietary serving layer):

empirica AI ──cortex_propose──► ECO Accept/Decline ──► outreach AI wakes
                                                             │
                                       cortex_complete_proposal (commit SHA)
                                                             │
empirica AI wakes ◄─────── outbox/completed event ───────────┘

Capability	What it does
`cortex_propose` (two flavors)	`collab_brief` is auto-accepted (FYI / question / discussion). Code change / architecture / investigation requests are ECO-gated — they wait for an Accept/Decline decision before the target AI acts
`empirica mailbox reply`	One verb does `cortex_propose` + `cortex_complete_proposal` atomically — closes the AI-to-AI handshake in a single step instead of two
Persistent listener service	systemd-user / launchd daemon holds an ntfy stream open. Idle sessions wake the moment a peer's proposal is decided, not on next user prompt
Canonical loops	`cortex-mailbox-poll` (30s adaptive) and `message-cleanup` (daily git-notes prune) auto-install per AI — no per-project config needed

The browser-side ECO surface (Accept/Decline, inbox triage, publish review) lives in the proprietary Empirica Extension.

Practice Model + Entity Graph (1.10.0)

Empirica's workspace stores entities (projects, contacts, organisations, engagements, users) in entity_registry with typed edges in entity_memberships. The Practice Model frames this consistently:

Term	Maps to
Practitioner	the AI working on the project (you)
Practice	the empirica project itself
Agent	a subagent spawned during the work

Four CLI verbs query the graph without raw SQL:

empirica entity-list [--type project|contact|organization|engagement|user]
empirica entity-show <type:id>          # full record + incoming/outgoing edges
empirica entity-walk <type:id> --depth 3 # BFS membership graph, cycle-safe
empirica entity-search "query" [--type T]

All read-only, all support --output json. Backs cross-project orchestration, CRM workflows, and the entity-aware POSTFLIGHT retrospective.

Platform Support

Platform	Integration Level	What You Get
Claude Code	Full (production)	Hooks, Sentinel gate, skills, agents, statusline, MCP
Cursor, Cline	MCP server	Epistemic transaction workflow, memory, calibration via MCP tools
Gemini CLI, Copilot	Experimental	System prompt + CLI
Any AI	CLI + prompt	Full measurement via CLI commands and system prompt

Documentation & Training

Resource	What It Covers
getempirica.com	Training course, interactive guides, deep explanations
Natural Language Guide	How to collaborate with AI using Empirica
Getting Started	First-time setup and concepts
CLI Reference	All 150+ commands documented
Architecture	Technical reference for contributors
Claude Code Setup	Install + system prompt + plugin wiring
Changelog	Full release history — every version since 1.0
Upgrade to 1.10	Migration guide for the `subtask` → `task` CLI rename

The Empirica Ecosystem

Project	Description	Status
Empirica	Core measurement system — epistemic transactions, Sentinel, calibration, 13 vectors	Open source
Empirica Iris	Epistemic browser automation with SVG spatial indexing — Sentinel gating for visual interactions	Open source
Docpistemic	Epistemic documentation coverage assessment — know what your docs know	Open source
Breadcrumbs	Survive context compacts with git notes — dead simple session continuity	Open source
Empirica Cortex	Cross-project intelligence layer — serves verified predictions and accumulated learnings to condition future work	Proprietary
Empirica Workspace	Entity Knowledge Graph, Epistemic Prompt Engine, CRM, portfolio dashboard	Proprietary
Empirica Extension	Chrome extension — desktop face of the mesh. ECO Accept/Decline, inbox/outbox triage, publish review, conversation extraction from Claude.ai / ChatGPT / Gemini / Grok	Proprietary

Building something with Empirica? Open an issue to get listed.

What's New in 1.10.4

Windows: every hook failed on every event — fixed (#111) — setup-claude-code now writes forward-slash hook paths (Git Bash was eating the backslashes)
Listener replay storms fixed — loop_fires.log rotates by rename, so the wake-Monitors stop re-firing duplicate events across the mesh
gh run/gh workflow reads un-gated in the Sentinel — CI-status checks no longer need a CHECK gate
Decay recency extended to lessons + eidetic — the read-time recency rerank now covers lessons + eidetic facts (longevity modulator), not just findings

Privacy & Data

Your data stays local:

.empirica/ — Local SQLite database (gitignored by default)
.git/refs/notes/empirica/* — Epistemic checkpoints (local unless you push)
Qdrant runs locally if enabled

No cloud dependencies. No telemetry. Your epistemic data is yours.

Community & Support

Website: getempirica.com
Issues: GitHub Issues
Discussions: GitHub Discussions

License

MIT License — see LICENSE for details.

Author: David S. L. Van Assche Version: 1.10.5

Turtles all the way down — built with its own epistemic framework, measuring what it knows at every step.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.12.13

Jul 4, 2026

1.12.12

Jul 3, 2026

1.12.11

Jul 3, 2026

1.12.10

Jul 2, 2026

1.12.9

Jun 30, 2026

1.12.8

Jun 29, 2026

1.12.7

Jun 26, 2026

1.12.6

Jun 24, 2026

1.12.5

Jun 23, 2026

1.12.4

Jun 21, 2026

1.12.3

Jun 21, 2026

1.12.2

Jun 19, 2026

1.12.1

Jun 16, 2026

1.12.0

Jun 14, 2026

1.11.11

Jun 9, 2026

1.11.10

Jun 8, 2026

1.11.9

Jun 5, 2026

1.11.8

Jun 3, 2026

1.11.7

Jun 3, 2026

1.11.6

Jun 3, 2026

1.11.5

Jun 3, 2026

1.11.4

Jun 3, 2026

1.11.3

Jun 3, 2026

1.11.2

Jun 1, 2026

1.11.1

Jun 1, 2026

1.11.0

Jun 1, 2026

1.10.6

May 31, 2026

This version

1.10.5

May 31, 2026

1.10.4

May 29, 2026

1.10.3

May 28, 2026

1.10.2

May 27, 2026

1.10.1

May 26, 2026

1.10.0

May 26, 2026

1.9.11

May 25, 2026

1.9.10

May 21, 2026

1.9.9

May 18, 2026

1.9.8

May 17, 2026

1.9.7

May 17, 2026

1.9.6

May 16, 2026

1.9.5

May 14, 2026

1.9.4

May 13, 2026

1.9.3

May 12, 2026

1.9.2

May 8, 2026

1.9.1

May 6, 2026

1.9.0

May 5, 2026

1.8.20

May 4, 2026

1.8.19

May 2, 2026

1.8.18

May 2, 2026

1.8.17

Apr 30, 2026

1.8.16

Apr 29, 2026

1.8.15

Apr 29, 2026

1.8.14

Apr 28, 2026

1.8.13

Apr 27, 2026

1.8.12

Apr 25, 2026

1.8.11

Apr 24, 2026

1.8.10

Apr 23, 2026

1.8.9

Apr 22, 2026

1.8.8

Apr 18, 2026

1.8.7

Apr 17, 2026

1.8.4

Apr 15, 2026

1.8.2

Apr 13, 2026

1.8.1

Apr 10, 2026

1.8.0

Apr 9, 2026

1.7.13

Apr 8, 2026

1.7.12

Apr 7, 2026

1.7.11

Apr 6, 2026

1.7.10

Apr 6, 2026

1.7.9

Apr 5, 2026

1.7.8

Apr 5, 2026

1.7.7

Apr 4, 2026

1.7.6

Apr 4, 2026

1.7.5

Apr 3, 2026

1.7.4

Apr 2, 2026

1.7.3

Mar 29, 2026

1.7.2

Mar 27, 2026

1.7.1

Mar 26, 2026

1.7.0

Mar 26, 2026

1.6.23

Mar 23, 2026

1.6.22

Mar 23, 2026

1.6.21

Mar 23, 2026

1.6.20

Mar 22, 2026

1.6.19

Mar 22, 2026

1.6.18

Mar 22, 2026

1.6.17

Mar 22, 2026

1.6.16

Mar 22, 2026

1.6.15

Mar 20, 2026

1.6.14

Mar 20, 2026

1.6.13

Mar 20, 2026

1.6.12

Mar 20, 2026

1.6.11

Mar 19, 2026

1.6.10

Mar 18, 2026

1.6.7

Mar 16, 2026

1.6.6

Mar 16, 2026

1.6.5

Mar 16, 2026

1.6.4

Mar 13, 2026

1.6.3

Mar 10, 2026

1.6.2

Mar 10, 2026

1.6.1

Mar 4, 2026

1.6.0

Mar 1, 2026

1.5.9

Feb 27, 2026

1.5.8

Feb 25, 2026

1.5.7

Feb 23, 2026

1.5.6

Feb 22, 2026

1.5.5

Feb 21, 2026

1.5.4

Feb 20, 2026

1.5.3

Feb 18, 2026

1.5.2

Feb 16, 2026

1.5.1

Feb 13, 2026

1.5.0

Feb 8, 2026

1.4.2

Jan 25, 2026

1.4.1

Jan 23, 2026

1.4.0

Jan 21, 2026

1.3.3

Jan 15, 2026

1.3.2

Jan 13, 2026

1.3.1

Jan 13, 2026

1.3.0

Jan 9, 2026

1.2.4

Jan 6, 2026

1.2.3

Jan 2, 2026

1.2.2

Jan 1, 2026

1.2.1

Jan 1, 2026

1.2.0

Dec 30, 2025

1.1.3

Dec 29, 2025

1.1.2

Dec 29, 2025

1.1.1

Dec 29, 2025

1.1.0

Dec 28, 2025

1.0.5

Dec 22, 2025

1.0.3

Dec 19, 2025

1.0.2

Dec 19, 2025

1.0.1

Dec 18, 2025

1.0.0

Dec 18, 2025

1.0.0b0 pre-release

Dec 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

empirica-1.10.5.tar.gz (2.0 MB view details)

Uploaded May 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

empirica-1.10.5-py3-none-any.whl (2.0 MB view details)

Uploaded May 31, 2026 Python 3

File details

Details for the file empirica-1.10.5.tar.gz.

File metadata

Download URL: empirica-1.10.5.tar.gz
Upload date: May 31, 2026
Size: 2.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for empirica-1.10.5.tar.gz
Algorithm	Hash digest
SHA256	`a08365aff4098fee9952a07155c40d97c42d00ee44e459e2921ed7eeeb67bdeb`
MD5	`b487980c301cb01f6e701c8ed8863e2d`
BLAKE2b-256	`4fe9e8969496cc6b277e9e67bedeede18c772eeccc7b168bc69d8e7c939da2e5`

See more details on using hashes here.

File details

Details for the file empirica-1.10.5-py3-none-any.whl.

File metadata

Download URL: empirica-1.10.5-py3-none-any.whl
Upload date: May 31, 2026
Size: 2.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for empirica-1.10.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ac868e004a11e8571e11f2e89e7d5c25b9241fbd801d305aa4b715c8754e662d`
MD5	`c68ba1887617f373cf4303d2d2c7ba38`
BLAKE2b-256	`da611e25a6857d95ddfc8d9a53d44cd4072eeef8a5f65b2ee039084f32afb02f`

See more details on using hashes here.

empirica 1.10.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Empirica

The Problem

What Empirica Does

How You Use It

Quick Start

Install + Claude Code (Recommended)

Alternative Installation Methods

First Session

The Measurement Architecture

The Epistemic Transaction Cycle

Live Statusline

The 13 Epistemic Vectors

How It Works With Claude Code

Cross-AI Mesh

Practice Model + Entity Graph (1.10.0)

Platform Support

Documentation & Training

The Empirica Ecosystem

What's New in 1.10.4

Privacy & Data

Community & Support

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes