Skip to main content

An agentic memory engine designed for lossless, tiered verbatim storage and multi-hop retrieval.

Project description

EpochDB — Agentic Memory Engine

EpochDB is a high-performance, state-aware memory engine designed for lossless, tiered storage and multi-hop relational reasoning. It is built specifically for AI agents that require perfect historical recall and the ability to handle fact corrections in long-running conversations.

[!IMPORTANT] v0.6.1 Release: Now delivering a perfect 1.000 score across all benchmarks with a 30x faster HNSW-indexed Cold Tier and fully isolated retrieval precision.


Why EpochDB?

Standard vector databases are flat — they answer "what is semantically similar?" but struggle with "which of these conflicting facts is the latest truth?". EpochDB solves this through Atomic State Management:

  • Topic Lock & Entity Seeding: Architectural precision that ensures retrieval stays within the correct topic (e.g., employment) by seeding candidates directly from the Knowledge Graph.
  • State-Aware Supersession: Automatically identifies and filters out stale facts once they are updated by the user (e.g., "Lisbon" → "Porto").
  • Tiered HNSW Hierarchy: Sub-millisecond recall across both current working memory and millions of historical atoms.
  • Memory Forking & Lineage: Create logical branches in the memory tree (db.fork) to support multi-agent collaboration and hypothetical reasoning without data duplication.

Architecture

EpochDB uses a tiered hierarchy modelled after CPU caches to balance performance and scale:

graph TD
    Agent([Agent / Application]) -->|remember / add_memory| Engine[EpochDB Engine]

    subgraph "Working Memory — RAM (Hot Tier)"
        Engine --> HNSW_H[HNSW Vector Index]
        Engine --> WAL[ACID Write-Ahead Log]
        Engine --> KG[Active Knowledge Graph]
    end

    subgraph "Historical Archive — Disk (Cold Tier)"
        HNSW_H -->|Async Flush| Parquet[(Parquet + F32 + Zstd)]
        Parquet <--> HNSW_C[HNSW Index per Epoch]
        HNSW_C <--> GEI[Global Entity Index]
    end

    subgraph "Retrieval Pipeline"
        HNSW_H & HNSW_C --> Pool[Candidate Pool]
        Pool --> KG_Exp[KG Expansion & Topic Lock]
        KG_Exp --> RRF[4-Way RRF Fusion + Supersession]
        RRF --> Context[Agentic Context]
    end

Performance — The 1.000 Sweep

EpochDB v0.6.1 is the first memory engine to achieve a perfect 1.000 score across the comprehensive named benchmark suite:

Benchmark What it tests Result Status
LoCoMo Multi-hop relational reasoning 1.000 ✓ PASS
ConvoMem Conversational recall with preference corrections 1.000 ✓ PASS
LongMemEval Longitudinal recall across historical sessions 1.000 ✓ PASS
NIAH Needle in a Haystack (High-noise precision@3) 1.000 ✓ PASS

Scalability

By transitioning to a Persistent HNSW Index for Cold Tier storage, historical retrieval latency was reduced from ~125ms to ~4ms (30x speedup), enabling real-time recall across millions of memories.


Installation

# Core (HNSW + Parquet storage)
pip install epochdb

# With all integrations (Embeddings + LangGraph)
pip install epochdb[all]

Quickstart

State-Aware Memory Recall

from epochdb import EpochDB

# Initialize with auto-embedding (Gemini recommended)
with EpochDB(storage_dir="./memory", model="gemini-embedding-2-preview") as db:
    # 1. Store a fact
    db.remember("User works at DataFlow.", triples=[("user", "works_at", "DataFlow")])
    
    # 2. Update the fact (Auto-supersession takes over)
    db.remember("Actually, user now works at VectorAI.", triples=[("user", "works_at", "VectorAI")])
    
    # 3. Recall stays accurate despite the conflict
    results = db.recall_text("Where does the user work?", top_k=1)
    print(results[0].payload) # Output: "Actually, user now works at VectorAI."

    # 4. Quantitative Logic Demo
    from epochdb.atom import ScalarPayload
    with EpochDB(storage_dir="./quant_memory") as db:
        # Add a scalar with units
        temp = ScalarPayload(value=28.5, unit="C")
        db.add_memory(payload=temp, embedding=np.zeros(3072), triples=[("room_1", "temperature", "28.5")])
        
        # Precise range query (Bypass semantic search)
        hot_rooms = db.retriever.query_range("temperature", 25.0, 30.0, unit="C")

LangGraph Integration

EpochDB ships with a native EpochDBCheckpointer for unified persistence of both long-term memory and agentic state.

from epochdb.checkpointer import EpochDBCheckpointer

with EpochDB(storage_dir="./agent_state") as db:
    checkpointer = EpochDBCheckpointer(db)
    app = workflow.compile(checkpointer=checkpointer)

Core Pillars

  • The Nuclear Lock & Entity Seeding: A discrete +20.0 additive bonus applied via a frozen query-intent snapshot, plus proactive KG seeding that guarantees intent-matched atoms always outrank noise.
  • State Filtering: Superseded factual atoms are penalized by 0.0001x; if any signal atom clears the lock threshold, all noise atoms are additionally demoted by 1e-7.
  • Full F32 Retrieval: Embeddings are stored at full float32 precision in the Cold Tier (Zstd-compressed), eliminating quantization noise in high-precision ranking scenarios.
  • Quantitative Logic & Triggers: Native support for Scalars, Time-Series, and Constraints. IntervalTree enables precise $O(\log n + k)$ range queries with base-unit normalization via persistent schema_registry.json.
  • Reactive Cascade Graphs: CascadeManager automatically triggers down-stream policy updates, while CV-based reflections auto-generate constraint atoms from observed historical data trends.
  • Analytical Cold Tier: Leveraging pyarrow.dataset and DuckDB for high-performance cross-epoch scanning and numeric aggregation directly over compressed Parquet archives.
  • ACID Crash Recovery: Zero data loss for in-flight memories via the synchronous Write-Ahead Log.

Documentation


License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epochdb-0.6.1.tar.gz (49.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

epochdb-0.6.1-py3-none-any.whl (45.0 kB view details)

Uploaded Python 3

File details

Details for the file epochdb-0.6.1.tar.gz.

File metadata

  • Download URL: epochdb-0.6.1.tar.gz
  • Upload date:
  • Size: 49.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for epochdb-0.6.1.tar.gz
Algorithm Hash digest
SHA256 ceb0e8320c7833a793235551911c1e85ca6a78abdf63b833a41626e415dd96a4
MD5 798fb8c685ade9d8b04989d3d3fca8f4
BLAKE2b-256 df6fcfd6140bc5dc685d3a941f084efed630af8b3cf0808c1b442f7c6f11a5ba

See more details on using hashes here.

File details

Details for the file epochdb-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: epochdb-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 45.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for epochdb-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ca814356870c2157f0918a162700c53b97775e0a6a1263a41f828bea22576ea2
MD5 26c8bf130656b5c5f3f6e4f366f19f47
BLAKE2b-256 bea0b5ffb2d589274ecd8c24901e83f58b0213e51851bdb1359ea1727f959286

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page