Skip to main content

A Cognitive Memory Engine for Persistent AI Systems

Project description

YantrikDB — A Cognitive Memory Engine for Persistent AI Systems

The memory engine for AI that actually knows you.

The Problem

Current AI systems have no coherent memory architecture. They bolt together generic databases — vector stores, knowledge graphs, key-value caches — none of which were designed for how cognition works. This makes persistent, evolving AI relationships impossible at scale.

Today's AI memory is:

Store everything → Embed → Retrieve top-k → Inject into context → Hope it helps.

That does not scale cognitively.

The Thesis

AI needs a purpose-built memory engine with native support for:

  • Temporal decay — memories age and fade like human memory
  • Semantic consolidation — patterns are extracted, redundancy is compressed
  • Conflict resolution — contradictions are detected and resolved conversationally
  • Multi-device replication — local-first CRDT-based sync across devices
  • Proactive cognition — background processing that gives AI genuine reasons to initiate conversation

All in a single embedded engine — no server, no network hops, no stitching together five databases.

Why Not Use Existing Solutions?

Solution What it does What it lacks
Vector DBs (Pinecone, Weaviate, Milvus) High-dimensional nearest-neighbor lookup No time awareness, no causality, no compression, no self-organization
Knowledge Graphs (Neo4j) Structured relations, entity linking Hard to scale dynamically, poor for fuzzy memory, not adaptive
Memory Frameworks (LangChain, LlamaIndex) Retrieval wrappers, context injection Not true memory architectures — just middleware

Human memory is hierarchical, compressed, contextual, self-updating, emotionally weighted, time-aware, and predictive. No existing system addresses this holistically.

Architecture

Design Principles

  • Embedded, not client-server — single file, no server process (like SQLite)
  • Local-first, sync-native — works offline, syncs when connected
  • Cognitive operations, not SQLrecord(), recall(), relate(), not SELECT
  • Living system, not passive store — does work between conversations

Unified Index Architecture

Five index types in one engine, sharing the same memory pages, WAL, and query planner:

┌─────────────────────────────────────────────────────┐
│                  YantrikDB Engine                         │
│                                                     │
│  ┌───────────┬───────────┬───────────┬───────────┐ │
│  │  Vector   │  Graph    │ Temporal  │   Decay   │ │
│  │  Index    │  Index    │  Index    │   Heap    │ │
│  │  (HNSW)  │ (Entities)│ (Events)  │(Priority) │ │
│  └───────────┴───────────┴───────────┴───────────┘ │
│  ┌───────────┐                                      │
│  │ Key-Value │                                      │
│  │  Store    │                                      │
│  └───────────┘                                      │
│                                                     │
│  ┌───────────────────────────────────────────────┐  │
│  │         Write-Ahead Log (WAL)                 │  │
│  └───────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────┐  │
│  │      Replication Log (append-only)            │  │
│  │      CRDT-based conflict resolution           │  │
│  └───────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────┘
  1. Vector Index (HNSW) — semantic similarity search across memories
  2. Graph Index — entity relationships ("Max is user's dog", "user works at Meta")
  3. Temporal Index — time-series style, "what happened around Tuesday"
  4. Decay Heap — priority queue with importance scores that degrade over time
  5. Key-Value Store — fast facts ("user's name is Pranab")

Memory Types

Inspired by cognitive science (Tulving's taxonomy):

Type What it stores Example
Episodic Events, experiences with context "User had a rough day at work on Feb 20"
Semantic Facts, knowledge, abstractions "User is a software engineer who likes AI"
Procedural Strategies, behaviors, what worked "User prefers concise answers with code examples"
Emotional Valence weighting on memories "Dog's death → high emotional weight → never forget"

Core Operations

yantrikdb.record(memory, importance=0.8, emotion="frustrated")
yantrikdb.recall("What does the user feel about their job?")
yantrikdb.relate("user.job", "user.stress", strength=0.7)
yantrikdb.consolidate(topic="user.career", since="30d")
yantrikdb.decay(threshold=0.1)       // prune low-importance memories
yantrikdb.forget(memory_id)          // explicit removal
yantrikdb.conflict(memory_a, memory_b)  // flag contradiction
yantrikdb.resolve(conflict_id, resolution)  // user-driven resolution

// Session tracking — memories auto-link to the active session
yantrikdb.session_start(namespace, client_id)
yantrikdb.session_end(session_id)    // computes summary, topics, valence

// Temporal awareness
yantrikdb.stale(days=14)             // forgotten high-importance memories
yantrikdb.upcoming(days=7)           // memories with approaching deadlines
yantrikdb.entity_profile("Alice")    // valence, domains, frequency, trend

Conflict Resolution — Human-in-the-Loop

When synced devices produce contradictory memories, YantrikDB doesn't guess. It creates a conflict segment — a first-class data structure:

┌──────────────────────────────────────────┐
│            Conflict Segment              │
│                                          │
│  conflict_id:  c_0042                    │
│  type:         identity_fact             │
│  priority:     high                      │
│  memory_a:     "works at Google" (phone) │
│  memory_b:     "works at Meta" (laptop)  │
│  status:       pending_resolution        │
│  strategy:     ask_user                  │
│  resolved_by:  null                      │
│  resolution:   null                      │
└──────────────────────────────────────────┘

Resolution happens conversationally, not programmatically:

"Oh by the way — last month you mentioned something about Meta. Did you end up switching from Google?"

Conflicts are triaged by priority:

Conflict Type Action
Critical identity facts Ask immediately
Preferences that changed Ask naturally in conversation
Minor contradictions Keep both, resolve lazily
Temporal conflicts Prefer most recent, flag if uncertain

Multi-Device Sync Protocol

YantrikDB is local-first with CRDT-based replication:

┌──────────────────────┐       ┌──────────────────────┐
│   Device A (Phone)   │       │  Device B (Laptop)   │
│                      │       │                      │
│  ┌────────────────┐  │ sync  │  ┌────────────────┐  │
│  │ YantrikDB Engine │◄─┼───────┼─►│ YantrikDB Engine │  │
│  └────────────────┘  │       │  └────────────────┘  │
│  ┌────────────────┐  │       │  ┌────────────────┐  │
│  │ Replication    │  │       │  │ Replication    │  │
│  │ Log            │  │       │  │ Log            │  │
│  └────────────────┘  │       │  └────────────────┘  │
└──────────────────────┘       └──────────────────────┘
         │                              │
         └──────────┬───────────────────┘
                    │
            P2P / Relay / BLE
        (encrypted, zero-knowledge)
  • Append-only replication log — every write, consolidation, and decay event is logged
  • CRDT merging — graph edges/nodes and facts merge without conflicts
  • Vector indexes rebuild locally — raw memories sync, each device rebuilds HNSW
  • Forget propagation — tombstones ensure forgotten memories stay forgotten
  • Optional cloud relay — dumb encrypted pipe, not a server. Sees nothing.

Storage Tiers

Tier Backing Use case
Hot In-memory Recent/frequent memories, active conversation
Warm SSD-backed Medium-term, weeks to months
Cold Compressed archival Old memories, on-demand hydration

Proactive Cognition Loop

YantrikDB runs a background processing loop even between conversations — giving AI genuine reasons to reach out:

┌─────────────────────────────────────────────────┐
│           Proactive Trigger System               │
│                                                  │
│  Memory Conflicts    → "You mentioned two        │
│  (need resolution)     different moving dates"   │
│                                                  │
│  Pattern Detection   → "You seem stressed        │
│  (noticed something)   every Sunday evening"     │
│                                                  │
│  Temporal Triggers   → "Your mom's birthday      │
│  (time-based)          is tomorrow"              │
│                                                  │
│  Decay Warnings      → "I'm fuzzy on your        │
│  (about to forget)     new coworker's name"      │
│                                                  │
│  Goal Tracking       → "How's the marathon       │
│  (user set a goal)     training going?"          │
│                                                  │
│  Consolidation       → "I noticed you always     │
│  Insights              feel better after talking  │
│                        to your sister"            │
└─────────────────────────────────────────────────┘

Every proactive message is grounded in real memory data — not engagement farming.

Built-in safety constraints:

Rule Purpose
Cooldown periods No messaging every hour
Priority threshold Only reach out when it matters
Time-of-day awareness Don't message at 3am
User-controlled frequency "Check in weekly" vs "only urgent"
Groundedness requirement Every message must trace to real memories

Background Processing Cycle

  1. Consolidation pass — compress, summarize, abstract
  2. Conflict detection — find contradictions across synced devices
  3. Pattern mining — "user tends to X when Y"
  4. Cross-domain discovery — find surprising connections between work, health, hobbies
  5. Entity bridge detection — identify people/concepts that span multiple domains
  6. Trigger evaluation — "is anything worth reaching out about?"
  7. Decay pass — age out low-importance memories
  8. Session cleanup — abandon stale sessions, compute summaries

Session Tracking & Temporal Awareness

YantrikDB tracks conversation sessions as first-class engine primitives — not faked via metadata:

┌────────────────────────────────────────────────┐
│              Session Lifecycle                  │
│                                                │
│  session_start("default", "mcp-server")        │
│       ↓                                        │
│  record() → auto-linked to active session      │
│  record() → auto-linked to active session      │
│  record() → auto-linked to active session      │
│       ↓                                        │
│  session_end() → computes:                     │
│    • memory_count, avg_valence                 │
│    • topic extraction from entity graph        │
│    • duration                                  │
└────────────────────────────────────────────────┘

Temporal helpers give the engine time awareness:

  • stale(days) — high-importance memories not accessed in N days ("I'm forgetting something important")
  • upcoming(days) — memories with deadlines approaching ("Your dentist appointment is Thursday")
  • entity_profile(entity, days) — rich profile: valence trend, domain distribution, session count, interaction frequency, dominant emotion

Cross-domain pattern mining uses the HNSW vector index to find surprising connections between domains:

  • A work frustration pattern that correlates with health domain entries
  • An entity (person, concept) that bridges finance and family domains
  • Scored by similarity × domain_surprise × entity_support — common co-occurrences are penalized

Technical Decisions

Decision Choice Rationale
Architecture Embedded (like SQLite) No server overhead, sub-ms local reads, single-tenant
Core language Rust Memory safety without GC pauses, ideal for embedded engines
Bindings Python, TypeScript Agent/AI layer integration
Storage format Single file per user Portable, backupable, no infrastructure
Sync CRDTs + append-only log Conflict-free for most operations, deterministic
Query interface Cognitive operations API Not SQL — designed for how agents think
Sessions Engine-native tracking Auto-links memories, computes valence/topics per session
Cross-domain mining HNSW-based Uses existing vector index for O(k·n) instead of O(n²) pairwise

Target Use Cases

  • AI Companions — persistent, evolving relationships across devices
  • Autonomous Agents — long-horizon planning with memory consolidation
  • Multi-Agent Systems — shared memory between cooperating agents
  • Personal AI Assistants — that actually remember and grow with you

Roadmap

  • V0 — Single device, embedded engine, core memory model (record, recall, relate, consolidate, decay)
  • V1 — Replication log, sync between two devices
  • V2 — Conflict resolution with human-in-the-loop, production-grade sync
  • V3 — Proactive cognition loop, pattern detection, trigger system
  • V4 — Sessions, temporal awareness, cross-domain pattern mining, entity profiles
  • V5 — Multi-agent shared memory, federated learning across users

Research & Publications

Author

Pranab Sarkar

Patent

YantrikDB's cognitive memory methods are covered by U.S. Patent Application No. 19/573,392 (filed March 20, 2026), claiming priority to Provisional Application No. 63/991,357 (filed February 26, 2026).

License

Copyright (c) 2026 Pranab Sarkar

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, version 3.

See LICENSE for the full text.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yantrikdb-0.2.7.tar.gz (771.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

yantrikdb-0.2.7-cp313-cp313-win_amd64.whl (2.2 MB view details)

Uploaded CPython 3.13Windows x86-64

yantrikdb-0.2.7-cp313-cp313-manylinux_2_34_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

yantrikdb-0.2.7-cp313-cp313-macosx_11_0_arm64.whl (2.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

File details

Details for the file yantrikdb-0.2.7.tar.gz.

File metadata

  • Download URL: yantrikdb-0.2.7.tar.gz
  • Upload date:
  • Size: 771.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for yantrikdb-0.2.7.tar.gz
Algorithm Hash digest
SHA256 b9ebdc8b30ff5d7449e524ca3b7024b5b32befb9a66350d7be0744cf4ba30eed
MD5 58126acb100680cadc5e564682ee78bf
BLAKE2b-256 5f91370d359f1932902f1f648421b6f87c0d0be09e0eacf2b125dbe9f913a210

See more details on using hashes here.

File details

Details for the file yantrikdb-0.2.7-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for yantrikdb-0.2.7-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 02673d59d336cc0b018adeaeae1ee77ce606a986969ca02fce26ee71cf255fa4
MD5 a1dc8998e2f3b6e6c5cbc984df1a6b0b
BLAKE2b-256 d539d099cd839786dd988c5ae714ffc43e65ea48c7de1c9427818af50c4602ad

See more details on using hashes here.

File details

Details for the file yantrikdb-0.2.7-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for yantrikdb-0.2.7-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 d3bb3882d06e0ce3d9c22fa0a68330e96b501f115159c91de8fa3e478cfd8743
MD5 981d770ca1c131d5a2f5a3538c155c7e
BLAKE2b-256 6c85a507619cb98dfebb6fa46dcbf51e9a0cadc433eed067fdc0d8ffaf4e8e1c

See more details on using hashes here.

File details

Details for the file yantrikdb-0.2.7-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for yantrikdb-0.2.7-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bd3bb2ec92be5ca3e466b27c43cf9ebc1943a7d9066b0e52936e885fc306d9e3
MD5 8d9f7368c473131f249fd297fe379ce5
BLAKE2b-256 535c6b657f432e149a98910f7a9e5ad2c6f6f696e24241557b12a7316af7470a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page