Skip to main content

Persistent memory for AI agents — knowledge graph MCP server with 19 tools: Hebbian learning, RRF fusion, episodic memory, contradiction detection, prospective triggers, Bayesian calibration, link prediction. PostgreSQL, sub-millisecond.

Project description

Cuba-Memorys

CI PyPI npm MCP Registry Rust PostgreSQL License: CC BY-NC 4.0 Audit Tech Debt

Persistent memory for AI agents — A Model Context Protocol (MCP) server that gives AI coding assistants long-term memory with a knowledge graph, neuroscience-inspired algorithms, and anti-hallucination grounding.

19 tools with Cuban soul. Sub-millisecond handlers. Mathematically rigorous.

[!IMPORTANT] v0.6.0 — Contextual Retrieval, importance priors, score breakdown, session provenance, compact format, semantic dedup, auto-tagging, Adamic-Adar link prediction, contradiction detection, prospective memory triggers, Bayesian calibration, bulk ingest, episodic memory with power-law decay, temporal search filters, and gap detection. 56 tests, 0 clippy warnings.

Demo

Cuba-Memorys MCP demo — AI agent session with knowledge graph, hybrid search, and graph analytics


Why Cuba-Memorys?

AI agents forget everything between conversations. Cuba-Memorys solves this:

  • Stratified exponential decay — Memories fade by type (facts=30d, errors=14d, context=7d), strengthen with access
  • Hebbian + BCM metaplasticity — Self-normalizing importance via Oja's rule with EMA sliding threshold
  • Hybrid RRF fusion search — pg_trgm + full-text + pgvector HNSW, entropy-routed weighting (k=60), temporal filters, tag filters, compact format
  • Knowledge graph — Entities, observations, typed relations with Leiden community detection and Adamic-Adar link prediction
  • Anti-hallucination grounding — Verify claims with graduated confidence + Bayesian calibration over time
  • Episodic memory — Separate temporal events (Tulving 1972) with power-law decay I(t) = I₀/(1+ct)^β (Wixted 2004)
  • Contradiction detection — Scan for semantic conflicts via embedding cosine + bilingual negation heuristics
  • Prospective memory — Triggers that fire on entity access, session start, or error match ("remind me when X")
  • Contextual Retrieval — Entity context prepended before embedding (Anthropic technique, +20% recall)
  • REM Sleep consolidation — Autonomous stratified decay + PageRank + auto-prune + auto-merge + episode decay
  • Graph intelligence — PageRank, Leiden communities, Brandes centrality, Shannon entropy, gap detection
  • Session awareness — Provenance tracking, session diff, importance priors per observation type
  • Error memory — Never repeat the same mistake (anti-repetition guard + pattern detection)

Comparison

Feature Cuba-Memorys Basic Memory MCPs
Knowledge graph with typed relations Yes No
Exponential importance decay Yes No
Hebbian learning + BCM metaplasticity Yes No
Hybrid entropy-routed RRF fusion Yes No
KG-neighbor query expansion Yes No
GraphRAG topological enrichment Yes No
Leiden community detection Yes No
Brandes betweenness centrality Yes No
Shannon entropy analytics Yes No
Adaptive prediction error gating Yes No
Anti-hallucination verification Yes No
Error pattern detection Yes No
Session-aware search boost Yes No
REM Sleep autonomous consolidation Yes No
Optional ONNX BGE embeddings Yes No
Write-time dedup gate Yes No
Contradiction auto-supersede Yes No
GDPR Right to Erasure Yes No
Graceful shutdown (SIGTERM/SIGINT) Yes No

Installation

PyPI (recommended)

pip install cuba-memorys

npm

npm install -g cuba-memorys

From source

git clone https://github.com/LeandroPG19/cuba-memorys.git
cd cuba-memorys/rust
cargo build --release

Binary download

Pre-built binaries available at GitHub Releases.


Quick Start

1. Start PostgreSQL (if you don't have one running):

docker compose up -d

2. Configure your AI editor (Claude Code, Cursor, Windsurf, etc.):

Claude Code
claude mcp add cuba-memorys -- cuba-memorys

Set the environment variable:

export DATABASE_URL="postgresql://cuba:memorys2026@127.0.0.1:5488/brain"
Cursor / Windsurf / VS Code

Add to your MCP config (.cursor/mcp.json, .windsurf/mcp.json, or .vscode/mcp.json):

{
  "mcpServers": {
    "cuba-memorys": {
      "command": "cuba-memorys",
      "env": {
        "DATABASE_URL": "postgresql://cuba:memorys2026@127.0.0.1:5488/brain"
      }
    }
  }
}

The server auto-creates the brain database and all tables on first run.

Optional: ONNX Embeddings

For real BGE-small-en-v1.5 semantic embeddings instead of hash-based fallback:

export ONNX_MODEL_PATH="$HOME/.cache/cuba-memorys/models"
export ORT_DYLIB_PATH="/path/to/libonnxruntime.so"

Without ONNX, the server uses deterministic hash-based embeddings — functional but without semantic understanding.


The 19 Tools

Every tool is named after Cuban culture — memorable, professional, meaningful.

Knowledge Graph

Tool Meaning What it does
cuba_alma Alma — soul CRUD entities. Types: concept, project, technology, person, pattern, config. Hebbian boost + access tracking. Fires prospective triggers on access.
cuba_cronica Cronica — chronicle Observations with semantic dedup, PE gating V5.2, importance priors by type, auto-tagging (TF-IDF top-5 keywords), session provenance, contextual embedding. Also manages episodic memories (episode_add/episode_list) and timeline view.
cuba_puente Puente — bridge Typed relations. Traverse walks the graph. Infer discovers transitive paths. Predict suggests missing relations via Adamic-Adar link prediction.
cuba_ingesta Ingesta — intake Bulk knowledge ingestion: accepts arrays of observations or long text with auto-classification by paragraph.

Search & Verification

Tool Meaning What it does
cuba_faro Faro — lighthouse RRF fusion (k=60) with entropy routing, pgvector, temporal filters (before/after), tag filters, score breakdown (text/vector/importance/session), compact format (~35% fewer tokens), Bayesian calibrated accuracy.

Error Memory

Tool Meaning What it does
cuba_alarma Alarma — alarm Report errors. Auto-detects patterns (>=3 similar = warning). Fires prospective triggers on error match.
cuba_remedio Remedio — remedy Resolve errors with cross-reference to similar unresolved issues.
cuba_expediente Expediente — case file Search past errors. Anti-repetition guard: warns if similar approach failed before.

Sessions & Decisions

Tool Meaning What it does
cuba_jornada Jornada — workday Session tracking with goals, outcomes, session diff (what was learned), and previous session context on start. Fires prospective triggers.
cuba_decreto Decreto — decree Record architecture decisions with context, alternatives, rationale.

Cognition & Analysis

Tool Meaning What it does
cuba_reflexion Reflexion — reflection Gap detection: isolated entities, underconnected hubs, type silos, observation gaps, density anomalies (z-score).
cuba_hipotesis Hipotesis — hypothesis Abductive inference: given an effect, find plausible causes via backward causal traversal. Plausibility = path_strength x importance.
cuba_contradiccion Contradiccion — contradiction Scan for semantic conflicts between same-entity observations via embedding cosine + bilingual negation heuristics.
cuba_centinela Centinela — sentinel Prospective memory triggers: "remind me when X is accessed / session starts / error matches". Auto-deactivate on max_fires, expiration support.
cuba_calibrar Calibrar — calibrate Bayesian confidence calibration: track faro/verify predictions, compute P(correct|grounding_level) via Beta distribution. Closes the verify-correct feedback loop.

Memory Maintenance

Tool Meaning What it does
cuba_zafra Zafra — sugar harvest Stratified decay (30d/14d/7d by type), power-law episode decay, prune, merge, summarize, pagerank, find_duplicates, export, stats, reembed (model migration with versioning). Auto-consolidation on >50 observations.
cuba_eco Eco — echo RLHF feedback: positive (Oja boost), negative (decrease), correct (update with versioning).
cuba_vigia Vigia — watchman Analytics: summary, enhanced health (null embeddings, active triggers, table sizes, embedding model), drift (chi-squared), Leiden communities, Brandes bridges.
cuba_forget Forget — forget GDPR Right to Erasure: cascading hard-delete of entity and ALL references (observations, episodes, relations, errors, sessions). Irreversible.

Architecture

cuba-memorys/
├── docker-compose.yml           # Dedicated PostgreSQL 18 (port 5488)
├── rust/                        # v0.3.0
│   ├── src/
│   │   ├── main.rs              # mimalloc + graceful shutdown
│   │   ├── protocol.rs          # JSON-RPC 2.0 + REM daemon (4h cycle)
│   │   ├── db.rs                # sqlx PgPool (10 max, 600s idle, 1800s lifetime)
│   │   ├── schema.sql           # 8 tables, 20+ indexes, HNSW
│   │   ├── constants.rs         # Tool definitions, thresholds, importance priors
│   │   ├── handlers/            # 19 MCP tool handlers (1 file each)
│   │   ├── cognitive/           # Hebbian/BCM, access tracking, PE gating V5.2
│   │   ├── search/              # RRF fusion, confidence, LRU cache
│   │   ├── graph/               # Brandes centrality, Leiden, PageRank (NF-IDF)
│   │   └── embeddings/          # ONNX multilingual-e5-small (contextual, spawn_blocking)
│   ├── scripts/
│   │   └── download_model.sh    # Download multilingual-e5-small ONNX
│   └── tests/
└── server.json                  # MCP Registry manifest

Performance: Rust vs Python

Metric Python v1.6.0 Rust v0.6.0
Binary size ~50MB (venv) 7.6MB
Entity create ~2ms 498us
Hybrid search <5ms 2.52ms
Analytics <2.5ms 958us
Memory usage ~120MB ~15MB
Startup time ~2s <100ms
Dependencies 12 Python packages 0 runtime deps

Database Schema

Table Purpose Key Features
brain_entities KG nodes tsvector + pg_trgm + GIN indexes, importance, bcm_theta
brain_observations Facts with provenance 9 types, versioning, vector(384), importance priors, auto-tags TEXT[], session_id FK, embedding_model tracking
brain_relations Typed edges 5 types, bidirectional, Hebbian strength, blake3 dedup
brain_errors Error memory JSONB context, synapse weight, pattern detection
brain_sessions Working sessions Goals (JSONB), outcome tracking, session diff
brain_episodes Episodic memory Tulving 1972, actors/artifacts TEXT[], power-law decay (Wixted 2004)
brain_triggers Prospective memory on_access/on_session_start/on_error_match, max_fires, expiration
brain_verify_log Bayesian calibration claim, confidence, grounding_level, outcome (correct/incorrect)

Search Pipeline

Reciprocal Rank Fusion (RRF, k=60) with entropy-routed weighting:

# Signal Source Condition
1 Entities (ts_rank + trigrams + importance) brain_entities Always
2 Observations (ts_rank + trigrams + importance) brain_observations Always
3 Errors (ts_rank + trigrams + synapse_weight) brain_errors Always
4 Vector cosine distance (HNSW) brain_observations.embedding pgvector installed

Post-fusion pipeline: Dedup -> KG-neighbor expansion -> Session boost -> GraphRAG enrichment -> Token-budget truncation -> Batch access tracking


Mathematical Foundations

Built on peer-reviewed algorithms, not ad-hoc heuristics:

Exponential Decay (V3)

importance_new = importance * exp(-0.693 * days_since_access / halflife)

halflife=30d by default. Decision/lesson observations are protected from decay. Importance directly affects search ranking (score0.7 + importance0.3).

Hebbian + BCM — Oja (1982), Bienenstock-Cooper-Munro (1982)

Positive: importance += eta * throttle(access_count, theta_M)
BCM EMA: theta_M = max(10, (1-alpha)*theta_prev + alpha*access_count)

V3: theta_M persisted in bcm_theta column for true temporal smoothing.

RRF Fusion — Cormack (2009)

RRF(d) = sum( w_i / (k + rank_i(d)) )   where k = 60

Entropy-routed weighting: keyword-dominant vs mixed vs semantic queries get different signal weights.

Other Algorithms

Algorithm Reference Used in
Leiden communities Traag et al. (Nature 2019) community.rs -> vigia.rs
Personalized PageRank Brin & Page (1998) pagerank.rs -> zafra.rs
Brandes centrality Brandes (2001) centrality.rs -> vigia.rs
Adaptive PE gating Friston (Nature 2023) prediction_error.rs -> cronica.rs
Shannon entropy Shannon (1948) density.rs -> information gating
Chi-squared drift Pearson (1900) Error distribution change detection

Configuration

Environment Variables

Variable Default Description
DATABASE_URL PostgreSQL connection string (required)
ONNX_MODEL_PATH Path to BGE model directory (optional)
ORT_DYLIB_PATH Path to libonnxruntime.so (optional)
RUST_LOG cuba_memorys=info Log level filter

Docker Compose

Dedicated PostgreSQL 18 Alpine:

  • Port: 5488 (avoids conflicts with 5432/5433)
  • Resources: 256MB RAM, 0.5 CPU
  • Restart: always
  • Healthcheck: pg_isready every 10s

How It Works

1. The agent learns from your project

Agent: FastAPI requires async def with response_model.
-> cuba_alma(create, "FastAPI", technology)
-> cuba_cronica(add, "FastAPI", "All endpoints must be async def with response_model")

2. Error memory prevents repeated mistakes

Agent: IntegrityError: duplicate key on numero_parte.
-> cuba_alarma("IntegrityError", "duplicate key on numero_parte")
-> cuba_expediente: Similar error found! Solution: "Add SELECT EXISTS before INSERT"

3. Anti-hallucination grounding

Agent: Let me verify before responding...
-> cuba_faro("FastAPI uses Django ORM", mode="verify")
-> confidence: 0.0, level: "unknown" — "No evidence. High hallucination risk."

4. Memories decay naturally

Initial importance:    0.5  (new observation)
After 30d no access:  0.25 (halved by exponential decay)
After 60d no access:  0.125
Active access resets the clock — frequently used memories stay strong.

5. Community intelligence

-> cuba_vigia(metric="communities")
-> Community 0 (4 members): [FastAPI, Pydantic, SQLAlchemy, PostgreSQL]
  Summary: "Backend stack: async endpoints, V2 validation, 2.0 ORM..."
-> Community 1 (3 members): [React, Next.js, TypeScript]
  Summary: "Frontend stack: React 19, App Router, strict types..."

Security & Audit

Internal Audit Verdict: GO (2026-03-28)

Check Result
SQL injection All queries parameterized (sqlx bind)
SEC-002 wildcard injection Fixed (POSITION-based)
CVEs in dependencies 0 active (sqlx 0.8.6, tokio 1.50.0)
UTF-8 safety safe_truncate on all string slicing
Secrets All via environment variables
Division by zero Protected with .max(1e-9)
Error handling All ? propagated with anyhow::Context
Clippy 0 warnings
Tests 106/106 passing (51 unit/smoke + 55 E2E)
Licenses All MIT/Apache-2.0 (0 GPL/AGPL)

Dependencies

Crate Purpose License
tokio Async runtime MIT
sqlx PostgreSQL (async) MIT/Apache-2.0
serde / serde_json Serialization MIT/Apache-2.0
pgvector Vector similarity MIT
ort ONNX Runtime (optional) MIT/Apache-2.0
tokenizers HuggingFace tokenizers Apache-2.0
blake3 Cryptographic hashing Apache-2.0/CC0
mimalloc Global allocator MIT
tracing Structured JSON logging MIT
lru O(1) LRU cache MIT
chrono Timezone-aware timestamps MIT/Apache-2.0

Version History

Version Key Changes
0.3.0 Deep Research V3: exponential decay replaces FSRS-6, dead code/columns eliminated, SEC-002 fix, importance in ranking, embeddings storage on write, GraphRAG CTE fix, Opus 4.6 token optimization, zero tech debt. 106 tests (51 unit/smoke + 55 E2E), 0 clippy warnings.
0.2.0 Complete Rust rewrite. BCM metaplasticity, Leiden communities, Shannon entropy, blake3 dedup. Internal audit: GO verdict.
1.6.0 KG-neighbor expansion, embedding LRU cache, async embed rebuild, community summaries, batch access tracking
1.5.0 Token-budget truncation, post-fusion dedup, source triangulation, adaptive confidence, session-aware decay
1.3.0 Modular architecture (CC avg D->A), 87% CC reduction
1.1.0 GraphRAG, REM Sleep, conditional pgvector, 4-signal RRF
1.0.0 Initial release: 12 tools, Hebbian learning

License

CC BY-NC 4.0 — Free to use and modify, not for commercial use.


Author

Leandro Perez G.

Credits

Mathematical foundations: Oja (1982), Bienenstock, Cooper & Munro (1982, BCM), Cormack (2009, RRF), Brin & Page (1998, PageRank), Traag et al. (2019, Leiden), Brandes (2001), Shannon (1948), Pearson (1900, chi-squared), Friston (2023, PE gating), BAAI (2023, BGE), Malkov & Yashunin (2018, HNSW), O'Connor et al. (2020, blake3).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cuba_memorys-1.8.0-py3-none-win_amd64.whl (3.2 MB view details)

Uploaded Python 3Windows x86-64

cuba_memorys-1.8.0-py3-none-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded Python 3manylinux: glibc 2.28+ x86-64

cuba_memorys-1.8.0-py3-none-manylinux_2_28_aarch64.whl (3.3 MB view details)

Uploaded Python 3manylinux: glibc 2.28+ ARM64

cuba_memorys-1.8.0-py3-none-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (6.3 MB view details)

Uploaded Python 3macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

File details

Details for the file cuba_memorys-1.8.0-py3-none-win_amd64.whl.

File metadata

File hashes

Hashes for cuba_memorys-1.8.0-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 02e0342794dceb855e46b2ba2e35a1ae07dc77a5631a976f6bc717c185b23545
MD5 cf39a20d22963aa6d8fe8f8e9e1aaae0
BLAKE2b-256 3528d4002616d93d7375cb986c4793ef5422a0e59f8611802256bea7352ff074

See more details on using hashes here.

File details

Details for the file cuba_memorys-1.8.0-py3-none-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cuba_memorys-1.8.0-py3-none-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 55689886261d5ee142a6aca10fffdac5c6534c11ff3fcb08a00b2043a9fd50f1
MD5 51d9fd3f13042e68cf414f2b612fc7c5
BLAKE2b-256 ed49470f412bf0ff45f0c7ec9b2e106b7b313653b5e1f6673cd2cbdcaa90fadb

See more details on using hashes here.

File details

Details for the file cuba_memorys-1.8.0-py3-none-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cuba_memorys-1.8.0-py3-none-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 dcc9ec3acd111398f6d70cd504eab7e5cf75933c1c17bfa34ae50bf1c51ecc49
MD5 685692621e303ba8fe5ebe0e428c5613
BLAKE2b-256 b43d965036c3891363b273ac908d385235d2dd9fd47a5d486bce693d3b16b405

See more details on using hashes here.

File details

Details for the file cuba_memorys-1.8.0-py3-none-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for cuba_memorys-1.8.0-py3-none-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 22170f3da5f8a8db11fc826c0f995429a8739e32666828b1804fa0e0cf1bc93c
MD5 511672936e179ece3c6484b2aa553dd1
BLAKE2b-256 1936c3fa6283ae0b40315f513812139082e20d4dd93168748bf3c8cc9e989cf8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page