Self-Healing Agent Memory Architecture — an immune system for AI agent memory

These details have not been verified by PyPI

Project links

Project description

SHAMA - Self-Healing Agent Memory Architecture

An immune system for AI agent memory. Memories that know what they've forgotten - and fix it.

The Problem

Every AI agent today loses context, hallucinates past events, or gets poisoned memory over long sessions. Existing solutions (conversation buffers, naive RAG) have no mechanism to detect stale facts, resolve contradictions, or autonomously correct errors.

The Solution

SHAMA is a drop-in memory layer that gives your agent:

Dual memory store - episodic (what happened) + semantic (what is true)
Confidence half-life decay - C(t) = C₀ × 2^(−t/τ) - memories decay probabilistically over time
Autonomous contradiction detection - scans for conflicting facts on every write, resolved by LLM judge
Self-correction loop - re-verifies and deprecates stale/wrong memories automatically
Full audit trail - every memory lifecycle event logged for complete data ownership
Swappable backends - Qdrant, Neo4j, Redis by default; swap any component with one config change

SHAMA - Self-Healing Agent Memory Architecture

Architecture

INPUT (text)
  └► LLM.score_importance()        ← how important is this memory?
  └► Embedding.embed()             ← convert to vector
  └► EpisodicNode written          ← append-only event log (Qdrant)
  └► Redis working memory updated  ← last 20 turns cached per session
  └► Audit event logged            ← immutable SQLite trail

PROMOTION JOB (every 60 min)
  └► Fetch unpromoted episodic nodes
  └► Cluster by cosine similarity (threshold 0.80)
  └► LLM distills cluster → entity-relation-value triples
  └► SemanticNode written          ← knowledge graph (Qdrant + Neo4j)
  └► Episodic nodes marked promoted

CONTRADICTION SCAN (every semantic write)
  └► Find nodes with same entity + relation, different value
  └► LLM judge: is_contradiction? winner?
  └► CONFLICTS_WITH edge added in Neo4j
  └► Both nodes → status = CONTESTED
  └► SelfCorrector: winner → ACTIVE, loser → DEPRECATED

DECAY SCHEDULER (every 15 min)
  └► Scan nodes below confidence threshold 0.30
  └► C(t) = C₀ × 2^(−t/τ)
  └► confidence < 0.10 → auto-deprecate
  └► 0.10 < confidence < 0.30 → re-verify via LLM
      └► confirmed  → confidence restored, status ACTIVE
      └► refuted    → status DEPRECATED
      └► uncertain  → status CONTESTED, escalated

RECALL (query string)
  └► Embed query
  └► ANN search: top-10 episodic + top-10 semantic (Qdrant)
  └► Graph hop: neighbors of top-3 semantic hits (Neo4j, 1-2 hops)
  └► Merge + deduplicate
  └► Re-rank: score = relevance×0.5 + confidence×0.3 + recency×0.2
  └► Filter: confidence >= 0.15
  └► Trim to 4000 token budget
  └► Return RetrievedContext with confidence-annotated memories

Confidence Half-Life

C(t) = C₀ × 2^(−t/τ)

C₀  = original confidence at write time (1.0)
t   = hours elapsed since creation
τ   = half-life in hours (per memory type)

Memory type	Half-life (τ)	After 1 half-life	After 2 half-lives
Conversational event	24 hrs	0.50	0.25
Tool output / API result	48 hrs	0.50	0.25
Distilled semantic fact	720 hrs (30 days)	0.50	0.25
User preference	2160 hrs (90 days)	0.50	0.25

C(t) < 0.30 → re-verify job fires C(t) < 0.10 → auto-deprecate

Prerequisites

Before starting, make sure you have:

Tool	Version	Install
Python	3.11+	https://python.org
Docker Desktop	Latest	https://docker.com/products/docker-desktop
Git	Any	https://git-scm.com
pip	23+	comes with Python

Check your versions:

python --version      # must be 3.11+
docker --version      # must be installed
docker compose version

Step 1 - Get API Keys

SHAMA uses two separate API keys - one for embeddings, one for LLM reasoning.

Embedding Key - OpenAI (required)

SHAMA uses OpenAI for converting text to vectors. DeepSeek does not provide an embedding API, so OpenAI is required for embeddings even when using DeepSeek as the LLM.

Go to https://platform.openai.com/api-keys
Click "Create new secret key"
Name it shama-embeddings
Copy the key - it starts with sk-...
Make sure your account has billing enabled (embeddings are very cheap - ~$0.001 per 1000 chunks)

LLM Key - DeepSeek (for reasoning, contradiction judging, promotion)

DeepSeek is the recommended LLM provider - significantly cheaper than GPT-4o with comparable reasoning quality.

Go to https://platform.deepseek.com
Sign up / log in
Go to API Keys → Create API Key
Name it shama-llm
Copy the key
Add credits (minimum $5 recommended for testing)

HuggingFace - Fully Local (no API keys, full privacy) or use Hugging face free API

Get your token at https://huggingface.co/settings/tokens
sign up/ login
Go to profile → API Keys → Create API Key
Name it shama-llm
Copy the key

For local usage

# Runs entirely on your machine - zero API calls, zero cost after download
client = ShamaClient.from_config(
    huggingface_local_llm_model="microsoft/Phi-3-mini-4k-instruct",   # ~3.8GB
    huggingface_local_embedding_model="BAAI/bge-base-en-v1.5",        # ~440MB
    huggingface_local_device="cpu",    # or "cuda" / "mps" (Apple Silicon)
)
# First run downloads models. Subsequent runs use cache.

Using OpenAI for both? You can use one OpenAI key for both embedding and LLM - just set openai_api_key and leave deepseek_api_key empty.

Using Anthropic? Set anthropic_api_key + embedding_api_key (OpenAI key for embeddings).

Using HuggingFace API? You can use one HuggingFace key for both embedding and LLM - just set HUGGINGFACE_API_KEY HF_JUDGE_MODEL HF_FAST_MODELand HF_EMBEDDING_MODEL.

Step 2 - Clone & Install

# Clone the repo (or unzip the package you received)
git clone https://github.com/gowthamsai09/shama
cd shama

# Create a virtual environment (strongly recommended)
python -m venv .venv

# Activate it
# macOS / Linux:
source .venv/bin/activate
# Windows:
.venv\Scripts\activate

# Install SHAMA with all dependencies for testing
pip install -e ".[dev,openai]"

pip install shama[huggingface-local]

Verify installation:

python -c "import shama; print(shama.__version__)"
# Expected: 0.1.0

Step 3 - Configure Environment

# Copy the example env file
cp .env

Open .env and fill in your values:

#  Infrastructure (Docker will handle these - leave as default) 
QDRANT_URL=http://localhost:6333
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your password
REDIS_URL=redis://localhost:6379
SHAMA_AUDIT_DB_PATH=./shama_audit.db

#  LLM Provider 
# Option A: DeepSeek for LLM + OpenAI for embeddings (recommended - cheapest)
DEEPSEEK_API_KEY=your_deepseek_key_here
EMBEDDING_API_KEY=your_openai_key_here

# Option B: OpenAI for everything (simplest)
# OPENAI_API_KEY=sk-...

# Option C: Anthropic for LLM + OpenAI for embeddings
# ANTHROPIC_API_KEY=sk-ant-...
# EMBEDDING_API_KEY=sk-...

# Option D - HuggingFace Inference API (LLM + embeddings both from HF)
# HUGGINGFACE_API_KEY=hf_...
# HF_JUDGE_MODEL=mistralai/Mistral-7B-Instruct-v0.3
# HF_FAST_MODEL=mistralai/Mistral-7B-Instruct-v0.3
# HF_EMBEDDING_MODEL=BAAI/bge-large-en-v1.5
 
# Option E - Fully local (no API keys needed)
# HF_LOCAL_LLM_MODEL=microsoft/Phi-3-mini-4k-instruct
# HF_LOCAL_EMBEDDING_MODEL=BAAI/bge-base-en-v1.5
# HF_LOCAL_DEVICE=cpu


#  Embedding Config 
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSIONS=1536

#  SHAMA Tuning (defaults are fine for testing) 
SHAMA_REVERIFY_THRESHOLD=0.30
SHAMA_DEPRECATE_THRESHOLD=0.10
SHAMA_EPISODIC_HALF_LIFE=24.0
SHAMA_SEMANTIC_HALF_LIFE=720.0
SHAMA_MAX_CONTEXT_TOKENS=4000
SHAMA_DECAY_INTERVAL_MINUTES=15
SHAMA_PROMOTION_INTERVAL_MINUTES=60

# Recommended HuggingFace models
HUGGINGFACE_API_KEY=hf_your_token
HF_JUDGE_MODEL=meta-llama/Llama-3.1-8B-Instruct
HF_FAST_MODEL=meta-llama/Meta-Llama-3-8B-Instruct
HF_EMBEDDING_MODEL=BAAI/bge-large-en-v1.5

Important: Also update NEO4J_PASSWORD in docker-compose.yml to match your .env:
NEO4J_AUTH: neo4j/your password

Step 4 - Start Infrastructure (Docker)

SHAMA needs three services running: Qdrant (vector DB), Neo4j (graph DB), Redis (cache). Docker Compose starts all three with one command.

# Start all services in background
docker compose up -d

Expected output:

 Container shama-qdrant  Started
 Container shama-neo4j   Started
 Container shama-redis   Started

This downloads ~800MB of images on first run. Subsequent starts are instant.

Step 5 - Verify Infrastructure

Run each check before proceeding:

Qdrant

curl http://localhost:6333/health
# Expected: {"title":"qdrant - vector search engine","version":"..."}

Neo4j

Open http://localhost:7474 in your browser.

Username: neo4j
Password: whatever you set in .env (e.g. shama_2026)
You should see the Neo4j Browser UI.

Redis

docker exec shama-redis redis-cli ping
# Expected: PONG

All three via Python

python -c "
import asyncio
import os
from dotenv import load_dotenv
load_dotenv()

async def check():
    from shama.stores.vector.qdrant import QdrantVectorStore
    from shama.stores.cache.redis import RedisCacheStore

    v = QdrantVectorStore(url=os.getenv('QDRANT_URL', 'http://localhost:6333'))
    await v.initialize()
    print('Qdrant:', await v.health_check())

    r = RedisCacheStore(url=os.getenv('REDIS_URL', 'redis://localhost:6379'))
    await r.initialize()
    print('Redis: ', await r.health_check())

asyncio.run(check())
"
# Expected:
# Qdrant: True
# Redis:  True

Public API Reference

from shama import ShamaClient

client = ShamaClient.from_config(...)
await client.initialize()

Method	Parameters	Returns	Description
`remember()`	`content, agent_id, session_id, source, turn_index`	`EpisodicNode`	Write raw observation to episodic memory
`remember_fact()`	`entity, relation, value, agent_id, session_id, confidence`	`SemanticNode`	Write structured fact + auto contradiction scan
`recall()`	`query, agent_id, session_id, min_confidence, max_tokens`	`RetrievedContext`	Retrieve ranked memory context
`export_agent_data()`	`agent_id`	`dict`	Export all data as JSON (data portability)
`delete_agent_data()`	`agent_id`	`dict`	Hard delete all agent data (GDPR)
`get_audit_trail()`	`agent_id, event_types, since, limit`	`list[dict]`	Full audit history
`run_decay_pass()`	`agent_id`	`dict`	Manual decay trigger
`run_promotion_pass()`	`agent_id`	`dict`	Manual promotion trigger
`health_check()`	-	`dict[str, bool]`	All backend health status

Swappable Backends

Implement any interface from shama.core.interfaces and pass it to from_components():

Layer	Interface	Default	Swap to
Vector DB	`VectorStore`	Qdrant	Pinecone, Weaviate, pgvector
Graph DB	`GraphStore`	Neo4j	Amazon Neptune, FalkorDB
Cache	`CacheStore`	Redis	DragonflyDB, Memcached
Embeddings	`EmbeddingProvider`	OpenAI	Cohere, local models
LLM	`LLMProvider`	DeepSeek / OpenAI	Any LLM
Audit	`AuditStore`	SQLite	PostgreSQL, ClickHouse

from shama import ShamaClient
from my_company.stores import MyPineconeStore

client = ShamaClient.from_components(
    vector_store=MyPineconeStore(),
    graph_store=...,
    cache_store=...,
    embedding_provider=...,
    llm_provider=...,
    audit_store=...,
)

Provider Combinations

Use case	LLM	Embeddings	Install
HF cloud (cheapest)	Mistral-7B via HF API	BGE-large via HF API	`pip install shama[huggingface]`
Fully local / air-gapped	Phi-3 local	BGE-base local	`pip install shama[huggingface-local]`
Best local quality	Llama-3-8B local	BGE-large local	`pip install shama[huggingface-local]`

# DeepSeek LLM + OpenAI embeddings (recommended for cost)
client = ShamaClient.from_config(
    deepseek_api_key="your_deepseek_key",
    embedding_api_key="your_openai_key",       # OpenAI used only for embeddings
)

# OpenAI for everything (simplest)
client = ShamaClient.from_config(
    openai_api_key="sk-...",
)

# Anthropic LLM + OpenAI embeddings
client = ShamaClient.from_config(
    anthropic_api_key="sk-ant-...",
    embedding_api_key="sk-...",                # OpenAI key for embeddings
)

# Azure OpenAI (full Azure stack)
client = ShamaClient.from_config(
    azure_api_key="...",
    azure_endpoint="https://my-resource.openai.azure.com/",
    azure_judge_deployment="gpt-4o",
    azure_fast_deployment="gpt-4o-mini",
    azure_embedding_deployment="text-embedding-3-small",
)

Get your token at https://huggingface.co/settings/tokens (Read scope is enough).
# HuggingFace LLM + HuggingFace embeddings (cloud, cheapest after DeepSeek)
client = ShamaClient.from_config(
    huggingface_api_key="hf_...",
    huggingface_judge_model="mistralai/Mistral-7B-Instruct-v0.3",
    huggingface_fast_model="mistralai/Mistral-7B-Instruct-v0.3",
    huggingface_embedding_model="BAAI/bge-large-en-v1.5",   # 1024 dims
)

# HuggingFace LLM + OpenAI embeddings (best quality embeddings)
client = ShamaClient.from_config(
    huggingface_api_key="hf_...",
    embedding_api_key="sk-...",    # OpenAI key for embeddings only
)

Quick usage reference

# Option 1: HF Inference API - both LLM and embeddings
client = ShamaClient.from_config(
    huggingface_api_key="hf_...",
    huggingface_embedding_model="BAAI/bge-large-en-v1.5",
)
 
# Option 2: HF for LLM + OpenAI for embeddings
client = ShamaClient.from_config(
    huggingface_api_key="hf_...",
    embedding_api_key="sk-...",
)
 
# Option 3: Fully local - zero API cost, full privacy
client = ShamaClient.from_config(
    huggingface_local_llm_model="microsoft/Phi-3-mini-4k-instruct",
    huggingface_local_embedding_model="BAAI/bge-base-en-v1.5",
    huggingface_local_device="cpu",
)
 
# Option 4: from_components - maximum flexibility
from shama import ShamaClient, HuggingFaceLLMProvider, HuggingFaceLocalEmbeddingProvider
 
client = ShamaClient.from_components(
    llm_provider=HuggingFaceLLMProvider(api_key="hf_...", judge_model="Qwen/Qwen2.5-72B-Instruct"),
    embedding_provider=HuggingFaceLocalEmbeddingProvider(model_name="BAAI/bge-large-en-v1.5"),
    # ... other components
)

Background Scheduler

SHAMA's self-healing runs automatically via Celery. Start it alongside your application:

# In your app startup
from shama.scheduler.tasks import register_shama_context

register_shama_context({
    **client.get_scheduler_context(),
    "agent_registry": ["agent-001", "agent-002"],  # agents to process
})

# Terminal 1 - Celery worker
celery -A shama.scheduler.tasks worker --loglevel=info

# Terminal 2 - Celery beat (scheduler)
celery -A shama.scheduler.tasks beat --loglevel=info

Default schedule:

Decay pass: every 15 minutes
Promotion pass: every 60 minutes
Re-verify and contradiction resolution: on-demand (triggered by decay engine)

License

MIT - use freely, including commercially.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shama-0.1.0.tar.gz (54.5 kB view details)

Uploaded Jun 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

shama-0.1.0-py3-none-any.whl (60.3 kB view details)

Uploaded Jun 20, 2026 Python 3

File details

Details for the file shama-0.1.0.tar.gz.

File metadata

Download URL: shama-0.1.0.tar.gz
Upload date: Jun 20, 2026
Size: 54.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.14

File hashes

Hashes for shama-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`83142b28f2307526047fd221180dd798c93ec545ceab2f8d9261758f4d49c35d`
MD5	`879380fe964629211c34fc9c1be8fd9c`
BLAKE2b-256	`513549ac9ed1a7279954a9c4f5fd2bb668cce6962697f703bf04180065ce032f`

See more details on using hashes here.

File details

Details for the file shama-0.1.0-py3-none-any.whl.

File metadata

Download URL: shama-0.1.0-py3-none-any.whl
Upload date: Jun 20, 2026
Size: 60.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.14

File hashes

Hashes for shama-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`04d89b355044c6a0c7689e9306372c5284960cb690b100bd0ba72a0fbee04913`
MD5	`1b7e514457c3c27a84d98069849bf881`
BLAKE2b-256	`15c64477e8c1012066b484fbd35566b006410b7d09244013f7e98e51111632de`

See more details on using hashes here.

shama 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SHAMA - Self-Healing Agent Memory Architecture

The Problem

The Solution

Table of Contents

Architecture

Confidence Half-Life

Prerequisites

Step 1 - Get API Keys

Embedding Key - OpenAI (required)

LLM Key - DeepSeek (for reasoning, contradiction judging, promotion)

HuggingFace - Fully Local (no API keys, full privacy) or use Hugging face free API

For local usage

Step 2 - Clone & Install

Step 3 - Configure Environment

Step 4 - Start Infrastructure (Docker)

Step 5 - Verify Infrastructure

Qdrant

Neo4j

Redis

All three via Python

Public API Reference

Swappable Backends

Provider Combinations

Quick usage reference

Background Scheduler

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes