Memory traces for AI agents - Think like human

These details have not been verified by PyPI

Project links

Project description

engram

Memory traces for AI agents — Think like humans.

Version Tests Python License

Dual-memory AI system combining episodic (vector) + semantic (graph) memory with LLM reasoning. Enterprise-ready with multi-tenancy, auth, caching, observability, and Docker deployment. Exposes CLI, MCP, HTTP API (/api/v1/), and WebSocket API (/ws) for real-time agent communication.

Features

Core Memory

Episodic Memory — ChromaDB vector store, semantic similarity search, Ebbinghaus decay, activation-based scoring, topic-key upsert
Semantic Graph — NetworkX MultiDiGraph, typed entities and relationships, SQLite (default) or PostgreSQL backend, weighted edges
Reasoning Engine — LLM synthesis (Gemini via litellm), dual-memory context fusion, constitution-guarded prompts
Recall Pipeline — Query decision, temporal+pronoun entity resolution, parallel multi-source search, dedup, composite scoring
Auto Memory — Detect and persist save-worthy messages automatically, poisoning guard for injection prevention
Feedback Loop — Confidence scoring (+0.15/−0.2), importance adjustment, auto-delete on 3x negative feedback
Graph Visualization — Interactive entity relationship explorer with dark theme, search, click-to-inspect (vis-network)

Intelligence Layer

Temporal Resolution — 28 Vietnamese+English date patterns resolve "hôm nay/yesterday" → ISO dates before storing
Pronoun Resolution — "anh ấy/he/she" → named entity from graph context, LLM-based fallback
Fusion Formatter — Group recall results by type [preference]/[fact]/[lesson] for structured LLM context
Memory Consolidation — Jaccard clustering + LLM summarization reduces redundancy

Enterprise

Multi-Surface — CLI (Typer), MCP Server (stdio), HTTP API (FastAPI), WebSocket, Web UI
Authentication — JWT + API keys with RBAC (ADMIN, AGENT, READER), optional, disabled by default
Multi-Tenancy — Isolated per-tenant stores, contextvar propagation, row-level PostgreSQL isolation
Caching — Redis-backed result caching with per-endpoint TTLs
Rate Limiting — Sliding-window per-tenant limits, fail_open option
Audit Trail — Structured before/after JSONL log for every episodic mutation
Resource Tiers — 4-tier LLM degradation (FULL → STANDARD → BASIC → READONLY), 60s auto-recovery
Data Constitution — 3-law LLM governance (namespace isolation, no fabrication, audit rights), SHA-256 tamper detection
Consolidation Scheduler — Asyncio background tasks (cleanup daily, consolidate 6h, decay daily), tier-aware
Key Rotation — Failover/round-robin for embedding API keys (GEMINI_API_KEY + GEMINI_API_KEY_FALLBACK)
Observability — OpenTelemetry + JSONL audit logging (optional)
Deployment — Docker Compose, Kubernetes-ready, health checks
Backup/Restore — Memory snapshots, point-in-time recovery
Benchmark Suite — p50/p95/p99 latency measurements for all endpoints

Architecture

flowchart TD
    CLI["CLI (Typer)"]
    MCP["MCP (stdio)"]
    HTTP["HTTP API /api/v1/"]
    WS["WebSocket /ws"]

    CLI & MCP & HTTP & WS --> Auth["Auth Middleware\n(JWT + RBAC, optional)"]
    Auth --> Tenant["TenantContext (ContextVar)"]
    Tenant --> Recall["Recall Pipeline\n(decision → resolve → search → feedback)"]
    Recall --> Episodic["EpisodicStore\n(ChromaDB)"]
    Recall --> Semantic["SemanticGraph\n(NetworkX + SQLite/PG)"]
    Episodic & Semantic --> Reasoning["Reasoning Engine\n(Gemini via litellm)"]
    Episodic --> Cache["Redis Cache (optional)"]
    WS --> EventBus["Event Bus\n(push events)"]

Quick Start

# Install from source
git clone https://github.com/docaohieu2808/Engram-Mem.git
cd engram
pip install -e .

# Optional: dev dependencies
pip install -e ".[dev]"

# Optional: OpenTelemetry support
pip install -e ".[telemetry]"

Requirements: Python 3.11+

Optional: GEMINI_API_KEY for LLM reasoning and embeddings. Basic storage works without it.

CLI Usage

export GEMINI_API_KEY="your-key"

# Store a memory
engram remember "Deployed v2.1 to production at 14:00 - caused 503 spike"

# Retrieve similar memories
engram recall "production incidents"

# Reason across all memory
engram think "What deployment issues have we had?"

# Add knowledge graph entities
engram add node "PostgreSQL" --type Technology
engram add edge "Service:API" "Technology:PostgreSQL" --relation uses

# Start HTTP API server
engram serve

Python SDK

from engram import EngramClient

async with EngramClient(namespace="my-agent") as client:
    # Auto-recalls relevant context before LLM call,
    # auto-extracts facts from response in the background
    response = await client.chat([
        {"role": "user", "content": "Deploy to production"}
    ])

# Explicit operations
await client.remember("No deploys on Fridays", memory_type="decision", priority=8)
results = await client.recall("deployment policy", limit=5)
answer = await client.think("What are our deployment rules?")

HTTP API Usage

# Store memory
curl -X POST http://localhost:8765/api/v1/remember \
  -H "Content-Type: application/json" \
  -d '{"content": "Deployed v1.0 to production", "memory_type": "fact", "priority": 8}'

# Search
curl "http://localhost:8765/api/v1/recall?query=deployment&limit=5"

# Reason
curl -X POST http://localhost:8765/api/v1/think \
  -H "Content-Type: application/json" \
  -d '{"question": "What deployment issues have we had?"}'

Configuration

Config file: ~/.engram/config.yaml — Priority: CLI flags > env vars > YAML > defaults

episodic:
  provider: chromadb
  path: ~/.engram/episodic
  namespace: default

embedding:
  provider: gemini
  model: gemini-embedding-001
  # Key rotation: failover (default) or round-robin
  key_strategy: failover  # GEMINI_API_KEY + GEMINI_API_KEY_FALLBACK

semantic:
  provider: sqlite          # or postgresql
  path: ~/.engram/semantic.db
  # For PostgreSQL:
  # dsn: postgresql://user:pass@localhost/engram
  # pool_min: 5
  # pool_max: 20

llm:
  provider: gemini
  model: gemini/gemini-2.0-flash
  api_key: ${GEMINI_API_KEY}

auth:
  enabled: false            # Set to true for production
  jwt_secret: "use-32+-chars-or-${ENV_VAR}"

cache:
  enabled: false            # Set to true with Redis
  redis_url: redis://localhost:6379/0

rate_limit:
  enabled: false
  requests_per_minute: 100
  fail_open: true           # Allow through on Redis failure

audit:
  enabled: false
  path: ~/.engram/audit.jsonl

telemetry:
  enabled: false            # Requires telemetry extra
  otlp_endpoint: http://localhost:4317

recall:
  enabled: true
  decision_skip_trivial: true
  entity_resolution_enabled: true
  parallel_search_enabled: true
  feedback_enabled: true
  auto_consolidate_threshold: 20
  retrieval_audit_enabled: true

ingestion:
  auto_memory_enabled: true
  guard_enabled: true

feedback:
  confidence_positive_delta: 0.15
  confidence_negative_delta: 0.2
  auto_delete_threshold: 3

resolution:
  temporal_enabled: true
  pronoun_enabled: true

fusion:
  formatter_enabled: true

graph:
  visualization_enabled: true

Environment Variables

Variable	Purpose
`GEMINI_API_KEY`	LLM + embeddings (primary key)
`GEMINI_API_KEY_FALLBACK`	Secondary key for key rotation
`GEMINI_KEY_STRATEGY`	`failover` (default) or `round-robin`
`ENGRAM_AUTH_ENABLED`	Enable auth (`true`/`false`)
`ENGRAM_AUTH_JWT_SECRET`	JWT signing key (32+ chars)
`ENGRAM_SEMANTIC_PROVIDER`	`sqlite` or `postgresql`
`ENGRAM_SEMANTIC_DSN`	PostgreSQL connection string
`ENGRAM_CACHE_ENABLED`	Enable Redis caching
`ENGRAM_CACHE_REDIS_URL`	Redis URL
`ENGRAM_RATE_LIMIT_ENABLED`	Enable rate limiting
`ENGRAM_RATE_LIMIT_REQUESTS_PER_MINUTE`	Default 60
`ENGRAM_AUDIT_ENABLED`	Enable audit logs
`ENGRAM_TELEMETRY_ENABLED`	Enable OpenTelemetry
`ENGRAM_RECALL_ENABLED`	Enable recall pipeline
`ENGRAM_INGESTION_AUTO_MEMORY_ENABLED`	Detect save-worthy messages
`ENGRAM_INGESTION_GUARD_ENABLED`	Block prompt injection
`ENGRAM_FEEDBACK_ENABLED`	Enable feedback loop
`ENGRAM_RESOLUTION_TEMPORAL_ENABLED`	Temporal entity resolution
`ENGRAM_RESOLUTION_PRONOUN_ENABLED`	Pronoun resolution via LLM
`ENGRAM_FUSION_FORMATTER_ENABLED`	Format recall by type
`ENGRAM_GRAPH_VISUALIZATION_ENABLED`	Enable graph UI at /graph

Note: ${VARIABLE} syntax in YAML is expanded at load time.

API Reference

Start server: engram serve [--host 0.0.0.0] [--port 8765]

All endpoints at /api/v1/; legacy routes redirect. Auth disabled by default.

Method	Endpoint	Purpose
`GET`	`/health`	Liveness check (always available)
`GET`	`/health/ready`	Readiness probe
`POST`	`/api/v1/remember`	Store episodic memory
`GET`	`/api/v1/recall`	Search memories (`?query=X&limit=5&offset=0`)
`POST`	`/api/v1/think`	LLM reasoning across episodic + semantic
`GET`	`/api/v1/query`	Graph search (`?keyword=X&node_type=Y&related_to=Z`)
`POST`	`/api/v1/ingest`	Extract entities + store memories from messages
`POST`	`/api/v1/feedback`	Record positive/negative feedback on a memory
`GET`	`/api/v1/graph/data`	Graph data JSON for visualization
`GET`	`/graph`	Interactive graph visualization UI
`POST`	`/api/v1/cleanup`	Delete expired memories (admin only)
`POST`	`/api/v1/summarize`	LLM synthesis of recent memories (admin only)
`POST`	`/api/v1/auth/token`	Issue JWT (admin_secret required)
`POST`	`/api/v1/backup`	Export all memory to JSON
`POST`	`/api/v1/restore`	Import backup snapshot

Responses: All wrapped in {data, meta}. Errors: {error: {code, message}, meta}.

Auth: When enabled, use Authorization: Bearer <JWT> or X-API-Key: <key>.

WebSocket API

Real-time bidirectional communication. Connect via ws://host:8765/ws?token=JWT (token optional when auth disabled).

Commands (client to server):

Command	Payload
`remember`	`{"content": "...", "priority": 7}`
`recall`	`{"query": "...", "limit": 5}`
`think`	`{"question": "..."}`
`feedback`	`{"memory_id": "abc123", "feedback": "positive"}`
`query`	`{"keyword": "PostgreSQL"}`
`ingest`	`{"messages": [...]}`
`status`	`{}`

Example:

{"id": "corr-1", "type": "remember", "payload": {"content": "User prefers dark mode", "priority": 7}}
{"id": "corr-1", "type": "response", "status": "ok", "data": {"id": "mem-xyz"}}

Push Events (server to all agents in same tenant):

{"type": "event", "event": "memory_created", "tenant_id": "default", "data": {"id": "mem-xyz"}}

Events: memory_created, memory_updated, memory_deleted, feedback_recorded

CLI Reference

Memory Operations

# Store with options
engram remember <content> [--type fact|decision|preference|todo|error|context|workflow]
                          [--priority 1-10] [--tags tag1,tag2] [--expires 2h|1d|7d]

# Search
engram recall <query> [--limit 5] [--type <type>] [--tags tag1,tag2] [--namespace <ns>]
              [--resolve-entities] [--resolve-temporal]

# Reason
engram think <question>
engram summarize [--count 20] [--save]

Semantic Graph

engram add node <name> --type <NodeType>
engram add edge <from_key> <to_key> --relation <relation>
engram remove node <key>
engram query [<keyword>] [--type <NodeType>] [--related-to <name>] [--format table|json]
engram graph [--search <keyword>]       # Open interactive graph visualization

Intelligence & Pipeline

engram resolve <query> [--context "..."]      # Entity/temporal resolution
engram feedback <id> [--positive|--negative]  # Record feedback on memory
engram audit [--last N]                        # View retrieval audit log
engram benchmark --questions file.json         # Run accuracy benchmarks

System

engram status                          # Memory + graph stats
engram serve [--host 127.0.0.1] [--port 8765]
engram watch [--daemon]                # Watch inbox + start scheduler
engram resource-status                 # Show resource tier (FULL/STANDARD/BASIC/READONLY)
engram constitution-status             # Show 3-law governance + SHA-256 hash
engram scheduler-status                # Show background task schedule + state

Maintenance

engram cleanup                         # Delete expired memories
engram ingest <file.json> [--dry-run]  # Ingest chat JSON
engram backup                          # Export memory snapshot
engram restore <file>                  # Import snapshot
engram config show / get <key> / set <key> <value>

MCP Integration

Add to ~/.claude/settings.json:

{
  "mcpServers": {
    "engram": {
      "command": "/path/to/.venv/bin/engram-mcp",
      "env": {
        "GEMINI_API_KEY": "your-key"
      }
    }
  }
}

Available MCP Tools:

Tool	Description
`engram_remember`	Store memory with type, priority, tags, namespace
`engram_recall`	Search episodic memories (compact format by default)
`engram_resolve`	Entity/temporal resolution for query context
`engram_feedback`	Record positive/negative feedback on memories
`engram_think`	Reason across episodic + semantic memory via LLM
`engram_summarize`	Summarize recent N memories via LLM
`engram_cleanup`	Delete all expired memories
`engram_status`	Show memory statistics
`engram_get_memory`	Retrieve full memory content by ID or prefix
`engram_timeline`	Chronological context around a memory
`engram_add_entity`	Add entity node to knowledge graph
`engram_add_relation`	Add relationship edge between entities
`engram_query_graph`	Query knowledge graph
`engram_get_graph_data`	Retrieve graph data for visualization
`engram_ingest`	Dual ingest: extract entities + store memories
`engram_session_start`	Begin new conversation session
`engram_session_end`	End active session with optional summary
`engram_session_summary`	Get summary of completed session
`engram_session_context`	Retrieve memories from active session

Claude Code Hooks: Engram ships Stop + SessionEnd hooks that capture conversation turns automatically — no SDK integration needed.

Embeddings

Only one embedding model is supported:

Mode	Model	Dimensions	Requires
Gemini (default)	`gemini-embedding-001`	3072	`GEMINI_API_KEY`
Fallback	`all-MiniLM-L6-v2` (ChromaDB default)	384	nothing

Embedding dimensions must remain consistent within a collection. If you switch providers, reinitialize the episodic store or create a new collection.

Key rotation: Set GEMINI_API_KEY_FALLBACK and GEMINI_KEY_STRATEGY=round-robin to distribute quota across multiple API keys.

Benchmarks

Run python tests/benchmark_performance.py --host 127.0.0.1 --port 8765 against a running server.

Sample results (local, SQLite backend, no Redis):

Operation	p50	p95	p99	Notes
`GET /health`	0.8ms	1.2ms	2.1ms	Always fast
`POST /remember`	415ms	680ms	920ms	Embedding API bound
`GET /recall`	1.3ms	3.2ms	5.8ms	Vector search
`POST /think`	5.6s	7.2s	9.4s	LLM bound

Flags: --quick for a faster subset, --concurrency N for load testing.

Docker

Quick Start

docker build -t engram:latest .
docker run -e GEMINI_API_KEY="your-key" -p 8765:8765 engram:latest

Production (PostgreSQL + Redis)

# Production env vars
ENGRAM_AUTH_ENABLED=true
ENGRAM_AUTH_JWT_SECRET=$(openssl rand -hex 32)
ENGRAM_SEMANTIC_PROVIDER=postgresql
ENGRAM_SEMANTIC_DSN=postgresql://user:pass@postgres:5432/engram
ENGRAM_CACHE_ENABLED=true
ENGRAM_CACHE_REDIS_URL=redis://redis:6379/0
ENGRAM_AUDIT_ENABLED=true
GEMINI_API_KEY=your-key
GEMINI_API_KEY_FALLBACK=your-fallback-key
GEMINI_KEY_STRATEGY=round-robin

See deployment-guide.md for Docker Compose with PostgreSQL, Redis, and OpenTelemetry.

Test Coverage

# Run all tests
pytest tests/ -v

# With coverage
pytest tests/ --cov=src/engram --cov-report=html

# Specific suites
pytest tests/ -k "recall or resolution or feedback" -v
pytest tests/ -k "websocket" -v

894+ tests across all modules
61%+ code coverage (core features 80%+)
CI/CD: GitHub Actions runs full suite on every PR and commit

Documentation

Project Overview & PDR — Features, requirements, config reference
System Architecture — Design, data flow, deployment patterns
Code Standards — Conventions, patterns, best practices
Deployment Guide — Docker, Kubernetes, environment variables, auth setup
Codebase Summary — Module inventory, metrics
Project Roadmap — Completed phases and future work
Changelog — Full version history

Contributing

Pull requests welcome. See CONTRIBUTING.md.

Before submitting:

Run ruff check src/ for linting
Run pytest tests/ — do not ignore failing tests
Do not commit .env files or API keys

License

MIT — Copyright (c) Do Cao Hieu

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.30

Mar 16, 2026

0.5.26

Mar 5, 2026

0.5.25

Mar 5, 2026

0.5.24

Mar 5, 2026

0.5.23

Mar 5, 2026

0.5.22

Mar 5, 2026

0.5.21

Mar 5, 2026

0.5.20

Mar 5, 2026

0.5.19

Mar 5, 2026

0.5.18

Mar 5, 2026

0.5.17

Mar 5, 2026

This version

0.5.16

Mar 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

engram_mem-0.5.16.tar.gz (422.2 kB view details)

Uploaded Mar 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

engram_mem-0.5.16-py3-none-any.whl (356.9 kB view details)

Uploaded Mar 5, 2026 Python 3

File details

Details for the file engram_mem-0.5.16.tar.gz.

File metadata

Download URL: engram_mem-0.5.16.tar.gz
Upload date: Mar 5, 2026
Size: 422.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for engram_mem-0.5.16.tar.gz
Algorithm	Hash digest
SHA256	`fb42b711000368a8ffe2abb435bb8b01c0958171a5044c0035488c01946f2b9d`
MD5	`035d66b2ede0392973e49f2aa902d0a9`
BLAKE2b-256	`d33a8ef3964c6a543032be1bf33a53bd7de91a8fe5910b1a41084a2ced7dfdef`

See more details on using hashes here.

File details

Details for the file engram_mem-0.5.16-py3-none-any.whl.

File metadata

Download URL: engram_mem-0.5.16-py3-none-any.whl
Upload date: Mar 5, 2026
Size: 356.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for engram_mem-0.5.16-py3-none-any.whl
Algorithm	Hash digest
SHA256	`23dd073502869a0b3676f0fe346f3b8910fbdfa9677d034ea0aa2a0f0c26cfc5`
MD5	`57b3f7efa9c355d44caacbbe2ae0700b`
BLAKE2b-256	`07285a4154aa566af4a83e3c52a5229f47625865974dca9000edb001ad48ca8b`

See more details on using hashes here.

engram-mem 0.5.16

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

engram

Features

Core Memory

Intelligence Layer

Enterprise

Architecture

Quick Start

CLI Usage

Python SDK

HTTP API Usage

Configuration

Environment Variables

API Reference

WebSocket API

CLI Reference

Memory Operations

Semantic Graph

Intelligence & Pipeline

System

Maintenance

MCP Integration

Embeddings

Benchmarks

Docker

Quick Start

Production (PostgreSQL + Redis)

Test Coverage

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes