Episodic Memory Kernel - An immutable, append-only ledger of agent experiences for AI systems
Project description
emk - Episodic Memory Kernel
"The Hard Drive" - An immutable, append-only ledger of agent experiences.
Overview
emk (Episodic Memory Kernel) is a Layer 1 primitive for storing agent experiences as structured episodes. Unlike context construction systems (like caas), emk is the permanent record. It provides a simple, immutable storage layer for agent experiences following the pattern: Goal → Action → Result → Reflection.
Core Value Proposition
- Immutable Storage: Episodes are append-only; no modifications allowed
- Structured Memory: Each episode captures the complete experience cycle
- Flexible Retrieval: Support for both simple file-based and vector-based retrieval
- Minimal Dependencies: Core functionality requires only
pydanticandnumpy - Not "Smart": Does not summarize or interpret - just stores and retrieves
Installation
# Basic installation
pip install emk
# With ChromaDB support for vector search
pip install emk[chromadb]
# Development installation
pip install emk[dev]
Quick Start
Basic Usage with FileAdapter
from emk import Episode, FileAdapter
# Create a memory store
store = FileAdapter("agent_memories.jsonl")
# Create and store an episode
episode = Episode(
goal="Retrieve user preferences",
action="Query database for user_id=123",
result="Successfully retrieved preferences",
reflection="Database query was efficient and returned expected data",
metadata={"user_id": "123", "query_time_ms": 45}
)
# Store the episode
episode_id = store.store(episode)
print(f"Stored episode: {episode_id}")
# Retrieve recent episodes
recent = store.retrieve(limit=10)
for ep in recent:
print(f"{ep.goal} -> {ep.result}")
# Retrieve by ID
specific = store.get_by_id(episode_id)
print(f"Retrieved: {specific.goal}")
# Filter by metadata
user_episodes = store.retrieve(filters={"user_id": "123"})
Using ChromaDB for Vector Search
from emk import Episode, ChromaDBAdapter
import numpy as np
# Create a vector store
store = ChromaDBAdapter(
collection_name="agent_episodes",
persist_directory="./chroma_data"
)
# Create an episode with embedding
episode = Episode(
goal="Learn Python syntax",
action="Read Python documentation",
result="Understood basic syntax",
reflection="Need more practice with decorators"
)
# Create a simple embedding (in practice, use a real embedding model)
embedding = np.random.rand(384)
# Store with embedding
store.store(episode, embedding=embedding)
# Query by similarity
query_embedding = np.random.rand(384)
similar = store.retrieve(query_embedding=query_embedding, limit=5)
Using the Indexer
from emk import Episode, Indexer
episode = Episode(
goal="Implement user authentication",
action="Created JWT-based auth system",
result="Users can now login securely",
reflection="Should add rate limiting next"
)
# Generate searchable tags
tags = Indexer.generate_episode_tags(episode)
print(f"Tags: {tags}")
# Create search text for embedding
search_text = Indexer.create_search_text(episode)
print(f"Search text: {search_text}")
# Enrich metadata with indexing info
enriched = Indexer.enrich_metadata(episode, auto_tags=True)
print(f"Enriched metadata: {enriched}")
Architecture
The Schema
The Episode class defines the core data structure:
class Episode:
goal: str # The agent's intended objective
action: str # The action taken
result: str # The outcome
reflection: str # Analysis or learning
timestamp: datetime # Auto-generated
metadata: Dict[str, Any] # Additional context
episode_id: str # Unique hash-based ID
The Store
Three storage implementations:
-
VectorStoreAdapter (Abstract Interface)
- Defines the contract for all storage implementations
- Methods:
store(),retrieve(),get_by_id()
-
FileAdapter (Simple JSONL)
- Local file-based storage
- No external dependencies
- Perfect for logging and simple use cases
-
ChromaDBAdapter (Vector Search)
- Requires optional
chromadbdependency - Supports embedding-based similarity search
- Ideal for semantic retrieval
- Requires optional
The Indexer
Utilities for tagging and indexing episodes:
extract_tags(): Extract searchable tags from textgenerate_episode_tags(): Auto-generate tags from episodesenrich_metadata(): Add indexing metadatacreate_search_text(): Generate text for embeddings
Design Principles
✅ What emk Does
- Stores episodes immutably
- Provides simple retrieval interfaces
- Indexes episodes for efficient search
- Maintains historical records
❌ What emk Does NOT Do
- Does not summarize memories (that's for agents or
caas) - Does not overwrite data (append-only)
- Does not depend on other agent systems
- Does not make "smart" decisions
Dependency Rules
Allowed:
numpy(for vectors)pydantic(for schemas)chromadb(optional extra)
Strictly Forbidden:
caas(caas depends on emk, not the other way)agent-control-plane(memory store is agnostic)- Any "smart" processing libraries
Development
# Clone the repository
git clone https://github.com/imran-siddique/emk.git
cd emk
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run with coverage
pytest --cov=emk --cov-report=html
# Format code
black emk tests
# Lint code
ruff check emk tests
Testing
# Run all tests
pytest
# Run specific test file
pytest tests/test_schema.py
# Run with verbose output
pytest -v
# Run with coverage
pytest --cov=emk
Use Cases
1. Agent Learning History
store = FileAdapter("agent_learning.jsonl")
episode = Episode(
goal="Solve user query about weather",
action="Called weather API",
result="Provided accurate forecast",
reflection="API response format changed - need to update parser"
)
store.store(episode)
2. Debugging Agent Behavior
# Store all agent actions for debugging
episode = Episode(
goal="Process payment",
action="Called payment gateway",
result="Payment failed",
reflection="Gateway timeout - need retry logic",
metadata={"error_code": "TIMEOUT", "amount": 50.00}
)
store.store(episode)
# Later, retrieve failed episodes
failed = store.retrieve(filters={"error_code": "TIMEOUT"})
3. Building Agent Memory
# Store experiences for later retrieval by reasoning systems
store = ChromaDBAdapter("agent_memory")
# When agent learns something
episode = Episode(
goal="Understand user preference",
action="Analyzed past interactions",
result="User prefers concise responses",
reflection="Should adjust response length in future"
)
store.store(episode)
# Later, retrieve relevant experiences
relevant = store.retrieve(query_embedding=current_context_embedding)
License
MIT License - see LICENSE file for details.
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
Roadmap
- Additional storage backends (SQLite, PostgreSQL)
- Advanced filtering and query capabilities
- Batch operations for efficiency
- Export/import utilities
- Performance optimizations for large datasets
Links
- Repository: https://github.com/imran-siddique/emk
- PyPI: https://pypi.org/project/emk/
- Issues: https://github.com/imran-siddique/emk/issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file emk-0.1.0.tar.gz.
File metadata
- Download URL: emk-0.1.0.tar.gz
- Upload date:
- Size: 35.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2795d850a33418aa9b179b43eef8de922c436646a67079cb75718ce3836539e3
|
|
| MD5 |
a1a6907f306402e9fff795950b59c69b
|
|
| BLAKE2b-256 |
ef84a348a711c56758786161dedd9932deb61b679998cf04acd3ae8cab9bd84e
|
Provenance
The following attestation bundles were made for emk-0.1.0.tar.gz:
Publisher:
publish.yml on imran-siddique/emk
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
emk-0.1.0.tar.gz -
Subject digest:
2795d850a33418aa9b179b43eef8de922c436646a67079cb75718ce3836539e3 - Sigstore transparency entry: 848796848
- Sigstore integration time:
-
Permalink:
imran-siddique/emk@7a84d375b46fa84db7637cec6ce9080425ab2075 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/imran-siddique
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7a84d375b46fa84db7637cec6ce9080425ab2075 -
Trigger Event:
release
-
Statement type:
File details
Details for the file emk-0.1.0-py3-none-any.whl.
File metadata
- Download URL: emk-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe3b2e5972da9904827dbdf431862e3c416d71ffefc6b471d611e4620acd7803
|
|
| MD5 |
b9a6b49f13fa73c3732ecf1951aabf9b
|
|
| BLAKE2b-256 |
8ea4d31016b6f8134dcd6131951960f6e628293d071d57cefbcae934188f9206
|
Provenance
The following attestation bundles were made for emk-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on imran-siddique/emk
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
emk-0.1.0-py3-none-any.whl -
Subject digest:
fe3b2e5972da9904827dbdf431862e3c416d71ffefc6b471d611e4620acd7803 - Sigstore transparency entry: 848796904
- Sigstore integration time:
-
Permalink:
imran-siddique/emk@7a84d375b46fa84db7637cec6ce9080425ab2075 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/imran-siddique
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7a84d375b46fa84db7637cec6ce9080425ab2075 -
Trigger Event:
release
-
Statement type: