Skip to main content

Episodic Memory Kernel - An immutable, append-only ledger of agent experiences for AI systems

Project description

emk - Episodic Memory Kernel

PyPI Python License

"The Hard Drive" - An immutable, append-only ledger of agent experiences.

Overview

emk (Episodic Memory Kernel) is a Layer 1 primitive for storing agent experiences as structured episodes. Unlike context construction systems (like caas), emk is the permanent record. It provides a simple, immutable storage layer for agent experiences following the pattern: Goal → Action → Result → Reflection.

Core Value Proposition

  • Immutable Storage: Episodes are append-only; no modifications allowed
  • Structured Memory: Each episode captures the complete experience cycle
  • Flexible Retrieval: Support for both simple file-based and vector-based retrieval
  • Minimal Dependencies: Core functionality requires only pydantic and numpy
  • Not "Smart": Does not summarize or interpret - just stores and retrieves

Installation

# Basic installation
pip install emk

# With ChromaDB support for vector search
pip install emk[chromadb]

# Development installation
pip install emk[dev]

Quick Start

Basic Usage with FileAdapter

from emk import Episode, FileAdapter

# Create a memory store
store = FileAdapter("agent_memories.jsonl")

# Create and store an episode
episode = Episode(
    goal="Retrieve user preferences",
    action="Query database for user_id=123",
    result="Successfully retrieved preferences",
    reflection="Database query was efficient and returned expected data",
    metadata={"user_id": "123", "query_time_ms": 45}
)

# Store the episode
episode_id = store.store(episode)
print(f"Stored episode: {episode_id}")

# Retrieve recent episodes
recent = store.retrieve(limit=10)
for ep in recent:
    print(f"{ep.goal} -> {ep.result}")

# Retrieve by ID
specific = store.get_by_id(episode_id)
print(f"Retrieved: {specific.goal}")

# Filter by metadata
user_episodes = store.retrieve(filters={"user_id": "123"})

Using ChromaDB for Vector Search

from emk import Episode, ChromaDBAdapter
import numpy as np

# Create a vector store
store = ChromaDBAdapter(
    collection_name="agent_episodes",
    persist_directory="./chroma_data"
)

# Create an episode with embedding
episode = Episode(
    goal="Learn Python syntax",
    action="Read Python documentation",
    result="Understood basic syntax",
    reflection="Need more practice with decorators"
)

# Create a simple embedding (in practice, use a real embedding model)
embedding = np.random.rand(384)

# Store with embedding
store.store(episode, embedding=embedding)

# Query by similarity
query_embedding = np.random.rand(384)
similar = store.retrieve(query_embedding=query_embedding, limit=5)

Using the Indexer

from emk import Episode, Indexer

episode = Episode(
    goal="Implement user authentication",
    action="Created JWT-based auth system",
    result="Users can now login securely",
    reflection="Should add rate limiting next"
)

# Generate searchable tags
tags = Indexer.generate_episode_tags(episode)
print(f"Tags: {tags}")

# Create search text for embedding
search_text = Indexer.create_search_text(episode)
print(f"Search text: {search_text}")

# Enrich metadata with indexing info
enriched = Indexer.enrich_metadata(episode, auto_tags=True)
print(f"Enriched metadata: {enriched}")

Architecture

The Schema

The Episode class defines the core data structure:

class Episode:
    goal: str                    # The agent's intended objective
    action: str                  # The action taken
    result: str                  # The outcome
    reflection: str              # Analysis or learning
    timestamp: datetime          # Auto-generated
    metadata: Dict[str, Any]     # Additional context
    episode_id: str              # Unique hash-based ID

The Store

Three storage implementations:

  1. VectorStoreAdapter (Abstract Interface)

    • Defines the contract for all storage implementations
    • Methods: store(), retrieve(), get_by_id()
  2. FileAdapter (Simple JSONL)

    • Local file-based storage
    • No external dependencies
    • Perfect for logging and simple use cases
  3. ChromaDBAdapter (Vector Search)

    • Requires optional chromadb dependency
    • Supports embedding-based similarity search
    • Ideal for semantic retrieval

The Indexer

Utilities for tagging and indexing episodes:

  • extract_tags(): Extract searchable tags from text
  • generate_episode_tags(): Auto-generate tags from episodes
  • enrich_metadata(): Add indexing metadata
  • create_search_text(): Generate text for embeddings

Design Principles

✅ What emk Does

  • Stores episodes immutably
  • Provides simple retrieval interfaces
  • Indexes episodes for efficient search
  • Maintains historical records

❌ What emk Does NOT Do

  • Does not summarize memories (that's for agents or caas)
  • Does not overwrite data (append-only)
  • Does not depend on other agent systems
  • Does not make "smart" decisions

Dependency Rules

Allowed:

  • numpy (for vectors)
  • pydantic (for schemas)
  • chromadb (optional extra)

Strictly Forbidden:

  • caas (caas depends on emk, not the other way)
  • agent-control-plane (memory store is agnostic)
  • Any "smart" processing libraries

Development

# Clone the repository
git clone https://github.com/imran-siddique/emk.git
cd emk

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=emk --cov-report=html

# Format code
black emk tests

# Lint code
ruff check emk tests

Testing

# Run all tests
pytest

# Run specific test file
pytest tests/test_schema.py

# Run with verbose output
pytest -v

# Run with coverage
pytest --cov=emk

Use Cases

1. Agent Learning History

store = FileAdapter("agent_learning.jsonl")

episode = Episode(
    goal="Solve user query about weather",
    action="Called weather API",
    result="Provided accurate forecast",
    reflection="API response format changed - need to update parser"
)
store.store(episode)

2. Debugging Agent Behavior

# Store all agent actions for debugging
episode = Episode(
    goal="Process payment",
    action="Called payment gateway",
    result="Payment failed",
    reflection="Gateway timeout - need retry logic",
    metadata={"error_code": "TIMEOUT", "amount": 50.00}
)
store.store(episode)

# Later, retrieve failed episodes
failed = store.retrieve(filters={"error_code": "TIMEOUT"})

3. Building Agent Memory

# Store experiences for later retrieval by reasoning systems
store = ChromaDBAdapter("agent_memory")

# When agent learns something
episode = Episode(
    goal="Understand user preference",
    action="Analyzed past interactions",
    result="User prefers concise responses",
    reflection="Should adjust response length in future"
)
store.store(episode)

# Later, retrieve relevant experiences
relevant = store.retrieve(query_embedding=current_context_embedding)

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

Roadmap

  • Additional storage backends (SQLite, PostgreSQL)
  • Advanced filtering and query capabilities
  • Batch operations for efficiency
  • Export/import utilities
  • Performance optimizations for large datasets

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emk-0.1.0.tar.gz (35.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

emk-0.1.0-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file emk-0.1.0.tar.gz.

File metadata

  • Download URL: emk-0.1.0.tar.gz
  • Upload date:
  • Size: 35.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for emk-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2795d850a33418aa9b179b43eef8de922c436646a67079cb75718ce3836539e3
MD5 a1a6907f306402e9fff795950b59c69b
BLAKE2b-256 ef84a348a711c56758786161dedd9932deb61b679998cf04acd3ae8cab9bd84e

See more details on using hashes here.

Provenance

The following attestation bundles were made for emk-0.1.0.tar.gz:

Publisher: publish.yml on imran-siddique/emk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file emk-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: emk-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for emk-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fe3b2e5972da9904827dbdf431862e3c416d71ffefc6b471d611e4620acd7803
MD5 b9a6b49f13fa73c3732ecf1951aabf9b
BLAKE2b-256 8ea4d31016b6f8134dcd6131951960f6e628293d071d57cefbcae934188f9206

See more details on using hashes here.

Provenance

The following attestation bundles were made for emk-0.1.0-py3-none-any.whl:

Publisher: publish.yml on imran-siddique/emk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page