Production-ready memory system for LLM agents - NO MOCKS, real semantic search, clear LLM vs embedding provider separation
Project description
AbstractMemory
Intelligent memory system for LLM agents with two-tier architecture
AbstractMemory provides efficient, purpose-built memory solutions for different types of LLM agents - from simple task-specific tools to sophisticated autonomous agents with persistent, grounded memory.
๐ฏ Project Goals
AbstractMemory is part of the AbstractLLM ecosystem refactoring, designed to power both simple and complex AI agents:
- Simple agents (ReAct, task tools) get lightweight, efficient memory
- Autonomous agents get sophisticated temporal memory with user tracking
- No over-engineering - memory complexity matches agent purpose
๐๏ธ Architecture Overview
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AbstractLLM Ecosystem โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ AbstractCore โ AbstractMemory โ AbstractAgent โ
โ โ โ โ
โ โข LLM Providers โ โข Simple Memory โ โข ReAct Agents โ
โ โข Sessions โ โข Complex Memoryโ โข Autonomous Agents โ
โ โข Tools โ โข Temporal KG โ โข Multi-user Agents โ
โโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ง Two-Tier Memory Strategy
Tier 1: Simple Memory (Task Agents)
Perfect for focused, single-purpose agents:
from abstractmemory import create_memory
# ReAct agent memory
scratchpad = create_memory("scratchpad", max_entries=50)
scratchpad.add_thought("User wants to learn Python")
scratchpad.add_action("search", {"query": "Python tutorials"})
scratchpad.add_observation("Found great tutorials")
# Simple chatbot memory
buffer = create_memory("buffer", max_messages=100)
buffer.add_message("user", "Hello!")
buffer.add_message("assistant", "Hi there!")
Tier 2: Complex Memory (Autonomous Agents)
For sophisticated agents with persistence and learning:
# Autonomous agent with full memory capabilities
memory = create_memory("grounded", working_capacity=10, enable_kg=True)
# Multi-user context
memory.set_current_user("alice", relationship="owner")
memory.add_interaction("I love Python", "Python is excellent!")
memory.learn_about_user("Python developer")
# Get personalized context
context = memory.get_full_context("programming", user_id="alice")
๐ง Quick Start
Installation
pip install abstractmemory
# For real LLM integration tests
pip install abstractmemory[llm]
# For LanceDB storage (optional)
pip install lancedb
Basic Usage
from abstractmemory import create_memory
# 1. Choose memory type based on agent purpose
memory = create_memory("scratchpad") # Simple task agent
memory = create_memory("buffer") # Simple chatbot
memory = create_memory("grounded") # Autonomous agent
# 2. Use memory in your agent
if agent_type == "react":
memory.add_thought("Planning the solution...")
memory.add_action("execute", {"command": "analyze"})
memory.add_observation("Analysis complete")
elif agent_type == "autonomous":
memory.set_current_user("user123")
memory.add_interaction(user_input, agent_response)
context = memory.get_full_context(query)
๐๏ธ Persistent Storage Options
AbstractMemory now supports sophisticated storage for observable, searchable AI memory:
Observable Markdown Storage
Perfect for development, debugging, and transparency:
# Human-readable, version-controllable AI memory
memory = create_memory(
"grounded",
storage_backend="markdown",
storage_path="./memory"
)
# Generates organized structure:
# memory/
# โโโ verbatim/alice/2025/09/24/10-30-45_python_int_abc123.md
# โโโ experiential/2025/09/24/10-31-02_learning_note_def456.md
# โโโ links/2025/09/24/int_abc123_to_note_def456.json
# โโโ index.json
Powerful Vector Search
High-performance search with default optimized embeddings:
# Uses default all-MiniLM-L6-v2 model (recommended)
memory = create_memory(
"grounded",
storage_backend="lancedb",
storage_uri="./memory.db"
# embedding_provider automatically configured with all-MiniLM-L6-v2
)
# Or with custom embedding model
from abstractmemory.embeddings.sentence_transformer_provider import create_sentence_transformer_provider
custom_provider = create_sentence_transformer_provider("bge-base-en-v1.5")
memory = create_memory(
"grounded",
storage_backend="lancedb",
storage_uri="./memory.db",
embedding_provider=custom_provider
)
# Semantic search across stored interactions
results = memory.search_stored_interactions("machine learning concepts")
๐ฏ Default Embedding Model: AbstractMemory now uses all-MiniLM-L6-v2 as the default embedding model, providing:
- Superior accuracy (best semantic similarity performance)
- Maximum efficiency (22M parameters, 384D embeddings)
- 50% storage savings compared to larger models
- Perfect retrieval performance (100% P@5, R@5, F1 scores)
See embedding comparison report for detailed benchmarks.
Dual Storage - Best of Both Worlds
Complete observability with powerful search:
# Dual storage: markdown (observable) + LanceDB (searchable)
memory = create_memory(
"grounded",
storage_backend="dual",
storage_path="./memory",
storage_uri="./memory.db",
embedding_provider=provider
)
# Every interaction stored in both formats
# - Markdown files for complete transparency
# - Vector database for semantic search
๐ Documentation
๐ START HERE: Complete Documentation Guide
Core Guides
- ๐ Quick Start - Get running in 5 minutes
- ๐ Semantic Search - Vector embeddings and similarity search
- ๐ง Memory Types - ScratchpadMemory, BufferMemory, GroundedMemory
- ๐ Performance Guide - Embedding timing and optimization
Advanced Topics
- ๐๏ธ Architecture - System design and two-tier strategy
- ๐พ Storage Systems - Markdown + LanceDB dual storage
- ๐ฏ Usage Patterns - Real-world examples and best practices
- ๐ Integration Guide - AbstractLLM ecosystem integration
- ๐ API Reference - Complete method documentation
๐ฌ Key Features
โ Purpose-Built Memory Types
- ScratchpadMemory: ReAct thought-action-observation cycles for task agents
- BufferMemory: Simple conversation history with capacity limits
- GroundedMemory: Four-tier architecture with semantic search and temporal context
โ State-of-the-Art Research Integration
- MemGPT/Letta Pattern: Self-editing core memory
- Temporal Grounding: WHO (relational) + WHEN (temporal) context
- Zep/Graphiti Architecture: Bi-temporal knowledge graphs
โ Four-Tier Memory Architecture (Autonomous Agents)
Core Memory โโโ Semantic Memory โโโ Working Memory โโโ Episodic Memory
(Identity) (Validated Facts) (Recent Context) (Event Archive)
โ Learning Capabilities
- Failure/Success Tracking: Learn from experience
- User Personalization: Multi-user context separation
- Fact Validation: Confidence-based knowledge consolidation
โ Dual Storage Architecture
- ๐ Markdown Storage: Human-readable, observable AI memory evolution
- ๐ LanceDB Storage: Vector search with SQL capabilities via AbstractCore
- ๐ Dual Mode: Best of both worlds - transparency + powerful search
- ๐ง AI Reflections: Automatic experiential notes about interactions
- ๐ Bidirectional Links: Connect interactions to AI insights
- ๐ Search Capabilities: Text-based and semantic similarity search
โ Semantic Search with AbstractCore
- ๐ฏ Real Embeddings: Uses AbstractCore's EmbeddingManager with Google's EmbeddingGemma (768D)
- โก Immediate Indexing: Embeddings generated instantly during
add_interaction()(~36ms) - ๐ Vector Similarity: True semantic search finds contextually relevant content
- ๐๏ธ Dual Storage: Observable markdown files + searchable LanceDB vectors
- ๐ฏ Production Ready: Sub-second search, proven with 200+ real implementation tests
๐งช Testing & Validation
AbstractMemory includes 200+ comprehensive tests using ONLY real implementations:
# Run all tests (NO MOCKS - only real implementations)
python -m pytest tests/ -v
# Run specific test suites
python -m pytest tests/simple/ -v # Simple memory types
python -m pytest tests/components/ -v # Memory components
python -m pytest tests/storage/ -v # Storage system tests
python -m pytest tests/integration/ -v # Full system integration
# Test with real LLM providers (requires AbstractCore)
python -m pytest tests/integration/test_llm_real_usage.py -v
# Test comprehensive dual storage with real embeddings
python -m pytest tests/storage/test_dual_storage_comprehensive.py -v
IMPORTANT: All tests use real implementations:
- Real embedding providers (AbstractCore EmbeddingManager)
- Real LLM providers (Anthropic, OpenAI, Ollama via AbstractCore)
- Real memory components and storage systems
- NO MOCKS anywhere in the codebase
๐ Quick Start
Installation
# Install with semantic search capabilities (includes sentence-transformers for default all-MiniLM-L6-v2 model)
pip install abstractmemory[embeddings]
# Or install everything
pip install abstractmemory[all]
# Basic memory only (no semantic search)
pip install abstractmemory
๐ Upgrading from v0.1.0?
Version 0.2.0 adds semantic search! See Migration Guide for:
- New AbstractCore dependency (
pip install abstractcore>=2.1.0) - LanceDB schema changes (recreate
.dbfiles) - New
embedding_providerparameter
โ ๏ธ Critical: LLM vs Embedding Provider Separation
Understanding the difference between LLM and Embedding providers:
- ๐ LLM Providers (text generation): Change freely between Anthropic, OpenAI, Ollama, etc.
- ๐ Embedding Providers (semantic search): Must remain consistent within a storage space
For semantic search consistency:
- โ Choose ONE embedding model and stick with it per storage space
- โ You can customize which embedding model to use (AbstractCore, OpenAI, Ollama, etc.)
- โ Don't change embedding models mid-project - it breaks vector search
- ๐จ AbstractMemory automatically warns when embedding model changes detected
Example of correct separation:
# LLM for text generation (can change anytime)
llm = create_llm("anthropic") # or "openai", "ollama", etc.
# Dedicated embedding provider (must stay consistent)
embedder = EmbeddingManager() # AbstractCore embeddings
memory = create_memory("grounded", embedding_provider=embedder) # NOT llm!
Basic Usage
from abstractmemory import create_memory
# 1. Create memory with default all-MiniLM-L6-v2 embeddings (recommended)
memory = create_memory(
"grounded",
storage_backend="dual", # Markdown + LanceDB
storage_path="./memory_files", # Observable files
storage_uri="./memory.db" # Vector search (auto-configured with all-MiniLM-L6-v2)
)
# 2. Add interactions (embeddings generated automatically!)
memory.set_current_user("alice")
memory.add_interaction(
"I'm working on machine learning projects",
"Great! ML has amazing applications in many fields."
)
# โณ Takes ~13ms: optimized all-MiniLM-L6-v2 embedding generated and stored
# 3. Semantic search finds contextually relevant content
results = memory.search_stored_interactions("artificial intelligence research")
# โณ Finds ML interaction via semantic similarity (not keywords!)
print(f"Found {len(results)} relevant conversations")
# Optional: Use custom embedding model
from abstractmemory.embeddings.sentence_transformer_provider import create_sentence_transformer_provider
custom_provider = create_sentence_transformer_provider("bge-base-en-v1.5")
custom_memory = create_memory(
"grounded",
storage_backend="dual",
storage_path="./memory_files",
storage_uri="./memory.db",
embedding_provider=custom_provider
)
๐ What Happens When You Add Interactions
memory.add_interaction("I love Python", "Great choice!")
# โ IMMEDIATE PROCESSING:
# 1. Text combined: "I love Python Great choice!"
# 2. EmbeddingManager.embed() called (36ms)
# 3. 768D vector generated with EmbeddingGemma
# 4. Saved to markdown file: ./memory_files/verbatim/alice/...
# 5. Stored in LanceDB: vector + text + metadata
# 6. Interaction immediately searchable via semantic similarity
๐ AbstractLLM Ecosystem Integration
AbstractMemory seamlessly integrates with AbstractCore, maintaining clear separation between LLM and embedding providers:
Critical Architecture: LLM vs Embedding Separation
from abstractllm import create_llm
from abstractllm.embeddings import EmbeddingManager
from abstractmemory import create_memory
# SEPARATE PROVIDERS for different purposes:
# 1. LLM Provider - for TEXT GENERATION (can change freely)
llm_provider = create_llm("anthropic", model="claude-3-5-haiku-latest")
# 2. Embedding Provider - for SEMANTIC SEARCH (must stay consistent)
embedding_provider = EmbeddingManager()
# Create memory with DEDICATED embedding provider
memory = create_memory(
"grounded",
enable_kg=True,
storage_backend="dual",
storage_path="./memory",
storage_uri="./memory.db",
embedding_provider=embedding_provider # DEDICATED for embeddings
)
# Use in agent reasoning with CLEAR separation
context = memory.get_full_context(query)
response = llm_provider.generate(prompt, system_prompt=context) # LLM for text
memory.add_interaction(query, response.content) # Embeddings handled internally
# Search uses embedding provider for semantic similarity
similar_memories = memory.search_stored_interactions("related concepts")
Key Points:
- LLM Provider: Change freely between Anthropic โ OpenAI โ Ollama
- Embedding Provider: Must remain consistent within storage space
- Never pass LLM provider as embedding provider
- Always use dedicated embedding provider for semantic search
With AbstractAgent (Future)
from abstractagent import create_agent
from abstractmemory import create_memory
# Autonomous agent with sophisticated memory
memory = create_memory("grounded", working_capacity=20)
agent = create_agent("autonomous", memory=memory, provider=provider)
# Agent automatically uses memory for consistency and personalization
response = agent.execute(task, user_id="alice")
๐๏ธ Architecture Principles
- No Over-Engineering: Memory complexity matches agent requirements
- Real Implementation Testing: NO MOCKS anywhere - all tests use real implementations
- SOTA Research Foundation: Built on proven patterns (MemGPT, Zep, Graphiti)
- Clean Abstractions: Simple interfaces, powerful implementations
- Performance Optimized: Fast operations for simple agents, scalable for complex ones
๐ Performance Characteristics
- Simple Memory: < 1ms operations, minimal overhead
- Complex Memory: < 100ms context generation, efficient consolidation
- Scalability: Handles thousands of memory items efficiently
- Real LLM Integration: Context + LLM calls complete in seconds
๐ค Contributing
AbstractMemory is part of the AbstractLLM ecosystem. See CONTRIBUTING.md for development guidelines.
๐ License
[License details]
AbstractMemory: Smart memory for smart agents ๐ง โจ
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file abstractmemory-0.2.3.tar.gz.
File metadata
- Download URL: abstractmemory-0.2.3.tar.gz
- Upload date:
- Size: 38.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bf0e5125e2061bdb98caea265e4a29c6b4403de873119292a2d78846123d1af
|
|
| MD5 |
445a4e192ccfb3bc2a37a6b75450e154
|
|
| BLAKE2b-256 |
74616ede72bcccde34f03ed87059ca82763a1f85ff9d91d7d4ce6c8e8c3ca585
|
File details
Details for the file abstractmemory-0.2.3-py3-none-any.whl.
File metadata
- Download URL: abstractmemory-0.2.3-py3-none-any.whl
- Upload date:
- Size: 45.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed8bcf80a34160de17de8f3cd2e9746f2992eff1a47d155026fe7c0ac3ddd530
|
|
| MD5 |
3b66cc991a19db0bf050fc7b361ed418
|
|
| BLAKE2b-256 |
13f46049eb5edb2f3d3b3c6d62a510287de99a76eb44abb53cc43a815310fd6e
|