Skip to main content

Production-ready memory system for LLM agents - NO MOCKS, real semantic search, clear LLM vs embedding provider separation

Project description

AbstractMemory

Intelligent memory system for LLM agents with two-tier architecture

AbstractMemory provides efficient, purpose-built memory solutions for different types of LLM agents - from simple task-specific tools to sophisticated autonomous agents with persistent, grounded memory.

๐ŸŽฏ Project Goals

AbstractMemory is part of the AbstractLLM ecosystem refactoring, designed to power both simple and complex AI agents:

  • Simple agents (ReAct, task tools) get lightweight, efficient memory
  • Autonomous agents get sophisticated temporal memory with user tracking
  • No over-engineering - memory complexity matches agent purpose

๐Ÿ—๏ธ Architecture Overview

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     AbstractLLM Ecosystem                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  AbstractCore   โ”‚ AbstractMemory  โ”‚    AbstractAgent        โ”‚
โ”‚                 โ”‚                 โ”‚                         โ”‚
โ”‚ โ€ข LLM Providers โ”‚ โ€ข Simple Memory โ”‚ โ€ข ReAct Agents          โ”‚
โ”‚ โ€ข Sessions      โ”‚ โ€ข Complex Memoryโ”‚ โ€ข Autonomous Agents     โ”‚
โ”‚ โ€ข Tools         โ”‚ โ€ข Temporal KG   โ”‚ โ€ข Multi-user Agents     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿง  Two-Tier Memory Strategy

Tier 1: Simple Memory (Task Agents)

Perfect for focused, single-purpose agents:

from abstractmemory import create_memory

# ReAct agent memory
scratchpad = create_memory("scratchpad", max_entries=50)
scratchpad.add_thought("User wants to learn Python")
scratchpad.add_action("search", {"query": "Python tutorials"})
scratchpad.add_observation("Found great tutorials")

# Simple chatbot memory
buffer = create_memory("buffer", max_messages=100)
buffer.add_message("user", "Hello!")
buffer.add_message("assistant", "Hi there!")

Tier 2: Complex Memory (Autonomous Agents)

For sophisticated agents with persistence and learning:

# Autonomous agent with full memory capabilities
memory = create_memory("grounded", working_capacity=10, enable_kg=True)

# Multi-user context
memory.set_current_user("alice", relationship="owner")
memory.add_interaction("I love Python", "Python is excellent!")
memory.learn_about_user("Python developer")

# Get personalized context
context = memory.get_full_context("programming", user_id="alice")

๐Ÿ”ง Quick Start

Installation

pip install abstractmemory

# For real LLM integration tests
pip install abstractmemory[llm]

# For LanceDB storage (optional)
pip install lancedb

Basic Usage

from abstractmemory import create_memory

# 1. Choose memory type based on agent purpose
memory = create_memory("scratchpad")  # Simple task agent
memory = create_memory("buffer")      # Simple chatbot
memory = create_memory("grounded")    # Autonomous agent

# 2. Use memory in your agent
if agent_type == "react":
    memory.add_thought("Planning the solution...")
    memory.add_action("execute", {"command": "analyze"})
    memory.add_observation("Analysis complete")

elif agent_type == "autonomous":
    memory.set_current_user("user123")
    memory.add_interaction(user_input, agent_response)
    context = memory.get_full_context(query)

๐Ÿ—‚๏ธ Persistent Storage Options

AbstractMemory now supports sophisticated storage for observable, searchable AI memory:

Observable Markdown Storage

Perfect for development, debugging, and transparency:

# Human-readable, version-controllable AI memory
memory = create_memory(
    "grounded",
    storage_backend="markdown",
    storage_path="./memory"
)

# Generates organized structure:
# memory/
# โ”œโ”€โ”€ verbatim/alice/2025/09/24/10-30-45_python_int_abc123.md
# โ”œโ”€โ”€ experiential/2025/09/24/10-31-02_learning_note_def456.md
# โ”œโ”€โ”€ links/2025/09/24/int_abc123_to_note_def456.json
# โ””โ”€โ”€ index.json

Powerful Vector Search

High-performance search with default optimized embeddings:

# Uses default all-MiniLM-L6-v2 model (recommended)
memory = create_memory(
    "grounded",
    storage_backend="lancedb",
    storage_uri="./memory.db"
    # embedding_provider automatically configured with all-MiniLM-L6-v2
)

# Or with custom embedding model
from abstractmemory.embeddings.sentence_transformer_provider import create_sentence_transformer_provider
custom_provider = create_sentence_transformer_provider("bge-base-en-v1.5")
memory = create_memory(
    "grounded",
    storage_backend="lancedb",
    storage_uri="./memory.db",
    embedding_provider=custom_provider
)

# Semantic search across stored interactions
results = memory.search_stored_interactions("machine learning concepts")

๐ŸŽฏ Default Embedding Model: AbstractMemory now uses all-MiniLM-L6-v2 as the default embedding model, providing:

  • Superior accuracy (best semantic similarity performance)
  • Maximum efficiency (22M parameters, 384D embeddings)
  • 50% storage savings compared to larger models
  • Perfect retrieval performance (100% P@5, R@5, F1 scores)

See embedding comparison report for detailed benchmarks.

Dual Storage - Best of Both Worlds

Complete observability with powerful search:

# Dual storage: markdown (observable) + LanceDB (searchable)
memory = create_memory(
    "grounded",
    storage_backend="dual",
    storage_path="./memory",
    storage_uri="./memory.db",
    embedding_provider=provider
)

# Every interaction stored in both formats
# - Markdown files for complete transparency
# - Vector database for semantic search

๐Ÿ“š Documentation

๐Ÿ‘‰ START HERE: Complete Documentation Guide

Core Guides

Advanced Topics

๐Ÿ”ฌ Key Features

โœ… Purpose-Built Memory Types

  • ScratchpadMemory: ReAct thought-action-observation cycles for task agents
  • BufferMemory: Simple conversation history with capacity limits
  • GroundedMemory: Four-tier architecture with semantic search and temporal context

โœ… State-of-the-Art Research Integration

  • MemGPT/Letta Pattern: Self-editing core memory
  • Temporal Grounding: WHO (relational) + WHEN (temporal) context
  • Zep/Graphiti Architecture: Bi-temporal knowledge graphs

โœ… Four-Tier Memory Architecture (Autonomous Agents)

Core Memory โ”€โ”€โ†’ Semantic Memory โ”€โ”€โ†’ Working Memory โ”€โ”€โ†’ Episodic Memory
   (Identity)     (Validated Facts)    (Recent Context)   (Event Archive)

โœ… Learning Capabilities

  • Failure/Success Tracking: Learn from experience
  • User Personalization: Multi-user context separation
  • Fact Validation: Confidence-based knowledge consolidation

โœ… Dual Storage Architecture

  • ๐Ÿ“„ Markdown Storage: Human-readable, observable AI memory evolution
  • ๐Ÿ” LanceDB Storage: Vector search with SQL capabilities via AbstractCore
  • ๐Ÿ”„ Dual Mode: Best of both worlds - transparency + powerful search
  • ๐Ÿง  AI Reflections: Automatic experiential notes about interactions
  • ๐Ÿ”— Bidirectional Links: Connect interactions to AI insights
  • ๐Ÿ“Š Search Capabilities: Text-based and semantic similarity search

โœ… Semantic Search with AbstractCore

  • ๐ŸŽฏ Real Embeddings: Uses AbstractCore's EmbeddingManager with Google's EmbeddingGemma (768D)
  • โšก Immediate Indexing: Embeddings generated instantly during add_interaction() (~36ms)
  • ๐Ÿ” Vector Similarity: True semantic search finds contextually relevant content
  • ๐Ÿ—„๏ธ Dual Storage: Observable markdown files + searchable LanceDB vectors
  • ๐ŸŽฏ Production Ready: Sub-second search, proven with 200+ real implementation tests

๐Ÿงช Testing & Validation

AbstractMemory includes 200+ comprehensive tests using ONLY real implementations:

# Run all tests (NO MOCKS - only real implementations)
python -m pytest tests/ -v

# Run specific test suites
python -m pytest tests/simple/ -v          # Simple memory types
python -m pytest tests/components/ -v      # Memory components
python -m pytest tests/storage/ -v         # Storage system tests
python -m pytest tests/integration/ -v     # Full system integration

# Test with real LLM providers (requires AbstractCore)
python -m pytest tests/integration/test_llm_real_usage.py -v

# Test comprehensive dual storage with real embeddings
python -m pytest tests/storage/test_dual_storage_comprehensive.py -v

IMPORTANT: All tests use real implementations:

  • Real embedding providers (AbstractCore EmbeddingManager)
  • Real LLM providers (Anthropic, OpenAI, Ollama via AbstractCore)
  • Real memory components and storage systems
  • NO MOCKS anywhere in the codebase

๐Ÿš€ Quick Start

Installation

# Install with semantic search capabilities (includes sentence-transformers for default all-MiniLM-L6-v2 model)
pip install abstractmemory[embeddings]

# Or install everything
pip install abstractmemory[all]

# Basic memory only (no semantic search)
pip install abstractmemory

๐Ÿ“‹ Upgrading from v0.1.0?

Version 0.2.0 adds semantic search! See Migration Guide for:

  • New AbstractCore dependency (pip install abstractcore>=2.1.0)
  • LanceDB schema changes (recreate .db files)
  • New embedding_provider parameter

โš ๏ธ Critical: LLM vs Embedding Provider Separation

Understanding the difference between LLM and Embedding providers:

  • ๐Ÿ”„ LLM Providers (text generation): Change freely between Anthropic, OpenAI, Ollama, etc.
  • ๐Ÿ”’ Embedding Providers (semantic search): Must remain consistent within a storage space

For semantic search consistency:

  • โœ… Choose ONE embedding model and stick with it per storage space
  • โœ… You can customize which embedding model to use (AbstractCore, OpenAI, Ollama, etc.)
  • โŒ Don't change embedding models mid-project - it breaks vector search
  • ๐Ÿšจ AbstractMemory automatically warns when embedding model changes detected

Example of correct separation:

# LLM for text generation (can change anytime)
llm = create_llm("anthropic")  # or "openai", "ollama", etc.

# Dedicated embedding provider (must stay consistent)
embedder = EmbeddingManager()  # AbstractCore embeddings

memory = create_memory("grounded", embedding_provider=embedder)  # NOT llm!

Basic Usage

from abstractmemory import create_memory

# 1. Create memory with default all-MiniLM-L6-v2 embeddings (recommended)
memory = create_memory(
    "grounded",
    storage_backend="dual",           # Markdown + LanceDB
    storage_path="./memory_files",    # Observable files
    storage_uri="./memory.db"         # Vector search (auto-configured with all-MiniLM-L6-v2)
)

# 2. Add interactions (embeddings generated automatically!)
memory.set_current_user("alice")
memory.add_interaction(
    "I'm working on machine learning projects",
    "Great! ML has amazing applications in many fields."
)
# โ†ณ Takes ~13ms: optimized all-MiniLM-L6-v2 embedding generated and stored

# 3. Semantic search finds contextually relevant content
results = memory.search_stored_interactions("artificial intelligence research")
# โ†ณ Finds ML interaction via semantic similarity (not keywords!)
print(f"Found {len(results)} relevant conversations")

# Optional: Use custom embedding model
from abstractmemory.embeddings.sentence_transformer_provider import create_sentence_transformer_provider
custom_provider = create_sentence_transformer_provider("bge-base-en-v1.5")
custom_memory = create_memory(
    "grounded",
    storage_backend="dual",
    storage_path="./memory_files",
    storage_uri="./memory.db",
    embedding_provider=custom_provider
)

๐Ÿ“‹ What Happens When You Add Interactions

memory.add_interaction("I love Python", "Great choice!")
# โ†“ IMMEDIATE PROCESSING:
# 1. Text combined: "I love Python Great choice!"
# 2. EmbeddingManager.embed() called (36ms)
# 3. 768D vector generated with EmbeddingGemma
# 4. Saved to markdown file: ./memory_files/verbatim/alice/...
# 5. Stored in LanceDB: vector + text + metadata
# 6. Interaction immediately searchable via semantic similarity

๐Ÿ”— AbstractLLM Ecosystem Integration

AbstractMemory seamlessly integrates with AbstractCore, maintaining clear separation between LLM and embedding providers:

Critical Architecture: LLM vs Embedding Separation

from abstractllm import create_llm
from abstractllm.embeddings import EmbeddingManager
from abstractmemory import create_memory

# SEPARATE PROVIDERS for different purposes:

# 1. LLM Provider - for TEXT GENERATION (can change freely)
llm_provider = create_llm("anthropic", model="claude-3-5-haiku-latest")

# 2. Embedding Provider - for SEMANTIC SEARCH (must stay consistent)
embedding_provider = EmbeddingManager()

# Create memory with DEDICATED embedding provider
memory = create_memory(
    "grounded",
    enable_kg=True,
    storage_backend="dual",
    storage_path="./memory",
    storage_uri="./memory.db",
    embedding_provider=embedding_provider  # DEDICATED for embeddings
)

# Use in agent reasoning with CLEAR separation
context = memory.get_full_context(query)
response = llm_provider.generate(prompt, system_prompt=context)  # LLM for text
memory.add_interaction(query, response.content)  # Embeddings handled internally

# Search uses embedding provider for semantic similarity
similar_memories = memory.search_stored_interactions("related concepts")

Key Points:

  • LLM Provider: Change freely between Anthropic โ†” OpenAI โ†” Ollama
  • Embedding Provider: Must remain consistent within storage space
  • Never pass LLM provider as embedding provider
  • Always use dedicated embedding provider for semantic search

With AbstractAgent (Future)

from abstractagent import create_agent
from abstractmemory import create_memory

# Autonomous agent with sophisticated memory
memory = create_memory("grounded", working_capacity=20)
agent = create_agent("autonomous", memory=memory, provider=provider)

# Agent automatically uses memory for consistency and personalization
response = agent.execute(task, user_id="alice")

๐Ÿ›๏ธ Architecture Principles

  1. No Over-Engineering: Memory complexity matches agent requirements
  2. Real Implementation Testing: NO MOCKS anywhere - all tests use real implementations
  3. SOTA Research Foundation: Built on proven patterns (MemGPT, Zep, Graphiti)
  4. Clean Abstractions: Simple interfaces, powerful implementations
  5. Performance Optimized: Fast operations for simple agents, scalable for complex ones

๐Ÿ“ˆ Performance Characteristics

  • Simple Memory: < 1ms operations, minimal overhead
  • Complex Memory: < 100ms context generation, efficient consolidation
  • Scalability: Handles thousands of memory items efficiently
  • Real LLM Integration: Context + LLM calls complete in seconds

๐Ÿค Contributing

AbstractMemory is part of the AbstractLLM ecosystem. See CONTRIBUTING.md for development guidelines.

๐Ÿ“„ License

[License details]


AbstractMemory: Smart memory for smart agents ๐Ÿง โœจ

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abstractmemory-0.2.3.tar.gz (38.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

abstractmemory-0.2.3-py3-none-any.whl (45.5 kB view details)

Uploaded Python 3

File details

Details for the file abstractmemory-0.2.3.tar.gz.

File metadata

  • Download URL: abstractmemory-0.2.3.tar.gz
  • Upload date:
  • Size: 38.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for abstractmemory-0.2.3.tar.gz
Algorithm Hash digest
SHA256 7bf0e5125e2061bdb98caea265e4a29c6b4403de873119292a2d78846123d1af
MD5 445a4e192ccfb3bc2a37a6b75450e154
BLAKE2b-256 74616ede72bcccde34f03ed87059ca82763a1f85ff9d91d7d4ce6c8e8c3ca585

See more details on using hashes here.

File details

Details for the file abstractmemory-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: abstractmemory-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 45.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for abstractmemory-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ed8bcf80a34160de17de8f3cd2e9746f2992eff1a47d155026fe7c0ac3ddd530
MD5 3b66cc991a19db0bf050fc7b361ed418
BLAKE2b-256 13f46049eb5edb2f3d3b3c6d62a510287de99a76eb44abb53cc43a815310fd6e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page