Skip to main content

File-based persistent memory for AI agents. Zero dependencies.

Project description

antaris-memory

File-based persistent memory for AI agents. Zero dependencies (stdlib only), crash-safe writes, file-based JSON storage.

PyPI Python 3.9+ License

Installation

pip install antaris-memory

Zero dependencies. No API keys. No external services.

Core Features

  • BM25 search with TF-IDF scoring - Full-text search with keyword relevance ranking
  • Write-ahead log (WAL) - Crash-safe writes with automatic recovery
  • Sharding - Horizontal scaling for large memory stores (10,000+ entries)
  • Decay engine - Memories fade over time unless reinforced (Ebbinghaus curves)
  • Sentiment tagging - Automatic emotional context detection
  • Temporal awareness - Time-based queries and chronological context
  • Confidence scoring - Reliability metrics for stored information
  • Compression and consolidation - Automatic deduplication and clustering
  • Forgetting engine - Selective deletion with audit trails
  • Input gating - P0-P3 priority classification at intake
  • Recovery presets - Context restoration after compaction/restart
  • Thread-safe operations - FileLock using os.mkdir() atomicity
  • MCP server integration - Expose as MCP tools (optional)

Quick Start

from antaris_memory import MemorySystem

# Initialize — agent_name personalizes logs and namespace labels
mem = MemorySystem("./workspace", half_life=7.0, agent_name="MyBot")
mem.load()

# Store memories
mem.ingest("PostgreSQL chosen for primary database", 
           category="technical", memory_type="episodic")
mem.ingest("API costs exceed $500/month budget",
           category="operational", confidence=0.9)

# Search with BM25 ranking (searches all memories by default)
results = mem.search("database decision")
for r in results:
    print(f"[{r.confidence:.2f}] {r.content}")

# Multi-tenant: scope search to a specific session
results = mem.search("database decision", session_id="session-abc")

# Save to disk
mem.save()

API Reference

Core Exports

from antaris_memory import (
    MemorySystem,           # Main interface (aliases MemorySystemV4)
    MemoryEntry,            # Individual memory record
    SearchResult,           # Search result with metadata
    RecoveryManager,        # Post-compaction context restoration
    RecoveryConfig,         # Recovery configuration presets
    
    # Engine Components
    DecayEngine,            # Time-based memory degradation
    SentimentTagger,        # Emotional context detection
    TemporalEngine,         # Time-aware queries
    ConfidenceEngine,       # Reliability scoring
    CompressionEngine,      # Deduplication and clustering
    ForgettingEngine,       # Selective deletion
    ConsolidationEngine,    # Memory optimization
    InputGate,              # Priority classification
)

MemorySystem

Primary interface for all memory operations:

mem = MemorySystem(
    workspace_path="./memory",
    half_life=7.0,                    # Days until 50% decay
    enable_wal=True,                  # Write-ahead logging
    shard_threshold=1000,             # Entries per shard
    recovery_config=RecoveryConfig()  # Post-restart recovery
)

Core Methods

# Loading and saving
mem.load()                # Load from disk
mem.save()                # Save to disk with WAL

# Memory ingestion
mem.ingest(content, source="", category="", memory_type="episodic", 
          confidence=1.0, tags=[], metadata={})

# Typed ingestion helpers
mem.ingest_fact("PostgreSQL supports JSON columns")
mem.ingest_preference("User prefers concise responses")
mem.ingest_mistake("Connection pool not closed properly", "Use context managers or explicit close() in finally block")
mem.ingest_procedure("Deploy: git push → CI → staging → prod")

# Input gating (P0-P3 classification)
mem.ingest_with_gating("Critical security alert", source="monitoring")
mem.ingest_with_gating("Thanks for the update", source="chat")  # Dropped (P3)

# Search
results = mem.search(query, limit=10, explain=False)
results = mem.search(query, tags=["technical", "database"])
results = mem.between("2026-02-01", "2026-02-28")

# Search with instrumentation context (primary API for OpenClaw plugin / agent use)
results, ctx = mem.search_with_context(query, limit=5)
mem.mark_used([r.id for r in results], ctx)  # Boosts relevance of retrieved memories

# Temporal queries
recent = mem.on_date("2026-02-14")
this_week = mem.between("2026-02-17", "2026-02-24")
story = mem.narrative(topic="database migration")

# Maintenance
report = mem.consolidate()        # Dedup and optimize
mem.forget(entity="John Doe")     # GDPR deletion with audit
mem.compact()                     # Archive old shards

BM25 Search Engine

Full-text search using BM25 algorithm with TF-IDF scoring:

# Basic search
results = mem.search("database performance issues")

# Search with explanation
results = mem.search("postgres slow", explain=True)
for r in results:
    print(f"[{r.relevance:.2f}] {r.content[:60]}")
    print(f"  Explanation: {r.explanation}")

# Advanced parameters
results = mem.search(
    query="API optimization",
    limit=20,
    min_confidence=0.7,
    memory_types=["procedural", "episodic"],
    categories=["technical"]
)

Search scoring combines:

  • BM25 keyword relevance
  • Temporal decay (recent memories score higher)
  • Access frequency (frequently accessed memories boost)
  • Memory type boost (procedural > episodic for how-to queries)

Write-Ahead Log (WAL)

Crash-safe writes with automatic recovery:

# WAL enabled by default
mem = MemorySystem("./workspace", enable_wal=True)

# Writes go through WAL first
mem.ingest("Important data")  # Written to WAL, then committed
mem.save()                    # Flush WAL to main storage

# Automatic recovery on next load
mem.load()  # Replays uncommitted WAL entries

WAL format:

workspace/
├── wal/
│   ├── 20260224_143022_001.wal
│   ├── 20260224_143022_002.wal
│   └── current.wal

Sharding

Horizontal scaling for large memory stores:

# Auto-sharding at 1000 entries per shard (default)
mem = MemorySystem("./workspace", shard_threshold=1000)

# Custom sharding strategy
mem = MemorySystem("./workspace", shard_strategy="temporal")  # By date
mem = MemorySystem("./workspace", shard_strategy="semantic")   # By topic

Shard structure:

workspace/
├── shards/
│   ├── 2026-02-technical.json     # 847 entries
│   ├── 2026-02-operational.json  # 1,203 entries
│   └── 2026-01-tactical.json     # 512 entries
├── indexes/
│   ├── global_search.json
│   └── shard_manifest.json

Decay Engine

Memories fade over time unless reinforced:

# Configure decay parameters
mem = MemorySystem("./workspace", half_life=7.0)  # 7-day half-life

# Query with/without decay consideration
recent = mem.search("performance", use_decay=True)
historical = mem.search("performance", use_decay=False)

# Decay statistics
stats = mem.get_stats()
print(f"Total memories: {stats['total']}")
print(f"Average confidence: {stats['avg_confidence']:.3f}")

Decay follows Ebbinghaus forgetting curve:

strength = initial_strength * exp(-ln(2) * age_days / half_life)

Sentiment Tagging

Automatic emotional context detection:

# Automatic sentiment detection
mem.ingest("The deployment failed catastrophically")
# → Tagged with sentiment: negative, intensity: 0.8

mem.ingest("Successfully migrated all user data")  
# → Tagged with sentiment: positive, intensity: 0.7

# Query by sentiment via search filter
positive_memories = mem.search("", sentiment_filter="positive")
issues = mem.search("", sentiment_filter="negative", category="technical")

# Access sentiment metadata
for result in mem.search("deployment"):
    if result.metadata.get("sentiment"):
        sent = result.metadata["sentiment"]
        print(f"Sentiment: {sent['polarity']} ({sent['intensity']:.2f})")

Sentiment classification uses lexicon-based analysis (no model calls).

Temporal Engine

Time-aware queries and chronological context:

# Date range queries
Q1_memories = mem.between("2026-01-01", "2026-03-31")

# Single-day queries
yesterday_memories = mem.on_date("2026-02-23")

# Chronological narrative (returns formatted string)
story = mem.narrative(topic="database migration")

# Time-filtered search
recent_deployments = mem.search(
    "deployment",
    date_range=("2026-02-01", "2026-02-28")
)

Confidence Engine

Track reliability of stored information:

# Store with confidence scores
mem.ingest("PostgreSQL handles 10K QPS", confidence=0.95)  # Measured
mem.ingest("MongoDB might be faster", confidence=0.3)     # Speculation

# Filter by confidence
reliable = mem.search("database performance", min_confidence=0.8)

# Confidence statistics
stats = mem.get_stats()
print(f"Total memories: {stats['total']}")
print(f"Average confidence: {stats['avg_confidence']:.3f}")

Compression and Consolidation

Automatic deduplication and memory optimization:

# Manual consolidation — deduplicates and clusters similar memories
report = mem.consolidate()
print(f"Consolidated: {report}")

# Auto-consolidation at ingest (configurable threshold)
mem = MemorySystem("./workspace", auto_consolidate_threshold=5000)

# Compact — archives old shards and frees memory
report = mem.compact()
print(f"Compacted: {report}")

Forgetting Engine

Selective deletion with audit trails:

# GDPR deletion by entity
result = mem.forget(entity="John Doe")
print(f"Removed {len(result['removed'])} entries")

# Time-based cleanup
mem.forget(before_date="2025-01-01")

# Topic-based deletion
mem.forget(topic="staging")

# Audit trail (returned by forget())
result = mem.forget(entity="Jane Smith")
audit = result["audit"]

Input Gating

Priority classification at intake (P0-P3):

Note: ingest_with_gating() automatically filters low-signal content (P3 ephemeral). Dropped items are logged at DEBUG level. Use mem.ingest() directly if you want to store everything without filtering.

# Automatic classification
mem.ingest_with_gating("URGENT: API down", source="alerts")
# → P0 (critical) → stored immediately

mem.ingest_with_gating("Decided on PostgreSQL", source="meeting")  
# → P1 (operational) → stored

mem.ingest_with_gating("sounds good!", source="chat")
# → P3 (ephemeral) → dropped (logged at DEBUG)

# Gating statistics
stats = mem.get_stats()
print(f"Total stored: {stats['total']}")

Priority levels:

  • P0 (Critical): Security alerts, errors, deadlines, financial data
  • P1 (Operational): Decisions, technical choices, assignments
  • P2 (Tactical): Background info, research, general discussion
  • P3 (Ephemeral): Greetings, acknowledgments, social noise (dropped)

Recovery Manager

Context restoration after compaction or restart:

# Default smart recovery
config = RecoveryConfig()  # 50 memories, 24h window
mem = MemorySystem("./workspace", recovery_config=config)

# Minimal recovery for token efficiency
config = RecoveryConfig(recovery_mode="minimal")  # 10 memories, session only

# Custom recovery
config = RecoveryConfig(
    recovery_search_limit=100,
    recovery_time_window="48h",
    recovery_channels="current",
    recovery_inject="cache"
)

mem.load()
mem.recover_memories()  # Automatic on load

# Access recovered context
recovery_mgr = mem.recovery_manager
cached = recovery_mgr.get_cached_memories()
context_block = recovery_mgr.inject_into_context()

Recovery presets:

Mode Memories Window Tokens Use Case
smart 50 24h 5-10K Balanced recovery
minimal 10 session 1-2K Token-constrained

Context Packets

Package relevant memories for sub-agent injection:

# Single query packet
packet = mem.build_context_packet(
    task="Debug authentication flow",
    max_memories=10,
    max_tokens=2000,
    include_mistakes=True,
    tags=["auth", "security"]
)

# Render for injection
markdown_context = packet.render("markdown")
xml_context = packet.render("xml")

# Multi-query with deduplication
packet = mem.build_context_packet_multi(
    task="Performance optimization",
    queries=["slow queries", "database bottleneck", "caching"],
    max_tokens=3000
)

# Token budget management
packet.trim(max_tokens=1500)
print(f"Final token count: {packet.token_count}")

MCP Server Integration

Expose memory as MCP tools:

# Requires: pip install mcp
from antaris_memory import create_mcp_server

# Create server
server = create_mcp_server(workspace="./memory")

# Run with stdio transport
server.run()  # Connect from Claude Desktop, Cursor, etc.

# Available MCP tools:
# - memory_search(query, limit)
# - memory_ingest(content, category, memory_type)
# - memory_consolidate()
# - memory_stats()

Thread Safety

Multiple processes can safely access the same workspace:

from antaris_memory import FileLock

# Exclusive write lock
with FileLock("/path/to/shard.json", timeout=10.0):
    data = load_shard()
    modify_data(data)
    save_shard(data)

# Optimistic concurrency for reads
from antaris_memory import VersionTracker

tracker = VersionTracker()
version = tracker.snapshot("/path/to/data.json")
data = load_data()
process_data(data)
tracker.check(version)  # Raises ConflictError if modified

FileLock uses os.mkdir() for cross-platform atomic operations.

Benchmarks

Tested on Apple M4, Python 3.14, SSD storage.

Search Performance

Memories Ingest (avg) Search (avg) Search (p99) Memory (MB)
100 0.053ms 0.40ms 0.65ms 8
1,000 0.033ms 3.43ms 5.14ms 45
10,000 0.035ms 24.7ms 38.2ms 180
50,000 0.041ms 127ms 195ms 850

Comparison with Other Libraries

Search performance against existing memory libraries:

Library 1K memories 10K memories Dependencies
antaris-memory 3.4ms 24.7ms 0 (stdlib)
mem0 610ms 1,507,000ms Redis + Vector DB
langchain-memory 185ms 4,460ms Multiple

Result: 61,030x faster than mem0 at scale, 180x faster at small scale.

Note: These benchmarks compare antaris-memory's local file-based storage against mem0's default networked backend (Qdrant vector DB + Redis). antaris-memory is designed as a zero-infrastructure local solution; mem0's strength is cloud-scale distributed search. The speed advantage comes from eliminating network round-trips and serialization overhead.

Input Gating Performance

P0-P3 classification speed:

Metric Value
Average classification 0.177ms
P99 classification 0.45ms
Throughput 5,650 classifications/sec

Storage Efficiency

Memories Raw JSON Compressed Compression Ratio
1,000 1.1MB 340KB 3.2:1
10,000 11.2MB 2.8MB 4.0:1
50,000 56.8MB 12.1MB 4.7:1

Storage Format

Plain JSON files for transparency and debuggability:

workspace/
├── shards/
│   ├── 2026-02-technical.json     # Technical memories
│   ├── 2026-02-operational.json  # Operational decisions
│   └── 2026-01-archive.json      # Archived memories
├── indexes/
│   ├── search_index.json         # BM25 inverted index
│   ├── tag_index.json           # Tag mappings
│   ├── date_index.json          # Temporal index
│   └── confidence_index.json    # Confidence levels
├── namespaces/                   # Isolated namespace stores
│   └── project-alpha/
│       ├── shards/
│       ├── indexes/
│       └── ...
├── wal/
│   ├── current.wal              # Active write-ahead log
│   └── 20260224_143022.wal      # Rotated WAL files
├── audit/
│   └── deletions.json           # GDPR audit trail
└── config.json                  # Workspace configuration

Architecture

MemorySystem (v4.0.0)
├── Core Components
│   ├── ShardManager         # Horizontal scaling
│   ├── IndexManager         # Search indexes
│   └── WALManager           # Write-ahead logging
├── Search Engine
│   ├── BM25Engine           # Keyword ranking
│   ├── TFIDFScorer          # Term frequency scoring
│   └── TemporalRanker       # Time-based relevance
├── Memory Processing
│   ├── DecayEngine          # Ebbinghaus forgetting
│   ├── SentimentTagger      # Emotional context
│   ├── ConfidenceEngine     # Reliability scoring
│   ├── CompressionEngine    # Deduplication
│   ├── ConsolidationEngine  # Memory optimization
│   └── ForgettingEngine     # Selective deletion
├── Input Processing
│   ├── InputGate            # P0-P3 classification
│   └── MemoryTyper          # Episodic/semantic/procedural
├── Recovery System
│   ├── RecoveryManager      # Post-restart context
│   └── RecoveryConfig       # Smart/minimal presets
├── Concurrency
│   ├── FileLock             # Cross-platform locking
│   └── VersionTracker       # Optimistic concurrency
└── Integration
    ├── MCPServer            # Model Context Protocol
    └── ContextPacketBuilder # Sub-agent injection

Memory Types

Store memories with type-specific optimizations:

# Episodic: events, decisions, meeting notes
mem.ingest("Decided to migrate to PostgreSQL in Q2 meeting", memory_type="episodic")

# Semantic: facts, concepts, general knowledge
mem.ingest("PostgreSQL supports ACID transactions", memory_type="semantic")

# Procedural: how-to steps, runbooks, processes (shorthand helper)
mem.ingest_procedure("Deploy: git push → CI → staging → production")

# Preference: user preferences, style notes (shorthand helper)
mem.ingest_preference("User prefers Python code examples over pseudocode")

# Mistake: errors to avoid, lessons learned (shorthand helper)
mem.ingest_mistake("Forgot to close database connections in worker threads", "Use context managers or explicit close() in finally block")

Type-specific recall boosts:

  • Procedural: 2.5x boost for how-to queries
  • Preference: 2.0x boost for style/format queries
  • Mistake: 1.8x boost for troubleshooting queries
  • Semantic: 1.2x boost for factual queries
  • Episodic: Baseline (1.0x)

Namespace Isolation

Multi-tenant workspaces with hard boundaries:

from antaris_memory import NamespacedMemory, NamespaceManager

# Create isolated namespaces
manager = NamespaceManager("./workspace") 
agent_a = manager.create_namespace("agent-a")
agent_b = manager.create_namespace("agent-b")

# Each namespace is fully isolated
agent_a.ingest("Agent A decision")
agent_b.ingest("Agent B decision")

# Search within namespace only
results_a = agent_a.search("decision")  # Only sees agent A memories
results_b = agent_b.search("decision")  # Only sees agent B memories

# Cross-namespace operations (explicit)
all_decisions = manager.search_across_namespaces("decision", 
                                                 namespaces=["agent-a", "agent-b"])

Testing

Run the full test suite:

git clone https://github.com/Antaris-Analytics-LLC/antaris-suite.git
cd antaris-memory
python -m pytest tests/ -v

# Run specific test categories
python -m pytest tests/test_search.py -v          # Search engine
python -m pytest tests/test_wal.py -v            # Write-ahead log
python -m pytest tests/test_decay.py -v          # Memory decay
python -m pytest tests/test_concurrency.py -v    # Thread safety

564 tests pass with zero external dependencies.

Migration from v3.x

Automatic schema migration on first load:

# v3.x workspaces load automatically
mem = MemorySystem("./existing_v3_workspace")
mem.load()  # Auto-detects v3 format, migrates to v4

# New v4 features available immediately
mem.ingest_with_gating("Test message", source="migration")
results = mem.search("test", explain=True)

Limitations

We are honest about what this library cannot do:

  1. Storage scale: JSON files work well up to ~50,000 memories. Beyond that, you need a database.
  2. Semantic understanding: Core search is keyword-based. Add your own embedding function for semantic search.
  3. Graph relationships: Flat memory store. No entity relationships or graph traversal.
  4. Real-time updates: File-based storage has write latency. Not suitable for real-time applications.
  5. Distributed systems: Single-machine only. No clustering or distributed consensus.

When you hit these limits, you know it's time for a more complex solution.

License

Licensed under the Apache License 2.0. See LICENSE for details.

Related Packages

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antaris_memory-4.2.0.tar.gz (191.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

antaris_memory-4.2.0-py3-none-any.whl (127.0 kB view details)

Uploaded Python 3

File details

Details for the file antaris_memory-4.2.0.tar.gz.

File metadata

  • Download URL: antaris_memory-4.2.0.tar.gz
  • Upload date:
  • Size: 191.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for antaris_memory-4.2.0.tar.gz
Algorithm Hash digest
SHA256 7c8fe8c6965409d7845e9acf969f0b10285eacd338aba46d989d4d4263201254
MD5 0de49b4863318ff21712159de616df46
BLAKE2b-256 b57d2c1251fd167cbbff24df424b3309d938ae4569adc54b85479821675f21cf

See more details on using hashes here.

File details

Details for the file antaris_memory-4.2.0-py3-none-any.whl.

File metadata

  • Download URL: antaris_memory-4.2.0-py3-none-any.whl
  • Upload date:
  • Size: 127.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for antaris_memory-4.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5355670c7dce4377938314069c6195783ae8edb9d7e92622eb0fdb5fa6a32a52
MD5 dddc0dc07fcda2c8e8de0370e3659b27
BLAKE2b-256 80b17ee843d636646439f2c70df6ca9666cbaf411c1f90463ad09cf327b25ebf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page