Skip to main content

File-based persistent memory for AI agents. Zero dependencies.

Project description

antaris-memory

File-based persistent memory for AI agents. Zero dependencies (stdlib only), crash-safe writes, file-based JSON storage.

PyPI Python 3.9+ License

Installation

pip install antaris-memory

Zero dependencies. No API keys. No external services.

Core Features

  • BM25 search with TF-IDF scoring - Full-text search with keyword relevance ranking
  • Write-ahead log (WAL) - Crash-safe writes with automatic recovery
  • Sharding - Horizontal scaling for large memory stores (10,000+ entries)
  • Decay engine - Memories fade over time unless reinforced (Ebbinghaus curves)
  • Sentiment tagging - Automatic emotional context detection
  • Temporal awareness - Time-based queries and chronological context
  • Confidence scoring - Reliability metrics for stored information
  • Compression and consolidation - Automatic deduplication and clustering
  • Forgetting engine - Selective deletion with audit trails
  • Input gating - P0-P3 priority classification at intake
  • Recovery presets - Context restoration after compaction/restart
  • Thread-safe operations - FileLock using os.mkdir() atomicity
  • MCP server integration - Expose as MCP tools (optional)

Quick Start

from antaris_memory import MemorySystem

# Initialize with 7-day half-life for decay
mem = MemorySystem("./workspace", half_life=7.0)
mem.load()

# Store memories
mem.ingest("PostgreSQL chosen for primary database", 
           category="technical", memory_type="episodic")
mem.ingest("API costs exceed $500/month budget",
           category="operational", confidence=0.9)

# Search with BM25 ranking
results = mem.search("database decision")
for r in results:
    print(f"[{r.confidence:.2f}] {r.content}")

# Save to disk
mem.save()

API Reference

Core Exports

from antaris_memory import (
    MemorySystem,           # Main interface (aliases MemorySystemV4)
    MemoryEntry,            # Individual memory record
    SearchResult,           # Search result with metadata
    RecoveryManager,        # Post-compaction context restoration
    RecoveryConfig,         # Recovery configuration presets
    
    # Engine Components
    DecayEngine,            # Time-based memory degradation
    SentimentTagger,        # Emotional context detection
    TemporalEngine,         # Time-aware queries
    ConfidenceEngine,       # Reliability scoring
    CompressionEngine,      # Deduplication and clustering
    ForgettingEngine,       # Selective deletion
    ConsolidationEngine,    # Memory optimization
    InputGate,              # Priority classification
)

MemorySystem

Primary interface for all memory operations:

mem = MemorySystem(
    workspace_path="./memory",
    half_life=7.0,                    # Days until 50% decay
    enable_wal=True,                  # Write-ahead logging
    shard_threshold=1000,             # Entries per shard
    recovery_config=RecoveryConfig()  # Post-restart recovery
)

Core Methods

# Loading and saving
mem.load()                # Load from disk
mem.save()                # Save to disk with WAL

# Memory ingestion
mem.ingest(content, source="", category="", memory_type="episodic", 
          confidence=1.0, tags=[], metadata={})

# Typed ingestion helpers
mem.ingest_fact("PostgreSQL supports JSON columns")
mem.ingest_preference("User prefers concise responses")
mem.ingest_mistake("Connection pool not closed properly")
mem.ingest_procedure("Deploy: git push → CI → staging → prod")

# Input gating (P0-P3 classification)
mem.ingest_with_gating("Critical security alert", source="monitoring")
mem.ingest_with_gating("Thanks for the update", source="chat")  # Dropped (P3)

# Search
results = mem.search(query, limit=10, explain=False)
results = mem.search_by_tags(["technical", "database"])
results = mem.search_by_date_range("2026-02-01", "2026-02-28")

# Temporal queries
recent = mem.on_date("2026-02-14")
story = mem.narrative(topic="database migration")

# Maintenance
report = mem.consolidate()        # Dedup and optimize
mem.forget(entity="John Doe")     # GDPR deletion with audit
mem.compact()                     # Archive old shards

BM25 Search Engine

Full-text search using BM25 algorithm with TF-IDF scoring:

# Basic search
results = mem.search("database performance issues")

# Search with explanation
results = mem.search("postgres slow", explain=True)
for r in results:
    print(f"[{r.relevance:.2f}] {r.content[:60]}")
    print(f"  Explanation: {r.explanation}")

# Advanced parameters
results = mem.search(
    query="API optimization",
    limit=20,
    min_confidence=0.7,
    memory_types=["procedural", "episodic"],
    categories=["technical"]
)

Search scoring combines:

  • BM25 keyword relevance
  • Temporal decay (recent memories score higher)
  • Access frequency (frequently accessed memories boost)
  • Memory type boost (procedural > episodic for how-to queries)

Write-Ahead Log (WAL)

Crash-safe writes with automatic recovery:

# WAL enabled by default
mem = MemorySystem("./workspace", enable_wal=True)

# Writes go through WAL first
mem.ingest("Important data")  # Written to WAL, then committed
mem.save()                    # Flush WAL to main storage

# Automatic recovery on next load
mem.load()  # Replays uncommitted WAL entries

WAL format:

workspace/
├── wal/
│   ├── 20260224_143022_001.wal
│   ├── 20260224_143022_002.wal
│   └── current.wal

Sharding

Horizontal scaling for large memory stores:

# Auto-sharding at 1000 entries per shard (default)
mem = MemorySystem("./workspace", shard_threshold=1000)

# Custom sharding strategy
mem = MemorySystem("./workspace", shard_strategy="temporal")  # By date
mem = MemorySystem("./workspace", shard_strategy="semantic")   # By topic

Shard structure:

workspace/
├── shards/
│   ├── 2026-02-technical.json     # 847 entries
│   ├── 2026-02-operational.json  # 1,203 entries
│   └── 2026-01-tactical.json     # 512 entries
├── indexes/
│   ├── global_search.json
│   └── shard_manifest.json

Decay Engine

Memories fade over time unless reinforced:

# Configure decay parameters
mem = MemorySystem("./workspace", half_life=7.0)  # 7-day half-life

# Manual reinforcement
mem.reinforce("database migration plan")  # Resets decay

# Query with decay consideration
recent = mem.search("performance", consider_decay=True)
historical = mem.search("performance", consider_decay=False)

# Decay statistics
stats = mem.decay_stats()
print(f"Memories at <10% strength: {stats['nearly_forgotten']}")
print(f"Average decay factor: {stats['avg_decay']:.3f}")

Decay follows Ebbinghaus forgetting curve:

strength = initial_strength * exp(-ln(2) * age_days / half_life)

Sentiment Tagging

Automatic emotional context detection:

# Automatic sentiment detection
mem.ingest("The deployment failed catastrophically")
# → Tagged with sentiment: negative, intensity: 0.8

mem.ingest("Successfully migrated all user data")  
# → Tagged with sentiment: positive, intensity: 0.7

# Query by sentiment
positive_memories = mem.search_by_sentiment("positive", min_intensity=0.6)
issues = mem.search_by_sentiment("negative", categories=["technical"])

# Access sentiment metadata
for result in mem.search("deployment"):
    if result.metadata.get("sentiment"):
        sent = result.metadata["sentiment"]
        print(f"Sentiment: {sent['polarity']} ({sent['intensity']:.2f})")

Sentiment classification uses lexicon-based analysis (no model calls).

Temporal Engine

Time-aware queries and chronological context:

# Date range queries
Q1_memories = mem.search_by_date_range("2026-01-01", "2026-03-31")

# Relative time queries
last_week = mem.in_last_days(7)
yesterday = mem.on_date("yesterday")

# Chronological narrative
story = mem.narrative(
    topic="database migration",
    start_date="2026-01-01",
    end_date="2026-02-28"
)

# Temporal patterns
patterns = mem.temporal_patterns(topic="deployment", window="weekly")
for pattern in patterns:
    print(f"Week {pattern['week']}: {pattern['event_count']} events")
    print(f"  Common themes: {pattern['themes']}")

Confidence Engine

Track reliability of stored information:

# Store with confidence scores
mem.ingest("PostgreSQL handles 10K QPS", confidence=0.95)  # Measured
mem.ingest("MongoDB might be faster", confidence=0.3)     # Speculation

# Filter by confidence
reliable = mem.search("database performance", min_confidence=0.8)

# Update confidence based on validation
mem.update_confidence("PostgreSQL handles 10K QPS", new_confidence=1.0)

# Confidence statistics
stats = mem.confidence_stats()
print(f"High confidence (>0.8): {stats['high_confidence']} memories")
print(f"Needs validation (<0.5): {stats['low_confidence']} memories")

Compression and Consolidation

Automatic deduplication and memory optimization:

# Manual consolidation
report = mem.consolidate()
print(f"Removed {report['duplicates_removed']} duplicates")
print(f"Merged {report['similar_merged']} similar memories")
print(f"Compressed {report['entries_compressed']} entries")

# Auto-consolidation threshold
mem = MemorySystem("./workspace", auto_consolidate_threshold=5000)

# Consolidation strategies
report = mem.consolidate(
    strategy="aggressive",      # Remove more aggressively
    similarity_threshold=0.85,  # Merge similar content
    preserve_temporal=True      # Keep chronological order
)

# Review before applying
preview = mem.consolidate(dry_run=True)
print(f"Would remove {preview['would_remove']} entries")
if input("Apply? [y/N]: ").lower() == 'y':
    mem.consolidate(apply=True)

Forgetting Engine

Selective deletion with audit trails:

# GDPR deletion by entity
mem.forget(entity="John Doe", reason="GDPR request")

# Time-based cleanup
mem.forget(before_date="2025-01-01", reason="Data retention policy")

# Category-based deletion
mem.forget(categories=["temporary", "staging"], reason="Cleanup")

# Selective forgetting with conditions
mem.forget_if(
    condition=lambda entry: entry.confidence < 0.2,
    reason="Low confidence cleanup"
)

# Audit trail
audit = mem.get_audit_log()
for deletion in audit:
    print(f"{deletion['timestamp']}: Removed {deletion['count']} entries")
    print(f"  Reason: {deletion['reason']}")
    print(f"  Hash: {deletion['content_hash'][:8]}...")

Input Gating

Priority classification at intake (P0-P3):

# Automatic classification
mem.ingest_with_gating("URGENT: API down", source="alerts")
# → P0 (critical) → stored immediately

mem.ingest_with_gating("Decided on PostgreSQL", source="meeting")  
# → P1 (operational) → stored

mem.ingest_with_gating("sounds good!", source="chat")
# → P3 (ephemeral) → dropped

# Manual priority override
mem.ingest("Low priority note", priority="P2")

# Gate configuration
gate_config = {
    "drop_p3": True,           # Drop ephemeral content
    "batch_p2": True,          # Batch low-priority writes
    "immediate_p0": True       # Immediate write for critical
}
mem.configure_input_gate(gate_config)

# Gating statistics
stats = mem.gate_stats()
print(f"P0 (critical): {stats['p0_count']}")
print(f"P3 dropped: {stats['p3_dropped']}")
print(f"Avg classification time: {stats['avg_classify_ms']:.2f}ms")

Priority levels:

  • P0 (Critical): Security alerts, errors, deadlines, financial data
  • P1 (Operational): Decisions, technical choices, assignments
  • P2 (Tactical): Background info, research, general discussion
  • P3 (Ephemeral): Greetings, acknowledgments, social noise (dropped)

Recovery Manager

Context restoration after compaction or restart:

# Default smart recovery
config = RecoveryConfig()  # 50 memories, 24h window
mem = MemorySystem("./workspace", recovery_config=config)

# Minimal recovery for token efficiency
config = RecoveryConfig(recovery_mode="minimal")  # 10 memories, session only

# Custom recovery
config = RecoveryConfig(
    recovery_search_limit=100,
    recovery_time_window="48h",
    recovery_channels="current",
    recovery_inject="cache"
)

mem.load()
mem.recover_memories()  # Automatic on load

# Access recovered context
recovery_mgr = mem.recovery_manager
cached = recovery_mgr.get_cached_memories()
context_block = recovery_mgr.inject_into_context()

Recovery presets:

Mode Memories Window Tokens Use Case
smart 50 24h 5-10K Balanced recovery
minimal 10 session 1-2K Token-constrained

Context Packets

Package relevant memories for sub-agent injection:

# Single query packet
packet = mem.build_context_packet(
    task="Debug authentication flow",
    max_memories=10,
    max_tokens=2000,
    include_mistakes=True,
    tags=["auth", "security"]
)

# Render for injection
markdown_context = packet.render("markdown")
xml_context = packet.render("xml")

# Multi-query with deduplication
packet = mem.build_context_packet_multi(
    task="Performance optimization",
    queries=["slow queries", "database bottleneck", "caching"],
    max_tokens=3000
)

# Token budget management
packet.trim(max_tokens=1500)
print(f"Final token count: {packet.token_count}")

MCP Server Integration

Expose memory as MCP tools:

# Requires: pip install mcp
from antaris_memory import create_mcp_server

# Create server
server = create_mcp_server(workspace="./memory")

# Run with stdio transport
server.run()  # Connect from Claude Desktop, Cursor, etc.

# Available MCP tools:
# - memory_search(query, limit)
# - memory_ingest(content, category, memory_type)
# - memory_consolidate()
# - memory_stats()

Thread Safety

Multiple processes can safely access the same workspace:

from antaris_memory import FileLock

# Exclusive write lock
with FileLock("/path/to/shard.json", timeout=10.0):
    data = load_shard()
    modify_data(data)
    save_shard(data)

# Optimistic concurrency for reads
from antaris_memory import VersionTracker

tracker = VersionTracker()
version = tracker.snapshot("/path/to/data.json")
data = load_data()
process_data(data)
tracker.check(version)  # Raises ConflictError if modified

FileLock uses os.mkdir() for cross-platform atomic operations.

Benchmarks

Tested on Apple M4, Python 3.14, SSD storage.

Search Performance

Memories Ingest (avg) Search (avg) Search (p99) Memory (MB)
100 0.053ms 0.40ms 0.65ms 8
1,000 0.033ms 3.43ms 5.14ms 45
10,000 0.035ms 24.7ms 38.2ms 180
50,000 0.041ms 127ms 195ms 850

Comparison with Other Libraries

Search performance against existing memory libraries:

Library 1K memories 10K memories Dependencies
antaris-memory 3.4ms 24.7ms 0 (stdlib)
mem0 610ms 1,507,000ms Redis + Vector DB
langchain-memory 185ms 4,460ms Multiple

Result: 61,030x faster than mem0 at scale, 180x faster at small scale.

Input Gating Performance

P0-P3 classification speed:

Metric Value
Average classification 0.177ms
P99 classification 0.45ms
Throughput 5,650 classifications/sec

Storage Efficiency

Memories Raw JSON Compressed Compression Ratio
1,000 1.1MB 340KB 3.2:1
10,000 11.2MB 2.8MB 4.0:1
50,000 56.8MB 12.1MB 4.7:1

Storage Format

Plain JSON files for transparency and debuggability:

workspace/
├── shards/
│   ├── 2026-02-technical.json     # Technical memories
│   ├── 2026-02-operational.json  # Operational decisions
│   └── 2026-01-archive.json      # Archived memories
├── indexes/
│   ├── search_index.json         # BM25 inverted index
│   ├── tag_index.json           # Tag mappings
│   ├── date_index.json          # Temporal index
│   └── confidence_index.json    # Confidence levels
├── wal/
│   ├── current.wal              # Active write-ahead log
│   └── 20260224_143022.wal      # Rotated WAL files
├── audit/
│   └── deletions.json           # GDPR audit trail
└── config.json                  # Workspace configuration

Architecture

MemorySystem (v4.0.0)
├── Core Components
│   ├── ShardManager         # Horizontal scaling
│   ├── IndexManager         # Search indexes
│   └── WALManager           # Write-ahead logging
├── Search Engine
│   ├── BM25Engine           # Keyword ranking
│   ├── TFIDFScorer          # Term frequency scoring
│   └── TemporalRanker       # Time-based relevance
├── Memory Processing
│   ├── DecayEngine          # Ebbinghaus forgetting
│   ├── SentimentTagger      # Emotional context
│   ├── ConfidenceEngine     # Reliability scoring
│   ├── CompressionEngine    # Deduplication
│   ├── ConsolidationEngine  # Memory optimization
│   └── ForgettingEngine     # Selective deletion
├── Input Processing
│   ├── InputGate            # P0-P3 classification
│   └── MemoryTyper          # Episodic/semantic/procedural
├── Recovery System
│   ├── RecoveryManager      # Post-restart context
│   └── RecoveryConfig       # Smart/minimal presets
├── Concurrency
│   ├── FileLock             # Cross-platform locking
│   └── VersionTracker       # Optimistic concurrency
└── Integration
    ├── MCPServer            # Model Context Protocol
    └── ContextPacketBuilder # Sub-agent injection

Memory Types

Store memories with type-specific optimizations:

# Episodic: events, decisions, meeting notes
mem.ingest_episodic("Decided to migrate to PostgreSQL in Q2 meeting")

# Semantic: facts, concepts, general knowledge  
mem.ingest_semantic("PostgreSQL supports ACID transactions")

# Procedural: how-to steps, runbooks, processes
mem.ingest_procedural("Deploy: git push → CI → staging → production")

# Preference: user preferences, style notes
mem.ingest_preference("User prefers Python code examples over pseudocode")

# Mistake: errors to avoid, lessons learned
mem.ingest_mistake("Forgot to close database connections in worker threads")

Type-specific recall boosts:

  • Procedural: 2.5x boost for how-to queries
  • Preference: 2.0x boost for style/format queries
  • Mistake: 1.8x boost for troubleshooting queries
  • Semantic: 1.2x boost for factual queries
  • Episodic: Baseline (1.0x)

Namespace Isolation

Multi-tenant workspaces with hard boundaries:

from antaris_memory import NamespacedMemory, NamespaceManager

# Create isolated namespaces
manager = NamespaceManager("./workspace") 
agent_a = manager.create_namespace("agent-a")
agent_b = manager.create_namespace("agent-b")

# Each namespace is fully isolated
agent_a.ingest("Agent A decision")
agent_b.ingest("Agent B decision")

# Search within namespace only
results_a = agent_a.search("decision")  # Only sees agent A memories
results_b = agent_b.search("decision")  # Only sees agent B memories

# Cross-namespace operations (explicit)
all_decisions = manager.search_across_namespaces("decision", 
                                                 namespaces=["agent-a", "agent-b"])

Testing

Run the full test suite:

git clone https://github.com/Antaris-Analytics-LLC/antaris-suite.git
cd antaris-memory
python -m pytest tests/ -v

# Run specific test categories
python -m pytest tests/test_search.py -v          # Search engine
python -m pytest tests/test_wal.py -v            # Write-ahead log
python -m pytest tests/test_decay.py -v          # Memory decay
python -m pytest tests/test_concurrency.py -v    # Thread safety

All 1,159 tests pass with zero external dependencies.

Migration from v3.x

Automatic schema migration on first load:

# v3.x workspaces load automatically
mem = MemorySystem("./existing_v3_workspace")
mem.load()  # Auto-detects v3 format, migrates to v4

# New v4 features available immediately
mem.ingest_with_gating("Test message", source="migration")
results = mem.search("test", explain=True)

Limitations

We are honest about what this library cannot do:

  1. Storage scale: JSON files work well up to ~50,000 memories. Beyond that, you need a database.
  2. Semantic understanding: Core search is keyword-based. Add your own embedding function for semantic search.
  3. Graph relationships: Flat memory store. No entity relationships or graph traversal.
  4. Real-time updates: File-based storage has write latency. Not suitable for real-time applications.
  5. Distributed systems: Single-machine only. No clustering or distributed consensus.

When you hit these limits, you know it's time for a more complex solution.

License

Licensed under the Apache License 2.0. See LICENSE for details.

Related Packages

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antaris_memory-4.0.1.tar.gz (187.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

antaris_memory-4.0.1-py3-none-any.whl (124.8 kB view details)

Uploaded Python 3

File details

Details for the file antaris_memory-4.0.1.tar.gz.

File metadata

  • Download URL: antaris_memory-4.0.1.tar.gz
  • Upload date:
  • Size: 187.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for antaris_memory-4.0.1.tar.gz
Algorithm Hash digest
SHA256 1c87c5cb5cdc1d2643e85a2e16c0a66258690b76029dbee5cc8bf93fcc0c5527
MD5 d05d6a4298ccbf175f9cbe86d832bdc1
BLAKE2b-256 891245959e815a82ce325476cd5d26260dd3942390cd40d29c9eeaf2ab18284b

See more details on using hashes here.

File details

Details for the file antaris_memory-4.0.1-py3-none-any.whl.

File metadata

  • Download URL: antaris_memory-4.0.1-py3-none-any.whl
  • Upload date:
  • Size: 124.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for antaris_memory-4.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3bde8adb0200a7292fd9485253f5a089583f6088592b23a259d60f2a1008a3e1
MD5 7b60ffc7f9253419ba1022ce86f2c0c5
BLAKE2b-256 0e2e1bf687eb28c696dbd328e2c7b3956e35bda2a22848fb1ab1f1c9a59aad91

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page