File-based persistent memory for AI agents. Zero dependencies.

These details have not been verified by PyPI

Project links

Project description

antaris-memory

Production-ready file-based persistent memory for AI agents. Zero dependencies (core).

Store, search, decay, and consolidate agent memories using only the Python standard library. Sharded storage for scalability, fast search indexes, namespace isolation, MCP server support, and automatic schema migration. No vector databases, no infrastructure, no API keys.

What's New in v2.4.0 (antaris-suite 3.0)

bulk_ingest(entries) — O(1) deferred index rebuild; ingest 1M entries without O(n²) WAL flush penalty
with mem.bulk_mode(): — context manager for existing ingest() call sites; single index rebuild on exit
Retrieval Feedback Loop — record_outcome(ids, "good"|"bad"|"neutral") adapts memory importance in real time
BLAKE2b-128 hashing — replaces MD5 for entry deduplication (SEC-001); tools/migrate_hashes.py for pre-3.0 stores
Audit log — memory_audit.json → memory_audit.jsonl; O(1) append per entry, no full-file rewrite
Cross-platform locking — _pid_running() uses ctypes/OpenProcess on Windows, os.kill(pid,0) on POSIX
Production Cleanup API — purge(), rebuild_indexes(), wal_flush(), wal_inspect() — bulk removal, index repair, and WAL management without manual shard surgery (see Production Cleanup API)
WAL subsystem — write-ahead log for safe, fast ingestion; auto-flushes every 50 appends or at 1 MB; crash-safe replay on startup
LRU read cache — Sprint 11 search caching with access-count boosting; configurable size via cache_max_entries
purge() glob patterns — source="pipeline:pipeline_*" removes all memories from any pipeline session at once

Previous v2.0.0 highlights (still fully available):

MCP Server — expose your memory workspace as MCP tools via create_mcp_server() (requires pip install mcp)
Hybrid semantic search — plug in any embedding function with set_embedding_fn(fn); BM25 and cosine blend automatically
Memory types — typed ingestion: episodic, semantic, procedural, preference, mistake — each with recall priority boosts
Namespace isolation — NamespacedMemory and NamespaceManager for multi-tenant memory with hard boundaries
Context packets — build_context_packet() packages relevant memories for sub-agent injection with token budgeting
293 tests (all passing)

See CHANGELOG.md for full version history.

Install

pip install antaris-memory

Quick Start

from antaris_memory import MemorySystem

mem = MemorySystem("./workspace", half_life=7.0)
mem.load()  # No-op on first run; auto-migrates old formats

# Store memories
mem.ingest("Decided to use PostgreSQL for the database.",
           source="meeting-notes", category="strategic")

# Typed helpers
mem.ingest_fact("PostgreSQL supports JSON natively")
mem.ingest_preference("User prefers concise explanations")
mem.ingest_mistake("Forgot to close DB connections in worker threads")
mem.ingest_procedure("Deploy: push to main → CI runs → auto-deploy to staging")

# Input gating — drops ephemeral noise (P3) before storage
mem.ingest_with_gating("Decided to switch to Redis for caching", source="chat")
mem.ingest_with_gating("thanks for the update!", source="chat")  # → dropped (P3)

# Search (BM25; hybrid BM25+cosine if embedding fn set)
for r in mem.search("database decision"):
    print(f"[{r.confidence:.2f}] {r.content}")

# Save
mem.save()

🧹 Production Cleanup API (v2.1.0)

These four methods replace manual shard surgery for production maintenance. Use them after bulk imports, pipeline restarts, or to clean up test data.

`purge()` — Bulk removal with glob patterns

Remove memories by source, content substring, or custom predicate. The WAL is filtered too, so purged entries cannot be replayed on the next load().

# Remove all memories from a specific pipeline session
result = mem.purge(source="pipeline:pipeline_abc123")
print(f"Removed {result['removed']} memories, {result['wal_removed']} WAL entries")

# Glob pattern — remove ALL pipeline sessions at once
result = mem.purge(source="pipeline:pipeline_*")

# Remove by content substring (case-insensitive)
result = mem.purge(content_contains="context_packet")

# Custom predicate (OR logic — removes if ANY criterion matches)
result = mem.purge(
    source="openclaw:auto",
    content_contains="symlink mismatch",
)

# Always persist after purge
mem.save()

Return value:

{
    "removed": 10,        # from in-memory set
    "wal_removed": 2,     # from WAL file
    "total": 12,
    "audit": {
        "operation": "purge",
        "count": 12,
        "sources": ["pipeline:pipeline_abc123"],
        "timestamp": "2026-02-19T..."
    }
}

`rebuild_indexes()` — Repair search indexes after bulk operations

Call after any bulk change (purge, manual shard edits, imports) to ensure the search index matches live data.

result = mem.rebuild_indexes()
print(f"Indexed {result['memories']} memories, {result['words_indexed']} words")
# → {"memories": 9990, "words_indexed": 5800, "tags": 24}

`wal_flush()` — Force-flush WAL to shard files

Normally the WAL auto-flushes. Call this explicitly before making backups, running migrations, or reading shard files directly.

flushed = mem.wal_flush()
print(f"Flushed {flushed} pending WAL entries to shards")

`wal_inspect()` — Health check without mutating state

status = mem.wal_inspect()
# {
#     "pending_entries": 14,
#     "size_bytes": 8192,
#     "sample": ["content preview 1...", "content preview 2..."]
# }
print(f"WAL pending: {status['pending_entries']} entries ({status['size_bytes']} bytes)")

Typical production maintenance flow

from antaris_memory import MemorySystem

mem = MemorySystem("./workspace")
mem.load()

# 1. Inspect WAL health
status = mem.wal_inspect()
if status["pending_entries"] > 100:
    print(f"WAL has {status['pending_entries']} pending — flushing...")
    mem.wal_flush()

# 2. Purge stale/unwanted data
result = mem.purge(source="pipeline:pipeline_old_session_*")
print(f"Purged {result['total']} stale entries")

# 3. Rebuild indexes after purge
index_result = mem.rebuild_indexes()
print(f"Re-indexed {index_result['memories']} memories")

# 4. Persist
mem.save()

OpenClaw Integration

antaris-memory ships as a native OpenClaw plugin (antaris-memory). Once enabled, the plugin fires automatically before and after each agent turn:

before_agent_start — searches memory for relevant context, injects into agent prompt
agent_end — ingests the turn into persistent memory

openclaw plugins enable antaris-memory

Also ships with an MCP server for any MCP-compatible host:

from antaris_memory import create_mcp_server  # pip install mcp
server = create_mcp_server(workspace="./memory")
server.run()  # MCP tools: memory_search, memory_ingest, memory_consolidate, memory_stats

What It Does

Sharded storage for production scalability (10,000+ memories, sub-second search)
Fast search indexes (full-text, tags, dates) stored as transparent JSON files
Automatic schema migration from single-file to sharded format with rollback
Multi-agent shared memory pools with namespace isolation and access controls
Retrieval weighted by recency × importance × access frequency (Ebbinghaus-inspired decay)
Input gating classifies incoming content by priority (P0–P3) and drops ephemeral noise at intake
Detects contradictions between stored memories using deterministic rule-based comparison
Runs fully offline — zero network calls, zero tokens, zero API keys

What It Doesn't Do

Not a vector database — no embeddings by default. Core search uses BM25 keyword ranking. Semantic search requires you to supply an embedding function (set_embedding_fn(fn)) — we never make that call for you.
Not a knowledge graph — flat memory store with metadata indexing. No entity relationships or graph traversal.
Not semantic by default — contradiction detection compares normalized statements using explicit conflict rules, not inference.
Not LLM-dependent — all operations are deterministic. No model calls, no prompt engineering.
Not infinitely scalable — JSON file storage works well up to ~50,000 memories per workspace.

Memory Types

mem.ingest("Deploy: push to main, CI runs, auto-deploy to staging",
           memory_type="procedural")   # High recall boost for how-to queries
mem.ingest_fact("PostgreSQL supports JSONB indexing")   # Semantic memory
mem.ingest_preference("User prefers Python examples")   # Preference memory
mem.ingest_mistake("Forgot to handle connection timeout")  # Mistake memory
mem.ingest_procedure("Run pytest from venv, not global pip")  # Procedure

Type	Use for	Recall boost
`episodic`	Events, decisions, meeting notes	Normal
`semantic`	Facts, concepts, general knowledge	Medium
`procedural`	How-to steps, runbooks	High
`preference`	User preferences, style notes	High
`mistake`	Errors to avoid, lessons learned	High

Hybrid Semantic Search

import openai

def my_embed(text: str) -> list[float]:
    resp = openai.embeddings.create(model="text-embedding-3-small", input=text)
    return resp.data[0].embedding

mem.set_embedding_fn(my_embed)  # BM25+cosine hybrid activates automatically

# Or use a local model
import ollama
mem.set_embedding_fn(
    lambda text: ollama.embeddings(model="nomic-embed-text", prompt=text)["embedding"]
)

When no embedding function is set, search uses BM25 only (zero API calls).

Input Gating (P0–P3)

mem.ingest_with_gating("CRITICAL: API key compromised", source="alerts")
# → P0 (critical) → stored with confidence 0.9

mem.ingest_with_gating("Decided to switch to PostgreSQL", source="meeting")
# → P1 (operational) → stored

mem.ingest_with_gating("thanks for the update!", source="chat")
# → P3 (ephemeral) → dropped silently

Level	Category	Stored	Examples
P0	Strategic	✅	Security alerts, errors, deadlines
P1	Operational	✅	Decisions, assignments, technical choices
P2	Tactical	✅	Background info, research
P3	—	❌	Greetings, acknowledgments, filler

Classification: keyword and pattern matching — no LLM calls. 0.177ms avg per input.

Note: ingest() (and ingest_with_gating()) silently drops content shorter than 15 characters. Single-concept memories ("Use Redis", "Done") fall below this threshold. Store them with a brief qualifier: "Prefer Redis for caching" (24 chars → stored).

Namespace Isolation

from antaris_memory import NamespacedMemory, NamespaceManager

manager = NamespaceManager("./workspace")
agent_a = manager.create_namespace("agent-a")
agent_b = manager.create_namespace("agent-b")

ns = NamespacedMemory("project-alpha", "./workspace")
ns.load()
ns.ingest("Alpha-specific decision")
results = ns.search("decision")

Context Packets (Sub-Agent Injection)

# Single-query context packet
packet = mem.build_context_packet(
    task="Debug the authentication flow",
    tags=["auth", "security"],
    max_memories=10,
    max_tokens=2000,
    include_mistakes=True,
)
print(packet.render("markdown"))  # → structured markdown for prompt injection

# Multi-query with deduplication
packet = mem.build_context_packet_multi(
    task="Fix performance issues",
    queries=["database bottleneck", "slow queries", "caching strategy"],
    max_tokens=3000,
)
packet.trim(max_tokens=1500)

Selective Forgetting (GDPR-ready)

audit = mem.forget(entity="John Doe")       # Remove by entity
audit = mem.forget(topic="project alpha")   # Remove by topic
audit = mem.forget(before_date="2025-01-01")  # Remove old entries
# Audit trail written to memory_audit.json

Shared Memory Pools

from antaris_memory import SharedMemoryPool, AgentPermission

pool = SharedMemoryPool("./shared", pool_name="team-alpha")
pool.grant("agent-1", AgentPermission.READ_WRITE)
pool.grant("agent-2", AgentPermission.READ_ONLY)

mem_1 = pool.open("agent-1")
mem_1.ingest("Deployed new API endpoint")

mem_2 = pool.open("agent-2")
results = mem_2.search("API deployment")

Concurrency

from antaris_memory import FileLock, VersionTracker

# Exclusive write access (atomic on all platforms including network filesystems)
with FileLock("/path/to/shard.json", timeout=10.0):
    data = load(shard)
    modify(data)
    save(shard, data)

Storage Format

workspace/
├── shards/
│   ├── 2026-02-strategic.json
│   ├── 2026-02-operational.json
│   └── 2026-01-tactical.json
├── indexes/
│   ├── search_index.json
│   ├── tag_index.json
│   └── date_index.json
├── .wal/
│   └── pending.jsonl          # Write-ahead log (auto-managed)
├── access_counts.json          # Access-frequency tracker
├── migrations/history.json
└── memory_audit.json           # Deletion audit trail (GDPR)

Plain JSON files. Inspect or edit with any text editor.

Architecture

MemorySystem (v2.1)
├── ShardManager         — Date/topic sharding
├── IndexManager         — Full-text, tag, and date indexes
│   ├── SearchIndex      — BM25 inverted index
│   ├── TagIndex         — Tag → hash mapping
│   └── DateIndex        — Date range queries
├── SearchEngine         — BM25 + optional cosine hybrid
├── WALManager           — Write-ahead log (crash-safe ingestion)
├── ReadCache            — LRU search result cache
├── AccessTracker        — Per-entry access-count boosting
├── PerformanceMonitor   — Timing/counter stats
├── MigrationManager     — Schema versioning with rollback
├── InputGate            — P0-P3 classification at intake
├── DecayEngine          — Ebbinghaus forgetting curves
├── ConsolidationEngine  — Dedup, clustering, contradiction detection
├── ForgettingEngine     — Selective deletion with audit
├── SharedMemoryPool     — Multi-agent coordination
├── NamespaceManager     — Multi-tenant isolation
└── ContextPacketBuilder — Sub-agent context injection

Benchmarks

Measured on Apple M4, Python 3.14.

Memories	Ingest	Search (avg)	Search (p99)	Consolidate	Disk
100	5.3ms (0.053ms/entry)	0.40ms	0.65ms	4.2ms	117KB
500	16.8ms (0.034ms/entry)	1.70ms	2.51ms	84.3ms	575KB
1,000	33.2ms (0.033ms/entry)	3.43ms	5.14ms	343.3ms	1.1MB
5,000	173.7ms (0.035ms/entry)	17.10ms	25.70ms	4.3s	5.6MB

Input gating classification: 0.177ms avg per input.

Large Corpus Management

antaris-memory can ingest at 1M+ items using bulk_ingest() (11,600 items/s on M4 hardware). At runtime, however, a safety limit caps the active in-memory set to 20,000 entries by default. This is a deliberate design choice — searching across millions of live entries would require a different index architecture (approximate nearest-neighbour, etc.).

What this means in practice:

# This completes in ~86s for 1M items:
mem.bulk_ingest(corpus_generator())  # all entries written to shards on disk

# On the next load(), only the 20K highest-scoring entries are loaded into RAM:
mem.load()  # prints a UserWarning if the corpus exceeds the limit

A UserWarning is emitted when the limit is hit so you won't miss it in logs.

Working with large corpora:

# Compact the corpus: dedup + consolidate, then trim to high-value entries
mem.compact()

# Archive old shards to keep the active set small
# (shards are plain JSON — archive to S3, local disk, etc.)

# Raise the limit explicitly (advanced — ensure you have enough RAM):
# Set _LOAD_LIMIT in core_v4._load_sharded() or subclass MemorySystemV4.

Rule of thumb: For typical agent use (10K–100K active memories), search latency stays under 5ms. At 1M loaded entries (with raised limit), p50 search is ~2.4s — plan accordingly.

MCP Server

from antaris_memory import create_mcp_server  # pip install mcp

server = create_mcp_server("./workspace")
server.run()  # Stdio transport — connect from Claude Desktop, Cursor, etc.

MCP tools exposed: memory_search, memory_ingest, memory_consolidate, memory_stats.

Running Tests

git clone https://github.com/Antaris-Analytics/antaris-memory.git
cd antaris-memory
python -m pytest tests/ -v

All 293 tests pass with zero external dependencies.

Migrating from v2.0.0

No breaking changes. The new purge(), rebuild_indexes(), wal_flush(), and wal_inspect() methods are additive. Existing workspaces load automatically — no migration required.

pip install --upgrade antaris-memory

Migrating from v1.x

# Existing workspaces load automatically — no changes required
mem = MemorySystem("./existing_workspace")
mem.load()  # Auto-detects format, migrates if needed

Zero Dependencies (Core)

The core package uses only the Python standard library. Optional extras:

pip install mcp — enables create_mcp_server()
Supply your own embedding function to set_embedding_fn() — any callable returning list[float] works (OpenAI, Ollama, sentence-transformers, etc.)

Part of the Antaris Analytics Suite

antaris-memory — Persistent memory for AI agents (this package)
antaris-router — Adaptive model routing with SLA enforcement
antaris-guard — Security and prompt injection detection
antaris-context — Context window optimization
antaris-pipeline — Agent orchestration pipeline

License

Apache 2.0 — see LICENSE for details.

Built with ❤️ by Antaris Analytics
Deterministic infrastructure for AI agents

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

5.5.0

Mar 15, 2026

5.4.0

Mar 15, 2026

5.3.0

Mar 15, 2026

5.2.3

Mar 14, 2026

5.2.1

Mar 11, 2026

5.2.0

Mar 11, 2026

5.0.1

Mar 10, 2026

5.0.0

Mar 9, 2026

4.9.22

Mar 9, 2026

4.9.21

Mar 9, 2026

4.9.20

Mar 8, 2026

4.9.19

Mar 8, 2026

4.9.18

Mar 7, 2026

4.9.17

Mar 7, 2026

4.9.16

Mar 6, 2026

4.9.15

Mar 6, 2026

4.9.14

Mar 5, 2026

4.9.13

Mar 5, 2026

4.9.12

Mar 5, 2026

4.9.11

Mar 5, 2026

4.9.10

Mar 4, 2026

4.9.7

Mar 4, 2026

4.9.6

Mar 4, 2026

4.9.5

Mar 3, 2026

4.9.4

Mar 3, 2026

4.9.3

Mar 3, 2026

4.9.2

Mar 3, 2026

4.9.1

Mar 3, 2026

4.9.0

Mar 3, 2026

4.8.0

Mar 3, 2026

4.7.1

Mar 3, 2026

4.7.0

Mar 3, 2026

4.6.8

Mar 2, 2026

4.6.7

Mar 2, 2026

4.6.6

Mar 2, 2026

4.6.5

Mar 2, 2026

4.6.0

Mar 2, 2026

4.5.6

Mar 1, 2026

4.5.5

Mar 1, 2026

4.5.4

Mar 1, 2026

4.5.3

Mar 1, 2026

4.5.2

Feb 28, 2026

4.5.1

Feb 28, 2026

4.5.0

Feb 28, 2026

4.2.1

Feb 27, 2026

4.2.0

Feb 27, 2026

4.1.1

Feb 27, 2026

4.1.0

Feb 26, 2026

4.0.3

Feb 25, 2026

4.0.2

Feb 25, 2026

4.0.1

Feb 25, 2026

4.0.0

Feb 23, 2026

3.3.0

Feb 22, 2026

3.1.0

Feb 21, 2026

This version

3.0.0

Feb 21, 2026

2.4.0

Feb 21, 2026

2.1.1

Feb 19, 2026

2.1.0

Feb 19, 2026

2.0.0

Feb 19, 2026

1.1.0

Feb 17, 2026

1.0.1

Feb 16, 2026

1.0.0

Feb 16, 2026

0.4.0

Feb 16, 2026

0.3.0

Feb 16, 2026

0.2.1

Feb 15, 2026

0.2.0

Feb 15, 2026

0.1.1

Feb 15, 2026

0.1.0

Feb 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antaris_memory-3.0.0.tar.gz (137.0 kB view details)

Uploaded Feb 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

antaris_memory-3.0.0-py3-none-any.whl (102.0 kB view details)

Uploaded Feb 21, 2026 Python 3

File details

Details for the file antaris_memory-3.0.0.tar.gz.

File metadata

Download URL: antaris_memory-3.0.0.tar.gz
Upload date: Feb 21, 2026
Size: 137.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for antaris_memory-3.0.0.tar.gz
Algorithm	Hash digest
SHA256	`d92008aa119fc10ca4aaf8601b00729efe3d7f0b22ab481d59c64c635ce27a64`
MD5	`a4941d44a9945474308224a04868fff7`
BLAKE2b-256	`3688d3be8b83f95eb58468b75972c86ce7997ff9946bee111198689ac0baa3f6`

See more details on using hashes here.

File details

Details for the file antaris_memory-3.0.0-py3-none-any.whl.

File metadata

Download URL: antaris_memory-3.0.0-py3-none-any.whl
Upload date: Feb 21, 2026
Size: 102.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for antaris_memory-3.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e7d5de27d0d6d9c58ed839cef91b835d89184ae8acb9bb670782b2765ae5832a`
MD5	`0d0bf25f5d05c592924f91631f64d7b1`
BLAKE2b-256	`af40e4339ec72d272bed622b4d0401d93c264da4e7d0ddb6636952949a376305`

See more details on using hashes here.

antaris-memory 3.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

antaris-memory

What's New in v2.4.0 (antaris-suite 3.0)

Install

Quick Start

🧹 Production Cleanup API (v2.1.0)

purge() — Bulk removal with glob patterns

rebuild_indexes() — Repair search indexes after bulk operations

wal_flush() — Force-flush WAL to shard files

wal_inspect() — Health check without mutating state

Typical production maintenance flow

OpenClaw Integration

What It Does

What It Doesn't Do

Memory Types

Hybrid Semantic Search

Input Gating (P0–P3)

Namespace Isolation

Context Packets (Sub-Agent Injection)

Selective Forgetting (GDPR-ready)

Shared Memory Pools

Concurrency

Storage Format

Architecture

Benchmarks

Large Corpus Management

MCP Server

Running Tests

Migrating from v2.0.0

Migrating from v1.x

Zero Dependencies (Core)

Part of the Antaris Analytics Suite

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`purge()` — Bulk removal with glob patterns

`rebuild_indexes()` — Repair search indexes after bulk operations

`wal_flush()` — Force-flush WAL to shard files

`wal_inspect()` — Health check without mutating state