File-based persistent memory for AI agents. Zero dependencies.
Project description
Antaris Memory
Production-ready file-based persistent memory for AI agents. Zero dependencies (core).
Store, search, decay, and consolidate agent memories using only the Python standard library. Sharded storage for scalability, fast search indexes, automatic schema migration. No vector databases, no infrastructure, no API keys.
What's New in v1.0.0
- BM25-inspired search — proper relevance ranking with IDF weighting. No more wall of 0.50 scores.
- File locking — cross-platform
os.mkdir()-based locks prevent concurrent writer data loss - Optimistic conflict detection — mtime/hash tracking catches stale read-modify-write patterns
- 78 tests — comprehensive coverage across search, locking, versioning, and all core features
See CHANGELOG.md for full version history.
What It Does
- Sharded storage for production scalability (10,000+ memories, sub-second search)
- Fast search indexes (full-text, tags, dates) stored as transparent JSON files
- Automatic schema migration from single-file to sharded format with rollback
- Multi-agent shared memory pools with namespace isolation and access controls
- Retrieval weighted by recency × importance × access frequency (Ebbinghaus-inspired decay)
- Input gating classifies incoming content by priority (P0–P3) and drops ephemeral noise at intake
- Detects contradictions between stored memories using deterministic rule-based comparison
- Runs fully offline — zero network calls, zero tokens, zero API keys
What It Doesn't Do
- Not a vector database — no embeddings. Search uses TF-IDF-style keyword matching on an inverted index, not semantic similarity. If you need "find memories similar in meaning," this isn't the right tool yet.
- Not a knowledge graph — flat memory store with metadata indexing. No entity relationships or graph traversal.
- Not semantic — contradiction detection compares normalized statements using explicit conflict rules (negation, numeric disagreement), not inference. It will not catch contradictions phrased differently.
- Not LLM-dependent — all operations are deterministic. No model calls, no prompt engineering.
- Not infinitely scalable — JSON file storage works well up to ~50,000 memories per workspace. Beyond that, you'll want a database. We're honest about this because we'd rather you succeed than discover limits in production.
Design Goals
| Goal | Rationale |
|---|---|
| Deterministic | Same input → same output. No model variance. |
| Offline | No network, no API keys, no phoning home. |
| Minimal surface area | One class (MemorySystem), obvious method names. |
| No hidden processes | Consolidation and synthesis run only when called. |
| Transparent storage | Plain JSON files. Inspect with any text editor. |
Install
pip install antaris-memory
Quick Start
from antaris_memory import MemorySystem
mem = MemorySystem("./workspace", half_life=7.0)
mem.load() # Load existing state (no-op if first run)
# Store memories
mem.ingest("Decided to use PostgreSQL for the database.",
source="meeting-notes", category="strategic")
mem.ingest("The API costs $500/month — too expensive.",
source="review", category="operational")
# Search (results ranked by relevance × decay score)
for r in mem.search("database decision"):
print(f"[{r.confidence:.1f}] {r.content}")
# Temporal queries
mem.on_date("2026-02-14")
mem.narrative(topic="database migration")
# Selective deletion
mem.forget(entity="John Doe") # GDPR-ready, with audit trail
mem.forget(before_date="2025-01-01")
# Background consolidation
report = mem.consolidate()
mem.save()
More examples in the examples/ directory:
quickstart.py— basic usageopenclaw_integration.py— OpenClaw agent integrationlangchain_integration.py— LangChain memory backend
Input Gating (P0–P3)
Input gating classifies content at intake by priority level. Low-value data (greetings, filler, acknowledgments) never enters storage, keeping memory clean without manual curation.
mem.ingest_with_gating("CRITICAL: API key compromised", source="alerts")
# → P0 (critical) → stored in strategic tier
mem.ingest_with_gating("Decided to switch to PostgreSQL", source="meeting")
# → P1 (operational) → stored in operational tier
mem.ingest_with_gating("thanks for the update!", source="chat")
# → P3 (ephemeral) → dropped, not stored
| Level | Category | Stored | Examples |
|---|---|---|---|
| P0 | Strategic | ✅ | Security alerts, errors, deadlines, financial commitments |
| P1 | Operational | ✅ | Decisions, assignments, technical choices |
| P2 | Tactical | ✅ | Background info, research, general discussion |
| P3 | — | ❌ | Greetings, acknowledgments, filler |
Classification uses keyword and pattern matching — no LLM calls. It's fast (0.177ms avg) but not perfect. Edge cases exist; when in doubt, it errs toward storing.
Knowledge Synthesis
Knowledge synthesis identifies gaps in stored knowledge and integrates new research. It scans existing memories for topics mentioned frequently but lacking detail, then suggests targeted research.
# What does the agent not know enough about?
suggestions = mem.research_suggestions(limit=5)
# → [{"topic": "token optimization", "reason": "mentioned 3x, no details", "priority": "P1"}, ...]
# Integrate external findings
report = mem.synthesize(research_results={
"token optimization": "Context window management techniques..."
})
Memory Decay
Memories fade over time unless reinforced by access:
score = importance × 2^(-age / half_life) + reinforcement
- Fresh memories score high
- Unused memories decay toward zero
- Accessed memories are automatically reinforced (each search hit boosts the score)
- Below-threshold memories are candidates for compression
Consolidation
Run periodically to maintain memory health:
report = mem.consolidate()
Sample output (10 memories, 2 near-duplicates, 3 topic clusters):
{
"timestamp": "2026-02-16T02:23:58",
"total": 10,
"active": 10,
"archive_candidates": 0,
"duplicates": 0,
"clusters": 3,
"contradictions": 0,
"top_clusters": {
"postgresql": ["4d8c1f76", "9178bfd3"],
"cost": ["a0811e1b", "5b42672b"],
"$500": ["a0811e1b", "5b42672b"]
}
}
- Finds and merges near-duplicate memories (e.g., "Chose PostgreSQL" and "PostgreSQL selected as database")
- Discovers topic clusters (memories that reference the same subjects)
- Flags contradictions (e.g., "API costs are reasonable" vs "API costs too much" — when phrased with explicit negation)
- Suggests memories for archival (old, low-importance, rarely accessed)
Concurrency
Multiple processes can safely read and write to the same memory workspace.
File Locking
from antaris_memory import FileLock
# Exclusive access to a resource
with FileLock("/path/to/shard.json", timeout=10.0):
data = load(shard)
modify(data)
save(shard, data)
# Non-blocking try
lock = FileLock("/path/to/shard.json")
if lock.acquire(blocking=False):
try:
...
finally:
lock.release()
Locks use os.mkdir() — atomic on all platforms, works on network filesystems, zero dependencies. Stale locks from crashed processes are automatically detected and broken (by age or dead PID).
Optimistic Conflict Detection
For read-heavy workloads where locking overhead isn't worth it:
from antaris_memory import VersionTracker
tracker = VersionTracker()
# Snapshot before reading
version = tracker.snapshot("/path/to/data.json")
data = load(data_path)
modify(data)
# Check before writing — raises ConflictError if another process modified the file
tracker.check(version)
save(data_path, data)
# Or use the retry helper:
tracker.safe_update("/path/to/data.json", lambda d: {**d, "count": d["count"] + 1})
Safety Stack
All JSON writes use atomic_write_json() which combines:
- Atomic writes (tmpfile → fsync → os.replace) — prevents torn files
- File locks (os.mkdir) — prevents lost updates from concurrent writers
- Directory fsync (POSIX) — crash-consistent renames
To opt out of locking for single-process workloads: atomic_write_json(path, data, lock=False).
Benchmarks
Measured on Apple M4, Python 3.14 (beta). Results on Python 3.9–3.13 will be comparable — no version-specific optimizations are used. Reproducible via scripts/ollama_benchmark.py.
| Memories | Ingest | Search (avg) | Search (p99) | Consolidate | Disk |
|---|---|---|---|---|---|
| 100 | 1.0ms (0.010ms/entry) | 0.35ms | 0.40ms | 2.6ms | 46KB |
| 500 | 4.4ms (0.009ms/entry) | 1.55ms | 1.83ms | 51ms | 230KB |
| 1,000 | 7.1ms (0.007ms/entry) | 2.69ms | 3.20ms | 195ms | 460KB |
| 5,000 | 36.8ms (0.007ms/entry) | 13.8ms | 16.2ms | 354ms | 2.3MB |
Input gating (P0–P3 classification): 0.177ms avg per input.
Scaling notes: JSON file storage is practical up to ~50,000 memories per workspace. At that scale, expect ~50-100ms search and ~50MB on disk. Beyond that, consider sharding across multiple workspaces or migrating to a database. We chose this limit deliberately — most agent workloads generate hundreds to low thousands of memories, not millions.
Storage Format
v0.4 (sharded) — memories are split across multiple files by date and topic:
workspace/
├── shards/
│ ├── 2026-02-strategic.json # Strategic memories from Feb 2026
│ ├── 2026-02-operational.json # Operational memories from Feb 2026
│ └── 2026-01-tactical.json # Tactical memories from Jan 2026
├── indexes/
│ ├── search_index.json # Full-text inverted index
│ ├── tag_index.json # Tag → memory hash lookup
│ └── date_index.json # Date range index
├── migrations/
│ └── history.json # Applied migration log
└── memory_audit.json # Deletion audit trail (GDPR)
Each shard is a plain JSON file containing an array of memory entries:
{
"hash": "a1b2c3d4e5f6",
"content": "Decided to use PostgreSQL",
"source": "meeting-notes",
"category": "strategic",
"created": "2026-02-15T10:00:00",
"importance": 1.0,
"confidence": 0.8,
"sentiment": {"strategic": 0.6},
"tags": ["postgresql", "deployment"]
}
v0.2/v0.3 (legacy) — single memory_metadata.json file. Automatically migrated to sharded format on first v0.4 load, with backup and rollback support.
Storage format may evolve between versions. Breaking changes will increment MAJOR version. See CHANGELOG.
Architecture
MemorySystem (v1.0)
├── ShardManager — Distributes memories across date/topic shards
├── IndexManager — Full-text, tag, and date indexes for fast lookup
│ ├── SearchIndex — Inverted index for text search
│ ├── TagIndex — Tag → memory hash mapping
│ └── DateIndex — Date range queries
├── MigrationManager — Schema versioning with backup and rollback
├── SearchEngine — BM25-inspired ranking with IDF, phrase boost, field boost
├── FileLock — Cross-platform directory-based file locking
├── VersionTracker — Optimistic conflict detection (mtime/hash)
├── InputGate — P0-P3 classification at intake
├── DecayEngine — Ebbinghaus forgetting curves
├── SentimentTagger — Rule-based keyword tone tagging
├── TemporalEngine — Date queries and narrative building
├── ConfidenceEngine — Reliability scoring
├── CompressionEngine — Old file summarization
├── ForgettingEngine — Selective deletion with audit
├── ConsolidationEngine — Dedup, clustering, contradiction detection
└── KnowledgeSynthesizer — Gap identification and research integration
Data flow: ingest → classify (P0-P3) → normalize → shard-route → index → persist → search (index lookup) → decay-weight → return
Works With Local Models (Ollama)
All memory operations are local and deterministic — no tokens consumed, no API calls. Pair with Ollama for a fully local agent stack at zero marginal cost.
mem = MemorySystem("./workspace")
mem.load()
mem.ingest_with_gating("Meeting notes from standup", source="daily")
results = mem.search("standup decisions")
On a Mac Mini (32GB) running Ollama for inference and antaris-memory for persistence, your entire agent stack runs locally. On a Mac Studio (256GB), you can run 70B+ models alongside thousands of indexed memories with sub-millisecond lookups.
Running Tests
git clone https://github.com/Antaris-Analytics/antaris-memory.git
cd antaris-memory
python -m pytest tests/ -v
All 44 tests pass with zero external dependencies. No test fixtures, no mocking libraries, no network access.
Zero Dependencies (Core)
The core package uses only the Python standard library — no install-time dependencies. Optional extras (pip install antaris-memory[embeddings]) add integration points but are never required. All core operations (ingest, search, decay, consolidation) are fully deterministic with no external calls.
Comparison
| Antaris Memory | LangChain Memory | Mem0 | Zep | |
|---|---|---|---|---|
| Input gating | ✅ P0-P3 | ❌ | ❌ | ❌ |
| Knowledge synthesis | ✅ Gap detection | ❌ | ❌ | ❌ |
| No database required | ✅ | ❌ | ❌ | ❌ |
| Memory decay | ✅ Ebbinghaus | ❌ | ❌ | ⚠️ Temporal graphs |
| Tone tagging | ✅ Rule-based keywords | ❌ | ❌ | ✅ NLP |
| Temporal queries | ✅ | ❌ | ❌ | ✅ |
| Contradiction detection | ✅ Rule-based | ❌ | ❌ | ⚠️ Fact evolution |
| Selective forgetting | ✅ With audit | ❌ | ⚠️ Invalidation | ⚠️ Invalidation |
| Infrastructure needed | None | Redis/PG | Vector + KV + Graph | PostgreSQL + Vector |
Honest caveat: LangChain, Mem0, and Zep offer features we don't — embeddings-based semantic search, graph relationships, real-time sync. They require more infrastructure but may be the right choice if you need those capabilities. Antaris Memory is for teams that want a simple, transparent, offline-first memory primitive.
Part of the Antaris Analytics Suite
- antaris-memory — Persistent memory for AI agents (this package)
- antaris-router — Adaptive model routing with outcome learning
- antaris-guard — Security and prompt injection detection (coming soon)
- antaris-context — Context window optimization (coming soon)
License
Licensed under the Apache License 2.0. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file antaris_memory-1.0.0.tar.gz.
File metadata
- Download URL: antaris_memory-1.0.0.tar.gz
- Upload date:
- Size: 67.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d993ea4ffa93e419e13410bdc0c7f73242043bf0ac42ac9496502eee86c0bc5
|
|
| MD5 |
cfd8a5f08711ada5f80c4a6b35defc88
|
|
| BLAKE2b-256 |
79ef41c735040729227acb03acd485b66f081f2dd33ac0c6b56006071e36547b
|
File details
Details for the file antaris_memory-1.0.0-py3-none-any.whl.
File metadata
- Download URL: antaris_memory-1.0.0-py3-none-any.whl
- Upload date:
- Size: 62.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a202b1d1399b43041eaaea3a739c77935713e936b584a82200abd09485dc2b8c
|
|
| MD5 |
a7d080f909a6b2512a79d3a1a8ae49d5
|
|
| BLAKE2b-256 |
d4370a8f26a0471363d930b0e5bc433f01d33b4205a3d26dfdd6b40eece3426b
|