Skip to main content

File-based persistent memory for AI agents. Zero dependencies.

Project description

๐Ÿง  Parsica-Memory

Persistent, intelligent memory for AI agents. The flagship package of the Antaris Analytics suite.

PyPI version Python 3.8+ Zero dependencies License: MIT


What Is This?

AI agents are stateless by default. Every spawn is a cold start with no knowledge of what happened before. antaris-memory solves this by giving agents a persistent, searchable, intelligent memory store that:

  • Remembers what happened across sessions, spawns, and agent restarts
  • Retrieves the right memories when they're needed, using an 11-layer search engine
  • Decays old memories gracefully so signal-to-noise stays high
  • Learns from mistakes, facts, and procedures with specialized memory types
  • Shares knowledge across multi-agent teams
  • Enriches itself via LLM hooks to dramatically improve recall

This is not a vector database wrapper. It is a zero-dependency, pure-Python, file-backed memory system designed from first principles for agentic workloads.


โšก Quick Start

pip install antaris-memory
from antaris_memory import MemorySystem

mem = MemorySystem(workspace="./memory", agent_name="my-agent")
mem.load()

# Store a memory
mem.ingest("Deployed v2.3.1 to production at 14:32 UTC. All checks green.", 
          source="deploy-log", session_id="session-123", channel_id="ops-channel")

# Cross-session recall - finds memories from other sessions
results = mem.search("production deployment", crossSessionRecall="semantic")
for r in results:
    print(f"[{r.session_id}] {r.content}")

mem.save()

That's it. No API keys required, no external services, no configuration files.


๐Ÿ“ฆ Installation

pip install antaris-memory

Version: 5.0.1
Requirements: Python 3.8+ ยท Zero external dependencies ยท stdlib only


๐Ÿ—บ๏ธ Feature Matrix

Feature Available Version
Core ingestion & search โœ… v1.0
Memory types (episodic/fact/mistake/procedure/preference) โœ… v1.0
Temporal decay โœ… v1.0
Export / Import โœ… v4.2
GCS cloud backend โœ… v4.2
Web & data file ingestion โœ… v4.7
Tiered storage (hot/warm/cold) โœ… v4.7
LLM enrichment hooks โœ… v4.6.5
11-layer search architecture โœ… v4.x
Graph intelligence (entity/relationship) โœ… v4.8/v4.9
Shared / team memory pools โœ… v4.8
Context packets (cold-spawn solver) โœ… v1.1
MCP server โœ… v4.9
Hybrid BM25 + semantic embedding search โœ… v4.x
Co-occurrence / PPMI semantic tier โœ… v4.x
Input gating (P0โ€“P3 priority) โœ… v4.x
Cross-session memory recall โœ… v5.0.1
Auto memory type classification โœ… v5.0.1
Session/channel provenance โœ… v5.0.1
doc2query (search query generation) โœ… v5.0.2
Recovery system โœ… v3.3
CLI tooling โœ… v4.x

๐Ÿ“– Table of Contents

  1. Core API
  2. Memory Types
  3. Ingestion Methods
  4. Search & Retrieval
  5. 11-Layer Search Architecture
  6. Tiered Storage
  7. LLM Enrichment
  8. Graph Intelligence
  9. Context Packets
  10. Shared / Team Memory Pools
  11. MCP Server
  12. GCS Backend
  13. Export & Import
  14. Recovery System
  15. Co-occurrence / PPMI Semantic Tier
  16. Input Gating
  17. Hybrid Semantic Search
  18. Maintenance & Operations
  19. Stats & Health
  20. CLI Reference
  21. Full API Reference

๐Ÿ”ง Core API

Constructor

from antaris_memory import MemorySystem

mem = MemorySystem(
    workspace="./memory",        # Root directory for all memory files (required)
    agent_name="my-agent",       # REQUIRED: agent scoping, omitting triggers UserWarning
    half_life=7.0,               # Decay half-life in days (default: 7.0)
    tag_terms=None,              # Custom auto-tag terms (list of strings)
    use_sharding=True,           # Enterprise sharding for large stores
    use_indexing=True,           # Pre-built search indexes (faster queries)
    enable_read_cache=True,      # LRU in-memory cache
    cache_max_entries=1000,      # Max LRU cache entries
    enricher=None,               # LLM enrichment callable (see LLM Enrichment)
    tiered_storage=True,         # Hot/warm/cold tier management
    graph_intelligence=True,     # Entity extraction + knowledge graph
    quality_routing=True,        # Follow-up pattern detection
    semantic_expansion=True,     # Word embedding query expansion
)

Why agent_name matters: Each agent gets a scoped memory namespace. Without it, memories from different agents bleed together, and per-agent search filtering stops working. Always set it.

Lifecycle

# Load memories from disk into memory
count = mem.load()
print(f"Loaded {count} memories")

# Save current state to disk
path = mem.save()

# Flush Write-Ahead Log (WAL) to shards without full save
result = mem.flush()
# โ†’ {"flushed_entries": 42, "wal_cleared": True}

# Graceful shutdown: flush + release all resources
mem.close()

WAL (Write-Ahead Log): Every ingest() call appends to a WAL first. This makes writes fast and crash-safe. The WAL is periodically compacted into shards. Use flush() frequently in long-running agents; use close() at shutdown.


๐Ÿท๏ธ Memory Types

Memory types are not just labels โ€” they change how memories decay, how they score in search, and how they surface in context packets.

Type Decay Rate Importance Multiplier Special Behavior
episodic Normal (half_life days) 1ร— Default type. General events and observations.
fact Normal 1ร— (high recall priority) Verified facts. Prioritized in recall.
mistake 10ร— slower 2ร— importance Surfaces as Known Pitfalls in context packets. Never forget your mistakes.
preference 3ร— slower 1ร— High context-matched recall. User/agent preferences persist longer.
procedure 3ร— slower 1ร— High task-matched recall. How-to knowledge stays relevant.

Why This Matters for Agents

Mistakes outlive everything. If an agent tried an approach that failed, that memory persists 10ร— longer and scores 2ร— higher. It will surface when context is relevant โ€” preventing the agent from repeating the same mistake.

Procedures don't decay during active projects. A procedure stored on Monday is still highly relevant on Friday. Regular episodic memories would fade; procedures don't.

# Store a verified fact
mem.ingest_fact(
    "The production database is PostgreSQL 14.2, hosted on RDS us-east-1",
    source="infra-docs",
    tags=["database", "infrastructure"]
)

# Record a mistake with full context
entry = mem.ingest_mistake(
    what_happened="Used DROP TABLE instead of TRUNCATE, lost test data",
    correction="Always use TRUNCATE for clearing data; reserve DROP TABLE for schema removal",
    root_cause="Confused SQL semantics under pressure",
    severity="high",
    tags=["sql", "database", "destructive-ops"]
)

# Store a user/agent preference
mem.ingest_preference(
    "User prefers concise responses under 200 words unless explicitly asked for detail",
    source="user-feedback"
)

# Store a repeatable procedure
mem.ingest_procedure(
    "Deployment checklist: 1) Run tests, 2) Tag release, 3) Push to staging, 4) Monitor 10min, 5) Push to prod",
    source="runbook"
)

๐Ÿ“ฅ Ingestion Methods

Basic Ingestion

# Full control ingestion with v5.0.1 features
entry_id = mem.ingest(
    content="Completed sprint 14. Delivered auth module, skipped rate-limiter due to scope.",
    source="sprint-retro",
    category="engineering",
    memory_type="episodic",      # Auto-classified (semantic/episodic) in v5.0.1
    tags=["sprint", "auth"],
    agent_id="agent-007",        # Override agent scoping
    session_id="session-abc",    # Session provenance (v5.0.1)
    channel_id="channel-main",   # Channel provenance (v5.0.1)
    source_url="https://example.com/doc",  # Source URL tracking (v5.0.1)
    content_hash="sha256:abc123",  # Content hash for deduplication (v5.0.1)
)

Typed Ingestion Shortcuts

# Facts โ€” verified knowledge
mem.ingest_fact("API rate limit is 1000 req/min", source="api-docs", tags=["api"])

# Preferences โ€” persist 3ร— longer
mem.ingest_preference("Prefers dark mode and compact layouts", source="user-settings")

# Procedures โ€” task-matched recall
mem.ingest_procedure("To reset 2FA: go to /settings โ†’ Security โ†’ Reset authenticator", source="support-kb")

# Mistakes โ€” 2ร— importance, 10ร— half-life
entry = mem.ingest_mistake(
    what_happened="Sent email to wrong recipient list",
    correction="Always preview recipient list before sending bulk mail",
    root_cause="Copy-paste error in distribution list",
    severity="medium",
    tags=["email", "communication"]
)

File & Directory Ingestion

# Ingest a single file
count = mem.ingest_file("./notes/meeting-2024-03.md", category="meetings")

# Ingest all matching files in a directory
count = mem.ingest_directory(
    "./docs",
    category="documentation",
    pattern="*.md"           # Glob pattern, default: *
)
print(f"Ingested {count} documents")

Web Ingestion

# Ingest a web page (and optionally crawl linked pages)
result = mem.ingest_url(
    "https://docs.example.com/api",
    depth=2,           # How many levels of links to follow (default: 1)
    incremental=True   # Skip pages already ingested (default: True)
)
# โ†’ {"ingested": 14, "skipped_duplicates": 3, "source_url": "https://docs.example.com/api"}

# Remove all memories from a source URL
result = mem.delete_source("https://docs.example.com/api")
# โ†’ {"deleted": 14, "source_url": "https://docs.example.com/api"}

Structured Data Ingestion

# CSV file
result = mem.ingest_data_file("./data/customers.csv", format="csv")
# โ†’ {"ingested": 512, "source": "customers.csv", "format": "csv"}

# JSON file
result = mem.ingest_data_file("./data/events.json", format="json")

# SQLite database
result = mem.ingest_data_file("./data/app.db", format="sqlite")

# Auto-detect format
result = mem.ingest_data_file("./data/report.csv", format="auto")

# Ingest from a SQL query
result = mem.ingest_sql(
    db_path="./data/app.db",
    query="SELECT id, title, body, created_at FROM articles WHERE published = 1"
)

Input Gating

See the Input Gating section for P0โ€“P3 priority filtering.


๐Ÿ” Search & Retrieval

Core Search

results = mem.search(
    query="production deployment failure",
    limit=10,                        # Max results (default: 10)
    tags=["production"],             # Filter by tags
    tag_mode="any",                  # "any" or "all" (default: "any")
    date_range=("2024-01-01", "2024-03-31"),  # ISO date strings
    use_decay=True,                  # Apply temporal decay scoring (default: True)
    category="engineering",          # Filter by category
    min_confidence=0.3,              # Minimum relevance score (0.0โ€“1.0)
    sentiment_filter="negative",     # Filter by sentiment
    memory_type="mistake",           # Filter by memory type
    explain=True,                    # Include score breakdown in results
    session_id="session-abc",        # Filter to specific session
    agent_id="agent-007",            # Filter to specific agent
    channel_id="ops-channel",        # Filter by channel (v5.0.1)
    crossSessionRecall="semantic",   # Cross-session semantic filtering (v5.0.1)
    include_cold=False,              # Include cold-tier memories (default: False)
)

for result in results:
    print(f"[{result.score:.3f}] {result.content[:100]}")
    if hasattr(result, 'score_breakdown'):
        print(f"  BM25={result.score_breakdown.get('bm25'):.3f}")

Search With Context (Instrumented)

results, ctx = mem.search_with_context(
    query="API authentication patterns",
    limit=10,
    instrumentation_context={"session": "my-session"},
    cooccurrence_boost=True          # Apply PPMI co-occurrence reranking
)

# ctx is a SearchContext object
print(f"Expanded query: {ctx.expanded_query}")
print(f"Intent detected: {ctx.intent}")
print(f"Tiers searched: {ctx.tiers_searched}")
print(f"Search time: {ctx.search_time_ms:.1f}ms")

Recency-First Retrieval

# Get recent memories without keyword matching (pure recency)
recent = mem.recent(
    limit=20,
    agent_id="agent-007",       # Filter to agent (optional)
    include_shared=True         # Include shared pool memories
)

Temporal Queries

# All memories from a specific date
entries = mem.on_date("2024-03-15")

# All memories in a date range
entries = mem.between("2024-03-01", "2024-03-31")

Narrative Generation

# Generate a prose narrative about a topic from memory
story = mem.narrative("deployment incidents in Q1")
print(story)
# โ†’ "In January, the team experienced three deployment incidents.
#    The first occurred on January 8th when..."

Analysis & Synthesis

# Structured analysis of memories related to a topic
analysis = mem.analyze("user authentication", limit=20)
# โ†’ {
#     "topic": "user authentication",
#     "memory_count": 8,
#     "themes": ["OAuth", "JWT", "session management"],
#     "timeline": [...],
#     "sentiment_distribution": {...}
# }

# Free-text knowledge synthesis
summary = mem.synthesize_knowledge("deployment best practices", limit=30)
print(summary)
# โ†’ "Based on accumulated knowledge: deployments succeed most often when..."

๐Ÿ—๏ธ 11-Layer Search Architecture

antaris-memory uses an 11-layer search pipeline. Each layer refines the ranked list before returning results. This is why recall quality dramatically exceeds naive keyword search.

Query Input
    โ”‚
    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Layer 1: BM25+ TF-IDF                                          โ”‚
โ”‚           BM25_DELTA=1.0 floor ensures smooth scoring           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 2: Exact Phrase Bonus                                     โ”‚
โ”‚           1.5ร— in content body ยท 1.3ร— in enriched summary       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 3: Field Boosting                                         โ”‚
โ”‚           Tags 1.2ร— ยท Source 1.1ร— ยท Category 1.3ร—               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 4: Rarity Boost + Proper Noun Boost                       โ”‚
โ”‚           โ‰ค1% corpus โ†’ 2.0ร— ยท 1โ€“5% โ†’ 1.5ร— ยท 5โ€“15% โ†’ 1.2ร—      โ”‚
โ”‚           Proper nouns (NNP detection) โ†’ 1.5ร—                   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 5: Sliding Window Context + Positional Salience           โ”‚
โ”‚           First/last window โ†’ 1.3ร— (intro/conclusion bias)      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 6: Semantic Expansion                                     โ”‚
โ”‚           QueryExpander ยท PPMIBootstrap ยท CategoryTagger         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 7: Intent Reranker                                        โ”‚
โ”‚           Detects: temporal ยท entity ยท event ยท decision          โ”‚
โ”‚                    comparison ยท howto ยท quantity ยท location      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 8: Qualifier & Negation Sensitivity                       โ”‚
โ”‚           Handles: before/after ยท success/failure ยท negation     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 9: Cross-Memory Clustering Boost                          โ”‚
โ”‚           Post-normalization cluster coherence scoring           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 10: MiniLM Embedding Reranker                             โ”‚
โ”‚            .word_embeddings.json + .embeddings_minilm.json       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 11: Pseudo-Relevance Feedback                             โ”‚
โ”‚            Top-term extraction from top-3 docs ยท 70/30 blend    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ”‚
    โ–ผ
Ranked Results

Layer Details

Layer 1 โ€” BM25+ TF-IDF: The baseline relevance signal. BM25+ with a delta floor of 1.0 ensures no term scores as zero, preventing the "lucky keyword match" problem that plagues basic TF-IDF implementations.

Layer 2 โ€” Exact Phrase Bonus: When the exact query phrase appears verbatim, the score jumps. Content hits get a bigger boost (1.5ร—) than hits in enriched summaries (1.3ร—) to reward genuine signal over derived metadata.

Layer 3 โ€” Field Boosting: Memories indexed with a matching category (1.3ร—) or matching tags (1.2ร—) rank higher. This rewards structured ingestion.

Layer 4 โ€” Rarity & Proper Noun Boost: Rare terms matter more. If a term appears in โ‰ค1% of the corpus, matching it scores 2ร— over a common term. Proper nouns (detected via capitalization heuristics) get an additional 1.5ร— because they're almost always semantically important.

Layer 5 โ€” Positional Salience: Not all text positions are equal. The first and last windows of a memory entry score 1.3ร— higher โ€” mimicking human reading patterns where intro/conclusion carry the most signal.

Layer 6 โ€” Semantic Expansion: The query is expanded with related terms using PPMIBootstrap co-occurrence statistics and the CategoryTagger. A search for "API failure" also matches "endpoint crash", "service down", "503 error" โ€” without any synonym dictionary.

Layer 7 โ€” Intent Reranker: Detects the semantic intent of the query and reranks accordingly. A "how to" query surfaces procedural memories first. A "when did" query surfaces episodic/temporal memories first.

Layer 8 โ€” Qualifier & Negation Sensitivity: Understands before/after, success/failure, and negation. Searching for "failed deployment" does not match "successful deployment" โ€” a distinction most search systems get wrong.

Layer 9 โ€” Clustering Boost: Memories that cluster with other highly-relevant results get a bonus. This rewards coherent knowledge clusters over isolated matching documents.

Layer 10 โ€” MiniLM Embedding Reranker: Pre-computed sentence embeddings (MiniLM-based) provide semantic similarity scoring. Works entirely from local files โ€” no model inference at search time.

Layer 11 โ€” Pseudo-Relevance Feedback: The top 3 results are analyzed for their most distinctive terms. Those terms are folded back into the query at a 70/30 blend. This is a classic IR technique (Rocchio) applied to agent memory โ€” the search becomes smarter the more memories you have.


๐Ÿ—„๏ธ Tiered Storage (Hot / Warm / Cold)

Large memory stores would be expensive to load entirely on startup. Tiered storage solves this by keeping recent memories fast and old memories accessible but lazy.

Tier Age Behavior
Hot 0โ€“3 days Loaded on startup. Always in memory.
Warm 3โ€“14 days Loaded on-demand when hot search returns < 3 results.
Cold 14+ days Never auto-loaded. Requires include_cold=True.
# Default search (hot + warm if needed)
results = mem.search("recent API changes")

# Explicitly search cold tier too
results = mem.search("API changes from last month", include_cold=True)

# Check tier distribution
stats = mem.get_stats()
print(f"Hot: {stats['hot_entries']}")
print(f"Warm: {stats['warm_entries']}")
print(f"Cold: {stats['cold_entries']}")

# Get the most-accessed hot entries
hot = mem.get_hot_entries(top_n=10)

Why tiers matter: An agent that's been running for months might have 50,000+ memories. Loading all of them on every spawn is expensive. Tiered storage means startup costs stay constant regardless of total memory size hot tier is small, warm tier is medium, cold tier is archived.


๐Ÿค– LLM Enrichment

Out-of-the-box keyword search only finds what's literally in the text. LLM enrichment dramatically improves recall by having a language model generate additional search signals at ingest time.

How It Works

  1. You provide an enricher callable when constructing MemorySystem
  2. On every ingest(), your callable receives the content and returns metadata
  3. That metadata is indexed with weighted boost factors:
    • search_queries: 3ร— TF weight โ€” artificial query-document pairs
    • enriched_summary: 2ร— TF weight โ€” search-optimized restatement
    • search_keywords: 2ร— TF weight โ€” extra search terms

Enricher Interface

from typing import TypedDict, List

class EnrichmentResult(TypedDict):
    tags: List[str]             # Auto-generated tags
    summary: str                # Search-optimized restatement of the content
    keywords: List[str]         # Additional search terms
    search_queries: List[str]   # doc2query: LLM-generated search queries (v5.0.2)

Example: Anthropic Enricher

import anthropic

client = anthropic.Anthropic()

def my_enricher(content: str) -> dict:
    response = client.messages.create(
        model="claude-3-5-haiku-20241022",
        max_tokens=500,
        messages=[{
            "role": "user",
            "content": f"""Analyze this memory and return JSON:
{{
  "tags": ["tag1", "tag2"],
  "summary": "one-sentence search-optimized restatement",
  "keywords": ["keyword1", "keyword2"],
  "search_queries": ["what query should find this?", "another natural query"]
}}

Memory: {content}"""
        }]
    )
    import json
    return json.loads(response.content[0].text)

mem = MemorySystem(
    workspace="./memory",
    agent_name="my-agent",
    enricher=my_enricher
)

Batch Enrichment

Enrich memories that were ingested before an enricher was configured:

# Enrich all non-enriched entries in batches of 50
count = mem.re_enrich(
    batch_size=50,
    progress_fn=lambda i, total: print(f"Enriching {i}/{total}"),
    overwrite=False   # True to re-enrich already-enriched entries
)
print(f"Enriched {count} entries")

# Track enrichment costs
enrichment_count = mem.get_enrichment_count(reset=False)
stats = mem.get_stats()
print(f"Total enrichments: {stats['enrichment_count']}")
print(f"Estimated cost: ${stats['enrichment_cost_usd']:.4f}")

๐Ÿ•ธ๏ธ Graph Intelligence

Beyond keyword search, antaris-memory builds a knowledge graph from ingested content. Entity extraction happens automatically โ€” you don't need to annotate anything.

What It Does

  • EntityExtractor: Identifies named entities (people, organizations, projects, locations, concepts) from memory content using zero-dependency heuristics
  • MemoryGraph: A in-memory knowledge graph of entity relationships derived from co-occurrence patterns and explicit relationship signals

Graph Queries

# Search by relationship triple (subject, relation, object)
triples = mem.graph_search(
    subject="PostgreSQL",          # Filter by subject (None = any)
    relation="used_by",            # Filter by relation type (None = any)
    obj=None                       # Filter by object (None = any)
)
for triple in triples:
    print(f"{triple.subject} --[{triple.relation}]--> {triple.object}")

# Find shortest path between two entities
path = mem.entity_path(
    source="deployment-service",
    target="production-database",
    max_hops=3
)
print(" โ†’ ".join(path))
# โ†’ "deployment-service โ†’ RDS โ†’ production-database"

# Get full entity info
entity = mem.get_entity("PostgreSQL")
# โ†’ {"canonical": "PostgreSQL", "aliases": [...], "edge_count": 12, "memories": [...]}

# Graph statistics
stats = mem.get_graph_stats()
print(f"Nodes: {stats['nodes']}")
print(f"Edges: {stats['edges']}")
print(f"Density: {stats['density']:.4f}")

# Rebuild the graph from scratch (after bulk ingestion)
node_count = mem.rebuild_graph()
print(f"Graph rebuilt with {node_count} nodes")

# Rebuild topic clusters
cluster_count = mem.rebuild_clusters()

When to Use Graph Search

Graph search excels at questions keyword search struggles with:

  • "What services depend on the auth module?" โ†’ graph_search(subject=None, relation="depends_on", obj="auth-module")
  • "How is the payment service connected to the database?" โ†’ entity_path("payment-service", "database")
  • "What do we know about the Stripe integration?" โ†’ get_entity("Stripe")

๐Ÿ“‹ Context Packets

The cold spawn problem: when you launch a new sub-agent, it has zero context. You could dump 50 raw memories into its prompt, but that's token-inefficient and hard to parse.

Context packets solve this. They are structured, token-budgeted memory summaries that prime an agent with exactly what it needs for a specific task.

Building a Context Packet

packet = mem.build_context_packet(
    task="Deploy the new auth service to production",
    tags=["deployment", "auth"],
    category="engineering",
    environment="production",
    instructions="Focus on known failure modes and deployment checklist",
    max_memories=20,
    max_tokens=3000,
    min_relevance=0.3,
    include_mistakes=True,    # Adds "Known Pitfalls" section
    max_pitfalls=5
)

# Render to markdown (for agent system prompt injection)
markdown = packet.render()
print(markdown)

# Trim to a strict token budget
packet.trim(max_tokens=2000)

# Serialize to dict (for JSON transport to sub-agents)
data = packet.to_dict()

Multi-Query Packets

When a task requires multiple knowledge domains:

packet = mem.build_context_packet_multi(
    task="Migrate the database to the new schema",
    queries=[
        "database migration procedures",
        "schema change failures",
        "rollback procedures",
        "maintenance window requirements"
    ],
    max_tokens=4000,
    include_mistakes=True
)

Example Rendered Output

# Context Packet: Deploy auth service to production

## Relevant Knowledge
1. **Deployment Checklist** (score: 0.89, procedure)
   Run tests โ†’ Tag release โ†’ Push staging โ†’ Monitor 10min โ†’ Push prod

2. **Auth Service Architecture** (score: 0.82, fact)
   JWT-based. Refresh tokens stored in Redis. Session expiry: 24h.

## Known Pitfalls โš ๏ธ
1. **Failed auth deployment (2024-02-14)** [SEVERITY: HIGH]
   - What happened: Deployed without running migration scripts first
   - Correction: Always run `alembic upgrade head` before service restart
   - Root cause: Skipped pre-deploy checklist under time pressure

Why This Is Critical

Context packets are the connective tissue of multi-agent systems. Without them, sub-agents are islands. With them, every spawned agent inherits the team's accumulated knowledge, including hard-won lessons from past failures.


๐Ÿ‘ฅ Shared / Team Memory Pools

Multi-agent systems need shared knowledge. A research agent should be able to write findings that a writing agent can later retrieve. Shared memory pools enable cross-agent knowledge sharing.

Setup

from antaris_memory import MemorySystem, AgentRole

# Agent 1: Coordinator
mem1 = MemorySystem(workspace="./agent1-memory", agent_name="coordinator")
pool = mem1.enable_shared_pool(
    pool_dir="./shared-pool",       # Shared filesystem location
    pool_name="project-alpha",
    agent_id="coordinator",
    role=AgentRole.COORDINATOR,     # COORDINATOR | WRITER | READER
    load_existing=True
)

# Agent 2: Worker (separate process/instance)
mem2 = MemorySystem(workspace="./agent2-memory", agent_name="worker")
pool2 = mem2.enable_shared_pool(
    pool_dir="./shared-pool",
    pool_name="project-alpha",
    agent_id="worker",
    role=AgentRole.WRITER
)

Writing to the Shared Pool

# Write to the shared pool (available to all agents in the pool)
entry = mem1.shared_write(
    content="Research complete: competitor uses GraphQL, not REST. Swagger docs at /api-docs.",
    namespace="research",        # Organize by namespace
    category="competitive-intel",
    metadata={"source": "api-analysis", "confidence": 0.9}
)

Reading from the Shared Pool

# Search shared pool
results = mem2.shared_search(
    query="competitor API architecture",
    namespace="research",
    limit=5
)
for r in results:
    print(r.content)

Agent Roles

Role Can Read Can Write Can Admin
COORDINATOR โœ… โœ… โœ…
WRITER โœ… โœ… โŒ
READER โœ… โŒ โŒ

๐Ÿ”Œ MCP Server

antaris-memory ships with a built-in MCP (Model Context Protocol) server, allowing any MCP-compatible client (Claude Desktop, etc.) to interact with memory directly.

Starting the MCP Server

Via CLI:

python -m antaris_memory serve \
    --workspace ./memory \
    --agent-name my-agent

Via Python:

from antaris_memory.mcp import AntarisMCPServer

server = AntarisMCPServer(
    workspace="./memory",
    agent_name="my-agent"
)
server.run_stdio()

MCP Tools Exposed

The MCP server exposes memory operations as tools that MCP clients can call:

  • memory_search โ€” search memories
  • memory_ingest โ€” store new memories
  • memory_recent โ€” get recent entries
  • memory_stats โ€” get memory statistics
  • memory_context_packet โ€” build context packet for a task

Claude Desktop Integration

Add to your Claude Desktop config.json:

{
  "mcpServers": {
    "antaris-memory": {
      "command": "python",
      "args": ["-m", "antaris_memory", "serve", "--workspace", "/path/to/memory", "--agent-name", "claude"]
    }
  }
}

โ˜๏ธ GCS Cloud Backend

For cloud-native deployments, antaris-memory supports Google Cloud Storage as a backend.

from antaris_memory.backends import GCSMemoryBackend
from antaris_memory import MemorySystem

backend = GCSMemoryBackend(
    bucket="my-agent-memory-bucket",
    prefix="agents/production/"
)

mem = MemorySystem(
    workspace="./local-cache",    # Local cache directory
    agent_name="prod-agent",
    backend=backend               # GCS backend for persistence
)

Use cases:

  • Persistent memory that survives container restarts
  • Shared memory accessible from multiple cloud instances
  • Memory backup and audit trail in GCS

Requirements: google-cloud-storage must be installed separately (pip install google-cloud-storage). The core antaris-memory package remains zero-dependency.


๐Ÿ“ค Export & Import

Move memory stores between agents, environments, or archive them for later use.

Export

# Export all memories to a JSON file
count = mem.export(
    output_path="./backup/memory-2024-03.json",
    include_metadata=True    # Include scores, timestamps, enrichment data
)
print(f"Exported {count} memories")

Export format:

{
  "version": "5.0.1",
  "exported_at": "2024-03-15T14:32:00Z",
  "workspace": "./memory",
  "entries": [
    {
      "id": "mem_abc123",
      "content": "...",
      "memory_type": "fact",
      "tags": ["api", "auth"],
      "created_at": "2024-03-10T09:15:00Z",
      "score": 0.92,
      "enriched_summary": "...",
      "search_queries": ["auth API", "JWT authentication"],
      "graph_entities": ["JWT", "OAuth"],
      "session_id": "session-abc",
      "channel_id": "ops-channel", 
      "source_url": "https://docs.example.com/auth",
      "content_hash": "sha256:abc123def456"
    }
  ]
}

Import

# Import memories (merge with existing)
count = mem.import_from(
    input_path="./backup/memory-2024-03.json",
    merge=True    # True = merge; False = replace
)
print(f"Imported {count} memories")

# Alias
count = mem.import_memories("./backup/memory-2024-03.json")

Use cases:

  • Bootstrap a new agent with knowledge from an existing agent
  • Restore from backup after data loss
  • Migrate from staging to production
  • Share domain knowledge between specialized agents

๐Ÿ”„ Recovery System

When an agent is spawned mid-task, it needs to reconstruct what happened before. The recovery system provides structured presets for this.

Presets

Preset Memories Time Window Approximate Tokens
smart (default) 50 24 hours ~5,000โ€“10,000
minimal 10 Current session ~1,000โ€“2,000
# Smart recovery โ€” get full 24h context
recent = mem.recent(limit=50)

# Build a targeted recovery packet
packet = mem.build_context_packet(
    task="Continue where we left off",
    max_memories=50,
    max_tokens=8000
)
recovery_context = packet.render()

Why recovery matters: An agent spawned to continue a long-running task needs to know: what was decided, what was tried, what failed, and what's pending. Without recovery context, it starts from scratch and may repeat mistakes or redo completed work.


๐Ÿ“Š Co-occurrence / PPMI Semantic Tier

The PPMIBootstrap component builds a co-occurrence statistical model over the memory corpus. This enables semantic query expansion without any ML models or external dependencies.

How PPMI Works

PPMI (Positive Pointwise Mutual Information) measures how much more often two terms co-occur than expected by chance. Terms with high PPMI are semantically related.

PPMI(term_a, term_b) = max(0, log( P(a,b) / (P(a) ร— P(b)) ))

Over time, as you ingest memories, the PPMI matrix learns your domain's vocabulary automatically. "API" and "endpoint" will have high PPMI. "deployment" and "rollback" will have high PPMI.

Practical Effect

A search for "API crash" will be expanded with high-PPMI neighbors like "endpoint failure", "service error", "HTTP 500," terms that appear in the same context in your memory store, not in a generic synonym dictionary.

# co-occurrence stats visible in stats output
stats = mem.get_stats()
print(f"Co-occurrence pairs: {stats['cooccurrence_pairs']}")

# Use cooccurrence boost explicitly
results, ctx = mem.search_with_context(
    query="API problems",
    cooccurrence_boost=True
)
print(f"Expanded query: {ctx.expanded_query}")
# โ†’ "API problems endpoint failure service error 503"

๐Ÿšฆ Input Gating

Not every piece of information is worth storing. Input gating classifies content by priority and drops low-value content before it enters the memory store.

Priority Levels

Level Label Behavior
P0 Critical Always stored, highest importance weighting
P1 Important Stored
P2 Standard Stored
P3 Ephemeral Dropped โ€” never stored
# The context dict informs the gating decision
entry_id = mem.ingest_with_gating(
    content="User said 'ok thanks'",
    source="chat-log",
    context={
        "session_type": "casual",
        "has_factual_claim": False,
        "is_action": False,
        "sentiment": "neutral"
    }
)
# โ†’ 0 (dropped as P3 ephemeral content)

entry_id = mem.ingest_with_gating(
    content="Production outage: auth service down. ETA 15 minutes. Cause: Redis connection pool exhausted.",
    source="incident-log",
    context={
        "is_incident": True,
        "severity": "high",
        "has_factual_claim": True
    }
)
# โ†’ memory_id (stored as P0 critical)

Why gating matters for agents: Agents that remember everything get noisy. An LLM-in-the-loop conversational agent might process thousands of utterances per day. Without gating, the memory store fills with "ok", "got it", "sure," drowning out the signal.


๐Ÿ”€ Hybrid Semantic Search

When you have embedding infrastructure available, antaris-memory can blend BM25 keyword search with cosine similarity semantic search.

Blend Ratio

Final Score = 0.40 ร— BM25_score + 0.60 ร— cosine_similarity

Setup

# Provide any embedding function โ€” OpenAI, local model, whatever you have
def my_embed(text: str) -> list[float]:
    # Returns a list of floats (embedding vector)
    import openai
    response = openai.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

mem.set_embedding_fn(my_embed)

# All subsequent searches now use hybrid BM25+semantic scoring
results = mem.search("machine learning pipeline optimization")

Without an embedding function, all 11 layers still run โ€” including the Layer 10 MiniLM reranker using pre-computed local embeddings. The set_embedding_fn hook adds a 60% cosine similarity signal on top.


๐Ÿ”ง Maintenance & Operations

Compaction

Over time, memories accumulate duplicates, stale entries, and expired content. Compaction cleans them up.

result = mem.compact()
# โ†’ {
#     "entries_before": 1240,
#     "entries_after": 987,
#     "removed_count": 253,
#     "shards_before": 8,
#     "shards_after": 6,
#     "space_freed_mb": 12.4,
#     "duration_ms": 340
# }

Forgetting

Selectively remove memories matching criteria:

result = mem.forget(
    topic="sprint-12",           # Remove by topic/keyword
    entity="TempProject",        # Remove by entity name
    before_date="2023-12-31"     # Remove entries older than date
)
# โ†’ {"forgotten": 42, "criteria": {...}}

Consolidation

Group similar memories and deduplicate:

result = mem.consolidate()
# โ†’ {"consolidated": 18, "duplicates_removed": 7}

Compression

Archive old memories to compressed format:

archived = mem.compress_old(days=60)   # Compress entries older than 60 days
print(f"Archived {len(archived)} entries")

Reindexing

Rebuild search indexes after bulk imports or schema changes:

mem.reindex()  # Rebuilds all search indexes

Relevance Management

# Mark memories as used (boosts their score for future recall)
count = mem.mark_used(
    memory_ids=["mem_abc123", "mem_def456"],
    context="used in deployment planning"
)

# Manually boost a specific memory
success = mem.boost_relevance(
    memory_id="mem_abc123",
    multiplier=1.5          # 1.5ร— score boost
)

Migration

# Migrate from older format to v4
result = mem.migrate_to_v4()
# โ†’ {"migrated": 840, "errors": 0, "duration_ms": 1200}

# Rollback if something goes wrong
result = mem.rollback_migration()

# Validate data integrity
report = mem.validate_data()
# โ†’ {"valid": True, "errors": [], "warnings": 3, "checked": 987}

๐Ÿ“ˆ Stats & Health

Comprehensive Stats

stats = mem.get_stats()
# Equivalent: mem.stats()
Key Description
total_entries Total memories across all tiers
hot_entries Entries in hot tier (0โ€“3 days)
warm_entries Entries in warm tier (3โ€“14 days)
cold_entries Entries in cold tier (14+ days)
wal_size Write-Ahead Log entry count
enrichment_count Total LLM enrichment calls made
enrichment_cost_usd Estimated enrichment cost
graph_enabled Whether graph intelligence is active
graph_nodes Total entities in knowledge graph
graph_edges Total relationships in knowledge graph
cooccurrence_pairs PPMI co-occurrence term pairs
cache_size Current LRU cache size
avg_search_time_ms Average search latency
cache_hit_rate LRU cache hit percentage
disk_usage_mb Total disk usage
version Library version
workspace Workspace path
agent_name Configured agent name

Health Check

health = mem.get_health()
# โ†’ {
#     "status": "ok",         # "ok" or "degraded"
#     "checks": {
#         "workspace_accessible": True,
#         "memories_loaded": True,
#         "wal_ok": True,
#         "search_ok": True,
#         "graph_ok": True
#     }
# }

if health["status"] != "ok":
    failed = [k for k, v in health["checks"].items() if not v]
    print(f"Health degraded: {failed}")

๐Ÿ–ฅ๏ธ CLI Reference

The antaris-memory CLI provides 4 commands for workspace management.

Global Flags

All commands accept:

--workspace PATH    Path to the memory workspace directory (default: ./memory)

init | Initialize a workspace

python -m antaris_memory init \
    --workspace ./my-agent-memory \
    --agent-name my-agent \
    [--force]
Flag Description
--workspace PATH Target directory to initialize
--agent-name NAME Agent name to embed in config
--force Overwrite existing workspace

Example output:

โœ“ Workspace initialized: ./my-agent-memory
โœ“ Agent name: my-agent
โœ“ Created: shards/, wal/, indexes/
Ready to use.

status โ€” Show workspace status

python -m antaris_memory status --workspace ./my-agent-memory

Example output:

antaris-memory v5.0.1
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
Workspace:      ./my-agent-memory
Agent:          my-agent
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
Total entries:  1,247
  Hot  (0-3d):  83
  Warm (3-14d): 312
  Cold (14d+):  852
WAL entries:    14
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
Graph nodes:    428
Graph edges:    1,102
Cooccurrence:   8,940 pairs
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
Disk usage:     47.2 MB
Cache hit rate: 82.4%
Avg search:     8.3 ms
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
Health: โœ“ ok

rebuild-graph โ€” Rebuild knowledge graph

python -m antaris_memory rebuild-graph --workspace ./my-agent-memory

Example output:

Rebuilding knowledge graph...
Processed 1,247 entries
โœ“ Graph rebuilt: 428 nodes, 1,102 edges
Duration: 2.4s

Use after bulk imports or when graph data seems stale.


serve โ€” Start MCP server

python -m antaris_memory serve \
    --workspace ./my-agent-memory \
    --agent-name my-agent
Flag Description
--workspace PATH Memory workspace to serve
--agent-name NAME Agent name for scoping

The server communicates over stdio (MCP protocol). Connect via any MCP-compatible client.


๐Ÿ“š Full API Reference

Constructor

Parameter Type Default Description
workspace str required Root directory for memory files
half_life float 7.0 Temporal decay half-life in days
tag_terms list None Custom auto-tag terms
use_sharding bool True Enterprise shard splitting
use_indexing bool True Pre-built search indexes
enable_read_cache bool True LRU read cache
cache_max_entries int 1000 LRU cache size limit
agent_name str None Agent scope (โš ๏ธ required)
enricher Callable None LLM enrichment hook
tiered_storage bool True Hot/warm/cold tier management
graph_intelligence bool True Entity extraction + graph
quality_routing bool True Follow-up pattern detection
semantic_expansion bool True PPMI query expansion

Lifecycle Methods

Method Returns Description
load() int Load from disk; returns entry count
save() str Save to disk; returns path
flush() dict Compact WAL to shards
close() None Flush + release resources

Ingestion Methods

Method Returns Description
ingest(content, source, category, memory_type, tags, agent_id, session_id, channel_id, source_url, content_hash) int Store a memory entry
ingest_fact(content, source, tags, category) int Store a verified fact
ingest_preference(content, source, tags, category) int Store a preference
ingest_procedure(content, source, tags, category) int Store a procedure
ingest_mistake(what_happened, correction, root_cause, severity, tags) MemoryEntry Store a mistake with full context
ingest_file(file_path, category) int Ingest a file
ingest_directory(dir_path, category, pattern) int Ingest a directory of files
ingest_url(url, depth, incremental) dict Ingest web content
ingest_data_file(path, format, **kwargs) dict Ingest CSV/JSON/SQLite
ingest_sql(db_path, query) dict Ingest SQL query results
ingest_with_gating(content, source, context) int Ingest with P0-P3 priority gating

Search & Retrieval Methods

Method Returns Description
search(query, limit, tags, tag_mode, date_range, use_decay, category, min_confidence, sentiment_filter, memory_type, explain, session_id, agent_id, include_cold) list 11-layer search
search_with_context(query, limit, instrumentation_context, cooccurrence_boost) tuple[list, SearchContext] Search with instrumentation
recent(limit, agent_id, include_shared) list Recency-first retrieval
on_date(date) list All memories from a date
between(start, end) list All memories in date range
analyze(query, limit) dict Structured topic analysis
synthesize_knowledge(topic, limit) str Free-text knowledge synthesis
narrative(topic) str Prose narrative from memories

Graph Methods

Method Returns Description
graph_search(subject, relation, obj) list Query relationship triples
entity_path(source, target, max_hops) list Find entity relationship path
get_entity(canonical) dict Get entity node info
get_graph_stats() dict Graph statistics
rebuild_graph() int Rebuild graph from all entries
rebuild_clusters() int Rebuild topic clusters

Context Packet Methods

Method Returns Description
build_context_packet(task, tags, category, environment, instructions, max_memories, max_tokens, min_relevance, include_mistakes, max_pitfalls) ContextPacket Build single-query context packet
build_context_packet_multi(task, queries, ...) ContextPacket Build multi-query context packet

ContextPacket methods:

  • packet.render() โ†’ str (markdown)
  • packet.to_dict() โ†’ dict (serializable)
  • packet.trim(max_tokens) โ†’ trims in-place

Shared Pool Methods

Method Returns Description
enable_shared_pool(pool_dir, pool_name, agent_id, role, load_existing) SharedMemoryPool Enable shared pool
shared_write(content, namespace, category, metadata) object Write to shared pool
shared_search(query, namespace, limit) list Search shared pool

LLM Enrichment Methods

Method Returns Description
re_enrich(batch_size, progress_fn, overwrite) int Batch-enrich existing entries
get_enrichment_count(reset) int Get enrichment call count
set_embedding_fn(fn) None Set embedding function for hybrid search

Maintenance Methods

Method Returns Description
compact() dict Remove dupes, expire stale
consolidate() dict Group and deduplicate similar memories
compress_old(days) list Compress entries older than N days
reindex() None Rebuild search indexes
forget(topic, entity, before_date) dict Selectively remove memories
delete_source(source_url) dict Remove all memories from a source
mark_used(memory_ids, context) int Mark memories as used
boost_relevance(memory_id, multiplier) bool Boost a memory's score

Stats & Health Methods

Method Returns Description
get_stats() / stats() dict Comprehensive statistics
get_health() dict Health check
get_hot_entries(top_n) list Most-accessed hot entries

Data Integrity Methods

Method Returns Description
export(output_path, include_metadata) int Export to JSON
import_from(input_path, merge) int Import from JSON
import_memories(path) int Import (alias)
validate_data() dict Validate data integrity
migrate_to_v4() dict Migrate from older format
rollback_migration() dict Rollback migration

๐Ÿงช Full Example: Production Agent

import json
import anthropic
from antaris_memory import MemorySystem

# โ”€โ”€ Enricher โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
client = anthropic.Anthropic()

def enricher(content: str) -> dict:
    response = client.messages.create(
        model="claude-3-5-haiku-20241022",
        max_tokens=400,
        messages=[{"role": "user", "content": f"""
Return JSON only:
{{"tags": ["tag1"], "summary": "one-line restatement", 
 "keywords": ["kw1"], "search_queries": ["natural query that should find this"]}}

Content: {content[:500]}"""}]
    )
    try:
        return json.loads(response.content[0].text)
    except Exception:
        return {"tags": [], "summary": content[:100], "keywords": [], "search_queries": []}

# โ”€โ”€ Setup โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
mem = MemorySystem(
    workspace="./production-memory",
    agent_name="prod-agent",
    enricher=enricher,
    tiered_storage=True,
    graph_intelligence=True,
    semantic_expansion=True,
)
count = mem.load()
print(f"Loaded {count} memories")

# โ”€โ”€ Ingest various memory types โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
mem.ingest_fact(
    "AWS us-east-1 is our primary region. Failover to us-west-2.",
    source="infra-docs",
    tags=["aws", "infrastructure"]
)

mem.ingest_procedure(
    "Incident response: 1) Page on-call, 2) Open incident channel, 3) Assign incident commander, 4) Update status page",
    source="runbook",
    tags=["incident", "ops"]
)

mem.ingest_mistake(
    what_happened="Deployed to production without a feature flag, caused 2h outage",
    correction="All new features must ship behind a LaunchDarkly flag",
    root_cause="Skipped pre-deploy checklist",
    severity="critical",
    tags=["deployment", "feature-flags", "outage"]
)

mem.ingest_url("https://docs.example.com/api/v2", depth=2, incremental=True)

# โ”€โ”€ Build context packet for sub-agent โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
packet = mem.build_context_packet(
    task="Deploy new payment service to production",
    tags=["deployment", "payment"],
    max_tokens=4000,
    include_mistakes=True,
    max_pitfalls=5
)

print(packet.render())  # Inject into sub-agent system prompt

# โ”€โ”€ Search โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
results = mem.search(
    "production deployment failure",
    limit=5,
    memory_type="mistake",
    explain=True
)

for r in results:
    print(f"[{r.score:.3f}] {r.content[:80]}")

# โ”€โ”€ Graph query โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
path = mem.entity_path("payment-service", "aws-rds", max_hops=3)
print(" โ†’ ".join(path))

# โ”€โ”€ Maintenance โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
stats = mem.get_stats()
print(f"Entries: {stats['total_entries']} | Graph: {stats['graph_nodes']} nodes")

health = mem.get_health()
if health["status"] != "ok":
    print(f"DEGRADED: {health}")

result = mem.compact()
print(f"Compacted: removed {result['removed_count']} entries")

mem.save()
mem.close()

๐Ÿ›๏ธ Architecture Overview

antaris-memory/
โ”œโ”€โ”€ Core
โ”‚   โ”œโ”€โ”€ MemorySystem (MemorySystemV4)
โ”‚   โ”œโ”€โ”€ MemoryEntry (typed entry schema)
โ”‚   โ””โ”€โ”€ WAL (Write-Ahead Log for crash safety)
โ”‚
โ”œโ”€โ”€ Storage
โ”‚   โ”œโ”€โ”€ ShardManager (enterprise sharding)
โ”‚   โ”œโ”€โ”€ TierManager (hot/warm/cold routing)
โ”‚   โ””โ”€โ”€ GCSMemoryBackend (cloud backend)
โ”‚
โ”œโ”€โ”€ Search
โ”‚   โ”œโ”€โ”€ BM25PlusIndex (Layer 1)
โ”‚   โ”œโ”€โ”€ PhraseMatcher (Layer 2)
โ”‚   โ”œโ”€โ”€ FieldBooster (Layer 3)
โ”‚   โ”œโ”€โ”€ RarityBooster (Layer 4)
โ”‚   โ”œโ”€โ”€ WindowScorer (Layer 5)
โ”‚   โ”œโ”€โ”€ QueryExpander + PPMIBootstrap (Layer 6)
โ”‚   โ”œโ”€โ”€ IntentReranker (Layer 7)
โ”‚   โ”œโ”€โ”€ QualifierFilter (Layer 8)
โ”‚   โ”œโ”€โ”€ ClusterBooster (Layer 9)
โ”‚   โ”œโ”€โ”€ EmbeddingReranker (Layer 10)
โ”‚   โ””โ”€โ”€ PseudoRelevanceFeedback (Layer 11)
โ”‚
โ”œโ”€โ”€ Intelligence
โ”‚   โ”œโ”€โ”€ EntityExtractor
โ”‚   โ”œโ”€โ”€ MemoryGraph
โ”‚   โ”œโ”€โ”€ LLMEnricher
โ”‚   โ””โ”€โ”€ CategoryTagger
โ”‚
โ”œโ”€โ”€ Multi-Agent
โ”‚   โ”œโ”€โ”€ SharedMemoryPool
โ”‚   โ”œโ”€โ”€ AgentRole (COORDINATOR/WRITER/READER)
โ”‚   โ””โ”€โ”€ AgentConfig
โ”‚
โ”œโ”€โ”€ Context
โ”‚   โ”œโ”€โ”€ ContextPacketBuilder
โ”‚   โ””โ”€โ”€ ContextPacket
โ”‚
โ””โ”€โ”€ Server
    โ”œโ”€โ”€ AntarisMCPServer
    โ””โ”€โ”€ CLI (init/status/rebuild-graph/serve)

๐Ÿ“„ License

APACHE 2.0 License


๐Ÿ”— Links


antaris-memory is the flagship package of the Antaris Analytics LLC suite โ€” built for production AI agent deployments.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parsica_memory-1.0.1.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

parsica_memory-1.0.1-py3-none-any.whl (1.8 MB view details)

Uploaded Python 3

File details

Details for the file parsica_memory-1.0.1.tar.gz.

File metadata

  • Download URL: parsica_memory-1.0.1.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for parsica_memory-1.0.1.tar.gz
Algorithm Hash digest
SHA256 080cf4b4c2cfeda6ef2c3d7b8f618399d341f0e40238cb13d3893c9f72a50c7f
MD5 c9498470f6eaaf1a488615ea2305ebbd
BLAKE2b-256 eb68fee764070b06de507211ca0ee061240ab6deec637f7d304804a0280a62cd

See more details on using hashes here.

File details

Details for the file parsica_memory-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: parsica_memory-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for parsica_memory-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 28bb29fa099085334aeaf659e73d96df14eec0d84e678a8ce3183e79a62211ee
MD5 ccb5654e37993b363e5553b0b38555a7
BLAKE2b-256 1d36766c0d6b24f35b2e84c2e18267654777d3d87e14b4559f7c2bfd5e0593c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page