Skip to main content

File-based persistent memory for AI agents. Zero dependencies.

Project description

Parsica-Memory

Persistent, intelligent memory for AI agents. Zero dependencies. Pure Python. File-backed.

PyPI version Python 3.9+ Zero dependencies License: Apache 2.0 Tests


What Is This?

AI agents are stateless by default. Every spawn is a cold start. parsica-memory gives agents a persistent, searchable, intelligent memory store that:

  • Remembers across sessions, spawns, and restarts
  • Retrieves the right memories using a 12-layer BM25+ search engine
  • Decays old memories gracefully so signal stays high
  • Enriches memories via LLM at write time for dramatically better recall
  • Shares knowledge across multi-agent teams
  • Learns from mistakes, facts, and procedures with specialized memory types
  • Cross-session recall - semantic memories surface across all sessions automatically

78.0% LOCOMO accuracy (beats mem0 at 66.9%). 88.5% doc2query R@1. Zero vector database. Zero API keys required for core features. Zero external services.


Quick Start

pip install parsica-memory

Option 1: MemoryManager (Recommended)

The easiest way to use Parsica-Memory. Handles per-user isolation, noise filtering, cross-session recall, and optional LLM enrichment out of the box.

from parsica_memory import MemoryManager

# Basic - no enrichment (BM25-only, still good)
mm = MemoryManager("./store")

# With enrichment - pass any API key
mm = MemoryManager("./store", api_key="sk-ant-...", provider="anthropic")

# Or auto-detect from environment variables
mm = MemoryManager("./store", auto_enrich=True)

# Store a conversation
mm.ingest("user123", "What's the capital of France?", "The capital of France is Paris.")

# Recall relevant memories
results = mm.recall("user123", "France capital")
for r in results:
    print(r)

# Search with full metadata
results = mm.search("user123", "Paris", explain=True)
for r in results:
    print(f"{r['relevance']:.2f} - {r['content'][:80]}")

Option 2: Direct MemorySystem (Full Control)

from parsica_memory import MemorySystem

mem = MemorySystem(workspace="./memory", agent_name="my-agent")
mem.load()

mem.ingest("Deployed v2.3.1 to production. All checks green.",
           source="deploy-log", session_id="session-123")

results = mem.search("production deployment",
                     session_id="session-456",
                     cross_session_recall="semantic")
for r in results:
    print(r.content)

mem.save()

Installation

pip install parsica-memory

Version: 2.1.2 Requirements: Python 3.9+ . Zero external dependencies . stdlib only


MemoryManager

MemoryManager is the batteries-included wrapper. It gives you everything you need to build a memory-enabled application without writing boilerplate.

What It Does

  • Per-user isolation - each user gets their own memory store in a sub-directory
  • Noise filtering - automatically skips junk (context packets, system messages, very short text)
  • Cross-session semantic recall - facts and decisions cross session boundaries, episodic chatter stays scoped
  • Auto-save - saves to disk after every ingest so nothing is lost on crash
  • LLM enrichment - optional, pluggable, works with any provider

Constructor

MemoryManager(
    store_path="./store",      # Root directory for all user stores
    search_limit=10,           # Max results per search
    min_relevance=0.3,         # Minimum score to include (0-1)
    api_key=None,              # LLM API key for enrichment (optional)
    provider=None,             # "anthropic", "openai", "gemini", "local", or "auto"
    model=None,                # Model override (uses sensible defaults)
    auto_enrich=False,         # Auto-detect enricher from env vars
    agent_name="parsica",      # Name for memory provenance
)

Methods

# Ingest a conversation turn (filters noise, enriches if configured, auto-saves)
mm.ingest(user_id, user_message, bot_response, session_id="", channel_id="")

# Ingest arbitrary content (documents, notes, facts)
mm.ingest_raw(user_id, content, source="manual", category="note")

# Recall relevant memories as plain strings
results = mm.recall(user_id, query, session_id="", limit=None)

# Search with full metadata (relevance, source, category, matched_terms)
results = mm.search(user_id, query, explain=False)

# Batch-enrich all existing memories for a user
count = mm.enrich_all(user_id, overwrite=False)

# Export a user's memories to JSON
count = mm.export(user_id, "./export.json")

# Get stats for a user or all users
stats = mm.stats(user_id)

# Delete all memories for a user
count = mm.clear(user_id)

Provider Auto-Detection

If you pass just an API key without specifying a provider, MemoryManager guesses from the key prefix:

Key Prefix Detected Provider Default Model
sk-ant- Anthropic claude-sonnet-4-6
sk- OpenAI gpt-4o-mini
AI Google Gemini gemini-2.0-flash

Or set provider="auto" / auto_enrich=True to check environment variables (ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY) in order.


LLM Enrichment: How It Works

Enrichment is what makes Parsica-Memory beat vector databases at recall. When you ingest a memory, an optional LLM call generates additional search metadata that dramatically improves retrieval quality.

The Problem Enrichment Solves

Without enrichment, a memory like:

"We decided to use FastAPI for the backend"

...can only be found by searching for words that appear in it: "FastAPI", "backend", "decided". But a user might search for "web framework", "API server", "http library", or "what tech stack did we pick?" - none of which appear in the original text.

How Enrichment Fixes This

At ingest time, an LLM reads the memory and generates four fields:

Field What It Contains Search Weight
tags 5-10 semantic concept labels (web_framework, architecture_decision, python) Boosted in field matching
summary One-sentence rephrase using DIFFERENT vocabulary than the original 2x weight in search scoring
keywords 8-15 terms a person might type when looking for this memory 2x weight in search scoring
search_queries 3 specific natural language questions ONLY this memory answers 3x weight in search scoring

For the FastAPI example, enrichment might produce:

{
  "tags": ["web_framework", "architecture_decision", "python", "backend", "api_server"],
  "summary": "The team selected FastAPI as their Python HTTP framework for the API layer",
  "keywords": ["framework", "http", "api server", "asgi", "tech stack", "web library"],
  "search_queries": [
    "What web framework did we choose for the backend?",
    "Why did we pick FastAPI over Flask or Django?",
    "What is our Python API server technology?"
  ]
}

Now the memory is findable by ANY of those terms, not just the original words. This is why our LOCOMO accuracy (78.0%) beats mem0 (66.9%) - enrichment closes the vocabulary gap without needing embeddings.

Enrichment at Ingest Time

When you configure an enricher, every new memory gets enriched automatically at write time:

from parsica_memory import MemoryManager

# Enrichment happens automatically on every ingest
mm = MemoryManager("./store", api_key="sk-ant-...", provider="anthropic")
mm.ingest("user1", "Let's use PostgreSQL for the database", "Good choice, PostgreSQL it is.")
# ^ This memory is now findable by: "database", "postgres", "sql",
#   "what DB are we using", "data storage decision", etc.

Enriching Existing Memories (Backfill)

Already have memories stored without enrichment? The enrich_all() method batch-processes everything:

from parsica_memory import MemoryManager

mm = MemoryManager("./store", api_key="sk-ant-...", provider="anthropic")

# Enrich all un-enriched memories for a user
count = mm.enrich_all("user1")
print(f"Enriched {count} memories")

# Force re-enrich everything (even already-enriched entries)
count = mm.enrich_all("user1", overwrite=True)

# With progress tracking
def progress(done, total):
    print(f"  {done}/{total}...")
count = mm.enrich_all("user1", progress_fn=progress)

Using the Low-Level API for Backfill

If you're using MemorySystem directly instead of MemoryManager:

from parsica_memory import MemorySystem
from parsica_memory.enrichers import anthropic_enricher

# Create enricher
enrich = anthropic_enricher("sk-ant-...")

# Load existing store and attach enricher
mem = MemorySystem(workspace="./store", agent_name="my-agent", enricher=enrich)
mem.load()

# Batch-enrich all existing memories
count = mem.re_enrich(
    batch_size=50,          # Progress callback fires every N entries
    overwrite=False,        # Skip already-enriched entries
    progress_fn=lambda done, total: print(f"{done}/{total}")
)
print(f"Enriched {count} memories")
mem.save()

Cost Estimate

Enrichment uses one LLM call per memory, with a 600-character input cap and 256-token output cap:

Provider Model Approximate Cost per 1,000 Memories
Anthropic claude-sonnet-4-6 ~$0.50
OpenAI gpt-4o-mini ~$0.15
Google gemini-2.0-flash ~$0.05
Local (Ollama) llama3.2 $0.00

Pre-Built Enricher Templates

Four provider-ready enricher factories ship with Parsica-Memory. All use raw urllib from the standard library - zero SDK dependencies.

Anthropic (Claude)

from parsica_memory.enrichers import anthropic_enricher

enrich = anthropic_enricher(
    api_key="sk-ant-...",            # Or set ANTHROPIC_API_KEY env var
    model="claude-sonnet-4-6",       # Default
)
mem = MemorySystem("./store", agent_name="my-agent", enricher=enrich)

OpenAI (GPT)

from parsica_memory.enrichers import openai_enricher

enrich = openai_enricher(
    api_key="sk-...",                # Or set OPENAI_API_KEY env var
    model="gpt-4o-mini",            # Default (cheap and fast)
    base_url="https://api.openai.com",  # Change for Azure, etc.
)

Google Gemini

from parsica_memory.enrichers import gemini_enricher

enrich = gemini_enricher(
    api_key="AI...",                 # Or set GOOGLE_API_KEY env var
    model="gemini-2.0-flash",       # Default
)

Local Models (Ollama, LM Studio, vLLM)

from parsica_memory.enrichers import local_enricher

enrich = local_enricher(
    base_url="http://localhost:11434",  # Ollama default
    model="llama3.2",
    timeout=60,                         # Longer timeout for local models
)

No API key needed. Works with anything that speaks the OpenAI Chat Completions format.

Auto-Detect

from parsica_memory.enrichers import auto_enricher

# Checks ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY in order
enrich = auto_enricher()
if enrich:
    mem = MemorySystem("./store", agent_name="my-agent", enricher=enrich)

Custom Enricher

Write your own - just return a dict:

def my_enricher(text: str) -> dict:
    # Call whatever you want
    return {
        "tags": ["tag1", "tag2"],
        "summary": "One sentence rephrase",
        "keywords": ["term1", "term2", "term3"],
        "search_queries": [
            "Question 1?",
            "Question 2?",
            "Question 3?"
        ]
    }

mem = MemorySystem("./store", agent_name="my-agent", enricher=my_enricher)

All fields are optional. Missing fields are silently ignored.


v2.0.0 Features

Fact Decomposition

When enrichment is enabled, conversation turns are automatically broken down into atomic facts and stored as separate searchable entries.

Before (v1.x): A conversation like "User: I like vertical monitors and ultrawide. Also my dog's name is Koda." is stored as one blob.

After (v2.0): The enricher extracts atomic facts and stores each one independently:

  • "User prefers vertical monitor + ultrawide setup" (type: fact, provenance: decomposed)
  • "User's dog is named Koda" (type: fact, provenance: decomposed)
  • Original conversation still stored as episodic entry

Each fact gets:

  • memory_type="fact" - crosses session boundaries automatically
  • provenance="decomposed" - tracks where it came from
  • parent_hash - links back to the original conversation entry
  • confidence=0.8 - starts high (confirmed by the user in conversation)
# Facts are extracted automatically during ingest when enricher is configured
mm = MemoryManager("./store", api_key="sk-ant-...", provider="anthropic")
mm.ingest("user1", "I like dark mode and my password min is 8 chars", "Got it!")

# Each fact is now independently searchable
results = mm.search("user1", "password requirements")
# -> Returns: "Password minimum is 8 characters" (not the whole conversation)

Provenance Tracking

Every memory now carries provenance metadata - where it came from and how it got there:

Provenance Meaning
self Created directly by this agent (default)
decomposed Extracted as an atomic fact from a conversation
shared Received from another agent via shared memory pool
transferred Migrated from another memory system
workspace Ingested from a workspace file
manual Manually added by a user or admin
# Structured recall includes provenance
results = mm.recall_structured("user1", "password")
for r in results:
    print(f"{r['content']} (from: {r['provenance']}, confidence: {r['confidence']:.2f})")

Fuzzy Dedup + Canonicalization

When the same fact appears in slightly different phrasings, Parsica-Memory detects the overlap and links them instead of storing duplicates.

How it works:

  • At ingest time, new facts are compared against existing facts using Jaccard word similarity
  • If similarity >= 0.85 (configurable), the existing canonical fact is kept and its metadata is updated:
    • confidence bumped by +0.05 (capped at 1.0)
    • sighting_count incremented
    • last_sighted timestamp updated
  • The duplicate is NOT stored

This means facts that are confirmed multiple times become MORE confident over time, while the store stays clean.

# First mention
mm.ingest("user1", "I prefer dark mode", "Noted!")
# -> Stored as fact: "User prefers dark mode" (confidence: 0.8)

# Second mention (different phrasing)
mm.ingest("user1", "Can you enable dark mode? I always use it", "Done!")
# -> Recognized as same fact, sighting_count bumped, confidence: 0.85
# -> NO duplicate created

Confidence Threshold ("I Don't Know")

Search now supports a confidence_floor parameter. If nothing scores above the threshold, the system returns an empty list instead of guessing.

# Normal search - returns best matches regardless of confidence
results = mm.recall("user1", "quantum physics experiments")
# -> Might return vaguely related results

# With confidence floor - only returns if confident
results = mm.recall("user1", "quantum physics experiments", confidence_floor=0.5)
# -> Returns [] if nothing truly matches (the "I don't know" response)

The floor is calculated as entry.confidence * search_relevance_score. Both the memory's inherent confidence AND its match quality must be high enough.

Anti-Adjacency Penalty (Layer 12)

New search scoring layer that penalizes results matching the TOPIC but not the specific ANSWER.

The problem: Searching "what was the test phrase?" might return "we did memory testing" instead of "the test phrase was Peaches from Space."

The fix: When the query asks a specific question (starts with "what is", "who", "which", etc.), results are checked for specific identifiers:

  • Proper nouns, quoted strings, version numbers, code identifiers, constants
  • If the result is meta-discussion without a concrete answer, it gets a 0.6x penalty
  • If the result lacks specific values, it gets a 0.85x penalty
  • General queries (not specific questions) are NOT penalized

Enhanced Rarity Boost

Unique, identifying information now scores dramatically higher:

Content Type Boost (v1.x) Boost (v2.0)
Proper nouns 1.5x 3.0x
Extremely rare terms (< 1% of docs) 2.0x 3.0x
Very rare terms (1-5% of docs) 1.5x 2.25x
Quoted strings 1.0x 3.0x
Version numbers 1.0x 2.5x
Code identifiers (backticks) 1.0x 2.5x
CONSTANTS/ACRONYMS 1.0x 2.0x

This means "Peaches from Space" massively outranks "some testing happened" when both match a query.


12-Layer Search Engine

Every query runs through 12 scoring layers:

  1. BM25+ TF-IDF - baseline relevance with delta floor
  2. Exact Phrase Bonus - verbatim matches score higher
  3. Field Boosting - tags, source, category weighted; enriched fields get 2-3x weight
  4. Rarity & Proper Noun Boost - rare terms and names surface
  5. Positional Salience - intro/conclusion bias
  6. Semantic Expansion - PPMI co-occurrence query widening
  7. Intent Reranker - temporal, entity, howto detection
  8. Qualifier & Negation - "failed" != "successful"
  9. Clustering Boost - coherent result groups score higher
  10. Embedding Reranker - optional local embeddings (no API)
  11. Pseudo-Relevance Feedback - top results refine the query
  12. Anti-Adjacency Penalty - penalizes topic-match without answer-match (v2.0)

Memory Types

Type Decay Rate Importance Cross-Session Use Case
episodic Normal 1x Same session only General events, chat
semantic Normal 1x Yes - crosses sessions Facts, decisions
fact Normal High recall Yes Verified knowledge
mistake 10x slower 2x Yes Never forget failures
preference 3x slower 1x Yes User/agent preferences
procedure 3x slower 1x Yes How-to knowledge

Memories are automatically classified at ingest time. No manual tagging needed.


Cross-Session Recall

results = mem.search(
    "what's the API key format?",
    session_id="session-B",
    cross_session_recall="semantic"  # "all" | "semantic" | "none"
)
  • "all" - search everything regardless of session (default)
  • "semantic" - other sessions' memories only surface if they're facts/decisions, not episodic chatter
  • "none" - strict session isolation

Shared / Team Memory

Multiple agents can share a memory pool with role-based access control:

from parsica_memory import SharedPool, AgentRole, AgentConfig

pool = mem.enable_shared_pool(
    pool_dir="./shared",
    pool_name="project-alpha",
    agent_id="worker-1",
    role=AgentRole.WRITER
)
mem.shared_write("Research complete: competitor uses GraphQL", namespace="research")

Context Packets

Cold-spawn solution for sub-agents:

packet = mem.build_context_packet(
    task="Deploy the auth service",
    max_tokens=3000,
    include_mistakes=True
)
# Inject packet.render() into sub-agent system prompt

Graph Intelligence

Automatic entity extraction and knowledge graph:

path = mem.entity_path("payment-service", "database", max_hops=3)
triples = mem.graph_search(subject="PostgreSQL", relation="used_by")

Tiered Storage

Tier Age Behavior
Hot 0-3 days Always loaded, fastest access
Warm 3-14 days Loaded on-demand
Cold 14+ days Requires include_cold=True

Input Gating

P0-P3 priority classification drops noise before it enters the store:

mem.ingest_with_gating("ok thanks", source="chat")  # dropped (P3 - noise)
mem.ingest_with_gating("Production outage: auth down", source="incident")  # stored (P0 - critical)

MCP Server

parsica-memory-mcp --workspace ./memory --agent-name my-agent

Works with Claude Desktop and any MCP-compatible client.


CLI

# Check status
python -m parsica_memory status --workspace ./memory

# Initialize a workspace
python -m parsica_memory init --workspace ./memory --agent-name my-agent

# Rebuild knowledge graph
python -m parsica_memory rebuild-graph --workspace ./memory

# Start MCP server
python -m parsica_memory serve --workspace ./memory --agent-name my-agent

Full API Reference

MemorySystem

from parsica_memory import MemorySystem

mem = MemorySystem(
    workspace="./memory",          # Required - directory path
    agent_name="my-agent",         # Required - scopes the store
    half_life=7.0,                 # Decay half-life in days
    enricher=None,                 # LLM enrichment callable
    tiered_storage=True,           # Hot/warm/cold tiers
    graph_intelligence=True,       # Entity extraction + graph
    semantic_expansion=True,       # PPMI query expansion
)

# Lifecycle
mem.load()                         # Load from disk
mem.save()                         # Save to disk
mem.flush()                        # WAL to shards
mem.close()                        # Flush + release

# Ingestion
mem.ingest(content, source=..., session_id=..., channel_id=...)
mem.ingest_fact(content, source=...)
mem.ingest_mistake(what_happened=..., correction=..., root_cause=..., severity=...)
mem.ingest_preference(content, source=...)
mem.ingest_procedure(content, source=...)
mem.ingest_file(path, category=...)
mem.ingest_url(url, depth=2)

# Search
mem.search(query, limit=10, cross_session_recall="semantic", explain=False, confidence_floor=0.0)
mem.recent(limit=20)

# Enrichment
mem.re_enrich(batch_size=50, overwrite=False, progress_fn=None)

# Graph
mem.graph_search(subject=..., relation=..., obj=...)
mem.entity_path(source, target, max_hops=3)

# Context
mem.build_context_packet(task=..., max_tokens=3000, include_mistakes=True)

# Maintenance
mem.compact()
mem.consolidate()
mem.reindex()
mem.forget(topic=..., before_date=...)
mem.stats()

Benchmarks

Measured on Mac Mini M4 (10-core, 32GB), Python 3.14, 9,896 live memories:

Metric Parsica-Memory mem0
LOCOMO Accuracy 78.0% 66.9%
doc2query R@1 88.5% -
p50 Search Latency 21ms -
Dependencies 0 15+
Requires API Key No (enrichment optional) Yes
Requires Vector DB No Yes

Architecture

parsica-memory/
+-- Core: MemorySystem, MemoryEntry, WAL, ShardManager
+-- Search: 12-layer BM25+ pipeline, SearchEngine
+-- Intelligence: EntityExtractor, MemoryGraph, LLM Enricher
+-- Multi-Agent: SharedMemoryPool, AgentRoles
+-- Context: ContextPacketBuilder
+-- Manager: MemoryManager (batteries-included wrapper)
+-- Enrichers: Anthropic, OpenAI, Gemini, Local, Auto-detect
+-- Storage: TierManager, GCS backend
+-- Server: MCP server, CLI

Relationship to Antaris Core

parsica-memory is the standalone memory package from Antaris Analytics. It ships as part of Antaris Core (the full agent infrastructure suite) and also as this independent package. Same engine, same features, independent install.


License

Apache 2.0


Links


Built by Antaris Analytics LLC. Zero dependencies. Ships complete.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parsica_memory-2.1.2.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

parsica_memory-2.1.2-py3-none-any.whl (1.7 MB view details)

Uploaded Python 3

File details

Details for the file parsica_memory-2.1.2.tar.gz.

File metadata

  • Download URL: parsica_memory-2.1.2.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for parsica_memory-2.1.2.tar.gz
Algorithm Hash digest
SHA256 f2c2ebaf8da0a5f5b0aeae36b6b657945162eceeaae2b2e6b70d33bf54bd1d5f
MD5 4d96b0b2d6b1db951bd5caacdb049028
BLAKE2b-256 e2782901c7f9f141ee55da1c75677255c41fff842218d9d222583183086f909e

See more details on using hashes here.

File details

Details for the file parsica_memory-2.1.2-py3-none-any.whl.

File metadata

  • Download URL: parsica_memory-2.1.2-py3-none-any.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for parsica_memory-2.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 254d868d10b82ecbf76380dbcef8f23b58cd9be010ab861240e1677dce586cc7
MD5 eaa305502091adcf0a4e640dc8f3d957
BLAKE2b-256 bf236bed6aadced090bd74287ea1de38b53db030dbdc6ce5dc89876a73807aba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page