File-based persistent memory for AI agents. Zero dependencies.
Project description
Parsica-Memory
Persistent, intelligent memory for AI agents. Zero dependencies. Pure Python. File-backed.
What Is This?
AI agents are stateless by default. Every spawn is a cold start. parsica-memory gives agents a persistent, searchable, intelligent memory store that:
- Remembers across sessions, spawns, and restarts
- Retrieves the right memories using a 12-layer BM25+ search engine
- Decays old memories gracefully so signal stays high
- Enriches memories via LLM at write time for dramatically better recall
- Shares knowledge across multi-agent teams
- Learns from mistakes, facts, and procedures with specialized memory types
- Cross-session recall - semantic memories surface across all sessions automatically
78.0% LOCOMO accuracy (beats mem0 at 66.9%). 88.5% doc2query R@1. Zero vector database. Zero API keys required for core features. Zero external services.
Quick Start
pip install parsica-memory
Option 1: MemoryManager (Recommended)
The easiest way to use Parsica-Memory. Handles per-user isolation, noise filtering, cross-session recall, and optional LLM enrichment out of the box.
from parsica_memory import MemoryManager
# Basic - no enrichment (BM25-only, still good)
mm = MemoryManager("./store")
# With enrichment - pass any API key
mm = MemoryManager("./store", api_key="sk-ant-...", provider="anthropic")
# Or auto-detect from environment variables
mm = MemoryManager("./store", auto_enrich=True)
# Store a conversation
mm.ingest("user123", "What's the capital of France?", "The capital of France is Paris.")
# Recall relevant memories
results = mm.recall("user123", "France capital")
for r in results:
print(r)
# Search with full metadata
results = mm.search("user123", "Paris", explain=True)
for r in results:
print(f"{r['relevance']:.2f} - {r['content'][:80]}")
Option 2: Direct MemorySystem (Full Control)
from parsica_memory import MemorySystem
mem = MemorySystem(workspace="./memory", agent_name="my-agent")
mem.load()
mem.ingest("Deployed v2.3.1 to production. All checks green.",
source="deploy-log", session_id="session-123")
results = mem.search("production deployment",
session_id="session-456",
cross_session_recall="semantic")
for r in results:
print(r.content)
mem.save()
Installation
pip install parsica-memory
Version: 2.1.2 Requirements: Python 3.9+ . Zero external dependencies . stdlib only
MemoryManager
MemoryManager is the batteries-included wrapper. It gives you everything you need to build a memory-enabled application without writing boilerplate.
What It Does
- Per-user isolation - each user gets their own memory store in a sub-directory
- Noise filtering - automatically skips junk (context packets, system messages, very short text)
- Cross-session semantic recall - facts and decisions cross session boundaries, episodic chatter stays scoped
- Auto-save - saves to disk after every ingest so nothing is lost on crash
- LLM enrichment - optional, pluggable, works with any provider
Constructor
MemoryManager(
store_path="./store", # Root directory for all user stores
search_limit=10, # Max results per search
min_relevance=0.3, # Minimum score to include (0-1)
api_key=None, # LLM API key for enrichment (optional)
provider=None, # "anthropic", "openai", "gemini", "local", or "auto"
model=None, # Model override (uses sensible defaults)
auto_enrich=False, # Auto-detect enricher from env vars
agent_name="parsica", # Name for memory provenance
)
Methods
# Ingest a conversation turn (filters noise, enriches if configured, auto-saves)
mm.ingest(user_id, user_message, bot_response, session_id="", channel_id="")
# Ingest arbitrary content (documents, notes, facts)
mm.ingest_raw(user_id, content, source="manual", category="note")
# Recall relevant memories as plain strings
results = mm.recall(user_id, query, session_id="", limit=None)
# Search with full metadata (relevance, source, category, matched_terms)
results = mm.search(user_id, query, explain=False)
# Batch-enrich all existing memories for a user
count = mm.enrich_all(user_id, overwrite=False)
# Export a user's memories to JSON
count = mm.export(user_id, "./export.json")
# Get stats for a user or all users
stats = mm.stats(user_id)
# Delete all memories for a user
count = mm.clear(user_id)
Provider Auto-Detection
If you pass just an API key without specifying a provider, MemoryManager guesses from the key prefix:
| Key Prefix | Detected Provider | Default Model |
|---|---|---|
sk-ant- |
Anthropic | claude-sonnet-4-6 |
sk- |
OpenAI | gpt-4o-mini |
AI |
Google Gemini | gemini-2.0-flash |
Or set provider="auto" / auto_enrich=True to check environment variables (ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY) in order.
LLM Enrichment: How It Works
Enrichment is what makes Parsica-Memory beat vector databases at recall. When you ingest a memory, an optional LLM call generates additional search metadata that dramatically improves retrieval quality.
The Problem Enrichment Solves
Without enrichment, a memory like:
"We decided to use FastAPI for the backend"
...can only be found by searching for words that appear in it: "FastAPI", "backend", "decided". But a user might search for "web framework", "API server", "http library", or "what tech stack did we pick?" - none of which appear in the original text.
How Enrichment Fixes This
At ingest time, an LLM reads the memory and generates four fields:
| Field | What It Contains | Search Weight |
|---|---|---|
tags |
5-10 semantic concept labels (web_framework, architecture_decision, python) |
Boosted in field matching |
summary |
One-sentence rephrase using DIFFERENT vocabulary than the original | 2x weight in search scoring |
keywords |
8-15 terms a person might type when looking for this memory | 2x weight in search scoring |
search_queries |
3 specific natural language questions ONLY this memory answers | 3x weight in search scoring |
For the FastAPI example, enrichment might produce:
{
"tags": ["web_framework", "architecture_decision", "python", "backend", "api_server"],
"summary": "The team selected FastAPI as their Python HTTP framework for the API layer",
"keywords": ["framework", "http", "api server", "asgi", "tech stack", "web library"],
"search_queries": [
"What web framework did we choose for the backend?",
"Why did we pick FastAPI over Flask or Django?",
"What is our Python API server technology?"
]
}
Now the memory is findable by ANY of those terms, not just the original words. This is why our LOCOMO accuracy (78.0%) beats mem0 (66.9%) - enrichment closes the vocabulary gap without needing embeddings.
Enrichment at Ingest Time
When you configure an enricher, every new memory gets enriched automatically at write time:
from parsica_memory import MemoryManager
# Enrichment happens automatically on every ingest
mm = MemoryManager("./store", api_key="sk-ant-...", provider="anthropic")
mm.ingest("user1", "Let's use PostgreSQL for the database", "Good choice, PostgreSQL it is.")
# ^ This memory is now findable by: "database", "postgres", "sql",
# "what DB are we using", "data storage decision", etc.
Enriching Existing Memories (Backfill)
Already have memories stored without enrichment? The enrich_all() method batch-processes everything:
from parsica_memory import MemoryManager
mm = MemoryManager("./store", api_key="sk-ant-...", provider="anthropic")
# Enrich all un-enriched memories for a user
count = mm.enrich_all("user1")
print(f"Enriched {count} memories")
# Force re-enrich everything (even already-enriched entries)
count = mm.enrich_all("user1", overwrite=True)
# With progress tracking
def progress(done, total):
print(f" {done}/{total}...")
count = mm.enrich_all("user1", progress_fn=progress)
Using the Low-Level API for Backfill
If you're using MemorySystem directly instead of MemoryManager:
from parsica_memory import MemorySystem
from parsica_memory.enrichers import anthropic_enricher
# Create enricher
enrich = anthropic_enricher("sk-ant-...")
# Load existing store and attach enricher
mem = MemorySystem(workspace="./store", agent_name="my-agent", enricher=enrich)
mem.load()
# Batch-enrich all existing memories
count = mem.re_enrich(
batch_size=50, # Progress callback fires every N entries
overwrite=False, # Skip already-enriched entries
progress_fn=lambda done, total: print(f"{done}/{total}")
)
print(f"Enriched {count} memories")
mem.save()
Cost Estimate
Enrichment uses one LLM call per memory, with a 600-character input cap and 256-token output cap:
| Provider | Model | Approximate Cost per 1,000 Memories |
|---|---|---|
| Anthropic | claude-sonnet-4-6 | ~$0.50 |
| OpenAI | gpt-4o-mini | ~$0.15 |
| gemini-2.0-flash | ~$0.05 | |
| Local (Ollama) | llama3.2 | $0.00 |
Pre-Built Enricher Templates
Four provider-ready enricher factories ship with Parsica-Memory. All use raw urllib from the standard library - zero SDK dependencies.
Anthropic (Claude)
from parsica_memory.enrichers import anthropic_enricher
enrich = anthropic_enricher(
api_key="sk-ant-...", # Or set ANTHROPIC_API_KEY env var
model="claude-sonnet-4-6", # Default
)
mem = MemorySystem("./store", agent_name="my-agent", enricher=enrich)
OpenAI (GPT)
from parsica_memory.enrichers import openai_enricher
enrich = openai_enricher(
api_key="sk-...", # Or set OPENAI_API_KEY env var
model="gpt-4o-mini", # Default (cheap and fast)
base_url="https://api.openai.com", # Change for Azure, etc.
)
Google Gemini
from parsica_memory.enrichers import gemini_enricher
enrich = gemini_enricher(
api_key="AI...", # Or set GOOGLE_API_KEY env var
model="gemini-2.0-flash", # Default
)
Local Models (Ollama, LM Studio, vLLM)
from parsica_memory.enrichers import local_enricher
enrich = local_enricher(
base_url="http://localhost:11434", # Ollama default
model="llama3.2",
timeout=60, # Longer timeout for local models
)
No API key needed. Works with anything that speaks the OpenAI Chat Completions format.
Auto-Detect
from parsica_memory.enrichers import auto_enricher
# Checks ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY in order
enrich = auto_enricher()
if enrich:
mem = MemorySystem("./store", agent_name="my-agent", enricher=enrich)
Custom Enricher
Write your own - just return a dict:
def my_enricher(text: str) -> dict:
# Call whatever you want
return {
"tags": ["tag1", "tag2"],
"summary": "One sentence rephrase",
"keywords": ["term1", "term2", "term3"],
"search_queries": [
"Question 1?",
"Question 2?",
"Question 3?"
]
}
mem = MemorySystem("./store", agent_name="my-agent", enricher=my_enricher)
All fields are optional. Missing fields are silently ignored.
v2.0.0 Features
Fact Decomposition
When enrichment is enabled, conversation turns are automatically broken down into atomic facts and stored as separate searchable entries.
Before (v1.x): A conversation like "User: I like vertical monitors and ultrawide. Also my dog's name is Koda." is stored as one blob.
After (v2.0): The enricher extracts atomic facts and stores each one independently:
"User prefers vertical monitor + ultrawide setup"(type: fact, provenance: decomposed)"User's dog is named Koda"(type: fact, provenance: decomposed)- Original conversation still stored as episodic entry
Each fact gets:
memory_type="fact"- crosses session boundaries automaticallyprovenance="decomposed"- tracks where it came fromparent_hash- links back to the original conversation entryconfidence=0.8- starts high (confirmed by the user in conversation)
# Facts are extracted automatically during ingest when enricher is configured
mm = MemoryManager("./store", api_key="sk-ant-...", provider="anthropic")
mm.ingest("user1", "I like dark mode and my password min is 8 chars", "Got it!")
# Each fact is now independently searchable
results = mm.search("user1", "password requirements")
# -> Returns: "Password minimum is 8 characters" (not the whole conversation)
Provenance Tracking
Every memory now carries provenance metadata - where it came from and how it got there:
| Provenance | Meaning |
|---|---|
self |
Created directly by this agent (default) |
decomposed |
Extracted as an atomic fact from a conversation |
shared |
Received from another agent via shared memory pool |
transferred |
Migrated from another memory system |
workspace |
Ingested from a workspace file |
manual |
Manually added by a user or admin |
# Structured recall includes provenance
results = mm.recall_structured("user1", "password")
for r in results:
print(f"{r['content']} (from: {r['provenance']}, confidence: {r['confidence']:.2f})")
Fuzzy Dedup + Canonicalization
When the same fact appears in slightly different phrasings, Parsica-Memory detects the overlap and links them instead of storing duplicates.
How it works:
- At ingest time, new facts are compared against existing facts using Jaccard word similarity
- If similarity >= 0.85 (configurable), the existing canonical fact is kept and its metadata is updated:
confidencebumped by +0.05 (capped at 1.0)sighting_countincrementedlast_sightedtimestamp updated
- The duplicate is NOT stored
This means facts that are confirmed multiple times become MORE confident over time, while the store stays clean.
# First mention
mm.ingest("user1", "I prefer dark mode", "Noted!")
# -> Stored as fact: "User prefers dark mode" (confidence: 0.8)
# Second mention (different phrasing)
mm.ingest("user1", "Can you enable dark mode? I always use it", "Done!")
# -> Recognized as same fact, sighting_count bumped, confidence: 0.85
# -> NO duplicate created
Confidence Threshold ("I Don't Know")
Search now supports a confidence_floor parameter. If nothing scores above the threshold, the system returns an empty list instead of guessing.
# Normal search - returns best matches regardless of confidence
results = mm.recall("user1", "quantum physics experiments")
# -> Might return vaguely related results
# With confidence floor - only returns if confident
results = mm.recall("user1", "quantum physics experiments", confidence_floor=0.5)
# -> Returns [] if nothing truly matches (the "I don't know" response)
The floor is calculated as entry.confidence * search_relevance_score. Both the memory's inherent confidence AND its match quality must be high enough.
Anti-Adjacency Penalty (Layer 12)
New search scoring layer that penalizes results matching the TOPIC but not the specific ANSWER.
The problem: Searching "what was the test phrase?" might return "we did memory testing" instead of "the test phrase was Peaches from Space."
The fix: When the query asks a specific question (starts with "what is", "who", "which", etc.), results are checked for specific identifiers:
- Proper nouns, quoted strings, version numbers, code identifiers, constants
- If the result is meta-discussion without a concrete answer, it gets a 0.6x penalty
- If the result lacks specific values, it gets a 0.85x penalty
- General queries (not specific questions) are NOT penalized
Enhanced Rarity Boost
Unique, identifying information now scores dramatically higher:
| Content Type | Boost (v1.x) | Boost (v2.0) |
|---|---|---|
| Proper nouns | 1.5x | 3.0x |
| Extremely rare terms (< 1% of docs) | 2.0x | 3.0x |
| Very rare terms (1-5% of docs) | 1.5x | 2.25x |
| Quoted strings | 1.0x | 3.0x |
| Version numbers | 1.0x | 2.5x |
| Code identifiers (backticks) | 1.0x | 2.5x |
| CONSTANTS/ACRONYMS | 1.0x | 2.0x |
This means "Peaches from Space" massively outranks "some testing happened" when both match a query.
12-Layer Search Engine
Every query runs through 12 scoring layers:
- BM25+ TF-IDF - baseline relevance with delta floor
- Exact Phrase Bonus - verbatim matches score higher
- Field Boosting - tags, source, category weighted; enriched fields get 2-3x weight
- Rarity & Proper Noun Boost - rare terms and names surface
- Positional Salience - intro/conclusion bias
- Semantic Expansion - PPMI co-occurrence query widening
- Intent Reranker - temporal, entity, howto detection
- Qualifier & Negation - "failed" != "successful"
- Clustering Boost - coherent result groups score higher
- Embedding Reranker - optional local embeddings (no API)
- Pseudo-Relevance Feedback - top results refine the query
- Anti-Adjacency Penalty - penalizes topic-match without answer-match (v2.0)
Memory Types
| Type | Decay Rate | Importance | Cross-Session | Use Case |
|---|---|---|---|---|
episodic |
Normal | 1x | Same session only | General events, chat |
semantic |
Normal | 1x | Yes - crosses sessions | Facts, decisions |
fact |
Normal | High recall | Yes | Verified knowledge |
mistake |
10x slower | 2x | Yes | Never forget failures |
preference |
3x slower | 1x | Yes | User/agent preferences |
procedure |
3x slower | 1x | Yes | How-to knowledge |
Memories are automatically classified at ingest time. No manual tagging needed.
Cross-Session Recall
results = mem.search(
"what's the API key format?",
session_id="session-B",
cross_session_recall="semantic" # "all" | "semantic" | "none"
)
"all"- search everything regardless of session (default)"semantic"- other sessions' memories only surface if they're facts/decisions, not episodic chatter"none"- strict session isolation
Shared / Team Memory
Multiple agents can share a memory pool with role-based access control:
from parsica_memory import SharedPool, AgentRole, AgentConfig
pool = mem.enable_shared_pool(
pool_dir="./shared",
pool_name="project-alpha",
agent_id="worker-1",
role=AgentRole.WRITER
)
mem.shared_write("Research complete: competitor uses GraphQL", namespace="research")
Context Packets
Cold-spawn solution for sub-agents:
packet = mem.build_context_packet(
task="Deploy the auth service",
max_tokens=3000,
include_mistakes=True
)
# Inject packet.render() into sub-agent system prompt
Graph Intelligence
Automatic entity extraction and knowledge graph:
path = mem.entity_path("payment-service", "database", max_hops=3)
triples = mem.graph_search(subject="PostgreSQL", relation="used_by")
Tiered Storage
| Tier | Age | Behavior |
|---|---|---|
| Hot | 0-3 days | Always loaded, fastest access |
| Warm | 3-14 days | Loaded on-demand |
| Cold | 14+ days | Requires include_cold=True |
Input Gating
P0-P3 priority classification drops noise before it enters the store:
mem.ingest_with_gating("ok thanks", source="chat") # dropped (P3 - noise)
mem.ingest_with_gating("Production outage: auth down", source="incident") # stored (P0 - critical)
MCP Server
parsica-memory-mcp --workspace ./memory --agent-name my-agent
Works with Claude Desktop and any MCP-compatible client.
CLI
# Check status
python -m parsica_memory status --workspace ./memory
# Initialize a workspace
python -m parsica_memory init --workspace ./memory --agent-name my-agent
# Rebuild knowledge graph
python -m parsica_memory rebuild-graph --workspace ./memory
# Start MCP server
python -m parsica_memory serve --workspace ./memory --agent-name my-agent
Full API Reference
MemorySystem
from parsica_memory import MemorySystem
mem = MemorySystem(
workspace="./memory", # Required - directory path
agent_name="my-agent", # Required - scopes the store
half_life=7.0, # Decay half-life in days
enricher=None, # LLM enrichment callable
tiered_storage=True, # Hot/warm/cold tiers
graph_intelligence=True, # Entity extraction + graph
semantic_expansion=True, # PPMI query expansion
)
# Lifecycle
mem.load() # Load from disk
mem.save() # Save to disk
mem.flush() # WAL to shards
mem.close() # Flush + release
# Ingestion
mem.ingest(content, source=..., session_id=..., channel_id=...)
mem.ingest_fact(content, source=...)
mem.ingest_mistake(what_happened=..., correction=..., root_cause=..., severity=...)
mem.ingest_preference(content, source=...)
mem.ingest_procedure(content, source=...)
mem.ingest_file(path, category=...)
mem.ingest_url(url, depth=2)
# Search
mem.search(query, limit=10, cross_session_recall="semantic", explain=False, confidence_floor=0.0)
mem.recent(limit=20)
# Enrichment
mem.re_enrich(batch_size=50, overwrite=False, progress_fn=None)
# Graph
mem.graph_search(subject=..., relation=..., obj=...)
mem.entity_path(source, target, max_hops=3)
# Context
mem.build_context_packet(task=..., max_tokens=3000, include_mistakes=True)
# Maintenance
mem.compact()
mem.consolidate()
mem.reindex()
mem.forget(topic=..., before_date=...)
mem.stats()
Benchmarks
Measured on Mac Mini M4 (10-core, 32GB), Python 3.14, 9,896 live memories:
| Metric | Parsica-Memory | mem0 |
|---|---|---|
| LOCOMO Accuracy | 78.0% | 66.9% |
| doc2query R@1 | 88.5% | - |
| p50 Search Latency | 21ms | - |
| Dependencies | 0 | 15+ |
| Requires API Key | No (enrichment optional) | Yes |
| Requires Vector DB | No | Yes |
Architecture
parsica-memory/
+-- Core: MemorySystem, MemoryEntry, WAL, ShardManager
+-- Search: 12-layer BM25+ pipeline, SearchEngine
+-- Intelligence: EntityExtractor, MemoryGraph, LLM Enricher
+-- Multi-Agent: SharedMemoryPool, AgentRoles
+-- Context: ContextPacketBuilder
+-- Manager: MemoryManager (batteries-included wrapper)
+-- Enrichers: Anthropic, OpenAI, Gemini, Local, Auto-detect
+-- Storage: TierManager, GCS backend
+-- Server: MCP server, CLI
Relationship to Antaris Core
parsica-memory is the standalone memory package from Antaris Analytics. It ships as part of Antaris Core (the full agent infrastructure suite) and also as this independent package. Same engine, same features, independent install.
License
Apache 2.0
Links
- PyPI: https://pypi.org/project/parsica-memory/
- GitHub: https://github.com/Antaris-Analytics-LLC/Parsica-Memory
- Antaris Analytics: https://antarisanalytics.ai
Built by Antaris Analytics LLC. Zero dependencies. Ships complete.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file parsica_memory-2.1.2.tar.gz.
File metadata
- Download URL: parsica_memory-2.1.2.tar.gz
- Upload date:
- Size: 1.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2c2ebaf8da0a5f5b0aeae36b6b657945162eceeaae2b2e6b70d33bf54bd1d5f
|
|
| MD5 |
4d96b0b2d6b1db951bd5caacdb049028
|
|
| BLAKE2b-256 |
e2782901c7f9f141ee55da1c75677255c41fff842218d9d222583183086f909e
|
File details
Details for the file parsica_memory-2.1.2-py3-none-any.whl.
File metadata
- Download URL: parsica_memory-2.1.2-py3-none-any.whl
- Upload date:
- Size: 1.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
254d868d10b82ecbf76380dbcef8f23b58cd9be010ab861240e1677dce586cc7
|
|
| MD5 |
eaa305502091adcf0a4e640dc8f3d957
|
|
| BLAKE2b-256 |
bf236bed6aadced090bd74287ea1de38b53db030dbdc6ce5dc89876a73807aba
|