The SQLite of AI Memory - an embeddable, zero-dependency AI memory library for any LLM application
Project description
MemoryMesh - The SQLite of AI Memory
MemoryMesh is an embeddable AI memory library with zero required dependencies that gives any LLM application persistent, intelligent memory. Install it with pip install memorymesh and add long-term memory to your AI agents in three lines of code. It works with ANY LLM -- Claude, GPT, Gemini, Llama, Ollama, Mistral, and more. Runs everywhere Python runs (Linux, macOS, Windows). All data stays on your machine by default. No servers, no APIs, no cloud accounts required. Privacy-first by design.
Why MemoryMesh?
Every AI application needs memory, but existing solutions come with heavy trade-offs:
| Solution | Approach | Trade-off |
|---|---|---|
| Mem0 | SaaS / managed service | Requires cloud account, data leaves your machine, ongoing costs |
| Letta / MemGPT | Full agent framework | Heavy framework lock-in, complex setup, opinionated architecture |
| Zep | Memory server | Requires PostgreSQL, Docker, server infrastructure |
| MemoryMesh | Embeddable library | Zero dependencies. Just SQLite. Works anywhere. |
MemoryMesh takes a fundamentally different approach. Like SQLite revolutionized embedded databases, MemoryMesh brings the same philosophy to AI memory: a simple, reliable, embeddable library that just works. No infrastructure. No lock-in. No surprises.
Quick Start
from memorymesh import MemoryMesh
memory = MemoryMesh()
memory.remember("User prefers Python and dark mode")
results = memory.recall("What does the user prefer?")
That is it. Three lines to give your AI application persistent, semantic memory.
How MemoryMesh Saves You Money
Without memory, every AI interaction requires re-sending the full conversation history. As conversations grow, so do your token costs -- linearly, every single turn.
MemoryMesh flips this model. Instead of sending thousands of tokens of raw conversation history, you recall only the top-k most relevant memories (typically 3-5 short passages) and inject them as context. The conversation itself stays short.
Token cost comparison: 20-turn conversation
| Turn | Without Memory (full history) | With MemoryMesh (recall top-5) |
|---|---|---|
| 1 | ~250 tokens | ~250 tokens |
| 5 | ~1,500 tokens | ~400 tokens |
| 10 | ~4,000 tokens | ~400 tokens |
| 20 | ~10,000 tokens | ~450 tokens |
| 50 | ~30,000 tokens | ~500 tokens |
Estimates based on typical conversational turns of ~250 tokens each, with MemoryMesh recalling 5 relevant memories (~50 tokens each) per turn.
How it works
- Store -- After each interaction,
remember()the key facts (not the full conversation). - Recall -- Before the next interaction,
recall()retrieves only the most relevant memories ranked by semantic similarity, recency, and importance. - Inject -- Pass the recalled memories as system context to your LLM. The full conversation history is never needed.
The result: Your input token count stays roughly constant regardless of how long the conversation has been going. At $3/million input tokens (Claude Sonnet pricing), a 50-turn conversation costs ~$0.09 without memory vs. ~$0.0015 with MemoryMesh -- a 60x reduction.
This is not just a cost saving. It also means your application stays within context window limits, responds faster (fewer tokens to process), and retrieves only what is actually relevant instead of forcing the LLM to sift through thousands of tokens of conversational noise.
Installation
# Base installation (no external dependencies, uses built-in keyword matching)
pip install memorymesh
# With local embeddings (sentence-transformers, runs entirely on your machine)
pip install "memorymesh[local]"
# With Ollama embeddings (connect to a local Ollama instance)
pip install "memorymesh[ollama]"
# With OpenAI embeddings
pip install "memorymesh[openai]"
# Everything
pip install "memorymesh[all]"
Features
- Simple API --
remember(),recall(),forget(). That is the core interface. No boilerplate, no configuration ceremony. - SQLite-Based -- All memory is stored in SQLite files. No database servers, no infrastructure. Automatic schema migrations keep existing databases up to date.
- Framework-Agnostic -- Works with any LLM, any framework, any architecture. Use it with LangChain, LlamaIndex, raw API calls, or your own custom setup.
- Pluggable Embeddings -- Choose the embedding provider that fits your needs: local models, Ollama, OpenAI, or plain keyword matching with zero dependencies.
- Time-Based Decay -- Memories naturally fade over time, just like human memory. Recent and frequently accessed memories are ranked higher.
- Auto-Importance Scoring -- Automatically detect and prioritize key information. MemoryMesh analyzes text for keywords, structure, and specificity to assign importance scores without manual tuning.
- Episodic Memory -- Group memories by conversation session. Recall with session context for better continuity across multi-turn interactions.
- Memory Compaction -- Detect and merge similar or redundant memories to keep your store lean. Reduces noise and improves recall accuracy over time.
- Encrypted Storage -- Optionally encrypt memory text and metadata at rest. All data stays protected on disk using application-level encryption with zero external dependencies.
- Privacy-First -- All data stays on your machine by default. No telemetry, no cloud calls, no data collection. You own your data.
- Cross-Platform -- Runs on Linux, macOS, and Windows. Anywhere Python runs, MemoryMesh runs.
- MCP Support -- Expose memory as an MCP (Model Context Protocol) server for seamless integration with AI assistants.
- Multi-Tool Sync -- Sync memories to Claude Code, OpenAI Codex CLI, and Google Gemini CLI simultaneously. Your knowledge follows you across tools.
- Memory Categories -- Automatic categorization with scope routing. Preferences and guardrails go to global scope; decisions and patterns stay in the project. MemoryMesh decides where memories belong.
- Session Start -- Structured context retrieval at the beginning of every AI session. Returns user profile, guardrails, common mistakes, and project context in one call.
- Auto-Compaction -- Transparent deduplication that runs automatically during normal use. Like SQLite's auto-vacuum, you never need to think about it.
- CLI -- Inspect, search, export, compact, and manage memories from the terminal. No Python code required.
Works with Any LLM
MemoryMesh is not tied to any specific LLM provider. It works as a memory layer alongside whatever model you use:
from memorymesh import MemoryMesh
memory = MemoryMesh()
# Store memories from any source
memory.remember("User is a senior Python developer")
memory.remember("User is building a healthcare startup")
memory.remember("User prefers concise explanations")
# Recall relevant context before calling ANY LLM
context = memory.recall("What do I know about this user?")
# Use with Claude
response = claude_client.messages.create(
model="claude-sonnet-4-20250514",
system=f"User context: {context}",
messages=[{"role": "user", "content": "Help me design an API"}],
)
# Or GPT
response = openai_client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": f"User context: {context}"},
{"role": "user", "content": "Help me design an API"},
],
)
# Or Ollama, Gemini, Mistral, Llama, or literally anything else
Documentation
| Guide | Description |
|---|---|
| Configuration | Embedding providers, Ollama setup, all constructor options |
| MCP Server | Setup for Claude Code, Cursor, Windsurf + teaching your AI to use memory |
| Multi-Tool Sync | Sync memories across Claude, Codex, and Gemini CLI |
| CLI Reference | Terminal commands for inspecting and managing memories |
| API Reference | Full Python API with all methods and parameters |
| Architecture | System design, dual-store pattern, and schema migrations |
| FAQ | Common questions answered |
| Benchmarks | Performance numbers and how to run benchmarks |
Roadmap
v0.1 -- MVP
- Core
remember()/recall()/forget()API - SQLite-based persistent storage
- Pluggable embedding providers (none, local, ollama, openai)
- Time-based memory decay
- Relevance scoring (semantic + recency + importance + frequency)
- MCP server for AI assistant integration (Claude Code, Cursor, Windsurf)
- Security hardening (input limits, path validation, error sanitization)
- Multi-tool memory sync (Claude, Codex, Gemini) with format adapters
- CLI viewer and management tool (
memorymesh list,search,stats,sync, etc.) - Automatic schema migrations (safe upgrades for existing databases)
v1.0 -- Production Ready
- Episodic memory with session tracking (
session_idon remember/recall) - Auto-importance scoring (heuristic-based: keywords, structure, specificity)
- Encrypted storage at rest (application-level, zero external dependencies)
- Memory compaction (detect and merge similar/redundant memories)
- Comprehensive benchmarks (
make bench-- throughput, latency, concurrency, disk usage)
v2.0 -- Personality & Learning Engine (Current)
- Memory categories with automatic scope routing (
category="preference"-> global) - Auto-categorization from text heuristics (
auto_categorize=True) session_start()method for structured context at the beginning of every AI session- Category-aware sync produces structured MEMORY.md (User Profile, Guardrails, Decisions, etc.)
- 9 built-in categories: preference, guardrail, mistake, personality, question, decision, pattern, context, session_summary
v3.0 -- Advanced
- Graph-based memory relationships
- Multi-device sync
- Plugin system for custom relevance strategies
- Streaming recall for large memory sets
Contributing
We welcome contributions from everyone. See CONTRIBUTING.md for guidelines on how to get started.
License
MIT License. See LICENSE for the full text.
Built for Humanity
MemoryMesh is part of the SparkVibe open-source AI initiative. We believe that foundational AI tools should be free, open, and accessible to everyone -- not locked behind paywalls, cloud subscriptions, or proprietary platforms.
Our mission is to reduce the cost and complexity of building AI applications, so that developers everywhere -- whether at a startup, a research lab, a nonprofit, or learning on their own -- can build intelligent systems without barriers.
If AI is going to shape the future, the tools that power it should belong to all of us.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memorymesh-2.0.0.tar.gz.
File metadata
- Download URL: memorymesh-2.0.0.tar.gz
- Upload date:
- Size: 165.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d64162f5b05ab334d04efba3560f247b9e17066c091fb49325fbc78f6796ccae
|
|
| MD5 |
225cf35b174b43310cffde58af569bc1
|
|
| BLAKE2b-256 |
5c1be7c08a0cbe0204e8dfc6e3e17a566920997be5a0fefeda869f027f76da31
|
Provenance
The following attestation bundles were made for memorymesh-2.0.0.tar.gz:
Publisher:
publish.yml on sparkvibe-io/memorymesh
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
memorymesh-2.0.0.tar.gz -
Subject digest:
d64162f5b05ab334d04efba3560f247b9e17066c091fb49325fbc78f6796ccae - Sigstore transparency entry: 956172781
- Sigstore integration time:
-
Permalink:
sparkvibe-io/memorymesh@05011295b117dd28e11c6b749d14ad9897c87ceb -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/sparkvibe-io
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@05011295b117dd28e11c6b749d14ad9897c87ceb -
Trigger Event:
release
-
Statement type:
File details
Details for the file memorymesh-2.0.0-py3-none-any.whl.
File metadata
- Download URL: memorymesh-2.0.0-py3-none-any.whl
- Upload date:
- Size: 87.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a76ca8c9c3a6ff336dd340d4ec32830d8eeba5e7ec926c904de22175a09014ff
|
|
| MD5 |
843b59a133d0d76ee35d6c58804d10a6
|
|
| BLAKE2b-256 |
8f8970c3b4ffa11f0a77ade8af90b72836e8e1c7dd08d9eb85ad589325ae707d
|
Provenance
The following attestation bundles were made for memorymesh-2.0.0-py3-none-any.whl:
Publisher:
publish.yml on sparkvibe-io/memorymesh
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
memorymesh-2.0.0-py3-none-any.whl -
Subject digest:
a76ca8c9c3a6ff336dd340d4ec32830d8eeba5e7ec926c904de22175a09014ff - Sigstore transparency entry: 956172786
- Sigstore integration time:
-
Permalink:
sparkvibe-io/memorymesh@05011295b117dd28e11c6b749d14ad9897c87ceb -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/sparkvibe-io
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@05011295b117dd28e11c6b749d14ad9897c87ceb -
Trigger Event:
release
-
Statement type: