The SQLite of AI Memory - an embeddable, zero-dependency AI memory library for any LLM application

These details have not been verified by PyPI

Project links

Project description

MemoryMesh - The SQLite of AI Memory

MemoryMesh is an embeddable AI memory library with zero required dependencies that gives any LLM application persistent, intelligent memory. Install it with pip install memorymesh and add long-term memory to your AI agents in three lines of code. It works with ANY LLM -- Claude, GPT, Gemini, Llama, Ollama, Mistral, and more. Runs everywhere Python runs (Linux, macOS, Windows). All data stays on your machine by default. No servers, no APIs, no cloud accounts required. Privacy-first by design.

Why MemoryMesh?

Every AI application needs memory, but existing solutions come with heavy trade-offs:

Solution	Approach	Trade-off
Mem0	SaaS / managed service	Requires cloud account, data leaves your machine, ongoing costs
Letta / MemGPT	Full agent framework	Heavy framework lock-in, complex setup, opinionated architecture
Zep	Memory server	Requires PostgreSQL, Docker, server infrastructure
MemoryMesh	Embeddable library	Zero dependencies. Just SQLite. Works anywhere.

MemoryMesh takes a fundamentally different approach. Like SQLite revolutionized embedded databases, MemoryMesh brings the same philosophy to AI memory: a simple, reliable, embeddable library that just works. No infrastructure. No lock-in. No surprises.

Quick Start

from memorymesh import MemoryMesh

memory = MemoryMesh()
memory.remember("User prefers Python and dark mode")
results = memory.recall("What does the user prefer?")

That is it. Three lines to give your AI application persistent, semantic memory.

How MemoryMesh Saves You Money

Without memory, every AI interaction requires re-sending the full conversation history. As conversations grow, so do your token costs -- linearly, every single turn.

MemoryMesh flips this model. Instead of sending thousands of tokens of raw conversation history, you recall only the top-k most relevant memories (typically 3-5 short passages) and inject them as context. The conversation itself stays short.

Token cost comparison: 20-turn conversation

Turn	Without Memory (full history)	With MemoryMesh (recall top-5)
1	~250 tokens	~250 tokens
5	~1,500 tokens	~400 tokens
10	~4,000 tokens	~400 tokens
20	~10,000 tokens	~450 tokens
50	~30,000 tokens	~500 tokens

Estimates based on typical conversational turns of ~250 tokens each, with MemoryMesh recalling 5 relevant memories (~50 tokens each) per turn.

How it works

Store -- After each interaction, remember() the key facts (not the full conversation).
Recall -- Before the next interaction, recall() retrieves only the most relevant memories ranked by semantic similarity, recency, and importance.
Inject -- Pass the recalled memories as system context to your LLM. The full conversation history is never needed.

The result: Your input token count stays roughly constant regardless of how long the conversation has been going. At $3/million input tokens (Claude Sonnet pricing), a 50-turn conversation costs ~$0.09 without memory vs. ~$0.0015 with MemoryMesh -- a 60x reduction.

This is not just a cost saving. It also means your application stays within context window limits, responds faster (fewer tokens to process), and retrieves only what is actually relevant instead of forcing the LLM to sift through thousands of tokens of conversational noise.

Installation

# Base installation (no external dependencies, uses built-in keyword matching)
pip install memorymesh

# With local embeddings (sentence-transformers, runs entirely on your machine)
pip install "memorymesh[local]"

# With Ollama embeddings (connect to a local Ollama instance)
pip install "memorymesh[ollama]"

# With OpenAI embeddings
pip install "memorymesh[openai]"

# Everything
pip install "memorymesh[all]"

Features

Simple API -- remember(), recall(), forget(). That is the core interface. No boilerplate, no configuration ceremony.
SQLite-Based -- All memory is stored in SQLite files. No database servers, no infrastructure. Automatic schema migrations keep existing databases up to date.
Framework-Agnostic -- Works with any LLM, any framework, any architecture. Use it with LangChain, LlamaIndex, raw API calls, or your own custom setup.
Pluggable Embeddings -- Choose the embedding provider that fits your needs: local models, Ollama, OpenAI, or plain keyword matching with zero dependencies.
Time-Based Decay -- Memories naturally fade over time, just like human memory. Recent and frequently accessed memories are ranked higher.
Auto-Importance Scoring -- Automatically detect and prioritize key information. MemoryMesh analyzes text for keywords, structure, and specificity to assign importance scores without manual tuning.
Episodic Memory -- Group memories by conversation session. Recall with session context for better continuity across multi-turn interactions.
Memory Compaction -- Detect and merge similar or redundant memories to keep your store lean. Reduces noise and improves recall accuracy over time.
Encrypted Storage -- Optionally encrypt memory text and metadata at rest. All data stays protected on disk using application-level encryption with zero external dependencies.
Privacy-First -- All data stays on your machine by default. No telemetry, no cloud calls, no data collection. You own your data.
Cross-Platform -- Runs on Linux, macOS, and Windows. Anywhere Python runs, MemoryMesh runs.
MCP Support -- Expose memory as an MCP (Model Context Protocol) server for seamless integration with AI assistants.
Multi-Tool Sync -- Sync memories to Claude Code, OpenAI Codex CLI, and Google Gemini CLI simultaneously. Your knowledge follows you across tools.
Memory Categories -- Automatic categorization with scope routing. Preferences and guardrails go to global scope; decisions and patterns stay in the project. MemoryMesh decides where memories belong.
Session Start -- Structured context retrieval at the beginning of every AI session. Returns user profile, guardrails, common mistakes, and project context in one call.
Auto-Compaction -- Transparent deduplication that runs automatically during normal use. Like SQLite's auto-vacuum, you never need to think about it.
CLI -- Inspect, search, export, compact, and manage memories from the terminal. No Python code required.
Pin Support -- Pin critical memories so they never decay and always rank at the top. Use for guardrails and non-negotiable rules.
Privacy Guard -- Automatically detect secrets (API keys, tokens, passwords) before storing. Optionally redact them with redact=True.
Contradiction Detection -- Catch conflicting facts when storing new memories. Choose to keep both, update, or skip.
Retrieval Filters -- Filter recall by category, minimum importance, time range, or metadata key-value pairs.
Web Dashboard -- Browse and search all your memories in a local web UI (memorymesh ui).
Evaluation Suite -- Built-in tests for recall quality and adversarial robustness.

What's New in v3

Pin support -- remember("critical rule", pin=True) sets importance to 1.0 with zero decay.
Privacy guard -- Detects API keys, GitHub tokens, JWTs, AWS keys, passwords, and more. Use redact=True to auto-redact before storing.
Contradiction detection -- on_conflict="update" replaces contradicting memories; "skip" discards the new one; "keep_both" flags it.
Retrieval filters -- recall(query, category="decision", min_importance=0.7, time_range=(...), metadata_filter={...}).
Web dashboard -- memorymesh ui launches a local browser-based memory viewer.
Evaluation suite -- 32 tests covering recall quality, adversarial inputs, scope isolation, and importance ranking.

Works with Any LLM

MemoryMesh is not tied to any specific LLM provider. It works as a memory layer alongside whatever model you use:

from memorymesh import MemoryMesh

memory = MemoryMesh()

# Store memories from any source
memory.remember("User is a senior Python developer")
memory.remember("User is building a healthcare startup")
memory.remember("User prefers concise explanations")

# Recall relevant context before calling ANY LLM
context = memory.recall("What do I know about this user?")

# Use with Claude
response = claude_client.messages.create(
    model="claude-sonnet-4-20250514",
    system=f"User context: {context}",
    messages=[{"role": "user", "content": "Help me design an API"}],
)

# Or GPT
response = openai_client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": f"User context: {context}"},
        {"role": "user", "content": "Help me design an API"},
    ],
)

# Or Ollama, Gemini, Mistral, Llama, or literally anything else

Documentation

Guide	Description
Configuration	Embedding providers, Ollama setup, all constructor options
MCP Server	Setup for Claude Code, Cursor, Windsurf + teaching your AI to use memory
Multi-Tool Sync	Sync memories across Claude, Codex, and Gemini CLI
CLI Reference	Terminal commands for inspecting and managing memories
API Reference	Full Python API with all methods and parameters
Architecture	System design, dual-store pattern, and schema migrations
FAQ	Common questions answered
Benchmarks	Performance numbers and how to run benchmarks

Roadmap

v0.1 -- MVP

Core remember() / recall() / forget() API
SQLite-based persistent storage
Pluggable embedding providers (none, local, ollama, openai)
Time-based memory decay
Relevance scoring (semantic + recency + importance + frequency)
MCP server for AI assistant integration (Claude Code, Cursor, Windsurf)
Security hardening (input limits, path validation, error sanitization)
Multi-tool memory sync (Claude, Codex, Gemini) with format adapters
CLI viewer and management tool (memorymesh list, search, stats, sync, etc.)
Automatic schema migrations (safe upgrades for existing databases)

v1.0 -- Production Ready

Episodic memory with session tracking (session_id on remember/recall)
Auto-importance scoring (heuristic-based: keywords, structure, specificity)
Encrypted storage at rest (application-level, zero external dependencies)
Memory compaction (detect and merge similar/redundant memories)
Comprehensive benchmarks (make bench -- throughput, latency, concurrency, disk usage)

v2.0 -- Personality & Learning Engine (Current)

Memory categories with automatic scope routing (category="preference" -> global)
Auto-categorization from text heuristics (auto_categorize=True)
session_start() method for structured context at the beginning of every AI session
Category-aware sync produces structured MEMORY.md (User Profile, Guardrails, Decisions, etc.)
9 built-in categories: preference, guardrail, mistake, personality, question, decision, pattern, context, session_summary

v3.0 -- Intelligent Memory (Current)

Pin support for critical memories (zero decay, always top-ranked)
Privacy guard with secret detection and optional redaction
Contradiction detection with configurable conflict resolution
Advanced retrieval filters (category, importance, time range, metadata)
Web dashboard for browsing and searching memories (memorymesh ui)
Evaluation suite (recall quality + adversarial robustness tests)

v4.0 -- Advanced

Graph-based memory relationships
Multi-device sync
Plugin system for custom relevance strategies
Streaming recall for large memory sets

Contributing

We welcome contributions from everyone. See CONTRIBUTING.md for guidelines on how to get started.

License

MIT License. See LICENSE for the full text.

Built for Humanity

MemoryMesh is part of the SparkVibe open-source AI initiative. We believe that foundational AI tools should be free, open, and accessible to everyone -- not locked behind paywalls, cloud subscriptions, or proprietary platforms.

Our mission is to reduce the cost and complexity of building AI applications, so that developers everywhere -- whether at a startup, a research lab, a nonprofit, or learning on their own -- can build intelligent systems without barriers.

If AI is going to shape the future, the tools that power it should belong to all of us.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

4.3.0

Mar 3, 2026

4.1.1

Mar 2, 2026

4.1.0

Feb 28, 2026

4.0.1

Feb 27, 2026

4.0.0

Feb 27, 2026

3.1.0

Feb 20, 2026

This version

3.0.0

Feb 18, 2026

2.0.0

Feb 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memorymesh-3.0.0.tar.gz (209.9 kB view details)

Uploaded Feb 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

memorymesh-3.0.0-py3-none-any.whl (111.2 kB view details)

Uploaded Feb 18, 2026 Python 3

File details

Details for the file memorymesh-3.0.0.tar.gz.

File metadata

Download URL: memorymesh-3.0.0.tar.gz
Upload date: Feb 18, 2026
Size: 209.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for memorymesh-3.0.0.tar.gz
Algorithm	Hash digest
SHA256	`6e4c4b7413146beaecf94710805517834027a81df5fd0a3f1a287813d6eaf8b0`
MD5	`91409d2e6c6e3481a137ff19cb7633fc`
BLAKE2b-256	`2552392a3245a744d602a9643781edd0c0dd0596fa33107303468c8db262b1d4`

See more details on using hashes here.

File details

Details for the file memorymesh-3.0.0-py3-none-any.whl.

File metadata

Download URL: memorymesh-3.0.0-py3-none-any.whl
Upload date: Feb 18, 2026
Size: 111.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for memorymesh-3.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2fdaf2535cd09d50ba3144dd255d9fbc718c62dcbaeecb32697e7b236d52ab34`
MD5	`ddbe8a68bb9347dd669872f2dceca1fa`
BLAKE2b-256	`b000cc266d82df26b52a3d50233120910a2b6878d39809a52cc19c4d14532d78`

See more details on using hashes here.

memorymesh 3.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MemoryMesh - The SQLite of AI Memory

Why MemoryMesh?

Quick Start

How MemoryMesh Saves You Money

Token cost comparison: 20-turn conversation

How it works

Installation

Features

What's New in v3

Works with Any LLM

Documentation

Roadmap

v0.1 -- MVP

v1.0 -- Production Ready

v2.0 -- Personality & Learning Engine (Current)

v3.0 -- Intelligent Memory (Current)

v4.0 -- Advanced

Contributing

License

Built for Humanity

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes