Skip to main content

The SQLite of AI Memory - an embeddable, zero-dependency AI memory library for any LLM application

Project description

MemoryMesh - The SQLite of AI Memory

PyPI version License: MIT Python Versions CI

MemoryMesh is an embeddable AI memory library with zero required dependencies that gives any LLM application persistent, intelligent memory. Install it with pip install memorymesh and add long-term memory to your AI agents in three lines of code. It works with ANY LLM -- Claude, GPT, Gemini, Llama, Ollama, Mistral, and more. Runs everywhere Python runs (Linux, macOS, Windows). All data stays on your machine by default. No servers, no APIs, no cloud accounts required. Privacy-first by design.


Why MemoryMesh?

Every AI application needs memory, but existing solutions come with heavy trade-offs:

Solution Approach Trade-off
Mem0 SaaS / managed service Requires cloud account, data leaves your machine, ongoing costs
Letta / MemGPT Full agent framework Heavy framework lock-in, complex setup, opinionated architecture
Zep Memory server Requires PostgreSQL, Docker, server infrastructure
MemoryMesh Embeddable library Zero dependencies. Just SQLite. Works anywhere.

MemoryMesh takes a fundamentally different approach. Like SQLite revolutionized embedded databases, MemoryMesh brings the same philosophy to AI memory: a simple, reliable, embeddable library that just works. No infrastructure. No lock-in. No surprises.


Quick Start

from memorymesh import MemoryMesh

memory = MemoryMesh()
memory.remember("User prefers Python and dark mode")
results = memory.recall("What does the user prefer?")

That is it. Three lines to give your AI application persistent, semantic memory.


How MemoryMesh Saves You Money

Without memory, every AI interaction requires re-sending the full conversation history. As conversations grow, so do your token costs -- linearly, every single turn.

MemoryMesh flips this model. Instead of sending thousands of tokens of raw conversation history, you recall only the top-k most relevant memories (typically 3-5 short passages) and inject them as context. The conversation itself stays short.

Token cost comparison: 20-turn conversation

Turn Without Memory (full history) With MemoryMesh (recall top-5)
1 ~250 tokens ~250 tokens
5 ~1,500 tokens ~400 tokens
10 ~4,000 tokens ~400 tokens
20 ~10,000 tokens ~450 tokens
50 ~30,000 tokens ~500 tokens

Estimates based on typical conversational turns of ~250 tokens each, with MemoryMesh recalling 5 relevant memories (~50 tokens each) per turn.

How it works

  1. Store -- After each interaction, remember() the key facts (not the full conversation).
  2. Recall -- Before the next interaction, recall() retrieves only the most relevant memories ranked by semantic similarity, recency, and importance.
  3. Inject -- Pass the recalled memories as system context to your LLM. The full conversation history is never needed.

The result: Your input token count stays roughly constant regardless of how long the conversation has been going. At $3/million input tokens (Claude Sonnet pricing), a 50-turn conversation costs ~$0.09 without memory vs. ~$0.0015 with MemoryMesh -- a 60x reduction.

This is not just a cost saving. It also means your application stays within context window limits, responds faster (fewer tokens to process), and retrieves only what is actually relevant instead of forcing the LLM to sift through thousands of tokens of conversational noise.


Installation

# Base installation (no external dependencies, uses built-in keyword matching)
pip install memorymesh

# With local embeddings (sentence-transformers, runs entirely on your machine)
pip install "memorymesh[local]"

# With Ollama embeddings (connect to a local Ollama instance)
pip install "memorymesh[ollama]"

# With OpenAI embeddings
pip install "memorymesh[openai]"

# Everything
pip install "memorymesh[all]"

Features

  • Simple API -- remember(), recall(), forget(). That is the core interface. No boilerplate, no configuration ceremony.
  • SQLite-Based -- All memory is stored in SQLite files. No database servers, no infrastructure. Automatic schema migrations keep existing databases up to date.
  • Framework-Agnostic -- Works with any LLM, any framework, any architecture. Use it with LangChain, LlamaIndex, raw API calls, or your own custom setup.
  • Pluggable Embeddings -- Choose the embedding provider that fits your needs: local models, Ollama, OpenAI, or plain keyword matching with zero dependencies.
  • Time-Based Decay -- Memories naturally fade over time, just like human memory. Recent and frequently accessed memories are ranked higher.
  • Auto-Importance Scoring -- Automatically detect and prioritize key information. MemoryMesh analyzes text for keywords, structure, and specificity to assign importance scores without manual tuning.
  • Episodic Memory -- Group memories by conversation session. Recall with session context for better continuity across multi-turn interactions.
  • Memory Compaction -- Detect and merge similar or redundant memories to keep your store lean. Reduces noise and improves recall accuracy over time.
  • Encrypted Storage -- Optionally encrypt memory text and metadata at rest. All data stays protected on disk using application-level encryption with zero external dependencies.
  • Privacy-First -- All data stays on your machine by default. No telemetry, no cloud calls, no data collection. You own your data.
  • Cross-Platform -- Runs on Linux, macOS, and Windows. Anywhere Python runs, MemoryMesh runs.
  • MCP Support -- Expose memory as an MCP (Model Context Protocol) server for seamless integration with AI assistants.
  • Multi-Tool Sync -- Sync memories to Claude Code, OpenAI Codex CLI, and Google Gemini CLI simultaneously. Your knowledge follows you across tools.
  • Memory Categories -- Automatic categorization with scope routing. Preferences and guardrails go to global scope; decisions and patterns stay in the project. MemoryMesh decides where memories belong.
  • Session Start -- Structured context retrieval at the beginning of every AI session. Returns user profile, guardrails, common mistakes, and project context in one call.
  • Auto-Compaction -- Transparent deduplication that runs automatically during normal use. Like SQLite's auto-vacuum, you never need to think about it.
  • CLI -- Inspect, search, export, compact, and manage memories from the terminal. No Python code required.
  • Pin Support -- Pin critical memories so they never decay and always rank at the top. Use for guardrails and non-negotiable rules.
  • Privacy Guard -- Automatically detect secrets (API keys, tokens, passwords) before storing. Optionally redact them with redact=True.
  • Contradiction Detection -- Catch conflicting facts when storing new memories. Choose to keep both, update, or skip.
  • Retrieval Filters -- Filter recall by category, minimum importance, time range, or metadata key-value pairs.
  • Web Dashboard -- Browse and search all your memories in a local web UI (memorymesh ui).
  • Evaluation Suite -- Built-in tests for recall quality and adversarial robustness.

What's New in v3

  • Pin support -- remember("critical rule", pin=True) sets importance to 1.0 with zero decay.
  • Privacy guard -- Detects API keys, GitHub tokens, JWTs, AWS keys, passwords, and more. Use redact=True to auto-redact before storing.
  • Contradiction detection -- on_conflict="update" replaces contradicting memories; "skip" discards the new one; "keep_both" flags it.
  • Retrieval filters -- recall(query, category="decision", min_importance=0.7, time_range=(...), metadata_filter={...}).
  • Web dashboard -- memorymesh ui launches a local browser-based memory viewer.
  • Evaluation suite -- 32 tests covering recall quality, adversarial inputs, scope isolation, and importance ranking.

Works with Any LLM

MemoryMesh is not tied to any specific LLM provider. It works as a memory layer alongside whatever model you use:

from memorymesh import MemoryMesh

memory = MemoryMesh()

# Store memories from any source
memory.remember("User is a senior Python developer")
memory.remember("User is building a healthcare startup")
memory.remember("User prefers concise explanations")

# Recall relevant context before calling ANY LLM
context = memory.recall("What do I know about this user?")

# Use with Claude
response = claude_client.messages.create(
    model="claude-sonnet-4-20250514",
    system=f"User context: {context}",
    messages=[{"role": "user", "content": "Help me design an API"}],
)

# Or GPT
response = openai_client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": f"User context: {context}"},
        {"role": "user", "content": "Help me design an API"},
    ],
)

# Or Ollama, Gemini, Mistral, Llama, or literally anything else

Documentation

Guide Description
Configuration Embedding providers, Ollama setup, all constructor options
MCP Server Setup for Claude Code, Cursor, Windsurf + teaching your AI to use memory
Multi-Tool Sync Sync memories across Claude, Codex, and Gemini CLI
CLI Reference Terminal commands for inspecting and managing memories
API Reference Full Python API with all methods and parameters
Architecture System design, dual-store pattern, and schema migrations
FAQ Common questions answered
Benchmarks Performance numbers and how to run benchmarks

Roadmap

v0.1 -- MVP

  • Core remember() / recall() / forget() API
  • SQLite-based persistent storage
  • Pluggable embedding providers (none, local, ollama, openai)
  • Time-based memory decay
  • Relevance scoring (semantic + recency + importance + frequency)
  • MCP server for AI assistant integration (Claude Code, Cursor, Windsurf)
  • Security hardening (input limits, path validation, error sanitization)
  • Multi-tool memory sync (Claude, Codex, Gemini) with format adapters
  • CLI viewer and management tool (memorymesh list, search, stats, sync, etc.)
  • Automatic schema migrations (safe upgrades for existing databases)

v1.0 -- Production Ready

  • Episodic memory with session tracking (session_id on remember/recall)
  • Auto-importance scoring (heuristic-based: keywords, structure, specificity)
  • Encrypted storage at rest (application-level, zero external dependencies)
  • Memory compaction (detect and merge similar/redundant memories)
  • Comprehensive benchmarks (make bench -- throughput, latency, concurrency, disk usage)

v2.0 -- Personality & Learning Engine (Current)

  • Memory categories with automatic scope routing (category="preference" -> global)
  • Auto-categorization from text heuristics (auto_categorize=True)
  • session_start() method for structured context at the beginning of every AI session
  • Category-aware sync produces structured MEMORY.md (User Profile, Guardrails, Decisions, etc.)
  • 9 built-in categories: preference, guardrail, mistake, personality, question, decision, pattern, context, session_summary

v3.0 -- Intelligent Memory (Current)

  • Pin support for critical memories (zero decay, always top-ranked)
  • Privacy guard with secret detection and optional redaction
  • Contradiction detection with configurable conflict resolution
  • Advanced retrieval filters (category, importance, time range, metadata)
  • Web dashboard for browsing and searching memories (memorymesh ui)
  • Evaluation suite (recall quality + adversarial robustness tests)

v4.0 -- Advanced

  • Graph-based memory relationships
  • Multi-device sync
  • Plugin system for custom relevance strategies
  • Streaming recall for large memory sets

Contributing

We welcome contributions from everyone. See CONTRIBUTING.md for guidelines on how to get started.


License

MIT License. See LICENSE for the full text.


Built for Humanity

MemoryMesh is part of the SparkVibe open-source AI initiative. We believe that foundational AI tools should be free, open, and accessible to everyone -- not locked behind paywalls, cloud subscriptions, or proprietary platforms.

Our mission is to reduce the cost and complexity of building AI applications, so that developers everywhere -- whether at a startup, a research lab, a nonprofit, or learning on their own -- can build intelligent systems without barriers.

If AI is going to shape the future, the tools that power it should belong to all of us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memorymesh-3.0.0.tar.gz (209.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memorymesh-3.0.0-py3-none-any.whl (111.2 kB view details)

Uploaded Python 3

File details

Details for the file memorymesh-3.0.0.tar.gz.

File metadata

  • Download URL: memorymesh-3.0.0.tar.gz
  • Upload date:
  • Size: 209.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for memorymesh-3.0.0.tar.gz
Algorithm Hash digest
SHA256 6e4c4b7413146beaecf94710805517834027a81df5fd0a3f1a287813d6eaf8b0
MD5 91409d2e6c6e3481a137ff19cb7633fc
BLAKE2b-256 2552392a3245a744d602a9643781edd0c0dd0596fa33107303468c8db262b1d4

See more details on using hashes here.

File details

Details for the file memorymesh-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: memorymesh-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 111.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for memorymesh-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2fdaf2535cd09d50ba3144dd255d9fbc718c62dcbaeecb32697e7b236d52ab34
MD5 ddbe8a68bb9347dd669872f2dceca1fa
BLAKE2b-256 b000cc266d82df26b52a3d50233120910a2b6878d39809a52cc19c4d14532d78

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page