The Zero-Dependency, Sub-Millisecond AI Memory System
Project description
Mnemosyne
Native, zero-cloud memory for AI agents. SQLite-backed. Sub-millisecond. Fully private.
Mnemosyne is a local-first memory system for the Hermes Agent framework. It stores conversations, preferences, and knowledge in SQLite with native vector search (sqlite-vec) and full-text search (FTS5) — no external databases, no API keys, no network calls.
Quick Start
Option A: Full install (recommended)
For the CLI, Python API, and Hermes MemoryProvider:
# 1. Clone and install
git clone https://github.com/AxDSan/mnemosyne.git
cd mnemosyne
pip install -e .
⚠️ Ubuntu 24.04 / Debian 12 users: If you get
error: externally-managed-environment, your system Python is PEP 668-protected. Use a virtual environment:python3 -m venv .venv source .venv/bin/activate pip install -e .Make sure to activate the venv every time you run Hermes, or install Hermes itself inside the same venv.
# 2. Register with Hermes
python -m mnemosyne.install
# 3. Activate as your memory provider
hermes memory setup
# → Select "mnemosyne" and press Enter
Option B: Hermes MemoryProvider only (no pip needed)
If you only need Mnemosyne as a Hermes memory backend and want to skip pip entirely:
curl -sSL https://raw.githubusercontent.com/AxDSan/mnemosyne/main/deploy_hermes_provider.sh | bash
This symlinks the provider into ~/.hermes/plugins/mnemosyne and adds the repo to sys.path at runtime. No virtual environment required — works out of the box on Ubuntu 24.04.
Verify:
hermes memory status # Should show "Provider: mnemosyne"
hermes mnemosyne stats # Shows working + episodic memory counts
Note: The
hermes memory setuppicker defaults to "Built-in only" every time it opens. This is normal Hermes UI behavior — your previous selection is saved. Just select Mnemosyne and press Enter.
What Makes It Different
| Mnemosyne | Cloud alternatives | |
|---|---|---|
| Latency | < 1ms | 10-100ms |
| Dependencies | Python stdlib + optional ONNX | External APIs, auth, rate limits |
| Privacy | 100% local | Data leaves your machine |
| Cost | Free | Freemium / per-call |
| Setup | pip install -e . |
API keys, accounts, config |
Key capabilities:
- BEAM architecture — Three tiers: hot working memory, long-term episodic memory, temporary scratchpad
- Hybrid search — 50% vector similarity + 30% FTS5 rank + 20% importance, all inside SQLite
- Automatic consolidation — Old working memories are summarized and moved to episodic memory via
mnemosyne_sleep() - Temporal triples — Time-aware knowledge graph with automatic invalidation
- Export / import — Move your entire memory database to a new machine with one JSON file
- Cross-session scope —
remember(..., scope="global")makes facts visible everywhere - Configurable compression —
float32(default),int8(4x smaller), orbit(32x smaller) vectors
Benchmarks
All numbers measured on CPU with sqlite-vec + FTS5 enabled.
LongMemEval (ICLR 2025)
| System | Score | Notes |
|---|---|---|
| Mnemosyne (dense) | 98.9% Recall@All@5 | Oracle subset, 100 instances, bge-small-en-v1.5 |
| Mempalace | 96.6% Recall@5 | AAAK + Palace architecture |
| Mastra Observational Memory | 84.23% (gpt-4o) | Three-date model |
| Full-context GPT-4o baseline | ~60.2% | No memory system |
Latency vs. Cloud Alternatives
| Operation | Honcho | Zep | MemGPT | Mnemosyne | Speedup |
|---|---|---|---|---|---|
| Write | 45ms | 85ms | 120ms | 0.81ms | 56x |
| Read | 38ms | 62ms | 95ms | 0.076ms | 500x |
| Search | 52ms | 78ms | 140ms | 1.2ms | 43x |
| Cold Start | 500ms | 800ms | 1200ms | 0ms | Instant |
BEAM Architecture Scaling
Write throughput:
| Operation | Count | Total | Avg |
|---|---|---|---|
| Working memory writes | 500 | 8.7s | 17.4 ms |
| Episodic inserts (with embedding) | 500 | 10.7s | 21.3 ms |
| Sleep consolidation | 300 old items | 33 ms | — |
Hybrid recall scaling (query latency stays flat as corpus grows):
| Corpus Size | Query | Avg Latency | p95 |
|---|---|---|---|
| 100 | "concept 42" | 5.1 ms | 6.9 ms |
| 500 | "concept 42" | 5.0 ms | 5.7 ms |
| 1,000 | "concept 42" | 5.3 ms | 6.5 ms |
| 2,000 | "concept 42" | 7.0 ms | 8.6 ms |
Working memory recall scaling (FTS5 fast path):
| WM Size | Query | Avg Latency | p95 |
|---|---|---|---|
| 1,000 | "concept 42" | 2.4 ms | 3.1 ms |
| 5,000 | "domain 7" | 3.2 ms | 3.8 ms |
| 10,000 | "concept 42" | 6.4 ms | 7.2 ms |
Installation
Prerequisites
- Python 3.9+
- Hermes Agent (for plugin integration)
Basic
git clone https://github.com/AxDSan/mnemosyne.git
cd mnemosyne
pip install -e .
python -m mnemosyne.install
⚠️ Ubuntu 24.04 / Debian 12 users: If
pip install -e .fails withexternally-managed-environment, see the Quick Start → Option A note about using a virtual environment.
Optional dependencies
# Dense retrieval (required for semantic search and the 98.9% LongMemEval score)
pip install fastembed>=0.3.0
# Local LLM consolidation (sleep cycle summarization)
pip install ctransformers>=0.2.27 huggingface-hub>=0.20
Note: Without
fastembed, Mnemosyne falls back to keyword-only retrieval. It still works, but you won't get competitive semantic search or the benchmark scores above.
Uninstall
python -m mnemosyne.install --uninstall
Updating
Mnemosyne is installed from source, so updating is a git pull away.
Option A (pip install -e .):
cd mnemosyne
git pull
# Only re-run pip if setup.py changed (new deps, entry points, CLI commands):
pip install -e .
Option B (deploy script / symlink only):
cd mnemosyne
git pull
# Nothing to reinstall — it's a live symlink
Always restart Hermes after updating so plugin changes take effect:
hermes gateway restart
If the update includes database schema changes, run the migration helper:
python scripts/migrate_from_legacy.py
See UPDATING.md for detailed troubleshooting and rollback instructions.
Usage
CLI
# Show memory statistics
hermes mnemosyne stats
# Search memories
hermes mnemosyne inspect "dark mode preferences"
# Run consolidation (compress old working memory into episodic summaries)
hermes mnemosyne sleep
# Export all memories to a JSON file
hermes mnemosyne export --output mnemosyne_backup.json
# Import memories from a JSON file
hermes mnemosyne import --input mnemosyne_backup.json
# Clear scratchpad
hermes mnemosyne clear
Python API
from mnemosyne import remember, recall
# Store a fact
remember(
content="User prefers dark mode interfaces",
importance=0.9,
source="preference"
)
# Store a global preference (visible in every session)
remember(
content="User email is abdi.moya@gmail.com",
importance=0.95,
source="preference",
scope="global"
)
# Store a temporary credential with expiry
remember(
content="API key: sk-abc123",
importance=0.8,
source="credential",
valid_until="2026-12-31T00:00:00"
)
# Search memories
results = recall("interface preferences", top_k=3)
# Temporal knowledge graph
from mnemosyne.core.triples import TripleStore
kg = TripleStore()
kg.add("Maya", "assigned_to", "auth-migration", valid_from="2026-01-15")
kg.query("Maya", as_of="2026-02-01")
Advanced: BEAM direct access
from mnemosyne.core.beam import BeamMemory
beam = BeamMemory(session_id="my_session")
# Working memory (auto-injected into prompts)
beam.remember("Important context", importance=0.9)
# Episodic memory (long-term, searchable)
beam.consolidate_to_episodic(
summary="User likes Neovim",
source_wm_ids=["wm1"],
importance=0.8
)
# Scratchpad (temporary reasoning)
beam.scratchpad_write("todo: fix auth bug")
# Search both tiers
results = beam.recall("editor preferences", top_k=5)
Architecture
┌─────────────────────────────────────────────────────────────┐
│ HERMES AGENT │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌─────────────┐ │
│ │ pre_llm │────▶│ Mnemosyne │────▶│ SQLite │ │
│ │ hook │ │ BEAM │ │ │ │
│ └─────────────┘ └──────────────┘ │ working_mem │ │
│ ▲ │ episodic_mem│ │
│ │ │ vec_episodes│ │
│ └──────── Auto-injected context ───│ fts_episodes│ │
│ │ scratchpad │ │
│ │ triples │ │
│ └─────────────┘ │
│ │
│ No HTTP. No cloud. 100% local. │
└─────────────────────────────────────────────────────────────┘
BEAM (Bilevel Episodic-Associative Memory):
working_memory— Hot context, auto-injected before LLM calls, TTL-based evictionepisodic_memory— Long-term storage with sqlite-vec + FTS5 hybrid searchscratchpad— Temporary agent reasoning workspace
Why SQLite for Hermes?
SQLite is already in your stack. Hermes uses it for session persistence. Mnemosyne extends that same file — no new dependencies, no Docker containers, no connection pooling.
| Feature | Honcho | Zep | Mnemosyne |
|---|---|---|---|
| Deployment | Docker + PostgreSQL | Docker + Postgres | pip install |
| Query Language | REST API | REST API | SELECT ... WHERE MATCH |
| Vector Store | pgvector | pgvector | sqlite-vec |
| Text Search | Separate API | Separate API | Built-in FTS5 |
| Auth Required | Yes (supabase) | Yes | No |
| Offline Mode | No | No | Yes |
| Cold Start Latency | 500-800ms | 800ms+ | 0ms |
Backup, Export & Migration
Mnemosyne stores everything in a single SQLite file at ~/.hermes/mnemosyne/data/mnemosyne.db.
# Simple backup
cp ~/.hermes/mnemosyne/data/mnemosyne.db ~/backups/mnemosyne_$(date +%Y%m%d).db
# Export to JSON (portable across machines)
hermes mnemosyne export --output mnemosyne_backup.json
# Import on a new machine
hermes mnemosyne import --input mnemosyne_backup.json
Environment Variables
| Variable | Default | Description |
|---|---|---|
MNEMOSYNE_DATA_DIR |
~/.hermes/mnemosyne/data |
Database directory |
MNEMOSYNE_VEC_TYPE |
float32 |
Vector compression: float32, int8, or bit |
MNEMOSYNE_WM_MAX_ITEMS |
10000 |
Working memory item limit |
MNEMOSYNE_WM_TTL_HOURS |
24 |
Working memory TTL |
MNEMOSYNE_RECENCY_HALFLIFE |
168 |
Recency decay halflife in hours (1 week) |
Testing
# Run tests
python -m pytest tests/test_beam.py -v
# Run benchmarks
python tests/benchmark_beam_working_memory.py
Contributing
Contributions are welcome. Areas of active interest:
- Encrypted cloud sync (optional, user-controlled)
- Browser extension for web context capture
- Additional embedding models
- Multi-language support
See CONTRIBUTING.md for guidelines.
License
MIT License — See LICENSE
Copyright (c) 2026 Abdias J
Acknowledgments
- Hermes Agent Framework — The ecosystem Mnemosyne was built for
- Honcho — For defining the stateful memory space
- Mempalace — For proving local-first memory can compete on benchmarks
- SQLite — The world's most deployed database
"The faintest ink is more powerful than the strongest memory." — Hermes Trismegistus
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mnemosyne_memory-1.9.0.tar.gz.
File metadata
- Download URL: mnemosyne_memory-1.9.0.tar.gz
- Upload date:
- Size: 437.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d117556257c12200824b89efe0f0f068cb5f6c8ded9ddd2484d68c4a55094b56
|
|
| MD5 |
7903697e4f9cb32bb290f1a21dc16e12
|
|
| BLAKE2b-256 |
5125e3e0bc698df2da7f5d03d1989a86418cc4f2a9715461f7f619f44bc76247
|
Provenance
The following attestation bundles were made for mnemosyne_memory-1.9.0.tar.gz:
Publisher:
release.yml on AxDSan/mnemosyne
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mnemosyne_memory-1.9.0.tar.gz -
Subject digest:
d117556257c12200824b89efe0f0f068cb5f6c8ded9ddd2484d68c4a55094b56 - Sigstore transparency entry: 1361738905
- Sigstore integration time:
-
Permalink:
AxDSan/mnemosyne@a5b24101791995f4b50f8a36b8180f0e0c6afee4 -
Branch / Tag:
refs/tags/v1.9.0 - Owner: https://github.com/AxDSan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a5b24101791995f4b50f8a36b8180f0e0c6afee4 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mnemosyne_memory-1.9.0-py3-none-any.whl.
File metadata
- Download URL: mnemosyne_memory-1.9.0-py3-none-any.whl
- Upload date:
- Size: 57.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9505ecc56eed5f7fef59db2e2ef506cf6e5d2e6410f36d17a99f69735ee4cabf
|
|
| MD5 |
a205b6f95ef39f7d32eb08b127866630
|
|
| BLAKE2b-256 |
830c62e70760db3799e0613450c1a9d2cdafd0dce0a8037532a144fcacfee589
|
Provenance
The following attestation bundles were made for mnemosyne_memory-1.9.0-py3-none-any.whl:
Publisher:
release.yml on AxDSan/mnemosyne
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mnemosyne_memory-1.9.0-py3-none-any.whl -
Subject digest:
9505ecc56eed5f7fef59db2e2ef506cf6e5d2e6410f36d17a99f69735ee4cabf - Sigstore transparency entry: 1361738919
- Sigstore integration time:
-
Permalink:
AxDSan/mnemosyne@a5b24101791995f4b50f8a36b8180f0e0c6afee4 -
Branch / Tag:
refs/tags/v1.9.0 - Owner: https://github.com/AxDSan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a5b24101791995f4b50f8a36b8180f0e0c6afee4 -
Trigger Event:
push
-
Statement type: