Persistent, context-aware memory for AI assistants.
Project description
Memory Layer AI
Persistent, context-aware memory for any AI assistant.
Memory Layer AI is an open-source Python library and REST API that gives AI models the ability to remember users across sessions — automatically, efficiently, and with full privacy control.
import asyncio
from memory_layer import MemoryLayer
async def main() -> None:
memory = MemoryLayer(user_id="user-123")
# Save a conversation turn
await memory.save("I'm building a FastAPI app using PostgreSQL")
# Later — in a new session — recall relevant context
result = await memory.recall("What database is the user using?")
print(result.prompt_block)
asyncio.run(main())
Why Memory Layer AI?
| Problem | Memory Layer AI |
|---|---|
| AI forgets everything between sessions | Persistent memory across unlimited sessions |
| Context window fills up with history | Smart compression and retrieval, not raw history dump |
| Tied to one model or cloud vendor | Model-agnostic: Claude, GPT, Ollama, any LLM |
| No privacy control | User-scoped memory, full delete support, local-first option |
| Black-box memory | Introspectable: view and edit what the system knows |
Features
- Four memory types — episodic, semantic, working, procedural
- Semantic retrieval — vector similarity search, not keyword matching
- Token-budget-aware — never overflows your LLM's context window
- Auto-compression — old memories summarized, not deleted
- Local-first — runs fully offline with ChromaDB + local embedding models
- MCP-compatible — plug directly into Claude, Cursor, and any MCP-enabled tool
- REST API + Python SDK — use as a service or import as a library
- CLI debug tools — inspect, search, compress, and manage memories from the terminal
- Custom memory type plugins — extend ingestion routing with pluggable classifiers
Quick Start
pip install "memory-layer-ai[all]"
import asyncio
from memory_layer import MemoryLayer
async def main() -> None:
# Embedded mode (no server needed)
memory = MemoryLayer(user_id="alice")
await memory.save("My name is Alice, I'm a backend engineer.")
await memory.save("I prefer concise answers with code examples.")
context = await memory.recall("Tell me about the user")
print(context.prompt_block)
asyncio.run(main())
Or run as an API server:
uvicorn memory_layer.api.main:app --port 8000
Memory introspection UI is available at http://localhost:8000/ui.
Or run with Docker Compose (API + Qdrant):
docker compose up --build
Then open:
http://localhost:8000/v1/healthhttp://localhost:8000/docshttp://localhost:8000/ui
Documentation
Documentation Website
Build and run the docs as a simple website:
pip install -e ".[docs]"
python -m mkdocs serve
Then open http://127.0.0.1:8000.
To build static HTML output:
python -m mkdocs build
Publish Documentation to GitHub Pages
This repository includes an automated Pages workflow at .github/workflows/docs-pages.yml.
To make docs publicly accessible without local setup:
- Push to branch
main. - Open GitHub repository settings.
- Go to Pages.
- Set source to
GitHub Actions. - Wait for workflow
Docs Pagesto complete.
Public URL format:
https://zidanmubarak.github.io/Memory-Layer-AI/
| Document | Description |
|---|---|
| Architecture | System design and component overview |
| Memory Logic | How ingestion, retrieval and compression work |
| API Reference | REST endpoint contracts |
| SDK Guide | Using Memory Layer as a Python library |
| MCP Integration | Connecting to Claude Code, Cursor, etc. |
| Benchmarking Guide | Running performance benchmark suite and baselines |
| Plugin System Guide | Building and registering custom memory type plugins |
Tech Stack
- Python 3.11+ · FastAPI · Pydantic v2
- ChromaDB (local) / Qdrant (production)
- SQLite via SQLModel for metadata
- sentence-transformers for local embeddings
- Typer + Rich for CLI
- AsyncIO for background jobs
Project Status
v0.1 — Active development. Core ingestion and retrieval being built. See milestone tracking in the GitHub repository issues/projects.
Contributing
Contributions are welcome. Use GitHub Issues and Pull Requests in this repository.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memory_vault-0.1.0.tar.gz.
File metadata
- Download URL: memory_vault-0.1.0.tar.gz
- Upload date:
- Size: 126.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d12c69327dfde6e490ca237984115749d4dfa41f80fd3310a82e1c18a97103a
|
|
| MD5 |
1823b15d0acb99da97b1c8c6771ed96d
|
|
| BLAKE2b-256 |
a2e7a959317bc98eb13d8d772dc29a8b47a840850fa382846fd577ff48efe2f8
|
File details
Details for the file memory_vault-0.1.0-py3-none-any.whl.
File metadata
- Download URL: memory_vault-0.1.0-py3-none-any.whl
- Upload date:
- Size: 75.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
163d91c455ae3326202b4a151813d6c3cd241257c2bf29fe174f1f49ddb665ca
|
|
| MD5 |
d5839c35c1e0d6045cdfbb942a58b61b
|
|
| BLAKE2b-256 |
7aa55e3a1021d81112032c6127dc934dd3e451d96b247fbb923dbe7887dacb96
|