Skip to main content

Persistent, context-aware memory for AI assistants.

Project description

Memory Layer AI

Persistent, context-aware memory for any AI assistant.

Memory Layer AI is an open-source Python library and REST API that gives AI models the ability to remember users across sessions — automatically, efficiently, and with full privacy control.

import asyncio

from memory_layer import MemoryLayer

async def main() -> None:
	memory = MemoryLayer(user_id="user-123")

	# Save a conversation turn
	await memory.save("I'm building a FastAPI app using PostgreSQL")

	# Later — in a new session — recall relevant context
	result = await memory.recall("What database is the user using?")
	print(result.prompt_block)

asyncio.run(main())

Why Memory Layer AI?

Problem Memory Layer AI
AI forgets everything between sessions Persistent memory across unlimited sessions
Context window fills up with history Smart compression and retrieval, not raw history dump
Tied to one model or cloud vendor Model-agnostic: Claude, GPT, Ollama, any LLM
No privacy control User-scoped memory, full delete support, local-first option
Black-box memory Introspectable: view and edit what the system knows

Features

  • Four memory types — episodic, semantic, working, procedural
  • Semantic retrieval — vector similarity search, not keyword matching
  • Token-budget-aware — never overflows your LLM's context window
  • Auto-compression — old memories summarized, not deleted
  • Local-first — runs fully offline with ChromaDB + local embedding models
  • MCP-compatible — plug directly into Claude, Cursor, and any MCP-enabled tool
  • REST API + Python SDK — use as a service or import as a library
  • CLI debug tools — inspect, search, compress, and manage memories from the terminal
  • Custom memory type plugins — extend ingestion routing with pluggable classifiers

Quick Start

pip install "memory-layer-ai[all]"
import asyncio

from memory_layer import MemoryLayer

async def main() -> None:
	# Embedded mode (no server needed)
	memory = MemoryLayer(user_id="alice")

	await memory.save("My name is Alice, I'm a backend engineer.")
	await memory.save("I prefer concise answers with code examples.")

	context = await memory.recall("Tell me about the user")
	print(context.prompt_block)

asyncio.run(main())

Or run as an API server:

uvicorn memory_layer.api.main:app --port 8000

Memory introspection UI is available at http://localhost:8000/ui.

Or run with Docker Compose (API + Qdrant):

docker compose up --build

Then open:

  • http://localhost:8000/v1/health
  • http://localhost:8000/docs
  • http://localhost:8000/ui

Documentation

Documentation Website

Build and run the docs as a simple website:

pip install -e ".[docs]"
python -m mkdocs serve

Then open http://127.0.0.1:8000.

To build static HTML output:

python -m mkdocs build

Publish Documentation to GitHub Pages

This repository includes an automated Pages workflow at .github/workflows/docs-pages.yml.

To make docs publicly accessible without local setup:

  1. Push to branch main.
  2. Open GitHub repository settings.
  3. Go to Pages.
  4. Set source to GitHub Actions.
  5. Wait for workflow Docs Pages to complete.

Public URL format:

  • https://zidanmubarak.github.io/Memory-Layer-AI/
Document Description
Architecture System design and component overview
Memory Logic How ingestion, retrieval and compression work
API Reference REST endpoint contracts
SDK Guide Using Memory Layer as a Python library
MCP Integration Connecting to Claude Code, Cursor, etc.
Benchmarking Guide Running performance benchmark suite and baselines
Plugin System Guide Building and registering custom memory type plugins

Tech Stack

  • Python 3.11+ · FastAPI · Pydantic v2
  • ChromaDB (local) / Qdrant (production)
  • SQLite via SQLModel for metadata
  • sentence-transformers for local embeddings
  • Typer + Rich for CLI
  • AsyncIO for background jobs

Project Status

v0.1 — Active development. Core ingestion and retrieval being built. See milestone tracking in the GitHub repository issues/projects.


Contributing

Contributions are welcome. Use GitHub Issues and Pull Requests in this repository.


License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memory_vault-0.1.0.tar.gz (126.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memory_vault-0.1.0-py3-none-any.whl (75.4 kB view details)

Uploaded Python 3

File details

Details for the file memory_vault-0.1.0.tar.gz.

File metadata

  • Download URL: memory_vault-0.1.0.tar.gz
  • Upload date:
  • Size: 126.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for memory_vault-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6d12c69327dfde6e490ca237984115749d4dfa41f80fd3310a82e1c18a97103a
MD5 1823b15d0acb99da97b1c8c6771ed96d
BLAKE2b-256 a2e7a959317bc98eb13d8d772dc29a8b47a840850fa382846fd577ff48efe2f8

See more details on using hashes here.

File details

Details for the file memory_vault-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: memory_vault-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 75.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for memory_vault-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 163d91c455ae3326202b4a151813d6c3cd241257c2bf29fe174f1f49ddb665ca
MD5 d5839c35c1e0d6045cdfbb942a58b61b
BLAKE2b-256 7aa55e3a1021d81112032c6127dc934dd3e451d96b247fbb923dbe7887dacb96

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page