Skip to main content

Memory Operating System for LLM Agents — persistent, smart, local-first memory

Project description

MemOS — Memory Operating System for AI Agents

Persistent, structured, self-maintaining memory for any LLM agent. Local-first. Framework-agnostic. Connects via MCP to Claude Code, OpenClaw, Cursor, or any HTTP client.

Python 3.11+ License: MIT Version Tests CI Docker PyPI


What's new in v1.1.0

  • Security hardening — WebSocket auth, CORS defaults, Pydantic request schemas on all endpoints
  • Canvas force-graph dashboard — clustering, depth filter, time-lapse slider, KG edges
  • Modular frontend — monolithic 1768L split into 12 JS modules
  • 1710 tests — modernized with shared fixtures, freezegun, tmp_path
  • Zero ruff errors — 506 lint errors fixed, full formatting pass
  • CI: Python 3.11 / 3.12 / 3.13 + coverage via Codecov

See CHANGELOG.md for full history.


Installation

pip install memos-os

With local semantic recall (no external services):

pip install "memos-os[local]"    # sentence-transformers, backend="local"

With vector backend (recommended for production):

pip install "memos-os[chroma]"   # ChromaDB + Ollama embeddings
pip install "memos-os[qdrant]"   # Qdrant
pip install "memos-os[all]"      # all backends

Quick start

# Store a memory
memos learn "FastAPI is better than Flask for async workloads" --tags python,backend

# Search semantically
memos recall "which web framework should I use?"

# Start the REST API + dashboard
memos serve --port 8100
# → http://localhost:8100/dashboard

Golden path

The full lifecycle from storing a memory to maintaining it over time.

from memos import MemOS
from memos.context import ContextStack
from memos.kg_bridge import KGBridge

mem = MemOS(backend="chroma", embed_host="http://localhost:11434")
cs = ContextStack(mem)
bridge = KGBridge(mem)

# 1. Learn — store memories with tags and importance
mem.learn("User prefers dark mode in all apps", tags=["preference", "ui"], importance=0.8)
mem.learn("Deploy with Docker on ARM64 homelab", tags=["devops", "docker"], importance=0.6)
mem.learn("Alice leads the backend team", tags=["team", "people"], importance=0.7)

# 2. Recall — semantic search
results = mem.recall("who handles the server side?", top=3)
for r in results:
    print(f"[{r.score:.2f}] {r.item.content}")
# [0.91] Alice leads the backend team (tags: team, people)

# 3. context_for — targeted context for a specific LLM call
ctx = cs.context_for("how should I deploy this app?", max_chars=1500, top=5)
# Returns a string with identity + relevant memories:
# === RELEVANT MEMORIES (2 results for: 'how should I deploy this app?') ===
# [0.87] Deploy with Docker on ARM64 homelab (tags: devops, docker)

# 4. wake_up — inject at session start
prompt_fragment = cs.wake_up(max_chars=2000, l1_top=15, include_stats=True)
# Returns L0 identity + L1 top-importance memories as a string
# ready to paste into a system prompt

# 5. Reinforce and decay — keep memories healthy over time
# Boost a memory that keeps being useful
results = mem.recall("deployment strategy", top=1)
if results:
    mem._decay.reinforce(results[0].item, strength=0.1)
    mem._store.upsert(results[0].item, namespace=mem._namespace)

# Decay stale memories (preview first)
items = mem._store.list_all(namespace=mem._namespace)
report = mem._decay.run_decay(items, dry_run=True)
print(f"{report.decayed}/{report.total} memories would decay")
# When ready: run_decay(items, dry_run=False)

Python SDK

from memos import MemOS

# In-memory (zero dependencies, great for testing)
mem = MemOS()

# JSON persistence
mem = MemOS(backend="json", persist_path="~/.memos/store.json")

# Local-first semantic recall, no Ollama/Chroma required
mem = MemOS(backend="local", persist_path="~/.memos/store.json")

# ChromaDB with local Ollama embeddings
mem = MemOS(backend="chroma", embed_host="http://localhost:11434")

# Qdrant
mem = MemOS(backend="qdrant", qdrant_path="/data/memos")

# Store
mem.learn("User prefers dark mode", tags=["preference", "ui"], importance=0.8)

# Recall
results = mem.recall("what does the user like?", top=5)
for r in results:
    print(f"[{r.score:.2f}] {r.item.content}")

# Forget
mem.forget("memory-id")           # by id
mem.delete_tag("old-project")     # all memories with this tag
mem.prune(threshold=0.2)          # decay-based cleanup

# Stats
s = mem.stats()
# MemoryStats(total_memories=142, avg_relevance=0.71, decay_candidates=8)

Which recall API should I use?

Function Best for Returns When NOT to use
mem.recall() General search, browsing results list[RecallResult] with scores You need a ready-made prompt string
memory_search (MCP) Same as recall(), but over MCP JSON via MCP protocol You're using the Python SDK directly
context_for() Augmenting a single LLM call str (identity + top results) You need structured data to process
recall_enriched() Answers needing entity context dict with memories + KG facts No KG data exists or entity resolution isn't needed
from memos.context import ContextStack
from memos.kg_bridge import KGBridge

cs = ContextStack(mem)
bridge = KGBridge(mem)

# recall() — structured results you iterate over
results = mem.recall("docker deployment", top=5)
for r in results:
    print(r.item.content, r.score)

# context_for() — one string, ready for a system prompt
ctx = cs.context_for("docker deployment", max_chars=1000, top=5)

# recall_enriched() — memories + knowledge graph facts in one dict
enriched = bridge.recall_enriched("who is Alice?", top=5, min_score=0.3)
print(enriched["facts"])       # KG triples about Alice
print(enriched["memory_count"])

MCP — connect any agent

MemOS exposes a universal MCP endpoint. Any agent that speaks MCP can use it without any code changes.

HTTP (recommended)

Claude Code — add to ~/.claude.json:

{
  "mcpServers": {
    "memos": { "type": "http", "url": "http://localhost:8100/mcp" }
  }
}

OpenClaw — add to ~/.openclaw/openclaw.json:

{
  "mcp": {
    "servers": {
      "memos": { "type": "http", "url": "http://localhost:8100/mcp" }
    }
  }
}

Any MCP clientPOST http://localhost:8100/mcp with JSON-RPC 2.0 body.

Discovery: GET http://localhost:8100/.well-known/mcp.json

Stdio (Claude Code local)

{
  "mcpServers": {
    "memos": { "command": "memos", "args": ["mcp-stdio"] }
  }
}

Or run standalone: memos mcp-serve --port 8200

Available MCP tools

Tool Description
memory_search Semantic search — query, top_k, tags
memory_save Store a memory — content, tags, importance
memory_forget Delete by id or tag
memory_stats Counts, avg importance, decay candidates
memory_wake_up Identity + top memories ready to inject at session start
memory_context_for Context optimised for a specific query
memory_decay Run decay cycle (dry-run by default)
memory_reinforce Boost a memory's importance score
kg_add_fact Add a temporal triple to the Knowledge Graph
kg_query_entity All active facts for an entity
kg_timeline Chronological fact history for an entity
memory_recall_enriched Memories + KG facts in one call

REST API

Start the server: memos serve --port 8100

Interactive docs: http://localhost:8100/docs

POST   /api/v1/learn                Store a memory
POST   /api/v1/learn/batch          Bulk store
POST   /api/v1/recall               Semantic search
GET    /api/v1/recall/stream        SSE streaming recall
GET    /api/v1/search               Keyword search
GET    /api/v1/stats                Memory statistics
GET    /api/v1/tags                 List all tags
DELETE /api/v1/tags/{tag}           Delete tag from all memories
DELETE /api/v1/memory/{id}          Delete a memory
POST   /api/v1/prune                Decay-based cleanup
GET    /api/v1/graph                Knowledge graph (nodes + edges for D3.js)
GET    /api/v1/export/parquet       Download .parquet backup
POST   /mcp                         MCP JSON-RPC endpoint
GET    /.well-known/mcp.json        MCP discovery
GET    /dashboard                   Second Brain UI
GET    /health                      Health check

Configuration

All options can be set via environment variables:

MEMOS_BACKEND=chroma              # memory | json | chroma | qdrant | pinecone
MEMOS_NAMESPACE=default           # memory namespace (one per agent)
MEMOS_PERSIST_PATH=~/.memos/      # path for json/sqlite storage

# ChromaDB
MEMOS_CHROMA_URL=http://chroma:8000
MEMOS_EMBED_HOST=http://localhost:11434   # Ollama — bypasses server-side ONNX
MEMOS_EMBED_MODEL=nomic-embed-text

# Qdrant
MEMOS_QDRANT_HOST=localhost
MEMOS_QDRANT_PORT=6333

# Pinecone
MEMOS_PINECONE_API_KEY=***
MEMOS_PINECONE_INDEX=agent-memories

Docker

Single container (JSON backend, no dependencies):

docker run -p 8100:8000 \
  -e MEMOS_BACKEND=json \
  -v memos-data:/root/.memos \
  ghcr.io/mars375/memos:latest

Full stack with ChromaDB + Ollama embeddings:

git clone https://github.com/Mars375/memos
cd memos
docker compose up -d

Import conversations

Mine your existing conversations into MemOS:

# Auto-detect format
memos mine conversations.json

# Supported formats
memos mine export.json      --format claude      # Claude Projects export
memos mine conversations.json --format chatgpt   # ChatGPT export
memos mine messages.json    --format discord
memos mine result.json      --format telegram    # Telegram Desktop export
memos mine channel.jsonl    --format slack
memos mine ~/.openclaw/workspace-labs/ --format openclaw

# Options
memos mine ~/notes/ --dry-run --tags project-x --chunk-size 600 --namespace agent-alice

Python API:

from memos.ingest.miner import Miner

miner = Miner(mem, chunk_size=800, chunk_overlap=100)
result = miner.mine_auto("conversations/")   # auto-detect
# MineResult(imported=127, dupes=12, empty=3, errors=0)

Knowledge Graph

Store and query temporal facts between entities:

memos kg-add "Alice" "works-at" "Acme Corp" --from 2024-01-01
memos kg-query Alice
memos kg-path Alice Carol --max-hops 3
memos kg-neighbors Alice --depth 2
memos kg-timeline Alice
from memos.knowledge_graph import KnowledgeGraph

kg = KnowledgeGraph()
kg.add_fact("Alice", "works-at", "Acme Corp", valid_from="2024-01-01")
facts = kg.query("Alice")
paths = kg.find_paths("Alice", "Carol", max_hops=3)

Living Wiki

Compile memories into entity-based markdown pages with backlinks:

memos wiki-living update          # scan memories, create/update entity pages
memos wiki-living read Alice      # print Alice's page
memos wiki-living search "python" # search across all pages
memos wiki-living lint            # find orphans, contradictions, empty pages

Memory decay

Memories age automatically. Important ones persist; stale ones fade.

memos decay --dry-run      # preview what would decay
memos decay --apply        # apply decay
memos prune --threshold 0.1  # delete memories below importance threshold

Versioning and time-travel

Every write is versioned. Query the past, diff changes, roll back.

memos history <memory-id>
memos diff <memory-id> --latest
memos rollback <memory-id> --version 1 --yes
memos recall-at "user preferences" --at 2d    # as of 2 days ago
memos snapshot-at 1w                          # all memories 1 week ago

Multi-namespace (multi-agent)

Each agent gets its own isolated namespace:

memos --namespace agent-alice learn "Alice's memory"
memos --namespace agent-bob learn "Bob's memory"
memos --namespace agent-alice recall "what do I know?"

Development

git clone https://github.com/Mars375/memos
cd memos
pip install -e ".[dev]"

# Lint
ruff check src/ tests/
ruff format src/ tests/

# Tests (Python 3.11 / 3.12 / 3.13)
pytest -q --tb=short          # 1710 tests
pytest tests/test_core.py     # specific module

Architecture

MemOS is built around three core layers:

  • Capture — Mine conversations and events into structured memory units via the CLI, SDK, or MCP.
  • Engine — Storage, recall, decay, reinforcement, versioning, and knowledge graph. Pluggable backends (in-memory, JSON, ChromaDB, Qdrant, Pinecone).
  • Knowledge Surface — Living wiki, graph view, and context packs (wake_up, context_for, recall_enriched) that serve the right context at the right time.

See ROADMAP.md for planned features and PRD.md for product requirements.


License

MIT — Mars375

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memos_os-2.2.0.tar.gz (387.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memos_os-2.2.0-py3-none-any.whl (301.1 kB view details)

Uploaded Python 3

File details

Details for the file memos_os-2.2.0.tar.gz.

File metadata

  • Download URL: memos_os-2.2.0.tar.gz
  • Upload date:
  • Size: 387.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memos_os-2.2.0.tar.gz
Algorithm Hash digest
SHA256 7684f1e0655b073f67032f98d54ad45e9ab8f6b5ad99a7bd2da6a9ceb48c0ae8
MD5 2a26ba5e82766b4e3e63f1d15e1b4814
BLAKE2b-256 5b3cc795a1ef1b2bb01d6a9c5aa4cb9fa145ee6b7f449525b30cd069133bf394

See more details on using hashes here.

Provenance

The following attestation bundles were made for memos_os-2.2.0.tar.gz:

Publisher: publish.yml on Mars375/memos

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file memos_os-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: memos_os-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 301.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memos_os-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ed2efa47f1bc455de8d69e0a936f31d8669683ea20dc25fa86d1ff03b523877a
MD5 31413c709a39f390a11d0088a99b40d2
BLAKE2b-256 5e3e4c42ba6083fb286b192cb150e26028c7bca8723112ac6624cb60b8d02b60

See more details on using hashes here.

Provenance

The following attestation bundles were made for memos_os-2.2.0-py3-none-any.whl:

Publisher: publish.yml on Mars375/memos

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page