Universal plug-and-play memory engine for AI agents. Built on Graphiti's temporal knowledge graph with token-budget optimization.
Project description
MemGraph — Universal AI Agent Memory Engine
Give any AI agent perfect memory in 3 lines of code.
MemGraph wraps Graphiti's temporal knowledge graph with a token-budget allocator that packs the maximum signal into the minimum tokens. Most agent turns need zero long-term memory — those turns pay zero memory tokens.
The Problem
Every "agent memory" system does one of two dumb things:
| Approach | Problem |
|---|---|
| Dump the whole history | Wastes thousands of tokens every turn. Expensive and slow. |
| Summarise into one paragraph | Lossy. Misses important details. Can't answer precise questions. |
The MemGraph Solution
Knowledge graph + hybrid retrieval + token-budget allocator
- Knowledge graph storage — facts, entities, and temporal relationships via Graphiti + FalkorDB.
- Hybrid retrieval — FAISS vector search + graph neighbour expansion + BM25, re-ranked by relevance × recency × centrality.
- Token-budget allocator — given a hard ceiling (e.g. 1000 tokens), pack the most information-dense fragments that fit. When nothing is relevant, return empty — saving 100% of memory tokens.
Benchmark Results
Measured on a synthetic 40-turn conversation with 10 test queries. Hardware: Ryzen 5800H, 32GB DDR4, nomic-embed-text (local).
| Approach | Avg Tokens / Query | P95 Latency | Token Reduction |
|---|---|---|---|
| Raw context (full history) | ~3,200 | — | baseline |
| Sliding window (10 turns) | ~820 | — | −74% |
| MemGraph (budget=1000) | ~310 | < 30ms | −90% |
Run the benchmarks yourself:
uv run python benchmarks/token_comparison.py --turns 40 --budget 1000
uv run python benchmarks/retrieval_speed.py --facts 200 --queries 50
uv run python benchmarks/quality_eval.py --budget 800
Quickstart
Install
pip install memgraph-agent
Check prerequisites (Docker, FalkorDB, Ollama)
memgraph-setup # check only
memgraph-setup --fix # auto-start missing services
One-command infra
docker compose up -d # starts FalkorDB on :6379 and :3000 (browser UI)
Use it
from memgraph import MemGraph
async with await MemGraph.create() as mg:
# Store anything — plain text, messages, JSON
await mg.store("Luke prefers TypeScript and runs Ollama locally.")
await mg.store([
{"role": "user", "content": "What's the ATLAS stack?"},
{"role": "assistant", "content": "Express.js + Qwen3 via Ollama."},
])
# Retrieve — only pays token cost when relevant context exists
ctx = await mg.query("What stack does Luke use?", token_budget=500)
print(ctx.text) # formatted, token-capped, ready to inject
print(ctx.tokens_used) # exact token count (0 if nothing relevant)
print(ctx.is_empty) # True when nothing relevant found
That's it. No vector DB setup, no embedding server to manage (uses Ollama locally by default).
Install Options
# Minimal (core + REST server + MCP)
pip install memgraph-agent
# With OpenAI client (for OpenAI Agents SDK adapter)
pip install memgraph-agent[openai]
# With Anthropic client
pip install memgraph-agent[anthropic]
# With LangChain adapter
pip install memgraph-agent[langchain]
# Everything
pip install memgraph-agent[all]
Setup Guide
Prerequisites
| Requirement | Notes |
|---|---|
| Python 3.11+ | python --version |
| Docker | For FalkorDB. Get Docker |
| Ollama (local) | For embeddings. ollama.com |
| OpenAI API key | For Graphiti entity extraction (even with local embeddings) |
Automated setup
# Check everything
memgraph-setup
# Auto-fix: creates FalkorDB container, starts Ollama, creates .env
memgraph-setup --fix
# Run only specific checks
memgraph-setup --check Docker FalkorDB "OPENAI_API_KEY"
The checker validates:
- Python version ≥ 3.11
- Docker installed and daemon running
- FalkorDB container reachable (creates it if
--fix) - Ollama API responsive (starts
ollama serveif--fix) nomic-embed-textmodel pulled (runsollama pullif--fix).envfile exists with required keys (copies from.env.exampleif--fix)OPENAI_API_KEYis set and non-placeholder
Manual setup
# 1. Clone
git clone https://github.com/SP3DK1D/Memomatic.git && cd Memomatic
# 2. Install
pip install -e ".[dev]"
# or with uv:
uv sync
# 3. Configure
cp .env.example .env
# edit .env — set OPENAI_API_KEY at minimum
# 4. Start FalkorDB
docker compose up -d falkordb
# 5. Pull embedding model
ollama pull nomic-embed-text
# 6. Verify
memgraph setup
Configuration
MemGraph is configured via config.yaml (overridable via environment variables):
graph:
host: localhost
port: 6379
database: memgraph # keeps data separate from other FalkorDB projects
embeddings:
provider: ollama # or "openai"
model: nomic-embed-text # or "text-embedding-3-small"
dim: 768
retrieval:
top_k: 20
default_token_budget: 2000
formatter:
default_format: claude_xml # claude_xml | openai_system | markdown | json
All config.yaml keys can be overridden via environment variables — see .env.example.
What's in the Box
| Module | Purpose |
|---|---|
memgraph.core.MemGraph |
High-level entry point: store(), query(), forget(), stats() |
memgraph.token_budget |
Greedy knapsack packer — the key differentiator |
memgraph.retrieval |
FAISS + graph hybrid search with scoring |
memgraph.formatter |
Claude XML / OpenAI system / Markdown / JSON output |
memgraph.setup_check |
Prerequisite checker + auto-installer |
server.app |
FastAPI REST server (memgraph-server) |
server.mcp.handler |
MCP protocol handler for Claude Code / Cursor / Windsurf |
adapters.GenericAgentAdapter |
Two-hook adapter for any agent framework |
adapters.MemGraphChatMemory |
LangChain BaseChatMemory drop-in |
adapters.MemGraphHooks |
OpenAI Agents SDK hooks |
adapters.SimpleOpenAIAgentAdapter |
Works with raw openai chat completions |
adapters.AdapterGenerator |
Auto-generates Node.js / TypeScript / Python adapters |
scanner |
Detects existing memory systems in any codebase |
bridge |
HTTP + IPC bridge for non-Python agents |
cli |
Full CLI (memgraph setup / scan / integrate / query / store) |
Framework Adapters
Any Python agent (3 lines)
from adapters.generic import GenericAgentAdapter
adapter = GenericAgentAdapter(mg, token_budget=1500, fmt="claude_xml")
context = await adapter.before_turn(user_message) # inject into prompt
await adapter.after_turn(messages) # store the turn
LangChain
from adapters.langchain import MemGraphChatMemory
memory = MemGraphChatMemory(mg, token_budget=1500)
chain = LLMChain(llm=chat_model, prompt=prompt, memory=memory)
# Works exactly like ConversationBufferMemory — but token-aware
OpenAI Agents SDK
from adapters.openai_agents import MemGraphHooks
agent = Agent(
name="MyAgent",
instructions="You are a helpful assistant.",
hooks=MemGraphHooks(mg),
)
Raw OpenAI completions
from adapters.openai_agents import SimpleOpenAIAgentAdapter
adapter = SimpleOpenAIAgentAdapter(mg)
messages = await adapter.before_completion(messages)
response = await client.chat.completions.create(model="gpt-4o", messages=messages)
await adapter.after_completion(messages, response.choices[0].message.content)
Node.js / TypeScript agents (ATLAS, OpenClaw, etc.)
# Auto-detect your agent's memory system and generate an adapter
memgraph integrate /path/to/your/agent
# Or apply it directly
memgraph integrate /path/to/your/agent --apply
Then add ONE line to your agent's entry point:
require('./memgraph-adapter').patch(); // Node.js
// or
import './memgraph-adapter'; // TypeScript — adapter auto-patches at import
MCP Integration (Claude Code / Cursor / Windsurf)
Run the MCP server:
memgraph-server
# or
uv run uvicorn server.app:app --port 8100
Add to your MCP client config:
{
"mcpServers": {
"memgraph": {
"url": "http://localhost:8100/mcp"
}
}
}
Tools exposed: store_memory, query_memory, forget, memory_stats, generate_context.
See examples/claude_code_mcp.py for a complete walkthrough.
REST API
# Start server
memgraph-server # listens on :8100
# Store a memory
curl -X POST http://localhost:8100/store \
-H 'Content-Type: application/json' \
-d '{"content": "Luke prefers TypeScript.", "group_id": "session-1"}'
# Query memory
curl -X POST http://localhost:8100/query \
-H 'Content-Type: application/json' \
-d '{"query": "What stack does Luke prefer?", "token_budget": 500}'
# Health check
curl http://localhost:8100/health
Interactive docs at http://localhost:8100/docs.
CLI
# Check prerequisites
memgraph setup
memgraph setup --fix # auto-install missing
# Scan a project's memory architecture
memgraph scan /path/to/atlas
# Generate + apply an adapter
memgraph integrate /path/to/atlas --apply
# Store and query directly
memgraph store "Luke runs Ollama on a Ryzen 5800H."
memgraph query "What hardware does Luke use?" --budget 300
# Check server health
memgraph status
Output Formats
from memgraph.formatter import OutputFormat
# Claude XML (default — Claude parses this most efficiently)
ctx = await mg.query(q, fmt=OutputFormat.CLAUDE_XML)
# → <agent_memory token_budget="1000">...</agent_memory>
# OpenAI system message
ctx = await mg.query(q, fmt=OutputFormat.OPENAI_SYSTEM)
# → [MEMORY CONTEXT] ... [/MEMORY CONTEXT]
# Markdown (for CLAUDE.md injection)
ctx = await mg.query(q, fmt=OutputFormat.MARKDOWN)
# → ## Agent Memory\n- **ATLAS**: ...
# Raw JSON
ctx = await mg.query(q, fmt=OutputFormat.JSON)
# → {"entities": [...], "facts": [...], ...}
Roadmap
- Phase 1: Core engine (FAISS + Graphiti + token budget + formatters)
- Phase 2: FastAPI REST + MCP server
- Phase 3: Framework adapters (LangChain, OpenAI Agents, generic), benchmarks
- Phase 4: PyPI release, setup checker, CI/CD
- Hosted API tier (pay-as-you-go, no Docker required)
- LangGraph adapter
- Streaming retrieval (yield fragments as they score)
- Web dashboard (FalkorDB browser + token savings metrics)
Development
# Install with dev deps
uv sync --extra dev
# Run tests (pure, no live services needed)
uv run pytest tests/ -v
# Run tests with coverage
uv run pytest tests/ --cov=memgraph --cov-report=term-missing
# Lint + format
uv run ruff check .
uv run ruff format .
# Type check
uv run mypy memgraph/
# Run benchmarks (requires live FalkorDB + Ollama)
uv run python benchmarks/token_comparison.py
uv run python benchmarks/retrieval_speed.py
uv run python benchmarks/quality_eval.py
Project structure
memgraph/ Core Python package (store, query, forget, token budget)
server/ FastAPI REST + MCP server
adapters/ Framework adapters (LangChain, OpenAI Agents, generic, Node.js templates)
scanner/ Project scanner (detects existing memory systems)
bridge/ HTTP + IPC bridge for non-Python agents
cli/ Unified CLI (memgraph setup / scan / integrate / query)
benchmarks/ Token comparison, speed, and quality benchmarks
examples/ Working end-to-end examples
tests/ pytest suite (pure module tests, no live services)
License
MIT — see LICENSE.
Credits
Built by Luke (TechItLuke) on top of:
- Graphiti by Zep — temporal knowledge graph engine
- FalkorDB — lightweight Redis-compatible graph database
- FAISS by Meta — in-process vector search
- nomic-embed-text — free local embeddings via Ollama
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memgraph_agent-0.2.0.tar.gz.
File metadata
- Download URL: memgraph_agent-0.2.0.tar.gz
- Upload date:
- Size: 102.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9c3a92acbb07b5396e964d9e024ef9985a46cec80ef01762ec47a112fc6486c
|
|
| MD5 |
fd555e96f54b34ed270823b1b01b1fd0
|
|
| BLAKE2b-256 |
0eab6f44ac36a8629c241abcc838b16637ae3fa31025f2c4f3b8924caabc5d9c
|
File details
Details for the file memgraph_agent-0.2.0-py3-none-any.whl.
File metadata
- Download URL: memgraph_agent-0.2.0-py3-none-any.whl
- Upload date:
- Size: 114.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
579963d56ceb3aba59c462bf7b81ac5d68cc54b39dc7c61a3ea4bf5f62e8057e
|
|
| MD5 |
ec20ec8078f4d10a1f86dc627765106f
|
|
| BLAKE2b-256 |
cbc4b94a9048587e9569f172d6eb0a64c07c71c302f853c611bb896b31d247c1
|