Human-like memory for AI agents. 10x cheaper than RAG. Zero vector DB needed.
Project description
om-memory
Human-like memory for AI agents. Cheaper than RAG. Zero vector DB needed.
om-memory is a Python implementation of Observational Memory (OM) — a smarter approach to AI agent memory inspired by Mastra's OM architecture. Instead of stuffing full conversation history into every API call, OM compresses old messages into dense observations using two background agents (Observer & Reflector).
Benchmark Results (Real API Calls)
Tested with gpt-4o-mini over a 50-turn HR chatbot conversation:
| Metric | Traditional RAG | om-memory | Improvement |
|---|---|---|---|
| Total tokens | 73,599 | 54,058 | 27% savings |
| Per-turn at turn 50 | 2,824 tokens | 1,559 tokens | 45% savings |
| Memory accuracy | 100% (full history) | 100% (8/8 recall) | No loss |
| Context growth | Linear O(n) | Flat O(1) | Stable |
RAG token usage grows linearly with every turn. om-memory stays flat — the longer the conversation, the bigger the savings.
How It Works
Traditional RAG: [System] + [KB] + [ALL Messages] ← grows every turn
om-memory: [System] + [KB] + [Observations] + [Last 2 msgs] ← stays flat
- Observer: When message history exceeds a token threshold, compresses messages into concise observations (facts, decisions, preferences)
- Reflector: When observations pile up, merges and prunes them — like a garbage collector for memory
- Context Builder: Serves a two-block context: compressed observations + recent messages
The result: your agent remembers everything important without carrying every raw message.
Quick Start
import asyncio
from om_memory import ObservationalMemory
async def main():
# 1. Initialize (uses SQLite + OPENAI_API_KEY by default)
om = ObservationalMemory()
await om.ainitialize()
thread_id = "user_123"
# 2. Get compressed context for your prompt
context = await om.aget_context(thread_id)
# 3. Use it in your LLM call
prompt = f"You are a helpful assistant.\n{context}\nUser: Hello!"
response = "Hello! How can I help?" # your LLM call here
# 4. Tell OM what happened
await om.aadd_message(thread_id, "user", "Hello!")
await om.aadd_message(thread_id, "assistant", response)
asyncio.run(main())
Configuration
from om_memory import ObservationalMemory, OMConfig
config = OMConfig(
observer_token_threshold=300, # compress after ~3 exchanges
reflector_token_threshold=1500, # GC observations early
message_retention_count=2, # keep last 2 messages uncompressed
message_token_budget=200, # token budget for recent messages
)
om = ObservationalMemory(api_key="sk-...", config=config)
Installation
pip install om-memory
Optional extras:
pip install om-memory[postgres] # PostgreSQL storage
pip install om-memory[anthropic] # Anthropic provider
pip install om-memory[gemini] # Google Gemini provider
pip install om-memory[dashboard] # Streamlit dashboard
Why Not Traditional RAG?
| Traditional RAG | om-memory | |
|---|---|---|
| Context size | Grows linearly with turns | Stays flat |
| Infrastructure | Vector DB + embeddings | SQLite (zero setup) |
| Memory type | Retrieves fragments | Maintains narrative |
| Long conversations | Hits context limit | Unlimited |
| Cost trend | Increases per turn | Stable per turn |
Try It Yourself
Run the benchmark locally:
pip install om-memory openai
export OPENAI_API_KEY="sk-..."
python demo/generate_graphs.py
Or try the interactive Colab notebook.
License
Apache 2.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file om_memory-0.3.2.tar.gz.
File metadata
- Download URL: om_memory-0.3.2.tar.gz
- Upload date:
- Size: 581.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cca3fff7b02c24f243181753abd60b993bc5100bd9d88f724152f28c58ab856a
|
|
| MD5 |
cbb467bbbf55085340ad746472afaa15
|
|
| BLAKE2b-256 |
6c17f88d720d509a086be76806e3e016318232b1a89e591e2e4cf9d17e545d0f
|
File details
Details for the file om_memory-0.3.2-py3-none-any.whl.
File metadata
- Download URL: om_memory-0.3.2-py3-none-any.whl
- Upload date:
- Size: 36.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3243f110392cb9ef3305a2a5782977bd8e75e95492d6109855d3a2497d49cf3f
|
|
| MD5 |
dae55901f9031a83432b4466731a676c
|
|
| BLAKE2b-256 |
244fc9e1fa60577bf8801896183111da225c18914819566ff252e461deeb0bfc
|