Skip to main content

Human-like memory for AI agents. 10x cheaper than RAG. Zero vector DB needed.

Project description

om-memory

License Python Versions PyPI Version

Human-like memory for AI agents. Cheaper than RAG. Zero vector DB needed.

om-memory is a Python implementation of Observational Memory (OM) — a smarter approach to AI agent memory inspired by Mastra's OM architecture. Instead of stuffing full conversation history into every API call, OM compresses old messages into dense observations using two background agents (Observer & Reflector).

Benchmark Results (Real API Calls)

Tested with gpt-4o-mini over a 50-turn HR chatbot conversation:

Metric Traditional RAG om-memory Improvement
Total tokens 73,599 54,058 27% savings
Per-turn at turn 50 2,824 tokens 1,559 tokens 45% savings
Memory accuracy 100% (full history) 100% (8/8 recall) No loss
Context growth Linear O(n) Flat O(1) Stable

RAG token usage grows linearly with every turn. om-memory stays flat — the longer the conversation, the bigger the savings.

How It Works

Traditional RAG:    [System] + [KB] + [ALL Messages]        ← grows every turn
om-memory:          [System] + [KB] + [Observations] + [Last 2 msgs]  ← stays flat
  1. Observer: When message history exceeds a token threshold, compresses messages into concise observations (facts, decisions, preferences)
  2. Reflector: When observations pile up, merges and prunes them — like a garbage collector for memory
  3. Context Builder: Serves a two-block context: compressed observations + recent messages

The result: your agent remembers everything important without carrying every raw message.

Quick Start

import asyncio
from om_memory import ObservationalMemory

async def main():
    # 1. Initialize (uses SQLite + OPENAI_API_KEY by default)
    om = ObservationalMemory()
    await om.ainitialize()

    thread_id = "user_123"

    # 2. Get compressed context for your prompt
    context = await om.aget_context(thread_id)

    # 3. Use it in your LLM call
    prompt = f"You are a helpful assistant.\n{context}\nUser: Hello!"
    response = "Hello! How can I help?"  # your LLM call here

    # 4. Tell OM what happened
    await om.aadd_message(thread_id, "user", "Hello!")
    await om.aadd_message(thread_id, "assistant", response)

asyncio.run(main())

Configuration

from om_memory import ObservationalMemory, OMConfig

config = OMConfig(
    observer_token_threshold=300,    # compress after ~3 exchanges
    reflector_token_threshold=1500,  # GC observations early
    message_retention_count=2,       # keep last 2 messages uncompressed
    message_token_budget=200,        # token budget for recent messages
)

om = ObservationalMemory(api_key="sk-...", config=config)

Installation

pip install om-memory

Optional extras:

pip install om-memory[postgres]     # PostgreSQL storage
pip install om-memory[anthropic]    # Anthropic provider
pip install om-memory[gemini]       # Google Gemini provider
pip install om-memory[dashboard]    # Streamlit dashboard

Why Not Traditional RAG?

Traditional RAG om-memory
Context size Grows linearly with turns Stays flat
Infrastructure Vector DB + embeddings SQLite (zero setup)
Memory type Retrieves fragments Maintains narrative
Long conversations Hits context limit Unlimited
Cost trend Increases per turn Stable per turn

Try It Yourself

Run the benchmark locally:

pip install om-memory openai
export OPENAI_API_KEY="sk-..."
python demo/generate_graphs.py

Or try the interactive Colab notebook.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

om_memory-0.3.2.tar.gz (581.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

om_memory-0.3.2-py3-none-any.whl (36.0 kB view details)

Uploaded Python 3

File details

Details for the file om_memory-0.3.2.tar.gz.

File metadata

  • Download URL: om_memory-0.3.2.tar.gz
  • Upload date:
  • Size: 581.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for om_memory-0.3.2.tar.gz
Algorithm Hash digest
SHA256 cca3fff7b02c24f243181753abd60b993bc5100bd9d88f724152f28c58ab856a
MD5 cbb467bbbf55085340ad746472afaa15
BLAKE2b-256 6c17f88d720d509a086be76806e3e016318232b1a89e591e2e4cf9d17e545d0f

See more details on using hashes here.

File details

Details for the file om_memory-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: om_memory-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 36.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for om_memory-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3243f110392cb9ef3305a2a5782977bd8e75e95492d6109855d3a2497d49cf3f
MD5 dae55901f9031a83432b4466731a676c
BLAKE2b-256 244fc9e1fa60577bf8801896183111da225c18914819566ff252e461deeb0bfc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page