A self-consolidating memory layer for AI agents with schema-first design, intelligent merging, and hybrid search capabilities
Project description
🧠 Ontomem: The Self-Consolidating Memory
中文版本 | English
Ontomem is built on the concept of Ontology Memory—structured, coherent knowledge representation for AI systems.
Give your AI agent a "coherent" memory, not just "fragmented" retrieval.
Traditional RAG (Retrieval-Augmented Generation) systems retrieve text fragments. Ontomem maintains structured entities using Pydantic schemas and intelligent merging algorithms.
It excels at Time-Series Consolidation: effortlessly merging streaming observations (like logs or chat turns) into coherent "Daily Snapshots" or "Session Summaries" simply by defining a composite key (e.g., user_id + date).
It doesn't just store data—it continuously "digests" and "organizes" it.
✨ Why Ontomem?
🧩 Schema-First & Type-Safe
Built on Pydantic. All memories are strongly-typed objects. Say goodbye to {"unknown": "dict"} hell and embrace IDE autocomplete and type checking.
⏱️ Temporal Consolidation (Time-Slicing)
Ontomem isn't just about ID deduplication. By using Composite Keys (e.g., lambda x: f"{x.user}_{x.date}"), you can automatically aggregate a day's worth of fragmented events into a Single Daily Record.
- Input: 1,000 fragmented logs/observations throughout the day.
- Output: 1 structured, LLM-synthesized "Daily Summary" object.
🔄 Auto-Evolution
When you insert new data about an existing entity, Ontomem doesn't create duplicates. It intelligently merges them into a Golden Record using configurable strategies (Conflict Resolution, List Appending, or LLM-powered Synthesis).
🔍 Hybrid Search
- Key-Value Lookup: O(1) exact access (e.g., "Get me Alice's summary for 2024-01-01").
- Vector Search: Semantic similarity search across your entire timeline (e.g., "When was Alice frustrated?").
💾 Stateful & Persistent
Save your complete memory state (structured data + vector indices) to disk and restore it in seconds on next startup.
🧠 Ontomem vs. Other Memory Systems
Most memory libraries store Raw Text or Chat History. Ontomem stores Consolidated Knowledge.
| Feature | Ontomem 🧠 | Mem0 / Zep | LangChain Memory | Vector DBs (Pinecone/Chroma) |
|---|---|---|---|---|
| Core Storage Unit | ✅ Structured Objects (Pydantic) | Text Chunks + Metadata | Raw Chat Logs | Embedding Vectors |
| Data "Digestion" | ✅ Auto-Consolidation & merging | Simple Extraction | ❌ Append-only | ❌ Append-only |
| Time Awareness | ✅ Time-Slicing (Daily/Session Aggregation) | ❌ Timestamp metadata only | ❌ Sequential only | ❌ Metadata filtering only |
| Conflict Resolution | ✅ LLM Logic (Synthesize/Prioritize) | ❌ Last-write-wins | ❌ None | ❌ None |
| Type Safety | ✅ Strict Schema | ⚠️ Loose JSON | ❌ String only | ❌ None |
| Ideal For | Long-term Agent Profiles, Knowledge Graphs | Simple RAG, Search | Chatbots, Context Window | Semantic Search |
💡 The "Consolidation" Advantage
- Traditional RAG: Stores 50 chunks of "Alice likes apples", "Alice likes bananas". Search returns 50 fragments.
- Ontomem: Merges them into 1 object:
User(name="Alice", likes=["apples", "bananas"]). Search returns one complete truth.
🚀 Quick Start
Build a structured memory store in 30 seconds.
1. Define & Initialize
from pydantic import BaseModel
from ontomem import OMem
# 1. Define your memory schema
class UserProfile(BaseModel):
name: str
skills: list[str]
last_seen: str
# 2. Initialize (Simple mode)
memory = OMem(
memory_schema=UserProfile,
key_extractor=lambda x: x.name # Unique ID
)
2. Add & Merge (Auto-Consolidation)
Ontomem automatically merges data for the same ID.
# First observation
memory.add(UserProfile(name="Alice", skills=["Python"], last_seen="10:00"))
# Later observation (New skill added, time updated)
memory.add(UserProfile(name="Alice", skills=["Docker"], last_seen="11:00"))
# Retrieve the consolidated "Golden Record"
alice = memory.get("Alice")
print(alice.skills) # ['Python', 'Docker'] (Lists merged!)
print(alice.last_seen) # "11:00" (Updated!)
3. Search & Retrieve
# Exact retrieval
profile = memory.get("Alice")
# All keys in memory
all_keys = memory.keys
# Clear or remove
memory.remove("Alice")
💡 Advanced Examples
Example 1: The "Self-Improving" Debugger (Logic Evolution)
An AI agent that doesn't just store errors—it synthesizes debugging wisdom over time using LLM.BALANCED strategy.
from ontomem import OMem, MergeStrategy
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
class BugFixExperience(BaseModel):
error_signature: str
solutions: list[str]
prevention_tips: str
memory = OMem(
memory_schema=BugFixExperience,
key_extractor=lambda x: x.error_signature,
llm_client=ChatOpenAI(model="gpt-4o"),
embedder=OpenAIEmbeddings(),
merge_strategy=MergeStrategy.LLM.BALANCED
)
# Day 1: Pip install
memory.add(BugFixExperience(
error_signature="ModuleNotFoundError: pandas",
solutions=["pip install pandas"],
prevention_tips="Check requirements.txt"
))
# Day 2: Docker container (Different solution!)
memory.add(BugFixExperience(
error_signature="ModuleNotFoundError: pandas",
solutions=["apt-get install python3-pandas"], # Added to list!
prevention_tips="Use system packages in containers" # LLM merges both tips
))
# Result: Single record with merged solutions + synthesized advice
guidance = memory.get("ModuleNotFoundError: pandas")
print(guidance.prevention_tips)
# >>> "In standard environments, check requirements.txt.
# In containerized environments, prefer system packages..."
Example 2: Temporal Memory & Daily Consolidation (Time-Series)
Turn a stream of fragmented events into a single "Daily Summary" record using Composite Keys.
from ontomem import OMem, MergeStrategy
class DailyTrace(BaseModel):
user: str
date: str
actions: list[str] # Accumulates all day
summary: str # LLM synthesizes entire day
memory = OMem(
memory_schema=DailyTrace,
key_extractor=lambda x: f"{x.user}_{x.date}", # <-- THE MAGIC KEY
llm_client=ChatOpenAI(model="gpt-4o"),
embedder=OpenAIEmbeddings(),
merge_strategy=MergeStrategy.LLM.BALANCED
)
# 9:00 AM event
memory.add(DailyTrace(user="Alice", date="2024-01-01", actions=["Login"]))
# 5:00 PM event (Same day → Merges into SAME record)
memory.add(DailyTrace(user="Alice", date="2024-01-01", actions=["Logout"]))
# Next day (New date → NEW record)
memory.add(DailyTrace(user="Alice", date="2024-01-02", actions=["Login"]))
# Results:
# - alice_2024-01-01: actions=["Login", "Logout"], summary="Active trading day..."
# - alice_2024-01-02: actions=["Login"], summary="Brief session..."
# Semantic search across time
results = memory.search("When was Alice frustrated?", k=1)
For a complete working example, see examples/06_temporal_memory_consolidation.py
🔍 Semantic Search
Build an index and search by natural language:
# Build vector index
memory.build_index()
# Semantic search
results = memory.search("Find researchers working on transformer models and attention mechanisms")
for researcher in results:
print(f"- {researcher.name}: {researcher.research_interests}")
🛠️ Merge Strategies
Choose how to handle conflicts:
| Strategy | Behavior | Use Case |
|---|---|---|
FIELD_MERGE |
Non-null overwrites, lists append | Simple attribute collection |
KEEP_INCOMING |
Latest data wins | Status updates (current role, last seen) |
KEEP_EXISTING |
First observation stays | Historical records (first publication year) |
LLM.BALANCED |
LLM-driven semantic merging | Complex synthesis, contradiction resolution |
LLM.PREFER_INCOMING |
LLM merges semantically, prefers new data on conflict | New information should take priority when contradictions arise |
LLM.PREFER_EXISTING |
LLM merges semantically, prefers existing data on conflict | Existing data should take priority when contradictions arise |
# Example: LLM intelligently merges conflicting information
memory = OMem(
...,
merge_strategy=MergeStrategy.LLM.BALANCED # or LLM.PREFER_INCOMING, LLM.PREFER_EXISTING
)
💾 Save & Load
Snapshot your entire memory state:
# Save (structured data → memory.json, vectors → FAISS indices)
memory.dump("./researcher_knowledge")
# Later, restore instantly
new_memory = OMem(...)
new_memory.load("./researcher_knowledge")
🔧 Installation & Setup
Basic Installation
pip install ontomem
Or with uv:
uv add ontomem
📦 For Developers
To set up the development environment with all testing and documentation tools:
uv sync --group dev
Core Requirements:
- Python 3.11+
- LangChain (for LLM integration)
- Pydantic (for schema definition)
- FAISS (for vector search)
🤝 Contributing
We're building the next generation of AI memory standards. PRs and issues welcome!
👨💻 Author
Yifan Feng - evanfeng97@gmail.com
📝 License
Licensed under the Apache License, Version 2.0 - See LICENSE file for details.
You are free to use, modify, and distribute this software under the terms of the Apache License 2.0.
Built with ❤️ for AI developers who believe memory is more than just search.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ontomem-0.1.2.tar.gz.
File metadata
- Download URL: ontomem-0.1.2.tar.gz
- Upload date:
- Size: 2.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96193913e44b294fed1e3084737e31238c0f6b6178f0a1e352b5d7851f34c832
|
|
| MD5 |
9c9d023f666bccc608f63013f7255fbe
|
|
| BLAKE2b-256 |
42e65c278eca2b2e0a2abda8dc3c670b0a4221076eade3ed77dc16724b5d19cc
|
Provenance
The following attestation bundles were made for ontomem-0.1.2.tar.gz:
Publisher:
publish-pypi.yml on yifanfeng97/ontomem
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ontomem-0.1.2.tar.gz -
Subject digest:
96193913e44b294fed1e3084737e31238c0f6b6178f0a1e352b5d7851f34c832 - Sigstore transparency entry: 833516018
- Sigstore integration time:
-
Permalink:
yifanfeng97/ontomem@861c7a41d1f0ff6422fb38424e02fe671786759e -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/yifanfeng97
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@861c7a41d1f0ff6422fb38424e02fe671786759e -
Trigger Event:
push
-
Statement type:
File details
Details for the file ontomem-0.1.2-py3-none-any.whl.
File metadata
- Download URL: ontomem-0.1.2-py3-none-any.whl
- Upload date:
- Size: 31.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3fa1a8fd67a4e08fb4991262afe493dc6141d2e99ecc947336dc29404575046d
|
|
| MD5 |
1bf65729fe56917681a34059c0c956a1
|
|
| BLAKE2b-256 |
ced95b5de6540c7733eb8f4c7ac6d35589a7211a1ef0af1af74301b4cf545fae
|
Provenance
The following attestation bundles were made for ontomem-0.1.2-py3-none-any.whl:
Publisher:
publish-pypi.yml on yifanfeng97/ontomem
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ontomem-0.1.2-py3-none-any.whl -
Subject digest:
3fa1a8fd67a4e08fb4991262afe493dc6141d2e99ecc947336dc29404575046d - Sigstore transparency entry: 833516020
- Sigstore integration time:
-
Permalink:
yifanfeng97/ontomem@861c7a41d1f0ff6422fb38424e02fe671786759e -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/yifanfeng97
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@861c7a41d1f0ff6422fb38424e02fe671786759e -
Trigger Event:
push
-
Statement type: