Skip to main content

A self-consolidating memory layer for AI agents with schema-first design, intelligent merging, and hybrid search capabilities

Project description

🧠 Ontomem: The Self-Consolidating Memory

中文版本 | English

Ontomem is built on the concept of Ontology Memory—structured, coherent knowledge representation for AI systems.

Give your AI agent a "coherent" memory, not just "fragmented" retrieval.

Ontomem Framework Diagram

PyPI version Python 3.11+ License: Apache 2.0 PyPI downloads

Traditional RAG (Retrieval-Augmented Generation) systems retrieve text fragments. Ontomem maintains structured entities using Pydantic schemas and intelligent merging algorithms. It automatically consolidates fragmented observations into complete knowledge graph nodes.

It doesn't just store data—it continuously "digests" and "organizes" it.


✨ Why Ontomem?

🧩 Schema-First & Type-Safe

Built on Pydantic. All memories are strongly-typed objects. Say goodbye to {"unknown": "dict"} hell and embrace IDE autocomplete and type checking.

🔄 Auto-Consolidation

When you insert different pieces of information about the same entity (same ID) multiple times, Ontomem doesn't create duplicates. It intelligently merges them into a Golden Record using configurable strategies (field overrides, list merging, or LLM-powered intelligent fusion).

🔍 Hybrid Search

  • Key-Value Lookup: O(1) exact entity access
  • Vector Search: Built-in FAISS indexing for semantic similarity search, automatically synced

💾 Stateful & Persistent

Save your complete memory state (structured data + vector indices) to disk and restore it in seconds on next startup.


🚀 Quick Start: Building a "Self-Improving" Experience Library

Imagine an AI coding agent that debugs issues. Without memory, it repeats the same trial-and-error process every time. With Ontomem, it builds a persistent "Debugging Playbook" that evolves with each new problem encountered.

1. Define Your Experience Schema

from pydantic import BaseModel
from typing import List, Optional

class BugFixExperience(BaseModel):
    """A living record of debugging knowledge."""
    error_signature: str            # Key: e.g., "ModuleNotFoundError: pandas"
    root_causes: List[str]          # Different reasons this error can occur
    solutions: List[str]            # Multiple working solutions discovered
    prevention_tips: str            # Synthesized understanding of how to avoid it
    last_updated: Optional[str] = None

2. Initialize with LLM-Powered Merging

We use the LLM.BALANCED strategy so Ontomem doesn't just list solutions—it synthesizes them into coherent, actionable guidance.

from ontomem import OMem, MergeStrategy
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

experience_memory = OMem(
    memory_schema=BugFixExperience,
    key_extractor=lambda x: x.error_signature,
    llm_client=ChatOpenAI(model="gpt-4o"),
    embedder=OpenAIEmbeddings(),
    merge_strategy=MergeStrategy.LLM.BALANCED
)

3. The Agent Learns Over Time

Day 1: The First Encounter

The agent encounters ModuleNotFoundError for pandas and fixes it with pip install.

# Experience 1: Initial observation
experience_memory.add(BugFixExperience(
    error_signature="ModuleNotFoundError: No module named 'pandas'",
    root_causes=["Missing library in environment"],
    solutions=["Run: pip install pandas"],
    prevention_tips="Always check requirements.txt before running code."
))

Day 2: New Context, Different Fix

The agent encounters the same error in a Docker container where pip fails, but apt-get install python3-pandas works.

# Experience 2: Different context, same error
experience_memory.add(BugFixExperience(
    error_signature="ModuleNotFoundError: No module named 'pandas'",
    root_causes=["Package not in system Python", "Binary incompatibility with pip"],
    solutions=["Run: apt-get install python3-pandas", "Use system package manager in containers"],
    prevention_tips="In containerized environments, prefer system packages for compiled dependencies."
))

Day 3: Agent Seeks Wisdom

When a new agent instance encounters the same error, it queries the evolved knowledge base:

# Retrieve consolidated wisdom
guidance = experience_memory.get("ModuleNotFoundError: No module named 'pandas'")

print("Root Causes:")
for cause in guidance.root_causes:
    print(f"  - {cause}")
# Output:
#   - Missing library in environment
#   - Package not in system Python
#   - Binary incompatibility with pip

print("\nSolutions:")
for i, solution in enumerate(guidance.solutions, 1):
    print(f"  {i}. {solution}")
# Output:
#   1. Run: pip install pandas (standard approach)
#   2. Run: apt-get install python3-pandas (for system Python)
#   3. Use system package manager in containers

print("\nPrevention Tips:")
print(guidance.prevention_tips)
# Output: "Check requirements.txt before running code. 
#         In containers, prefer system packages for compiled dependencies.
#         Consider using virtual environments to isolate dependencies."

Day 4: Semantic Search for Similar Problems

The agent doesn't remember the exact error, but can search by concept:

# Semantic search: Find solutions for import-related issues
similar_issues = experience_memory.search(
    "Python module import failures dependency missing",
    k=5
)

print(f"Found {len(similar_issues)} related debugging experiences")

The agent went from "trial and error" to "informed decision-making". No boilerplate. No manual consolidation. Just add experiences and let Ontomem synthesize wisdom.


🔍 Semantic Search

Build an index and search by natural language:

# Build vector index
memory.build_index()

# Semantic search
results = memory.search("Find researchers working on transformer models and attention mechanisms")

for researcher in results:
    print(f"- {researcher.name}: {researcher.research_interests}")

🛠️ Merge Strategies

Choose how to handle conflicts:

Strategy Behavior Use Case
FIELD_MERGE Non-null overwrites, lists append Simple attribute collection
KEEP_INCOMING Latest data wins Status updates (current role, last seen)
KEEP_EXISTING First observation stays Historical records (first publication year)
LLM.BALANCED LLM-driven semantic merging Complex synthesis, contradiction resolution
LLM.PREFER_INCOMING LLM merges semantically, prefers new data on conflict New information should take priority when contradictions arise
LLM.PREFER_EXISTING LLM merges semantically, prefers existing data on conflict Existing data should take priority when contradictions arise
# Example: LLM intelligently merges conflicting information
memory = OMem(
    ...,
    merge_strategy=MergeStrategy.LLM.BALANCED  # or LLM.PREFER_INCOMING, LLM.PREFER_EXISTING
)

💾 Save & Load

Snapshot your entire memory state:

# Save (structured data → memory.json, vectors → FAISS indices)
memory.dump("./researcher_knowledge")

# Later, restore instantly
new_memory = OMem(...)
new_memory.load("./researcher_knowledge")

📊 Ontomem vs Traditional Approaches

Feature Traditional Vector DB Ontomem 🧠
Storage Unit Text chunks Structured Objects
Deduplication Manual or via embeddings Native, ID-based
Updates Append-only (creates dupes) Auto-merge (upsert)
Query Results Similar text fragments Complete entities
Type Safety ❌ None Pydantic
Indexing Manual sync needed Auto-synced

🎯 Use Cases

🤖 AI Research Assistant

Consolidate researcher profiles, papers, and citations from multiple sources.

👤 Personal Knowledge Graph

Build a living profile of contacts, their preferences, skills, and interaction history from conversations.

🏢 Enterprise Data Hub

Unify customer/employee records from CRM, email, support tickets, and social media.

🧠 AI Agent Long-Term Memory

An autonomous agent accumulates experiences and observations—Ontomem keeps them organized and searchable.


🔧 Installation

Basic Installation

pip install ontomem

Or with uv:

uv add ontomem

For Developers

To set up the development environment with all testing and documentation tools:

uv sync --group dev

Core Requirements:

  • Python 3.11+
  • LangChain (for LLM integration)
  • Pydantic (for schema definition)
  • FAISS (for vector search)

📚 API Reference

Core Methods

add(items: Union[T, List[T]]) → None

Add item(s) to memory. Automatically merges duplicates by key.

memory.add(ResearcherProfile(...))
memory.add([item1, item2, item3])

get(key: Any) → Optional[T]

Retrieve an entity by its unique key.

researcher = memory.get("yann_lecun_001")

build_index(force: bool = False) → None

Build or rebuild the vector index for semantic search.

memory.build_index()  # Build if clean
memory.build_index(force=True)  # Force rebuild

search(query: str, k: int = 5) → List[T]

Semantic search over all entities.

results = memory.search("transformers and attention", k=10)

dump(folder_path: Union[str, Path]) → None

Save memory state (data + index) to disk.

memory.dump("./my_memory")

load(folder_path: Union[str, Path]) → None

Load memory state from disk.

memory.load("./my_memory")

remove(key: Any) → bool

Remove an entity by key.

success = memory.remove("yann_lecun_001")

clear() → None

Clear all entities and indices.

memory.clear()

Properties

keys: List[Any]

All unique keys in memory.

items: List[T]

All entity instances.

size: int

Number of entities.


🤝 Contributing

We're building the next generation of AI memory standards. PRs and issues welcome!


👨‍💻 Author

Yifan Feng - evanfeng97@gmail.com


📝 License

Licensed under the Apache License, Version 2.0 - See LICENSE file for details.

You are free to use, modify, and distribute this software under the terms of the Apache License 2.0.


Built with ❤️ for AI developers who believe memory is more than just search.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ontomem-0.1.1.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ontomem-0.1.1-py3-none-any.whl (32.1 kB view details)

Uploaded Python 3

File details

Details for the file ontomem-0.1.1.tar.gz.

File metadata

  • Download URL: ontomem-0.1.1.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.17

File hashes

Hashes for ontomem-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bfd1ffcff2b654413d7a481a576156f0677c88c5c506914463406bdfc790b81c
MD5 55db8c18ea21114ead6194518ab78b4c
BLAKE2b-256 24ec4d8f12b6157177e157877d11c4f5f1bdbf1d6e3f4e76272478148efa9a57

See more details on using hashes here.

File details

Details for the file ontomem-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ontomem-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 32.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.17

File hashes

Hashes for ontomem-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 21c385af72e068132d2ba32ab9e3cdd82132265861244062fceb1ec0fd63f519
MD5 ea8c3874d546b17f2ed88c79c8c2fdc8
BLAKE2b-256 57437f1ae83ca67aebe978a1e705d3d1c36b588def53128c8fe531d638d53670

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page